• Alexander Barkov's avatar
    MDEV-23525 Wrong result of MIN(time_expr) and MAX(time_expr) with GROUP BY · ae33ebe5
    Alexander Barkov authored
    Problem:
    
    When calculatung MIN() and MAX() in a query with GROUP BY, like this:
    
      SELECT MIN(time_expr), MAX(time_expr) FROM t1 GROUP BY i;
    
    the code in Item_sum_min_max::update_field() erroneosly used
    string format comparison, therefore '100:20:30' was considered as
    smaller than '10:20:30'.
    
    Fix:
    
    1. Implementing low level "native" related methods in class Time:
         Time::Time(const Native &native)           - convert native to Time
         Time::to_native(Native *to, uint decimals) - convert Time to native
    
       The "native" binary representation for TIME is equal to
       the binary data format of Field_timef, which is used to
       store TIME when mysql56_temporal_format is ON (default).
    
    2. Implementing Type_handler_time_common "native" related methods:
    
      Type_handler_time_common::cmp_native()
      Type_handler_time_common::Item_val_native_with_conversion()
      Type_handler_time_common::Item_val_native_with_conversion_result()
      Type_handler_time_common::Item_param_val_native()
    
    3. Implementing missing "native representation" related methods
       in Field_time and Field_timef:
    
      Field_time::store_native()
      Field_time::val_native()
      Field_timef::store_native()
      Field_timef::val_native()
    
    4. Implementing missing "native" related methods in all Items
       that can have the TIME data type:
    
      Item_timefunc::val_native()
      Item_name_const::val_native()
      Item_time_literal::val_native()
      Item_cache_time::val_native()
      Item_handled_func::val_native()
    
    5. Marking Type_handler_time_common as "native ready".
       So now Item_sum_min_max::update_field() calculates
       values using min_max_update_native_field(),
       which uses native binary representation rather than string representation.
    
       Before this change, only the TIMESTAMP data type used native
       representation to calculate MIN() and MAX().
    
    Benchmarks (see more details in MDEV):
    
      This change not only fixes the wrong result, but also
      makes a "SELECT .. MAX.. GROUP BY .." query faster:
    
      # TIME(0)
      CREATE TABLE t1 (id INT, time_col TIME) ENGINE=HEAP;
      INSERT INTO t1 VALUES (1,'10:10:10'); -- repeat this 1m times
      SELECT id, MAX(time_col) FROM t1 GROUP BY id;
    
      MySQL80: 0.159 sec
      10.3:    0.108 sec
      10.4:    0.094 sec (fixed)
    
      # TIME(6):
      CREATE TABLE t1 (id INT, time_col TIME(6)) ENGINE=HEAP;
      INSERT INTO t1 VALUES (1,'10:10:10.999999'); -- repeat this 1m times
      SELECT id, MAX(time_col) FROM t1 GROUP BY id;
    
      My80: 0.154
      10.3: 0.135
      10.4: 0.093 (fixed)
    ae33ebe5
item.h 236 KB