• Guilhem Bichot's avatar
    Fixes for: · 9e0b69c0
    Guilhem Bichot authored
    BUG#13519696 - 62940: SELECT RESULTS VARY WITH VERSION AND
    WITH/WITHOUT INDEX RANGE SCAN
    BUG#13453382 - REGRESSION SINCE 5.1.39, RANGE OPTIMIZER WRONG
    RESULTS WITH DECIMAL CONVERSION
    BUG#13463488 - 63437: CHAR & BETWEEN WITH INDEX RETURNS WRONG
    RESULT AFTER MYSQL 5.1.
    Those are all cases where the range optimizer got it wrong
    with > and >=.
    
    mysql-test/r/range.result:
      Without the code fix for DECIMAL, "select count(val) from t2 where val > 0.1155"
      (which uses a range scan) returned 127 instead of 128);
      Moreover, both
      select * from t1 force  index (primary) where a=1 and c>= 2.9;
      and
      select * from t1 force  index (primary) where a=1 and c> 2.9;
      would miss "1	1	3".
      Without the code fix for strings, both
      SELECT * FROM t1 WHERE F1 >= 'A    ';
      and
      SELECT * FROM t1 WHERE F1 BETWEEN 'A    ' AND 'AAAAA';
      would miss "A	A	A".
    sql/item.cc:
      Preamble to the explanations below: opt_range.cc:get_mm_leaf() does
      this (this is not changed by the patch): changes
      column > value
      to
      column OP V
      where:
      * V is what is in "column" after we stored "value" in it
      (such store operation may have done rounding...)
      * OP is > or >=, depending on what's correct.
      For example, if c is an INT column,
      c > 2.9 is changed to
      c OP 3
      where OP is >= ('>' would not be correct).
      The bugs below are cases where we chose OP wrongly.
      Note that such transformations are visible in the optimizer trace.
      
      1) Fix for STRING. In the scenario with CHAR(5) in range.test, this happens,
      in get_mm_tree(), for the condition F1>='A    ':
      * value->save_in_field_no_warnings(field, 1) wants to store the right argument
      (named 'item') into the CHAR(5) field; this stores 'A    ' (the item's value)
      padded with spaces (which changes nothing: still 'A    ')
      * we come to
        case Item_func::GE_FUNC:
          /* Don't use open ranges for partial key_segments */
          if ((!(key_part->flag & HA_PART_KEY_SEG)) &&
              (stored_field_cmp_to_item(param->thd, field, value) < 0))
            tree->min_flag= NEAR_MIN;
          tree->max_flag=NO_MAX_RANGE;
      What this wants to do is: if the field's value is strictly smaller
      than the item's, then ">=" can be changed to ">" (this is an optimization,
      it can help pruning one useless partition).
      * stored_field_cmp_to_item() is called; it compares the field's
      and item's values: the item's value (Item_string::val_str()) is
      'A    ') and the field's value (Field_string::val_str()) is
      'A' (yes val_str() removes end spaces unless sql_mode='PAD_CHAR_TO_FULL_LENGTH');
      and the comparison is done with stringcmp() which considers
      end spaces as relevant; as end spaces differ, function returns a
      negative number, and ">='A    '" becomes ">'A'" (i.e. the NEAR_MIN
      flag is turned on).
      During execution the index range scan code will search for "A", find
      a match, but exclude it (because of ">"), wrongly.
      The badness is the string comparison done by stored_field_cmp_to_item():
      we use the reply of this function to determine where the index search
      should start, so it should do comparison like index search does
      comparisons; index search comparisons are ha_key_cmp() which uses
      a collation-aware comparison (in our case, my_strnncollsp_simple(),
      which ignores end spaces); so stored_field_cmp_to_item()
      needs to do the same. When this is fixed, condition becomes
      ">='A    '".
      
      2) Fix for DECIMAL: just like in other comparisons in stored_field_cmp_to_item(),
      we must first pass the field and then the item; otherwise expectations
      on what <0 and >0 mean (inferiority, superiority) get violated.
      In the test in range.test about c>2.9: c is an INT column, so 2.9
      gets stored as 3, then stored_field_cmp_to_item() compares 3
      and 2.9; because of the wrong order of arguments passed
      to my_decimal_cmp(), range optimizer
      thinks that 3 is < 2.9 and thus changes "c> 2.9" to "c> 3".
      After fixing the order, it changes to the correct "c>= 3".
      In the test in range.inc for val > 0.1155, it was changed to
      val > 0.116, now it is changed to val >= 0.116.
    9e0b69c0
range.test 52.1 KB