• Georgi Kodinov's avatar
    Bug #18359924: INNODB AND MYISAM CORRUPTION ON PREFIX INDEXES · 29694eeb
    Georgi Kodinov authored
    The problem was in the validation of the input data for blob types.
    When assigned binary data, the character blob types were only checking if 
    the length of these data is a multiple of the minimum char length for the 
    destination charset. 
    And since e.g. UTF-8's minimum character length is 1 (becuase it's 
    variable length) even byte sequences that are invalid utf-8 strings (e.g. 
    wrong leading byte etc) were copied verbatim into utf-8 columns when
    coming from binary strings or fields.
    Storing invalid data into string columns was having all kinds of ill effects 
    on code that assumed that the encoding data are valid to begin with.
    
    Fixed by additionally checking the incoming binary string for validity when 
    assigning it to a non-binary string column.
    Made sure the conversions to charsets with no known "invalid" ranges 
    are not covered by the extra check.
    Removed trailing spaces.
    
    Test case added.
    29694eeb
sql_string.h 13 KB