Commit ff6b117c authored by Venkata Sidagam's avatar Venkata Sidagam

Bug #17760379 COLLATIONS WITH CONTRACTIONS BUFFER-OVERFLOW THEMSELVES IN THE FOOT

Description: A typo in create_tailoring() causes the "contraction_flags" to be written
into cs->contractions in the wrong place. This causes two problems:
(1) Anyone relying on `contraction_flags` to decide "could this character be
part of a contraction" is 100% broken.
(2) Anyone relying on `contractions` to determine the weight of a contraction
is mostly broken

Analysis: When we are preparing the contraction in create_tailoring(), we are corrupting the 
cs->contractions memory location which is supposed to store the weights(8k) + contraction information(256 bytes). We started storing the contraction information after the 4k location. This is because of logic flaw in the code.

Fix: When we create the contractions, we need to calculate the contraction with (char*) (cs->contractions + 0x40*0x40) from ((char*) cs->contractions) + 0x40*0x40. This makes the "cs->contractions" to move to 8k bytes and stores the contraction information from there. Similarly when we are calculating it for like range queries we need to calculate it from the 8k bytes onwards, this can be done by changing the logic to (const char*) (cs->contractions + 0x40*0x40). And for ucs2 charsets we need to modify the my_cs_can_be_contraction_head() and my_cs_can_be_contraction_tail() to point to 8k+ locations.
parent 8765bec5
......@@ -372,13 +372,13 @@ my_cs_have_contractions(CHARSET_INFO *cs)
static inline my_bool
my_cs_can_be_contraction_head(CHARSET_INFO *cs, my_wc_t wc)
{
return ((const char *)cs->contractions)[0x40*0x40 + (wc & 0xFF)];
return ((const char *) cs->contractions)[0x40 * 0x40 * 2 + (wc & 0xFF)];
}
static inline my_bool
my_cs_can_be_contraction_tail(CHARSET_INFO *cs, my_wc_t wc)
{
return ((const char *)cs->contractions)[0x40*0x40 + (wc & 0xFF)];
return ((const char *) cs->contractions)[0x40 * 0x40 * 2 + (wc & 0xFF)];
}
static inline uint16*
......
......@@ -697,7 +697,7 @@ my_bool my_like_range_mb(CHARSET_INFO *cs,
char *max_end= max_str + res_length;
size_t maxcharlen= res_length / cs->mbmaxlen;
const char *contraction_flags= cs->contractions ?
((const char*) cs->contractions) + 0x40*0x40 : NULL;
(const char *) (cs->contractions + 0x40*0x40) : NULL;
for (; ptr != end && min_str != min_end && maxcharlen ; maxcharlen--)
{
......
......@@ -8046,7 +8046,7 @@ static my_bool create_tailoring(CHARSET_INFO *cs, void *(*alloc)(size_t))
if (!(cs->contractions= (uint16*) (*alloc)(size)))
return 1;
bzero((void*)cs->contractions, size);
contraction_flags= ((char*) cs->contractions) + 0x40*0x40;
contraction_flags= (char *) (cs->contractions + 0x40*0x40);
for (i=0; i < rc; i++)
{
if (rule[i].curr[1])
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment