Bug#13694811: THE OPTIMIZER WRONGLY USES THE FIRST INNODB
PARTITION STATISTICS Problem was the fix for bug#11756867; It always used the first partitions, and stopped after it checked 10 [sub]partitions. (or until it found a partition which would contain a match). This results in bad statistics for tables where the first 10 partitions don't represent the majority of the data (like when the first 10 partitions only contained a few rows in total). The solution was to take statisics from the partitions containing the most rows instead: Added an array of partition ids which is sorted by number of records in descending order. this array is used in records_in_range to cover as many records as possible in as few calls as possible. Also changed the limit of how many partitions to use for the statistics from a static max of 10 partitions, into a dynamic model: Maximum number of partitions is now log2(total number of partitions) taken from the ordered array. It will continue calling partitions records_in_range until it has checked: (total rows in matching partitions) * (maximum number of partitions) / (number of used partitions) Also reverted the changes for ha_partition::scan_time() and ha_partition::estimate_rows_upper_bound() to before the fix of bug#11756867. Since they are not as slow as records_in_range.
Showing
Please register or sign in to comment