Bug#22555: STDDEV yields positive result for groups with only one row
When only one row was present, the subtraction of nearly the same number resulted in catastropic cancellation, introducing an error in the VARIANCE calculation near 1e-15. That was sqrt()ed to get STDDEV, the error was escallated to near 1e-8. The simple fix of testing for a row count of 1 and forcing that to yield 0.0 is insufficient, as two rows of the same value should also have a variance of 0.0, yet the error would be about the same. So, this patch changes the formula that computes the VARIANCE to be one that is not subject to catastrophic cancellation. In addition, it now uses only (faster-than-decimal) floating point numbers to calculate, and renders that to other types on demand. mysql-test/r/func_group.result: Test that the bug is fixed, and that no unexpected behavior arises from the changes. mysql-test/t/func_group.test: Test that the bug is fixed, and that no unexpected behavior arises from the changes. sql/item_sum.cc: Serg's suggestion: Force all VARIANCE calculations to be done with floating- point types. It's faster, and the SQL standard says we may implement these functions any way we want. Additionally, use a form of variance calculation that is not subject to catastrophic cancellation. http://static.flickr.com/108/311308512_5c4e1c0c3d_b.jpg sql/item_sum.h: Remove unused members and add a comment describing the recurrence relation.
Showing
This diff is collapsed.
Please register or sign in to comment