• unknown's avatar
    Bug#22555: STDDEV yields positive result for groups with only one row · 398dbceb
    unknown authored
    When only one row was present, the subtraction of nearly the same number 
    resulted in catastropic cancellation, introducing an error in the 
    VARIANCE calculation near 1e-15.  That was sqrt()ed to get STDDEV, the 
    error was escallated to near 1e-8.  
    
    The simple fix of testing for a row count of 1 and forcing that to yield 
    0.0 is insufficient, as two rows of the same value should also have a
    variance of 0.0, yet the error would be about the same.
    
    So, this patch changes the formula that computes the VARIANCE to be one
    that is not subject to catastrophic cancellation.
    
    In addition, it now uses only (faster-than-decimal) floating point numbers
    to calculate, and renders that to other types on demand.
    
    
    mysql-test/r/func_group.result:
      Test that the bug is fixed, and that no unexpected behavior arises from the 
      changes.
    mysql-test/t/func_group.test:
      Test that the bug is fixed, and that no unexpected behavior arises from the 
      changes.
    sql/item_sum.cc:
      Serg's suggestion: Force all VARIANCE calculations to be done with floating-
      point types.  It's faster, and the SQL standard says we may implement these
      functions any way we want.
      
      Additionally, use a form of variance calculation that is not subject to 
      catastrophic cancellation.   
      http://static.flickr.com/108/311308512_5c4e1c0c3d_b.jpg
    sql/item_sum.h:
      Remove unused members and add a comment describing the recurrence relation.
    398dbceb
func_group.test 26.2 KB