• Guilhem Bichot's avatar
    Bug#16539979 - BASIC SELECT COUNT(DISTINCT ID) IS BROKEN · c90cdf5d
    Guilhem Bichot authored
    Bug#17867117 - ERROR RESULT WHEN "COUNT + DISTINCT + CASE WHEN" NEED MERGE_WALK 
    
    Problem:
    COUNT DISTINCT gives incorrect result when it uses a Unique
    Tree and its last inserted record has null value.
    
    Here is how COUNT DISTINCT is processed, given that this query is not
    using loose index scan.
    
    When a row is produced as a result of joining tables (there is only
    one table here), we store the SELECTed value in a Unique tree. This
    allows elimination of any duplicates, and thus implements DISTINCT.
    
    When we have processed all rows like this, we walk the Unique tree,
    counting its elements, in Aggregator_distinct::endup() (tree->walk());
    for each element we call Item_sum_count::add(). Such function wants to
    ignore any NULL value, for that it checks item_sum -> args[0] ->
    null_value. It is a mistake: when walking the Unique tree, the value
    to be aggregated is not item_sum ->args[0] but rather table ->
    field[0].
    
    Solution:
    instead of item_sum -> args[0] -> null_value, use arg_is_null(), which
    knows where to look (like in fix for bug 57932).
    
    As a consequence of this solution, we have to make arg_is_null() a
    little more general:
    1) Because it was so far only used for AVG() (which always has a
    single argument), this function was looking at a single argument; now
    that it has to work with COUNT(DISTINCT expression1,expression2), it
    must look at all arguments.
    2) Because we start using arg_is_null () for COUNT(DISTINCT), i.e. in
    Item_sum_count::add (), it implies that we are also using it for
    COUNT(no DISTINCT) (same add ()). For COUNT(no DISTINCT), the
    nullness to check is that of item_sum -> args[0]. But the null_value
    of such item is reliable only if val_*() has been called on it. So far
    arg_is_null() was always used after a call to arg_val*(), so could
    rely on null_value; but for COUNT, there is no call to arg_val*(), so
    arg_is_null() has to call is_null() instead.
    
    Testcase for 16539979 by Neeraj. Testcase for 17867117 contributed by
    Xiaobin Lin from Taobao.
    c90cdf5d
distinct.result 27.4 KB