• Anton Blanchard's avatar
    perf sort: Fix symbol sort output by separating unresolved samples by type · 6bb8f311
    Anton Blanchard authored
    I took a profile that suggested 60% of total CPU time was in the
    hypervisor:
    
    ...
        60.20%  [H] 0x33d43c
         4.43%  [k] ._spin_lock_irqsave
         1.07%  [k] ._spin_lock
    
    Using perf stat to get the user/kernel/hypervisor breakdown contradicted
    this.
    
    The problem is we merge all unresolved samples into the one unknown
    bucket. If add a comparison by sample type to sort__sym_cmp we get the
    real picture:
    
    ...
        57.11%  [.] 0x80fbf63c
         4.43%  [k] ._spin_lock_irqsave
         1.07%  [k] ._spin_lock
         0.65%  [H] 0x33d43c
    
    So it was almost all userspace, not hypervisor as the initial profile
    suggested.
    
    I found another issue while adding this. Symbol sorting sometimes shows
    multiple entries for the unknown bucket:
    
    ...
        16.65%  [.] 0x6cd3a8
         7.25%  [.] 0x422460
         5.37%  [.] yylex
         4.79%  [.] malloc
         4.78%  [.] _int_malloc
         4.03%  [.] _int_free
         3.95%  [.] hash_source_code_string
         2.82%  [.] 0x532908
         2.64%  [.] 0x36b538
         0.94%  [H] 0x8000000000e132a4
         0.82%  [H] 0x800000000000e8b0
    
    This happens because we aren't consistent with our sorting. On
    one hand we check to see if both symbols match and for two unresolved
    samples sym is NULL so we match:
    
            if (left->ms.sym == right->ms.sym)
                    return 0;
    
    On the other hand we use sample IP for unresolved samples when
    comparing against a symbol:
    
           ip_l = left->ms.sym ? left->ms.sym->start : left->ip;
           ip_r = right->ms.sym ? right->ms.sym->start : right->ip;
    
    This means unresolved samples end up spread across the rbtree and we
    can't merge them all.
    
    If we use cmp_null all unresolved samples will end up in the one bucket
    and the output makes more sense:
    
    ...
        39.12%  [.] 0x36b538
         5.37%  [.] yylex
         4.79%  [.] malloc
         4.78%  [.] _int_malloc
         4.03%  [.] _int_free
         3.95%  [.] hash_source_code_string
         2.26%  [H] 0x800000000000e8b0
    Acked-by: default avatarEric B Munson <emunson@mgebm.net>
    Cc: Eric B Munson <emunson@mgebm.net>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Ingo Molnar <mingo@elte.hu>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Ian Munsie <imunsie@au1.ibm.com>
    Link: http://lkml.kernel.org/r/20110831115145.4f598ab2@krytenSigned-off-by: default avatarAnton Blanchard <anton@samba.org>
    Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    6bb8f311
sort.c 7.79 KB