• Joe Damato's avatar
    testing: net-drv: use stats64 for testing · a61a459f
    Joe Damato authored
    Testing a network device that has large numbers of bytes/packets may
    overflow. Using stats64 when comparing fixes this problem.
    
    I tripped on this while iterating on a qstats patch for mlx5. See below
    for confirmation without my added code that this is a bug.
    
    Before this patch (with added debugging output):
    
    $ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
    KTAP version 1
    1..4
    ok 1 stats.check_pause
    ok 2 stats.check_fec
    rstat: 481708634 qstat: 666201639514 key: tx-bytes
    not ok 3 stats.pkt_byte_sum
    ok 4 stats.qstat_by_ifindex
    
    Note the huge delta above ^^^ in the rtnl vs qstats.
    
    After this patch:
    
    $ NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
    KTAP version 1
    1..4
    ok 1 stats.check_pause
    ok 2 stats.check_fec
    ok 3 stats.pkt_byte_sum
    ok 4 stats.qstat_by_ifindex
    
    It looks like rtnl_fill_stats in net/core/rtnetlink.c will attempt to
    copy the 64bit stats into a 32bit structure which is probably why this
    behavior is occurring.
    
    To show this is happening, you can get the underlying stats that the
    stats.py test uses like this:
    
    $ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
               --do getlink --json '{"ifi-index": 7}'
    
    And examine the output (heavily snipped to show relevant fields):
    
     'stats': {
               'multicast': 3739197,
               'rx-bytes': 1201525399,
               'rx-packets': 56807158,
               'tx-bytes': 492404458,
               'tx-packets': 1200285371,
    
     'stats64': {
                 'multicast': 3739197,
                 'rx-bytes': 35561263767,
                 'rx-packets': 56807158,
                 'tx-bytes': 666212335338,
                 'tx-packets': 1200285371,
    
    The stats.py test prior to this patch was using the 'stats' structure
    above, which matches the failure output on my system.
    
    Comparing side by side, rx-bytes and tx-bytes, and getting ethtool -S
    output:
    
    rx-bytes stats:    1201525399
    rx-bytes stats64: 35561263767
    rx-bytes ethtool: 36203402638
    
    tx-bytes stats:      492404458
    tx-bytes stats64: 666212335338
    tx-bytes ethtool: 666215360113
    
    Note that the above was taken from a system with an mlx5 NIC, which only
    exposes ndo_get_stats64.
    
    Based on the ethtool output and qstat output, it appears that stats.py
    should be updated to use the 'stats64' structure for accurate
    comparisons when packet/byte counters get very large.
    
    To confirm that this was not related to the qstats code I was iterating
    on, I booted a kernel without my driver changes and re-ran the test
    which shows the qstats are skipped (as they don't exist for mlx5):
    
    NETIF=eth0 tools/testing/selftests/drivers/net/stats.py
    KTAP version 1
    1..4
    ok 1 stats.check_pause
    ok 2 stats.check_fec
    ok 3 stats.pkt_byte_sum # SKIP qstats not supported by the device
    ok 4 stats.qstat_by_ifindex # SKIP No ifindex supports qstats
    
    But, fetching the stats using the CLI
    
    $ ./cli.py --spec ../../../Documentation/netlink/specs/rt_link.yaml \
               --do getlink --json '{"ifi-index": 7}'
    
    Shows the same issue (heavily snipped for relevant fields only):
    
     'stats': {
               'multicast': 105489,
               'rx-bytes': 530879526,
               'rx-packets': 751415,
               'tx-bytes': 2510191396,
               'tx-packets': 27700323,
     'stats64': {
                 'multicast': 105489,
                 'rx-bytes': 530879526,
                 'rx-packets': 751415,
                 'tx-bytes': 15395093284,
                 'tx-packets': 27700323,
    
    Comparing side by side with ethtool -S on the unmodified mlx5 driver:
    
    tx-bytes stats:    2510191396
    tx-bytes stats64: 15395093284
    tx-bytes ethtool: 17718435810
    
    Fixes: f0e6c86e ("testing: net-drv: add a driver test for stats reporting")
    Signed-off-by: default avatarJoe Damato <jdamato@fastly.com>
    Link: https://lore.kernel.org/r/20240520235850.190041-1-jdamato@fastly.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
    a61a459f
stats.py 4.54 KB