• John Fastabend's avatar
    ixgbe: rework Tx hang detection to fix reoccurring false Tx hangs · c84d324c
    John Fastabend authored
    The Tx hang logic has been known to detect false hangs when
    the device is receiving pause frames or has delayed processing
    for some other reason.
    
    This patch makes the logic more robust and resolves these
    known issues. The old logic checked to see if the device
    was paused by querying the HW then the hang logic was
    aborted if the device was currently paused. This check was
    racy because the device could have been in the pause state
    any time up to this check. The other operation of the
    hang logic is to verify the Tx ring is still advancing
    the old logic checked the EOP timestamp. This is not
    sufficient to determine the ring is not advancing but
    only infers that it may be moving slowly.
    
    Here we add logic to track the number of completed Tx
    descriptors and use the adapter stats to check if any
    pause frames have been received since the previous Tx
    hang check. This way we avoid racing with the HW
    register and do not detect false hangs if the ring is
    advancing slowly.
    
    This patch is primarily the work of Jesse Brandeburg. I
    clean it up some and fixed the PFC checking.
    Signed-off-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
    Tested-by: default avatarRoss Brattain <ross.b.brattain@intel.com>
    Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
    c84d324c
ixgbe_main.c 208 KB