• Michael Chan's avatar
    bnxt_en: Fix possible unintended driver initiated error recovery · 1b2b9183
    Michael Chan authored
    If error recovery is already enabled, bnxt_timer() will periodically
    check the heartbeat register and the reset counter.  If we get an
    error recovery async. notification from the firmware (e.g. change in
    primary/secondary role), we will immediately read and update the
    heartbeat register and the reset counter.  If the timer for the next
    health check expires soon after this, we may read the heartbeat register
    again in quick succession and find that it hasn't changed.  This will
    trigger error recovery unintentionally.
    
    The likelihood is small because we also reset fw_health->tmr_counter
    which will reset the interval for the next health check.  But the
    update is not protected and bnxt_timer() can miss the update and
    perform the health check without waiting for the full interval.
    
    Fix it by only reading the heartbeat register and reset counter in
    bnxt_async_event_process() if error recovery is trasitioning to the
    enabled state.  Also add proper memory barriers so that when enabling
    for the first time, bnxt_timer() will see the tmr_counter interval and
    perform the health check after the full interval has elapsed.
    
    Fixes: 7e914027 ("bnxt_en: Enable health monitoring.")
    Reviewed-by: default avatarEdwin Peer <edwin.peer@broadcom.com>
    Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    1b2b9183
bnxt.c 352 KB