• Michael Chan's avatar
    tg3: Prevent system hang during repeated EEH errors. · 72bb72b0
    Michael Chan authored
    The current tg3 code assumes the pci_error_handlers to be always called
    in sequence.  In particular, during ->error_detected(), NAPI is disabled
    and the device is shutdown.  The device is later reset and NAPI
    re-enabled in ->slot_reset() and ->resume().
    
    In EEH, if more than 6 errors are detected in a hour, only
    ->error_detected() will be called.  This will leave the driver in an
    inconsistent state as NAPI is disabled but netif_running state is still
    true.  When the device is later closed, we'll try to disable NAPI again
    and it will loop forever.
    
    We fix this by closing the device if we encounter any error conditions
    during the normal sequence of the pci_error_handlers.
    
    v2: Remove the changes in tg3_io_resume() based on Benjamin Poirier's
        feedback.
    Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
    Signed-off-by: default avatarNithin Nayak Sujir <nsujir@broadcom.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    72bb72b0
tg3.c 455 KB