• Yishai Hadas's avatar
    net/mlx4_core: Enable device recovery flow with SRIOV · 55ad3592
    Yishai Hadas authored
    In SRIOV, both the PF and the VF may attempt device recovery whenever they
    assume that the device is not functioning.  When the PF driver resets the
    device, the VF should detect this and attempt to reinitialize itself.
    
    The VF must be able to reset itself under all circumstances, even
    if the PF is not responsive.
    
    The VF shall reset itself in the following cases:
    
    1. Commands are not processed within reasonable time over the communication channel.
    This is done considering device state and the correct return code based on
    the command as was done in the native mode, done in the next patch.
    
    2. The VF driver receives an internal error event reported by the PF on the
    communication channel. This occurs when the PF driver resets the device or
    when VF is out of sync with the PF.
    
    Add 'VF reset' capability, which allows the VF to reinitialize itself even when the
    PF is not responsive.
    
    As PF and VF may run their reset flow simulantanisly, there are several cases
    that are handled:
    - Prevent freeing VF resources upon FLR, when PF is in its unloading stage.
    - Prevent PF getting VF commands before it has finished initializing its resources.
    - Upon VF startup, check that comm-channel is online before sending
      commands to the PF and getting timed-out.
    Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
    Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    55ad3592
main.c 98.9 KB