• Huy Nguyen's avatar
    net/mlx5: Skip mlx5_unload_one if mlx5_load_one fails · b3cb5388
    Huy Nguyen authored
    There is an issue where the firmware fails during mlx5_load_one,
    the health_care timer detects the issue and schedules a health_care call.
    Then the mlx5_load_one detects the issue, cleans up and quits. Then
    the health_care starts and calls mlx5_unload_one to clean up the resources
    that no longer exist and causes kernel panic.
    
    The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set
    after mlx5_load_one fails. The solution is removing the bit
    MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the
    bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN
    is redundant and we can use MLX5_INTERFACE_STATE_UP instead.
    
    Fixes: 5fc7197d ("net/mlx5: Add pci shutdown callback")
    Signed-off-by: default avatarHuy Nguyen <huyn@mellanox.com>
    Reviewed-by: default avatarDaniel Jurgens <danielj@mellanox.com>
    Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
    b3cb5388
main.c 38.9 KB