• jinyiting's avatar
    bonding: 3ad: Fix the conflict between bond_update_slave_arr and the state machine · 83d686a6
    jinyiting authored
    The bond works in mode 4, and performs down/up operations on the bond
    that is normally negotiated. The probability of bond-> slave_arr is NULL
    
    Test commands:
       ifconfig bond1 down
       ifconfig bond1 up
    
    The conflict occurs in the following process:
    
    __dev_open (CPU A)
    --bond_open
      --queue_delayed_work(bond->wq,&bond->ad_work,0);
      --bond_update_slave_arr
        --bond_3ad_get_active_agg_info
    
    ad_work(CPU B)
    --bond_3ad_state_machine_handler
      --ad_agg_selection_logic
    
    ad_work runs on cpu B. In the function ad_agg_selection_logic, all
    agg->is_active will be cleared. Before the new active aggregator is
    selected on CPU B, bond_3ad_get_active_agg_info failed on CPU A,
    bond->slave_arr will be set to NULL. The best aggregator in
    ad_agg_selection_logic has not changed, no need to update slave arr.
    
    The conflict occurred in that ad_agg_selection_logic clears
    agg->is_active under mode_lock, but bond_open -> bond_update_slave_arr
    is inspecting agg->is_active outside the lock.
    
    Also, bond_update_slave_arr is normal for potential sleep when
    allocating memory, so replace the WARN_ON with a call to might_sleep.
    Signed-off-by: default avatarjinyiting <jinyiting@huawei.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    83d686a6
bond_main.c 152 KB