• Wen Gu's avatar
    net/smc: Transitional solution for clcsock race issue · c0bf3d8a
    Wen Gu authored
    We encountered a crash in smc_setsockopt() and it is caused by
    accessing smc->clcsock after clcsock was released.
    
     BUG: kernel NULL pointer dereference, address: 0000000000000020
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 0 P4D 0
     Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 1 PID: 50309 Comm: nginx Kdump: loaded Tainted: G E     5.16.0-rc4+ #53
     RIP: 0010:smc_setsockopt+0x59/0x280 [smc]
     Call Trace:
      <TASK>
      __sys_setsockopt+0xfc/0x190
      __x64_sys_setsockopt+0x20/0x30
      do_syscall_64+0x34/0x90
      entry_SYSCALL_64_after_hwframe+0x44/0xae
     RIP: 0033:0x7f16ba83918e
      </TASK>
    
    This patch tries to fix it by holding clcsock_release_lock and
    checking whether clcsock has already been released before access.
    
    In case that a crash of the same reason happens in smc_getsockopt()
    or smc_switch_to_fallback(), this patch also checkes smc->clcsock
    in them too. And the caller of smc_switch_to_fallback() will identify
    whether fallback succeeds according to the return value.
    
    Fixes: fd57770d ("net/smc: wait for pending work before clcsock release_sock")
    Link: https://lore.kernel.org/lkml/5dd7ffd1-28e2-24cc-9442-1defec27375e@linux.ibm.com/T/Signed-off-by: default avatarWen Gu <guwen@linux.alibaba.com>
    Acked-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    c0bf3d8a
af_smc.c 76.8 KB