• Dmitry Safonov's avatar
    net/tcp: Disable TCP-AO static key after RCU grace period · 14ab4792
    Dmitry Safonov authored
    The lifetime of TCP-AO static_key is the same as the last
    tcp_ao_info. On the socket destruction tcp_ao_info ceases to be
    with RCU grace period, while tcp-ao static branch is currently deferred
    destructed. The static key definition is
    : DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ);
    
    which means that if RCU grace period is delayed by more than a second
    and tcp_ao_needed is in the process of disablement, other CPUs may
    yet see tcp_ao_info which atent dead, but soon-to-be.
    And that breaks the assumption of static_key_fast_inc_not_disabled().
    
    See the comment near the definition:
    > * The caller must make sure that the static key can't get disabled while
    > * in this function. It doesn't patch jump labels, only adds a user to
    > * an already enabled static key.
    
    Originally it was introduced in commit eb8c5072 ("jump_label:
    Prevent key->enabled int overflow"), which is needed for the atomic
    contexts, one of which would be the creation of a full socket from a
    request socket. In that atomic context, it's known by the presence
    of the key (md5/ao) that the static branch is already enabled.
    So, the ref counter for that static branch is just incremented
    instead of holding the proper mutex.
    static_key_fast_inc_not_disabled() is just a helper for such usage
    case. But it must not be used if the static branch could get disabled
    in parallel as it's not protected by jump_label_mutex and as a result,
    races with jump_label_update() implementation details.
    
    Happened on netdev test-bot[1], so not a theoretical issue:
    
    [] jump_label: Fatal kernel bug, unexpected op at tcp_inbound_hash+0x1a7/0x870 [ffffffffa8c4e9b7] (eb 50 0f 1f 44 != 66 90 0f 1f 00)) size:2 type:1
    [] ------------[ cut here ]------------
    [] kernel BUG at arch/x86/kernel/jump_label.c:73!
    [] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
    [] CPU: 3 PID: 243 Comm: kworker/3:3 Not tainted 6.10.0-virtme #1
    [] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
    [] Workqueue: events jump_label_update_timeout
    [] RIP: 0010:__jump_label_patch+0x2f6/0x350
    ...
    [] Call Trace:
    []  <TASK>
    []  arch_jump_label_transform_queue+0x6c/0x110
    []  __jump_label_update+0xef/0x350
    []  __static_key_slow_dec_cpuslocked.part.0+0x3c/0x60
    []  jump_label_update_timeout+0x2c/0x40
    []  process_one_work+0xe3b/0x1670
    []  worker_thread+0x587/0xce0
    []  kthread+0x28a/0x350
    []  ret_from_fork+0x31/0x70
    []  ret_from_fork_asm+0x1a/0x30
    []  </TASK>
    [] Modules linked in: veth
    [] ---[ end trace 0000000000000000 ]---
    [] RIP: 0010:__jump_label_patch+0x2f6/0x350
    
    [1]: https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/696681/5-connect-deny-ipv6/stderr
    
    Cc: stable@kernel.org
    Fixes: 67fa83f7
    
     ("net/tcp: Add static_key for TCP-AO")
    Signed-off-by: default avatarDmitry Safonov <0x7f454c46@gmail.com>
    Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    14ab4792
tcp_ao.c 64.7 KB