1. 08 Dec, 2021 6 commits
    • Eric Dumazet's avatar
      netfilter: conntrack: annotate data-races around ct->timeout · 802a7dc5
      Eric Dumazet authored
      (struct nf_conn)->timeout can be read/written locklessly,
      add READ_ONCE()/WRITE_ONCE() to prevent load/store tearing.
      
      BUG: KCSAN: data-race in __nf_conntrack_alloc / __nf_conntrack_find_get
      
      write to 0xffff888132e78c08 of 4 bytes by task 6029 on cpu 0:
       __nf_conntrack_alloc+0x158/0x280 net/netfilter/nf_conntrack_core.c:1563
       init_conntrack+0x1da/0xb30 net/netfilter/nf_conntrack_core.c:1635
       resolve_normal_ct+0x502/0x610 net/netfilter/nf_conntrack_core.c:1746
       nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901
       ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414
       nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline]
       nf_hook_slow+0x72/0x170 net/netfilter/core.c:619
       nf_hook include/linux/netfilter.h:262 [inline]
       NF_HOOK include/linux/netfilter.h:305 [inline]
       ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324
       inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135
       __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402
       tcp_transmit_skb net/ipv4/tcp_output.c:1420 [inline]
       tcp_write_xmit+0x1450/0x4460 net/ipv4/tcp_output.c:2680
       __tcp_push_pending_frames+0x68/0x1c0 net/ipv4/tcp_output.c:2864
       tcp_push_pending_frames include/net/tcp.h:1897 [inline]
       tcp_data_snd_check+0x62/0x2e0 net/ipv4/tcp_input.c:5452
       tcp_rcv_established+0x880/0x10e0 net/ipv4/tcp_input.c:5947
       tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521
       sk_backlog_rcv include/net/sock.h:1030 [inline]
       __release_sock+0xf2/0x270 net/core/sock.c:2768
       release_sock+0x40/0x110 net/core/sock.c:3300
       sk_stream_wait_memory+0x435/0x700 net/core/stream.c:145
       tcp_sendmsg_locked+0xb85/0x25a0 net/ipv4/tcp.c:1402
       tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1440
       inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:644
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg net/socket.c:724 [inline]
       __sys_sendto+0x21e/0x2c0 net/socket.c:2036
       __do_sys_sendto net/socket.c:2048 [inline]
       __se_sys_sendto net/socket.c:2044 [inline]
       __x64_sys_sendto+0x74/0x90 net/socket.c:2044
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff888132e78c08 of 4 bytes by task 17446 on cpu 1:
       nf_ct_is_expired include/net/netfilter/nf_conntrack.h:286 [inline]
       ____nf_conntrack_find net/netfilter/nf_conntrack_core.c:776 [inline]
       __nf_conntrack_find_get+0x1c7/0xac0 net/netfilter/nf_conntrack_core.c:807
       resolve_normal_ct+0x273/0x610 net/netfilter/nf_conntrack_core.c:1734
       nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901
       ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414
       nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline]
       nf_hook_slow+0x72/0x170 net/netfilter/core.c:619
       nf_hook include/linux/netfilter.h:262 [inline]
       NF_HOOK include/linux/netfilter.h:305 [inline]
       ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324
       inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135
       __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402
       __tcp_send_ack+0x1fd/0x300 net/ipv4/tcp_output.c:3956
       tcp_send_ack+0x23/0x30 net/ipv4/tcp_output.c:3962
       __tcp_ack_snd_check+0x2d8/0x510 net/ipv4/tcp_input.c:5478
       tcp_ack_snd_check net/ipv4/tcp_input.c:5523 [inline]
       tcp_rcv_established+0x8c2/0x10e0 net/ipv4/tcp_input.c:5948
       tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521
       sk_backlog_rcv include/net/sock.h:1030 [inline]
       __release_sock+0xf2/0x270 net/core/sock.c:2768
       release_sock+0x40/0x110 net/core/sock.c:3300
       tcp_sendpage+0x94/0xb0 net/ipv4/tcp.c:1114
       inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833
       rds_tcp_xmit+0x376/0x5f0 net/rds/tcp_send.c:118
       rds_send_xmit+0xbed/0x1500 net/rds/send.c:367
       rds_send_worker+0x43/0x200 net/rds/threads.c:200
       process_one_work+0x3fc/0x980 kernel/workqueue.c:2298
       worker_thread+0x616/0xa70 kernel/workqueue.c:2445
       kthread+0x2c7/0x2e0 kernel/kthread.c:327
       ret_from_fork+0x1f/0x30
      
      value changed: 0x00027cc2 -> 0x00000000
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 17446 Comm: kworker/u4:5 Tainted: G        W         5.16.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: krdsd rds_send_worker
      
      Note: I chose an arbitrary commit for the Fixes: tag,
      because I do not think we need to backport this fix to very old kernels.
      
      Fixes: e37542ba ("netfilter: conntrack: avoid possible false sharing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      802a7dc5
    • Florian Westphal's avatar
      selftests: netfilter: switch zone stress to socat · d46cea0e
      Florian Westphal authored
      centos9 has nmap-ncat which doesn't like the '-q' option, use socat.
      While at it, mark test skipped if needed tools are missing.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      d46cea0e
    • Pablo Neira Ayuso's avatar
      netfilter: nft_exthdr: break evaluation if setting TCP option fails · 962e5a40
      Pablo Neira Ayuso authored
      Break rule evaluation on malformed TCP options.
      
      Fixes: 99d1712b ("netfilter: exthdr: tcp option set support")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      962e5a40
    • Stefano Brivio's avatar
      selftests: netfilter: Add correctness test for mac,net set type · 0de53b0f
      Stefano Brivio authored
      The existing net,mac test didn't cover the issue recently reported
      by Nikita Yushchenko, where MAC addresses wouldn't match if given
      as first field of a concatenated set with AVX2 and 8-bit groups,
      because there's a different code path covering the lookup of six
      8-bit groups (MAC addresses) if that's the first field.
      
      Add a similar mac,net test, with MAC address and IPv4 address
      swapped in the set specification.
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0de53b0f
    • Stefano Brivio's avatar
      nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups · b7e945e2
      Stefano Brivio authored
      The sixth byte of packet data has to be looked up in the sixth group,
      not in the seventh one, even if we load the bucket data into ymm6
      (and not ymm5, for convenience of tracking stalls).
      
      Without this fix, matching on a MAC address as first field of a set,
      if 8-bit groups are selected (due to a small set size) would fail,
      that is, the given MAC address would never match.
      Reported-by: default avatarNikita Yushchenko <nikita.yushchenko@virtuozzo.com>
      Cc: <stable@vger.kernel.org> # 5.6.x
      Fixes: 7400b063 ("nft_set_pipapo: Introduce AVX2-based lookup implementation")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Tested-By: default avatarNikita Yushchenko <nikita.yushchenko@virtuozzo.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b7e945e2
    • Nicolas Dichtel's avatar
      vrf: don't run conntrack on vrf with !dflt qdisc · d43b75fb
      Nicolas Dichtel authored
      After the below patch, the conntrack attached to skb is set to "notrack" in
      the context of vrf device, for locally generated packets.
      But this is true only when the default qdisc is set to the vrf device. When
      changing the qdisc, notrack is not set anymore.
      In fact, there is a shortcut in the vrf driver, when the default qdisc is
      set, see commit dcdd43c4 ("net: vrf: performance improvements for
      IPv4") for more details.
      
      This patch ensures that the behavior is always the same, whatever the qdisc
      is.
      
      To demonstrate the difference, a new test is added in conntrack_vrf.sh.
      
      Fixes: 8c9c296a ("vrf: run conntrack only in context of lower/physdev for locally generated packets")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      d43b75fb
  2. 30 Nov, 2021 18 commits
  3. 29 Nov, 2021 16 commits