1. 07 Nov, 2023 10 commits
  2. 06 Nov, 2023 8 commits
    • David S. Miller's avatar
      Merge branch 'smc-fixes' · c1ed833e
      David S. Miller authored
      D. Wythe says
      
      ====================
      bugfixs for smc
      
      This patches includes bugfix following:
      
      1. hung state
      2. sock leak
      3. potential panic
      
      We have been testing these patches for some time, but
      if you have any questions, please let us know.
      
      --
      v1:
      Fix spelling errors and incorrect function names in descriptions
      
      v2->v1:
      Add fix tags for bugfix patch
      ====================
      Reviewed-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1ed833e
    • D. Wythe's avatar
      net/smc: put sk reference if close work was canceled · aa96fbd6
      D. Wythe authored
      Note that we always hold a reference to sock when attempting
      to submit close_work. Therefore, if we have successfully
      canceled close_work from pending, we MUST release that reference
      to avoid potential leaks.
      
      Fixes: 42bfba9e ("net/smc: immediate termination for SMCD link groups")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aa96fbd6
    • D. Wythe's avatar
      net/smc: allow cdc msg send rather than drop it with NULL sndbuf_desc · c5bf605b
      D. Wythe authored
      This patch re-fix the issues mentioned by commit 22a825c5
      ("net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()").
      
      Blocking sending message do solve the issues though, but it also
      prevents the peer to receive the final message. Besides, in logic,
      whether the sndbuf_desc is NULL or not have no impact on the processing
      of cdc message sending.
      
      Hence that, this patch allows the cdc message sending but to check the
      sndbuf_desc with care in smc_cdc_tx_handler().
      
      Fixes: 22a825c5 ("net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5bf605b
    • D. Wythe's avatar
      net/smc: fix dangling sock under state SMC_APPFINCLOSEWAIT · 5211c972
      D. Wythe authored
      Considering scenario:
      
      				smc_cdc_rx_handler
      __smc_release
      				sock_set_flag
      smc_close_active()
      sock_set_flag
      
      __set_bit(DEAD)			__set_bit(DONE)
      
      Dues to __set_bit is not atomic, the DEAD or DONE might be lost.
      if the DEAD flag lost, the state SMC_CLOSED  will be never be reached
      in smc_close_passive_work:
      
      if (sock_flag(sk, SOCK_DEAD) &&
      	smc_close_sent_any_close(conn)) {
      	sk->sk_state = SMC_CLOSED;
      } else {
      	/* just shutdown, but not yet closed locally */
      	sk->sk_state = SMC_APPFINCLOSEWAIT;
      }
      
      Replace sock_set_flags or __set_bit to set_bit will fix this problem.
      Since set_bit is atomic.
      
      Fixes: b38d7324 ("smc: socket closing and linkgroup cleanup")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5211c972
    • Jakub Kicinski's avatar
      nfsd: regenerate user space parsers after ynl-gen changes · d93f9528
      Jakub Kicinski authored
      Commit 8cea95b0 ("tools: ynl-gen: handle do ops with no input attrs")
      added support for some of the previously-skipped ops in nfsd.
      Regenerate the user space parsers to fill them in.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Acked-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d93f9528
    • Kuniyuki Iwashima's avatar
      tcp: Fix SYN option room calculation for TCP-AO. · 0a8e987d
      Kuniyuki Iwashima authored
      When building SYN packet in tcp_syn_options(), MSS, TS, WS, and
      SACKPERM are used without checking the remaining bytes in the
      options area.
      
      To keep that logic as is, we limit the TCP-AO MAC length in
      tcp_ao_parse_crypto().  Currently, the limit is calculated as below.
      
        MAX_TCP_OPTION_SPACE - TCPOLEN_TSTAMP_ALIGNED
                             - TCPOLEN_WSCALE_ALIGNED
                             - TCPOLEN_SACKPERM_ALIGNED
      
      This looks confusing as (1) we pack SACKPERM into the leading
      2-bytes of the aligned 12-bytes of TS and (2) TCPOLEN_MSS_ALIGNED
      is not used.  Fortunately, the calculated limit is not wrong as
      TCPOLEN_SACKPERM_ALIGNED and TCPOLEN_MSS_ALIGNED are the same value.
      
      However, we should use the proper constant in the formula.
      
        MAX_TCP_OPTION_SPACE - TCPOLEN_MSS_ALIGNED
                             - TCPOLEN_TSTAMP_ALIGNED
                             - TCPOLEN_WSCALE_ALIGNED
      
      Fixes: 4954f17d ("net/tcp: Introduce TCP_AO setsockopt()s")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a8e987d
    • Geetha sowjanya's avatar
      octeontx2-pf: Free pending and dropped SQEs · 3423ca23
      Geetha sowjanya authored
      On interface down, the pending SQEs in the NIX get dropped
      or drained out during SMQ flush. But skb's pointed by these
      SQEs never get free or updated to the stack as respective CQE
      never get added.
      This patch fixes the issue by freeing all valid skb's in SQ SG list.
      
      Fixes: b1bc8457 ("octeontx2-pf: Cleanup all receive buffers in SG descriptor")
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3423ca23
    • Jamal Hadi Salim's avatar
      net, sched: Fix SKB_NOT_DROPPED_YET splat under debug config · 40cb2fdf
      Jamal Hadi Salim authored
      Getting the following splat [1] with CONFIG_DEBUG_NET=y and this
      reproducer [2]. Problem seems to be that classifiers clear 'struct
      tcf_result::drop_reason', thereby triggering the warning in
      __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0).
      
      Fixed by disambiguating a legit error from a verdict with a bogus drop_reason
      
      [1]
      WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
      Modules linked in:
      CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d95 #682
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
      RIP: 0010:kfree_skb_reason+0x38/0x130
      [...]
      Call Trace:
       <IRQ>
       __netif_receive_skb_core.constprop.0+0x837/0xdb0
       __netif_receive_skb_one_core+0x3c/0x70
       process_backlog+0x95/0x130
       __napi_poll+0x25/0x1b0
       net_rx_action+0x29b/0x310
       __do_softirq+0xc0/0x29b
       do_softirq+0x43/0x60
       </IRQ>
      
      [2]
      
      ip link add name veth0 type veth peer name veth1
      ip link set dev veth0 up
      ip link set dev veth1 up
      tc qdisc add dev veth1 clsact
      tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
      mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
      
      Ido reported:
      
        [...] getting the following splat [1] with CONFIG_DEBUG_NET=y and this
        reproducer [2]. Problem seems to be that classifiers clear 'struct
        tcf_result::drop_reason', thereby triggering the warning in
        __kfree_skb_reason() due to reason being 'SKB_NOT_DROPPED_YET' (0). [...]
      
        [1]
        WARNING: CPU: 0 PID: 181 at net/core/skbuff.c:1082 kfree_skb_reason+0x38/0x130
        Modules linked in:
        CPU: 0 PID: 181 Comm: mausezahn Not tainted 6.6.0-rc6-custom-ge43e6d95 #682
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
        RIP: 0010:kfree_skb_reason+0x38/0x130
        [...]
        Call Trace:
         <IRQ>
         __netif_receive_skb_core.constprop.0+0x837/0xdb0
         __netif_receive_skb_one_core+0x3c/0x70
         process_backlog+0x95/0x130
         __napi_poll+0x25/0x1b0
         net_rx_action+0x29b/0x310
         __do_softirq+0xc0/0x29b
         do_softirq+0x43/0x60
         </IRQ>
      
        [2]
        #!/bin/bash
      
        ip link add name veth0 type veth peer name veth1
        ip link set dev veth0 up
        ip link set dev veth1 up
        tc qdisc add dev veth1 clsact
        tc filter add dev veth1 ingress pref 1 proto all flower dst_mac 00:11:22:33:44:55 action drop
        mausezahn veth0 -a own -b 00:11:22:33:44:55 -q -c 1
      
      What happens is that inside most classifiers the tcf_result is copied over
      from a filter template e.g. *res = f->res which then implicitly overrides
      the prior SKB_DROP_REASON_TC_{INGRESS,EGRESS} default drop code which was
      set via sch_handle_{ingress,egress}() for kfree_skb_reason().
      
      Commit text above copied verbatim from Daniel. The general idea of the patch
      is not very different from what Ido originally posted but instead done at the
      cls_api codepath.
      
      Fixes: 54a59aed ("net, sched: Make tc-related drop reason more flexible")
      Reported-by: default avatarIdo Schimmel <idosch@idosch.org>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/netdev/ZTjY959R+AFXf3Xy@shredderReviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40cb2fdf
  3. 03 Nov, 2023 7 commits
    • Jakub Kicinski's avatar
      netlink: fill in missing MODULE_DESCRIPTION() · 016b9332
      Jakub Kicinski authored
      W=1 builds now warn if a module is built without
      a MODULE_DESCRIPTION(). Fill it in for sock_diag.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      016b9332
    • Eric Dumazet's avatar
      net/tcp: fix possible out-of-bounds reads in tcp_hash_fail() · 02f0717e
      Eric Dumazet authored
      syzbot managed to trigger a fault by sending TCP packets
      with all flags being set.
      
      v2:
       - While fixing this bug, add PSH flag handling and represent
         flags the way tcpdump does : [S], [S.], [P.]
       - Print 4-tuples more consistently between families.
      
      BUG: KASAN: stack-out-of-bounds in string_nocheck lib/vsprintf.c:645 [inline]
      BUG: KASAN: stack-out-of-bounds in string+0x394/0x3d0 lib/vsprintf.c:727
      Read of size 1 at addr ffffc9000397f3f5 by task syz-executor299/5039
      
      CPU: 1 PID: 5039 Comm: syz-executor299 Not tainted 6.6.0-rc7-syzkaller-02075-g55c90047 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:364 [inline]
      print_report+0xc4/0x620 mm/kasan/report.c:475
      kasan_report+0xda/0x110 mm/kasan/report.c:588
      string_nocheck lib/vsprintf.c:645 [inline]
      string+0x394/0x3d0 lib/vsprintf.c:727
      vsnprintf+0xc5f/0x1870 lib/vsprintf.c:2818
      vprintk_store+0x3a0/0xb80 kernel/printk/printk.c:2191
      vprintk_emit+0x14c/0x5f0 kernel/printk/printk.c:2288
      vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
      _printk+0xc8/0x100 kernel/printk/printk.c:2332
      tcp_inbound_hash.constprop.0+0xdb2/0x10d0 include/net/tcp.h:2760
      tcp_v6_rcv+0x2b31/0x34d0 net/ipv6/tcp_ipv6.c:1882
      ip6_protocol_deliver_rcu+0x33b/0x13d0 net/ipv6/ip6_input.c:438
      ip6_input_finish+0x14f/0x2f0 net/ipv6/ip6_input.c:483
      NF_HOOK include/linux/netfilter.h:314 [inline]
      NF_HOOK include/linux/netfilter.h:308 [inline]
      ip6_input+0xce/0x440 net/ipv6/ip6_input.c:492
      dst_input include/net/dst.h:461 [inline]
      ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline]
      NF_HOOK include/linux/netfilter.h:314 [inline]
      NF_HOOK include/linux/netfilter.h:308 [inline]
      ipv6_rcv+0x563/0x720 net/ipv6/ip6_input.c:310
      __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5527
      __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5641
      netif_receive_skb_internal net/core/dev.c:5727 [inline]
      netif_receive_skb+0x133/0x700 net/core/dev.c:5786
      tun_rx_batched+0x429/0x780 drivers/net/tun.c:1579
      tun_get_user+0x29e7/0x3bc0 drivers/net/tun.c:2002
      tun_chr_write_iter+0xe8/0x210 drivers/net/tun.c:2048
      call_write_iter include/linux/fs.h:1956 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x650/0xe40 fs/read_write.c:584
      ksys_write+0x12f/0x250 fs/read_write.c:637
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 2717b5ad ("net/tcp: Add tcp_hash_fail() ratelimited logs")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Dmitry Safonov <dima@arista.com>
      Cc: Francesco Ruggeri <fruggeri@arista.com>
      Cc: David Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02f0717e
    • Ronald Wahl's avatar
      net: ethernet: ti: am65-cpsw: rx_pause/tx_pause controls wrong direction · 153a58c6
      Ronald Wahl authored
      The rx_pause flag says that whether we support receiving Pause frames.
      When a Pause frame is received TX is delayed for some time. This is TX
      flow control. In the same manner tx_pause is actually RX flow control.
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      153a58c6
    • Eric Dumazet's avatar
      tcp: fix fastopen code vs usec TS · cdbab623
      Eric Dumazet authored
      After blamed commit, TFO client-ack-dropped-then-recovery-ms-timestamps
      packetdrill test failed.
      
      David Morley and Neal Cardwell started investigating and Neal pointed
      that we had :
      
      tcp_conn_request()
        tcp_try_fastopen()
         -> tcp_fastopen_create_child
           -> child = inet_csk(sk)->icsk_af_ops->syn_recv_sock()
             -> tcp_create_openreq_child()
                -> copy req_usec_ts from req:
                newtp->tcp_usec_ts = treq->req_usec_ts;
                // now the new TFO server socket always does usec TS, no matter
                // what the route options are...
        send_synack()
          -> tcp_make_synack()
              // disable tcp_rsk(req)->req_usec_ts if route option is not present:
              if (tcp_rsk(req)->req_usec_ts < 0)
                      tcp_rsk(req)->req_usec_ts = dst_tcp_usec_ts(dst);
      
      tcp_conn_request() has the initial dst, we can initialize
      tcp_rsk(req)->req_usec_ts there instead of later in send_synack();
      
      This means tcp_rsk(req)->req_usec_ts can be a boolean.
      
      Many thanks to David an Neal for their help.
      
      Fixes: 614e8316 ("tcp: add support for usec resolution in TCP TS values")
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202310302216.f79d78bc-oliver.sang@intel.comSuggested-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: David Morley <morleyd@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdbab623
    • Hangbin Liu's avatar
      selftests: pmtu.sh: fix result checking · 63e20191
      Hangbin Liu authored
      In the PMTU test, when all previous tests are skipped and the new test
      passes, the exit code is set to 0. However, the current check mistakenly
      treats this as an assignment, causing the check to pass every time.
      
      Consequently, regardless of how many tests have failed, if the latest test
      passes, the PMTU test will report a pass.
      
      Fixes: 2a9d3716 ("selftests: pmtu.sh: improve the test result processing")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarPo-Hsu Lin <po-hsu.lin@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63e20191
    • Furong Xu's avatar
      net: stmmac: xgmac: Enable support for multiple Flexible PPS outputs · db456d90
      Furong Xu authored
      From XGMAC Core 3.20 and later, each Flexible PPS has individual PPSEN bit
      to select Fixed mode or Flexible mode. The PPSEN must be set, or it stays
      in Fixed PPS mode by default.
      XGMAC Core prior 3.20, only PPSEN0(bit 4) is writable. PPSEN{1,2,3} are
      read-only reserved, and they are already in Flexible mode by default, our
      new code always set PPSEN{1,2,3} do not make things worse ;-)
      
      Fixes: 95eaf3cd ("net: stmmac: dwxgmac: Add Flexible PPS support")
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarFurong Xu <0x1207@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db456d90
    • NeilBrown's avatar
      Fix termination state for idr_for_each_entry_ul() · e8ae8ad4
      NeilBrown authored
      The comment for idr_for_each_entry_ul() states
      
        after normal termination @entry is left with the value NULL
      
      This is not correct in the case where UINT_MAX has an entry in the idr.
      In that case @entry will be non-NULL after termination.
      No current code depends on the documentation being correct, but to
      save future code we should fix it.
      
      Also fix idr_for_each_entry_continue_ul().  While this is not documented
      as leaving @entry as NULL, the mellanox driver appears to depend on
      it doing so.  So make that explicit in the documentation as well as in
      the code.
      
      Fixes: e33d2b74 ("idr: fix overflow case for idr_for_each_entry_ul()")
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Chris Mi <chrism@mellanox.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8ae8ad4
  4. 02 Nov, 2023 15 commits