1. 06 Jun, 2023 5 commits
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: out-of-bound check in chain blob · 08e42a0d
      Pablo Neira Ayuso authored
      Add current size of rule expressions to the boundary check.
      
      Fixes: 2c865a8a ("netfilter: nf_tables: add rule blob layout")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      08e42a0d
    • Kuniyuki Iwashima's avatar
      netfilter: ipset: Add schedule point in call_ad(). · 24e22789
      Kuniyuki Iwashima authored
      syzkaller found a repro that causes Hung Task [0] with ipset.  The repro
      first creates an ipset and then tries to delete a large number of IPs
      from the ipset concurrently:
      
        IPSET_ATTR_IPADDR_IPV4 : 172.20.20.187
        IPSET_ATTR_CIDR        : 2
      
      The first deleting thread hogs a CPU with nfnl_lock(NFNL_SUBSYS_IPSET)
      held, and other threads wait for it to be released.
      
      Previously, the same issue existed in set->variant->uadt() that could run
      so long under ip_set_lock(set).  Commit 5e29dc36 ("netfilter: ipset:
      Rework long task execution when adding/deleting entries") tried to fix it,
      but the issue still exists in the caller with another mutex.
      
      While adding/deleting many IPs, we should release the CPU periodically to
      prevent someone from abusing ipset to hang the system.
      
      Note we need to increment the ipset's refcnt to prevent the ipset from
      being destroyed while rescheduling.
      
      [0]:
      INFO: task syz-executor174:268 blocked for more than 143 seconds.
            Not tainted 6.4.0-rc1-00145-gba79e9a7 #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      task:syz-executor174 state:D stack:0     pid:268   ppid:260    flags:0x0000000d
      Call trace:
       __switch_to+0x308/0x714 arch/arm64/kernel/process.c:556
       context_switch kernel/sched/core.c:5343 [inline]
       __schedule+0xd84/0x1648 kernel/sched/core.c:6669
       schedule+0xf0/0x214 kernel/sched/core.c:6745
       schedule_preempt_disabled+0x58/0xf0 kernel/sched/core.c:6804
       __mutex_lock_common kernel/locking/mutex.c:679 [inline]
       __mutex_lock+0x6fc/0xdb0 kernel/locking/mutex.c:747
       __mutex_lock_slowpath+0x14/0x20 kernel/locking/mutex.c:1035
       mutex_lock+0x98/0xf0 kernel/locking/mutex.c:286
       nfnl_lock net/netfilter/nfnetlink.c:98 [inline]
       nfnetlink_rcv_msg+0x480/0x70c net/netfilter/nfnetlink.c:295
       netlink_rcv_skb+0x1c0/0x350 net/netlink/af_netlink.c:2546
       nfnetlink_rcv+0x18c/0x199c net/netfilter/nfnetlink.c:658
       netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
       netlink_unicast+0x664/0x8cc net/netlink/af_netlink.c:1365
       netlink_sendmsg+0x6d0/0xa4c net/netlink/af_netlink.c:1913
       sock_sendmsg_nosec net/socket.c:724 [inline]
       sock_sendmsg net/socket.c:747 [inline]
       ____sys_sendmsg+0x4b8/0x810 net/socket.c:2503
       ___sys_sendmsg net/socket.c:2557 [inline]
       __sys_sendmsg+0x1f8/0x2a4 net/socket.c:2586
       __do_sys_sendmsg net/socket.c:2595 [inline]
       __se_sys_sendmsg net/socket.c:2593 [inline]
       __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2593
       __invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
       invoke_syscall+0x84/0x270 arch/arm64/kernel/syscall.c:52
       el0_svc_common+0x134/0x24c arch/arm64/kernel/syscall.c:142
       do_el0_svc+0x64/0x198 arch/arm64/kernel/syscall.c:193
       el0_svc+0x2c/0x7c arch/arm64/kernel/entry-common.c:637
       el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:655
       el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Fixes: a7b4f989 ("netfilter: ipset: IP set core support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      24e22789
    • Tijs Van Buggenhout's avatar
      netfilter: conntrack: fix NULL pointer dereference in nf_confirm_cthelper · e1f543dc
      Tijs Van Buggenhout authored
      An nf_conntrack_helper from nf_conn_help may become NULL after DNAT.
      
      Observed when TCP port 1720 (Q931_PORT), associated with h323 conntrack
      helper, is DNAT'ed to another destination port (e.g. 1730), while
      nfqueue is being used for final acceptance (e.g. snort).
      
      This happenned after transition from kernel 4.14 to 5.10.161.
      
      Workarounds:
       * keep the same port (1720) in DNAT
       * disable nfqueue
       * disable/unload h323 NAT helper
      
      $ linux-5.10/scripts/decode_stacktrace.sh vmlinux < /tmp/kernel.log
      BUG: kernel NULL pointer dereference, address: 0000000000000084
      [..]
      RIP: 0010:nf_conntrack_update (net/netfilter/nf_conntrack_core.c:2080 net/netfilter/nf_conntrack_core.c:2134) nf_conntrack
      [..]
      nfqnl_reinject (net/netfilter/nfnetlink_queue.c:237) nfnetlink_queue
      nfqnl_recv_verdict (net/netfilter/nfnetlink_queue.c:1230) nfnetlink_queue
      nfnetlink_rcv_msg (net/netfilter/nfnetlink.c:241) nfnetlink
      [..]
      
      Fixes: ee04805f ("netfilter: conntrack: make conntrack userspace helpers work again")
      Signed-off-by: default avatarTijs Van Buggenhout <tijs.van.buggenhout@axsguard.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      e1f543dc
    • Jeremy Sowden's avatar
      netfilter: nft_bitwise: fix register tracking · 14e8b293
      Jeremy Sowden authored
      At the end of `nft_bitwise_reduce`, there is a loop which is intended to
      update the bitwise expression associated with each tracked destination
      register.  However, currently, it just updates the first register
      repeatedly.  Fix it.
      
      Fixes: 34cc9e52 ("netfilter: nf_tables: cancel tracking for clobbered destination registers")
      Signed-off-by: default avatarJeremy Sowden <jeremy@azazel.net>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      14e8b293
    • Gavrilov Ilia's avatar
      netfilter: nf_tables: Add null check for nla_nest_start_noflag() in nft_dump_basechain_hook() · bd058763
      Gavrilov Ilia authored
      The nla_nest_start_noflag() function may fail and return NULL;
      the return value needs to be checked.
      
      Found by InfoTeCS on behalf of Linux Verification Center
      (linuxtesting.org) with SVACE.
      
      Fixes: d54725cd ("netfilter: nf_tables: support for multiple devices per netdev hook")
      Signed-off-by: default avatarGavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      bd058763
  2. 19 May, 2023 16 commits
    • Shenwei Wang's avatar
      net: fec: add dma_wmb to ensure correct descriptor values · 9025944f
      Shenwei Wang authored
      Two dma_wmb() are added in the XDP TX path to ensure proper ordering of
      descriptor and buffer updates:
      1. A dma_wmb() is added after updating the last BD to make sure
         the updates to rest of the descriptor are visible before
         transferring ownership to FEC.
      2. A dma_wmb() is also added after updating the bdp to ensure these
         updates are visible before updating txq->bd.cur.
      3. Start the xmit of the frame immediately right after configuring the
         tx descriptor.
      
      Fixes: 6d6b39f1 ("net: fec: add initial XDP support")
      Signed-off-by: default avatarShenwei Wang <shenwei.wang@nxp.com>
      Reviewed-by: default avatarWei Fang <wei.fang@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9025944f
    • Vladimir Oltean's avatar
      MAINTAINERS: add myself as maintainer for enetc · 3be5f6cd
      Vladimir Oltean authored
      I would like to be copied on new patches submitted on this driver.
      I am relatively familiar with the code, having practically maintained
      it for a while.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3be5f6cd
    • Sunil Goutham's avatar
      octeontx2-pf: Fix TSOv6 offload · de678ca3
      Sunil Goutham authored
      HW adds segment size to the payload length
      in the IPv6 header. Fix payload length to
      just TCP header length instead of 'TCP header
      size + IPv6 header size'.
      
      Fixes: 86d74760 ("octeontx2-pf: TCP segmentation offload support")
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarRatheesh Kannoth <rkannoth@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de678ca3
    • Alejandro Lucero's avatar
      sfc: fix devlink info error handling · cfcb9428
      Alejandro Lucero authored
      Avoid early devlink info return if errors arise with MCDI commands
      executed for getting the required info from the device. The rationale
      is some commands can fail but later ones could still give useful data.
      Moreover, some nvram partitions could not be present which needs to be
      handled as a non error.
      
      The specific errors are reported through system messages and if any
      error appears, it will be reported generically through extack.
      
      Fixes 14743ddd ("sfc: add devlink info support for ef100")
      Signed-off-by: default avatarAlejandro Lucero <alejandro.lucero-palau@amd.com>
      Acked-by: default avatarMartin Habets <habetsm.xilinx@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfcb9428
    • Wen Gu's avatar
      net/smc: Reset connection when trying to use SMCRv2 fails. · 35112271
      Wen Gu authored
      We found a crash when using SMCRv2 with 2 Mellanox ConnectX-4. It
      can be reproduced by:
      
      - smc_run nginx
      - smc_run wrk -t 32 -c 500 -d 30 http://<ip>:<port>
      
       BUG: kernel NULL pointer dereference, address: 0000000000000014
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 8000000108713067 P4D 8000000108713067 PUD 151127067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP PTI
       CPU: 4 PID: 2441 Comm: kworker/4:249 Kdump: loaded Tainted: G        W   E      6.4.0-rc1+ #42
       Workqueue: smc_hs_wq smc_listen_work [smc]
       RIP: 0010:smc_clc_send_confirm_accept+0x284/0x580 [smc]
       RSP: 0018:ffffb8294b2d7c78 EFLAGS: 00010a06
       RAX: ffff8f1873238880 RBX: ffffb8294b2d7dc8 RCX: 0000000000000000
       RDX: 00000000000000b4 RSI: 0000000000000001 RDI: 0000000000b40c00
       RBP: ffffb8294b2d7db8 R08: ffff8f1815c5860c R09: 0000000000000000
       R10: 0000000000000400 R11: 0000000000000000 R12: ffff8f1846f56180
       R13: ffff8f1815c5860c R14: 0000000000000001 R15: 0000000000000001
       FS:  0000000000000000(0000) GS:ffff8f1aefd00000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000014 CR3: 00000001027a0001 CR4: 00000000003706e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        <TASK>
        ? mlx5_ib_map_mr_sg+0xa1/0xd0 [mlx5_ib]
        ? smcr_buf_map_link+0x24b/0x290 [smc]
        ? __smc_buf_create+0x4ee/0x9b0 [smc]
        smc_clc_send_accept+0x4c/0xb0 [smc]
        smc_listen_work+0x346/0x650 [smc]
        ? __schedule+0x279/0x820
        process_one_work+0x1e5/0x3f0
        worker_thread+0x4d/0x2f0
        ? __pfx_worker_thread+0x10/0x10
        kthread+0xe5/0x120
        ? __pfx_kthread+0x10/0x10
        ret_from_fork+0x2c/0x50
        </TASK>
      
      During the CLC handshake, server sequentially tries available SMCRv2
      and SMCRv1 devices in smc_listen_work().
      
      If an SMCRv2 device is found. SMCv2 based link group and link will be
      assigned to the connection. Then assumed that some buffer assignment
      errors happen later in the CLC handshake, such as RMB registration
      failure, server will give up SMCRv2 and try SMCRv1 device instead. But
      the resources assigned to the connection won't be reset.
      
      When server tries SMCRv1 device, the connection creation process will
      be executed again. Since conn->lnk has been assigned when trying SMCRv2,
      it will not be set to the correct SMCRv1 link in
      smcr_lgr_conn_assign_link(). So in such situation, conn->lgr points to
      correct SMCRv1 link group but conn->lnk points to the SMCRv2 link
      mistakenly.
      
      Then in smc_clc_send_confirm_accept(), conn->rmb_desc->mr[link->link_idx]
      will be accessed. Since the link->link_idx is not correct, the related
      MR may not have been initialized, so crash happens.
      
       | Try SMCRv2 device first
       |     |-> conn->lgr:	assign existed SMCRv2 link group;
       |     |-> conn->link:	assign existed SMCRv2 link (link_idx may be 1 in SMC_LGR_SYMMETRIC);
       |     |-> sndbuf & RMB creation fails, quit;
       |
       | Try SMCRv1 device then
       |     |-> conn->lgr:	create SMCRv1 link group and assign;
       |     |-> conn->link:	keep SMCRv2 link mistakenly;
       |     |-> sndbuf & RMB creation succeed, only RMB->mr[link_idx = 0]
       |         initialized.
       |
       | Then smc_clc_send_confirm_accept() accesses
       | conn->rmb_desc->mr[conn->link->link_idx, which is 1], then crash.
       v
      
      This patch tries to fix this by cleaning conn->lnk before assigning
      link. In addition, it is better to reset the connection and clean the
      resources assigned if trying SMCRv2 failed in buffer creation or
      registration.
      
      Fixes: e49300a6 ("net/smc: add listen processing for SMC-Rv2")
      Link: https://lore.kernel.org/r/20220523055056.2078994-1-liuyacan@corp.netease.com/Signed-off-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35112271
    • Po-Hsu Lin's avatar
      selftests: fib_tests: mute cleanup error message · d226b1df
      Po-Hsu Lin authored
      In the end of the test, there will be an error message induced by the
      `ip netns del ns1` command in cleanup()
      
        Tests passed: 201
        Tests failed:   0
        Cannot remove namespace file "/run/netns/ns1": No such file or directory
      
      This can even be reproduced with just `./fib_tests.sh -h` as we're
      calling cleanup() on exit.
      
      Redirect the error message to /dev/null to mute it.
      
      V2: Update commit message and fixes tag.
      V3: resubmit due to missing netdev ML in V2
      
      Fixes: b60417a9 ("selftest: fib_tests: Always cleanup before exit")
      Signed-off-by: default avatarPo-Hsu Lin <po-hsu.lin@canonical.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d226b1df
    • Jakub Kicinski's avatar
      net/mlx5e: do as little as possible in napi poll when budget is 0 · afbed3f7
      Jakub Kicinski authored
      NAPI gets called with budget of 0 from netpoll, which has interrupts
      disabled. We should try to free some space on Tx rings and nothing
      else.
      
      Specifically do not try to handle XDP TX or try to refill Rx buffers -
      we can't use the page pool from IRQ context. Don't check if IRQs moved,
      either, that makes no sense in netpoll. Netpoll calls _all_ the rings
      from whatever CPU it happens to be invoked on.
      
      In general do as little as possible, the work quickly adds up when
      there's tens of rings to poll.
      
      The immediate stack trace I was seeing is:
      
          __do_softirq+0xd1/0x2c0
          __local_bh_enable_ip+0xc7/0x120
          </IRQ>
          <TASK>
          page_pool_put_defragged_page+0x267/0x320
          mlx5e_free_xdpsq_desc+0x99/0xd0
          mlx5e_poll_xdpsq_cq+0x138/0x3b0
          mlx5e_napi_poll+0xc3/0x8b0
          netpoll_poll_dev+0xce/0x150
      
      AFAIU page pool takes a BH lock, releases it and since BH is now
      enabled tries to run softirqs.
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Fixes: 60bbf7ee ("mlx5: use page_pool for xdp_return_frame call")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afbed3f7
    • David S. Miller's avatar
      Merge branch 'tls-fixes' · 2897041e
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      tls: rx: strp: fix inline crypto offload
      
      The local strparser version I added to TLS does not preserve
      decryption status, which breaks inline crypto (NIC offload).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2897041e
    • Jakub Kicinski's avatar
      tls: rx: strp: don't use GFP_KERNEL in softirq context · 74836ec8
      Jakub Kicinski authored
      When receive buffer is small, or the TCP rx queue looks too
      complicated to bother using it directly - we allocate a new
      skb and copy data into it.
      
      We already use sk->sk_allocation... but nothing actually
      sets it to GFP_ATOMIC on the ->sk_data_ready() path.
      
      Users of HW offload are far more likely to experience problems
      due to scheduling while atomic. "Copy mode" is very rarely
      triggered with SW crypto.
      
      Fixes: 84c61fe1 ("tls: rx: do not use the standard strparser")
      Tested-by: default avatarShai Amiram <samiram@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74836ec8
    • Jakub Kicinski's avatar
      tls: rx: strp: preserve decryption status of skbs when needed · eca9bfaf
      Jakub Kicinski authored
      When receive buffer is small we try to copy out the data from
      TCP into a skb maintained by TLS to prevent connection from
      stalling. Unfortunately if a single record is made up of a mix
      of decrypted and non-decrypted skbs combining them into a single
      skb leads to loss of decryption status, resulting in decryption
      errors or data corruption.
      
      Similarly when trying to use TCP receive queue directly we need
      to make sure that all the skbs within the record have the same
      status. If we don't the mixed status will be detected correctly
      but we'll CoW the anchor, again collapsing it into a single paged
      skb without decrypted status preserved. So the "fixup" code will
      not know which parts of skb to re-encrypt.
      
      Fixes: 84c61fe1 ("tls: rx: do not use the standard strparser")
      Tested-by: default avatarShai Amiram <samiram@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eca9bfaf
    • Jakub Kicinski's avatar
      tls: rx: strp: factor out copying skb data · c1c607b1
      Jakub Kicinski authored
      We'll need to copy input skbs individually in the next patch.
      Factor that code out (without assuming we're copying a full record).
      Tested-by: default avatarShai Amiram <samiram@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1c607b1
    • Jakub Kicinski's avatar
      tls: rx: strp: fix determining record length in copy mode · 8b0c0dc9
      Jakub Kicinski authored
      We call tls_rx_msg_size(skb) before doing skb->len += chunk.
      So the tls_rx_msg_size() code will see old skb->len, most
      likely leading to an over-read.
      
      Worst case we will over read an entire record, next iteration
      will try to trim the skb but may end up turning frag len negative
      or discarding the subsequent record (since we already told TCP
      we've read it during previous read but now we'll trim it out of
      the skb).
      
      Fixes: 84c61fe1 ("tls: rx: do not use the standard strparser")
      Tested-by: default avatarShai Amiram <samiram@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b0c0dc9
    • Jakub Kicinski's avatar
      tls: rx: strp: force mixed decrypted records into copy mode · 14c4be92
      Jakub Kicinski authored
      If a record is partially decrypted we'll have to CoW it, anyway,
      so go into copy mode and allocate a writable skb right away.
      
      This will make subsequent fix simpler because we won't have to
      teach tls_strp_msg_make_copy() how to copy skbs while preserving
      decrypt status.
      Tested-by: default avatarShai Amiram <samiram@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14c4be92
    • Jakub Kicinski's avatar
      tls: rx: strp: set the skb->len of detached / CoW'ed skbs · 210620ae
      Jakub Kicinski authored
      alloc_skb_with_frags() fills in page frag sizes but does not
      set skb->len and skb->data_len. Set those correctly otherwise
      device offload will most likely generate an empty skb and
      hit the BUG() at the end of __skb_nsg().
      
      Fixes: 84c61fe1 ("tls: rx: do not use the standard strparser")
      Tested-by: default avatarShai Amiram <samiram@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      210620ae
    • Jakub Kicinski's avatar
      tls: rx: device: fix checking decryption status · b3a03b54
      Jakub Kicinski authored
      skb->len covers the entire skb, including the frag_list.
      In fact we're guaranteed that rxm->full_len <= skb->len,
      so since the change under Fixes we were not checking decrypt
      status of any skb but the first.
      
      Note that the skb_pagelen() added here may feel a bit costly,
      but it's removed by subsequent fixes, anyway.
      Reported-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Fixes: 86b259f6 ("tls: rx: device: bound the frag walk")
      Tested-by: default avatarShai Amiram <samiram@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3a03b54
    • Tudor Ambarus's avatar
      net: cdc_ncm: Deal with too low values of dwNtbOutMaxSize · 7e01c7f7
      Tudor Ambarus authored
      Currently in cdc_ncm_check_tx_max(), if dwNtbOutMaxSize is lower than
      the calculated "min" value, but greater than zero, the logic sets
      tx_max to dwNtbOutMaxSize. This is then used to allocate a new SKB in
      cdc_ncm_fill_tx_frame() where all the data is handled.
      
      For small values of dwNtbOutMaxSize the memory allocated during
      alloc_skb(dwNtbOutMaxSize, GFP_ATOMIC) will have the same size, due to
      how size is aligned at alloc time:
      	size = SKB_DATA_ALIGN(size);
              size += SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
      Thus we hit the same bug that we tried to squash with
      commit 2be6d4d1 ("net: cdc_ncm: Allow for dwNtbOutMaxSize to be unset or zero")
      
      Low values of dwNtbOutMaxSize do not cause an issue presently because at
      alloc_skb() time more memory (512b) is allocated than required for the
      SKB headers alone (320b), leaving some space (512b - 320b = 192b)
      for CDC data (172b).
      
      However, if more elements (for example 3 x u64 = [24b]) were added to
      one of the SKB header structs, say 'struct skb_shared_info',
      increasing its original size (320b [320b aligned]) to something larger
      (344b [384b aligned]), then suddenly the CDC data (172b) no longer
      fits in the spare SKB data area (512b - 384b = 128b).
      
      Consequently the SKB bounds checking semantics fails and panics:
      
      skbuff: skb_over_panic: text:ffffffff831f755b len:184 put:172 head:ffff88811f1c6c00 data:ffff88811f1c6c00 tail:0xb8 end:0x80 dev:<NULL>
      ------------[ cut here ]------------
      kernel BUG at net/core/skbuff.c:113!
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 57 Comm: kworker/0:2 Not tainted 5.15.106-syzkaller-00249-g19c0ed55a470 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
      Workqueue: mld mld_ifc_work
      RIP: 0010:skb_panic net/core/skbuff.c:113 [inline]
      RIP: 0010:skb_over_panic+0x14c/0x150 net/core/skbuff.c:118
      [snip]
      Call Trace:
       <TASK>
       skb_put+0x151/0x210 net/core/skbuff.c:2047
       skb_put_zero include/linux/skbuff.h:2422 [inline]
       cdc_ncm_ndp16 drivers/net/usb/cdc_ncm.c:1131 [inline]
       cdc_ncm_fill_tx_frame+0x11ab/0x3da0 drivers/net/usb/cdc_ncm.c:1308
       cdc_ncm_tx_fixup+0xa3/0x100
      
      Deal with too low values of dwNtbOutMaxSize, clamp it in the range
      [USB_CDC_NCM_NTB_MIN_OUT_SIZE, CDC_NCM_NTB_MAX_SIZE_TX]. We ensure
      enough data space is allocated to handle CDC data by making sure
      dwNtbOutMaxSize is not smaller than USB_CDC_NCM_NTB_MIN_OUT_SIZE.
      
      Fixes: 289507d3 ("net: cdc_ncm: use sysfs for rx/tx aggregation tuning")
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+9f575a1f15fc0c01ed69@syzkaller.appspotmail.com
      Link: https://syzkaller.appspot.com/bug?extid=b982f1059506db48409d
      Link: https://lore.kernel.org/all/20211202143437.1411410-1-lee.jones@linaro.org/Signed-off-by: default avatarTudor Ambarus <tudor.ambarus@linaro.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230517133808.1873695-2-tudor.ambarus@linaro.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7e01c7f7
  3. 18 May, 2023 10 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 1f594fe7
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from can, xfrm, bluetooth and netfilter.
      
        Current release - regressions:
      
         - ipv6: fix RCU splat in ipv6_route_seq_show()
      
         - wifi: iwlwifi: disable RFI feature
      
        Previous releases - regressions:
      
         - tcp: fix possible sk_priority leak in tcp_v4_send_reset()
      
         - tipc: do not update mtu if msg_max is too small in mtu negotiation
      
         - netfilter: fix null deref on element insertion
      
         - devlink: change per-devlink netdev notifier to static one
      
         - phylink: fix ksettings_set() ethtool call
      
         - wifi: mac80211: fortify the spinlock against deadlock by interrupt
      
         - wifi: brcmfmac: check for probe() id argument being NULL
      
         - eth: ice:
            - fix undersized tx_flags variable
            - fix ice VF reset during iavf initialization
      
         - eth: hns3: fix sending pfc frames after reset issue
      
        Previous releases - always broken:
      
         - xfrm: release all offloaded policy memory
      
         - nsh: use correct mac_offset to unwind gso skb in nsh_gso_segment()
      
         - vsock: avoid to close connected socket after the timeout
      
         - dsa: rzn1-a5psw: enable management frames for CPU port
      
         - eth: virtio_net: fix error unwinding of XDP initialization
      
         - eth: tun: fix memory leak for detached NAPI queue.
      
        Misc:
      
         - MAINTAINERS: sctp: move Neil to CREDITS"
      
      * tag 'net-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
        MAINTAINERS: skip CCing netdev for Bluetooth patches
        mdio_bus: unhide mdio_bus_init prototype
        bridge: always declare tunnel functions
        atm: hide unused procfs functions
        net: isa: include net/Space.h
        Revert "ARM: dts: stm32: add CAN support on stm32f746"
        netfilter: nft_set_rbtree: fix null deref on element insertion
        netfilter: nf_tables: fix nft_trans type confusion
        netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with CONFIG_NF_NAT
        net: wwan: t7xx: Ensure init is completed before system sleep
        net: selftests: Fix optstring
        net: pcs: xpcs: fix C73 AN not getting enabled
        net: wwan: iosm: fix NULL pointer dereference when removing device
        vlan: fix a potential uninit-value in vlan_dev_hard_start_xmit()
        mailmap: add entries for Nikolay Aleksandrov
        igb: fix bit_shift to be in [1..8] range
        net: dsa: mv88e6xxx: Fix mv88e6393x EPC write command offset
        cassini: Fix a memory leak in the error handling path of cas_init_one()
        tun: Fix memory leak for detached NAPI queue.
        can: kvaser_pciefd: Disable interrupts in probe error path
        ...
      1f594fe7
    • Linus Torvalds's avatar
      Merge tag 'media/v6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · b802651b
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "Several fixes for the dvb core and drivers:
      
         - fix UAF and null pointer de-reference in DVB core
      
         - fix kernel runtime warning for blocking operation in wait_event*()
           in dvb core
      
         - fix write size bug in DVB conditional access core
      
         - fix dvb demux continuity counter debug check logic
      
         - randconfig build fixes in pvrusb2 and mn88443x
      
         - fix memory leak in ttusb-dec
      
         - fix netup_unidvb probe-time error check logic
      
         - improve error handling in dw2102 if it can't retrieve DVB MAC
           address"
      
      * tag 'media/v6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
        media: dvb-core: Fix use-after-free due to race condition at dvb_ca_en50221
        media: dvb-core: Fix kernel WARNING for blocking operation in wait_event*()
        media: dvb-core: Fix use-after-free due to race at dvb_register_device()
        media: dvb-core: Fix use-after-free due on race condition at dvb_net
        media: dvb-core: Fix use-after-free on race condition at dvb_frontend
        media: mn88443x: fix !CONFIG_OF error by drop of_match_ptr from ID table
        media: ttusb-dec: fix memory leak in ttusb_dec_exit_dvb()
        media: dvb_ca_en50221: fix a size write bug
        media: netup_unidvb: fix irq init by register it at the end of probe
        media: dvb-usb: dw2102: fix uninit-value in su3000_read_mac_address
        media: dvb-usb: digitv: fix null-ptr-deref in digitv_i2c_xfer()
        media: dvb-usb-v2: rtl28xxu: fix null-ptr-deref in rtl28xxu_i2c_xfer
        media: dvb-usb-v2: ce6230: fix null-ptr-deref in ce6230_i2c_master_xfer()
        media: dvb-usb-v2: ec168: fix null-ptr-deref in ec168_i2c_xfer()
        media: dvb-usb: az6027: fix three null-ptr-deref in az6027_i2c_xfer()
        media: netup_unidvb: fix use-after-free at del_timer()
        media: dvb_demux: fix a bug for the continuity counter
        media: pvrusb2: fix DVB_CORE dependency
      b802651b
    • Paolo Abeni's avatar
      Merge tag 'linux-can-fixes-for-6.4-20230518' of... · 6e42fae0
      Paolo Abeni authored
      Merge tag 'linux-can-fixes-for-6.4-20230518' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2023-05-18
      
      this is a pull request of 7 patches for net/master.
      
      The first 6 patches are by Jimmy Assarsson and fix several bugs in the
      kvaser_pciefd driver.
      
      The latest patch is from me and reverts a change in stm32f746.dtsi
      that causes build errors due to a missing dependent patch.
      
      * tag 'linux-can-fixes-for-6.4-20230518' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        Revert "ARM: dts: stm32: add CAN support on stm32f746"
        can: kvaser_pciefd: Disable interrupts in probe error path
        can: kvaser_pciefd: Do not send EFLUSH command on TFD interrupt
        can: kvaser_pciefd: Empty SRB buffer in probe
        can: kvaser_pciefd: Call request_irq() before enabling interrupts
        can: kvaser_pciefd: Clear listen-only bit if not explicitly requested
        can: kvaser_pciefd: Set CAN_STATE_STOPPED in kvaser_pciefd_stop()
      ====================
      
      Link: https://lore.kernel.org/r/20230518073241.1110453-1-mkl@pengutronix.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6e42fae0
    • Jakub Kicinski's avatar
      Merge tag 'nf-23-05-17' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 30a0f49d
      Jakub Kicinski authored
      Florian Westphal says:
      
      ====================
      Netfilter fixes for net
      
      1. Silence warning about unused variable when CONFIG_NF_NAT=n, from Tom Rix.
      2. nftables: Fix possible out-of-bounds access, from myself.
      3. nftables: fix null deref+UAF during element insertion into rbtree,
         also from myself.
      
      * tag 'nf-23-05-17' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nft_set_rbtree: fix null deref on element insertion
        netfilter: nf_tables: fix nft_trans type confusion
        netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with CONFIG_NF_NAT
      ====================
      
      Link: https://lore.kernel.org/r/20230517123756.7353-1-fw@strlen.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30a0f49d
    • Jakub Kicinski's avatar
      Merge tag 'wireless-2023-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · c259ad11
      Jakub Kicinski authored
      Kalle Valo says:
      
      ====================
      wireless fixes for v6.4
      
      A lot of fixes this time, for both the stack and the drivers. The
      brcmfmac resume fix has been reported by several people so I would say
      it's the most important here. The iwlwifi RFI workaround is also
      something which was reported as a regression recently.
      
      * tag 'wireless-2023-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (31 commits)
        wifi: b43: fix incorrect __packed annotation
        wifi: rtw88: sdio: Always use two consecutive bytes for word operations
        mac80211_hwsim: fix memory leak in hwsim_new_radio_nl
        wifi: iwlwifi: mvm: Add locking to the rate read flow
        wifi: iwlwifi: Don't use valid_links to iterate sta links
        wifi: iwlwifi: mvm: don't trust firmware n_channels
        wifi: iwlwifi: mvm: fix OEM's name in the tas approved list
        wifi: iwlwifi: fix OEM's name in the ppag approved list
        wifi: iwlwifi: mvm: fix initialization of a return value
        wifi: iwlwifi: mvm: fix access to fw_id_to_mac_id
        wifi: iwlwifi: fw: fix DBGI dump
        wifi: iwlwifi: mvm: fix number of concurrent link checks
        wifi: iwlwifi: mvm: fix cancel_delayed_work_sync() deadlock
        wifi: iwlwifi: mvm: don't double-init spinlock
        wifi: iwlwifi: mvm: always free dup_data
        wifi: mac80211: recalc chanctx mindef before assigning
        wifi: mac80211: consider reserved chanctx for mindef
        wifi: mac80211: simplify chanctx allocation
        wifi: mac80211: Abort running color change when stopping the AP
        wifi: mac80211: fix min center freq offset tracing
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20230517151914.B0AF6C433EF@smtp.kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c259ad11
    • Jakub Kicinski's avatar
      MAINTAINERS: skip CCing netdev for Bluetooth patches · bfa00d8f
      Jakub Kicinski authored
      As requested by Marcel skip netdev for Bluetooth patches.
      Bluetooth has its own mailing list and overloading netdev
      leads to fewer people reading it.
      
      Link: https://lore.kernel.org/netdev/639C8EA4-1F6E-42BE-8F04-E4A753A6EFFC@holtmann.org/Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230517014253.1233333-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bfa00d8f
    • Arnd Bergmann's avatar
      mdio_bus: unhide mdio_bus_init prototype · 2e9f8ab6
      Arnd Bergmann authored
      mdio_bus_init() is either used as a local module_init() entry,
      or it gets called in phy_device.c. In the former case, there
      is no declaration, which causes a warning:
      
      drivers/net/phy/mdio_bus.c:1371:12: error: no previous prototype for 'mdio_bus_init' [-Werror=missing-prototypes]
      
      Remove the #ifdef around the declaration to avoid the warning..
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20230516194625.549249-4-arnd@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2e9f8ab6
    • Arnd Bergmann's avatar
      bridge: always declare tunnel functions · 89dcd87c
      Arnd Bergmann authored
      When CONFIG_BRIDGE_VLAN_FILTERING is disabled, two functions are still
      defined but have no prototype or caller. This causes a W=1 warning for
      the missing prototypes:
      
      net/bridge/br_netlink_tunnel.c:29:6: error: no previous prototype for 'vlan_tunid_inrange' [-Werror=missing-prototypes]
      net/bridge/br_netlink_tunnel.c:199:5: error: no previous prototype for 'br_vlan_tunnel_info' [-Werror=missing-prototypes]
      
      The functions are already contitional on CONFIG_BRIDGE_VLAN_FILTERING,
      and I coulnd't easily figure out the right set of #ifdefs, so just
      move the declarations out of the #ifdef to avoid the warning,
      at a small cost in code size over a more elaborate fix.
      
      Fixes: 188c67dd ("net: bridge: vlan options: add support for tunnel id dumping")
      Fixes: 569da082 ("net: bridge: vlan options: add support for tunnel mapping set/del")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20230516194625.549249-3-arnd@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      89dcd87c
    • Arnd Bergmann's avatar
      atm: hide unused procfs functions · fb1b7be9
      Arnd Bergmann authored
      When CONFIG_PROC_FS is disabled, the function declarations for some
      procfs functions are hidden, but the definitions are still build,
      as shown by this compiler warning:
      
      net/atm/resources.c:403:7: error: no previous prototype for 'atm_dev_seq_start' [-Werror=missing-prototypes]
      net/atm/resources.c:409:6: error: no previous prototype for 'atm_dev_seq_stop' [-Werror=missing-prototypes]
      net/atm/resources.c:414:7: error: no previous prototype for 'atm_dev_seq_next' [-Werror=missing-prototypes]
      
      Add another #ifdef to leave these out of the build.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20230516194625.549249-2-arnd@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fb1b7be9
    • Arnd Bergmann's avatar
      net: isa: include net/Space.h · 067dee65
      Arnd Bergmann authored
      The legacy drivers that still get called from net/Space.c have prototypes
      in net/Space, but this header is not included in most of the files that
      define those functions:
      
      drivers/net/ethernet/cirrus/cs89x0.c:1649:28: error: no previous prototype for 'cs89x0_probe' [-Werror=missing-prototypes]
      drivers/net/ethernet/8390/ne.c:947:28: error: no previous prototype for 'ne_probe' [-Werror=missing-prototypes]
      drivers/net/ethernet/8390/smc-ultra.c:167:28: error: no previous prototype for 'ultra_probe' [-Werror=missing-prototypes]
      drivers/net/ethernet/amd/lance.c:438:28: error: no previous prototype for 'lance_probe' [-Werror=missing-prototypes]
      drivers/net/ethernet/3com/3c515.c:422:20: error: no previous prototype for 'tc515_probe' [-Werror=missing-prototypes]
      
      Add the inclusion to avoids the warnings.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20230516194625.549249-1-arnd@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      067dee65
  4. 17 May, 2023 9 commits