1. 14 Jun, 2024 5 commits
  2. 13 Jun, 2024 14 commits
    • Aryan Srivastava's avatar
      net: mvpp2: use slab_build_skb for oversized frames · 4467c09b
      Aryan Srivastava authored
      Setting frag_size to 0 to indicate kmalloc has been deprecated,
      use slab_build_skb directly.
      
      Fixes: ce098da1 ("skbuff: Introduce slab_build_skb()")
      Signed-off-by: default avatarAryan Srivastava <aryan.srivastava@alliedtelesis.co.nz>
      Reviewed-by: default avatarKees Cook <kees@kernel.org>
      Link: https://lore.kernel.org/r/20240613024900.3842238-1-aryan.srivastava@alliedtelesis.co.nzSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4467c09b
    • Linus Torvalds's avatar
      Merge tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · d20f6b3d
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth and netfilter.
      
        Slim pickings this time, probably a combination of summer, DevConf.cz,
        and the end of first half of the year at corporations.
      
        Current release - regressions:
      
         - Revert "igc: fix a log entry using uninitialized netdev", it traded
           lack of netdev name in a printk() for a crash
      
        Previous releases - regressions:
      
         - Bluetooth: L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
      
         - geneve: fix incorrectly setting lengths of inner headers in the
           skb, confusing the drivers and causing mangled packets
      
         - sched: initialize noop_qdisc owner to avoid false-positive
           recursion detection (recursing on CPU 0), which bubbles up to user
           space as a sendmsg() error, while noop_qdisc should silently drop
      
         - netdevsim: fix backwards compatibility in nsim_get_iflink()
      
        Previous releases - always broken:
      
         - netfilter: ipset: fix race between namespace cleanup and gc in the
           list:set type"
      
      * tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
        bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
        af_unix: Read with MSG_PEEK loops if the first unread byte is OOB
        bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response
        gve: Clear napi->skb before dev_kfree_skb_any()
        ionic: fix use after netif_napi_del()
        Revert "igc: fix a log entry using uninitialized netdev"
        net: bridge: mst: fix suspicious rcu usage in br_mst_set_state
        net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
        net/ipv6: Fix the RT cache flush via sysctl using a previous delay
        net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
        gve: ignore nonrelevant GSO type bits when processing TSO headers
        net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP
        netfilter: Use flowlabel flow key when re-routing mangled packets
        netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
        netfilter: nft_inner: validate mandatory meta and payload
        tcp: use signed arithmetic in tcp_rtx_probe0_timed_out()
        mailmap: map Geliang's new email address
        mptcp: pm: update add_addr counters after connect
        mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
        mptcp: ensure snd_una is properly initialized on connect
        ...
      d20f6b3d
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-6.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · fd88e181
      Linus Torvalds authored
      Pull NFS client fixes from Trond Myklebust:
       "Bugfixes:
         - NFSv4.2: Fix a memory leak in nfs4_set_security_label
         - NFSv2/v3: abort nfs_atomic_open_v23 if the name is too long.
         - NFS: Add appropriate memory barriers to the sillyrename code
         - Propagate readlink errors in nfs_symlink_filler
         - NFS: don't invalidate dentries on transient errors
         - NFS: fix unnecessary synchronous writes in random write workloads
         - NFSv4.1: enforce rootpath check when deciding whether or not to trunk
      
        Other:
         - Change email address for Trond Myklebust due to email server concerns"
      
      * tag 'nfs-for-6.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFS: add barriers when testing for NFS_FSDATA_BLOCKED
        SUNRPC: return proper error from gss_wrap_req_priv
        NFSv4.1 enforce rootpath check in fs_location query
        NFS: abort nfs_atomic_open_v23 if name is too long.
        nfs: don't invalidate dentries on transient errors
        nfs: Avoid flushing many pages with NFS_FILE_SYNC
        nfs: propagate readlink errors in nfs_symlink_filler
        MAINTAINERS: Change email address for Trond Myklebust
        NFSv4: Fix memory leak in nfs4_set_security_label
      fd88e181
    • Linus Torvalds's avatar
      Merge tag 'fixes-2024-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock · 3572597c
      Linus Torvalds authored
      Pull memblock fixes from Mike Rapoport:
       "Fix validation of NUMA coverage.
      
        memblock_validate_numa_coverage() was checking for a unset node ID
        using NUMA_NO_NODE, but x86 used MAX_NUMNODES when no node ID was
        specified by buggy firmware.
      
        Update memblock to substitute MAX_NUMNODES with NUMA_NO_NODE in
        memblock_set_node() and use NUMA_NO_NODE in x86::numa_init()"
      
      * tag 'fixes-2024-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
        x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node()
        memblock: make memblock_set_node() also warn about use of MAX_NUMNODES
      3572597c
    • Aleksandr Mishin's avatar
      bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() · a9b97418
      Aleksandr Mishin authored
      In case of token is released due to token->state == BNXT_HWRM_DEFERRED,
      released token (set to NULL) is used in log messages. This issue is
      expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But
      this error code is returned by recent firmware. So some firmware may not
      return it. This may lead to NULL pointer dereference.
      Adjust this issue by adding token pointer check.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: 8fa4219d ("bnxt_en: add dynamic debug support for HWRM messages")
      Suggested-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarAleksandr Mishin <amishin@t-argos.ru>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20240611082547.12178-1-amishin@t-argos.ruSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a9b97418
    • Rao Shoaib's avatar
      af_unix: Read with MSG_PEEK loops if the first unread byte is OOB · a6736a0a
      Rao Shoaib authored
      Read with MSG_PEEK flag loops if the first byte to read is an OOB byte.
      commit 22dd70eb ("af_unix: Don't peek OOB data without MSG_OOB.")
      addresses the loop issue but does not address the issue that no data
      beyond OOB byte can be read.
      
      >>> from socket import *
      >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
      >>> c1.send(b'a', MSG_OOB)
      1
      >>> c1.send(b'b')
      1
      >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
      b'b'
      
      >>> from socket import *
      >>> c1, c2 = socketpair(AF_UNIX, SOCK_STREAM)
      >>> c2.setsockopt(SOL_SOCKET, SO_OOBINLINE, 1)
      >>> c1.send(b'a', MSG_OOB)
      1
      >>> c1.send(b'b')
      1
      >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
      b'a'
      >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
      b'a'
      >>> c2.recv(1, MSG_DONTWAIT)
      b'a'
      >>> c2.recv(1, MSG_PEEK | MSG_DONTWAIT)
      b'b'
      >>>
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Signed-off-by: default avatarRao Shoaib <Rao.Shoaib@oracle.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240611084639.2248934-1-Rao.Shoaib@oracle.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a6736a0a
    • Michael Chan's avatar
      bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response · 7d9df38c
      Michael Chan authored
      Firmware interface 1.10.2.118 has increased the size of
      HWRM_PORT_PHY_QCFG response beyond the maximum size that can be
      forwarded.  When the VF's link state is not the default auto state,
      the PF will need to forward the response back to the VF to indicate
      the forced state.  This regression may cause the VF to fail to
      initialize.
      
      Fix it by capping the HWRM_PORT_PHY_QCFG response to the maximum
      96 bytes.  The SPEEDS2_SUPPORTED flag needs to be cleared because the
      new speeds2 fields are beyond the legacy structure.  Also modify
      bnxt_hwrm_fwd_resp() to print a warning if the message size exceeds 96
      bytes to make this failure more obvious.
      
      Fixes: 84a911db ("bnxt_en: Update firmware interface to 1.10.2.118")
      Reviewed-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20240612231736.57823-1-michael.chan@broadcom.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7d9df38c
    • Ziwei Xiao's avatar
      gve: Clear napi->skb before dev_kfree_skb_any() · 6f4d93b7
      Ziwei Xiao authored
      gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it
      is freed with dev_kfree_skb_any(). This can result in a subsequent call
      to napi_get_frags returning a dangling pointer.
      
      Fix this by clearing napi->skb before the skb is freed.
      
      Fixes: 9b8dd5e5 ("gve: DQO: Add RX path")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarShailend Chand <shailend@google.com>
      Signed-off-by: default avatarZiwei Xiao <ziweixiao@google.com>
      Reviewed-by: default avatarHarshitha Ramamurthy <hramamurthy@google.com>
      Reviewed-by: default avatarShailend Chand <shailend@google.com>
      Reviewed-by: default avatarPraveen Kaligineedi <pkaligineedi@google.com>
      Link: https://lore.kernel.org/r/20240612001654.923887-1-ziweixiao@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6f4d93b7
    • Taehee Yoo's avatar
      ionic: fix use after netif_napi_del() · 79f18a41
      Taehee Yoo authored
      When queues are started, netif_napi_add() and napi_enable() are called.
      If there are 4 queues and only 3 queues are used for the current
      configuration, only 3 queues' napi should be registered and enabled.
      The ionic_qcq_enable() checks whether the .poll pointer is not NULL for
      enabling only the using queue' napi. Unused queues' napi will not be
      registered by netif_napi_add(), so the .poll pointer indicates NULL.
      But it couldn't distinguish whether the napi was unregistered or not
      because netif_napi_del() doesn't reset the .poll pointer to NULL.
      So, ionic_qcq_enable() calls napi_enable() for the queue, which was
      unregistered by netif_napi_del().
      
      Reproducer:
         ethtool -L <interface name> rx 1 tx 1 combined 0
         ethtool -L <interface name> rx 0 tx 0 combined 1
         ethtool -L <interface name> rx 0 tx 0 combined 4
      
      Splat looks like:
      kernel BUG at net/core/dev.c:6666!
      Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16
      Workqueue: events ionic_lif_deferred_work [ionic]
      RIP: 0010:napi_enable+0x3b/0x40
      Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f
      RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029
      RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28
      RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001
      R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
      R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20
      FS:  0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0
      PKRU: 55555554
      Call Trace:
       <TASK>
       ? die+0x33/0x90
       ? do_trap+0xd9/0x100
       ? napi_enable+0x3b/0x40
       ? do_error_trap+0x83/0xb0
       ? napi_enable+0x3b/0x40
       ? napi_enable+0x3b/0x40
       ? exc_invalid_op+0x4e/0x70
       ? napi_enable+0x3b/0x40
       ? asm_exc_invalid_op+0x16/0x20
       ? napi_enable+0x3b/0x40
       ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
       ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
       ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
       ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8]
       process_one_work+0x145/0x360
       worker_thread+0x2bb/0x3d0
       ? __pfx_worker_thread+0x10/0x10
       kthread+0xcc/0x100
       ? __pfx_kthread+0x10/0x10
       ret_from_fork+0x2d/0x50
       ? __pfx_kthread+0x10/0x10
       ret_from_fork_asm+0x1a/0x30
      
      Fixes: 0f3154e6 ("ionic: Add Tx and Rx handling")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Reviewed-by: default avatarBrett Creeley <brett.creeley@amd.com>
      Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Link: https://lore.kernel.org/r/20240612060446.1754392-1-ap420073@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      79f18a41
    • Sasha Neftin's avatar
      Revert "igc: fix a log entry using uninitialized netdev" · 8eef5c3c
      Sasha Neftin authored
      This reverts commit 86167183.
      
      igc_ptp_init() needs to be called before igc_reset(), otherwise kernel
      crash could be observed. Following the corresponding discussion [1] and
      [2] revert this commit.
      
      Link: https://lore.kernel.org/all/8fb634f8-7330-4cf4-a8ce-485af9c0a61a@intel.com/ [1]
      Link: https://lore.kernel.org/all/87o78rmkhu.fsf@intel.com/ [2]
      Fixes: 86167183 ("igc: fix a log entry using uninitialized netdev")
      Signed-off-by: default avatarSasha Neftin <sasha.neftin@intel.com>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20240611162456.961631-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8eef5c3c
    • Jakub Kicinski's avatar
      Merge branch 'net-bridge-mst-fix-suspicious-rcu-usage-warning' · b60b1bdc
      Jakub Kicinski authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: mst: fix suspicious rcu usage warning
      
      This set fixes a suspicious RCU usage warning triggered by syzbot[1] in
      the bridge's MST code. After I converted br_mst_set_state to RCU, I
      forgot to update the vlan group dereference helper. Fix it by using
      the proper helper, in order to do that we need to pass the vlan group
      which is already obtained correctly by the callers for their respective
      context. Patch 01 is a requirement for the fix in patch 02.
      
      Note I did consider rcu_dereference_rtnl() but the churn is much bigger
      and in every part of the bridge. We can do that as a cleanup in
      net-next.
      
      [1] https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe
       =============================
       WARNING: suspicious RCU usage
       6.10.0-rc2-syzkaller-00235-g8a929806 #0 Not tainted
       -----------------------------
       net/bridge/br_private.h:1599 suspicious rcu_dereference_protected() usage!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 2, debug_locks = 1
       4 locks held by syz-executor.1/5374:
        #0: ffff888022d50b18 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_lock include/linux/mmap_lock.h:144 [inline]
        #0: ffff888022d50b18 (&mm->mmap_lock){++++}-{3:3}, at: __mm_populate+0x1b0/0x460 mm/gup.c:2111
        #1: ffffc90000a18c00 ((&p->forward_delay_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x650 kernel/time/timer.c:1789
        #2: ffff88805fb2ccb8 (&br->lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
        #2: ffff88805fb2ccb8 (&br->lock){+.-.}-{2:2}, at: br_forward_delay_timer_expired+0x50/0x440 net/bridge/br_stp_timer.c:86
        #3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline]
        #3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline]
        #3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: br_mst_set_state+0x171/0x7a0 net/bridge/br_mst.c:105
      
       stack backtrace:
       CPU: 1 PID: 5374 Comm: syz-executor.1 Not tainted 6.10.0-rc2-syzkaller-00235-g8a929806 #0
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
       Call Trace:
        <IRQ>
        __dump_stack lib/dump_stack.c:88 [inline]
        dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
        lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712
        nbp_vlan_group net/bridge/br_private.h:1599 [inline]
        br_mst_set_state+0x29e/0x7a0 net/bridge/br_mst.c:106
        br_set_state+0x28a/0x7b0 net/bridge/br_stp.c:47
        br_forward_delay_timer_expired+0x176/0x440 net/bridge/br_stp_timer.c:88
        call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792
        expire_timers kernel/time/timer.c:1843 [inline]
        __run_timers kernel/time/timer.c:2417 [inline]
        __run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428
        run_timer_base kernel/time/timer.c:2437 [inline]
        run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447
        handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
        __do_softirq kernel/softirq.c:588 [inline]
        invoke_softirq kernel/softirq.c:428 [inline]
        __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
        irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
        instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
        sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
        </IRQ>
        <TASK>
      ====================
      
      Link: https://lore.kernel.org/r/20240609103654.914987-1-razor@blackwall.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b60b1bdc
    • Nikolay Aleksandrov's avatar
      net: bridge: mst: fix suspicious rcu usage in br_mst_set_state · 546ceb1d
      Nikolay Aleksandrov authored
      I converted br_mst_set_state to RCU to avoid a vlan use-after-free
      but forgot to change the vlan group dereference helper. Switch to vlan
      group RCU deref helper to fix the suspicious rcu usage warning.
      
      Fixes: 3a7c1661 ("net: bridge: mst: fix vlan use-after-free")
      Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5feSigned-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240609103654.914987-3-razor@blackwall.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      546ceb1d
    • Nikolay Aleksandrov's avatar
      net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state · 36c92936
      Nikolay Aleksandrov authored
      Pass the already obtained vlan group pointer to br_mst_vlan_set_state()
      instead of dereferencing it again. Each caller has already correctly
      dereferenced it for their context. This change is required for the
      following suspicious RCU dereference fix. No functional changes
      intended.
      
      Fixes: 3a7c1661 ("net: bridge: mst: fix vlan use-after-free")
      Reported-by: syzbot+9bbe2de1bc9d470eb5fe@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5feSigned-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20240609103654.914987-2-razor@blackwall.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      36c92936
    • Petr Pavlu's avatar
      net/ipv6: Fix the RT cache flush via sysctl using a previous delay · 14a20e5b
      Petr Pavlu authored
      The net.ipv6.route.flush system parameter takes a value which specifies
      a delay used during the flush operation for aging exception routes. The
      written value is however not used in the currently requested flush and
      instead utilized only in the next one.
      
      A problem is that ipv6_sysctl_rtcache_flush() first reads the old value
      of net->ipv6.sysctl.flush_delay into a local delay variable and then
      calls proc_dointvec() which actually updates the sysctl based on the
      provided input.
      
      Fix the problem by switching the order of the two operations.
      
      Fixes: 4990509f ("[NETNS][IPV6]: Make sysctls route per namespace.")
      Signed-off-by: default avatarPetr Pavlu <petr.pavlu@suse.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20240607112828.30285-1-petr.pavlu@suse.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      14a20e5b
  3. 12 Jun, 2024 14 commits
  4. 11 Jun, 2024 7 commits
    • Kent Overstreet's avatar
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 2ef5971f
      Linus Torvalds authored
      Pull vfs fixes from Christian Brauner:
       "Misc:
         - Restore debugfs behavior of ignoring unknown mount options
         - Fix kernel doc for netfs_wait_for_oustanding_io()
         - Fix struct statx comment after new addition for this cycle
         - Fix a check in find_next_fd()
      
        iomap:
         - Fix data zeroing behavior when an extent spans the block that
           contains i_size
         - Restore i_size increasing in iomap_write_end() for now to avoid
           stale data exposure on xfs with a realtime device
      
        Cachefiles:
         - Remove unneeded fdtable.h include
         - Improve trace output for cachefiles_obj_{get,put}_ondemand_fd()
         - Remove requests from the request list to prevent accessing already
           freed requests
         - Fix UAF when issuing restore command while the daemon is still
           alive by adding an additional reference count to requests
         - Fix UAF by grabbing a reference during xarray lookup with xa_lock()
           held
         - Simplify error handling in cachefiles_ondemand_daemon_read()
         - Add consistency checks read and open requests to avoid crashes
         - Add a spinlock to protect ondemand_id variable which is used to
           determine whether an anonymous cachefiles fd has already been
           closed
         - Make on-demand reads killable allowing to handle broken cachefiles
           daemon better
         - Flush all requests after the kernel has been marked dead via
           CACHEFILES_DEAD to avoid hung-tasks
         - Ensure that closed requests are marked as such to avoid reusing
           them with a reopen request
         - Defer fd_install() until after copy_to_user() succeeded and thereby
           get rid of having to use close_fd()
         - Ensure that anonymous cachefiles on-demand fds are reused while
           they are valid to avoid pinning already freed cookies"
      
      * tag 'vfs-6.10-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        iomap: Fix iomap_adjust_read_range for plen calculation
        iomap: keep on increasing i_size in iomap_write_end()
        cachefiles: remove unneeded include of <linux/fdtable.h>
        fs/file: fix the check in find_next_fd()
        cachefiles: make on-demand read killable
        cachefiles: flush all requests after setting CACHEFILES_DEAD
        cachefiles: Set object to close if ondemand_id < 0 in copen
        cachefiles: defer exposing anon_fd until after copy_to_user() succeeds
        cachefiles: never get a new anonymous fd if ondemand_id is valid
        cachefiles: add spin_lock for cachefiles_ondemand_info
        cachefiles: add consistency check for copen/cread
        cachefiles: remove err_put_fd label in cachefiles_ondemand_daemon_read()
        cachefiles: fix slab-use-after-free in cachefiles_ondemand_daemon_read()
        cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()
        cachefiles: remove requests from xarray during flushing requests
        cachefiles: add output string to cachefiles_obj_[get|put]_ondemand_fd
        statx: Update offset commentary for struct statx
        netfs: fix kernel doc for nets_wait_for_outstanding_io()
        debugfs: continue to ignore unknown mount options
      2ef5971f
    • Florian Westphal's avatar
      netfilter: Use flowlabel flow key when re-routing mangled packets · 6f8f132c
      Florian Westphal authored
      'ip6 dscp set $v' in an nftables outpute route chain has no effect.
      While nftables does detect the dscp change and calls the reroute hook.
      But ip6_route_me_harder never sets the dscp/flowlabel:
      flowlabel/dsfield routing rules are ignored and no reroute takes place.
      
      Thanks to Yi Chen for an excellent reproducer script that I used
      to validate this change.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarYi Chen <yiche@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6f8f132c
    • Jozsef Kadlecsik's avatar
      netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type · 4e7aaa6b
      Jozsef Kadlecsik authored
      Lion Ackermann reported that there is a race condition between namespace cleanup
      in ipset and the garbage collection of the list:set type. The namespace
      cleanup can destroy the list:set type of sets while the gc of the set type is
      waiting to run in rcu cleanup. The latter uses data from the destroyed set which
      thus leads use after free. The patch contains the following parts:
      
      - When destroying all sets, first remove the garbage collectors, then wait
        if needed and then destroy the sets.
      - Fix the badly ordered "wait then remove gc" for the destroy a single set
        case.
      - Fix the missing rcu locking in the list:set type in the userspace test
        case.
      - Use proper RCU list handlings in the list:set type.
      
      The patch depends on c1193d9b (netfilter: ipset: Add list flush to cancel_gc).
      
      Fixes: 97f7cf1c (netfilter: ipset: fix performance regression in swap operation)
      Reported-by: default avatarLion Ackermann <nnamrec@gmail.com>
      Tested-by: default avatarLion Ackermann <nnamrec@gmail.com>
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      4e7aaa6b
    • Davide Ornaghi's avatar
      netfilter: nft_inner: validate mandatory meta and payload · c4ab9da8
      Davide Ornaghi authored
      Check for mandatory netlink attributes in payload and meta expression
      when used embedded from the inner expression, otherwise NULL pointer
      dereference is possible from userspace.
      
      Fixes: a150d122 ("netfilter: nft_meta: add inner match support")
      Fixes: 3a07327d ("netfilter: nft_inner: support for inner tunnel header matching")
      Signed-off-by: default avatarDavide Ornaghi <d.ornaghi97@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      c4ab9da8
    • Eric Dumazet's avatar
      tcp: use signed arithmetic in tcp_rtx_probe0_timed_out() · 36534d3c
      Eric Dumazet authored
      Due to timer wheel implementation, a timer will usually fire
      after its schedule.
      
      For instance, for HZ=1000, a timeout between 512ms and 4s
      has a granularity of 64ms.
      For this range of values, the extra delay could be up to 63ms.
      
      For TCP, this means that tp->rcv_tstamp may be after
      inet_csk(sk)->icsk_timeout whenever the timer interrupt
      finally triggers, if one packet came during the extra delay.
      
      We need to make sure tcp_rtx_probe0_timed_out() handles this case.
      
      Fixes: e89688e3 ("net: tcp: fix unexcepted socket die when snd_wnd is 0")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Menglong Dong <imagedong@tencent.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Reviewed-by: default avatarJason Xing <kerneljasonxing@gmail.com>
      Link: https://lore.kernel.org/r/20240607125652.1472540-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      36534d3c
    • Jakub Kicinski's avatar
      Merge branch 'mptcp-various-fixes' · 70b3c88c
      Jakub Kicinski authored
      Matthieu Baerts says:
      
      ====================
      mptcp: various fixes
      
      The different patches here are some unrelated fixes for MPTCP:
      
      - Patch 1 ensures 'snd_una' is initialised on connect in case of MPTCP
        fallback to TCP followed by retransmissions before the processing of
        any other incoming packets. A fix for v5.9+.
      
      - Patch 2 makes sure the RmAddr MIB counter is incremented, and only
        once per ID, upon the reception of a RM_ADDR. A fix for v5.10+.
      
      - Patch 3 doesn't update 'add addr' related counters if the connect()
        was not possible. A fix for v5.7+.
      
      - Patch 4 updates the mailmap file to add Geliang's new email address.
      ====================
      
      Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-0-1ab9ddfa3d00@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      70b3c88c