1. 13 Oct, 2022 23 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 66ae0436
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter, and wifi.
      
      Current release - regressions:
      
         - Revert "net/sched: taprio: make qdisc_leaf() see the
           per-netdev-queue pfifo child qdiscs", it may cause crashes when the
           qdisc is reconfigured
      
         - inet: ping: fix splat due to packet allocation refactoring in inet
      
         - tcp: clean up kernel listener's reqsk in inet_twsk_purge(), fix UAF
           due to races when per-netns hash table is used
      
        Current release - new code bugs:
      
         - eth: adin1110: check in netdev_event that netdev belongs to driver
      
         - fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co.
      
        Previous releases - regressions:
      
         - ipv4: handle attempt to delete multipath route when fib_info
           contains an nh reference, avoid oob access
      
         - wifi: fix handful of bugs in the new Multi-BSSID code
      
         - wifi: mt76: fix rate reporting / throughput regression on mt7915
           and newer, fix checksum offload
      
         - wifi: iwlwifi: mvm: fix double list_add at
           iwl_mvm_mac_wake_tx_queue (other cases)
      
         - wifi: mac80211: do not drop packets smaller than the LLC-SNAP
           header on fast-rx
      
        Previous releases - always broken:
      
         - ieee802154: don't warn zero-sized raw_sendmsg()
      
         - ipv6: ping: fix wrong checksum for large frames
      
         - mctp: prevent double key removal and unref
      
         - tcp/udp: fix memory leaks and races around IPV6_ADDRFORM
      
         - hv_netvsc: fix race between VF offering and VF association message
      
        Misc:
      
         - remove -Warray-bounds silencing in the drivers, compilers fixed"
      
      * tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
        sunhme: fix an IS_ERR() vs NULL check in probe
        net: marvell: prestera: fix a couple NULL vs IS_ERR() checks
        kcm: avoid potential race in kcm_tx_work
        tcp: Clean up kernel listener's reqsk in inet_twsk_purge()
        net: phy: micrel: Fixes FIELD_GET assertion
        openvswitch: add nf_ct_is_confirmed check before assigning the helper
        tcp: Fix data races around icsk->icsk_af_ops.
        ipv6: Fix data races around sk->sk_prot.
        tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
        udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
        tcp/udp: Fix memory leak in ipv6_renew_options().
        mctp: prevent double key removal and unref
        selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1
        netfilter: rpfilter/fib: Populate flowic_l3mdev field
        selftests: netfilter: Test reverse path filtering
        net/mlx5: Make ASO poll CQ usable in atomic context
        tcp: cdg: allow tcp_cdg_release() to be called multiple times
        inet: ping: fix recent breakage
        ipv6: ping: fix wrong checksum for large frames
        net: ethernet: ti: am65-cpsw: set correct devlink flavour for unused ports
        ...
      66ae0436
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · d6f04f26
      Linus Torvalds authored
      Pull virtio fixes from Michael Tsirkin:
      
       - Fix a regression in virtio pci on power
      
       - Add a reviewer for ifcvf
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vdpa/ifcvf: add reviewer
        virtio_pci: use irq to detect interrupt support
      d6f04f26
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · aa41478a
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Found that the synthetic events were using strlen/strscpy() on values
         that could have come from userspace, and that is bad.
      
         Consolidate the string logic of kprobe and eprobe and extend it to
         the synthetic events to safely process string addresses.
      
       - Clean up content of text dump in ftrace_bug() where the output does
         not make char reads into signed and sign extending the byte output.
      
       - Fix some kernel docs in the ring buffer code.
      
      * tag 'trace-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing: Fix reading strings from synthetic events
        tracing: Add "(fault)" name injection to kernel probes
        tracing: Move duplicate code of trace_kprobe/eprobe.c into header
        ring-buffer: Fix kernel-doc
        ftrace: Fix char print issue in print_ip_ins()
      aa41478a
    • Linus Torvalds's avatar
      Merge tag 'linux-watchdog-6.1-rc1' of git://www.linux-watchdog.org/linux-watchdog · 3d33e6dd
      Linus Torvalds authored
      Pull watchdog updates from Wim Van Sebroeck:
      
       - new driver for Exar/MaxLinear XR28V38x
      
       - support for exynosautov9 SoC
      
       - support for Renesas R-Car V5H (R8A779G0) and RZ/V2M (r9a09g011) SoC
      
       - support for imx93
      
       - several other fixes and improvements
      
      * tag 'linux-watchdog-6.1-rc1' of git://www.linux-watchdog.org/linux-watchdog: (36 commits)
        watchdog: twl4030_wdt: add missing mod_devicetable.h include
        dt-bindings: watchdog: migrate mt7621 text bindings to YAML
        watchdog: sp5100_tco: Add "action" module parameter
        watchdog: imx93: add watchdog timer on imx93
        watchdog: imx7ulp_wdt: init wdog when it was active
        watchdog: imx7ulp_wdt: Handle wdog reconfigure failure
        watchdog: imx7ulp_wdt: Fix RCS timeout issue
        watchdog: imx7ulp_wdt: Check CMD32EN in wdog init
        watchdog: imx7ulp: Add explict memory barrier for unlock sequence
        watchdog: imx7ulp: Move suspend/resume to noirq phase
        watchdog: rti-wdt:using the pm_runtime_resume_and_get to simplify the code
        dt-bindings: watchdog: rockchip: add rockchip,rk3128-wdt
        watchdog: s3c2410_wdt: support exynosautov9 watchdog
        dt-bindings: watchdog: add exynosautov9 compatible
        watchdog: npcm: Enable clock if provided
        watchdog: meson: keep running if already active
        watchdog: dt-bindings: atmel,at91sam9-wdt: convert to json-schema
        watchdog: armada_37xx_wdt: Fix .set_timeout callback
        watchdog: sa1100: make variable sa1100dog_driver static
        watchdog: w83977f_wdt: Fix comment typo
        ...
      3d33e6dd
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client · 524d0c68
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "A quiet round this time: several assorted filesystem fixes, the most
        noteworthy one being some additional wakeups in cap handling code, and
        a messenger cleanup"
      
      * tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client:
        ceph: remove Sage's git tree from documentation
        ceph: fix incorrectly showing the .snap size for stat
        ceph: fail the open_by_handle_at() if the dentry is being unlinked
        ceph: increment i_version when doing a setattr with caps
        ceph: Use kcalloc for allocating multiple elements
        ceph: no need to wait for transition RDCACHE|RD -> RD
        ceph: fail the request if the peer MDS doesn't support getvxattr op
        ceph: wake up the waiters if any new caps comes
        libceph: drop last_piece flag from ceph_msg_data_cursor
      524d0c68
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs · 66b83455
      Linus Torvalds authored
      Pull NFS client updates from Anna Schumaker:
       "New Features:
         - Add NFSv4.2 xattr tracepoints
         - Replace xprtiod WQ in rpcrdma
         - Flexfiles cancels I/O on layout recall or revoke
      
        Bugfixes and Cleanups:
         - Directly use ida_alloc() / ida_free()
         - Don't open-code max_t()
         - Prefer using strscpy over strlcpy
         - Remove unused forward declarations
         - Always return layout states on flexfiles layout return
         - Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of
           error
         - Allow more xprtrdma memory allocations to fail without triggering a
           reclaim
         - Various other xprtrdma clean ups
         - Fix rpc_killall_tasks() races"
      
      * tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
        NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked
        SUNRPC: Add API to force the client to disconnect
        SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls
        SUNRPC: Fix races with rpc_killall_tasks()
        xprtrdma: Fix uninitialized variable
        xprtrdma: Prevent memory allocations from driving a reclaim
        xprtrdma: Memory allocation should be allowed to fail during connect
        xprtrdma: MR-related memory allocation should be allowed to fail
        xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc()
        xprtrdma: Clean up synopsis of rpcrdma_req_create()
        svcrdma: Clean up RPCRDMA_DEF_GFP
        SUNRPC: Replace the use of the xprtiod WQ in rpcrdma
        NFSv4.2: Add a tracepoint for listxattr
        NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr
        NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2
        NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR
        nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration
        NFSv4: remove nfs4_renewd_prepare_shutdown() declaration
        fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment
        NFSv4/pNFS: Always return layout stats on layout return for flexfiles
        ...
      66b83455
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.1-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · 531d3b5f
      Linus Torvalds authored
      Pull orangefs update from Mike Marshall:
       "Change iterate to iterate_shared"
      
      * tag 'for-linus-6.1-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        Orangefs: change iterate to iterate_shared
      531d3b5f
    • Dan Carpenter's avatar
      sunhme: fix an IS_ERR() vs NULL check in probe · 99df45c9
      Dan Carpenter authored
      The devm_request_region() function does not return error pointers, it
      returns NULL on error.
      
      Fixes: 914d9b27 ("sunhme: switch to devres")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarSean Anderson <seanga2@gmail.com>
      Reviewed-by: default avatarRolf Eike Beer <eike-kernel@sf-tec.de>
      Link: https://lore.kernel.org/r/Y0bWzJL8JknX8MUf@kiliSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      99df45c9
    • Dan Carpenter's avatar
      net: marvell: prestera: fix a couple NULL vs IS_ERR() checks · 30e9672a
      Dan Carpenter authored
      The __prestera_nexthop_group_create() function returns NULL on error
      and the prestera_nexthop_group_get() returns error pointers.  Fix these
      two checks.
      
      Fixes: 0a23ae23 ("net: marvell: prestera: Add router nexthops ABI")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Link: https://lore.kernel.org/r/Y0bWq+7DoKK465z8@kiliSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      30e9672a
    • Eric Dumazet's avatar
      kcm: avoid potential race in kcm_tx_work · ec7eede3
      Eric Dumazet authored
      syzbot found that kcm_tx_work() could crash [1] in:
      
      	/* Primarily for SOCK_SEQPACKET sockets */
      	if (likely(sk->sk_socket) &&
      	    test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) {
      <<*>>	clear_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
      		sk->sk_write_space(sk);
      	}
      
      I think the reason is that another thread might concurrently
      run in kcm_release() and call sock_orphan(sk) while sk is not
      locked. kcm_tx_work() find sk->sk_socket being NULL.
      
      [1]
      BUG: KASAN: null-ptr-deref in instrument_atomic_write include/linux/instrumented.h:86 [inline]
      BUG: KASAN: null-ptr-deref in clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
      BUG: KASAN: null-ptr-deref in kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742
      Write of size 8 at addr 0000000000000008 by task kworker/u4:3/53
      
      CPU: 0 PID: 53 Comm: kworker/u4:3 Not tainted 5.19.0-rc3-next-20220621-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: kkcmd kcm_tx_work
      Call Trace:
      <TASK>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
      kasan_report+0xbe/0x1f0 mm/kasan/report.c:495
      check_region_inline mm/kasan/generic.c:183 [inline]
      kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
      instrument_atomic_write include/linux/instrumented.h:86 [inline]
      clear_bit include/asm-generic/bitops/instrumented-atomic.h:41 [inline]
      kcm_tx_work+0xff/0x160 net/kcm/kcmsock.c:742
      process_one_work+0x996/0x1610 kernel/workqueue.c:2289
      worker_thread+0x665/0x1080 kernel/workqueue.c:2436
      kthread+0x2e9/0x3a0 kernel/kthread.c:376
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
      </TASK>
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <tom@herbertland.com>
      Link: https://lore.kernel.org/r/20221012133412.519394-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ec7eede3
    • Kuniyuki Iwashima's avatar
      tcp: Clean up kernel listener's reqsk in inet_twsk_purge() · 740ea3c4
      Kuniyuki Iwashima authored
      Eric Dumazet reported a use-after-free related to the per-netns ehash
      series. [0]
      
      When we create a TCP socket from userspace, the socket always holds a
      refcnt of the netns.  This guarantees that a reqsk timer is always fired
      before netns dismantle.  Each reqsk has a refcnt of its listener, so the
      listener is not freed before the reqsk, and the net is not freed before
      the listener as well.
      
      OTOH, when in-kernel users create a TCP socket, it might not hold a refcnt
      of its netns.  Thus, a reqsk timer can be fired after the netns dismantle
      and access freed per-netns ehash.
      
      To avoid the use-after-free, we need to clean up TCP_NEW_SYN_RECV sockets
      in inet_twsk_purge() if the netns uses a per-netns ehash.
      
      [0]: https://lore.kernel.org/netdev/CANn89iLXMup0dRD_Ov79Xt8N9FM0XdhCHEN05sf3eLwxKweM6w@mail.gmail.com/
      
      BUG: KASAN: use-after-free in tcp_or_dccp_get_hashinfo
      include/net/inet_hashtables.h:181 [inline]
      BUG: KASAN: use-after-free in reqsk_queue_unlink+0x320/0x350
      net/ipv4/inet_connection_sock.c:913
      Read of size 8 at addr ffff88807545bd80 by task syz-executor.2/8301
      
      CPU: 1 PID: 8301 Comm: syz-executor.2 Not tainted
      6.0.0-syzkaller-02757-gaf7d23f9 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine,
      BIOS Google 09/22/2022
      Call Trace:
      <IRQ>
      __dump_stack lib/dump_stack.c:88 [inline]
      dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
      print_address_description mm/kasan/report.c:317 [inline]
      print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
      kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
      tcp_or_dccp_get_hashinfo include/net/inet_hashtables.h:181 [inline]
      reqsk_queue_unlink+0x320/0x350 net/ipv4/inet_connection_sock.c:913
      inet_csk_reqsk_queue_drop net/ipv4/inet_connection_sock.c:927 [inline]
      inet_csk_reqsk_queue_drop_and_put net/ipv4/inet_connection_sock.c:939 [inline]
      reqsk_timer_handler+0x724/0x1160 net/ipv4/inet_connection_sock.c:1053
      call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
      expire_timers kernel/time/timer.c:1519 [inline]
      __run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
      __run_timers kernel/time/timer.c:1768 [inline]
      run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
      __do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
      invoke_softirq kernel/softirq.c:445 [inline]
      __irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
      irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
      sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1107
      </IRQ>
      
      Fixes: d1e5e640 ("tcp: Introduce optional per-netns ehash.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20221012145036.74960-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      740ea3c4
    • Michael S. Tsirkin's avatar
      vdpa/ifcvf: add reviewer · be8ddea9
      Michael S. Tsirkin authored
      Zhu Lingshan has been writing and reviewing ifcvf patches for
      a while now, add as reviewer.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarZhu Lingshan <lingshan.zhu@intel.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      be8ddea9
    • Michael S. Tsirkin's avatar
      virtio_pci: use irq to detect interrupt support · 2145ab51
      Michael S. Tsirkin authored
      commit 71491c54 ("virtio_pci: don't try to use intxif pin is zero")
      breaks virtio_pci on powerpc, when running as a qemu guest.
      
      vp_find_vqs() bails out because pci_dev->pin == 0.
      
      But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
      succeed if we called it - which is what the code used to do.
      
      This seems to happen because pci_dev->pin is not populated in
      pci_assign_irq(). A PCI core bug? Maybe.
      
      However Linus said:
      	I really think that that is basically the only time you should use
      	that 'pci_dev->pin' thing: it basically exists not for "does this
      	device have an IRQ", but for "what is the routing of this irq on this
      	device".
      
      and
      	The correct way to check for "no irq" doesn't use NO_IRQ at all, it just does
      		if (dev->irq) ...
      
      so let's just check irq and be done with it.
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Reported-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Fixes: 71491c54 ("virtio_pci: don't try to use intxif pin is zero")
      Cc: "Angus Chen" <angus.chen@jaguarmicro.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Tested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Message-Id: <20221012220312.308522-1-mst@redhat.com>
      2145ab51
    • Paolo Abeni's avatar
      Merge tag 'wireless-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · ac85bc71
      Paolo Abeni authored
      Johannes Berg says:
      
      ====================
      More wireless fixes for 6.1
      
      This has only the fixes for the scan parsing issues.
      
      * tag 'wireless-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
        wifi: cfg80211: update hidden BSSes to avoid WARN_ON
        wifi: mac80211: fix crash in beacon protection for P2P-device
        wifi: mac80211_hwsim: avoid mac80211 warning on bad rate
        wifi: cfg80211: avoid nontransmitted BSS list corruption
        wifi: cfg80211: fix BSS refcounting bugs
        wifi: cfg80211: ensure length byte is present before access
        wifi: mac80211: fix MBSSID parsing use-after-free
        wifi: cfg80211/mac80211: reject bad MBSSID elements
        wifi: cfg80211: fix u8 overflow in cfg80211_update_notlisted_nontrans()
      ====================
      
      Link: https://lore.kernel.org/r/20221013100522.46346-1-johannes@sipsolutions.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ac85bc71
    • Johannes Berg's avatar
      Merge branch 'cve-fixes-2022-10-13' · e7ad651c
      Johannes Berg authored
      Pull in the fixes for various scan parsing bugs found by
      Sönke Huster by fuzzing.
      e7ad651c
    • Divya Koppera's avatar
      net: phy: micrel: Fixes FIELD_GET assertion · fa182ea2
      Divya Koppera authored
      FIELD_GET() must only be used with a mask that is a compile-time
      constant. Mark the functions as __always_inline to avoid the problem.
      
      Fixes: 21b688da ("net: phy: micrel: Cable Diag feature for lan8814 phy")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarDivya Koppera <Divya.Koppera@microchip.com>
      Link: https://lore.kernel.org/r/20221011095437.12580-1-Divya.Koppera@microchip.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fa182ea2
    • Xin Long's avatar
      openvswitch: add nf_ct_is_confirmed check before assigning the helper · 3c186054
      Xin Long authored
      A WARN_ON call trace would be triggered when 'ct(commit, alg=helper)'
      applies on a confirmed connection:
      
        WARNING: CPU: 0 PID: 1251 at net/netfilter/nf_conntrack_extend.c:98
        RIP: 0010:nf_ct_ext_add+0x12d/0x150 [nf_conntrack]
        Call Trace:
         <TASK>
         nf_ct_helper_ext_add+0x12/0x60 [nf_conntrack]
         __nf_ct_try_assign_helper+0xc4/0x160 [nf_conntrack]
         __ovs_ct_lookup+0x72e/0x780 [openvswitch]
         ovs_ct_execute+0x1d8/0x920 [openvswitch]
         do_execute_actions+0x4e6/0xb60 [openvswitch]
         ovs_execute_actions+0x60/0x140 [openvswitch]
         ovs_packet_cmd_execute+0x2ad/0x310 [openvswitch]
         genl_family_rcv_msg_doit.isra.15+0x113/0x150
         genl_rcv_msg+0xef/0x1f0
      
      which can be reproduced with these OVS flows:
      
        table=0, in_port=veth1,tcp,tcp_dst=2121,ct_state=-trk
        actions=ct(commit, table=1)
        table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
        actions=ct(commit, alg=ftp),normal
      
      The issue was introduced by commit 248d45f1 ("openvswitch: Allow
      attaching helper in later commit") where it somehow removed the check
      of nf_ct_is_confirmed before asigning the helper. This patch is to fix
      it by bringing it back.
      
      Fixes: 248d45f1 ("openvswitch: Allow attaching helper in later commit")
      Reported-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Tested-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://lore.kernel.org/r/c5c9092a22a2194650222bffaf786902613deb16.1665085502.git.lucien.xin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3c186054
    • Jakub Kicinski's avatar
      Merge branch 'tcp-udp-fix-memory-leaks-and-data-races-around-ipv6_addrform' · 4f0f2121
      Jakub Kicinski authored
      Kuniyuki Iwashima says:
      
      ====================
      tcp/udp: Fix memory leaks and data races around IPV6_ADDRFORM.
      
      This series fixes some memory leaks and data races caused in the
      same scenario where one thread converts an IPv6 socket into IPv4
      with IPV6_ADDRFORM and another accesses the socket concurrently.
      
        v4: https://lore.kernel.org/netdev/20221004171802.40968-1-kuniyu@amazon.com/
        v3 (Resend): https://lore.kernel.org/netdev/20221003154425.49458-1-kuniyu@amazon.com/
        v3: https://lore.kernel.org/netdev/20220929012542.55424-1-kuniyu@amazon.com/
        v2: https://lore.kernel.org/netdev/20220928002741.64237-1-kuniyu@amazon.com/
        v1: https://lore.kernel.org/netdev/20220927161209.32939-1-kuniyu@amazon.com/
      ====================
      
      Link: https://lore.kernel.org/r/20221006185349.74777-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4f0f2121
    • Kuniyuki Iwashima's avatar
      tcp: Fix data races around icsk->icsk_af_ops. · f49cd2f4
      Kuniyuki Iwashima authored
      setsockopt(IPV6_ADDRFORM) and tcp_v6_connect() change icsk->icsk_af_ops
      under lock_sock(), but tcp_(get|set)sockopt() read it locklessly.  To
      avoid load/store tearing, we need to add READ_ONCE() and WRITE_ONCE()
      for the reads and writes.
      
      Thanks to Eric Dumazet for providing the syzbot report:
      
      BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect
      
      write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0:
      tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240
      __inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660
      inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724
      __sys_connect_file net/socket.c:1976 [inline]
      __sys_connect+0x197/0x1b0 net/socket.c:1993
      __do_sys_connect net/socket.c:2003 [inline]
      __se_sys_connect net/socket.c:2000 [inline]
      __x64_sys_connect+0x3d/0x50 net/socket.c:2000
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1:
      tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789
      sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585
      __sys_setsockopt+0x212/0x2b0 net/socket.c:2252
      __do_sys_setsockopt net/socket.c:2263 [inline]
      __se_sys_setsockopt net/socket.c:2260 [inline]
      __x64_sys_setsockopt+0x62/0x70 net/socket.c:2260
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted
      6.0.0-rc4-syzkaller-00331-g4ed9c1e9-dirty #0
      
      Hardware name: Google Google Compute Engine/Google Compute Engine,
      BIOS Google 08/26/2022
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f49cd2f4
    • Kuniyuki Iwashima's avatar
      ipv6: Fix data races around sk->sk_prot. · 364f997b
      Kuniyuki Iwashima authored
      Commit 086d4905 ("ipv6: annotate some data-races around sk->sk_prot")
      fixed some data-races around sk->sk_prot but it was not enough.
      
      Some functions in inet6_(stream|dgram)_ops still access sk->sk_prot
      without lock_sock() or rtnl_lock(), so they need READ_ONCE() to avoid
      load tearing.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      364f997b
    • Kuniyuki Iwashima's avatar
      tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct(). · d38afeec
      Kuniyuki Iwashima authored
      Originally, inet6_sk(sk)->XXX were changed under lock_sock(), so we were
      able to clean them up by calling inet6_destroy_sock() during the IPv6 ->
      IPv4 conversion by IPV6_ADDRFORM.  However, commit 03485f2a ("udpv6:
      Add lockless sendmsg() support") added a lockless memory allocation path,
      which could cause a memory leak:
      
      setsockopt(IPV6_ADDRFORM)                 sendmsg()
      +-----------------------+                 +-------+
      - do_ipv6_setsockopt(sk, ...)             - udpv6_sendmsg(sk, ...)
        - sockopt_lock_sock(sk)                   ^._ called via udpv6_prot
          - lock_sock(sk)                             before WRITE_ONCE()
        - WRITE_ONCE(sk->sk_prot, &tcp_prot)
        - inet6_destroy_sock()                    - if (!corkreq)
        - sockopt_release_sock(sk)                  - ip6_make_skb(sk, ...)
          - release_sock(sk)                          ^._ lockless fast path for
                                                          the non-corking case
      
                                                      - __ip6_append_data(sk, ...)
                                                        - ipv6_local_rxpmtu(sk, ...)
                                                          - xchg(&np->rxpmtu, skb)
                                                            ^._ rxpmtu is never freed.
      
                                                      - goto out_no_dst;
      
                                                  - lock_sock(sk)
      
      For now, rxpmtu is only the case, but not to miss the future change
      and a similar bug fixed in commit e2732600 ("net: ping6: Fix
      memleak in ipv6_renew_options()."), let's set a new function to IPv6
      sk->sk_destruct() and call inet6_cleanup_sock() there.  Since the
      conversion does not change sk->sk_destruct(), we can guarantee that
      we can clean up IPv6 resources finally.
      
      We can now remove all inet6_destroy_sock() calls from IPv6 protocol
      specific ->destroy() functions, but such changes are invasive to
      backport.  So they can be posted as a follow-up later for net-next.
      
      Fixes: 03485f2a ("udpv6: Add lockless sendmsg() support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d38afeec
    • Kuniyuki Iwashima's avatar
      udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM). · 21985f43
      Kuniyuki Iwashima authored
      Commit 4b340ae2 ("IPv6: Complete IPV6_DONTFRAG support") forgot
      to add a change to free inet6_sk(sk)->rxpmtu while converting an IPv6
      socket into IPv4 with IPV6_ADDRFORM.  After conversion, sk_prot is
      changed to udp_prot and ->destroy() never cleans it up, resulting in
      a memory leak.
      
      This is due to the discrepancy between inet6_destroy_sock() and
      IPV6_ADDRFORM, so let's call inet6_destroy_sock() from IPV6_ADDRFORM
      to remove the difference.
      
      However, this is not enough for now because rxpmtu can be changed
      without lock_sock() after commit 03485f2a ("udpv6: Add lockless
      sendmsg() support").  We will fix this case in the following patch.
      
      Note we will rename inet6_destroy_sock() to inet6_cleanup_sock() and
      remove unnecessary inet6_destroy_sock() calls in sk_prot->destroy()
      in the future.
      
      Fixes: 4b340ae2 ("IPv6: Complete IPV6_DONTFRAG support")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      21985f43
    • Kuniyuki Iwashima's avatar
      tcp/udp: Fix memory leak in ipv6_renew_options(). · 3c52c6bb
      Kuniyuki Iwashima authored
      syzbot reported a memory leak [0] related to IPV6_ADDRFORM.
      
      The scenario is that while one thread is converting an IPv6 socket into
      IPv4 with IPV6_ADDRFORM, another thread calls do_ipv6_setsockopt() and
      allocates memory to inet6_sk(sk)->XXX after conversion.
      
      Then, the converted sk with (tcp|udp)_prot never frees the IPv6 resources,
      which inet6_destroy_sock() should have cleaned up.
      
      setsockopt(IPV6_ADDRFORM)                 setsockopt(IPV6_DSTOPTS)
      +-----------------------+                 +----------------------+
      - do_ipv6_setsockopt(sk, ...)
        - sockopt_lock_sock(sk)                 - do_ipv6_setsockopt(sk, ...)
          - lock_sock(sk)                         ^._ called via tcpv6_prot
        - WRITE_ONCE(sk->sk_prot, &tcp_prot)          before WRITE_ONCE()
        - xchg(&np->opt, NULL)
        - txopt_put(opt)
        - sockopt_release_sock(sk)
          - release_sock(sk)                      - sockopt_lock_sock(sk)
                                                    - lock_sock(sk)
                                                  - ipv6_set_opt_hdr(sk, ...)
                                                    - ipv6_update_options(sk, opt)
                                                      - xchg(&inet6_sk(sk)->opt, opt)
                                                        ^._ opt is never freed.
      
                                                  - sockopt_release_sock(sk)
                                                    - release_sock(sk)
      
      Since IPV6_DSTOPTS allocates options under lock_sock(), we can avoid this
      memory leak by testing whether sk_family is changed by IPV6_ADDRFORM after
      acquiring the lock.
      
      This issue exists from the initial commit between IPV6_ADDRFORM and
      IPV6_PKTOPTIONS.
      
      [0]:
      BUG: memory leak
      unreferenced object 0xffff888009ab9f80 (size 96):
        comm "syz-executor583", pid 328, jiffies 4294916198 (age 13.034s)
        hex dump (first 32 bytes):
          01 00 00 00 48 00 00 00 08 00 00 00 00 00 00 00  ....H...........
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<000000002ee98ae1>] kmalloc include/linux/slab.h:605 [inline]
          [<000000002ee98ae1>] sock_kmalloc+0xb3/0x100 net/core/sock.c:2566
          [<0000000065d7b698>] ipv6_renew_options+0x21e/0x10b0 net/ipv6/exthdrs.c:1318
          [<00000000a8c756d7>] ipv6_set_opt_hdr net/ipv6/ipv6_sockglue.c:354 [inline]
          [<00000000a8c756d7>] do_ipv6_setsockopt.constprop.0+0x28b7/0x4350 net/ipv6/ipv6_sockglue.c:668
          [<000000002854d204>] ipv6_setsockopt+0xdf/0x190 net/ipv6/ipv6_sockglue.c:1021
          [<00000000e69fdcf8>] tcp_setsockopt+0x13b/0x2620 net/ipv4/tcp.c:3789
          [<0000000090da4b9b>] __sys_setsockopt+0x239/0x620 net/socket.c:2252
          [<00000000b10d192f>] __do_sys_setsockopt net/socket.c:2263 [inline]
          [<00000000b10d192f>] __se_sys_setsockopt net/socket.c:2260 [inline]
          [<00000000b10d192f>] __x64_sys_setsockopt+0xbe/0x160 net/socket.c:2260
          [<000000000a80d7aa>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
          [<000000000a80d7aa>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
          [<000000004562b5c6>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3c52c6bb
  2. 12 Oct, 2022 17 commits
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-kunit-6.1-rc1-2' of... · a185a099
      Linus Torvalds authored
      Merge tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull more KUnit updates from Shuah Khan:
       "Features and fixes:
      
         - simplify resource use
      
         - make kunit_malloc() and kunit_free() allocations and frees
           consistent. kunit_free() frees only the memory allocated by
           kunit_malloc()
      
         - stop downloading risc-v opensbi binaries using wget
      
         - other fixes and improvements to tool and KUnit framework"
      
      * tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        Documentation: kunit: Update description of --alltests option
        kunit: declare kunit_assert structs as const
        kunit: rename base KUNIT_ASSERTION macro to _KUNIT_FAILED
        kunit: remove format func from struct kunit_assert, get it to 0 bytes
        kunit: tool: Don't download risc-v opensbi firmware with wget
        kunit: make kunit_kfree(NULL) a no-op to match kfree()
        kunit: make kunit_kfree() not segfault on invalid inputs
        kunit: make kunit_kfree() only work on pointers from kunit_malloc() and friends
        kunit: drop test pointer in string_stream_fragment
        kunit: string-stream: Simplify resource use
      a185a099
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-next-6.1-rc1-2' of... · 661e0096
      Linus Torvalds authored
      Merge tag 'linux-kselftest-next-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull more Kselftest updates from Shuah Khan:
       "This consists of fixes and improvements to memory-hotplug test and a
        minor spelling fix to ftrace test"
      
      * tag 'linux-kselftest-next-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        docs: notifier-error-inject: Correct test's name
        selftests/memory-hotplug: Adjust log info for maintainability
        selftests/memory-hotplug: Restore memory before exit
        selftests/memory-hotplug: Add checking after online or offline
        selftests/ftrace: func_event_triggers: fix typo in user message
      661e0096
    • Linus Torvalds's avatar
      Merge tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio · d3cf4051
      Linus Torvalds authored
      Pull VFIO updates from Alex Williamson:
      
       - Prune private items from vfio_pci_core.h to a new internal header,
         fix missed function rename, and refactor vfio-pci interrupt defines
         (Jason Gunthorpe)
      
       - Create consistent naming and handling of ioctls with a function per
         ioctl for vfio-pci and vfio group handling, use proper type args
         where available (Jason Gunthorpe)
      
       - Implement a set of low power device feature ioctls allowing userspace
         to make use of power states such as D3cold where supported (Abhishek
         Sahu)
      
       - Remove device counter on vfio groups, which had restricted the page
         pinning interface to singleton groups to account for limitations in
         the type1 IOMMU backend. Document usage as limited to emulated IOMMU
         devices, ie. traditional mdev devices where this restriction is
         consistent (Jason Gunthorpe)
      
       - Correct function prefix in hisi_acc driver incurred during previous
         refactoring (Shameer Kolothum)
      
       - Correct typo and remove redundant warning triggers in vfio-fsl driver
         (Christophe JAILLET)
      
       - Introduce device level DMA dirty tracking uAPI and implementation in
         the mlx5 variant driver (Yishai Hadas & Joao Martins)
      
       - Move much of the vfio_device life cycle management into vfio core,
         simplifying and avoiding duplication across drivers. This also
         facilitates adding a struct device to vfio_device which begins the
         introduction of device rather than group level user support and fills
         a gap allowing userspace identify devices as vfio capable without
         implicit knowledge of the driver (Kevin Tian & Yi Liu)
      
       - Split vfio container handling to a separate file, creating a more
         well defined API between the core and container code, masking IOMMU
         backend implementation from the core, allowing for an easier future
         transition to an iommufd based implementation of the same (Jason
         Gunthorpe)
      
       - Attempt to resolve race accessing the iommu_group for a device
         between vfio releasing DMA ownership and removal of the device from
         the IOMMU driver. Follow-up with support to allow vfio_group to exist
         with NULL iommu_group pointer to support existing userspace use cases
         of holding the group file open (Jason Gunthorpe)
      
       - Fix error code and hi/lo register manipulation issues in the hisi_acc
         variant driver, along with various code cleanups (Longfang Liu)
      
       - Fix a prior regression in GVT-g group teardown, resulting in
         unreleased resources (Jason Gunthorpe)
      
       - A significant cleanup and simplification of the mdev interface,
         consolidating much of the open coded per driver sysfs interface
         support into the mdev core (Christoph Hellwig)
      
       - Simplification of tracking and locking around vfio_groups that fall
         out from previous refactoring (Jason Gunthorpe)
      
       - Replace trivial open coded f_ops tests with new helper (Alex
         Williamson)
      
      * tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio: (77 commits)
        vfio: More vfio_file_is_group() use cases
        vfio: Make the group FD disassociate from the iommu_group
        vfio: Hold a reference to the iommu_group in kvm for SPAPR
        vfio: Add vfio_file_is_group()
        vfio: Change vfio_group->group_rwsem to a mutex
        vfio: Remove the vfio_group->users and users_comp
        vfio/mdev: add mdev available instance checking to the core
        vfio/mdev: consolidate all the description sysfs into the core code
        vfio/mdev: consolidate all the available_instance sysfs into the core code
        vfio/mdev: consolidate all the name sysfs into the core code
        vfio/mdev: consolidate all the device_api sysfs into the core code
        vfio/mdev: remove mtype_get_parent_dev
        vfio/mdev: remove mdev_parent_dev
        vfio/mdev: unexport mdev_bus_type
        vfio/mdev: remove mdev_from_dev
        vfio/mdev: simplify mdev_type handling
        vfio/mdev: embedd struct mdev_parent in the parent data structure
        vfio/mdev: make mdev.h standalone includable
        drm/i915/gvt: simplify vgpu configuration management
        drm/i915/gvt: fix a memory leak in intel_gvt_init_vgpu_types
        ...
      d3cf4051
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 778ce723
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
      
       - Some minor typo fixes
      
       - A fix of the Xen pcifront driver for supporting the device model to
         run in a Linux stub domain
      
       - A cleanup of the pcifront driver
      
       - A series to enable grant-based virtio with Xen on x86
      
       - A cleanup of Xen PV guests to distinguish between safe and faulting
         MSR accesses
      
       - Two fixes of the Xen gntdev driver
      
       - Two fixes of the new xen grant DMA driver
      
      * tag 'for-linus-6.1-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen: Kconfig: Fix spelling mistake "Maxmium" -> "Maximum"
        xen/pv: support selecting safe/unsafe msr accesses
        xen/pv: refactor msr access functions to support safe and unsafe accesses
        xen/pv: fix vendor checks for pmu emulation
        xen/pv: add fault recovery control to pmu msr accesses
        xen/virtio: enable grant based virtio on x86
        xen/virtio: use dom0 as default backend for CONFIG_XEN_VIRTIO_FORCE_GRANT
        xen/virtio: restructure xen grant dma setup
        xen/pcifront: move xenstore config scanning into sub-function
        xen/gntdev: Accommodate VMA splitting
        xen/gntdev: Prevent leaking grants
        xen/virtio: Fix potential deadlock when accessing xen_grant_dma_devices
        xen/virtio: Fix n_pages calculation in xen_grant_dma_map(unmap)_page()
        xen/xenbus: Fix spelling mistake "hardward" -> "hardware"
        xen-pcifront: Handle missed Connected state
      778ce723
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 1440f576
      Linus Torvalds authored
      Pull misc hotfixes from Andrew Morton:
       "Five hotfixes - three for nilfs2, two for MM. For are cc:stable, one
        is not"
      
      * tag 'mm-hotfixes-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        nilfs2: fix leak of nilfs_root in case of writer thread creation failure
        nilfs2: fix NULL pointer dereference at nilfs_bmap_lookup_at_level()
        nilfs2: fix use-after-free bug of struct nilfs_root
        mm/damon/core: initialize damon_target->list in damon_new_target()
        mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page
      1440f576
    • Linus Torvalds's avatar
      Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 676cb495
      Linus Torvalds authored
      Pull non-MM updates from Andrew Morton:
      
       - hfs and hfsplus kmap API modernization (Fabio Francesco)
      
       - make crash-kexec work properly when invoked from an NMI-time panic
         (Valentin Schneider)
      
       - ntfs bugfixes (Hawkins Jiawei)
      
       - improve IPC msg scalability by replacing atomic_t's with percpu
         counters (Jiebin Sun)
      
       - nilfs2 cleanups (Minghao Chi)
      
       - lots of other single patches all over the tree!
      
      * tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
        include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
        proc: test how it holds up with mapping'less process
        mailmap: update Frank Rowand email address
        ia64: mca: use strscpy() is more robust and safer
        init/Kconfig: fix unmet direct dependencies
        ia64: update config files
        nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
        fork: remove duplicate included header files
        init/main.c: remove unnecessary (void*) conversions
        proc: mark more files as permanent
        nilfs2: remove the unneeded result variable
        nilfs2: delete unnecessary checks before brelse()
        checkpatch: warn for non-standard fixes tag style
        usr/gen_init_cpio.c: remove unnecessary -1 values from int file
        ipc/msg: mitigate the lock contention with percpu counter
        percpu: add percpu_counter_add_local and percpu_counter_sub_local
        fs/ocfs2: fix repeated words in comments
        relay: use kvcalloc to alloc page array in relay_alloc_page_array
        proc: make config PROC_CHILDREN depend on PROC_FS
        fs: uninline inode_maybe_inc_iversion()
        ...
      676cb495
    • Steven Rostedt (Google)'s avatar
      tracing: Fix reading strings from synthetic events · 0934ae99
      Steven Rostedt (Google) authored
      The follow commands caused a crash:
      
        # cd /sys/kernel/tracing
        # echo 's:open char file[]' > dynamic_events
        # echo 'hist:keys=common_pid:file=filename:onchange($file).trace(open,$file)' > events/syscalls/sys_enter_openat/trigger'
        # echo 1 > events/synthetic/open/enable
      
      BOOM!
      
      The problem is that the synthetic event field "char file[]" will read
      the value given to it as a string without any memory checks to make sure
      the address is valid. The above example will pass in the user space
      address and the sythetic event code will happily call strlen() on it
      and then strscpy() where either one will cause an oops when accessing
      user space addresses.
      
      Use the helper functions from trace_kprobe and trace_eprobe that can
      read strings safely (and actually succeed when the address is from user
      space and the memory is mapped in).
      
      Now the above can show:
      
           packagekitd-1721    [000] ...2.   104.597170: open: file=/usr/lib/rpm/fileattrs/cmake.attr
          in:imjournal-978     [006] ...2.   104.599642: open: file=/var/lib/rsyslog/imjournal.state.tmp
           packagekitd-1721    [000] ...2.   104.626308: open: file=/usr/lib/rpm/fileattrs/debuginfo.attr
      
      Link: https://lkml.kernel.org/r/20221012104534.826549315@goodmis.org
      
      Cc: stable@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Tom Zanussi <zanussi@kernel.org>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Reviewed-by: default avatarTom Zanussi <zanussi@kernel.org>
      Fixes: bd82631d ("tracing: Add support for dynamic strings to synthetic events")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      0934ae99
    • Steven Rostedt (Google)'s avatar
      tracing: Add "(fault)" name injection to kernel probes · 2e9906f8
      Steven Rostedt (Google) authored
      Have the specific functions for kernel probes that read strings to inject
      the "(fault)" name directly. trace_probes.c does this too (for uprobes)
      but as the code to read strings are going to be used by synthetic events
      (and perhaps other utilities), it simplifies the code by making sure those
      other uses do not need to implement the "(fault)" name injection as well.
      
      Link: https://lkml.kernel.org/r/20221012104534.644803645@goodmis.org
      
      Cc: stable@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Tom Zanussi <zanussi@kernel.org>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Reviewed-by: default avatarTom Zanussi <zanussi@kernel.org>
      Fixes: bd82631d ("tracing: Add support for dynamic strings to synthetic events")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      2e9906f8
    • Steven Rostedt (Google)'s avatar
      tracing: Move duplicate code of trace_kprobe/eprobe.c into header · f1d3cbfa
      Steven Rostedt (Google) authored
      The functions:
      
        fetch_store_strlen_user()
        fetch_store_strlen()
        fetch_store_string_user()
        fetch_store_string()
      
      are identical in both trace_kprobe.c and trace_eprobe.c. Move them into
      a new header file trace_probe_kernel.h to share it. This code will later
      be used by the synthetic events as well.
      
      Marked for stable as a fix for a crash in synthetic events requires it.
      
      Link: https://lkml.kernel.org/r/20221012104534.467668078@goodmis.org
      
      Cc: stable@vger.kernel.org
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Tom Zanussi <zanussi@kernel.org>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Reviewed-by: default avatarTom Zanussi <zanussi@kernel.org>
      Fixes: bd82631d ("tracing: Add support for dynamic strings to synthetic events")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      f1d3cbfa
    • Linus Torvalds's avatar
      Merge tag 'loongarch-6.1' of... · 95b8b595
      Linus Torvalds authored
      Merge tag 'loongarch-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch updates from Huacai Chen:
      
       - Use EXPLICIT_RELOCS (ABIv2.0)
      
       - Use generic BUG() handler
      
       - Refactor TLB/Cache operations
      
       - Add qspinlock support
      
       - Add perf events support
      
       - Add kexec/kdump support
      
       - Add BPF JIT support
      
       - Add ACPI-based laptop driver
      
       - Update the default config file
      
      * tag 'loongarch-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits)
        LoongArch: Update Loongson-3 default config file
        LoongArch: Add ACPI-based generic laptop driver
        LoongArch: Add BPF JIT support
        LoongArch: Add some instruction opcodes and formats
        LoongArch: Move {signed,unsigned}_imm_check() to inst.h
        LoongArch: Add kdump support
        LoongArch: Add kexec support
        LoongArch: Use generic BUG() handler
        LoongArch: Add SysRq-x (TLB Dump) support
        LoongArch: Add perf events support
        LoongArch: Add qspinlock support
        LoongArch: Use TLB for ioremap()
        LoongArch: Support access filter to /dev/mem interface
        LoongArch: Refactor cache probe and flush methods
        LoongArch: mm: Refactor TLB exception handlers
        LoongArch: Support R_LARCH_GOT_PC_{LO12,HI20} in modules
        LoongArch: Support PC-relative relocations in modules
        LoongArch: Define ELF relocation types added in ABIv2.0
        LoongArch: Adjust symbol addressing for AS_HAS_EXPLICIT_RELOCS
        LoongArch: Add Kconfig option AS_HAS_EXPLICIT_RELOCS
        ...
      95b8b595
    • Linus Torvalds's avatar
      Merge tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 60ac35bf
      Linus Torvalds authored
      Pull interrupt updates from Thomas Gleixner:
       "Core code:
      
         - Provide a generic wrapper which can be utilized in drivers to
           handle the problem of force threaded demultiplex interrupts on RT
           enabled kernels. This avoids conditionals and horrible quirks in
           drivers all over the place
      
         - Fix up affected pinctrl and GPIO drivers to make them cleanly RT
           safe
      
        Interrupt drivers:
      
         - A new driver for the FSL MU platform specific MSI implementation
      
         - Make irqchip_init() available for pure ACPI based systems
      
         - Provide a functional DT binding for the Realtek RTL interrupt chip
      
         - The usual DT updates and small code improvements all over the
           place"
      
      * tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
        irqchip: IMX_MU_MSI should depend on ARCH_MXC
        irqchip/imx-mu-msi: Fix wrong register offset for 8ulp
        irqchip/ls-extirq: Fix invalid wait context by avoiding to use regmap
        dt-bindings: irqchip: Describe the IMX MU block as a MSI controller
        irqchip: Add IMX MU MSI controller driver
        dt-bindings: irqchip: renesas,irqc: Add r8a779g0 support
        irqchip/gic-v3: Fix typo in comment
        dt-bindings: interrupt-controller: ti,sci-intr: Fix missing reg property in the binding
        dt-bindings: irqchip: ti,sci-inta: Fix warning for missing #interrupt-cells
        irqchip: Allow extra fields to be passed to IRQCHIP_PLATFORM_DRIVER_END
        platform-msi: Export symbol platform_msi_create_irq_domain()
        irqchip/realtek-rtl: use parent interrupts
        dt-bindings: interrupt-controller: realtek,rtl-intc: require parents
        irqchip/realtek-rtl: use irq_domain_add_linear()
        irqchip: Make irqchip_init() usable on pure ACPI systems
        bcma: gpio: Use generic_handle_irq_safe()
        gpio: mlxbf2: Use generic_handle_irq_safe()
        platform/x86: intel_int0002_vgpio: Use generic_handle_irq_safe()
        ssb: gpio: Use generic_handle_irq_safe()
        pinctrl: amd: Use generic_handle_irq_safe()
        ...
      60ac35bf
    • Jiapeng Chong's avatar
      ring-buffer: Fix kernel-doc · b7085b6f
      Jiapeng Chong authored
      kernel/trace/ring_buffer.c:895: warning: expecting prototype for ring_buffer_nr_pages_dirty(). Prototype was for ring_buffer_nr_dirty_pages() instead.
      kernel/trace/ring_buffer.c:5313: warning: expecting prototype for ring_buffer_reset_cpu(). Prototype was for ring_buffer_reset_online_cpus() instead.
      kernel/trace/ring_buffer.c:5382: warning: expecting prototype for rind_buffer_empty(). Prototype was for ring_buffer_empty() instead.
      
      Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2340
      Link: https://lkml.kernel.org/r/20221009020642.12506-1-jiapeng.chong@linux.alibaba.comReported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      b7085b6f
    • Jeremy Kerr's avatar
      mctp: prevent double key removal and unref · 3a732b46
      Jeremy Kerr authored
      Currently, we have a bug where a simultaneous DROPTAG ioctl and socket
      close may race, as we attempt to remove a key from lists twice, and
      perform an unref for each removal operation. This may result in a uaf
      when we attempt the second unref.
      
      This change fixes the race by making __mctp_key_remove tolerant to being
      called on a key that has already been removed from the socket/net lists,
      and only performs the unref when we do the actual remove. We also need
      to hold the list lock on the ioctl cleanup path.
      
      This fix is based on a bug report and comprehensive analysis from
      butt3rflyh4ck <butterflyhuangxx@gmail.com>, found via syzkaller.
      
      Cc: stable@vger.kernel.org
      Fixes: 63ed1aab ("mctp: Add SIOCMCTP{ALLOC,DROP}TAG ioctls for tag control")
      Reported-by: default avatarbutt3rflyh4ck <butterflyhuangxx@gmail.com>
      Signed-off-by: default avatarJeremy Kerr <jk@codeconstruct.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a732b46
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · ed5d1f61
      David S. Miller authored
      Florian Westphal says:
      
      ====================
      netfilter fixes for net
      
      This series from Phil Sutter for the *net* tree fixes a problem with a change
      from the 6.1 development phase: the change to nft_fib should have used
      the more recent flowic_l3mdev field.  Pointed out by Guillaume Nault.
      This also makes the older iptables module follow the same pattern.
      
      Also add selftest case and avoid test failure in nft_fib.sh when the
      host environment has set rp_filter=1.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed5d1f61
    • Phil Sutter's avatar
      selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1 · 6a91e727
      Phil Sutter authored
      If net.ipv4.conf.all.rp_filter is set, it overrides the per-interface
      setting and thus defeats the fix from bbe4c089 ("selftests:
      netfilter: disable rp_filter on router"). Unset it as well to cover that
      case.
      
      Fixes: bbe4c089 ("selftests: netfilter: disable rp_filter on router")
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      6a91e727
    • Phil Sutter's avatar
      netfilter: rpfilter/fib: Populate flowic_l3mdev field · acc641ab
      Phil Sutter authored
      Use the introduced field for correct operation with VRF devices instead
      of conditionally overwriting flowic_oif. This is a partial revert of
      commit b575b24b ("netfilter: Fix rpfilter dropping vrf packets by
      mistake"), implementing a simpler solution.
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      acc641ab
    • Phil Sutter's avatar
      selftests: netfilter: Test reverse path filtering · 6e31ce83
      Phil Sutter authored
      Test reverse path (filter) matches in iptables, ip6tables and nftables.
      Both with a regular interface and a VRF.
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Reviewed-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      6e31ce83