1. 10 Mar, 2023 5 commits
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 388a8101
      Linus Torvalds authored
      Pull erofs fixes from Gao Xiang:
       "The most important one reverts an improper fix which can cause an
        unexpected warning more often on specific images, and another one
        fixes LZMA decompression on 32-bit platforms. The others are minor
        fixes and cleanups.
      
         - Fix LZMA decompression failure on HIGHMEM platforms
      
         - Revert an inproper fix since it is actually an implementation issue
           of vmalloc()
      
         - Avoid a wrong DBG_BUGON since it could be triggered with -EINTR
      
         - Minor cleanups"
      
      * tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: use wrapper i_blocksize() in erofs_file_read_iter()
        erofs: get rid of a useless DBG_BUGON
        erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL"
        erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms
        erofs: mark z_erofs_lzma_init/erofs_pcpubuf_init w/ __init
      388a8101
    • Linus Torvalds's avatar
      Merge tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux · 92cadfcf
      Linus Torvalds authored
      Pull nfsd fixes from Chuck Lever:
      
       - Protect NFSD writes against filesystem freezing
      
       - Fix a potential memory leak during server shutdown
      
      * tag 'nfsd-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
        SUNRPC: Fix a server shutdown leak
        NFSD: Protect against filesystem freezing
      92cadfcf
    • Linus Torvalds's avatar
      Merge tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · ae195ca1
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "First batch of fixes. Among them there are two updates to sysfs and
        ioctl which are not strictly fixes but are used for testing so there's
        no reason to delay them.
      
         - fix block group item corruption after inserting new block group
      
         - fix extent map logging bit not cleared for split maps after
           dropping range
      
         - fix calculation of unusable block group space reporting bogus
           values due to 32/64b division
      
         - fix unnecessary increment of read error stat on write error
      
         - improve error handling in inode update
      
         - export per-device fsid in DEV_INFO ioctl to distinguish seeding
           devices, needed for testing
      
         - allocator size classes:
            - fix potential dead lock in size class loading logic
            - print sysfs stats for the allocation classes"
      
      * tag 'for-6.3-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix block group item corruption after inserting new block group
        btrfs: fix extent map logging bit not cleared for split maps after dropping range
        btrfs: fix percent calculation for bg reclaim message
        btrfs: fix unnecessary increment of read error stat on write error
        btrfs: handle btrfs_del_item errors in __btrfs_update_delayed_inode
        btrfs: ioctl: return device fsid from DEV_INFO ioctl
        btrfs: fix potential dead lock in size class loading logic
        btrfs: sysfs: add size class stats
      ae195ca1
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.3-2023-03-09' of git://git.kernel.dk/linux · f331c5de
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Stop setting PF_NO_SETAFFINITY on io-wq workers.
      
         This has been reported in the past as it confuses some applications,
         as some of their threads will fail with -1/EINVAL if attempted
         affinitized. Most recent report was on cpusets, where enabling that
         with io-wq workers active will fail.
      
         Just deal with the mask changing by checking when a worker times out,
         and then exit if we have no work pending.
      
       - Fix an issue with passthrough support where we don't properly check
         if the file type has pollable uring_cmd support.
      
       - Fix a reported W=1 warning on a variable being set and unused. Add a
         special helper for iterating these lists that doesn't save the
         previous list element, if that iterator never ends up using it.
      
      * tag 'io_uring-6.3-2023-03-09' of git://git.kernel.dk/linux:
        io_uring: silence variable ‘prev’ set but not used warning
        io_uring/uring_cmd: ensure that device supports IOPOLL
        io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers
      f331c5de
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of... · 49be4fb2
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull perf tools fixes from Arnaldo Carvalho de Melo:
      
       - Add Adrian Hunter to MAINTAINERS as a perf tools reviewer
      
       - Sync various tools/ copies of kernel headers with the kernel sources,
         this time trying to avoid first merging with upstream to then update
         but instead copy from upstream so that a merge is avoided and the end
         result after merging this pull request is the one expected,
         tools/perf/check-headers.sh (mostly) happy, less warnings while
         building tools/perf/
      
       - Fix counting when initial delay configured by setting
         perf_attr.enable_on_exec when starting workloads from the perf
         command line
      
       - Don't avoid emitting a PERF_RECORD_MMAP2 in 'perf inject
         --buildid-all' when that record comes with a build-id, otherwise we
         end up not being able to resolve symbols
      
       - Don't use comma as the CSV output separator the "stat+csv_output"
         test, as comma can appear on some tests as a modifier for an event,
         use @ instead, ditto for the JSON linter test
      
       - The offcpu test was looking for some bits being set on
         task_struct->prev_state without masking other bits not important for
         this specific 'perf test', fix it
      
      * tag 'perf-tools-fixes-for-v6.3-1-2023-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
        perf tools: Add Adrian Hunter to MAINTAINERS as a reviewer
        tools headers UAPI: Sync linux/perf_event.h with the kernel sources
        tools headers x86 cpufeatures: Sync with the kernel sources
        tools include UAPI: Sync linux/vhost.h with the kernel sources
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        tools headers kvm: Sync uapi/{asm/linux} kvm.h headers with the kernel sources
        tools include UAPI: Synchronize linux/fcntl.h with the kernel sources
        tools headers: Synchronize {linux,vdso}/bits.h with the kernel sources
        tools headers UAPI: Sync linux/prctl.h with the kernel sources
        tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'
        perf stat: Fix counting when initial delay configured
        tools headers svm: Sync svm headers with the kernel sources
        perf test: Avoid counting commas in json linter
        perf tests stat+csv_output: Switch CSV separator to @
        perf inject: Fix --buildid-all not to eat up MMAP2
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        perf test: Fix offcpu test prev_state check
      49be4fb2
  2. 09 Mar, 2023 19 commits
    • Linus Torvalds's avatar
      Merge tag 'net-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 44889ba5
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from netfilter and bpf.
      
        Current release - regressions:
      
         - core: avoid skb end_offset change in __skb_unclone_keeptruesize()
      
         - sched:
            - act_connmark: handle errno on tcf_idr_check_alloc
            - flower: fix fl_change() error recovery path
      
         - ieee802154: prevent user from crashing the host
      
        Current release - new code bugs:
      
         - eth: bnxt_en: fix the double free during device removal
      
         - tools: ynl:
            - fix enum-as-flags in the generic CLI
            - fully inherit attrs in subsets
            - re-license uniformly under GPL-2.0 or BSD-3-clause
      
        Previous releases - regressions:
      
         - core: use indirect calls helpers for sk_exit_memory_pressure()
      
         - tls:
            - fix return value for async crypto
            - avoid hanging tasks on the tx_lock
      
         - eth: ice: copy last block omitted in ice_get_module_eeprom()
      
        Previous releases - always broken:
      
         - core: avoid double iput when sock_alloc_file fails
      
         - af_unix: fix struct pid leaks in OOB support
      
         - tls:
            - fix possible race condition
            - fix device-offloaded sendpage straddling records
      
         - bpf:
            - sockmap: fix an infinite loop error
            - test_run: fix &xdp_frame misplacement for LIVE_FRAMES
            - fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
      
         - netfilter: tproxy: fix deadlock due to missing BH disable
      
         - phylib: get rid of unnecessary locking
      
         - eth: bgmac: fix *initial* chip reset to support BCM5358
      
         - eth: nfp: fix csum for ipsec offload
      
         - eth: mtk_eth_soc: fix RX data corruption issue
      
        Misc:
      
         - usb: qmi_wwan: add telit 0x1080 composition"
      
      * tag 'net-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
        tools: ynl: fix enum-as-flags in the generic CLI
        tools: ynl: move the enum classes to shared code
        net: avoid double iput when sock_alloc_file fails
        af_unix: fix struct pid leaks in OOB support
        eth: fealnx: bring back this old driver
        net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC
        net: microchip: sparx5: fix deletion of existing DSCP mappings
        octeontx2-af: Unlock contexts in the queue context cache in case of fault detection
        net/smc: fix fallback failed while sendmsg with fastopen
        ynl: re-license uniformly under GPL-2.0 OR BSD-3-Clause
        mailmap: update entries for Stephen Hemminger
        mailmap: add entry for Maxim Mikityanskiy
        nfc: change order inside nfc_se_io error path
        ethernet: ice: avoid gcc-9 integer overflow warning
        ice: don't ignore return codes in VSI related code
        ice: Fix DSCP PFC TLV creation
        net: usb: qmi_wwan: add Telit 0x1080 composition
        net: usb: cdc_mbim: avoid altsetting toggling for Telit FE990
        netfilter: conntrack: adopt safer max chain length
        net: tls: fix device-offloaded sendpage straddling records
        ...
      44889ba5
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2023030901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid · 2653e3fe
      Linus Torvalds authored
      Pull HID fixes from Benjamin Tissoires:
      
       - fix potential out of bound write of zeroes in HID core with a
         specially crafted uhid device (Lee Jones)
      
       - fix potential use-after-free in work function in intel-ish-hid (Reka
         Norman)
      
       - selftests config fixes (Benjamin Tissoires)
      
       - few device small fixes and support
      
      * tag 'for-linus-2023030901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
        HID: intel-ish-hid: ipc: Fix potential use-after-free in work function
        HID: logitech-hidpp: Add support for Logitech MX Master 3S mouse
        HID: cp2112: Fix driver not registering GPIO IRQ chip as threaded
        selftest: hid: fix hid_bpf not set in config
        HID: uhid: Over-ride the default maximum data buffer value with our own
        HID: core: Provide new max_buffer_size attribute to over-ride the default
      2653e3fe
    • Linus Torvalds's avatar
      Merge tag 'm68k-for-v6.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k · c70e9b8e
      Linus Torvalds authored
      Pull m68k fixes from Geert Uytterhoeven:
      
       - Fix systems with memory at end of 32-bit address space
      
       - Fix initrd on systems where memory does not start at address zero
      
       - Fix 68030 handling of bus errors for addresses in exception tables
      
      * tag 'm68k-for-v6.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
        m68k: Only force 030 bus error if PC not in exception table
        m68k: mm: Move initrd phys_to_virt handling after paging_init()
        m68k: mm: Fix systems with memory at end of 32-bit address space
      c70e9b8e
    • Al Viro's avatar
      sh: sanitize the flags on sigreturn · 573b22cc
      Al Viro authored
      We fetch %SR value from sigframe; it might have been modified by signal
      handler, so we can't trust it with any bits that are not modifiable in
      user mode.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Rich Felker <dalias@libc.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      573b22cc
    • Jens Axboe's avatar
      io_uring: silence variable ‘prev’ set but not used warning · fa780334
      Jens Axboe authored
      If io_uring.o is built with W=1, it triggers a warning:
      
      io_uring/io_uring.c: In function ‘__io_submit_flush_completions’:
      io_uring/io_uring.c:1502:40: warning: variable ‘prev’ set but not used [-Wunused-but-set-variable]
       1502 |         struct io_wq_work_node *node, *prev;
            |                                        ^~~~
      
      which is due to the wq_list_for_each() iterator always keeping a 'prev'
      variable. Most users need this to remove an entry from a list, for
      example, but __io_submit_flush_completions() never does that.
      
      Add a basic helper that doesn't track prev instead, and use that in
      that function.
      Reported-by: default avatarVincenzo Palazzo <vincenzopalazzodev@gmail.com>
      Reviewed-by: default avatarVincenzo Palazzo <vincenzopalazzodev@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      fa780334
    • Jens Axboe's avatar
      io_uring/uring_cmd: ensure that device supports IOPOLL · 03b3d6be
      Jens Axboe authored
      It's possible for a file type to support uring commands, but not
      pollable ones. Hence before issuing one of those, we should check
      that it is supported and error out upfront if it isn't.
      
      Cc: stable@vger.kernel.org
      Fixes: 5756a3a7 ("io_uring: add iopoll infrastructure for io_uring_cmd")
      Link: https://github.com/axboe/liburing/issues/816Reviewed-by: default avatarKanchan Joshi <joshi.k@samsung.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      03b3d6be
    • Yue Hu's avatar
      3993f4f4
    • Gao Xiang's avatar
      erofs: get rid of a useless DBG_BUGON · 9ff47180
      Gao Xiang authored
      `err` could be -EINTR and it should not be the case.  Actually such
      DBG_BUGON is useless.
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Link: https://lore.kernel.org/r/20230309053148.9223-2-hsiangkao@linux.alibaba.comSigned-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      9ff47180
    • Gao Xiang's avatar
      erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL" · 647dd2c3
      Gao Xiang authored
      Let's revert commit 12724ba3 ("erofs: fix kvcalloc() misuse with
      __GFP_NOFAIL") since kvmalloc() already supports __GFP_NOFAIL in commit
      a421ef30 ("mm: allow !GFP_KERNEL allocations for kvmalloc").  So
      the original fix was wrong.
      
      Actually there was some issue as [1] discussed, so before that mm fix
      is landed, the warn could still happen but applying this commit first
      will cause less.
      
      [1] https://lore.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.com
      
      Fixes: 12724ba3 ("erofs: fix kvcalloc() misuse with __GFP_NOFAIL")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Link: https://lore.kernel.org/r/20230309053148.9223-1-hsiangkao@linux.alibaba.comSigned-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      647dd2c3
    • Gao Xiang's avatar
      erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms · 8f121dfb
      Gao Xiang authored
      As the call trace shown, the root cause is kunmap incorrect pages:
      
       BUG: kernel NULL pointer dereference, address: 00000000
       CPU: 1 PID: 40 Comm: kworker/u5:0 Not tainted 6.2.0-rc5 #4
       Workqueue: erofs_worker z_erofs_decompressqueue_work
       EIP: z_erofs_lzma_decompress+0x34b/0x8ac
        z_erofs_decompress+0x12/0x14
        z_erofs_decompress_queue+0x7e7/0xb1c
        z_erofs_decompressqueue_work+0x32/0x60
        process_one_work+0x24b/0x4d8
        ? process_one_work+0x1a4/0x4d8
        worker_thread+0x14c/0x3fc
        kthread+0xe6/0x10c
        ? rescuer_thread+0x358/0x358
        ? kthread_complete_and_exit+0x18/0x18
        ret_from_fork+0x1c/0x28
       ---[ end trace 0000000000000000 ]---
      
      The bug is trivial and should be fixed now.  It has no impact on
      !HIGHMEM platforms.
      
      Fixes: 622ceadd ("erofs: lzma compression support")
      Cc: <stable@vger.kernel.org> # 5.16+
      Reviewed-by: default avatarYue Hu <huyue2@coolpad.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Signed-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20230305134455.88236-1-hsiangkao@linux.alibaba.com
      8f121dfb
    • Yangtao Li's avatar
      erofs: mark z_erofs_lzma_init/erofs_pcpubuf_init w/ __init · a279aded
      Yangtao Li authored
      They are used during the erofs module init phase. Let's mark it as
      __init like any other function.
      Signed-off-by: default avatarYangtao Li <frank.li@vivo.com>
      Reviewed-by: default avatarYue Hu <huyue2@coolpad.com>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Link: https://lore.kernel.org/r/20230303063731.66760-1-frank.li@vivo.comSigned-off-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      a279aded
    • Paolo Abeni's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 67eeadf2
      Paolo Abeni authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-03-07 (ice)
      
      This series contains updates to ice driver only.
      
      Dave removes masking from pfcena field as it was incorrectly preventing
      valid traffic classes from being enabled.
      
      Michal resolves various smatch issues such as not propagating error
      codes and returning 0 explicitly.
      
      Arnd Bergmann resolves gcc-9 warning for integer overflow.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ethernet: ice: avoid gcc-9 integer overflow warning
        ice: don't ignore return codes in VSI related code
        ice: Fix DSCP PFC TLV creation
      ====================
      
      Link: https://lore.kernel.org/r/20230307220714.3997294-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      67eeadf2
    • Jakub Kicinski's avatar
      Merge branch 'tools-ynl-fix-enum-as-flags-in-the-generic-cli' · 2d8cb0bf
      Jakub Kicinski authored
      Jakub Kicinski says:
      
      ====================
      tools: ynl: fix enum-as-flags in the generic CLI
      
      The CLI needs to use proper classes when looking at Enum definitions
      rather than interpreting the YAML spec ad-hoc, because we have more
      than on format of the definition supported.
      ====================
      
      Link: https://lore.kernel.org/r/20230308003923.445268-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2d8cb0bf
    • Jakub Kicinski's avatar
      tools: ynl: fix enum-as-flags in the generic CLI · c311aaa7
      Jakub Kicinski authored
      Lorenzo points out that the generic CLI is broken for the netdev
      family. When I added the support for documentation of enums
      (and sparse enums) the client script was not updated.
      It expects the values in enum to be a list of names,
      now it can also be a dict (YAML object).
      Reported-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Fixes: e4b48ed4 ("tools: ynl: add a completely generic client")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c311aaa7
    • Jakub Kicinski's avatar
      tools: ynl: move the enum classes to shared code · 6517a60b
      Jakub Kicinski authored
      Move bulk of the EnumSet and EnumEntry code to shared
      code for reuse by cli.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6517a60b
    • Thadeu Lima de Souza Cascardo's avatar
      net: avoid double iput when sock_alloc_file fails · 649c15c7
      Thadeu Lima de Souza Cascardo authored
      When sock_alloc_file fails to allocate a file, it will call sock_release.
      __sys_socket_file should then not call sock_release again, otherwise there
      will be a double free.
      
      [   89.319884] ------------[ cut here ]------------
      [   89.320286] kernel BUG at fs/inode.c:1764!
      [   89.320656] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      [   89.321051] CPU: 7 PID: 125 Comm: iou-sqp-124 Not tainted 6.2.0+ #361
      [   89.321535] RIP: 0010:iput+0x1ff/0x240
      [   89.321808] Code: d1 83 e1 03 48 83 f9 02 75 09 48 81 fa 00 10 00 00 77 05 83 e2 01 75 1f 4c 89 ef e8 fb d2 ba 00 e9 80 fe ff ff c3 cc cc cc cc <0f> 0b 0f 0b e9 d0 fe ff ff 0f 0b eb 8d 49 8d b4 24 08 01 00 00 48
      [   89.322760] RSP: 0018:ffffbdd60068bd50 EFLAGS: 00010202
      [   89.323036] RAX: 0000000000000000 RBX: ffff9d7ad3cacac0 RCX: 0000000000001107
      [   89.323412] RDX: 000000000003af00 RSI: 0000000000000000 RDI: ffff9d7ad3cacb40
      [   89.323785] RBP: ffffbdd60068bd68 R08: ffffffffffffffff R09: ffffffffab606438
      [   89.324157] R10: ffffffffacb3dfa0 R11: 6465686361657256 R12: ffff9d7ad3cacb40
      [   89.324529] R13: 0000000080000001 R14: 0000000080000001 R15: 0000000000000002
      [   89.324904] FS:  00007f7b28516740(0000) GS:ffff9d7aeb1c0000(0000) knlGS:0000000000000000
      [   89.325328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   89.325629] CR2: 00007f0af52e96c0 CR3: 0000000002a02006 CR4: 0000000000770ee0
      [   89.326004] PKRU: 55555554
      [   89.326161] Call Trace:
      [   89.326298]  <TASK>
      [   89.326419]  __sock_release+0xb5/0xc0
      [   89.326632]  __sys_socket_file+0xb2/0xd0
      [   89.326844]  io_socket+0x88/0x100
      [   89.327039]  ? io_issue_sqe+0x6a/0x430
      [   89.327258]  io_issue_sqe+0x67/0x430
      [   89.327450]  io_submit_sqes+0x1fe/0x670
      [   89.327661]  io_sq_thread+0x2e6/0x530
      [   89.327859]  ? __pfx_autoremove_wake_function+0x10/0x10
      [   89.328145]  ? __pfx_io_sq_thread+0x10/0x10
      [   89.328367]  ret_from_fork+0x29/0x50
      [   89.328576] RIP: 0033:0x0
      [   89.328732] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
      [   89.329073] RSP: 002b:0000000000000000 EFLAGS: 00000202 ORIG_RAX: 00000000000001a9
      [   89.329477] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f7b28637a3d
      [   89.329845] RDX: 00007fff4e4318a8 RSI: 00007fff4e4318b0 RDI: 0000000000000400
      [   89.330216] RBP: 00007fff4e431830 R08: 00007fff4e431711 R09: 00007fff4e4318b0
      [   89.330584] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff4e441b38
      [   89.330950] R13: 0000563835e3e725 R14: 0000563835e40d10 R15: 00007f7b28784040
      [   89.331318]  </TASK>
      [   89.331441] Modules linked in:
      [   89.331617] ---[ end trace 0000000000000000 ]---
      
      Fixes: da214a47 ("net: add __sys_socket_file()")
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230307173707.468744-1-cascardo@canonical.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      649c15c7
    • Eric Dumazet's avatar
      af_unix: fix struct pid leaks in OOB support · 2aab4b96
      Eric Dumazet authored
      syzbot reported struct pid leak [1].
      
      Issue is that queue_oob() calls maybe_add_creds() which potentially
      holds a reference on a pid.
      
      But skb->destructor is not set (either directly or by calling
      unix_scm_to_skb())
      
      This means that subsequent kfree_skb() or consume_skb() would leak
      this reference.
      
      In this fix, I chose to fully support scm even for the OOB message.
      
      [1]
      BUG: memory leak
      unreferenced object 0xffff8881053e7f80 (size 128):
      comm "syz-executor242", pid 5066, jiffies 4294946079 (age 13.220s)
      hex dump (first 32 bytes):
      01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
      backtrace:
      [<ffffffff812ae26a>] alloc_pid+0x6a/0x560 kernel/pid.c:180
      [<ffffffff812718df>] copy_process+0x169f/0x26c0 kernel/fork.c:2285
      [<ffffffff81272b37>] kernel_clone+0xf7/0x610 kernel/fork.c:2684
      [<ffffffff812730cc>] __do_sys_clone+0x7c/0xb0 kernel/fork.c:2825
      [<ffffffff849ad699>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      [<ffffffff849ad699>] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
      [<ffffffff84a0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 314001f0 ("af_unix: Add OOB support")
      Reported-by: syzbot+7699d9e5635c10253a27@syzkaller.appspotmail.com
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Rao Shoaib <rao.shoaib@oracle.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230307164530.771896-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2aab4b96
    • Jakub Kicinski's avatar
      eth: fealnx: bring back this old driver · 8f148208
      Jakub Kicinski authored
      This reverts commit d5e2d038.
      
      We have a report of this chip being used on a
      
        SURECOM EP-320X-S 100/10M Ethernet PCI Adapter
      
      which could still have been purchased in some parts
      of the world 3 years ago.
      
      Cc: stable@vger.kernel.org
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=217151
      Fixes: d5e2d038 ("eth: fealnx: delete the driver for Myson MTD-800")
      Link: https://lore.kernel.org/r/20230307171930.4008454-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8f148208
    • Vladimir Oltean's avatar
      net: dsa: mt7530: permit port 5 to work without port 6 on MT7621 SoC · c8b8a3c6
      Vladimir Oltean authored
      The MT7530 switch from the MT7621 SoC has 2 ports which can be set up as
      internal: port 5 and 6. Arınç reports that the GMAC1 attached to port 5
      receives corrupted frames, unless port 6 (attached to GMAC0) has been
      brought up by the driver. This is true regardless of whether port 5 is
      used as a user port or as a CPU port (carrying DSA tags).
      
      Offline debugging (blind for me) which began in the linked thread showed
      experimentally that the configuration done by the driver for port 6
      contains a step which is needed by port 5 as well - the write to
      CORE_GSWPLL_GRP2 (note that I've no idea as to what it does, apart from
      the comment "Set core clock into 500Mhz"). Prints put by Arınç show that
      the reset value of CORE_GSWPLL_GRP2 is RG_GSWPLL_POSDIV_500M(1) |
      RG_GSWPLL_FBKDIV_500M(40) (0x128), both on the MCM MT7530 from the
      MT7621 SoC, as well as on the standalone MT7530 from MT7623NI Bananapi
      BPI-R2. Apparently, port 5 on the standalone MT7530 can work under both
      values of the register, while on the MT7621 SoC it cannot.
      
      The call path that triggers the register write is:
      
      mt753x_phylink_mac_config() for port 6
      -> mt753x_pad_setup()
         -> mt7530_pad_clk_setup()
      
      so this fully explains the behavior noticed by Arınç, that bringing port
      6 up is necessary.
      
      The simplest fix for the problem is to extract the register writes which
      are needed for both port 5 and 6 into a common mt7530_pll_setup()
      function, which is called at mt7530_setup() time, immediately after
      switch reset. We can argue that this mirrors the code layout introduced
      in mt7531_setup() by commit 42bc4faf ("net: mt7531: only do PLL once
      after the reset"), in that the PLL setup has the exact same positioning,
      and further work to consolidate the separate setup() functions is not
      hindered.
      
      Testing confirms that:
      
      - the slight reordering of writes to MT7530_P6ECR and to
        CORE_GSWPLL_GRP1 / CORE_GSWPLL_GRP2 introduced by this change does not
        appear to cause problems for the operation of port 6 on MT7621 and on
        MT7623 (where port 5 also always worked)
      
      - packets sent through port 5 are not corrupted anymore, regardless of
        whether port 6 is enabled by phylink or not (or even present in the
        device tree)
      
      My algorithm for determining the Fixes: tag is as follows. Testing shows
      that some logic from mt7530_pad_clk_setup() is needed even for port 5.
      Prior to commit ca366d6c ("net: dsa: mt7530: Convert to PHYLINK
      API"), a call did exist for all phy_is_pseudo_fixed_link() ports - so
      port 5 included. That commit replaced it with a temporary "Port 5 is not
      supported!" comment, and the following commit 38f790a8 ("net: dsa:
      mt7530: Add support for port 5") replaced that comment with a
      configuration procedure in mt7530_setup_port5() which was insufficient
      for port 5 to work. I'm laying the blame on the patch that claimed
      support for port 5, although one would have also needed the change from
      commit c3b8e079 ("net: dsa: mt7530: setup core clock even in TRGMII
      mode") for the write to be performed completely independently from port
      6's configuration.
      
      Thanks go to Arınç for describing the problem, for debugging and for
      testing.
      Reported-by: default avatarArınç ÜNAL <arinc.unal@arinc9.com>
      Link: https://lore.kernel.org/netdev/f297c2c4-6e7c-57ac-2394-f6025d309b9d@arinc9.com/
      Fixes: 38f790a8 ("net: dsa: mt7530: Add support for port 5")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: default avatarArınç ÜNAL <arinc.unal@arinc9.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230307155411.868573-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c8b8a3c6
  3. 08 Mar, 2023 9 commits
    • Linus Torvalds's avatar
      Merge tag 'fs_for_v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 6a98c9ca
      Linus Torvalds authored
      Pull udf fixes from Jan Kara:
       "Fix bugs in UDF caused by the big pile of changes that went in during
        the merge window"
      
      * tag 'fs_for_v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        udf: Warn if block mapping is done for in-ICB files
        udf: Fix reading of in-ICB files
        udf: Fix lost writes in udf_adinicb_writepage()
      6a98c9ca
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v6.3-2' of... · 55ee6646
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fixes from Hans de Goede:
       "A small set of assorted bug and build/warning fixes"
      
      * tag 'platform-drivers-x86-v6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform: mellanox: mlx-platform: Initialize shift variable to 0
        platform/x86: int3472: Add GPIOs to Surface Go 3 Board data
        platform/x86: ISST: Fix kernel documentation warnings
        platform: x86: MLX_PLATFORM: select REGMAP instead of depending on it
        platform: mellanox: select REGMAP instead of depending on it
        platform/x86/intel/tpmi: Fix double free reported by Smatch
        platform/x86: ISST: Increase range of valid mail box commands
        platform/x86: dell-ddv: Fix temperature scaling
        platform/x86: dell-ddv: Fix cache invalidation on resume
        platform/x86/amd: pmc: remove CONFIG_SUSPEND checks
      55ee6646
    • Linus Torvalds's avatar
      x86/resctl: fix scheduler confusion with 'current' · 7fef0997
      Linus Torvalds authored
      The implementation of 'current' on x86 is very intentionally special: it
      is a very common thing to look up, and it uses 'this_cpu_read_stable()'
      to get the current thread pointer efficiently from per-cpu storage.
      
      And the keyword in there is 'stable': the current thread pointer never
      changes as far as a single thread is concerned.  Even if when a thread
      is preempted, or moved to another CPU, or even across an explicit call
      'schedule()' that thread will still have the same value for 'current'.
      
      It is, after all, the kernel base pointer to thread-local storage.
      That's why it's stable to begin with, but it's also why it's important
      enough that we have that special 'this_cpu_read_stable()' access for it.
      
      So this is all done very intentionally to allow the compiler to treat
      'current' as a value that never visibly changes, so that the compiler
      can do CSE and combine multiple different 'current' accesses into one.
      
      However, there is obviously one very special situation when the
      currently running thread does actually change: inside the scheduler
      itself.
      
      So the scheduler code paths are special, and do not have a 'current'
      thread at all.  Instead there are _two_ threads: the previous and the
      next thread - typically called 'prev' and 'next' (or prev_p/next_p)
      internally.
      
      So this is all actually quite straightforward and simple, and not all
      that complicated.
      
      Except for when you then have special code that is run in scheduler
      context, that code then has to be aware that 'current' isn't really a
      valid thing.  Did you mean 'prev'? Did you mean 'next'?
      
      In fact, even if then look at the code, and you use 'current' after the
      new value has been assigned to the percpu variable, we have explicitly
      told the compiler that 'current' is magical and always stable.  So the
      compiler is quite free to use an older (or newer) value of 'current',
      and the actual assignment to the percpu storage is not relevant even if
      it might look that way.
      
      Which is exactly what happened in the resctl code, that blithely used
      'current' in '__resctrl_sched_in()' when it really wanted the new
      process state (as implied by the name: we're scheduling 'into' that new
      resctl state).  And clang would end up just using the old thread pointer
      value at least in some configurations.
      
      This could have happened with gcc too, and purely depends on random
      compiler details.  Clang just seems to have been more aggressive about
      moving the read of the per-cpu current_task pointer around.
      
      The fix is trivial: just make the resctl code adhere to the scheduler
      rules of using the prev/next thread pointer explicitly, instead of using
      'current' in a situation where it just wasn't valid.
      
      That same code is then also used outside of the scheduler context (when
      a thread resctl state is explicitly changed), and then we will just pass
      in 'current' as that pointer, of course.  There is no ambiguity in that
      case.
      
      The fix may be trivial, but noticing and figuring out what went wrong
      was not.  The credit for that goes to Stephane Eranian.
      Reported-by: default avatarStephane Eranian <eranian@google.com>
      Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/
      Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Tested-by: default avatarTony Luck <tony.luck@intel.com>
      Tested-by: default avatarStephane Eranian <eranian@google.com>
      Tested-by: default avatarBabu Moger <babu.moger@amd.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7fef0997
    • Jens Axboe's avatar
      io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers · 01e68ce0
      Jens Axboe authored
      Every now and then reports come in that are puzzled on why changing
      affinity on the io-wq workers fails with EINVAL. This happens because they
      set PF_NO_SETAFFINITY as part of their creation, as io-wq organizes
      workers into groups based on what CPU they are running on.
      
      However, this is purely an optimization and not a functional requirement.
      We can allow setting affinity, and just lazily update our worker to wqe
      mappings. If a given io-wq thread times out, it normally exits if there's
      no more work to do. The exception is if it's the last worker available.
      For the timeout case, check the affinity of the worker against group mask
      and exit even if it's the last worker. New workers should be created with
      the right mask and in the right location.
      Reported-by: default avatarDaniel Dao <dqminh@cloudflare.com>
      Link: https://lore.kernel.org/io-uring/CA+wXwBQwgxB3_UphSny-yAP5b26meeOu1W4TwYVcD_+5gOhvPw@mail.gmail.com/Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      01e68ce0
    • Benjamin Coddington's avatar
      SUNRPC: Fix a server shutdown leak · 9ca6705d
      Benjamin Coddington authored
      Fix a race where kthread_stop() may prevent the threadfn from ever getting
      called.  If that happens the svc_rqst will not be cleaned up.
      
      Fixes: ed6473dd ("NFSv4: Fix callback server shutdown")
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      9ca6705d
    • Daniel Machon's avatar
      net: microchip: sparx5: fix deletion of existing DSCP mappings · cdd28833
      Daniel Machon authored
      Fix deletion of existing DSCP mappings in the APP table.
      
      Adding and deleting DSCP entries are replicated per-port, since the
      mapping table is global for all ports in the chip. Whenever a mapping
      for a DSCP value already exists, the old mapping is deleted first.
      However, it is only deleted for the specified port. Fix this by calling
      sparx5_dcb_ieee_delapp() instead of dcb_ieee_delapp() as it ought to be.
      
      Reproduce:
      
      // Map and remap DSCP value 63
      $ dcb app add dev eth0 dscp-prio 63:1
      $ dcb app add dev eth0 dscp-prio 63:2
      
      $ dcb app show dev eth0 dscp-prio
      dscp-prio 63:2
      
      $ dcb app show dev eth1 dscp-prio
      dscp-prio 63:1 63:2 <-- 63:1 should not be there
      
      Fixes: 8dcf69a6 ("net: microchip: sparx5: add support for offloading dscp table")
      Signed-off-by: default avatarDaniel Machon <daniel.machon@microchip.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdd28833
    • Suman Ghosh's avatar
      octeontx2-af: Unlock contexts in the queue context cache in case of fault detection · ea9dd2e5
      Suman Ghosh authored
      NDC caches contexts of frequently used queue's (Rx and Tx queues)
      contexts. Due to a HW errata when NDC detects fault/poision while
      accessing contexts it could go into an illegal state where a cache
      line could get locked forever. To makesure all cache lines in NDC
      are available for optimum performance upon fault/lockerror/posion
      errors scan through all cache lines in NDC and clear the lock bit.
      
      Fixes: 4a3581cd ("octeontx2-af: NPA AQ instruction enqueue support")
      Signed-off-by: default avatarSuman Ghosh <sumang@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarSai Krishna <saikrishnag@marvell.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ea9dd2e5
    • D. Wythe's avatar
      net/smc: fix fallback failed while sendmsg with fastopen · ce7ca794
      D. Wythe authored
      Before determining whether the msg has unsupported options, it has been
      prematurely terminated by the wrong status check.
      
      For the application, the general usages of MSG_FASTOPEN likes
      
      fd = socket(...)
      /* rather than connect */
      sendto(fd, data, len, MSG_FASTOPEN)
      
      Hence, We need to check the flag before state check, because the sock
      state here is always SMC_INIT when applications tries MSG_FASTOPEN.
      Once we found unsupported options, fallback it to TCP.
      
      Fixes: ee9dfbef ("net/smc: handle sockopts forcing fallback")
      Signed-off-by: default avatarD. Wythe <alibuda@linux.alibaba.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      
      v2 -> v1: Optimize code style
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce7ca794
    • Filipe Manana's avatar
      btrfs: fix block group item corruption after inserting new block group · 675dfe12
      Filipe Manana authored
      We can often end up inserting a block group item, for a new block group,
      with a wrong value for the used bytes field.
      
      This happens if for the new allocated block group, in the same transaction
      that created the block group, we have tasks allocating extents from it as
      well as tasks removing extents from it.
      
      For example:
      
      1) Task A creates a metadata block group X;
      
      2) Two extents are allocated from block group X, so its "used" field is
         updated to 32K, and its "commit_used" field remains as 0;
      
      3) Transaction commit starts, by some task B, and it enters
         btrfs_start_dirty_block_groups(). There it tries to update the block
         group item for block group X, which currently has its "used" field with
         a value of 32K. But that fails since the block group item was not yet
         inserted, and so on failure update_block_group_item() sets the
         "commit_used" field of the block group back to 0;
      
      4) The block group item is inserted by task A, when for example
         btrfs_create_pending_block_groups() is called when releasing its
         transaction handle. This results in insert_block_group_item() inserting
         the block group item in the extent tree (or block group tree), with a
         "used" field having a value of 32K, but without updating the
         "commit_used" field in the block group, which remains with value of 0;
      
      5) The two extents are freed from block X, so its "used" field changes
         from 32K to 0;
      
      6) The transaction commit by task B continues, it enters
         btrfs_write_dirty_block_groups() which calls update_block_group_item()
         for block group X, and there it decides to skip the block group item
         update, because "used" has a value of 0 and "commit_used" has a value
         of 0 too.
      
         As a result, we end up with a block item having a 32K "used" field but
         no extents allocated from it.
      
      When this issue happens, a btrfs check reports an error like this:
      
         [1/7] checking root items
         [2/7] checking extents
         block group [1104150528 1073741824] used 39796736 but extent items used 0
         ERROR: errors found in extent allocation tree or chunk allocation
         (...)
      
      Fix this by making insert_block_group_item() update the block group's
      "commit_used" field.
      
      Fixes: 7248e0ce ("btrfs: skip update of block group item if used bytes are the same")
      CC: stable@vger.kernel.org # 6.2+
      Reviewed-by: default avatarQu Wenruo <wqu@suse.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      675dfe12
  4. 07 Mar, 2023 7 commits