1. 17 Dec, 2023 2 commits
  2. 12 Dec, 2023 7 commits
    • Daniel Vacek's avatar
      IB/ipoib: Fix mcast list locking · 4f973e21
      Daniel Vacek authored
      Releasing the `priv->lock` while iterating the `priv->multicast_list` in
      `ipoib_mcast_join_task()` opens a window for `ipoib_mcast_dev_flush()` to
      remove the items while in the middle of iteration. If the mcast is removed
      while the lock was dropped, the for loop spins forever resulting in a hard
      lockup (as was reported on RHEL 4.18.0-372.75.1.el8_6 kernel):
      
          Task A (kworker/u72:2 below)       | Task B (kworker/u72:0 below)
          -----------------------------------+-----------------------------------
          ipoib_mcast_join_task(work)        | ipoib_ib_dev_flush_light(work)
            spin_lock_irq(&priv->lock)       | __ipoib_ib_dev_flush(priv, ...)
            list_for_each_entry(mcast,       | ipoib_mcast_dev_flush(dev = priv->dev)
                &priv->multicast_list, list) |
              ipoib_mcast_join(dev, mcast)   |
                spin_unlock_irq(&priv->lock) |
                                             |   spin_lock_irqsave(&priv->lock, flags)
                                             |   list_for_each_entry_safe(mcast, tmcast,
                                             |                  &priv->multicast_list, list)
                                             |     list_del(&mcast->list);
                                             |     list_add_tail(&mcast->list, &remove_list)
                                             |   spin_unlock_irqrestore(&priv->lock, flags)
                spin_lock_irq(&priv->lock)   |
                                             |   ipoib_mcast_remove_list(&remove_list)
         (Here, `mcast` is no longer on the  |     list_for_each_entry_safe(mcast, tmcast,
          `priv->multicast_list` and we keep |                            remove_list, list)
          spinning on the `remove_list` of   |  >>>  wait_for_completion(&mcast->done)
          the other thread which is blocked  |
          and the list is still valid on     |
          it's stack.)
      
      Fix this by keeping the lock held and changing to GFP_ATOMIC to prevent
      eventual sleeps.
      Unfortunately we could not reproduce the lockup and confirm this fix but
      based on the code review I think this fix should address such lockups.
      
      crash> bc 31
      PID: 747      TASK: ff1c6a1a007e8000  CPU: 31   COMMAND: "kworker/u72:2"
      --
          [exception RIP: ipoib_mcast_join_task+0x1b1]
          RIP: ffffffffc0944ac1  RSP: ff646f199a8c7e00  RFLAGS: 00000002
          RAX: 0000000000000000  RBX: ff1c6a1a04dc82f8  RCX: 0000000000000000
                                        work (&priv->mcast_task{,.work})
          RDX: ff1c6a192d60ac68  RSI: 0000000000000286  RDI: ff1c6a1a04dc8000
                 &mcast->list
          RBP: ff646f199a8c7e90   R8: ff1c699980019420   R9: ff1c6a1920c9a000
          R10: ff646f199a8c7e00  R11: ff1c6a191a7d9800  R12: ff1c6a192d60ac00
                                                               mcast
          R13: ff1c6a1d82200000  R14: ff1c6a1a04dc8000  R15: ff1c6a1a04dc82d8
                 dev                    priv (&priv->lock)     &priv->multicast_list (aka head)
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
      --- <NMI exception stack> ---
       #5 [ff646f199a8c7e00] ipoib_mcast_join_task+0x1b1 at ffffffffc0944ac1 [ib_ipoib]
       #6 [ff646f199a8c7e98] process_one_work+0x1a7 at ffffffff9bf10967
      
      crash> rx ff646f199a8c7e68
      ff646f199a8c7e68:  ff1c6a1a04dc82f8 <<< work = &priv->mcast_task.work
      
      crash> list -hO ipoib_dev_priv.multicast_list ff1c6a1a04dc8000
      (empty)
      
      crash> ipoib_dev_priv.mcast_task.work.func,mcast_mutex.owner.counter ff1c6a1a04dc8000
        mcast_task.work.func = 0xffffffffc0944910 <ipoib_mcast_join_task>,
        mcast_mutex.owner.counter = 0xff1c69998efec000
      
      crash> b 8
      PID: 8        TASK: ff1c69998efec000  CPU: 33   COMMAND: "kworker/u72:0"
      --
       #3 [ff646f1980153d50] wait_for_completion+0x96 at ffffffff9c7d7646
       #4 [ff646f1980153d90] ipoib_mcast_remove_list+0x56 at ffffffffc0944dc6 [ib_ipoib]
       #5 [ff646f1980153de8] ipoib_mcast_dev_flush+0x1a7 at ffffffffc09455a7 [ib_ipoib]
       #6 [ff646f1980153e58] __ipoib_ib_dev_flush+0x1a4 at ffffffffc09431a4 [ib_ipoib]
       #7 [ff646f1980153e98] process_one_work+0x1a7 at ffffffff9bf10967
      
      crash> rx ff646f1980153e68
      ff646f1980153e68:  ff1c6a1a04dc83f0 <<< work = &priv->flush_light
      
      crash> ipoib_dev_priv.flush_light.func,broadcast ff1c6a1a04dc8000
        flush_light.func = 0xffffffffc0943820 <ipoib_ib_dev_flush_light>,
        broadcast = 0x0,
      
      The mcast(s) on the `remove_list` (the remaining part of the ex `priv->multicast_list`):
      
      crash> list -s ipoib_mcast.done.done ipoib_mcast.list -H ff646f1980153e10 | paste - -
      ff1c6a192bd0c200          done.done = 0x0,
      ff1c6a192d60ac00          done.done = 0x0,
      Reported-by: default avatarYuya Fujita-bishamonten <fj-lsoft-rh-driver@dl.jp.fujitsu.com>
      Signed-off-by: default avatarDaniel Vacek <neelx@redhat.com>
      Link: https://lore.kernel.org/all/20231212080746.1528802-1-neelx@redhat.comSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      4f973e21
    • Leon Romanovsky's avatar
      Expose c0 and SW encap ICM for RDMA · afcda192
      Leon Romanovsky authored
      These two series from Mark and Shun extend RDMA mlx5 API.
      
      Mark's series provides c0 register used to match egress
      traffic sent by local device.
      
      Shun's series adds new type for ICM area.
      
      Link: https://lore.kernel.org/all/cover.1701871118.git.leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      afcda192
    • Mark Bloch's avatar
      RDMA/mlx5: Expose register c0 for RDMA device · d727d27d
      Mark Bloch authored
      This patch introduces improvements for matching egress traffic sent by the
      local device. When applicable, all egress traffic from the local vport is
      now tagged with the provided value. This enhancement is particularly useful
      for FDB steering purposes.
      
      The primary focus of this update is facilitating the transmission of
      traffic from the hypervisor to a VF. To achieve this, one must initiate an
      SQ on the hypervisor and subsequently create a rule in the FDB that matches
      on the eswitch manager vport and the SQN of the aforementioned SQ.
      
      Obtaining the SQN can be had from SQ opened, and the eswitch manager vport
      match can be substituted with the register c0 value exposed by this patch.
      Signed-off-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarMichael Guralnik <michaelgur@nvidia.com>
      Link: https://lore.kernel.org/r/aa4120a91c98ff1c44f1213388c744d4cb0324d6.1701871118.git.leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      d727d27d
    • Mark Bloch's avatar
      net/mlx5: E-Switch, expose eswitch manager vport · eb524d0f
      Mark Bloch authored
      Expose the ability the query the eswitch manager vport number.
      Next patch will utilize this capability to reveal the correct
      register C0 value to the users.
      Signed-off-by: default avatarMark Bloch <mbloch@nvidia.com>
      Link: https://lore.kernel.org/r/614fb0e216250e2ce3340471ec141b83ec45c7f4.1701871118.git.leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      eb524d0f
    • Shun Hao's avatar
      net/mlx5: Manage ICM type of SW encap · abf8e8f2
      Shun Hao authored
      Support allocate/deallocate the new SW encap ICM type memory.
      The new ICM type is used for encap context allocation managed by SW,
      instead FW. It can increase encap context maximum number and allocation
      speed
      Signed-off-by: default avatarShun Hao <shunh@nvidia.com>
      Link: https://lore.kernel.org/r/bed5121255918eb132a1334141c76a0594df8143.1701871118.git.leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      abf8e8f2
    • Shun Hao's avatar
      RDMA/mlx5: Support handling of SW encap ICM area · a429ec96
      Shun Hao authored
      New type for this ICM area, now the user can allocate/deallocate
      the new type of SW encap ICM memory, to store the encap header data
      which are managed by SW.
      Signed-off-by: default avatarShun Hao <shunh@nvidia.com>
      Link: https://lore.kernel.org/r/546fe43fc700240709e30acf7713ec6834d652bd.1701871118.git.leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      a429ec96
    • Shun Hao's avatar
      net/mlx5: Introduce indirect-sw-encap ICM properties · 1ca51628
      Shun Hao authored
      Add new fields for device memory capabilities, in order to support
      creation of new ICM memory type of SW encap.
      Signed-off-by: default avatarShun Hao <shunh@nvidia.com>
      Link: https://lore.kernel.org/r/107cca7dd6a932a1704abf6ebd1b801105546a8e.1701871118.git.leon@kernel.orgSigned-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      1ca51628
  3. 11 Dec, 2023 6 commits
  4. 10 Dec, 2023 7 commits
  5. 09 Dec, 2023 15 commits
    • Linus Torvalds's avatar
      Merge tag 'usb-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 21b73ffc
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are some small USB fixes for 6.7-rc5 to resolve some reported
        issues. Included in here are:
      
         - usb gadget f_hid, and uevent fix
      
         - xhci driver revert to resolve a much-reported issue
      
         - typec driver fix
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'usb-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        usb: gadget: f_hid: fix report descriptor allocation
        Revert "xhci: Loosen RPM as default policy to cover for AMD xHC 1.1"
        usb: typec: class: fix typec_altmode_put_partner to put plugs
        USB: gadget: core: adjust uevent timing on gadget unbind
      21b73ffc
    • Linus Torvalds's avatar
      Merge tag 'tty-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 0b526090
      Linus Torvalds authored
      Pull serial driver fixes from Greg KH:
       "Here are some small serial driver fixes for 6.7-rc4 to resolve some
        reported issues. Included in here are:
      
         - pl011 dma support fix
      
         - sc16is7xx driver fix
      
         - ma35d1 console index fix
      
         - 8250 driver fixes for small issues
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'tty-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        serial: 8250_dw: Add ACPI ID for Granite Rapids-D UART
        serial: ma35d1: Validate console index before assignment
        ARM: PL011: Fix DMA support
        serial: sc16is7xx: address RX timeout interrupt errata
        serial: 8250: 8250_omap: Clear UART_HAS_RHR_IT_DIS bit
        serial: 8250_omap: Add earlycon support for the AM654 UART controller
        serial: 8250: 8250_omap: Do not start RX DMA on THRI interrupt
      0b526090
    • Linus Torvalds's avatar
      Merge tag 'char-misc-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · ca20f162
      Linus Torvalds authored
      Pull char / misc driver fixes from Greg KH:
       "Here are some small fixes for 6.7-rc5 for a variety of small driver
        subsystems. Included in here are:
      
         - debugfs revert for reported issue
      
         - greybus revert for reported issue
      
         - greybus fixup for endian build warning
      
         - coresight driver fixes
      
         - nvmem driver fixes
      
         - devcoredump fix
      
         - parport new device id
      
         - ndtest build fix
      
        All of these have ben in linux-next with no reported issues"
      
      * tag 'char-misc-6.7-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        nvmem: Do not expect fixed layouts to grab a layout driver
        parport: Add support for Brainboxes IX/UC/PX parallel cards
        Revert "greybus: gb-beagleplay: Ensure le for values in transport"
        greybus: gb-beagleplay: Ensure le for values in transport
        greybus: BeaglePlay driver needs CRC_CCITT
        Revert "debugfs: annotate debugfs handlers vs. removal with lockdep"
        devcoredump: Send uevent once devcd is ready
        ndtest: fix typo class_regster -> class_register
        misc: mei: client.c: fix problem of return '-EOVERFLOW' in mei_cl_write
        misc: mei: client.c: return negative error code in mei_cl_write
        mei: pxp: fix mei_pxp_send_message return value
        coresight: ultrasoc-smb: Fix uninitialized before use buf_hw_base
        coresight: ultrasoc-smb: Config SMB buffer before register sink
        coresight: ultrasoc-smb: Fix sleep while close preempt in enable_smb
        Documentation: coresight: fix `make refcheckdocs` warning
        hwtracing: hisi_ptt: Don't try to attach a task
        hwtracing: hisi_ptt: Handle the interrupt in hardirq context
        hwtracing: hisi_ptt: Add dummy callback pmu::read()
        coresight: Fix crash when Perf and sysfs modes are used concurrently
        coresight: etm4x: Remove bogous __exit annotation for some functions
      ca20f162
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-6.7-2' of... · b10a3cca
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen:
       "Preserve syscall nr across execve(), slightly clean up drdtime(), fix
        the Clang built zboot kernel, fix a stack unwinder bug and several bpf
        jit bugs"
      
      * tag 'loongarch-fixes-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: BPF: Fix unconditional bswap instructions
        LoongArch: BPF: Fix sign-extension mov instructions
        LoongArch: BPF: Don't sign extend function return value
        LoongArch: BPF: Don't sign extend memory load operand
        LoongArch: Preserve syscall nr across execve()
        LoongArch: Set unwind stack type to unknown rather than set error flag
        LoongArch: Slightly clean up drdtime()
        LoongArch: Apply dynamic relocations for LLD
      b10a3cca
    • Linus Torvalds's avatar
      Merge tag 'mips-fixes_6.7_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · b8503b21
      Linus Torvalds authored
      Pull MIPS fixes from Thomas Bogendoerfer:
      
       - Fixes for broken Loongson firmware
      
       - Fix lockdep splat
      
       - Fix FPU states when creating kernel threads
      
      * tag 'mips-fixes_6.7_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
        MIPS: kernel: Clear FPU states when setting up kernel threads
        MIPS: Loongson64: Handle more memory types passed from firmware
        MIPS: Loongson64: Enable DMA noncoherent support
        MIPS: Loongson64: Reserve vgabios memory on boot
        mips/smp: Call rcutree_report_cpu_starting() earlier
      b8503b21
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.7-2-2023-12-08' of... · 9d3bc457
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.7-2-2023-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools fixes from Namhyung Kim:
       "A random set of small bug fixes including:
      
         - Fix segfault on AmpereOne due to missing default metricgroup name
      
         - Fix segfault on `perf list --json` due to NULL pointer"
      
      * tag 'perf-tools-fixes-for-v6.7-2-2023-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
        perf list: Fix JSON segfault by setting the used skip_duplicate_pmus callback
        perf vendor events arm64: AmpereOne: Add missing DefaultMetricgroupName fields
        perf metrics: Avoid segv if default metricgroup isn't set
      9d3bc457
    • Linus Torvalds's avatar
      Merge tag '6.7-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 2099306c
      Linus Torvalds authored
      Pull smb client fixes from Steve French:
       "Six smb3 client fixes:
      
         - Fixes for copy_file_range and clone (cache invalidation and file
           size), also addresses an xfstest failure
      
         - Fix to return proper error if REMAP_FILE_DEDUP set (also fixes
           xfstest generic/304)
      
         - Fix potential null pointer reference with DFS
      
         - Multichannel fix addressing (reverting an earlier patch) some of
           the problems with enabling/disabling channels dynamically
      
        Still working on a followon multichannel fix to address another issue
        found in reconnect testing that will send next week"
      
      * tag '6.7-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: reconnect worker should take reference on server struct unconditionally
        Revert "cifs: reconnect work should have reference on server struct"
        cifs: Fix non-availability of dedup breaking generic/304
        smb: client: fix potential NULL deref in parse_dfs_referrals()
        cifs: Fix flushing, invalidation and file size with FICLONE
        cifs: Fix flushing, invalidation and file size with copy_file_range()
      2099306c
    • Tiezhu Yang's avatar
      LoongArch: BPF: Fix unconditional bswap instructions · e2f7b3d8
      Tiezhu Yang authored
      We can see that "bswap32: Takes an unsigned 32-bit number in either big-
      or little-endian format and returns the equivalent number with the same
      bit width but opposite endianness" in BPF Instruction Set Specification,
      so it should clear the upper 32 bits in "case 32:" for both BPF_ALU and
      BPF_ALU64.
      
      [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
      [root@linux fedora]# modprobe test_bpf
      
      Before:
      test_bpf: #313 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 jited:1 ret 1460850314 != -271733879 (0x5712ce8a != 0xefcdab89)FAIL (1 times)
      test_bpf: #317 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 jited:1 ret -1460850316 != 271733878 (0xa8ed3174 != 0x10325476)FAIL (1 times)
      
      After:
      test_bpf: #313 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 jited:1 4 PASS
      test_bpf: #317 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 jited:1 4 PASS
      
      Fixes: 4ebf9216 ("LoongArch: BPF: Support unconditional bswap instructions")
      Acked-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      e2f7b3d8
    • Tiezhu Yang's avatar
      LoongArch: BPF: Fix sign-extension mov instructions · 772cbe94
      Tiezhu Yang authored
      We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg" in
      include/linux/filter.h, additionally, for BPF_ALU64 the value of the
      destination register is unchanged whereas for BPF_ALU the upper 32 bits
      of the destination register are zeroed, so it should clear the upper 32
      bits for BPF_ALU.
      
      [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
      [root@linux fedora]# modprobe test_bpf
      
      Before:
      test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
      test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
      
      After:
      test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
      test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
      
      By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
      also be fixed with this patch.
      
      Fixes: f48012f1 ("LoongArch: BPF: Support sign-extension mov instructions")
      Acked-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      772cbe94
    • Hengqi Chen's avatar
      LoongArch: BPF: Don't sign extend function return value · 5d47ec2e
      Hengqi Chen authored
      The `cls_redirect` test triggers a kernel panic like:
      
        # ./test_progs -t cls_redirect
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        [   30.938489] CPU 3 Unable to handle kernel paging request at virtual address fffffffffd814de0, era == ffff800002009fb8, ra == ffff800002009f9c
        [   30.939331] Oops[#1]:
        [   30.939513] CPU: 3 PID: 1260 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
        [   30.939732] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
        [   30.939901] pc ffff800002009fb8 ra ffff800002009f9c tp 9000000104da4000 sp 9000000104da7ab0
        [   30.940038] a0 fffffffffd814de0 a1 9000000104da7a68 a2 0000000000000000 a3 9000000104da7c10
        [   30.940183] a4 9000000104da7c14 a5 0000000000000002 a6 0000000000000021 a7 00005555904d7f90
        [   30.940321] t0 0000000000000110 t1 0000000000000000 t2 fffffffffd814de0 t3 0004c4b400000000
        [   30.940456] t4 ffffffffffffffff t5 00000000c3f63600 t6 0000000000000000 t7 0000000000000000
        [   30.940590] t8 000000000006d803 u0 0000000000000020 s9 9000000104da7b10 s0 900000010504c200
        [   30.940727] s1 fffffffffd814de0 s2 900000010504c200 s3 9000000104da7c10 s4 9000000104da7ad0
        [   30.940866] s5 0000000000000000 s6 90000000030e65bc s7 9000000104da7b44 s8 90000000044f6fc0
        [   30.941015]    ra: ffff800002009f9c bpf_prog_846803e5ae81417f_cls_redirect+0xa0/0x590
        [   30.941535]   ERA: ffff800002009fb8 bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
        [   30.941696]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
        [   30.942224]  PRMD: 00000004 (PPLV0 +PIE -PWE)
        [   30.942330]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
        [   30.942453]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
        [   30.942612] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
        [   30.942764]  BADV: fffffffffd814de0
        [   30.942854]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
        [   30.942974] Modules linked in:
        [   30.943078] Process test_progs (pid: 1260, threadinfo=00000000ce303226, task=000000007d10bb76)
        [   30.943306] Stack : 900000010a064000 90000000044f6fc0 9000000104da7b48 0000000000000000
        [   30.943495]         0000000000000000 9000000104da7c14 9000000104da7c10 900000010504c200
        [   30.943626]         0000000000000001 ffff80001b88c000 9000000104da7b70 90000000030e6668
        [   30.943785]         0000000000000000 9000000104da7b58 ffff80001b88c048 9000000003d05000
        [   30.943936]         900000000303ac88 0000000000000000 0000000000000000 9000000104da7b70
        [   30.944091]         0000000000000000 0000000000000001 0000000731eeab00 0000000000000000
        [   30.944245]         ffff80001b88c000 0000000000000000 0000000000000000 54b99959429f83b8
        [   30.944402]         ffff80001b88c000 90000000044f6fc0 9000000101d70000 ffff80001b88c000
        [   30.944538]         000000000000005a 900000010504c200 900000010a064000 900000010a067000
        [   30.944697]         9000000104da7d88 0000000000000000 9000000003d05000 90000000030e794c
        [   30.944852]         ...
        [   30.944924] Call Trace:
        [   30.945120] [<ffff800002009fb8>] bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
        [   30.945650] [<90000000030e6668>] bpf_test_run+0x1ec/0x2f8
        [   30.945958] [<90000000030e794c>] bpf_prog_test_run_skb+0x31c/0x684
        [   30.946065] [<90000000026d4f68>] __sys_bpf+0x678/0x2724
        [   30.946159] [<90000000026d7288>] sys_bpf+0x20/0x2c
        [   30.946253] [<90000000032dd224>] do_syscall+0x7c/0x94
        [   30.946343] [<9000000002541c5c>] handle_syscall+0xbc/0x158
        [   30.946492]
        [   30.946549] Code: 0015030e  5c0009c0  5001d000 <28c00304> 02c00484  29c00304  00150009  2a42d2e4  0280200d
        [   30.946793]
        [   30.946971] ---[ end trace 0000000000000000 ]---
        [   32.093225] Kernel panic - not syncing: Fatal exception in interrupt
        [   32.093526] Kernel relocated by 0x2320000
        [   32.093630]  .text @ 0x9000000002520000
        [   32.093725]  .data @ 0x9000000003400000
        [   32.093792]  .bss  @ 0x9000000004413200
        [   34.971998] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      This is because we signed-extend function return values. When subprog
      mode is enabled, we have:
      
        cls_redirect()
          -> get_global_metrics() returns pcpu ptr 0xfffffefffc00b480
      
      The pointer returned is later signed-extended to 0xfffffffffc00b480 at
      `BPF_JMP | BPF_EXIT`. During BPF prog run, this triggers unhandled page
      fault and a kernel panic.
      
      Drop the unnecessary signed-extension on return values like other
      architectures do.
      
      With this change, we have:
      
        # ./test_progs -t cls_redirect
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        #51/1    cls_redirect/cls_redirect_inlined:OK
        #51/2    cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
        #51/3    cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
        #51/4    cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
        #51/5    cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
        #51/6    cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
        #51/7    cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
        #51/8    cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
        #51/9    cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
        #51/10   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
        #51/11   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
        #51/12   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
        #51/13   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
        #51/14   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
        #51/15   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
        #51/16   cls_redirect/cls_redirect_subprogs:OK
        #51/17   cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
        #51/18   cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
        #51/19   cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
        #51/20   cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
        #51/21   cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
        #51/22   cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
        #51/23   cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
        #51/24   cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
        #51/25   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
        #51/26   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
        #51/27   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
        #51/28   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
        #51/29   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
        #51/30   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
        #51/31   cls_redirect/cls_redirect_dynptr:OK
        #51/32   cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
        #51/33   cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
        #51/34   cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
        #51/35   cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
        #51/36   cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
        #51/37   cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
        #51/38   cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
        #51/39   cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
        #51/40   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
        #51/41   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
        #51/42   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
        #51/43   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
        #51/44   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
        #51/45   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
        #51      cls_redirect:OK
        Summary: 1/45 PASSED, 0 SKIPPED, 0 FAILED
      
      Fixes: 5dc61552 ("LoongArch: Add BPF JIT support")
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      5d47ec2e
    • Hengqi Chen's avatar
      LoongArch: BPF: Don't sign extend memory load operand · fe575755
      Hengqi Chen authored
      The `cgrp_local_storage` test triggers a kernel panic like:
      
        # ./test_progs -t cgrp_local_storage
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        [  550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00
        [  550.931781] Oops[#1]:
        [  550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
        [  550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
        [  550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0
        [  550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558
        [  550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118
        [  550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0
        [  550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020
        [  550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700
        [  550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8
        [  550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050
        [  550.933520]    ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200
        [  550.933911]   ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
        [  550.934105]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
        [  550.934596]  PRMD: 00000004 (PPLV0 +PIE -PWE)
        [  550.934712]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
        [  550.934836]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
        [  550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
        [  550.935097]  BADV: 0000000000000080
        [  550.935181]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
        [  550.935291] Modules linked in:
        [  550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55)
        [  550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001
        [  550.935844]         9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034
        [  550.935990]         0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09
        [  550.936175]         0000000000000001 0000000000000000 9000000108353ec0 0000000000000118
        [  550.936314]         9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000
        [  550.936479]         0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288
        [  550.936635]         00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003
        [  550.936779]         9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c
        [  550.936939]         0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0
        [  550.937083]         ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558
        [  550.937224]         ...
        [  550.937299] Call Trace:
        [  550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
        [  550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154
        [  550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200
        [  550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94
        [  550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158
        [  550.938477]
        [  550.938607] Code: 580009ae  50016000  262402e4 <28c20085> 14092084  03a00084  16000024  03240084  00150006
        [  550.938851]
        [  550.939021] ---[ end trace 0000000000000000 ]---
      
      Further investigation shows that this panic is triggered by memory
      load operations:
      
        ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0,
                                   BPF_LOCAL_STORAGE_GET_F_CREATE);
      
      The expression `task->cgroups->dfl_cgrp` involves two memory load.
      Since the field offset fits in imm12 or imm14, we use ldd or ldptrd
      instructions. But both instructions have the side effect that it will
      signed-extended the imm operand. Finally, we got the wrong addresses
      and panics is inevitable.
      
      Use a generic ldxd instruction to avoid this kind of issues.
      
      With this change, we have:
      
        # ./test_progs -t cgrp_local_storage
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
        #48/1    cgrp_local_storage/tp_btf:OK
        test_attach_cgroup:PASS:skel_open 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
        test_attach_cgroup:FAIL:prog_attach unexpected error: -524
        #48/2    cgrp_local_storage/attach_cgroup:FAIL
        test_recursion:PASS:skel_open_and_load 0 nsec
        libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'on_lookup': failed to auto-attach: -524
        test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/3    cgrp_local_storage/recursion:FAIL
        #48/4    cgrp_local_storage/negative:OK
        #48/5    cgrp_local_storage/cgroup_iter_sleepable:OK
        test_yes_rcu_lock:PASS:skel_open 0 nsec
        test_yes_rcu_lock:PASS:skel_load 0 nsec
        libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
        test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/6    cgrp_local_storage/yes_rcu_lock:FAIL
        #48/7    cgrp_local_storage/no_rcu_lock:OK
        #48      cgrp_local_storage:FAIL
      
        All error logs:
        test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
        test_attach_cgroup:PASS:skel_open 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
        test_attach_cgroup:FAIL:prog_attach unexpected error: -524
        #48/2    cgrp_local_storage/attach_cgroup:FAIL
        test_recursion:PASS:skel_open_and_load 0 nsec
        libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'on_lookup': failed to auto-attach: -524
        test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/3    cgrp_local_storage/recursion:FAIL
        test_yes_rcu_lock:PASS:skel_open 0 nsec
        test_yes_rcu_lock:PASS:skel_load 0 nsec
        libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
        test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/6    cgrp_local_storage/yes_rcu_lock:FAIL
        #48      cgrp_local_storage:FAIL
        Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED
      
      No panics any more (The test still failed because lack of BPF trampoline
      which I am actively working on).
      
      Fixes: 5dc61552 ("LoongArch: Add BPF JIT support")
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      fe575755
    • Hengqi Chen's avatar
      LoongArch: Preserve syscall nr across execve() · d6c5f06e
      Hengqi Chen authored
      Currently, we store syscall nr in pt_regs::regs[11] and syscall execve()
      accidentally overrides it during its execution:
      
          sys_execve()
            -> do_execve()
              -> do_execveat_common()
                -> bprm_execve()
                  -> exec_binprm()
                    -> search_binary_handler()
                      -> load_elf_binary()
                        -> ELF_PLAT_INIT()
      
      ELF_PLAT_INIT() reset regs[11] to 0, so in syscall_exit_to_user_mode()
      we later get a wrong syscall nr. This breaks tools like execsnoop since
      it relies on execve() tracepoints.
      
      Skip pt_regs::regs[11] reset in ELF_PLAT_INIT() to fix the issue.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      d6c5f06e
    • Jinyang He's avatar
      LoongArch: Set unwind stack type to unknown rather than set error flag · 97ceddbc
      Jinyang He authored
      During unwinding, unwind_done() is used as an end condition. Normally it
      unwind to the user stack and then set the stack type to unknown, which
      is a normal exit. When something unexpected happens in unwind process
      and we cannot unwind anymore, we should set the error flag, and also set
      the stack type to unknown to indicate that the unwind process can not
      continue. The error flag emphasizes that the unwind process produce an
      unexpected error. There is no unexpected things when we unwind the PT_REGS
      in the top of IRQ stack and find out that is an user mode PT_REGS. Thus,
      we should not set error flag and just set stack type to unknown.
      Reported-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Acked-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarJinyang He <hejinyang@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      97ceddbc
    • Xi Ruoyao's avatar
      LoongArch: Slightly clean up drdtime() · 8146c5b3
      Xi Ruoyao authored
      As we are just discarding the stable clock ID, simply write it into
      $zero instead of allocating a temporary register.
      Signed-off-by: default avatarXi Ruoyao <xry111@xry111.site>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      8146c5b3
    • WANG Rui's avatar
      LoongArch: Apply dynamic relocations for LLD · eea673e9
      WANG Rui authored
      For the following assembly code:
      
           .text
           .global func
       func:
           nop
      
           .data
       var:
           .dword func
      
      When linked with `-pie`, GNU LD populates the `var` variable with the
      pre-relocated value of `func`. However, LLVM LLD does not exhibit the
      same behavior. This issue also arises with the `kernel_entry` in arch/
      loongarch/kernel/head.S:
      
       _head:
           .word   MZ_MAGIC                /* "MZ", MS-DOS header */
           .org    0x8
           .dword  kernel_entry            /* Kernel entry point */
      
      The correct kernel entry from the MS-DOS header is crucial for jumping
      to vmlinux from zboot. This necessity is why the compressed relocatable
      kernel compiled by Clang encounters difficulties in booting.
      
      To address this problem, it is proposed to apply dynamic relocations to
      place with `--apply-dynamic-relocs`.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1962Signed-off-by: default avatarWANG Rui <wangrui@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      eea673e9
  6. 08 Dec, 2023 3 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · f2e8a57e
      Linus Torvalds authored
      Pull SCSI fix from James Bottomley:
       "One tiny fix to the be2iscsi driver fixing a memory leak in an error
        leg"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
      f2e8a57e
    • Linus Torvalds's avatar
      Merge tag 'block-6.7-2023-12-08' of git://git.kernel.dk/linux · d71369db
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Nothing major in here, just miscellanous fixes for MD and NVMe:
      
         - NVMe pull request via Keith:
            - Proper nvme ctrl state setting (Keith)
            - Passthrough command optimization (Keith)
            - Spectre fix (Nitesh)
            - Kconfig clarifications (Shin'ichiro)
            - Frozen state deadlock fix (Bitao)
            - Power setting quirk (Georg)
      
         - MD pull requests via Song:
            - 6.7 regresisons with recovery/sync (Yu)
            - Reshape fix (David)"
      
      * tag 'block-6.7-2023-12-08' of git://git.kernel.dk/linux:
        md: split MD_RECOVERY_NEEDED out of mddev_resume
        nvme-pci: Add sleep quirk for Kingston drives
        md: fix stopping sync thread
        md: don't leave 'MD_RECOVERY_FROZEN' in error path of md_set_readonly()
        md: fix missing flush of sync_work
        nvme: fix deadlock between reset and scan
        nvme: prevent potential spectre v1 gadget
        nvme: improve NVME_HOST_AUTH and NVME_TARGET_AUTH config descriptions
        nvme-ioctl: move capable() admin check to the end
        nvme: ensure reset state check ordering
        nvme: introduce helper function to get ctrl state
        md/raid6: use valid sector values to determine if an I/O should wait on the reshape
      d71369db
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.7-2023-12-08' of git://git.kernel.dk/linux · 689659c9
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Two minor fixes for issues introduced in this release cycle, and two
        fixes for issues or potential issues that are heading to stable.
      
        One of these ends up disabling passing io_uring file descriptors via
        SCM_RIGHTS. There really shouldn't be an overlap between that kind of
        historic use case and modern usage of io_uring, which is why this was
        deemed appropriate"
      
      * tag 'io_uring-6.7-2023-12-08' of git://git.kernel.dk/linux:
        io_uring/af_unix: disable sending io_uring over sockets
        io_uring/kbuf: check for buffer list readiness after NULL check
        io_uring/kbuf: Fix an NULL vs IS_ERR() bug in io_alloc_pbuf_ring()
        io_uring: fix mutex_unlock with unreferenced ctx
      689659c9