1. 09 Dec, 2023 7 commits
    • Tiezhu Yang's avatar
      LoongArch: BPF: Fix sign-extension mov instructions · 772cbe94
      Tiezhu Yang authored
      We can see that "Short form of movsx, dst_reg = (s8,s16,s32)src_reg" in
      include/linux/filter.h, additionally, for BPF_ALU64 the value of the
      destination register is unchanged whereas for BPF_ALU the upper 32 bits
      of the destination register are zeroed, so it should clear the upper 32
      bits for BPF_ALU.
      
      [root@linux fedora]# echo 1 > /proc/sys/net/core/bpf_jit_enable
      [root@linux fedora]# modprobe test_bpf
      
      Before:
      test_bpf: #81 ALU_MOVSX | BPF_B jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
      test_bpf: #82 ALU_MOVSX | BPF_H jited:1 ret 2 != 1 (0x2 != 0x1)FAIL (1 times)
      
      After:
      test_bpf: #81 ALU_MOVSX | BPF_B jited:1 6 PASS
      test_bpf: #82 ALU_MOVSX | BPF_H jited:1 6 PASS
      
      By the way, the bpf selftest case "./test_progs -t verifier_movsx" can
      also be fixed with this patch.
      
      Fixes: f48012f1 ("LoongArch: BPF: Support sign-extension mov instructions")
      Acked-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      772cbe94
    • Hengqi Chen's avatar
      LoongArch: BPF: Don't sign extend function return value · 5d47ec2e
      Hengqi Chen authored
      The `cls_redirect` test triggers a kernel panic like:
      
        # ./test_progs -t cls_redirect
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        [   30.938489] CPU 3 Unable to handle kernel paging request at virtual address fffffffffd814de0, era == ffff800002009fb8, ra == ffff800002009f9c
        [   30.939331] Oops[#1]:
        [   30.939513] CPU: 3 PID: 1260 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
        [   30.939732] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
        [   30.939901] pc ffff800002009fb8 ra ffff800002009f9c tp 9000000104da4000 sp 9000000104da7ab0
        [   30.940038] a0 fffffffffd814de0 a1 9000000104da7a68 a2 0000000000000000 a3 9000000104da7c10
        [   30.940183] a4 9000000104da7c14 a5 0000000000000002 a6 0000000000000021 a7 00005555904d7f90
        [   30.940321] t0 0000000000000110 t1 0000000000000000 t2 fffffffffd814de0 t3 0004c4b400000000
        [   30.940456] t4 ffffffffffffffff t5 00000000c3f63600 t6 0000000000000000 t7 0000000000000000
        [   30.940590] t8 000000000006d803 u0 0000000000000020 s9 9000000104da7b10 s0 900000010504c200
        [   30.940727] s1 fffffffffd814de0 s2 900000010504c200 s3 9000000104da7c10 s4 9000000104da7ad0
        [   30.940866] s5 0000000000000000 s6 90000000030e65bc s7 9000000104da7b44 s8 90000000044f6fc0
        [   30.941015]    ra: ffff800002009f9c bpf_prog_846803e5ae81417f_cls_redirect+0xa0/0x590
        [   30.941535]   ERA: ffff800002009fb8 bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
        [   30.941696]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
        [   30.942224]  PRMD: 00000004 (PPLV0 +PIE -PWE)
        [   30.942330]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
        [   30.942453]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
        [   30.942612] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
        [   30.942764]  BADV: fffffffffd814de0
        [   30.942854]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
        [   30.942974] Modules linked in:
        [   30.943078] Process test_progs (pid: 1260, threadinfo=00000000ce303226, task=000000007d10bb76)
        [   30.943306] Stack : 900000010a064000 90000000044f6fc0 9000000104da7b48 0000000000000000
        [   30.943495]         0000000000000000 9000000104da7c14 9000000104da7c10 900000010504c200
        [   30.943626]         0000000000000001 ffff80001b88c000 9000000104da7b70 90000000030e6668
        [   30.943785]         0000000000000000 9000000104da7b58 ffff80001b88c048 9000000003d05000
        [   30.943936]         900000000303ac88 0000000000000000 0000000000000000 9000000104da7b70
        [   30.944091]         0000000000000000 0000000000000001 0000000731eeab00 0000000000000000
        [   30.944245]         ffff80001b88c000 0000000000000000 0000000000000000 54b99959429f83b8
        [   30.944402]         ffff80001b88c000 90000000044f6fc0 9000000101d70000 ffff80001b88c000
        [   30.944538]         000000000000005a 900000010504c200 900000010a064000 900000010a067000
        [   30.944697]         9000000104da7d88 0000000000000000 9000000003d05000 90000000030e794c
        [   30.944852]         ...
        [   30.944924] Call Trace:
        [   30.945120] [<ffff800002009fb8>] bpf_prog_846803e5ae81417f_cls_redirect+0xbc/0x590
        [   30.945650] [<90000000030e6668>] bpf_test_run+0x1ec/0x2f8
        [   30.945958] [<90000000030e794c>] bpf_prog_test_run_skb+0x31c/0x684
        [   30.946065] [<90000000026d4f68>] __sys_bpf+0x678/0x2724
        [   30.946159] [<90000000026d7288>] sys_bpf+0x20/0x2c
        [   30.946253] [<90000000032dd224>] do_syscall+0x7c/0x94
        [   30.946343] [<9000000002541c5c>] handle_syscall+0xbc/0x158
        [   30.946492]
        [   30.946549] Code: 0015030e  5c0009c0  5001d000 <28c00304> 02c00484  29c00304  00150009  2a42d2e4  0280200d
        [   30.946793]
        [   30.946971] ---[ end trace 0000000000000000 ]---
        [   32.093225] Kernel panic - not syncing: Fatal exception in interrupt
        [   32.093526] Kernel relocated by 0x2320000
        [   32.093630]  .text @ 0x9000000002520000
        [   32.093725]  .data @ 0x9000000003400000
        [   32.093792]  .bss  @ 0x9000000004413200
        [   34.971998] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      This is because we signed-extend function return values. When subprog
      mode is enabled, we have:
      
        cls_redirect()
          -> get_global_metrics() returns pcpu ptr 0xfffffefffc00b480
      
      The pointer returned is later signed-extended to 0xfffffffffc00b480 at
      `BPF_JMP | BPF_EXIT`. During BPF prog run, this triggers unhandled page
      fault and a kernel panic.
      
      Drop the unnecessary signed-extension on return values like other
      architectures do.
      
      With this change, we have:
      
        # ./test_progs -t cls_redirect
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        #51/1    cls_redirect/cls_redirect_inlined:OK
        #51/2    cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
        #51/3    cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
        #51/4    cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
        #51/5    cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
        #51/6    cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
        #51/7    cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
        #51/8    cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
        #51/9    cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
        #51/10   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
        #51/11   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
        #51/12   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
        #51/13   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
        #51/14   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
        #51/15   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
        #51/16   cls_redirect/cls_redirect_subprogs:OK
        #51/17   cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
        #51/18   cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
        #51/19   cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
        #51/20   cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
        #51/21   cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
        #51/22   cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
        #51/23   cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
        #51/24   cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
        #51/25   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
        #51/26   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
        #51/27   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
        #51/28   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
        #51/29   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
        #51/30   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
        #51/31   cls_redirect/cls_redirect_dynptr:OK
        #51/32   cls_redirect/IPv4 TCP accept unknown (no hops, flags: SYN):OK
        #51/33   cls_redirect/IPv6 TCP accept unknown (no hops, flags: SYN):OK
        #51/34   cls_redirect/IPv4 TCP accept unknown (no hops, flags: ACK):OK
        #51/35   cls_redirect/IPv6 TCP accept unknown (no hops, flags: ACK):OK
        #51/36   cls_redirect/IPv4 TCP forward unknown (one hop, flags: ACK):OK
        #51/37   cls_redirect/IPv6 TCP forward unknown (one hop, flags: ACK):OK
        #51/38   cls_redirect/IPv4 TCP accept known (one hop, flags: ACK):OK
        #51/39   cls_redirect/IPv6 TCP accept known (one hop, flags: ACK):OK
        #51/40   cls_redirect/IPv4 UDP accept unknown (no hops, flags: none):OK
        #51/41   cls_redirect/IPv6 UDP accept unknown (no hops, flags: none):OK
        #51/42   cls_redirect/IPv4 UDP forward unknown (one hop, flags: none):OK
        #51/43   cls_redirect/IPv6 UDP forward unknown (one hop, flags: none):OK
        #51/44   cls_redirect/IPv4 UDP accept known (one hop, flags: none):OK
        #51/45   cls_redirect/IPv6 UDP accept known (one hop, flags: none):OK
        #51      cls_redirect:OK
        Summary: 1/45 PASSED, 0 SKIPPED, 0 FAILED
      
      Fixes: 5dc61552 ("LoongArch: Add BPF JIT support")
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      5d47ec2e
    • Hengqi Chen's avatar
      LoongArch: BPF: Don't sign extend memory load operand · fe575755
      Hengqi Chen authored
      The `cgrp_local_storage` test triggers a kernel panic like:
      
        # ./test_progs -t cgrp_local_storage
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        [  550.930632] CPU 1 Unable to handle kernel paging request at virtual address 0000000000000080, era == ffff80000200be34, ra == ffff80000200be00
        [  550.931781] Oops[#1]:
        [  550.931966] CPU: 1 PID: 1303 Comm: test_progs Not tainted 6.7.0-rc2-loong-devel-g2f56bb0d2327 #35 a896aca3f4164f09cc346f89f2e09832e07be5f6
        [  550.932215] Hardware name: QEMU QEMU Virtual Machine, BIOS unknown 2/2/2022
        [  550.932403] pc ffff80000200be34 ra ffff80000200be00 tp 9000000108350000 sp 9000000108353dc0
        [  550.932545] a0 0000000000000000 a1 0000000000000517 a2 0000000000000118 a3 00007ffffbb15558
        [  550.932682] a4 00007ffffbb15620 a5 90000001004e7700 a6 0000000000000021 a7 0000000000000118
        [  550.932824] t0 ffff80000200bdc0 t1 0000000000000517 t2 0000000000000517 t3 00007ffff1c06ee0
        [  550.932961] t4 0000555578ae04d0 t5 fffffffffffffff8 t6 0000000000000004 t7 0000000000000020
        [  550.933097] t8 0000000000000040 u0 00000000000007b8 s9 9000000108353e00 s0 90000001004e7700
        [  550.933241] s1 9000000004005000 s2 0000000000000001 s3 0000000000000000 s4 0000555555eb2ec8
        [  550.933379] s5 00007ffffbb15bb8 s6 00007ffff1dafd60 s7 000055555663f610 s8 00007ffff1db0050
        [  550.933520]    ra: ffff80000200be00 bpf_prog_98f1b9e767be2a84_on_enter+0x40/0x200
        [  550.933911]   ERA: ffff80000200be34 bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
        [  550.934105]  CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE)
        [  550.934596]  PRMD: 00000004 (PPLV0 +PIE -PWE)
        [  550.934712]  EUEN: 00000003 (+FPE +SXE -ASXE -BTE)
        [  550.934836]  ECFG: 00071c1c (LIE=2-4,10-12 VS=7)
        [  550.934976] ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0)
        [  550.935097]  BADV: 0000000000000080
        [  550.935181]  PRID: 0014c010 (Loongson-64bit, Loongson-3A5000)
        [  550.935291] Modules linked in:
        [  550.935391] Process test_progs (pid: 1303, threadinfo=000000006c3b1c41, task=0000000061f84a55)
        [  550.935643] Stack : 00007ffffbb15bb8 0000555555eb2ec8 0000000000000000 0000000000000001
        [  550.935844]         9000000004005000 ffff80001b864000 00007ffffbb15450 90000000029aa034
        [  550.935990]         0000000000000000 9000000108353ec0 0000000000000118 d07d9dfb09721a09
        [  550.936175]         0000000000000001 0000000000000000 9000000108353ec0 0000000000000118
        [  550.936314]         9000000101d46ad0 900000000290abf0 000055555663f610 0000000000000000
        [  550.936479]         0000000000000003 9000000108353ec0 00007ffffbb15450 90000000029d7288
        [  550.936635]         00007ffff1dafd60 000055555663f610 0000000000000000 0000000000000003
        [  550.936779]         9000000108353ec0 90000000035dd1f0 00007ffff1dafd58 9000000002841c5c
        [  550.936939]         0000000000000119 0000555555eea5a8 00007ffff1d78780 00007ffffbb153e0
        [  550.937083]         ffffffffffffffda 00007ffffbb15518 0000000000000040 00007ffffbb15558
        [  550.937224]         ...
        [  550.937299] Call Trace:
        [  550.937521] [<ffff80000200be34>] bpf_prog_98f1b9e767be2a84_on_enter+0x74/0x200
        [  550.937910] [<90000000029aa034>] bpf_trace_run2+0x90/0x154
        [  550.938105] [<900000000290abf0>] syscall_trace_enter.isra.0+0x1cc/0x200
        [  550.938224] [<90000000035dd1f0>] do_syscall+0x48/0x94
        [  550.938319] [<9000000002841c5c>] handle_syscall+0xbc/0x158
        [  550.938477]
        [  550.938607] Code: 580009ae  50016000  262402e4 <28c20085> 14092084  03a00084  16000024  03240084  00150006
        [  550.938851]
        [  550.939021] ---[ end trace 0000000000000000 ]---
      
      Further investigation shows that this panic is triggered by memory
      load operations:
      
        ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0,
                                   BPF_LOCAL_STORAGE_GET_F_CREATE);
      
      The expression `task->cgroups->dfl_cgrp` involves two memory load.
      Since the field offset fits in imm12 or imm14, we use ldd or ldptrd
      instructions. But both instructions have the side effect that it will
      signed-extended the imm operand. Finally, we got the wrong addresses
      and panics is inevitable.
      
      Use a generic ldxd instruction to avoid this kind of issues.
      
      With this change, we have:
      
        # ./test_progs -t cgrp_local_storage
        Can't find bpf_testmod.ko kernel module: -2
        WARNING! Selftests relying on bpf_testmod.ko will be skipped.
        test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
        #48/1    cgrp_local_storage/tp_btf:OK
        test_attach_cgroup:PASS:skel_open 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
        test_attach_cgroup:FAIL:prog_attach unexpected error: -524
        #48/2    cgrp_local_storage/attach_cgroup:FAIL
        test_recursion:PASS:skel_open_and_load 0 nsec
        libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'on_lookup': failed to auto-attach: -524
        test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/3    cgrp_local_storage/recursion:FAIL
        #48/4    cgrp_local_storage/negative:OK
        #48/5    cgrp_local_storage/cgroup_iter_sleepable:OK
        test_yes_rcu_lock:PASS:skel_open 0 nsec
        test_yes_rcu_lock:PASS:skel_load 0 nsec
        libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
        test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/6    cgrp_local_storage/yes_rcu_lock:FAIL
        #48/7    cgrp_local_storage/no_rcu_lock:OK
        #48      cgrp_local_storage:FAIL
      
        All error logs:
        test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
        test_attach_cgroup:PASS:skel_open 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        test_attach_cgroup:PASS:prog_attach 0 nsec
        libbpf: prog 'update_cookie_tracing': failed to attach: ERROR: strerror_r(-524)=22
        test_attach_cgroup:FAIL:prog_attach unexpected error: -524
        #48/2    cgrp_local_storage/attach_cgroup:FAIL
        test_recursion:PASS:skel_open_and_load 0 nsec
        libbpf: prog 'on_lookup': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'on_lookup': failed to auto-attach: -524
        test_recursion:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/3    cgrp_local_storage/recursion:FAIL
        test_yes_rcu_lock:PASS:skel_open 0 nsec
        test_yes_rcu_lock:PASS:skel_load 0 nsec
        libbpf: prog 'yes_rcu_lock': failed to attach: ERROR: strerror_r(-524)=22
        libbpf: prog 'yes_rcu_lock': failed to auto-attach: -524
        test_yes_rcu_lock:FAIL:skel_attach unexpected error: -524 (errno 524)
        #48/6    cgrp_local_storage/yes_rcu_lock:FAIL
        #48      cgrp_local_storage:FAIL
        Summary: 0/4 PASSED, 0 SKIPPED, 1 FAILED
      
      No panics any more (The test still failed because lack of BPF trampoline
      which I am actively working on).
      
      Fixes: 5dc61552 ("LoongArch: Add BPF JIT support")
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      fe575755
    • Hengqi Chen's avatar
      LoongArch: Preserve syscall nr across execve() · d6c5f06e
      Hengqi Chen authored
      Currently, we store syscall nr in pt_regs::regs[11] and syscall execve()
      accidentally overrides it during its execution:
      
          sys_execve()
            -> do_execve()
              -> do_execveat_common()
                -> bprm_execve()
                  -> exec_binprm()
                    -> search_binary_handler()
                      -> load_elf_binary()
                        -> ELF_PLAT_INIT()
      
      ELF_PLAT_INIT() reset regs[11] to 0, so in syscall_exit_to_user_mode()
      we later get a wrong syscall nr. This breaks tools like execsnoop since
      it relies on execve() tracepoints.
      
      Skip pt_regs::regs[11] reset in ELF_PLAT_INIT() to fix the issue.
      Signed-off-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      d6c5f06e
    • Jinyang He's avatar
      LoongArch: Set unwind stack type to unknown rather than set error flag · 97ceddbc
      Jinyang He authored
      During unwinding, unwind_done() is used as an end condition. Normally it
      unwind to the user stack and then set the stack type to unknown, which
      is a normal exit. When something unexpected happens in unwind process
      and we cannot unwind anymore, we should set the error flag, and also set
      the stack type to unknown to indicate that the unwind process can not
      continue. The error flag emphasizes that the unwind process produce an
      unexpected error. There is no unexpected things when we unwind the PT_REGS
      in the top of IRQ stack and find out that is an user mode PT_REGS. Thus,
      we should not set error flag and just set stack type to unknown.
      Reported-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Acked-by: default avatarHengqi Chen <hengqi.chen@gmail.com>
      Signed-off-by: default avatarJinyang He <hejinyang@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      97ceddbc
    • Xi Ruoyao's avatar
      LoongArch: Slightly clean up drdtime() · 8146c5b3
      Xi Ruoyao authored
      As we are just discarding the stable clock ID, simply write it into
      $zero instead of allocating a temporary register.
      Signed-off-by: default avatarXi Ruoyao <xry111@xry111.site>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      8146c5b3
    • WANG Rui's avatar
      LoongArch: Apply dynamic relocations for LLD · eea673e9
      WANG Rui authored
      For the following assembly code:
      
           .text
           .global func
       func:
           nop
      
           .data
       var:
           .dword func
      
      When linked with `-pie`, GNU LD populates the `var` variable with the
      pre-relocated value of `func`. However, LLVM LLD does not exhibit the
      same behavior. This issue also arises with the `kernel_entry` in arch/
      loongarch/kernel/head.S:
      
       _head:
           .word   MZ_MAGIC                /* "MZ", MS-DOS header */
           .org    0x8
           .dword  kernel_entry            /* Kernel entry point */
      
      The correct kernel entry from the MS-DOS header is crucial for jumping
      to vmlinux from zboot. This necessity is why the compressed relocatable
      kernel compiled by Clang encounters difficulties in booting.
      
      To address this problem, it is proposed to apply dynamic relocations to
      place with `--apply-dynamic-relocs`.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1962Signed-off-by: default avatarWANG Rui <wangrui@loongson.cn>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      eea673e9
  2. 03 Dec, 2023 3 commits
  3. 02 Dec, 2023 5 commits
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 1b8af655
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix corruption of f0/vs0 during FP/Vector save, seen as userspace
         crashes when using io-uring workers (in particular with MariaDB)
      
       - Fix KVM_RUN potentially clobbering all host userspace FP/Vector
         registers
      
      Thanks to Timothy Pearson, Jens Axboe, and Nicholas Piggin.
      
      * tag 'powerpc-6.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        KVM: PPC: Book3S HV: Fix KVM_RUN clobbering FP/VEC user registers
        powerpc: Don't clobber f0/vs0 during fp|altivec register save
      1b8af655
    • Linus Torvalds's avatar
      Merge tag 'vfio-v6.7-rc4' of https://github.com/awilliam/linux-vfio · 17b17be2
      Linus Torvalds authored
      Pull vfio fixes from Alex Williamson:
      
       - Fix the lifecycle of a mutex in the pds variant driver such that a
         reset prior to opening the device won't find it uninitialized.
         Implement the release path to symmetrically destroy the mutex. Also
         switch a different lock from spinlock to mutex as the code path has
         the potential to sleep and doesn't need the spinlock context
         otherwise (Brett Creeley)
      
       - Fix an issue detected via randconfig where KVM tries to symbol_get an
         undeclared function. The symbol is temporarily declared
         unconditionally here, which resolves the problem and avoids churn
         relative to a series pending for the next merge window which resolves
         some of this symbol ugliness, but also fixes Kconfig dependencies
         (Sean Christopherson)
      
      * tag 'vfio-v6.7-rc4' of https://github.com/awilliam/linux-vfio:
        vfio: Drop vfio_file_iommu_group() stub to fudge around a KVM wart
        vfio/pds: Fix possible sleep while in atomic context
        vfio/pds: Fix mutex lock->magic != lock warning
      17b17be2
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.7a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · deb4b9dd
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - A fix for the Xen event driver setting the correct return value when
         experiencing an allocation failure
      
       - A fix for allocating space for a struct in the percpu area to not
         cross page boundaries (this one is for x86, a similar one for Arm was
         already in the pull request for rc3)
      
      * tag 'for-linus-6.7a-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/events: fix error code in xen_bind_pirq_msi_to_irq()
        x86/xen: fix percpu vcpu_info allocation
      deb4b9dd
    • Linus Torvalds's avatar
      Merge tag 'probes-fixes-v6.7-rc3' of... · 669fc834
      Linus Torvalds authored
      Merge tag 'probes-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
      
      Pull probes fixes from Masami Hiramatsu:
      
       - objpool: Fix objpool overrun case on memory/cache access delay
         especially on the big.LITTLE SoC. The objpool uses a copy of object
         slot index internal loop, but the slot index can be changed on
         another processor in parallel. In that case, the difference of 'head'
         local copy and the 'slot->last' index will be bigger than local slot
         size. In that case, we need to re-read the slot::head to update it.
      
       - kretprobe: Fix to use appropriate rcu API for kretprobe holder. Since
         kretprobe_holder::rp is RCU managed, it should use
         rcu_assign_pointer() and rcu_dereference_check() correctly. Also
         adding __rcu tag for finding wrong usage by sparse.
      
       - rethook: Fix to use appropriate rcu API for rethook::handler. The
         same as kretprobe, rethook::handler is RCU managed and it should use
         rcu_assign_pointer() and rcu_dereference_check(). This also adds
         __rcu tag for finding wrong usage by sparse.
      
      * tag 'probes-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        rethook: Use __rcu pointer for rethook::handler
        kprobes: consistent rcu api usage for kretprobe holder
        lib: objpool: fix head overrun on RK3588 SBC
      669fc834
    • Linus Torvalds's avatar
      Merge tag 'pm-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 815fb87b
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix issues in two cpufreq drivers, in the AMD P-state driver and
        in the power-capping DTPM framework.
      
        Specifics:
      
         - Fix the AMD P-state driver's EPP sysfs interface in the cases when
           the performance governor is in use (Ayush Jain)
      
         - Make the ->fast_switch() callback in the AMD P-state driver return
           the target frequency as expected (Gautham R. Shenoy)
      
         - Allow user space to control the range of frequencies to use via
           scaling_min_freq and scaling_max_freq when AMD P-state driver is in
           use (Wyes Karny)
      
         - Prevent power domains needed for wakeup signaling from being turned
           off during system suspend on Qualcomm systems and prevent
           performance states votes from runtime-suspended devices from being
           lost across a system suspend-resume cycle in qcom-cpufreq-nvmem
           (Stephan Gerhold)
      
         - Fix disabling the 792 Mhz OPP in the imx6q cpufreq driver for the
           i.MX6ULL types that can run at that frequency (Christoph
           Niedermaier)
      
         - Eliminate unnecessary and harmful conversions to uW from the DTPM
           (dynamic thermal and power management) framework (Lukasz Luba)"
      
      * tag 'pm-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq/amd-pstate: Only print supported EPP values for performance governor
        cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update
        powercap: DTPM: Fix unneeded conversions to micro-Watts
        cpufreq/amd-pstate: Fix the return value of amd_pstate_fast_switch()
        pmdomain: qcom: rpmpd: Set GENPD_FLAG_ACTIVE_WAKEUP
        cpufreq: qcom-nvmem: Preserve PM domain votes in system suspend
        cpufreq: qcom-nvmem: Enable virtual power domain devices
        cpufreq: imx6q: Don't disable 792 Mhz OPP unnecessarily
      815fb87b
  4. 01 Dec, 2023 24 commits
    • Linus Torvalds's avatar
      Merge tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · ce474ae7
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "This fixes a recently introduced build issue on ARM32 and a NULL
        pointer dereference in the ACPI backlight driver due to a design issue
        exposed by a recent change in the ACPI bus type code.
      
        Specifics:
      
         - Fix a recently introduced build issue on ARM32 platforms caused by
           an inadvertent header file breakage (Dave Jiang)
      
         - Eliminate questionable usage of acpi_driver_data() in the ACPI
           backlight cooling device code that leads to NULL pointer
           dereferences after recent ACPI core changes (Hans de Goede)"
      
      * tag 'acpi-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: video: Use acpi_video_device for cooling-dev driver data
        ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
      ce474ae7
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 35f84584
      Linus Torvalds authored
      Pull arm64 fix from Catalin Marinas:
       "Fix a regression where the arm64 KPTI ends up enabled even on systems
        that don't need it"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Avoid enabling KPTI unnecessarily
      35f84584
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 1a2b4185
      Linus Torvalds authored
      Pull iommu fixes from Joerg Roedel:
      
       - Fix race conditions in device probe path
      
       - Handle ERR_PTR() returns in __iommu_domain_alloc() path
      
       - Update MAINTAINERS entry for Qualcom IOMMUs
      
       - Printk argument fix in device tree specific code
      
       - Several Intel VT-d fixes from Lu Baolu:
           - Do not support enforcing cache coherency for non-empty domains
           - Avoid devTLB invalidation if iommu is off
           - Disable PCI ATS in legacy passthrough mode
           - Support non-PCI devices when clearing context
           - Fix incorrect cache invalidation for mm notification
           - Add MTL to quirk list to skip TE disabling
           - Set variable intel_dirty_ops to static
      
      * tag 'iommu-fixes-v6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu: Fix printk arg in of_iommu_get_resv_regions()
        iommu/vt-d: Set variable intel_dirty_ops to static
        iommu/vt-d: Fix incorrect cache invalidation for mm notification
        iommu/vt-d: Add MTL to quirk list to skip TE disabling
        iommu/vt-d: Make context clearing consistent with context mapping
        iommu/vt-d: Disable PCI ATS in legacy passthrough mode
        iommu/vt-d: Omit devTLB invalidation requests when TES=0
        iommu/vt-d: Support enforce_cache_coherency only for empty domains
        iommu: Avoid more races around device probe
        MAINTAINERS: list all Qualcomm IOMMU drivers in the QUALCOMM IOMMU entry
        iommu: Flow ERR_PTR out from __iommu_domain_alloc()
      1a2b4185
    • Linus Torvalds's avatar
      Merge tag 'sound-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 06a3c59f
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "No surprise here, including only a collection of HD-audio
        device-specific small fixes"
      
      * tag 'sound-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda: Disable power-save on KONTRON SinglePC
        ALSA: hda/realtek: Add supported ALC257 for ChromeOS
        ALSA: hda/realtek: Headset Mic VREF to 100%
        ALSA: hda: intel-nhlt: Ignore vbps when looking for DMIC 32 bps format
        ALSA: hda: cs35l56: Enable low-power hibernation mode on SPI
        ALSA: cs35l41: Fix for old systems which do not support command
        ALSA: hda: cs35l41: Remove unnecessary boolean state variable firmware_running
        ALSA: hda - Fix speaker and headset mic pin config for CHUWI CoreBook XPro
      06a3c59f
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2023-12-01' of git://anongit.freedesktop.org/drm/drm · b1e51588
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Weekly fixes, mostly amdgpu fixes with a scattering of nouveau, i915,
        and a couple of reverts. Hopefully it will quieten down in coming
        weeks.
      
        drm:
         - Revert unexport of prime helpers for fd/handle conversion
      
        dma_resv:
         - Do not double add fences in dma_resv_add_fence.
      
        gpuvm:
         - Fix GPUVM license identifier.
      
        i915:
         - Mark internal GSC engine with reserved uabi class
         - Take VGA converters into account in eDP probe
         - Fix intel_pre_plane_updates() call to ensure workarounds get applied
      
        panel:
         - Revert panel fixes as they require exporting device_is_dependent.
      
        nouveau:
         - fix oversized allocations in new vm path
         - fix zero-length array
         - remove a stray lock
      
        nt36523:
         - Fix error check for nt36523.
      
        amdgpu:
         - DMUB fix
         - DCN 3.5 fixes
         - XGMI fix
         - DCN 3.2 fixes
         - Vangogh suspend fix
         - NBIO 7.9 fix
         - GFX11 golden register fix
         - Backlight fix
         - NBIO 7.11 fix
         - IB test overflow fix
         - DCN 3.1.4 fixes
         - fix a runtime pm ref count
         - Retimer fix
         - ABM fix
         - DCN 3.1.5 fix
         - Fix AGP addressing
         - Fix possible memory leak in SMU error path
         - Make sure PME is enabled in D3
         - Fix possible NULL pointer dereference in debugfs
         - EEPROM fix
         - GC 9.4.3 fix
      
        amdkfd:
         - IP version check fix
         - Fix memory leak in pqm_uninit()"
      
      * tag 'drm-fixes-2023-12-01' of git://anongit.freedesktop.org/drm/drm: (53 commits)
        Revert "drm/prime: Unexport helpers for fd/handle conversion"
        drm/amdgpu: Use another offset for GC 9.4.3 remap
        drm/amd/display: Fix some HostVM parameters in DML
        drm/amdkfd: Free gang_ctx_bo and wptr_bo in pqm_uninit
        drm/amdgpu: Update EEPROM I2C address for smu v13_0_0
        drm/amd/display: Allow DTBCLK disable for DCN35
        drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer
        drm/amd: Enable PCIe PME from D3
        drm/amd/pm: fix a memleak in aldebaran_tables_init
        drm/amdgpu: fix AGP addressing when GART is not at 0
        drm/amd/display: update dcn315 lpddr pstate latency
        drm/amd/display: fix ABM disablement
        drm/amd/display: Fix black screen on video playback with embedded panel
        drm/amd/display: Fix conversions between bytes and KB
        drm/amdkfd: Use common function for IP version check
        drm/amd/display: Remove config update
        drm/amd/display: Update DCN35 clock table policy
        drm/amd/display: force toggle rate wa for first link training for a retimer
        drm/amdgpu: correct the amdgpu runtime dereference usage count
        drm/amd/display: Update min Z8 residency time to 2100 for DCN314
        ...
      b1e51588
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.7-2023-11-30' of git://git.kernel.dk/linux · c9a925b7
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - Fix an issue with discontig page checking for IORING_SETUP_NO_MMAP
      
       - Fix an issue with not allowing IORING_SETUP_NO_MMAP also disallowing
         mmap'ed buffer rings
      
       - Fix an issue with deferred release of memory mapped pages
      
       - Fix a lockdep issue with IORING_SETUP_NO_MMAP
      
       - Use fget/fput consistently, even from our sync system calls. No real
         issue here, but if we were ever to allow closing io_uring descriptors
         it would be required. Let's play it safe and just use the full ref
         counted versions upfront. Most uses of io_uring are threaded anyway,
         and hence already doing the full version underneath.
      
      * tag 'io_uring-6.7-2023-11-30' of git://git.kernel.dk/linux:
        io_uring: use fget/fput consistently
        io_uring: free io_buffer_list entries via RCU
        io_uring/kbuf: prune deferred locked cache when tearing down
        io_uring/kbuf: recycle freed mapped buffer ring entries
        io_uring/kbuf: defer release of mapped buffer rings
        io_uring: enable io_mem_alloc/free to be used in other parts
        io_uring: don't guard IORING_OFF_PBUF_RING with SETUP_NO_MMAP
        io_uring: don't allow discontig pages for IORING_SETUP_NO_MMAP
      c9a925b7
    • Linus Torvalds's avatar
      Merge tag 'block-6.7-2023-12-01' of git://git.kernel.dk/linux · ee0c8a9b
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - NVMe pull request via Keith:
           - Invalid namespace identification error handling (Marizio Ewan,
             Keith)
           - Fabrics keep-alive tuning (Mark)
      
       - Fix for a bad error check regression in bcache (Markus)
      
       - Fix for a performance regression with O_DIRECT (Ming)
      
       - Fix for a flush related deadlock (Ming)
      
       - Make the read-only warn on per-partition (Yu)
      
      * tag 'block-6.7-2023-12-01' of git://git.kernel.dk/linux:
        nvme-core: check for too small lba shift
        blk-mq: don't count completed flush data request as inflight in case of quiesce
        block: Document the role of the two attribute groups
        block: warn once for each partition in bio_check_ro()
        block: move .bd_inode into 1st cacheline of block_device
        nvme: check for valid nvme_identify_ns() before using it
        nvme-core: fix a memory leak in nvme_ns_info_from_identify()
        nvme: fine-tune sending of first keep-alive
        bcache: revert replacing IS_ERR_OR_NULL with IS_ERR
      ee0c8a9b
    • Linus Torvalds's avatar
      Merge tag 'dm-6.7/dm-fixes-2' of... · abd792f3
      Linus Torvalds authored
      Merge tag 'dm-6.7/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix DM verity target's FEC support to always initialize IO before it
         frees it. Also fix alignment of struct dm_verity_fec_io within the
         per-bio-data
      
       - Fix DM verity target to not FEC failed readahead IO
      
       - Update DM flakey target to use MAX_ORDER rather than MAX_ORDER - 1
      
      * tag 'dm-6.7/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm-flakey: start allocating with MAX_ORDER
        dm-verity: align struct dm_verity_fec_io properly
        dm verity: don't perform FEC for failed readahead IO
        dm verity: initialize fec io before freeing it
      abd792f3
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · ff4a9f49
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small fixes, one in drivers.
      
        The core changes are to the internal representation of flags in
        scsi_devices which removes space wasting bools in favour of single bit
        flags and to add a flag to force a runtime resume which is used by ATA
        devices"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: sd: Fix system start for ATA devices
        scsi: Change SCSI device boolean fields to single bit flags
        scsi: ufs: core: Clear cmd if abort succeeds in MCQ mode
      ff4a9f49
    • Linus Torvalds's avatar
      Merge tag 'fs_for_v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · c1c09da0
      Linus Torvalds authored
      Pull ext2 fix from Jan Kara:
       "Fix an ext2 bug introduced by changes in ext2 & iomap stepping on each
        other toes (apparently ext2 driver does not get much testing in
        linux-next)"
      
      * tag 'fs_for_v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        ext2: Fix ki_pos update for DIO buffered-io fallback case
      c1c09da0
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs · e6861be4
      Linus Torvalds authored
      Pull more bcachefs bugfixes from Kent Overstreet:
      
       - bcache & bcachefs were broken with CFI enabled; patch for closures to
         fix type punning
      
       - mark erasure coding as extra-experimental; there are incompatible
         disk space accounting changes coming for erasure coding, and I'm
         still seeing checksum errors in some tests
      
       - several fixes for durability-related issues (durability is a device
         specific setting where we can tell bcachefs that data on a given
         device should be counted as replicated x times)
      
       - a fix for a rare livelock when a btree node merge then updates a
         parent node that is almost full
      
       - fix a race in the device removal path, where dropping a pointer in a
         btree node to a device would be clobbered by an in flight btree write
         updating the btree node key on completion
      
       - fix one SRCU lock hold time warning in the btree gc code - ther's
         still a bunch more of these to fix
      
       - fix a rare race where we'd start copygc before initializing the "are
         we rw" percpu refcount; copygc would think we were already ro and die
         immediately
      
      * tag 'bcachefs-2023-11-29' of https://evilpiepirate.org/git/bcachefs: (23 commits)
        bcachefs: Extra kthread_should_stop() calls for copygc
        bcachefs: Convert gc_alloc_start() to for_each_btree_key2()
        bcachefs: Fix race between btree writes and metadata drop
        bcachefs: move journal seq assertion
        bcachefs: -EROFS doesn't count as move_extent_start_fail
        bcachefs: trace_move_extent_start_fail() now includes errcode
        bcachefs: Fix split_race livelock
        bcachefs: Fix bucket data type for stripe buckets
        bcachefs: Add missing validation for jset_entry_data_usage
        bcachefs: Fix zstd compress workspace size
        bcachefs: bpos is misaligned on big endian
        bcachefs: Fix ec + durability calculation
        bcachefs: Data update path won't accidentaly grow replicas
        bcachefs: deallocate_extra_replicas()
        bcachefs: Proper refcounting for journal_keys
        bcachefs: preserve device path as device name
        bcachefs: Fix an endianness conversion
        bcachefs: Start gc, copygc, rebalance threads after initing writes ref
        bcachefs: Don't stop copygc thread on device resize
        bcachefs: Make sure bch2_move_ratelimit() also waits for move_ops
        ...
      e6861be4
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-tables' · 7d4c44a5
      Rafael J. Wysocki authored
      Merge a fix for a recently introduced build issue on ARM32 platforms
      caused by an inadvertent header file breakage (Dave Jiang).
      
      * acpi-tables:
        ACPI: Fix ARM32 platforms compile issue introduced by fw_table changes
      7d4c44a5
    • Rafael J. Wysocki's avatar
      Merge branch 'powercap' · a6b31256
      Rafael J. Wysocki authored
      Merge a power capping fix for 6.7-rc4 which eliminates unnecessary
      and harmful conversions to uW from the DTPM (dynamic thermal and power
      management) framework (Lukasz Luba).
      
      * powercap:
        powercap: DTPM: Fix unneeded conversions to micro-Watts
      a6b31256
    • Jens Axboe's avatar
      Merge tag 'nvme-6.7-2023-12-01' of git://git.infradead.org/nvme into block-6.7 · 8ad3ac92
      Jens Axboe authored
      Pull NVMe fixes from Keith:
      
      "nvme fixes for Linux 6.7
      
       - Invalid namespace identification error handling (Marizio Ewan, Keith)
       - Fabrics keep-alive tuning (Mark)"
      
      * tag 'nvme-6.7-2023-12-01' of git://git.infradead.org/nvme:
        nvme-core: check for too small lba shift
        nvme: check for valid nvme_identify_ns() before using it
        nvme-core: fix a memory leak in nvme_ns_info_from_identify()
        nvme: fine-tune sending of first keep-alive
      8ad3ac92
    • Keith Busch's avatar
      nvme-core: check for too small lba shift · 74fbc88e
      Keith Busch authored
      The block layer doesn't support logical block sizes smaller than 512
      bytes. The nvme spec doesn't support that small either, but the driver
      isn't checking to make sure the device responded with usable data.
      Failing to catch this will result in a kernel bug, either from a
      division by zero when stacking, or a zero length bio.
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarKeith Busch <kbusch@kernel.org>
      74fbc88e
    • Ming Lei's avatar
      blk-mq: don't count completed flush data request as inflight in case of quiesce · 0e4237ae
      Ming Lei authored
      Request queue quiesce may interrupt flush sequence, and the original request
      may have been marked as COMPLETE, but can't get finished because of
      queue quiesce.
      
      This way is fine from driver viewpoint, because flush sequence is block
      layer concept, and it isn't related with driver.
      
      However, driver(such as dm-rq) can call blk_mq_queue_inflight() to count &
      drain inflight requests, then the wait & drain never gets done because
      the completed & not-finished flush request is counted as inflight.
      
      Fix this issue by not counting completed flush data request as inflight in
      case of quiesce.
      
      Cc: Mike Snitzer <snitzer@kernel.org>
      Cc: David Jeffery <djeffery@redhat.com>
      Cc: John Pittman <jpittman@redhat.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20231201085605.577730-1-ming.lei@redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      0e4237ae
    • Daniel Mentz's avatar
      iommu: Fix printk arg in of_iommu_get_resv_regions() · c2183b3d
      Daniel Mentz authored
      The variable phys is defined as (struct resource *) which aligns with
      the printk format specifier %pr. Taking the address of it results in a
      value of type (struct resource **) which is incompatible with the format
      specifier %pr. Therefore, remove the address of operator (&).
      
      Fixes: a5bf3cfc ("iommu: Implement of_iommu_get_resv_regions()")
      Signed-off-by: default avatarDaniel Mentz <danielmentz@google.com>
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Link: https://lore.kernel.org/r/20231108062226.928985-1-danielmentz@google.comSigned-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      c2183b3d
    • Masami Hiramatsu (Google)'s avatar
      rethook: Use __rcu pointer for rethook::handler · a1461f1f
      Masami Hiramatsu (Google) authored
      Since the rethook::handler is an RCU-maganged pointer so that it will
      notice readers the rethook is stopped (unregistered) or not, it should
      be an __rcu pointer and use appropriate functions to be accessed. This
      will use appropriate memory barrier when accessing it. OTOH,
      rethook::data is never changed, so we don't need to check it in
      get_kretprobe().
      
      NOTE: To avoid sparse warning, rethook::handler is defined by a raw
      function pointer type with __rcu instead of rethook_handler_t.
      
      Link: https://lore.kernel.org/all/170126066201.398836.837498688669005979.stgit@devnote2/
      
      Fixes: 54ecbe6f ("rethook: Add a generic return hook")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202311241808.rv9ceuAh-lkp@intel.com/Tested-by: default avatarJP Kobryn <inwardvessel@gmail.com>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      a1461f1f
    • JP Kobryn's avatar
      kprobes: consistent rcu api usage for kretprobe holder · d839a656
      JP Kobryn authored
      It seems that the pointer-to-kretprobe "rp" within the kretprobe_holder is
      RCU-managed, based on the (non-rethook) implementation of get_kretprobe().
      The thought behind this patch is to make use of the RCU API where possible
      when accessing this pointer so that the needed barriers are always in place
      and to self-document the code.
      
      The __rcu annotation to "rp" allows for sparse RCU checking. Plain writes
      done to the "rp" pointer are changed to make use of the RCU macro for
      assignment. For the single read, the implementation of get_kretprobe()
      is simplified by making use of an RCU macro which accomplishes the same,
      but note that the log warning text will be more generic.
      
      I did find that there is a difference in assembly generated between the
      usage of the RCU macros vs without. For example, on arm64, when using
      rcu_assign_pointer(), the corresponding store instruction is a
      store-release (STLR) which has an implicit barrier. When normal assignment
      is done, a regular store (STR) is found. In the macro case, this seems to
      be a result of rcu_assign_pointer() using smp_store_release() when the
      value to write is not NULL.
      
      Link: https://lore.kernel.org/all/20231122132058.3359-1-inwardvessel@gmail.com/
      
      Fixes: d741bf41 ("kprobes: Remove kretprobe hash")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJP Kobryn <inwardvessel@gmail.com>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      d839a656
    • wuqiang.matt's avatar
      lib: objpool: fix head overrun on RK3588 SBC · d67f39d2
      wuqiang.matt authored
      objpool overrun stress with test_objpool on OrangePi5+ SBC triggered the
      following kernel warnings:
      
          WARNING: CPU: 6 PID: 3115 at lib/objpool.c:168 objpool_push+0xc0/0x100
      
      This message is from objpool.c:168:
      
          WARN_ON_ONCE(tail - head > pool->nr_objs);
      
      The overrun test case is to validate the case that pre-allocated objects
      are insufficient: 8 objects are pre-allocated for each node and consumer
      thread per node tries to grab 16 objects in a row. The testing system is
      OrangePI 5+, with RK3588, a big.LITTLE SOC with 4x A76 and 4x A55. When
      disabling either all 4 big or 4 little cores, the overrun tests run well,
      and once with big and little cores mixed together, the overrun test would
      always cause an overrun loop. It's likely the memory timing differences
      of big and little cores cause this trouble. Here are the debugging data
      of objpool_try_get_slot after try_cmpxchg_release:
      
          objpool_pop: cpu: 4/0 0:0 head: 278/279 tail:278 last:276/278
      
      The local copies of 'head' and 'last' were 278 and 276, and reloading of
      'slot->head' and 'slot->last' got 279 and 278. After try_cmpxchg_release
      'slot->head' became 'head + 1', which is correct. But what's wrong here
      is the stale value of 'last', and that stale value of 'last' finally led
      the overrun of 'head'.
      
      Memory updating of 'last' and 'head' are performed in push() and pop()
      independently, which could be the culprit leading this out of order
      visibility of 'last' and 'head'. So for objpool_try_get_slot(), it's
      not enough only checking the condition of 'head != slot', the implicit
      condition 'last - head <= nr_objs' must also be explicitly asserted to
      guarantee 'last' is always behind 'head' before the object retrieving.
      
      This patch will check and try reloading of 'head' and 'last' to ensure
      'last' is behind 'head' at the time of object retrieving. Performance
      testings show the average impact is about 0.1% for X86_64 and 1.12% for
      ARM64. Here are the results:
      
          OS: Debian 10 X86_64, Linux 6.6rc
          HW: XEON 8336C x 2, 64 cores/128 threads, DDR4 3200MT/s
                            1T         2T         4T         8T        16T
          native:     49543304   99277826  199017659  399070324  795185848
          objpool:    29909085   59865637  119692073  239750369  478005250
          objpool+:   29879313   59230743  119609856  239067773  478509029
                           32T        48T        64T        96T       128T
          native:   1596927073 2390099988 2929397330 3183875848 3257546602
          objpool:   957553042 1435814086 1680872925 2043126796 2165424198
          objpool+:  956476281 1434491297 1666055740 2041556569 2157415622
      
          OS: Debian 11 AARCH64, Linux 6.6rc
          HW: Kunpeng-920 96 cores/2 sockets/4 NUMA nodes, DDR4 2933 MT/s
                            1T         2T         4T         8T        16T
          native:     30890508   60399915  123111980  242257008  494002946
          objpool:    14742531   28883047   57739948  115886644  232455421
          objpool+:   14107220   29032998   57286084  113730493  232232850
                           24T        32T        48T        64T        96T
          native:    746406039 1000174750 1493236240 1998318364 2942911180
          objpool:   349164852  467284332  702296756  934459713 1387898285
          objpool+:  348388180  462750976  696606096  927865887 1368402195
      
      Link: https://lore.kernel.org/all/20231114115148.298821-1-wuqiang.matt@bytedance.com/
      
      Fixes: b4edb8d2 ("lib: objpool added: ring-array based lockless MPMC")
      Signed-off-by: default avatarwuqiang.matt <wuqiang.matt@bytedance.com>
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Signed-off-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      d67f39d2
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 994d5c58
      Linus Torvalds authored
      Pull hardening fixes from Kees Cook:
      
       - struct_group: propagate attributes to top-level union (Dmitry
         Antipov)
      
       - gcc-plugins: randstruct: Update code comment in relayout_struct
         (Gustavo A. R. Silva)
      
       - MAINTAINERS: refresh LLVM support (Nick Desaulniers)
      
      * tag 'hardening-v6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: randstruct: Update code comment in relayout_struct()
        uapi: propagate __struct_group() attributes to the container union
        MAINTAINERS: refresh LLVM support
      994d5c58
    • Linus Torvalds's avatar
      Merge tag 'linux_kselftest-kunit-fixes-6.7-rc4' of... · 47669f40
      Linus Torvalds authored
      Merge tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull KUnit fixes from Shuah Khan:
       "Three fixes to warnings and run-time test behavior. With these fixes,
        test suite counter will be reset correctly before running tests, kunit
        will warn if tests are too slow, and eliminate warning when kfree() as
        an action"
      
      * tag 'linux_kselftest-kunit-fixes-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        kunit: test: Avoid cast warning when adding kfree() as an action
        kunit: Reset suite counter right before running tests
        kunit: Warn if tests are slow
      47669f40
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-6.7-2023-11-30' of... · 908f6064
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-6.7-2023-11-30' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
      
      amd-drm-fixes-6.7-2023-11-30:
      
      amdgpu:
      - DMUB fix
      - DCN 3.5 fixes
      - XGMI fix
      - DCN 3.2 fixes
      - Vangogh suspend fix
      - NBIO 7.9 fix
      - GFX11 golden register fix
      - Backlight fix
      - NBIO 7.11 fix
      - IB test overflow fix
      - DCN 3.1.4 fixes
      - fix a runtime pm ref count
      - Retimer fix
      - ABM fix
      - DCN 3.1.5 fix
      - Fix AGP addressing
      - Fix possible memory leak in SMU error path
      - Make sure PME is enabled in D3
      - Fix possible NULL pointer dereference in debugfs
      - EEPROM fix
      - GC 9.4.3 fix
      
      amdkfd:
      - IP version check fix
      - Fix memory leak in pqm_uninit()
      
      drm:
      - Revert unexport of prime helpers for fd/handle conversion
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Alex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20231130213135.5083-1-alexander.deucher@amd.com
      908f6064
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-fixes-for-v6.7-1-2023-11-29' of... · 2594faaf
      Linus Torvalds authored
      Merge tag 'perf-tools-fixes-for-v6.7-1-2023-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
      
      Pull perf tools fixes from Namhyung Kim:
       "Assorted build fixes including:
      
         - fix compile errors in printf() with u64 on 32-bit systesm
      
         - sync kernel headers to the tool copies
      
         - update arm64 sysreg generation for tarballs
      
         - disable compile warnings on __packed attribute"
      
      * tag 'perf-tools-fixes-for-v6.7-1-2023-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
        tools: Disable __packed attribute compiler warning due to -Werror=attributes
        perf build: Ensure sysreg-defs Makefile respects output dir
        tools perf: Add arm64 sysreg files to MANIFEST
        tools/perf: Update tools's copy of mips syscall table
        tools/perf: Update tools's copy of s390 syscall table
        tools/perf: Update tools's copy of powerpc syscall table
        tools/perf: Update tools's copy of x86 syscall table
        tools headers: Update tools's copy of s390/asm headers
        tools headers: Update tools's copy of arm64/asm headers
        tools headers: Update tools's copy of x86/asm headers
        tools headers: Update tools's copy of socket.h header
        tools headers UAPI: Update tools's copy of unistd.h header
        tools headers UAPI: Update tools's copy of vhost.h header
        tools headers UAPI: Update tools's copy of mount.h header
        tools headers UAPI: Update tools's copy of kvm.h header
        tools headers UAPI: Update tools's copy of fscrypt.h header
        tools headers UAPI: Update tools's copy of drm headers
        perf lock contention: Fix a build error on 32-bit
        perf kwork: Fix a build error on 32-bit
      2594faaf
  5. 30 Nov, 2023 1 commit
    • Linus Torvalds's avatar
      Merge tag 'net-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 6172a518
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf and wifi.
      
        Current release - regressions:
      
         - neighbour: fix __randomize_layout crash in struct neighbour
      
         - r8169: fix deadlock on RTL8125 in jumbo mtu mode
      
        Previous releases - regressions:
      
         - wifi:
             - mac80211: fix warning at station removal time
             - cfg80211: fix CQM for non-range use
      
         - tools: ynl-gen: fix unexpected response handling
      
         - octeontx2-af: fix possible buffer overflow
      
         - dpaa2: recycle the RX buffer only after all processing done
      
         - rswitch: fix missing dev_kfree_skb_any() in error path
      
        Previous releases - always broken:
      
         - ipv4: fix uaf issue when receiving igmp query packet
      
         - wifi: mac80211: fix debugfs deadlock at device removal time
      
         - bpf:
             - sockmap: af_unix stream sockets need to hold ref for pair sock
             - netdevsim: don't accept device bound programs
      
         - selftests: fix a char signedness issue
      
         - dsa: mv88e6xxx: fix marvell 6350 probe crash
      
         - octeontx2-pf: restore TC ingress police rules when interface is up
      
         - wangxun: fix memory leak on msix entry
      
         - ravb: keep reverse order of operations in ravb_remove()"
      
      * tag 'net-6.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
        net: ravb: Keep reverse order of operations in ravb_remove()
        net: ravb: Stop DMA in case of failures on ravb_open()
        net: ravb: Start TX queues after HW initialization succeeded
        net: ravb: Make write access to CXR35 first before accessing other EMAC registers
        net: ravb: Use pm_runtime_resume_and_get()
        net: ravb: Check return value of reset_control_deassert()
        net: libwx: fix memory leak on msix entry
        ice: Fix VF Reset paths when interface in a failed over aggregate
        bpf, sockmap: Add af_unix test with both sockets in map
        bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
        tools: ynl-gen: always construct struct ynl_req_state
        ethtool: don't propagate EOPNOTSUPP from dumps
        ravb: Fix races between ravb_tx_timeout_work() and net related ops
        r8169: prevent potential deadlock in rtl8169_close
        r8169: fix deadlock on RTL8125 in jumbo mtu mode
        neighbour: Fix __randomize_layout crash in struct neighbour
        octeontx2-pf: Restore TC ingress police rules when interface is up
        octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64
        net: stmmac: xgmac: Disable FPE MMC interrupts
        octeontx2-af: Fix possible buffer overflow
        ...
      6172a518