1. 12 Sep, 2022 12 commits
    • Valentin Schneider's avatar
      panic, kexec: make __crash_kexec() NMI safe · 05c62574
      Valentin Schneider authored
      Attempting to get a crash dump out of a debug PREEMPT_RT kernel via an NMI
      panic() doesn't work.  The cause of that lies in the PREEMPT_RT definition
      of mutex_trylock():
      
      	if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
      		return 0;
      
      This prevents an nmi_panic() from executing the main body of
      __crash_kexec() which does the actual kexec into the kdump kernel.  The
      warning and return are explained by:
      
        6ce47fd9 ("rtmutex: Warn if trylock is called from hard/softirq context")
        [...]
        The reasons for this are:
      
            1) There is a potential deadlock in the slowpath
      
            2) Another cpu which blocks on the rtmutex will boost the task
      	 which allegedly locked the rtmutex, but that cannot work
      	 because the hard/softirq context borrows the task context.
      
      Furthermore, grabbing the lock isn't NMI safe, so do away with kexec_mutex
      and replace it with an atomic variable.  This is somewhat overzealous as
      *some* callsites could keep using a mutex (e.g.  the sysfs-facing ones
      like crash_shrink_memory()), but this has the benefit of involving a
      single unified lock and preventing any future NMI-related surprises.
      
      Tested by triggering NMI panics via:
      
        $ echo 1 > /proc/sys/kernel/panic_on_unrecovered_nmi
        $ echo 1 > /proc/sys/kernel/unknown_nmi_panic
        $ echo 1 > /proc/sys/kernel/panic
      
        $ ipmitool power diag
      
      Link: https://lkml.kernel.org/r/20220630223258.4144112-3-vschneid@redhat.com
      Fixes: 6ce47fd9 ("rtmutex: Warn if trylock is called from hard/softirq context")
      Signed-off-by: default avatarValentin Schneider <vschneid@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Juri Lelli <jlelli@redhat.com>
      Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      05c62574
    • Valentin Schneider's avatar
      kexec: turn all kexec_mutex acquisitions into trylocks · 7bb5da0d
      Valentin Schneider authored
      Patch series "kexec, panic: Making crash_kexec() NMI safe", v4.
      
      
      This patch (of 2):
      
      Most acquistions of kexec_mutex are done via mutex_trylock() - those were
      a direct "translation" from:
      
        8c5a1cf0 ("kexec: use a mutex for locking rather than xchg()")
      
      there have however been two additions since then that use mutex_lock():
      crash_get_memory_size() and crash_shrink_memory().
      
      A later commit will replace said mutex with an atomic variable, and
      locking operations will become atomic_cmpxchg().  Rather than having those
      mutex_lock() become while (atomic_cmpxchg(&lock, 0, 1)), turn them into
      trylocks that can return -EBUSY on acquisition failure.
      
      This does halve the printable size of the crash kernel, but that's still
      neighbouring 2G for 32bit kernels which should be ample enough.
      
      Link: https://lkml.kernel.org/r/20220630223258.4144112-1-vschneid@redhat.com
      Link: https://lkml.kernel.org/r/20220630223258.4144112-2-vschneid@redhat.comSigned-off-by: default avatarValentin Schneider <vschneid@redhat.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: Juri Lelli <jlelli@redhat.com>
      Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
      Cc: Miaohe Lin <linmiaohe@huawei.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Baoquan He <bhe@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7bb5da0d
    • Neel Natu's avatar
      lib/cmdline: avoid page fault in next_arg · 9847f212
      Neel Natu authored
      An argument list like "arg=val arg2 \"" can trigger a page fault if the
      page pointed by 'args[0xffffffff]' is not mapped and potential memory
      corruption otherwise (unlikely but possible if the bogus address is mapped
      and contents happen to match the ascii value of the quote character).
      
      The fix is to ensure that we load 'args[i-1]' only when (i > 0).
      
      Prior to this commit the following command would trigger an
      unhandled page fault in the kernel:
      
      root@(none):/linus/fs/fat# insmod ./fat.ko  "foo=bar \""
      [   33.870507] BUG: unable to handle page fault for address: ffff888204252608
      [   33.872180] #PF: supervisor read access in kernel mode
      [   33.873414] #PF: error_code(0x0000) - not-present page
      [   33.874650] PGD 4401067 P4D 4401067 PUD 0
      [   33.875321] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
      [   33.876113] CPU: 16 PID: 399 Comm: insmod Not tainted 5.19.0-dbg-DEV #4
      [   33.877193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
      [   33.878739] RIP: 0010:next_arg+0xd1/0x110
      [   33.879399] Code: 22 75 1d 41 c6 04 01 00 41 80 f8 22 74 18 eb 35 4c 89 0e 45 31 d2 4c 89 cf 48 c7 02 00 00 00 00 41 80 f8 22 75 1f 41 8d 42 ff <41> 80 3c 01 22 75 14 41 c6 04 01 00 eb 0d 48 c7 02 00 00 00 00 41
      [   33.882338] RSP: 0018:ffffc90001253d08 EFLAGS: 00010246
      [   33.883174] RAX: 00000000ffffffff RBX: ffff888104252608 RCX: 0fc317bba1c1dd00
      [   33.884311] RDX: ffffc90001253d40 RSI: ffffc90001253d48 RDI: ffff888104252609
      [   33.885450] RBP: ffffc90001253d10 R08: 0000000000000022 R09: ffff888104252609
      [   33.886595] R10: 0000000000000000 R11: ffffffff82c7ff20 R12: 0000000000000282
      [   33.887748] R13: 00000000ffff8000 R14: 0000000000000000 R15: 0000000000007fff
      [   33.888887] FS:  00007f04ec7432c0(0000) GS:ffff88813d300000(0000) knlGS:0000000000000000
      [   33.890183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   33.891111] CR2: ffff888204252608 CR3: 0000000100f36005 CR4: 0000000000170ee0
      [   33.892241] Call Trace:
      [   33.892641]  <TASK>
      [   33.892989]  parse_args+0x8f/0x220
      [   33.893538]  load_module+0x138b/0x15a0
      [   33.894149]  ? prepare_coming_module+0x50/0x50
      [   33.894879]  ? kernel_read_file_from_fd+0x5f/0x90
      [   33.895639]  __se_sys_finit_module+0xce/0x130
      [   33.896342]  __x64_sys_finit_module+0x1d/0x20
      [   33.897042]  do_syscall_64+0x44/0xa0
      [   33.897622]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
      [   33.898434] RIP: 0033:0x7f04ec85ef79
      [   33.899009] Code: 48 8d 3d da db 0d 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c7 9e 0d 00 f7 d8 64 89 01 48
      [   33.901912] RSP: 002b:00007fffae81bfe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      [   33.903081] RAX: ffffffffffffffda RBX: 0000559c5f1d2640 RCX: 00007f04ec85ef79
      [   33.904191] RDX: 0000000000000000 RSI: 0000559c5f1d12a0 RDI: 0000000000000003
      [   33.905304] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      [   33.906421] R10: 0000000000000003 R11: 0000000000000246 R12: 0000559c5f1d12a0
      [   33.907526] R13: 0000000000000000 R14: 0000559c5f1d25f0 R15: 0000559c5f1d12a0
      [   33.908631]  </TASK>
      [   33.908986] Modules linked in: fat(+) [last unloaded: fat]
      [   33.909843] CR2: ffff888204252608
      [   33.910375] ---[ end trace 0000000000000000 ]---
      [   33.911172] RIP: 0010:next_arg+0xd1/0x110
      [   33.911796] Code: 22 75 1d 41 c6 04 01 00 41 80 f8 22 74 18 eb 35 4c 89 0e 45 31 d2 4c 89 cf 48 c7 02 00 00 00 00 41 80 f8 22 75 1f 41 8d 42 ff <41> 80 3c 01 22 75 14 41 c6 04 01 00 eb 0d 48 c7 02 00 00 00 00 41
      [   33.914643] RSP: 0018:ffffc90001253d08 EFLAGS: 00010246
      [   33.915446] RAX: 00000000ffffffff RBX: ffff888104252608 RCX: 0fc317bba1c1dd00
      [   33.916544] RDX: ffffc90001253d40 RSI: ffffc90001253d48 RDI: ffff888104252609
      [   33.917636] RBP: ffffc90001253d10 R08: 0000000000000022 R09: ffff888104252609
      [   33.918727] R10: 0000000000000000 R11: ffffffff82c7ff20 R12: 0000000000000282
      [   33.919821] R13: 00000000ffff8000 R14: 0000000000000000 R15: 0000000000007fff
      [   33.920908] FS:  00007f04ec7432c0(0000) GS:ffff88813d300000(0000) knlGS:0000000000000000
      [   33.922125] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   33.923017] CR2: ffff888204252608 CR3: 0000000100f36005 CR4: 0000000000170ee0
      [   33.924098] Kernel panic - not syncing: Fatal exception
      [   33.925776] Kernel Offset: disabled
      [   33.926347] Rebooting in 10 seconds..
      
      Link: https://lkml.kernel.org/r/20220728232434.1666488-1-neelnatu@google.comSigned-off-by: default avatarNeel Natu <neelnatu@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9847f212
    • Ira Weiny's avatar
      checkpatch: add kmap and kmap_atomic to the deprecated list · defdaff1
      Ira Weiny authored
      kmap() and kmap_atomic() are being deprecated in favor of
      kmap_local_page().
      
      There are two main problems with kmap(): (1) It comes with an overhead as
      mapping space is restricted and protected by a global lock for
      synchronization and (2) it also requires global TLB invalidation when the
      kmap's pool wraps and it might block when the mapping space is fully
      utilized until a slot becomes available.
      
      kmap_local_page() is safe from any context and is therefore redundant with
      kmap_atomic() with the exception of any pagefault or preemption disable
      requirements.  However, using kmap_atomic() for these side effects makes
      the code less clear.  So any requirement for pagefault or preemption
      disable should be made explicitly.
      
      With kmap_local_page() the mappings are per thread, CPU local, can take
      page faults, and can be called from any context (including interrupts). 
      It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
      the tasks can be preempted and, when they are scheduled to run again, the
      kernel virtual addresses are restored.
      
      Link: https://lkml.kernel.org/r/20220813220034.806698-1-ira.weiny@intel.comSigned-off-by: default avatarIra Weiny <ira.weiny@intel.com>
      Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Suggested-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      defdaff1
    • Fabio M. De Francesco's avatar
      fs/isofs: replace kmap() with kmap_local_page() · 5bb6ce3a
      Fabio M. De Francesco authored
      The use of kmap() is being deprecated in favor of kmap_local_page().
      
      There are two main problems with kmap(): (1) It comes with an overhead as
      mapping space is restricted and protected by a global lock for
      synchronization and (2) it also requires global TLB invalidation when the
      kmap's pool wraps and it might block when the mapping space is fully
      utilized until a slot becomes available.
      
      With kmap_local_page() the mappings are per thread, CPU local, can take
      page faults, and can be called from any context (including interrupts). 
      Tasks can be preempted and, when scheduled to run again, the kernel
      virtual addresses are restored and still valid.  It is faster than kmap()
      in kernels with HIGHMEM enabled.
      
      Since kmap_local_page() can be safely used in compress.c, it should be
      called everywhere instead of kmap().
      
      Therefore, replace kmap() with kmap_local_page() in compress.c.  Where it
      is needed, use memzero_page() instead of open coding kmap_local_page()
      plus memset() to fill the pages with zeros.  Delete the redundant
      flush_dcache_page() in the two call sites of memzero_page().
      
      Tested with mkisofs on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
      with HIGHMEM64GB enabled.
      
      Link: https://lkml.kernel.org/r/20220801122709.8164-1-fmdefrancesco@gmail.comSigned-off-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      Suggested-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Pali Rohár <pali@kernel.org>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Cc: Theodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5bb6ce3a
    • Arnd Bergmann's avatar
      treewide: defconfig: address renamed CONFIG_DEBUG_INFO=y · 64367f2e
      Arnd Bergmann authored
      CONFIG_DEBUG_INFO is now implicitly selected if one picks one of the
      explicit options that could be DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT,
      DEBUG_INFO_DWARF4, DEBUG_INFO_DWARF5.
      
      This was actually not what I had in mind when I suggested making it a
      'choice' statement, but it's too late to change again now, and the Kconfig
      logic is more sensible in the new form.
      
      Change any defconfig file that had CONFIG_DEBUG_INFO enabled but did not
      pick DWARF4 or DWARF5 explicitly to now pick the toolchain default.
      
      Link: https://lkml.kernel.org/r/20220811114609.2097335-1-arnd@kernel.org
      Fixes: f9b3cd24 ("Kconfig.debug: make DEBUG_INFO selectable from a choice")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@kernel.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Dinh Nguyen <dinguyen@kernel.org>
      Cc: Yoshinori Sato <ysato@users.osdn.me>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      64367f2e
    • Manfred Spraul's avatar
      ipc/util.c: cleanup and improve sysvipc_find_ipc() · 58b5c203
      Manfred Spraul authored
      sysvipc_find_ipc() can be simplified further:
      
      - It uses a for() loop to locate the next entry in the idr.
        This can be replaced with idr_get_next().
      
      - It receives two parameters (pos - which is actually
        an idr index and not a position, and new_pos, which
        is really a position).
        One parameter is sufficient.
      
      Link: https://lore.kernel.org/all/20210903052020.3265-3-manfred@colorfullife.com/
      Link: https://lkml.kernel.org/r/20220805115733.104763-1-manfred@colorfullife.comSigned-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Acked-by: default avatarWaiman Long <longman@redhat.com>
      Cc: "Eric W . Biederman" <ebiederm@xmission.com>
      Cc: <1vier1@web.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      58b5c203
    • Borislav Petkov's avatar
      scripts/decodecode: improve faulting line determination · 765f2bf0
      Borislav Petkov authored
      There are cases where the IP pointer in a Code: line in an oops doesn't
      point at the beginning of an instruction:
      
      Code: 0f bd c2 e9 a0 cd b5 e4 48 0f bd c2 e9 97 cd b5 e4 0f 1f 80 00 00 00 00 \
      	  e9 8b cd b5 e4 0f 1f 00 66 0f a3 d0 e9 7f cd b5 e4 0f 1f <80> 00 00 00 \
      	  00 0f a3 d0 e9 70 cd b5 e4 48 0f a3 d0 e9 67 cd b5
      
        e9 7f cd b5 e4          jmp    0xffffffffe4b5cda8
        0f 1f 80 00 00 00 00    nopl   0x0(%rax)
      	^^
      
      and the current way of determining the faulting instruction line doesn't
      work because disassembled instructions are counted from the IP byte to
      the end and when that thing points in the middle, the trailing bytes can
      be interpreted as different insns:
      
        Code starting with the faulting instruction
        ===========================================
           0:   80 00 00                addb   $0x0,(%rax)
           3:   00 00                   add    %al,(%rax)
      
      whereas, this is part of
      
      0f 1f 80 00 00 00 00    nopl   0x0(%rax)
      
           5:   0f a3 d0                bt     %edx,%eax
           ...
      
      leading to:
      
        1d:   0f 1f 00                nopl   (%rax)
        20:   66 0f a3 d0             bt     %dx,%ax
        24:*  e9 7f cd b5 e4          jmp    0xffffffffe4b5cda8               <-- trapping instruction
        29:   0f 1f 80 00 00 00 00    nopl   0x0(%rax)
        30:   0f a3 d0                bt     %edx,%eax
      
      which is the wrong faulting instruction.
      
      Change the way the faulting line number is determined by matching the
      opcode bytes from the beginning, leading to correct output:
      
        1d:   0f 1f 00                nopl   (%rax)
        20:   66 0f a3 d0             bt     %dx,%ax
        24:   e9 7f cd b5 e4          jmp    0xffffffffe4b5cda8
        29:*  0f 1f 80 00 00 00 00    nopl   0x0(%rax)                <-- trapping instruction
        30:   0f a3 d0                bt     %edx,%eax
      
      While at it, make decodecode use bash as the interpreter - that thing
      should be present on everything by now. It simplifies the code a lot
      too.
      
      Link: https://lkml.kernel.org/r/20220808085928.29840-1-bp@alien8.deSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      765f2bf0
    • Fabio M. De Francesco's avatar
      hfsplus: convert kmap() to kmap_local_page() in btree.c · 9f25f357
      Fabio M. De Francesco authored
      kmap() is being deprecated in favor of kmap_local_page().
      
      There are two main problems with kmap(): (1) It comes with an overhead as
      mapping space is restricted and protected by a global lock for
      synchronization and (2) it also requires global TLB invalidation when the
      kmap's pool wraps and it might block when the mapping space is fully
      utilized until a slot becomes available.
      
      With kmap_local_page() the mappings are per thread, CPU local, can take
      page faults, and can be called from any context (including interrupts). 
      It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
      the tasks can be preempted and, when they are scheduled to run again, the
      kernel virtual addresses are restored and are still valid.
      
      Since its use in btree.c is safe everywhere, it should be preferred.
      
      Therefore, replace kmap() with kmap_local_page() in btree.c.
      
      Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
      HIGHMEM64GB enabled.
      
      Link: https://lkml.kernel.org/r/20220809203105.26183-5-fmdefrancesco@gmail.comSigned-off-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      Suggested-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarViacheslav Dubeyko <slava@dubeyko.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9f25f357
    • Fabio M. De Francesco's avatar
      hfsplus: convert kmap() to kmap_local_page() in bitmap.c · f9ef3b95
      Fabio M. De Francesco authored
      kmap() is being deprecated in favor of kmap_local_page().
      
      There are two main problems with kmap(): (1) It comes with an overhead as
      mapping space is restricted and protected by a global lock for
      synchronization and (2) it also requires global TLB invalidation when the
      kmap's pool wraps and it might block when the mapping space is fully
      utilized until a slot becomes available.
      
      With kmap_local_page() the mappings are per thread, CPU local, can take
      page faults, and can be called from any context (including interrupts). 
      It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
      the tasks can be preempted and, when they are scheduled to run again, the
      kernel virtual addresses are restored and are still valid.
      
      Since its use in bitmap.c is safe everywhere, it should be preferred.
      
      Therefore, replace kmap() with kmap_local_page() in bitmap.c.
      
      Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
      HIGHMEM64GB enabled.
      
      Link: https://lkml.kernel.org/r/20220809203105.26183-4-fmdefrancesco@gmail.comSigned-off-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      Suggested-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarViacheslav Dubeyko <slava@dubeyko.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f9ef3b95
    • Fabio M. De Francesco's avatar
      hfsplus: convert kmap() to kmap_local_page() in bnode.c · 6c3014a6
      Fabio M. De Francesco authored
      kmap() is being deprecated in favor of kmap_local_page().
      
      Two main problems with kmap(): (1) It comes with an overhead as mapping
      space is restricted and protected by a global lock for synchronization and
      (2) it also requires global TLB invalidation when the kmap's pool wraps
      and it might block when the mapping space is fully utilized until a slot
      becomes available.
      
      With kmap_local_page() the mappings are per thread, CPU local, can take
      page faults, and can be called from any context (including interrupts). 
      It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
      the tasks can be preempted and, when they are scheduled to run again, the
      kernel virtual addresses are restored and still valid.
      
      Since its use in bnode.c is safe everywhere, it should be preferred.
      
      Therefore, replace kmap() with kmap_local_page() in bnode.c.  Where
      possible, use the suited standard helpers (memzero_page(), memcpy_page())
      instead of open coding kmap_local_page() plus memset() or memcpy().
      
      Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
      HIGHMEM64GB enabled.
      
      Link: https://lkml.kernel.org/r/20220809203105.26183-3-fmdefrancesco@gmail.comSigned-off-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      Suggested-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarViacheslav Dubeyko <slava@dubeyko.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6c3014a6
    • Fabio M. De Francesco's avatar
      hfsplus: unmap the page in the "fail_page" label · f5b23d67
      Fabio M. De Francesco authored
      Patch series "hfsplus: Replace kmap() with kmap_local_page()".
      
      kmap() is being deprecated in favor of kmap_local_page().
      
      There are two main problems with kmap(): (1) It comes with an overhead as
      mapping space is restricted and protected by a global lock for
      synchronization and (2) it also requires global TLB invalidation when the
      kmap’s pool wraps and it might block when the mapping space is fully
      utilized until a slot becomes available.
      
      With kmap_local_page() the mappings are per thread, CPU local, can take
      page faults, and can be called from any context (including interrupts). 
      It is faster than kmap() in kernels with HIGHMEM enabled.  Furthermore,
      the tasks can be preempted and, when they are scheduled to run again, the
      kernel virtual addresses are restored and still valid.
      
      Since its use in fs/hfsplus is safe everywhere, it should be preferred.
      
      Therefore, replace kmap() with kmap_local_page() in fs/hfsplus.  Where
      possible, use the suited standard helpers (memzero_page(), memcpy_page())
      instead of open coding kmap_local_page() plus memset() or memcpy().
      
      Fix a bug due to a page being not unmapped if the code jumps to the
      "fail_page" label (1/4).
      
      Tested in a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with
      HIGHMEM64GB enabled.
      
      
      This patch (of 4):
      
      Several paths within hfs_btree_open() jump to the "fail_page" label where
      put_page() is called while the page is still mapped.
      
      Call kunmap() to unmap the page soon before put_page().
      
      Link: https://lkml.kernel.org/r/20220809203105.26183-1-fmdefrancesco@gmail.com
      Link: https://lkml.kernel.org/r/20220809203105.26183-2-fmdefrancesco@gmail.comSigned-off-by: default avatarFabio M. De Francesco <fmdefrancesco@gmail.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reviewed-by: default avatarViacheslav Dubeyko <slava@dubeyko.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f5b23d67
  2. 28 Aug, 2022 25 commits
  3. 27 Aug, 2022 3 commits
    • Linus Torvalds's avatar
      Merge tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 10d4879f
      Linus Torvalds authored
      Pull thermal control fixes from Rafael Wysocki:
       "Fix two issues introduced recently and one driver problem leading to a
        NULL pointer dereference in some cases.
      
        Specifics:
      
         - Add missing EXPORT_SYMBOL_GPL in the thermal core and add back the
           required 'trips' property to the thermal zone DT bindings (Daniel
           Lezcano)
      
         - Prevent the int340x_thermal driver from crashing when a package
           with a buffer of 0 length is returned by an ACPI control method
           evaluated by it (Lee, Chun-Yi)"
      
      * tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        thermal/int340x_thermal: handle data_vault when the value is ZERO_SIZE_PTR
        dt-bindings: thermal: Fix missing required property
        thermal/core: Add missing EXPORT_SYMBOL_GPL
      10d4879f
    • Linus Torvalds's avatar
      Merge tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · b98f602d
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "Make __resolve_freq() check the presence of the frequency table
        instead of checking whether or not the ->target_index() callback is
        implemented by the driver, because that need not be the case when
        __resolve_freq() is used (Lukasz Luba)"
      
      * tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: check only freq_table in __resolve_freq()
      b98f602d
    • Linus Torvalds's avatar
      Merge tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 2b1ddb59
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix issues introduced by recent changes related to the handling
        of ACPI device properties and a coding mistake in the exit path of the
        ACPI processor driver.
      
        Specifics:
      
         - Prevent acpi_thermal_cpufreq_exit() from attempting to remove
           the same frequency QoS request multiple times (Riwen Lu)
      
         - Fix type detection for integer ACPI device properties (Stefan
           Binding)
      
         - Avoid emitting false-positive warnings when processing ACPI
           device properties and drop the useless default case from the
           acpi_copy_property_array_uint() macro (Sakari Ailus)"
      
      * tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: property: Remove default association from integer maximum values
        ACPI: property: Ignore already existing data node tags
        ACPI: property: Fix type detection of unified integer reading functions
        ACPI: processor: Remove freq Qos request for all CPUs
      2b1ddb59