1. 08 Jul, 2021 38 commits
    • Stephen Boyd's avatar
      scripts/decode_stacktrace.sh: support debuginfod · 26681eb3
      Stephen Boyd authored
      Now that stacktraces contain the build ID information we can update this
      script to use debuginfod-find to locate the debuginfo for the vmlinux and
      modules automatically.  This can replace the existing code that requires
      specifying a path to vmlinux or tries to find the vmlinux and modules
      automatically by using the release number.  Work it into the script as a
      fallback option if the vmlinux isn't specified on the commandline.
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-9-swboyd@chromium.orgSigned-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      26681eb3
    • Stephen Boyd's avatar
      x86/dumpstack: use %pSb/%pBb for backtrace printing · 9ef8af2a
      Stephen Boyd authored
      Let's use the new printk formats to print the stacktrace entries when
      printing a backtrace to the kernel logs.  This will include any module's
      build ID[1] in it so that offline/crash debugging can easily locate the
      debuginfo for a module via something like debuginfod[2].
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-8-swboyd@chromium.org
      Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
      Link: https://sourceware.org/elfutils/Debuginfod.html [2]
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9ef8af2a
    • Stephen Boyd's avatar
      arm64: stacktrace: use %pSb for backtrace printing · f61b8706
      Stephen Boyd authored
      Let's use the new printk format to print the stacktrace entry when
      printing a backtrace to the kernel logs. This will include any module's
      build ID[1] in it so that offline/crash debugging can easily locate the
      debuginfo for a module via something like debuginfod[2].
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-7-swboyd@chromium.org
      Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
      Link: https://sourceware.org/elfutils/Debuginfod.html [2]
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f61b8706
    • Stephen Boyd's avatar
      module: add printk formats to add module build ID to stacktraces · 9294523e
      Stephen Boyd authored
      Let's make kernel stacktraces easier to identify by including the build
      ID[1] of a module if the stacktrace is printing a symbol from a module.
      This makes it simpler for developers to locate a kernel module's full
      debuginfo for a particular stacktrace.  Combined with
      scripts/decode_stracktrace.sh, a developer can download the matching
      debuginfo from a debuginfod[2] server and find the exact file and line
      number for the functions plus offsets in a stacktrace that match the
      module.  This is especially useful for pstore crash debugging where the
      kernel crashes are recorded in something like console-ramoops and the
      recovery kernel/modules are different or the debuginfo doesn't exist on
      the device due to space concerns (the debuginfo can be too large for space
      limited devices).
      
      Originally, I put this on the %pS format, but that was quickly rejected
      given that %pS is used in other places such as ftrace where build IDs
      aren't meaningful.  There was some discussions on the list to put every
      module build ID into the "Modules linked in:" section of the stacktrace
      message but that quickly becomes very hard to read once you have more than
      three or four modules linked in.  It also provides too much information
      when we don't expect each module to be traversed in a stacktrace.  Having
      the build ID for modules that aren't important just makes things messy.
      Splitting it to multiple lines for each module quickly explodes the number
      of lines printed in an oops too, possibly wrapping the warning off the
      console.  And finally, trying to stash away each module used in a
      callstack to provide the ID of each symbol printed is cumbersome and would
      require changes to each architecture to stash away modules and return
      their build IDs once unwinding has completed.
      
      Instead, we opt for the simpler approach of introducing new printk formats
      '%pS[R]b' for "pointer symbolic backtrace with module build ID" and '%pBb'
      for "pointer backtrace with module build ID" and then updating the few
      places in the architecture layer where the stacktrace is printed to use
      this new format.
      
      Before:
      
       Call trace:
        lkdtm_WARNING+0x28/0x30 [lkdtm]
        direct_entry+0x16c/0x1b4 [lkdtm]
        full_proxy_write+0x74/0xa4
        vfs_write+0xec/0x2e8
      
      After:
      
       Call trace:
        lkdtm_WARNING+0x28/0x30 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9]
        direct_entry+0x16c/0x1b4 [lkdtm 6c2215028606bda50de823490723dc4bc5bf46f9]
        full_proxy_write+0x74/0xa4
        vfs_write+0xec/0x2e8
      
      [akpm@linux-foundation.org: fix build with CONFIG_MODULES=n, tweak code layout]
      [rdunlap@infradead.org: fix build when CONFIG_MODULES is not set]
        Link: https://lkml.kernel.org/r/20210513171510.20328-1-rdunlap@infradead.org
      [akpm@linux-foundation.org: make kallsyms_lookup_buildid() static]
      [cuibixuan@huawei.com: fix build error when CONFIG_SYSFS is disabled]
        Link: https://lkml.kernel.org/r/20210525105049.34804-1-cuibixuan@huawei.com
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-6-swboyd@chromium.org
      Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
      Link: https://sourceware.org/elfutils/Debuginfod.html [2]
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Signed-off-by: default avatarBixuan Cui <cuibixuan@huawei.com>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9294523e
    • Stephen Boyd's avatar
      dump_stack: add vmlinux build ID to stack traces · 22f4e66d
      Stephen Boyd authored
      Add the running kernel's build ID[1] to the stacktrace information header.
      This makes it simpler for developers to locate the vmlinux with full
      debuginfo for a particular kernel stacktrace.  Combined with
      scripts/decode_stracktrace.sh, a developer can download the correct
      vmlinux from a debuginfod[2] server and find the exact file and line
      number for the functions plus offsets in a stacktrace.
      
      This is especially useful for pstore crash debugging where the kernel
      crashes are recorded in the pstore logs and the recovery kernel is
      different or the debuginfo doesn't exist on the device due to space
      concerns (the data can be large and a security concern).  The stacktrace
      can be analyzed after the crash by using the build ID to find the matching
      vmlinux and understand where in the function something went wrong.
      
      Example stacktrace from lkdtm:
      
       WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 lkdtm_WARNING+0x28/0x30 [lkdtm]
       Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg xt_cgroup uinput xt_MASQUERADE
       CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 aa23f7a1231c229de205662d5a9e0d4c580f19a1
       Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
       pstate: 00400009 (nzcv daif +PAN -UAO -TCO BTYPE=--)
       pc : lkdtm_WARNING+0x28/0x30 [lkdtm]
      
      The hex string aa23f7a1231c229de205662d5a9e0d4c580f19a1 is the build ID,
      following the kernel version number. Put it all behind a config option,
      STACKTRACE_BUILD_ID, so that kernel developers can remove this
      information if they decide it is too much.
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-5-swboyd@chromium.org
      Link: https://fedoraproject.org/wiki/Releases/FeatureBuildId [1]
      Link: https://sourceware.org/elfutils/Debuginfod.html [2]
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      22f4e66d
    • Stephen Boyd's avatar
      buildid: stash away kernels build ID on init · 83cc6fa0
      Stephen Boyd authored
      Parse the kernel's build ID at initialization so that other code can print
      a hex format string representation of the running kernel's build ID.  This
      will be used in the kdump and dump_stack code so that developers can
      easily locate the vmlinux debug symbols for a crash/stacktrace.
      
      [swboyd@chromium.org: fix implicit declaration of init_vmlinux_build_id()]
        Link: https://lkml.kernel.org/r/CAE-0n51UjTbay8N9FXAyE7_aR2+ePrQnKSRJ0gbmRsXtcLBVaw@mail.gmail.com
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-4-swboyd@chromium.orgSigned-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Acked-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      83cc6fa0
    • Stephen Boyd's avatar
      buildid: add API to parse build ID out of buffer · 7eaf3cf3
      Stephen Boyd authored
      Add an API that can parse the build ID out of a buffer, instead of a vma,
      to support printing a kernel module's build ID for stack traces.
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-3-swboyd@chromium.orgSigned-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7eaf3cf3
    • Stephen Boyd's avatar
      buildid: only consider GNU notes for build ID parsing · a010d79b
      Stephen Boyd authored
      Patch series "Add build ID to stacktraces", v6.
      
      This series adds the kernel's build ID[1] to the stacktrace header printed
      in oops messages, warnings, etc.  and the build ID for any module that
      appears in the stacktrace after the module name.  The goal is to make the
      stacktrace more self-contained and descriptive by including the relevant
      build IDs in the kernel logs when something goes wrong.  This can be used
      by post processing tools like script/decode_stacktrace.sh and kernel
      developers to easily locate the debug info associated with a kernel crash
      and line up what line and file things started falling apart at.
      
      To show how this can be used I've included a patch to decode_stacktrace.sh
      that downloads the debuginfo from a debuginfod server.  This also includes
      some patches to make the buildid.c file use more const arguments and
      consolidate logic into buildid.c from kdump.  These are left to the end as
      they were mostly cleanup patches.
      
      Here's an example lkdtm stacktrace on arm64.
      
       WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 lkdtm_WARNING+0x28/0x30 [lkdtm]
       Modules linked in: lkdtm rfcomm algif_hash algif_skcipher af_alg xt_cgroup uinput xt_MASQUERADE
       CPU: 4 PID: 3255 Comm: bash Not tainted 5.11 #3 aa23f7a1231c229de205662d5a9e0d4c580f19a1
       Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
       pstate: 00400009 (nzcv daif +PAN -UAO -TCO BTYPE=--)
       pc : lkdtm_WARNING+0x28/0x30 [lkdtm]
       lr : lkdtm_do_action+0x24/0x40 [lkdtm]
       sp : ffffffc0134fbca0
       x29: ffffffc0134fbca0 x28: ffffff92d53ba240
       x27: 0000000000000000 x26: 0000000000000000
       x25: 0000000000000000 x24: ffffffe3622352c0
       x23: 0000000000000020 x22: ffffffe362233366
       x21: ffffffe3622352e0 x20: ffffffc0134fbde0
       x19: 0000000000000008 x18: 0000000000000000
       x17: ffffff929b6536fc x16: 0000000000000000
       x15: 0000000000000000 x14: 0000000000000012
       x13: ffffffe380ed892c x12: ffffffe381d05068
       x11: 0000000000000000 x10: 0000000000000000
       x9 : 0000000000000001 x8 : ffffffe362237000
       x7 : aaaaaaaaaaaaaaaa x6 : 0000000000000000
       x5 : 0000000000000000 x4 : 0000000000000001
       x3 : 0000000000000008 x2 : ffffff93fef25a70
       x1 : ffffff93fef15788 x0 : ffffffe3622352e0
       Call trace:
        lkdtm_WARNING+0x28/0x30 [lkdtm ed5019fdf5e53be37cb1ba7899292d7e143b259e]
        direct_entry+0x16c/0x1b4 [lkdtm ed5019fdf5e53be37cb1ba7899292d7e143b259e]
        full_proxy_write+0x74/0xa4
        vfs_write+0xec/0x2e8
        ksys_write+0x84/0xf0
        __arm64_sys_write+0x24/0x30
        el0_svc_common+0xf4/0x1c0
        do_el0_svc_compat+0x28/0x3c
        el0_svc_compat+0x10/0x1c
        el0_sync_compat_handler+0xa8/0xcc
        el0_sync_compat+0x178/0x180
       ---[ end trace 3d95032303e59e68 ]---
      
      This patch (of 13):
      
      Some kernel elf files have various notes that also happen to have an elf
      note type of '3', which matches NT_GNU_BUILD_ID but the note name isn't
      "GNU".  For example, this note trips up the existing logic:
      
       Owner  Data size   Description
       Xen    0x00000008  Unknown note type: (0x00000003) description data: 00 00 00 ffffff80 ffffffff ffffffff ffffffff ffffffff
      
      Let's make sure that it is a GNU note when parsing the build ID so that we
      can use this function to parse a vmlinux's build ID too.
      
      Link: https://lkml.kernel.org/r/20210511003845.2429846-1-swboyd@chromium.org
      Link: https://lkml.kernel.org/r/20210511003845.2429846-2-swboyd@chromium.org
      Fixes: bd7525da ("bpf: Move stack_map_get_build_id into lib")
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Reported-by: default avatarPetr Mladek <pmladek@suse.com>
      Tested-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Jessica Yu <jeyu@kernel.org>
      Cc: Evan Green <evgreen@chromium.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Sasha Levin <sashal@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a010d79b
    • Kefeng Wang's avatar
      x86: convert to setup_initial_init_mm() · 30120d72
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-16-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      30120d72
    • Kefeng Wang's avatar
      sh: convert to setup_initial_init_mm() · f7cce365
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-15-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Rich Felker <dalias@libc.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f7cce365
    • Kefeng Wang's avatar
      s390: convert to setup_initial_init_mm() · 638cd5a3
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-14-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      638cd5a3
    • Kefeng Wang's avatar
      riscv: convert to setup_initial_init_mm() · 723a42f4
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-13-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Acked-by: default avatarPalmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      723a42f4
    • Kefeng Wang's avatar
      powerpc: convert to setup_initial_init_mm() · 6cd7547b
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-12-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6cd7547b
    • Kefeng Wang's avatar
      openrisc: convert to setup_initial_init_mm() · 20f2eccf
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-11-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Acked-by: default avatarStafford Horne <shorne@gmail.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Stafford Horne <shorne@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      20f2eccf
    • Kefeng Wang's avatar
      nios2: convert to setup_initial_init_mm() · 4154267a
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-10-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4154267a
    • Kefeng Wang's avatar
      nds32: convert to setup_initial_init_mm() · de26fb41
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-9-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Greentime Hu <green.hu@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de26fb41
    • Kefeng Wang's avatar
    • Kefeng Wang's avatar
      h8300: convert to setup_initial_init_mm() · 9772bdef
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-7-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9772bdef
    • Kefeng Wang's avatar
    • Kefeng Wang's avatar
      arm64: convert to setup_initial_init_mm() · 29ffbca1
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-5-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      29ffbca1
    • Kefeng Wang's avatar
    • Kefeng Wang's avatar
      arc: convert to setup_initial_init_mm() · 8e339d50
      Kefeng Wang authored
      Use setup_initial_init_mm() helper to simplify code.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-3-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Acked-by: Vineet Gupta <vgupta@synopsys.com>	arch/arc]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8e339d50
    • Kefeng Wang's avatar
      mm: add setup_initial_init_mm() helper · 5748fbc5
      Kefeng Wang authored
      Patch series "init_mm: cleanup ARCH's text/data/brk setup code", v3.
      
      Add setup_initial_init_mm() helper, then use it to cleanup the text, data
      and brk setup code.
      
      This patch (of 15):
      
      Add setup_initial_init_mm() helper to setup kernel text, data and brk.
      
      Link: https://lkml.kernel.org/r/20210608083418.137226-1-wangkefeng.wang@huawei.com
      Link: https://lkml.kernel.org/r/20210608083418.137226-2-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Souptick Joarder <jrdr.linux@gmail.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Heiko Carstens <hca@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Ley Foon Tan <ley.foon.tan@intel.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nick Hu <nickhu@andestech.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5748fbc5
    • Zhen Lei's avatar
      mm: fix spelling mistakes in header files · 06c88398
      Zhen Lei authored
      Fix some spelling mistakes in comments:
      successfull ==> successful
      potentialy ==> potentially
      alloced ==> allocated
      indicies ==> indices
      wont ==> won't
      resposible ==> responsible
      dirtyness ==> dirtiness
      droppped ==> dropped
      alread ==> already
      occured ==> occurred
      interupts ==> interrupts
      extention ==> extension
      slighly ==> slightly
      Dont't ==> Don't
      
      Link: https://lkml.kernel.org/r/20210531034849.9549-2-thunder.leizhen@huawei.comSigned-off-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Christoph Lameter <cl@linux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      06c88398
    • Mike Rapoport's avatar
      secretmem: test: add basic selftest for memfd_secret(2) · 76fe17ef
      Mike Rapoport authored
      The test verifies that file descriptor created with memfd_secret does not
      allow read/write operations, that secret memory mappings respect
      RLIMIT_MEMLOCK and that remote accesses with process_vm_read() and
      ptrace() to the secret memory fail.
      
      Link: https://lkml.kernel.org/r/20210518072034.31572-8-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Bottomley <jejb@linux.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tycho Andersen <tycho@tycho.ws>
      Cc: Will Deacon <will@kernel.org>
      Cc: kernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      76fe17ef
    • Mike Rapoport's avatar
      arch, mm: wire up memfd_secret system call where relevant · 7bb7f2ac
      Mike Rapoport authored
      Wire up memfd_secret system call on architectures that define
      ARCH_HAS_SET_DIRECT_MAP, namely arm64, risc-v and x86.
      
      Link: https://lkml.kernel.org/r/20210518072034.31572-7-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarPalmer Dabbelt <palmerdabbelt@google.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Bottomley <jejb@linux.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tycho Andersen <tycho@tycho.ws>
      Cc: Will Deacon <will@kernel.org>
      Cc: kernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7bb7f2ac
    • Mike Rapoport's avatar
      PM: hibernate: disable when there are active secretmem users · 9a436f8f
      Mike Rapoport authored
      It is unsafe to allow saving of secretmem areas to the hibernation
      snapshot as they would be visible after the resume and this essentially
      will defeat the purpose of secret memory mappings.
      
      Prevent hibernation whenever there are active secret memory users.
      
      Link: https://lkml.kernel.org/r/20210518072034.31572-6-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Bottomley <jejb@linux.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tycho Andersen <tycho@tycho.ws>
      Cc: Will Deacon <will@kernel.org>
      Cc: kernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a436f8f
    • Mike Rapoport's avatar
      mm: introduce memfd_secret system call to create "secret" memory areas · 1507f512
      Mike Rapoport authored
      Introduce "memfd_secret" system call with the ability to create memory
      areas visible only in the context of the owning process and not mapped not
      only to other processes but in the kernel page tables as well.
      
      The secretmem feature is off by default and the user must explicitly
      enable it at the boot time.
      
      Once secretmem is enabled, the user will be able to create a file
      descriptor using the memfd_secret() system call.  The memory areas created
      by mmap() calls from this file descriptor will be unmapped from the kernel
      direct map and they will be only mapped in the page table of the processes
      that have access to the file descriptor.
      
      Secretmem is designed to provide the following protections:
      
      * Enhanced protection (in conjunction with all the other in-kernel
        attack prevention systems) against ROP attacks.  Seceretmem makes
        "simple" ROP insufficient to perform exfiltration, which increases the
        required complexity of the attack.  Along with other protections like
        the kernel stack size limit and address space layout randomization which
        make finding gadgets is really hard, absence of any in-kernel primitive
        for accessing secret memory means the one gadget ROP attack can't work.
        Since the only way to access secret memory is to reconstruct the missing
        mapping entry, the attacker has to recover the physical page and insert
        a PTE pointing to it in the kernel and then retrieve the contents.  That
        takes at least three gadgets which is a level of difficulty beyond most
        standard attacks.
      
      * Prevent cross-process secret userspace memory exposures.  Once the
        secret memory is allocated, the user can't accidentally pass it into the
        kernel to be transmitted somewhere.  The secreremem pages cannot be
        accessed via the direct map and they are disallowed in GUP.
      
      * Harden against exploited kernel flaws.  In order to access secretmem,
        a kernel-side attack would need to either walk the page tables and
        create new ones, or spawn a new privileged uiserspace process to perform
        secrets exfiltration using ptrace.
      
      The file descriptor based memory has several advantages over the
      "traditional" mm interfaces, such as mlock(), mprotect(), madvise().  File
      descriptor approach allows explicit and controlled sharing of the memory
      areas, it allows to seal the operations.  Besides, file descriptor based
      memory paves the way for VMMs to remove the secret memory range from the
      userspace hipervisor process, for instance QEMU.  Andy Lutomirski says:
      
        "Getting fd-backed memory into a guest will take some possibly major
        work in the kernel, but getting vma-backed memory into a guest without
        mapping it in the host user address space seems much, much worse."
      
      memfd_secret() is made a dedicated system call rather than an extension to
      memfd_create() because it's purpose is to allow the user to create more
      secure memory mappings rather than to simply allow file based access to
      the memory.  Nowadays a new system call cost is negligible while it is way
      simpler for userspace to deal with a clear-cut system calls than with a
      multiplexer or an overloaded syscall.  Moreover, the initial
      implementation of memfd_secret() is completely distinct from
      memfd_create() so there is no much sense in overloading memfd_create() to
      begin with.  If there will be a need for code sharing between these
      implementation it can be easily achieved without a need to adjust user
      visible APIs.
      
      The secret memory remains accessible in the process context using uaccess
      primitives, but it is not exposed to the kernel otherwise; secret memory
      areas are removed from the direct map and functions in the
      follow_page()/get_user_page() family will refuse to return a page that
      belongs to the secret memory area.
      
      Once there will be a use case that will require exposing secretmem to the
      kernel it will be an opt-in request in the system call flags so that user
      would have to decide what data can be exposed to the kernel.
      
      Removing of the pages from the direct map may cause its fragmentation on
      architectures that use large pages to map the physical memory which
      affects the system performance.  However, the original Kconfig text for
      CONFIG_DIRECT_GBPAGES said that gigabyte pages in the direct map "...  can
      improve the kernel's performance a tiny bit ..." (commit 00d1c5e0
      ("x86: add gbpages switches")) and the recent report [1] showed that "...
      although 1G mappings are a good default choice, there is no compelling
      evidence that it must be the only choice".  Hence, it is sufficient to
      have secretmem disabled by default with the ability of a system
      administrator to enable it at boot time.
      
      Pages in the secretmem regions are unevictable and unmovable to avoid
      accidental exposure of the sensitive data via swap or during page
      migration.
      
      Since the secretmem mappings are locked in memory they cannot exceed
      RLIMIT_MEMLOCK.  Since these mappings are already locked independently
      from mlock(), an attempt to mlock()/munlock() secretmem range would fail
      and mlockall()/munlockall() will ignore secretmem mappings.
      
      However, unlike mlock()ed memory, secretmem currently behaves more like
      long-term GUP: secretmem mappings are unmovable mappings directly consumed
      by user space.  With default limits, there is no excessive use of
      secretmem and it poses no real problem in combination with
      ZONE_MOVABLE/CMA, but in the future this should be addressed to allow
      balanced use of large amounts of secretmem along with ZONE_MOVABLE/CMA.
      
      A page that was a part of the secret memory area is cleared when it is
      freed to ensure the data is not exposed to the next user of that page.
      
      The following example demonstrates creation of a secret mapping (error
      handling is omitted):
      
      	fd = memfd_secret(0);
      	ftruncate(fd, MAP_SIZE);
      	ptr = mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE,
      		   MAP_SHARED, fd, 0);
      
      [1] https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@linux.intel.com/
      
      [akpm@linux-foundation.org: suppress Kconfig whine]
      
      Link: https://lkml.kernel.org/r/20210518072034.31572-5-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarHagen Paul Pfeifer <hagen@jauu.net>
      Acked-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Bottomley <jejb@linux.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tycho Andersen <tycho@tycho.ws>
      Cc: Will Deacon <will@kernel.org>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: kernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1507f512
    • Mike Rapoport's avatar
      set_memory: allow querying whether set_direct_map_*() is actually enabled · 6d47c23b
      Mike Rapoport authored
      On arm64, set_direct_map_*() functions may return 0 without actually
      changing the linear map.  This behaviour can be controlled using kernel
      parameters, so we need a way to determine at runtime whether calls to
      set_direct_map_invalid_noflush() and set_direct_map_default_noflush() have
      any effect.
      
      Extend set_memory API with can_set_direct_map() function that allows
      checking if calling set_direct_map_*() will actually change the page
      table, replace several occurrences of open coded checks in arm64 with the
      new function and provide a generic stub for architectures that always
      modify page tables upon calls to set_direct_map APIs.
      
      [arnd@arndb.de: arm64: kfence: fix header inclusion ]
      
      Link: https://lkml.kernel.org/r/20210518072034.31572-4-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Bottomley <jejb@linux.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tycho Andersen <tycho@tycho.ws>
      Cc: Will Deacon <will@kernel.org>
      Cc: kernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d47c23b
    • Mike Rapoport's avatar
      riscv/Kconfig: make direct map manipulation options depend on MMU · 10cc3278
      Mike Rapoport authored
      ARCH_HAS_SET_DIRECT_MAP and ARCH_HAS_SET_MEMORY configuration options have
      no meaning when CONFIG_MMU is disabled and there is no point to enable
      them for the nommu case.
      
      Add an explicit dependency on MMU for these options.
      
      Link: https://lkml.kernel.org/r/20210518072034.31572-3-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Bottomley <jejb@linux.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tycho Andersen <tycho@tycho.ws>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10cc3278
    • Mike Rapoport's avatar
      mmap: make mlock_future_check() global · 6aeb2542
      Mike Rapoport authored
      Patch series "mm: introduce memfd_secret system call to create "secret" memory areas", v20.
      
      This is an implementation of "secret" mappings backed by a file
      descriptor.
      
      The file descriptor backing secret memory mappings is created using a
      dedicated memfd_secret system call The desired protection mode for the
      memory is configured using flags parameter of the system call.  The mmap()
      of the file descriptor created with memfd_secret() will create a "secret"
      memory mapping.  The pages in that mapping will be marked as not present
      in the direct map and will be present only in the page table of the owning
      mm.
      
      Although normally Linux userspace mappings are protected from other users,
      such secret mappings are useful for environments where a hostile tenant is
      trying to trick the kernel into giving them access to other tenants
      mappings.
      
      It's designed to provide the following protections:
      
      * Enhanced protection (in conjunction with all the other in-kernel
        attack prevention systems) against ROP attacks.  Seceretmem makes
        "simple" ROP insufficient to perform exfiltration, which increases the
        required complexity of the attack.  Along with other protections like
        the kernel stack size limit and address space layout randomization which
        make finding gadgets is really hard, absence of any in-kernel primitive
        for accessing secret memory means the one gadget ROP attack can't work.
        Since the only way to access secret memory is to reconstruct the missing
        mapping entry, the attacker has to recover the physical page and insert
        a PTE pointing to it in the kernel and then retrieve the contents.  That
        takes at least three gadgets which is a level of difficulty beyond most
        standard attacks.
      
      * Prevent cross-process secret userspace memory exposures.  Once the
        secret memory is allocated, the user can't accidentally pass it into the
        kernel to be transmitted somewhere.  The secreremem pages cannot be
        accessed via the direct map and they are disallowed in GUP.
      
      * Harden against exploited kernel flaws.  In order to access secretmem,
        a kernel-side attack would need to either walk the page tables and
        create new ones, or spawn a new privileged uiserspace process to perform
        secrets exfiltration using ptrace.
      
      In the future the secret mappings may be used as a mean to protect guest
      memory in a virtual machine host.
      
      For demonstration of secret memory usage we've created a userspace library
      
      https://git.kernel.org/pub/scm/linux/kernel/git/jejb/secret-memory-preloader.git
      
      that does two things: the first is act as a preloader for openssl to
      redirect all the OPENSSL_malloc calls to secret memory meaning any secret
      keys get automatically protected this way and the other thing it does is
      expose the API to the user who needs it.  We anticipate that a lot of the
      use cases would be like the openssl one: many toolkits that deal with
      secret keys already have special handling for the memory to try to give
      them greater protection, so this would simply be pluggable into the
      toolkits without any need for user application modification.
      
      Hiding secret memory mappings behind an anonymous file allows usage of the
      page cache for tracking pages allocated for the "secret" mappings as well
      as using address_space_operations for e.g.  page migration callbacks.
      
      The anonymous file may be also used implicitly, like hugetlb files, to
      implement mmap(MAP_SECRET) and use the secret memory areas with "native"
      mm ABIs in the future.
      
      Removing of the pages from the direct map may cause its fragmentation on
      architectures that use large pages to map the physical memory which
      affects the system performance.  However, the original Kconfig text for
      CONFIG_DIRECT_GBPAGES said that gigabyte pages in the direct map "...  can
      improve the kernel's performance a tiny bit ..." (commit 00d1c5e0
      ("x86: add gbpages switches")) and the recent report [1] showed that "...
      although 1G mappings are a good default choice, there is no compelling
      evidence that it must be the only choice".  Hence, it is sufficient to
      have secretmem disabled by default with the ability of a system
      administrator to enable it at boot time.
      
      In addition, there is also a long term goal to improve management of the
      direct map.
      
      [1] https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@linux.intel.com/
      
      This patch (of 7):
      
      It will be used by the upcoming secret memory implementation.
      
      Link: https://lkml.kernel.org/r/20210518072034.31572-1-rppt@kernel.org
      Link: https://lkml.kernel.org/r/20210518072034.31572-2-rppt@kernel.orgSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Elena Reshetova <elena.reshetova@intel.com>
      Cc: Hagen Paul Pfeifer <hagen@jauu.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: James Bottomley <jejb@linux.ibm.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmerdabbelt@google.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rick Edgecombe <rick.p.edgecombe@intel.com>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tycho Andersen <tycho@tycho.ws>
      Cc: Will Deacon <will@kernel.org>
      Cc: kernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6aeb2542
    • Oliver Glitta's avatar
      mm/slub: use stackdepot to save stack trace in objects · 78869146
      Oliver Glitta authored
      Many stack traces are similar so there are many similar arrays.
      Stackdepot saves each unique stack only once.
      
      Replace field addrs in struct track with depot_stack_handle_t handle.  Use
      stackdepot to save stack trace.
      
      The benefits are smaller memory overhead and possibility to aggregate
      per-cache statistics in the future using the stackdepot handle instead of
      matching stacks manually.
      
      [rdunlap@infradead.org: rename save_stack_trace()]
        Link: https://lkml.kernel.org/r/20210513051920.29320-1-rdunlap@infradead.org
      [vbabka@suse.cz: fix lockdep splat]
        Link: https://lkml.kernel.org/r/20210516195150.26740-1-vbabka@suse.czLink: https://lkml.kernel.org/r/20210414163434.4376-1-glittao@gmail.comSigned-off-by: default avatarOliver Glitta <glittao@gmail.com>
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      78869146
    • Nathan Chancellor's avatar
      hexagon: select ARCH_WANT_LD_ORPHAN_WARN · 113616ec
      Nathan Chancellor authored
      Now that we handle all of the sections in a Hexagon defconfig, select
      ARCH_WANT_LD_ORPHAN_WARN so that unhandled sections are warned about by
      default.
      
      Link: https://lkml.kernel.org/r/20210521011239.1332345-4-nathan@kernel.orgSigned-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Acked-by: default avatarBrian Cain <bcain@codeaurora.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Oliver Glitta <glittao@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      113616ec
    • Nathan Chancellor's avatar
      hexagon: use common DISCARDS macro · 681ba73c
      Nathan Chancellor authored
      ld.lld warns that the '.modinfo' section is not currently handled:
      
      ld.lld: warning: kernel/built-in.a(workqueue.o):(.modinfo) is being placed in '.modinfo'
      ld.lld: warning: kernel/built-in.a(printk/printk.o):(.modinfo) is being placed in '.modinfo'
      ld.lld: warning: kernel/built-in.a(irq/spurious.o):(.modinfo) is being placed in '.modinfo'
      ld.lld: warning: kernel/built-in.a(rcu/update.o):(.modinfo) is being placed in '.modinfo'
      
      The '.modinfo' section was added in commit 898490c0 ("moduleparam:
      Save information about built-in modules in separate file") to the DISCARDS
      macro but Hexagon has never used that macro.  The unification of DISCARDS
      happened in commit 023bf6f1 ("linker script: unify usage of discard
      definition") in 2009, prior to Hexagon being added in 2011.
      
      Switch Hexagon over to the DISCARDS macro so that anything that is
      expected to be discarded gets discarded.
      
      Link: https://lkml.kernel.org/r/20210521011239.1332345-3-nathan@kernel.org
      Fixes: e95bf452 ("Hexagon: Add configuration and makefiles for the Hexagon architecture.")
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Acked-by: default avatarBrian Cain <bcain@codeaurora.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Oliver Glitta <glittao@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      681ba73c
    • Nathan Chancellor's avatar
      hexagon: handle {,SOFT}IRQENTRY_TEXT in linker script · 6fef087d
      Nathan Chancellor authored
      Patch series "hexagon: Fix build error with CONFIG_STACKDEPOT and select CONFIG_ARCH_WANT_LD_ORPHAN_WARN".
      
      This series fixes an error with ARCH=hexagon that was pointed out by the
      patch "mm/slub: use stackdepot to save stack trace in objects".
      
      The first patch fixes that error by handling the '.irqentry.text' and
      '.softirqentry.text' sections.
      
      The second patch switches Hexagon over to the common DISCARDS macro, which
      should have been done when Hexagon was merged into the tree to match
      commit 023bf6f1 ("linker script: unify usage of discard definition").
      
      The third patch selects CONFIG_ARCH_WANT_LD_ORPHAN_WARN so that something
      like this does not happen again.
      
      This patch (of 3):
      
      Patch "mm/slub: use stackdepot to save stack trace in objects" in -mm
      selects CONFIG_STACKDEPOT when CONFIG_STACKTRACE_SUPPORT is selected and
      CONFIG_STACKDEPOT requires IRQENTRY_TEXT and SOFTIRQENTRY_TEXT to be
      handled after commit 505a0ef1 ("kasan: stackdepot: move
      filter_irq_stacks() to stackdepot.c") due to the use of the
      __{,soft}irqentry_text_{start,end} section symbols.  If those sections are
      not handled, the build is broken.
      
      $ make ARCH=hexagon CROSS_COMPILE=hexagon-linux- LLVM=1 LLVM_IAS=1 defconfig all
      ...
      ld.lld: error: undefined symbol: __irqentry_text_start
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      
      ld.lld: error: undefined symbol: __irqentry_text_end
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      
      ld.lld: error: undefined symbol: __softirqentry_text_start
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      
      ld.lld: error: undefined symbol: __softirqentry_text_end
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      >>> referenced by stackdepot.c
      >>>               stackdepot.o:(filter_irq_stacks) in archive lib/built-in.a
      ...
      
      Add these sections to the Hexagon linker script so the build continues to
      work.  ld.lld's orphan section warning would have caught this prior to the
      -mm commit mentioned above:
      
      ld.lld: warning: kernel/built-in.a(softirq.o):(.softirqentry.text) is being placed in '.softirqentry.text'
      ld.lld: warning: kernel/built-in.a(softirq.o):(.softirqentry.text) is being placed in '.softirqentry.text'
      ld.lld: warning: kernel/built-in.a(softirq.o):(.softirqentry.text) is being placed in '.softirqentry.text'
      
      Link: https://lkml.kernel.org/r/20210521011239.1332345-1-nathan@kernel.org
      Link: https://lkml.kernel.org/r/20210521011239.1332345-2-nathan@kernel.org
      Link: https://github.com/ClangBuiltLinux/linux/issues/1381
      Fixes: 505a0ef1 ("kasan: stackdepot: move filter_irq_stacks() to stackdepot.c")
      Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Acked-by: default avatarBrian Cain <bcain@codeaurora.org>
      Cc: Oliver Glitta <glittao@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6fef087d
    • Zhen Lei's avatar
      lib: fix spelling mistakes in header files · c23c8082
      Zhen Lei authored
      Fix some spelling mistakes in comments found by "codespell":
      Hoever ==> However
      poiter ==> pointer
      representaion ==> representation
      uppon ==> upon
      independend ==> independent
      aquired ==> acquired
      mis-match ==> mismatch
      scrach ==> scratch
      struture ==> structure
      Analagous ==> Analogous
      interation ==> iteration
      
      And some were discovered manually by Joe Perches and Christoph Lameter:
      stroed ==> stored
      arch independent ==> an architecture independent
      A example structure for ==> Example structure for
      
      Link: https://lkml.kernel.org/r/20210609150027.14805-2-thunder.leizhen@huawei.comSigned-off-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
      Cc: Christoph Lameter <cl@gentwo.de>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Joe Perches <joe@perches.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c23c8082
    • Zhen Lei's avatar
      lib: fix spelling mistakes · 9dbbc3b9
      Zhen Lei authored
      Fix some spelling mistakes in comments:
      permanentely ==> permanently
      wont ==> won't
      remaning ==> remaining
      succed ==> succeed
      shouldnt ==> shouldn't
      alpha-numeric ==> alphanumeric
      storeing ==> storing
      funtion ==> function
      documenation ==> documentation
      Determin ==> Determine
      intepreted ==> interpreted
      ammount ==> amount
      obious ==> obvious
      interupts ==> interrupts
      occured ==> occurred
      asssociated ==> associated
      taking into acount ==> taking into account
      squence ==> sequence
      stil ==> still
      contiguos ==> contiguous
      matchs ==> matches
      
      Link: https://lkml.kernel.org/r/20210607072555.12416-1-thunder.leizhen@huawei.comSigned-off-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9dbbc3b9
    • Zhen Lei's avatar
      lib/test: fix spelling mistakes · 53b0fe36
      Zhen Lei authored
      Fix some spelling mistakes in comments found by "codespell":
      thats ==> that's
      unitialized ==> uninitialized
      panicing ==> panicking
      sucess ==> success
      possitive ==> positive
      intepreted ==> interpreted
      
      Link: https://lkml.kernel.org/r/20210607133036.12525-2-thunder.leizhen@huawei.comSigned-off-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
      Acked-by: Yonghong Song <yhs@fb.com>	[test_bfp.c]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      53b0fe36
  2. 07 Jul, 2021 2 commits
    • Linus Torvalds's avatar
      Merge tag 'modules-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux · a931dd33
      Linus Torvalds authored
      Pull module updates from Jessica Yu:
      
       - Fix incorrect logic in module_kallsyms_on_each_symbol()
      
       - Fix for a Coccinelle warning
      
      * tag 'modules-for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
        module: correctly exit module_kallsyms_on_each_symbol when fn() != 0
        kernel/module: Use BUG_ON instead of if condition followed by BUG
      a931dd33
    • Linus Torvalds's avatar
      Merge tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1423e266
      Linus Torvalds authored
      Pull x86 fpu updates from Thomas Gleixner:
       "Fixes and improvements for FPU handling on x86:
      
         - Prevent sigaltstack out of bounds writes.
      
           The kernel unconditionally writes the FPU state to the alternate
           stack without checking whether the stack is large enough to
           accomodate it.
      
           Check the alternate stack size before doing so and in case it's too
           small force a SIGSEGV instead of silently corrupting user space
           data.
      
         - MINSIGSTKZ and SIGSTKSZ are constants in signal.h and have never
           been updated despite the fact that the FPU state which is stored on
           the signal stack has grown over time which causes trouble in the
           field when AVX512 is available on a CPU. The kernel does not expose
           the minimum requirements for the alternate stack size depending on
           the available and enabled CPU features.
      
           ARM already added an aux vector AT_MINSIGSTKSZ for the same reason.
           Add it to x86 as well.
      
         - A major cleanup of the x86 FPU code. The recent discoveries of
           XSTATE related issues unearthed quite some inconsistencies,
           duplicated code and other issues.
      
           The fine granular overhaul addresses this, makes the code more
           robust and maintainable, which allows to integrate upcoming XSTATE
           related features in sane ways"
      
      * tag 'x86-fpu-2021-07-07' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
        x86/fpu/xstate: Clear xstate header in copy_xstate_to_uabi_buf() again
        x86/fpu/signal: Let xrstor handle the features to init
        x86/fpu/signal: Handle #PF in the direct restore path
        x86/fpu: Return proper error codes from user access functions
        x86/fpu/signal: Split out the direct restore code
        x86/fpu/signal: Sanitize copy_user_to_fpregs_zeroing()
        x86/fpu/signal: Sanitize the xstate check on sigframe
        x86/fpu/signal: Remove the legacy alignment check
        x86/fpu/signal: Move initial checks into fpu__restore_sig()
        x86/fpu: Mark init_fpstate __ro_after_init
        x86/pkru: Remove xstate fiddling from write_pkru()
        x86/fpu: Don't store PKRU in xstate in fpu_reset_fpstate()
        x86/fpu: Remove PKRU handling from switch_fpu_finish()
        x86/fpu: Mask PKRU from kernel XRSTOR[S] operations
        x86/fpu: Hook up PKRU into ptrace()
        x86/fpu: Add PKRU storage outside of task XSAVE buffer
        x86/fpu: Dont restore PKRU in fpregs_restore_userspace()
        x86/fpu: Rename xfeatures_mask_user() to xfeatures_mask_uabi()
        x86/fpu: Move FXSAVE_LEAK quirk info __copy_kernel_to_fpregs()
        x86/fpu: Rename __fpregs_load_activate() to fpregs_restore_userregs()
        ...
      1423e266