1. 08 May, 2022 16 commits
    • Mark Rutland's avatar
      lkdtm/stackleak: prevent unexpected stack usage · f03a5093
      Mark Rutland authored
      The lkdtm_STACKLEAK_ERASING() test is instrumentable and runs with IRQs
      unmasked, so it's possible for unrelated code to clobber the task stack
      and/or manipulate current->lowest_stack while the test is running,
      resulting in spurious failures.
      
      The regular stackleak erasing code is non-instrumentable and runs with
      IRQs masked, preventing similar issues.
      
      Make the body of the test non-instrumentable, and run it with IRQs
      masked, avoiding such spurious failures.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-11-mark.rutland@arm.com
      f03a5093
    • Mark Rutland's avatar
      lkdtm/stackleak: rework boundary management · 72b61896
      Mark Rutland authored
      There are a few problems with the way the LKDTM STACKLEAK_ERASING test
      manipulates the stack pointer and boundary values:
      
      * It uses the address of a local variable to determine the current stack
        pointer, rather than using current_stack_pointer directly. As the
        local variable could be placed anywhere within the stack frame, this
        can be an over-estimate of the true stack pointer value.
      
      * Is uses an estimate of the current stack pointer as the upper boundary
        when scanning for poison, even though prior functions could have used
        more stack (and may have updated current->lowest stack accordingly).
      
      * A pr_info() call is made in the middle of the test. As the printk()
        code is out-of-line and will make use of the stack, this could clobber
        poison and/or adjust current->lowest_stack. It would be better to log
        the metadata after the body of the test to avoid such problems.
      
      These have been observed to result in spurious test failures on arm64.
      
      In addition to this there are a couple of things which are sub-optimal:
      
      * To avoid the STACK_END_MAGIC value, it conditionally modifies 'left'
        if this contains more than a single element, when it could instead
        calculate the bound unconditionally using stackleak_task_low_bound().
      
      * It open-codes the poison scanning. It would be better if this used the
        same helper code as used by erasing function so that the two cannot
        diverge.
      
      This patch reworks the test to avoid these issues, making use of the
      recently introduced helpers to ensure this is aligned with the regular
      stackleak code.
      
      As the new code tests stack boundaries before accessing the stack, there
      is no need to fail early when the tracked or untracked portions of the
      stack extend all the way to the low stack boundary.
      
      As stackleak_find_top_of_poison() is now used to find the top of the
      poisoned region of the stack, the subsequent poison checking starts at
      this boundary and verifies that stackleak_find_top_of_poison() is
      working correctly.
      
      The pr_info() which logged the untracked portion of stack is now moved
      to the end of the function, and logs the size of all the portions of the
      stack relevant to the test, including the portions at the top and bottom
      of the stack which are not erased or scanned, and the current / lowest
      recorded stack usage.
      
      Tested on x86_64:
      
      | # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
      | lkdtm: Performing direct entry STACKLEAK_ERASING
      | lkdtm: stackleak stack usage:
      |   high offset: 168 bytes
      |   current:     336 bytes
      |   lowest:      656 bytes
      |   tracked:     656 bytes
      |   untracked:   400 bytes
      |   poisoned:    15152 bytes
      |   low offset:  8 bytes
      | lkdtm: OK: the rest of the thread stack is properly erased
      
      Tested on arm64:
      
      | # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
      | lkdtm: Performing direct entry STACKLEAK_ERASING
      | lkdtm: stackleak stack usage:
      |   high offset: 336 bytes
      |   current:     656 bytes
      |   lowest:      1232 bytes
      |   tracked:     1232 bytes
      |   untracked:   672 bytes
      |   poisoned:    14136 bytes
      |   low offset:  8 bytes
      | lkdtm: OK: the rest of the thread stack is properly erased
      
      Tested on arm64 with deliberate breakage to the starting stack value and
      poison scanning:
      
      | # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
      | lkdtm: Performing direct entry STACKLEAK_ERASING
      | lkdtm: FAIL: non-poison value 24 bytes below poison boundary: 0x0
      | lkdtm: FAIL: non-poison value 32 bytes below poison boundary: 0xffff8000083dbc00
      ...
      | lkdtm: FAIL: non-poison value 1912 bytes below poison boundary: 0x78b4b9999e8cb15
      | lkdtm: FAIL: non-poison value 1920 bytes below poison boundary: 0xffff8000083db400
      | lkdtm: stackleak stack usage:
      |   high offset: 336 bytes
      |   current:     688 bytes
      |   lowest:      1232 bytes
      |   tracked:     576 bytes
      |   untracked:   288 bytes
      |   poisoned:    15176 bytes
      |   low offset:  8 bytes
      | lkdtm: FAIL: the thread stack is NOT properly erased!
      | lkdtm: Unexpected! This kernel (5.18.0-rc1-00013-g1f7b1f1e29e0-dirty aarch64) was built with CONFIG_GCC_PLUGIN_STACKLEAK=y
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-10-mark.rutland@arm.com
      72b61896
    • Mark Rutland's avatar
      lkdtm/stackleak: avoid spurious failure · 4130a61c
      Mark Rutland authored
      The lkdtm_STACKLEAK_ERASING() test scans for a contiguous block of
      poison values between the low stack bound and the stack pointer, and
      fails if it does not find a sufficiently large block.
      
      This can happen legitimately if the scan the low stack bound, which
      could occur if functions called prior to lkdtm_STACKLEAK_ERASING() used
      a large amount of stack. If this were to occur, it means that the erased
      portion of the stack is smaller than the size used by the scan, but does
      not cause a functional problem
      
      In practice this is unlikely to happen, but as this is legitimate and
      would not result in a functional problem, the test should not fail in
      this case.
      
      Remove the spurious failure case.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-9-mark.rutland@arm.com
      4130a61c
    • Mark Rutland's avatar
      stackleak: rework poison scanning · 77cf2b6d
      Mark Rutland authored
      Currently we over-estimate the region of stack which must be erased.
      
      To determine the region to be erased, we scan downwards for a contiguous
      block of poison values (or the low bound of the stack). There are a few
      minor problems with this today:
      
      * When we find a block of poison values, we include this block within
        the region to erase.
      
        As this is included within the region to erase, this causes us to
        redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
        poison.
      
      * As the loop condition checks 'poison_count <= depth', it will run an
        additional iteration after finding the contiguous block of poison,
        decrementing 'erase_low' once more than necessary.
      
        As this is included within the region to erase, this causes us to
        redundantly overwrite an additional unsigned long with poison.
      
      * As we always decrement 'erase_low' after checking an element on the
        stack, we always include the element below this within the region to
        erase.
      
        As this is included within the region to erase, this causes us to
        redundantly overwrite an additional unsigned long with poison.
      
        Note that this is not a functional problem. As the loop condition
        checks 'erase_low > task_stack_low', we'll never clobber the
        STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
        never fail to erase the element immediately above the STACK_END_MAGIC.
      
      In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
      bytes more than necessary, which is unfortunate.
      
      This patch reworks the logic to find the address immediately above the
      poisoned region, by finding the lowest non-poisoned address. This is
      factored into a stackleak_find_top_of_poison() helper both for clarity
      and so that this can be shared with the LKDTM test in subsequent
      patches.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-8-mark.rutland@arm.com
      77cf2b6d
    • Mark Rutland's avatar
      stackleak: rework stack high bound handling · 0cfa2ccd
      Mark Rutland authored
      Prior to returning to userspace, we reset current->lowest_stack to a
      reasonable high bound. Currently we do this by subtracting the arbitrary
      value `THREAD_SIZE/64` from the top of the stack, for reasons lost to
      history.
      
      Looking at configurations today:
      
      * On i386 where THREAD_SIZE is 8K, the bound will be 128 bytes. The
        pt_regs at the top of the stack is 68 bytes (with 0 to 16 bytes of
        padding above), and so this covers an additional portion of 44 to 60
        bytes.
      
      * On x86_64 where THREAD_SIZE is at least 16K (up to 32K with KASAN) the
        bound will be at least 256 bytes (up to 512 with KASAN). The pt_regs
        at the top of the stack is 168 bytes, and so this cover an additional
        88 bytes of stack (up to 344 with KASAN).
      
      * On arm64 where THREAD_SIZE is at least 16K (up to 64K with 64K pages
        and VMAP_STACK), the bound will be at least 256 bytes (up to 1024 with
        KASAN). The pt_regs at the top of the stack is 336 bytes, so this can
        fall within the pt_regs, or can cover an additional 688 bytes of
        stack.
      
      Clearly the `THREAD_SIZE/64` value doesn't make much sense -- in the
      worst case, this will cause more than 600 bytes of stack to be erased
      for every syscall, even if actual stack usage were substantially
      smaller.
      
      This patches makes this slightly less nonsensical by consistently
      resetting current->lowest_stack to the base of the task pt_regs. For
      clarity and for consistency with the handling of the low bound, the
      generation of the high bound is split into a helper with commentary
      explaining why.
      
      Since the pt_regs at the top of the stack will be clobbered upon the
      next exception entry, we don't need to poison these at exception exit.
      By using task_pt_regs() as the high stack boundary instead of
      current_top_of_stack() we avoid some redundant poisoning, and the
      compiler can share the address generation between the poisoning and
      resetting of `current->lowest_stack`, making the generated code more
      optimal.
      
      It's not clear to me whether the existing `THREAD_SIZE/64` offset was a
      dodgy heuristic to skip the pt_regs, or whether it was attempting to
      minimize the number of times stackleak_check_stack() would have to
      update `current->lowest_stack` when stack usage was shallow at the cost
      of unconditionally poisoning a small portion of the stack for every exit
      to userspace.
      
      For now I've simply removed the offset, and if we need/want to minimize
      updates for shallow stack usage it should be easy to add a better
      heuristic atop, with appropriate commentary so we know what's going on.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-7-mark.rutland@arm.com
      0cfa2ccd
    • Mark Rutland's avatar
      stackleak: clarify variable names · 1723d39d
      Mark Rutland authored
      The logic within __stackleak_erase() can be a little hard to follow, as
      `boundary` switches from being the low bound to the high bound mid way
      through the function, and `kstack_ptr` is used to represent the start of
      the region to erase while `boundary` represents the end of the region to
      erase.
      
      Make this a little clearer by consistently using clearer variable names.
      The `boundary` variable is removed, the bounds of the region to erase
      are described by `erase_low` and `erase_high`, and bounds of the task
      stack are described by `task_stack_low` and `task_stack_high`.
      
      As the same time, remove the comment above the variables, since it is
      unclear whether it's intended as rationale, a complaint, or a TODO, and
      is more confusing than helpful.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-6-mark.rutland@arm.com
      1723d39d
    • Mark Rutland's avatar
      stackleak: rework stack low bound handling · 9ec79840
      Mark Rutland authored
      In stackleak_task_init(), stackleak_track_stack(), and
      __stackleak_erase(), we open-code skipping the STACK_END_MAGIC at the
      bottom of the stack. Each case is implemented slightly differently, and
      only the __stackleak_erase() case is commented.
      
      In stackleak_task_init() and stackleak_track_stack() we unconditionally
      add sizeof(unsigned long) to the lowest stack address. In
      stackleak_task_init() we use end_of_stack() for this, and in
      stackleak_track_stack() we use task_stack_page(). In __stackleak_erase()
      we handle this by detecting if `kstack_ptr` has hit the stack end
      boundary, and if so, conditionally moving it above the magic.
      
      This patch adds a new stackleak_task_low_bound() helper which is used in
      all three cases, which unconditionally adds sizeof(unsigned long) to the
      lowest address on the task stack, with commentary as to why. This uses
      end_of_stack() as stackleak_task_init() did prior to this patch, as this
      is consistent with the code in kernel/fork.c which initializes the
      STACK_END_MAGIC value.
      
      In __stackleak_erase() we no longer need to check whether we've spilled
      into the STACK_END_MAGIC value, as stackleak_track_stack() ensures that
      `current->lowest_stack` stops immediately above this, and similarly the
      poison scan will stop immediately above this.
      
      For stackleak_task_init() and stackleak_track_stack() this results in no
      change to code generation. For __stackleak_erase() the generated
      assembly is slightly simpler and shorter.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-5-mark.rutland@arm.com
      9ec79840
    • Mark Rutland's avatar
      stackleak: remove redundant check · ac7838b4
      Mark Rutland authored
      In __stackleak_erase() we check that the `erase_low` value derived from
      `current->lowest_stack` is above the lowest legitimate stack pointer
      value, but this is already enforced by stackleak_track_stack() when
      recording the lowest stack value.
      
      Remove the redundant check.
      
      There should be no functional change as a result of this patch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-4-mark.rutland@arm.com
      ac7838b4
    • Mark Rutland's avatar
      stackleak: move skip_erasing() check earlier · a12685e2
      Mark Rutland authored
      In stackleak_erase() we check skip_erasing() after accessing some fields
      from current. As generating the address of current uses asm which
      hazards with the static branch asm, this work is always performed, even
      when the static branch is patched to jump to the return at the end of the
      function.
      
      This patch avoids this redundant work by moving the skip_erasing() check
      earlier.
      
      To avoid complicating initialization within stackleak_erase(), the body
      of the function is split out into a __stackleak_erase() helper, with the
      check left in a wrapper function. The __stackleak_erase() helper is
      marked __always_inline to ensure that this is inlined into
      stackleak_erase() and not instrumented.
      
      Before this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:
      
      <stackleak_erase>:
         65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
         00 00
         48 8b 48 20             mov    0x20(%rax),%rcx
         48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
         66 90                   xchg   %ax,%ax  <------------ static branch
         48 89 c2                mov    %rax,%rdx
         48 29 ca                sub    %rcx,%rdx
         48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx
      
      After this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:
      
      <stackleak_erase>:
         0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)  <--- static branch
         65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
         00 00
         48 8b 48 20             mov    0x20(%rax),%rcx
         48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
         48 89 c2                mov    %rax,%rdx
         48 29 ca                sub    %rcx,%rdx
         48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx
      
      Before this patch, on arm64 w/ GCC 11.1.0 the start of the function is:
      
      <stackleak_erase>:
         d503245f        bti     c
         d5384100        mrs     x0, sp_el0
         f9401003        ldr     x3, [x0, #32]
         f9451000        ldr     x0, [x0, #2592]
         d503201f        nop  <------------------------------- static branch
         d503233f        paciasp
         cb030002        sub     x2, x0, x3
         d287ffe1        mov     x1, #0x3fff
         eb01005f        cmp     x2, x1
      
      After this patch, on arm64 w/ GCC 11.1.0 the start of the function is:
      
      <stackleak_erase>:
         d503245f        bti     c
         d503201f        nop  <------------------------------- static branch
         d503233f        paciasp
         d5384100        mrs     x0, sp_el0
         f9401003        ldr     x3, [x0, #32]
         d287ffe1        mov     x1, #0x3fff
         f9451000        ldr     x0, [x0, #2592]
         cb030002        sub     x2, x0, x3
         eb01005f        cmp     x2, x1
      
      While this may not be a huge win on its own, moving the static branch
      will permit further optimization of the body of the function in
      subsequent patches.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-3-mark.rutland@arm.com
      a12685e2
    • Mark Rutland's avatar
      arm64: stackleak: fix current_top_of_stack() · e85094c3
      Mark Rutland authored
      Due to some historical confusion, arm64's current_top_of_stack() isn't
      what the stackleak code expects. This could in theory result in a number
      of problems, and practically results in an unnecessary performance hit.
      We can avoid this by aligning the arm64 implementation with the x86
      implementation.
      
      The arm64 implementation of current_top_of_stack() was added
      specifically for stackleak in commit:
      
        0b3e3366 ("arm64: Add support for STACKLEAK gcc plugin")
      
      This was intended to be equivalent to the x86 implementation, but the
      implementation, semantics, and performance characteristics differ
      wildly:
      
      * On x86, current_top_of_stack() returns the top of the current task's
        task stack, regardless of which stack is in active use.
      
        The implementation accesses a percpu variable which the x86 entry code
        maintains, and returns the location immediately above the pt_regs on
        the task stack (above which x86 has some padding).
      
      * On arm64 current_top_of_stack() returns the top of the stack in active
        use (i.e. the one which is currently being used).
      
        The implementation checks the SP against a number of
        potentially-accessible stacks, and will BUG() if no stack is found.
      
      The core stackleak_erase() code determines the upper bound of stack to
      erase with:
      
      | if (on_thread_stack())
      |         boundary = current_stack_pointer;
      | else
      |         boundary = current_top_of_stack();
      
      On arm64 stackleak_erase() is always called on a task stack, and
      on_thread_stack() should always be true. On x86, stackleak_erase() is
      mostly called on a trampoline stack, and is sometimes called on a task
      stack.
      
      Currently, this results in a lot of unnecessary code being generated for
      arm64 for the impossible !on_thread_stack() case. Some of this is
      inlined, bloating stackleak_erase(), while portions of this are left
      out-of-line and permitted to be instrumented (which would be a
      functional problem if that code were reachable).
      
      As a first step towards improving this, this patch aligns arm64's
      implementation of current_top_of_stack() with x86's, always returning
      the top of the current task's stack. With GCC 11.1.0 this results in the
      bulk of the unnecessary code being removed, including all of the
      out-of-line instrumentable code.
      
      While I don't believe there's a functional problem in practice I've
      marked this as a fix since the semantic was clearly wrong, the fix
      itself is simple, and other code might rely upon this in future.
      
      Fixes: 0b3e3366 ("arm64: Add support for STACKLEAK gcc plugin")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Popov <alex.popov@linux.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Will Deacon <will@kernel.org>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220427173128.2603085-2-mark.rutland@arm.com
      e85094c3
    • Kees Cook's avatar
      randstruct: Enable Clang support · 035f7f87
      Kees Cook authored
      Clang 15 will support randstruct via the -frandomize-layout-seed-file=...
      option. Update the Kconfig and Makefile to recognize this feature.
      
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: linux-kbuild@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220503205503.3054173-7-keescook@chromium.org
      035f7f87
    • Kees Cook's avatar
      randstruct: Move seed generation into scripts/basic/ · be2b34fa
      Kees Cook authored
      To enable Clang randstruct support, move the structure layout
      randomization seed generation out of scripts/gcc-plugins/ into
      scripts/basic/ so it happens early enough that it can be used by either
      compiler implementation. The gcc-plugin still builds its own header file,
      but now does so from the common "randstruct.seed" file.
      
      Cc: linux-hardening@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220503205503.3054173-6-keescook@chromium.org
      be2b34fa
    • Kees Cook's avatar
      randstruct: Split randstruct Makefile and CFLAGS · 613f4b3e
      Kees Cook authored
      To enable the new Clang randstruct implementation[1], move
      randstruct into its own Makefile and split the CFLAGS from
      GCC_PLUGINS_CFLAGS into RANDSTRUCT_CFLAGS.
      
      [1] https://reviews.llvm.org/D121556
      
      Cc: linux-hardening@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220503205503.3054173-5-keescook@chromium.org
      613f4b3e
    • Kees Cook's avatar
      randstruct: Reorganize Kconfigs and attribute macros · 595b893e
      Kees Cook authored
      In preparation for Clang supporting randstruct, reorganize the Kconfigs,
      move the attribute macros, and generalize the feature to be named
      CONFIG_RANDSTRUCT for on/off, CONFIG_RANDSTRUCT_FULL for the full
      randomization mode, and CONFIG_RANDSTRUCT_PERFORMANCE for the cache-line
      sized mode.
      
      Cc: linux-hardening@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220503205503.3054173-4-keescook@chromium.org
      595b893e
    • Kees Cook's avatar
      sancov: Split plugin build from plugin CFLAGS · d3646589
      Kees Cook authored
      When the sancov_plugin is enabled, it gets added to gcc-plugin-y which
      is used to populate both GCC_PLUGIN (for building the plugin) and
      GCC_PLUGINS_CFLAGS (for enabling and options). Instead of adding sancov
      to both and then removing it from GCC_PLUGINS_CFLAGS, create a separate
      list, gcc-plugin-external-y, which is only added to GCC_PLUGIN.
      
      This will also be used by the coming randstruct build changes.
      
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: linux-kbuild@vger.kernel.org
      Cc: linux-hardening@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220503205503.3054173-3-keescook@chromium.org
      d3646589
    • Kees Cook's avatar
      netfs: Eliminate Clang randstruct warning · 3b5eed3c
      Kees Cook authored
      Clang's structure layout randomization feature gets upset when it sees
      struct inode (which is randomized) cast to struct netfs_i_context. This
      is due to seeing the inode pointer as being treated as an array of inodes,
      rather than "something else, following struct inode".
      
      Since netfs can't use container_of() (since it doesn't know what the
      true containing struct is), it uses this direct offset instead. Adjust
      the code to better reflect what is happening: an arbitrary pointer is
      being adjusted and cast to something else: use a "void *" for the math.
      The resulting binary output is the same, but Clang no longer sees an
      unexpected cross-structure cast:
      
      In file included from ../fs/nfs/inode.c:50:
      In file included from ../fs/nfs/fscache.h:15:
      In file included from ../include/linux/fscache.h:18:
      ../include/linux/netfs.h:298:9: error: casting from randomized structure pointer type 'struct inode *' to 'struct netfs_i_context *'
              return (struct netfs_i_context *)(inode + 1);
                     ^
      1 error generated.
      
      Cc: David Howells <dhowells@redhat.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220503205503.3054173-2-keescook@chromium.orgReviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Link: https://lore.kernel.org/lkml/7562f8eccd7cc0e447becfe9912179088784e3b9.camel@kernel.org
      3b5eed3c
  2. 13 Apr, 2022 10 commits
  3. 12 Apr, 2022 3 commits
    • Mikulas Patocka's avatar
      stat: fix inconsistency between struct stat and struct compat_stat · 932aba1e
      Mikulas Patocka authored
      struct stat (defined in arch/x86/include/uapi/asm/stat.h) has 32-bit
      st_dev and st_rdev; struct compat_stat (defined in
      arch/x86/include/asm/compat.h) has 16-bit st_dev and st_rdev followed by
      a 16-bit padding.
      
      This patch fixes struct compat_stat to match struct stat.
      
      [ Historical note: the old x86 'struct stat' did have that 16-bit field
        that the compat layer had kept around, but it was changes back in 2003
        by "struct stat - support larger dev_t":
      
          https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=e95b2065677fe32512a597a79db94b77b90c968d
      
        and back in those days, the x86_64 port was still new, and separate
        from the i386 code, and had already picked up the old version with a
        16-bit st_dev field ]
      
      Note that we can't change compat_dev_t because it is used by
      compat_loop_info.
      
      Also, if the st_dev and st_rdev values are 32-bit, we don't have to use
      old_valid_dev to test if the value fits into them.  This fixes
      -EOVERFLOW on filesystems that are on NVMe because NVMe uses the major
      number 259.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      932aba1e
    • Jason A. Donenfeld's avatar
      gcc-plugins: latent_entropy: use /dev/urandom · c40160f2
      Jason A. Donenfeld authored
      While the latent entropy plugin mostly doesn't derive entropy from
      get_random_const() for measuring the call graph, when __latent_entropy is
      applied to a constant, then it's initialized statically to output from
      get_random_const(). In that case, this data is derived from a 64-bit
      seed, which means a buffer of 512 bits doesn't really have that amount
      of compile-time entropy.
      
      This patch fixes that shortcoming by just buffering chunks of
      /dev/urandom output and doling it out as requested.
      
      At the same time, it's important that we don't break the use of
      -frandom-seed, for people who want the runtime benefits of the latent
      entropy plugin, while still having compile-time determinism. In that
      case, we detect whether gcc's set_random_seed() has been called by
      making a call to get_random_seed(noinit=true) in the plugin init
      function, which is called after set_random_seed() is called but before
      anything that calls get_random_seed(noinit=false), and seeing if it's
      zero or not. If it's not zero, we're in deterministic mode, and so we
      just generate numbers with a basic xorshift prng.
      
      Note that we don't detect if -frandom-seed is being used using the
      documented local_tick variable, because it's assigned via:
         local_tick = (unsigned) tv.tv_sec * 1000 + tv.tv_usec / 1000;
      which may well overflow and become -1 on its own, and so isn't
      reliable: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105171
      
      [kees: The 256 byte rnd_buf size was chosen based on average (250),
       median (64), and std deviation (575) bytes of used entropy for a
       defconfig x86_64 build]
      
      Fixes: 38addce8 ("gcc-plugins: Add latent_entropy plugin")
      Cc: stable@vger.kernel.org
      Cc: PaX Team <pageexec@freemail.hu>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20220405222815.21155-1-Jason@zx2c4.com
      c40160f2
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.18-2' of... · 7281a59c
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform drivers fixes from Hans de Goede:
      
       - Documentation and compilation warning fixes
      
       - Kconfig dep fixes
      
       - Misc small code cleanups
      
      * tag 'platform-drivers-x86-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86: amd-pmc: Fix compilation without CONFIG_SUSPEND
        platform/x86: acerhdf: Cleanup str_starts_with()
        Documentation/ABI: sysfs-class-firmware-attributes: Misc. cleanups
        Documentation/ABI: sysfs-class-firmware-attributes: Fix Sphinx errors
        Documentation/ABI: sysfs-driver-intel_sdsi: Fix sphinx warnings
        platform/x86: barco-p50-gpio: Fix duplicate included linux/io.h
        platform/x86: samsung-laptop: Fix an unsigned comparison which can never be negative
        platform/x86: think-lmi: certificate support clean ups
      7281a59c
  4. 11 Apr, 2022 6 commits
  5. 10 Apr, 2022 5 commits
    • Linus Torvalds's avatar
      Merge tag 'tty-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 8b57b304
      Linus Torvalds authored
      Pull serial driver fix from Greg KH:
       "This is a single serial driver fix for a build issue that showed up
        due to changes that came in through the tty tree in 5.18-rc1 that were
        missed previously. It resolves a build error with the mpc52xx_uart
        driver.
      
        It has been in linux-next this week with no reported problems"
      
      * tag 'tty-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
        tty: serial: mpc52xx_uart: make rx/tx hooks return unsigned, part II.
      8b57b304
    • Linus Torvalds's avatar
      Merge tag 'staging-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 95aa17c3
      Linus Torvalds authored
      Pull staging driver fix from Greg KH:
       "Here is a single staging driver fix for 5.18-rc2 that resolves an
        endian issue for the r8188eu driver. It has been in linux-next all
        this week with no reported problems"
      
      * tag 'staging-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: r8188eu: Fix PPPoE tag insertion on little endian systems
      95aa17c3
    • Linus Torvalds's avatar
      Merge tag 'driver-core-5.18-rc2' of... · 33563138
      Linus Torvalds authored
      Merge tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core updates from Greg KH:
       "Here are two small driver core changes for 5.18-rc2.
      
        They are the final bits in the removal of the default_attrs field in
        struct kobj_type. I had to wait until after 5.18-rc1 for all of the
        changes to do this came in through different development trees, and
        then one new user snuck in. So this series has two changes:
      
         - removal of the default_attrs field in the powerpc/pseries/vas code.
      
           The change has been acked by the PPC maintainers to come through
           this tree
      
         - removal of default_attrs from struct kobj_type now that all
           in-kernel users are removed.
      
           This cleans up the kobject code a little bit and removes some
           duplicated functionality that confused people (now there is only
           one way to do default groups)
      
        Both of these have been in linux-next for all of this week with no
        reported problems"
      
      * tag 'driver-core-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
        kobject: kobj_type: remove default_attrs
        powerpc/pseries/vas: use default_groups in kobj_type
      33563138
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · f58d3410
      Linus Torvalds authored
      Pull char/misc driver fix from Greg KH:
       "A single driver fix. It resolves the build warning issue on 32bit
        systems in the habannalabs driver that came in during the 5.18-rc1
        merge cycle.
      
        It has been in linux-next for all this week with no reported problems"
      
      * tag 'char-misc-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        habanalabs: Fix test build failures
      f58d3410
    • Linus Torvalds's avatar
      Merge tag 'powerpc-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 4ea3c642
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix KVM "lost kick" race, where an attempt to pull a vcpu out of the
         guest could be lost (or delayed until the next guest exit).
      
       - Disable SCV (system call vectored) when PR KVM guests could be run.
      
       - Fix KVM PR guests using SCV, by disallowing AIL != 0 for KVM PR
         guests.
      
       - Add a new KVM CAP to indicate if AIL == 3 is supported.
      
       - Fix a regression when hotplugging a CPU to a memoryless/cpuless node.
      
       - Make virt_addr_valid() stricter for 64-bit Book3E & 32-bit, which
         fixes crashes seen due to hardened usercopy.
      
       - Revert a change to max_mapnr which broke HIGHMEM.
      
      Thanks to Christophe Leroy, Fabiano Rosas, Kefeng Wang, Nicholas Piggin,
      and Srikar Dronamraju.
      
      * tag 'powerpc-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        Revert "powerpc: Set max_mapnr correctly"
        powerpc: Fix virt_addr_valid() for 64-bit Book3E & 32-bit
        KVM: PPC: Move kvmhv_on_pseries() into kvm_ppc.h
        powerpc/numa: Handle partially initialized numa nodes
        powerpc/64: Fix build failure with allyesconfig in book3s_64_entry.S
        KVM: PPC: Use KVM_CAP_PPC_AIL_MODE_3
        KVM: PPC: Book3S PR: Disallow AIL != 0
        KVM: PPC: Book3S PR: Disable SCV when AIL could be disabled
        KVM: PPC: Book3S HV P9: Fix "lost kick" race
      4ea3c642