Commits · d37aa2efc89b387cda93bf15317883519683d435 · Kirill Smelkov / linux

24 May, 2022 1 commit

gcc-plugins: use KERNELVERSION for plugin version · d37aa2ef

Masahiro Yamada authored May 24, 2022

Commit 61f60bac ("gcc-plugins: Change all version strings match
kernel") broke parallel builds.

Instead of adding the dependency between GCC plugins and utsrelease.h,
let's use KERNELVERSION, which does not require any build artifact.

Another reason why I want to avoid utsrelease.h is because it depends
on CONFIG_LOCALVERSION(_AUTO) and localversion* files.

(include/generated/utsrelease.h depends on include/config/kernel.release,
which is generated by scripts/setlocalversion)

I want to keep host tools independent of the kernel configuration.

There is no good reason to rebuild GCC plugins just because of
CONFIG_LOCALVERSION being changed.

We just want to associate the plugin versions with the kernel source
version. KERNELVERSION should be enough for our purpose.

Fixes: 61f60bac ("gcc-plugins: Change all version strings match kernel")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-mm/202205230239.EZxeZ3Fv-lkp@intel.comReported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220524135541.1453693-1-masahiroy@kernel.org

d37aa2ef

16 May, 2022 6 commits

loadpin: stop using bdevname · ed5edd5a

Christoph Hellwig authored May 12, 2022

Use the %pg format specifier to save on stack consuption and code size.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220512062014.1826835-1-hch@lst.de

ed5edd5a

mm: usercopy: move the virt_addr_valid() below the is_vmalloc_addr() · a5f4d9df

Yuanzheng Song authored May 05, 2022

The is_kmap_addr() and the is_vmalloc_addr() in the check_heap_object()
will not work, because the virt_addr_valid() will exclude the kmap and
vmalloc regions. So let's move the virt_addr_valid() below
the is_vmalloc_addr().
Signed-off-by: Yuanzheng Song <songyuanzheng@huawei.com>
Fixes: 4e140f59 ("mm/usercopy: Check kmap addresses properly")
Fixes: 0aef499f ("mm/usercopy: Detect vmalloc overruns")
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220505071037.4121100-1-songyuanzheng@huawei.com

a5f4d9df

gcc-plugins: randstruct: Remove cast exception handling · 710e4ebf

Kees Cook authored May 10, 2022

With all randstruct exceptions removed, remove all the exception
handling code. Any future warnings are likely to be shared between
this plugin and Clang randstruct, and will need to be addressed in a
more wholistic fashion.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>

710e4ebf

af_unix: Silence randstruct GCC plugin warning · b146cbf2

Kees Cook authored May 10, 2022

While preparing for Clang randstruct support (which duplicated many of
the warnings the randstruct GCC plugin warned about), one strange one
remained only for the randstruct GCC plugin. Eliminating this rids
the plugin of the last exception.

It seems the plugin is happy to dereference individual members of
a cross-struct cast, but it is upset about casting to a whole object
pointer. This only manifests in one place in the kernel, so just replace
the variable with individual member accesses. There is no change in
executable instruction output.

Drop the last exception from the randstruct GCC plugin.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Cong Wang <cong.wang@bytedance.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: netdev@vger.kernel.org
Cc: linux-hardening@vger.kernel.org
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Link: https://lore.kernel.org/lkml/20220511022217.58586-1-kuniyu@amazon.co.jpAcked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/lkml/20220511151542.4cb3ff17@kernel.orgSigned-off-by: Kees Cook <keescook@chromium.org>

b146cbf2

niu: Silence randstruct warnings · 2dcfe9e2

Kees Cook authored May 09, 2022

Clang randstruct gets upset when it sees struct addresspace (which is
randomized) being assigned to a struct page (which is not randomized):

drivers/net/ethernet/sun/niu.c:3385:12: error: casting from randomized structure pointer type 'struct address_space *' to 'struct page *'
                        *link = (struct page *) page->mapping;
                                ^

It looks like niu.c is looking for an in-line place to chain its allocated
pages together and is overloading the "mapping" member, as it is unused.
This is very non-standard, and is expected to be cleaned up in the
future[1], but there is no "correct" way to handle it today.

No meaningful machine code changes result after this change, and source
readability is improved.

Drop the randstruct exception now that there is no "confusing" cross-type
assignment.

[1] https://lore.kernel.org/lkml/YnqgjVoMDu5v9PNG@casper.infradead.org/

Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Du Cheng <ducheng2@gmail.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-hardening@vger.kernel.org
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/lkml/20220511151647.7290adbe@kernel.orgSigned-off-by: Kees Cook <keescook@chromium.org>

2dcfe9e2

big_keys: Use struct for internal payload · c1298a3a

Kees Cook authored May 08, 2022

The randstruct GCC plugin gets upset when it sees struct path (which is
randomized) being assigned from a "void *" (which it cannot type-check).

There's no need for these casts, as the entire internal payload use is
following a normal struct layout. Convert the enum-based void * offset
dereferencing to the new big_key_payload struct. No meaningful machine
code changes result after this change, and source readability is improved.

Drop the randstruct exception now that there is no "confusing" cross-type
assignment.

Cc: David Howells <dhowells@redhat.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: linux-hardening@vger.kernel.org
Cc: keyrings@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>

c1298a3a

10 May, 2022 2 commits

gcc-plugins: Change all version strings match kernel · 61f60bac

Kees Cook authored May 10, 2022

It's not meaningful for the GCC plugins to track their versions separately
from the rest of the kernel. Switch all versions to the kernel version.

Fix mismatched indenting while we're at it.

Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>

61f60bac

randomize_kstack: Improve docs on requirements/rationale · 1ff29758

Kees Cook authored May 10, 2022

There were some recent questions about where and why to use the
random_kstack routines when applying them to new architectures[1].
Update the header comments to reflect the design choices for the
routines.

[1] https://lore.kernel.org/lkml/1652173338.7bltwybi0c.astroid@bobo.none

Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>

1ff29758

08 May, 2022 20 commits

lkdtm/stackleak: fix CONFIG_GCC_PLUGIN_STACKLEAK=n · 8c6a490e

Mark Rutland authored May 06, 2022

Recent rework broke building LKDTM when CONFIG_GCC_PLUGIN_STACKLEAK=n.
This patch fixes that breakage.

Prior to recent stackleak rework, the LKDTM STACKLEAK_ERASING code could
be built when the kernel was not built with stackleak support, and would
run a test that would almost certainly fail (or pass by sheer cosmic
coincidence), e.g.

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: checking unused part of the thread stack (15560 bytes)...
| lkdtm: FAIL: the erased part is not found (checked 15560 bytes)
| lkdtm: FAIL: the thread stack is NOT properly erased!
| lkdtm: This is probably expected, since this kernel (5.18.0-rc2 aarch64) was built *without* CONFIG_GCC_PLUGIN_STACKLEAK=y

The recent rework to the test made it more accurate by using helpers
which are only defined when CONFIG_GCC_PLUGIN_STACKLEAK=y, and so when
building LKDTM when CONFIG_GCC_PLUGIN_STACKLEAK=n, we get a build
failure:

| drivers/misc/lkdtm/stackleak.c: In function 'check_stackleak_irqoff':
| drivers/misc/lkdtm/stackleak.c:30:46: error: implicit declaration of function 'stackleak_task_low_bound' [-Werror=implicit-function-declaration]
|    30 |         const unsigned long task_stack_low = stackleak_task_low_bound(current);
|       |                                              ^~~~~~~~~~~~~~~~~~~~~~~~
| drivers/misc/lkdtm/stackleak.c:31:47: error: implicit declaration of function 'stackleak_task_high_bound'; did you mean 'stackleak_task_init'? [-Werror=implicit-function-declaration]
|    31 |         const unsigned long task_stack_high = stackleak_task_high_bound(current);
|       |                                               ^~~~~~~~~~~~~~~~~~~~~~~~~
|       |                                               stackleak_task_init
| drivers/misc/lkdtm/stackleak.c:33:48: error: 'struct task_struct' has no member named 'lowest_stack'
|    33 |         const unsigned long lowest_sp = current->lowest_stack;
|       |                                                ^~
| drivers/misc/lkdtm/stackleak.c:74:23: error: implicit declaration of function 'stackleak_find_top_of_poison' [-Werror=implicit-function-declaration]
|    74 |         poison_high = stackleak_find_top_of_poison(task_stack_low, untracked_high);
|       |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

This patch fixes the issue by not compiling the body of the test when
CONFIG_GCC_PLUGIN_STACKLEAK=n, and replacing this with an unconditional
XFAIL message. This means the pr_expected_config() in
check_stackleak_irqoff() is redundant, and so it is removed.

Where an architecture does not support stackleak, the test will log:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: XFAIL: stackleak is not supported on this arch (HAVE_ARCH_STACKLEAK=n)

Where an architectures does support stackleak, but this has not been
compiled in, the test will log:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: XFAIL: stackleak is not enabled (CONFIG_GCC_PLUGIN_STACKLEAK=n)

Where stackleak has been compiled in, the test behaves as usual:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: stackleak stack usage:
|   high offset: 336 bytes
|   current:     688 bytes
|   lowest:      1232 bytes
|   tracked:     1232 bytes
|   untracked:   672 bytes
|   poisoned:    14136 bytes
|   low offset:  8 bytes
| lkdtm: OK: the rest of the thread stack is properly erased

Fixes: f4cfacd92972cc44 ("lkdtm/stackleak: rework boundary management")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220506121145.1162908-1-mark.rutland@arm.com

8c6a490e

arm64: entry: use stackleak_erase_on_task_stack() · 88959a39

Mark Rutland authored Apr 27, 2022

On arm64 we always call stackleak_erase() on a task stack, and never
call it on another stack. We can avoid some redundant work by using
stackleak_erase_on_task_stack(), telling the stackleak code that it's
being called on a task stack.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-14-mark.rutland@arm.com

88959a39

stackleak: add on/off stack variants · 8111e67d

Mark Rutland authored Apr 27, 2022

The stackleak_erase() code dynamically handles being on a task stack or
another stack. In most cases, this is a fixed property of the caller,
which the caller is aware of, as an architecture might always return
using the task stack, or might always return using a trampoline stack.

This patch adds stackleak_erase_on_task_stack() and
stackleak_erase_off_task_stack() functions which callers can use to
avoid on_thread_stack() check and associated redundant work when the
calling stack is known. The existing stackleak_erase() is retained as a
safe default.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-13-mark.rutland@arm.com

8111e67d

lkdtm/stackleak: check stack boundaries · f171d695

Mark Rutland authored Apr 27, 2022

The stackleak code relies upon the current SP and lowest recorded SP
falling within expected task stack boundaries.

Check this at the start of the test.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-12-mark.rutland@arm.com

f171d695

lkdtm/stackleak: prevent unexpected stack usage · f03a5093

Mark Rutland authored Apr 27, 2022

The lkdtm_STACKLEAK_ERASING() test is instrumentable and runs with IRQs
unmasked, so it's possible for unrelated code to clobber the task stack
and/or manipulate current->lowest_stack while the test is running,
resulting in spurious failures.

The regular stackleak erasing code is non-instrumentable and runs with
IRQs masked, preventing similar issues.

Make the body of the test non-instrumentable, and run it with IRQs
masked, avoiding such spurious failures.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-11-mark.rutland@arm.com

f03a5093

lkdtm/stackleak: rework boundary management · 72b61896

Mark Rutland authored Apr 27, 2022

There are a few problems with the way the LKDTM STACKLEAK_ERASING test
manipulates the stack pointer and boundary values:

* It uses the address of a local variable to determine the current stack
  pointer, rather than using current_stack_pointer directly. As the
  local variable could be placed anywhere within the stack frame, this
  can be an over-estimate of the true stack pointer value.

* Is uses an estimate of the current stack pointer as the upper boundary
  when scanning for poison, even though prior functions could have used
  more stack (and may have updated current->lowest stack accordingly).

* A pr_info() call is made in the middle of the test. As the printk()
  code is out-of-line and will make use of the stack, this could clobber
  poison and/or adjust current->lowest_stack. It would be better to log
  the metadata after the body of the test to avoid such problems.

These have been observed to result in spurious test failures on arm64.

In addition to this there are a couple of things which are sub-optimal:

* To avoid the STACK_END_MAGIC value, it conditionally modifies 'left'
  if this contains more than a single element, when it could instead
  calculate the bound unconditionally using stackleak_task_low_bound().

* It open-codes the poison scanning. It would be better if this used the
  same helper code as used by erasing function so that the two cannot
  diverge.

This patch reworks the test to avoid these issues, making use of the
recently introduced helpers to ensure this is aligned with the regular
stackleak code.

As the new code tests stack boundaries before accessing the stack, there
is no need to fail early when the tracked or untracked portions of the
stack extend all the way to the low stack boundary.

As stackleak_find_top_of_poison() is now used to find the top of the
poisoned region of the stack, the subsequent poison checking starts at
this boundary and verifies that stackleak_find_top_of_poison() is
working correctly.

The pr_info() which logged the untracked portion of stack is now moved
to the end of the function, and logs the size of all the portions of the
stack relevant to the test, including the portions at the top and bottom
of the stack which are not erased or scanned, and the current / lowest
recorded stack usage.

Tested on x86_64:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: stackleak stack usage:
|   high offset: 168 bytes
|   current:     336 bytes
|   lowest:      656 bytes
|   tracked:     656 bytes
|   untracked:   400 bytes
|   poisoned:    15152 bytes
|   low offset:  8 bytes
| lkdtm: OK: the rest of the thread stack is properly erased

Tested on arm64:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: stackleak stack usage:
|   high offset: 336 bytes
|   current:     656 bytes
|   lowest:      1232 bytes
|   tracked:     1232 bytes
|   untracked:   672 bytes
|   poisoned:    14136 bytes
|   low offset:  8 bytes
| lkdtm: OK: the rest of the thread stack is properly erased

Tested on arm64 with deliberate breakage to the starting stack value and
poison scanning:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: FAIL: non-poison value 24 bytes below poison boundary: 0x0
| lkdtm: FAIL: non-poison value 32 bytes below poison boundary: 0xffff8000083dbc00
...
| lkdtm: FAIL: non-poison value 1912 bytes below poison boundary: 0x78b4b9999e8cb15
| lkdtm: FAIL: non-poison value 1920 bytes below poison boundary: 0xffff8000083db400
| lkdtm: stackleak stack usage:
|   high offset: 336 bytes
|   current:     688 bytes
|   lowest:      1232 bytes
|   tracked:     576 bytes
|   untracked:   288 bytes
|   poisoned:    15176 bytes
|   low offset:  8 bytes
| lkdtm: FAIL: the thread stack is NOT properly erased!
| lkdtm: Unexpected! This kernel (5.18.0-rc1-00013-g1f7b1f1e29e0-dirty aarch64) was built with CONFIG_GCC_PLUGIN_STACKLEAK=y
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-10-mark.rutland@arm.com

72b61896

lkdtm/stackleak: avoid spurious failure · 4130a61c

Mark Rutland authored Apr 27, 2022

The lkdtm_STACKLEAK_ERASING() test scans for a contiguous block of
poison values between the low stack bound and the stack pointer, and
fails if it does not find a sufficiently large block.

This can happen legitimately if the scan the low stack bound, which
could occur if functions called prior to lkdtm_STACKLEAK_ERASING() used
a large amount of stack. If this were to occur, it means that the erased
portion of the stack is smaller than the size used by the scan, but does
not cause a functional problem

In practice this is unlikely to happen, but as this is legitimate and
would not result in a functional problem, the test should not fail in
this case.

Remove the spurious failure case.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-9-mark.rutland@arm.com

4130a61c

stackleak: rework poison scanning · 77cf2b6d

Mark Rutland authored Apr 27, 2022

Currently we over-estimate the region of stack which must be erased.

To determine the region to be erased, we scan downwards for a contiguous
block of poison values (or the low bound of the stack). There are a few
minor problems with this today:

* When we find a block of poison values, we include this block within
  the region to erase.

  As this is included within the region to erase, this causes us to
  redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
  poison.

* As the loop condition checks 'poison_count <= depth', it will run an
  additional iteration after finding the contiguous block of poison,
  decrementing 'erase_low' once more than necessary.

  As this is included within the region to erase, this causes us to
  redundantly overwrite an additional unsigned long with poison.

* As we always decrement 'erase_low' after checking an element on the
  stack, we always include the element below this within the region to
  erase.

  As this is included within the region to erase, this causes us to
  redundantly overwrite an additional unsigned long with poison.

  Note that this is not a functional problem. As the loop condition
  checks 'erase_low > task_stack_low', we'll never clobber the
  STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
  never fail to erase the element immediately above the STACK_END_MAGIC.

In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
bytes more than necessary, which is unfortunate.

This patch reworks the logic to find the address immediately above the
poisoned region, by finding the lowest non-poisoned address. This is
factored into a stackleak_find_top_of_poison() helper both for clarity
and so that this can be shared with the LKDTM test in subsequent
patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-8-mark.rutland@arm.com

77cf2b6d

stackleak: rework stack high bound handling · 0cfa2ccd

Mark Rutland authored Apr 27, 2022

Prior to returning to userspace, we reset current->lowest_stack to a
reasonable high bound. Currently we do this by subtracting the arbitrary
value `THREAD_SIZE/64` from the top of the stack, for reasons lost to
history.

Looking at configurations today:

* On i386 where THREAD_SIZE is 8K, the bound will be 128 bytes. The
  pt_regs at the top of the stack is 68 bytes (with 0 to 16 bytes of
  padding above), and so this covers an additional portion of 44 to 60
  bytes.

* On x86_64 where THREAD_SIZE is at least 16K (up to 32K with KASAN) the
  bound will be at least 256 bytes (up to 512 with KASAN). The pt_regs
  at the top of the stack is 168 bytes, and so this cover an additional
  88 bytes of stack (up to 344 with KASAN).

* On arm64 where THREAD_SIZE is at least 16K (up to 64K with 64K pages
  and VMAP_STACK), the bound will be at least 256 bytes (up to 1024 with
  KASAN). The pt_regs at the top of the stack is 336 bytes, so this can
  fall within the pt_regs, or can cover an additional 688 bytes of
  stack.

Clearly the `THREAD_SIZE/64` value doesn't make much sense -- in the
worst case, this will cause more than 600 bytes of stack to be erased
for every syscall, even if actual stack usage were substantially
smaller.

This patches makes this slightly less nonsensical by consistently
resetting current->lowest_stack to the base of the task pt_regs. For
clarity and for consistency with the handling of the low bound, the
generation of the high bound is split into a helper with commentary
explaining why.

Since the pt_regs at the top of the stack will be clobbered upon the
next exception entry, we don't need to poison these at exception exit.
By using task_pt_regs() as the high stack boundary instead of
current_top_of_stack() we avoid some redundant poisoning, and the
compiler can share the address generation between the poisoning and
resetting of `current->lowest_stack`, making the generated code more
optimal.

It's not clear to me whether the existing `THREAD_SIZE/64` offset was a
dodgy heuristic to skip the pt_regs, or whether it was attempting to
minimize the number of times stackleak_check_stack() would have to
update `current->lowest_stack` when stack usage was shallow at the cost
of unconditionally poisoning a small portion of the stack for every exit
to userspace.

For now I've simply removed the offset, and if we need/want to minimize
updates for shallow stack usage it should be easy to add a better
heuristic atop, with appropriate commentary so we know what's going on.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-7-mark.rutland@arm.com

0cfa2ccd

stackleak: clarify variable names · 1723d39d

Mark Rutland authored Apr 27, 2022

The logic within __stackleak_erase() can be a little hard to follow, as
`boundary` switches from being the low bound to the high bound mid way
through the function, and `kstack_ptr` is used to represent the start of
the region to erase while `boundary` represents the end of the region to
erase.

Make this a little clearer by consistently using clearer variable names.
The `boundary` variable is removed, the bounds of the region to erase
are described by `erase_low` and `erase_high`, and bounds of the task
stack are described by `task_stack_low` and `task_stack_high`.

As the same time, remove the comment above the variables, since it is
unclear whether it's intended as rationale, a complaint, or a TODO, and
is more confusing than helpful.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-6-mark.rutland@arm.com

1723d39d

stackleak: rework stack low bound handling · 9ec79840

Mark Rutland authored Apr 27, 2022

In stackleak_task_init(), stackleak_track_stack(), and
__stackleak_erase(), we open-code skipping the STACK_END_MAGIC at the
bottom of the stack. Each case is implemented slightly differently, and
only the __stackleak_erase() case is commented.

In stackleak_task_init() and stackleak_track_stack() we unconditionally
add sizeof(unsigned long) to the lowest stack address. In
stackleak_task_init() we use end_of_stack() for this, and in
stackleak_track_stack() we use task_stack_page(). In __stackleak_erase()
we handle this by detecting if `kstack_ptr` has hit the stack end
boundary, and if so, conditionally moving it above the magic.

This patch adds a new stackleak_task_low_bound() helper which is used in
all three cases, which unconditionally adds sizeof(unsigned long) to the
lowest address on the task stack, with commentary as to why. This uses
end_of_stack() as stackleak_task_init() did prior to this patch, as this
is consistent with the code in kernel/fork.c which initializes the
STACK_END_MAGIC value.

In __stackleak_erase() we no longer need to check whether we've spilled
into the STACK_END_MAGIC value, as stackleak_track_stack() ensures that
`current->lowest_stack` stops immediately above this, and similarly the
poison scan will stop immediately above this.

For stackleak_task_init() and stackleak_track_stack() this results in no
change to code generation. For __stackleak_erase() the generated
assembly is slightly simpler and shorter.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-5-mark.rutland@arm.com

9ec79840

stackleak: remove redundant check · ac7838b4

Mark Rutland authored Apr 27, 2022

In __stackleak_erase() we check that the `erase_low` value derived from
`current->lowest_stack` is above the lowest legitimate stack pointer
value, but this is already enforced by stackleak_track_stack() when
recording the lowest stack value.

Remove the redundant check.

There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-4-mark.rutland@arm.com

ac7838b4

stackleak: move skip_erasing() check earlier · a12685e2

Mark Rutland authored Apr 27, 2022

In stackleak_erase() we check skip_erasing() after accessing some fields
from current. As generating the address of current uses asm which
hazards with the static branch asm, this work is always performed, even
when the static branch is patched to jump to the return at the end of the
function.

This patch avoids this redundant work by moving the skip_erasing() check
earlier.

To avoid complicating initialization within stackleak_erase(), the body
of the function is split out into a __stackleak_erase() helper, with the
check left in a wrapper function. The __stackleak_erase() helper is
marked __always_inline to ensure that this is inlined into
stackleak_erase() and not instrumented.

Before this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
   00 00
   48 8b 48 20             mov    0x20(%rax),%rcx
   48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
   66 90                   xchg   %ax,%ax  <------------ static branch
   48 89 c2                mov    %rax,%rdx
   48 29 ca                sub    %rcx,%rdx
   48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx

After this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)  <--- static branch
   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
   00 00
   48 8b 48 20             mov    0x20(%rax),%rcx
   48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
   48 89 c2                mov    %rax,%rdx
   48 29 ca                sub    %rcx,%rdx
   48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx

Before this patch, on arm64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   d503245f        bti     c
   d5384100        mrs     x0, sp_el0
   f9401003        ldr     x3, [x0, #32]
   f9451000        ldr     x0, [x0, #2592]
   d503201f        nop  <------------------------------- static branch
   d503233f        paciasp
   cb030002        sub     x2, x0, x3
   d287ffe1        mov     x1, #0x3fff
   eb01005f        cmp     x2, x1

After this patch, on arm64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   d503245f        bti     c
   d503201f        nop  <------------------------------- static branch
   d503233f        paciasp
   d5384100        mrs     x0, sp_el0
   f9401003        ldr     x3, [x0, #32]
   d287ffe1        mov     x1, #0x3fff
   f9451000        ldr     x0, [x0, #2592]
   cb030002        sub     x2, x0, x3
   eb01005f        cmp     x2, x1

While this may not be a huge win on its own, moving the static branch
will permit further optimization of the body of the function in
subsequent patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-3-mark.rutland@arm.com

a12685e2

arm64: stackleak: fix current_top_of_stack() · e85094c3

Mark Rutland authored Apr 27, 2022

Due to some historical confusion, arm64's current_top_of_stack() isn't
what the stackleak code expects. This could in theory result in a number
of problems, and practically results in an unnecessary performance hit.
We can avoid this by aligning the arm64 implementation with the x86
implementation.

The arm64 implementation of current_top_of_stack() was added
specifically for stackleak in commit:

  0b3e3366 ("arm64: Add support for STACKLEAK gcc plugin")

This was intended to be equivalent to the x86 implementation, but the
implementation, semantics, and performance characteristics differ
wildly:

* On x86, current_top_of_stack() returns the top of the current task's
  task stack, regardless of which stack is in active use.

  The implementation accesses a percpu variable which the x86 entry code
  maintains, and returns the location immediately above the pt_regs on
  the task stack (above which x86 has some padding).

* On arm64 current_top_of_stack() returns the top of the stack in active
  use (i.e. the one which is currently being used).

  The implementation checks the SP against a number of
  potentially-accessible stacks, and will BUG() if no stack is found.

The core stackleak_erase() code determines the upper bound of stack to
erase with:

| if (on_thread_stack())
|         boundary = current_stack_pointer;
| else
|         boundary = current_top_of_stack();

On arm64 stackleak_erase() is always called on a task stack, and
on_thread_stack() should always be true. On x86, stackleak_erase() is
mostly called on a trampoline stack, and is sometimes called on a task
stack.

Currently, this results in a lot of unnecessary code being generated for
arm64 for the impossible !on_thread_stack() case. Some of this is
inlined, bloating stackleak_erase(), while portions of this are left
out-of-line and permitted to be instrumented (which would be a
functional problem if that code were reachable).

As a first step towards improving this, this patch aligns arm64's
implementation of current_top_of_stack() with x86's, always returning
the top of the current task's stack. With GCC 11.1.0 this results in the
bulk of the unnecessary code being removed, including all of the
out-of-line instrumentable code.

While I don't believe there's a functional problem in practice I've
marked this as a fix since the semantic was clearly wrong, the fix
itself is simple, and other code might rely upon this in future.

Fixes: 0b3e3366 ("arm64: Add support for STACKLEAK gcc plugin")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220427173128.2603085-2-mark.rutland@arm.com

e85094c3

randstruct: Enable Clang support · 035f7f87