1. 06 Dec, 2021 5 commits
    • Vladimir Murzin's avatar
      irqchip: nvic: Use GENERIC_IRQ_MULTI_HANDLER · 52d24087
      Vladimir Murzin authored
      Rather then restructuring the ARMv7M entrly logic per TODO, just move
      NVIC to GENERIC_IRQ_MULTI_HANDLER.
      Signed-off-by: default avatarVladimir Murzin <vladimir.murzin@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      52d24087
    • Arnd Bergmann's avatar
      ARM: remove old-style irq entry · 54f481a2
      Arnd Bergmann authored
      The last user of arch_irq_handler_default is gone now, so the
      entry-macro-multi.S file and all references to mach/entry-macro.S can
      be removed, as well as the asm_do_IRQ() entrypoint into the interrupt
      handling routines implemented in C.
      
      Note: The ARMv7-M entry still uses its own top-level IRQ entry, calling
      nvic_handle_irq() from assembly. This could be changed to go through
      generic_handle_arch_irq() as well, but it's unclear to me if there are
      any benefits.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      [ardb: keep irq_handler macro as it carries all the IRQ stack handling]
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      54f481a2
    • Arnd Bergmann's avatar
      ARM: iop32x: use GENERIC_IRQ_MULTI_HANDLER · 6f5d248d
      Arnd Bergmann authored
      iop32x uses the entry-macro.S file for both the IRQ entry and for
      hooking into the arch_ret_to_user code path. This is done because the
      cp6 registers have to be enabled before accessing any of the interrupt
      controller registers but have to be disabled when running in user space.
      
      There is also a lazy-enable logic in cp6.c, but during a hardirq, we
      know it has to be enabled.
      
      Both the cp6-enable code and the code to read the IRQ status can be
      lifted into the normal generic_handle_arch_irq() path, but the
      cp6-disable code has to remain in the user return code. As nothing
      other than iop32x uses this hook, just open-code it there with an
      ifdef for the platform that can eventually be removed when iop32x
      has reached the end of its life.
      
      The cp6-enable path in the IRQ entry has an extra cp_wait barrier that
      the trap version does not have, but it is harmless to do it in both
      cases to simplify the logic here at the cost of a few extra cycles
      for the trap.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      6f5d248d
    • Arnd Bergmann's avatar
      ARM: iop32x: offset IRQ numbers by 1 · 9d67412f
      Arnd Bergmann authored
      iop32x is one of the last platforms to use IRQ 0, and this has apparently
      stopped working in a 2014 cleanup without anyone noticing. This interrupt
      is used for the DMA engine, so most likely this has not actually worked
      in the past 7 years, but it's also not essential for using this board.
      
      I'm splitting out this change from my GENERIC_IRQ_MULTI_HANDLER
      conversion so it can be backported if anyone cares.
      
      Fixes: a71b092a ("ARM: Convert handle_IRQ to use __handle_domain_irq")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      [ardb: take +1 offset into account in mask/unmask and init as well]
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      9d67412f
    • Arnd Bergmann's avatar
      ARM: footbridge: use GENERIC_IRQ_MULTI_HANDLER · 90890f17
      Arnd Bergmann authored
      Footbridge still uses the classic IRQ entry path in assembler,
      but this is easily converted into an equivalent C version.
      
      In this case, the correlation between IRQ numbers and bits in
      the status register is non-obvious, and the priorities are
      handled by manually checking each bit in a static order,
      re-reading the status register after each handled event.
      
      I moved the code into the new file and edited the syntax without
      changing this sequence to keep the behavior as close as possible
      to what it traditionally did.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      90890f17
  2. 03 Dec, 2021 20 commits
    • Arnd Bergmann's avatar
      ARM: riscpc: use GENERIC_IRQ_MULTI_HANDLER · c1fe8d05
      Arnd Bergmann authored
      This is one of the last platforms using the old entry path.
      While this code path is spread over a few files, it is fairly
      straightforward to convert it into an equivalent C version,
      leaving the existing algorithm and all the priority handling
      the same.
      
      Unlike most irqchip drivers, this means reading the status
      register(s) in a loop and always handling the highest-priority
      irq first.
      
      The IOMD_IRQREQC and IOMD_IRQREQD registers are not actaully
      used here, but I left the code in place for the time being,
      to keep the conversion as direct as possible. It could be
      removed in a cleanup on top.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      [ardb: drop obsolete IOMD_IRQREQC/IOMD_IRQREQD handling]
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      c1fe8d05
    • Ard Biesheuvel's avatar
      ARM: riscpc: drop support for IOMD_IRQREQC/IOMD_IRQREQD IRQ groups · d60ff2e7
      Ard Biesheuvel authored
      IOMD_IRQREQC nor IOMD_IRQREQD are ever defined, so any conditionally
      compiled code that depends on them is dead code, and can be removed.
      Suggested-by: default avatarRussell King <linux@armlinux.org.uk>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      d60ff2e7
    • Ard Biesheuvel's avatar
      ARM: implement support for vmap'ed stacks · a1c510d0
      Ard Biesheuvel authored
      Wire up the generic support for managing task stack allocations via vmalloc,
      and implement the entry code that detects whether we faulted because of a
      stack overrun (or future stack overrun caused by pushing the pt_regs array)
      
      While this adds a fair amount of tricky entry asm code, it should be
      noted that it only adds a TST + branch to the svc_entry path. The code
      implementing the non-trivial handling of the overflow stack is emitted
      out-of-line into the .text section.
      
      Since on ARM, we rely on do_translation_fault() to keep PMD level page
      table entries that cover the vmalloc region up to date, we need to
      ensure that we don't hit such a stale PMD entry when accessing the
      stack. So we do a dummy read from the new stack while still running from
      the old one on the context switch path, and bump the vmalloc_seq counter
      when PMD level entries in the vmalloc range are modified, so that the MM
      switch fetches the latest version of the entries.
      
      Note that we need to increase the per-mode stack by 1 word, to gain some
      space to stash a GPR until we know it is safe to touch the stack.
      However, due to the cacheline alignment of the struct, this does not
      actually increase the memory footprint of the struct stack array at all.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      a1c510d0
    • Ard Biesheuvel's avatar
      ARM: entry: rework stack realignment code in svc_entry · ae5cc07d
      Ard Biesheuvel authored
      The original Thumb-2 enablement patches updated the stack realignment
      code in svc_entry to work around the lack of a STMIB instruction in
      Thumb-2, by subtracting 4 from the frame size, inverting the sense of
      the misaligment check, and changing to a STMIA instruction and a final
      stack push of a 4 byte quantity that results in the stack becoming
      aligned at the end of the sequence. It also pushes and pops R0 to the
      stack in order to have a temp register that Thumb-2 allows in general
      purpose ALU instructions, as TST using SP is not permitted.
      
      Both are a bit problematic for vmap'ed stacks, as using the stack is
      only permitted after we decide that we did not overflow the stack, or
      have already switched to the overflow stack.
      
      As for the alignment check: the current approach creates a corner case
      where, if the initial SUB of SP ends up right at the start of the stack,
      we will end up subtracting another 8 bytes and overflowing it.  This
      means we would need to add the overflow check *after* the SUB that
      deliberately misaligns the stack. However, this would require us to keep
      local state (i.e., whether we performed the subtract or not) across the
      overflow check, but without any GPRs or stack available.
      
      So let's switch to an approach where we don't use the stack, and where
      the alignment check of the stack pointer occurs in the usual way, as
      this is guaranteed not to result in overflow. This means we will be able
      to do the overflow check first.
      
      While at it, switch to R1 so the mode stack pointer in R0 remains
      accessible.
      Acked-by: default avatarNicolas Pitre <nico@fluxnic.net>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      ae5cc07d
    • Ard Biesheuvel's avatar
      ARM: switch_to: clean up Thumb2 code path · b832faec
      Ard Biesheuvel authored
      The load-multiple instruction that essentially performs the switch_to
      operation in ARM mode, by loading all callee save registers as well the
      stack pointer and the program counter, is split into 3 separate loads
      for Thumb-2, with the IP register used as a temporary to capture the
      value of R4 before it gets overwritten.
      
      We can clean this up a bit, by sticking with a single LDMIA instruction,
      but one that pops SP and PC into IP and LR, respectively, and by using
      ordinary move register and branch instructions to get those values into
      SP and PC. This also allows us to move the set_current call closer to
      the assignment of SP, reducing the window where those are mutually out
      of sync. This is especially relevant for CONFIG_VMAP_STACK, which is
      being introduced in a subsequent patch, where we need to issue a load
      that might fault from the new stack while running from the old one, to
      ensure that stale PMD entries in the VMALLOC space are synced up.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      b832faec
    • Ard Biesheuvel's avatar
      ARM: unwind: disregard unwind info before stack frame is set up · 532319b9
      Ard Biesheuvel authored
      When unwinding the stack from a stack overflow, we are likely to start
      from a stack push instruction, given that this is the most common way to
      grow the stack for compiler emitted code. This push instruction rarely
      appears anywhere else than at offset 0x0 of the function, and if it
      doesn't, the compiler tends to split up the unwind annotations, given
      that the stack frame layout is apparently not the same throughout the
      function.
      
      This means that, in the general case, if the frame's PC points at the
      first instruction covered by a certain unwind entry, there is no way the
      stack frame that the unwind entry describes could have been created yet,
      and so we are still on the stack frame of the caller in that case. So
      treat this as a special case, and return with the new PC taken from the
      frame's LR, without applying the unwind transformations to the virtual
      register set.
      
      This permits us to unwind the call stack on stack overflow when the
      overflow was caused by a stack push on function entry.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      532319b9
    • Ard Biesheuvel's avatar
      ARM: memset: clean up unwind annotations · ad3d09b5
      Ard Biesheuvel authored
      The memset implementation carves up the code in different sections, each
      covered with their own unwind info. In this case, it is done in a way
      similar to how the compiler might do it, to disambiguate between parts
      where the return address is in LR and the SP is unmodified, and parts
      where a stack frame is live, and the unwinder needs to know the size of
      the stack frame and the location of the return address within it.
      
      Only the placement of the unwind directives is slightly odd: the stack
      pushes are placed in the wrong sections, which may confuse the unwinder
      when attempting to unwind with PC pointing at the stack push in
      question.
      
      So let's fix this up, by reordering the directives and instructions as
      appropriate.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      ad3d09b5
    • Ard Biesheuvel's avatar
      ARM: memmove: use frame pointer as unwind anchor · ccb81601
      Ard Biesheuvel authored
      The memmove routine is a bit unusual in the way it manages the stack
      pointer: depending on the execution path through the function, the SP
      assumes different values as different subsets of the register file are
      preserved and restored again. This is problematic when it comes to EHABI
      unwind info, as it is not instruction accurate, and does not allow
      tracking the SP value as it changes.
      
      Commit 207a6cb0 ("ARM: 8224/1: Add unwinding support for memmove
      function") addressed this by carving up the function in different chunks
      as far as the unwinder is concerned, and keeping a set of unwind
      directives for each of them, each corresponding with the state of the
      stack pointer during execution of the chunk in question. This not only
      duplicates unwind info unnecessarily, but it also complicates unwinding
      the stack upon overflow.
      
      Instead, let's do what the compiler does when the SP is updated halfway
      through a function, which is to use a frame pointer and emit the
      appropriate unwind directives to communicate this to the unwinder.
      
      Note that Thumb-2 uses R7 for this, while ARM uses R11 aka FP. So let's
      avoid touching R7 in the body of the function, so that Thumb-2 can use
      it as the frame pointer. R11 was not modified in the first place.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      ccb81601
    • Ard Biesheuvel's avatar
      ARM: memcpy: use frame pointer as unwind anchor · ba999a04
      Ard Biesheuvel authored
      The memcpy template is a bit unusual in the way it manages the stack
      pointer: depending on the execution path through the function, the SP
      assumes different values as different subsets of the register file are
      preserved and restored again. This is problematic when it comes to EHABI
      unwind info, as it is not instruction accurate, and does not allow
      tracking the SP value as it changes.
      
      Commit 279f487e ("ARM: 8225/1: Add unwinding support for memory
      copy functions") addressed this by carving up the function in different
      chunks as far as the unwinder is concerned, and keeping a set of unwind
      directives for each of them, each corresponding with the state of the
      stack pointer during execution of the chunk in question. This not only
      duplicates unwind info unnecessarily, but it also complicates unwinding
      the stack upon overflow.
      
      Instead, let's do what the compiler does when the SP is updated halfway
      through a function, which is to use a frame pointer and emit the
      appropriate unwind directives to communicate this to the unwinder.
      
      Note that Thumb-2 uses R7 for this, while ARM uses R11 aka FP. So let's
      avoid touching R7 in the body of the template, so that Thumb-2 can use
      it as the frame pointer. R11 was not modified in the first place.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      ba999a04
    • Ard Biesheuvel's avatar
      ARM: run softirqs on the per-CPU IRQ stack · 9974f857
      Ard Biesheuvel authored
      Now that we have enabled IRQ stacks, any softIRQs that are handled over
      the back of a hard IRQ will run from the IRQ stack as well. However, any
      synchronous softirq processing that happens when re-enabling softIRQs
      from task context will still execute on that task's stack.
      
      Since any call to local_bh_enable() at any level in the task's call
      stack may trigger a softIRQ processing run, which could potentially
      cause a task stack overflow if the combined stack footprints exceed the
      stack's size, let's run these synchronous invocations of do_softirq() on
      the IRQ stack as well.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      9974f857
    • Ard Biesheuvel's avatar
      ARM: call_with_stack: add unwind support · 0b78f2e9
      Ard Biesheuvel authored
      Restructure the code and add the unwind annotations so that both the
      frame pointer unwinder as well as the EHABI unwind info based unwinder
      will be able to follow the call stack through call_with_stack().
      
      Since GCC and Clang use different formats for the stack frame, two
      methods are implemented: a GCC version that pushes fp, sp, lr and pc for
      compatibility with the frame pointer unwinder, and a second version that
      works with Clang, as well as with the EHABI unwinder both in ARM and
      Thumb2 modes.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      0b78f2e9
    • Ard Biesheuvel's avatar
      ARM: implement IRQ stacks · d4664b6c
      Ard Biesheuvel authored
      Now that we no longer rely on the stack pointer to access the current
      task struct or thread info, we can implement support for IRQ stacks
      cleanly as well.
      
      Define a per-CPU IRQ stack and switch to this stack when taking an IRQ,
      provided that we were not already using that stack in the interrupted
      context. This is never the case for IRQs taken from user space, but ones
      taken while running in the kernel could fire while one taken from user
      space has not completed yet.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Acked-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      d4664b6c
    • Ard Biesheuvel's avatar
      ARM: backtrace-clang: avoid crash on bogus frame pointer · eae9523f
      Ard Biesheuvel authored
      The Clang backtrace code dereferences the link register value pulled
      from the stack to decide whether the caller was a branch-and-link
      instruction, in order to subsequently decode the offset to find the
      start of the calling function. Unlike other loads in this routine, this
      one is not protected by a fixup, and may therefore cause a crash if the
      address in question is bogus.
      
      So let's fix this, by treating the fault as a failure to decode the 'bl'
      instruction. To avoid a label renum, reuse a fixup label that guards an
      instruction that cannot fault to begin with.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      eae9523f
    • Ard Biesheuvel's avatar
      ARM: unwind: dump exception stack from calling frame · 4ab68270
      Ard Biesheuvel authored
      The existing code that dumps the contents of the pt_regs structure
      passed to __entry routines does so while unwinding the callee frame, and
      dereferences the stack pointer as a struct pt_regs*. This will no longer
      work when we enable support for IRQ or overflow stacks, because the
      struct pt_regs may live on the task stack, while we are executing from
      another stack.
      
      The unwinder has access to this information, but only while unwinding
      the calling frame. So let's combine the exception stack dumping code
      with the handling of the calling frame as well. By printing it before
      dumping the caller/callee addresses, the output order is preserved.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      4ab68270
    • Ard Biesheuvel's avatar
      ARM: export dump_mem() to other objects · 8cdfdf7f
      Ard Biesheuvel authored
      The unwind info based stack unwinder will make its own call to
      dump_mem() to dump the exception stack, so give it external linkage.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      8cdfdf7f
    • Ard Biesheuvel's avatar
      ARM: unwind: support unwinding across multiple stacks · b6506981
      Ard Biesheuvel authored
      Implement support in the unwinder for dealing with multiple stacks.
      This will be needed once we add support for IRQ stacks, or for the
      overflow stack used by the vmap'ed stacks code.
      
      This involves tracking the unwind opcodes that either update the virtual
      stack pointer from another virtual register, or perform an explicit
      subtract on the virtual stack pointer, and updating the low and high
      bounds that we use to sanitize the stack pointer accordingly.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      b6506981
    • Ard Biesheuvel's avatar
      ARM: assembler: introduce bl_r macro · b3ab60b1
      Ard Biesheuvel authored
      Add a bl_r macro that abstract the difference between the ways indirect
      calls are performed on older and newer ARM architecture revisions.
      
      The main difference is to prefer blx instructions over explicit LR
      assignments when possible, as these tend to confuse the prediction logic
      in out-of-order cores when speculating across a function return.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      b3ab60b1
    • Ard Biesheuvel's avatar
      ARM: remove some dead code · 08572cd4
      Ard Biesheuvel authored
      This code appears to be no longer used so let's get rid of it.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Tested-by: default avatarKeith Packard <keithpac@amazon.com>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      08572cd4
    • Ard Biesheuvel's avatar
      ARM: stackprotector: prefer compiler for TLS based per-task protector · f05eb1d2
      Ard Biesheuvel authored
      Currently, we implement the per-task stack protector for ARM using a GCC
      plugin, due to lack of native compiler support. However, work is
      underway to get this implemented in the compiler, which means we will be
      able to deprecate the GCC plugin at some point.
      
      In the meantime, we will need to support both, where the native compiler
      implementation is obviously preferred. So let's wire this up in Kconfig
      and the Makefile.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
      f05eb1d2
    • Ard Biesheuvel's avatar
      ARM: decompressor: disable stack protector · 672513bf
      Ard Biesheuvel authored
      Enabling the stack protector in the decompressor is of dubious value,
      given that it uses a fixed value for the canary, cannot print any output
      unless CONFIG_DEBUG_LL is enabled (which relies on board specific build
      time settings), and is already disabled for a good chunk of the code
      (libfdt).
      
      So let's just disable it in the decompressor. This will make it easier
      in the future to manage the command line options that would need to be
      removed again in this context for the TLS register based stack
      protector.
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      672513bf
  3. 14 Nov, 2021 15 commits
    • Linus Torvalds's avatar
      Linux 5.16-rc1 · fa55b7dc
      Linus Torvalds authored
      fa55b7dc
    • Gustavo A. R. Silva's avatar
      kconfig: Add support for -Wimplicit-fallthrough · dee2b702
      Gustavo A. R. Silva authored
      Add Kconfig support for -Wimplicit-fallthrough for both GCC and Clang.
      
      The compiler option is under configuration CC_IMPLICIT_FALLTHROUGH,
      which is enabled by default.
      
      Special thanks to Nathan Chancellor who fixed the Clang bug[1][2]. This
      bugfix only appears in Clang 14.0.0, so older versions still contain
      the bug and -Wimplicit-fallthrough won't be enabled for them, for now.
      
      This concludes a long journey and now we are finally getting rid
      of the unintentional fallthrough bug-class in the kernel, entirely. :)
      
      Link: https://github.com/llvm/llvm-project/commit/9ed4a94d6451046a51ef393cd62f00710820a7e8 [1]
      Link: https://bugs.llvm.org/show_bug.cgi?id=51094 [2]
      Link: https://github.com/KSPP/linux/issues/115
      Link: https://github.com/ClangBuiltLinux/linux/issues/236Co-developed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Co-developed-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dee2b702
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · ce49bfc8
      Linus Torvalds authored
      Pull xfs cleanups from Darrick Wong:
       "The most 'exciting' aspect of this branch is that the xfsprogs
        maintainer and I have worked through the last of the code
        discrepancies between kernel and userspace libxfs such that there are
        no code differences between the two except for #includes.
      
        IOWs, diff suffices to demonstrate that the userspace tools behave the
        same as the kernel, and kernel-only bits are clearly marked in the
        /kernel/ source code instead of just the userspace source.
      
        Summary:
      
         - Clean up open-coded swap() calls.
      
         - A little bit of #ifdef golf to complete the reunification of the
           kernel and userspace libxfs source code"
      
      * tag 'xfs-5.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: sync xfs_btree_split macros with userspace libxfs
        xfs: #ifdef out perag code for userspace
        xfs: use swap() to make dabtree code cleaner
      ce49bfc8
    • Linus Torvalds's avatar
      Merge tag 'for-5.16/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · c3b68c27
      Linus Torvalds authored
      Pull more parisc fixes from Helge Deller:
       "Fix a build error in stracktrace.c, fix resolving of addresses to
        function names in backtraces, fix single-stepping in assembly code and
        flush userspace pte's when using set_pte_at()"
      
      * tag 'for-5.16/parisc-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc/entry: fix trace test in syscall exit path
        parisc: Flush kernel data mapping in set_pte_at() when installing pte for user page
        parisc: Fix implicit declaration of function '__kernel_text_address'
        parisc: Fix backtrace to always include init funtion names
      c3b68c27
    • Linus Torvalds's avatar
      Merge tag 'sh-for-5.16' of git://git.libc.org/linux-sh · 24318ae8
      Linus Torvalds authored
      Pull arch/sh updates from Rich Felker.
      
      * tag 'sh-for-5.16' of git://git.libc.org/linux-sh:
        sh: pgtable-3level: Fix cast to pointer from integer of different size
        sh: fix READ/WRITE redefinition warnings
        sh: define __BIG_ENDIAN for math-emu
        sh: math-emu: drop unused functions
        sh: fix kconfig unmet dependency warning for FRAME_POINTER
        sh: Cleanup about SPARSE_IRQ
        sh: kdump: add some attribute to function
        maple: fix wrong return value of maple_bus_init().
        sh: boot: avoid unneeded rebuilds under arch/sh/boot/compressed/
        sh: boot: add intermediate vmlinux.bin* to targets instead of extra-y
        sh: boards: Fix the cacography in irq.c
        sh: check return code of request_irq
        sh: fix trivial misannotations
      24318ae8
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 6ea45c57
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
      
       - Fix early_iounmap
      
       - Drop cc-option fallbacks for architecture selection
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 9156/1: drop cc-option fallbacks for architecture selection
        ARM: 9155/1: fix early early_iounmap()
      6ea45c57
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · 0d1503d8
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Two fixes due to DT node name changes on Arm, Ltd. boards
      
       - Treewide rename of Ingenic CGU headers
      
       - Update ST email addresses
      
       - Remove Netlogic DT bindings
      
       - Dropping few more cases of redundant 'maxItems' in schemas
      
       - Convert toshiba,tc358767 bridge binding to schema
      
      * tag 'devicetree-fixes-for-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: watchdog: sunxi: fix error in schema
        bindings: media: venus: Drop redundant maxItems for power-domain-names
        dt-bindings: Remove Netlogic bindings
        clk: versatile: clk-icst: Ensure clock names are unique
        of: Support using 'mask' in making device bus id
        dt-bindings: treewide: Update @st.com email address to @foss.st.com
        dt-bindings: media: Update maintainers for st,stm32-hwspinlock.yaml
        dt-bindings: media: Update maintainers for st,stm32-cec.yaml
        dt-bindings: mfd: timers: Update maintainers for st,stm32-timers
        dt-bindings: timer: Update maintainers for st,stm32-timer
        dt-bindings: i2c: imx: hardware do not restrict clock-frequency to only 100 and 400 kHz
        dt-bindings: display: bridge: Convert toshiba,tc358767.txt to yaml
        dt-bindings: Rename Ingenic CGU headers to ingenic,*.h
      0d1503d8
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 622c72b6
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A single fix for POSIX CPU timers to address a problem where POSIX CPU
        timer delivery stops working for a new child task because
        copy_process() copies state information which is only valid for the
        parent task"
      
      * tag 'timers-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        posix-cpu-timers: Clear task::posix_cputimers_work in copy_process()
      622c72b6
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · c36e33e2
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "A set of fixes for the interrupt subsystem
      
        Core code:
      
         - A regression fix for the Open Firmware interrupt mapping code where
           a interrupt controller property in a node caused a map property in
           the same node to be ignored.
      
        Interrupt chip drivers:
      
         - Workaround a limitation in SiFive PLIC interrupt chip which
           silently ignores an EOI when the interrupt line is masked.
      
         - Provide the missing mask/unmask implementation for the CSKY MP
           interrupt controller.
      
        PCI/MSI:
      
         - Prevent a use after free when PCI/MSI interrupts are released by
           destroying the sysfs entries before freeing the memory which is
           accessed in the sysfs show() function.
      
         - Implement a mask quirk for the Nvidia ION AHCI chip which does not
           advertise masking capability despite implementing it. Even worse
           the chip comes out of reset with all MSI entries masked, which due
           to the missing masking capability never get unmasked.
      
         - Move the check which prevents accessing the MSI[X] masking for XEN
           back into the low level accessors. The recent consolidation missed
           that these accessors can be invoked from places which do not have
           that check which broke XEN. Move them back to he original place
           instead of sprinkling tons of these checks all over the code"
      
      * tag 'irq-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        of/irq: Don't ignore interrupt-controller when interrupt-map failed
        irqchip/sifive-plic: Fixup EOI failed when masked
        irqchip/csky-mpintc: Fixup mask/unmask implementation
        PCI/MSI: Destroy sysfs before freeing entries
        PCI: Add MSI masking quirk for Nvidia ION AHCI
        PCI/MSI: Deal with devices lying about their MSI mask capability
        PCI/MSI: Move non-mask check back into low level accessors
      c36e33e2
    • Linus Torvalds's avatar
      Merge tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 218cc8b8
      Linus Torvalds authored
      Pull x86 static call update from Thomas Gleixner:
       "A single fix for static calls to make the trampoline patching more
        robust by placing explicit signature bytes after the call trampoline
        to prevent patching random other jumps like the CFI jump table
        entries"
      
      * tag 'locking-urgent-2021-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        static_call,x86: Robustify trampoline patching
      218cc8b8
    • Linus Torvalds's avatar
      Merge tag 'sched_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · fc661f2d
      Linus Torvalds authored
      Pull scheduler fixes from Borislav Petkov:
      
       - Avoid touching ~100 config files in order to be able to select the
         preemption model
      
       - clear cluster CPU masks too, on the CPU unplug path
      
       - prevent use-after-free in cfs
      
       - Prevent a race condition when updating CPU cache domains
      
       - Factor out common shared part of smp_prepare_cpus() into a common
         helper which can be called by both baremetal and Xen, in order to fix
         a booting of Xen PV guests
      
      * tag 'sched_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        preempt: Restore preemption model selection configs
        arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology()
        sched/fair: Prevent dead task groups from regaining cfs_rq's
        sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
        x86/smp: Factor out parts of native_smp_prepare_cpus()
      fc661f2d
    • Linus Torvalds's avatar
      Merge tag 'perf_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f7018be2
      Linus Torvalds authored
      Pull perf fixes from Borislav Petkov:
      
       - Prevent unintentional page sharing by checking whether a page
         reference to a PMU samples page has been acquired properly before
         that
      
       - Make sure the LBR_SELECT MSR is saved/restored too
      
       - Reset the LBR_SELECT MSR when resetting the LBR PMU to clear any
         residual data left
      
      * tag 'perf_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/core: Avoid put_page() when GUP fails
        perf/x86/vlbr: Add c->flags to vlbr event constraints
        perf/x86/lbr: Reset LBR_SELECT during vlbr reset
      f7018be2
    • Linus Torvalds's avatar
      Merge tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1654e95e
      Linus Torvalds authored
      Pull x86 fixes from Borislav Petkov:
      
       - Add the model number of a new, Raptor Lake CPU, to intel-family.h
      
       - Do not log spurious corrected MCEs on SKL too, due to an erratum
      
       - Clarify the path of paravirt ops patches upstream
      
       - Add an optimization to avoid writing out AMX components to sigframes
         when former are in init state
      
      * tag 'x86_urgent_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Add Raptor Lake to Intel family
        x86/mce: Add errata workaround for Skylake SKX37
        MAINTAINERS: Add some information to PARAVIRT_OPS entry
        x86/fpu: Optimize out sigframe xfeatures when in init state
      1654e95e
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-for-v5.16-2021-11-13' of... · 35c8fad4
      Linus Torvalds authored
      Merge tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
      
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
       "Hardware tracing:
      
         - ARM:
            * Print the size of the buffer size consistently in hexadecimal in
              ARM Coresight.
            * Add Coresight snapshot mode support.
            * Update --switch-events docs in 'perf record'.
            * Support hardware-based PID tracing.
            * Track task context switch for cpu-mode events.
      
         - Vendor events:
            * Add metric events JSON file for power10 platform
      
        perf test:
      
         - Get 'perf test' unit tests closer to kunit.
      
         - Topology tests improvements.
      
         - Remove bashisms from some tests.
      
        perf bench:
      
         - Fix memory leak of perf_cpu_map__new() in the futex benchmarks.
      
        libbpf:
      
         - Add some more weak libbpf functions o allow building with the
           libbpf versions, old ones, present in distros.
      
        libbeauty:
      
         - Translate [gs]setsockopt 'level' argument integer values to
           strings.
      
        tools headers UAPI:
      
         - Sync futex_waitv, arch prctl, sound, i195_drm and msr-index files
           with the kernel sources.
      
        Documentation:
      
         - Add documentation to 'struct symbol'.
      
         - Synchronize the definition of enum perf_hw_id with code in
           tools/perf/design.txt"
      
      * tag 'perf-tools-for-v5.16-2021-11-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (67 commits)
        perf tests: Remove bash constructs from stat_all_pmu.sh
        perf tests: Remove bash construct from record+zstd_comp_decomp.sh
        perf test: Remove bash construct from stat_bpf_counters.sh test
        perf bench futex: Fix memory leak of perf_cpu_map__new()
        tools arch x86: Sync the msr-index.h copy with the kernel sources
        tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
        tools headers UAPI: Sync sound/asound.h with the kernel sources
        tools headers UAPI: Sync linux/prctl.h with the kernel sources
        tools headers UAPI: Sync arch prctl headers with the kernel sources
        perf tools: Add more weak libbpf functions
        perf bpf: Avoid memory leak from perf_env__insert_btf()
        perf symbols: Factor out annotation init/exit
        perf symbols: Bit pack to save a byte
        perf symbols: Add documentation to 'struct symbol'
        tools headers UAPI: Sync files changed by new futex_waitv syscall
        perf test bpf: Use ARRAY_CHECK() instead of ad-hoc equivalent, addressing array_size.cocci warning
        perf arm-spe: Support hardware-based PID tracing
        perf arm-spe: Save context ID in record
        perf arm-spe: Update --switch-events docs in 'perf record'
        perf arm-spe: Track task context switch for cpu-mode events
        ...
      35c8fad4
    • Thomas Gleixner's avatar
      Merge tag 'irqchip-fixes-5.16-1' of... · 979292af
      Thomas Gleixner authored
      Merge tag 'irqchip-fixes-5.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
      
      Pull irqchip fixes from Marc Zyngier:
      
        - Address an issue with the SiFive PLIC being unable to EOI
          a masked interrupt
      
        - Move the disable/enable methods in the CSky mpintc to
          mask/unmask
      
        - Fix a regression in the OF irq code where an interrupt-controller
          property in the same node as an interrupt-map property would get
          ignored
      
      Link: https://lore.kernel.org/all/20211112173459.4015233-1-maz@kernel.org
      979292af