An error occurred fetching the project authors.
- 01 Dec, 2020 1 commit
-
-
Nathan Chancellor authored
Currently, '--orphan-handling=warn' is spread out across four different architectures in their respective Makefiles, which makes it a little unruly to deal with in case it needs to be disabled for a specific linker version (in this case, ld.lld 10.0.1). To make it easier to control this, hoist this warning into Kconfig and the main Makefile so that disabling it is simpler, as the warning will only be enabled in a couple places (main Makefile and a couple of compressed boot folders that blow away LDFLAGS_vmlinx) and making it conditional is easier due to Kconfig syntax. One small additional benefit of this is saving a call to ld-option on incremental builds because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN. To keep the list of supported architectures the same, introduce CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to gain this automatically after all of the sections are specified and size asserted. A special thanks to Kees Cook for the help text on this config. Link: https://github.com/ClangBuiltLinux/linux/issues/1187Acked-by:
Kees Cook <keescook@chromium.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Tested-by:
Nick Desaulniers <ndesaulniers@google.com> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Masahiro Yamada <masahiroy@kernel.org>
-
- 08 Oct, 2020 1 commit
-
-
YiFei Zhu authored
In order to make adding configurable features into seccomp easier, it's better to have the options at one single location, considering especially that the bulk of seccomp code is arch-independent. An quick look also show that many SECCOMP descriptions are outdated; they talk about /proc rather than prctl. As a result of moving the config option and keeping it default on, architectures arm, arm64, csky, riscv, sh, and xtensa did not have SECCOMP on by default prior to this and SECCOMP will be default in this change. Architectures microblaze, mips, powerpc, s390, sh, and sparc have an outdated depend on PROC_FS and this dependency is removed in this change. Suggested-by:
Jann Horn <jannh@google.com> Link: https://lore.kernel.org/lkml/CAG48ez1YWz9cnp08UZgeieYRhHdqh-ch7aNwc4JRBnGyrmgfMg@mail.gmail.com/Signed-off-by:
YiFei Zhu <yifeifz2@illinois.edu> [kees: added HAVE_ARCH_SECCOMP help text, tweaked wording] Signed-off-by:
Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/9ede6ef35c847e58d61e476c6a39540520066613.1600951211.git.yifeifz2@illinois.edu
-
- 06 Oct, 2020 3 commits
-
-
Andrew Donnellan authored
A number of userspace utilities depend on making calls to RTAS to retrieve information and update various things. The existing API through which we expose RTAS to userspace exposes more RTAS functionality than we actually need, through the sys_rtas syscall, which allows root (or anyone with CAP_SYS_ADMIN) to make any RTAS call they want with arbitrary arguments. Many RTAS calls take the address of a buffer as an argument, and it's up to the caller to specify the physical address of the buffer as an argument. We allocate a buffer (the "RMO buffer") in the Real Memory Area that RTAS can access, and then expose the physical address and size of this buffer in /proc/powerpc/rtas/rmo_buffer. Userspace is expected to read this address, poke at the buffer using /dev/mem, and pass an address in the RMO buffer to the RTAS call. However, there's nothing stopping the caller from specifying whatever address they want in the RTAS call, and it's easy to construct a series of RTAS calls that can overwrite arbitrary bytes (even without /dev/mem access). Additionally, there are some RTAS calls that do potentially dangerous things and for which there are no legitimate userspace use cases. In the past, this would not have been a particularly big deal as it was assumed that root could modify all system state freely, but with Secure Boot and lockdown we need to care about this. We can't fundamentally change the ABI at this point, however we can address this by implementing a filter that checks RTAS calls against a list of permitted calls and forces the caller to use addresses within the RMO buffer. The list is based off the list of calls that are used by the librtas userspace library, and has been tested with a number of existing userspace RTAS utilities. For compatibility with any applications we are not aware of that require other calls, the filter can be turned off at build time. Cc: stable@vger.kernel.org Reported-by:
Daniel Axtens <dja@axtens.net> Signed-off-by:
Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200820044512.7543-1-ajd@linux.ibm.com
-
Daniel Axtens authored
In commit 61f879d9 ("powerpc/pseries: Detect secure and trusted boot state of the system.") we taught the kernel how to understand the secure-boot parameters used by a pseries guest. However, CONFIG_PPC_SECURE_BOOT still requires PowerNV. I didn't catch this because pseries_le_defconfig includes support for PowerNV and so everything still worked. Indeed, most configs will. Nonetheless, technically PPC_SECURE_BOOT doesn't require PowerNV any more. The secure variables support (PPC_SECVAR_SYSFS) doesn't do anything on pSeries yet, but I don't think it's worth adding a new condition - at some stage we'll want to add a backend for pSeries anyway. Fixes: 61f879d9 ("powerpc/pseries: Detect secure and trusted boot state of the system.") Signed-off-by:
Daniel Axtens <dja@axtens.net> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200924014922.172914-1-dja@axtens.net
-
Dan Williams authored
In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Tony Luck <tony.luck@intel.com> Acked-by:
Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
-
- 28 Sep, 2020 1 commit
-
-
Thomas Gleixner authored
The unconditional selection of PCI_MSI_ARCH_FALLBACKS has an unmet dependency because PCI_MSI_ARCH_FALLBACKS is defined in a 'if PCI' clause. As it is only relevant when PCI_MSI is enabled, update the affected architecture Kconfigs to make the selection of PCI_MSI_ARCH_FALLBACKS depend on 'if PCI_MSI'. Fixes: 077ee78e ("PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable") Reported-by:
Qian Cai <cai@redhat.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Links: https://lore.kernel.org/r/cdfd63305caa57785b0925dd24c0711ea02c8527.camel@redhat.com
-
- 16 Sep, 2020 2 commits
-
-
Thomas Gleixner authored
The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture requires them or not. Architectures which are fully utilizing hierarchical irq domains should never call into that code. It's not only architectures which depend on that by implementing one or more of the weak functions, there is also a bunch of drivers which relies on the weak functions which invoke msi_controller::setup_irq[s] and msi_controller::teardown_irq. Make the architectures and drivers which rely on them select them in Kconfig and if not selected replace them by stub functions which emit a warning and fail the PCI/MSI interrupt allocation. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200826112333.992429909@linutronix.de
-
Nicholas Piggin authored
powerpc uses IPIs in some situations to switch a kernel thread away from a lazy tlb mm, which is subject to the TLB flushing race described in the changelog introducing ARCH_WANT_IRQS_OFF_ACTIVATE_MM. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200914045219.3736466-3-npiggin@gmail.com
-
- 15 Sep, 2020 1 commit
-
-
Aneesh Kumar K.V authored
Implement page mapping percpu first chunk allocator as a fallback to the embedding allocator. With 4K hash translation we limit our page table range to 64TB and commit: 0034d395 ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range") moved all kernel mapping to that 64TB range. In-order to support sparse memory layout we need to increase our linear mapping space and reduce other mappings. With such a layout percpu embedded first chunk allocator will fail because of small vmalloc range. Add a fallback to page mapping percpu first chunk allocator for such failures. The below dmesg output can be observed in such case. percpu: max_distance=0x1ffffef00000 too large for vmalloc space 0x10000000000 PERCPU: auto allocator failed (-22), falling back to page size percpu: 40 4K pages/cpu s148816 r0 d15024 Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200608070904.387440-2-aneesh.kumar@linux.ibm.com
-
- 09 Sep, 2020 2 commits
-
-
Christoph Hellwig authored
Stop providing the possibility to override the address space using set_fs() now that there is no need for that any more. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Christoph Hellwig authored
Add a CONFIG_SET_FS option that is selected by architecturess that implement set_fs, which is all of them initially. If the option is not set stubs for routines related to overriding the address space are provided so that architectures can start to opt out of providing set_fs. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 02 Sep, 2020 1 commit
-
-
Aneesh Kumar K.V authored
The test is broken w.r.t page table update rules and results in kernel crash as below. Disable the support until we get the tests updated. [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 lr: c0000000005eeeec: pte_update+0x11c/0x190 sp: c000000c6d1e7950 msr: 8000000002029033 current = 0xc000000c6d172c80 paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 pid = 1, comm = swapper/0 kernel BUG at arch/powerpc/mm/pgtable.c:304! [link register ] c0000000005eeeec pte_update+0x11c/0x190 [c000000c6d1e7950] 0000000000000001 (unreliable) [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c With DEBUG_VM disabled [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 [ 20.530183] Faulting instruction address: 0xc0000000000df330 cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] pc: c0000000000df330: memset+0x68/0x104 lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 sp: c000000c6d19f990 msr: 8000000002009033 dar: 0 current = 0xc000000c6d177480 paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 pid = 1, comm = swapper/0 [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c 33:mon> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200902040122.136414-1-aneesh.kumar@linux.ibm.com
-
- 24 Aug, 2020 1 commit
-
-
Shawn Anastasio authored
Since migration of guests using SAO to ISA 3.1 hosts may cause issues, disable PROT_SAO in LPARs by default and introduce a new Kconfig option PPC_PROT_SAO_LPAR to allow users to enable it if desired. Signed-off-by:
Shawn Anastasio <shawn@anastas.io> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200821185558.35561-3-shawn@anastas.io
-
- 26 Jul, 2020 2 commits
-
-
Christophe Leroy authored
When STRICT_KERNEL_RWX is set, we want to set NX bit on vmalloc segments. But modules require exec. Use a dedicated segment for modules. There is not much space above kernel, and we don't waste vmalloc space to do alignment. Therefore, we take the segment before PAGE_OFFSET for modules. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eb8faba9148b6cf17c696ba776b4e8ee2f6313bf.1593428200.git.christophe.leroy@csgroup.eu
-
Nicholas Piggin authored
These have shown significantly improved performance and fairness when spinlock contention is moderate to high on very large systems. With this series including subsequent patches, on a 16 socket 1536 thread POWER9, a stress test such as same-file open/close from all CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs 384158op/s (33x faster), where the difference in throughput between the fastest and slowest thread goes from 7x to 1.4x. Thanks to the fast path being identical in terms of atomics and barriers (after a subsequent optimisation patch), single threaded performance is not changed (no measurable difference). On smaller systems, performance and fairness seems to be generally improved. Using dbench on tmpfs as a test (that starts to run into kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was tested with bare metal and KVM guest configurations. Results can be found here: https://github.com/linuxppc/issues/issues/305#issuecomment-663487453 Observations are: - Queued spinlocks are equal when contention is insignificant, as expected and as measured with microbenchmarks. - When there is contention, on bare metal queued spinlocks have better throughput and max latency at all points. - When virtualised, queued spinlocks are slightly worse approaching peak throughput, but significantly better throughput and max latency at all points beyond peak, until queued spinlock maximum latency rises when clients are 2x vCPUs. The regressions haven't been analysed very well yet, there are a lot of things that can be tuned, particularly the paravirtualised locking, but the numbers already look like a good net win even on relatively small systems. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Waiman Long <longman@redhat.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200724131423.1362108-4-npiggin@gmail.com
-
- 23 Jul, 2020 1 commit
-
-
Nicholas Piggin authored
powerpc return from interrupt and return from system call sequences are context synchronising. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200716013522.338318-1-npiggin@gmail.com
-
- 21 Jul, 2020 1 commit
-
-
Nicholas Piggin authored
The subpage_prot syscall was added for specialised system software (Lx86) that has been discontinued for about 7 years, and is not thought to be used elsewhere, so disable it by default. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200703011958.1166620-4-npiggin@gmail.com
-
- 19 Jul, 2020 2 commits
-
-
Christoph Hellwig authored
Use the DMA API bypass mechanism for direct window mappings. This uses common code and speed up the direct mapping case by avoiding indirect calls just when not using dma ops at all. It also fixes a problem where the sync_* methods were using the bypass check for DMA allocations, but those are part of the streaming ops. Note that this patch loses the DMA_ATTR_WEAK_ORDERING override, which has never been well defined, as is only used by a few drivers, which IIRC never showed up in the typical Cell blade setups that are affected by the ordering workaround. Fixes: efd176a0 ("powerpc/pseries/dma: Allow SWIOTLB") Signed-off-by:
Christoph Hellwig <hch@lst.de> Tested-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by:
Alexey Kardashevskiy <aik@ozlabs.ru>
-
Christoph Hellwig authored
Avoid the overhead of the dma ops support for tiny builds that only use the direct mapping. Signed-off-by:
Christoph Hellwig <hch@lst.de> Tested-by:
Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by:
Alexey Kardashevskiy <aik@ozlabs.ru>
-
- 04 Jul, 2020 1 commit
-
-
Christian Brauner authored
All architectures support copy_thread_tls() now, so remove the legacy copy_thread() function and the HAVE_COPY_THREAD_TLS config option. Everyone uses the same process creation calling convention based on copy_thread_tls() and struct kernel_clone_args. This will make it easier to maintain the core process creation code under kernel/, simplifies the callpaths and makes the identical for all architectures. Cc: linux-arch@vger.kernel.org Acked-by:
Thomas Bogendoerfer <tsbogend@alpha.franken.de> Acked-by:
Greentime Hu <green.hu@gmail.com> Acked-by:
Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
- 22 Jun, 2020 1 commit
-
-
Chris Packham authored
Since commit cbe46bd4 ("powerpc: remove CONFIG_CMDLINE #ifdef mess") CONFIG_CMDLINE has always had a value regardless of CONFIG_CMDLINE_BOOL. For example: $ make ARCH=powerpc defconfig $ cat .config # CONFIG_CMDLINE_BOOL is not set CONFIG_CMDLINE="" When enabling CONFIG_CMDLINE_BOOL this value is kept making the 'default "..." if CONFIG_CMDLINE_BOOL' ineffective. $ ./scripts/config --enable CONFIG_CMDLINE_BOOL $ cat .config CONFIG_CMDLINE_BOOL=y CONFIG_CMDLINE="" Remove CONFIG_CMDLINE_BOOL and the inaccessible default. Signed-off-by:
Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by:
Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200611224220.25066-2-chris.packham@alliedtelesis.co.nz
-
- 05 Jun, 2020 1 commit
-
-
Anshuman Khandual authored
This adds tests which will validate architecture page table helpers and other accessors in their compliance with expected generic MM semantics. This will help various architectures in validating changes to existing page table helpers or addition of new ones. This test covers basic page table entry transformations including but not limited to old, young, dirty, clean, write, write protect etc at various level along with populating intermediate entries with next page table page and validating them. Test page table pages are allocated from system memory with required size and alignments. The mapped pfns at page table levels are derived from a real pfn representing a valid kernel text symbol. This test gets called via late_initcall(). This test gets built and run when CONFIG_DEBUG_VM_PGTABLE is selected. Any architecture, which is willing to subscribe this test will need to select ARCH_HAS_DEBUG_VM_PGTABLE. For now this is limited to arc, arm64, x86, s390 and powerpc platforms where the test is known to build and run successfully Going forward, other architectures too can subscribe the test after fixing any build or runtime problems with their page table helpers. Folks interested in making sure that a given platform's page table helpers conform to expected generic MM semantics should enable the above config which will just trigger this test during boot. Any non conformity here will be reported as an warning which would need to be fixed. This test will help catch any changes to the agreed upon semantics expected from generic MM and enable platforms to accommodate it thereafter. [anshuman.khandual@arm.com: v17] Link: http://lkml.kernel.org/r/1587436495-22033-3-git-send-email-anshuman.khandual@arm.com [anshuman.khandual@arm.com: v18] Link: http://lkml.kernel.org/r/1588564865-31160-3-git-send-email-anshuman.khandual@arm.comSuggested-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by:
Qian Cai <cai@lca.pw> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Tested-by: Christophe Leroy <christophe.leroy@c-s.fr> [ppc32] Reviewed-by:
Ingo Molnar <mingo@kernel.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Link: http://lkml.kernel.org/r/1583919272-24178-1-git-send-email-anshuman.khandual@arm.comSigned-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 04 Jun, 2020 2 commits
-
-
Mike Rapoport authored
The memmap_init() function was made to iterate over memblock regions and as the result the early_pfn_in_nid() function became obsolete. Since CONFIG_NODES_SPAN_OTHER_NODES is only used to pick a stub or a real implementation of early_pfn_in_nid(), it is also not needed anymore. Remove both early_pfn_in_nid() and the CONFIG_NODES_SPAN_OTHER_NODES. Co-developed-by:
Hoan Tran <Hoan@os.amperecomputing.com> Signed-off-by:
Hoan Tran <Hoan@os.amperecomputing.com> Signed-off-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Tested-by: Hoan Tran <hoan@os.amperecomputing.com> [arm64] Cc: Baoquan He <bhe@redhat.com> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200412194859.12663-17-rppt@kernel.orgSigned-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Mike Rapoport authored
CONFIG_HAVE_MEMBLOCK_NODE_MAP is used to differentiate initialization of nodes and zones structures between the systems that have region to node mapping in memblock and those that don't. Currently all the NUMA architectures enable this option and for the non-NUMA systems we can presume that all the memory belongs to node 0 and therefore the compile time configuration option is not required. The remaining few architectures that use DISCONTIGMEM without NUMA are easily updated to use memblock_add_node() instead of memblock_add() and thus have proper correspondence of memblock regions to NUMA nodes. Still, free_area_init_node() must have a backward compatible version because its semantics with and without CONFIG_HAVE_MEMBLOCK_NODE_MAP is different. Once all the architectures will use the new semantics, the entire compatibility layer can be dropped. To avoid addition of extra run time memory to store node id for architectures that keep memblock but have only a single node, the node id field of the memblock_region is guarded by CONFIG_NEED_MULTIPLE_NODES and the corresponding accessors presume that in those cases it is always 0. Signed-off-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Tested-by: Hoan Tran <hoan@os.amperecomputing.com> [arm64] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Cc: Baoquan He <bhe@redhat.com> Cc: Brian Cain <bcain@codeaurora.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200412194859.12663-4-rppt@kernel.orgSigned-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 02 Jun, 2020 1 commit
-
-
Christophe Leroy authored
Mapping of early shadow area is implemented by using a single static page table having all entries pointing to the same early shadow page. The shadow area must therefore occupy full PGD entries. The shadow area has a size of 128MB starting at 0xf8000000. With 4k pages, a PGD entry is 4MB With 16k pages, a PGD entry is 64MB With 64k pages, a PGD entry is 1GB which is too big. Until we rework the early shadow mapping, disable KASAN when the page size is too big. Fixes: 2edb16ef ("powerpc/32: Add KASAN support") Cc: stable@vger.kernel.org # v5.2+ Reported-by:
kbuild test robot <lkp@intel.com> Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7195fcde7314ccbf7a081b356084a69d421b10d4.1590660977.git.christophe.leroy@csgroup.eu
-
- 28 May, 2020 1 commit
-
-
Petr Mladek authored
The commit 0ebeea8c ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work") caused that bpf_probe_read{, str}() functions were not longer available on architectures where the same logical address might have different content in kernel and user memory mapping. These architectures should use probe_read_{user,kernel}_str helpers. For backward compatibility, the problematic functions are still available on architectures where the user and kernel address spaces are not overlapping. This is defined CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE. At the moment, these backward compatible functions are enabled only on x86_64, arm, and arm64. Let's do it also on powerpc that has the non overlapping address space as well. Fixes: 0ebeea8c ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work") Signed-off-by:
Petr Mladek <pmladek@suse.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Acked-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/lkml/20200527122844.19524-1-pmladek@suse.com
-
- 26 May, 2020 5 commits
-
-
Christophe Leroy authored
DEBUG_PAGEALLOC only manages RW data. Text and RO data can still be mapped with BATs. In order to map with BATs, also enforce data alignment. Set by default to 256M which is a good compromise for keeping enough BATs for also KASAN and IMMR. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/fd29c1718ee44d82115d0e835ced808eb4ccbf51.1589866984.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
DEBUG_PAGEALLOC only manages RW data. Text and RO data can still be mapped with hugepages and pinned TLB. In order to map with hugepages, also enforce a 512kB data alignment minimum. That's a trade-off between size of speed, taking into account that DEBUG_PAGEALLOC is a debug option. Anyway the alignment is still tunable. We also allow tuning of alignment for book3s to limit the complexity of the test in Kconfig that will anyway disappear in the following patches once DEBUG_PAGEALLOC is handled together with BATs. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c13256f2d356a316715da61fe089b3623ef217a5.1589866984.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Pinned TLB are 8M. Now that there is no strict boundary anymore between text and RO data, it is possible to use 8M pinned executable TLB that covers both text and RO data. When PIN_TLB_DATA or PIN_TLB_TEXT is selected, enforce 8M RW data alignment and allow STRICT_KERNEL_RWX. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c535fc97bf0dd8693192e25feeed8088701e00c6.1589866984.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
Similar to PPC64, accept to map RO data as ROX as a trade off between between security and memory usage. Having RO data executable is not a high risk as RO data can't be modified to forge an exploit. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8c4a0d89d944eed984dd941e509614031a5ace2b.1589866984.git.christophe.leroy@csgroup.eu
-
Christophe Leroy authored
PPC_PIN_TLB options are dedicated to the 8xx, move them into the 8xx Kconfig. While we are at it, add some text to explain what it does. Signed-off-by:
Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1ece39fac6312e1d14e6a67b3f9d9f9f91990a7b.1589866984.git.christophe.leroy@csgroup.eu
-
- 21 May, 2020 1 commit
-
-
Michael Ellerman authored
Several strange crashes have been eventually traced back to STRICT_KERNEL_RWX and its interaction with code patching. Various paths in our ftrace, kprobes and other patching code need to be hardened against patching failures, otherwise we can end up running with partially/incorrectly patched ftrace paths, kprobes or jump labels, which can then cause strange crashes. Although fixes for those are in development, they're not -rc material. There also seem to be problems with the underlying strict RWX logic, which needs further debugging. So for now disable STRICT_KERNEL_RWX on 64-bit to prevent people from enabling the option and tripping over the bugs. Fixes: 1e0fc9d1 ("powerpc/Kconfig: Enable STRICT_KERNEL_RWX for some configs") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200520133605.972649-1-mpe@ellerman.id.au
-
- 11 May, 2020 1 commit
-
-
Christophe Leroy authored
When CONFIG_KASAN is selected, the stack usage is increased. In the same way as x86 and arm64 architectures, increase THREAD_SHIFT when CONFIG_KASAN is selected. Fixes: 2edb16ef ("powerpc/32: Add KASAN support") Reported-by: <erhard_f@mailbox.org> Signed-off-by:
Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://bugzilla.kernel.org/show_bug.cgi?id=207129 Link: https://lore.kernel.org/r/2c50f3b1c9bbaa4217c9a98f3044bd2a36c46a4f.1586361277.git.christophe.leroy@c-s.fr
-
- 30 Apr, 2020 1 commit
-
-
Naveen N. Rao authored
Currently, it is possible to have CONFIG_FUNCTION_TRACER disabled, but CONFIG_MPROFILE_KERNEL enabled. Though all existing users of MPROFILE_KERNEL are doing the right thing, it is weird to have MPROFILE_KERNEL enabled when the function tracer isn't. Fix this by making MPROFILE_KERNEL depend on FUNCTION_TRACER. Signed-off-by:
Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200422092612.514301-1-naveen.n.rao@linux.vnet.ibm.com
-
- 02 Apr, 2020 1 commit
-
-
Michal Suchanek authored
On bigendian ppc64 it is common to have 32bit legacy binaries but much less so on littleendian. Signed-off-by:
Michal Suchanek <msuchanek@suse.de> Reviewed-by:
Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/41393d6e895b0d3a47ee62f8f51e1cf888ad6226.1584699455.git.msuchanek@suse.de
-
- 12 Mar, 2020 1 commit
-
-
Nayna Jain authored
Every time a new architecture defines the IMA architecture specific functions - arch_ima_get_secureboot() and arch_ima_get_policy(), the IMA include file needs to be updated. To avoid this "noise", this patch defines a new IMA Kconfig IMA_SECURE_AND_OR_TRUSTED_BOOT option, allowing the different architectures to select it. Suggested-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Nayna Jain <nayna@linux.ibm.com> Acked-by:
Ard Biesheuvel <ardb@kernel.org> Acked-by: Philipp Rudo <prudo@linux.ibm.com> (s390) Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by:
Mimi Zohar <zohar@linux.ibm.com>
-
- 21 Feb, 2020 1 commit
-
-
Dan Williams authored
The "sub-section memory hotplug" facility allows memremap_pages() users like libnvdimm to compensate for hardware platforms like x86 that have a section size larger than their hardware memory mapping granularity. The compensation that sub-section support affords is being tolerant of physical memory resources shifting by units smaller (64MiB on x86) than the memory-hotplug section size (128 MiB). Where the platform physical-memory mapping granularity is limited by the number and capability of address-decode-registers in the memory controller. While the sub-section support allows memremap_pages() to operate on sub-section (2MiB) granularity, the Power architecture may still require 16MiB alignment on "!radix_enabled()" platforms. In order for libnvdimm to be able to detect and manage this per-arch limitation, introduce memremap_compat_align() as a common minimum alignment across all driver-facing memory-mapping interfaces, and let Power override it to 16MiB in the "!radix_enabled()" case. The assumption / requirement for 16MiB to be a viable memremap_compat_align() value is that Power does not have platforms where its equivalent of address-decode-registers never hardware remaps a persistent memory resource on smaller than 16MiB boundaries. Note that I tried my best to not add a new Kconfig symbol, but header include entanglements defeated the #ifndef memremap_compat_align design pattern and the need to export it defeats the __weak design pattern for arch overrides. Based on an initial patch by Aneesh. Link: http://lore.kernel.org/r/CAPcyv4gBGNP95APYaBcsocEa50tQj9b5h__83vgngjq3ouGX_Q@mail.gmail.comReported-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by:
Jeff Moyer <jmoyer@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 14 Feb, 2020 1 commit
-
-
Frederic Weisbecker authored
A few archs (x86, arm, arm64) don't rely anymore on TIF_NOHZ to call into context tracking on user entry/exit but instead use static keys (or not) to optimize those calls. Ideally every arch should migrate to that behaviour in the long run. Settle a config option to let those archs remove their TIF_NOHZ definitions. Signed-off-by:
Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paulburton@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: David S. Miller <davem@davemloft.net>
-
- 04 Feb, 2020 2 commits
-
-
Peter Zijlstra authored
Towards a more consistent naming scheme. Link: http://lkml.kernel.org/r/20200116064531.483522-8-aneesh.kumar@linux.ibm.comSigned-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Towards a more consistent naming scheme. [akpm@linux-foundation.org: fix sparc64 Kconfig] Link: http://lkml.kernel.org/r/20200116064531.483522-7-aneesh.kumar@linux.ibm.comSigned-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-