- 12 Oct, 2020 1 commit
-
-
Petr Mladek authored
-
- 05 Oct, 2020 1 commit
-
-
Gustavo A. R. Silva authored
Replace /* FALL THRU */ comment with the new pseudo-keyword macro fallthrough[1]. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20201002224627.GA30475@embeddedor
-
- 30 Sep, 2020 2 commits
-
-
John Ogness authored
@setup_text_buf only copies the original text messages (without any prefix or extended text). It only needs to be LOG_LINE_MAX in size. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200930090134.8723-3-john.ogness@linutronix.de
-
John Ogness authored
If a reader provides a buffer that is smaller than the message text, the @text_len field of @info will have a value larger than the buffer size. If readers blindly read @text_len bytes of data without checking the size, they will read beyond their buffer. Add this check to record_print_text() to properly recognize when such truncation has occurred. Add a maximum size argument to the ringbuffer function to extend records so that records can not be created that are larger than the buffer size of readers. When extending records (LOG_CONT), do not extend records beyond LOG_LINE_MAX since that is the maximum size available in the buffers used by consoles and syslog. Fixes: f5f022e5 ("printk: reimplement log_cont using record extension") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200930090134.8723-2-john.ogness@linutronix.de
-
- 22 Sep, 2020 3 commits
-
-
John Ogness authored
Since there is no code that will ever store anything into the dict ring, remove it. If any future dictionary properties are to be added, these should be added to the struct printk_info. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200918223421.21621-4-john.ogness@linutronix.de
-
John Ogness authored
Dictionaries are only used for SUBSYSTEM and DEVICE properties. The current implementation stores the property names each time they are used. This requires more space than otherwise necessary. Also, because the dictionary entries are currently considered optional, it cannot be relied upon that they are always available, even if the writer wanted to store them. These issues will increase should new dictionary properties be introduced. Rather than storing the subsystem and device properties in the dict ring, introduce a struct dev_printk_info with separate fields to store only the property values. Embed this struct within the struct printk_info to provide guaranteed availability. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/87mu1jl6ne.fsf@jogness.linutronix.de
-
John Ogness authored
The majority of the size of a descriptor is taken up by meta data, which is often not of interest to the ringbuffer (for example, when performing state checks). Since descriptors are often temporarily stored on the stack, keeping their size minimal will help reduce stack pressure. Rather than embedding the printk_info into the descriptor, create a separate printk_info array. The index of a descriptor in the descriptor array corresponds to the printk_info with the same index in the printk_info array. The rules for validity of a printk_info match the existing rules for the data blocks: the descriptor must be in a consistent state. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200918223421.21621-2-john.ogness@linutronix.de
-
- 15 Sep, 2020 9 commits
-
-
John Ogness authored
Use the record extending feature of the ringbuffer to implement continuous messages. This preserves the existing continuous message behavior. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914123354.832-7-john.ogness@linutronix.de
-
John Ogness authored
Add support for extending the newest data block. For this, introduce a new finalization state (desc_finalized) denoting a committed descriptor that cannot be extended. Until a record is finalized, a writer can reopen that record to append new data. Reopening a record means transitioning from the desc_committed state back to the desc_reserved state. A writer can explicitly finalize a record if there is no intention of extending it. Also, records are automatically finalized when a new record is reserved. This relieves writers of needing to explicitly finalize while also making such records available to readers sooner. (Readers can only traverse finalized records.) Four new memory barrier pairs are introduced. Two of them are insignificant additions (data_realloc:A/desc_read:D and data_realloc:A/data_push_tail:B) because they are alternate path memory barriers that exactly match the purpose, pairing, and context of the two existing memory barrier pairs they provide an alternate path for. The other two new memory barrier pairs are significant additions: desc_reopen_last:A / _prb_commit:B - When reopening a descriptor, ensure the state transitions back to desc_reserved before fully trusting the descriptor data. _prb_commit:B / desc_reserve:D - When committing a descriptor, ensure the state transitions to desc_committed before checking the head ID to see if the descriptor needs to be finalized. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914123354.832-6-john.ogness@linutronix.de
-
John Ogness authored
Rather than deriving the state by evaluating bits within the flags area of the state variable, assign the states explicit values and set those values in the flags area. Introduce macros to make it simple to read and write state values for the state variable. Although the functionality is preserved, the binary representation for the states is changed. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914123354.832-5-john.ogness@linutronix.de
-
John Ogness authored
prb_reserve() will set some meta data values and leave others uninitialized (or rather, containing the values of the previous wrap). Simplify the API by always clearing out all the fields. Only the sequence number is filled in. The caller is now responsible for filling in the rest of the meta data fields. In particular, for correctly filling in text and dict lengths. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914123354.832-4-john.ogness@linutronix.de
-
John Ogness authored
Rather than continually needing to explicitly check @begin and @next to identify a dataless block, introduce and use a BLK_DATALESS() macro. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914123354.832-3-john.ogness@linutronix.de
-
John Ogness authored
Move the internal get_data() function as-is above prb_reserve() so that a later change can make use of the static function. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914123354.832-2-john.ogness@linutronix.de
-
John Ogness authored
@state_var is copied as part of the descriptor copying via memcpy(). This is not allowed because @state_var is an atomic type, which in some implementations may contain a spinlock. Avoid using memcpy() with @state_var by explicitly copying the other fields of the descriptor. @state_var is set using atomic set operator before returning. Fixes: b6cf8b3f ("printk: add lockless ringbuffer") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914094803.27365-2-john.ogness@linutronix.de
-
John Ogness authored
It is expected that desc_read() will always set at least the @state_var field. However, if the descriptor is in an inconsistent state, no fields are set. Also, the second load of @state_var is not stored in @desc_out and so might not match the state value that is returned. Always set the last loaded @state_var into @desc_out, regardless of the descriptor consistency. Fixes: b6cf8b3f ("printk: add lockless ringbuffer") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200914094803.27365-1-john.ogness@linutronix.de
-
Andy Shevchenko authored
The oops_in_progress is defined in printk.c, so it's logical to move oops_in_progress to printk.h. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200911170202.8565-1-andriy.shevchenko@linux.intel.com
-
- 08 Sep, 2020 5 commits
-
-
John Ogness authored
With the introduction of the lockless printk ringbuffer, the data structure for the kernel log buffer was changed. Update the gdb scripts to be able to parse/print the new log buffer structure. Fixes: 896fbe20 ("printk: use the lockless ringbuffer") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reported-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Petr Mladek <pmladek@suse.com> [akpm@linux-foundation.org: A typo fix.] Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200814212525.6118-3-john.ogness@linutronix.de
-
John Ogness authored
Add a function for reading unsigned long values, which vary in size depending on the architecture. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200814212525.6118-2-john.ogness@linutronix.de
-
John Ogness authored
With the introduction of the lockless printk ringbuffer, the VMCOREINFO relating to the kernel log buffer was changed. Update the documentation to match those changes. Fixes: 896fbe20 ("printk: use the lockless ringbuffer") Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200814213316.6394-1-john.ogness@linutronix.de
-
John Ogness authored
The .bss section for the h8300 is relatively small. A value of CONFIG_LOG_BUF_SHIFT that is larger than 19 will create a static printk ringbuffer that is too large. Limit the range appropriately for the H8300. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200812073122.25412-1-john.ogness@linutronix.de
-
John Ogness authored
With commit 896fbe20 ("printk: use the lockless ringbuffer"), printk() started silently dropping messages without text because such records are not supported by the new printk ringbuffer. Add support for such records. Currently dataless records are denoted by INVALID_LPOS in order to recognize failed prb_reserve() calls. Change the ringbuffer to instead use two different identifiers (FAILED_LPOS and NO_LPOS) to distinguish between failed prb_reserve() records and successful dataless records, respectively. Fixes: 896fbe20 ("printk: use the lockless ringbuffer") Fixes: https://lkml.kernel.org/r/20200718121053.GA691245@elver.google.comReported-by: Marco Elver <elver@google.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Marco Elver <elver@google.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200721132528.9661-1-john.ogness@linutronix.de
-
- 18 Aug, 2020 1 commit
-
-
Sergey Senozhatsky authored
We have a number of "uart.port->desc.lock vs desc.lock->uart.port" lockdep reports coming from 8250 driver; this causes a bit of trouble to people, so let's fix it. The problem is reverse lock order in two different call paths: chain #1: serial8250_do_startup() spin_lock_irqsave(&port->lock); disable_irq_nosync(port->irq); raw_spin_lock_irqsave(&desc->lock) chain #2: __report_bad_irq() raw_spin_lock_irqsave(&desc->lock) for_each_action_of_desc() printk() spin_lock_irqsave(&port->lock); Fix this by changing the order of locks in serial8250_do_startup(): do disable_irq_nosync() first, which grabs desc->lock, and grab uart->port after that, so that chain #1 and chain #2 have same lock order. Full lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.4.39 #55 Not tainted ====================================================== swapper/0/0 is trying to acquire lock: ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57 but task is already holding lock: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&irq_desc_lock_class){-.-.}: _raw_spin_lock_irqsave+0x61/0x8d __irq_get_desc_lock+0x65/0x89 __disable_irq_nosync+0x3b/0x93 serial8250_do_startup+0x451/0x75c uart_startup+0x1b4/0x2ff uart_port_activate+0x73/0xa0 tty_port_open+0xae/0x10a uart_open+0x1b/0x26 tty_open+0x24d/0x3a0 chrdev_open+0xd5/0x1cc do_dentry_open+0x299/0x3c8 path_openat+0x434/0x1100 do_filp_open+0x9b/0x10a do_sys_open+0x15f/0x3d7 kernel_init_freeable+0x157/0x1dd kernel_init+0xe/0x105 ret_from_fork+0x27/0x50 -> #1 (&port_lock_key){-.-.}: _raw_spin_lock_irqsave+0x61/0x8d serial8250_console_write+0xa7/0x2a0 console_unlock+0x3b7/0x528 vprintk_emit+0x111/0x17f printk+0x59/0x73 register_console+0x336/0x3a4 uart_add_one_port+0x51b/0x5be serial8250_register_8250_port+0x454/0x55e dw8250_probe+0x4dc/0x5b9 platform_drv_probe+0x67/0x8b really_probe+0x14a/0x422 driver_probe_device+0x66/0x130 device_driver_attach+0x42/0x5b __driver_attach+0xca/0x139 bus_for_each_dev+0x97/0xc9 bus_add_driver+0x12b/0x228 driver_register+0x64/0xed do_one_initcall+0x20c/0x4a6 do_initcall_level+0xb5/0xc5 do_basic_setup+0x4c/0x58 kernel_init_freeable+0x13f/0x1dd kernel_init+0xe/0x105 ret_from_fork+0x27/0x50 -> #0 (console_owner){-...}: __lock_acquire+0x118d/0x2714 lock_acquire+0x203/0x258 console_lock_spinning_enable+0x51/0x57 console_unlock+0x25d/0x528 vprintk_emit+0x111/0x17f printk+0x59/0x73 __report_bad_irq+0xa3/0xba note_interrupt+0x19a/0x1d6 handle_irq_event_percpu+0x57/0x79 handle_irq_event+0x36/0x55 handle_fasteoi_irq+0xc2/0x18a do_IRQ+0xb3/0x157 ret_from_intr+0x0/0x1d cpuidle_enter_state+0x12f/0x1fd cpuidle_enter+0x2e/0x3d do_idle+0x1ce/0x2ce cpu_startup_entry+0x1d/0x1f start_kernel+0x406/0x46a secondary_startup_64+0xa4/0xb0 other info that might help us debug this: Chain exists of: console_owner --> &port_lock_key --> &irq_desc_lock_class Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&irq_desc_lock_class); lock(&port_lock_key); lock(&irq_desc_lock_class); lock(console_owner); *** DEADLOCK *** 2 locks held by swapper/0/0: #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba #1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55 Hardware name: XXXXXX Call Trace: <IRQ> dump_stack+0xbf/0x133 ? print_circular_bug+0xd6/0xe9 check_noncircular+0x1b9/0x1c3 __lock_acquire+0x118d/0x2714 lock_acquire+0x203/0x258 ? console_lock_spinning_enable+0x31/0x57 console_lock_spinning_enable+0x51/0x57 ? console_lock_spinning_enable+0x31/0x57 console_unlock+0x25d/0x528 ? console_trylock+0x18/0x4e vprintk_emit+0x111/0x17f ? lock_acquire+0x203/0x258 printk+0x59/0x73 __report_bad_irq+0xa3/0xba note_interrupt+0x19a/0x1d6 handle_irq_event_percpu+0x57/0x79 handle_irq_event+0x36/0x55 handle_fasteoi_irq+0xc2/0x18a do_IRQ+0xb3/0x157 common_interrupt+0xf/0xf </IRQ> Fixes: 768aec0b ("serial: 8250: fix shared interrupts issues with SMP and RT kernels") Reported-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Raul Rangel <rrangel@google.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u Link: https://lore.kernel.org/r/20200817022646.1484638-1-sergey.senozhatsky@gmail.com BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800
-
- 10 Aug, 2020 1 commit
-
-
Randy Dunlap authored
Drop repeated words "the" in kernel/printk/. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20200807033227.8349-1-rdunlap@infradead.org
-
- 05 Aug, 2020 9 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linuxLinus Torvalds authored
Pull printk updates from Petr Mladek: - Herbert Xu made printk header file self-contained. - Andy Shevchenko and Sergey Senozhatsky cleaned up console->setup() error handling. - Andy Shevchenko did some cleanups (e.g. sparse warning) in vsprintf code. - Minor documentation updates. * tag 'printk-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: lib/vsprintf: Force type of flags value for gfp_t lib/vsprintf: Replace custom spec to print decimals with generic one lib/vsprintf: Replace hidden BUILD_BUG_ON() with static_assert() printk: Make linux/printk.h self-contained doc:kmsg: explicitly state the return value in case of SEEK_CUR Replace HTTP links with HTTPS ones: vsprintf hvc: unify console setup naming console: Fix trivia typo 'change' -> 'chance' console: Propagate error code from console ->setup() tty: hvc: Return proper error code from console ->setup() hook serial: sunzilog: Return proper error code from console ->setup() hook serial: sunsab: Return proper error code from console ->setup() hook mips: Return proper error code from console ->setup() hook
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc updates from Helge Deller: "The majority of the patches are reverts of previous commits regarding the parisc-specific low level spinlocking code and barrier handling, with which we tried to fix CPU stalls on our build servers. In the end John David Anglin found the culprit: We missed a define for atomic64_set_release(). This seems to have fixed our issues, so now it's good to remove the unnecessary code again. Other than that it's trivial stuff: Spelling fixes, constifications and such" * 'parisc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: make the log level string for register dumps const parisc: Do not use an ordered store in pa_tlb_lock() Revert "parisc: Revert "Release spinlocks using ordered store"" Revert "parisc: Use ldcw instruction for SMP spinlock release barrier" Revert "parisc: Drop LDCW barrier in CAS code when running UP" Revert "parisc: Improve interrupt handling in arch_spin_lock_flags()" parisc: Replace HTTP links with HTTPS ones parisc: elf.h: delete a duplicated word parisc: Report bad pages as HardwareCorrupted parisc: Convert to BIT_MASK() and BIT_WORD()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fsgsbase from Thomas Gleixner: "Support for FSGSBASE. Almost 5 years after the first RFC to support it, this has been brought into a shape which is maintainable and actually works. This final version was done by Sasha Levin who took it up after Intel dropped the ball. Sasha discovered that the SGX (sic!) offerings out there ship rogue kernel modules enabling FSGSBASE behind the kernels back which opens an instantanious unpriviledged root hole. The FSGSBASE instructions provide a considerable speedup of the context switch path and enable user space to write GSBASE without kernel interaction. This enablement requires careful handling of the exception entries which go through the paranoid entry path as they can no longer rely on the assumption that user GSBASE is positive (as enforced via prctl() on non FSGSBASE enabled systemn). All other entries (syscalls, interrupts and exceptions) can still just utilize SWAPGS unconditionally when the entry comes from user space. Converting these entries to use FSGSBASE has no benefit as SWAPGS is only marginally slower than WRGSBASE and locating and retrieving the kernel GSBASE value is not a free operation either. The real benefit of RD/WRGSBASE is the avoidance of the MSR reads and writes. The changes come with appropriate selftests and have held up in field testing against the (sanitized) Graphene-SGX driver" * tag 'x86-fsgsbase-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) x86/fsgsbase: Fix Xen PV support x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase selftests/x86/fsgsbase: Add a missing memory constraint selftests/x86/fsgsbase: Fix a comment in the ptrace_write_gsbase test selftests/x86: Add a syscall_arg_fault_64 test for negative GSBASE selftests/x86/fsgsbase: Test ptracer-induced GS base write with FSGSBASE selftests/x86/fsgsbase: Test GS selector on ptracer-induced GS base write Documentation/x86/64: Add documentation for GS/FS addressing mode x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit x86/entry/64: Handle FSGSBASE enabled paranoid entry/exit x86/entry/64: Introduce the FIND_PERCPU_BASE macro x86/entry/64: Switch CR3 before SWAPGS in paranoid entry x86/speculation/swapgs: Check FSGSBASE in enabling SWAPGS mitigation x86/process/64: Use FSGSBASE instructions on thread copy and ptrace x86/process/64: Use FSBSBASE in switch_to() if available x86/process/64: Make save_fsgs_for_kvm() ready for FSGSBASE x86/fsgsbase/64: Enable FSGSBASE instructions in helper functions x86/fsgsbase/64: Add intrinsics for FSGSBASE instructions x86/cpu: Add 'unsafe_fsgsbase' to enable CR4.FSGSBASE ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 conversion to generic entry code from Thomas Gleixner: "The conversion of X86 syscall, interrupt and exception entry/exit handling to the generic code. Pretty much a straight-forward 1:1 conversion plus the consolidation of the KVM handling of pending work before entering guest mode" * tag 'x86-entry-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/kvm: Use __xfer_to_guest_mode_work_pending() in kvm_run_vcpu() x86/kvm: Use generic xfer to guest work function x86/entry: Cleanup idtentry_enter/exit x86/entry: Use generic interrupt entry/exit code x86/entry: Cleanup idtentry_entry/exit_user x86/entry: Use generic syscall exit functionality x86/entry: Use generic syscall entry function x86/ptrace: Provide pt_regs helper for entry/exit x86/entry: Move user return notifier out of loop x86/entry: Consolidate 32/64 bit syscall entry x86/entry: Consolidate check_user_regs() x86: Correct noinstr qualifiers x86/idtentry: Remove stale comment
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull generic kernel entry/exit code from Thomas Gleixner: "Generic implementation of common syscall, interrupt and exception entry/exit functionality based on the recent X86 effort to ensure correctness of entry/exit vs RCU and instrumentation. As this functionality and the required entry/exit sequences are not architecture specific, sharing them allows other architectures to benefit instead of copying the same code over and over again. This branch was kept standalone to allow others to work on it. The conversion of x86 comes in a seperate pull request which obviously is based on this branch" * tag 'core-entry-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Correct __secure_computing() stub entry: Correct 'noinstr' attributes entry: Provide infrastructure for work before transitioning to guest mode entry: Provide generic interrupt entry/exit code entry: Provide generic syscall exit function entry: Provide generic syscall entry functionality seccomp: Provide stub for __secure_computing()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer updates from Thomas Gleixner: "Time, timers and related driver updates: - Prevent unnecessary timer softirq invocations by extending the tracking of the next expiring timer in the timer wheel beyond the existing NOHZ functionality. The tracking overhead at enqueue time is within the noise, but on sensitive workloads the avoidance of the soft interrupt invocation is a measurable improvement. - The obligatory new clocksource driver for Ingenic X100 OST - The usual fixes, improvements, cleanups and extensions for newer chip variants all over the driver space" * tag 'timers-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) timers: Recalculate next timer interrupt only when necessary clocksource/drivers/ingenic: Add support for the Ingenic X1000 OST. dt-bindings: timer: Add Ingenic X1000 OST bindings. clocksource/drivers: Replace HTTP links with HTTPS ones clocksource/drivers/nomadik-mtu: Handle 32kHz clock clocksource/drivers/sh_cmt: Use "kHz" for kilohertz clocksource/drivers/imx: Add support for i.MX TPM driver with ARM64 clocksource/drivers/ingenic: Add high resolution timer support for SMP/SMT. timers: Lower base clock forwarding threshold timers: Remove must_forward_clk timers: Spare timer softirq until next expiry timers: Expand clk forward logic beyond nohz timers: Reuse next expiry cache after nohz exit timers: Always keep track of next expiry timers: Optimize _next_timer_interrupt() level iteration timers: Add comments about calc_index() ceiling work timers: Move trigger_dyntick_cpu() to enqueue_timer() timers: Use only bucket expiry for base->next_expiry value timers: Preserve higher bits of expiration on index calculation clocksource/drivers/timer-atmel-tcb: Add sama5d2 support ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq updates from Thomas Gleixner: "The usual boring updates from the interrupt subsystem: - Infrastructure to allow building irqchip drivers as modules - Consolidation of irqchip ACPI probing - Removal of the EOI-preflow interrupt handler which was required for SPARC support and became obsolete after SPARC was converted to use sparse interrupts. - Cleanups, fixes and improvements all over the place" * tag 'irq-core-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) irqchip/loongson-pch-pic: Fix the misused irq flow handler irqchip/loongson-htvec: Support 8 groups of HT vectors irqchip/loongson-liointc: Fix misuse of gc->mask_cache dt-bindings: interrupt-controller: Update Loongson HTVEC description irqchip/imx-intmux: Fix irqdata regs save in imx_intmux_runtime_suspend() irqchip/imx-intmux: Implement intmux runtime power management irqchip/gic-v4.1: Use GFP_ATOMIC flag in allocate_vpe_l1_table() irqchip: Fix IRQCHIP_PLATFORM_DRIVER_* compilation by including module.h irqchip/stm32-exti: Map direct event to irq parent irqchip/mtk-cirq: Convert to a platform driver irqchip/mtk-sysirq: Convert to a platform driver irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros irqchip: irq-bcm2836.h: drop a duplicated word irqchip/gic-v4.1: Ensure accessing the correct RD when writing INVALLR irqchip/irq-bcm7038-l1: Guard uses of cpu_logical_map irqchip/gic-v3: Remove unused register definition irqchip/qcom-pdc: Allow QCOM_PDC to be loadable as a permanent module genirq: Export irq_chip_retrigger_hierarchy and irq_chip_set_vcpu_affinity_parent irqdomain: Export irq_domain_update_bus_token ...
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping updates from Christoph Hellwig: - make support for dma_ops optional - move more code out of line - add generic support for a dma_ops bypass mode - misc cleanups * tag 'dma-mapping-5.9' of git://git.infradead.org/users/hch/dma-mapping: dma-contiguous: cleanup dma_alloc_contiguous dma-debug: use named initializers for dir2name powerpc: use the generic dma_ops_bypass mode dma-mapping: add a dma_ops_bypass flag to struct device dma-mapping: make support for dma ops optional dma-mapping: inline the fast path dma-direct calls dma-mapping: move the remaining DMA API calls out of line
-
git://git.infradead.org/users/hch/uuidLinus Torvalds authored
Pull uuid update from Christoph Hellwig: "Remove a now unused helper (Andy Shevchenko)" * tag 'uuid-for-5.9' of git://git.infradead.org/users/hch/uuid: uuid: remove unused uuid_le_to_bin() definition
-
- 04 Aug, 2020 8 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linuxLinus Torvalds authored
Pull close_range() implementation from Christian Brauner: "This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. This is coordinated with the FreeBSD folks which have copied our version of this syscall and in the meantime have already merged it in April 2019: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 The syscall originally came up in a discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive */ unshare(CLONE_FILES); /* we don't want anything past stderr here */ close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/* and calling close() on each file descriptor and other hacks. From looking at various large(ish) userspace code bases this or similar patterns are very common in service managers, container runtimes, and programming language runtimes/standard libraries such as Python or Rust. In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery. Based on Linus' suggestion close_range() also comes with a new flag CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping right before exec. This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which gets especially handy when we're closing all file descriptors above a certain threshold. Test-suite as always included" * tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLOSE_RANGE_UNSHARE tests close_range: add CLOSE_RANGE_UNSHARE tests: add close_range() tests arch: wire-up close_range() open: add close_range()
-
Linus Torvalds authored
Merge tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull checkpoint-restore updates from Christian Brauner: "This enables unprivileged checkpoint/restore of processes. Given that this work has been going on for quite some time the first sentence in this summary is hopefully more exciting than the actual final code changes required. Unprivileged checkpoint/restore has seen a frequent increase in interest over the last two years and has thus been one of the main topics for the combined containers & checkpoint/restore microconference since at least 2018 (cf. [1]). Here are just the three most frequent use-cases that were brought forward: - The JVM developers are integrating checkpoint/restore into a Java VM to significantly decrease the startup time. - In high-performance computing environment a resource manager will typically be distributing jobs where users are always running as non-root. Long-running and "large" processes with significant startup times are supposed to be checkpointed and restored with CRIU. - Container migration as a non-root user. In all of these scenarios it is either desirable or required to run without CAP_SYS_ADMIN. The userspace implementation of checkpoint/restore CRIU already has the pull request for supporting unprivileged checkpoint/restore up (cf. [2]). To enable unprivileged checkpoint/restore a new dedicated capability CAP_CHECKPOINT_RESTORE is introduced. This solution has last been discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1] "Update on Task Migration at Google Using CRIU") with Adrian and Nicolas providing the implementation now over the last months. In essence, this allows the CRIU binary to be installed with the CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling unprivileged users to restore processes. To make this possible the following permissions are altered: - Selecting a specific PID via clone3() set_tid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Accessing /proc/pid/map_files relaxed from init userns CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE. - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. Of these four changes the /proc/self/exe change deserves a few words because the reasoning behind even restricting /proc/self/exe changes in the first place is just full of historical quirks and tracking this down was a questionable version of fun that I'd like to spare others. In short, it is trivial to change /proc/self/exe as an unprivileged user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace() or by simply intercepting the elf loader in userspace during exec. Nicolas was nice enough to even provide a POC for the latter (cf. [3]) to illustrate this fact. The original patchset which introduced PR_SET_MM_MAP had no permissions around changing the exe link. They too argued that it is trivial to spoof the exe link already which is true. The argument brought up against this was that the Tomoyo LSM uses the exe link in tomoyo_manager() to detect whether the calling process is a policy manager. This caused changing the exe links to be guarded by userns CAP_SYS_ADMIN. All in all this rather seems like a "better guard it with something rather than nothing" argument which imho doesn't qualify as a great security policy. Again, because spoofing the exe link is possible for the calling process so even if this were security relevant it was broken back then and would be broken today. So technically, dropping all permissions around changing the exe link would probably be possible and would send a clearer message to any userspace that relies on /proc/self/exe for security reasons that they should stop doing this but for now we're only relaxing the exe link permissions from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. There's a final uapi change in here. Changing the exe link used to accidently return EINVAL when the caller lacked the necessary permissions instead of the more correct EPERM. This pr contains a commit fixing this. I assume that userspace won't notice or care and if they do I will revert this commit. But since we are changing the permissions anyway it seems like a good opportunity to try this fix. With these changes merged unprivileged checkpoint/restore will be possible and has already been tested by various users" [1] LPC 2018 1. "Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095 2. "Securely Migrating Untrusted Workloads with CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400 LPC 2019 1. "CRIU and the PID dance" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s 2. "Update on Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s [2] https://github.com/checkpoint-restore/criu/pull/1155 [3] https://github.com/nviennot/run_as_exe * tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add clone3() CAP_CHECKPOINT_RESTORE test prctl: exe link permission error changed from -EINVAL to -EPERM prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid pid: use checkpoint_restore_ns_capable() for set_tid capabilities: Introduce CAP_CHECKPOINT_RESTORE
-
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linuxLinus Torvalds authored
Pull fork cleanups from Christian Brauner: "This is cleanup series from when we reworked a chunk of the process creation paths in the kernel and switched to struct {kernel_}clone_args. High-level this does two main things: - Remove the double export of both do_fork() and _do_fork() where do_fork() used the incosistent legacy clone calling convention. Now we only export _do_fork() which is based on struct kernel_clone_args. - Remove the copy_thread_tls()/copy_thread() split making the architecture specific HAVE_COYP_THREAD_TLS config option obsolete. This switches all remaining architectures to select HAVE_COPY_THREAD_TLS and thus to the copy_thread_tls() calling convention. The current split makes the process creation codepaths more convoluted than they need to be. Each architecture has their own copy_thread() function unless it selects HAVE_COPY_THREAD_TLS then it has a copy_thread_tls() function. The split is not needed anymore nowadays, all architectures support CLONE_SETTLS but quite a few of them never bothered to select HAVE_COPY_THREAD_TLS and instead simply continued to use copy_thread() and use the old calling convention. Removing this split cleans up the process creation codepaths and paves the way for implementing clone3() on such architectures since it requires the copy_thread_tls() calling convention. After having made each architectures support copy_thread_tls() this series simply renames that function back to copy_thread(). It also switches all architectures that call do_fork() directly over to _do_fork() and the struct kernel_clone_args calling convention. This is a corollary of switching the architectures that did not yet support it over to copy_thread_tls() since do_fork() is conditional on not supporting copy_thread_tls() (Mostly because it lacks a separate argument for tls which is trivial to fix but there's no need for this function to exist.). The do_fork() removal is in itself already useful as it allows to to remove the export of both do_fork() and _do_fork() we currently have in favor of only _do_fork(). This has already been discussed back when we added clone3(). The legacy clone() calling convention is - as is probably well-known - somewhat odd: # # ABI hall of shame # config CLONE_BACKWARDS config CLONE_BACKWARDS2 config CLONE_BACKWARDS3 that is aggravated by the fact that some architectures such as sparc follow the CLONE_BACKWARDSx calling convention but don't really select the corresponding config option since they call do_fork() directly. So do_fork() enforces a somewhat arbitrary calling convention in the first place that doesn't really help the individual architectures that deviate from it. They can thus simply be switched to _do_fork() enforcing a single calling convention. (I really hope that any new architectures will __not__ try to implement their own calling conventions...) Most architectures already have made a similar switch (m68k comes to mind). Overall this removes more code than it adds even with a good portion of added comments. It simplifies a chunk of arch specific assembly either by moving the code into C or by simply rewriting the assembly. Architectures that have been touched in non-trivial ways have all been actually boot and stress tested: sparc and ia64 have been tested with Debian 9 images. They are the two architectures which have been touched the most. All non-trivial changes to architectures have seen acks from the relevant maintainers. nios2 with a custom built buildroot image. h8300 I couldn't get something bootable to test on but the changes have been fairly automatic and I'm sure we'll hear people yell if I broke something there. All other architectures that have been touched in trivial ways have been compile tested for each single patch of the series via git rebase -x "make ..." v5.8-rc2. arm{64} and x86{_64} have been boot tested even though they have just been trivially touched (removal of the HAVE_COPY_THREAD_TLS macro from their Kconfig) because well they are basically "core architectures" and since it is trivial to get your hands on a useable image" * tag 'fork-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: arch: rename copy_thread_tls() back to copy_thread() arch: remove HAVE_COPY_THREAD_TLS unicore: switch to copy_thread_tls() sh: switch to copy_thread_tls() nds32: switch to copy_thread_tls() microblaze: switch to copy_thread_tls() hexagon: switch to copy_thread_tls() c6x: switch to copy_thread_tls() alpha: switch to copy_thread_tls() fork: remove do_fork() h8300: select HAVE_COPY_THREAD_TLS, switch to kernel_clone_args nios2: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args ia64: enable HAVE_COPY_THREAD_TLS, switch to kernel_clone_args sparc: unconditionally enable HAVE_COPY_THREAD_TLS sparc: share process creation helpers between sparc and sparc64 sparc64: enable HAVE_COPY_THREAD_TLS fork: fold legacy_clone_args_valid() into _do_fork()
-
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linuxLinus Torvalds authored
Pull thread updates from Christian Brauner: "This contains the changes to add the missing support for attaching to time namespaces via pidfds. Last cycle setns() was changed to support attaching to multiple namespaces atomically. This requires all namespaces to have a point of no return where they can't fail anymore. Specifically, <namespace-type>_install() is allowed to perform permission checks and install the namespace into the new struct nsset that it has been given but it is not allowed to make visible changes to the affected task. Once <namespace-type>_install() returns, anything that the given namespace type additionally requires to be setup needs to ideally be done in a function that can't fail or if it fails the failure must be non-fatal. For time namespaces the relevant functions that fell into this category were timens_set_vvar_page() and vdso_join_timens(). The latter could still fail although it didn't need to. This function is only implemented for vdso_join_timens() in current mainline. As discussed on-list (cf. [1]), in order to make setns() support time namespaces when attaching to multiple namespaces at once properly we changed vdso_join_timens() to always succeed. So vdso_join_timens() replaces the mmap_write_lock_killable() with mmap_read_lock(). Please note that arm is about to grow vdso support for time namespaces (possibly this merge window). We've synced on this change and arm64 also uses mmap_read_lock(), i.e. makes vdso_join_timens() a function that can't fail. Once the changes here and the arm64 changes have landed, vdso_join_timens() should be turned into a void function so it's obvious to callers and implementers on other architectures that the expectation is that it can't fail. We didn't do this right away because it would've introduced unnecessary merge conflicts between the two trees for no major gain. As always, tests included" [1]: https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein * tag 'threads-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLONE_NEWTIME setns tests nsproxy: support CLONE_NEWTIME with setns() timens: add timens_commit() helper timens: make vdso_join_timens() always succeed
-
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespaceLinus Torvalds authored
Pull execve updates from Eric Biederman: "During the development of v5.7 I ran into bugs and quality of implementation issues related to exec that could not be easily fixed because of the way exec is implemented. So I have been diggin into exec and cleaning up what I can. This cycle I have been looking at different ideas and different implementations to see what is possible to improve exec, and cleaning the way exec interfaces with in kernel users. Only cleaning up the interfaces of exec with rest of the kernel has managed to stabalize and make it through review in time for v5.9-rc1 resulting in 2 sets of changes this cycle. - Implement kernel_execve - Make the user mode driver code a better citizen With kernel_execve the code size got a little larger as the copying of parameters from userspace and copying of parameters from userspace is now separate. The good news is kernel threads no longer need to play games with set_fs to use exec. Which when combined with the rest of Christophs set_fs changes should security bugs with set_fs much more difficult" * 'exec-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (23 commits) exec: Implement kernel_execve exec: Factor bprm_stack_limits out of prepare_arg_pages exec: Factor bprm_execve out of do_execve_common exec: Move bprm_mm_init into alloc_bprm exec: Move initialization of bprm->filename into alloc_bprm exec: Factor out alloc_bprm exec: Remove unnecessary spaces from binfmts.h umd: Stop using split_argv umd: Remove exit_umh bpfilter: Take advantage of the facilities of struct pid exit: Factor thread_group_exited out of pidfd_poll umd: Track user space drivers with struct pid bpfilter: Move bpfilter_umh back into init data exec: Remove do_execve_file umh: Stop calling do_execve_file umd: Transform fork_usermode_blob into fork_usermode_driver umd: Rename umd_info.cmdline umd_info.driver_name umd: For clarity rename umh_info umd_info umh: Separate the user mode driver and the user mode helper support umh: Remove call_usermodehelper_setup_file. ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/auditLinus Torvalds authored
Pull audit updates from Paul Moore: "Aside from some smaller bug fixes, here are the highlights: - add a new backlog wait metric to the audit status message, this is intended to help admins determine how long processes have been waiting for the audit backlog queue to clear - generate audit records for nftables configuration changes - generate CWD audit records for for the relevant LSM audit records" * tag 'audit-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: report audit wait metric in audit status reply audit: purge audit_log_string from the intra-kernel audit API audit: issue CWD record to accompany LSM_AUDIT_DATA_* records audit: use the proper gfp flags in the audit_log_nfcfg() calls audit: remove unused !CONFIG_AUDITSYSCALL __audit_inode* stubs audit: add gfp parameter to audit_log_nfcfg audit: log nftables configuration change events audit: Use struct_size() helper in alloc_chunk
-
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinuxLinus Torvalds authored
Pull selinux updates from Paul Moore: "Beyond the usual smattering of bug fixes, we've got three small improvements worth highlighting: - improved SELinux policy symbol table performance due to a reworking of the insert and search functions - allow reading of SELinux labels before the policy is loaded, allowing for some more "exotic" initramfs approaches - improved checking an error reporting about process class/permissions during SELinux policy load" * tag 'selinux-pr-20200803' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: complete the inlining of hashtab functions selinux: prepare for inlining of hashtab functions selinux: specialize symtab insert and search functions selinux: Fix spelling mistakes in the comments selinux: fixed a checkpatch warning with the sizeof macro selinux: log error messages on required process class / permissions scripts/selinux/mdp: fix initial SID handling selinux: allow reading labels before policy is loaded
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull seccomp updates from Kees Cook: "There are a bunch of clean ups and selftest improvements along with two major updates to the SECCOMP_RET_USER_NOTIF filter return: EPOLLHUP support to more easily detect the death of a monitored process, and being able to inject fds when intercepting syscalls that expect an fd-opening side-effect (needed by both container folks and Chrome). The latter continued the refactoring of __scm_install_fd() started by Christoph, and in the process found and fixed a handful of bugs in various callers. - Improved selftest coverage, timeouts, and reporting - Add EPOLLHUP support for SECCOMP_RET_USER_NOTIF (Christian Brauner) - Refactor __scm_install_fd() into __receive_fd() and fix buggy callers - Introduce 'addfd' command for SECCOMP_RET_USER_NOTIF (Sargun Dhillon)" * tag 'seccomp-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits) selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD seccomp: Introduce addfd ioctl to seccomp user notifier fs: Expand __receive_fd() to accept existing fd pidfd: Replace open-coded receive_fd() fs: Add receive_fd() wrapper for __receive_fd() fs: Move __scm_install_fd() to __receive_fd() net/scm: Regularize compat handling of scm_detach_fds() pidfd: Add missing sock updates for pidfd_getfd() net/compat: Add missing sock updates for SCM_RIGHTS selftests/seccomp: Check ENOSYS under tracing selftests/seccomp: Refactor to use fixture variants selftests/harness: Clean up kern-doc for fixtures seccomp: Use -1 marker for end of mode 1 syscall list seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID selftests/seccomp: Rename user_trap_syscall() to user_notif_syscall() selftests/seccomp: Make kcmp() less required seccomp: Use pr_fmt selftests/seccomp: Improve calibration loop selftests/seccomp: use 90s as timeout selftests/seccomp: Expand benchmark to per-filter measurements ...
-