- 07 Apr, 2018 1 commit
-
-
Rafael J. Wysocki authored
In order to address the issue with short idle duration predictions by the idle governor after the scheduler tick has been stopped, split tick_nohz_stop_sched_tick() into two separate routines, one computing the time to the next timer event and the other simply stopping the tick when the time to the next timer event is known. Prepare these two routines to be called separately, as one of them will be called by the idle governor in the cpuidle_select() code path after subsequent changes. Update the former callers of tick_nohz_stop_sched_tick() to use the new routines, tick_nohz_next_event() and tick_nohz_stop_tick(), instead of it and move the updates of the sleep_length field in struct tick_sched into __tick_nohz_idle_stop_tick() as it doesn't need to be updated anywhere else. There should be no intentional visible changes in functionality resulting from this change. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
-
- 06 Apr, 2018 2 commits
-
-
Rafael J. Wysocki authored
Add a new pointer argument to cpuidle_select() and to the ->select cpuidle governor callback to allow a boolean value indicating whether or not the tick should be stopped before entering the selected state to be returned from there. Make the ladder governor ignore that pointer (to preserve its current behavior) and make the menu governor return 'false" through it if: (1) the idle exit latency is constrained at 0, or (2) the selected state is a polling one, or (3) the expected idle period duration is within the tick period range. In addition to that, the correction factor computations in the menu governor need to take the possibility that the tick may not be stopped into account to avoid artificially small correction factor values. To that end, add a mechanism to record tick wakeups, as suggested by Peter Zijlstra, and use it to modify the menu_update() behavior when tick wakeup occurs. Namely, if the CPU is woken up by the tick and the return value of tick_nohz_get_sleep_length() is not within the tick boundary, the predicted idle duration is likely too short, so make menu_update() try to compensate for that by updating the governor statistics as though the CPU was idle for a long time. Since the value returned through the new argument pointer of cpuidle_select() is not used by its caller yet, this change by itself is not expected to alter the functionality of the code. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
-
Rafael J. Wysocki authored
Since the subsequent changes will need a TICK_USEC definition analogous to TICK_NSEC, rename the existing TICK_USEC as USER_TICK_USEC, update its users and redefine TICK_USEC accordingly. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
-
- 05 Apr, 2018 3 commits
-
-
Rafael J. Wysocki authored
Make cpuidle_idle_call() decide whether or not to stop the tick. First, the cpuidle_enter_s2idle() path deals with the tick (and with the entire timekeeping for that matter) by itself and it doesn't need the tick to be stopped beforehand. Second, to address the issue with short idle duration predictions by the idle governor after the tick has been stopped, it will be necessary to change the ordering of cpuidle_select() with respect to tick_nohz_idle_stop_tick(). To prepare for that, put a tick_nohz_idle_stop_tick() call in the same branch in which cpuidle_select() is called. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
-
Rafael J. Wysocki authored
Push the decision whether or not to stop the tick somewhat deeper into the idle loop. Stopping the tick upfront leads to unpleasant outcomes in case the idle governor doesn't agree with the nohz code on the duration of the upcoming idle period. Specifically, if the tick has been stopped and the idle governor predicts short idle, the situation is bad regardless of whether or not the prediction is accurate. If it is accurate, the tick has been stopped unnecessarily which means excessive overhead. If it is not accurate, the CPU is likely to spend too much time in the (shallow, because short idle has been predicted) idle state selected by the governor [1]. As the first step towards addressing this problem, change the code to make the tick stopping decision inside of the loop in do_idle(). In particular, do not stop the tick in the cpu_idle_poll() code path. Also don't do that in tick_nohz_irq_exit() which doesn't really have enough information on whether or not to stop the tick. Link: https://marc.info/?l=linux-pm&m=150116085925208&w=2 # [1] Link: https://tu-dresden.de/zih/forschung/ressourcen/dateien/projekte/haec/powernightmares.pdfSuggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
-
Rafael J. Wysocki authored
Prepare the scheduler tick code for reworking the idle loop to avoid stopping the tick in some cases. The idea is to split the nohz idle entry call to decouple the idle time stats accounting and preparatory work from the actual tick stop code, in order to later be able to delay the tick stop once we reach more power-knowledgeable callers. Move away the tick_nohz_start_idle() invocation from __tick_nohz_idle_enter(), rename the latter to __tick_nohz_idle_stop_tick() and define tick_nohz_idle_stop_tick() as a wrapper around it for calling it from the outside. Make tick_nohz_idle_enter() only call tick_nohz_start_idle() instead of calling the entire __tick_nohz_idle_enter(), add another wrapper disabling and enabling interrupts around tick_nohz_idle_stop_tick() and make the current callers of tick_nohz_idle_enter() call it too to retain their current functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
-
- 03 Apr, 2018 9 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management updates from Rafael Wysocki: "These update the cpuidle poll state definition to reduce excessive energy usage related to it, add new CPU ID to the RAPL power capping driver, update the ACPI system suspend code to handle some special cases better, extend the PM core's device links code slightly, add new sysfs attribute for better suspend-to-idle diagnostics and easier hibernation handling, update power management tools and clean up cpufreq quite a bit. Specifics: - Modify the cpuidle poll state implementation to prevent CPUs from staying in the loop in there for excessive times (Rafael Wysocki). - Add Intel Cannon Lake chips support to the RAPL power capping driver (Joe Konno). - Add reference counting to the device links handling code in the PM core (Lukas Wunner). - Avoid reconfiguring GPEs on suspend-to-idle in the ACPI system suspend code (Rafael Wysocki). - Allow devices to be put into deeper low-power states via ACPI if both _SxD and _SxW are missing (Daniel Drake). - Reorganize the core ACPI suspend-to-idle wakeup code to avoid a keyboard wakeup issue on Asus UX331UA (Chris Chiu). - Prevent the PCMCIA library code from aborting suspend-to-idle due to noirq suspend failures resulting from incorrect assumptions (Rafael Wysocki). - Add coupled cpuidle supprt to the Exynos3250 platform (Marek Szyprowski). - Add new sysfs file to make it easier to specify the image storage location during hibernation (Mario Limonciello). - Add sysfs files for collecting suspend-to-idle usage and time statistics for CPU idle states (Rafael Wysocki). - Update the pm-graph utilities (Todd Brandt). - Reduce the kernel log noise related to reporting Low-power Idle constraings by the ACPI system suspend code (Rafael Wysocki). - Make it easier to distinguish dedicated wakeup IRQs in the /proc/interrupts output (Tony Lindgren). - Add the frequency table validation in cpufreq to the core and drop it from a number of cpufreq drivers (Viresh Kumar). - Drop "cooling-{min|max}-level" for CPU nodes from a couple of DT bindings (Viresh Kumar). - Clean up the CPU online error code path in the cpufreq core (Viresh Kumar). - Fix assorted issues in the SCPI, CPPC, mediatek and tegra186 cpufreq drivers (Arnd Bergmann, Chunyu Hu, George Cherian, Viresh Kumar). - Drop memory allocation error messages from a few places in cpufreq and cpuildle drivers (Markus Elfring)" * tag 'pm-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (56 commits) ACPI / PM: Fix keyboard wakeup from suspend-to-idle on ASUS UX331UA cpufreq: CPPC: Use transition_delay_us depending transition_latency PM / hibernate: Change message when writing to /sys/power/resume PM / hibernate: Make passing hibernate offsets more friendly cpuidle: poll_state: Avoid invoking local_clock() too often PM: cpuidle/suspend: Add s2idle usage and time state attributes cpuidle: Enable coupled cpuidle support on Exynos3250 platform cpuidle: poll_state: Add time limit to poll_idle() cpufreq: tegra186: Don't validate the frequency table twice cpufreq: speedstep: Don't validate the frequency table twice cpufreq: sparc: Don't validate the frequency table twice cpufreq: sh: Don't validate the frequency table twice cpufreq: sfi: Don't validate the frequency table twice cpufreq: scpi: Don't validate the frequency table twice cpufreq: sc520: Don't validate the frequency table twice cpufreq: s3c24xx: Don't validate the frequency table twice cpufreq: qoirq: Don't validate the frequency table twice cpufreq: pxa: Don't validate the frequency table twice cpufreq: ppc_cbe: Don't validate the frequency table twice cpufreq: powernow: Don't validate the frequency table twice ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to follow upstream revision 20180313 which includes fixes related to the so-called module-level AML (mostly "if" type of statements outside of any methods) that should improve the handling of systems that load alternative SSDTs depending on the current configuration, for example, and event handling fixes related to disabling and enabling GPEs on system startup and on suspend/resume. Moreover, the ACPICA license boilerplate is replaced with SPDX license IDs which alone reduces the number of lines of ACPICA code in the kernel quite a bit. Also added is a new driver for the generic ACPI Time and Alarm Device (TAD). At the moment it only handles the most basic capabilities of the TAD, however. In addition to that the ACPI battery driver is improved to handle battery thresholds on ThinkPads, among other things, some bugs are fixed, a new backlight quirk is added and some documentation is updated. Specifics: - Update the in-kernel ACPICA code to upstream revision 20180313 including: * Module-level AML code handling fixes and simplifications (Bob Moore, Erik Schmauss). * Fixes and cleanups related to messaging (Bob Moore). * Events handling fixes related to disabling and enabling GPEs (Erik Schmauss). * Introduction of SPDX license identifiers and removal of license boilerplate in multiple files (Erik Schmauss). * Assorted fixes and cleanups (Bob Moore, Erik Schmauss, Hans de Goede, Seunghun Han). - Add new basic driver for the ACPI Time and Alarm Device (Rafael Wysocki). - Modify the ACPI battery driver to support battery thresholds on Lenovo ThinkPads (Ognjen Galic, Colin Ian King). - Avoid reporting battery capacity over 100 in the ACPI battery driver in some cases (Laszlo Toth). - Make the kernel recognize an OEM _OSI string from Dell to avoid power management issues with NVidia GPUs in Dell platforms (Alex Hung). - Make the PCI IRQ management code handle missing _PRS cleanly (Alex Hung). - Fix uevent notifications related to device hotplut (Lee, Chun-Yi). - Prevent the ACPI PAD driver from leaking memory (Lenny Szubowicz). - Update the ACPI CPPC library code to include subspace IDs in the kernel messages logged by it (George Cherian). - Add backlight quirk for Samsung 670Z5E (Hans de Goede). - Add the NFIT and HMAT tables to the list of ACPI tables that can be overridden via initrd (Dan Williams). - Fix and clean up some ACPI documentation and Kconfig help language (Aishwarya Pant, Randy Dunlap). - Replace license boilerplate with an SPDX license ID in the ACPI PMIC operation region handling code (Rajmohan Mani)" * tag 'acpi-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (39 commits) ACPI: acpi_pad: Fix memory leak in power saving threads ACPI / video: Add quirk to force acpi-video backlight on Samsung 670Z5E ACPI: Add Time and Alarm Device (TAD) driver ACPI / scan: Send change uevent with offine environmental data ACPI / Kconfig: Update ACPI_PROCFS_POWER help text ACPI / OSI: Add OEM _OSI strings to disable NVidia RTD3 ACPICA: Update version to 20180313 ACPICA: Cleanup/simplify module-level code support ACPICA: Events: add a return on failure from acpi_hw_register_read ACPICA: adding SPDX headers ACPICA: Rename a global for clarity, no functional change ACPICA: macros: fix ACPI_ERROR_NAMESPACE macro ACPICA: Change a compile-time option to a runtime option ACPICA: Remove calling of _STA from acpi_get_object_info() ACPICA: AML Debug Object: Don't ignore output of zero-length strings ACPICA: Fix memory leak on unusual memory leak ACPICA: Events: Dispatch GPEs after enabling for the first time ACPICA: Events: Add parallel GPE handling support to fix potential redundant _Exx evaluations ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linuxLinus Torvalds authored
Pull removal of in-kernel calls to syscalls from Dominik Brodowski: "System calls are interaction points between userspace and the kernel. Therefore, system call functions such as sys_xyzzy() or compat_sys_xyzzy() should only be called from userspace via the syscall table, but not from elsewhere in the kernel. At least on 64-bit x86, it will likely be a hard requirement from v4.17 onwards to not call system call functions in the kernel: It is better to use use a different calling convention for system calls there, where struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands processing over to the actual syscall function. This means that only those parameters which are actually needed for a specific syscall are passed on during syscall entry, instead of filling in six CPU registers with random user space content all the time (which may cause serious trouble down the call chain). Those x86-specific patches will be pushed through the x86 tree in the near future. Moreover, rules on how data may be accessed may differ between kernel data and user data. This is another reason why calling sys_xyzzy() is generally a bad idea, and -- at most -- acceptable in arch-specific code. This patchset removes all in-kernel calls to syscall functions in the kernel with the exception of arch/. On top of this, it cleans up the three places where many syscalls are referenced or prototyped, namely kernel/sys_ni.c, include/linux/syscalls.h and include/linux/compat.h" * 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux: (109 commits) bpf: whitelist all syscalls for error injection kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitions kernel/sys_ni: sort cond_syscall() entries syscalls/x86: auto-create compat_sys_*() prototypes syscalls: sort syscall prototypes in include/linux/compat.h net: remove compat_sys_*() prototypes from net/compat.h syscalls: sort syscall prototypes in include/linux/syscalls.h kexec: move sys_kexec_load() prototype to syscalls.h x86/sigreturn: use SYSCALL_DEFINE0 x86: fix sys_sigreturn() return type to be long, not unsigned long x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm() mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead() mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff() mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64() fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate() fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate() fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare() ...
-
Omar Sandoval authored
Commit 2a98dc02 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible") introduced an optimization to bitmap_{set,clear}() which uses memset() when the start and length are constants aligned to a byte. This is wrong on big-endian systems; our bitmaps are arrays of unsigned long, so bit n is not at byte n / 8 in memory. This was caught by the Btrfs selftests, but the bitmap selftests also fail when run on a big-endian machine. We can still use memset if the start and length are aligned to an unsigned long, so do that on big-endian. The same problem applies to the memcmp in bitmap_equal(), so fix it there, too. Fixes: 2a98dc02 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible") Fixes: 2c6deb01 ("bitmap: use memcmp optimisation in more situations") Cc: stable@kernel.org Reported-by: "Erhard F." <erhard_f@mailbox.org> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-genericLinus Torvalds authored
Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linuxLinus Torvalds authored
Pull nds32 architecture support from Greentime Hu: "This contains the core nds32 Linux port (including interrupt controller driver and timer driver), which has been through seven rounds of review on mailing list. It is able to boot to shell and passes most LTP-2017 testsuites in nds32 AE3XX platform: Total Tests: 1901 Total Skipped Tests: 618 Total Failures: 78" Reviewed-by: Arnd Bergmann <arnd@arndb.de> * tag 'nds32-for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux: (44 commits) nds32: To use the generic dump_stack() nds32: fix building failed if using elf toolchain. nios2: add ioremap_nocache declaration before include asm-generic/io.h. nds32: fix building failed if using older version gcc. dt-bindings: timer: Add andestech atcpit100 timer binding doc clocksource/drivers/atcpit100: VDSO support clocksource/drivers/atcpit100: Add andestech atcpit100 timer net: faraday add nds32 support. irqchip: Andestech Internal Vector Interrupt Controller driver dt-bindings: interrupt-controller: Andestech Internal Vector Interrupt Controller dt-bindings: nds32 SoC Bindings dt-bindings: nds32 L2 cache controller Bindings dt-bindings: nds32 CPU Bindings MAINTAINERS: Add nds32 nds32: Build infrastructure nds32: defconfig nds32: Miscellaneous header files nds32: Device tree support nds32: Generic timers support nds32: Loadable modules ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68kLinus Torvalds authored
Pull m68k updates from Geert Uytterhoeven: - Macintosh enhancements and fixes - Remove useless memory layout printing using hashed pointers - Add missing Amiga Zorro bus DMA mask - Small fixes and cleanups - Defconfig updates * tag 'm68k-for-v4.17-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k/mac: Remove bogus "FIXME" comment m68k/mac: Enable RTC for 100-series PowerBooks m68k/mac: Clean up whitespace and remove redundant parentheses m68k/defconfig: Update defconfigs for v4.16-rc5 zorro: Set up z->dev.dma_mask for the DMA API m68k/time: Stop validating rtc_time in .read_time m68k/mm: Stop printing the virtual memory layout macintosh/via-pmu68k: Initialize PMU driver with setup_arch and arch_initcall m68k/mac: Fix apparent race condition in Baboon interrupt dispatch m68k/mac: Enable PDMA support for PowerBook 190
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull EFI updates from Ingo Molnar: "The main EFI changes in this cycle were: - Fix the apple-properties code (Andy Shevchenko) - Add WARN() on arm64 if UEFI Runtime Services corrupt the reserved x18 register (Ard Biesheuvel) - Use efi_switch_mm() on x86 instead of manipulating %cr3 directly (Sai Praneeth) - Fix early memremap leak in ESRT code (Ard Biesheuvel) - Switch to L"xxx" notation for wide string literals (Ard Biesheuvel) - ... plus misc other cleanups and bugfixes" * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/efi: Use efi_switch_mm() rather than manually twiddling with %cr3 x86/efi: Replace efi_pgd with efi_mm.pgd efi: Use string literals for efi_char16_t variable initializers efi/esrt: Fix handling of early ESRT table mapping efi: Use efi_mm in x86 as well as ARM efi: Make const array 'apple' static efi/apple-properties: Use memremap() instead of ioremap() efi: Reorder pr_notice() with add_device_randomness() call x86/efi: Replace GFP_ATOMIC with GFP_KERNEL in efi_query_variable_store() efi/arm64: Check whether x18 is preserved by runtime services calls efi/arm*: Stop printing addresses of virtual mappings efi/apple-properties: Remove redundant attribute initialization from unmarshal_key_value_pairs() efi/arm*: Only register page tables when they exist
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 dma mapping updates from Ingo Molnar: "This tree, by Christoph Hellwig, switches over the x86 architecture to the generic dma-direct and swiotlb code, and also unifies more of the dma-direct code between architectures. The now unused x86-only primitives are removed" * 'x86-dma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dma-mapping: Don't clear GFP_ZERO in dma_alloc_attrs swiotlb: Make swiotlb_{alloc,free}_buffer depend on CONFIG_DMA_DIRECT_OPS dma/swiotlb: Remove swiotlb_{alloc,free}_coherent() dma/direct: Handle force decryption for DMA coherent buffers in common code dma/direct: Handle the memory encryption bit in common code dma/swiotlb: Remove swiotlb_set_mem_attributes() set_memory.h: Provide set_memory_{en,de}crypted() stubs x86/dma: Remove dma_alloc_coherent_gfp_flags() iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent() iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}() x86/dma/amd_gart: Use dma_direct_{alloc,free}() x86/dma/amd_gart: Look at dev->coherent_dma_mask instead of GFP_DMA x86/dma: Use generic swiotlb_ops x86/dma: Use DMA-direct (CONFIG_DMA_DIRECT_OPS=y) x86/dma: Remove dma_alloc_coherent_mask()
-
- 02 Apr, 2018 25 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull wait_var_event updates from Ingo Molnar: "This introduces the new wait_var_event() API, which is a more flexible waiting primitive than wait_on_atomic_t(). All wait_on_atomic_t() users are migrated over to the new API and wait_on_atomic_t() is removed. The migration fixes one bug and should result in no functional changes for the other usecases" * 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/wait: Improve __var_waitqueue() code generation sched/wait: Remove the wait_on_atomic_t() API sched/wait, arch/mips: Fix and convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/ocfs2: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/fscache: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/btrfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/afs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/media: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait: Introduce wait_var_event()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 timer updates from Ingo Molnar: "Two changes: add the new convert_art_ns_to_tsc() API for upcoming Intel Goldmont+ drivers, and remove the obsolete rdtscll() API" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc: Get rid of rdtscll() x86/tsc: Convert ART in nanoseconds to TSC
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 platform updates from Ingo Molnar: "The main changes in this cycle were: - Add "Jailhouse" hypervisor support (Jan Kiszka) - Update DeviceTree support (Ivan Gorinov) - Improve DMI date handling (Andy Shevchenko)" * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/PCI: Fix a potential regression when using dmi_get_bios_year() firmware/dmi_scan: Uninline dmi_get_bios_year() helper x86/devicetree: Use CPU description from Device Tree of/Documentation: Specify local APIC ID in "reg" MAINTAINERS: Add entry for Jailhouse x86/jailhouse: Allow to use PCI_MMCONFIG without ACPI x86: Consolidate PCI_MMCONFIG configs x86: Align x86_64 PCI_MMCONFIG with 32-bit variant x86/jailhouse: Enable PCI mmconfig access in inmates PCI: Scan all functions when running over Jailhouse jailhouse: Provide detection for non-x86 systems x86/devicetree: Fix device IRQ settings in DT x86/devicetree: Initialize device tree before using it pci: Simplify code by using the new dmi_get_bios_year() helper ACPI/sleep: Simplify code by using the new dmi_get_bios_year() helper x86/pci: Simplify code by using the new dmi_get_bios_year() helper dmi: Introduce the dmi_get_bios_year() helper function x86/platform/quark: Re-use DEFINE_SHOW_ATTRIBUTE() macro x86/platform/atom: Re-use DEFINE_SHOW_ATTRIBUTE() macro
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 mm updates from Ingo Molnar: - Extend the memmap= boot parameter syntax to allow the redeclaration and dropping of existing ranges, and to support all e820 range types (Jan H. Schönherr) - Improve the W+X boot time security checks to remove false positive warnings on Xen (Jan Beulich) - Support booting as Xen PVH guest (Juergen Gross) - Improved 5-level paging (LA57) support, in particular it's possible now to have a single kernel image for both 4-level and 5-level hardware (Kirill A. Shutemov) - AMD hardware RAM encryption support (SME/SEV) fixes (Tom Lendacky) - Preparatory commits for hardware-encrypted RAM support on Intel CPUs. (Kirill A. Shutemov) - Improved Intel-MID support (Andy Shevchenko) - Show EFI page tables in page_tables debug files (Andy Lutomirski) - ... plus misc fixes and smaller cleanups * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) x86/cpu/tme: Fix spelling: "configuation" -> "configuration" x86/boot: Fix SEV boot failure from change to __PHYSICAL_MASK_SHIFT x86/mm: Update comment in detect_tme() regarding x86_phys_bits x86/mm/32: Remove unused node_memmap_size_bytes() & CONFIG_NEED_NODE_MEMMAP_SIZE logic x86/mm: Remove pointless checks in vmalloc_fault x86/platform/intel-mid: Add special handling for ACPI HW reduced platforms ACPI, x86/boot: Introduce the ->reduced_hw_early_init() ACPI callback ACPI, x86/boot: Split out acpi_generic_reduce_hw_init() and export x86/pconfig: Provide defines and helper to run MKTME_KEY_PROG leaf x86/pconfig: Detect PCONFIG targets x86/tme: Detect if TME and MKTME is activated by BIOS x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G x86/boot/compressed/64: Use page table in trampoline memory x86/boot/compressed/64: Use stack from trampoline memory x86/boot/compressed/64: Make sure we have a 32-bit code segment x86/mm: Do not use paravirtualized calls in native_set_p4d() kdump, vmcoreinfo: Export pgtable_l5_enabled value x86/boot/compressed/64: Prepare new top-level page table for trampoline x86/boot/compressed/64: Set up trampoline memory x86/boot/compressed/64: Save and restore trampoline memory ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 cleanups and msr updates from Ingo Molnar: "The main change is a performance/latency improvement to /dev/msr access. The rest are misc cleanups" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/msr: Make rdmsrl_safe_on_cpu() scheduling safe as well x86/cpuid: Allow cpuid_read() to schedule x86/msr: Allow rdmsr_safe_on_cpu() to schedule x86/rtc: Stop using deprecated functions x86/dumpstack: Unify show_regs() x86/fault: Do not print IP in show_fault_oops() x86/MSR: Move native_* variants to msr.h
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 build updates from Ingo Molnar: "The biggest change is the forcing of asm-goto support on x86, which effectively increases the GCC minimum supported version to gcc-4.5 (on x86)" * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/build: Don't pass in -D__KERNEL__ multiple times x86: Remove FAST_FEATURE_TESTS x86: Force asm-goto x86/build: Drop superfluous ALIGN from the linker script
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 asm fixlets from Ingo Molnar: "A clobber list fix and cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Trim clear_page.S includes x86/asm: Clobber flags in clear_page()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 apic updates from Ingo Molnar: "The main x86 APIC/IOAPIC changes in this cycle were: - Robustify kexec support to more carefully restore IRQ hardware state before calling into kexec/kdump kernels. (Baoquan He) - Clean up the local APIC code a bit (Dou Liyang) - Remove unused callbacks (David Rientjes)" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Finish removing unused callbacks x86/apic: Drop logical_smp_processor_id() inline x86/apic: Modernize the pending interrupt code x86/apic: Move pending interrupt check code into it's own function x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified x86/apic: Rename variables and functions related to x86_io_apic_ops x86/apic: Remove the (now) unused disable_IO_APIC() function x86/apic: Fix restoring boot IRQ mode in reboot and kexec/kdump x86/apic: Split disable_IO_APIC() into two functions to fix CONFIG_KEXEC_JUMP=y x86/apic: Split out restore_boot_irq_mode() from disable_IO_APIC() x86/apic: Make setup_local_APIC() static x86/apic: Simplify init_bsp_APIC() usage x86/x2apic: Mark set_x2apic_phys_mode() as __init
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull SMP hotplug updates from Ingo Molnar: "Simplify the CPU hot-plug state machine" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Fix unused function warning cpu/hotplug: Merge cpuhp_bp_states and cpuhp_ap_states
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler updates from Ingo Molnar: "The main scheduler changes in this cycle were: - NUMA balancing improvements (Mel Gorman) - Further load tracking improvements (Patrick Bellasi) - Various NOHZ balancing cleanups and optimizations (Peter Zijlstra) - Improve blocked load handling, in particular we can now reduce and eventually stop periodic load updates on 'very idle' CPUs. (Vincent Guittot) - On isolated CPUs offload the final 1Hz scheduler tick as well, plus related cleanups and reorganization. (Frederic Weisbecker) - Core scheduler code cleanups (Ingo Molnar)" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) sched/core: Update preempt_notifier_key to modern API sched/cpufreq: Rate limits for SCHED_DEADLINE sched/fair: Update util_est only on util_avg updates sched/cpufreq/schedutil: Use util_est for OPP selection sched/fair: Use util_est in LB and WU paths sched/fair: Add util_est on top of PELT sched/core: Remove TASK_ALL sched/completions: Use bool in try_wait_for_completion() sched/fair: Update blocked load when newly idle sched/fair: Move idle_balance() sched/nohz: Merge CONFIG_NO_HZ_COMMON blocks sched/fair: Move rebalance_domains() sched/nohz: Optimize nohz_idle_balance() sched/fair: Reduce the periodic update duration sched/nohz: Stop NOHZ stats when decayed sched/cpufreq: Provide migration hint sched/nohz: Clean up nohz enter/exit sched/fair: Update blocked load from NEWIDLE sched/fair: Add NOHZ stats balancing sched/fair: Restructure nohz_balance_kick() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 RAS updates from Ingo Molnar: "The main changes in this cycle were: - AMD MCE support/decoding improvements (Yazen Ghannam) - general MCE header cleanups and reorganization (Borislav Petkov)" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "x86/mce/AMD: Collect error info even if valid bits are not set" x86/MCE: Cleanup and complete struct mce fields definitions x86/mce/AMD: Carve out SMCA get_block_address() code x86/mce/AMD: Get address from already initialized block x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type x86/mce/AMD: Pass the bank number to smca_get_bank_type() x86/mce/AMD: Collect error info even if valid bits are not set x86/mce: Issue the 'mcelog --ascii' message only on !AMD x86/mce: Convert 'struct mca_config' bools to a bitfield x86/mce: Put private structures and definitions into the internal header
-
Howard McLauchlan authored
Error injection is a useful mechanism to fail arbitrary kernel functions. However, it is often hard to guarantee an error propagates appropriately to user space programs. By injecting into syscalls, we can return arbitrary values to user space directly; this increases flexibility and robustness in testing, allowing us to test user space error paths effectively. The following script, for example, fails calls to sys_open() from a given pid: from bcc import BPF from sys import argv pid = argv[1] prog = r""" int kprobe__SyS_open(struct pt_regs *ctx, const char *pathname, int flags) { u32 pid = bpf_get_current_pid_tgid(); if (pid == %s) bpf_override_return(ctx, -ENOMEM); return 0; } """ % pid b = BPF(text=prog) while 1: b.perf_buffer_poll() This patch whitelists all syscalls defined with SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE for error injection. These changes are not intended to be considered stable, and would normally be configured off. Signed-off-by: Howard McLauchlan <hmclauchlan@fb.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
This keeps it in line with the SYSCALL_DEFINEx() / COMPAT_SYSCALL_DEFINEx() calling convention. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Shuffle the cond_syscall() entries in kernel/sys_ni.c around so that they are kept in the same order as in include/uapi/asm-generic/unistd.h. For better structuring, add the same comments as in that file, but keep a few additional comments and extend the commentary where it seems useful. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
compat_sys_*() functions are no longer called from within the kernel on x86 except from the system call table. Linking the system call does not require compat_sys_*() function prototypes at least on x86. Therefore, generate compat_sys_*() prototypes on-the-fly within the COMPAT_SYSCALL_DEFINEx() macro, and remove x86-specific prototypes from various header files. Suggested-by: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: x86@kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Shuffle the syscall prototypes in include/linux/compat.h around so that they are kept in the same order as in include/uapi/asm-generic/unistd.h. The individual entries are kept the same, and neither modified to bring them in line with kernel coding style nor wrapped in proper ifdefs -- as an exception to this, add the prefix "asmlinkage" where it was missing. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
As the syscall functions should only be called from the system call table but not from elsewhere in the kernel, it is sufficient that they are defined in linux/compat.h. Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Shuffle the syscall prototypes in include/linux/syscalls.h around so that they are kept in the same order as in include/uapi/asm-generic/unistd.h. The individual entries are kept the same, and neither modified to bring them in line with kernel coding style nor wrapped in proper ifdefs. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
As the syscall function should only be called from the system call table but not from elsewhere in the kernel, move the prototype for sys_kexec_load() to include/syscall.h. Cc: Eric Biederman <ebiederm@xmission.com> Cc: kexec@lists.infradead.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Tautschnig, Michael authored
All definitions of syscalls in x86 except for those patched here have already been using the appropriate SYSCALL_DEFINE*. Signed-off-by: Michael Tautschnig <tautschn@amazon.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jaswinder Singh <jaswinder@infradead.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: x86@kernel.org Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Same as with other system calls, sys_sigreturn() should return a value of type long, not unsigned long. This also matches the behaviour for IA32_EMULATION, see sys32_sigreturn() in arch/x86/ia32/ia32_signal.c . Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: x86@kernel.org Cc: Michael Tautschnig <tautschn@amazon.co.uk> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Using this helper allows us to avoid the in-kernel calls to the sys_ioperm() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_ioperm(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: x86@kernel.org Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Using this helper allows us to avoid the in-kernel calls to the sys_readahead() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_readahead(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Using this helper allows us to avoid the in-kernel calls to the sys_mmap_pgoff() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_mmap_pgoff(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-
Dominik Brodowski authored
Using the ksys_fadvise64_64() helper allows us to avoid the in-kernel calls to the sys_fadvise64_64() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as ksys_fadvise64_64(). Some compat stubs called sys_fadvise64(), which then just passed through the arguments to sys_fadvise64_64(). Get rid of this indirection, and call ksys_fadvise64_64() directly. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
-