- 13 Jun, 2016 4 commits
-
-
Jon Hunter authored
Some IRQ chips, such as GPIO controllers or secondary level interrupt controllers, may require require additional runtime power management control to ensure they are accessible. For such IRQ chips, it makes sense to enable the IRQ chip when interrupts are requested and disabled them again once all interrupts have been freed. When mapping an IRQ, the IRQ type settings are read and then programmed. The mapping of the IRQ happens before the IRQ is requested and so the programming of the type settings occurs before the IRQ is requested. This is a problem for IRQ chips that require additional power management control because they may not be accessible yet. Therefore, when mapping the IRQ, don't program the type settings, just save them and then program these saved settings when the IRQ is requested (so long as if they are not overridden via the call to request the IRQ). Add a stub function for irq_domain_free_irqs() to avoid any compilation errors when CONFIG_IRQ_DOMAIN_HIERARCHY is not selected. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Marc Zyngier authored
As we now do for non-percpu interrupt, perform a lookup of the interrupt trigger if the user doesn't supply one. The difference here is that we can only do it at enable time (trigger configuration can be per-cpu as well). Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Jon Hunter authored
For some devices the IRQ trigger type for a device is read from firmware, such as device-tree. The IRQ trigger type is typically read when the mapping for IRQ is created, which is before the IRQ is requested. Hence, the IRQ trigger type is programmed when mapping the IRQ and not when requesting the IRQ. Although this works for most cases, in order to support IRQ chips which require runtime power management, which may not be accessible prior to requesting the IRQ, it is desirable to look-up the IRQ trigger type when it is requested. Therefore, if the IRQ trigger type is not specified when __setup_irq() is called, look-up the saved IRQ trigger type. This will allow us to defer the programming of the trigger type from when the IRQ is mapped to when it is actually requested. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
Jon Hunter authored
When mapping an IRQ, it is possible that a mapping for the IRQ already exists. If mapping does exist then there are the following issues with regard to the handling of the IRQ type settings ... 1. If the domain is part of a hierarchy, then: a. We do not check that the type settings for the existing mapping match those of the new mapping. b. We do not check to see if the type settings have been programmed yet (and they might not have been) and so we may never set the type. 2. If the domain is NOT part of a hierarchy, we will overwrite the current type settings programmed if they are different from the previous mapping. Please note that irq_create_mapping() calls irq_find_mapping() to check if a mapping already exists. Although, it may be unlikely that the type settings for a shared interrupt would not match, nonetheless we should check for this. Therefore, to fix this check if a mapping exists (regardless of whether the domain is part of a hierarchy or not) and if it does then: 1. Return the IRQ number if the type settings match or are not specified. 2. Program the type settings and return the IRQ number if the type settings have not been programmed yet. 3. Otherwise if the type setting do not match, then print a warning and don't return the IRQ number. Furthermore, add a warning if the type return by irq_domain_translate() has bits outside the sense mask set and then clear these bits. If these bits are not cleared then this will cause the comparision of the type settings for an existing mapping to fail with that of the new mapping even if the sense bit themselves match. The reason being is that the existing type settings are read by calling irq_get_trigger_type() which will clear any bits outside the sense mask. This will allow us to detect irqchips that are not correctly clearing these bits and fix them. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
-
- 05 Jun, 2016 7 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc fixes from Helge Deller: - Fix printk time stamps on SMP systems which got wrong due to a patch which was added during the merge window - Fix two bugs in the stack backtrace code: Races in module unloading and possible invalid accesses to memory due to wrong instruction decoding (Mikulas Patocka) - Fix userspace crash when syscalls access invalid unaligned userspace addresses. Those syscalls will now return EFAULT as expected. (tagged for stable kernel series) * 'parisc-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Move die_if_kernel() prototype into traps.h header parisc: Fix pagefault crash in unaligned __get_user() call parisc: Fix printk time during boot parisc: Fix backtrace on PA-RISC
-
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds authored
Pull key handling update from James Morris: "This alters a new keyctl function added in the current merge window to allow for a future extension planned for the next merge window" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: KEYS: Add placeholder for KDF usage with DH
-
Eric W. Biederman authored
The /dev/ptmx device node is changed to lookup the directory entry "pts" in the same directory as the /dev/ptmx device node was opened in. If there is a "pts" entry and that entry is a devpts filesystem /dev/ptmx uses that filesystem. Otherwise the open of /dev/ptmx fails. The DEVPTS_MULTIPLE_INSTANCES configuration option is removed, so that userspace can now safely depend on each mount of devpts creating a new instance of the filesystem. Each mount of devpts is now a separate and equal filesystem. Reserved ttys are now available to all instances of devpts where the mounter is in the initial mount namespace. A new vfs helper path_pts is introduced that finds a directory entry named "pts" in the directory of the passed in path, and changes the passed in path to point to it. The helper path_pts uses a function path_parent_directory that was factored out of follow_dotdot. In the implementation of devpts: - devpts_mnt is killed as it is no longer meaningful if all mounts of devpts are equal. - pts_sb_from_inode is replaced by just inode->i_sb as all cached inodes in the tty layer are now from the devpts filesystem. - devpts_add_ref is rolled into the new function devpts_ptmx. And the unnecessary inode hold is removed. - devpts_del_ref is renamed devpts_release and reduced to just a deacrivate_super. - The newinstance mount option continues to be accepted but is now ignored. In devpts_fs.h definitions for when !CONFIG_UNIX98_PTYS are removed as they are never used. Documentation/filesystems/devices.txt is updated to describe the current situation. This has been verified to work properly on openwrt-15.05, centos5, centos6, centos7, debian-6.0.2, debian-7.9, debian-8.2, ubuntu-14.04.3, ubuntu-15.10, fedora23, magia-5, mint-17.3, opensuse-42.1, slackware-14.1, gentoo-20151225 (13.0?), archlinux-2015-12-01. With the caveat that on centos6 and on slackware-14.1 that there wind up being two instances of the devpts filesystem mounted on /dev/pts, the lower copy does not end up getting used. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Greg KH <greg@kroah.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Serge Hallyn <serge.hallyn@ubuntu.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk> Cc: Jann Horn <jann@thejh.net> Cc: Jiri Slaby <jslaby@suse.com> Cc: Florian Weimer <fw@deneb.enyo.de> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Helge Deller authored
Signed-off-by: Helge Deller <deller@gmx.de>
-
Helge Deller authored
One of the debian buildd servers had this crash in the syslog without any other information: Unaligned handler failed, ret = -2 clock_adjtime (pid 22578): Unaligned data reference (code 28) CPU: 1 PID: 22578 Comm: clock_adjtime Tainted: G E 4.5.0-2-parisc64-smp #1 Debian 4.5.4-1 task: 000000007d9960f8 ti: 00000001bde7c000 task.ti: 00000001bde7c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001001111100000001111 Tainted: G E r00-03 000000ff0804f80f 00000001bde7c2b0 00000000402d2be8 00000001bde7c2b0 r04-07 00000000409e1fd0 00000000fa6f7fff 00000001bde7c148 00000000fa6f7fff r08-11 0000000000000000 00000000ffffffff 00000000fac9bb7b 000000000002b4d4 r12-15 000000000015241c 000000000015242c 000000000000002d 00000000fac9bb7b r16-19 0000000000028800 0000000000000001 0000000000000070 00000001bde7c218 r20-23 0000000000000000 00000001bde7c210 0000000000000002 0000000000000000 r24-27 0000000000000000 0000000000000000 00000001bde7c148 00000000409e1fd0 r28-31 0000000000000001 00000001bde7c320 00000001bde7c350 00000001bde7c218 sr00-03 0000000001200000 0000000001200000 0000000000000000 0000000001200000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000402d2e84 00000000402d2e88 IIR: 0ca0d089 ISR: 0000000001200000 IOR: 00000000fa6f7fff CPU: 1 CR30: 00000001bde7c000 CR31: ffffffffffffffff ORIG_R28: 00000002369fe628 IAOQ[0]: compat_get_timex+0x2dc/0x3c0 IAOQ[1]: compat_get_timex+0x2e0/0x3c0 RP(r2): compat_get_timex+0x40/0x3c0 Backtrace: [<00000000402d4608>] compat_SyS_clock_adjtime+0x40/0xc0 [<0000000040205024>] syscall_exit+0x0/0x14 This means the userspace program clock_adjtime called the clock_adjtime() syscall and then crashed inside the compat_get_timex() function. Syscalls should never crash programs, but instead return EFAULT. The IIR register contains the executed instruction, which disassebles into "ldw 0(sr3,r5),r9". This load-word instruction is part of __get_user() which tried to read the word at %r5/IOR (0xfa6f7fff). This means the unaligned handler jumped in. The unaligned handler is able to emulate all ldw instructions, but it fails if it fails to read the source e.g. because of page fault. The following program reproduces the problem: #define _GNU_SOURCE #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> int main(void) { /* allocate 8k */ char *ptr = mmap(NULL, 2*4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); /* free second half (upper 4k) and make it invalid. */ munmap(ptr+4096, 4096); /* syscall where first int is unaligned and clobbers into invalid memory region */ /* syscall should return EFAULT */ return syscall(__NR_clock_adjtime, 0, ptr+4095); } To fix this issue we simply need to check if the faulting instruction address is in the exception fixup table when the unaligned handler failed. If it is, call the fixup routine instead of crashing. While looking at the unaligned handler I found another issue as well: The target register should not be modified if the handler was unsuccessful. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org
-
Helge Deller authored
Avoid showing invalid printk time stamps during boot. Signed-off-by: Helge Deller <deller@gmx.de> Reviewed-by: Aaro Koskinen <aaro.koskinen@iki.fi>
-
- 04 Jun, 2016 9 commits
-
-
Mikulas Patocka authored
This patch fixes backtrace on PA-RISC There were several problems: 1) The code that decodes instructions handles instructions that subtract from the stack pointer incorrectly. If the instruction subtracts the number X from the stack pointer the code increases the frame size by (0x100000000-X). This results in invalid accesses to memory and recursive page faults. 2) Because gcc reorders blocks, handling instructions that subtract from the frame pointer is incorrect. For example, this function int f(int a) { if (__builtin_expect(a, 1)) return a; g(); return a; } is compiled in such a way, that the code that decreases the stack pointer for the first "return a" is placed before the code for "g" call. If we recognize this decrement, we mistakenly believe that the frame size for the "g" call is zero. To fix problems 1) and 2), the patch doesn't recognize instructions that decrease the stack pointer at all. To further safeguard the unwind code against nonsense values, we don't allow frame size larger than Total_frame_size. 3) The backtrace is not locked. If stack dump races with module unload, invalid table can be accessed. This patch adds a spinlock when processing module tables. Note, that for correct backtrace, you need recent binutils. Binutils 2.18 from Debian 5 produce garbage unwind tables. Binutils 2.21 work better (it sometimes forgets function frames, but at least it doesn't generate garbage). Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Helge Deller <deller@gmx.de>
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "A bunch of ARM drivers got into the fixes vibe this time around, so this contains a bunch of fixes for imx, atmel hlcdc, arm hdlcd (only so many combos of hlcd), mediatek and omap drm. Other than that there is one mgag200 fix and a few core drm regression fixes" * tag 'drm-fixes-for-v4.7-rc2' of git://people.freedesktop.org/~airlied/linux: (34 commits) drm/omap: fix unused variable warning. drm: hdlcd: Add information about the underlying framebuffers in debugfs drm: hdlcd: Cleanup the atomic plane operations drm/hdlcd: Fix up crtc_state->event handling drm: hdlcd: Revamp runtime power management drm/mediatek: mtk_dsi: Remove spurious drm_connector_unregister drm/mediatek: mtk_dpi: remove invalid error message drm: atmel-hlcdc: fix a NULL check drm: atmel-hlcdc: fix atmel_hlcdc_crtc_reset() implementation drm/mgag200: Black screen fix for G200e rev 4 drm: Wrap direct calls to driver->gem_free_object from CMA drm: fix fb refcount issue with atomic modesetting drm: make drm_atomic_set_mode_prop_for_crtc() more reliable drm/sti: remove extra mode fixup drm: add missing drm_mode_set_crtcinfo call drm/omap: include gpio/consumer.h where needed drm/omap: include linux/seq_file.h where needed Revert "drm/omap: no need to select OMAP2_DSS" drm/omap: Remove regulator API abuse OMAPDSS: HDMI5: Change DDC timings ...
-
git://github.com/awilliam/linux-vfioLinus Torvalds authored
Pull VFIO fixes from Alex Williamson: "Fix irqfd shutdown ordering, build warning, and VPD short read" * tag 'vfio-v4.7-rc2' of git://github.com/awilliam/linux-vfio: vfio/pci: Allow VPD short read vfio/type1: Fix build warning vfio/pci: Fix ordering of eventfd vs virqfd shutdown
-
git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds authored
Pull MMC fixes from Ulf Hansson: "MMC core: - Fix/restore behaviour when selecting bus width for (e)MMC MMC host: - sunxi: Fix eMMC HS-DDR modes on Allwinner A80" * tag 'mmc-v4.7-rc1-2' of git://git.linaro.org/people/ulf.hansson/mmc: mmc: sunxi: Re-enable eMMC HS-DDR modes on Allwinner A80 mmc: sunxi: Fix DDR MMC timings for A80 mmc: fix mmc mode selection for HS-DDR and higher
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull btrfs fixes from Chris Mason: "The important part of this pull is Filipe's set of fixes for btrfs device replacement. Filipe fixed a few issues seen on the list and a number he found on his own" * 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: deal with duplciates during extent_map insertion in btrfs_get_extent Btrfs: fix race between device replace and read repair Btrfs: fix race between device replace and discard Btrfs: fix race between device replace and chunk allocation Btrfs: fix race setting block group back to RW mode during device replace Btrfs: fix unprotected assignment of the left cursor for device replace Btrfs: fix race setting block group readonly during device replace Btrfs: fix race between device replace and block group removal Btrfs: fix race between readahead and device replace/removal
-
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-clientLinus Torvalds authored
Pull Ceph fixes from Sage Weil: "We have a few follow-up fixes for the libceph refactor from Ilya, and then some cephfs + fscache fixes from Zheng. The first two FS-Cache patches are acked by David Howells and deemed trivial enough to go through our tree. The rest fix some issues with the ceph fscache handling (disable cache for inodes opened for write, and simplify the revalidation logic accordingly, dropping the now-unnecessary work queue)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: ceph: use i_version to check validity of fscache ceph: improve fscache revalidation ceph: disable fscache when inode is opened for write ceph: avoid unnecessary fscache invalidation/revlidation ceph: call __fscache_uncache_page() if readpages fails FS-Cache: make check_consistency callback return int FS-Cache: wake write waiter after invalidating writes libceph: use %s instead of %pE in dout()s libceph: put request only if it's done in handle_reply() libceph: change ceph_osdmap_flag() to take osdc
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI fixes from Rafael Wysocki: "Two fixes for problems introduced recently (ACPICA and the ACPI backlight driver) and one fix for an older issue that prevents at least one system from booting. Specifics: - Fix an incorrect check introduced by recent ACPICA changes which causes problems with booting KVM guests to happen, among other things (Lv Zheng). - Fix a backlight issue introduced by recent changes to the ACPI video driver (Aaron Lu). - Fix the ACPI processor initialization which attempts to register an IO region without checking if that really is necessary and sometimes prevents drivers loaded subsequently from registering their resources which leads to boot issues (Rafael Wysocki)" * tag 'acpi-4.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / processor: Avoid reserving IO regions too early ACPICA / Hardware: Fix old register check in acpi_hw_get_access_bit_width() ACPI / Thermal / video: fix max_level incorrect value
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management fixes from Rafael Wysocki: "Two fixes for problems introduced recently in the cpufreq core and the intel_pstate driver. Specifics: - Fix a silly mistake related to the clamp_val() usage in a function added by a recent commit (Rafael Wysocki). - Reduce the log level of an annoying message added to intel_pstate during the recent merge window (Srinivas Pandruvada)" * tag 'pm-4.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Fix clamp_val() usage in cpufreq_driver_fast_switch() cpufreq: intel_pstate: Downgrade print level for _PPC
-
Linus Torvalds authored
Merge various fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies mm, page_alloc: reset zonelist iterator after resetting fair zone allocation policy mm, oom_reaper: do not use siglock in try_oom_reaper() mm, page_alloc: prevent infinite loop in buffered_rmqueue() checkpatch: reduce git commit description style false positives mm/z3fold.c: avoid modifying HEADLESS page and minor cleanup memcg: add RCU locking around css_for_each_descendant_pre() in memcg_offline_kmem() mm: check the return value of lookup_page_ext for all call sites kdump: fix dmesg gdbmacro to work with record based printk mm: fix overflow in vm_map_ram()
-
- 03 Jun, 2016 20 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fixes from Thomas Gleixner: - a few simple fixes for fallout from the recent gic-v3 changes - a workaround for a Cavium thunderX erratum - a bugfix for the pic32 irqchip to make external interrupts work proper - a missing return value in the generic IPI management code * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-pic32-evic: Fix bug with external interrupts. irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144 irqchip/gic-v3: Fix quiescence check in gic_enable_redist irqchip/gic-v3: Fix copy+paste mistakes in defines irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding mask genirq: Fix missing return value in irq_destroy_ipi()
-
Mel Gorman authored
The optimistic fast path may use cpuset_current_mems_allowed instead of of a NULL nodemask supplied by the caller for cpuset allocations. The preferred zone is calculated on this basis for statistic purposes and as a starting point in the zonelist iterator. However, if the context can ignore memory policies due to being atomic or being able to ignore watermarks then the starting point in the zonelist iterator is no longer correct. This patch resets the zonelist iterator in the allocator slowpath if the context can ignore memory policies. This will alter the zone used for statistics but only after it is known that it makes sense for that context. Resetting it before entering the slowpath would potentially allow an ALLOC_CPUSET allocation to be accounted for against the wrong zone. Note that while nodemask is not explicitly set to the original nodemask, it would only have been overwritten if cpuset_enabled() and it was reset before the slowpath was entered. Link: http://lkml.kernel.org/r/20160602103936.GU2527@techsingularity.net Fixes: c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mel Gorman authored
Geert Uytterhoeven reported the following problem that bisected to commit c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") on m68k/ARAnyM BUG: scheduling while atomic: cron/668/0x10c9a0c0 Modules linked in: CPU: 0 PID: 668 Comm: cron Not tainted 4.6.0-atari-05133-gc33d6c06 #364 Call Trace: [<0003d7d0>] __schedule_bug+0x40/0x54 __schedule+0x312/0x388 __schedule+0x0/0x388 prepare_to_wait+0x0/0x52 schedule+0x64/0x82 schedule_timeout+0xda/0x104 set_next_entity+0x18/0x40 pick_next_task_fair+0x78/0xda io_schedule_timeout+0x36/0x4a bit_wait_io+0x0/0x40 bit_wait_io+0x12/0x40 __wait_on_bit+0x46/0x76 wait_on_page_bit_killable+0x64/0x6c bit_wait_io+0x0/0x40 wake_bit_function+0x0/0x4e __lock_page_or_retry+0xde/0x124 do_scan_async+0x114/0x17c lookup_swap_cache+0x24/0x4e handle_mm_fault+0x626/0x7de find_vma+0x0/0x66 down_read+0x0/0xe wait_on_page_bit_killable_timeout+0x77/0x7c find_vma+0x16/0x66 do_page_fault+0xe6/0x23a res_func+0xa3c/0x141a buserr_c+0x190/0x6d4 res_func+0xa3c/0x141a buserr+0x20/0x28 res_func+0xa3c/0x141a buserr+0x20/0x28 The relationship is not obvious but it's due to a failure to rescan the full zonelist after the fair zone allocation policy exhausts the batch count. While this is a functional problem, it's also a performance issue. A page allocator microbenchmark showed the following 4.7.0-rc1 4.7.0-rc1 vanilla reset-v1r2 Min alloc-odr0-1 327.00 ( 0.00%) 326.00 ( 0.31%) Min alloc-odr0-2 235.00 ( 0.00%) 235.00 ( 0.00%) Min alloc-odr0-4 198.00 ( 0.00%) 198.00 ( 0.00%) Min alloc-odr0-8 170.00 ( 0.00%) 170.00 ( 0.00%) Min alloc-odr0-16 156.00 ( 0.00%) 156.00 ( 0.00%) Min alloc-odr0-32 150.00 ( 0.00%) 150.00 ( 0.00%) Min alloc-odr0-64 146.00 ( 0.00%) 146.00 ( 0.00%) Min alloc-odr0-128 145.00 ( 0.00%) 145.00 ( 0.00%) Min alloc-odr0-256 155.00 ( 0.00%) 155.00 ( 0.00%) Min alloc-odr0-512 168.00 ( 0.00%) 165.00 ( 1.79%) Min alloc-odr0-1024 175.00 ( 0.00%) 174.00 ( 0.57%) Min alloc-odr0-2048 180.00 ( 0.00%) 180.00 ( 0.00%) Min alloc-odr0-4096 187.00 ( 0.00%) 186.00 ( 0.53%) Min alloc-odr0-8192 190.00 ( 0.00%) 190.00 ( 0.00%) Min alloc-odr0-16384 191.00 ( 0.00%) 191.00 ( 0.00%) Min alloc-odr1-1 736.00 ( 0.00%) 445.00 ( 39.54%) Min alloc-odr1-2 343.00 ( 0.00%) 335.00 ( 2.33%) Min alloc-odr1-4 277.00 ( 0.00%) 270.00 ( 2.53%) Min alloc-odr1-8 238.00 ( 0.00%) 233.00 ( 2.10%) Min alloc-odr1-16 224.00 ( 0.00%) 218.00 ( 2.68%) Min alloc-odr1-32 210.00 ( 0.00%) 208.00 ( 0.95%) Min alloc-odr1-64 207.00 ( 0.00%) 203.00 ( 1.93%) Min alloc-odr1-128 276.00 ( 0.00%) 202.00 ( 26.81%) Min alloc-odr1-256 206.00 ( 0.00%) 202.00 ( 1.94%) Min alloc-odr1-512 207.00 ( 0.00%) 202.00 ( 2.42%) Min alloc-odr1-1024 208.00 ( 0.00%) 205.00 ( 1.44%) Min alloc-odr1-2048 213.00 ( 0.00%) 212.00 ( 0.47%) Min alloc-odr1-4096 218.00 ( 0.00%) 216.00 ( 0.92%) Min alloc-odr1-8192 341.00 ( 0.00%) 219.00 ( 35.78%) Note that order-0 allocations are unaffected but higher orders get a small boost from this patch and a large reduction in system CPU usage overall as can be seen here: 4.7.0-rc1 4.7.0-rc1 vanilla reset-v1r2 User 85.32 86.31 System 2221.39 2053.36 Elapsed 2368.89 2202.47 Fixes: c33d6c06 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Link: http://lkml.kernel.org/r/20160531100848.GR2527@techsingularity.netSigned-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
Oleg has noted that siglock usage in try_oom_reaper is both pointless and dangerous. signal_group_exit can be checked lockless. The problem is that sighand becomes NULL in __exit_signal so we can crash. Fixes: 3ef22dff ("oom, oom_reaper: try to reap tasks which skip regular OOM killer path") Link: http://lkml.kernel.org/r/1464679423-30218-1-git-send-email-mhocko@kernel.orgSigned-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
In DEBUG_VM kernel, we can hit infinite loop for order == 0 in buffered_rmqueue() when check_new_pcp() returns 1, because the bad page is never removed from the pcp list. Fix this by removing the page before retrying. Also we don't need to check if page is non-NULL, because we simply grab it from the list which was just tested for being non-empty. Fixes: 479f854a ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") Link: http://lkml.kernel.org/r/20160530090154.GM2527@techsingularity.netSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
Some lines in a commit log appear to be commit SHA1 ids like: ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")' Link: http://lkml.kernel.org/r/40e03fd7aaf1f55c75d787128d6d17c5a71226c2.1464358556.git.vdavydov@virtuozzo.com Reduce the false positives. Link: http://lkml.kernel.org/r/eda977eaa8328fef42bb3c87935d97e10ea8ff67.1464384023.git.joe@perches.comSigned-off-by: Joe Perches <joe@perches.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vitaly Wool authored
Fix erroneous z3fold header access in a HEADLESS page in reclaim function, and change one remaining direct handle-to-buddy conversion to use the appropriate helper. Link: http://lkml.kernel.org/r/5748706F.9020208@gmail.comSigned-off-by: Vitaly Wool <vitalywool@gmail.com> Reviewed-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tejun Heo authored
memcg_offline_kmem() may be called from memcg_free_kmem() after a css init failure. memcg_free_kmem() is a ->css_free callback which is called without cgroup_mutex and memcg_offline_kmem() ends up using css_for_each_descendant_pre() without any locking. Fix it by adding rcu read locking around it. mkdir: cannot create directory `65530': No space left on device =============================== [ INFO: suspicious RCU usage. ] 4.6.0-work+ #321 Not tainted ------------------------------- kernel/cgroup.c:4008 cgroup_mutex or RCU read lock required! [ 527.243970] other info that might help us debug this: [ 527.244715] rcu_scheduler_active = 1, debug_locks = 0 2 locks held by kworker/0:5/1664: #0: ("cgroup_destroy"){.+.+..}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0 #1: ((&css->destroy_work)#3){+.+...}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0 [ 527.248098] stack backtrace: CPU: 0 PID: 1664 Comm: kworker/0:5 Not tainted 4.6.0-work+ #321 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 Workqueue: cgroup_destroy css_free_work_fn Call Trace: dump_stack+0x68/0xa1 lockdep_rcu_suspicious+0xd7/0x110 css_next_descendant_pre+0x7d/0xb0 memcg_offline_kmem.part.44+0x4a/0xc0 mem_cgroup_css_free+0x1ec/0x200 css_free_work_fn+0x49/0x5e0 process_one_work+0x1c5/0x4a0 worker_thread+0x49/0x490 kthread+0xea/0x100 ret_from_fork+0x1f/0x40 Link: http://lkml.kernel.org/r/20160526203018.GG23194@mtj.duckdns.orgSigned-off-by: Tejun Heo <tj@kernel.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer bugfix from Thomas Gleixner: "A single bugfix for the error check wreckage we introduced in the merge window" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time: Make settimeofday error checking work again
-
Yang Shi authored
Per the discussion with Joonsoo Kim [1], we need check the return value of lookup_page_ext() for all call sites since it might return NULL in some cases, although it is unlikely, i.e. memory hotplug. Tested with ltp with "page_owner=0". [1] http://lkml.kernel.org/r/20160519002809.GA10245@js1304-P5Q-DELUXE [akpm@linux-foundation.org: fix build-breaking typos] [arnd@arndb.de: fix build problems from lookup_page_ext] Link: http://lkml.kernel.org/r/6285269.2CksypHdYp@wuerfel [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1464023768-31025-1-git-send-email-yang.shi@linaro.orgSigned-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Corey Minyard authored
Commit 7ff9554b ("printk: convert byte-buffer to variable-length record buffer") introduced a record based printk buffer. Modify gdbmacros.txt to parse this new structure so dmesg will work properly. Link: http://lkml.kernel.org/r/1463515794-1599-1-git-send-email-minyard@acm.orgSigned-off-by: Corey Minyard <cminyard@mvista.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Guillermo Julián Moreno authored
When remapping pages accounting for 4G or more memory space, the operation 'count << PAGE_SHIFT' overflows as it is performed on an integer. Solution: cast before doing the bitshift. [akpm@linux-foundation.org: fix vm_unmap_ram() also] [akpm@linux-foundation.org: fix vmap() as well, per Guillermo] Link: http://lkml.kernel.org/r/etPan.57175fb3.7a271c6b.2bd@naudit.esSigned-off-by: Guillermo Julián Moreno <guillermo.julian@naudit.es> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM fix from Russell King: "Just one fix to the ptrace code, spotted by Simon Marchi, where if a thread migrates to a different CPU and the VFP registers are changed through ptrace, the application doesn't see the updated VFP registers" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: fix PTRACE_SETVFPREGS on SMP systems
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Will Deacon: "The main thing here is reviving hugetlb support using contiguous ptes, which we ended up reverting at the last minute in 4.5 pending a fix which went into the core mm/ code during the recent merge window. - Revert a previous revert and get hugetlb going with contiguous hints - Wire up missing compat syscalls - Enable CONFIG_SET_MODULE_RONX by default - Add missing line to our compat /proc/cpuinfo output - Clarify levels in our page table dumps - Fix booting with RANDOMIZE_TEXT_OFFSET enabled - Misc fixes to the ARM CPU PMU driver (refcounting, probe failure) - Remove some dead code and update a comment" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: fix alignment when RANDOMIZE_TEXT_OFFSET is enabled arm64: move {PAGE,CONT}_SHIFT into Kconfig arm64: mm: dump: log span level arm64: update stale PAGE_OFFSET comment drivers/perf: arm_pmu: Avoid leaking pmu->irq_affinity on error drivers/perf: arm_pmu: Defer the setting of __oprofile_cpu_pmu drivers/perf: arm_pmu: Fix reference count of a device_node in of_pmu_irq_cfg arm64: report CPU number in bad_mode arm64: unistd32.h: wire up missing syscalls for compat tasks arm64: Provide "model name" in /proc/cpuinfo for PER_LINUX32 tasks arm64: enable CONFIG_SET_MODULE_RONX by default arm64: Remove orphaned __addr_ok() definition Revert "arm64: hugetlb: partial revert of 66b3923a"
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: - Handle RTAS delay requests in configure_bridge from Russell Currey - Refactor the configure_bridge RTAS tokens from Russell Currey - Fix definition of SIAR and SDAR registers from Thomas Huth - Use privileged SPR number for MMCR2 from Thomas Huth - Update LPCR only if it is powernv from Aneesh Kumar K.V - Fix the reference bit update when handling hash fault from Aneesh Kumar K.V - Add missing tlb flush from Aneesh Kumar K.V - Add POWER8NVL support to ibm,client-architecture-support call from Thomas Huth * tag 'powerpc-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call powerpc/mm/radix: Add missing tlb flush powerpc/mm/hash: Fix the reference bit update when handling hash fault powerpc/mm/radix: Update LPCR only if it is powernv powerpc: Use privileged SPR number for MMCR2 powerpc: Fix definition of SIAR and SDAR registers powerpc/pseries/eeh: Refactor the configure_bridge RTAS tokens powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge
-
Rafael J. Wysocki authored
* acpica-fixes: ACPICA / Hardware: Fix old register check in acpi_hw_get_access_bit_width() * acpi-video: ACPI / Thermal / video: fix max_level incorrect value * acpi-processor: ACPI / processor: Avoid reserving IO regions too early
-
Rafael J. Wysocki authored
* pm-cpufreq-fixes: cpufreq: Fix clamp_val() usage in cpufreq_driver_fast_switch() cpufreq: intel_pstate: Downgrade print level for _PPC
-
Chris Mason authored
When dealing with inline extents, btrfs_get_extent will incorrectly try to insert a duplicate extent_map. The dup hits -EEXIST from add_extent_map, but then we try to merge with the existing one and end up trying to insert a zero length extent_map. This actually works most of the time, except when there are extent maps past the end of the inline extent. rocksdb will trigger this sometimes because it preallocates an extent and then truncates down. Josef made a script to trigger with xfs_io: #!/bin/bash xfs_io -f -c "pwrite 0 1000" inline xfs_io -c "falloc -k 4k 1M" inline xfs_io -c "pread 0 1000" -c "fadvise -d 0 1000" -c "pread 0 1000" inline xfs_io -c "fadvise -d 0 1000" inline cat inline You'll get EIOs trying to read inline after this because add_extent_map is returning EEXIST Signed-off-by: Chris Mason <clm@fb.com>
-
Thomas Gleixner authored
Merge tag 'irqchip-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Merge irqchip updates from Marc Zyngier: - A number of embarassing buglets (GICv3, PIC32) - A more substential errata workaround for Cavium's GICv3 ITS (kept for post-rc1 due to its dependency on NUMA)
-
Mark Rutland authored
With ARM64_64K_PAGES and RANDOMIZE_TEXT_OFFSET enabled, we hit the following issue on the boot: kernel BUG at arch/arm64/mm/mmu.c:480! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 4.6.0 #310 Hardware name: ARM Juno development board (r2) (DT) task: ffff000008d58a80 ti: ffff000008d30000 task.ti: ffff000008d30000 PC is at map_kernel_segment+0x44/0xb0 LR is at paging_init+0x84/0x5b0 pc : [<ffff000008c450b4>] lr : [<ffff000008c451a4>] pstate: 600002c5 Call trace: [<ffff000008c450b4>] map_kernel_segment+0x44/0xb0 [<ffff000008c451a4>] paging_init+0x84/0x5b0 [<ffff000008c42728>] setup_arch+0x198/0x534 [<ffff000008c40848>] start_kernel+0x70/0x388 [<ffff000008c401bc>] __primary_switched+0x30/0x74 Commit 7eb90f2f ("arm64: cover the .head.text section in the .text segment mapping") removed the alignment between the .head.text and .text sections, and used the _text rather than the _stext interval for mapping the .text segment. Prior to this commit _stext was always section aligned and didn't cause any issue even when RANDOMIZE_TEXT_OFFSET was enabled. Since that alignment has been removed and _text is used to map the .text segment, we need ensure _text is always page aligned when RANDOMIZE_TEXT_OFFSET is enabled. This patch adds logic to TEXT_OFFSET fuzzing to ensure that the offset is always aligned to the kernel page size. To ensure this, we rely on the PAGE_SHIFT being available via Kconfig. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Sudeep Holla <sudeep.holla@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Fixes: 7eb90f2f ("arm64: cover the .head.text section in the .text segment mapping") Signed-off-by: Will Deacon <will.deacon@arm.com>
-