- 14 Nov, 2014 12 commits
-
-
Tang Chen authored
In free_area_init_core(), zone->managed_pages is set to an approximate value for lowmem, and will be adjusted when the bootmem allocator frees pages into the buddy system. But free_area_init_core() is also called by hotadd_new_pgdat() when hot-adding memory. As a result, zone->managed_pages of the newly added node's pgdat is set to an approximate value in the very beginning. Even if the memory on that node has node been onlined, /sys/device/system/node/nodeXXX/meminfo has wrong value: hot-add node2 (memory not onlined) cat /sys/device/system/node/node2/meminfo Node 2 MemTotal: 33554432 kB Node 2 MemFree: 0 kB Node 2 MemUsed: 33554432 kB Node 2 Active: 0 kB This patch fixes this problem by reset node managed pages to 0 after hot-adding a new node. 1. Move reset_managed_pages_done from reset_node_managed_pages() to reset_all_zones_managed_pages() 2. Make reset_node_managed_pages() non-static 3. Call reset_node_managed_pages() in hotadd_new_pgdat() after pgdat is initialized Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: <stable@vger.kernel.org> [3.16+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
One thing I did in this patch is fixing freepage accounting. If we clear guard page and link it onto isolate buddy list, we should not increase freepage count. This patch adds conditional branch to skip counting in this case. Without this patch, this overcounting happens frequently if guard order is set and CMA is used. Another thing fixed in this patch is the target to reset order. In __free_one_page(), we check the buddy page whether it is a guard page or not. And, if so, we should clear guard attribute on the buddy page and reset order of it to 0. But, current code resets original page's order rather than buddy one's. Maybe, this doesn't have any problem, because whole merged page's order will be re-assigned soon. But, it is better to correct code. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Gioh Kim <gioh.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jan Kara authored
fsnotify() needs to merge inode and mount marks lists when notifying groups about events so that ignore masks from inode marks are reflected in mount mark notifications and groups are notified in proper order (according to priorities). Currently the sorting of the lists done by fsnotify_add_inode_mark() / fsnotify_add_vfsmount_mark() and fsnotify() differed which resulted ignore masks not being used in some cases. Fix the problem by always using the same comparison function when sorting / merging the mark lists. Thanks to Heinrich Schuchardt for improvements of my patch. Link: https://bugzilla.kernel.org/show_bug.cgi?id=87721Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Tested-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Several people have reported occasionally seeing processes stuck in compact_zone(), even triggering soft lockups, in 3.18-rc2+. Testing a revert of commit e14c720e ("mm, compaction: remember position within pageblock in free pages scanner") fixed the issue, although the stuck processes do not appear to involve the free scanner. Finally, by code inspection, the bug was found in isolate_migratepages() which uses a slightly different condition to detect if the migration and free scanners have met, than compact_finished(). That has not been a problem until commit e14c720e allowed the free scanner position between individual invocations to be in the middle of a pageblock. In a relatively rare case, the migration scanner position can end up at the beginning of a pageblock, with the free scanner position in the middle of the same pageblock. If it's the migration scanner's turn, isolate_migratepages() exits immediately (without updating the position), while compact_finished() decides to continue compaction, resulting in a potentially infinite loop. The system can recover only if another process creates enough high-order pages to make the watermark checks in compact_finished() pass. This patch fixes the immediate problem by bumping the migration scanner's position to meet the free scanner in isolate_migratepages(), when both are within the same pageblock. This causes compact_finished() to terminate properly. A more robust check in compact_finished() is planned as a cleanup for better future maintainability. Fixes: e14c720e ("mm, compaction: remember position within pageblock in free pages scanner) Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: P. Christeas <xrg@linux.gr> Tested-by: P. Christeas <xrg@linux.gr> Link: http://marc.info/?l=linux-mm&m=141508604232522&w=2Reported-by: Norbert Preining <preining@logic.at> Tested-by: Norbert Preining <preining@logic.at> Link: https://lkml.org/lkml/2014/11/4/904Reported-by: Pavel Machek <pavel@ucw.cz> Link: https://lkml.org/lkml/2014/11/7/164 Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Rientjes <rientjes@google.com> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Nazarewicz authored
Having test_pages_isolated failure message as a warning confuses users into thinking that it is more serious than it really is. In reality, if called via CMA, allocation will be retried so a single test_pages_isolated failure does not prevent allocation from succeeding. Demote the warning message to an info message and reformat it such that the text "failed" does not appear and instead a less worrying "PFNS busy" is used. This message is trivially reproducible on a 10GB x86 machine on 3.16.y kernels configured with CONFIG_DMA_CMA. Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
Unlike SLUB, sometimes, object isn't started at the beginning of the slab in SLAB. This causes the unalignment problem after slab merging is supported by commit 12220dea ("mm/slab: support slab merge"). Following is the report from Markos that fail to boot on Malta with EVA. Calibrating delay loop... 19.86 BogoMIPS (lpj=99328) pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 4096 (order: 0, 16384 bytes) Mountpoint-cache hash table entries: 4096 (order: 0, 16384 bytes) Kernel bug detected[#1]: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.17.0-05639-g12220dea #1631 task: 1f04f5d8 ti: 1f050000 task.ti: 1f050000 epc : 80141190 alloc_unbound_pwq+0x234/0x304 Not tainted ra : 80141184 alloc_unbound_pwq+0x228/0x304 Process swapper/0 (pid: 1, threadinfo=1f050000, task=1f04f5d8, tls=00000000) Call Trace: alloc_unbound_pwq+0x234/0x304 apply_workqueue_attrs+0x11c/0x294 __alloc_workqueue_key+0x23c/0x470 init_workqueues+0x320/0x400 do_one_initcall+0xe8/0x23c kernel_init_freeable+0x9c/0x224 kernel_init+0x10/0x100 ret_from_kernel_thread+0x14/0x1c [ end trace cb88537fdc8fa200 ] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b alloc_unbound_pwq() allocates slab object from pool_workqueue. This kmem_cache requires 256 bytes alignment, but, current merging code doesn't honor that, and merge it with kmalloc-256. kmalloc-256 requires only cacheline size alignment so that above failure occurs. However, in x86, kmalloc-256 is luckily aligned in 256 bytes, so the problem didn't happen on it. To fix this problem, this patch introduces alignment mismatch check in find_mergeable(). This will fix the problem. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reported-by: Markos Chandras <Markos.Chandras@imgtec.com> Tested-by: Markos Chandras <Markos.Chandras@imgtec.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
Current pageblock isolation logic could isolate each pageblock individually. This causes freepage accounting problem if freepage with pageblock order on isolate pageblock is merged with other freepage on normal pageblock. We can prevent merging by restricting max order of merging to pageblock order if freepage is on isolate pageblock. A side-effect of this change is that there could be non-merged buddy freepage even if finishing pageblock isolation, because undoing pageblock isolation is just to move freepage from isolate buddy list to normal buddy list rather than to consider merging. So, the patch also makes undoing pageblock isolation consider freepage merge. When un-isolation, freepage with more than pageblock order and it's buddy are checked. If they are on normal pageblock, instead of just moving, we isolate the freepage and free it in order to get merged. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
All the caller of __free_one_page() has similar freepage counting logic, so we can move it to __free_one_page(). This reduce line of code and help future maintenance. This is also preparation step for "mm/page_alloc: restrict max order of merging on isolated pageblock" which fix the freepage counting problem on freepage with more than pageblock order. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
In free_pcppages_bulk(), we use cached migratetype of freepage to determine type of buddy list where freepage will be added. This information is stored when freepage is added to pcp list, so if isolation of pageblock of this freepage begins after storing, this cached information could be stale. In other words, it has original migratetype rather than MIGRATE_ISOLATE. There are two problems caused by this stale information. One is that we can't keep these freepages from being allocated. Although this pageblock is isolated, freepage will be added to normal buddy list so that it could be allocated without any restriction. And the other problem is incorrect freepage accounting. Freepages on isolate pageblock should not be counted for number of freepage. Following is the code snippet in free_pcppages_bulk(). /* MIGRATE_MOVABLE list may include MIGRATE_RESERVEs */ __free_one_page(page, page_to_pfn(page), zone, 0, mt); trace_mm_page_pcpu_drain(page, 0, mt); if (likely(!is_migrate_isolate_page(page))) { __mod_zone_page_state(zone, NR_FREE_PAGES, 1); if (is_migrate_cma(mt)) __mod_zone_page_state(zone, NR_FREE_CMA_PAGES, 1); } As you can see above snippet, current code already handle second problem, incorrect freepage accounting, by re-fetching pageblock migratetype through is_migrate_isolate_page(page). But, because this re-fetched information isn't used for __free_one_page(), first problem would not be solved. This patch try to solve this situation to re-fetch pageblock migratetype before __free_one_page() and to use it for __free_one_page(). In addition to move up position of this re-fetch, this patch use optimization technique, re-fetching migratetype only if there is isolate pageblock. Pageblock isolation is rare event, so we can avoid re-fetching in common case with this optimization. This patch also correct migratetype of the tracepoint output. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
Before describing bugs itself, I first explain definition of freepage. 1. pages on buddy list are counted as freepage. 2. pages on isolate migratetype buddy list are *not* counted as freepage. 3. pages on cma buddy list are counted as CMA freepage, too. Now, I describe problems and related patch. Patch 1: There is race conditions on getting pageblock migratetype that it results in misplacement of freepages on buddy list, incorrect freepage count and un-availability of freepage. Patch 2: Freepages on pcp list could have stale cached information to determine migratetype of buddy list to go. This causes misplacement of freepages on buddy list and incorrect freepage count. Patch 4: Merging between freepages on different migratetype of pageblocks will cause freepages accouting problem. This patch fixes it. Without patchset [3], above problem doesn't happens on my CMA allocation test, because CMA reserved pages aren't used at all. So there is no chance for above race. With patchset [3], I did simple CMA allocation test and get below result: - Virtual machine, 4 cpus, 1024 MB memory, 256 MB CMA reservation - run kernel build (make -j16) on background - 30 times CMA allocation(8MB * 30 = 240MB) attempts in 5 sec interval - Result: more than 5000 freepage count are missed With patchset [3] and this patchset, I found that no freepage count are missed so that I conclude that problems are solved. On my simple memory offlining test, these problems also occur on that environment, too. This patch (of 4): There are two paths to reach core free function of buddy allocator, __free_one_page(), one is free_one_page()->__free_one_page() and the other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page(). Each paths has race condition causing serious problems. At first, this patch is focused on first type of freepath. And then, following patch will solve the problem in second type of freepath. In the first type of freepath, we got migratetype of freeing page without holding the zone lock, so it could be racy. There are two cases of this race. 1. pages are added to isolate buddy list after restoring orignal migratetype CPU1 CPU2 get migratetype => return MIGRATE_ISOLATE call free_one_page() with MIGRATE_ISOLATE grab the zone lock unisolate pageblock release the zone lock grab the zone lock call __free_one_page() with MIGRATE_ISOLATE freepage go into isolate buddy list, although pageblock is already unisolated This may cause two problems. One is that we can't use this page anymore until next isolation attempt of this pageblock, because freepage is on isolate buddy list. The other is that freepage accouting could be wrong due to merging between different buddy list. Freepages on isolate buddy list aren't counted as freepage, but ones on normal buddy list are counted as freepage. If merge happens, buddy freepage on normal buddy list is inevitably moved to isolate buddy list without any consideration of freepage accouting so it could be incorrect. 2. pages are added to normal buddy list while pageblock is isolated. It is similar with above case. This also may cause two problems. One is that we can't keep these freepages from being allocated. Although this pageblock is isolated, freepage would be added to normal buddy list so that it could be allocated without any restriction. And the other problem is same as case 1, that it, incorrect freepage accouting. This race condition would be prevented by checking migratetype again with holding the zone lock. Because it is somewhat heavy operation and it isn't needed in common case, we want to avoid rechecking as much as possible. So this patch introduce new variable, nr_isolate_pageblock in struct zone to check if there is isolated pageblock. With this, we can avoid to re-check migratetype in common case and do it only if there is isolated pageblock or migratetype is MIGRATE_ISOLATE. This solve above mentioned problems. Changes from v3: Add one more check in free_one_page() that checks whether migratetype is MIGRATE_ISOLATE or not. Without this, abovementioned case 1 could happens. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Minchan Kim <minchan@kernel.org> Acked-by: Michal Nazarewicz <mina86@mina86.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Mel Gorman <mgorman@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Laura Abbott <lauraa@codeaurora.org> Cc: Heesub Shin <heesub.shin@samsung.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Ritesh Harjani <ritesh.list@gmail.com> Cc: Gioh Kim <gioh.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
Commit 7d49d886 ("mm, compaction: reduce zone checking frequency in the migration scanner") has a side-effect that changes the iteration range calculation. Before the change, block_end_pfn is calculated using start_pfn, but now it blindly adds pageblock_nr_pages to the previous value. This causes the problem that isolation_start_pfn is larger than block_end_pfn when we isolate the page with more than pageblock order. In this case, isolation would fail due to an invalid range parameter. To prevent this, this patch implements skipping the range until a proper target pageblock is met. Without this patch, CMA with more than pageblock order always fails but with this patch it will succeed. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Weijie Yang authored
zram could kunmap_atomic() a NULL pointer in a rare situation: a zram page becomes a full-zeroed page after a partial write io. The current code doesn't handle this case and performs kunmap_atomic() on a NULL pointer, which panics the kernel. This patch fixes this issue. Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang.kh@gmail.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 13 Nov, 2014 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds authored
Pull SELinux fixlet from James Morris: "WARN_ONCE() here will unnecessarily terrify users" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: selinux: convert WARN_ONCE() to printk() in selinux_nlmsg_perm()
-
git://git.infradead.org/users/pcmoore/auditLinus Torvalds authored
Pull audit fixes from Paul Moore: "After he sent the initial audit pull request for 3.18, Eric asked me to take over the management of the audit tree, hence this pull request to fix a couple of problems with audit. As you can see below, the changes are minimal: adding some whitespace to a string so userspace parses it correctly, and fixing a problem with audit's usage of fsnotify that was causing audit watch rules to be lost. Neither of these patches were very controversial on the mailing lists and they fix real problems, getting them into 3.18 would be a good thing" * 'stable-3.18' of git://git.infradead.org/users/pcmoore/audit: audit: keep inode pinned audit: AUDIT_FEATURE_CHANGE message format missing delimiting space
-
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dmLinus Torvalds authored
Pull device mapper fixes from Mike Snitzer: - stable fix for dm-thin that avoids normal IO racing with discard - stable fix for a dm-cache related bug in dm-btree walking code that results from using very large fast device (eg 4T) with a very small cache blocksize (eg 32K) -- this is a very uncommon configuration - a couple fixes for dm-raid (one for stable and the other addresses a crash in 3.18-rc1 code) - stable fix for dm-thinp that addresses a very rare dm-bufio bug having to do with memory reclaimation (via shrinker) when using dm-thinp ontop of loopback devices - fix a leak in dm-stripe target constructor's error path * tag 'dm-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm btree: fix a recursion depth bug in btree walking code dm thin: grab a virtual cell before looking up the mapping dm raid: fix inaccessible superblocks causing oops in configure_discard_support dm raid: ensure superblock's size matches device's logical block size dm bufio: change __GFP_IO to __GFP_FS in shrinker callbacks dm stripe: fix potential for leak in stripe_ctr error path
-
-
- 12 Nov, 2014 10 commits
-
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm fixes from Paolo Bonzini: "Two fixes --- one of them not exactly a one liner, but things are calming down on the KVM front at last" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix uninitialized op->type for some immediate values KVM: s390: virtio_ccw: remove unused variable
-
git://github.com/czankel/xtensa-linuxLinus Torvalds authored
Pull Xtensa fixes from Chris Zankel: - fix umount syscall - fix ISS and xtfpga Kconfig dependencies so that more randconfigs are buildable - add seccomp, getrandom, and memfd_create syscalls - add defconfigs for KC705 and SMP LX200 - implement pgprot_noncached * tag 'xtensa-20141109' of git://github.com/czankel/xtensa-linux: xtensa: xtfpga: add lx200 SMP DTS and defconfig xtensa: xtfpga: add generic KC705 board config xtensa: re-wire umount syscall to sys_oldumount xtensa: xtfpga: only select ethoc when ethernet is available xtensa: add seccomp, getrandom, and memfd_create syscalls xtensa: ISS: add BLOCK dependency to BLK_DEV_SIMDISK xtensa: implement pgprot_noncached xtensa/uapi: Add definition of TIOC[SG]RS485
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
Pull crypto fixes from Herbert Xu: - stack corruption fix for pseries hwrng driver - add missing DMA unmap in caam crypto driver - fix NUMA crash in qat crypto driver - fix buggy mapping of zero-length associated data in qat crypto driver * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: hwrng: pseries - port to new read API and fix stack corruption crypto: caam - fix missing dma unmap on error path crypto: qat - Enforce valid numa configuration crypto: qat - Prevent dma mapping zero length assoc data
-
Linus Torvalds authored
Merge tag 'trace-fixes-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Rabin Vincent found a way that tracing could cause an infinite loop in the kernel. The splice logic wants a full page from the ring buffer but the ring_buffer_wait() returns when there's any data in the ring buffer. The splice code would then continue the loop waiting for a full page. But if a full page never happens, the splice code will never sleep and just continue to loop. There's another case that Rabin fixed that could loop if there's no memory and kmalloc() constantly returns NULL" * tag 'trace-fixes-v3.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Do not risk busy looping in buffer splice tracing: Do not busy wait in buffer splice
-
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linuxLinus Torvalds authored
Pull kernel argument parsing fix from Rusty Russell: "Nasty, stupid bug, and I've suddenly had two reports" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: param: fix crash on bad kernel arguments
-
Linus Torvalds authored
Merge tag 'hwmon-for-linus-v3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - fix PCI device ID in fam15h_power driver - fix suspend/resume behavior in pwm-fan driver - reduce logging noise created by ibmpowernv driver * tag 'hwmon-for-linus-v3.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (fam15h_power) Fix NB device ID for F16h M30h hwmon: (pwm-fan) Fix suspend/resume behavior hwmon: (ibmpowernv) Quieten when probing finds no device
-
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermalLinus Torvalds authored
Pull thermal driver fixes from Eduardo Valentin: "This week we have few fixes: - fix in IMX thermal driver to do the correct loading sequence with CPUfreq - fix in Exynos related to TMU_CONTROL offset in Exynos5260 - fix the unit conversion in int3403" [ Still pulling from Eduardo as Rui Zhang is on a business trip and has troubles with his machine ] * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: imx: thermal: imx_get_temp might be called before sensor clock is prepared thermal: exynos: use correct offset for TMU_CONTROL register on Exynos5260 thermal: imx: correct driver load sequence for cpu cooling Thermal/int3403: Fix thermal hysteresis unit conversion
-
Richard Guy Briggs authored
Convert WARN_ONCE() to printk() in selinux_nlmsg_perm(). After conversion from audit_log() in commit e173fb26, WARN_ONCE() was deemed too alarmist, so switch it to printk(). Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [PM: Changed to printk(WARNING) so we catch all of the different invalid netlink messages. In Richard's defense, he brought this point up earlier, but I didn't understand his point at the time.] Signed-off-by: Paul Moore <pmoore@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfdLinus Torvalds authored
Pull MFD fixes from Lee Jones: - register offset fix for stmpe - eradicate build warning when !PM in rtsx_pcr - fix device ID collision when multiple boards are connected in viperboard - use correct Regmap handle - fixing unhanded IRQs in max77693 - unmask MUIC IRQs in max77693 - clear VBUS & CHG bits so board doesn't reboot instead of poweroff in twl4030 * tag 'mfd-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: twl4030-power: Fix poweroff with PM configuration enabled mfd: max77693: Fix always masked MUIC interrupts mfd: max77693: Use proper regmap for handling MUIC interrupts mfd: viperboard: Fix platform-device id collision mfd: rtsx: Fix build warnings for !PM mfd: stmpe: Fix STMPE24xx GPMR LSB
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Radeon and i915 fixes. I probably should have sent these earlier, but nothing too urgent in them: - i915: blackscreen and corruption fixes - radeon: oops, locking and stability" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon: add missing crtc unlock when setting up the MC drm/radeon: use gart for DMA IB tests drm/radeon: make sure mode init is complete in bandwidth_update drm/radeon: set correct CE ram size for CIK drm/i915: safeguard against too high minimum brightness drm/i915: vlv: fix gunit HW state corruption during S4 suspend drm/i915: Disable caches for Global GTT.
-
- 11 Nov, 2014 5 commits
-
-
Miklos Szeredi authored
Audit rules disappear when an inode they watch is evicted from the cache. This is likely not what we want. The guilty commit is "fsnotify: allow marks to not pin inodes in core", which didn't take into account that audit_tree adds watches with a zero mask. Adding any mask should fix this. Fixes: 90b1e7a5 ("fsnotify: allow marks to not pin inodes in core") Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org # 2.6.36+ Signed-off-by: Paul Moore <pmoore@redhat.com>
-
Aravind Gopalakrishnan authored
F3 device ID is wrongly included in fam15h_power_id_table for F16h M30h. It should be F4 device ID. Fix this. Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
-
Kamil Debski authored
The state of a PWM output is not clearly defined after resume. Some PWM drivers do not restore the duty cycle upon resume, thus it is necessary to manually restore the correct value. Signed-off-by: Kamil Debski <k.debski@samsung.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
-
Michael Ellerman authored
Because we build kernels with drivers built in for many platforms, it's normal for the ibmpowernv driver to be loaded on systems that don't have the appropriate hardware. Currently the driver spams the log with: ibmpowernv ibmpowernv.0: Opal node 'sensors' not found ibmpowernv: Platfrom driver probe failed But there is no error, this machine is not a powernv and doesn't have the hardware. So change the sensors message to dev_dbg(), and only print an error about the probe failing if it's not ENODEV. Also fix the spelling of "Platfrom" and print the actual error value. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
-
Daniel Thompson authored
Currently if the user passes an invalid value on the kernel command line then the kernel will crash during argument parsing. On most systems this is very hard to debug because the console hasn't been initialized yet. This is a regression due to commit 51e158c1 ("param: hand arguments after -- straight to init") which, in response to the systemd debug controversy, made it possible to explicitly pass arguments to init. To achieve this parse_args() was extended from simply returning an error code to returning a pointer. Regretably the new init args logic does not perform a proper validity check on the pointer resulting in a crash. This patch fixes the validity check. Should the check fail then no arguments will be passed to init. This is reasonable and matches how the kernel treats its own arguments (i.e. no error recovery). Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
-
- 10 Nov, 2014 9 commits
-
-
Rabin Vincent authored
If the read loop in trace_buffers_splice_read() keeps failing due to memory allocation failures without reading even a single page then this function will keep busy looping. Remove the risk for that by exiting the function if memory allocation failures are seen. Link: http://lkml.kernel.org/r/1415309167-2373-2-git-send-email-rabin@rab.inSigned-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Rabin Vincent authored
On a !PREEMPT kernel, attempting to use trace-cmd results in a soft lockup: # trace-cmd record -e raw_syscalls:* -F false NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61] ... Call Trace: [<ffffffff8105b580>] ? __wake_up_common+0x90/0x90 [<ffffffff81092e25>] wait_on_pipe+0x35/0x40 [<ffffffff810936e3>] tracing_buffers_splice_read+0x2e3/0x3c0 [<ffffffff81093300>] ? tracing_stats_read+0x2a0/0x2a0 [<ffffffff812d10ab>] ? _raw_spin_unlock+0x2b/0x40 [<ffffffff810dc87b>] ? do_read_fault+0x21b/0x290 [<ffffffff810de56a>] ? handle_mm_fault+0x2ba/0xbd0 [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80 [<ffffffff810951e2>] ? trace_buffer_lock_reserve+0x22/0x60 [<ffffffff81095c80>] ? trace_event_buffer_lock_reserve+0x40/0x80 [<ffffffff8112415d>] do_splice_to+0x6d/0x90 [<ffffffff81126971>] SyS_splice+0x7c1/0x800 [<ffffffff812d1edd>] tracesys_phase2+0xd3/0xd8 The problem is this: tracing_buffers_splice_read() calls ring_buffer_wait() to wait for data in the ring buffers. The buffers are not empty so ring_buffer_wait() returns immediately. But tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1, meaning it only wants to read a full page. When the full page is not available, tracing_buffers_splice_read() tries to wait again with ring_buffer_wait(), which again returns immediately, and so on. Fix this by adding a "full" argument to ring_buffer_wait() which will make ring_buffer_wait() wait until the writer has left the reader's page, i.e. until full-page reads will succeed. Link: http://lkml.kernel.org/r/1415645194-25379-1-git-send-email-rabin@rab.in Cc: stable@vger.kernel.org # 3.16+ Fixes: b1169cc6 ("tracing: Remove mock up poll wait function") Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
-
Joe Thornber authored
The walk code was using a 'ro_spine' to hold it's locked btree nodes. But this data structure is designed for the rolling lock scheme, and as such automatically unlocks blocks that are two steps up the call chain. This is not suitable for the simple recursive walk algorithm, which retraces its steps. This code is only used by the persistent array code, which in turn is only used by dm-cache. In order to trigger it you need to have a mapping tree that is more than 2 levels deep; which equates to 8-16 million cache blocks. For instance a 4T ssd with a very small block size of 32k only just triggers this bug. The fix just places the locked blocks on the stack, and stops using the ro_spine altogether. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org
-
Tony Lindgren authored
Commit e7cd1d1e ("mfd: twl4030-power: Add generic reset configuration") enabled configuring the PM features for twl4030. This caused poweroff command to fail on devices that have the BCI charger on twl4030 wired, or have power wired for VBUS. Instead of powering off, the device reboots. This is because voltage is detected on charger or VBUS with the default bits enabled for the power transition registers. To fix the issue, let's just clear VBUS and CHG bits as we want poweroff command to keep the system powered off. Fixes: e7cd1d1e ("mfd: twl4030-power: Add generic reset configuration") Cc: stable@vger.kernel.org # v3.16+ Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Krzysztof Kozlowski authored
All interrupts coming from MUIC were ignored because interrupt source register was masked. The Maxim 77693 has a "interrupt source" - a separate register and interrupts which give information about PMIC block triggering the individual interrupt (charger, topsys, MUIC, flash LED). By default bootloader could initialize this register to "mask all" value. In such case (observed on Trats2 board) MUIC interrupts won't be generated regardless of their mask status. Regmap irq chip was unmasking individual MUIC interrupts but the source was masked Before introducing regmap irq chip this interrupt source was unmasked, read and acked. Reading and acking is not necessary but unmasking is. Fixes: 342d669c ("mfd: max77693: Handle IRQs using regmap") Cc: <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Krzysztof Kozlowski authored
Interrupts coming from Maxim77693 MUIC block (MicroUSB Interface Controller) were not handled at all because wrong regmap was used for MUIC's regmap_irq_chip. The MUIC component of Maxim 77693 uses different I2C address thus second regmap is created and used by max77693 extcon driver. The registers for MUIC interrupts are also in that block and should be handled by that second regmap. However the regmap irq chip for MUIC was configured with default regmap which could not read MUIC registers. Fixes: 342d669c ("mfd: max77693: Handle IRQs using regmap") Cc: <stable@vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Johan Hovold authored
Allow more than one viperboard to be connected by registering with PLATFORM_DEVID_AUTO instead of PLATFORM_DEVID_NONE. The subdevices are currently registered with PLATFORM_DEVID_NONE, which will cause a name collision on the platform bus when a second viperboard is plugged in: viperboard 1-2.4:1.0: version 0.00 found at bus 001 address 004 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 181 at /home/johan/work/omicron/src/linux/fs/sysfs/dir.c:31 sysfs_warn_dup+0x74/0x84() sysfs: cannot create duplicate filename '/bus/platform/devices/viperboard-gpio' Modules linked in: i2c_viperboard viperboard netconsole [last unloaded: viperboard] CPU: 0 PID: 181 Comm: bash Tainted: G W 3.17.0-rc6 #1 [<c0016bf4>] (unwind_backtrace) from [<c0013860>] (show_stack+0x20/0x24) [<c0013860>] (show_stack) from [<c04305f8>] (dump_stack+0x24/0x28) [<c04305f8>] (dump_stack) from [<c0040fb4>] (warn_slowpath_common+0x80/0x98) [<c0040fb4>] (warn_slowpath_common) from [<c004100c>] (warn_slowpath_fmt+0x40/0x48) [<c004100c>] (warn_slowpath_fmt) from [<c016f1bc>] (sysfs_warn_dup+0x74/0x84) [<c016f1bc>] (sysfs_warn_dup) from [<c016f548>] (sysfs_do_create_link_sd.isra.2+0xcc/0xd0) [<c016f548>] (sysfs_do_create_link_sd.isra.2) from [<c016f588>] (sysfs_create_link+0x3c/0x48) [<c016f588>] (sysfs_create_link) from [<c02867ec>] (bus_add_device+0x12c/0x1e0) [<c02867ec>] (bus_add_device) from [<c0284820>] (device_add+0x410/0x584) [<c0284820>] (device_add) from [<c0289440>] (platform_device_add+0xd8/0x26c) [<c0289440>] (platform_device_add) from [<c02a5ae4>] (mfd_add_device+0x240/0x344) [<c02a5ae4>] (mfd_add_device) from [<c02a5ce0>] (mfd_add_devices+0xb8/0x110) [<c02a5ce0>] (mfd_add_devices) from [<bf00d1c8>] (vprbrd_probe+0x160/0x1b0 [viperboard]) [<bf00d1c8>] (vprbrd_probe [viperboard]) from [<c030c000>] (usb_probe_interface+0x1bc/0x2a8) [<c030c000>] (usb_probe_interface) from [<c028768c>] (driver_probe_device+0x14c/0x3ac) [<c028768c>] (driver_probe_device) from [<c02879e4>] (__driver_attach+0xa4/0xa8) [<c02879e4>] (__driver_attach) from [<c0285698>] (bus_for_each_dev+0x70/0xa4) [<c0285698>] (bus_for_each_dev) from [<c0287030>] (driver_attach+0x2c/0x30) [<c0287030>] (driver_attach) from [<c030a288>] (usb_store_new_id+0x170/0x1ac) [<c030a288>] (usb_store_new_id) from [<c030a2f8>] (new_id_store+0x34/0x3c) [<c030a2f8>] (new_id_store) from [<c02853ec>] (drv_attr_store+0x30/0x3c) [<c02853ec>] (drv_attr_store) from [<c016eaa8>] (sysfs_kf_write+0x5c/0x60) [<c016eaa8>] (sysfs_kf_write) from [<c016dc68>] (kernfs_fop_write+0xd4/0x194) [<c016dc68>] (kernfs_fop_write) from [<c010fe40>] (vfs_write+0xb4/0x1c0) [<c010fe40>] (vfs_write) from [<c01104a8>] (SyS_write+0x4c/0xa0) [<c01104a8>] (SyS_write) from [<c000f900>] (ret_fast_syscall+0x0/0x48) ---[ end trace 98e8603c22d65817 ]--- viperboard 1-2.4:1.0: Failed to add mfd devices to core. viperboard: probe of 1-2.4:1.0 failed with error -17 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Thierry Reding authored
rtsx_pci_power_off() is called only from rtsx_pci_suspend(), which isn't built when PM is disabled. Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Linus Walleij authored
The least significat byte of the GPIO value read register on the STMPE24xx series is on addres 0xA4 not 0xA5. Correct against datasheet and tested on the STMPE2401 hardware. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-