- 21 Nov, 2012 3 commits
-
-
Sarveshwar Bandi authored
Patch sets the lowest gso_max_size and gso_max_segs values of the slave devices during enslave and detach. Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ian Campbell authored
An SKB paged fragment can consist of a compound page with order > 0. However the netchannel protocol deals only in PAGE_SIZE frames. Handle this in xennet_make_frags by iterating over the frames which make up the page. This is the netfront equivalent to 6a8ed462 for netback. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: netdev@vger.kernel.org Cc: xen-devel@lists.xen.org Cc: Eric Dumazet <edumazet@google.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> Cc: ANNIE LI <annie.li@oracle.com> Cc: Sander Eikelenboom <linux@eikelenboom.it> Cc: Stefan Bader <stefan.bader@canonical.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
John W. Linville authored
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem John W. Linville says: ==================== This is a batch of fixes intended for 3.7... Included are two pulls. Regarding the mac80211 tree, Johannes says: "Please pull my mac80211.git tree (see below) to get two more fixes for 3.7. Both fix regressions introduced *before* this cycle that weren't noticed until now, one for IBSS not cleaning up properly and the other to add back the "wireless" sysfs directory for Fedora's startup scripts." Regarding the iwlwifi tree, Johannes says: "Please also pull my iwlwifi.git tree, I have two fixes: one to remove a spurious warning that can actually trigger in legitimate situations, and the other to fix a regression from when monitor mode was changed to use the "sniffer" firmware mode." Also included is an nfc tree pull. Samuel says: "We mostly have pn533 fixes here, 2 memory leaks and an early unlocking fix. Moreover, we also have an LLCP adapter linked list insertion fix." On top of that, a few more bits... Albert Pool adds a USB ID to rtlwifi. Bing Zhao provides two mwifiex fixes -- one to fix a system hang during a command timeout, and the other to properly report a suspend error to the MMC core. Finally, Sujith Manoharan fixes a thinko that would trigger an ath9k hang during device reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 20 Nov, 2012 7 commits
-
-
Jeff Mahoney authored
Commit 71c6c837 (drivers/net: fix tasklet misuse issue) introduced a build failure in the xilinx driver. axienet_dma_err_handler isn't declared before its use in axienet_open. This patch provides the prototype before axienet_open. Cc: Xiaotian Feng <dannyfeng@tencent.com> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Shiyan authored
Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
In case of error, inet6_csk_update_pmtu() should consistently return NULL. Bug added in commit 35ad9b9c (ipv6: Add helper inet6_csk_update_pmtu().) Reported-by: Lluís Batlle i Rossell <viric@viric.name> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xi Wang authored
Use &port->netdev->dev instead of NULL since dma_pool_create() doesn't allow NULL dev. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xi Wang authored
Use &port->netdev->dev instead of NULL since dma_pool_create() doesn't allow NULL dev. Signed-off-by: Xi Wang <xi.wang@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alan Cox authored
Without this udev doesn't have a way to key the ne device to the platform device. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-3.0John W. Linville authored
Samuel says: "This is the first pull request for 3.7 NFC fixes. We mostly have pn533 fixes here, 2 memory leaks and an early unlocking fix. Moreover, we also have an LLCP adapter linked list insertion fix." Signed-off-by: John W. Linville <linville@tuxdriver.com>
-
- 19 Nov, 2012 9 commits
-
-
Srinivas Kandagatla authored
When the mdio-gpio driver is probed via device trees, the platform device id is set as -1, However the pdev->id is re-used as bus-id for while creating mdio gpio bus. So For device tree case the mdio-gpio bus name appears as "gpio-ffffffff" where as for non-device tree case the bus name appears as "gpio-<bus-num>" Which means the bus_id is fixed in device tree case, so we can't have two mdio gpio buses via device trees. Assigning a logical bus number via device tree solves the problem and the bus name is much consistent with non-device tree bus name. Without this patch 1. we can't support two mdio-gpio buses via device trees. 2. we should always pass gpio-ffffffff as bus name to phy_connect, very different to non-device tree bus name. So, setting up the bus_id via aliases from device tree is the right solution and other drivers do similar thing. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Thierry Escande authored
In target mode, sent sk_buff were not freed in pn533_tm_send_complete Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
-
Waldemar Rymarkiewicz authored
cmd is allocated in pn533_dep_link_up and passed as an arg to pn533_send_cmd_frame_async together with a complete cb. arg is passed to the cb and must be kfreed there. Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
-
Szymon Janc authored
cmd was freed in pn533_dep_link_up regardless of pn533_send_cmd_frame_async return code. Cmd is passed as argument to pn533_in_dep_link_up_complete callback and should be freed there. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
-
Szymon Janc authored
In pn533_wq_cmd command was removed from list without cmd_lock held (race with pn533_send_cmd_frame_async) which could lead to list corruption. Delete command from list before releasing lock. Signed-off-by: Szymon Janc <szymon.janc@tieto.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
-
Thierry Escande authored
list_add was called with swapped parameters Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
-
-
Sujith Manoharan authored
Commit "ath9k: improve suspend/resume reliability" broke ath9k_htc and bringing up the device would hang indefinitely. Fix this. Cc: stable@vger.kernel.org Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
-
- 18 Nov, 2012 9 commits
-
-
Francois Romieu authored
Leftover of 57d6d456 ("sis900: stop using net_device.{base_addr, irq} and convert to __iomem."). It is needed for suspend / resume to work. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Jan Janssen <medhefgo@web.de> Cc: Daniele Venzano <venza@brownhat.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Al Viro authored
If the FAN_Q_OVERFLOW bit set in event->mask, the fanotify event metadata will not contain a valid file descriptor, but copy_event_to_user() didn't check for that, and unconditionally does a fd_install() on the file descriptor. Which in turn will cause a BUG_ON() in __fd_install(). Introduced by commit 352e3b24 ("fanotify: sanitize failure exits in copy_event_to_user()") Mea culpa - missed that path ;-/ Reported-by: Alex Shi <lkml.alex@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull misc VFS fixes from Al Viro: "Remove a bogus BUG_ON() that can trigger spuriously + alpha bits of do_mount() constification I'd missed during the merge window." This pull request came in a week ago, I missed it for some reason. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: kill bogus BUG_ON() in do_close_on_exec() missing const in alpha callers of do_mount()
-
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68kLinus Torvalds authored
Pull m68k fix from Geert Uytterhoeven: "This is a bug fix for asm constraints that affect sending RT signals, also destined for -stable." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: fix sigset_t accessor functions
-
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpioLinus Torvalds authored
Pull last minute GPIO fixes from Linus Walleij: - Disable blinking on the Orion GPIO driver - Two Kconfig-style fixes to avoid broken builds * tag 'gpio-fixes-for-v3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio-mcp23s08: Build I2C support even when CONFIG_I2C=m gpio: adnp: Depend on OF_GPIO instead of OF mvebu-gpio: Disable blinking when enabling a GPIO for output
-
git://oss.sgi.com/xfs/xfsLinus Torvalds authored
Pull xfs bugfixes from Ben Myers: - fix attr tree double split corruption - fix broken error handling in xfs_vm_writepage - drop buffer io reference when a bad bio is built * tag 'for-linus-v3.7-rc7' of git://oss.sgi.com/xfs/xfs: xfs: drop buffer io reference when a bad bio is built xfs: fix broken error handling in xfs_vm_writepage xfs: fix attr tree double split corruption
-
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-devLinus Torvalds authored
Pull libata fixes from Jeff Garzik: "If you were going to shoot me for not sending these earlier, you would be right. -rc6 beat me by ~2 hours it seems, and they really should have gone out long before that. These have been in libata-dev.git for a day or so (unfortunately linux-next is on vacation). The main one is #1, with the others being minor bits. #1 has multiple tested-by, and can be considered a regression fix IMO. 1) Fix ACPI oops: https://bugzilla.kernel.org/show_bug.cgi?id=48211 2) Temporary WARN_ONCE() debugging patch for further ACPI debugging. The code already oopses here, and so this merely gives slightly better info. Related to https://bugzilla.kernel.org/show_bug.cgi?id=49151 which has been bisected down to a patch that _exposes_ a latest bug, but said bisection target does not actually appear to be the root cause itself. 3) sata_svw: fix longstanding error recovery bug, which was preventing kdump, by adding missing DMA-start bit check. Core code was already checking DMA-start, but ancillary, less-used routines were not. Fixed. 4) sata_highbank: fix minor __init/__devinit warning 5) Fix minor warning, if CONFIG_PM is set, but CONFIG_PM_SLEEP is not set 6) pata_arasan: proper functioning requires clock setting" * tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] PM callbacks should be conditionally compiled on CONFIG_PM_SLEEP sata_svw: check DMA start bit before reset libata debugging: Warn when unable to find timing descriptor based on xfer_mode sata_highbank: mark ahci_highbank_probe as __devinit pata_arasan: Initialize cf clock to 166MHz libata-acpi: Fix NULL ptr derference in ata_acpi_dev_handle
-
Emmanuel Grumbach authored
This can happen when we shut down suddenly an interface. Cc: stable@vger.kernel.org Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
-
Andreas Schwab authored
The sigaddset/sigdelset/sigismember functions that are implemented with bitfield insn cannot allow the sigset argument to be placed in a data register since the sigset is wider than 32 bits. Remove the "d" constraint from the asm statements. The effect of the bug is that sending RT signals does not work, the signal number is truncated modulo 32. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org
-
- 17 Nov, 2012 8 commits
-
-
Daniel M. Weeks authored
The driver has both SPI and I2C pieces. The appropriate pieces are built based on whether SPI and/or I2C is/are enabled. However, it was only checking if I2C was built-in, never if it was built as a module. This patch checks for either since building both this driver and I2C as modules is possible. Signed-off-by: Daniel M. Weeks <dan@danweeks.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
-
Thierry Reding authored
The driver accesses the of_node field of struct gpio_chip, which is only available if OF_GPIO is selected. This solves a build issue on SPARC which conflicts with OF_GPIO and therefore does not provide this field. Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
-
Jamie Lentin authored
The plat-orion GPIO driver would disable any pin blinking whenever using a pin for output. Do the same here, as a blinking LED will continue to blink regardless of what the GPIO pin level is. Signed-off-by: Jamie Lentin <jm@lentin.co.uk> Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
-
Dave Chinner authored
Error handling in xfs_buf_ioapply_map() does not handle IO reference counts correctly. We increment the b_io_remaining count before building the bio, but then fail to decrement it in the failure case. This leads to the buffer never running IO completion and releasing the reference that the IO holds, so at unmount we can leak the buffer. This leak is captured by this assert failure during unmount: XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 273 This is not a new bug - the b_io_remaining accounting has had this problem for a long, long time - it's just very hard to get a zero length bio being built by this code... Further, the buffer IO error can be overwritten on a multi-segment buffer by subsequent bio completions for partial sections of the buffer. Hence we should only set the buffer error status if the buffer is not already carrying an error status. This ensures that a partial IO error on a multi-segment buffer will not be lost. This part of the problem is a regression, however. cc: <stable@vger.kernel.org> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
-
Dave Chinner authored
When we shut down the filesystem, it might first be detected in writeback when we are allocating a inode size transaction. This happens after we have moved all the pages into the writeback state and unlocked them. Unfortunately, if we fail to set up the transaction we then abort writeback and try to invalidate the current page. This then triggers are BUG() in block_invalidatepage() because we are trying to invalidate an unlocked page. Fixing this is a bit of a chicken and egg problem - we can't allocate the transaction until we've clustered all the pages into the IO and we know the size of it (i.e. whether the last block of the IO is beyond the current EOF or not). However, we don't want to hold pages locked for long periods of time, especially while we lock other pages to cluster them into the write. To fix this, we need to make a clear delineation in writeback where errors can only be handled by IO completion processing. That is, once we have marked a page for writeback and unlocked it, we have to report errors via IO completion because we've already started the IO. We may not have submitted any IO, but we've changed the page state to indicate that it is under IO so we must now use the IO completion path to report errors. To do this, add an error field to xfs_submit_ioend() to pass it the error that occurred during the building on the ioend chain. When this is non-zero, mark each ioend with the error and call xfs_finish_ioend() directly rather than building bios. This will immediately push the ioends through completion processing with the error that has occurred. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
-
Dave Chinner authored
In certain circumstances, a double split of an attribute tree is needed to insert or replace an attribute. In rare situations, this can go wrong, leaving the attribute tree corrupted. In this case, the attr being replaced is the last attr in a leaf node, and the replacement is larger so doesn't fit in the same leaf node. When we have the initial condition of a node format attribute btree with two leaves at index 1 and 2. Call them L1 and L2. The leaf L1 is completely full, there is not a single byte of free space in it. L2 is mostly empty. The attribute being replaced - call it X - is the last attribute in L1. The way an attribute replace is executed is that the replacement attribute - call it Y - is first inserted into the tree, but has an INCOMPLETE flag set on it so that list traversals ignore it. Once this transaction is committed, a second transaction it run to atomically mark Y as COMPLETE and X as INCOMPLETE, so that a traversal will now find Y and skip X. Once that transaction is committed, attribute X is then removed. So, the initial condition is: +--------+ +--------+ | L1 | | L2 | | fwd: 2 |---->| fwd: 0 | | bwd: 0 |<----| bwd: 1 | | fsp: 0 | | fsp: N | |--------| |--------| | attr A | | attr 1 | |--------| |--------| | attr B | | attr 2 | |--------| |--------| .......... .......... |--------| |--------| | attr X | | attr n | +--------+ +--------+ So now we go to replace X, and see that L1:fsp = 0 - it is full so we can't insert Y in the same leaf. So we record the the location of attribute X so we can track it for later use, then we split L1 into L1 and L3 and reblance across the two leafs. We end with: +--------+ +--------+ +--------+ | L1 | | L3 | | L2 | | fwd: 3 |---->| fwd: 2 |---->| fwd: 0 | | bwd: 0 |<----| bwd: 1 |<----| bwd: 3 | | fsp: M | | fsp: J | | fsp: N | |--------| |--------| |--------| | attr A | | attr X | | attr 1 | |--------| +--------+ |--------| | attr B | | attr 2 | |--------| |--------| .......... .......... |--------| |--------| | attr W | | attr n | +--------+ +--------+ And we track that the original attribute is now at L3:0. We then try to insert Y into L1 again, and find that there isn't enough room because the new attribute is larger than the old one. Hence we have to split again to make room for Y. We end up with this: +--------+ +--------+ +--------+ +--------+ | L1 | | L4 | | L3 | | L2 | | fwd: 4 |---->| fwd: 3 |---->| fwd: 2 |---->| fwd: 0 | | bwd: 0 |<----| bwd: 1 |<----| bwd: 4 |<----| bwd: 3 | | fsp: M | | fsp: J | | fsp: J | | fsp: N | |--------| |--------| |--------| |--------| | attr A | | attr Y | | attr X | | attr 1 | |--------| + INCOMP + +--------+ |--------| | attr B | +--------+ | attr 2 | |--------| |--------| .......... .......... |--------| |--------| | attr W | | attr n | +--------+ +--------+ And now we have the new (incomplete) attribute @ L4:0, and the original attribute at L3:0. At this point, the first transaction is committed, and we move to the flipping of the flags. This is where we are supposed to end up with this: +--------+ +--------+ +--------+ +--------+ | L1 | | L4 | | L3 | | L2 | | fwd: 4 |---->| fwd: 3 |---->| fwd: 2 |---->| fwd: 0 | | bwd: 0 |<----| bwd: 1 |<----| bwd: 4 |<----| bwd: 3 | | fsp: M | | fsp: J | | fsp: J | | fsp: N | |--------| |--------| |--------| |--------| | attr A | | attr Y | | attr X | | attr 1 | |--------| +--------+ + INCOMP + |--------| | attr B | +--------+ | attr 2 | |--------| |--------| .......... .......... |--------| |--------| | attr W | | attr n | +--------+ +--------+ But that doesn't happen properly - the attribute tracking indexes are not pointing to the right locations. What we end up with is both the old attribute to be removed pointing at L4:0 and the new attribute at L4:1. On a debug kernel, this assert fails like so: XFS: Assertion failed: args->index2 < be16_to_cpu(leaf2->hdr.count), file: fs/xfs/xfs_attr_leaf.c, line: 2725 because the new attribute location does not exist. On a production kernel, this goes unnoticed and the code proceeds ahead merrily and removes L4 because it thinks that is the block that is no longer needed. This leaves the hash index node pointing to entries L1, L4 and L2, but only blocks L1, L3 and L2 to exist. Further, the leaf level sibling list is L1 <-> L4 <-> L2, but L4 is now free space, and so everything is busted. This corruption is caused by the removal of the old attribute triggering a join - it joins everything correctly but then frees the wrong block. xfs_repair will report something like: bad sibling back pointer for block 4 in attribute fork for inode 131 problem with attribute contents in inode 131 would clear attr fork bad nblocks 8 for inode 131, would reset to 3 bad anextents 4 for inode 131, would reset to 0 The problem lies in the assignment of the old/new blocks for tracking purposes when the double leaf split occurs. The first split tries to place the new attribute inside the current leaf (i.e. "inleaf == true") and moves the old attribute (X) to the new block. This sets up the old block/index to L1:X, and newly allocated block to L3:0. It then moves attr X to the new block and tries to insert attr Y at the old index. That fails, so it splits again. With the second split, the rebalance ends up placing the new attr in the second new block - L4:0 - and this is where the code goes wrong. What is does is it sets both the new and old block index to the second new block. Hence it inserts attr Y at the right place (L4:0) but overwrites the current location of the attr to replace that is held in the new block index (currently L3:0). It over writes it with L4:1 - the index we later assert fail on. Hopefully this table will show this in a foramt that is a bit easier to understand: Split old attr index new attr index vanilla patched vanilla patched before 1st L1:26 L1:26 N/A N/A after 1st L3:0 L3:0 L1:26 L1:26 after 2nd L4:0 L3:0 L4:1 L4:0 ^^^^ ^^^^ wrong wrong The fix is surprisingly simple, for all this analysis - just stop the rebalance on the out-of leaf case from overwriting the new attr index - it's already correct for the double split case. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com>
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fix from Marcelo Tosatti: "A correction for oops on module init with older Intel hosts." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix invalid secondary exec controls in vmx_cpuid_update()
-
- 16 Nov, 2012 4 commits
-
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches) revert "mm: fix-up zone present pages" tmpfs: change final i_blocks BUG to WARNING tmpfs: fix shmem_getpage_gfp() VM_BUG_ON mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures" rapidio: fix kernel-doc warnings swapfile: fix name leak in swapoff memcg: fix hotplugged memory zone oops mips, arc: fix build failure memcg: oom: fix totalpages calculation for memory.swappiness==0 mm: fix build warning for uninitialized value mm: add anon_vma_lock to validate_mm()
-
Andrew Morton authored
Revert commit 7f1290f2 ("mm: fix-up zone present pages") That patch tried to fix a issue when calculating zone->present_pages, but it caused a regression on 32bit systems with HIGHMEM. With that change, reset_zone_present_pages() resets all zone->present_pages to zero, and fixup_zone_present_pages() is called to recalculate zone->present_pages when the boot allocator frees core memory pages into buddy allocator. Because highmem pages are not freed by bootmem allocator, all highmem zones' present_pages becomes zero. Various options for improving the situation are being discussed but for now, let's return to the 3.6 code. Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hugh Dickins authored
Under a particular load on one machine, I have hit shmem_evict_inode()'s BUG_ON(inode->i_blocks), enough times to narrow it down to a particular race between swapout and eviction. It comes from the "if (freed > 0)" asymmetry in shmem_recalc_inode(), and the lack of coherent locking between mapping's nrpages and shmem's swapped count. There's a window in shmem_writepage(), between lowering nrpages in shmem_delete_from_page_cache() and then raising swapped count, when the freed count appears to be +1 when it should be 0, and then the asymmetry stops it from being corrected with -1 before hitting the BUG. One answer is coherent locking: using tree_lock throughout, without info->lock; reasonable, but the raw_spin_lock in percpu_counter_add() on used_blocks makes that messier than expected. Another answer may be a further effort to eliminate the weird shmem_recalc_inode() altogether, but previous attempts at that failed. So far undecided, but for now change the BUG_ON to WARN_ON: in usual circumstances it remains a useful consistency check. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hugh Dickins authored
Fuzzing with trinity hit the "impossible" VM_BUG_ON(error) (which Fedora has converted to WARNING) in shmem_getpage_gfp(): WARNING: at mm/shmem.c:1151 shmem_getpage_gfp+0xa5c/0xa70() Pid: 29795, comm: trinity-child4 Not tainted 3.7.0-rc2+ #49 Call Trace: warn_slowpath_common+0x7f/0xc0 warn_slowpath_null+0x1a/0x20 shmem_getpage_gfp+0xa5c/0xa70 shmem_fault+0x4f/0xa0 __do_fault+0x71/0x5c0 handle_pte_fault+0x97/0xae0 handle_mm_fault+0x289/0x350 __do_page_fault+0x18e/0x530 do_page_fault+0x2b/0x50 page_fault+0x28/0x30 tracesys+0xe1/0xe6 Thanks to Johannes for pointing to truncation: free_swap_and_cache() only does a trylock on the page, so the page lock we've held since before confirming swap is not enough to protect against truncation. What cleanup is needed in this case? Just delete_from_swap_cache(), which takes care of the memcg uncharge. Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Dave Jones <davej@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-