- 03 Dec, 2010 1 commit
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hidLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: length resolution should be reported units/mm HID: add support for F430 Force Feedback Wheel HID: egalax: Use kzalloc HID: Remove KERN_DEBUG from dbg_hid use Manually fixed trivial conflict in drivers/hid/hid-input.c (due to removal of KERN_DEBUG from dbg_hid use clashing with new keycode interface switch)
-
- 02 Dec, 2010 33 commits
-
-
Nelson Elhage authored
If a user manages to trigger an oops with fs set to KERNEL_DS, fs is not otherwise reset before do_exit(). do_exit may later (via mm_release in fork.c) do a put_user to a user-controlled address, potentially allowing a user to leverage an oops into a controlled write into kernel memory. This is only triggerable in the presence of another bug, but this potentially turns a lot of DoS bugs into privilege escalations, so it's worth fixing. I have proof-of-concept code which uses this bug along with CVE-2010-3849 to write a zero to an arbitrary kernel address, so I've tested that this is not theoretical. A more logical place to put this fix might be when we know an oops has occurred, before we call do_exit(), but that would involve changing every architecture, in multiple places. Let's just stick it in do_exit instead. [akpm@linux-foundation.org: update code comment] Signed-off-by: Nelson Elhage <nelhage@ksplice.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
KOSAKI Motohiro authored
commit 62b61f61 ("ksm: memory hotremove migration only") caused the following new lockdep warning. ======================================================= [ INFO: possible circular locking dependency detected ] ------------------------------------------------------- bash/1621 is trying to acquire lock: ((memory_chain).rwsem){.+.+.+}, at: [<ffffffff81079339>] __blocking_notifier_call_chain+0x69/0xc0 but task is already holding lock: (ksm_thread_mutex){+.+.+.}, at: [<ffffffff8113a3aa>] ksm_memory_callback+0x3a/0xc0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (ksm_thread_mutex){+.+.+.}: [<ffffffff8108b70a>] lock_acquire+0xaa/0x140 [<ffffffff81505d74>] __mutex_lock_common+0x44/0x3f0 [<ffffffff81506228>] mutex_lock_nested+0x48/0x60 [<ffffffff8113a3aa>] ksm_memory_callback+0x3a/0xc0 [<ffffffff8150c21c>] notifier_call_chain+0x8c/0xe0 [<ffffffff8107934e>] __blocking_notifier_call_chain+0x7e/0xc0 [<ffffffff810793a6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff813afbfb>] memory_notify+0x1b/0x20 [<ffffffff81141b7c>] remove_memory+0x1cc/0x5f0 [<ffffffff813af53d>] memory_block_change_state+0xfd/0x1a0 [<ffffffff813afd62>] store_mem_state+0xe2/0xf0 [<ffffffff813a0bb0>] sysdev_store+0x20/0x30 [<ffffffff811bc116>] sysfs_write_file+0xe6/0x170 [<ffffffff8114f398>] vfs_write+0xc8/0x190 [<ffffffff8114fc14>] sys_write+0x54/0x90 [<ffffffff810028b2>] system_call_fastpath+0x16/0x1b -> #0 ((memory_chain).rwsem){.+.+.+}: [<ffffffff8108b5ba>] __lock_acquire+0x155a/0x1600 [<ffffffff8108b70a>] lock_acquire+0xaa/0x140 [<ffffffff81506601>] down_read+0x51/0xa0 [<ffffffff81079339>] __blocking_notifier_call_chain+0x69/0xc0 [<ffffffff810793a6>] blocking_notifier_call_chain+0x16/0x20 [<ffffffff813afbfb>] memory_notify+0x1b/0x20 [<ffffffff81141f1e>] remove_memory+0x56e/0x5f0 [<ffffffff813af53d>] memory_block_change_state+0xfd/0x1a0 [<ffffffff813afd62>] store_mem_state+0xe2/0xf0 [<ffffffff813a0bb0>] sysdev_store+0x20/0x30 [<ffffffff811bc116>] sysfs_write_file+0xe6/0x170 [<ffffffff8114f398>] vfs_write+0xc8/0x190 [<ffffffff8114fc14>] sys_write+0x54/0x90 [<ffffffff810028b2>] system_call_fastpath+0x16/0x1b But it's a false positive. Both memory_chain.rwsem and ksm_thread_mutex have an outer lock (mem_hotplug_mutex). So they cannot deadlock. Thus, This patch annotate ksm_thread_mutex is not deadlock source. [akpm@linux-foundation.org: update comment, from Hugh] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
KOSAKI Motohiro authored
Presently hwpoison is using lock_system_sleep() to prevent a race with memory hotplug. However lock_system_sleep() is a no-op if CONFIG_HIBERNATION=n. Therefore we need a new lock. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Suggested-by: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
->releasepage() does not remove the page from the mapping. Acked-by: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jeremy Fitzhardinge authored
On stock 2.6.37-rc4, running: # mount lilith:/export /mnt/lilith # find /mnt/lilith/ -type f -print0 | xargs -0 file crashes the machine fairly quickly under Xen. Often it results in oops messages, but the couple of times I tried just now, it just hung quietly and made Xen print some rude messages: (XEN) mm.c:2389:d80 Bad type (saw 7400000000000001 != exp 3000000000000000) for mfn 1d7058 (pfn 18fa7) (XEN) mm.c:964:d80 Attempt to create linear p.t. with write perms (XEN) mm.c:2389:d80 Bad type (saw 7400000000000010 != exp 1000000000000000) for mfn 1d2e04 (pfn 1d1fb) (XEN) mm.c:2965:d80 Error while pinning mfn 1d2e04 Which means the domain tried to map a pagetable page RW, which would allow it to map arbitrary memory, so Xen stopped it. This is because vm_unmap_ram() left some pages mapped in the vmalloc area after NFS had finished with them, and those pages got recycled as pagetable pages while still having these RW aliases. Removing those mappings immediately removes the Xen-visible aliases, and so it has no problem with those pages being reused as pagetable pages. Deferring the TLB flush doesn't upset Xen because it can flush the TLB itself as needed to maintain its invariants. When unmapping a region in the vmalloc space, clear the ptes immediately. There's no point in deferring this because there's no amortization benefit. The TLBs are left dirty, and they are flushed lazily to amortize the cost of the IPIs. This specific motivation for this patch is an oops-causing regression since 2.6.36 when using NFS under Xen, triggered by the NFS client's use of vm_map_ram() introduced in 56e4ebf8 ("NFS: readdir with vmapped pages") . XFS also uses vm_map_ram() and could cause similar problems. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Alex Elder <aelder@sgi.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andres Salomon authored
The AMD Geode CS5536 Companion Device Silicon Revision B1 Specification Update mentions the follow as issue #36: "Atomic write transactions to the atomic GPIO High Bank Feature Bit registers should only affect the bits selected [...]" "after Suspend, an atomic write transaction [...] will clear all non-selected bits of the accessed register." In other words, writing to the high bank for a single GPIO bit will clear every other GPIO bit (but only sometimes after a suspend). The workaround described is obvious and simple; do a read-modify-write. This patch does that, and documents why we're doing it. Signed-off-by: Andres Salomon <dilinger@queued.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Frederic Weisbecker authored
reiserfs_acl_chmod() can be called by reiserfs_set_attr() and then take the reiserfs lock a second time. Thereafter it may call journal_begin() that definitely requires the lock not to be nested in order to release it before taking the journal mutex because the reiserfs lock depends on the journal mutex already. So, aviod nesting the lock in reiserfs_acl_chmod(). Reported-by: Pawel Zawora <pzawora@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Pawel Zawora <pzawora@gmail.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: <stable@kernel.org> [2.6.32.x+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Johannes Berg authored
It's not useful to build LED triggers when there's no LEDs that can be triggered by them. Therefore, fix up the dependencies so that this cannot happen, and fix a few users that select triggers to depend on LEDS_CLASS as well (there is also one user that also selects LEDS_CLASS, which is OK). Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Tested-by: Ingo Molnar <mingo@elte.hu> Cc: Arnd Hannemann <arnd@arndnet.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Richard Purdie <rpurdie@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Wu Fengguang authored
The nr_dirty_[background_]threshold fields are misplaced before the numa_* fields, and users will read strange values. This is the right order. Before patch, nr_dirty_background_threshold will read as 0 (the value from numa_miss). numa_hit 128501 numa_miss 0 numa_foreign 0 numa_interleave 7388 numa_local 128501 numa_other 0 nr_dirty_threshold 144291 nr_dirty_background_threshold 72145 Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Michael Rubin <mrubin@google.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zeng Zhaoming authored
find_task_by_vpid() should be protected by rcu_read_lock(), to prevent free_pid() reclaiming pid. Signed-off-by: Zeng Zhaoming <zengzm.kernel@gmail.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Christoph Lameter <cl@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dean Nelson authored
Have hugetlb_fault() call unlock_page(page) only if it had previously called lock_page(page). Setting CONFIG_DEBUG_VM=y and then running the libhugetlbfs test suite, resulted in the tripping of VM_BUG_ON(!PageLocked(page)) in unlock_page() having been called by hugetlb_fault() when page == pagecache_page. This patch remedied the problem. Signed-off-by: Dean Nelson <dnelson@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6Linus Torvalds authored
* 'staging-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (27 commits) Staging: rt2870: Add USB ID for Buffalo Airstation WLI-UC-GN staging: easycap needs smp_lock.h, fixes build error Staging: batman-adv: ensure that eth_type_trans gets linear memory Staging: batman-adv: Don't remove interface with spinlock held staging: brcm80211: updated maintainers contact information staging: fix winbond build, needs delay.h Staging: line6: fix up my fixup for some sysfs attribute permissions Staging: zram: fix up my fixup for some sysfs attribute permissions Staging: udlfb: fix up my fixup for some sysfs attribute permissions Staging: samsung-laptop: fix up my fixup for some sysfs attribute permissions Staging: iio: adis16220: fix up my fixup for some sysfs attribute permissions Staging: frontier: fix up my fixup for some sysfs attribute permissions Staging: asus_oled: fix up my fixup for some sysfs attribute permissions staging: spectra: fix build error Staging: intel_sst: fix memory leak Staging: rtl8712: signedness bug in init staging: rtl8187se: Change panic to warn when RF switch turned off staging: comedi: fix memory leak Staging: quickstart: free after input_unregister_device() Staging: speakup: free after input_unregister_device() ...
-
Linus Torvalds authored
Merge branch 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: uio: Change mail address of Hans J. Koch driver core: prune docs about device_interface driver core: the development tree has switched to git
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6Linus Torvalds authored
* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: serial: mfd: adjust the baud rate setting TTY: open/hangup race fixup TTY: don't allow reopen when ldisc is changing NET: wan/x25, fix ldisc->open retval TTY: ldisc, fix open flag handling serial8250: Mark console as CON_ANYTIME
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds authored
* 'usb-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: USB: fix autosuspend bug in usb-serial USB: ehci: disable LPM and PPCD for nVidia MCP89 chips USB: serial: ftdi_sio: Vardaan USB RS422/485 converter PID added USB: yurex: add .llseek fop to file_operations USB: ftdi_sio: Add ID for RT Systems USB-29B radio cable usb: musb: do not use dma for control transfers usb: musb: gadget: fix compilation warning usb: musb: clear RXCSR_AUTOCLEAR before PIO read usb: musb: unmap dma buffer when switching to PIO xhci: Don't let the USB core disable SuperSpeed ports. xhci: Setup array of USB 2.0 and USB 3.0 ports. xhci: Fix reset-device and configure-endpoint commands
-
git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdogLinus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog: watchdog: it8712f_wdt: add note to Kconfig watchdog: gef_wdt: include fs.h watchdog: bcm63xx_wdt: improve platform part. watchdog: iTCO_wdt: TCO Watchdog patch for Intel Patsburg PCH
-
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infinibandLinus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB: Fix information leak in marshalling code IB/pack: Remove some unused code added by the IBoE patches IB/mlx4: Fix IBoE link state IB/mlx4: Fix IBoE reported link rate mlx4_core: Workaround firmware bug in query dev cap IB/mlx4: Fix memory ordering of VLAN insertion control bits MAINTAINERS: Update NetEffect entry
-
git://oss.sgi.com/xfs/xfsLinus Torvalds authored
* 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: only run xfs_error_test if error injection is active xfs: avoid moving stale inodes in the AIL xfs: delayed alloc blocks beyond EOF are valid after writeback xfs: push stale, pinned buffers on trylock failures xfs: fix failed write truncation handling.
-
git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6Linus Torvalds authored
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: regulator: fix kernel-doc for set_consumer_device_supply regulator: enable supply regulator only when use count is zero regulator: twl-regulator - fix twlreg_set_mode regulator: lock supply in regulator enable regulator: Return proper error for regulator_register() regulator: Ensure enough delay time for enabling regulator regulator: Remove a redundant device_remove_file call in create_regulator regulator: Staticise mc13783_powermisc_rmw() regulator: regulator disable supply fix
-
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6Linus Torvalds authored
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: [media] v4l: Remove module_name argument to the v4l2_i2c_new_subdev* functions [media] v4l: Remove hardcoded module names passed to v4l2_i2c_new_subdev* (2)
-
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-clientLinus Torvalds authored
* 'rbd-sysfs' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: replace the rbd sysfs interface
-
git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: fix parsing of hostname in dfs referrals cifs: display fsc in /proc/mounts cifs: enable fscache iff fsc mount option is used explicitly cifs: allow fsc mount option only if CONFIG_CIFS_FSCACHE is set cifs: Handle extended attribute name cifs_acl to generate cifs acl blob (try #4) cifs: Misc. cleanup in cifsacl handling [try #4] cifs: trivial comment fix for cifs_invalidate_mapping [CIFS] fs/cifs/Kconfig: CIFS depends on CRYPTO_HMAC cifs: don't take extra tlink reference in initiate_cifs_search cifs: Percolate error up to the caller during get/set acls [try #4] cifs: fix another memleak, in cifs_root_iget cifs: fix potential use-after-free in cifs_oplock_break_put
-
Wim Van Sebroeck authored
On some motherboards the it8712f watchdog does not work unless the game port was enabled. see Bug 13140. We therefor add a note to Kconfig. Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
Wolfram Sang & Martyn Welch authored
Add missing include "linux/fs.h". This fixes compile failure. Signed-off-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Martyn Welch <martyn.welch@ge.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
Wim Van Sebroeck authored
* fix devinit and devexit sections * fix platform removal code so that the iounmap happens after the removal of the timer. * changes the reboot_notifier by a platform shutdown method. Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
Seth Heasley authored
This patch adds an additional LPC Controller DeviceID for the Intel Patsburg PCH for TCO Watchdog. Signed-off-by: Seth Heasley <seth.heasley@intel.com> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
-
Dmitry Torokhov authored
Input ABI requires reporting resolution on main axes in units per millimeter, not units per inch, so we need to convert accordingly. Tested-by: Nikolai Kondrashov <spbnick@gmail.com> Acked-by: Nikolai Kondrashov <spbnick@gmail.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
-
Roland Dreier authored
-
Vasiliy Kulikov authored
ib_ucm_init_qp_attr() and ucma_init_qp_attr() pass struct ib_uverbs_qp_attr with reserved, qp_state, {ah_attr,alt_ah_attr}{reserved,->grh.reserved} fields uninitialized to copy_to_user(). This leads to leaking of contents of kernel stack memory to userspace. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
-
Or Gerlitz authored
Remove unused functions added by commit ff7f5aab ("IB/pack: IBoE UD packet packing support"). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
-
Eli Cohen authored
Use netif_running() and netif_carrier_ok() to report link state, exactly as is done to report Ethernet link state in sysfs. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
-
Eli Cohen authored
The link rate is the product of the link speed in the link width. For Etherent ports the rate is 10G, so we use 1 for the width and 4 for speed to get the correct rate. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
-
Eli Cohen authored
ConnectX firmware is supposed to report the number blue flame registers per page as log2 of the value. However, due to a firmware bug, it reports actual number. This patch works around this by checking if the number of registers calculated fits within a page. If it does not, we use 8 registers per page. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
-
- 01 Dec, 2010 6 commits
-
-
Yehuda Sadeh authored
The new interface creates directories per mapped image and under each it creates a subdir per available snapshot. This allows keeping a cleaner interface within the sysfs guidelines. The ABI documentation was updated too. Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Sage Weil <sage@newdream.net>
-
Eli Cohen authored
We must fully update the control segment before marking it as valid, so that hardware doesn't start executing it before we're ready. Signed-off-by: Eli Cohen <eli@mellanox.co.il> [ Move VLAN control bit setting to before wmb(). - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
-
Chien Tung authored
Correct web link as www.neteffect.com is no longer valid. Remove Chien Tung as maintainer. I am moving on to other responsibilities at Intel. Thanks for all the fish. Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
-
Dave Chinner authored
Recent tests writing lots of small files showed the flusher thread being CPU bound and taking a long time to do allocations on a debug kernel. perf showed this as the prime reason: samples pcnt function DSO _______ _____ ___________________________ _________________ 224648.00 36.8% xfs_error_test [kernel.kallsyms] 86045.00 14.1% xfs_btree_check_sblock [kernel.kallsyms] 39778.00 6.5% prandom32 [kernel.kallsyms] 37436.00 6.1% xfs_btree_increment [kernel.kallsyms] 29278.00 4.8% xfs_btree_get_rec [kernel.kallsyms] 27717.00 4.5% random32 [kernel.kallsyms] Walking btree blocks during allocation checking them requires each block (a cache hit, so no I/O) call xfs_error_test(), which then does a random32() call as the first operation. IOWs, ~50% of the CPU is being consumed just testing whether we need to inject an error, even though error injection is not active. Kill this overhead when error injection is not active by adding a global counter of active error traps and only calling into xfs_error_test when fault injection is active. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Dave Chinner authored
When an inode has been marked stale because the cluster is being freed, we don't want to (re-)insert this inode into the AIL. There is a race condition where the cluster buffer may be unpinned before the inode is inserted into the AIL during transaction committed processing. If the buffer is unpinned before the inode item has been committed and inserted, then it is possible for the buffer to be released and hence processthe stale inode callbacks before the inode is inserted into the AIL. In this case, we then insert a clean, stale inode into the AIL which will never get removed by an IO completion. It will, however, get reclaimed and that triggers an assert in xfs_inode_free() complaining about freeing an inode still in the AIL. This race can be avoided by not moving stale inodes forward in the AIL during transaction commit completion processing. This closes the race condition by ensuring we never insert clean stale inodes into the AIL. It is safe to do this because a dirty stale inode, by definition, must already be in the AIL. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-
Dave Chinner authored
There is an assumption in the parts of XFS that flushing a dirty file will make all the delayed allocation blocks disappear from an inode. That is, that after calling xfs_flush_pages() then ip->i_delayed_blks will be zero. This is an invalid assumption as we may have specualtive preallocation beyond EOF and they are recorded in ip->i_delayed_blks. A flush of the dirty pages of an inode will not change the state of these blocks beyond EOF, so a non-zero deeelalloc block count after a flush is valid. The bmap code has an invalid ASSERT() that needs to be removed, and the swapext code has a bug in that while it swaps the data forks around, it fails to swap the i_delayed_blks counter associated with the fork and hence can get the block accounting wrong. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
-