1. 04 Jun, 2009 9 commits
    • Tao Ma's avatar
      06c59bb8
    • Joel Becker's avatar
      ocfs2: Add statistics for the checksum and ecc operations. · 73be192b
      Joel Becker authored
      It would be nice to know how often we get checksum failures.  Even
      better, how many of them we can fix with the single bit ecc.  So, we add
      a statistics structure.  The structure can be installed into debugfs
      wherever the user wants.
      
      For ocfs2, we'll put it in the superblock-specific debugfs directory and
      pass it down from our higher-level functions.  The stats are only
      registered with debugfs when the filesystem supports metadata ecc.
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      73be192b
    • Srinivas Eeda's avatar
      ocfs2 patch to track delayed orphan scan timer statistics · 15633a22
      Srinivas Eeda authored
      Patch to track delayed orphan scan timer statistics.
      
      Modifies ocfs2_osb_dump to print the following:
        Orphan Scan=> Local: 10  Global: 21  Last Scan: 67 seconds ago
      Signed-off-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      15633a22
    • Srinivas Eeda's avatar
      ocfs2: timer to queue scan of all orphan slots · 83273932
      Srinivas Eeda authored
      When a dentry is unlinked, the unlinking node takes an EX on the dentry lock
      before moving the dentry to the orphan directory. Other nodes that have
      this dentry in cache have a PR on the same dentry lock.  When the EX is
      requested, the other nodes flag the corresponding inode as MAYBE_ORPHANED
      during downconvert.  The inode is finally deleted when the last node to iput
      the inode sees that i_nlink==0 and the MAYBE_ORPHANED flag is set.
      
      A problem arises if a node is forced to free dentry locks because of memory
      pressure. If this happens, the node will no longer get downconvert
      notifications for the dentries that have been unlinked on another node.
      If it also happens that node is actively using the corresponding inode and
      happens to be the one performing the last iput on that inode, it will fail
      to delete the inode as it will not have the MAYBE_ORPHANED flag set.
      
      This patch fixes this shortcoming by introducing a periodic scan of the
      orphan directories to delete such inodes. Care has been taken to distribute
      the workload across the cluster so that no one node has to perform the task
      all the time.
      Signed-off-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      83273932
    • Jan Kara's avatar
      ocfs2: Correct ordering of ip_alloc_sem and localloc locks for directories · edd45c08
      Jan Kara authored
      We use ordering ip_alloc_sem -> local alloc locks in ocfs2_write_begin().
      So change lock ordering in ocfs2_extend_dir() and ocfs2_expand_inline_dir()
      to also use this lock ordering.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarMark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      edd45c08
    • Jan Kara's avatar
      ocfs2: Fix possible deadlock in quota recovery · 80d73f15
      Jan Kara authored
      In ocfs2_finish_quota_recovery() we acquired global quota file lock and started
      recovering local quota file. During this process we need to get quota
      structures, which calls ocfs2_dquot_acquire() which gets global quota file lock
      again. This second lock can block in case some other node has requested the
      quota file lock in the mean time. Fix the problem by moving quota file locking
      down into the function where it is really needed.  Then dqget() or dqput()
      won't be called with the lock held.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      80d73f15
    • Jan Kara's avatar
      ocfs2: Fix possible deadlock with quotas in ocfs2_setattr() · 65bac575
      Jan Kara authored
      We called vfs_dq_transfer() with global quota file lock held. This can lead
      to deadlocks as if vfs_dq_transfer() has to allocate new quota structure,
      it calls ocfs2_dquot_acquire() which tries to get quota file lock again and
      this can block if another node requested the lock in the mean time.
      
      Since we have to call vfs_dq_transfer() with transaction already started
      and quota file lock ranks above the transaction start, we cannot just rely
      on ocfs2_dquot_acquire() or ocfs2_dquot_release() on getting the lock
      if they need it. We fix the problem by acquiring pointers to all quota
      structures needed by vfs_dq_transfer() already before calling the function.
      By this we are sure that all quota structures are properly allocated and
      they can be freed only after we drop references to them. Thus we don't need
      quota file lock anywhere inside vfs_dq_transfer().
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      65bac575
    • Jan Kara's avatar
      ocfs2: Fix lock inversion in ocfs2_local_read_info() · b4c30de3
      Jan Kara authored
      This function is called with dqio_mutex held but it has to acquire lock
      from global quota file which ranks above this lock. This is not deadlockable
      lock inversion since this code path is take only during mount when noone
      else can race with us but let's clean this up to silence lockdep.
      
      We just drop the dqio_mutex in the beginning of the function and reacquire
      it in the end since we don't need it - noone can race with us at this moment.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      b4c30de3
    • Jan Kara's avatar
      ocfs2: Fix possible deadlock in ocfs2_global_read_dquot() · 4e8a3019
      Jan Kara authored
      It is not possible to get a read lock and then try to get the same write lock
      in one thread as that can block on downconvert being requested by other node
      leading to deadlock. So first drop the quota lock for reading and only after
      that get it for writing.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      4e8a3019
  2. 05 May, 2009 2 commits
    • Coly Li's avatar
      ocfs2: update comments in masklog.h · 2b53bc7b
      Coly Li authored
      In the mainline ocfs2 code, the interface for masklog is in files under
      /sys/fs/o2cb/masklog, but the comments in fs/ocfs2/cluster/masklog.h
      reference the old /proc interface.  They are out of date.
      
      This patch modifies the comments in cluster/masklog.h, which also provides
      a bash script example on how to change the log mask bits.
      Signed-off-by: default avatarColy Li <coly.li@suse.de>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      2b53bc7b
    • Tao Ma's avatar
      ocfs2: Don't printk the error when listing too many xattrs. · a46fa684
      Tao Ma authored
      Currently the kernel defines XATTR_LIST_MAX as 65536
      in include/linux/limits.h.  This is the largest buffer that is used for
      listing xattrs.
      
      But with ocfs2 xattr tree, we actually have no limit for the number.  If
      filesystem has more names than can fit in the buffer, the kernel
      logs will be pollluted with something like this when listing:
      
      (27738,0):ocfs2_iterate_xattr_buckets:3158 ERROR: status = -34
      (27738,0):ocfs2_xattr_tree_list_index_block:3264 ERROR: status = -34
      
      So don't print "ERROR" message as this is not an ocfs2 error.
      Signed-off-by: default avatarTao Ma <tao.ma@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      a46fa684
  3. 02 May, 2009 29 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · b4348f32
      Linus Torvalds authored
      * 'for-linus' of git://oss.sgi.com/xfs/xfs:
        xfs: fix getbmap vs mmap deadlock
        xfs: a couple getbmap cleanups
        xfs: add more checks to superblock validation
        xfs_file_last_byte() needs to acquire ilock
      b4348f32
    • David Gibson's avatar
      Move dtc and libfdt sources from arch/powerpc/boot to scripts/dtc · 9fffb55f
      David Gibson authored
      The powerpc kernel always requires an Open Firmware like device tree
      to supply device information.  On systems without OF, this comes from
      a flattened device tree blob.  This blob is usually generated by dtc,
      a tool which compiles a text description of the device tree into the
      flattened format used by the kernel.  Sometimes, the bootwrapper makes
      small changes to the pre-compiled device tree blob (e.g. filling in
      the size of RAM).  To do this it uses the libfdt library.
      
      Because these are only used on powerpc, the code for both these tools
      is included under arch/powerpc/boot (these were imported and are
      periodically updated from the upstream dtc tree).
      
      However, the microblaze architecture, currently being prepared for
      merging to mainline also uses dtc to produce device tree blobs.  A few
      other archs have also mentioned some interest in using dtc.
      Therefore, this patch moves dtc and libfdt from arch/powerpc into
      scripts, where it can be used by any architecture.
      
      The vast bulk of this patch is a literal move, the rest is adjusting
      the various Makefiles to use dtc and libfdt correctly from their new
      locations.
      Signed-off-by: default avatarDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9fffb55f
    • Linus Torvalds's avatar
      Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs · afc1e702
      Linus Torvalds authored
      * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs:
        configfs: Fix Trivial Warning in fs/configfs/symlink.c
      afc1e702
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6 · 7b39da78
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
        ide-cd: fix REQ_QUIET tests in cdrom_decode_status
      
      Fix up trivial conflicts in include/linux/blkdev.h
      7b39da78
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/rmk/linux-2.6-arm · 2142baba
      Linus Torvalds authored
      * master.kernel.org:/home/rmk/linux-2.6-arm: (45 commits)
        [ARM] 5489/1: ARM errata: Data written to the L2 cache can be overwritten with stale data
        [ARM] 5490/1: ARM errata: Processor deadlock when a false hazard is created
        [ARM] 5487/1: ARM errata: Stale prediction on replaced interworking branch
        [ARM] 5488/1: ARM errata: Invalidation of the Instruction Cache operation can fail
        davinci: DM644x: NAND: update partitioning
        davinci: update DM644x support in preparation for more SoCs
        davinci: DM644x: rename board file
        davinci: update pin-multiplexing support
        davinci: serial: generalize for more SoCs
        davinci: DM355 IRQ Definitions
        davinci: DM646x: add interrupt number and priorities
        davinci: PSC: Clear bits in MDCTL reg before setting new bits
        davinci: gpio bugfixes
        davinci: add EDMA driver
        davinci: timers: use clk_get_rate()
        [ARM] pxa/littleton: add missing da9034 touchscreen support
        [ARM] pxa/zylonite: configure GPIO18/19 correctly, used by 2 GPIO expanders
        [ARM] pxa/zylonite: fix the issue of unused SDATA_IN_1 pin get AC97 not working
        [ARM] pxa: make ads7846 on corgi and spitz to sync on HSYNC
        [ARM] pxa: remove unused CPU_FREQ_PXA Kconfig symbol
        ...
      2142baba
    • Linus Torvalds's avatar
      Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip · bb402c4f
      Linus Torvalds authored
      * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
        x86, mce: fix boot logging logic
        x86, mce: make polling timer interval per CPU
      bb402c4f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 · 61bd1e85
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (53 commits)
        [SCSI] libosd: OSD2r05: on-the-wire changes for latest OSD2 revision 5.
        [SCSI] libosd: OSD2r05: OSD_CRYPTO_KEYID_SIZE will grow 20 => 32 bytes
        [SCSI] libosd: OSD2r05: Prepare for rev5 attribute list changes
        [SCSI] libosd: fix potential ERR_PTR dereference in osd_initiator.c
        [SCSI] mpt2sas : bump driver version to 01.100.02.00
        [SCSI] mpt2sas: fix hotplug event processing
        [SCSI] mpt2sas : release diagnotic buffers prior host reset
        [SCSI] mpt2sas : Broadcast Primative AEN bug fix
        [SCSI] mpt2sas : Identify Dell series-7 adapters at driver load time
        [SCSI] mpt2sas : driver name needs to be in the MPT2IOCINFO ioctl
        [SCSI] mpt2sas : running out of message frames
        [SCSI] mpt2sas : fix oops when firmware sends large sense buffer size
        [SCSI] mpt2sas : the sanity check in base_interrupt needs to be on dword boundary
        [SCSI] mpt2sas : unique ioctl magic number
        [SCSI] fix sign extension with 1.5TB usb-storage LBD=y
        [SCSI] ipr: Fix sleeping function called with interrupts disabled
        [SCSI] fcoe: fip: add multicast filter to receive FIP advertisements.
        [SCSI] libfc: Fix compilation warnings with allmodconfig
        [SCSI] fcoe: fix spelling typos and bad comments
        [SCSI] fcoe: don't export functions that are internal to fcoe
        ...
      61bd1e85
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 8c0c3f7f
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: document the multi-touch (MT) protocol
        Input: add detailed multi-touch finger data report protocol
        Input: allow certain EV_ABS events to bypass all filtering
        Input: bcm5974 - add documentation for the driver
        Input: bcm5974 - augment debug information
        Input: bcm5974 - Add support for the Macbook 5 (Unibody)
        Input: bcm5974 - add quad-finger tapping
        Input: bcm5974 - prepare for a new trackpad header type
        Input: appletouch - fix DMA to/from stack buffer
        Input: wacom - fix TabletPC touch bug
        Input: lifebook - add DMI entry for Fujitsu B-2130
        Input: ALPS - add signature for Toshiba Satellite Pro M10
        Input: elantech - make sure touchpad is really in absolute mode
        Input: elantech - provide a workaround for jumpy cursor on firmware 2.34
        Input: ucb1400 - use disable_irq_nosync() in irq handler
        Input: tsc2007 - use disable_irq_nosync() in irq handler
        Input: sa1111ps2 - use disable_irq_nosync() in irq handlers
        Input: omap-keypad - use disable_irq_nosync() in irq handler
      8c0c3f7f
    • Trond Myklebust's avatar
      SUNRPC: Fix the problem of EADDRNOTAVAIL syslog floods on reconnect · f75e6745
      Trond Myklebust authored
      See http://bugzilla.kernel.org/show_bug.cgi?id=13034
      
      If the port gets into a TIME_WAIT state, then we cannot reconnect without
      binding to a new port.
      Tested-by: default avatarPetr Vandrovec <petr@vandrovec.name>
      Tested-by: default avatarJean Delvare <khali@linux-fr.org>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f75e6745
    • Linus Torvalds's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes · 414772fa
      Linus Torvalds authored
      * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
        kbuild, modpost: Check the section flags, to catch missing "ax"/"aw"
        kbuild: fix comment in modpost.c
        kbuild: fix scripts/setlocalversion with git
        kbuild: fix Module.markers permission error under cygwin
        docs: also clean index.html
        kbuild: remove a tag file before it is regenerated
        kbuild: "make prepare" should be "make modules_prepare"
        kbuild: clean Module.markers and modules.order for out-of-tree modules
        avr32: drop unused CLEAN_FILES
      414772fa
    • Linus Torvalds's avatar
      Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 · 7e567b44
      Linus Torvalds authored
      * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
        ocfs2: Change repository in MAINTAINERS.
        ocfs2: Fix a missing credit when deleting from indexed directories.
        ocfs2/trivial: Remove unused variable in ocfs2_rename.
        ocfs2: Add missing iput() during error handling in ocfs2_dentry_attach_lock()
        ocfs2: Fix some printk() warnings.
        ocfs2: Fix 2 warning during ocfs2 make.
        ocfs2: Reserve 1 more cluster in expanding_inline_dir for indexed dir.
      7e567b44
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid · 020f932b
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
        HID: fix oops in hid_check_keys_pressed()
        HID: fix possible deadlock in usbhid_close()
        HID: Fix the support for apple mini aluminium keyboard
        HID: Add support for the G25 force feedback wheel in native mode
        HID: hidraw -- fix missing unlocks in unlocked_ioctl
      020f932b
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://www.linux-m32r.org/git/takata/linux-2.6_dev · 912e7796
      Linus Torvalds authored
      * 'for-linus' of git://www.linux-m32r.org/git/takata/linux-2.6_dev:
        m32r: use __stringify() macro in assembler.h
        m32r: build fix for __stringify macro
      912e7796
    • Ashutosh Naik's avatar
      ibft: fix the display of a few fields in the NIC attribute structure in sysfs · 65fd2105
      Ashutosh Naik authored
      Fix the display of a few fields in the iBFT NIC attribute structure in
      sysfs.
      
      Ensure that, if the DHCP IP address and the subnet mask for the interface
      is present in the iBFT NIC structure, the corresponding entries are
      created in sysfs tree for the device.  This would hence create the
      additional entries in the tree based on the iBFT table and would not
      delete any existing entries.
      Signed-off-by: default avatarAshutosh Naik <ashutosh.naik@gmail.com>
      Cc: Vishnu V <vishnu@chelsio.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      65fd2105
    • Andrea Righi's avatar
      mm: prevent divide error for small values of vm_dirty_bytes · 9e4a5bda
      Andrea Righi authored
      Avoid setting less than two pages for vm_dirty_bytes: this is necessary to
      avoid potential division by 0 (like the following) in get_dirty_limits().
      
      [   49.951610] divide error: 0000 [#1] PREEMPT SMP
      [   49.952195] last sysfs file: /sys/devices/pci0000:00/0000:00:01.1/host0/target0:0:0/0:0:0:0/block/sda/uevent
      [   49.952195] CPU 1
      [   49.952195] Modules linked in: pcspkr
      [   49.952195] Pid: 3064, comm: dd Not tainted 2.6.30-rc3 #1
      [   49.952195] RIP: 0010:[<ffffffff802d39a9>]  [<ffffffff802d39a9>] get_dirty_limits+0xe9/0x2c0
      [   49.952195] RSP: 0018:ffff88001de03a98  EFLAGS: 00010202
      [   49.952195] RAX: 00000000000000c0 RBX: ffff88001de03b80 RCX: 28f5c28f5c28f5c3
      [   49.952195] RDX: 0000000000000000 RSI: 00000000000000c0 RDI: 0000000000000000
      [   49.952195] RBP: ffff88001de03ae8 R08: 0000000000000000 R09: 0000000000000000
      [   49.952195] R10: ffff88001ddda9a0 R11: 0000000000000001 R12: 0000000000000001
      [   49.952195] R13: ffff88001fbc8218 R14: ffff88001de03b70 R15: ffff88001de03b78
      [   49.952195] FS:  00007fe9a435b6f0(0000) GS:ffff8800025d9000(0000) knlGS:0000000000000000
      [   49.952195] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   49.952195] CR2: 00007fe9a39ab000 CR3: 000000001de38000 CR4: 00000000000006e0
      [   49.952195] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   49.952195] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [   49.952195] Process dd (pid: 3064, threadinfo ffff88001de02000, task ffff88001ddda250)
      [   49.952195] Stack:
      [   49.952195]  ffff88001fa0de00 ffff88001f2dbd70 ffff88001f9fe800 000080b900000000
      [   49.952195]  00000000000000c0 ffff8800027a6100 0000000000000400 ffff88001fbc8218
      [   49.952195]  0000000000000000 0000000000000600 ffff88001de03bb8 ffffffff802d3ed7
      [   49.952195] Call Trace:
      [   49.952195]  [<ffffffff802d3ed7>] balance_dirty_pages_ratelimited_nr+0x1d7/0x3f0
      [   49.952195]  [<ffffffff80368f8e>] ? ext3_writeback_write_end+0x9e/0x120
      [   49.952195]  [<ffffffff802cc7df>] generic_file_buffered_write+0x12f/0x330
      [   49.952195]  [<ffffffff802cce8d>] __generic_file_aio_write_nolock+0x26d/0x460
      [   49.952195]  [<ffffffff802cda32>] ? generic_file_aio_write+0x52/0xd0
      [   49.952195]  [<ffffffff802cda49>] generic_file_aio_write+0x69/0xd0
      [   49.952195]  [<ffffffff80365fa6>] ext3_file_write+0x26/0xc0
      [   49.952195]  [<ffffffff803034d1>] do_sync_write+0xf1/0x140
      [   49.952195]  [<ffffffff80290d1a>] ? get_lock_stats+0x2a/0x60
      [   49.952195]  [<ffffffff80280730>] ? autoremove_wake_function+0x0/0x40
      [   49.952195]  [<ffffffff8030411b>] vfs_write+0xcb/0x190
      [   49.952195]  [<ffffffff803042d0>] sys_write+0x50/0x90
      [   49.952195]  [<ffffffff8022ff6b>] system_call_fastpath+0x16/0x1b
      [   49.952195] Code: 00 00 00 2b 05 09 1c 17 01 48 89 c6 49 0f af f4 48 c1 ee 02 48 89 f0 48 f7 e1 48 89 d6 31 d2 48 c1 ee 02 48 0f af 75 d0 48 89 f0 <48> f7 f7 41 8b 95 ac 01 00 00 48 89 c7 49 0f af d4 48 c1 ea 02
      [   49.952195] RIP  [<ffffffff802d39a9>] get_dirty_limits+0xe9/0x2c0
      [   49.952195]  RSP <ffff88001de03a98>
      [   50.096523] ---[ end trace 008d7aa02f244d7b ]---
      Signed-off-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9e4a5bda
    • Andrew Morton's avatar
      vmscan: avoid multiplication overflow in shrink_zone() · 8713e012
      Andrew Morton authored
      Local variable `scan' can overflow on zones which are larger than
      
      	(2G * 4k) / 100 = 80GB.
      
      Making it 64-bit on 64-bit will fix that up.
      
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Wu Fengguang <fengguang.wu@intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8713e012
    • Oleg Nesterov's avatar
      ptrace: s/parent/real_parent/ in binfmt_elf_fdpic.c · 0ae05fb2
      Oleg Nesterov authored
      ->real_parent is the parent. ->parent may be the tracer.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarRoland McGrath <roland@redhat.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0ae05fb2
    • Randy Dunlap's avatar
      kernel-doc: restrict syntax for private: and public: · 52dc5aec
      Randy Dunlap authored
      scripts/kernel-doc can (incorrectly) delete struct members that are
      surrounded by /* ...  */ <struct members> /* ...  */ if there is a /*
      private: */ comment in there somewhere also.
      
      Fix that by making the "/* private:" only allow whitespace between /* and
      "private:", not anything/everything in the world.
      
      This fixes some erroneous kernel-doc warnings that popped up while
      processing include/linux/usb/composite.h.
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      52dc5aec
    • KOSAKI Motohiro's avatar
      mm: fix Committed_AS underflow on large NR_CPUS environment · 00a62ce9
      KOSAKI Motohiro authored
      The Committed_AS field can underflow in certain situations:
      
      >         # while true; do cat /proc/meminfo  | grep _AS; sleep 1; done | uniq -c
      >               1 Committed_AS: 18446744073709323392 kB
      >              11 Committed_AS: 18446744073709455488 kB
      >               6 Committed_AS:    35136 kB
      >               5 Committed_AS: 18446744073709454400 kB
      >               7 Committed_AS:    35904 kB
      >               3 Committed_AS: 18446744073709453248 kB
      >               2 Committed_AS:    34752 kB
      >               9 Committed_AS: 18446744073709453248 kB
      >               8 Committed_AS:    34752 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               7 Committed_AS: 18446744073709454080 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               5 Committed_AS: 18446744073709454080 kB
      >               6 Committed_AS: 18446744073709320960 kB
      
      Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does
      not check for underflow.
      
      But NR_CPUS proportional isn't good calculation.  In general,
      possibility of lock contention is proportional to the number of online
      cpus, not theorical maximum cpus (NR_CPUS).
      
      The current kernel has generic percpu-counter stuff.  using it is right
      way.  it makes code simplify and percpu_counter_read_positive() don't
      make underflow issue.
      Reported-by: default avatarDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: <stable@kernel.org>		[All kernel versions]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00a62ce9
    • Grant Likely's avatar
      of: make of_(un)register_platform_driver common code · 0763ed23
      Grant Likely authored
      Some drivers using of_register_platform_driver() wrapper break on sparc
      because the wrapper isn't in the header file.  This patch moves it from
      Microblaze and PowerPC implementations and makes it common code.
      
      Fixes this sparc64 allmodconfig build error (at least):
      
      drivers/leds/leds-gpio.c: In function `gpio_led_init':
      drivers/leds/leds-gpio.c:295: error: implicit declaration of function `of_register_platform_driver'
      drivers/leds/leds-gpio.c: In function `gpio_led_exit':
      drivers/leds/leds-gpio.c:311: error: implicit declaration of function `of_unregister_platform_driver'
      Signed-off-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Richard Purdie <rpurdie@rpsys.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0763ed23
    • Ivan Kokshaysky's avatar
      alpha: binfmt_aout fix · 74641f58
      Ivan Kokshaysky authored
      This fixes the problem introduced by commit 3bfacef4 (get rid of
      special-casing the /sbin/loader on alpha): osf/1 ecoff binary segfaults
      when binfmt_aout built as module.  That happens because aout binary
      handler gets on the top of the binfmt list due to late registration, and
      kernel attempts to execute the binary without preparatory work that must
      be done by binfmt_loader.
      
      Fixed by changing the registration order of the default binfmt handlers
      using list_add_tail() and introducing insert_binfmt() function which
      places new handler on the top of the binfmt list.  This might be generally
      useful for installing arch-specific frontends for default handlers or just
      for overriding them.
      Signed-off-by: default avatarIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Richard Henderson <rth@twiddle.net
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      74641f58
    • Ivan Kokshaysky's avatar
      alpha: futex implementation · 77b4cf5c
      Ivan Kokshaysky authored
      Signed-off-by: default avatarIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Richard Henderson <rth@twiddle.net
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      77b4cf5c
    • Ivan Kokshaysky's avatar
      alpha: exception table sorting · 08a42e86
      Ivan Kokshaysky authored
      Exception fixups for sections other than .text (like one in futex_init())
      break the natural ordering of fixup entries, so sorting is required.
      
      Without that the result of the exception table search depends on phase of
      the moon.
      Signed-off-by: default avatarIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Richard Henderson <rth@twiddle.net
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      08a42e86
    • Ivan Kokshaysky's avatar
      alpha: titan and marvel build fixes · 1ffb1c0c
      Ivan Kokshaysky authored
      These platforms got broken after u64 => 'long long' conversion.
      
      Apparently that change was compile-tested with 'make allmodconfig', but it
      doesn't include systems that depend on !ALPHA_LEGACY_START_ADDRESS.
      Signed-off-by: default avatarIvan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Richard Henderson <rth@twiddle.net
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1ffb1c0c
    • Stefan Bader's avatar
      vgacon: return the upper half of 512 character fonts · b175dc09
      Stefan Bader authored
      Uwe Geuder noted that he gets random bitmaps on a text console if he tried
      to type extended characters (like the e acute).  For him everything above
      unicode 0xa0 was corrupted.
      
      After some digging there seems to be a little culprit in vgacon since the
      beginning of ages (well git).  The function vgacon_font_get will store the
      number of characters correctly in font->charcount but then calls to
      vgacon_do_font_op(..., 0, 0).  Which means only the lower 256 characters
      are actually stored to the fontdata.  The rest is left untouched.  So the
      next time that saved data is used, the garbled font appears.  This happens
      on every switch between text consoles.
      
      Addresses https://bugs.launchpad.net/ubuntu/+source/linux/+bug/355057Signed-off-by: default avatarStefan Bader <stefan.bader@canonical.com>
      Tested-by: default avatarUwe Geuder <ubuntuLp-ugeuder@sneakemail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b175dc09
    • Daisuke Nishimura's avatar
      memcg: fix mem_cgroup_shrink_usage() · ae3abae6
      Daisuke Nishimura authored
      Current mem_cgroup_shrink_usage() has two problems.
      
      1. It doesn't call mem_cgroup_out_of_memory and doesn't update
         last_oom_jiffies, so pagefault_out_of_memory invokes global OOM.
      
      2. Considering hierarchy, shrinking has to be done from the
         mem_over_limit, not from the memcg which the page would be charged to.
      
      mem_cgroup_try_charge_swapin() does all of these things properly, so we
      use it and call cancel_charge_swapin when it succeeded.
      
      The name of "shrink_usage" is not appropriate for this behavior, so we
      change it too.
      Signed-off-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Li Zefan <lizf@cn.fujitsu.cn>
      Cc: Paul Menage <menage@google.com>
      Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
      Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
      Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae3abae6
    • Vitaly Mayatskikh's avatar
      pagemap: require aligned-length, non-null reads of /proc/pid/pagemap · 08161786
      Vitaly Mayatskikh authored
      The intention of commit aae8679b
      ("pagemap: fix bug in add_to_pagemap, require aligned-length reads of
      /proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a
      multiple of 8 bytes, but now it allows to read 0 bytes, which actually
      puts some data to user's buffer.  According to POSIX, if count is zero,
      read() should return zero and has no other results.
      Signed-off-by: default avatarVitaly Mayatskikh <v.mayatskih@gmail.com>
      Cc: Thomas Tuttle <ttuttle@google.com>
      Acked-by: default avatarMatt Mackall <mpm@selenic.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      08161786
    • Nick Piggin's avatar
      mm: close page_mkwrite races · b827e496
      Nick Piggin authored
      Change page_mkwrite to allow implementations to return with the page
      locked, and also change it's callers (in page fault paths) to hold the
      lock until the page is marked dirty.  This allows the filesystem to have
      full control of page dirtying events coming from the VM.
      
      Rather than simply hold the page locked over the page_mkwrite call, we
      call page_mkwrite with the page unlocked and allow callers to return with
      it locked, so filesystems can avoid LOR conditions with page lock.
      
      The problem with the current scheme is this: a filesystem that wants to
      associate some metadata with a page as long as the page is dirty, will
      perform this manipulation in its ->page_mkwrite.  It currently then must
      return with the page unlocked and may not hold any other locks (according
      to existing page_mkwrite convention).
      
      In this window, the VM could write out the page, clearing page-dirty.  The
      filesystem has no good way to detect that a dirty pte is about to be
      attached, so it will happily write out the page, at which point, the
      filesystem may manipulate the metadata to reflect that the page is no
      longer dirty.
      
      It is not always possible to perform the required metadata manipulation in
      ->set_page_dirty, because that function cannot block or fail.  The
      filesystem may need to allocate some data structure, for example.
      
      And the VM cannot mark the pte dirty before page_mkwrite, because
      page_mkwrite is allowed to fail, so we must not allow any window where the
      page could be written to if page_mkwrite does fail.
      
      This solution of holding the page locked over the 3 critical operations
      (page_mkwrite, setting the pte dirty, and finally setting the page dirty)
      closes out races nicely, preventing page cleaning for writeout being
      initiated in that window.  This provides the filesystem with a strong
      synchronisation against the VM here.
      
      - Sage needs this race closed for ceph filesystem.
      - Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913).
      - I need it for fsblock.
      - I suspect other filesystems may need it too (eg. btrfs).
      - I have converted buffer.c to the new locking. Even simple block allocation
        under dirty pages might be susceptible to i_size changing under partial page
        at the end of file (we also have a buffer.c-side problem here, but it cannot
        be fixed properly without this patch).
      - Other filesystems (eg. NFS, maybe btrfs) will need to change their
        page_mkwrite functions themselves.
      
      [ This also moves page_mkwrite another step closer to fault, which should
        eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a
        filesystem calldown and page lock/unlock cycle in __do_fault. ]
      
      [akpm@linux-foundation.org: fix derefs of NULL ->mapping]
      Cc: Sage Weil <sage@newdream.net>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b827e496
    • Heiko Carstens's avatar
      atomic: fix atomic_long_cmpxchg/xchg for 64 bit architectures · a5fc1abe
      Heiko Carstens authored
      On a linux-next allyesconfig build:
      
      kernel/trace/ring_buffer.c:1726:
      	warning: passing argument 1 of 'atomic_cmpxchg' from incompatible pointer type
      linux-next/arch/s390/include/asm/atomic.h:112:
      	note: expected 'struct atomic_t *' but argument is of type 'struct atomic64_t *'
      
      atomic_long_cmpxchg and atomic_long_xchg are incorrectly defined for 64
      bit architectures.  They should be mapped to the atomic64_* variants.
      Acked-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a5fc1abe