1. 12 Jul, 2013 1 commit
    • Tejun Heo's avatar
      cgroup: replace task_cgroup_path_from_hierarchy() with task_cgroup_path() · 913ffdb5
      Tejun Heo authored
      task_cgroup_path_from_hierarchy() was added for the planned new users
      and none of the currently planned users wants to know about multiple
      hierarchies.  This patch drops the multiple hierarchy part and makes
      it always return the path in the first non-dummy hierarchy.
      
      As unified hierarchy will always have id 1, this is guaranteed to
      return the path for the unified hierarchy if mounted; otherwise, it
      will return the path from the hierarchy which happens to occupy the
      lowest hierarchy id, which will usually be the first hierarchy mounted
      after boot.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Cc: Lennart Poettering <lennart@poettering.net>
      Cc: Kay Sievers <kay.sievers@vrfy.org>
      Cc: Jan Kaluža <jkaluza@redhat.com>
      913ffdb5
  2. 09 Jul, 2013 39 commits
    • Tejun Heo's avatar
      cgroup: remove bcache_subsys_id which got added stealthily · add0c59d
      Tejun Heo authored
      cafe5635 ("bcache: A block layer cache") added a new cgroup
      subsystem bcache_subsys without proper review and ack.  bcache_subsys
      seems to use cgroup for group stats and per-group cache_mode
      configuration.  This is very much the type of usage that we don't want
      to allow.
      
      Fortunately, CONFIG_CGROUP_BCACHE which enables bcache_subsys is
      currently commented out, so this shouldn't have any upstream users.
      Let's nip in the bud.  While at it, clarify in cgroup_subsys.h that no
      new subsystem should be added without explicit acks from cgroup
      maintainers.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: cgroups@vger.kernel.org
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-bcache@vger.kernel.org
      add0c59d
    • Linus Torvalds's avatar
      Merge branch 'akpm' (updates from Andrew Morton) · a82a729f
      Linus Torvalds authored
      Merge second patch-bomb from Andrew Morton:
       - misc fixes
       - audit stuff
       - fanotify/inotify/dnotify things
       - most of the rest of MM.  The new cache shrinker code from Glauber and
         Dave Chinner probably isn't quite stabilized yet.
       - ptrace
       - ipc
       - partitions
       - reboot cleanups
       - add LZ4 decompressor, use it for kernel compression
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
        lib/scatterlist: error handling in __sg_alloc_table()
        scsi_debug: fix do_device_access() with wrap around range
        crypto: talitos: use sg_pcopy_to_buffer()
        lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer()
        lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next()
        crypto: add lz4 Cryptographic API
        lib: add lz4 compressor module
        arm: add support for LZ4-compressed kernel
        lib: add support for LZ4-compressed kernel
        decompressor: add LZ4 decompressor module
        lib: add weak clz/ctz functions
        reboot: move arch/x86 reboot= handling to generic kernel
        reboot: arm: change reboot_mode to use enum reboot_mode
        reboot: arm: prepare reboot_mode for moving to generic kernel code
        reboot: arm: remove unused restart_mode fields from some arm subarchs
        reboot: unicore32: prepare reboot_mode for moving to generic kernel code
        reboot: x86: prepare reboot_mode for moving to generic kernel code
        reboot: checkpatch.pl the new kernel/reboot.c file
        reboot: move shutdown/reboot related functions to kernel/reboot.c
        reboot: remove -stable friendly PF_THREAD_BOUND define
        ...
      a82a729f
    • Linus Torvalds's avatar
      Merge tag 'for-linus-3.11-merge-window-part-1' of... · 899dd388
      Linus Torvalds authored
      Merge tag 'for-linus-3.11-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
      
      Pull 9p update from Eric Van Hensbergen:
       "Grab bag of little fixes and enhancements:
        - optional security enhancements
        - fix path coverage in MAINTAINERS
        - switch to using most used protocol and transport as default
        - clean up buffer dumps in trace code
      
        Held off on RDMA patches as they need to be cleaned up a bit, but will
        try to get the cleaned, checked, and pushed by mid-week"
      
      * tag 'for-linus-3.11-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
        9p: Add rest of 9p files to MAINTAINERS entry
        9p: trace: use %*ph to dump buffer
        net/9p: Handle error in zero copy request correctly for 9p2000.u
        net/9p: Use virtio transpart as the default transport
        net/9p: Make 9P2000.L the default protocol for 9p file system
      899dd388
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 9a5889ae
      Linus Torvalds authored
      Pull Ceph updates from Sage Weil:
       "There is some follow-on RBD cleanup after the last window's code drop,
        a series from Yan fixing multi-mds behavior in cephfs, and then a
        sprinkling of bug fixes all around.  Some warnings, sleeping while
        atomic, a null dereference, and cleanups"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits)
        libceph: fix invalid unsigned->signed conversion for timespec encoding
        libceph: call r_unsafe_callback when unsafe reply is received
        ceph: fix race between cap issue and revoke
        ceph: fix cap revoke race
        ceph: fix pending vmtruncate race
        ceph: avoid accessing invalid memory
        libceph: Fix NULL pointer dereference in auth client code
        ceph: Reconstruct the func ceph_reserve_caps.
        ceph: Free mdsc if alloc mdsc->mdsmap failed.
        ceph: remove sb_start/end_write in ceph_aio_write.
        ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.
        ceph: fix sleeping function called from invalid context.
        ceph: move inode to proper flushing list when auth MDS changes
        rbd: fix a couple warnings
        ceph: clear migrate seq when MDS restarts
        ceph: check migrate seq before changing auth cap
        ceph: fix race between page writeback and truncate
        ceph: reset iov_len when discarding cap release messages
        ceph: fix cap release race
        libceph: fix truncate size calculation
        ...
      9a5889ae
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · e3a0dd98
      Linus Torvalds authored
      Pull btrfs update from Chris Mason:
       "These are the usual mixture of bugs, cleanups and performance fixes.
        Miao has some really nice tuning of our crc code as well as our
        transaction commits.
      
        Josef is peeling off more and more problems related to early enospc,
        and has a number of important bug fixes in here too"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (81 commits)
        Btrfs: wait ordered range before doing direct io
        Btrfs: only do the tree_mod_log_free_eb if this is our last ref
        Btrfs: hold the tree mod lock in __tree_mod_log_rewind
        Btrfs: make backref walking code handle skinny metadata
        Btrfs: fix crash regarding to ulist_add_merge
        Btrfs: fix several potential problems in copy_nocow_pages_for_inode
        Btrfs: cleanup the code of copy_nocow_pages_for_inode()
        Btrfs: fix oops when recovering the file data by scrub function
        Btrfs: make the chunk allocator completely tree lockless
        Btrfs: cleanup orphaned root orphan item
        Btrfs: fix wrong mirror number tuning
        Btrfs: cleanup redundant code in btrfs_submit_direct()
        Btrfs: remove btrfs_sector_sum structure
        Btrfs: check if we can nocow if we don't have data space
        Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc
        Btrfs: use a percpu to keep track of possibly pinned bytes
        Btrfs: check for actual acls rather than just xattrs when caching no acl
        Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate
        Btrfs: optimize reada_for_balance
        Btrfs: optimize read_block_for_search
        ...
      e3a0dd98
    • Linus Torvalds's avatar
      Merge tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs · da89bd21
      Linus Torvalds authored
      Pull xfs update from Ben Myers:
       "This includes several bugfixes, part of the work for project quotas
        and group quotas to be used together, performance improvements for
        inode creation/deletion, buffer readahead, and bulkstat,
        implementation of the inode change count, an inode create transaction,
        and the removal of a bunch of dead code.
      
        There are also some duplicate commits that you already have from the
        3.10-rc series.
      
         - part of the work to allow project quotas and group quotas to be
           used together
         - inode change count
         - inode create transaction
         - block queue plugging in buffer readahead and bulkstat
         - ordered log vector support
         - removal of dead code in and around xfs_sync_inode_grab,
           xfs_ialloc_get_rec, XFS_MOUNT_RETERR, XFS_ALLOCFREE_LOG_RES,
           XFS_DIROP_LOG_RES, xfs_chash, ctl_table, and
           xfs_growfs_data_private
         - don't keep silent if sunit/swidth can not be changed via mount
         - fix a leak of remote symlink blocks into the filesystem when xattrs
           are used on symlinks
         - fix for fiemap to return FIEMAP_EXTENT_UNKOWN flag on delay extents
         - part of a fix for xfs_fsr
         - disable speculative preallocation with small files
         - performance improvements for inode creates and deletes"
      
      * tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs: (61 commits)
        xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD
        xfs: Change xfs_dquot_acct to be a 2-dimensional array
        xfs: Code cleanup and removal of some typedef usage
        xfs: Replace macro XFS_DQ_TO_QIP with a function
        xfs: Replace macro XFS_DQUOT_TREE with a function
        xfs: Define a new function xfs_is_quota_inode()
        xfs: implement inode change count
        xfs: Use inode create transaction
        xfs: Inode create item recovery
        xfs: Inode create transaction reservations
        xfs: Inode create log items
        xfs: Introduce an ordered buffer item
        xfs: Introduce ordered log vector support
        xfs: xfs_ifree doesn't need to modify the inode buffer
        xfs: don't do IO when creating an new inode
        xfs: don't use speculative prealloc for small files
        xfs: plug directory buffer readahead
        xfs: add pluging for bulkstat readahead
        xfs: Remove dead function prototype xfs_sync_inode_grab()
        xfs: Remove the left function variable from xfs_ialloc_get_rec()
        ...
      da89bd21
    • Josh Durgin's avatar
      libceph: fix invalid unsigned->signed conversion for timespec encoding · 8b8cf891
      Josh Durgin authored
      __kernel_time_t is a long, which cannot hold a U32_MAX on 32-bit
      architectures.  Just drop this check as it has limited value.
      
      This fixes a crash like:
      
      [  957.905812] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/include/linux/ceph/decode.h:164!
      [  957.914849] Internal error: Oops - BUG: 0 [#1] SMP ARM
      [  957.919978] Modules linked in: rbd libceph libcrc32c ipmi_devintf ipmi_si ipmi_msghandler nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc
      [  957.932547] CPU: 1    Tainted: G        W     (3.9.0-ceph-19bb6a83-highbank #1)
      [  957.939881] PC is at ceph_osdc_build_request+0x8c/0x4f8 [libceph]
      [  957.945967] LR is at 0xec520904
      [  957.949103] pc : [<bf13e76c>]    lr : [<ec520904>]    psr: 20000153
      [  957.949103] sp : ec753df8  ip : 00000001  fp : ec53e100
      [  957.960571] r10: ebef25c0  r9 : ec5fa400  r8 : ecbcc000
      [  957.965788] r7 : 00000000  r6 : 00000000  r5 : ffffffff  r4 : 00000020
      [  957.972307] r3 : 51cc8143  r2 : ec520900  r1 : ec753e58  r0 : ec520908
      [  957.978827] Flags: nzCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment user
      [  957.986039] Control: 10c5387d  Table: 2c59c04a  DAC: 00000015
      [  957.991777] Process rbd (pid: 2138, stack limit = 0xec752238)
      [  957.997514] Stack: (0xec753df8 to 0xec754000)
      [  958.001864] 3de0:                                                       00000001 00000001
      [  958.010032] 3e00: 00000001 bf139744 ecbcc000 ec55a0a0 00000024 00000000 ebef25c0 fffffffe
      [  958.018204] 3e20: ffffffff 00000000 00000000 00000001 ec5fa400 ebef25c0 ec53e100 bf166b68
      [  958.026377] 3e40: 00000000 0000220f fffffffe ffffffff ec753e58 bf13ff24 51cc8143 05b25ed2
      [  958.034548] 3e60: 00000001 00000000 00000000 bf1688d4 00000001 00000000 00000000 00000000
      [  958.042720] 3e80: 00000001 00000060 ec5fa400 ed53d200 ed439600 ed439300 00000001 00000060
      [  958.050888] 3ea0: ec5fa400 ed53d200 00000000 bf16a320 00000000 ec53e100 00000040 ec753eb8
      [  958.059059] 3ec0: ec51df00 ed53d7c0 ed53d200 ed53d7c0 00000000 ed53d7c0 ec5fa400 bf16ed70
      [  958.067230] 3ee0: 00000000 00000060 00000002 ed53d200 00000000 bf16acf4 ed53d7c0 ec752000
      [  958.075402] 3f00: ed980e50 e954f5d8 00000000 00000060 ed53d240 ed53d258 ec753f80 c04f44a8
      [  958.083574] 3f20: edb7910c ec664700 01ade920 c02e4c44 00000060 c016b3dc ec51de40 01adfb84
      [  958.091745] 3f40: 00000060 ec752000 ec753f80 ec752000 00000060 c0108444 00000007 ec51de48
      [  958.099914] 3f60: ed0eb8c0 00000000 00000000 ec51de40 01adfb84 00000001 00000060 c0108858
      [  958.108085] 3f80: 00000000 00000000 51cc8143 00000060 01adfb84 00000007 00000004 c000dd68
      [  958.116257] 3fa0: 00000000 c000dbc0 00000060 01adfb84 00000007 01adfb84 00000060 01adfb80
      [  958.124429] 3fc0: 00000060 01adfb84 00000007 00000004 beded1a8 00000000 01adf2f0 01ade920
      [  958.132599] 3fe0: 00000000 beded180 b6811324 b6811334 800f0010 00000007 2e7f5821 2e7f5c21
      [  958.140815] [<bf13e76c>] (ceph_osdc_build_request+0x8c/0x4f8 [libceph]) from [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd])
      [  958.152739] [<bf166b68>] (rbd_osd_req_format_write+0x50/0x7c [rbd]) from [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd])
      [  958.164486] [<bf1688d4>] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) from [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd])
      [  958.175967] [<bf16a320>] (rbd_dev_image_probe+0x23c/0x850 [rbd]) from [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd])
      [  958.185975] [<bf16acf4>] (rbd_add+0x3c0/0x918 [rbd]) from [<c02e4c44>] (bus_attr_store+0x20/0x2c)
      [  958.194850] [<c02e4c44>] (bus_attr_store+0x20/0x2c) from [<c016b3dc>] (sysfs_write_file+0x168/0x198)
      [  958.203984] [<c016b3dc>] (sysfs_write_file+0x168/0x198) from [<c0108444>] (vfs_write+0x9c/0x170)
      [  958.212768] [<c0108444>] (vfs_write+0x9c/0x170) from [<c0108858>] (sys_write+0x3c/0x70)
      [  958.220768] [<c0108858>] (sys_write+0x3c/0x70) from [<c000dbc0>] (ret_fast_syscall+0x0/0x30)
      [  958.229199] Code: e59d1058 e5913000 e3530000 ba000114 (e7f001f2)
      
      CC: stable@vger.kernel.org  # 3.4+
      Signed-off-by: default avatarJosh Durgin <josh.durgin@inktank.com>
      Reviewed-by: default avatarSage Weil <sage@inktank.com>
      8b8cf891
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · be0c5d8c
      Linus Torvalds authored
      Pull NFS client updates from Trond Myklebust:
       "Feature highlights include:
         - Add basic client support for NFSv4.2
         - Add basic client support for Labeled NFS (selinux for NFSv4.2)
         - Fix the use of credentials in NFSv4.1 stateful operations, and add
           support for NFSv4.1 state protection.
      
        Bugfix highlights:
         - Fix another NFSv4 open state recovery race
         - Fix an NFSv4.1 back channel session regression
         - Various rpc_pipefs races
         - Fix another issue with NFSv3 auth negotiation
      
        Please note that Labeled NFS does require some additional support from
        the security subsystem.  The relevant changesets have all been
        reviewed and acked by James Morris."
      
      * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
        NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
        NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
        nfs: have NFSv3 try server-specified auth flavors in turn
        nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
        nfs: move server_authlist into nfs_try_mount_request
        nfs: refactor "need_mount" code out of nfs_try_mount
        SUNRPC: PipeFS MOUNT notification optimization for dying clients
        SUNRPC: split client creation routine into setup and registration
        SUNRPC: fix races on PipeFS UMOUNT notifications
        SUNRPC: fix races on PipeFS MOUNT notifications
        NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
        NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
        NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
        NFS: Improve legacy idmapping fallback
        NFSv4.1 end back channel session draining
        NFS: Apply v4.1 capabilities to v4.2
        NFSv4.1: Clean up layout segment comparison helper names
        NFSv4.1: layout segment comparison helpers should take 'const' parameters
        NFSv4: Move the DNS resolver into the NFSv4 module
        rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
        ...
      be0c5d8c
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 1f792dd1
      Linus Torvalds authored
      Pull ext3 fix and quota cleanup from Jan Kara:
       "A fix of ext3 error reporting from fsync and a quota cleanup"
      
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        quota: Convert use of typedef ctl_table to struct ctl_table
        ext3: Fix fsync error handling after filesystem abort.
      1f792dd1
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · c75e2475
      Linus Torvalds authored
      Pull third set of VFS updates from Al Viro:
       "Misc stuff all over the place.  There will be one more pile in a
        couple of days"
      
      This is an "evil merge" that also uses the new d_count helper in
      fs/configfs/dir.c, missed by commit 84d08fa8 ("helper for reading
      ->d_count")
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        ncpfs: fix error return code in ncp_parse_options()
        locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock
        seq_file: add seq_list_*_percpu helpers
        f2fs: fix readdir incorrectness
        mode_t whack-a-mole...
        lustre: kill the pointless wrapper
        helper for reading ->d_count
      c75e2475
    • Dan Carpenter's avatar
      lib/scatterlist: error handling in __sg_alloc_table() · 27daabd9
      Dan Carpenter authored
      I was reviewing code which I suspected might allocate a zero size SG
      table.  That will cause memory corruption.  Also we can't return before
      doing the memset or we could end up using uninitialized memory in the
      cleanup path.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Akinobu Mita <akinobu.mita@gmail.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27daabd9
    • Akinobu Mita's avatar
      scsi_debug: fix do_device_access() with wrap around range · a4517511
      Akinobu Mita authored
      do_device_access() is a function that abstracts copying SG list from/to
      ramdisk storage (fake_storep).
      
      It must deal with the ranges exceeding actual fake_storep size, because
      such ranges are valid if virtual_gb is set greater than zero, and they
      should be treated as fake_storep is repeatedly mirrored up to virtual
      size.
      
      Unfortunately, it can't deal with the range which wraps around the end of
      fake_storep.  A wrap around range is copied by two
      sg_copy_{from,to}_buffer() calls, but sg_copy_{from,to}_buffer() can't
      copy from/to in the middle of SG list, therefore the second call can't
      copy correctly.
      
      This fixes it by using sg_pcopy_{from,to}_buffer() that can copy from/to
      the middle of SG list.
      
      This also simplifies the assignment of sdb->resid in
      fill_from_dev_buffer().  Because fill_from_dev_buffer() is now only called
      once per command execution cycle.  So it is not necessary to take care to
      decrease sdb->resid if fill_from_dev_buffer() is called more than once.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a4517511
    • Akinobu Mita's avatar
      crypto: talitos: use sg_pcopy_to_buffer() · d0525723
      Akinobu Mita authored
      Use sg_pcopy_to_buffer() which is better than the function previously used.
      Because it doesn't do kmap/kunmap for skipped pages.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0525723
    • Akinobu Mita's avatar
      lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer() · df642cea
      Akinobu Mita authored
      The only difference between sg_pcopy_{from,to}_buffer() and
      sg_copy_{from,to}_buffer() is an additional argument that specifies the
      number of bytes to skip the SG list before copying.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      df642cea
    • Akinobu Mita's avatar
      lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next() · 11052004
      Akinobu Mita authored
      This patchset introduces sg_pcopy_from_buffer() and sg_pcopy_to_buffer(),
      which copy data between a linear buffer and an SG list.
      
      The only difference between sg_pcopy_{from,to}_buffer() and
      sg_copy_{from,to}_buffer() is an additional argument that specifies the
      number of bytes to skip the SG list before copying.
      
      The main reason for introducing these functions is to fix a problem in
      scsi_debug module.  And there is a local function in crypto/talitos
      module, which can be replaced by sg_pcopy_to_buffer().
      
      This patch:
      
      sg_miter_get_next_page() is used to proceed page iterator to the next page
      if necessary, and will be used to implement the variants of
      sg_copy_{from,to}_buffer() later.
      Signed-off-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Imre Deak <imre.deak@intel.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
      Cc: Douglas Gilbert <dgilbert@interlog.com>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      11052004
    • Chanho Min's avatar
      crypto: add lz4 Cryptographic API · 0ea8530d
      Chanho Min authored
      Add support for lz4 and lz4hc compression algorithm using the lib/lz4/*
      codebase.
      
      [akpm@linux-foundation.org: fix warnings]
      Signed-off-by: default avatarChanho Min <chanho.min@lge.com>
      Cc: "Darrick J. Wong" <djwong@us.ibm.com>
      Cc: Bob Pearson <rpearson@systemfabricworks.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Herbert Xu <herbert@gondor.hengli.com.au>
      Cc: Yann Collet <yann.collet.73@gmail.com>
      Cc: Kyungsik Lee <kyungsik.lee@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0ea8530d
    • Chanho Min's avatar
      lib: add lz4 compressor module · c72ac7a1
      Chanho Min authored
      This patchset is for supporting LZ4 compression and the crypto API using
      it.
      
      As shown below, the size of data is a little bit bigger but compressing
      speed is faster under the enabled unaligned memory access.  We can use
      lz4 de/compression through crypto API as well.  Also, It will be useful
      for another potential user of lz4 compression.
      
      lz4 Compression Benchmark:
      Compiler: ARM gcc 4.6.4
      ARMv7, 1 GHz based board
         Kernel: linux 3.4
         Uncompressed data Size: 101 MB
               Compressed Size  compression Speed
         LZO   72.1MB		  32.1MB/s, 33.0MB/s(UA)
         LZ4   75.1MB		  30.4MB/s, 35.9MB/s(UA)
         LZ4HC 59.8MB		   2.4MB/s,  2.5MB/s(UA)
      - UA: Unaligned memory Access support
      - Latest patch set for LZO applied
      
      This patch:
      
      Add support for LZ4 compression in the Linux Kernel.  LZ4 Compression APIs
      for kernel are based on LZ4 implementation by Yann Collet and were changed
      for kernel coding style.
      
      LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
      LZ4 source repository : http://code.google.com/p/lz4/
      svn revision : r90
      
      Two APIs are added:
      
      lz4_compress() support basic lz4 compression whereas lz4hc_compress()
      support high compression or CPU performance get lower but compression
      ratio get higher.  Also, we require the pre-allocated working memory with
      the defined size and destination buffer must be allocated with the size of
      lz4_compressbound.
      
      [akpm@linux-foundation.org: make lz4_compresshcctx() static]
      Signed-off-by: default avatarChanho Min <chanho.min@lge.com>
      Cc: "Darrick J. Wong" <djwong@us.ibm.com>
      Cc: Bob Pearson <rpearson@systemfabricworks.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Herbert Xu <herbert@gondor.hengli.com.au>
      Cc: Yann Collet <yann.collet.73@gmail.com>
      Cc: Kyungsik Lee <kyungsik.lee@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c72ac7a1
    • Kyungsik Lee's avatar
      arm: add support for LZ4-compressed kernel · f9b493ac
      Kyungsik Lee authored
      Integrates the LZ4 decompression code to the arm pre-boot code.
      Signed-off-by: default avatarKyungsik Lee <kyungsik.lee@lge.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Florian Fainelli <florian@openwrt.org>
      Cc: Yann Collet <yann.collet.73@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f9b493ac
    • Kyungsik Lee's avatar
      lib: add support for LZ4-compressed kernel · e76e1fdf
      Kyungsik Lee authored
      Add support for extracting LZ4-compressed kernel images, as well as
      LZ4-compressed ramdisk images in the kernel boot process.
      Signed-off-by: default avatarKyungsik Lee <kyungsik.lee@lge.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Florian Fainelli <florian@openwrt.org>
      Cc: Yann Collet <yann.collet.73@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e76e1fdf
    • Kyungsik Lee's avatar
      decompressor: add LZ4 decompressor module · cffb78b0
      Kyungsik Lee authored
      Add support for LZ4 decompression in the Linux Kernel.  LZ4 Decompression
      APIs for kernel are based on LZ4 implementation by Yann Collet.
      
      Benchmark Results(PATCH v3)
      Compiler: Linaro ARM gcc 4.6.2
      
      1. ARMv7, 1.5GHz based board
         Kernel: linux 3.4
         Uncompressed Kernel Size: 14MB
              Compressed Size  Decompression Speed
         LZO  6.7MB            20.1MB/s, 25.2MB/s(UA)
         LZ4  7.3MB            29.1MB/s, 45.6MB/s(UA)
      
      2. ARMv7, 1.7GHz based board
         Kernel: linux 3.7
         Uncompressed Kernel Size: 14MB
              Compressed Size  Decompression Speed
         LZO  6.0MB            34.1MB/s, 52.2MB/s(UA)
         LZ4  6.5MB            86.7MB/s
      - UA: Unaligned memory Access support
      - Latest patch set for LZO applied
      
      This patch set is for adding support for LZ4-compressed Kernel.  LZ4 is a
      very fast lossless compression algorithm and it also features an extremely
      fast decoder [1].
      
      But we have five of decompressors already and one question which does
      arise, however, is that of where do we stop adding new ones?  This issue
      had been discussed and came to the conclusion [2].
      
      Russell King said that we should have:
      
       - one decompressor which is the fastest
       - one decompressor for the highest compression ratio
       - one popular decompressor (eg conventional gzip)
      
      If we have a replacement one for one of these, then it should do exactly
      that: replace it.
      
      The benchmark shows that an 8% increase in image size vs a 66% increase
      in decompression speed compared to LZO(which has been known as the
      fastest decompressor in the Kernel).  Therefore the "fast but may not be
      small" compression title has clearly been taken by LZ4 [3].
      
      [1] http://code.google.com/p/lz4/
      [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
      [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347
      
      LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
      LZ4 source repository: http://code.google.com/p/lz4/Signed-off-by: default avatarKyungsik Lee <kyungsik.lee@lge.com>
      Signed-off-by: default avatarYann Collet <yann.collet.73@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Florian Fainelli <florian@openwrt.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cffb78b0
    • Chanho Min's avatar
      lib: add weak clz/ctz functions · 4df87bb7
      Chanho Min authored
      Some architectures need __c[lt]z[sd]i2() for __builtin_c[lt]z[ll] and
      that causes a build failure.  They can be implemented using the
      fls()/__ffs() and overridden by linking arch-specific versions may not
      be implemented yet.
      
      This is required by "lib: add lz4 compressor module".
      
      Reference: https://lkml.org/lkml/2013/4/18/603Signed-off-by: default avatarChanho Min <chanho.min@lge.com>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: "Darrick J. Wong" <djwong@us.ibm.com>
      Cc: Bob Pearson <rpearson@systemfabricworks.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Herbert Xu <herbert@gondor.hengli.com.au>
      Cc: Yann Collet <yann.collet.73@gmail.com>
      Cc: Kyungsik Lee <kyungsik.lee@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4df87bb7
    • Robin Holt's avatar
      reboot: move arch/x86 reboot= handling to generic kernel · 1b3a5d02
      Robin Holt authored
      Merge together the unicore32, arm, and x86 reboot= command line
      parameter handling.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarGuan Xuetao <gxt@mprc.pku.edu.cn>
      Acked-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1b3a5d02
    • Robin Holt's avatar
      reboot: arm: change reboot_mode to use enum reboot_mode · 7b6d864b
      Robin Holt authored
      Preparing to move the parsing of reboot= to generic kernel code forces
      the change in reboot_mode handling to use the enum.
      
      [akpm@linux-foundation.org: fix arch/arm/mach-socfpga/socfpga.c]
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Acked-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b6d864b
    • Robin Holt's avatar
      reboot: arm: prepare reboot_mode for moving to generic kernel code · 16d6d5b0
      Robin Holt authored
      Prepare for the moving the parsing of reboot= to the generic kernel code
      by making reboot_mode into a more generic form.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Acked-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      16d6d5b0
    • Robin Holt's avatar
      reboot: arm: remove unused restart_mode fields from some arm subarchs · 58591942
      Robin Holt authored
      These restart_mode fields are not used at all.  Remove them to make
      moving the reboot= cmdline options to the general kernel easier.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Acked-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      58591942
    • Robin Holt's avatar
      reboot: unicore32: prepare reboot_mode for moving to generic kernel code · c97a7008
      Robin Holt authored
      Prepare for the moving the parsing of reboot= to the generic kernel code
      by making reboot_mode into a more generic form.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Acked-by: default avatarGuan Xuetao <gxt@mprc.pku.edu.cn>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c97a7008
    • Robin Holt's avatar
      reboot: x86: prepare reboot_mode for moving to generic kernel code · edf2b139
      Robin Holt authored
      Prepare for the moving the parsing of reboot= to the generic kernel code
      by making reboot_mode into a more generic form.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Miguel Boton <mboton.lkml@gmail.com>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      edf2b139
    • Robin Holt's avatar
      reboot: checkpatch.pl the new kernel/reboot.c file · 972ee83d
      Robin Holt authored
      Get the new file to pass scripts/checkpatch.pl
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      972ee83d
    • Robin Holt's avatar
      reboot: move shutdown/reboot related functions to kernel/reboot.c · 15d94b82
      Robin Holt authored
      This patch is preparatory.  It moves reboot related syscall, etc
      functions from kernel/sys.c to kernel/reboot.c.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      15d94b82
    • Robin Holt's avatar
      reboot: remove -stable friendly PF_THREAD_BOUND define · 0efbee70
      Robin Holt authored
      Remove the prior patch's #define for easier backporting to the stable
      releases.
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Russ Anderson <rja@sgi.com>
      Cc: Robin Holt <holt@sgi.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0efbee70
    • Philippe De Muyter's avatar
      partitions/msdos: enumerate also AIX LVM partitions · f8f06603
      Philippe De Muyter authored
      Graft AIX partitions enumeration into partitions/msdos.c
      
      There is already a AIX disks detection logic in msdos.c.  When an AIX disk
      has been found, and if configured to, call the aix partitions recognizer.
      This avoids removal of AIX disks protection from msdos.c, avoids code
      duplication, and ensures that AIX partitions enumeration is called before
      plain msdos partitions enumeration.
      Signed-off-by: default avatarPhilippe De Muyter <phdm@macqel.be>
      Cc: Karel Zak <kzak@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f8f06603
    • Philippe De Muyter's avatar
      partitions: add aix lvm partition support files · 6ceea22b
      Philippe De Muyter authored
      Add partitions/aix.h and partitions/aix.c.
      
      AIX LVM permits to make "logical volumes" which are made of multiple
      slices of multiple disks.  The new code allows only access to the
      "logical volumes" which are made of one slice on the probed disk, a
      slice being a contiguous disk area.  The code also detects "logical
      volumes" made of multiple slices on the probed disk, but can not
      describe them to the partition layer, because the partition layer
      generic code does not support that.  When such non-contiguous "logical
      volumes" are detected, a diagnostic message is printed.
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarPhilippe De Muyter <phdm@macqel.be>
      Cc: Karel Zak <kzak@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ceea22b
    • Philippe De Muyter's avatar
      partitions/msdos.c: end-of-line whitespace and semicolon cleanup · 1d04f3c6
      Philippe De Muyter authored
      Signed-off-by: default avatarPhilippe De Muyter <phdm@macqel.be>
      Cc: Karel Zak <kzak@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1d04f3c6
    • Dan Carpenter's avatar
      mwave: fix info leak in mwave_ioctl() · 026dadad
      Dan Carpenter authored
      Smatch complains that on 64 bit systems, there is a hole in the
      MW_ABILITIES struct between ->component_count and ->component_list[].
      It leaks stack information from the mwave_ioctl() function.
      
      I've added a memset() to initialize the struct to zero.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Greg KH <greg@kroah.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      026dadad
    • Manfred Spraul's avatar
      ipc/sem.c: rename try_atomic_semop() to perform_atomic_semop(), docu update · 758a6ba3
      Manfred Spraul authored
      Cleanup: Some minor points that I noticed while writing the previous
      patches
      
      1) The name try_atomic_semop() is misleading: The function performs the
         operation (if it is possible).
      
      2) Some documentation updates.
      
      No real code change, a rename and documentation changes.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      758a6ba3
    • Manfred Spraul's avatar
      ipc/sem.c: replace shared sem_otime with per-semaphore value · d12e1e50
      Manfred Spraul authored
      sem_otime contains the time of the last semaphore operation that
      completed successfully.  Every operation updates this value, thus access
      from multiple cpus can cause thrashing.
      
      Therefore the patch replaces the variable with a per-semaphore variable.
      The per-array sem_otime is only calculated when required.
      
      No performance improvement on a single-socket i3 - only important for
      larger systems.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d12e1e50
    • Manfred Spraul's avatar
      ipc/sem.c: always use only one queue for alter operations · f269f40a
      Manfred Spraul authored
      There are two places that can contain alter operations:
       - the global queue: sma->pending_alter
       - the per-semaphore queues: sma->sem_base[].pending_alter.
      
      Since one of the queues must be processed first, this causes an odd
      priorization of the wakeups: complex operations have priority over
      simple ops.
      
      The patch restores the behavior of linux <=3.0.9: The longest waiting
      operation has the highest priority.
      
      This is done by using only one queue:
       - if there are complex ops, then sma->pending_alter is used.
       - otherwise, the per-semaphore queues are used.
      
      As a side effect, do_smart_update_queue() becomes much simpler: no more
      goto logic.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f269f40a
    • Manfred Spraul's avatar
      ipc/sem: separate wait-for-zero and alter tasks into seperate queues · 1a82e9e1
      Manfred Spraul authored
      Introduce separate queues for operations that do not modify the
      semaphore values.  Advantages:
      
       - Simpler logic in check_restart().
       - Faster update_queue(): Right now, all wait-for-zero operations are
         always tested, even if the semaphore value is not 0.
       - wait-for-zero gets again priority, as in linux <=3.0.9
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1a82e9e1
    • Manfred Spraul's avatar
      ipc/sem.c: cacheline align the semaphore structures · f5c936c0
      Manfred Spraul authored
      As now each semaphore has its own spinlock and parallel operations are
      possible, give each semaphore its own cacheline.
      
      On a i3 laptop, this gives up to 28% better performance:
      
        #semscale 10 | grep "interleave 2"
        - before:
        Cpus 1, interleave 2 delay 0: 36109234 in 10 secs
        Cpus 2, interleave 2 delay 0: 55276317 in 10 secs
        Cpus 3, interleave 2 delay 0: 62411025 in 10 secs
        Cpus 4, interleave 2 delay 0: 81963928 in 10 secs
      
        -after:
        Cpus 1, interleave 2 delay 0: 35527306 in 10 secs
        Cpus 2, interleave 2 delay 0: 70922909 in 10 secs <<< + 28%
        Cpus 3, interleave 2 delay 0: 80518538 in 10 secs
        Cpus 4, interleave 2 delay 0: 89115148 in 10 secs <<< + 8.7%
      
      i3, with 2 cores and with hyperthreading enabled.  Interleave 2 in order
      use first the full cores.  HT partially hides the delay from cacheline
      trashing, thus the improvement is "only" 8.7% if 4 threads are running.
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f5c936c0