1. 26 May, 2011 40 commits
    • Linus Torvalds's avatar
      Fix build with !HUGETLBFS · be93d8cf
      Linus Torvalds authored
      I stupidly broke the case of CONFIG_HUGETLBFS=n when doing the
      conversion to vm_flags_t in commit ca16d140 ("mm: don't access
      vm_flags as 'int'").  And my 'allyesconfig' build didn't find it, for
      obvious reasons..
      
      Include <linux/mm_types.h> in <linux/hugetlb.h>.  The problem could have
      been avoided by just turning the hugetlb_file_setup() error wrapper into
      a macro, but mm_types.h is a reasonable include in this file.
      Reported-by: default avatarRichard -rw- Weinberger <richard.weinberger@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be93d8cf
    • Linus Torvalds's avatar
      Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 · a74b81b0
      Linus Torvalds authored
      * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (28 commits)
        Ocfs2: Teach local-mounted ocfs2 to handle unwritten_extents correctly.
        ocfs2/dlm: Do not migrate resource to a node that is leaving the domain
        ocfs2/dlm: Add new dlm message DLM_BEGIN_EXIT_DOMAIN_MSG
        Ocfs2/move_extents: Set several trivial constraints for threshold.
        Ocfs2/move_extents: Let defrag handle partial extent moving.
        Ocfs2/move_extents: move/defrag extents within a certain range.
        Ocfs2/move_extents: helper to calculate the defraging length in one run.
        Ocfs2/move_extents: move entire/partial extent.
        Ocfs2/move_extents: helpers to update the group descriptor and global bitmap inode.
        Ocfs2/move_extents: helper to probe a proper region to move in an alloc group.
        Ocfs2/move_extents: helper to validate and adjust moving goal.
        Ocfs2/move_extents: find the victim alloc group, where the given #blk fits.
        Ocfs2/move_extents: defrag a range of extent.
        Ocfs2/move_extents: move a range of extent.
        Ocfs2/move_extents: lock allocators and reserve metadata blocks and data clusters for extents moving.
        Ocfs2/move_extents: Add basic framework and source files for extent moving.
        Ocfs2/move_extents: Adding new ioctl code 'OCFS2_IOC_MOVE_EXT' to ocfs2.
        Ocfs2/refcounttree: Publicize couple of funcs from refcounttree.c
        Ocfs2: Add a new code 'OCFS2_INFO_FREEFRAG' for o2info ioctl.
        Ocfs2: Add a new code 'OCFS2_INFO_FREEINODE' for o2info ioctl.
        ...
      a74b81b0
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem · f8d613e2
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem:
        xen: cleancache shim to Xen Transcendent Memory
        ocfs2: add cleancache support
        ext4: add cleancache support
        btrfs: add cleancache support
        ext3: add cleancache support
        mm/fs: add hooks to support cleancache
        mm: cleancache core ops functions and config
        fs: add field to superblock to support cleancache
        mm/fs: cleancache documentation
      
      Fix up trivial conflict in fs/btrfs/extent_io.c due to includes
      f8d613e2
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs · 8a0599dd
      Linus Torvalds authored
      * 'for-linus' of git://oss.sgi.com/xfs/xfs:
        xfs: correctly decrement the extent buffer index in xfs_bmap_del_extent
        xfs: check for valid indices in xfs_iext_get_ext and xfs_iext_idx_to_irec
        xfs: fix up asserts in xfs_iflush_fork
        xfs: do not do pointer arithmetic on extent records
        xfs: do not use unchecked extent indices in xfs_bunmapi
        xfs: do not use unchecked extent indices in xfs_bmapi
        xfs: do not use unchecked extent indices in xfs_bmap_add_extent_*
        xfs: remove if_lastex
        xfs: remove the unused XFS_BMAPI_RSVBLOCKS flag
        xfs: do not discard alloc btree blocks
        xfs: add online discard support
      8a0599dd
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 35806b4f
      Linus Torvalds authored
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
        jbd2: Add MAINTAINERS entry
        jbd2: fix a potential leak of a journal_head on an error path
        ext4: teach ext4_ext_split to calculate extents efficiently
        ext4: Convert ext4 to new truncate calling convention
        ext4: do not normalize block requests from fallocate()
        ext4: enable "punch hole" functionality
        ext4: add "punch hole" flag to ext4_map_blocks()
        ext4: punch out extents
        ext4: add new function ext4_block_zero_page_range()
        ext4: add flag to ext4_has_free_blocks
        ext4: reserve inodes and feature code for 'quota' feature
        ext4: add support for multiple mount protection
        ext4: ensure f_bfree returned by ext4_statfs() is non-negative
        ext4: protect bb_first_free in ext4_trim_all_free() with group lock
        ext4: only load buddy bitmap in ext4_trim_fs() when it is needed
        jbd2: Fix comment to match the code in jbd2__journal_start()
        ext4: fix waiting and sending of a barrier in ext4_sync_file()
        jbd2: Add function jbd2_trans_will_send_data_barrier()
        jbd2: fix sending of data flush on journal commit
        ext4: fix ext4_ext_fiemap_cb() to handle blocks before request range correctly
        ...
      35806b4f
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 · 32e51f14
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (25 commits)
        cifs: remove unnecessary dentry_unhash on rmdir/rename_dir
        ocfs2: remove unnecessary dentry_unhash on rmdir/rename_dir
        exofs: remove unnecessary dentry_unhash on rmdir/rename_dir
        nfs: remove unnecessary dentry_unhash on rmdir/rename_dir
        ext2: remove unnecessary dentry_unhash on rmdir/rename_dir
        ext3: remove unnecessary dentry_unhash on rmdir/rename_dir
        ext4: remove unnecessary dentry_unhash on rmdir/rename_dir
        btrfs: remove unnecessary dentry_unhash in rmdir/rename_dir
        ceph: remove unnecessary dentry_unhash calls
        vfs: clean up vfs_rename_other
        vfs: clean up vfs_rename_dir
        vfs: clean up vfs_rmdir
        vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
        libfs: drop unneeded dentry_unhash
        vfs: update dentry_unhash() comment
        vfs: push dentry_unhash on rename_dir into file systems
        vfs: push dentry_unhash on rmdir into file systems
        vfs: remove dget() from dentry_unhash()
        vfs: dentry_unhash immediately prior to rmdir
        vfs: Block mmapped writes while the fs is frozen
        ...
      32e51f14
    • KOSAKI Motohiro's avatar
      mm: don't access vm_flags as 'int' · ca16d140
      KOSAKI Motohiro authored
      The type of vma->vm_flags is 'unsigned long'. Neither 'int' nor
      'unsigned int'. This patch fixes such misuse.
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      [ Changed to use a typedef - we'll extend it to cover more cases
        later, since there has been discussion about making it a 64-bit
        type..                      - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ca16d140
    • Dan Magenheimer's avatar
      xen: cleancache shim to Xen Transcendent Memory · 5bc20fc5
      Dan Magenheimer authored
      This patch provides a shim between the kernel-internal cleancache
      API (see Documentation/mm/cleancache.txt) and the Xen Transcendent
      Memory ABI (see http://oss.oracle.com/projects/tmem).
      
      Xen tmem provides "hypervisor RAM" as an ephemeral page-oriented
      pseudo-RAM store for cleancache pages, shared cleancache pages,
      and frontswap pages.  Tmem provides enterprise-quality concurrency,
      full save/restore and live migration support, compression
      and deduplication.
      
      A presentation showing up to 8% faster performance and up to 52%
      reduction in sectors read on a kernel compile workload, despite
      aggressive in-kernel page reclamation ("self-ballooning") can be
      found at:
      
      http://oss.oracle.com/projects/tmem/dist/documentation/presentations/TranscendentMemoryXenSummit2010.pdfSigned-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Andreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      5bc20fc5
    • Dan Magenheimer's avatar
      ocfs2: add cleancache support · 1cfd8bd0
      Dan Magenheimer authored
      This eighth patch of eight in this cleancache series "opts-in"
      cleancache for ocfs2.  Clustered filesystems must explicitly enable
      cleancache by calling cleancache_init_shared_fs anytime an instance
      of the filesystem is mounted.  Ocfs2 is currently the only user of
      the clustered filesystem interface but nevertheless, the cleancache
      hooks in the VFS layer are sufficient for ocfs2 including the matching
      cleancache_flush_fs hook which must be called on unmount.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v8: trivial merge conflict update]
      [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Signed-off-by: default avatarJoel Becker <joel.becker@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Andreas Dilger <adilger@sun.com>
      Cc: Ted Tso <tytso@mit.edu>
      Cc: Nitin Gupta <ngupta@vflare.org>
      1cfd8bd0
    • Dan Magenheimer's avatar
      ext4: add cleancache support · 7abc52c2
      Dan Magenheimer authored
      This seventh patch of eight in this cleancache series "opts-in"
      cleancache for ext4.  Filesystems must explicitly enable cleancache
      by calling cleancache_init_fs anytime an instance of the filesystem
      is mounted. For ext4, all other cleancache hooks are in
      the VFS layer including the matching cleancache_flush_fs
      hook which must be called on unmount.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v6-v8: no changes]
      [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: default avatarAndreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      7abc52c2
    • Dan Magenheimer's avatar
      btrfs: add cleancache support · 90a887c9
      Dan Magenheimer authored
      This sixth patch of eight in this cleancache series "opts-in"
      cleancache for btrfs.  Filesystems must explicitly enable
      cleancache by calling cleancache_init_fs anytime an instance
      of the filesystem is mounted.  Btrfs uses its own readpage
      which must be hooked, but all other cleancache hooks are in
      the VFS layer including the matching cleancache_flush_fs hook
      which must be called on unmount.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v6-v8: no changes]
      [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Andreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      90a887c9
    • Dan Magenheimer's avatar
      ext3: add cleancache support · d71bc6db
      Dan Magenheimer authored
      This fifth patch of eight in this cleancache series "opts-in"
      cleancache for ext3.  Filesystems must explicitly enable
      cleancache by calling cleancache_init_fs anytime an instance
      of the filesystem is mounted. For ext3, all other cleancache
      hooks are in the VFS layer including the matching cleancache_flush_fs
      hook which must be called on unmount.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v6-v8: no changes]
      [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: default avatarAndreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      d71bc6db
    • Dan Magenheimer's avatar
      mm/fs: add hooks to support cleancache · c515e1fd
      Dan Magenheimer authored
      This fourth patch of eight in this cleancache series provides the
      core hooks in VFS for: initializing cleancache per filesystem;
      capturing clean pages reclaimed by page cache; attempting to get
      pages from cleancache before filesystem read; and ensuring coherency
      between pagecache, disk, and cleancache.  Note that the placement
      of these hooks was stable from 2.6.18 to 2.6.38; a minor semantic
      change was required due to a patchset in 2.6.39.
      
      All hooks become no-ops if CONFIG_CLEANCACHE is unset, or become
      a check of a boolean global if CONFIG_CLEANCACHE is set but no
      cleancache "backend" has claimed cleancache_ops.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v8: minchan.kim@gmail.com: adapt to new remove_from_page_cache function]
      Signed-off-by: default avatarChris Mason <chris.mason@oracle.com>
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Andreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      c515e1fd
    • Dan Magenheimer's avatar
      mm: cleancache core ops functions and config · 077b1f83
      Dan Magenheimer authored
      This third patch of eight in this cleancache series provides
      the core code for cleancache that interfaces between the hooks in
      VFS and individual filesystems and a cleancache backend.  It also
      includes build and config patches.
      
      Two new files are added: mm/cleancache.c and include/linux/cleancache.h.
      
      Note that CONFIG_CLEANCACHE can default to on; in systems that do
      not provide a cleancache backend, all hooks devolve to a simple
      check of a global enable flag, so performance impact should
      be negligible but can be reduced to zero impact if config'ed off.
      However for this first commit, it defaults to off.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      Credits: Cleancache_ops design derived from Jeremy Fitzhardinge
      design for tmem
      
      [v8: dan.magenheimer@oracle.com: fix exportfs call affecting btrfs]
      [v8: akpm@linux-foundation.org: use static inline function, not macro]
      [v7: dan.magenheimer@oracle.com: cleanup sysfs and remove cleancache prefix]
      [v6: JBeulich@novell.com: robustly handle buggy fs encode_fh actor definition]
      [v5: jeremy@goop.org: clean up global usage and static var names]
      [v5: jeremy@goop.org: simplify init hook and any future fs init changes]
      [v5: hch@infradead.org: cleaner non-global interface for ops registration]
      [v4: adilger@sun.com: interface must support exportfs FS's]
      [v4: hch@infradead.org: interface must support 64-bit FS on 32-bit kernel]
      [v3: akpm@linux-foundation.org: use one ops struct to avoid pointer hops]
      [v3: akpm@linux-foundation.org: document and ensure PageLocked reqts are met]
      [v3: ngupta@vflare.org: fix success/fail codes, change funcs to void]
      [v2: viro@ZenIV.linux.org.uk: use sane types]
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarNitin Gupta <ngupta@vflare.org>
      Acked-by: default avatarMinchan Kim <minchan.kim@gmail.com>
      Acked-by: default avatarAndreas Dilger <adilger@sun.com>
      Acked-by: default avatarJan Beulich <JBeulich@novell.com>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      077b1f83
    • Dan Magenheimer's avatar
      fs: add field to superblock to support cleancache · 9fdfdcf1
      Dan Magenheimer authored
      This second patch of eight in this cleancache series adds a field to
      the generic superblock to squirrel away a pool identifier that is
      dynamically provided by cleancache-enabled filesystems at mount time
      to uniquely identify files and pages belonging to this mounted filesystem.
      
      Details and a FAQ can be found in Documentation/vm/cleancache.txt
      
      [v8: trivial merge conflict update]
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Andreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      9fdfdcf1
    • Dan Magenheimer's avatar
      mm/fs: cleancache documentation · 4fe4746a
      Dan Magenheimer authored
      This patchset introduces cleancache, an optional new feature exposed
      by the VFS layer that potentially dramatically increases page cache
      effectiveness for many workloads in many environments at a negligible
      cost.  It does this by providing an interface to transcendent memory,
      which is memory/storage that is not otherwise visible to and/or directly
      addressable by the kernel.
      
      Instead of being discarded, hooks in the reclaim code "put" clean
      pages to cleancache.  Filesystems that "opt-in" may "get" pages
      from cleancache that were previously put, but pages in cleancache are
      "ephemeral", meaning they may disappear at any time. And the size
      of cleancache is entirely dynamic and unknowable to the kernel.
      Filesystems currently supported by this patchset include ext3, ext4,
      btrfs, and ocfs2.  Other filesystems (especially those built entirely
      on VFS) should be easy to add, but should first be thoroughly tested to
      ensure coherency.
      
      Details and a FAQ are provided in Documentation/vm/cleancache.txt
      
      This first patch of eight in this cleancache series only adds two
      new documentation files.
      
      [v8: minor documentation changes by author]
      [v3: akpm@linux-foundation.org: document sysfs API]
      [v3: hch@infradead.org: move detailed description to Documentation/vm]
      Signed-off-by: default avatarDan Magenheimer <dan.magenheimer@oracle.com>
      Reviewed-by: default avatarJeremy Fitzhardinge <jeremy@goop.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Rik Van Riel <riel@redhat.com>
      Cc: Jan Beulich <JBeulich@novell.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Andreas Dilger <adilger@sun.com>
      Cc: Ted Ts'o <tytso@mit.edu>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <joel.becker@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      4fe4746a
    • Theodore Ts'o's avatar
      jbd2: Add MAINTAINERS entry · d183e11a
      Theodore Ts'o authored
      Create a separate MAINTAINERS entry for jbd2
      
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      d183e11a
    • Sage Weil's avatar
      cifs: remove unnecessary dentry_unhash on rmdir/rename_dir · b6ff24a3
      Sage Weil authored
      Cifs has no problems with lingering references to unlinked directory
      inodes.
      
      CC: Steve French <sfrench@samba.org>
      CC: linux-cifs@vger.kernel.org
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b6ff24a3
    • Sage Weil's avatar
      ocfs2: remove unnecessary dentry_unhash on rmdir/rename_dir · 7ca57363
      Sage Weil authored
      Ocfs2 has no issues with lingering references to unlinked directory inodes.
      
      CC: Mark Fasheh <mfasheh@suse.com>
      CC: ocfs2-devel@oss.oracle.com
      Acked-by: default avatarJoel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7ca57363
    • Sage Weil's avatar
      exofs: remove unnecessary dentry_unhash on rmdir/rename_dir · 8cbfa53b
      Sage Weil authored
      Exofs has no problems with lingering references to unlinked directory
      inodes.
      
      CC: Benny Halevy <bhalevy@panasas.com>
      CC: osd-dev@open-osd.org
      Acked-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8cbfa53b
    • Sage Weil's avatar
      nfs: remove unnecessary dentry_unhash on rmdir/rename_dir · 052e2a1b
      Sage Weil authored
      NFS has no problems with lingering references to unlinked directory
      inodes.
      
      CC: Trond Myklebust <Trond.Myklebust@netapp.com>
      CC: linux-nfs@vger.kernel.org
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      052e2a1b
    • Sage Weil's avatar
      ext2: remove unnecessary dentry_unhash on rmdir/rename_dir · 5afcb940
      Sage Weil authored
      ext2 has no problems with lingering references to unlinked directory
      inodes.
      
      CC: Jan Kara <jack@suse.cz>
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5afcb940
    • Sage Weil's avatar
      ext3: remove unnecessary dentry_unhash on rmdir/rename_dir · 5a61a245
      Sage Weil authored
      ext3 has no problems with lingering references to unlinked directory
      inodes.
      
      CC: Jan Kara <jack@suse.cz>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: Andreas Dilger <adilger.kernel@dilger.ca>
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5a61a245
    • Sage Weil's avatar
      ext4: remove unnecessary dentry_unhash on rmdir/rename_dir · 40ebc0af
      Sage Weil authored
      ext4 has no problems with lingering references to unlinked directory
      inodes.
      
      CC: "Theodore Ts'o" <tytso@mit.edu>
      CC: Andreas Dilger <adilger.kernel@dilger.ca>
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      40ebc0af
    • Sage Weil's avatar
      btrfs: remove unnecessary dentry_unhash in rmdir/rename_dir · f64f58f8
      Sage Weil authored
      Btrfs has no problems with lingering references to unlinked directory
      inodes.
      
      CC: Chris Mason <chris.mason@oracle.com>
      CC: linux-btrfs@vger.kernel.org
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      f64f58f8
    • Sage Weil's avatar
      ceph: remove unnecessary dentry_unhash calls · 051e8f0e
      Sage Weil authored
      Ceph does not need these, and they screw up our use of the dcache as a
      consistent cache.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      051e8f0e
    • Sage Weil's avatar
      vfs: clean up vfs_rename_other · 51892bbb
      Sage Weil authored
      Simplify control flow to match vfs_rename_dir.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      51892bbb
    • Sage Weil's avatar
      vfs: clean up vfs_rename_dir · 9055cba7
      Sage Weil authored
      Simplify control flow through vfs_rename_dir.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      9055cba7
    • Sage Weil's avatar
      vfs: clean up vfs_rmdir · 912dbc15
      Sage Weil authored
      Simplify the control flow with an out label.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      912dbc15
    • Miklos Szeredi's avatar
      vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems · b5afd2c4
      Miklos Szeredi authored
      vfs_rename_dir() doesn't properly account for filesystems with
      FS_RENAME_DOES_D_MOVE.  If new_dentry has a target inode attached, it
      unhashes the new_dentry prior to the rename() iop and rehashes it after,
      but doesn't account for the possibility that rename() may have swapped
      {old,new}_dentry.  For FS_RENAME_DOES_D_MOVE filesystems, it rehashes
      new_dentry (now the old renamed-from name, which d_move() expected to go
      away), such that a subsequent lookup will find it.  Currently all
      FS_RENAME_DOES_D_MOVE filesystems compensate for this by failing in
      d_revalidate.
      
      The bug was introduced by: commit 349457cc
      "[PATCH] Allow file systems to manually d_move() inside of ->rename()"
      
      Fix by not rehashing the new dentry.  Rehashing used to be needed by
      d_move() but isn't anymore.
      Reported-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      b5afd2c4
    • Sage Weil's avatar
      libfs: drop unneeded dentry_unhash · 5c5d3f3b
      Sage Weil authored
      There are no libfs issues with dangling references to empty directories.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      5c5d3f3b
    • Sage Weil's avatar
      vfs: update dentry_unhash() comment · a71905f0
      Sage Weil authored
      The helper is now only called by file systems, not the VFS.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a71905f0
    • Sage Weil's avatar
      vfs: push dentry_unhash on rename_dir into file systems · e4eaac06
      Sage Weil authored
      Only a few file systems need this.  Start by pushing it down into each
      rename method (except gfs2 and xfs) so that it can be dealt with on a
      per-fs basis.
      Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e4eaac06
    • Sage Weil's avatar
      vfs: push dentry_unhash on rmdir into file systems · 79bf7c73
      Sage Weil authored
      Only a few file systems need this.  Start by pushing it down into each
      fs rmdir method (except gfs2 and xfs) so it can be dealt with on a per-fs
      basis.
      
      This does not change behavior for any in-tree file systems.
      Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      79bf7c73
    • Sage Weil's avatar
      vfs: remove dget() from dentry_unhash() · 64252c75
      Sage Weil authored
      This serves no useful purpose that I can discern.  All callers (rename,
      rmdir) hold their own reference to the dentry.
      
      A quick audit of all file systems showed no relevant checks on the value
      of d_count in vfs_rmdir/vfs_rename_dir paths.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      64252c75
    • Sage Weil's avatar
      vfs: dentry_unhash immediately prior to rmdir · 48293699
      Sage Weil authored
      This presumes that there is no reason to unhash a dentry if we fail because
      it is a mountpoint or the LSM check fails, and that the LSM checks do not
      depend on the dentry being unhashed.
      Signed-off-by: default avatarSage Weil <sage@newdream.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      48293699
    • Jan Kara's avatar
      vfs: Block mmapped writes while the fs is frozen · ea13a864
      Jan Kara authored
      We should not allow file modification via mmap while the filesystem is
      frozen. So block in block_page_mkwrite() while the filesystem is frozen.
      We cannot do the blocking wait in __block_page_mkwrite() since e.g. ext4
      will want to call that function with transaction started in some cases
      and that would deadlock. But we can at least do the non-blocking reliable
      check in __block_page_mkwrite() which is the hardest part anyway.
      
      We have to check for frozen filesystem with the page marked dirty and under
      page lock with which we then return from ->page_mkwrite(). Only that way we
      cannot race with writeback done by freezing code - either we mark the page
      dirty after the writeback has started, see freezing in progress and block, or
      writeback will wait for our page lock which is released only when the fault is
      done and then writeback will writeout and writeprotect the page again.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ea13a864
    • Jan Kara's avatar
      vfs: Create __block_page_mkwrite() helper passing error values back · 24da4fab
      Jan Kara authored
      Create __block_page_mkwrite() helper which does all what block_page_mkwrite()
      does except that it passes back errors from __block_write_begin /
      block_commit_write calls.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      24da4fab
    • Roman Borisov's avatar
      fs/namespace.c: bound mount propagation fix · 7c6e984d
      Roman Borisov authored
      This issue was discovered by users of busybox.  And the bug is actual for
      busybox users, I don't know how it affects others.  Apparently, mount is
      called with and without MS_SILENT, and this affects mount() behaviour.
      But MS_SILENT is only supposed to affect kernel logging verbosity.
      
      The following script was run in an empty test directory:
      
      mkdir -p mount.dir mount.shared1 mount.shared2
      touch mount.dir/a mount.dir/b
      mount -vv --bind         mount.shared1 mount.shared1
      mount -vv --make-rshared mount.shared1
      mount -vv --bind         mount.shared2 mount.shared2
      mount -vv --make-rshared mount.shared2
      mount -vv --bind mount.shared2 mount.shared1
      mount -vv --bind mount.dir     mount.shared2
      ls -R mount.dir mount.shared1 mount.shared2
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      rm -f mount.dir/a mount.dir/b mount.dir/c
      rmdir mount.dir mount.shared1 mount.shared2
      
      mount -vv was used to show the mount() call arguments and result.
      Output shows that flag argument has 0x00008000 = MS_SILENT bit:
      
      mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared1','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared2','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0
      mount.dir:
      a
      b
      
      mount.shared1:
      
      mount.shared2:
      a
      b
      
      After adding --loud option to remove MS_SILENT bit from just one mount cmd:
      
      mkdir -p mount.dir mount.shared1 mount.shared2
      touch mount.dir/a mount.dir/b
      mount -vv --bind         mount.shared1 mount.shared1 2>&1
      mount -vv --make-rshared mount.shared1               2>&1
      mount -vv --bind         mount.shared2 mount.shared2 2>&1
      mount -vv --loud --make-rshared mount.shared2               2>&1  # <-HERE
      mount -vv --bind mount.shared2 mount.shared1         2>&1
      mount -vv --bind mount.dir     mount.shared2         2>&1
      ls -R mount.dir mount.shared1 mount.shared2      2>&1
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      umount mount.dir mount.shared1 mount.shared2 2>/dev/null
      rm -f mount.dir/a mount.dir/b mount.dir/c
      rmdir mount.dir mount.shared1 mount.shared2
      
      The result is different now - look closely at mount.shared1 directory listing.
      Now it does show files 'a' and 'b':
      
      mount: mount('mount.shared1','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared1','',0x0010c000,''):0
      mount: mount('mount.shared2','mount.shared2','(null)',0x00009000,'(null)'):0
      mount: mount('','mount.shared2','',0x00104000,''):0
      mount: mount('mount.shared2','mount.shared1','(null)',0x00009000,'(null)'):0
      mount: mount('mount.dir','mount.shared2','(null)',0x00009000,'(null)'):0
      
      mount.dir:
      a
      b
      
      mount.shared1:
      a
      b
      
      mount.shared2:
      a
      b
      
      The analysis shows that MS_SILENT flag which is ON by default in any
      busybox-> mount operations cames to flags_to_propagation_type function and
      causes the error return while is_power_of_2 checking because the function
      expects only one bit set.  This doesn't allow to do busybox->mount with
      any --make-[r]shared, --make-[r]private etc options.
      
      Moreover, the recently added flags_to_propagation_type() function doesn't
      allow us to do such operations as --make-[r]private --make-[r]shared etc.
      when MS_SILENT is on.  The idea or clearing the MS_SILENT flag came from
      to Denys Vlasenko.
      Signed-off-by: default avatarRoman Borisov <ext-roman.borisov@nokia.com>
      Reported-by: default avatarDenys Vlasenko <vda.linux@googlemail.com>
      Cc: Chuck Ebbert <cebbert@redhat.com>
      Cc: Alexander Shishkin <virtuoso@slind.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7c6e984d
    • Jonas Gorski's avatar
      exportfs: reallow building as a module · 79fead47
      Jonas Gorski authored
      Commit 990d6c2d ("vfs: Add name to file
      handle conversion support") changed EXPORTFS to be a bool.
      This was needed for earlier revisions of the original patch, but the actual
      commit put the code needing it into its own file that only gets compiled
      when FHANDLE is selected which in turn selects EXPORTFS.
      So EXPORTFS can be safely compiled as a module when not selecting FHANDLE.
      Signed-off-by: default avatarJonas Gorski <jonas.gorski@gmail.com>
      Acked-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      79fead47