1. 22 Feb, 2015 13 commits
    • Konstantin Khlebnikov's avatar
      trylock_super(): replacement for grab_super_passive() · eb6ef3df
      Konstantin Khlebnikov authored
      I've noticed significant locking contention in memory reclaimer around
      sb_lock inside grab_super_passive(). Grab_super_passive() is called from
      two places: in icache/dcache shrinkers (function super_cache_scan) and
      from writeback (function __writeback_inodes_wb). Both are required for
      progress in memory allocator.
      
      Grab_super_passive() acquires sb_lock to increment sb->s_count and check
      sb->s_instances. It seems sb->s_umount locked for read is enough here:
      super-block deactivation always runs under sb->s_umount locked for write.
      Protecting super-block itself isn't a problem: in super_cache_scan() sb
      is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers
      are still active. Inside writeback super-block comes from inode from bdi
      writeback list under wb->list_lock.
      
      This patch removes locking sb_lock and checks s_instances under s_umount:
      generic_shutdown_super() unlinks it under sb->s_umount locked for write.
      New variant is called trylock_super() and since it only locks semaphore,
      callers must call up_read(&sb->s_umount) instead of drop_super(sb) when
      they're done.
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      eb6ef3df
    • David Howells's avatar
      fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions · 54f2a2f4
      David Howells authored
      Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup()
      rather than d_is_dir() when checking a dir watch and give an error on fake
      directories.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      54f2a2f4
    • David Howells's avatar
      Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions · ce40fa78
      David Howells authored
      Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack
      thereof) in cachefiles:
      
       (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as
           it doesn't want to deal with automounts in its cache.
      
       (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in
           cachefiles.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ce40fa78
    • David Howells's avatar
      VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) · e36cb0b8
      David Howells authored
      Convert the following where appropriate:
      
       (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).
      
       (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).
      
       (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry).  This is actually more
           complicated than it appears as some calls should be converted to
           d_can_lookup() instead.  The difference is whether the directory in
           question is a real dir with a ->lookup op or whether it's a fake dir with
           a ->d_automount op.
      
      In some circumstances, we can subsume checks for dentry->d_inode not being
      NULL into this, provided we the code isn't in a filesystem that expects
      d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
      use d_inode() rather than d_backing_inode() to get the inode pointer).
      
      Note that the dentry type field may be set to something other than
      DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
      manages the fall-through from a negative dentry to a lower layer.  In such a
      case, the dentry type of the negative union dentry is set to the same as the
      type of the lower dentry.
      
      However, if you know d_inode is not NULL at the call site, then you can use
      the d_is_xxx() functions even in a filesystem.
      
      There is one further complication: a 0,0 chardev dentry may be labelled
      DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE.  Strictly, this was
      intended for special directory entry types that don't have attached inodes.
      
      The following perl+coccinelle script was used:
      
      use strict;
      
      my @callers;
      open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
          die "Can't grep for S_ISDIR and co. callers";
      @callers = <$fd>;
      close($fd);
      unless (@callers) {
          print "No matches\n";
          exit(0);
      }
      
      my @cocci = (
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISLNK(E->d_inode->i_mode)',
          '+ d_is_symlink(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISDIR(E->d_inode->i_mode)',
          '+ d_is_dir(E)',
          '',
          '@@',
          'expression E;',
          '@@',
          '',
          '- S_ISREG(E->d_inode->i_mode)',
          '+ d_is_reg(E)' );
      
      my $coccifile = "tmp.sp.cocci";
      open($fd, ">$coccifile") || die $coccifile;
      print($fd "$_\n") || die $coccifile foreach (@cocci);
      close($fd);
      
      foreach my $file (@callers) {
          chomp $file;
          print "Processing ", $file, "\n";
          system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
      	die "spatch failed";
      }
      
      [AV: overlayfs parts skipped]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e36cb0b8
    • David Howells's avatar
      SELinux: Use d_is_positive() rather than testing dentry->d_inode · 2c616d4d
      David Howells authored
      Use d_is_positive() rather than testing dentry->d_inode in SELinux to get rid
      of direct references to d_inode outside of the VFS.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2c616d4d
    • David Howells's avatar
      Smack: Use d_is_positive() rather than testing dentry->d_inode · 8802565b
      David Howells authored
      Use d_is_positive() rather than testing dentry->d_inode in Smack to get rid of
      direct references to d_inode outside of the VFS.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8802565b
    • David Howells's avatar
      TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR() · e656a8eb
      David Howells authored
      Use d_is_dir() rather than d_inode and S_ISDIR().  Note that this will include
      fake directories such as automount triggers.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e656a8eb
    • David Howells's avatar
      Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode · 729b8a3d
      David Howells authored
      Use d_is_positive(dentry) or d_is_negative(dentry) rather than testing
      dentry->d_inode as the dentry may cover another layer that has an inode when
      the top layer doesn't or may hold a 0,0 chardev that's actually a whiteout.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      729b8a3d
    • David Howells's avatar
      Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb · 7ac2856d
      David Howells authored
      mediated_filesystem() should use dentry->d_sb not dentry->d_inode->i_sb and
      should avoid file_inode() also since it is really dealing with the path.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      7ac2856d
    • David Howells's avatar
      VFS: Split DCACHE_FILE_TYPE into regular and special types · 44bdb5e5
      David Howells authored
      Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular
      files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and
      socket files).
      
      d_is_reg() and d_is_special() are added to detect these subtypes and
      d_is_file() is left as the union of the two.
      
      This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to
      use d_is_reg(dentry) instead.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      44bdb5e5
    • David Howells's avatar
      VFS: Add a fallthrough flag for marking virtual dentries · df1a085a
      David Howells authored
      Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is
      a virtual dentry that covers another one in a lower layer that should be used
      instead.  This may be recorded on medium if directory integration is stored
      there.
      
      The flag can be set with d_set_fallthru() and tested with d_is_fallthru().
      
      Original-author: Valerie Aurora <vaurora@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      df1a085a
    • David Howells's avatar
      VFS: Add a whiteout dentry type · e7f7d225
      David Howells authored
      Add DCACHE_WHITEOUT_TYPE and provide a d_is_whiteout() accessor function.  A
      d_is_miss() accessor is also added for ordinary cache misses and
      d_is_negative() is modified to indicate either an ordinary miss or an enforced
      miss (whiteout).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e7f7d225
    • David Howells's avatar
      VFS: Introduce inode-getting helpers for layered/unioned fs environments · 155e35d4
      David Howells authored
      Introduce some function for getting the inode (and also the dentry) in an
      environment where layered/unioned filesystems are in operation.
      
      The problem is that we have places where we need *both* the union dentry and
      the lower source or workspace inode or dentry available, but we can only have
      a handle on one of them.  Therefore we need to derive the handle to the other
      from that.
      
      The idea is to introduce an extra field in struct dentry that allows the union
      dentry to refer to and pin the lower dentry.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      155e35d4
  2. 20 Feb, 2015 9 commits
  3. 18 Feb, 2015 11 commits
  4. 17 Feb, 2015 7 commits
    • Linus Torvalds's avatar
      Merge branch 'iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 66dc830d
      Linus Torvalds authored
      Pull iov_iter updates from Al Viro:
       "More iov_iter work - missing counterpart of iov_iter_init() for
        bvec-backed ones and vfs_read_iter()/vfs_write_iter() - wrappers for
        sync calls of ->read_iter()/->write_iter()"
      
      * 'iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: add vfs_iter_{read,write} helpers
        new helper: iov_iter_bvec()
      66dc830d
    • Linus Torvalds's avatar
      Merge branch 'getname2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 05016b0f
      Linus Torvalds authored
      Pull getname/putname updates from Al Viro:
       "Rework of getname/getname_kernel/etc., mostly from Paul Moore.  Gets
        rid of quite a pile of kludges between namei and audit..."
      
      * 'getname2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        audit: replace getname()/putname() hacks with reference counters
        audit: fix filename matching in __audit_inode() and __audit_inode_child()
        audit: enable filename recording via getname_kernel()
        simpler calling conventions for filename_mountpoint()
        fs: create proper filename objects using getname_kernel()
        fs: rework getname_kernel to handle up to PATH_MAX sized filenames
        cut down the number of do_path_lookup() callers
      05016b0f
    • Linus Torvalds's avatar
      Merge branch 'debugfs_automount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · c6b1de1b
      Linus Torvalds authored
      Pull debugfs patches from Al Viro:
       "debugfs patches, mostly to make it possible for something like tracefs
        to be transparently automounted on given directory in debugfs.
      
        New primitive in there is debugfs_create_automount(name, parent, func,
        arg), which creates a directory and makes its ->d_automount() return
        func(arg).  Another missing primitive was debugfs_create_file_size() -
        open-coded in quite a few places.  Dave's patch adds it and converts
        the open-code instances to calling it"
      
      * 'debugfs_automount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        debugfs: Provide a file creation function that also takes an initial size
        new primitive: debugfs_create_automount()
        debugfs: split end_creating() into success and failure cases
        debugfs: take mode-dependent parts of debugfs_get_inode() into callers
        fold debugfs_mknod() into callers
        fold debugfs_create() into caller
        fold debugfs_mkdir() into caller
        debugfs_mknod(): get rid useless arguments
        fold debugfs_link() into caller
        debugfs: kill __create_file()
        debugfs: split the beginning and the end of __create_file() off
        debugfs_{mkdir,create,link}(): get rid of redundant argument
      c6b1de1b
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 50652963
      Linus Torvalds authored
      Pull misc VFS updates from Al Viro:
       "This cycle a lot of stuff sits on topical branches, so I'll be sending
        more or less one pull request per branch.
      
        This is the first pile; more to follow in a few.  In this one are
        several misc commits from early in the cycle (before I went for
        separate branches), plus the rework of mntput/dput ordering on umount,
        switching to use of fs_pin instead of convoluted games in
        namespace_unlock()"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        switch the IO-triggering parts of umount to fs_pin
        new fs_pin killing logics
        allow attaching fs_pin to a group not associated with some superblock
        get rid of the second argument of acct_kill()
        take count and rcu_head out of fs_pin
        dcache: let the dentry count go down to zero without taking d_lock
        pull bumping refcount into ->kill()
        kill pin_put()
        mode_t whack-a-mole: chelsio
        file->f_path.dentry is pinned down for as long as the file is open...
        get rid of lustre_dump_dentry()
        gut proc_register() a bit
        kill d_validate()
        ncpfs: get rid of d_validate() nonsense
        selinuxfs: don't open-code d_genocide()
      50652963
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · e2b74f23
      Linus Torvalds authored
      Merge yet more updates from Andrew Morton:
      
       - a pile of minor fs fixes and cleanups
      
       - kexec updates
      
       - random misc fixes in various places: vmcore, rbtree, eventfd, ipc, seccomp.
      
       - a series of python-based kgdb helper scripts
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (58 commits)
        seccomp: cap SECCOMP_RET_ERRNO data to MAX_ERRNO
        samples/seccomp: improve label helper
        ipc,sem: use current->state helpers
        scripts/gdb: disable pagination while printing from breakpoint handler
        scripts/gdb: define maintainer
        scripts/gdb: convert CpuList to generator function
        scripts/gdb: convert ModuleList to generator function
        scripts/gdb: use a generator instead of iterator for task list
        scripts/gdb: ignore byte-compiled python files
        scripts/gdb: port to python3 / gdb7.7
        scripts/gdb: add basic documentation
        scripts/gdb: add lx-lsmod command
        scripts/gdb: add class to iterate over CPU masks
        scripts/gdb: add lx_current convenience function
        scripts/gdb: add internal helper and convenience function for per-cpu lookup
        scripts/gdb: add get_gdbserver_type helper
        scripts/gdb: add internal helper and convenience function to retrieve thread_info
        scripts/gdb: add is_target_arch helper
        scripts/gdb: add helper and convenience function to look up tasks
        scripts/gdb: add task iteration class
        ...
      e2b74f23
    • Kees Cook's avatar
      seccomp: cap SECCOMP_RET_ERRNO data to MAX_ERRNO · 580c57f1
      Kees Cook authored
      The value resulting from the SECCOMP_RET_DATA mask could exceed MAX_ERRNO
      when setting errno during a SECCOMP_RET_ERRNO filter action.  This makes
      sure we have a reliable value being set, so that an invalid errno will not
      be ignored by userspace.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Will Drewry <wad@chromium.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      580c57f1
    • Kees Cook's avatar
      samples/seccomp: improve label helper · 3a9af0bd
      Kees Cook authored
      Fixes a potential corruption with uninitialized stack memory in the
      seccomp BPF sample program.
      
      [akpm@linux-foundation.org: coding-style fixlet]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarRobert Swiecki <swiecki@google.com>
      Tested-by: default avatarRobert Swiecki <swiecki@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3a9af0bd