1. 09 Aug, 2018 3 commits
    • Al Viro's avatar
      make sure that __dentry_kill() always invalidates d_seq, unhashed or not · 4c0d7cd5
      Al Viro authored
      RCU pathwalk relies upon the assumption that anything that changes
      ->d_inode of a dentry will invalidate its ->d_seq.  That's almost
      true - the one exception is that the final dput() of already unhashed
      dentry does *not* touch ->d_seq at all.  Unhashing does, though,
      so for anything we'd found by RCU dcache lookup we are fine.
      Unfortunately, we can *start* with an unhashed dentry or jump into
      it.
      
      We could try and be careful in the (few) places where that could
      happen.  Or we could just make the final dput() invalidate the damn
      thing, unhashed or not.  The latter is much simpler and easier to
      backport, so let's do it that way.
      Reported-by: default avatar"Dae R. Jeong" <threeearcat@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4c0d7cd5
    • Al Viro's avatar
      fix __legitimize_mnt()/mntput() race · 119e1ef8
      Al Viro authored
      __legitimize_mnt() has two problems - one is that in case of success
      the check of mount_lock is not ordered wrt preceding increment of
      refcount, making it possible to have successful __legitimize_mnt()
      on one CPU just before the otherwise final mntpu() on another,
      with __legitimize_mnt() not seeing mntput() taking the lock and
      mntput() not seeing the increment done by __legitimize_mnt().
      Solved by a pair of barriers.
      
      Another is that failure of __legitimize_mnt() on the second
      read_seqretry() leaves us with reference that'll need to be
      dropped by caller; however, if that races with final mntput()
      we can end up with caller dropping rcu_read_lock() and doing
      mntput() to release that reference - with the first mntput()
      having freed the damn thing just as rcu_read_lock() had been
      dropped.  Solution: in "do mntput() yourself" failure case
      grab mount_lock, check if MNT_DOOMED has been set by racing
      final mntput() that has missed our increment and if it has -
      undo the increment and treat that as "failure, caller doesn't
      need to drop anything" case.
      
      It's not easy to hit - the final mntput() has to come right
      after the first read_seqretry() in __legitimize_mnt() *and*
      manage to miss the increment done by __legitimize_mnt() before
      the second read_seqretry() in there.  The things that are almost
      impossible to hit on bare hardware are not impossible on SMP
      KVM, though...
      Reported-by: default avatarOleg Nesterov <oleg@redhat.com>
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      119e1ef8
    • Al Viro's avatar
      fix mntput/mntput race · 9ea0a46c
      Al Viro authored
      mntput_no_expire() does the calculation of total refcount under mount_lock;
      unfortunately, the decrement (as well as all increments) are done outside
      of it, leading to false positives in the "are we dropping the last reference"
      test.  Consider the following situation:
      	* mnt is a lazy-umounted mount, kept alive by two opened files.  One
      of those files gets closed.  Total refcount of mnt is 2.  On CPU 42
      mntput(mnt) (called from __fput()) drops one reference, decrementing component
      	* After it has looked at component #0, the process on CPU 0 does
      mntget(), incrementing component #0, gets preempted and gets to run again -
      on CPU 69.  There it does mntput(), which drops the reference (component #69)
      and proceeds to spin on mount_lock.
      	* On CPU 42 our first mntput() finishes counting.  It observes the
      decrement of component #69, but not the increment of component #0.  As the
      result, the total it gets is not 1 as it should've been - it's 0.  At which
      point we decide that vfsmount needs to be killed and proceed to free it and
      shut the filesystem down.  However, there's still another opened file
      on that filesystem, with reference to (now freed) vfsmount, etc. and we are
      screwed.
      
      It's not a wide race, but it can be reproduced with artificial slowdown of
      the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups.
      
      Fix consists of moving the refcount decrement under mount_lock; the tricky
      part is that we want (and can) keep the fast case (i.e. mount that still
      has non-NULL ->mnt_ns) entirely out of mount_lock.  All places that zero
      mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu()
      before that mntput().  IOW, if mntput() observes (under rcu_read_lock())
      a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to
      be dropped.
      Reported-by: default avatarJann Horn <jannh@google.com>
      Tested-by: default avatarJann Horn <jannh@google.com>
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      9ea0a46c
  2. 06 Aug, 2018 1 commit
    • Al Viro's avatar
      root dentries need RCU-delayed freeing · 90bad5e0
      Al Viro authored
      Since mountpoint crossing can happen without leaving lazy mode,
      root dentries do need the same protection against having their
      memory freed without RCU delay as everything else in the tree.
      
      It's partially hidden by RCU delay between detaching from the
      mount tree and dropping the vfsmount reference, but the starting
      point of pathwalk can be on an already detached mount, in which
      case umount-caused RCU delay has already passed by the time the
      lazy pathwalk grabs rcu_read_lock().  If the starting point
      happens to be at the root of that vfsmount *and* that vfsmount
      covers the entire filesystem, we get trouble.
      
      Fixes: 48a066e7 ("RCU'd vsfmounts")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      90bad5e0
  3. 18 Jul, 2018 1 commit
  4. 11 Jul, 2018 3 commits
  5. 28 Jun, 2018 1 commit
    • Chunyu Hu's avatar
      proc: add proc_seq_release · 877f919e
      Chunyu Hu authored
      kmemleak reported some memory leak on reading proc files. After adding
      some debug lines, find that proc_seq_fops is using seq_release as
      release handler, which won't handle the free of 'private' field of
      seq_file, while in fact the open handler proc_seq_open could create
      the private data with __seq_open_private when state_size is greater
      than zero. So after reading files created with proc_create_seq_private,
      such as /proc/timer_list and /proc/vmallocinfo, the private mem of a
      seq_file is not freed. Fix it by adding the paired proc_seq_release
      as the default release handler of proc_seq_ops instead of seq_release.
      
      Fixes: 44414d82 ("proc: introduce proc_create_seq_private")
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      CC: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarChunyu Hu <chuhu@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      877f919e
  6. 16 Jun, 2018 8 commits
    • Linus Torvalds's avatar
      Linux 4.18-rc1 · ce397d21
      Linus Torvalds authored
      ce397d21
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-block · 265c5596
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "A collection of fixes that should go into -rc1. This contains:
      
         - bsg_open vs bsg_unregister race fix (Anatoliy)
      
         - NVMe pull request from Christoph, with fixes for regressions in
           this window, FC connect/reconnect path code unification, and a
           trace point addition.
      
         - timeout fix (Christoph)
      
         - remove a few unused functions (Christoph)
      
         - blk-mq tag_set reinit fix (Roman)"
      
      * tag 'for-linus-20180616' of git://git.kernel.dk/linux-block:
        bsg: fix race of bsg_open and bsg_unregister
        block: remov blk_queue_invalidate_tags
        nvme-fabrics: fix and refine state checks in __nvmf_check_ready
        nvme-fabrics: handle the admin-only case properly in nvmf_check_ready
        nvme-fabrics: refactor queue ready check
        blk-mq: remove blk_mq_tagset_iter
        nvme: remove nvme_reinit_tagset
        nvme-fc: fix nulling of queue data on reconnect
        nvme-fc: remove reinit_request routine
        blk-mq: don't time out requests again that are in the timeout handler
        nvme-fc: change controllers first connect to use reconnect path
        nvme: don't rely on the changed namespace list log
        nvmet: free smart-log buffer after use
        nvme-rdma: fix error flow during mapping request data
        nvme: add bio remapping tracepoint
        nvme: fix NULL pointer dereference in nvme_init_subsystem
        blk-mq: reinit q->tag_set_list entry only after grace period
      265c5596
    • Linus Torvalds's avatar
      Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental · 5e7b9212
      Linus Torvalds authored
      Pull documentation fixes from Mauro Carvalho Chehab:
       "This solves a series of broken links for files under Documentation,
        and improves a script meant to detect such broken links (see
        scripts/documentation-file-ref-check).
      
        The changes on this series are:
      
         - can.rst: fix a footnote reference;
      
         - crypto_engine.rst: Fix two parsing warnings;
      
         - Fix a lot of broken references to Documentation/*;
      
         - improve the scripts/documentation-file-ref-check script, in order
           to help detecting/fixing broken references, preventing
           false-positives.
      
        After this patch series, only 33 broken references to doc files are
        detected by scripts/documentation-file-ref-check"
      
      * tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits)
        fix a series of Documentation/ broken file name references
        Documentation: rstFlatTable.py: fix a broken reference
        ABI: sysfs-devices-system-cpu: remove a broken reference
        devicetree: fix a series of wrong file references
        devicetree: fix name of pinctrl-bindings.txt
        devicetree: fix some bindings file names
        MAINTAINERS: fix location of DT npcm files
        MAINTAINERS: fix location of some display DT bindings
        kernel-parameters.txt: fix pointers to sound parameters
        bindings: nvmem/zii: Fix location of nvmem.txt
        docs: Fix more broken references
        scripts/documentation-file-ref-check: check tools/*/Documentation
        scripts/documentation-file-ref-check: get rid of false-positives
        scripts/documentation-file-ref-check: hint: dash or underline
        scripts/documentation-file-ref-check: add a fix logic for DT
        scripts/documentation-file-ref-check: accept more wildcards at filenames
        scripts/documentation-file-ref-check: fix help message
        media: max2175: fix location of driver's companion documentation
        media: v4l: fix broken video4linux docs locations
        media: dvb: point to the location of the old README.dvb-usb file
        ...
      5e7b9212
    • Linus Torvalds's avatar
      Merge tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · dbb2816f
      Linus Torvalds authored
      Pull fsnotify updates from Jan Kara:
       "fsnotify cleanups unifying handling of different watch types.
      
        This is the shortened fsnotify series from Amir with the last five
        patches pulled out. Amir has modified those patches to not change
        struct inode but obviously it's too late for those to go into this
        merge window"
      
      * tag 'fsnotify_for_v4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        fsnotify: add fsnotify_add_inode_mark() wrappers
        fanotify: generalize fanotify_should_send_event()
        fsnotify: generalize send_to_group()
        fsnotify: generalize iteration of marks by object type
        fsnotify: introduce marks iteration helpers
        fsnotify: remove redundant arguments to handle_event()
        fsnotify: use type id to identify connector object type
      dbb2816f
    • Linus Torvalds's avatar
      Merge tag 'fbdev-v4.18' of git://github.com/bzolnier/linux · 644f2639
      Linus Torvalds authored
      Pull fbdev updates from Bartlomiej Zolnierkiewicz:
       "There is nothing really major here, few small fixes, some cleanups and
        dead drivers removal:
      
         - mark omapfb drivers as orphans in MAINTAINERS file (Tomi Valkeinen)
      
         - add missing module license tags to omap/omapfb driver (Arnd
           Bergmann)
      
         - add missing GPIOLIB dependendy to omap2/omapfb driver (Arnd
           Bergmann)
      
         - convert savagefb, aty128fb & radeonfb drivers to use msleep & co.
           (Jia-Ju Bai)
      
         - allow COMPILE_TEST build for viafb driver (media part was reviewed
           by media subsystem Maintainer)
      
         - remove unused MERAM support from sh_mobile_lcdcfb and shmob-drm
           drivers (drm parts were acked by shmob-drm driver Maintainer)
      
         - remove unused auo_k190xfb drivers
      
         - misc cleanups (Souptick Joarder, Wolfram Sang, Markus Elfring, Andy
           Shevchenko, Colin Ian King)"
      
      * tag 'fbdev-v4.18' of git://github.com/bzolnier/linux: (26 commits)
        fb_omap2: add gpiolib dependency
        video/omap: add module license tags
        MAINTAINERS: make omapfb orphan
        video: fbdev: pxafb: match_string() conversion fixup
        video: fbdev: nvidia: fix spelling mistake: "scaleing" -> "scaling"
        video: fbdev: fix spelling mistake: "frambuffer" -> "framebuffer"
        video: fbdev: pxafb: Convert to use match_string() helper
        video: fbdev: via: allow COMPILE_TEST build
        video: fbdev: remove unused sh_mobile_meram driver
        drm: shmobile: remove unused MERAM support
        video: fbdev: sh_mobile_lcdcfb: remove unused MERAM support
        video: fbdev: remove unused auo_k190xfb drivers
        video: omap: Improve a size determination in omapfb_do_probe()
        video: sm501fb: Improve a size determination in sm501fb_probe()
        video: fbdev-MMP: Improve a size determination in path_init()
        video: fbdev-MMP: Delete an error message for a failed memory allocation in two functions
        video: auo_k190x: Delete an error message for a failed memory allocation in auok190x_common_probe()
        video: sh_mobile_lcdcfb: Delete an error message for a failed memory allocation in two functions
        video: sh_mobile_meram: Delete an error message for a failed memory allocation in sh_mobile_meram_probe()
        video: fbdev: sh_mobile_meram: Drop SUPERH platform dependency
        ...
      644f2639
    • Linus Torvalds's avatar
      Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 35773c93
      Linus Torvalds authored
      Pull AFS updates from Al Viro:
       "Assorted AFS stuff - ended up in vfs.git since most of that consists
        of David's AFS-related followups to Christoph's procfs series"
      
      * 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        afs: Optimise callback breaking by not repeating volume lookup
        afs: Display manually added cells in dynamic root mount
        afs: Enable IPv6 DNS lookups
        afs: Show all of a server's addresses in /proc/fs/afs/servers
        afs: Handle CONFIG_PROC_FS=n
        proc: Make inline name size calculation automatic
        afs: Implement network namespacing
        afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
        afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
        proc: Add a way to make network proc files writable
        afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
        afs: Rearrange fs/afs/proc.c to move the show routines up
        afs: Rearrange fs/afs/proc.c by moving fops and open functions down
        afs: Move /proc management functions to the end of the file
      35773c93
    • Linus Torvalds's avatar
      Merge branch 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 29d6849d
      Linus Torvalds authored
      Pull compat updates from Al Viro:
       "Some biarch patches - getting rid of assorted (mis)uses of
        compat_alloc_user_space().
      
        Not much in that area this cycle..."
      
      * 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        orangefs: simplify compat ioctl handling
        signalfd: lift sigmask copyin and size checks to callers of do_signalfd4()
        vmsplice(): lift importing iovec into vmsplice(2) and compat counterpart
      29d6849d
    • Linus Torvalds's avatar
      Merge branch 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · a5b729ea
      Linus Torvalds authored
      Pull aio fixes from Al Viro:
       "Assorted AIO followups and fixes"
      
      * 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        eventpoll: switch to ->poll_mask
        aio: only return events requested in poll_mask() for IOCB_CMD_POLL
        eventfd: only return events requested in poll_mask()
        aio: mark __aio_sigset::sigmask const
      a5b729ea
  7. 15 Jun, 2018 23 commits