1. 31 Oct, 2015 3 commits
    • Christoph Hellwig's avatar
      dm: add support for passing through persistent reservations · 71cdb697
      Christoph Hellwig authored
      This adds support to pass through persistent reservation requests
      similar to the existing ioctl handling, and with the same limitations,
      e.g. devices may only have a single target attached.
      
      This is mostly intended for multipathing.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      71cdb697
    • Christoph Hellwig's avatar
      dm: refactor ioctl handling · e56f81e0
      Christoph Hellwig authored
      This moves the call to blkdev_ioctl and the argument checking to DM core
      code, and only leaves a callout to find the block device to operate on
      in the targets.  This simplifies the code and allows us to pass through
      ioctl-like command using other methods in the next patch.
      
      Also split out a helper around calling the prepare_ioctl method that
      will be reused for persistent reservation handling.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      e56f81e0
    • Mauricio Faria de Oliveira's avatar
      Revert "dm mpath: fix stalls when handling invalid ioctls" · 47796938
      Mauricio Faria de Oliveira authored
      This reverts commit a1989b33.
      
      That commit introduced a regression at least for the case of the SG_IO ioctl()
      running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there
      are no active paths: the ioctl() fails with the ENOTTY errno immediately rather
      than blocking due to queue_if_no_path until a path becomes active, for example.
      
      That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
      (qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2])
      from multipath devices; which leads to SCSI/filesystem errors in such a guest.
      
      More general scenarios can hit that regression too. The following demonstration
      employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective
      (some output & user changes omitted for brevity and comments added for clarity).
      
      Reverting that commit restores normal operation (queueing) in failing scenarios;
      tested on linux-next (next-20151022).
      
      1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM)
      
          $ cat sg_simple0.c
          ... see [3] ...
          $ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c
          $ gcc sgio_inquiry.c -o sgio_inquiry
      
      2) The ioctl() works fine with active paths present.
      
          # multipath -l 85ag56
          85ag56 (...) dm-19 IBM     ,2145
          size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
          |-+- policy='service-time 0' prio=0 status=active
          | |- 8:0:11:0  sdz  65:144  active undef running
          | `- 9:0:9:0   sdbf 67:144  active undef running
          `-+- policy='service-time 0' prio=0 status=enabled
            |- 8:0:12:0  sdae 65:224  active undef running
            `- 9:0:12:0  sdbo 68:32   active undef running
      
          $ ./sgio_inquiry /dev/mapper/85ag56
          Some of the INQUIRY command's response:
              IBM       2145              0000
          INQUIRY duration=0 millisecs, resid=0
      
      3) The ioctl() fails with ENOTTY errno with _no_ active paths present,
         for unprivileged users (rather than blocking due to queue_if_no_path).
      
          # for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \
                do multipathd -k"fail path $path"; done
      
          # multipath -l 85ag56
          85ag56 (...) dm-19 IBM     ,2145
          size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
          |-+- policy='service-time 0' prio=0 status=enabled
          | |- 8:0:11:0  sdz  65:144  failed undef running
          | `- 9:0:9:0   sdbf 67:144  failed undef running
          `-+- policy='service-time 0' prio=0 status=enabled
            |- 8:0:12:0  sdae 65:224  failed undef running
            `- 9:0:12:0  sdbo 68:32   failed undef running
      
          $ ./sgio_inquiry /dev/mapper/85ag56
          sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device
      
      4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
         it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().
      
          $ dmesg
          <...>
          [] device-mapper: multipath: Failing path 65:144.
          [] device-mapper: multipath: Failing path 67:144.
          [] device-mapper: multipath: Failing path 65:224.
          [] device-mapper: multipath: Failing path 68:32.
          [] sgio_inquiry: sending ioctl 2285 to a partition!
      
      5) The ioctl() only works if the SYS_CAP_RAWIO capability is present
         (then queueing happens -- in this example, queue_if_no_path is set);
         this is due to a conditional check in scsi_verify_blk_ioctl().
      
          # capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56'
          sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device
      
          # ./sgio_inquiry /dev/mapper/85ag56 &
          [1] 72830
      
          # cat /proc/72830/stack
          [<c00000171c0df700>] 0xc00000171c0df700
          [<c000000000015934>] __switch_to+0x204/0x350
          [<c000000000152d4c>] msleep+0x5c/0x80
          [<c00000000077dfb0>] dm_blk_ioctl+0x70/0x170
          [<c000000000487c40>] blkdev_ioctl+0x2b0/0x9b0
          [<c0000000003128e4>] block_ioctl+0x64/0xd0
          [<c0000000002dd3b0>] do_vfs_ioctl+0x490/0x780
          [<c0000000002dd774>] SyS_ioctl+0xd4/0xf0
          [<c000000000009358>] system_call+0x38/0xd0
      
      6) This is the function call chain exercised in this analysis:
      
      SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c
          -> do_vfs_ioctl()
              -> vfs_ioctl()
                  ...
                  error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
                  ...
                      -> dm_blk_ioctl() @ drivers/md/dm.c
                          -> multipath_ioctl() @ drivers/md/dm-mpath.c
                              ...
                              (bdev = NULL, due to no active paths)
                              ...
                              if (!bdev || <...>) {
                                  int err = scsi_verify_blk_ioctl(NULL, cmd);
                                  if (err)
                                      r = err;
                              }
                              ...
                                  -> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c
                                      ...
                                      if (bd && bd == bd->bd_contains) // not taken (bd = NULL)
                                          return 0;
                                      ...
                                      if (capable(CAP_SYS_RAWIO)) // not taken (unprivileged user)
                                          return 0;
                                      ...
                                      printk_ratelimited(KERN_WARNING
                                                 "%s: sending ioctl %x to a partition!\n" <...>);
      
                                      return -ENOIOCTLCMD;
                                  <-
                              ...
                              return r ? : <...>
                          <-
                  ...
                  if (error == -ENOIOCTLCMD)
                      error = -ENOTTY;
                   out:
                      return error;
                  ...
      
      Links:
      [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
      [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')
      [3] http://tldp.org/HOWTO/SCSI-Generic-HOWTO/pexample.html (Revision 1.2, 2002-05-03)
      Signed-off-by: default avatarMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      47796938
  2. 30 Oct, 2015 1 commit
    • Mikulas Patocka's avatar
      dm: initialize non-blk-mq queue data before queue is used · ad5f498f
      Mikulas Patocka authored
      Commit bfebd1cd ("dm: add full blk-mq
      support to request-based DM") moves the initialization of the fields
      backing_dev_info.congested_fn, backing_dev_info.congested_data and
      queuedata from the function dm_init_md_queue (that is called when the
      device is created) to dm_init_old_md_queue (that is called after the
      device type is determined).
      
      There is no locking when accessing these variables, thus it is possible
      for other parts of the kernel to briefly see this data in a transient
      state (e.g. queue->backing_dev_info.congested_fn initialized and
      md->queue->backing_dev_info.congested_data uninitialized, resulting in
      passing an incorrect parameter to the function dm_any_congested).
      
      This queue data is left initialized for blk-mq devices even though they
      that don't use it.
      
      Fixes: bfebd1cd ("dm: add full blk-mq support to request-based DM")
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org # v4.1+
      ad5f498f
  3. 22 Oct, 2015 2 commits
  4. 21 Oct, 2015 3 commits
  5. 15 Oct, 2015 2 commits
  6. 12 Oct, 2015 2 commits
  7. 09 Oct, 2015 15 commits
  8. 04 Oct, 2015 6 commits
    • Linus Torvalds's avatar
      Linux 4.3-rc4 · 049e6dde
      Linus Torvalds authored
      049e6dde
    • Linus Torvalds's avatar
      Merge branch 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile · 30c44659
      Linus Torvalds authored
      Pull strscpy string copy function implementation from Chris Metcalf.
      
      Chris sent this during the merge window, but I waffled back and forth on
      the pull request, which is why it's going in only now.
      
      The new "strscpy()" function is definitely easier to use and more secure
      than either strncpy() or strlcpy(), both of which are horrible nasty
      interfaces that have serious and irredeemable problems.
      
      strncpy() has a useless return value, and doesn't NUL-terminate an
      overlong result.  To make matters worse, it pads a short result with
      zeroes, which is a performance disaster if you have big buffers.
      
      strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
      the insane NUL padding, but having a differently broken return value
      which returns the original length of the source string.  Which means
      that it will read characters past the count from the source buffer, and
      you have to trust the source to be properly terminated.  It also makes
      error handling fragile, since the test for overflow is unnecessarily
      subtle.
      
      strscpy() avoids both these problems, guaranteeing the NUL termination
      (but not excessive padding) if the destination size wasn't zero, and
      making the overflow condition very obvious by returning -E2BIG.  It also
      doesn't read past the size of the source, and can thus be used for
      untrusted source data too.
      
      So why did I waffle about this for so long?
      
      Every time we introduce a new-and-improved interface, people start doing
      these interminable series of trivial conversion patches.
      
      And every time that happens, somebody does some silly mistake, and the
      conversion patch to the improved interface actually makes things worse.
      Because the patch is mindnumbing and trivial, nobody has the attention
      span to look at it carefully, and it's usually done over large swatches
      of source code which means that not every conversion gets tested.
      
      So I'm pulling the strscpy() support because it *is* a better interface.
      But I will refuse to pull mindless conversion patches.  Use this in
      places where it makes sense, but don't do trivial patches to fix things
      that aren't actually known to be broken.
      
      * 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
        tile: use global strscpy() rather than private copy
        string: provide strscpy()
        Make asm/word-at-a-time.h available on all architectures
      30c44659
    • Linus Torvalds's avatar
      Merge tag 'md/4.3-fixes' of git://neil.brown.name/md · 15ecf9a9
      Linus Torvalds authored
      Pull md fixes from Neil Brown:
       "Assorted fixes for md in 4.3-rc.
      
        Two tagged for -stable, and one is really a cleanup to match and
        improve kmemcache interface.
      
      * tag 'md/4.3-fixes' of git://neil.brown.name/md:
        md/bitmap: don't pass -1 to bitmap_storage_alloc.
        md/raid1: Avoid raid1 resync getting stuck
        md: drop null test before destroy functions
        md: clear CHANGE_PENDING in readonly array
        md/raid0: apply base queue limits *before* disk_stack_limits
        md/raid5: don't index beyond end of array in need_this_block().
        raid5: update analysis state for failed stripe
        md: wait for pending superblock updates before switching to read-only
      15ecf9a9
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 0d877081
      Linus Torvalds authored
      Pull MIPS updates from Ralf Baechle:
       "This week's round of MIPS fixes:
         - Fix JZ4740 build
         - Fix fallback to GFP_DMA
         - FP seccomp in case of ENOSYS
         - Fix bootmem panic
         - A number of FP and CPS fixes
         - Wire up new syscalls
         - Make sure BPF assembler objects can properly be disassembled
         - Fix BPF assembler code for MIPS I"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: scall: Always run the seccomp syscall filters
        MIPS: Octeon: Fix kernel panic on startup from memory corruption
        MIPS: Fix R2300 FP context switch handling
        MIPS: Fix octeon FP context switch handling
        MIPS: BPF: Fix load delay slots.
        MIPS: BPF: Do all exports of symbols with FEXPORT().
        MIPS: Fix the build on jz4740 after removing the custom gpio.h
        MIPS: CPS: #ifdef on CONFIG_MIPS_MT_SMP rather than CONFIG_MIPS_MT
        MIPS: CPS: Don't include MT code in non-MT kernels.
        MIPS: CPS: Stop dangling delay slot from has_mt.
        MIPS: dma-default: Fix 32-bit fall back to GFP_DMA
        MIPS: Wire up userfaultfd and membarrier syscalls.
      0d877081
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3e519dde
      Linus Torvalds authored
      Pull irq fixes from Thomas Gleixner:
       "This update contains:
      
         - Fix for a long standing race affecting /proc/irq/NNN
      
         - One line fix for ARM GICV3-ITS counting the wrong data
      
         - Warning silencing in ARM GICV3-ITS.  Another GCC trying to be
           overly clever issue"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irqchip/gic-v3-its: Count additional LPIs for the aliased devices
        irqchip/gic-v3-its: Silence warning when its_lpi_alloc_chunks gets inlined
        genirq: Fix race in register_irq_proc()
      3e519dde
    • Markos Chandras's avatar
      MIPS: scall: Always run the seccomp syscall filters · d218af78
      Markos Chandras authored
      The MIPS syscall handler code used to return -ENOSYS on invalid
      syscalls. Whilst this is expected, it caused problems for seccomp
      filters because the said filters never had the change to run since
      the code returned -ENOSYS before triggering them. This caused
      problems on the chromium testsuite for filters looking for invalid
      syscalls. This has now changed and the seccomp filters are always
      run even if the syscall is invalid. We return -ENOSYS once we
      return from the seccomp filters. Moreover, similar codepaths have
      been merged in the process which simplifies somewhat the overall
      syscall code.
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/11236/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      d218af78
  9. 03 Oct, 2015 4 commits
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 2cf30826
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
       "Fixes all around the map: W+X kernel mapping fix, WCHAN fixes, two
        build failure fixes for corner case configs, x32 header fix and a
        speling fix"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/headers/uapi: Fix __BITS_PER_LONG value for x32 builds
        x86/mm: Set NX on gap between __ex_table and rodata
        x86/kexec: Fix kexec crash in syscall kexec_file_load()
        x86/process: Unify 32bit and 64bit implementations of get_wchan()
        x86/process: Add proper bound checks in 64bit get_wchan()
        x86, efi, kasan: Fix build failure on !KASAN && KMEMCHECK=y kernels
        x86/hyperv: Fix the build in the !CONFIG_KEXEC_CORE case
        x86/cpufeatures: Correct spelling of the HWP_NOTIFY flag
      2cf30826
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 37cc7ab1
      Linus Torvalds authored
      Pull timer fixes from Ingo Molnar:
       "An abs64() fix in the watchdog driver, and two clocksource driver
        NO_IRQ assumption fixes"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        clocksource: Fix abs() usage w/ 64bit values
        clocksource/drivers/keystone: Fix bad NO_IRQ usage
        clocksource/drivers/rockchip: Fix bad NO_IRQ usage
      37cc7ab1
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a758379b
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "Two EFI fixes: one for x86, one for ARM, fixing a boot crash bug that
        can trigger under newer EFI firmware"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        arm64/efi: Fix boot crash by not padding between EFI_MEMORY_RUNTIME regions
        x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down
      a758379b
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 14f97d97
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Bunch of fixes all over the place, all pretty small: amdgpu, i915,
        exynos, one qxl and one vmwgfx.
      
        There is also a bunch of mst fixes, I left some cleanups in the series
        as I didn't think it was worth splitting up the tested series"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (37 commits)
        drm/dp/mst: add some defines for logical/physical ports
        drm/dp/mst: drop cancel work sync in the mstb destroy path (v2)
        drm/dp/mst: split connector registration into two parts (v2)
        drm/dp/mst: update the link_address_sent before sending the link address (v3)
        drm/dp/mst: fixup handling hotplug on port removal.
        drm/dp/mst: don't pass port into the path builder function
        drm/radeon: drop radeon_fb_helper_set_par
        drm: handle cursor_set2 in restore_fbdev_mode
        drm/exynos: Staticize local function in exynos_drm_gem.c
        drm/exynos: fimd: actually disable dp clock
        drm/exynos: dp: remove suspend/resume functions
        drm/qxl: recreate the primary surface when the bo is not primary
        drm/amdgpu: only print meaningful VM faults
        drm/amdgpu/cgs: remove import_gpu_mem
        drm/i915: Call non-locking version of drm_kms_helper_poll_enable(), v2
        drm: Add a non-locking version of drm_kms_helper_poll_enable(), v2
        drm/vmwgfx: Fix a command submission hang regression
        drm/exynos: remove unused mode_fixup() code
        drm/exynos: remove decon_mode_fixup()
        drm/exynos: remove fimd_mode_fixup()
        ...
      14f97d97
  10. 02 Oct, 2015 2 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 978ab6a0
      Linus Torvalds authored
      Pull input layer fixes from Dmitry Torokhov:
       "Fixes for two recent regressions (in Synaptics PS/2 and uinput
        drivers) and some more driver fixups"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Revert "Input: synaptics - fix handling of disabling gesture mode"
        Input: psmouse - fix data race in __ps2_command
        Input: elan_i2c - add all valid ic type for i2c/smbus
        Input: zhenhua - ensure we have BITREVERSE
        Input: omap4-keypad - fix memory leak
        Input: serio - fix blocking of parport
        Input: uinput - fix crash when using ABS events
        Input: elan_i2c - expand maximum product_id form 0xFF to 0xFFFF
        Input: elan_i2c - add ic type 0x03
        Input: elan_i2c - don't require known iap version
        Input: imx6ul_tsc - fix controller name
        Input: imx6ul_tsc - use the preferred method for kzalloc()
        Input: imx6ul_tsc - check for negative return value
        Input: imx6ul_tsc - propagate the errors
        Input: walkera0701 - fix abs() calculations on 64 bit values
        Input: mms114 - remove unneded semicolons
        Input: pm8941-pwrkey - remove unneded semicolon
        Input: fix typo in MT documentation
        Input: cyapa - fix address of Gen3 devices in device tree documentation
      978ab6a0
    • John Stultz's avatar
      clocksource: Fix abs() usage w/ 64bit values · 67dfae0c
      John Stultz authored
      This patch fixes one cases where abs() was being used with 64-bit
      nanosecond values, where the result may be capped at 32-bits.
      
      This potentially could cause watchdog false negatives on 32-bit
      systems, so this patch addresses the issue by using abs64().
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1442279124-7309-2-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      67dfae0c