1. 07 Jun, 2014 40 commits
    • Alexander Stein's avatar
      can: c_can: Set reserved bit in IFx_MASK2 to 1 on write · c7b5c6cd
      Alexander Stein authored
      commit 2bd3bc4e upstream.
      
      According to C_CAN documentation, the reserved bit in IFx_MASK2 register is
      fixed 1.
      Signed-off-by: default avatarAlexander Stein <alexander.stein@systec-electronic.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7b5c6cd
    • Konrad Rzeszutek Wilk's avatar
      intel_idle: Don't register CPU notifier if we are not running. · a0700471
      Konrad Rzeszutek Wilk authored
      commit 6f8c2e79 upstream.
      
      The 'intel_idle_probe' probes the CPU and sets the CPU notifier.
      But if later on during the module initialization we fail (say
      in cpuidle_register_driver), we stop loading, but we neglect
      to unregister the CPU notifier.  This means that during CPU
      hotplug events the system will fail:
      
      calling  intel_idle_init+0x0/0x326 @ 1
      intel_idle: MWAIT substates: 0x1120
      intel_idle: v0.4 model 0x2A
      intel_idle: lapic_timer_reliable_states 0xffffffff
      intel_idle: intel_idle yielding to none
      initcall intel_idle_init+0x0/0x326 returned -19 after 14 usecs
      
      ... some time later, offlining and onlining a CPU:
      
      cpu 3 spinlock event irq 62
      BUG: unable to ] __cpuidle_register_device+0x1c/0x120
      PGD 99b8b067 PUD 99b95067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: xen_evtchn nouveau mxm_wmi wmi radeon ttm i915 fbcon tileblit font atl1c bitblit softcursor drm_kms_helper video xen_blkfront xen_netfront fb_sys_fops sysimgblt sysfillrect syscopyarea xenfs xen_privcmd mperf
      CPU 0
      Pid: 2302, comm: udevd Not tainted 3.8.0-rc3upstream-00249-g09ad159 #1 MSI MS-7680/H61M-P23 (MS-7680)
      RIP: e030:[<ffffffff814d956c>]  [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120
      RSP: e02b:ffff88009dacfcb8  EFLAGS: 00010286
      RAX: 0000000000000000 RBX: ffff880105380000 RCX: 000000000000001c
      RDX: 0000000000000000 RSI: 0000000000000055 RDI: ffff880105380000
      RBP: ffff88009dacfce8 R08: ffffffff81a4f048 R09: 0000000000000008
      R10: 0000000000000008 R11: 0000000000000000 R12: ffff880105380000
      R13: 00000000ffffffdd R14: 0000000000000000 R15: ffffffff81a523d0
      FS:  00007f37bd83b7a0(0000) GS:ffff880105200000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 00000000a09ea000 CR4: 0000000000042660
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process udevd (pid: 2302, threadinfo ffff88009dace000, task ffff88009afb47f0)
      Stack:
       ffffffff8107f2d0 ffffffff810c2fb7 ffff88009dacfce8 00000000ffffffea
       ffff880105380000 00000000ffffffdd ffff88009dacfd08 ffffffff814d9882
       0000000000000003 ffff880105380000 ffff88009dacfd28 ffffffff81340afd
      Call Trace:
       [<ffffffff8107f2d0>] ? collect_cpu_info_local+0x30/0x30
       [<ffffffff810c2fb7>] ? __might_sleep+0xe7/0x100
       [<ffffffff814d9882>] cpuidle_register_device+0x32/0x70
       [<ffffffff81340afd>] intel_idle_cpu_init+0xad/0x110
       [<ffffffff81340bc8>] cpu_hotplug_notify+0x68/0x80
       [<ffffffff8166023d>] notifier_call_chain+0x4d/0x70
       [<ffffffff810bc369>] __raw_notifier_call_chain+0x9/0x10
       [<ffffffff81094a4b>] __cpu_notify+0x1b/0x30
       [<ffffffff81652cf7>] _cpu_up+0x103/0x14b
       [<ffffffff81652e18>] cpu_up+0xd9/0xec
       [<ffffffff8164a254>] store_online+0x94/0xd0
       [<ffffffff814122fb>] dev_attr_store+0x1b/0x20
       [<ffffffff81216404>] sysfs_write_file+0xf4/0x170
       [<ffffffff811a1024>] vfs_write+0xb4/0x130
       [<ffffffff811a17ea>] sys_write+0x5a/0xa0
       [<ffffffff816643a9>] system_call_fastpath+0x16/0x1b
      Code: 03 18 00 c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 e8 84 08 00 00 <48> 8b 78 08 49 89 c4 e8 f8 7f c1 ff 89 c2 b8 ea ff ff ff 84 d2
      RIP  [<ffffffff814d956c>] __cpuidle_register_device+0x1c/0x120
       RSP <ffff88009dacfcb8>
      
      This patch fixes that by moving the CPU notifier registration
      as the last item to be done by the module.
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Reviewed-by: default avatarSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      [bwh: Backported to 3.2: notifier is registered only if we do not have ARAT]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a0700471
    • Axel Lin's avatar
      regulator: max8998: Ensure enough delay time for max8998_set_voltage_buck_time_sel · af857f1c
      Axel Lin authored
      commit e8d9897f upstream.
      
      commit 81d0a6ae upstream.
      
      Use DIV_ROUND_UP to prevent truncation by integer division issue.
      This ensures we return enough delay time.
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      [bwh: Backported to 3.2: delay is done by driver, not returned to the caller]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af857f1c
    • Axel Lin's avatar
      regulator: max8997: Use uV in voltage_map_desc · cd858528
      Axel Lin authored
      commit bc3b7756 upstream.
      
      Current code does integer division (min_vol = min_uV / 1000) before pass
      min_vol to max8997_get_voltage_proper_val().
      So it is possible min_vol is truncated to a smaller value.
      
      For example, if the request min_uV is 800900 for ldo.
      min_vol = 800900 / 1000 = 800 (mV)
      Then max8997_get_voltage_proper_val returns 800 mV for this case which is lower
      than the requested voltage.
      
      Use uV rather than mV in voltage_map_desc to prevent truncation by integer
      division.
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - voltage_map_desc also has an n_bits field]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd858528
    • Krzysztof Mazur's avatar
      i915: ensure that VGA plane is disabled · c7e0950c
      Krzysztof Mazur authored
      commit 0fde901f upstream.
      
      Some broken systems (like HP nc6120) in some cases, usually after LID
      close/open, enable VGA plane, making display unusable (black screen on LVDS,
      some strange mode on VGA output). We used to disable VGA plane only once at
      startup. Now we also check, if VGA plane is still disabled while changing
      mode, and fix that if something changed it.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57434Signed-off-by: default avatarKrzysztof Mazur <krzysiek@podlesie.net>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      [bwh: Backported to 3.2: intel_modeset_setup_hw_state() does not
       exist, so call i915_redisable_vga() directly from intel_lid_notify()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7e0950c
    • Mauro Carvalho Chehab's avatar
      i82975x_edac: Fix dimm label initialization · cabb4147
      Mauro Carvalho Chehab authored
      commit 47969684 upstream.
      
      The driver has only 4 hardcoded labels, but allows much more memory.
      Fix it by removing the hardcoded logic, using snprintf() instead.
      
      [   19.833972] general protection fault: 0000 [#1] SMP
      [   19.837733] Modules linked in: i82975x_edac(+) edac_core firewire_ohci firewire_core crc_itu_t nouveau mxm_wmi wmi video i2c_algo_bit drm_kms_helper ttm drm i2c_core
      [   19.837733] CPU 0
      [   19.837733] Pid: 390, comm: udevd Not tainted 3.6.1-1.fc17.x86_64.debug #1 Dell Inc.                 Precision WorkStation 390    /0MY510
      [   19.837733] RIP: 0010:[<ffffffff813463a8>]  [<ffffffff813463a8>] strncpy+0x18/0x30
      [   19.837733] RSP: 0018:ffff880078535b68  EFLAGS: 00010202
      [   19.837733] RAX: ffff880069fa9708 RBX: ffff880078588000 RCX: ffff880069fa9708
      [   19.837733] RDX: 000000000000001f RSI: 5f706f5f63616465 RDI: ffff880069fa9708
      [   19.837733] RBP: ffff880078535b68 R08: ffff880069fa9727 R09: 000000000000fffe
      [   19.837733] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
      [   19.837733] R13: 0000000000000000 R14: ffff880069fa9290 R15: ffff880079624a80
      [   19.837733] FS:  00007f3de01ee840(0000) GS:ffff88007c400000(0000) knlGS:0000000000000000
      [   19.837733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   19.837733] CR2: 00007f3de00b9000 CR3: 0000000078dbc000 CR4: 00000000000007f0
      [   19.837733] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   19.837733] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [   19.837733] Process udevd (pid: 390, threadinfo ffff880078534000, task ffff880079642450)
      [   19.837733] Stack:
      [   19.837733]  ffff880078535c18 ffffffffa017c6b8 00040000816d627f ffff880079624a88
      [   19.837733]  ffffc90004cd6000 ffff880079624520 ffff88007ac21148 0000000000000000
      [   19.837733]  0000000000000000 0004000000000000 feda000078535bc8 ffffffff810d696d
      [   19.837733] Call Trace:
      [   19.837733]  [<ffffffffa017c6b8>] i82975x_init_one+0x2e6/0x3e6 [i82975x_edac]
      ...
      
      Fix bug reported at:
      	https://bugzilla.redhat.com/show_bug.cgi?id=848149
      And, very likely:
      	https://bbs.archlinux.org/viewtopic.php?id=148033
      	https://bugzilla.kernel.org/show_bug.cgi?id=47171
      
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - Use csrow->channels[chan].label not csrow->channels[chan]->dimm->label]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cabb4147
    • Jiri Slaby's avatar
      MISC: hpilo, remove pci_disable_device · fb35bbb9
      Jiri Slaby authored
      commit bcdee04e upstream.
      
      pci_disable_device(pdev) used to be in pci remove function. But this
      PCI device has two functions with interrupt lines connected to a
      single pin. The other one is a USB host controller. So when we disable
      the PIN there e.g. by rmmod hpilo, the controller stops working. It is
      because the interrupt link is disabled in ACPI since it is not
      refcounted yet. See acpi_pci_link_free_irq called from
      acpi_pci_irq_disable.
      
      It is not the best solution whatsoever, but as a workaround until the
      ACPI irq link refcounting is sorted out this should fix the reported
      errors.
      
      References: https://lkml.org/lkml/2008/11/4/535Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Nobin Mathew <nobin.mathew@gmail.com>
      Cc: Robert Hancock <hancockr@shaw.ca>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: David Altobelli <david.altobelli@hp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb35bbb9
    • Herton Ronaldo Krzesinski's avatar
      floppy: properly handle failure on add_disk loop · 85923c3d
      Herton Ronaldo Krzesinski authored
      commit d60e7ec1 upstream.
      
      On floppy initialization, if something failed inside the loop we call
      add_disk, there was no cleanup of previous iterations in the error
      handling.
      Signed-off-by: default avatarHerton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85923c3d
    • Seth Forshee's avatar
      Input: synaptics - adjust threshold for treating position values as negative · ad4751d3
      Seth Forshee authored
      commit 824efd37 upstream.
      
      Commit c039450 (Input: synaptics - handle out of bounds values from the
      hardware) caused any hardware reported values over 7167 to be treated as
      a wrapped-around negative value. It turns out that some firmware uses
      the value 8176 to indicate a finger near the edge of the touchpad whose
      actual position cannot be determined. This value now gets treated as
      negative, which can cause pointer jumps and broken edge scrolling on
      these machines.
      
      I only know of one touchpad which reports negative values, and this
      hardware never reports any value lower than -8 (i.e. 8184). Moving the
      threshold for treating a value as negative up to 8176 should work fine
      then for any hardware we currently know about, and since we're dealing
      with unspecified behavior it's probably the best we can do. The special
      8176 value is also likely to result in sudden jumps in position, so
      let's also clamp this to the maximum specified value for the axis.
      
      BugLink: http://bugs.launchpad.net/bugs/1046512
      https://bugzilla.kernel.org/show_bug.cgi?id=46371Signed-off-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Reviewed-by: default avatarDaniel Kurtz <djkurtz@chromium.org>
      Tested-by: default avatarAlan Swanson <swanson@ukfsn.org>
      Tested-by: default avatarArteom <arutemus@gmail.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad4751d3
    • Matthieu CASTET's avatar
      UBI: erase free PEB with bitflip in EC header · b2d99537
      Matthieu CASTET authored
      commit 193819cf upstream.
      
      Without this patch, these PEB are not scrubbed until we put data in them.
      Bitflip can accumulate latter and we can loose the EC header (but VID header
      should be intact and allow to recover data).
      Signed-off-by: default avatarMatthieu Castet <matthieu.castet@parrot.com>
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      [bwh: Backported to 3.2: adjust filename, context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Qiang Huang <h.huangqiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2d99537
    • Rashika Kheria's avatar
      Staging: zram: Fix access of NULL pointer · 70e8fcb6
      Rashika Kheria authored
      commit 46a51c80 upstream.
      
      This patch fixes the bug in reset_store caused by accessing NULL pointer.
      
      The bdev gets its value from bdget_disk() which could fail when memory
      pressure is severe and hence can return NULL because allocation of
      inode in bdget could fail.
      
      Hence, this patch introduces a check for bdev to prevent reference to a
      NULL pointer in the later part of the code. It also removes unnecessary
      check of bdev for fsync_bdev().
      Acked-by: default avatarJerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarRashika Kheria <rashika.kheria@gmail.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      70e8fcb6
    • Sergey Senozhatsky's avatar
      zram: allow request end to coincide with disksize · 1f607459
      Sergey Senozhatsky authored
      commit 75c7caf5 upstream.
      
      Pass valid_io_request() checks if request end coincides with disksize
      (end equals bound), only fail if we attempt to read beyond the bound.
      
      mkfs.ext2 produces numerous errors:
      [ 2164.632747] quiet_error: 1 callbacks suppressed
      [ 2164.633260] Buffer I/O error on device zram0, logical block 153599
      [ 2164.633265] lost page write due to I/O error on zram0
      Signed-off-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f607459
    • Jiang Liu's avatar
      zram: avoid access beyond the zram device · f3ec6e7a
      Jiang Liu authored
      commit 12a7ad3b upstream.
      
      Function valid_io_request() should verify the entire request are within
      the zram device address range. Otherwise it may cause invalid memory
      access when accessing/modifying zram->meta->table[index] because the
      'index' is out of range. Then it may access non-exist memory, randomly
      modify memory belong to other subsystems, which is hard to track down.
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3ec6e7a
    • Jiang Liu's avatar
      zram: destroy all devices on error recovery path in zram_init() · 11fc2ee5
      Jiang Liu authored
      commit 39a9b8ac upstream.
      
      On error recovery path of zram_init(), it leaks the zram device object
      causing the failure. So change create_device() to free allocated
      resources on error path.
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Acked-by: default avatarJerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      11fc2ee5
    • Jiang Liu's avatar
      zram: avoid invalid memory access in zram_exit() · 5e84bf6d
      Jiang Liu authored
      commit 6030ea9b upstream.
      
      Memory for zram->disk object may have already been freed after returning
      from destroy_device(zram), then it's unsafe for zram_reset_device(zram)
      to access zram->disk again.
      
      We can't solve this bug by flipping the order of destroy_device(zram)
      and zram_reset_device(zram), that will cause deadlock issues to the
      zram sysfs handler.
      
      So fix it by holding an extra reference to zram->disk before calling
      destroy_device(zram).
      Signed-off-by: default avatarJiang Liu <jiang.liu@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e84bf6d
    • Minchan Kim's avatar
      zram: Fix deadlock bug in partial read/write · 74eb879b
      Minchan Kim authored
      commit 7e5a5104 upstream.
      
      Now zram allocates new page with GFP_KERNEL in zram I/O path
      if IO is partial. Unfortunately, It may cause deadlock with
      reclaim path like below.
      
      write_page from fs
      fs_lock
      allocation(GFP_KERNEL)
      reclaim
      pageout
      				write_page from fs
      				fs_lock <-- deadlock
      
      This patch fixes it by using GFP_NOIO.  In read path, we
      reorganize code flow so that kmap_atomic is called after the
      GFP_NOIO allocation.
      Acked-by: default avatarJerome Marchand <jmarchand@redhat.com>
      Acked-by: default avatarNitin Gupta <ngupta@vflare.org>
      [ penberg@kernel.org: don't use GFP_ATOMIC ]
      Signed-off-by: default avatarPekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [bwh: Backported to 3.2: no reordering is needed in the read path]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Jianguo Wu <wujianguo@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74eb879b
    • Joe Thornber's avatar
      dm thin: fix discard corruption · 87dba703
      Joe Thornber authored
      commit f046f89a upstream.
      
      Fix a bug in dm_btree_remove that could leave leaf values with incorrect
      reference counts.  The effect of this was that removal of a shared block
      could result in the space maps thinking the block was no longer used.
      More concretely, if you have a thin device and a snapshot of it, sending
      a discard to a shared region of the thin could corrupt the snapshot.
      
      Thinp uses a 2-level nested btree to store it's mappings.  This first
      level is indexed by thin device, and the second level by logical
      block.
      
      Often when we're removing an entry in this mapping tree we need to
      rebalance nodes, which can involve shadowing them, possibly creating a
      copy if the block is shared.  If we do create a copy then children of
      that node need to have their reference counts incremented.  In this
      way reference counts percolate down the tree as shared trees diverge.
      
      The rebalance functions were incrementing the children at the
      appropriate time, but they were always assuming the children were
      internal nodes.  This meant the leaf values (in our case packed
      block/flags entries) were not being incremented.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      [bwh: Backported to 3.2: bump target version numbers from 1.0.1 to 1.0.2]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [xr: Backported to 3.4: bump target version numbers to 1.1.1]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87dba703
    • Shiva Krishna Merla's avatar
      dm mpath: fix race condition between multipath_dtr and pg_init_done · 5e301eba
      Shiva Krishna Merla authored
      commit 954a73d5 upstream.
      
      Whenever multipath_dtr() is happening we must prevent queueing any
      further path activation work.  Implement this by adding a new
      'pg_init_disabled' flag to the multipath structure that denotes future
      path activation work should be skipped if it is set.  By disabling
      pg_init and then re-enabling in flush_multipath_work() we also avoid the
      potential for pg_init to be initiated while suspending an mpath device.
      
      Without this patch a race condition exists that may result in a kernel
      panic:
      
      1) If after pg_init_done() decrements pg_init_in_progress to 0, a call
         to wait_for_pg_init_completion() assumes there are no more pending path
         management commands.
      2) If pg_init_required is set by pg_init_done(), due to retryable
         mode_select errors, then process_queued_ios() will again queue the
         path activation work.
      3) If free_multipath() completes before activate_path() work is called a
         NULL pointer dereference like the following can be seen when
         accessing members of the recently destructed multipath:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
      RIP: 0010:[<ffffffffa003db1b>]  [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath]
      [<ffffffff81090ac0>] worker_thread+0x170/0x2a0
      [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
      
      [switch to disabling pg_init in flush_multipath_work & header edits by Mike Snitzer]
      Signed-off-by: default avatarShiva Krishna Merla <shivakrishna.merla@netapp.com>
      Reviewed-by: default avatarKrishnasamy Somasundaram <somasundaram.krishnasamy@netapp.com>
      Tested-by: default avatarSpeagle Andy <Andy.Speagle@netapp.com>
      Acked-by: default avatarJunichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - Bump version to 1.3.2 not 1.6.0]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [xr: Backported to 3.4: Adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e301eba
    • Mikulas Patocka's avatar
      dm snapshot: avoid snapshot space leak on crash · d110fd51
      Mikulas Patocka authored
      commit 230c83af upstream.
      
      There is a possible leak of snapshot space in case of crash.
      
      The reason for space leaking is that chunks in the snapshot device are
      allocated sequentially, but they are finished (and stored in the metadata)
      out of order, depending on the order in which copying finished.
      
      For example, supposed that the metadata contains the following records
      SUPERBLOCK
      METADATA (blocks 0 ... 250)
      DATA 0
      DATA 1
      DATA 2
      ...
      DATA 250
      
      Now suppose that you allocate 10 new data blocks 251-260. Suppose that
      copying of these blocks finish out of order (block 260 finished first
      and the block 251 finished last). Now, the snapshot device looks like
      this:
      SUPERBLOCK
      METADATA (blocks 0 ... 250, 260, 259, 258, 257, 256)
      DATA 0
      DATA 1
      DATA 2
      ...
      DATA 250
      DATA 251
      DATA 252
      DATA 253
      DATA 254
      DATA 255
      METADATA (blocks 255, 254, 253, 252, 251)
      DATA 256
      DATA 257
      DATA 258
      DATA 259
      DATA 260
      
      Now, if the machine crashes after writing the first metadata block but
      before writing the second metadata block, the space for areas DATA 250-255
      is leaked, it contains no valid data and it will never be used in the
      future.
      
      This patch makes dm-snapshot complete exceptions in the same order they
      were allocated, thus fixing this bug.
      
      Note: when backporting this patch to the stable kernel, change the version
      field in the following way:
      * if version in the stable kernel is {1, 11, 1}, change it to {1, 12, 0}
      * if version in the stable kernel is {1, 10, 0} or {1, 10, 1}, change it
        to {1, 10, 2}
      Userspace reads the version to determine if the bug was fixed, so the
      version change is needed.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      [xr: Backported to 3.4: adjust version]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d110fd51
    • Harshula Jayasuriya's avatar
      nfsd: nfsd_open: when dentry_open returns an error do not propagate as struct file · 4834ca94
      Harshula Jayasuriya authored
      commit e4daf1ff upstream.
      
      The following call chain:
      ------------------------------------------------------------
      nfs4_get_vfs_file
      - nfsd_open
        - dentry_open
          - do_dentry_open
            - __get_file_write_access
              - get_write_access
                - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY;
      ------------------------------------------------------------
      
      can result in the following state:
      ------------------------------------------------------------
      struct nfs4_file {
      ...
        fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0},
        fi_access = {{
            counter = 0x1
          }, {
            counter = 0x0
          }},
      ...
      ------------------------------------------------------------
      
      1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
      NULL, hence nfsd_open() is called where we get status set to an error
      and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach
      nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented.
      
      2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
      NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but
      nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented.
      Thus we leave a landmine in the form of the nfs4_file data structure in
      an incorrect state.
      
      3) Eventually, when __nfs4_file_put_access() is called it finds
      fi_access[O_WRONLY] being non-zero, it decrements it and calls
      nfs4_file_put_fd() which tries to fput -ETXTBSY.
      ------------------------------------------------------------
      ...
           [exception RIP: fput+0x9]
           RIP: ffffffff81177fa9  RSP: ffff88062e365c90  RFLAGS: 00010282
           RAX: ffff880c2b3d99cc  RBX: ffff880c2b3d9978  RCX: 0000000000000002
           RDX: dead000000100101  RSI: 0000000000000001  RDI: ffffffffffffffe6
           RBP: ffff88062e365c90   R8: ffff88041fe797d8   R9: ffff88062e365d58
           R10: 0000000000000008  R11: 0000000000000000  R12: 0000000000000001
           R13: 0000000000000007  R14: 0000000000000000  R15: 0000000000000000
           ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
        #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd]
       #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd]
       #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd]
       #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd]
       #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd]
       #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd]
       #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd]
       #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc]
       #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc]
       #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd]
       #19 [ffff88062e365ee8] kthread at ffffffff81090886
       #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a
      ------------------------------------------------------------
      Signed-off-by: default avatarHarshula Jayasuriya <harshula@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      [xr: Backported to 3.4: adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4834ca94
    • NeilBrown's avatar
      md/raid10: fix "enough" function for detecting if array is failed. · 352f526f
      NeilBrown authored
      commit 80b48124 upstream.
      
      The 'enough' function is written to work with 'near' arrays only
      in that is implicitly assumes that the offset from one 'group' of
      devices to the next is the same as the number of copies.
      In reality it is the number of 'near' copies.
      
      So change it to make this number explicit.
      
      This bug makes it possible to run arrays without enough drives
      present, which is dangerous.
      It is appropriate for an -stable kernel, but will almost certainly
      need to be modified for some of them.
      Reported-by: default avatarJakub Husák <jakub@gooseman.cz>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      [bwh: Backported to 3.2: s/geo->/conf->/]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      352f526f
    • Mikulas Patocka's avatar
      dm snapshot: add missing module aliases · e4bf9301
      Mikulas Patocka authored
      commit 23cb2109 upstream.
      
      Add module aliases so that autoloading works correctly if the user
      tries to activate "snapshot-origin" or "snapshot-merge" targets.
      
      Reference: https://bugzilla.redhat.com/889973Reported-by: default avatarChao Yang <chyang@redhat.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4bf9301
    • Mikulas Patocka's avatar
      dm bufio: avoid a possible __vmalloc deadlock · bed74df4
      Mikulas Patocka authored
      commit 502624bd upstream.
      
      This patch uses memalloc_noio_save to avoid a possible deadlock in
      dm-bufio.  (it could happen only with large block size, at most
      PAGE_SIZE << MAX_ORDER (typically 8MiB).
      
      __vmalloc doesn't fully respect gfp flags. The specified gfp flags are
      used for allocation of requested pages, structures vmap_area, vmap_block
      and vm_struct and the radix tree nodes.
      
      However, the kernel pagetables are allocated always with GFP_KERNEL.
      Thus the allocation of pagetables can recurse back to the I/O layer and
      cause a deadlock.
      
      This patch uses the function memalloc_noio_save to set per-process
      PF_MEMALLOC_NOIO flag and the function memalloc_noio_restore to restore
      it. When this flag is set, all allocations in the process are done with
      implied GFP_NOIO flag, thus the deadlock can't happen.
      
      This should be backported to stable kernels, but they don't have the
      PF_MEMALLOC_NOIO flag and memalloc_noio_save/memalloc_noio_restore
      functions. So, PF_MEMALLOC should be set and restored instead.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      [bwh: Backported to 3.2 as recommended]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bed74df4
    • Trond Myklebust's avatar
      NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session · ba795927
      Trond Myklebust authored
      commit c489ee29 upstream.
      
      NFS4ERR_DELAY is a legal reply when we call DESTROY_SESSION. It
      usually means that the server is busy handling an unfinished RPC
      request. Just sleep for a second and then retry.
      We also need to be able to handle the NFS4ERR_BACK_CHAN_BUSY return
      value. If the NFS server has outstanding callbacks, we just want to
      similarly sleep & retry.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba795927
    • Weston Andros Adamson's avatar
      NFSv4.1: Don't decode skipped layoutgets · 0c5fa16a
      Weston Andros Adamson authored
      commit 085b7a45 upstream.
      
      layoutget's prepare hook can call rpc_exit with status = NFS4_OK (0).
      Because of this, nfs4_proc_layoutget can't depend on a 0 status to mean
      that the RPC was successfully sent, received and parsed.
      
      To fix this, use the result's len member to see if parsing took place.
      
      This fixes the following OOPS -- calling xdr_init_decode() with a buffer length
      0 doesn't set the stream's 'p' member and ends up using uninitialized memory
      in filelayout_decode_layout.
      
      BUG: unable to handle kernel paging request at 0000000000008050
      IP: [<ffffffff81282e78>] memcpy+0x18/0x120
      PGD 0
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:01.0/irq
      CPU 1
      Modules linked in: nfs_layout_nfsv41_files nfs lockd fscache auth_rpcgss nfs_acl autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod ppdev parport_pc parport snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 microcode vmware_balloon i2c_piix4 i2c_core sg shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptspi mptscsih mptbase scsi_transport_spi [last unloaded: speedstep_lib]
      
      Pid: 1665, comm: flush-0:22 Not tainted 2.6.32-356-test-2 #2 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
      RIP: 0010:[<ffffffff81282e78>]  [<ffffffff81282e78>] memcpy+0x18/0x120
      RSP: 0018:ffff88003dfab588  EFLAGS: 00010206
      RAX: ffff88003dc42000 RBX: ffff88003dfab610 RCX: 0000000000000009
      RDX: 000000003f807ff0 RSI: 0000000000008050 RDI: ffff88003dc42000
      RBP: ffff88003dfab5b0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000024
      R13: ffff88003dc42000 R14: ffff88003f808030 R15: ffff88003dfab6a0
      FS:  0000000000000000(0000) GS:ffff880003420000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000008050 CR3: 000000003bc92000 CR4: 00000000001407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process flush-0:22 (pid: 1665, threadinfo ffff88003dfaa000, task ffff880037f77540)
      Stack:
      ffffffffa0398ac1 ffff8800397c5940 ffff88003dfab610 ffff88003dfab6a0
      <d> ffff88003dfab5d0 ffff88003dfab680 ffffffffa01c150b ffffea0000d82e70
      <d> 000000508116713b 0000000000000000 0000000000000000 0000000000000000
      Call Trace:
      [<ffffffffa0398ac1>] ? xdr_inline_decode+0xb1/0x120 [sunrpc]
      [<ffffffffa01c150b>] filelayout_decode_layout+0xeb/0x350 [nfs_layout_nfsv41_files]
      [<ffffffffa01c17fc>] filelayout_alloc_lseg+0x8c/0x3c0 [nfs_layout_nfsv41_files]
      [<ffffffff8150e6ce>] ? __wait_on_bit+0x7e/0x90
      Signed-off-by: default avatarWeston Andros Adamson <dros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c5fa16a
    • Trond Myklebust's avatar
      NFSv4.1: Fix a race in pNFS layoutcommit · 079ee145
      Trond Myklebust authored
      commit a073dbff upstream.
      
      We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the
      NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations
      where the two are out of sync.
      The first half of the problem is to ensure that pnfs_layoutcommit_inode
      clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg.
      We still need to keep the reference to those segments until the RPC call
      is finished, so in order to make it clear _where_ those references come
      from, we add a helper pnfs_list_write_lseg_done() that cleans up after
      pnfs_list_write_lseg.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarBenny Halevy <bhalevy@tonian.com>
      [bwh: Backported to 3.2: s/pnfs_put_lseg/put_lseg/]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      079ee145
    • Chuck Lever's avatar
      NFS: nfs_getaclargs.acl_len is a size_t · f40661f4
      Chuck Lever authored
      commit 56d08fef upstream.
      
      Squelch compiler warnings:
      
      fs/nfs/nfs4proc.c: In function ‘__nfs4_get_acl_uncached’:
      fs/nfs/nfs4proc.c:3811:14: warning: comparison between signed and
      	unsigned integer expressions [-Wsign-compare]
      fs/nfs/nfs4proc.c:3818:15: warning: comparison between signed and
      	unsigned integer expressions [-Wsign-compare]
      
      Introduced by commit bf118a34 "NFSv4: include bitmap in nfsv4 get
      acl data", Dec 7, 2011.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f40661f4
    • fanchaoting's avatar
      nfsd: don't run get_file if nfs4_preprocess_stateid_op return error · 79854e6e
      fanchaoting authored
      commit b022032e upstream.
      
      we should return error status directly when nfs4_preprocess_stateid_op
      return error.
      Signed-off-by: default avatarfanchaoting <fanchaoting@cn.fujitsu.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79854e6e
    • Dan Carpenter's avatar
      NFSv4.1: integer overflow in decode_cb_sequence_args() · 42b607da
      Dan Carpenter authored
      commit 0439f31c upstream.
      
      This seems like it could overflow on 32 bits.  Use kmalloc_array() which
      has overflow protection built in.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42b607da
    • J. Bruce Fields's avatar
      nfsd4: fix xdr decoding of large non-write compounds · ca36e74e
      J. Bruce Fields authored
      commit 365da4ad upstream.
      
      This fixes a regression from 24750082
      "nfsd4: fix decoding of compounds across page boundaries".  The previous
      code was correct: argp->pagelist is initialized in
      nfs4svc_deocde_compoundargs to rqstp->rq_arg.pages, and is therefore a
      pointer to the page *after* the page we are currently decoding.
      
      The reason that patch nevertheless fixed a problem with decoding
      compounds containing write was a bug in the write decoding introduced by
      5a80a54d "nfsd4: reorganize write
      decoding", after which write decoding no longer adhered to the rule that
      argp->pagelist point to the next page.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      [bwh: Backported to 3.2: adjust context; there is only one instance to fix]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca36e74e
    • Andy Adamson's avatar
      NFSv4 wait on recovery for async session errors · e9d735ee
      Andy Adamson authored
      commit 4a82fd7c upstream.
      
      When the state manager is processing the NFS4CLNT_DELEGRETURN flag, session
      draining is off, but DELEGRETURN can still get a session error.
      The async handler calls nfs4_schedule_session_recovery returns -EAGAIN, and
      the DELEGRETURN done then restarts the RPC task in the prepare state.
      With the state manager still processing the NFS4CLNT_DELEGRETURN flag with
      session draining off, these DELEGRETURNs will cycle with errors filling up the
      session slots.
      
      This prevents OPEN reclaims (from nfs_delegation_claim_opens) required by the
      NFS4CLNT_DELEGRETURN state manager processing from completing, hanging the
      state manager in the __rpc_wait_for_completion_task in nfs4_run_open_task
      as seen in this kernel thread dump:
      
      kernel: 4.12.32.53-ma D 0000000000000000     0  3393      2 0x00000000
      kernel: ffff88013995fb60 0000000000000046 ffff880138cc5400 ffff88013a9df140
      kernel: ffff8800000265c0 ffffffff8116eef0 ffff88013fc10080 0000000300000001
      kernel: ffff88013a4ad058 ffff88013995ffd8 000000000000fbc8 ffff88013a4ad058
      kernel: Call Trace:
      kernel: [<ffffffff8116eef0>] ? cache_alloc_refill+0x1c0/0x240
      kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc]
      kernel: [<ffffffffa0358152>] rpc_wait_bit_killable+0x42/0xa0 [sunrpc]
      kernel: [<ffffffff8152914f>] __wait_on_bit+0x5f/0x90
      kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc]
      kernel: [<ffffffff815291f8>] out_of_line_wait_on_bit+0x78/0x90
      kernel: [<ffffffff8109b520>] ? wake_bit_function+0x0/0x50
      kernel: [<ffffffffa035810d>] __rpc_wait_for_completion_task+0x2d/0x30 [sunrpc]
      kernel: [<ffffffffa040d44c>] nfs4_run_open_task+0x11c/0x160 [nfs]
      kernel: [<ffffffffa04114e7>] nfs4_open_recover_helper+0x87/0x120 [nfs]
      kernel: [<ffffffffa0411646>] nfs4_open_recover+0xc6/0x150 [nfs]
      kernel: [<ffffffffa040cc6f>] ? nfs4_open_recoverdata_alloc+0x2f/0x60 [nfs]
      kernel: [<ffffffffa0414e1a>] nfs4_open_delegation_recall+0x6a/0xa0 [nfs]
      kernel: [<ffffffffa0424020>] nfs_end_delegation_return+0x120/0x2e0 [nfs]
      kernel: [<ffffffff8109580f>] ? queue_work+0x1f/0x30
      kernel: [<ffffffffa0424347>] nfs_client_return_marked_delegations+0xd7/0x110 [nfs]
      kernel: [<ffffffffa04225d8>] nfs4_run_state_manager+0x548/0x620 [nfs]
      kernel: [<ffffffffa0422090>] ? nfs4_run_state_manager+0x0/0x620 [nfs]
      kernel: [<ffffffff8109b0f6>] kthread+0x96/0xa0
      kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
      kernel: [<ffffffff8109b060>] ? kthread+0x0/0xa0
      kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      The state manager can not therefore process the DELEGRETURN session errors.
      Change the async handler to wait for recovery on session errors.
      Signed-off-by: default avatarAndy Adamson <andros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - There's no restart_call label]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9d735ee
    • Mateusz Guzik's avatar
      cifs: delay super block destruction until all cifsFileInfo objects are gone · 38feb080
      Mateusz Guzik authored
      commit 24261fc2 upstream.
      
      cifsFileInfo objects hold references to dentries and it is possible that
      these will still be around in workqueues when VFS decides to kill super
      block during unmount.
      
      This results in panics like this one:
      BUG: Dentry ffff88001f5e76c0{i=66b4a,n=1M-2} still in use (1) [unmount of cifs cifs]
      ------------[ cut here ]------------
      kernel BUG at fs/dcache.c:943!
      [..]
      Process umount (pid: 1781, threadinfo ffff88003d6e8000, task ffff880035eeaec0)
      [..]
      Call Trace:
       [<ffffffff811b44f3>] shrink_dcache_for_umount+0x33/0x60
       [<ffffffff8119f7fc>] generic_shutdown_super+0x2c/0xe0
       [<ffffffff8119f946>] kill_anon_super+0x16/0x30
       [<ffffffffa036623a>] cifs_kill_sb+0x1a/0x30 [cifs]
       [<ffffffff8119fcc7>] deactivate_locked_super+0x57/0x80
       [<ffffffff811a085e>] deactivate_super+0x4e/0x70
       [<ffffffff811bb417>] mntput_no_expire+0xd7/0x130
       [<ffffffff811bc30c>] sys_umount+0x9c/0x3c0
       [<ffffffff81657c19>] system_call_fastpath+0x16/0x1b
      
      Fix this by making each cifsFileInfo object hold a reference to cifs
      super block, which implicitly keeps VFS super block around as well.
      Signed-off-by: default avatarMateusz Guzik <mguzik@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Reported-and-Tested-by: default avatarBen Greear <greearb@candelatech.com>
      Signed-off-by: default avatarSteve French <sfrench@us.ibm.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [xr: Backported to 3.4: adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38feb080
    • Linus Torvalds's avatar
      VFS: make vfs_fstat() use f[get|put]_light() · 9b67aeff
      Linus Torvalds authored
      commit e994defb upstream.
      
      Use the *_light() versions that properly avoid doing the file user count
      updates when they are unnecessary.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [xr: Backported to 3.4: adjust function name]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b67aeff
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Keep overwrite in sync between regular and snapshot buffers · 4652951d
      Steven Rostedt (Red Hat) authored
      commit 80902822 upstream.
      
      Changing the overwrite mode for the ring buffer via the trace
      option only sets the normal buffer. But the snapshot buffer could
      swap with it, and then the snapshot would be in non overwrite mode
      and the normal buffer would be in overwrite mode, even though the
      option flag states otherwise.
      
      Keep the two buffers overwrite modes in sync.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4652951d
    • Wei Yongjun's avatar
      perf: Fix error return code · 926685e9
      Wei Yongjun authored
      commit c4814202 upstream.
      
      Fix to return -ENOMEM in the allocation error case instead of 0
      (if pmu_bus_running == 1), as done elsewhere in this function.
      Signed-off-by: default avatarWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: a.p.zijlstra@chello.nl
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Link: http://lkml.kernel.org/r/CAPgLHd8j_fWcgqe%3DKLWjpBj%2B%3Do0Pw6Z-SEq%3DNTPU08c2w1tngQ@mail.gmail.com
      [ Tweaked the error code setting placement and the changelog. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      926685e9
    • libin's avatar
      sched/debug: Fix sd->*_idx limit range avoiding overflow · e5f1ec5d
      libin authored
      commit fd9b86d3 upstream.
      
      Commit 201c373e ("sched/debug: Limit sd->*_idx range on
      sysctl") was an incomplete bug fix.
      
      This patch fixes sd->*_idx limit range to [0 ~ CPU_LOAD_IDX_MAX-1]
      avoiding array overflow caused by setting sd->*_idx to CPU_LOAD_IDX_MAX
      on sysctl.
      Signed-off-by: default avatarLibin <huawei.libin@huawei.com>
      Cc: <jiang.liu@huawei.com>
      Cc: <guohanjun@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/51626610.2040607@huawei.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5f1ec5d
    • Namhyung Kim's avatar
      sched/debug: Limit sd->*_idx range on sysctl · 4aff95ab
      Namhyung Kim authored
      commit 201c373e upstream.
      
      Various sd->*_idx's are used for refering the rq's load average table
      when selecting a cpu to run.  However they can be set to any number
      with sysctl knobs so that it can crash the kernel if something bad is
      given. Fix it by limiting them into the actual range.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1345104204-8317-1-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4aff95ab
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Check module functions being traced on reload · 74d86ed7
      Steven Rostedt (Red Hat) authored
      commit 8c4f3c3f upstream.
      
      There's been a nasty bug that would show up and not give much info.
      The bug displayed the following warning:
      
       WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230()
       Pid: 20903, comm: bash Tainted: G           O 3.6.11+ #38405.trunk
       Call Trace:
        [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230
        [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0
        [<ffffffff811401cc>] ? kfree+0x2c/0x110
        [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150
        [<ffffffff81149f1e>] __fput+0xae/0x220
        [<ffffffff8114a09e>] ____fput+0xe/0x10
        [<ffffffff8105fa22>] task_work_run+0x72/0x90
        [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0
        [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
        [<ffffffff815c0f88>] int_signal+0x12/0x17
       ---[ end trace 793179526ee09b2c ]---
      
      It was finally narrowed down to unloading a module that was being traced.
      
      It was actually more than that. When functions are being traced, there's
      a table of all functions that have a ref count of the number of active
      tracers attached to that function. When a function trace callback is
      registered to a function, the function's record ref count is incremented.
      When it is unregistered, the function's record ref count is decremented.
      If an inconsistency is detected (ref count goes below zero) the above
      warning is shown and the function tracing is permanently disabled until
      reboot.
      
      The ftrace callback ops holds a hash of functions that it filters on
      (and/or filters off). If the hash is empty, the default means to filter
      all functions (for the filter_hash) or to disable no functions (for the
      notrace_hash).
      
      When a module is unloaded, it frees the function records that represent
      the module functions. These records exist on their own pages, that is
      function records for one module will not exist on the same page as
      function records for other modules or even the core kernel.
      
      Now when a module unloads, the records that represents its functions are
      freed. When the module is loaded again, the records are recreated with
      a default ref count of zero (unless there's a callback that traces all
      functions, then they will also be traced, and the ref count will be
      incremented).
      
      The problem is that if an ftrace callback hash includes functions of the
      module being unloaded, those hash entries will not be removed. If the
      module is reloaded in the same location, the hash entries still point
      to the functions of the module but the module's ref counts do not reflect
      that.
      
      With the help of Steve and Joern, we found a reproducer:
      
       Using uinput module and uinput_release function.
      
       cd /sys/kernel/debug/tracing
       modprobe uinput
       echo uinput_release > set_ftrace_filter
       echo function > current_tracer
       rmmod uinput
       modprobe uinput
       # check /proc/modules to see if loaded in same addr, otherwise try again
       echo nop > current_tracer
      
       [BOOM]
      
      The above loads the uinput module, which creates a table of functions that
      can be traced within the module.
      
      We add uinput_release to the filter_hash to trace just that function.
      
      Enable function tracincg, which increments the ref count of the record
      associated to uinput_release.
      
      Remove uinput, which frees the records including the one that represents
      uinput_release.
      
      Load the uinput module again (and make sure it's at the same address).
      This recreates the function records all with a ref count of zero,
      including uinput_release.
      
      Disable function tracing, which will decrement the ref count for uinput_release
      which is now zero because of the module removal and reload, and we have
      a mismatch (below zero ref count).
      
      The solution is to check all currently tracing ftrace callbacks to see if any
      are tracing any of the module's functions when a module is loaded (it already does
      that with callbacks that trace all functions). If a callback happens to have
      a module function being traced, it increments that records ref count and starts
      tracing that function.
      
      There may be a strange side effect with this, where tracing module functions
      on unload and then reloading a new module may have that new module's functions
      being traced. This may be something that confuses the user, but it's not
      a big deal. Another approach is to disable all callback hashes on module unload,
      but this leaves some ftrace callbacks that may not be registered, but can
      still have hashes tracing the module's function where ftrace doesn't know about
      it. That situation can cause the same bug. This solution solves that case too.
      Another benefit of this solution, is it is possible to trace a module's
      function on unload and load.
      
      Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.comReported-by: default avatarJörn Engel <joern@logfs.org>
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Reported-by: default avatarSteve Hodgson <steve@purestorage.com>
      Tested-by: default avatarSteve Hodgson <steve@purestorage.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74d86ed7
    • Peter Zijlstra's avatar
      perf: Fix perf ring buffer memory ordering · 1fbbea7b
      Peter Zijlstra authored
      commit bf378d34 upstream.
      
      The PPC64 people noticed a missing memory barrier and crufty old
      comments in the perf ring buffer code. So update all the comments and
      add the missing barrier.
      
      When the architecture implements local_t using atomic_long_t there
      will be double barriers issued; but short of introducing more
      conditional barrier primitives this is the best we can do.
      Reported-by: default avatarVictor Kaplansky <victork@il.ibm.com>
      Tested-by: default avatarVictor Kaplansky <victork@il.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: michael@ellerman.id.au
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: anton@samba.org
      Cc: benh@kernel.crashing.org
      Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lanSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fbbea7b
    • Justin Lecher's avatar
      fs: cachefiles: add support for large files in filesystem caching · 76504c24
      Justin Lecher authored
      commit 98c350cd upstream.
      
      Support the caching of large files.
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=31182Signed-off-by: default avatarJustin Lecher <jlec@gentoo.org>
      Signed-off-by: default avatarSuresh Jayaraman <sjayaraman@suse.com>
      Tested-by: default avatarSuresh Jayaraman <sjayaraman@suse.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2:
       - Adjust context
       - dentry_open() takes dentry and vfsmount pointers, not a path pointer]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76504c24