1. 19 Aug, 2015 19 commits
    • Brian Campbell's avatar
      xhci: Calculate old endpoints correctly on device reset · 567ddd57
      Brian Campbell authored
      commit 326124a0 upstream.
      
      When resetting a device the number of active TTs may need to be
      corrected by xhci_update_tt_active_eps, but the number of old active
      endpoints supplied to it was always zero, so the number of TTs and the
      bandwidth reserved for them was not updated, and could rise
      unnecessarily.
      
      This affected systems using Intel's Patherpoint chipset, which rely on
      software bandwidth checking.  For example, a Lenovo X230 would lose the
      ability to use ports on the docking station after enough suspend/resume
      cycles because the bandwidth calculated would rise with every cycle when
      a suitable device is attached.
      
      The correct number of active endpoints is calculated in the same way as
      in xhci_reserve_bandwidth.
      Signed-off-by: default avatarBrian Campbell <bacam@z273.org.uk>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      567ddd57
    • Oliver Neukum's avatar
      usb-storage: ignore ZTE MF 823 card reader in mode 0x1225 · 6464a738
      Oliver Neukum authored
      commit 5fb2c782 upstream.
      
      This device automatically switches itself to another mode (0x1405)
      unless the specific access pattern of Windows is followed in its
      initial mode. That makes a dirty unmount of the internal storage
      devices inevitable if they are mounted. So the card reader of
      such a device should be ignored, lest an unclean removal become
      inevitable.
      
      This replaces an earlier patch that ignored all LUNs of this device.
      That patch was overly broad.
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Reviewed-by: default avatarLars Melin <larsm17@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6464a738
    • Lior Amsalem's avatar
      ata: pmp: add quirk for Marvell 4140 SATA PMP · 6a060daa
      Lior Amsalem authored
      commit 945b4744 upstream.
      
      This commit adds the necessary quirk to make the Marvell 4140 SATA PMP
      work properly. This PMP doesn't like SRST on port number 4 (the host
      port) so this commit marks this port as not supporting SRST.
      Signed-off-by: default avatarLior Amsalem <alior@marvell.com>
      Reviewed-by: default avatarNadav Haklai <nadavh@marvell.com>
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6a060daa
    • Tejun Heo's avatar
      blkcg: fix gendisk reference leak in blkg_conf_prep() · 5b649869
      Tejun Heo authored
      commit 5f6c2d2b upstream.
      
      When a blkcg configuration is targeted to a partition rather than a
      whole device, blkg_conf_prep fails with -EINVAL; unfortunately, it
      forgets to put the gendisk ref in that case.  Fix it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5b649869
    • Bernhard Bender's avatar
      Input: usbtouchscreen - avoid unresponsive TSC-30 touch screen · 251414e3
      Bernhard Bender authored
      commit 96849170 upstream.
      
      This patch fixes a problem in the usbtouchscreen driver for DMC TSC-30
      touch screen.  Due to a missing delay between the RESET and SET_RATE
      commands, the touch screen may become unresponsive during system startup or
      driver loading.
      
      According to the DMC documentation, a delay is needed after the RESET
      command to allow the chip to complete its internal initialization. As this
      delay is not guaranteed, we had a system where the touch screen
      occasionally did not send any touch data. There was no other indication of
      the problem.
      
      The patch fixes the problem by adding a 150ms delay between the RESET and
      SET_RATE commands.
      Suggested-by: default avatarJakob Mustafa <jakob.mustafa@bytecmed.com>
      Signed-off-by: default avatarBernhard Bender <bernhard.bender@bytecmed.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      251414e3
    • Chris Metcalf's avatar
      tile: use free_bootmem_late() for initrd · d1cb2e26
      Chris Metcalf authored
      commit 3f81d244 upstream.
      
      We were previously using free_bootmem() and just getting lucky
      that nothing too bad happened.
      Signed-off-by: default avatarChris Metcalf <cmetcalf@ezchip.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d1cb2e26
    • NeilBrown's avatar
      md/raid1: fix test for 'was read error from last working device'. · 7ceb7391
      NeilBrown authored
      commit 34cab6f4 upstream.
      
      When we get a read error from the last working device, we don't
      try to repair it, and don't fail the device.  We simple report a
      read error to the caller.
      
      However the current test for 'is this the last working device' is
      wrong.
      When there is only one fully working device, it assumes that a
      non-faulty device is that device.  However a spare which is rebuilding
      would be non-faulty but so not the only working device.
      
      So change the test from "!Faulty" to "In_sync".  If ->degraded says
      there is only one fully working device and this device is in_sync,
      this must be the one.
      
      This bug has existed since we allowed read_balance to read from
      a recovering spare in v3.0
      Reported-and-tested-by: default avatarAlexander Lyakas <alex.bolshoy@gmail.com>
      Fixes: 76073054 ("md/raid1: clean up read_balance.")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7ceb7391
    • Jingju Hou's avatar
      mmc: sdhci-pxav3: fix platform_data is not initialized · 3e9b9255
      Jingju Hou authored
      commit 9cd76049 upstream.
      
      pdev->dev.platform_data is not initialized if match is true in function
      sdhci_pxav3_probe. Just local variable pdata is assigned the return value
      from function pxav3_get_mmc_pdata().
      
      static int sdhci_pxav3_probe(struct platform_device *pdev) {
      
          struct sdhci_pxa_platdata *pdata = pdev->dev.platform_data;
          ...
          if (match) {
      		ret = mmc_of_parse(host->mmc);
      		if (ret)
      			goto err_of_parse;
      		sdhci_get_of_property(pdev);
      		pdata = pxav3_get_mmc_pdata(dev);
           }
           ...
      }
      Signed-off-by: default avatarJingju Hou <houjingj@marvell.com>
      Fixes: b650352d("mmc: sdhci-pxa: Add device tree support")
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3e9b9255
    • Joakim Tjernlund's avatar
      mmc: sdhci-esdhc: Make 8BIT bus work · 2c8e7b05
      Joakim Tjernlund authored
      commit 8e91125f upstream.
      
      Support for 8BIT bus with was added some time ago to sdhci-esdhc but
      then missed to remove the 8BIT from the reserved bit mask which made
      8BIT non functional.
      
      Fixes: 66b50a00 ("mmc: esdhc: Add support for 8-bit bus width and..")
      Signed-off-by: default avatarJoakim Tjernlund <joakim.tjernlund@transmode.se>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2c8e7b05
    • Tom Hughes's avatar
      mac80211: clear subdir_stations when removing debugfs · 0045487d
      Tom Hughes authored
      commit 4479004e upstream.
      
      If we don't do this, and we then fail to recreate the debugfs
      directory during a mode change, then we will fail later trying
      to add stations to this now bogus directory:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000006c
      IP: [<c0a92202>] mutex_lock+0x12/0x30
      Call Trace:
      [<c0678ab4>] start_creating+0x44/0xc0
      [<c0679203>] debugfs_create_dir+0x13/0xf0
      [<f8a938ae>] ieee80211_sta_debugfs_add+0x6e/0x490 [mac80211]
      Signed-off-by: default avatarTom Hughes <tom@compton.nu>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0045487d
    • Seymour, Shane M's avatar
      st: null pointer dereference panic caused by use after kref_put by st_open · f603c11c
      Seymour, Shane M authored
      commit e7ac6c66 upstream.
      
      Two SLES11 SP3 servers encountered similar crashes simultaneously
      following some kind of SAN/tape target issue:
      
      ...
      qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
      qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
      qla2xxx [0000:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2.
      qla2xxx [0000:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2.
      qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
      qla2xxx [0000:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2.
      qla2xxx [0000:81:00.0]-00af:3: Performing ISP error recovery - ha=ffff88bf04d18000.
       rport-3:0-0: blocked FC remote port time out: removing target and saving binding
      qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
      qla2xxx [0000:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2.
       rport-2:0-0: blocked FC remote port time out: removing target and saving binding
      sg_rq_end_io: device detached
      BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
      IP: [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
      PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP
      CPU 0
      ...
      Supported: No, Proprietary modules are loaded [1739975.390463]
      Pid: 27965, comm: ABCD Tainted: PF           X 3.0.101-0.29-default #1 HP ProLiant DL580 Gen8
      RIP: 0010:[<ffffffff8133b268>]  [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
      RSP: 0018:ffff8839dc1e7c68  EFLAGS: 00010202
      RAX: 0000000000000000 RBX: ffff883f0592fc00 RCX: 0000000000000090
      RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000138
      RBP: 0000000000000138 R08: 0000000000000010 R09: ffffffff81bd39d0
      R10: 00000000000009c0 R11: ffffffff81025790 R12: 0000000000000001
      R13: ffff883022212b80 R14: 0000000000000004 R15: ffff883022212b80
      FS:  00007f8e54560720(0000) GS:ffff88407f800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000000002a8 CR3: 0000007e6ced6000 CR4: 00000000001407f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process ABCD (pid: 27965, threadinfo ffff8839dc1e6000, task ffff883592e0c640)
      Stack:
       ffff883f0592fc00 00000000fffffffa 0000000000000001 ffff883022212b80
       ffff883eff772400 ffffffffa03fa309 0000000000000000 0000000000000000
       ffffffffa04003a0 ffff883f063196c0 ffff887f0379a930 ffffffff8115ea1e
      Call Trace:
       [<ffffffffa03fa309>] st_open+0x129/0x240 [st]
       [<ffffffff8115ea1e>] chrdev_open+0x13e/0x200
       [<ffffffff811588a8>] __dentry_open+0x198/0x310
       [<ffffffff81167d74>] do_last+0x1f4/0x800
       [<ffffffff81168fe9>] path_openat+0xd9/0x420
       [<ffffffff8116946c>] do_filp_open+0x4c/0xc0
       [<ffffffff8115a00f>] do_sys_open+0x17f/0x250
       [<ffffffff81468d92>] system_call_fastpath+0x16/0x1b
       [<00007f8e4f617fd0>] 0x7f8e4f617fcf
      Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 <f0> ff 8f 70 01 00 00 0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0
      RIP  [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
       RSP <ffff8839dc1e7c68>
      CR2: 00000000000002a8
      
      Analysis reveals the cause of the crash to be due to STp->device
      being NULL. The pointer was NULLed via scsi_tape_put(STp) when it
      calls scsi_tape_release(). In st_open() we jump to err_out after
      scsi_block_when_processing_errors() completes and returns the
      device as offline (sdev_state was SDEV_DEL):
      
      1180 /* Open the device. Needs to take the BKL only because of incrementing the SCSI host
      1181    module count. */
      1182 static int st_open(struct inode *inode, struct file *filp)
      1183 {
      1184         int i, retval = (-EIO);
      1185         int resumed = 0;
      1186         struct scsi_tape *STp;
      1187         struct st_partstat *STps;
      1188         int dev = TAPE_NR(inode);
      1189         char *name;
      ...
      1217         if (scsi_autopm_get_device(STp->device) < 0) {
      1218                 retval = -EIO;
      1219                 goto err_out;
      1220         }
      1221         resumed = 1;
      1222         if (!scsi_block_when_processing_errors(STp->device)) {
      1223                 retval = (-ENXIO);
      1224                 goto err_out;
      1225         }
      ...
      1264  err_out:
      1265         normalize_buffer(STp->buffer);
      1266         spin_lock(&st_use_lock);
      1267         STp->in_use = 0;
      1268         spin_unlock(&st_use_lock);
      1269         scsi_tape_put(STp); <-- STp->device = 0 after this
      1270         if (resumed)
      1271                 scsi_autopm_put_device(STp->device);
      1272         return retval;
      
      The ref count for the struct scsi_tape had already been reduced
      to 1 when the .remove method of the st module had been called.
      The kref_put() in scsi_tape_put() caused scsi_tape_release()
      to be called:
      
      0266 static void scsi_tape_put(struct scsi_tape *STp)
      0267 {
      0268         struct scsi_device *sdev = STp->device;
      0269
      0270         mutex_lock(&st_ref_mutex);
      0271         kref_put(&STp->kref, scsi_tape_release); <-- calls this
      0272         scsi_device_put(sdev);
      0273         mutex_unlock(&st_ref_mutex);
      0274 }
      
      In scsi_tape_release() the struct scsi_device in the struct
      scsi_tape gets set to NULL:
      
      4273 static void scsi_tape_release(struct kref *kref)
      4274 {
      4275         struct scsi_tape *tpnt = to_scsi_tape(kref);
      4276         struct gendisk *disk = tpnt->disk;
      4277
      4278         tpnt->device = NULL; <<<---- where the dev is nulled
      4279
      4280         if (tpnt->buffer) {
      4281                 normalize_buffer(tpnt->buffer);
      4282                 kfree(tpnt->buffer->reserved_pages);
      4283                 kfree(tpnt->buffer);
      4284         }
      4285
      4286         disk->private_data = NULL;
      4287         put_disk(disk);
      4288         kfree(tpnt);
      4289         return;
      4290 }
      
      Although the problem was reported on SLES11.3 the problem appears
      in linux-next as well.
      
      The crash is fixed by reordering the code so we no longer access
      the struct scsi_tape after the kref_put() is done on it in st_open().
      Signed-off-by: default avatarShane Seymour <shane.seymour@hp.com>
      Signed-off-by: default avatarDarren Lavender <darren.lavender@hp.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.com>
      Acked-by: default avatarKai Mäkisara <kai.makisara@kolumbus.fi>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f603c11c
    • Takashi Iwai's avatar
      ALSA: hda - Fix MacBook Pro 5,2 quirk · 8e711dc3
      Takashi Iwai authored
      commit 649ccd08 upstream.
      
      MacBook Pro 5,2 with ALC889 codec had already a fixup entry, but this
      seems not working correctly, a fix for pin NID 0x15 is needed in
      addition.  It's equivalent with the fixup for MacBook Air 1,1, so use
      this instead.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102131Reported-and-tested-by: default avatarJeffery Miller <jefferym@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8e711dc3
    • Yao-Wen Mao's avatar
      ALSA: usb-audio: add dB range mapping for some devices · 85262576
      Yao-Wen Mao authored
      commit 2d1cb7f6 upstream.
      
      Add the correct dB ranges of Bose Companion 5 and Drangonfly DAC 1.2.
      Signed-off-by: default avatarYao-Wen Mao <yaowen@google.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      85262576
    • Dominic Sacré's avatar
      ALSA: usb-audio: Add MIDI support for Steinberg MI2/MI4 · 6f5602ec
      Dominic Sacré authored
      commit 0689a86a upstream.
      
      The Steinberg MI2 and MI4 interfaces are compatible with the USB class
      audio spec, but the MIDI part of the devices is reported as a vendor
      specific interface.
      
      This patch adds entries to quirks-table.h to recognize the MIDI
      endpoints. Audio functionality was already working and is unaffected by
      this change.
      Signed-off-by: default avatarDominic Sacré <dominic.sacre@gmx.de>
      Signed-off-by: default avatarAlbert Huitsing <albert@huitsing.nl>
      Acked-by: default avatarClemens Ladisch <clemens@ladisch.de>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6f5602ec
    • Thomas Gleixner's avatar
      genirq: Prevent resend to interrupts marked IRQ_NESTED_THREAD · 351ee276
      Thomas Gleixner authored
      commit 75a06189 upstream.
      
      The resend mechanism happily calls the interrupt handler of interrupts
      which are marked IRQ_NESTED_THREAD from softirq context. This can
      result in crashes because the interrupt handler is not the proper way
      to invoke the device handlers. They must be invoked via
      handle_nested_irq.
      
      Prevent the resend even if the interrupt has no valid parent irq
      set. Its better to have a lost interrupt than a crashing machine.
      Reported-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      351ee276
    • Alexey Brodkin's avatar
      ARC: make sure instruction_pointer() returns unsigned value · b9d5df39
      Alexey Brodkin authored
      commit f51e2f19 upstream.
      
      Currently instruction_pointer() returns pt_regs->ret and so return value
      is of type "long", which implicitly stands for "signed long".
      
      While that's perfectly fine when dealing with 32-bit values if return
      value of instruction_pointer() gets assigned to 64-bit variable sign
      extension may happen.
      
      And at least in one real use-case it happens already.
      In perf_prepare_sample() return value of perf_instruction_pointer()
      (which is an alias to instruction_pointer() in case of ARC) is assigned
      to (struct perf_sample_data)->ip (which type is "u64").
      
      And what we see if instuction pointer points to user-space application
      that in case of ARC lays below 0x8000_0000 "ip" gets set properly with
      leading 32 zeros. But if instruction pointer points to kernel address
      space that starts from 0x8000_0000 then "ip" is set with 32 leadig
      "f"-s. I.e. id instruction_pointer() returns 0x8100_0000, "ip" will be
      assigned with 0xffff_ffff__8100_0000. Which is obviously wrong.
      
      In particular that issuse broke output of perf, because perf was unable
      to associate addresses like 0xffff_ffff__8100_0000 with anything from
      /proc/kallsyms.
      
      That's what we used to see:
       ----------->8----------
        6.27%  ls       [unknown]                [k] 0xffffffff8046c5cc
        2.96%  ls       libuClibc-0.9.34-git.so  [.] memcpy
        2.25%  ls       libuClibc-0.9.34-git.so  [.] memset
        1.66%  ls       [unknown]                [k] 0xffffffff80666536
        1.54%  ls       libuClibc-0.9.34-git.so  [.] 0x000224d6
        1.18%  ls       libuClibc-0.9.34-git.so  [.] 0x00022472
       ----------->8----------
      
      With that change perf output looks much better now:
       ----------->8----------
        8.21%  ls       [kernel.kallsyms]        [k] memset
        3.52%  ls       libuClibc-0.9.34-git.so  [.] memcpy
        2.11%  ls       libuClibc-0.9.34-git.so  [.] malloc
        1.88%  ls       libuClibc-0.9.34-git.so  [.] memset
        1.64%  ls       [kernel.kallsyms]        [k] _raw_spin_unlock_irqrestore
        1.41%  ls       [kernel.kallsyms]        [k] __d_lookup_rcu
       ----------->8----------
      Signed-off-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Cc: arc-linux-dev@synopsys.com
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b9d5df39
    • Martin Schwidefsky's avatar
      s390/sclp: clear upper register halves in _sclp_print_early · 496398df
      Martin Schwidefsky authored
      commit f9c87a6f upstream.
      
      If the kernel is compiled with gcc 5.1 and the XZ compression option
      the decompress_kernel function calls _sclp_print_early in 64-bit mode
      while the content of the upper register half of %r6 is non-zero.
      This causes a specification exception on the servc instruction in
      _sclp_servc.
      
      The _sclp_print_early function saves and restores the upper registers
      halves but it fails to clear them for the 31-bit code of the mini sclp
      driver.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      496398df
    • Al Viro's avatar
      freeing unlinked file indefinitely delayed · f9c48e68
      Al Viro authored
      commit 75a6f82a upstream.
      
      	Normally opening a file, unlinking it and then closing will have
      the inode freed upon close() (provided that it's not otherwise busy and
      has no remaining links, of course).  However, there's one case where that
      does *not* happen.  Namely, if you open it by fhandle with cold dcache,
      then unlink() and close().
      
      	In normal case you get d_delete() in unlink(2) notice that dentry
      is busy and unhash it; on the final dput() it will be forcibly evicted from
      dcache, triggering iput() and inode removal.  In this case, though, we end
      up with *two* dentries - disconnected (created by open-by-fhandle) and
      regular one (used by unlink()).  The latter will have its reference to inode
      dropped just fine, but the former will not - it's considered hashed (it
      is on the ->s_anon list), so it will stay around until the memory pressure
      will finally do it in.  As the result, we have the final iput() delayed
      indefinitely.  It's trivial to reproduce -
      
      void flush_dcache(void)
      {
              system("mount -o remount,rw /");
      }
      
      static char buf[20 * 1024 * 1024];
      
      main()
      {
              int fd;
              union {
                      struct file_handle f;
                      char buf[MAX_HANDLE_SZ];
              } x;
              int m;
      
              x.f.handle_bytes = sizeof(x);
              chdir("/root");
              mkdir("foo", 0700);
              fd = open("foo/bar", O_CREAT | O_RDWR, 0600);
              close(fd);
              name_to_handle_at(AT_FDCWD, "foo/bar", &x.f, &m, 0);
              flush_dcache();
              fd = open_by_handle_at(AT_FDCWD, &x.f, O_RDWR);
              unlink("foo/bar");
              write(fd, buf, sizeof(buf));
              system("df .");			/* 20Mb eaten */
              close(fd);
              system("df .");			/* should've freed those 20Mb */
              flush_dcache();
              system("df .");			/* should be the same as #2 */
      }
      
      will spit out something like
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 303843      1131 100% /
      Filesystem     1K-blocks   Used Available Use% Mounted on
      /dev/root         322023 283282     21692  93% /
      - inode gets freed only when dentry is finally evicted (here we trigger
      than by remount; normally it would've happened in response to memory
      pressure hell knows when).
      Acked-by: default avatarJ. Bruce Fields <bfields@fieldses.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f9c48e68
    • Kirill A. Shutemov's avatar
      mm: avoid setting up anonymous pages into file mapping · bf653833
      Kirill A. Shutemov authored
      commit 6b7339f4 upstream.
      
      Reading page fault handler code I've noticed that under right
      circumstances kernel would map anonymous pages into file mappings: if
      the VMA doesn't have vm_ops->fault() and the VMA wasn't fully populated
      on ->mmap(), kernel would handle page fault to not populated pte with
      do_anonymous_page().
      
      Let's change page fault handler to use do_anonymous_page() only on
      anonymous VMA (->vm_ops == NULL) and make sure that the VMA is not
      shared.
      
      For file mappings without vm_ops->fault() or shred VMA without vm_ops,
      page fault on pte_none() entry would lead to SIGBUS.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bf653833
  2. 06 Aug, 2015 2 commits
  3. 05 Aug, 2015 1 commit
  4. 04 Aug, 2015 18 commits