1. 05 Oct, 2014 17 commits
  2. 17 Sep, 2014 23 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.16.3 · c13c2820
      Greg Kroah-Hartman authored
      c13c2820
    • David Howells's avatar
      KEYS: Fix termination condition in assoc array garbage collection · a4b9e45f
      David Howells authored
      commit 95389b08 upstream.
      
      This fixes CVE-2014-3631.
      
      It is possible for an associative array to end up with a shortcut node at the
      root of the tree if there are more than fan-out leaves in the tree, but they
      all crowd into the same slot in the lowest level (ie. they all have the same
      first nibble of their index keys).
      
      When assoc_array_gc() returns back up the tree after scanning some leaves, it
      can fall off of the root and crash because it assumes that the back pointer
      from a shortcut (after label ascend_old_tree) must point to a normal node -
      which isn't true of a shortcut node at the root.
      
      Should we find we're ascending rootwards over a shortcut, we should check to
      see if the backpointer is zero - and if it is, we have completed the scan.
      
      This particular bug cannot occur if the root node is not a shortcut - ie. if
      you have fewer than 17 keys in a keyring or if you have at least two keys that
      sit into separate slots (eg. a keyring and a non keyring).
      
      This can be reproduced by:
      
      	ring=`keyctl newring bar @s`
      	for ((i=1; i<=18; i++)); do last_key=`keyctl newring foo$i $ring`; done
      	keyctl timeout $last_key 2
      
      Doing this:
      
      	echo 3 >/proc/sys/kernel/keys/gc_delay
      
      first will speed things up.
      
      If we do fall off of the top of the tree, we get the following oops:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      IP: [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540
      PGD dae15067 PUD cfc24067 PMD 0
      Oops: 0000 [#1] SMP
      Modules linked in: xt_nat xt_mark nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_ni
      CPU: 0 PID: 26011 Comm: kworker/0:1 Not tainted 3.14.9-200.fc20.x86_64 #1
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Workqueue: events key_garbage_collector
      task: ffff8800918bd580 ti: ffff8800aac14000 task.ti: ffff8800aac14000
      RIP: 0010:[<ffffffff8136cea7>] [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540
      RSP: 0018:ffff8800aac15d40  EFLAGS: 00010206
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800aaecacc0
      RDX: ffff8800daecf440 RSI: 0000000000000001 RDI: ffff8800aadc2bc0
      RBP: ffff8800aac15da8 R08: 0000000000000001 R09: 0000000000000003
      R10: ffffffff8136ccc7 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000070 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000018 CR3: 00000000db10d000 CR4: 00000000000006f0
      Stack:
       ffff8800aac15d50 0000000000000011 ffff8800aac15db8 ffffffff812e2a70
       ffff880091a00600 0000000000000000 ffff8800aadc2bc3 00000000cd42c987
       ffff88003702df20 ffff88003702dfa0 0000000053b65c09 ffff8800aac15fd8
      Call Trace:
       [<ffffffff812e2a70>] ? keyring_detect_cycle_iterator+0x30/0x30
       [<ffffffff812e3e75>] keyring_gc+0x75/0x80
       [<ffffffff812e1424>] key_garbage_collector+0x154/0x3c0
       [<ffffffff810a67b6>] process_one_work+0x176/0x430
       [<ffffffff810a744b>] worker_thread+0x11b/0x3a0
       [<ffffffff810a7330>] ? rescuer_thread+0x3b0/0x3b0
       [<ffffffff810ae1a8>] kthread+0xd8/0xf0
       [<ffffffff810ae0d0>] ? insert_kthread_work+0x40/0x40
       [<ffffffff816ffb7c>] ret_from_fork+0x7c/0xb0
       [<ffffffff810ae0d0>] ? insert_kthread_work+0x40/0x40
      Code: 08 4c 8b 22 0f 84 bf 00 00 00 41 83 c7 01 49 83 e4 fc 41 83 ff 0f 4c 89 65 c0 0f 8f 5a fe ff ff 48 8b 45 c0 4d 63 cf 49 83 c1 02 <4e> 8b 34 c8 4d 85 f6 0f 84 be 00 00 00 41 f6 c6 01 0f 84 92
      RIP  [<ffffffff8136cea7>] assoc_array_gc+0x2f7/0x540
       RSP <ffff8800aac15d40>
      CR2: 0000000000000018
      ---[ end trace 1129028a088c0cbd ]---
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4b9e45f
    • David Howells's avatar
      KEYS: Fix use-after-free in assoc_array_gc() · b3c24771
      David Howells authored
      commit 27419604 upstream.
      
      An edit script should be considered inaccessible by a function once it has
      called assoc_array_apply_edit() or assoc_array_cancel_edit().
      
      However, assoc_array_gc() is accessing the edit script just after the
      gc_complete: label.
      Reported-by: default avatarAndreea-Cristina Bernat <bernat.ada@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarAndreea-Cristina Bernat <bernat.ada@gmail.com>
      cc: shemming@brocade.com
      cc: paulmck@linux.vnet.ibm.com
      Signed-off-by: default avatarJames Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b3c24771
    • Pavel Shilovsky's avatar
      CIFS: Fix SMB2 readdir error handling · c857808d
      Pavel Shilovsky authored
      commit 52755808 upstream.
      
      SMB2 servers indicates the end of a directory search with
      STATUS_NO_MORE_FILE error code that is not processed now.
      This causes generic/257 xfstest to fail. Fix this by triggering
      the end of search by this error code in SMB2_query_directory.
      
      Also when negotiating CIFS protocol we tell the server to close
      the search automatically at the end and there is no need to do
      it itself. In the case of SMB2 protocol, we need to close it
      explicitly - separate close directory checks for different
      protocols.
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c857808d
    • Linus Torvalds's avatar
      vfs: fix bad hashing of dentries · b5cf3193
      Linus Torvalds authored
      commit 99d263d4 upstream.
      
      Josef Bacik found a performance regression between 3.2 and 3.10 and
      narrowed it down to commit bfcfaa77 ("vfs: use 'unsigned long'
      accesses for dcache name comparison and hashing"). He reports:
      
       "The test case is essentially
      
            for (i = 0; i < 1000000; i++)
                    mkdir("a$i");
      
        On xfs on a fio card this goes at about 20k dir/sec with 3.2, and 12k
        dir/sec with 3.10.  This is because we spend waaaaay more time in
        __d_lookup on 3.10 than in 3.2.
      
        The new hashing function for strings is suboptimal for <
        sizeof(unsigned long) string names (and hell even > sizeof(unsigned
        long) string names that I've tested).  I broke out the old hashing
        function and the new one into a userspace helper to get real numbers
        and this is what I'm getting:
      
            Old hash table had 1000000 entries, 0 dupes, 0 max dupes
            New hash table had 12628 entries, 987372 dupes, 900 max dupes
            We had 11400 buckets with a p50 of 30 dupes, p90 of 240 dupes, p99 of 567 dupes for the new hash
      
        My test does the hash, and then does the d_hash into a integer pointer
        array the same size as the dentry hash table on my system, and then
        just increments the value at the address we got to see how many
        entries we overlap with.
      
        As you can see the old hash function ended up with all 1 million
        entries in their own bucket, whereas the new one they are only
        distributed among ~12.5k buckets, which is why we're using so much
        more CPU in __d_lookup".
      
      The reason for this hash regression is two-fold:
      
       - On 64-bit architectures the down-mixing of the original 64-bit
         word-at-a-time hash into the final 32-bit hash value is very
         simplistic and suboptimal, and just adds the two 32-bit parts
         together.
      
         In particular, because there is no bit shuffling and the mixing
         boundary is also a byte boundary, similar character patterns in the
         low and high word easily end up just canceling each other out.
      
       - the old byte-at-a-time hash mixed each byte into the final hash as it
         hashed the path component name, resulting in the low bits of the hash
         generally being a good source of hash data.  That is not true for the
         word-at-a-time case, and the hash data is distributed among all the
         bits.
      
      The fix is the same in both cases: do a better job of mixing the bits up
      and using as much of the hash data as possible.  We already have the
      "hash_32|64()" functions to do that.
      Reported-by: default avatarJosef Bacik <jbacik@fb.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Chris Mason <clm@fb.com>
      Cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5cf3193
    • Mario Kleiner's avatar
      drm/nouveau: Bump version from 1.1.1 to 1.1.2 · b80e6286
      Mario Kleiner authored
      commit 7820e5ee upstream.
      
      Linux 3.16 fixed multiple bugs in kms pageflip completion events
      and timestamping, which were originally introduced in Linux 3.13.
      
      These fixes have been backported to all stable kernels since 3.13.
      
      However, the userspace nouveau-ddx needs to be aware if it is
      running on a kernel on which these bugs are fixed, or not.
      
      Bump the patchlevel of the drm driver version to signal this,
      so backporting this patch to stable 3.13+ kernels will give the
      ddx the required info.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b80e6286
    • Mario Kleiner's avatar
      drm/nouveau: Dis/Enable vblank irqs during suspend/resume. · a82fd712
      Mario Kleiner authored
      commit 9cba5efa upstream.
      
      Vblank irqs don't get disabled during suspend or driver
      unload, which causes irq delivery after "suspend" or
      driver unload, at least until the gpu is powered off.
      This could race with drm_vblank_cleanup() in the case
      of nouveau and cause a use-after-free bug if the driver
      is unloaded.
      
      More annoyingly during everyday use, at least on nv50
      display engine (likely also others), vblank irqs are
      off after a resume from suspend, but the drm doesn't
      know this, so all vblank related functionality is dead
      after a resume. E.g., all windowed OpenGL clients will
      hang at swapbuffers time, as well as many fullscreen
      clients in many cases. This makes suspend/resume useless
      if one wants to use any OpenGL apps after the resume.
      
      In Linux 3.16, drm_vblank_on() was added, complementing
      the older drm_vblank_off()  to solve these problems
      elegantly, so use those calls in nouveaus suspend/resume
      code.
      
      For kernels 3.8 - 3.15, we need to cherry-pick the
      drm_vblank_on() patch to support this patch.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a82fd712
    • Bart Van Assche's avatar
      IB/srp: Fix deadlock between host removal and multipathd · 1b29db8c
      Bart Van Assche authored
      commit bcc05910 upstream.
      
      If scsi_remove_host() is invoked after a SCSI device has been blocked,
      if the fast_io_fail_tmo or dev_loss_tmo work gets scheduled on the
      workqueue executing srp_remove_work() and if an I/O request is
      scheduled after the SCSI device had been blocked by e.g. multipathd
      then the following deadlock can occur:
      
          kworker/6:1     D ffff880831f3c460     0   195      2 0x00000000
          Call Trace:
           [<ffffffff814aafd9>] schedule+0x29/0x70
           [<ffffffff814aa0ef>] schedule_timeout+0x10f/0x2a0
           [<ffffffff8105af6f>] msleep+0x2f/0x40
           [<ffffffff8123b0ae>] __blk_drain_queue+0x4e/0x180
           [<ffffffff8123d2d5>] blk_cleanup_queue+0x225/0x230
           [<ffffffffa0010732>] __scsi_remove_device+0x62/0xe0 [scsi_mod]
           [<ffffffffa000ed2f>] scsi_forget_host+0x6f/0x80 [scsi_mod]
           [<ffffffffa0002eba>] scsi_remove_host+0x7a/0x130 [scsi_mod]
           [<ffffffffa07cf5c5>] srp_remove_work+0x95/0x180 [ib_srp]
           [<ffffffff8106d7aa>] process_one_work+0x1ea/0x6c0
           [<ffffffff8106dd9b>] worker_thread+0x11b/0x3a0
           [<ffffffff810758bd>] kthread+0xed/0x110
           [<ffffffff814b972c>] ret_from_fork+0x7c/0xb0
          multipathd      D ffff880096acc460     0  5340      1 0x00000000
          Call Trace:
           [<ffffffff814aafd9>] schedule+0x29/0x70
           [<ffffffff814aa0ef>] schedule_timeout+0x10f/0x2a0
           [<ffffffff814ab79b>] io_schedule_timeout+0x9b/0xf0
           [<ffffffff814abe1c>] wait_for_completion_io_timeout+0xdc/0x110
           [<ffffffff81244b9b>] blk_execute_rq+0x9b/0x100
           [<ffffffff8124f665>] sg_io+0x1a5/0x450
           [<ffffffff8124fd21>] scsi_cmd_ioctl+0x2a1/0x430
           [<ffffffff8124fef2>] scsi_cmd_blk_ioctl+0x42/0x50
           [<ffffffffa00ec97e>] sd_ioctl+0xbe/0x140 [sd_mod]
           [<ffffffff8124bd04>] blkdev_ioctl+0x234/0x840
           [<ffffffff811cb491>] block_ioctl+0x41/0x50
           [<ffffffff811a0df0>] do_vfs_ioctl+0x300/0x520
           [<ffffffff811a1051>] SyS_ioctl+0x41/0x80
           [<ffffffff814b9962>] tracesys+0xd0/0xd5
      
      Fix this by scheduling removal work on another workqueue than the
      transport layer timers.
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarSagi Grimberg <sagig@mellanox.com>
      Reviewed-by: default avatarDavid Dillow <dave@thedillows.org>
      Cc: Sebastian Parschauer <sebastian.riemer@profitbricks.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b29db8c
    • Jeff Moyer's avatar
      dm table: propagate QUEUE_FLAG_NO_SG_MERGE · 05a09533
      Jeff Moyer authored
      commit 200612ec upstream.
      
      Commit 05f1dd53 ("block: add queue flag for disabling SG merging")
      introduced a new queue flag: QUEUE_FLAG_NO_SG_MERGE.  This gets set by
      default in blk_mq_init_queue for mq-enabled devices.  The effect of
      the flag is to bypass the SG segment merging.  Instead, the
      bio->bi_vcnt is used as the number of hardware segments.
      
      With a device mapper target on top of a device with
      QUEUE_FLAG_NO_SG_MERGE set, we can end up sending down more segments
      than a driver is prepared to handle.  I ran into this when backporting
      the virtio_blk mq support.  It triggerred this BUG_ON, in
      virtio_queue_rq:
      
              BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);
      
      The queue's max is set here:
              blk_queue_max_segments(q, vblk->sg_elems-2);
      
      Basically, what happens is that a bio is built up for the dm device
      (which does not have the QUEUE_FLAG_NO_SG_MERGE flag set) using
      bio_add_page.  That path will call into __blk_recalc_rq_segments, so
      what you end up with is bi_phys_segments being much smaller than bi_vcnt
      (and bi_vcnt grows beyond the maximum sg elements).  Then, when the bio
      is submitted, it gets cloned.  When the cloned bio is submitted, it will
      end up in blk_recount_segments, here:
      
              if (test_bit(QUEUE_FLAG_NO_SG_MERGE, &q->queue_flags))
                      bio->bi_phys_segments = bio->bi_vcnt;
      
      and now we've set bio->bi_phys_segments to a number that is beyond what
      was registered as queue_max_segments by the driver.
      
      The right way to fix this is to propagate the queue flag up the stack.
      
      The rules for propagating the flag are simple:
      - if the flag is set for any underlying device, it must be set for the
        upper device
      - consequently, if the flag is not set for any underlying device, it
        should not be set for the upper device.
      Signed-off-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      05a09533
    • Roger Quadros's avatar
      mtd: nand: omap: Fix 1-bit Hamming code scheme, omap_calculate_ecc() · 20c0fb35
      Roger Quadros authored
      commit 40ddbf50 upstream.
      
      commit 65b97cf6 introduced in v3.7 caused a regression
      by using a reversed CS_MASK thus causing omap_calculate_ecc to
      always fail. As the NAND base driver never checks for .calculate()'s
      return value, the zeroed ECC values are used as is without showing
      any error to the user. However, this won't work and the NAND device
      won't be guarded by any error code.
      
      Fix the issue by using the correct mask.
      
      Code was tested on omap3beagle using the following procedure
      - flash the primary bootloader (MLO) from the kernel to the first
      NAND partition using nandwrite.
      - boot the board from NAND. This utilizes OMAP ROM loader that
      relies on 1-bit Hamming code ECC.
      
      Fixes: 65b97cf6 (mtd: nand: omap2: handle nand on gpmc)
      Signed-off-by: default avatarRoger Quadros <rogerq@ti.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      20c0fb35
    • Kevin Hao's avatar
      mtd/ftl: fix the double free of the buffers allocated in build_maps() · 711f8bcf
      Kevin Hao authored
      commit a152056c upstream.
      
      I got the following panic on my fsl p5020ds board.
      
        Unable to handle kernel paging request for data at address 0x7375627379737465
        Faulting instruction address: 0xc000000000100778
        Oops: Kernel access of bad area, sig: 11 [#1]
        SMP NR_CPUS=24 CoreNet Generic
        Modules linked in:
        CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.15.0-next-20140613 #145
        task: c0000000fe080000 ti: c0000000fe088000 task.ti: c0000000fe088000
        NIP: c000000000100778 LR: c00000000010073c CTR: 0000000000000000
        REGS: c0000000fe08aa00 TRAP: 0300   Not tainted  (3.15.0-next-20140613)
        MSR: 0000000080029000 <CE,EE,ME>  CR: 24ad2e24  XER: 00000000
        DEAR: 7375627379737465 ESR: 0000000000000000 SOFTE: 1
        GPR00: c0000000000c99b0 c0000000fe08ac80 c0000000009598e0 c0000000fe001d80
        GPR04: 00000000000000d0 0000000000000913 c000000007902b20 0000000000000000
        GPR08: c0000000feaae888 0000000000000000 0000000007091000 0000000000200200
        GPR12: 0000000028ad2e28 c00000000fff4000 c0000000007abe08 0000000000000000
        GPR16: c0000000007ab160 c0000000007aaf98 c00000000060ba68 c0000000007abda8
        GPR20: c0000000007abde8 c0000000feaea6f8 c0000000feaea708 c0000000007abd10
        GPR24: c000000000989370 c0000000008c6228 00000000000041ed c0000000fe00a400
        GPR28: c00000000017c1cc 00000000000000d0 7375627379737465 c0000000fe001d80
        NIP [c000000000100778] .__kmalloc_track_caller+0x70/0x168
        LR [c00000000010073c] .__kmalloc_track_caller+0x34/0x168
        Call Trace:
        [c0000000fe08ac80] [c00000000087e6b8] uevent_sock_list+0x0/0x10 (unreliable)
        [c0000000fe08ad20] [c0000000000c99b0] .kstrdup+0x44/0x90
        [c0000000fe08adc0] [c00000000017c1cc] .__kernfs_new_node+0x4c/0x130
        [c0000000fe08ae70] [c00000000017d7e4] .kernfs_new_node+0x2c/0x64
        [c0000000fe08aef0] [c00000000017db00] .kernfs_create_dir_ns+0x34/0xc8
        [c0000000fe08af80] [c00000000018067c] .sysfs_create_dir_ns+0x58/0xcc
        [c0000000fe08b010] [c0000000002c711c] .kobject_add_internal+0xc8/0x384
        [c0000000fe08b0b0] [c0000000002c7644] .kobject_add+0x64/0xc8
        [c0000000fe08b140] [c000000000355ebc] .device_add+0x11c/0x654
        [c0000000fe08b200] [c0000000002b5988] .add_disk+0x20c/0x4b4
        [c0000000fe08b2c0] [c0000000003a21d4] .add_mtd_blktrans_dev+0x340/0x514
        [c0000000fe08b350] [c0000000003a3410] .mtdblock_add_mtd+0x74/0xb4
        [c0000000fe08b3e0] [c0000000003a32cc] .blktrans_notify_add+0x64/0x94
        [c0000000fe08b470] [c00000000039b5b4] .add_mtd_device+0x1d4/0x368
        [c0000000fe08b520] [c00000000039b830] .mtd_device_parse_register+0xe8/0x104
        [c0000000fe08b5c0] [c0000000003b8408] .of_flash_probe+0x72c/0x734
        [c0000000fe08b750] [c00000000035ba40] .platform_drv_probe+0x38/0x84
        [c0000000fe08b7d0] [c0000000003599a4] .really_probe+0xa4/0x29c
        [c0000000fe08b870] [c000000000359d3c] .__driver_attach+0x100/0x104
        [c0000000fe08b900] [c00000000035746c] .bus_for_each_dev+0x84/0xe4
        [c0000000fe08b9a0] [c0000000003593c0] .driver_attach+0x24/0x38
        [c0000000fe08ba10] [c000000000358f24] .bus_add_driver+0x1c8/0x2ac
        [c0000000fe08bab0] [c00000000035a3a4] .driver_register+0x8c/0x158
        [c0000000fe08bb30] [c00000000035b9f4] .__platform_driver_register+0x6c/0x80
        [c0000000fe08bba0] [c00000000084e080] .of_flash_driver_init+0x1c/0x30
        [c0000000fe08bc10] [c000000000001864] .do_one_initcall+0xbc/0x238
        [c0000000fe08bd00] [c00000000082cdc0] .kernel_init_freeable+0x188/0x268
        [c0000000fe08bdb0] [c0000000000020a0] .kernel_init+0x1c/0xf7c
        [c0000000fe08be30] [c000000000000884] .ret_from_kernel_thread+0x58/0xd4
        Instruction dump:
        41bd0010 480000c8 4bf04eb5 60000000 e94d0028 e93f0000 7cc95214 e8a60008
        7fc9502a 2fbe0000 419e00c8 e93f0022 <7f7e482a> 39200000 88ed06b2 992d06b2
        ---[ end trace b4c9a94804a42d40 ]---
      
      It seems that the corrupted partition header on my mtd device triggers
      a bug in the ftl. In function build_maps() it will allocate the buffers
      needed by the mtd partition, but if something goes wrong such as kmalloc
      failure, mtd read error or invalid partition header parameter, it will
      free all allocated buffers and then return non-zero. In my case, it
      seems that partition header parameter 'NumTransferUnits' is invalid.
      
      And the ftl_freepart() is a function which free all the partition
      buffers allocated by build_maps(). Given the build_maps() is a self
      cleaning function, so there is no need to invoke this function even
      if build_maps() return with error. Otherwise it will causes the
      buffers to be freed twice and then weird things would happen.
      Signed-off-by: default avatarKevin Hao <haokexin@gmail.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      711f8bcf
    • Pavel Shilovsky's avatar
      CIFS: Fix wrong restart readdir for SMB1 · f586b768
      Pavel Shilovsky authored
      commit f736906a upstream.
      
      The existing code calls server->ops->close() that is not
      right. This causes XFS test generic/310 to fail. Fix this
      by using server->ops->closedir() function.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f586b768
    • Pavel Shilovsky's avatar
      CIFS: Fix wrong filename length for SMB2 · be462337
      Pavel Shilovsky authored
      commit 1bbe4997 upstream.
      
      The existing code uses the old MAX_NAME constant. This causes
      XFS test generic/013 to fail. Fix it by replacing MAX_NAME with
      PATH_MAX that SMB1 uses. Also remove an unused MAX_NAME constant
      definition.
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be462337
    • Pavel Shilovsky's avatar
      CIFS: Fix directory rename error · cb22458c
      Pavel Shilovsky authored
      commit a07d3220 upstream.
      
      CIFS servers process nlink counts differently for files and directories.
      In cifs_rename() if we the request fails on the existing target, we
      try to remove it through cifs_unlink() but this is not what we want
      to do for directories. As the result the following sequence of commands
      
      mkdir {1,2}; mv -T 1 2; rmdir {1,2}; mkdir {1,2}; echo foo > 2/bar
      
      and XFS test generic/023 fail with -ENOENT error. That's why the second
      mkdir reuses the existing inode (target inode of the mv -T command) with
      S_DEAD flag.
      
      Fix this by checking whether the target is directory or not and
      calling cifs_rmdir() rather than cifs_unlink() for directories.
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb22458c
    • Pavel Shilovsky's avatar
      CIFS: Fix wrong directory attributes after rename · 7454dc80
      Pavel Shilovsky authored
      commit b46799a8 upstream.
      
      When we requests rename we also need to update attributes
      of both source and target parent directories. Not doing it
      causes generic/309 xfstest to fail on SMB2 mounts. Fix this
      by marking these directories for force revalidating.
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7454dc80
    • Steve French's avatar
      CIFS: Possible null ptr deref in SMB2_tcon · fa7007a5
      Steve French authored
      commit 18f39e7b upstream.
      
      As Raphael Geissert pointed out, tcon_error_exit can dereference tcon
      and there is one path in which tcon can be null.
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Reported-by: default avatarRaphael Geissert <geissert@debian.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa7007a5
    • Pavel Shilovsky's avatar
      CIFS: Fix async reading on reconnects · 09f07a14
      Pavel Shilovsky authored
      commit 038bc961 upstream.
      
      If we get into read_into_pages() from cifs_readv_receive() and then
      loose a network, we issue cifs_reconnect that moves all mids to
      a private list and issue their callbacks. The callback of the async
      read request sets a mid to retry, frees it and wakes up a process
      that waits on the rdata completion.
      
      After the connection is established we return from read_into_pages()
      with a short read, use the mid that was freed before and try to read
      the remaining data from the a newly created socket. Both actions are
      not what we want to do. In reconnect cases (-EAGAIN) we should not
      mask off the error with a short read but should return the error
      code instead.
      Acked-by: default avatarJeff Layton <jlayton@samba.org>
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09f07a14
    • Pavel Shilovsky's avatar
      CIFS: Fix STATUS_CANNOT_DELETE error mapping for SMB2 · fd6cb8b1
      Pavel Shilovsky authored
      commit 21496687 upstream.
      
      The existing mapping causes unlink() call to return error after delete
      operation. Changing the mapping to -EACCES makes the client process
      the call like CIFS protocol does - reset dos attributes with ATTR_READONLY
      flag masked off and retry the operation.
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@samba.org>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd6cb8b1
    • Ilya Dryomov's avatar
      libceph: do not hard code max auth ticket len · 346acdff
      Ilya Dryomov authored
      commit c27a3e4d upstream.
      
      We hard code cephx auth ticket buffer size to 256 bytes.  This isn't
      enough for any moderate setups and, in case tickets themselves are not
      encrypted, leads to buffer overflows (ceph_x_decrypt() errors out, but
      ceph_decode_copy() doesn't - it's just a memcpy() wrapper).  Since the
      buffer is allocated dynamically anyway, allocated it a bit later, at
      the point where we know how much is going to be needed.
      
      Fixes: http://tracker.ceph.com/issues/8979Signed-off-by: default avatarIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: default avatarSage Weil <sage@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      346acdff
    • Ilya Dryomov's avatar
      libceph: add process_one_ticket() helper · fc65783a
      Ilya Dryomov authored
      commit 597cda35 upstream.
      
      Add a helper for processing individual cephx auth tickets.  Needed for
      the next commit, which deals with allocating ticket buffers.  (Most of
      the diff here is whitespace - view with git diff -b).
      Signed-off-by: default avatarIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: default avatarSage Weil <sage@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc65783a
    • Sage Weil's avatar
      libceph: gracefully handle large reply messages from the mon · 7fa66ee4
      Sage Weil authored
      commit 73c3d481 upstream.
      
      We preallocate a few of the message types we get back from the mon.  If we
      get a larger message than we are expecting, fall back to trying to allocate
      a new one instead of blindly using the one we have.
      Signed-off-by: default avatarSage Weil <sage@redhat.com>
      Reviewed-by: default avatarIlya Dryomov <ilya.dryomov@inktank.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7fa66ee4
    • Ilya Dryomov's avatar
      libceph: set last_piece in ceph_msg_data_pages_cursor_init() correctly · b704f8b1
      Ilya Dryomov authored
      commit 5f740d7e upstream.
      
      Determining ->last_piece based on the value of ->page_offset + length
      is incorrect because length here is the length of the entire message.
      ->last_piece set to false even if page array data item length is <=
      PAGE_SIZE, which results in invalid length passed to
      ceph_tcp_{send,recv}page() and causes various asserts to fire.
      
          # cat pages-cursor-init.sh
          #!/bin/bash
          rbd create --size 10 --image-format 2 foo
          FOO_DEV=$(rbd map foo)
          dd if=/dev/urandom of=$FOO_DEV bs=1M &>/dev/null
          rbd snap create foo@snap
          rbd snap protect foo@snap
          rbd clone foo@snap bar
          # rbd_resize calls librbd rbd_resize(), size is in bytes
          ./rbd_resize bar $(((4 << 20) + 512))
          rbd resize --size 10 bar
          BAR_DEV=$(rbd map bar)
          # trigger a 512-byte copyup -- 512-byte page array data item
          dd if=/dev/urandom of=$BAR_DEV bs=1M count=1 seek=5
      
      The problem exists only in ceph_msg_data_pages_cursor_init(),
      ceph_msg_data_pages_advance() does the right thing.  The size_t cast is
      unnecessary.
      Signed-off-by: default avatarIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: default avatarSage Weil <sage@redhat.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b704f8b1
    • Chris Mason's avatar
      xfs: don't zero partial page cache pages during O_DIRECT writes · 2f3219f6
      Chris Mason authored
      commit 85e584da upstream.
      
      xfs is using truncate_pagecache_range to invalidate the page cache
      during DIO reads.  This is different from the other filesystems who
      only invalidate pages during DIO writes.
      
      truncate_pagecache_range is meant to be used when we are freeing the
      underlying data structs from disk, so it will zero any partial
      ranges in the page.  This means a DIO read can zero out part of the
      page cache page, and it is possible the page will stay in cache.
      
      buffered reads will find an up to date page with zeros instead of
      the data actually on disk.
      
      This patch fixes things by using invalidate_inode_pages2_range
      instead.  It preserves the page cache invalidation, but won't zero
      any pages.
      
      [dchinner: catch error and warn if it fails. Comment.]
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f3219f6