1. 08 Jul, 2018 32 commits
  2. 03 Jul, 2018 8 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.53 · fa745a1b
      Greg Kroah-Hartman authored
      fa745a1b
    • Mathias Nyman's avatar
      xhci: Fix use-after-free in xhci_free_virt_device · 4798e96b
      Mathias Nyman authored
      commit 44a182b9 upstream.
      
      KASAN found a use-after-free in xhci_free_virt_device+0x33b/0x38e
      where xhci_free_virt_device() sets slot id to 0 if udev exists:
      if (dev->udev && dev->udev->slot_id)
      	dev->udev->slot_id = 0;
      
      dev->udev will be true even if udev is freed because dev->udev is
      not set to NULL.
      
      set dev->udev pointer to NULL in xhci_free_dev()
      
      The original patch went to stable so this fix needs to be applied
      there as well.
      
      Fixes: a400efe4 ("xhci: zero usb device slot_id member when disabling and freeing a xhci slot")
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarSudip Mukherjee <sudipm.mukherjee@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4798e96b
    • Mike Snitzer's avatar
      dm thin: handle running out of data space vs concurrent discard · 0b19825f
      Mike Snitzer authored
      commit a685557f upstream.
      
      Discards issued to a DM thin device can complete to userspace (via
      fstrim) _before_ the metadata changes associated with the discards is
      reflected in the thinp superblock (e.g. free blocks).  As such, if a
      user constructs a test that loops repeatedly over these steps, block
      allocation can fail due to discards not having completed yet:
      1) fill thin device via filesystem file
      2) remove file
      3) fstrim
      
      From initial report, here:
      https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html
      
      "The root cause of this issue is that dm-thin will first remove
      mapping and increase corresponding blocks' reference count to prevent
      them from being reused before DISCARD bios get processed by the
      underlying layers. However. increasing blocks' reference count could
      also increase the nr_allocated_this_transaction in struct sm_disk
      which makes smd->old_ll.nr_allocated +
      smd->nr_allocated_this_transaction bigger than smd->old_ll.nr_blocks.
      In this case, alloc_data_block() will never commit metadata to reset
      the begin pointer of struct sm_disk, because sm_disk_get_nr_free()
      always return an underflow value."
      
      While there is room for improvement to the space-map accounting that
      thinp is making use of: the reality is this test is inherently racey and
      will result in the previous iteration's fstrim's discard(s) completing
      vs concurrent block allocation, via dd, in the next iteration of the
      loop.
      
      No amount of space map accounting improvements will be able to allow
      user's to use a block before a discard of that block has completed.
      
      So the best we can really do is allow DM thinp to gracefully handle such
      aggressive use of all the pool's data by degrading the pool into
      out-of-data-space (OODS) mode.  We _should_ get that behaviour already
      (if space map accounting didn't falsely cause alloc_data_block() to
      believe free space was available).. but short of that we handle the
      current reality that dm_pool_alloc_data_block() can return -ENOSPC.
      Reported-by: default avatarDennis Yang <dennisyang@qnap.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b19825f
    • Bart Van Assche's avatar
      dm zoned: avoid triggering reclaim from inside dmz_map() · fb4d8744
      Bart Van Assche authored
      commit 2d0b2d64 upstream.
      
      This patch avoids that lockdep reports the following:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.18.0-rc1 #62 Not tainted
      ------------------------------------------------------
      kswapd0/84 is trying to acquire lock:
      00000000c313516d (&xfs_nondir_ilock_class){++++}, at: xfs_free_eofblocks+0xa2/0x1e0
      
      but task is already holding lock:
      00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #2 (fs_reclaim){+.+.}:
        kmem_cache_alloc+0x2c/0x2b0
        radix_tree_node_alloc.constprop.19+0x3d/0xc0
        __radix_tree_create+0x161/0x1c0
        __radix_tree_insert+0x45/0x210
        dmz_map+0x245/0x2d0 [dm_zoned]
        __map_bio+0x40/0x260
        __split_and_process_non_flush+0x116/0x220
        __split_and_process_bio+0x81/0x180
        __dm_make_request.isra.32+0x5a/0x100
        generic_make_request+0x36e/0x690
        submit_bio+0x6c/0x140
        mpage_readpages+0x19e/0x1f0
        read_pages+0x6d/0x1b0
        __do_page_cache_readahead+0x21b/0x2d0
        force_page_cache_readahead+0xc4/0x100
        generic_file_read_iter+0x7c6/0xd20
        __vfs_read+0x102/0x180
        vfs_read+0x9b/0x140
        ksys_read+0x55/0xc0
        do_syscall_64+0x5a/0x1f0
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      -> #1 (&dmz->chunk_lock){+.+.}:
        dmz_map+0x133/0x2d0 [dm_zoned]
        __map_bio+0x40/0x260
        __split_and_process_non_flush+0x116/0x220
        __split_and_process_bio+0x81/0x180
        __dm_make_request.isra.32+0x5a/0x100
        generic_make_request+0x36e/0x690
        submit_bio+0x6c/0x140
        _xfs_buf_ioapply+0x31c/0x590
        xfs_buf_submit_wait+0x73/0x520
        xfs_buf_read_map+0x134/0x2f0
        xfs_trans_read_buf_map+0xc3/0x580
        xfs_read_agf+0xa5/0x1e0
        xfs_alloc_read_agf+0x59/0x2b0
        xfs_alloc_pagf_init+0x27/0x60
        xfs_bmap_longest_free_extent+0x43/0xb0
        xfs_bmap_btalloc_nullfb+0x7f/0xf0
        xfs_bmap_btalloc+0x428/0x7c0
        xfs_bmapi_write+0x598/0xcc0
        xfs_iomap_write_allocate+0x15a/0x330
        xfs_map_blocks+0x1cf/0x3f0
        xfs_do_writepage+0x15f/0x7b0
        write_cache_pages+0x1ca/0x540
        xfs_vm_writepages+0x65/0xa0
        do_writepages+0x48/0xf0
        __writeback_single_inode+0x58/0x730
        writeback_sb_inodes+0x249/0x5c0
        wb_writeback+0x11e/0x550
        wb_workfn+0xa3/0x670
        process_one_work+0x228/0x670
        worker_thread+0x3c/0x390
        kthread+0x11c/0x140
        ret_from_fork+0x3a/0x50
      
      -> #0 (&xfs_nondir_ilock_class){++++}:
        down_read_nested+0x43/0x70
        xfs_free_eofblocks+0xa2/0x1e0
        xfs_fs_destroy_inode+0xac/0x270
        dispose_list+0x51/0x80
        prune_icache_sb+0x52/0x70
        super_cache_scan+0x127/0x1a0
        shrink_slab.part.47+0x1bd/0x590
        shrink_node+0x3b5/0x470
        balance_pgdat+0x158/0x3b0
        kswapd+0x1ba/0x600
        kthread+0x11c/0x140
        ret_from_fork+0x3a/0x50
      
      other info that might help us debug this:
      
      Chain exists of:
        &xfs_nondir_ilock_class --> &dmz->chunk_lock --> fs_reclaim
      
      Possible unsafe locking scenario:
      
           CPU0                    CPU1
           ----                    ----
      lock(fs_reclaim);
                                   lock(&dmz->chunk_lock);
                                   lock(fs_reclaim);
      lock(&xfs_nondir_ilock_class);
      fb4d8744
    • Kirill A. Shutemov's avatar
      x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y · 0cfb151b
      Kirill A. Shutemov authored
      commit cfe19577 upstream.
      
      Open-coded page table entry checks don't work correctly when we fold the
      page table level at runtime.
      
      pgd_present() on 4-level paging machine always returns true, but
      open-coded version of the check may return false-negative result and
      we silently skip the rest of the loop body in efi_call_phys_epilog().
      
      Replace open-coded checks with proper helpers.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org # v4.12+
      Fixes: 94133e46 ("x86/efi: Correct EFI identity mapping under 'efi=old_map' when KASLR is enabled")
      Link: http://lkml.kernel.org/r/20180625120852.18300-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cfb151b
    • Bart Van Assche's avatar
      block: Fix cloning of requests with a special payload · 25114134
      Bart Van Assche authored
      commit 297ba57d upstream.
      
      This patch avoids that removing a path controlled by the dm-mpath driver
      while mkfs is running triggers the following kernel bug:
      
          kernel BUG at block/blk-core.c:3347!
          invalid opcode: 0000 [#1] PREEMPT SMP KASAN
          CPU: 20 PID: 24369 Comm: mkfs.ext4 Not tainted 4.18.0-rc1-dbg+ #2
          RIP: 0010:blk_end_request_all+0x68/0x70
          Call Trace:
           <IRQ>
           dm_softirq_done+0x326/0x3d0 [dm_mod]
           blk_done_softirq+0x19b/0x1e0
           __do_softirq+0x128/0x60d
           irq_exit+0x100/0x110
           smp_call_function_single_interrupt+0x90/0x330
           call_function_single_interrupt+0xf/0x20
           </IRQ>
      
      Fixes: f9d03f96 ("block: improve handling of the magic discard payload")
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Johannes Thumshirn <jthumshirn@suse.de>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25114134
    • Keith Busch's avatar
      block: Fix transfer when chunk sectors exceeds max · 29413e06
      Keith Busch authored
      commit 15bfd21f upstream.
      
      A device may have boundary restrictions where the number of sectors
      between boundaries exceeds its max transfer size. In this case, we need
      to cap the max size to the smaller of the two limits.
      Reported-by: default avatarJitendra Bhivare <jitendra.bhivare@broadcom.com>
      Tested-by: default avatarJitendra Bhivare <jitendra.bhivare@broadcom.com>
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29413e06
    • Mikulas Patocka's avatar
      slub: fix failure when we delete and create a slab cache · 804a0db7
      Mikulas Patocka authored
      commit d50d82fa upstream.
      
      In kernel 4.17 I removed some code from dm-bufio that did slab cache
      merging (commit 21bb1327: "dm bufio: remove code that merges slab
      caches") - both slab and slub support merging caches with identical
      attributes, so dm-bufio now just calls kmem_cache_create and relies on
      implicit merging.
      
      This uncovered a bug in the slub subsystem - if we delete a cache and
      immediatelly create another cache with the same attributes, it fails
      because of duplicate filename in /sys/kernel/slab/.  The slub subsystem
      offloads freeing the cache to a workqueue - and if we create the new
      cache before the workqueue runs, it complains because of duplicate
      filename in sysfs.
      
      This patch fixes the bug by moving the call of kobject_del from
      sysfs_slab_remove_workfn to shutdown_cache.  kobject_del must be called
      while we hold slab_mutex - so that the sysfs entry is deleted before a
      cache with the same attributes could be created.
      
      Running device-mapper-test-suite with:
      
        dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/
      
      triggered:
      
        Buffer I/O error on dev dm-0, logical block 1572848, async page read
        device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5
        device-mapper: thin: 253:1: aborting current metadata transaction
        sysfs: cannot create duplicate filename '/kernel/slab/:a-0000144'
        CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25
        Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
        Workqueue: dm-thin do_worker [dm_thin_pool]
        Call Trace:
         dump_stack+0x5a/0x73
         sysfs_warn_dup+0x58/0x70
         sysfs_create_dir_ns+0x77/0x80
         kobject_add_internal+0xba/0x2e0
         kobject_init_and_add+0x70/0xb0
         sysfs_slab_add+0xb1/0x250
         __kmem_cache_create+0x116/0x150
         create_cache+0xd9/0x1f0
         kmem_cache_create_usercopy+0x1c1/0x250
         kmem_cache_create+0x18/0x20
         dm_bufio_client_create+0x1ae/0x410 [dm_bufio]
         dm_block_manager_create+0x5e/0x90 [dm_persistent_data]
         __create_persistent_data_objects+0x38/0x940 [dm_thin_pool]
         dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool]
         metadata_operation_failed+0x59/0x100 [dm_thin_pool]
         alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool]
         process_cell+0x2a3/0x550 [dm_thin_pool]
         do_worker+0x28d/0x8f0 [dm_thin_pool]
         process_one_work+0x171/0x370
         worker_thread+0x49/0x3f0
         kthread+0xf8/0x130
         ret_from_fork+0x35/0x40
        kobject_add_internal failed for :a-0000144 with -EEXIST, don't try to register things with the same name in the same directory.
        kmem_cache_create(dm_bufio_buffer-16) failed with error -17
      
      Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.comSigned-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Reported-by: default avatarMike Snitzer <snitzer@redhat.com>
      Tested-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      804a0db7