1. 23 May, 2021 3 commits
    • Varad Gautam's avatar
      ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry · a11ddb37
      Varad Gautam authored
      do_mq_timedreceive calls wq_sleep with a stack local address.  The
      sender (do_mq_timedsend) uses this address to later call pipelined_send.
      
      This leads to a very hard to trigger race where a do_mq_timedreceive
      call might return and leave do_mq_timedsend to rely on an invalid
      address, causing the following crash:
      
        RIP: 0010:wake_q_add_safe+0x13/0x60
        Call Trace:
         __x64_sys_mq_timedsend+0x2a9/0x490
         do_syscall_64+0x80/0x680
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
        RIP: 0033:0x7f5928e40343
      
      The race occurs as:
      
      1. do_mq_timedreceive calls wq_sleep with the address of `struct
         ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it
         holds a valid `struct ext_wait_queue *` as long as the stack has not
         been overwritten.
      
      2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and
         do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call
         __pipelined_op.
      
      3. Sender calls __pipelined_op::smp_store_release(&this->state,
         STATE_READY).  Here is where the race window begins.  (`this` is
         `ewq_addr`.)
      
      4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it
         will see `state == STATE_READY` and break.
      
      5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed
         to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's
         stack.  (Although the address may not get overwritten until another
         function happens to touch it, which means it can persist around for an
         indefinite time.)
      
      6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a
         `struct ext_wait_queue *`, and uses it to find a task_struct to pass to
         the wake_q_add_safe call.  In the lucky case where nothing has
         overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct.
         In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a
         bogus address as the receiver's task_struct causing the crash.
      
      do_mq_timedsend::__pipelined_op() should not dereference `this` after
      setting STATE_READY, as the receiver counterpart is now free to return.
      Change __pipelined_op to call wake_q_add_safe on the receiver's
      task_struct returned by get_task_struct, instead of dereferencing `this`
      which sits on the receiver's stack.
      
      As Manfred pointed out, the race potentially also exists in
      ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare.  Fix
      those in the same way.
      
      Link: https://lkml.kernel.org/r/20210510102950.12551-1-varad.gautam@suse.com
      Fixes: c5b2cbdb ("ipc/mqueue.c: update/document memory barriers")
      Fixes: 8116b54e ("ipc/sem.c: document and update memory barriers")
      Fixes: 0d97a82b ("ipc/msg.c: update and document memory barriers")
      Signed-off-by: default avatarVarad Gautam <varad.gautam@suse.com>
      Reported-by: default avatarMatthias von Faber <matthias.vonfaber@aox-tech.de>
      Acked-by: default avatarDavidlohr Bueso <dbueso@suse.de>
      Acked-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Christian Brauner <christian.brauner@ubuntu.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a11ddb37
    • Michal Hocko's avatar
      Revert "mm/gup: check page posion status for coredump." · f10628d2
      Michal Hocko authored
      While reviewing [1] I came across commit d3378e86 ("mm/gup: check
      page posion status for coredump.") and noticed that this patch is broken
      in two ways.  First it doesn't really prevent hwpoison pages from being
      dumped because hwpoison pages can be marked asynchornously at any time
      after the check.  Secondly, and more importantly, the patch introduces a
      ref count leak because get_dump_page takes a reference on the page which
      is not released.
      
      It also seems that the patch was merged incorrectly because there were
      follow up changes not included as well as discussions on how to address
      the underlying problem [2]
      
      Therefore revert the original patch.
      
      Link: http://lkml.kernel.org/r/20210429122519.15183-4-david@redhat.com [1]
      Link: http://lkml.kernel.org/r/57ac524c-b49a-99ec-c1e4-ef5027bfb61b@redhat.com [2]
      Link: https://lkml.kernel.org/r/20210505135407.31590-1-mhocko@kernel.org
      Fixes: d3378e86 ("mm/gup: check page posion status for coredump.")
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Aili Yao <yaoaili@kingsoft.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f10628d2
    • Arnd Bergmann's avatar
      mm/shuffle: fix section mismatch warning · f9f74dc2
      Arnd Bergmann authored
      clang sometimes decides not to inline shuffle_zone(), but it calls a
      __meminit function.  Without the extra __meminit annotation we get this
      warning:
      
        WARNING: modpost: vmlinux.o(.text+0x2a86d4): Section mismatch in reference from the function shuffle_zone() to the function .meminit.text:__shuffle_zone()
        The function shuffle_zone() references
        the function __meminit __shuffle_zone().
        This is often because shuffle_zone lacks a __meminit
        annotation or the annotation of __shuffle_zone is wrong.
      
      shuffle_free_memory() did not show the same problem in my tests, but it
      could happen in theory as well, so mark both as __meminit.
      
      Link: https://lkml.kernel.org/r/20210514135952.2928094-1-arnd@kernel.orgSigned-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f9f74dc2
  2. 22 May, 2021 4 commits
    • Linus Torvalds's avatar
      Merge tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block · 4ff2473b
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Fix BLKRRPART and deletion race (Gulam, Christoph)
      
       - NVMe pull request (Christoph):
            - nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith
              Busch)
            - nvme-fc teardown fix (James Smart)
            - nvmet/nvme-loop memory leak fixes (Wu Bo)"
      
      * tag 'block-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
        block: fix a race between del_gendisk and BLKRRPART
        block: prevent block device lookups at the beginning of del_gendisk
        nvme-fc: clear q_live at beginning of association teardown
        nvme-tcp: rerun io_work if req_list is not empty
        nvme-tcp: fix possible use-after-completion
        nvme-loop: fix memory leak in nvme_loop_create_ctrl()
        nvmet: fix memory leak in nvmet_alloc_ctrl()
      4ff2473b
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block · b9231dfb
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "One fix for a regression with poll in this merge window, and another
        just hardens the io-wq exit path a bit"
      
      * tag 'io_uring-5.13-2021-05-22' of git://git.kernel.dk/linux-block:
        io_uring: fortify tctx/io_wq cleanup
        io_uring: don't modify req->poll for rw
      b9231dfb
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 23d72926
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - a fix for a boot regression when running as PV guest on hardware
         without NX support
      
       - a small series fixing a bug in the Xen pciback driver when
         configuring a PCI card with multiple virtual functions
      
      * tag 'for-linus-5.13b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen-pciback: reconfigure also from backend watch handler
        xen-pciback: redo VF placement in the virtual topology
        x86/Xen: swap NX determination and GDT setup on BSP
      23d72926
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · a3969ef4
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
      
       - Fix some math errors in the realtime allocator when extent size hints
         are applied.
      
       - Fix unnecessary short writes to realtime files when free space is
         fragmented.
      
       - Fix a crash when using scrub tracepoints.
      
       - Restore ioctl uapi definitions that were accidentally removed in
         5.13-rc1.
      
      * tag 'xfs-5.13-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: restore old ioctl definitions
        xfs: fix deadlock retry tracepoint arguments
        xfs: retry allocations when locality-based search fails
        xfs: adjust rt allocation minlen when extszhint > rtextsize
      a3969ef4
  3. 21 May, 2021 20 commits
  4. 20 May, 2021 13 commits
    • Rohith Surabattula's avatar
      Fix KASAN identified use-after-free issue. · 9687c85d
      Rohith Surabattula authored
      [  612.157429] ==================================================================
      [  612.158275] BUG: KASAN: use-after-free in process_one_work+0x90/0x9b0
      [  612.158801] Read of size 8 at addr ffff88810a31ca60 by task kworker/2:9/2382
      
      [  612.159611] CPU: 2 PID: 2382 Comm: kworker/2:9 Tainted: G
      OE     5.13.0-rc2+ #98
      [  612.159623] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
      BIOS 1.14.0-1.fc33 04/01/2014
      [  612.159640] Workqueue:  0x0 (deferredclose)
      [  612.159669] Call Trace:
      [  612.159685]  dump_stack+0xbb/0x107
      [  612.159711]  print_address_description.constprop.0+0x18/0x140
      [  612.159733]  ? process_one_work+0x90/0x9b0
      [  612.159743]  ? process_one_work+0x90/0x9b0
      [  612.159754]  kasan_report.cold+0x7c/0xd8
      [  612.159778]  ? lock_is_held_type+0x80/0x130
      [  612.159789]  ? process_one_work+0x90/0x9b0
      [  612.159812]  kasan_check_range+0x145/0x1a0
      [  612.159834]  process_one_work+0x90/0x9b0
      [  612.159877]  ? pwq_dec_nr_in_flight+0x110/0x110
      [  612.159914]  ? spin_bug+0x90/0x90
      [  612.159967]  worker_thread+0x3b6/0x6c0
      [  612.160023]  ? process_one_work+0x9b0/0x9b0
      [  612.160038]  kthread+0x1dc/0x200
      [  612.160051]  ? kthread_create_worker_on_cpu+0xd0/0xd0
      [  612.160092]  ret_from_fork+0x1f/0x30
      
      [  612.160399] Allocated by task 2358:
      [  612.160757]  kasan_save_stack+0x1b/0x40
      [  612.160768]  __kasan_kmalloc+0x9b/0xd0
      [  612.160778]  cifs_new_fileinfo+0xb0/0x960 [cifs]
      [  612.161170]  cifs_open+0xadf/0xf20 [cifs]
      [  612.161421]  do_dentry_open+0x2aa/0x6b0
      [  612.161432]  path_openat+0xbd9/0xfa0
      [  612.161441]  do_filp_open+0x11d/0x230
      [  612.161450]  do_sys_openat2+0x115/0x240
      [  612.161460]  __x64_sys_openat+0xce/0x140
      
      When mod_delayed_work is called to modify the delay of pending work,
      it might return false and queue a new work when pending work is
      already scheduled or when try to grab pending work failed.
      
      So, Increase the reference count when new work is scheduled to
      avoid use-after-free.
      Signed-off-by: default avatarRohith Surabattula <rohiths@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      9687c85d
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · f01da525
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "A mixture of small bug fixes, most for longer standing problems:
      
         - NULL pointer crash in siw
      
         - Various error unwind bugs in siw, rxe, cm
      
         - User triggerable errors in uverbs
      
         - Minor bugs in mlx5 and rxe drivers"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/uverbs: Fix a NULL vs IS_ERR() bug
        RDMA/mlx5: Fix query DCT via DEVX
        RDMA/core: Don't access cm_id after its destruction
        RDMA/rxe: Return CQE error if invalid lkey was supplied
        RDMA/mlx5: Recover from fatal event in dual port mode
        RDMA/mlx5: Verify that DM operation is reasonable
        RDMA/rxe: Clear all QP fields if creation failed
        RDMA/core: Prevent divide-by-zero error triggered by the user
        RDMA/siw: Release xarray entry
        RDMA/siw: Properly check send and receive CQ pointers
      f01da525
    • Linus Torvalds's avatar
      Merge tag 'sound-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 6aa37a53
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "All small device-specific fixes here: a series of FireWire audio
        fixes, UAF and other fixes in USB-audio and co spotted by fuzzer,
        and a few HD-audio quirks as usual"
      
      * tag 'sound-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: line6: Fix racy initialization of LINE6 MIDI
        ALSA: dice: fix stream format for TC Electronic Konnekt Live at high sampling transfer frequency
        ALSA: dice: disable double_pcm_frames mode for M-Audio Profire 610, 2626 and Avid M-Box 3 Pro
        ALSA: intel8x0: Don't update period unless prepared
        ALSA: hda/realtek: Add some CLOVE SSIDs of ALC293
        ALSA: firewire-lib: fix amdtp_packet tracepoints event for packet_index field
        ALSA: firewire-lib: fix calculation for size of IR context payload
        ALSA: firewire-lib: fix check for the size of isochronous packet payload
        ALSA: bebob/oxfw: fix Kconfig entry for Mackie d.2 Pro
        ALSA: dice: fix stream format at middle sampling rate for Alesis iO 26
        ALSA: hda/realtek: Add fixup for HP Spectre x360 15-df0xxx
        ALSA: usb-audio: Fix potential out-of-bounce access in MIDI EP parser
        ALSA: usb-audio: Validate MS endpoint descriptors
        ALSA: hda: fixup headset for ASUS GU502 laptop
        ALSA: hda/realtek: reset eapd coeff to default value for alc287
      6aa37a53
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.13-2' of... · 9ebd8118
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fixes from Hans de Goede:
       "Assorted pdx86 bug-fixes and model-specific quirks for 5.13"
      
      * tag 'platform-drivers-x86-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet
        platform/x86: touchscreen_dmi: Add info for the Mediacom Winpad 7.0 W700 tablet
        platform/x86: intel_punit_ipc: Append MODULE_DEVICE_TABLE for ACPI
        platform/x86: dell-smbios-wmi: Fix oops on rmmod dell_smbios
        platform/x86: hp-wireless: add AMD's hardware id to the supported list
        platform/x86: intel_int0002_vgpio: Only call enable_irq_wake() when using s2idle
        platform/x86: gigabyte-wmi: add support for B550 Aorus Elite
        platform/x86: gigabyte-wmi: add support for X570 UD
        platform/x86: gigabyte-wmi: streamline dmi matching
        platform/mellanox: mlxbf-tmfifo: Fix a memory barrier issue
        platform/surface: dtx: Fix poll function
        platform/surface: aggregator: Add platform-drivers-x86 list to MAINTAINERS entry
        platform/surface: aggregator: avoid clang -Wconstant-conversion warning
        platform/surface: aggregator: Do not mark interrupt as shared
        platform/x86: hp_accel: Avoid invoking _INI to speed up resume
        platform/x86: ideapad-laptop: fix method name typo
        platform/x86: ideapad-laptop: fix a NULL pointer dereference
      9ebd8118
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 50f09a3d
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here is a big set of char/misc/other driver fixes for 5.13-rc3.
      
        The majority here is the fallout of the umn.edu re-review of all prior
        submissions. That resulted in a bunch of reverts along with the
        "correct" changes made, such that there is no regression of any of the
        potential fixes that were made by those individuals. I would like to
        thank the over 80 different developers who helped with the review and
        fixes for this mess.
      
        Other than that, there's a few habanna driver fixes for reported
        issues, and some dyndbg fixes for reported problems.
      
        All of these have been in linux-next for a while with no reported
        problems"
      
      * tag 'char-misc-5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (82 commits)
        misc: eeprom: at24: check suspend status before disable regulator
        uio_hv_generic: Fix another memory leak in error handling paths
        uio_hv_generic: Fix a memory leak in error handling paths
        uio/uio_pci_generic: fix return value changed in refactoring
        Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference""
        dyndbg: drop uninformative vpr_info
        dyndbg: avoid calling dyndbg_emit_prefix when it has no work
        binder: Return EFAULT if we fail BINDER_ENABLE_ONEWAY_SPAM_DETECTION
        cdrom: gdrom: initialize global variable at init time
        brcmfmac: properly check for bus register errors
        Revert "brcmfmac: add a check for the status of usb_register"
        video: imsttfb: check for ioremap() failures
        Revert "video: imsttfb: fix potential NULL pointer dereferences"
        net: liquidio: Add missing null pointer checks
        Revert "net: liquidio: fix a NULL pointer dereference"
        media: gspca: properly check for errors in po1030_probe()
        Revert "media: gspca: Check the return value of write_bridge for timeout"
        media: gspca: mt9m111: Check write_bridge for timeout
        Revert "media: gspca: mt9m111: Check write_bridge for timeout"
        media: dvb: Add check on sp8870_readreg return
        ...
      50f09a3d
    • Linus Torvalds's avatar
      Merge tag 'quota_for_v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs · 7ac17714
      Linus Torvalds authored
      Pull quota fixes from Jan Kara:
       "The most important part in the pull is disablement of the new syscall
        quotactl_path() which was added in rc1.
      
        The reason is some people at LWN discussion pointed out dirfd would be
        useful for this path based syscall and Christian Brauner agreed.
      
        Without dirfd it may be indeed problematic for containers. So let's
        just disable the syscall for now when it doesn't have users yet so
        that we have more time to mull over how to best specify the filesystem
        we want to work on"
      
      * tag 'quota_for_v5.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
        quota: Disable quotactl_path syscall
        quota: Use 'hlist_for_each_entry' to simplify code
      7ac17714
    • Darrick J. Wong's avatar
      xfs: restore old ioctl definitions · e3c2b047
      Darrick J. Wong authored
      These ioctl definitions in xfs_fs.h are part of the userspace ABI and
      were mistakenly removed during the 5.13 merge window.
      
      Fixes: 9fefd5db ("xfs: convert to fileattr")
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      e3c2b047
    • Darrick J. Wong's avatar
      xfs: fix deadlock retry tracepoint arguments · 16c9de54
      Darrick J. Wong authored
      sc->ip is the inode that's being scrubbed, which means that it's not set
      for scrub types that don't involve inodes.  If one of those scrubbers
      (e.g. inode btrees) returns EDEADLOCK, we'll trip over the null pointer.
      Fix that by reporting either the file being examined or the file that
      was used to call scrub.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      16c9de54
    • Darrick J. Wong's avatar
      xfs: retry allocations when locality-based search fails · 676a659b
      Darrick J. Wong authored
      If a realtime allocation fails because we can't find a sufficiently
      large free extent satisfying locality rules, relax the locality rules
      and try again.  This reduces the occurrence of short writes to realtime
      files when the write size is large and the free space is fragmented.
      
      This was originally discovered by running generic/186 with the realtime
      reflink patchset and a 128k cow extent size hint, but the short write
      symptoms can manifest with a 128k extent size hint and no reflink, so
      apply the fix now.
      Signed-off-by: default avatarDarrick J. Wong <djwong@kernel.org>
      Reviewed-by: default avatarAllison Henderson <allison.henderson@oracle.com>
      676a659b
    • Gulam Mohamed's avatar
      block: fix a race between del_gendisk and BLKRRPART · bc6a3851
      Gulam Mohamed authored
      When BLKRRPART is called concurrently with del_gendisk, the partitions
      rescan can create a stale partition that will never be be cleaned up.
      
      Fix this by checking the the disk is up before rescanning partitions
      while under bd_mutex.
      Signed-off-by: default avatarGulam Mohamed <gulam.mohamed@oracle.com>
      [hch: split from a larger patch]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20210514131842.1600568-3-hch@lst.deSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bc6a3851
    • Christoph Hellwig's avatar
      block: prevent block device lookups at the beginning of del_gendisk · 6c60ff04
      Christoph Hellwig authored
      As an artifact of how gendisk lookup used to work in earlier kernels,
      GENHD_FL_UP is only cleared very late in del_gendisk, and a global lock
      is used to prevent opens from succeeding while del_gendisk is tearing
      down the gendisk.  Switch to clearing the flag early and under bd_mutex
      so that callers can use bd_mutex to stabilize the flag, which removes
      the need for the global mutex.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20210514131842.1600568-2-hch@lst.deSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6c60ff04
    • Jens Axboe's avatar
      Merge tag 'nvme-5.13-2021-05-20' of git://git.infradead.org/nvme into block-5.13 · 9a66e6bd
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 5.13:
      
       - nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith Busch)
       - nvme-fc teardown fix (James Smart)
       - nvmet/nvme-loop memory leak fixes (Wu Bo)"
      
      * tag 'nvme-5.13-2021-05-20' of git://git.infradead.org/nvme:
        nvme-fc: clear q_live at beginning of association teardown
        nvme-tcp: rerun io_work if req_list is not empty
        nvme-tcp: fix possible use-after-completion
        nvme-loop: fix memory leak in nvme_loop_create_ctrl()
        nvmet: fix memory leak in nvmet_alloc_ctrl()
      9a66e6bd
    • Johannes Thumshirn's avatar
      btrfs: zoned: fix parallel compressed writes · 764c7c9a
      Johannes Thumshirn authored
      When multiple processes write data to the same block group on a
      compressed zoned filesystem, the underlying device could report I/O
      errors and data corruption is possible.
      
      This happens because on a zoned file system, compressed data writes
      where sent to the device via a REQ_OP_WRITE instead of a
      REQ_OP_ZONE_APPEND operation. But with REQ_OP_WRITE and parallel
      submission it cannot be guaranteed that the data is always submitted
      aligned to the underlying zone's write pointer.
      
      The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a
      zoned filesystem is non intrusive on a regular file system or when
      submitting to a conventional zone on a zoned filesystem, as it is
      guarded by btrfs_use_zone_append.
      Reported-by: default avatarDavid Sterba <dsterba@suse.com>
      Fixes: 9d294a68 ("btrfs: zoned: enable to mount ZONED incompat flag")
      CC: stable@vger.kernel.org # 5.12.x: e380adfc: btrfs: zoned: pass start block to btrfs_use_zone_append
      CC: stable@vger.kernel.org # 5.12.x
      Signed-off-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      764c7c9a