1. 30 Jun, 2022 3 commits
  2. 29 Jun, 2022 4 commits
    • Ruozhu Li's avatar
      nvme: fix regression when disconnect a recovering ctrl · f7f70f4a
      Ruozhu Li authored
      We encountered a problem that the disconnect command hangs.
      After analyzing the log and stack, we found that the triggering
      process is as follows:
      CPU0                          CPU1
                                      nvme_rdma_error_recovery_work
                                        nvme_rdma_teardown_io_queues
      nvme_do_delete_ctrl                 nvme_stop_queues
        nvme_remove_namespaces
        --clear ctrl->namespaces
                                          nvme_start_queues
                                          --no ns in ctrl->namespaces
          nvme_ns_remove                  return(because ctrl is deleting)
            blk_freeze_queue
              blk_mq_freeze_queue_wait
              --wait for ns to unquiesce to clean infligt IO, hang forever
      
      This problem was not found in older kernels because we will flush
      err work in nvme_stop_ctrl before nvme_remove_namespaces.It does not
      seem to be modified for functional reasons, the patch can be revert
      to solve the problem.
      
      Revert commit 794a4cb3 ("nvme: remove the .stop_ctrl callout")
      Signed-off-by: default avatarRuozhu Li <liruozhu@huawei.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      f7f70f4a
    • Pablo Greco's avatar
      nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G) · 1629de0e
      Pablo Greco authored
      ADATA XPG SPECTRIX S40G drives report bogus eui64 values that appear to
      be the same across drives in one system. Quirk them out so they are
      not marked as "non globally unique" duplicates.
      
      Before:
      [    2.258919] nvme nvme1: pci function 0000:06:00.0
      [    2.264898] nvme nvme2: pci function 0000:05:00.0
      [    2.323235] nvme nvme1: failed to set APST feature (2)
      [    2.326153] nvme nvme2: failed to set APST feature (2)
      [    2.333935] nvme nvme1: allocated 64 MiB host memory buffer.
      [    2.336492] nvme nvme2: allocated 64 MiB host memory buffer.
      [    2.339611] nvme nvme1: 7/0/0 default/read/poll queues
      [    2.341805] nvme nvme2: 7/0/0 default/read/poll queues
      [    2.346114]  nvme1n1: p1
      [    2.347197] nvme nvme2: globally duplicate IDs for nsid 1
      After:
      [    2.427715] nvme nvme1: pci function 0000:06:00.0
      [    2.427771] nvme nvme2: pci function 0000:05:00.0
      [    2.488154] nvme nvme2: failed to set APST feature (2)
      [    2.489895] nvme nvme1: failed to set APST feature (2)
      [    2.498773] nvme nvme2: allocated 64 MiB host memory buffer.
      [    2.500587] nvme nvme1: allocated 64 MiB host memory buffer.
      [    2.504113] nvme nvme2: 7/0/0 default/read/poll queues
      [    2.507026] nvme nvme1: 7/0/0 default/read/poll queues
      [    2.509467] nvme nvme2: Ignoring bogus Namespace Identifiers
      [    2.512804] nvme nvme1: Ignoring bogus Namespace Identifiers
      [    2.513698]  nvme1n1: p1
      Signed-off-by: default avatarPablo Greco <pgreco@centosproject.org>
      Reviewed-by: default avatarKeith Busch <kbusch@kernel.org>
      Reviewed-by: default avatarChaitanya Kulkarni <kch@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      1629de0e
    • Sagi Grimberg's avatar
      nvme-tcp: always fail a request when sending it failed · 41d07df7
      Sagi Grimberg authored
      queue stoppage and inflight requests cancellation is fully fenced from
      io_work and thus failing a request from this context. Hence we don't
      need to try to guess from the socket retcode if this failure is because
      the queue is about to be torn down or not.
      
      We are perfectly safe to just fail it, the request will not be cancelled
      later on.
      
      This solves possible very long shutdown delays when the users issues a
      'nvme disconnect-all'
      Reported-by: default avatarDaniel Wagner <dwagner@suse.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      41d07df7
    • Sagi Grimberg's avatar
      nvmet-tcp: fix regression in data_digest calculation · ed0691cf
      Sagi Grimberg authored
      Data digest calculation iterates over command mapped iovec. However
      since commit bac04454 we unmap the iovec before we handle the data
      digest, and since commit 69b85e1f we clear nr_mapped when we unmap
      the iov.
      
      Instead of open-coding the command iov traversal, simply call
      crypto_ahash_digest with the command sg that is already allocated (we
      already do that for the send path). Rename nvmet_tcp_send_ddgst to
      nvmet_tcp_calc_ddgst and call it from send and recv paths.
      
      Fixes: 69b85e1f ("nvmet-tcp: add an helper to free the cmd buffers")
      Fixes: bac04454 ("nvmet-tcp: fix kmap leak when data digest in use")
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      ed0691cf
  3. 25 Jun, 2022 1 commit
  4. 23 Jun, 2022 5 commits
  5. 21 Jun, 2022 1 commit
  6. 20 Jun, 2022 1 commit
  7. 17 Jun, 2022 4 commits
  8. 16 Jun, 2022 5 commits
  9. 15 Jun, 2022 4 commits
    • Jens Axboe's avatar
      Merge branch 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into block-5.19 · 04cb45b4
      Jens Axboe authored
      Pull MD fixes from Song.
      
      * 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md/raid5-ppl: Fix argument order in bio_alloc_bioset()
        Revert "md: don't unregister sync_thread with reconfig_mutex held"
      04cb45b4
    • Logan Gunthorpe's avatar
      md/raid5-ppl: Fix argument order in bio_alloc_bioset() · f34fdcd4
      Logan Gunthorpe authored
      bio_alloc_bioset() takes a block device, number of vectors, the
      OP flags, the GFP mask and the bio set. However when the prototype
      was changed, the callisite in ppl_do_flush() had the OP flags and
      the GFP flags reversed. This introduced some sparse error:
      
        drivers/md/raid5-ppl.c:632:57: warning: incorrect type in argument 3
      				    (different base types)
        drivers/md/raid5-ppl.c:632:57:    expected unsigned int opf
        drivers/md/raid5-ppl.c:632:57:    got restricted gfp_t [usertype]
        drivers/md/raid5-ppl.c:633:61: warning: incorrect type in argument 4
        				    (different base types)
        drivers/md/raid5-ppl.c:633:61:    expected restricted gfp_t [usertype]
      				    gfp_mask
        drivers/md/raid5-ppl.c:633:61:    got unsigned long long
      
      The sparse error introduction may not have been reported correctly by
      0day due to other work that was cleaning up other sparse errors in this
      area.
      
      Fixes: 609be106 ("block: pass a block_device and opf to bio_alloc_bioset")
      Cc: stable@vger.kernel.org # 5.18+
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      f34fdcd4
    • Guoqing Jiang's avatar
      Revert "md: don't unregister sync_thread with reconfig_mutex held" · d0a18034
      Guoqing Jiang authored
      The 07reshape5intr test is broke because of below path.
      
          md_reap_sync_thread
                  -> mddev_unlock
                  -> md_unregister_thread(&mddev->sync_thread)
      
      And md_check_recovery is triggered by,
      
      mddev_unlock -> md_wakeup_thread(mddev->thread)
      
      then mddev->reshape_position is set to MaxSector in raid5_finish_reshape
      since MD_RECOVERY_INTR is cleared in md_check_recovery, which means
      feature_map is not set with MD_FEATURE_RESHAPE_ACTIVE and superblock's
      reshape_position can't be updated accordingly.
      
      Fixes: 8b48ec23 ("md: don't unregister sync_thread with reconfig_mutex held")
      Reported-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@linux.dev>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      d0a18034
    • Jens Axboe's avatar
      Merge tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme into block-5.19 · 2396e958
      Jens Axboe authored
      Pull NVMe fixes from Christoph:
      
      "nvme fixes for Linux 5.19
      
       - quirks, quirks, quirks to work around buggy consumer grade devices
         (Keith Bush, Ning Wang, Stefan Reiter, Rasheed Hsueh)
       - better kernel messages for devices that need quirking (Keith Bush)
       - make a kernel message more useful (Thomas Weißschuh)"
      
      * tag 'nvme-5.19-2022-06-15' of git://git.infradead.org/nvme:
        nvme-pci: disable write zeros support on UMIC and Samsung SSDs
        nvme-pci: avoid the deepest sleep state on ZHITAI TiPro7000 SSDs
        nvme-pci: sk hynix p31 has bogus namespace ids
        nvme-pci: smi has bogus namespace ids
        nvme-pci: phison e12 has bogus namespace ids
        nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG GAMMIX S50
        nvme-pci: add trouble shooting steps for timeouts
        nvme: add bug report info for global duplicate id
        nvme: add device name to warning in uuid_show()
      2396e958
  10. 13 Jun, 2022 9 commits
  11. 12 Jun, 2022 3 commits