1. 24 Jan, 2019 5 commits
    • Hannes Reinecke's avatar
      nvme-multipath: drop optimization for static ANA group IDs · 78a61cd4
      Hannes Reinecke authored
      Bit 6 in the ANACAP field is used to indicate that the ANA group ID
      doesn't change while the namespace is attached to the controller.
      There is an optimisation in the code to only allocate space
      for the ANA group header, as the namespace list won't change and
      hence would not need to be refreshed.
      However, this optimisation was never carried over to the actual
      workflow, which always assumes that the buffer is large enough
      to hold the ANA header _and_ the namespace list.
      So drop this optimisation and always allocate enough space.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      78a61cd4
    • Raju Rangoju's avatar
      nvmet-rdma: fix null dereference under heavy load · 5cbab630
      Raju Rangoju authored
      Under heavy load if we don't have any pre-allocated rsps left, we
      dynamically allocate a rsp, but we are not actually allocating memory
      for nvme_completion (rsp->req.rsp). In such a case, accessing pointer
      fields (req->rsp->status) in nvmet_req_init() will result in crash.
      
      To fix this, allocate the memory for nvme_completion by calling
      nvmet_rdma_alloc_rsp()
      
      Fixes: 8407879c("nvmet-rdma:fix possible bogus dereference under heavy load")
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarRaju Rangoju <rajur@chelsio.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5cbab630
    • Sagi Grimberg's avatar
      nvme-rdma: rework queue maps handling · b1064d3e
      Sagi Grimberg authored
      If the device supports less queues than provided (if the device has less
      completion vectors), we might hit a bug due to the fact that we ignore
      that in nvme_rdma_map_queues (we override the maps nr_queues with user
      opts).
      
      Instead, keep track of how many default/read/poll queues we actually
      allocated (rather than asked by the user) and use that to assign our
      queue mappings.
      
      Fixes: b65bb777 (" nvme-rdma: support separate queue maps for read and write")
      Reported-by: default avatarSaleem, Shiraz <shiraz.saleem@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b1064d3e
    • Sagi Grimberg's avatar
      nvme-tcp: fix timeout handler · 39d57757
      Sagi Grimberg authored
      Currently, we have several problems with the timeout
      handler:
      1. If we timeout on the controller establishment flow, we will hang
      because we don't execute the error recovery (and we shouldn't because
      the create_ctrl flow needs to fail and cleanup on its own)
      2. We might also hang if we get a disconnet on a queue while the
      controller is already deleting. This racy flow can cause the controller
      disable/shutdown admin command to hang.
      
      We cannot complete a timed out request from the timeout handler without
      mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work).
      So we serialize it in the timeout handler and teardown io and admin
      queues to guarantee that no one races with us from completing the
      request.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      39d57757
    • Sagi Grimberg's avatar
      nvme-rdma: fix timeout handler · 4c174e63
      Sagi Grimberg authored
      Currently, we have several problems with the timeout
      handler:
      1. If we timeout on the controller establishment flow, we will hang
      because we don't execute the error recovery (and we shouldn't because
      the create_ctrl flow needs to fail and cleanup on its own)
      2. We might also hang if we get a disconnet on a queue while the
      controller is already deleting. This racy flow can cause the controller
      disable/shutdown admin command to hang.
      
      We cannot complete a timed out request from the timeout handler without
      mutual exclusion from the teardown flow (e.g. nvme_rdma_error_recovery_work).
      So we serialize it in the timeout handler and teardown io and admin
      queues to guarantee that no one races with us from completing the
      request.
      Reported-by: default avatarJaesoo Lee <jalee@purestorage.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4c174e63
  2. 22 Jan, 2019 3 commits
  3. 18 Jan, 2019 1 commit
    • Thomas Gleixner's avatar
      block: Cleanup license notice · 38197ca1
      Thomas Gleixner authored
      Remove the imprecise and sloppy:
      
        "This files is licensed under the GPL."
      
      license notice in the top level comment.
      
      1) The file already contains a SPDX license identifier which clearly
         states that the license of the file is GPL V2 only
      
      2) The notice resolves to GPL v1 or later for scanners which is just
         contrary to the intent of SPDX identifiers to provide clear and non
         ambiguous license information. Aside of that the value add of this
         notice is below zero,
      
      Cc: Damien Le Moal <damien.lemoal@wdc.com>
      Cc: Matias Bjorling <mb@lightnvm.io>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Fixes: 6a5ac984 ("block: Make struct request_queue smaller for CONFIG_BLK_DEV_ZONED=n")
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      38197ca1
  4. 16 Jan, 2019 3 commits
  5. 15 Jan, 2019 2 commits
    • Jan Kara's avatar
      blockdev: Fix livelocks on loop device · 04906b2f
      Jan Kara authored
      bd_set_size() updates also block device's block size. This is somewhat
      unexpected from its name and at this point, only blkdev_open() uses this
      functionality. Furthermore, this can result in changing block size under
      a filesystem mounted on a loop device which leads to livelocks inside
      __getblk_gfp() like:
      
      Sending NMI from CPU 0 to CPUs 1:
      NMI backtrace for cpu 1
      CPU: 1 PID: 10863 Comm: syz-executor0 Not tainted 4.18.0-rc5+ #151
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
      01/01/2011
      RIP: 0010:__sanitizer_cov_trace_pc+0x3f/0x50 kernel/kcov.c:106
      ...
      Call Trace:
       init_page_buffers+0x3e2/0x530 fs/buffer.c:904
       grow_dev_page fs/buffer.c:947 [inline]
       grow_buffers fs/buffer.c:1009 [inline]
       __getblk_slow fs/buffer.c:1036 [inline]
       __getblk_gfp+0x906/0xb10 fs/buffer.c:1313
       __bread_gfp+0x2d/0x310 fs/buffer.c:1347
       sb_bread include/linux/buffer_head.h:307 [inline]
       fat12_ent_bread+0x14e/0x3d0 fs/fat/fatent.c:75
       fat_ent_read_block fs/fat/fatent.c:441 [inline]
       fat_alloc_clusters+0x8ce/0x16e0 fs/fat/fatent.c:489
       fat_add_cluster+0x7a/0x150 fs/fat/inode.c:101
       __fat_get_block fs/fat/inode.c:148 [inline]
      ...
      
      Trivial reproducer for the problem looks like:
      
      truncate -s 1G /tmp/image
      losetup /dev/loop0 /tmp/image
      mkfs.ext4 -b 1024 /dev/loop0
      mount -t ext4 /dev/loop0 /mnt
      losetup -c /dev/loop0
      l /mnt
      
      Fix the problem by moving initialization of a block device block size
      into a separate function and call it when needed.
      
      Thanks to Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> for help with
      debugging the problem.
      
      Reported-by: syzbot+9933e4476f365f5d5a1b@syzkaller.appspotmail.com
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      04906b2f
    • Jan Kara's avatar
      nbd: Use set_blocksize() to set device blocksize · c8a83a6b
      Jan Kara authored
      NBD can update block device block size implicitely through
      bd_set_size(). Make it explicitely set blocksize with set_blocksize() as
      this behavior of bd_set_size() is going away.
      
      CC: Josef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c8a83a6b
  6. 14 Jan, 2019 2 commits
  7. 11 Jan, 2019 5 commits
    • Miquel Raynal's avatar
      ata: ahci: mvebu: request PHY suspend/resume for Armada 3700 · bde0b5c1
      Miquel Raynal authored
      A feature has been added in the libahci driver: the possibility to set
      a new flag in hpriv->flags to let the core handle PHY suspend/resume
      automatically. Make use of this feature to make suspend to RAM work
      with SATA drives on A3700.
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      bde0b5c1
    • Miquel Raynal's avatar
      ata: ahci: mvebu: add Armada 3700 initialization needed for S2RAM · 2f558bc3
      Miquel Raynal authored
      A3700 comphy initialization is done in the firmware (TF-A). Looking at
      the SATA PHY initialization routine, there is a comment about "vendor
      specific" registers. Two registers are mentioned. They are not
      initialized there in the firmware because they are AHCI related, while
      the firmware at this location does only PHY configuration. The
      solution to avoid doing such initialization is relying on U-Boot.
      
      While this work at boot time, U-Boot is definitely not going to run
      during a resume after suspending to RAM.
      
      Two possible solutions were considered:
      * Fixing the firmware.
      * Fixing the kernel driver.
      
      The first solution would take ages to propagate, while the second
      solution is easy to implement as the driver as been a little bit
      reworked to prepare for such platform configuration. Hence, this patch
      adds an Armada 3700 configuration function to set these two registers
      both at boot time (in the probe) and after a suspend (in the resume
      path).
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      2f558bc3
    • Miquel Raynal's avatar
      ata: ahci: mvebu: do Armada 38x configuration only on relevant SoCs · 96dbcb40
      Miquel Raynal authored
      At the beginning, only Armada 38x SoCs where supported by the
      ahci_mvebu.c driver. Commit 15d3ce7b ("ata: ahci_mvebu: add
      support for Armada 3700 variant") introduced Armada 3700 support. As
      opposed to Armada 38x SoCs, the 3700 variants do not have to configure
      mbus and the regret option. This patch took care of avoiding such
      configuration when not needed in the probe function, but failed to do
      the same in the resume path. While doing so looks harmless by
      experience, let's clean the driver logic and avoid doing this useless
      configuration with Armada 3700 SoCs.
      
      Because the logic is very similar between these two places, it has
      been decided to factorize this code and put it in a "Armada 38x
      configuration function". This function is part of a new
      (per-compatible) platform data structure, so that the addition of such
      configuration function for Armada 3700 will be eased.
      
      Fixes: 15d3ce7b ("ata: ahci_mvebu: add support for Armada 3700 variant")
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      96dbcb40
    • Miquel Raynal's avatar
      ata: ahci: mvebu: remove stale comment · c9bc1367
      Miquel Raynal authored
      For Armada-38x (32-bit) SoCs, PM platform support has been added since:
      commit 32f9494c ("ARM: mvebu: prepare pm-board.c for the
                            introduction of Armada 38x support")
      commit 3cbd6a6c ("ARM: mvebu: Add standby support")
      
      For Armada 64-bit SoCs, like the A3700 also using this AHCI driver, PM
      platform support has always existed.
      
      There are even suspend/resume hooks in this driver since:
      commit d6ecf158 ("ata: ahci_mvebu: add suspend/resume support")
      
      Remove the stale comment at the end of this driver stating that all
      the above does not exist yet.
      
      Fixes: d6ecf158 ("ata: ahci_mvebu: add suspend/resume support")
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      c9bc1367
    • Miquel Raynal's avatar
      ata: libahci_platform: comply to PHY framework · 49e54187
      Miquel Raynal authored
      Current implementation of the libahci does not take into account the
      new PHY framework. Correct the situation by adding a call to
      phy_set_mode() before phy_power_on().
      
      PHYs should also be handled at suspend/resume time. For this, call
      ahci_platform_enable/disable_phys() at suspend/resume_host() time. These
      calls are guarded by a HFLAG (AHCI_HFLAG_SUSPEND_PHYS) that the user of
      the libahci driver must set manually in hpriv->flags at probe time. This
      is to avoid breaking users that have not been tested with this change.
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Suggested-by: default avatarGrzegorz Jaszczyk <jaz@semihalf.com>
      Signed-off-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      49e54187
  8. 10 Jan, 2019 2 commits
    • Jens Axboe's avatar
      Merge branch 'nvme-5.0' of git://git.infradead.org/nvme into for-linus · a39c330d
      Jens Axboe authored
      Pull NVMe fixes from Christoph.
      
      * 'nvme-5.0' of git://git.infradead.org/nvme:
        nvme: don't initlialize ctrl->cntlid twice
        nvme: introduce NVME_QUIRK_IGNORE_DEV_SUBNQN
        nvme: pad fake subsys NQN vid and ssvid with zeros
        nvme-multipath: zero out ANA log buffer
        nvme-fabrics: unset write/poll queues for discovery controllers
        nvme-tcp: don't ask if controller is fabrics
        nvme-tcp: remove dead code
        nvme-pci: fix out of bounds access in nvme_cqe_pending
        nvme-pci: rerun irq setup on IO queue init errors
        nvme-pci: use the same attributes when freeing host_mem_desc_bufs.
        nvme-pci: fix the wrong setting of nr_maps
      a39c330d
    • Jaegeuk Kim's avatar
      loop: drop caches if offset or block_size are changed · 5db470e2
      Jaegeuk Kim authored
      If we don't drop caches used in old offset or block_size, we can get old data
      from new offset/block_size, which gives unexpected data to user.
      
      For example, Martijn found a loopback bug in the below scenario.
      1) LOOP_SET_FD loads first two pages on loop file
      2) LOOP_SET_STATUS64 changes the offset on the loop file
      3) mount is failed due to the cached pages having wrong superblock
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: linux-block@vger.kernel.org
      Reported-by: default avatarMartijn Coenen <maco@google.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5db470e2
  9. 09 Jan, 2019 14 commits
  10. 06 Jan, 2019 1 commit
  11. 05 Jan, 2019 1 commit
  12. 04 Jan, 2019 1 commit