An error occurred fetching the project authors.
  1. 26 Apr, 2022 5 commits
  2. 23 Feb, 2022 1 commit
  3. 03 Dec, 2021 2 commits
  4. 29 Nov, 2021 1 commit
  5. 22 Sep, 2021 1 commit
  6. 14 Sep, 2021 1 commit
  7. 12 Aug, 2021 1 commit
  8. 01 Jun, 2021 2 commits
  9. 19 Mar, 2021 1 commit
  10. 17 Mar, 2021 1 commit
    • Johannes Thumshirn's avatar
      scsi: sd_zbc: Update write pointer offset cache · 2db4215f
      Johannes Thumshirn authored
      Recent changes changed the completion of SCSI commands from Soft-IRQ
      context to IRQ context. This triggers the following warning, when we're
      completing writes to zoned block devices that go through the zone append
      emulation:
      
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.12.0-rc2+ #2
       Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0 12/17/2015
       RIP: 0010:__local_bh_disable_ip+0x3f/0x50
       RSP: 0018:ffff8883e1409ba8 EFLAGS: 00010006
       RAX: 0000000080010001 RBX: 0000000000000001 RCX: 0000000000000013
       RDX: ffff888129e4d200 RSI: 0000000000000201 RDI: ffffffff915b9dbd
       RBP: ffff888113e9a540 R08: ffff888113e9a540 R09: 00000000000077f0
       R10: 0000000000080000 R11: 0000000000000001 R12: ffff888129e4d200
       R13: 0000000000001000 R14: 00000000000077f0 R15: ffff888129e4d218
       FS:  0000000000000000(0000) GS:ffff8883e1400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007f2f8418ebc0 CR3: 000000021202a006 CR4: 00000000001706f0
       Call Trace:
        <IRQ>
        _raw_spin_lock_bh+0x18/0x40
        sd_zbc_complete+0x43d/0x1150
        sd_done+0x631/0x1040
        ? mark_lock+0xe4/0x2fd0
        ? provisioning_mode_store+0x3f0/0x3f0
        scsi_finish_command+0x31b/0x5c0
        _scsih_io_done+0x960/0x29e0 [mpt3sas]
        ? mpt3sas_scsih_scsi_lookup_get+0x1c7/0x340 [mpt3sas]
        ? __lock_acquire+0x166b/0x58b0
        ? _get_st_from_smid+0x4a/0x80 [mpt3sas]
        _base_process_reply_queue+0x23f/0x26e0 [mpt3sas]
        ? lock_is_held_type+0x98/0x110
        ? find_held_lock+0x2c/0x110
        ? mpt3sas_base_sync_reply_irqs+0x360/0x360 [mpt3sas]
        _base_interrupt+0x8d/0xd0 [mpt3sas]
        ? rcu_read_lock_sched_held+0x3f/0x70
        __handle_irq_event_percpu+0x24d/0x600
        handle_irq_event+0xef/0x240
        ? handle_irq_event_percpu+0x110/0x110
        handle_edge_irq+0x1f6/0xb60
        __common_interrupt+0x75/0x160
        common_interrupt+0x7b/0xa0
        </IRQ>
        asm_common_interrupt+0x1e/0x40
      
      Don't use spin_lock_bh() to protect the update of the write pointer offset
      cache, but use spin_lock_irqsave() for it.
      
      Link: https://lore.kernel.org/r/3cfebe48d09db73041b7849be71ffbcec7ee40b3.1615369586.git.johannes.thumshirn@wdc.com
      Fixes: 664f0dce ("scsi: mpt3sas: Add support for shared host tagset for CPU hotplug")
      Reported-by: default avatarShinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Tested-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      2db4215f
  11. 23 Feb, 2021 1 commit
  12. 10 Feb, 2021 2 commits
    • Damien Le Moal's avatar
      sd_zbc: clear zone resources for non-zoned case · 78e1663f
      Damien Le Moal authored
      For host-aware ZBC disk, setting the device zoned model to BLK_ZONED_HA
      using blk_queue_set_zoned() in sd_read_block_characteristics() may
      result in the block device effective zoned model to be "none"
      (BLK_ZONED_NONE) if partitions are present on the device. In this case,
      sd_zbc_read_zones() should not setup the zone related queue limits for
      the disk so that the device limits and configuration is consistent with
      a regular disk and resources not uselessly allocated (e.g. the zone
      write pointer tracking array for zone append emulation).
      
      Furthermore, if the disk zoned model changes at run time due to the
      creation of a partition by the user, the zone related resources can be
      released.
      
      Fix both problems by introducing the function sd_zbc_clear_zone_info()
      to reset the scsi disk zone information and free resources and by
      returning early in sd_zbc_read_zones() for a block device that has a
      zoned model equal to BLK_ZONED_NONE.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@edc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      78e1663f
    • Damien Le Moal's avatar
      block: introduce zone_write_granularity limit · a805a4fa
      Damien Le Moal authored
      Per ZBC and ZAC specifications, host-managed SMR hard-disks mandate that
      all writes into sequential write required zones be aligned to the device
      physical block size. However, NVMe ZNS does not have this constraint and
      allows write operations into sequential zones to be aligned to the
      device logical block size. This inconsistency does not help with
      software portability across device types.
      
      To solve this, introduce the zone_write_granularity queue limit to
      indicate the alignment constraint, in bytes, of write operations into
      zones of a zoned block device. This new limit is exported as a
      read-only sysfs queue attribute and the helper
      blk_queue_zone_write_granularity() introduced for drivers to set this
      limit.
      
      The function blk_queue_set_zoned() is modified to set this new limit to
      the device logical block size by default. NVMe ZNS devices as well as
      zoned nullb devices use this default value as is. The scsi disk driver
      is modified to execute the blk_queue_zone_write_granularity() helper to
      set the zone write granularity of host-managed SMR disks to the disk
      physical block size.
      
      The accessor functions queue_zone_write_granularity() and
      bdev_zone_write_granularity() are also introduced.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@edc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      a805a4fa
  13. 16 Sep, 2020 2 commits
  14. 05 Aug, 2020 1 commit
    • Damien Le Moal's avatar
      scsi: sd_zbc: Improve zone revalidation · a3d8a257
      Damien Le Moal authored
      Currently, for zoned disks, since blk_revalidate_disk_zones() requires the
      disk capacity to be set already to operate correctly, zones revalidation
      can only be done on the second revalidate scan once the gendisk capacity is
      set at the end of the first scan. As a result, if zone revalidation fails,
      there is no second chance to recover from the failure and the disk capacity
      is changed to 0, with the disk left unusable.
      
      This can be improved by shuffling around code, specifically, by moving the
      call to sd_zbc_revalidate_zones() from sd_zbc_read_zones() to the end of
      sd_revalidate_disk(), after set_capacity_revalidate_and_notify() is called
      to set the gendisk capacity. With this change, if sd_zbc_revalidate_zones()
      fails on the first scan, the second scan will call it again to recover, if
      possible.
      
      Using the new struct scsi_disk fields rev_nr_zones and rev_zone_blocks,
      sd_zbc_revalidate_zones() does actual work only if it detects a change with
      the disk zone configuration. This means that for a successful zones
      revalidation on the first scan, the second scan will not cause another
      heavy full check.
      
      While at it, remove the unecesary "extern" declaration of
      sd_zbc_read_zones().
      
      Link: https://lore.kernel.org/r/20200731054928.668547-1-damien.lemoal@wdc.comReviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      a3d8a257
  15. 25 Jul, 2020 1 commit
  16. 15 Jul, 2020 2 commits
  17. 08 Jul, 2020 2 commits
  18. 02 Jun, 2020 1 commit
    • Christoph Hellwig's avatar
      mm: remove the pgprot argument to __vmalloc · 88dca4ca
      Christoph Hellwig authored
      The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: Michael Kelley <mikelley@microsoft.com> [hyperv]
      Acked-by: Gao Xiang <xiang@kernel.org> [erofs]
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarWei Liu <wei.liu@kernel.org>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Sumit Semwal <sumit.semwal@linaro.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mackerras <paulus@ozlabs.org>
      Cc: Vasily Gorbik <gor@linux.ibm.com>
      Cc: Will Deacon <will@kernel.org>
      Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      88dca4ca
  19. 13 May, 2020 2 commits
    • Johannes Thumshirn's avatar
      scsi: sd_zbc: emulate ZONE_APPEND commands · 5795eb44
      Johannes Thumshirn authored
      Emulate ZONE_APPEND for SCSI disks using a regular WRITE(16) command
      with a start LBA set to the target zone write pointer position.
      
      In order to always know the write pointer position of a sequential write
      zone, the write pointer of all zones is tracked using an array of 32bits
      zone write pointer offset attached to the scsi disk structure. Each
      entry of the array indicate a zone write pointer position relative to
      the zone start sector. The write pointer offsets are maintained in sync
      with the device as follows:
      1) the write pointer offset of a zone is reset to 0 when a
         REQ_OP_ZONE_RESET command completes.
      2) the write pointer offset of a zone is set to the zone size when a
         REQ_OP_ZONE_FINISH command completes.
      3) the write pointer offset of a zone is incremented by the number of
         512B sectors written when a write, write same or a zone append
         command completes.
      4) the write pointer offset of all zones is reset to 0 when a
         REQ_OP_ZONE_RESET_ALL command completes.
      
      Since the block layer does not write lock zones for zone append
      commands, to ensure a sequential ordering of the regular write commands
      used for the emulation, the target zone of a zone append command is
      locked when the function sd_zbc_prepare_zone_append() is called from
      sd_setup_read_write_cmnd(). If the zone write lock cannot be obtained
      (e.g. a zone append is in-flight or a regular write has already locked
      the zone), the zone append command dispatching is delayed by returning
      BLK_STS_ZONE_RESOURCE.
      
      To avoid the need for write locking all zones for REQ_OP_ZONE_RESET_ALL
      requests, use a spinlock to protect accesses and modifications of the
      zone write pointer offsets. This spinlock is initialized from sd_probe()
      using the new function sd_zbc_init().
      Co-developed-by: default avatarDamien Le Moal <Damien.LeMoal@wdc.com>
      Signed-off-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5795eb44
    • Johannes Thumshirn's avatar
      scsi: sd_zbc: factor out sanity checks for zoned commands · 02494d35
      Johannes Thumshirn authored
      Factor sanity checks for zoned commands from sd_zbc_setup_zone_mgmt_cmnd().
      
      This will help with the introduction of an emulated ZONE_APPEND command.
      Signed-off-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      02494d35
  20. 24 Feb, 2020 1 commit
    • Damien Le Moal's avatar
      scsi: sd_sbc: Fix sd_zbc_report_zones() · 51fdaa04
      Damien Le Moal authored
      The block layer generic blk_revalidate_disk_zones() checks the validity of
      zone descriptors reported by a disk using the blk_revalidate_zone_cb()
      callback function executed for each zone descriptor. If a ZBC disk reports
      invalid zone descriptors, blk_revalidate_disk_zones() returns an error and
      sd_zbc_read_zones() changes the disk capacity to 0, which in turn results
      in the gendisk structure capacity to be set to 0. This all works well for
      the first revalidate pass on a disk and the block layer detects the
      capactiy change.
      
      On the second revalidate pass, blk_revalidate_disk_zones() is called again
      and sd_zbc_report_zones() executed to check the zones a second time.
      However, for this second pass, the gendisk capacity is now 0, which results
      in sd_zbc_report_zones() to do nothing and to report success and no
      zones. blk_revalidate_disk_zones() in turn returns success and sets the
      disk queue chunk_sectors limit with zero as no zones were checked, causing
      a oops to trigger on the BUG_ON(!is_power_of_2(chunk_sectors)) in
      blk_queue_chunk_sectors().
      
      Fix this by using the sdkp capacity field rather than the gendisk capacity
      for the report zones loop in sd_zbc_report_zones(). Also add a check to
      return immediately an error if the sdkp capacity is 0.  With this fix,
      invalid/buggy ZBC disk scan does not trigger a oops and are exposed with a
      0 capacity. This change also preserve the chance for the disk to be
      correctly revalidated on the second revalidate pass as the scsi disk
      structure capacity field is always set to the disk reported value when
      sd_zbc_report_zones() is called.
      
      Link: https://lore.kernel.org/r/20200219063800.880834-1-damien.lemoal@wdc.com
      Fixes: d4100351 ("block: rework zone reporting")
      Cc: Cc: <stable@vger.kernel.org> # v5.5
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      51fdaa04
  21. 03 Jan, 2020 2 commits
  22. 03 Dec, 2019 1 commit
  23. 27 Nov, 2019 1 commit
  24. 18 Nov, 2019 1 commit
  25. 13 Nov, 2019 3 commits
    • Christoph Hellwig's avatar
      block: rework zone reporting · d4100351
      Christoph Hellwig authored
      Avoid the need to allocate a potentially large array of struct blk_zone
      in the block layer by switching the ->report_zones method interface to
      a callback model. Now the caller simply supplies a callback that is
      executed on each reported zone, and private data for it.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d4100351
    • Damien Le Moal's avatar
      scsi: sd_zbc: Cleanup sd_zbc_alloc_report_buffer() · 23a50861
      Damien Le Moal authored
      There is no need to arbitrarily limit the size of a report zone to the
      number of zones defined by SD_ZBC_REPORT_MAX_ZONES. Rather, simply
      calculate the report buffer size needed for the requested number of
      zones without exceeding the device total number of zones. This buffer
      size limitation to the hardware maximum transfer size and page mapping
      capabilities is kept unchanged. Starting with this initial buffer size,
      the allocation is optimized by iterating over decreasing buffer size
      until the allocation succeeds (each iteration is allowed to fail fast
      using the __GFP_NORETRY flag). This ensures forward progress for zone
      reports and avoids failures of zones revalidation under memory pressure.
      
      While at it, also replace the hard coded 512 B sector size with the
      SECTOR_SIZE macro.
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      23a50861
    • Damien Le Moal's avatar
      block: Enhance blk_revalidate_disk_zones() · d9dd7308
      Damien Le Moal authored
      For ZBC and ZAC zoned devices, the scsi driver revalidation processing
      implemented by sd_revalidate_disk() includes a call to
      sd_zbc_read_zones() which executes a full disk zone report used to
      check that all zones of the disk are the same size. This processing is
      followed by a call to blk_revalidate_disk_zones(), used to initialize
      the device request queue zone bitmaps (zone type and zone write lock
      bitmaps). To do so, blk_revalidate_disk_zones() also executes a full
      device zone report to obtain zone types. As a result, the entire
      zoned block device revalidation process includes two full device zone
      report.
      
      By moving the zone size checks into blk_revalidate_disk_zones(), this
      process can be optimized to a single full device zone report, leading to
      shorter device scan and revalidation times. This patch implements this
      optimization, reducing the original full device zone report implemented
      in sd_zbc_check_zones() to a single, small, report zones command
      execution to obtain the size of the first zone of the device. Checks
      whether all zones of the device are the same size as the first zone
      size are moved to the generic blk_check_zone() function called from
      blk_revalidate_disk_zones().
      
      This optimization also has the following benefits:
      1) fewer memory allocations in the scsi layer during disk revalidation
         as the potentailly large buffer for zone report execution is not
         needed.
      2) Implement zone checks in a generic manner, reducing the burden on
         device driver which only need to obtain the zone size and check that
         this size is a power of 2 number of LBAs. Any new type of zoned
         block device will benefit from this.
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d9dd7308
  26. 07 Nov, 2019 1 commit