1. 23 Jul, 2024 4 commits
    • Damien Le Moal's avatar
      scsi: mpt3sas: Avoid IOMMU page faults on REPORT ZONES · 82dbb57a
      Damien Le Moal authored
      Some firmware versions of the 9600 series SAS HBA byte-swap the REPORT
      ZONES command reply buffer from ATA-ZAC devices by directly accessing the
      buffer in the host memory. This does not respect the default command DMA
      direction and causes IOMMU page faults on architectures with an IOMMU
      enforcing write-only mappings for DMA_FROM_DEVICE DMA driection (e.g. AMD
      hosts).
      
      scsi 18:0:0:0: Direct-Access-ZBC ATA      WDC  WSH722020AL W870 PQ: 0 ANSI: 6
      scsi 18:0:0:0: SATA: handle(0x0027), sas_addr(0x300062b2083e7c40), phy(0), device_name(0x5000cca29dc35e11)
      scsi 18:0:0:0: enclosure logical id (0x300062b208097c40), slot(0)
      scsi 18:0:0:0: enclosure level(0x0000), connector name( C0.0)
      scsi 18:0:0:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y)
      scsi 18:0:0:0: qdepth(32), tagged(1), scsi_level(7), cmd_que(1)
      sd 18:0:0:0: Attached scsi generic sg2 type 20
      sd 18:0:0:0: [sdc] Host-managed zoned block device
      mpt3sas 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0xfff9b200 flags=0x0050]
      mpt3sas 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0xfff9b300 flags=0x0050]
      mpt3sas_cm0: mpt3sas_ctl_pre_reset_handler: Releasing the trace buffer due to adapter reset.
      mpt3sas_cm0 fault info from func: mpt3sas_base_make_ioc_ready
      mpt3sas_cm0: fault_state(0x2666)!
      mpt3sas_cm0: sending diag reset !!
      mpt3sas_cm0: diag reset: SUCCESS
      sd 18:0:0:0: [sdc] REPORT ZONES start lba 0 failed
      sd 18:0:0:0: [sdc] REPORT ZONES: Result: hostbyte=DID_RESET driverbyte=DRIVER_OK
      sd 18:0:0:0: [sdc] 0 4096-byte logical blocks: (0 B/0 B)
      
      Avoid such issue by always mapping the buffer of REPORT ZONES commands
      using DMA_BIDIRECTIONAL (read+write IOMMU mapping). This is done by
      introducing the helper function _base_scsi_dma_map() and using this helper
      in _base_build_sg_scmd() and _base_build_sg_scmd_ieee() instead of calling
      directly scsi_dma_map().
      
      Fixes: 471ef9d4 ("mpt3sas: Build MPI SGL LIST on GEN2 HBAs and IEEE SGL LIST on GEN3 HBAs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Link: https://lore.kernel.org/r/20240719073913.179559-3-dlemoal@kernel.orgReviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      82dbb57a
    • Damien Le Moal's avatar
      scsi: mpi3mr: Avoid IOMMU page faults on REPORT ZONES · 1abc900d
      Damien Le Moal authored
      Some firmware versions of the 9600 series SAS HBA byte-swap the REPORT
      ZONES command reply buffer from ATA-ZAC devices by directly accessing the
      buffer in the host memory. This does not respect the default command DMA
      direction and causes IOMMU page faults on architectures with an IOMMU
      enforcing write-only mappings for DMA_FROM_DEVICE DMA direction (e.g. AMD
      hosts), leading to the device capacity to be dropped to 0:
      
      scsi 18:0:58:0: Direct-Access-ZBC ATA      WDC  WSH722626AL W930 PQ: 0 ANSI: 7
      scsi 18:0:58:0: Power-on or device reset occurred
      sd 18:0:58:0: Attached scsi generic sg9 type 20
      sd 18:0:58:0: [sdj] Host-managed zoned block device
      mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c400 flags=0x0050]
      mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c500 flags=0x0050]
      sd 18:0:58:0: [sdj] REPORT ZONES start lba 0 failed
      sd 18:0:58:0: [sdj] REPORT ZONES: Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
      sd 18:0:58:0: [sdj] 0 4096-byte logical blocks: (0 B/0 B)
      sd 18:0:58:0: [sdj] Write Protect is off
      sd 18:0:58:0: [sdj] Mode Sense: 6b 00 10 08
      sd 18:0:58:0: [sdj] Write cache: enabled, read cache: enabled, supports DPO and FUA
      sd 18:0:58:0: [sdj] Attached SCSI disk
      
      Avoid this issue by always mapping the buffer of REPORT ZONES commands
      using DMA_BIDIRECTIONAL, that is, using a read-write IOMMU mapping.
      Suggested-by: default avatarChristoph Hellwig <hch@lst.de>
      Fixes: 023ab2a9 ("scsi: mpi3mr: Add support for queue command processing")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Link: https://lore.kernel.org/r/20240719073913.179559-2-dlemoal@kernel.orgReviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      1abc900d
    • Manivannan Sadhasivam's avatar
      scsi: ufs: core: Do not set link to OFF state while waking up from hibernation · ac6efb12
      Manivannan Sadhasivam authored
      UFS link is just put into hibern8 state during the 'freeze' process of the
      hibernation. Afterwards, the system may get powered down. But that doesn't
      matter during wakeup. Because during wakeup from hibernation, UFS link is
      again put into hibern8 state by the restore kernel and then the control is
      handed over to the to image kernel.
      
      So in both the places, UFS link is never turned OFF. But
      ufshcd_system_restore() just assumes that the link will be in OFF state and
      sets the link state accordingly. And this breaks hibernation wakeup:
      
      [ 2445.371335] phy phy-1d87000.phy.3: phy_power_on was called before phy_init
      [ 2445.427883] ufshcd-qcom 1d84000.ufshc: Controller enable failed
      [ 2445.427890] ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
      [ 2445.427906] ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5
      [ 2445.427918] ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_restore returns -5
      [ 2445.427973] ufs_device_wlun 0:0:0:49488: PM: failed to restore async: error -5
      
      So fix the issue by removing the code that sets the link to OFF state.
      
      Cc: Anjana Hari <quic_ahari@quicinc.com>
      Cc: stable@vger.kernel.org # 6.3
      Fixes: 88441a8d ("scsi: ufs: core: Add hibernation callbacks")
      Signed-off-by: default avatarManivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
      Link: https://lore.kernel.org/r/20240718170659.201647-1-manivannan.sadhasivam@linaro.orgReviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      ac6efb12
    • Johan Hovold's avatar
      scsi: Revert "scsi: sd: Do not repeat the starting disk message" · da3e19ef
      Johan Hovold authored
      This reverts commit 7a6bbc28.
      
      The offending commit tried to suppress a double "Starting disk" message for
      some drivers, but instead started spamming the log with bogus messages
      every five seconds:
      
      	[  311.798956] sd 0:0:0:0: [sda] Starting disk
      	[  316.919103] sd 0:0:0:0: [sda] Starting disk
      	[  322.040775] sd 0:0:0:0: [sda] Starting disk
      	[  327.161140] sd 0:0:0:0: [sda] Starting disk
      	[  332.281352] sd 0:0:0:0: [sda] Starting disk
      	[  337.401878] sd 0:0:0:0: [sda] Starting disk
      	[  342.521527] sd 0:0:0:0: [sda] Starting disk
      	[  345.850401] sd 0:0:0:0: [sda] Starting disk
      	[  350.967132] sd 0:0:0:0: [sda] Starting disk
      	[  356.090454] sd 0:0:0:0: [sda] Starting disk
      	...
      
      on machines that do not actually stop the disk on runtime suspend (e.g.
      the Qualcomm sc8280xp CRD with UFS).
      
      Let's just revert for now to address the regression.
      
      Fixes: 7a6bbc28 ("scsi: sd: Do not repeat the starting disk message")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Link: https://lore.kernel.org/r/20240716161101.30692-1-johan+linaro@kernel.orgReviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarDamien Le Moal <dlemoal@kernel.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      da3e19ef
  2. 16 Jul, 2024 3 commits
  3. 11 Jul, 2024 33 commits