Commits · 046ab7d0f5943dd74c351e1f3a771dea785fe25d · Kirill Smelkov / linux

13 Oct, 2021 2 commits

scsi: hisi_sas: Wait for phyup in hisi_sas_control_phy() · 046ab7d0

Xiang Chen authored Oct 12, 2021

When issuing a hardreset/linkreset/phy_set_linkrate from sysfs, the phy
will be disabled and re-enabled for the directly attached scenario.

It takes some time for the phy to come back up after re-enabling the phy.
If the controller becomes suspended while waiting for the phy to come back,
the phy up may be lost (along with the disk).

To solve this problem, wait for the phy up to occur with a timeout. Indeed
this is already done in hisi_sas_debug_I_T_nexus_reset() for local phys, so
just relocate the functionality to hisi_sas_control_phy().

Since the HA workqueue is drained when suspending the controller, and the
phy control function is called from the same workqueue, we can guarantee
that the controller will not be suspended during this period.

Link: https://lore.kernel.org/r/1634041588-74824-3-git-send-email-john.garry@huawei.comSigned-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

046ab7d0

scsi: hisi_sas: Initialise devices in .slave_alloc callback · 36c6b761

Xiang Chen authored Oct 12, 2021

Perform driver-specific SCSI device initialization in the designated SCSI
midlayer callback instead of relying on the libsas "device found" callback.

The SCSI midlayer .slave_alloc interface is called prior to sending any I/O
to the device.

Link: https://lore.kernel.org/r/1634041588-74824-2-git-send-email-john.garry@huawei.comSigned-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

36c6b761

12 Oct, 2021 4 commits

scsi: ufs: core: Fix synchronization between scsi_unjam_host() and ufshcd_queuecommand() · d489f18a

Adrian Hunter authored Oct 08, 2021

The SCSI error handler calls scsi_unjam_host() which can call the queue
function ufshcd_queuecommand() indirectly. The error handler changes the
state to UFSHCD_STATE_RESET while running, but error interrupts that
happen while the error handler is running could change the state to
UFSHCD_STATE_EH_SCHEDULED_NON_FATAL which would allow requests to go
through ufshcd_queuecommand() even though the error handler is running.
Block that hole by checking whether the error handler is in progress.

Link: https://lore.kernel.org/r/20211008084048.257498-1-adrian.hunter@intel.comReviewed-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d489f18a

scsi: ufs: mediatek: Support vops pre suspend to disable auto-hibern8 · 9561f584

Peter Wang authored Oct 06, 2021

Mediatek UFS needs auto-hibern8 disabled before suspend. Introduce a
solution to do pre-suspend before SSU (sleep).

Link: https://lore.kernel.org/r/20211006054705.21885-1-peter.wang@mediatek.comReviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9561f584

scsi: csiostor: Uninitialized data in csio_ln_vnp_read_cbfn() · f4875d50

Dan Carpenter authored Oct 06, 2021

This variable is just a temporary variable, used to do an endian
conversion. The problem is that the last byte is not initialized. After
the conversion is completely done, the last byte is discarded so it doesn't
cause a problem. But static checkers and the KMSan runtime checker can
detect the uninitialized read and will complain about it.

Link: https://lore.kernel.org/r/20211006073242.GA8404@kili
Fixes: 5036f0a0 ("[SCSI] csiostor: Fix sparse warnings.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

f4875d50

Merge branch '5.15/scsi-fixes' into 5.16/scsi-staging · ec65e6be

Martin K. Petersen authored Oct 12, 2021

Merge the 5.15/scsi-fixes branch into the staging tree to resolve UFS
conflict reported by sfr.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ec65e6be

05 Oct, 2021 34 commits

scsi: smartpqi: Update version to 2.1.12-055 · 605ae389

Don Brace authored Sep 28, 2021

Update driver version to reflect changes.

Link: https://lore.kernel.org/r/20210928235442.201875-12-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

605ae389

scsi: smartpqi: Add 3252-8i PCI id · 80982656

Mike McGowen authored Sep 28, 2021

Add PCI ID information for the Adaptec SmartRAID 3252-8i controller:

9005 / 028F / 9005 / 14A2

Link: https://lore.kernel.org/r/20210928235442.201875-11-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

80982656

scsi: smartpqi: Fix duplicate device nodes for tape changers · d4dc6aea

Kevin Barnett authored Sep 28, 2021

Stop the OS from re-discovering multiple LUNs for tape drive and medium
changer.

Duplicate device nodes for Ultrium tape drive and medium changer are being
created.

The Ultrium tape drive is a multi-LUN SCSI target.  It presents a LUN for
the tape drive and a 2nd LUN for the medium changer.  Our controller FW
lists both LUNs in the RPL results.

As a result, the smartpqi driver exposes both devices to the OS. Then the
OS does its normal device discovery via the SCSI REPORT LUNS command, which
causes it to re-discover both devices a 2nd time, which results in the
duplicate device nodes.

Link: https://lore.kernel.org/r/20210928235442.201875-10-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d4dc6aea

scsi: smartpqi: Fix boot failure during LUN rebuild · 987d3560

Mike McGowen authored Sep 28, 2021

Move the delay in the register polling loop to the beginning of the loop to
ensure there is always a delay between writing the register and reading it.

Link: https://lore.kernel.org/r/20210928235442.201875-9-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

987d3560

scsi: smartpqi: Add extended report physical LUNs · 28ca6d87

Mike McGowen authored Sep 28, 2021

Add support for the new extended formats in the data returned from the
Report Physical LUNs command for controllers that enable this feature.

The new formats allow the reporting of 16-byte WWIDs.

Link: https://lore.kernel.org/r/20210928235442.201875-8-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mike McGowen <Mike.McGowen@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

28ca6d87

scsi: smartpqi: Avoid failing I/Os for offline devices · 4f3cefc3

Mahesh Rajashekhara authored Sep 28, 2021

Prevent kernel crash by failing outstanding I/O request when the OS takes
device offline.

When posted I/Os to the controller's inbound queue are not picked by the
controller, the driver will halt the controller and take the controller
offline.

When the driver takes the controller offline, the driver will fail all the
outstanding requests which can sometimes lead to an OS crash.

Link: https://lore.kernel.org/r/20210928235442.201875-7-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4f3cefc3

scsi: smartpqi: Add TEST UNIT READY check for SANITIZE operation · be76f906

Don Brace authored Sep 28, 2021

Send a TEST UNIT READY to HBA disks and do not present them to the OS if
0x02/0x04/0x1b (SANITIZE IN PROGRESS) is returned.

During boot-up, some OSes appear to hang when there are one or more disks
undergoing a sanitize operation.

According to SCSI SBC4 specification section 4.11.2 "Commands allowed
during SANITIZE", some SCSI commands are permitted, but read/write
operations are not.

When the OS attempts to read the disk partition table a CHECK CONDITION ASC
0x04 ASCQ 0x1b is returned which causes the OS to retry the read until
SANITIZE has completed. This can take hours.

According to document HPE Smart Storage Administrator User Guide, during
the sanitize erase operation, the drive is unusable. I.e. the expected
behavior for SANITIZE is the that disk remains offline even after SANITIZE
has completed. The customer is expected to re-enable the disk using the
management utility.

Link: https://lore.kernel.org/r/20210928235442.201875-6-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

be76f906

scsi: smartpqi: Update LUN reset handler · 6ce1ddf5

Kevin Barnett authored Sep 28, 2021

Enhance check for commands queued to the controller. Add new function
pqi_nonempty_inbound_queue_count() that will wait for all I/O queued for
submission to controller across all queue groups to drain. Add helper
functions to obtain queue command counts for each queue group. These
queues should drain quickly as they are already staged to be submitted down
to the controller's IB queue.

Enhance check for outstanding command completion. Update the count of
outstanding commands while waiting. This value was not re-obtained and was
potentially causing infinite wait for all completions.

Link: https://lore.kernel.org/r/20210928235442.201875-5-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6ce1ddf5

scsi: smartpqi: Capture controller reason codes · 5d1f03e6

Murthy Bhat authored Sep 28, 2021

In some rare cases, the driver can halt the controller. Add a reason code
describing why the controller was halted. Store this reason code in a
controller register to aid in debugging the issue.

Link: https://lore.kernel.org/r/20210928235442.201875-4-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

5d1f03e6

scsi: smartpqi: Add controller handshake during kdump · 9ee5d6e9

Mahesh Rajashekhara authored Sep 28, 2021

Correct kdump hangs when controller is locked up.

There are occasions when a controller reboot (controller soft reset) is
issued when a controller firmware crash dump is in progress.

This leads to incomplete controller firmware crash dump:

- When the controller crash dump is in progress, and a kdump is initiated,
the driver issues inbound doorbell reset to bring back the controller in
SIS mode.

- If the controller is in locked up state, the inbound doorbell reset does
not work causing controller initialization failures. This results in the
driver hanging waiting for SIS mode.

To avoid an incomplete controller crash dump, add in a controller crash
dump handshake:

- Controller will indicate start and end of the controller crash dump by
setting some register bits.

- Driver will look these bits when a kdump is initiated. If a controller
crash dump is in progress, the driver will wait for the controller crash
dump to complete before issuing the controller soft reset then complete
driver initialization.

Link: https://lore.kernel.org/r/20210928235442.201875-3-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9ee5d6e9

scsi: smartpqi: Update device removal management · 819225b0

Don Brace authored Sep 28, 2021

Update device removal path to handle issues for:

 - rmmod: Correct stack trace when removing devices.
 - rmmod: Synchronize SCSI cache.
 - Update handling for removing devices using sysfs.

Link: https://lore.kernel.org/r/20210928235442.201875-2-don.brace@microchip.comReviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Acked-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

819225b0

scsi: mpi3mr: Clean up mpi3mr_print_ioc_info() · 76a4f7cc

Dan Carpenter authored Sep 16, 2021

This function is more complicated than necessary.

If we change from scnprintf() to snprintf() that lets us remove the if
bytes_wrote < sizeof(protocol) checks.  Also, we can use bytes_wrote ? ","
: "" to print the comma and remove the separate if statement and the
"is_string_nonempty" variable.

[mkp: a few formatting cleanups and s/wrote/written/]

Link: https://lore.kernel.org/r/20210916132605.GF25094@kiliSigned-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

76a4f7cc

scsi: iscsi: Fix iscsi_task use after free · 258aad75

Mike Christie authored Oct 04, 2021

Commit d39df158 ("scsi: iscsi: Have abort handler get ref to conn")
added iscsi_get_conn()/iscsi_put_conn() calls during abort handling but
then also changed the handling of the case where we detect an already
completed task where we now end up doing a goto to the common put/cleanup
code. This results in a iscsi_task use after free, because the common
cleanup code will do a put on the iscsi_task.

This reverts the goto and moves the iscsi_get_conn() to after we've checked
if the iscsi_task is valid.

Link: https://lore.kernel.org/r/20211004210608.9962-1-michael.christie@oracle.com
Fixes: d39df158 ("scsi: iscsi: Have abort handler get ref to conn")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

258aad75

scsi: lpfc: Fix memory overwrite during FC-GS I/O abort handling · 69a3a7bc

James Smart authored Oct 04, 2021

When an FC-GS I/O is aborted by lpfc, the driver requires a node pointer
for a dereference operation. In the abort I/O routine, the driver miscasts
a context pointer to the wrong data type and overwrites a single byte
outside of the allocated space. This miscast is done in the abort I/O
function handler because the handler works on both FC-GS and FC-LS
commands. However, the code neglected to get the correct job location for
the node.

Fix this by acquiring the necessary node pointer from the correct job
structure depending on the I/O type.

Link: https://lore.kernel.org/r/20211004231210.35524-1-jsmart2021@gmail.comCo-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

69a3a7bc

scsi: elx: efct: Delete stray unlock statement · a013c71c

Dan Carpenter authored Oct 04, 2021

It's not holding the lock at this stage and the IRQ "flags" are not correct
so it would restore something bogus. Delete the unlock statement.

Link: https://lore.kernel.org/r/20211004103851.GE25015@kili
Fixes: 3e641400 ("scsi: elx: efct: SCSI I/O handling routines")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

a013c71c

scsi: pm80xx: Fix misleading log statement in pm8001_mpi_get_nvmd_resp() · 4084a723

Igor Pylypiv authored Sep 28, 2021

pm8001_mpi_get_nvmd_resp() handles a GET_NVMD_DATA response, not a
SET_NVMD_DATA response, as the log statement implies.

Fixes: 1f889b58 ("scsi: pm80xx: Fix pm8001_mpi_get_nvmd_resp() race condition")
Link: https://lore.kernel.org/r/20210929025847.646999-1-ipylypiv@google.comReviewed-by: Changyuan Lyu <changyuanl@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4084a723

scsi: pm80xx: Replace open coded check with dev_is_expander() · 4f632918

Igor Pylypiv authored Sep 28, 2021

This is a follow up cleanup to the commit 924a3541 ("scsi: libsas:
aic94xx: hisi_sas: mvsas: pm8001: Use dev_is_expander()")

Link: https://lore.kernel.org/r/20210929025807.646589-1-ipylypiv@google.comReviewed-by: Vishakha Channapattan <vishakhavc@google.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4f632918

scsi: target: tcmu: Use struct_size() helper in kmalloc() · c20bda34

Gustavo A. R. Silva authored Sep 27, 2021

Make use of the struct_size() helper instead of an open-coded version, in
order to avoid any potential type mistakes or integer overflows that, in
the worst scenario, could lead to heap overflows.

Link: https://github.com/KSPP/linux/issues/160
Link: https://lore.kernel.org/r/20210927224344.GA190701@embeddedorReviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c20bda34

scsi: target: usb: Replace enable attr with ops.enable · 5384ee08

Dmitry Bogdanov authored Sep 10, 2021

Remove tpg/enable attribute. Add fabric ops enable_tpg implementation
instead.

Link: https://lore.kernel.org/r/20210910084133.17956-8-d.bogdanov@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

5384ee08

scsi: target: ibm_vscsi: Replace enable attr with ops.enable · d7e2932b

Dmitry Bogdanov authored Sep 10, 2021

Remove tpg/enable attribute. Add fabric ops enable_tpg implementation
instead.

Link: https://lore.kernel.org/r/20210910084133.17956-7-d.bogdanov@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d7e2932b

scsi: target: srpt: Replace enable attr with ops.enable · 9465b487

Dmitry Bogdanov authored Sep 10, 2021

Remove tpg/enable attribute.  Add fabric ops enable_tpg implementation
instead.

Link: https://lore.kernel.org/r/20210910084133.17956-6-d.bogdanov@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

9465b487

scsi: target: sbp: Replace enable attr with ops.enable · fb00af92

Dmitry Bogdanov authored Sep 10, 2021

Remove tpg/enable attribute.  Add fabric ops enable_tpg implementation
instead.

Link: https://lore.kernel.org/r/20210910084133.17956-5-d.bogdanov@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fb00af92

scsi: target: qla2xxx: Replace enable attr with ops.enable · cb8717a7

Dmitry Bogdanov authored Sep 10, 2021

Remove tpg/enable attribute. Add fabric ops enable_tpg implementation
instead.

Link: https://lore.kernel.org/r/20210910084133.17956-4-d.bogdanov@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

cb8717a7

scsi: target: iscsi: Replace tpg enable attr with ops.enable · 382731ec

Dmitry Bogdanov authored Sep 10, 2021

Remove tpg/enable attribute. Add fabric ops enable_tpg implementation
instead.

Link: https://lore.kernel.org/r/20210910084133.17956-3-d.bogdanov@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

382731ec

scsi: target: core: Add common tpg/enable attribute · 80ed33c8

Dmitry Bogdanov authored Sep 10, 2021

Many fabric modules provide their own implementation of enable attribute in
tpg.

Provide a way to remove code duplication in the fabric modules and
automatically add "enable" attribute if a fabric module has an
implementation of fabric_enable_tpg().

Link: https://lore.kernel.org/r/20210910084133.17956-2-d.bogdanov@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

80ed33c8

scsi: megaraid_sas: Driver version update to 07.719.03.00-rc1 · cdf7f6a1

Sumit Saxena authored Sep 29, 2021

Link: https://lore.kernel.org/r/20210929124022.24605-4-sumit.saxena@broadcom.comSigned-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

cdf7f6a1

scsi: megaraid_sas: Add helper functions for irq_context · 4c32edc3

Sumit Saxena authored Sep 29, 2021

Adding helper functions for ISR access and release to improve readability.

Link: https://lore.kernel.org/r/20210929124022.24605-3-sumit.saxena@broadcom.comSigned-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4c32edc3

scsi: megaraid_sas: Fix concurrent access to ISR between IRQ polling and real interrupt · e7dcc514

Sumit Saxena authored Sep 29, 2021

IRQ polling thread calls ISR after enable_irq() to handle any missed I/O
completion. The atomic flag "in_used" was added to have the synchronization
between the IRQ polling thread and the interrupt context. There is a bug
around it leading to a race condition.

Below is the sequence:

 - IRQ polling thread accesses ISR, fetches the reply descriptor.

 - Real interrupt arrives and pre-empts polling thread (enable_irq() is
   already called).

 - Interrupt context picks the same reply descriptor as fetched by polling
   thread, processes it, and exits.

 - Polling thread resumes and processes the descriptor which is already
   processed by interrupt thread leads to kernel crash.

Setting the "in_used" flag before fetching the reply descriptor ensures
synchronized access to ISR.

Link: https://www.spinics.net/lists/linux-scsi/msg159440.html
Link: https://lore.kernel.org/r/20210929124022.24605-2-sumit.saxena@broadcom.com
Fixes: 9bedd36e ("scsi: megaraid_sas: Handle missing interrupts while re-enabling IRQs")
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

e7dcc514

scsi: advansys: Fix kernel pointer leak · d4996c6e

Guo Zhi authored Sep 29, 2021

Pointers should be printed with %p or %px rather than cast to 'unsigned
long' and printed with %lx.

Change %lx to %p to print the hashed pointer.

Link: https://lore.kernel.org/r/20210929122538.1158235-1-qtxuning1999@sjtu.edu.cnSigned-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d4996c6e

scsi: target: core: Make logs less verbose · 05787e34

Konstantin Shelekhin authored Sep 29, 2021

Change the log level of the following message to debug:

Unsupported SCSI Opcode 0xXX, sending CHECK_CONDITION.

This message is mostly helpful during debugging sessions in order to
understand errors on the initiator side. But most of the time it's just
useless and makes reading logs much harder.

It gets particularly annoying if there are many initiators that come and go
or if an initiator runs a program that does not care whether the command is
supported and just keeps sending it.

Link: https://lore.kernel.org/r/20210929114959.705852-1-k.shelekhin@yadro.comReviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Konstantin Shelekhin <k.shelekhin@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

05787e34

scsi: ufs: core: Do not exit ufshcd_err_handler() unless operational or dead · 87bf6a6b

Adrian Hunter authored Oct 02, 2021

Callers of ufshcd_err_handler() expect it to return in an operational
state. However, the code does not check the state before exiting.

Add a check for the state and perform retries until either success or the
maximum number of retries is reached.

Link: https://lore.kernel.org/r/20211002154550.128511-3-adrian.hunter@intel.comReviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

87bf6a6b

scsi: ufs: core: Do not exit ufshcd_reset_and_restore() unless operational or dead · 54a40453

Adrian Hunter authored Oct 02, 2021

Callers of ufshcd_reset_and_restore() expect it to return in an operational
state. However, the code only checks direct errors and so the ufshcd_state
may not be UFSHCD_STATE_OPERATIONAL due to error interrupts.

Fix by also checking ufshcd_state, still allowing non-fatal errors which
are left for the error handler to deal with.

Link: https://lore.kernel.org/r/20211002154550.128511-2-adrian.hunter@intel.comReviewed-by: Avri altman <avri.altman@wdc.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

54a40453

scsi: ufs: core: Stop clearing UNIT ATTENTIONS · edc0596c

Bart Van Assche authored Oct 01, 2021

Commit aa53f580 ("scsi: ufs: Minor adjustments to error handling")
introduced a ufshcd_clear_ua_wluns() call in
ufshcd_err_handling_unprepare(). As explained in detail by Adrian Hunter,
this can trigger a deadlock. Avoid that deadlock by removing the code that
clears the unit attention. This is safe because the only software that
relies on clearing unit attentions is the Android Trusty software and
because support for handling unit attentions has been added in the Trusty
software.

See also https://lore.kernel.org/linux-scsi/20210930124224.114031-2-adrian.hunter@intel.com/

Note that "scsi: ufs: Retry START_STOP on UNIT_ATTENTION" is a prerequisite
for this commit.

Link: https://lore.kernel.org/r/20211001182015.1347587-3-jaegeuk@kernel.org
Fixes: aa53f580 ("scsi: ufs: Minor adjustments to error handling")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

edc0596c

scsi: ufs: core: Retry START_STOP on UNIT_ATTENTION · af21c3fd

Jaegeuk Kim authored Oct 01, 2021

Commit 57d104c1 ("ufs: add UFS power management support") made the UFS
driver submit a REQUEST SENSE command before submitting a power management
command to a WLUN to clear the POWER ON unit attention. Instead of
submitting a REQUEST SENSE command before submitting a power management
command, retry the power management command until it succeeds.

This is the preparation to get rid of all UNIT ATTENTION code which should
be handled by users.

Link: https://lore.kernel.org/r/20211001182015.1347587-2-jaegeuk@kernel.org
Cc: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

af21c3fd