- 13 Oct, 2023 40 commits
-
-
Martin K. Petersen authored
Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.15 This patch set contains error handling fixes, ELS bug fixes, and logging improvements. The patches were cut against Martin's 6.7/scsi-queue tree. Link: https://lore.kernel.org/r/20231009161812.97232-1-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Justin Tee authored
Update lpfc version to 14.2.0.15. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-7-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Justin Tee authored
The preexisting LOG_NODE message flag frequently spams a subset of the same log messages during normal FC driver operations. When analyzing driver logs, this sometimes leads to difficulty in troubleshooting. Because LOG_IP log message flag is unused, convert it to a new LOG_NODE_VERBOSE flag. The LOG_NODE_VERBOSE shall specifically be used for diagnosing issues that require precise ndlp tracking detail. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-6-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Justin Tee authored
A WCQE success completion status does not guarantee valid LS_ACC receipt for ELS commands. So, introduce a small helper routine that validates ELS LS_ACC frames in ELS cmpl routines. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-5-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Justin Tee authored
Currently, NPIV ports send PRLI_ACC to all received unsolicited PRLI requests. For an NPIV port, there is no point to PRLI_ACC if the received PRLI request has the initiator function bit set and the target function bit unset. Modify the lpfc_rcv_prli_support_check() routine to send a PRLI_RJT in such cases. NPIV ports are expected to send PRLI_ACC only if the Target function bit is set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-4-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Justin Tee authored
During receipt of a hardware error attention ACQE, IOERR_SLI_DOWN status is set by the driver for all outstanding I/Os. In such hardware error attention cases, we can treat the situation exactly the same as pci_channel_offline. Thus, add IOERR_SLI_DOWN status to the same category as pci_channel_offline handling in lpfc_nvme_io_cmd_cmpl. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-3-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Justin Tee authored
In order to enter the !rc if statement block in question, rc had to have been zero to begin with. Thus, the rc = 0 assignment is unnecessary and can be removed. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20231009161812.97232-2-justintee8345@gmail.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Martin K. Petersen authored
Chandrakanth patil <chandrakanth.patil@broadcom.com> says: This set of patches includes critical fixes, and updates to the maintainer list. Link: https://lore.kernel.org/r/20231003110021.168862-1-chandrakanth.patil@broadcom.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Chandrakanth patil authored
Given my active involvement in megaraid_sas development, I am including myself in the maintainers list. Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-5-chandrakanth.patil@broadcom.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Chandrakanth patil authored
Driver version update. Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-4-chandrakanth.patil@broadcom.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Chandrakanth patil authored
The driver now includes the print message 'IO is completed, no reset is required' when a reset is requested but not issued. This message is displayed only when pending SCSI IO is completed before issuing the reset. Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-3-chandrakanth.patil@broadcom.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Chandrakanth patil authored
In BMC environments with concurrent access to multiple registers, certain registers occasionally yield a value of 0 even after 3 retries due to hardware errata. As a fix, we have extended the retry count from 3 to 30. The same errata applies to the mpt3sas driver, and a similar patch has been accepted. Please find more details in the mpt3sas patch reference link. Link: https://lore.kernel.org/r/20230829090020.5417-2-ranjan.kumar@broadcom.com Fixes: 272652fc ("scsi: megaraid_sas: add retry logic in megasas_readl") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20231003110021.168862-2-chandrakanth.patil@broadcom.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Martin K. Petersen authored
Mike Christie <michael.christie@oracle.com> says: The following patches were made over Linus tree (Martin's 6.7 branch was missing some changes to sd.c). They only contain the sshdr and rdac retry fixes from the "Allow scsi_execute users to control retries" patchset. The patches in this set are reviewed and tested but the changes to how we do retries will take a little longer and require more testing, so I broke up the series to make them easier to review. Link: https://lore.kernel.org/r/20231004210013.5601-1-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-13-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-12-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-11-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-10-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
The sshdr passed into scsi_execute_cmd is only initialized if scsi_execute_cmd returns >= 0, and scsi_mode_sense will convert all non good statuses like check conditions to -EIO. This has scsi_mode_sense callers that were possibly accessing an uninitialized sshdrs to only access it if we got -EIO. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-9-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-7-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-6-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If send_mode_select retries scsi_execute_cmd it will leave err set to SCSI_DH_RETRY/SCSI_DH_IMM_RETRY. If on the retry, the command is successful, then SCSI_DH_RETRY/SCSI_DH_IMM_RETRY will be returned to the scsi_dh activation caller. On the retry, we will then detect the previous MODE SELECT had worked, and so we will return success. This patch has us return the correct return value, so we can avoid the extra scsi_dh activation call and to avoid failures if the caller had hit its activation retry limit and does not end up retrying. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-5-michael.christie@oracle.comReviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-4-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-3-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we shouldn't access the sshdr. If it returns 0, then the cmd executed successfully, so there is no need to check the sshdr. This has us access the sshdr when we get a return value > 0. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20231004210013.5601-2-michael.christie@oracle.comReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Martin K. Petersen authored
Mike Christie <michael.christie@oracle.com> says: The following patches were made over Linus's tree but apply over Martin's branches. They allow userspace to configure how fabric drivers submit cmds to backend drivers. Right now loop and vhost use a worker thread, and the other drivers submit from the contexts they receive/process the cmd from. For multiple LUN cases where the target can queue more cmds than the backend can handle then deferring to a worker thread is safest because the backend driver can block when doing things like waiting for a free request/tag. Deferring also helps when the target has to handle transport level requests from the recv context. For cases where the backend devices can queue everything the target sends, then there is no need to defer to a workqueue and you can see a perf boost of up to 26% for small IO workloads. For a nvme device and vhost-scsi I can see with 4K IOs: fio jobs 1 2 4 8 10 -------------------------------------------------- workqueue submit 94K 190K 394K 770K 890K direct submit 128K 252K 488K 950K - Link: https://lore.kernel.org/r/1b1f7a5c-0988-45f9-b103-dfed2c0405b1@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
This exports the fabric driver's direct submit settings, so users know what the driver supports. It will be helpful when they are exporting a device through different targets and one doesn't support direct submission. The new files allow the fabric to report what submission types they default to and if they support direct submission: default_submit_type: 1 - TARGET_DIRECT_SUBMIT - If the user has not requested a specific value then the fabric requests direct submission. 2 - TARGET_QUEUE_SUBMIT - If the user has not requested a specific value then the fabric requests queued submission. Note that these fabric values are based on what the fabric driver currently defaults to for compat with exiting setups. direct_submit_supported: 0 - The fabric does not support direct submission. 1 - The fabric supports direct submission. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-8-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
target_queue_submission() is not called by drivers anymore so unexport it. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-7-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
This allows userspace to request the fabric drivers do direct submissions if they support it. With the new device file, submit_type, users can write 0 - 2 to control how commands are submitted to the backend: 0 - TARGET_FABRIC_DEFAULT_SUBMIT - LIO will use the fabric's default submission type. This is the default for compat. 1 - TARGET_DIRECT_SUBMIT - LIO will submit the cmd to the backend from the calling context if the fabric the cmd was received on supports it, else it will use the fabric's default type. 2 - TARGET_QUEUE_SUBMIT - LIO will queue the cmd to the LIO submission workqueue which will pass it to the backend. When using an NVMe drive and vhost-scsi with direct submission we see around a 20% improvement in 4K I/Os: fio jobs 1 2 4 8 10 -------------------------------------------------- defer 94K 190K 394K 770K 890K direct 128K 252K 488K 950K - And when using the queueing mode, we now no longer see issues like where the iSCSI tx thread is blocked in the block layer waiting on a tag so it can't respond to a nop or perform I/Os for other LUs. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-6-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
Move the code from transport_handle_cdb_direct() to target_submit() and have iSCSI call target_submit(). Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-5-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
Move the hack to clear some buffers to transport_handle_cdb_direct() so we can eventually merge transport_handle_cdb_direct() and target_submit(). This also fixes up the comment so it's clear it was only for udev and reflects that the referenced function does not exist and we now allow more than 1 page for control CDBs. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-4-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
Move core_alua_check_nonop_delay() to transport_handle_cdb_direct() so the iSCSI target driver doesn't have to call as many core functions directly. We will eventually merge transport_handle_cdb_direct and target_submit so iSCSI and the other drivers call a common function. It will also be helpful as preparation for future changes which allow the iSCSI target to defer command submission to the LIO submission workqueue, because we will have a common submission function for that which will be based on transport_handle_cdb_direct()/target_submit(). Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-3-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
In some cases, like with multiple LUN targets or where the target has to respond to transport level requests from the receiving context it can be better to defer cmd submission to a helper thread. If the backend driver blocks on something like request/tag allocation it can block the entire target submission path and other LUs and transport IO on that session. In other cases like single LUN targets with storage that can support all the commands that the target can queue, then it's best to submit the cmd to the backend from the target's cmd receiving context. Subsequent commits will allow the user to config what they prefer, but drivers like loop can't directly submit because they can be called from a context that can't sleep. And, drivers like vhost-scsi can support direct submission, but need to keep their default behavior of deferring execution to avoid possible regressions where the backend can block. Make the drivers tell LIO core if they support direct submissions and their current default, so we can prevent users from misconfiguring the system and initialize devices correctly. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-2-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Mike Christie authored
Subsequent commits add more on/off type of settings to the target_core_fabric_ops struct so this makes write_pending_must_be_called a bit field instead of a bool to better organize the settings. Signed-off-by: Mike Christie <michael.christie@oracle.com> Link: https://lore.kernel.org/r/20230928020907.5730-1-michael.christie@oracle.comSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Martin K. Petersen authored
Hannes Reinecke <hare@suse.de> says: Hi all, (taking up an old thread:) here's the first batch of patches for my EH rework. It modifies the reset callbacks for SCSI drivers such that the final conversion to drop the 'struct scsi_cmnd' argument and use the entity in question (host, bus, target, device) as the argument to the SCSI EH callbacks becomes possible. The first part covers drivers which just requires minor tweaks. Link: https://lore.kernel.org/r/20231002154328.43718-1-hare@suse.deSigned-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Hannes Reinecke authored
SCSI EH host reset is the final callback in the escalation chain; once we reach this we need to reset the controller. As such it defeats the purpose to skip controller reset if no I/Os are pending and the RAID device is to be reset; especially after kexec there might be stale commands pending in firmware for which we have no reference whatsoever. So this patch splits off the check for pending I/O into a 'bus_reset' function, and leaves the actual controller reset to the host reset. Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231002154328.43718-19-hare@suse.de Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Hannes Reinecke authored
The reset code requires a device to be selected, but we shouldn't rely on the command to provide a device for us. So select the first device on the target when sending down a target reset. Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231002154328.43718-18-hare@suse.deReviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Hannes Reinecke authored
The reset code requires a device to be selected, but we shouldn't rely on the command to provide a device for us. So select the first device on the bus when sending down a bus reset. Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231002154328.43718-17-hare@suse.deReviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Hannes Reinecke authored
There's not much in common between host reset and all other error handlers, so use a separate function here. Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231002154328.43718-16-hare@suse.deReviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Hannes Reinecke authored
Split off the combined abort and device reset handling into distinct functions. And rename the current device reset handler into a target reset handler, seeing that it really is a target reset. Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231002154328.43718-15-hare@suse.deReviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Hannes Reinecke authored
The current handler does both, bus reset and host reset. So split them off into two distinct functions. Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20231002154328.43718-14-hare@suse.de Cc: Matthew Wilcox <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-