- 12 Sep, 2018 40 commits
-
-
James Smart authored
When taking the board offline while performing i/o, unsafe locking errors occurred and irq level isn't properly managed. In lpfc_sli_hba_down, spin_lock_irqsave(&phba->hbalock, flags) does not disable softirqs raised from timer expiry. It is possible that a softirq is raised from the lpfc_els_retry_delay routine and recursively requests the same phba->hbalock spinlock causing deadlock. Address the deadlocks by creating a new port_list lock. The softirq behavior can then be managed a level deeper into the calling sequences. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
James Smart authored
When running an mds diagnostic that passes frames with the switch, soft lockups are detected. The driver is in a CQE processing loop and has sufficient amount of traffic that it never exits the ring processing routine, thus the "lockup". Cap the number of elements in the work processing routine to 64 elements. This ensures that the cpu will be given up and the handler reschedule to process additional items. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
James Smart authored
On io completion, the driver is taking an adapter wide lock and nulling the scsi command back pointer. The nulling of the back pointer is to signify the io was completed and the scsi_done() routine was called. However, the routine makes no check to see if the abort routine had done the same thing and possibly nulled the pointer. Thus it may doubly-complete the io. Make the following mods: - Check to make sure forward progress (call scsi_done()) only happens if the command pointer was non-null. - As the taking of the lock, which is adapter wide, is very costly on a system under load, null the pointer using an xchg operation rather than under lock. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
James Smart authored
When nvme is enabled, change the default for two parameters: sg_seg_cnt - raise the per-io sg list size so that 1MB ios are supported (based on a 4k buffer per element). iocb_cnt - raise the number of buffers used for things like NVME LS request/responses to allow more concurrent requests to for larger nvme configs. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
James Smart authored
The driver allocates a sg list per io struture based on a fixed maximum size. When it registers with the protocol transports and indicates the max sg list size it supports, the driver manipulates the fixed value to report a lesser amount so that it has reserved space for sg elements that are used for DIF. The driver initialization path sets the cfg_sg_seg_cnt field to the manipulated value for scsi. NVME initialization ran afterward and capped it's maximum by the manipulated value for SCSI. This erroneously made NVME report the SCSI-reduce-for-DIF value that reduced the max io size for nvme and wasted sg elements. Rework the driver so that cfg_sg_seg_cnt becomes the overall maximum size and allow the max size to be tunable. A separate (new) scsi sg count is then setup with the scsi-modified reduced value. NVME then initializes based off the overall maximum. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
James Smart authored
Driver only sends NVME PRLI to a device that also supports FCP. This resuls in remote ports that don't have fc_remote_ports created for them. The driver is clearing the nlp_fc4_type for a ndlp at the wrong time. Fix by moving the nlp_fc4_type clearing to the discovery engine in the DEVICE_RECOVERY state. Also ensure that rport registration is done for all nlp_fc4_types. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Himanshu Madhani authored
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
This patch fixes issue when switch command fails, current code increments retry count twice. This results in a smaller number of retries. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Current code relies on switch to provide a unique combination of WWPN + NPORTID to tract an FC port. This patch tries to detect a case where switch data base can get corrupted where multiple WWPNs can have the same Nport ID. The 1st Nport ID on the list will be kept while the duplicate Nport ID will be discarded. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Remove stale debug trace. Fixes: 1eb42f96 ("qla2xxx: Make trace flags more readable") Cc: stable@vger.kernel.org #4.10 Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
When qla2xxx and Target Core gets out of sync during command cleanup, qla2xxx will not free command until it is out of firmware's hand and Target Core has called the release on the command. This patch adds synchronization using cmd_lock and release flag. If the release flag is set, then qla2xxx will free up the command using qlt_free_cmd() otherwise transport_generic_free_cmd() will be responsible for relase of the command. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Reject bsg request if chip is down. This prevent erroneous timeout. Fixes: d051a5aa ("[SCSI] qla2xxx: Add an "is reset active" helper.") Cc: stable@vger.kernel.org # 4.10 Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
If chip unable to fully initialize, use full shutdown sequence to clear out any stale FW state. Fixes: e315cd28 ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Cc: stable@vger.kernel.org #4.10 Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
On PLOGI complete + RSCN received, driver tries to handle RSCN but failed to reset the session back to the beginning to restart the login process. Instead the session was left in the Plogi complete without moving forward. This patch will push the session state back to the delete state and restart the connection. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Task abort can take 2 paths: 1) serial/synchronous abort where the calling thread will put to sleep, wait for completion and free cmd resource. 2) async abort where the cmd free will be free by the completion thread. For path 2, driver is freeing the SRB too early. Fixes: f6145e86 ("scsi: qla2xxx: Fix race between switch cmd completion and timeout") Cc: stable@vger.kernel.org # 4.19 Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Add ability to allow each physical port to control operating mode. Current code forces all ports to behave in one mode (i.e. initiator, target or dual). This patch allows user to select the operating mode for each port. - Driver must be loaded in dual mode to allow resource allocation modprobe qla2xxx qlini_mode=dual - In addition user can make adjustment to exchange resources using following command echo 1024 > /sys/class/scsi_host/host<x>/ql2xiniexchg echo 1024 > /sys/class/scsi_host/host<x>/ql2xexchoffld - trigger mode change and new setting of ql2xexchoffld|ql2xiniexchg echo [<value>] > /sys/class/scsi_host/host<x>/qlini_mode where, value can be one of following - enabled - disabled - dual - exclusive Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
For Loop topology + Initiator, FW is in control of PLOGI/PRLI. When link is reset, driver will try to cleanup the session by doing an Implicit Logout. Instead, the code is doing an Explicit Logout. The explicit logout interferes with FW state machine in trying to reconnect. The implicit logout was meant for FW to flush commands. In loop, it is not needed because FW will auto flush. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
When FW rejects a command due to "entry_status" error (malform IOCB), the srb resource needs to be returned back for cleanup. The filter to catch this is in the wrong location. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Clear port speed value on chip reset. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Sawan Chandak authored
During adapter shutdown process check for register disconnect before proceeding to call PCI functions. Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Abort IOCB request can take up to 40s or 2 ABTS timeout. We will wait for ABTS response for 20s. On a timeout, second ABTS can go out with another 20s timeout. On 2nd ABTS timeout FW will automatically do Logout. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Darren Trapp authored
This patch allows FC-NVMe under-run to be handled by transport Signed-off-by: Darren Trapp <darren.trapp@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Himanshu Madhani authored
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Current abort code defaults to legacy single queue where hardware_lock is used to protect command search. This patch moves this code behind the QPair where the qp_lock_ptr will reference the appropriate lock for either legacy/single queue or MQ. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Himanshu Madhani authored
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Himanshu Madhani authored
Using GPNFT/GNNFT command will be able to cover switch database with less number of scans. This patch removes Get NportID with provided WWPN/GIDPN switch command. By making this change, in large fabric with lots of remote port or NPIV ports with noisy SAN, the number of GIDPN commands issued by a port when it detects large number of remote ports going away or coming back, can overwhelmn the switch and it can becomde unresponsive. In a case where the fabric has not change, GIDPN is not required. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
- Reduce sess_lock holding to prevent CPU Lock up. sess_lock was held across fc_port registration and deletion. These calls can be blocked by upper layer. Sess_lock is also being accessed by interrupt thread. - Reduce number of loops in processing work_list to prevent kernel complaint of CPU lockup or holding sess_lock. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Currently, qla2x00_[get_sp|rel_sp] routines does {get|release} of srb resource/srb_mempool directly from qla_hw_data. qla2x00_start_sp() is used to issue management commands through the default Request Q 0 & Response Q 0 or base_qpair. This patch moves access of these resources through base_qpair. Instead of having knowledge of specific Q number and lock to rsp/req queue, this change will key off the qpair that is assigned to the srb resource. This lays the ground work for other routines to see this resource through the qpair. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Add sysfs support to control zio6 interrupt threshold. Using this sysfs hook user can set when to generate interrupts. This value will be used to tell firmware to generate interrupt at a certain interval. If the number of exchanges/commands fall below defined setting, then the interrupt will be generated immediately by the firmware. By default ZIO6 will coalesce interrupts to a specified interval regardless of low traffic or high traffic. [mkp: fixed several typos] Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Following changes are added by this patch - Prevent ABTS Response from getting in front of Termination of exchange. Firmware requires driver to cleanup exchanges before ABTS response can be sent. This reduces ABTS response error which triggers extra command re-termination and re-sending of ABTS response. - Add bits in driver and tracks CTIO/ATIO attribute bits for proper command Termination. A copy of the ATTR bits will be kept in the ABTS task management command as a back up copy, if an ABTS response encounters an error. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
ABTS error completion can trigger an exchange cleanup from the driver and another ABTS response will be generated. This retry of ABTS response can cause loop between driver trying to send ABTS and firmware returning error. This patch fixes this issue by adding logic to check for unresolved exchanges and clean up before ABTS is retried. This patch also addes the fix to use the same qpair as the ABTS completion for the terminatation of exchange as well as retry of ABTS response. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
When driver detect CTIO_INVALID_RX_ID status for CTIO, print message with correct information to help with debugging. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Move ATIO queue processing out of hardware_lock to prevent deadlock. Fixes: 3bb67df5 ("qla2xxx: Check for online flag instead of active reset when transmitting responses") Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
For driver MBX submission, use mbox_busy to serialize request. For Userspace MBX submission, use optrom mutex to serialize request. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Himanshu Madhani authored
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
When driver receive PLOGI/PRLI from FW, the WWPN value will be provided. If it is not, then driver will terminate it. The WWPN allows driver to locate the session or create a new session. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
For target mode, any chip reset triggered before target mode is enabled will be held off until user is ready to enable. This prevents the chip from starting or running before it is intended. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
When switch responds with error for Get Port Speed Command (GPSC), driver should not proceed with telling FW about the speed of the remote port. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
When all fabric scan retries fail, remove all RPorts, DMA resources for the command. Otherwise we have stale Rports. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-