- 09 Dec, 2020 40 commits
-
-
Alan Stern authored
blk_queue_enter() accepts BLK_MQ_REQ_PM requests independent of the runtime power management state. Now that SCSI domain validation no longer depends on this behavior, modify the behavior of blk_queue_enter() as follows: - Do not accept any requests while suspended. - Only process power management requests while suspending or resuming. Submitting BLK_MQ_REQ_PM requests to a device that is runtime suspended causes runtime-suspended devices not to resume as they should. The request which should cause a runtime resume instead gets issued directly, without resuming the device first. Of course the device can't handle it properly, the I/O fails, and the device remains suspended. The problem is fixed by checking that the queue's runtime-PM status isn't RPM_SUSPENDED before allowing a request to be issued, and queuing a runtime-resume request if it is. In particular, the inline blk_pm_request_resume() routine is renamed blk_pm_resume_queue() and the code is unified by merging the surrounding checks into the routine. If the queue isn't set up for runtime PM, or there currently is no restriction on allowed requests, the request is allowed. Likewise if the BLK_MQ_REQ_PM flag is set and the status isn't RPM_SUSPENDED. Otherwise a runtime resume is queued and the request is blocked until conditions are more suitable. [ bvanassche: modified commit message and removed Cc: stable because without the previous patches from this series this patch would break parallel SCSI domain validation + introduced queue_rpm_status() ] Link: https://lore.kernel.org/r/20201209052951.16136-9-bvanassche@acm.org Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reported-and-tested-by: Martin Kepplinger <martin.kepplinger@puri.sm> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bart Van Assche authored
Remove flag RQF_PREEMPT and BLK_MQ_REQ_PREEMPT since these are no longer used by any kernel code. Link: https://lore.kernel.org/r/20201209052951.16136-8-bvanassche@acm.org Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Martin Kepplinger <martin.kepplinger@puri.sm> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bart Van Assche authored
Instead of submitting all SCSI commands submitted with scsi_execute() to a SCSI device if rpm_status != RPM_ACTIVE, only submit RQF_PM (power management requests) if rpm_status != RPM_ACTIVE. This patch makes the SCSI core handle the runtime power management status (rpm_status) as it should be handled. Link: https://lore.kernel.org/r/20201209052951.16136-7-bvanassche@acm.org Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Martin Kepplinger <martin.kepplinger@puri.sm> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bart Van Assche authored
Disable runtime power management during domain validation. Since a later patch removes RQF_PREEMPT, set RQF_PM for domain validation commands such that these are executed in the quiesced SCSI device state. Link: https://lore.kernel.org/r/20201209052951.16136-6-bvanassche@acm.org Cc: Alan Stern <stern@rowland.harvard.edu> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Woody Suwalski <terraluna977@gmail.com> Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Stan Johnson <userm57@yahoo.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bart Van Assche authored
This is another step that prepares for the removal of RQF_PREEMPT. Link: https://lore.kernel.org/r/20201209052951.16136-5-bvanassche@acm.org Cc: David S. Miller <davem@davemloft.net> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bart Van Assche authored
RQF_PREEMPT is used for two different purposes in the legacy IDE code: 1. To mark power management requests. 2. To mark requests that should preempt another request. An (old) explanation of that feature is as follows: "The IDE driver in the Linux kernel normally uses a series of busywait delays during its initialization. When the driver executes these busywaits, the kernel does nothing for the duration of the wait. The time spent in these waits could be used for other initialization activities, if they could be run concurrently with these waits. More specifically, busywait-style delays such as udelay() in module init functions inhibit kernel preemption because the Big Kernel Lock is held, while yielding APIs such as schedule_timeout() allow preemption. This is true because the kernel handles the BKL specially and releases and reacquires it across reschedules allowed by the current thread. This IDE-preempt specification requires that the driver eliminate these busywaits and replace them with a mechanism that allows other work to proceed while the IDE driver is initializing." Since I haven't found an implementation of (2), do not set the PREEMPT flag for sense requests. This patch causes sense requests to be postponed while a drive is suspended instead of being submitted to ide_queue_rq(). If it would ever be necessary to restore the IDE PREEMPT functionality, that can be done by introducing a new flag in struct ide_request. Link: https://lore.kernel.org/r/20201209052951.16136-4-bvanassche@acm.org Cc: David S. Miller <davem@davemloft.net> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Can Guo <cang@codeaurora.org> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bart Van Assche authored
Introduce the BLK_MQ_REQ_PM flag. This flag makes the request allocation functions set RQF_PM. This is the first step towards removing BLK_MQ_REQ_PREEMPT. Link: https://lore.kernel.org/r/20201209052951.16136-3-bvanassche@acm.org Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Stanley Chu <stanley.chu@mediatek.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Can Guo <cang@codeaurora.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bart Van Assche authored
With the current implementation the following race can happen: * blk_pre_runtime_suspend() calls blk_freeze_queue_start() and blk_mq_unfreeze_queue(). * blk_queue_enter() calls blk_queue_pm_only() and that function returns true. * blk_queue_enter() calls blk_pm_request_resume() and that function does not call pm_request_resume() because the queue runtime status is RPM_ACTIVE. * blk_pre_runtime_suspend() changes the queue status into RPM_SUSPENDING. Fix this race by changing the queue runtime status into RPM_SUSPENDING before switching q_usage_counter to atomic mode. Link: https://lore.kernel.org/r/20201209052951.16136-2-bvanassche@acm.org Fixes: 986d413b ("blk-mq: Enable support for runtime power management") Cc: Ming Lei <ming.lei@redhat.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Stanley Chu <stanley.chu@mediatek.com> Co-developed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Adrian Hunter authored
Enable runtime PM auto-suspend by default for Intel host controllers. Link: https://lore.kernel.org/r/20201207083120.26732-5-adrian.hunter@intel.comSigned-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Adrian Hunter authored
Intel controllers can end up in an unrecoverable state after a hibernate exit error unless a full reset and restore is done before anything else. Force that to happen. Link: https://lore.kernel.org/r/20201207083120.26732-4-adrian.hunter@intel.comSigned-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Adrian Hunter authored
The expectation for suspend-to-disk is that devices will be powered-off, so the UFS device should be put in PowerDown mode. If spm_lvl is not 5, then that will not happen. Change the pm callbacks to force spm_lvl 5 for suspend-to-disk poweroff. Link: https://lore.kernel.org/r/20201207083120.26732-3-adrian.hunter@intel.comSigned-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Adrian Hunter authored
Currently, ufshcd-pci is the only UFS driver with support for suspend-to-disk PM callbacks (i.e. freeze/thaw/restore/poweroff). These callbacks are set by the macro SET_SYSTEM_SLEEP_PM_OPS to the same functions as system suspend/resume. That will work with spm_lvl 5 because spm_lvl 5 will result in a full restore for the ->restore() callback. In the absence of a full restore, the host controller registers will have values set up by the restore kernel (the kernel that boots and loads the restore image) which are not necessarily the same. However it turns out, the only registers that sometimes need restore are the base address registers. This has gone un-noticed because, depending on IOMMU settings, the kernel can end up allocating the same addresses every time. For Intel controllers, an spm_lvl other than 5 can be used, so to support S4 (suspend-to-disk) with spm_lvl other than 5, restore the base address registers. Link: https://lore.kernel.org/r/20201207083120.26732-2-adrian.hunter@intel.comSigned-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Stanley Chu authored
For some devices which need extra delay after VCC power down, VCC shall be kept always-on in some MediaTek UFS platforms to ensure the stability of such devices because the extra delay may not be enough in those platforms. Link: https://lore.kernel.org/r/20201207054955.24366-3-stanley.chu@mediatek.comReviewed-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Stanley Chu authored
Introduce a flag "always_on" in struct ufs_vreg to allow vendors to keep the regulator always-on. Link: https://lore.kernel.org/r/20201207054955.24366-2-stanley.chu@mediatek.comReviewed-by: Andy Teng <andy.teng@mediatek.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Randall Huang authored
If RPMB is not provisioned, we may see RPMB failure after UFS suspend/resume. Inject request_sense to clear uac in ufshcd reset flow. Link: https://lore.kernel.org/r/20201201041402.3860525-1-jaegeuk@kernel.orgReported-by: kernel test robot <lkp@intel.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Randall Huang <huangrandall@google.com> Signed-off-by: Leo Liou <leoliou@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@google.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bean Huo authored
Change dev_err() print message from "dme-reset" to "dme_enable" in function ufshcd_dme_enable(). Link: https://lore.kernel.org/r/20201207190137.6858-3-huobean@gmail.comAcked-by: Alim Akhtar <alim.akhtar@samsung.com> Acked-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Bean Huo authored
POWER_DESC_MAX_SIZE is unused, remove it. Link: https://lore.kernel.org/r/20201207190137.6858-2-huobean@gmail.comAcked-by: Avri Altman <avri.altman@wdc.com> Acked-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
Update driver version to 36.100.00.00 Link: https://lore.kernel.org/r/20201126094311.8686-9-suganath-prabu.subramani@broadcom.comSigned-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
If a firmware update adds support for the trigger pages, then the driver should handle this by writing the existing trigger data from the driver's internal data structure to the corresponding trigger pages in NVRAM. Also handle the case where the trigger page capability is no longer present after a firmware downgrade. Link: https://lore.kernel.org/r/20201126094311.8686-8-suganath-prabu.subramani@broadcom.comSigned-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
This page is used to store information about MPI (IOC Status & LogInfo) triggers. Driver Persistent Trigger Page-4 format: ------------------------------------------------------- | 31 24 23 16 15 8 7 0| Byte ------------------------------------------------------- | PageType | PageNumber | Reserved | PageVersion | 0x00 -------------------------------------------------------- | Reserved | ExtPageType | ExtPageLength | 0x04 -------------------------------------------------------- | Reserved | NumMpiTriggerEntries | 0x08 -------------------------------------------------------- | MPITriggerEntry[0] | 0x0C -------------------------------------------------------- | … | -------------------------------------------------------- | MPITriggerEntry[19] | 0xA4 -------------------------------------------------------- NumMpiTriggerEntries: This field indicates number of MPI (IOC Status & LogInfo) trigger entries stored in this page. Currently driver is supporting a maximum of 20-MPI trigger entries. MPITriggerEntry: ----------------------------------------------------- | 31 16 15 0 | ----------------------------------------------------- | Reserved | IOCStatus | ----------------------------------------------------- | IOCLogInfo | ----------------------------------------------------- IOCStatus => Status value from the IOC IOCLogInfo => Specific value that supplements the IOCStatus. Link: https://lore.kernel.org/r/20201126094311.8686-7-suganath-prabu.subramani@broadcom.comReported-by: kernel test robot <lkp@intel.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
Trigger Page3 is used to store information about SCSI Sense triggers: Persistent Trigger Page-3 ------------------------------------------------------------------ | 31 24 23 16 15 8 7 0| Byte ------------------------------------------------------------------ | PageType | PageNumber | Reserved | PageVersion | 0x00 ------------------------------------------------------------------ | Reserved | ExtPageType | ExtPageLen | 0x04 ------------------------------------------------------------------ | Reserved | NumScsiSense | TriggerEntries | 0x08 ------------------------------------------------------------------ | ScsiSenseTriggerEntry[0] | 0x0C ------------------------------------------------------------------ | … … | ------------------------------------------------------------------ | ScsiSenseTriggerEntry[19] | 0x58 ------------------------------------------------------------------ NumScsiSenseTriggerEntries: This field indicates number of SCSI Sense trigger entries stored in this page. Currently driver is supporting a maximum of 20-SCSI Sense trigger entries. ScsiSenseTriggerEntry: ----------------------------------------------- | 31 24 23 16 15 8 7 0 | ----------------------------------------------- | Reserved | SenseKey | ASC | ASCQ | ----------------------------------------------- ASCQ => Additional Sense Code Qualifier ASC => Additional Sense Code SenseKey => Sense Key values ASCQ => Additional Sense Code Qualifier ASC => Additional Sense Code SenseKey => Sense Key values Link: https://lore.kernel.org/r/20201126094311.8686-6-suganath-prabu.subramani@broadcom.comReported-by: kernel test robot <lkp@intel.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
Trigger Page2 is used to store information about Event triggers: 31 24 23 16 15 8 7 0 Byte ----------------------------------------------- |PageType |PageNumber |Reserved |PageVersion| 0x00 ----------------------------------------------- |Reserved |ExtPageType | ExtPageLength | 0x04 ----------------------------------------------- | Reserved | NumMPIEventTriggers | 0x08 ----------------------------------------------- | MPIEventTriggerEntries | 0x0C | | 0xFC ----------------------------------------------- Number of MPI Event Trigger Entries currently stored in this page. If this is set to zero, there are no valid MPI-Event-Trigger entries available in this page. MPIEventTriggerEntry: - MPIEventCode [15:00] MPI Event code specified in MPI-Spec - MPIEventCodeSpecific [16:31] For Event Code “MPI2_EVENT_LOG_ENTRY_ADDED (0x0021)”, this field specifies the Log-Entry-Qualifier. For all other Event Codes, this field is reserved and not used Maximum of 20-event trigger entries can be stored in this page. Link: https://lore.kernel.org/r/20201126094311.8686-5-suganath-prabu.subramani@broadcom.comReported-by: kernel test robot <lkp@intel.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
Trigger Page 1 is used to store information about Master triggers. Below are the Master trigger conditions: Bit[3] Trigger condition for Device Removal event Bit[2] Trigger condition for TM command issued by driver Bit[1] Trigger condition for Adapter reset issued by driver Bit[0] Trigger condition for IOC Fault state During driver load, if Master trigger type bit is enabled in the Persistent Trigger Page0, then read the Persistent Trigger Page1 and update the IOC instance's diag_trigger_master.MasterData with Persistent Trigger Page1's MasterTriggerFlags. Link: https://lore.kernel.org/r/20201126094311.8686-4-suganath-prabu.subramani@broadcom.comReported-by: kernel test robot <lkp@intel.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
The user can set trigger values in order to collect the IOC's host trace buffer automatically upon detecting certain conditions. However, the trigger values that the user sets are not persistent across system reboot or reload of the driver. In order to make the user trigger settings persistent, these trigger values need to be saved in the IOC's NVRAM pages: - Driver Persistent Trigger Page 0: This page is used to store list of trigger types that are enabled - Driver Persistent Trigger Page 1: This page stores the list of Master triggers that are enabled - Driver Persistent Trigger Page 2: This page stores the list of MPI Event Triggers that are enabled - Driver Persistent Trigger Page 3: This page stores the list of SCSI Sense Triggers that are enabled - Driver Persistent Trigger Page 4: This page stores the list of IOCStatus-LogInfo Triggers that are enabled. Whenever user configures triggers, the driver persists the values in the corresponding trigger pages. When the driver is subsequently reloaded, the driver reads the values from the trigger pages and configures the triggers accordingly. During firmware upload operation, if the newer firmware supports the trigger page feature, then driver persists the configured diag trigger values to NVRAM. Link: https://lore.kernel.org/r/20201126094311.8686-3-suganath-prabu.subramani@broadcom.comReported-by: kernel test robot <lkp@intel.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Suganath Prabu S authored
The controller time currently gets updated with host time during driver load or when a controller reset is issued. I.e. when host issues the IOCInit request message to the HBA firmware. This IOCInit message has a field named 'TimeStamp' with which the host updates the controller time. Sometimes controller time drifts with respect to the host and it is difficult to correlate host logs with controller logs. Issuing a controller reset to sync the time would impact in-flight I/O and is not a viable option. Instead the driver now sends an IO_UNIT_CONTROL Request to sync the time periodically. This is done from the watchdog thread which gets invoked every second. The time synchronization interval is specified in the 'TimeSyncInterval' field in Manufacturing Page11 by the controller: TimeSyncInterval - 8 bits bits 0-6: Time stamp Synchronization interval value bit 7: Time stamp Synchronization interval unit, (if this bit is one then Timestamp Synchronization interval value is specified in terms of hours else Timestamp Synchronization interval value is specified in terms of minutes). The driver keeps track of the timer using IOC's timestamp_update_count field. This field value gets incremented whenever the watchdog thread gets invoked. And whenever this field value is greater than or equal to the Time Stamp Synchronization interval value, the driver sends the IO_UNIT_CONTROL Request message to controller to update the time and then it resets the timestamp_update_count field to zero. Link: https://lore.kernel.org/r/20201126094311.8686-2-suganath-prabu.subramani@broadcom.comReported-by: kernel test robot <lkp@intel.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Nilesh Javali authored
Link: https://lore.kernel.org/r/20201202132312.19966-16-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Arun Easi authored
Due to a bug in the older scan logic, when a once lost device re-appeared, it was not discovered. Fix this by resetting login_retry counter upon device discovery. This is applicable only for 4G and older HBAs. Link: https://lore.kernel.org/r/20201202132312.19966-15-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Saurav Kashyap authored
Driver unload with I/Os in flight causes server to crash. Complete I/O with DID_IMM_RETRY if fcport undergoing deletion. CPU: 44 PID: 35008 Comm: qla2xxx_4_dpc Kdump: loaded Tainted: G OE X 5.3.18-22-default #1 SLE15-SP2 (unreleased) Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 07/16/2020 RIP: 0010:dma_direct_unmap_sg+0x24/0x60 Code: 4c 8b 04 24 eb b9 0f 1f 44 00 00 85 d2 7e 4e 41 57 4d 89 c7 41 56 41 89 ce 41 55 49 89 fd 41 54 41 89 d4 55 31 ed 53 48 89 f3 <8b> 53 18 48 8b 73 10 4d 89 f8 44 89 f1 4c 89 ef 83 c5 01 e8 44 ff RSP: 0018:ffffc0c661037d88 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002 RDX: 000000000000001d RSI: 0000000000000000 RDI: ffff9a51ee53b0b0 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9a51ee53b0b0 R10: ffffc0c646463dc8 R11: ffff9a4a067087c8 R12: 000000000000001d R13: ffff9a51ee53b0b0 R14: 0000000000000002 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff9a523f800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 000000043740a004 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: qla2xxx_qpair_sp_free_dma+0x20d/0x3c0 [qla2xxx] qla2xxx_qpair_sp_compl+0x35/0x90 [qla2xxx] __qla2x00_abort_all_cmds+0x180/0x390 [qla2xxx] ? qla24xx_process_purex_list+0x100/0x100 [qla2xxx] qla2x00_abort_all_cmds+0x5e/0x80 [qla2xxx] qla2x00_do_dpc+0x317/0xa30 [qla2xxx] kthread+0x10d/0x130 ? kthread_park+0xa0/0xa0 ret_from_fork+0x35/0x40 Link: https://lore.kernel.org/r/20201202132312.19966-14-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Saurav Kashyap authored
The call trace was because workqueue was allocated without any flags, added WQ_MEM_RECLAIM as flag when allocating. kernel: workqueue: WQ_MEM_RECLAIM kblockd:blk_mq_run_work_fn is flushing !WQ_MEM_RECLAIM qla2xxx_wq:0x0 kernel: WARNING: CPU: 0 PID: 2475 at kernel/workqueue.c:2593 check_flush_dependency+0x110/0x130 kernel: CPU: 0 PID: 2475 Comm: kworker/0:1H Kdump: loaded Tainted: G OE --------- - - 4.18.0-193.el8.x86_64 #1 kernel: Hardware name: HPE ProLiant XL170r Gen10/ProLiant XL170r Gen10, BIOS U38 05/21/2019 kernel: Workqueue: kblockd blk_mq_run_work_fn kernel: RIP: 0010:check_flush_dependency+0x110/0x130 kernel: Code: ff ff 48 8b 50 18 48 8d 8b b0 00 00 00 49 89 e8 48 81 c6 b0 00 00 00 48 c7 c7 00 1e e9 95 c6 05 dc 9a 2f 01 01 e8 1a 42 fe ff <0f> 0b e9 0a ff ff ff 80 3d ca 9a 2f 01 0 0 75 95 e9 41 ff ff ff 90 kernel: RSP: 0018:ffffa40f48b2baf8 EFLAGS: 00010282 kernel: RAX: 0000000000000000 RBX: ffff946795282600 RCX: 0000000000000000 kernel: RDX: 000000000000005f RSI: ffffffff96a1af7f RDI: 0000000000000246 kernel: RBP: 0000000000000000 R08: ffffffff96a1af20 R09: 0000000000029480 kernel: R10: 00080c89bb3e7462 R11: 00000000000009ab R12: ffff946773628000 kernel: R13: 0000000000000282 R14: 0000000000000246 R15: ffffa40f48b2bb40 kernel: FS: 0000000000000000(0000) GS:ffff94679fa00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00005570c4b60110 CR3: 000000029140a005 CR4: 00000000007606f0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: PKRU: 55555554 kernel: Call Trace: kernel: flush_workqueue+0x13a/0x440 kernel: qla2x00_wait_for_sess_deletion+0x1d6/0x200 [qla2xxx] kernel: ? finish_wait+0x80/0x80 kernel: qla2xxx_disable_port+0x2b/0x30 [qla2xxx] kernel: qla2x00_process_vendor_specific+0x1dc9/0x2d20 [qla2xxx] kernel: ? blk_rq_map_sg+0x195/0x570 kernel: qla24xx_bsg_request+0x1a3/0xf90 [qla2xxx] Link: https://lore.kernel.org/r/20201202132312.19966-13-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Arun Easi authored
Flash update failed due to missing endian conversion in FLT region access as well as in checksum computation. Link: https://lore.kernel.org/r/20201202132312.19966-12-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Saurav Kashyap authored
Call trace observed while shutting down the adapter ports (LINK DOWN). Handle aborts correctly. localhost kernel: INFO: task nvme:44209 blocked for more than 120 seconds. localhost kernel: "echo 0 >/proc/sys/kernel/hung_task_timeout_secs" disables this message. localhost kernel: nvme D ffff88b45fb5acc0 0 44209 1 0x00000080 localhost kernel: Call Trace: localhost kernel: [<ffffffffbd187169>] schedule+0x29/0x70 localhost kernel: [<ffffffffbd184c51>] schedule_timeout+0x221/0x2d0 localhost kernel: [<ffffffffbcad7229>] ? ttwu_do_wakeup+0x19/0xe0 localhost kernel: [<ffffffffbcad735f>] ? ttwu_do_activate+0x6f/0x80 localhost kernel: [<ffffffffbcada830>] ? try_to_wake_up+0x190/0x390 localhost kernel: [<ffffffffbd18751d>] wait_for_completion+0xfd/0x140 localhost kernel: [<ffffffffbcadaaf0>] ? wake_up_state+0x20/0x20 localhost kernel: [<ffffffffbcabe3da>] flush_work+0x10a/0x1b0 localhost kernel: [<ffffffffbcabb0f0>] ? move_linked_works+0x90/0x90 localhost kernel: [<ffffffffbcabe6cf>] flush_delayed_work+0x3f/0x50 localhost kernel: [<ffffffffc0452767>] nvme_fc_init_ctrl+0x657/0x6a0 [nvme_fc] localhost kernel: [<ffffffffc045293a>] nvme_fc_create_ctrl+0x18a/0x210 [nvme_fc] localhost kernel: [<ffffffffc028962f>] nvmf_dev_write+0x98f/0xb35 [nvme_fabrics] localhost kernel: [<ffffffffbcd08927>] ? security_file_permission+0x27/0xa0 localhost kernel: [<ffffffffbcc4db50>] vfs_write+0xc0/0x1f0 localhost kernel: [<ffffffffbcc4e92f>] SyS_write+0x7f/0xf0 localhost kernel: [<ffffffffbd193f92>] system_call_fastpath+0x25/0x2a Link: https://lore.kernel.org/r/20201202132312.19966-11-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
FC-NVMe target discovery failed when initiator wwpn < target wwpn in an N2N (Direct Attach) config, where the driver was stuck on FCP PRLI mode and failed to retry with NVMe PRLI. Link: https://lore.kernel.org/r/20201202132312.19966-10-njavali@marvell.com Fixes: 84ed362a ("scsi: qla2xxx: Dual FCP-NVMe target port support”) Fixes: 983f1276 ("scsi: qla2xxx: Retry PLOGI on FC-NVMe PRLI failure”) Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Arun Easi authored
Some fields are not correctly byte swapped causing failure during initialization. As probe() returns failure, HBAs will not be claimed when this happens. qla2xxx [0007:01:00.0]-ffff:3: Secure Flash Update in FW: Supported qla2xxx [0007:01:00.0]-ffff:3: SCM in FW: Supported qla2xxx [0007:01:00.0]-00d2:3: Init Firmware **** FAILED ****. qla2xxx [0007:01:00.0]-00d6:3: Failed to initialize adapter - Adapter flags 2. qla2xxx 0007:01:00.1: enabling device (0140 -> 0142) qla2xxx [0007:01:00.1]-011c: : MSI-X vector count: 128. qla2xxx [0007:01:00.1]-001d: : Found an ISP2289 irq 18 iobase 0xd000080080004000. qla2xxx 0007:01:00.1: Using 64-bit direct DMA at offset 800000000000000 BUG: Bad page state in process insmod pfn:67118 page:f00000000168bd40 count:-1 mapcount:0 mapping: (null) index:0x0 page flags: 0x3ffff800000000() page dumped because: nonzero _count Modules linked in: qla2xxx(OE+) nvme_fc nvme_fabrics nvme_core scsi_transport_fc scsi_tgt nls_utf8 isofs ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter nx_crypto ses enclosure scsi_transport_sas pseries_rng sg ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_common usb_storage ipr libata tg3 ptp pps_core dm_mirror dm_region_hash dm_log dm_mod CPU: 32 PID: 8560 Comm: insmod Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.ppc64 #1 Call Trace: [c0000006dd7caa70] [c00000000001cca8] .show_stack+0x88/0x330 (unreliable) [c0000006dd7cab30] [c000000000ac3d88] .dump_stack+0x28/0x3c [c0000006dd7caba0] [c00000000029e48c] .bad_page+0x15c/0x1c0 [c0000006dd7cac40] [c00000000029f938] .get_page_from_freelist+0x11e8/0x1ea0 [c0000006dd7caf40] [c0000000002a1d30] .__alloc_pages_nodemask+0x1c0/0xc70 [c0000006dd7cb140] [c00000000002ba0c] .__dma_direct_alloc_coherent+0x8c/0x170 [c0000006dd7cb1e0] [d000000010a94688] .qla2x00_mem_alloc+0x10f8/0x1370 [qla2xxx] [c0000006dd7cb2d0] [d000000010a9c790] .qla2x00_probe_one+0xb60/0x22e0 [qla2xxx] [c0000006dd7cb540] [c0000000005de764] .pci_device_probe+0x204/0x300 [c0000006dd7cb600] [c0000000006ca61c] .driver_probe_device+0x2cc/0x6f0 [c0000006dd7cb6b0] [c0000000006cabec] .__driver_attach+0x10c/0x110 [c0000006dd7cb740] [c0000000006c5f04] .bus_for_each_dev+0x94/0x100 [c0000006dd7cb7e0] [c0000000006c94f4] .driver_attach+0x34/0x50 [c0000006dd7cb860] [c0000000006c8f58] .bus_add_driver+0x298/0x3b0 [c0000006dd7cb900] [c0000000006cb6e0] .driver_register+0xb0/0x1a0 [c0000006dd7cb980] [c0000000005dc474] .__pci_register_driver+0xc4/0xf0 [c0000006dd7cba10] [d000000010b94e20] .qla2x00_module_init+0x2a8/0x328 [qla2xxx] [c0000006dd7cbaa0] [c00000000000c130] .do_one_initcall+0x130/0x2e0 [c0000006dd7cbb50] [c0000000001b2e8c] .load_module+0x1afc/0x2340 [c0000006dd7cbd40] [c0000000001b3920] .SyS_finit_module+0xd0/0x130 [c0000006dd7cbe30] [c00000000000a284] system_call+0x38/0xfc Link: https://lore.kernel.org/r/20201202132312.19966-9-njavali@marvell.com Fixes: 9f2475fe ("scsi: qla2xxx: SAN congestion management implementation") Fixes: cf3c54fb ("scsi: qla2xxx: Add SLER and PI control support”) Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Arun Easi authored
Crash stack: [576544.715489] Unable to handle kernel paging request for data at address 0xd00000000f970000 [576544.715497] Faulting instruction address: 0xd00000000f880f64 [576544.715503] Oops: Kernel access of bad area, sig: 11 [#1] [576544.715506] SMP NR_CPUS=2048 NUMA pSeries : [576544.715703] NIP [d00000000f880f64] .qla27xx_fwdt_template_valid+0x94/0x100 [qla2xxx] [576544.715722] LR [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715726] Call Trace: [576544.715731] [c0000004d0ffb000] [c0000006fe02c350] 0xc0000006fe02c350 (unreliable) [576544.715750] [c0000004d0ffb080] [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] [576544.715770] [c0000004d0ffb170] [d00000000f7aa034] .qla81xx_load_risc+0x84/0x1a0 [qla2xxx] [576544.715789] [c0000004d0ffb210] [d00000000f79f7c8] .qla2x00_setup_chip+0xc8/0x910 [qla2xxx] [576544.715808] [c0000004d0ffb300] [d00000000f7a631c] .qla2x00_initialize_adapter+0x4dc/0xb00 [qla2xxx] [576544.715826] [c0000004d0ffb3e0] [d00000000f78ce28] .qla2x00_probe_one+0xf08/0x2200 [qla2xxx] Link: https://lore.kernel.org/r/20201202132312.19966-8-njavali@marvell.com Fixes: f73cb695 ("[SCSI] qla2xxx: Add support for ISP2071.") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Arun Easi authored
Fix compile time errors reported on PPC systems, qla_gbl.h:991:20: error: inlining failed in call to always_inline ‘qla_nvme_abort_set_option’: function body not available Link: https://lore.kernel.org/r/20201202132312.19966-7-njavali@marvell.comSigned-off-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Saurav Kashyap authored
NVMe commands can come only after successful addition of rport and NVMe connect, and rport is only registered after FW started bit is set. Remove the redundant check. Link: https://lore.kernel.org/r/20201202132312.19966-6-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
The completion status 0x28 (ppc = be = 0x2800) below indicates session is not there, trigger session deletion. qla2xxx [000b:04:00.1]-8009:8: DEVICE RESET ISSUED nexus=8:1:51 cmd=c000001432d0f600. qla2xxx [000b:04:00.1]-5039:8: Async-tmf error - hdl=67b completion status(2800). qla2xxx [000b:04:00.1]-8030:8: TM IOCB failed (102). qla2xxx [000b:04:00.1]-800c:8: do_reset failed for cmd=c000001432d0f600. qla2xxx [000b:04:00.1]-800f:8: DEVICE RESET FAILED: Task management failed nexus=8:1:51 cmd=c000001432d0f600. qla2xxx [000b:04:00.1]-8009:8: DEVICE RESET ISSUED nexus=8:1:52 cmd=c000001432d0c200. qla2xxx [000b:04:00.1]-5039:8: Async-tmf error - hdl=67c completion status(2800). qla2xxx [000b:04:00.1]-8030:8: TM IOCB failed (102). Link: https://lore.kernel.org/r/20201202132312.19966-5-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Quinn Tran authored
Driver created too many QPairs(126) with 28xx adapter. Limit to the number of CPUs to minimize wasted resources. Link: https://lore.kernel.org/r/20201202132312.19966-4-njavali@marvell.comReviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Saurav Kashyap authored
Change the message debug level. Link: https://lore.kernel.org/r/20201202132312.19966-3-njavali@marvell.comSigned-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-
Daniel Wagner authored
When the fcport is about to be deleted we should return EBUSY instead of ENODEV. Only for EBUSY will the request be requeued in a multipath setup. Also return EBUSY when the firmware has not yet started to avoid dropping the request. Link: https://lore.kernel.org/r/20201014073048.36219-1-dwagner@suse.de Link: https://lore.kernel.org/r/20201202132312.19966-2-njavali@marvell.comReviewed-by: Arun Easi <aeasi@marvell.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-