1. 19 Feb, 2019 3 commits
  2. 14 Feb, 2019 2 commits
  3. 13 Feb, 2019 5 commits
  4. 12 Feb, 2019 7 commits
  5. 08 Feb, 2019 8 commits
  6. 06 Feb, 2019 15 commits
    • James Smart's avatar
      scsi: lpfc: Update lpfc version to 12.2.0.0 · 42fb055a
      James Smart authored
      Update lpfc version to 12.2.0.0
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      42fb055a
    • James Smart's avatar
      scsi: lpfc: Update 12.2.0.0 file copyrights to 2019 · 0d041215
      James Smart authored
      For files modified as part of 12.2.0.0 patches, update copyright to 2019
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      0d041215
    • James Smart's avatar
      scsi: lpfc: Fix nvmet issues when link bounce under IO load · c160c0f8
      James Smart authored
      Various null pointer dereference and general protection fault panics occur
      when there is a link bounce under load. There are a large number of "error"
      message 6413 indicating "bad release".
      
      The issues resolve to list corruptions due to missing or inconsistent lock
      protection. Lockups are due to nested locks in the unsolicited abort
      path. The unsolicited abort path calls the wrong abort processing
      routine. There was also duplicate context release while aborts were still
      active in the hardware.
      
      Removed duplicate locks and added lock protection around list item
      removal. Commonized lock handling around the abort processing routines.
      Prevent context release while still in ABTS list.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      c160c0f8
    • James Smart's avatar
      scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall · 472e146d
      James Smart authored
      When the transport calls into the lpfc target to release an IO job
      structure, which corresponds to an exchange, and if the driver was waiting
      for an exchange in order to post a previously received command to the
      transport, the driver immediately takes the IO job and reuses the context
      for the prior command and calls nvmet_fc_rcv_fcp_req() to tell the
      transport about a newly received command.
      
      Problem is, the execution of the IO job release may be in the context of
      the back end driver and its bio completion handlers, thus it may be in a
      irq context and protection code kicks in in the bio and request layers that
      are subsequently called.
      
      Rework lpfc so that instead of immediately upcalling, queue it to a
      deferred work thread and have the thread make the upcall.
      
      Took advantage of this change to remove duplicated code with the normal
      command receive path that preps the IO job and upcalls nvmet_fc. Created a
      common routine both paths use.
      
      Also corrected some errors that were found during review of the context
      freeing and reuse - basically unlocked operations and a somewhat disjoint
      set of calls to release associated job elements. Cleaned up this path and
      added locks for coherency.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      472e146d
    • James Smart's avatar
      scsi: lpfc: Fix default driver parameter collision for allowing NPIV support · f6e84790
      James Smart authored
      The conversion to enable SCSI and NVME fc4 support ran into an issue with
      NPIV support. With NVME, NPIV is not currently supported, but with SCSI it
      was. The driver reverted to its lowest setting meaning NPIV with SCSI was
      not allowed.
      
      Convert the NPIV checks and implementation so that SCSI can continue to
      allow NPIV support.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      f6e84790
    • James Smart's avatar
      scsi: lpfc: Rework locking on SCSI io completion · c2017260
      James Smart authored
      A scsi host lock is taken on every io completion to check whether the abort
      handler is waiting on the io completion. This is an expensive lock to take
      on all completion when rarely in an abort condition.
      
      Replace scsi host lock with command-specific lock. Synchronize completion
      and abort paths by new cmd lock. Ensure all flag changing and nulling of
      context pointers taken under lock.  When adding lock to task management
      abort, realized it was missing other synchronization locks. Added that
      synchronization to match normal paths.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      c2017260
    • James Smart's avatar
      scsi: lpfc: Enable SCSI and NVME fc4s by default · b1684a0b
      James Smart authored
      Now that performance mods don't split resources by protocol and enable both
      protocols by default, there's no reason not to enable concurrent SCSI and
      NVME fc4 support.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      b1684a0b
    • James Smart's avatar
      scsi: lpfc: Resize cpu maps structures based on possible cpus · 222e9239
      James Smart authored
      The work done to date utilized the number of present cpus when sizing
      per-cpu structures. Structures should have been sized based on the max
      possible cpu count.
      
      Convert the driver over to possible cpu count for sizing allocation.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      222e9239
    • James Smart's avatar
      scsi: lpfc: Utilize new IRQ API when allocating MSI-X vectors · 75508a8b
      James Smart authored
      Current driver uses the older IRQ API for MSIX allocation
      
      Change driver to utilize pci_alloc_irq_vectors when allocating IRQ vectors.
      
      Make lpfc_cpu_affinity_check use pci_irq_get_affinity to determine how the
      kernel mapped all the IRQs.
      
      Remove msix_entries from SLI4 structure, replaced with pci_irq_vector()
      usage.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      75508a8b
    • James Smart's avatar
      scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescing · 32517fc0
      James Smart authored
      When driving high iop counts, auto_imax coalescing kicks in and drives the
      performance to extremely small iops levels.
      
      There are two issues:
      
       1) auto_imax is enabled by default. The auto algorithm, when iops gets
          high, divides the iops by the hdwq count and uses that value to
          calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether
          they have load or not. The EQ_delay is only manipulated every 5s (a
          long time). Thus there were large 5s swings of no interrupt delay
          followed by large/maximum delay, before repeating.
      
       2) When processing a CQ, the driver got mixed up on the rate of when
          to ring the doorbell to keep the chip appraised of the eqe or cqe
          consumption as well as how how long to sit in the thread and
          process queue entries. Currently, the driver capped its work at
          64 entries (very small) and exited/rearmed the CQ.  Thus, on heavy
          loads, additional overheads were taken to exit and re-enter the
          interrupt handler. Worse, if in the large/maximum coalescing
          windows,k it could be a while before getting back to servicing.
      
      The issues are corrected by the following:
      
       - A change in defaults. Auto_imax is turned OFF and fcp_imax is set
         to 0. Thus all interrupts are immediate.
      
       - Cleanup of field names and their meanings. Existing names were
         non-intuitive or used for duplicate things.
      
       - Added max_proc_limit field, to control the length of time the
         handlers would service completions.
      
       - Reworked EQ handling:
          Added common routine that walks eq, applying notify interval and max
            processing limits. Use queue_claimed to claim ownership of the queue
            while processing. Always rearm the queue whenever the common routine
            is called.
          Rework queue element processing, namely to eliminate hba_index vs
            host_index. Only one index is necessary. The queue entry can be
            marked invalid and the host_index updated immediately after eqe
            processing.
          After rework, xx_release routines are now DB write functions. Renamed
            the routines as such.
          Moved lpfc_sli4_eq_flush(), which does similar action, to same area.
          Replaced the 2 individual loops that walk an eq with a call to the
            common routine.
          Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax.
          Added per-cpu counters to detect interrupt rates and scale
            interrupt coalescing values.
      
       - Reworked CQ handling:
          Added common routine that walks cq, applying notify interval and max
            processing limits. Use queue_claimed to claim ownership of the queue
            while processing. Always rearm the queue whenever the common routine
            is called.
          Rework queue element processing, namely to eliminate hba_index vs
            host_index. Only one index is necessary. The queue entry can be
            marked invalid and the host_index updated immediately after cqe
            processing.
          After rework, xx_release routines are now DB write functions.  Renamed
            the routines as such.
          Replaced the 3 individual loops that walk a cq with a call to the
            common routine.
          Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with
            queue reference. Add increment for mbox completion to handler.
      
       - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow
         dynamic changing of the CQ max_proc_limit value being used.
      
      Although this leaves an EQ as an immediate interrupt, that interrupt will
      only occur if a CQ bound to it is in an armed state and has cqe's to
      process.  By staying in the cq processing routine longer, high loads will
      avoid generating more interrupts as they will only rearm as the processing
      thread exits. The immediately interrupt is also beneficial to idle or
      lower-processing CQ's as they get serviced immediately without being
      penalized by sharing an EQ with a more loaded CQ.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      32517fc0
    • James Smart's avatar
      scsi: lpfc: cleanup: convert eq_delay to usdelay · cb733e35
      James Smart authored
      Review of the eq coalescing logic showed the code was a bit fragmented.
      Sometimes it would save/set via an interrupt max value, while in others it
      would do so via a usdelay. There were also two places changing eq delay,
      one place that issued mailbox commands, and another that changed via
      register writes if supported.
      
      Clean this up by:
      
       - Standardizing the operation of lpfc_modify_hba_eq_delay() routine so
         that it is always told of a us delay to impose. The routine then chooses
         the best way to set that - via register or via mbx.
      
       - Rather than two value types stored in eq->q_mode (usdelay if change via
         register, imax if change via mbox) - q_mode always contains usdelay.
         Before any value change, old vs new value is compared and only if
         different is a change done.
      
       - Revised the dmult calculation. dmult is not set based on overall imax
         divided by hardware queues - instead imax applies to a single cpu and
         the value will be replicated to all cpus.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      cb733e35
    • James Smart's avatar
      scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues · 6a828b0f
      James Smart authored
      So far MSIX vector allocation assumed it would be 1:1 with hardware
      queues. However, there are several reasons why fewer MSIX vectors may be
      allocated than hardware queues such as the platform being out of vectors or
      adapter limits being less than cpu count.
      
      This patch reworks the MSIX/EQ relationships with the per-cpu hardware
      queues so they can function independently. MSIX vectors will be equitably
      split been cpu sockets/cores and then the per-cpu hardware queues will be
      mapped to the vectors most efficient for them.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      6a828b0f
    • James Smart's avatar
      scsi: lpfc: Fix setting affinity hints to correlate with hardware queues · b3295c2a
      James Smart authored
      The desired affinity for the hardware queue behavior is for hdwq 0 to be
      affinitized with cpu 0, hdwq 1 to cpu 1, and so on.  The implementation so
      far does not do this if the number of cpus is greater than the number of
      hardware queues (e.g. hardware queue allocation was administratively
      reduced or hardware queue resources could not scale to the cpu count).
      
      Correct the queue affinitization logic when queue count is less than
      cpu count.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      b3295c2a
    • James Smart's avatar
      scsi: lpfc: Allow override of hardware queue selection policies · 45aa312e
      James Smart authored
      Default behavior is to use the information from the upper IO stacks to
      select the hardware queue to use for IO submission.  Which typically has
      good cpu affinity.
      
      However, the driver, when used on some variants of the upstream kernel, has
      found queuing information to be suboptimal for FCP or IO completion locked
      on particular cpus.
      
      For command submission situations, the lpfc_fcp_io_sched module parameter
      can be set to specify a hardware queue selection policy that overrides the
      os stack information.
      
      For IO completion situations, rather than queing cq processing based on the
      cpu servicing the interrupting event, schedule the cq processing on the cpu
      associated with the hardware queue's cq.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      45aa312e
    • James Smart's avatar
      scsi: lpfc: Adapt partitioned XRI lists to efficient sharing · c490850a
      James Smart authored
      The XRI get/put lists were partitioned per hardware queue. However, the
      adapter rarely had sufficient resources to give a large number of resources
      per queue. As such, it became common for a cpu to encounter a lack of XRI
      resource and request the upper io stack to retry after returning a BUSY
      condition. This occurred even though other cpus were idle and not using
      their resources.
      
      Create as efficient a scheme as possible to move resources to the cpus that
      need them. Each cpu maintains a small private pool which it allocates from
      for io. There is a watermark that the cpu attempts to keep in the private
      pool.  The private pool, when empty, pulls from a global pool from the
      cpu. When the cpu's global pool is empty it will pull from other cpu's
      global pool. As there many cpu global pools (1 per cpu or hardware queue
      count) and as each cpu selects what cpu to pull from at different rates and
      at different times, it creates a radomizing effect that minimizes the
      number of cpu's that will contend with each other when the steal XRI's from
      another cpu's global pool.
      
      On io completion, a cpu will push the XRI back on to its private pool.  A
      watermark level is maintained for the private pool such that when it is
      exceeded it will move XRI's to the CPU global pool so that other cpu's may
      allocate them.
      
      On NVME, as heartbeat commands are critical to get placed on the wire, a
      single expedite pool is maintained. When a heartbeat is to be sent, it will
      allocate an XRI from the expedite pool rather than the normal cpu
      private/global pools. On any io completion, if a reduction in the expedite
      pools is seen, it will be replenished before the XRI is placed on the cpu
      private pool.
      
      Statistics are added to aid understanding the XRI levels on each cpu and
      their behaviors.
      Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
      Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      c490850a