1. 03 Oct, 2017 4 commits
    • Jacob Keller's avatar
      fm10k: introduce a message queue for MAC/VLAN messages · fc917368
      Jacob Keller authored
      Under some circumstances, when dealing with a large number of MAC
      address or VLAN updates at once, the fm10k driver, particularly the VFs
      can overload the mailbox with too many messages at once.
      
      This results in a mailbox timeout, which causes the driver to initiate
      a reset. During the reset, we re-send all the same messages that
      originally caused the timeout. This results in a cycle of resets each
      triggering a future reset.
      
      To fix or avoid this, we introduce a workqueue item which monitors
      a queue of MAC and VLAN requests. These requests are queued to the end
      of the list, and we process as a FIFO periodically.
      
      Initially we only handle requests for the netdev, but we do handle
      unicast MAC addresses, multicast MAC addresses, and update VLAN
      requests.
      
      A future patch will add support to use this queue for handling MAC
      update requests from the VF<->PF mailbox.
      
      The MAC/VLAN work item will keep checking to make sure that each request
      does not overflow the mailbox and cause a timeout. If it might, then the
      work item will reschedule itself a short time later. This avoids any
      reset cycle, since we never send the message if the mailbox is not
      ready.
      
      As an alternative, we tried increasing the mailbox message FIFO, but
      this just delays the problem and results in needless memory waste on the
      system. Our new message queue is dynamically allocated so only uses as
      much memory as it needs. Additionally, it need not be contiguous like
      the Tx and Rx FIFOs.
      
      Note that this patch chose to only create a queue for MAC and VLAN
      messages, since these are the only messages sent in a large enough
      volume to cause the reset loop. Other messages are very unlikely to
      overflow the mailbox Tx FIFO so easily.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKrishneil Singh <krishneil.k.singh@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      fc917368
    • Jacob Keller's avatar
      fm10k: use generic PM hooks instead of legacy PCIe power hooks · 8249c47c
      Jacob Keller authored
      Replace the PCI specific legacy power management hooks with the new
      generic power management hooks which work properly for both suspend and
      hibernate. The new generic system is better and properly handles the
      lower level PCIe power management rather than forcing the driver to
      handle it.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKrishneil Singh <krishneil.k.singh@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      8249c47c
    • Jacob Keller's avatar
      fm10k: use spinlock to implement mailbox lock · b4fcd436
      Jacob Keller authored
      Lets not re-invent the locking wheel. Remove our bitlock and use
      a proper spinlock instead.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKrishneil Singh <krishneil.k.singh@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      b4fcd436
    • Jacob Keller's avatar
      fm10k: prepare_for_reset() when we lose PCIe Link · 0b40f457
      Jacob Keller authored
      If we lose PCIe link, such as when an unannounced PFLR event occurs, or
      when a device is surprise removed, we currently detach the device and
      close the netdev. This unfortunately leaves a lot of things still
      active, such as the msix_mbx_pf IRQ, and Tx/Rx resources.
      
      This can cause problems because the register reads will return
      potentially invalid values which may result in unknown driver behavior.
      
      Begin the process of resetting using fm10k_prepare_for_reset(), much in
      the same way as the suspend and resume cycle does. This will attempt to
      shutdown as much as possible, in order to prevent possible issues.
      
      A naive implementation for this has issues, because there are now
      multiple flows calling the reset logic and setting a reset bit. This
      would cause problems, because the "re-attach" routine might call
      fm10k_handle_reset() prior to the reset actually finishing. Instead,
      we'll add state bits to indicate which flow actually initiated the
      reset.
      
      For the general reset flow, we'll assume that if someone else is
      resetting that we do not need to handle it at all, so it does not need
      its own state bit. For the suspend case, we will simply issue a warning
      indicating that we are attempting to recover from this case when
      resuming.
      
      For the detached subtask, we'll simply refuse to re-attach until we've
      actually initiated a reset as part of that flow.
      
      Finally, we'll stop attempting to manage the mailbox subtask when we're
      detached, since there's nothing we can do if we don't have a PCIe
      address.
      
      Overall this produces a much cleaner shutdown and recovery cycle for
      a PCIe surprise remove event.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarKrishneil Singh <krishneil.k.singh@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0b40f457
  2. 02 Oct, 2017 36 commits