• Bjorn Helgaas's avatar
    PCI: shpchp: Use per-slot workqueues to avoid deadlock · f652e7d2
    Bjorn Helgaas authored
    When we have an SHPC-capable bridge with a second SHPC-capable bridge
    below it, pushing the upstream bridge's attention button causes a
    deadlock.
    
    The deadlock happens because we use the shpchp_wq workqueue to run
    shpchp_pushbutton_thread(), which uses shpchp_disable_slot() to remove
    devices below the upstream bridge.  When we remove the downstream bridge,
    we call shpc_remove(), the shpchp driver's .remove() method.  That calls
    flush_workqueue(shpchp_wq), which deadlocks because the
    shpchp_pushbutton_thread() work item is still running.
    
    This patch avoids the deadlock by creating a workqueue for every slot
    and removing the single shared workqueue.
    
    Here's the call path that leads to the deadlock:
    
      shpchp_queue_pushbutton_work
        queue_work(shpchp_wq)		# shpchp_pushbutton_thread
        ...
    
      shpchp_pushbutton_thread
        shpchp_disable_slot
          remove_board
            shpchp_unconfigure_device
              pci_stop_and_remove_bus_device
                ...
                  shpc_remove		# shpchp driver .remove method
                    hpc_release_ctlr
                      cleanup_slots
                        flush_workqueue(shpchp_wq)
    
    This change is based on code inspection, since we don't have hardware
    with this topology.
    Based-on-patch-by: default avatarYijing Wang <wangyijing@huawei.com>
    Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
    CC: stable@vger.kernel.org
    f652e7d2
shpchp.h 11 KB