PCI: shpchp: Use per-slot workqueues to avoid deadlock
When we have an SHPC-capable bridge with a second SHPC-capable bridge below it, pushing the upstream bridge's attention button causes a deadlock. The deadlock happens because we use the shpchp_wq workqueue to run shpchp_pushbutton_thread(), which uses shpchp_disable_slot() to remove devices below the upstream bridge. When we remove the downstream bridge, we call shpc_remove(), the shpchp driver's .remove() method. That calls flush_workqueue(shpchp_wq), which deadlocks because the shpchp_pushbutton_thread() work item is still running. This patch avoids the deadlock by creating a workqueue for every slot and removing the single shared workqueue. Here's the call path that leads to the deadlock: shpchp_queue_pushbutton_work queue_work(shpchp_wq) # shpchp_pushbutton_thread ... shpchp_pushbutton_thread shpchp_disable_slot remove_board shpchp_unconfigure_device pci_stop_and_remove_bus_device ... shpc_remove # shpchp driver .remove method hpc_release_ctlr cleanup_slots flush_workqueue(shpchp_wq) This change is based on code inspection, since we don't have hardware with this topology. Based-on-patch-by: Yijing Wang <wangyijing@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org
Showing
Please register or sign in to comment