• Brian Norris's avatar
    mwifiex: resolve reset vs. remove()/shutdown() deadlocks · a64e7a79
    Brian Norris authored
    Commit b014e96d ("PCI: Protect pci_error_handlers->reset_notify()
    usage with device_lock()") resolves races between driver reset and
    removal, but it introduces some new deadlock problems. If we see a
    timeout while we've already started suspending, removing, or shutting
    down the driver, we might see:
    
    (a) a worker thread, running mwifiex_pcie_work() ->
        mwifiex_pcie_card_reset_work() -> pci_reset_function()
    (b) a removal thread, running mwifiex_pcie_remove() ->
        mwifiex_free_adapter() -> mwifiex_unregister() ->
        mwifiex_cleanup_pcie() -> cancel_work_sync(&card->work)
    
    Unfortunately, mwifiex_pcie_remove() already holds the device lock that
    pci_reset_function() is now requesting, and so we see a deadlock.
    
    It's necessary to cancel and synchronize our outstanding work before
    tearing down the driver, so we can't have this work wait indefinitely
    for the lock.
    
    It's reasonable to only "try" to reset here, since this will mostly
    happen for cases where it's already difficult to reset the firmware
    anyway (e.g., while we're suspending or powering off the system). And if
    reset *really* needs to happen, we can always try again later.
    
    Fixes: b014e96d ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()")
    Cc: <stable@vger.kernel.org>
    Cc: Xinming Hu <huxm@marvell.com>
    Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
    Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
    a64e7a79
pcie.c 85.6 KB