• Dan Williams's avatar
    nvdimm: Fix firmware activation deadlock scenarios · e6829d1b
    Dan Williams authored
    Lockdep reports the following deadlock scenarios for CXL root device
    power-management, device_prepare(), operations, and device_shutdown()
    operations for 'nd_region' devices:
    
     Chain exists of:
       &nvdimm_region_key --> &nvdimm_bus->reconfig_mutex --> system_transition_mutex
    
      Possible unsafe locking scenario:
    
            CPU0                    CPU1
            ----                    ----
       lock(system_transition_mutex);
                                    lock(&nvdimm_bus->reconfig_mutex);
                                    lock(system_transition_mutex);
       lock(&nvdimm_region_key);
    
     Chain exists of:
       &cxl_nvdimm_bridge_key --> acpi_scan_lock --> &cxl_root_key
    
      Possible unsafe locking scenario:
    
            CPU0                    CPU1
            ----                    ----
       lock(&cxl_root_key);
                                    lock(acpi_scan_lock);
                                    lock(&cxl_root_key);
       lock(&cxl_nvdimm_bridge_key);
    
    These stem from holding nvdimm_bus_lock() over hibernate_quiet_exec()
    which walks the entire system device topology taking device_lock() along
    the way. The nvdimm_bus_lock() is protecting against unregistration,
    multiple simultaneous ops callers, and preventing activate_show() from
    racing activate_store(). For the first 2, the lock is redundant.
    Unregistration already flushes all ops users, and sysfs already prevents
    multiple threads to be active in an ops handler at the same time. For
    the last userspace should already be waiting for its last
    activate_store() to complete, and does not need activate_show() to flush
    the write side, so this lock usage can be deleted in these attributes.
    
    Fixes: 48001ea5 ("PM, libnvdimm: Add runtime firmware activation support")
    Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
    Link: https://lore.kernel.org/r/165074883800.4116052.10737040861825806582.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarDan Williams <dan.j.williams@intel.com>
    e6829d1b
core.c 13 KB