• Daniel Axtens's avatar
    powerpc/eeh: Probe after unbalanced kref check · e642d11b
    Daniel Axtens authored
    In the complete hotplug case, EEH PEs are supposed to be released
    and set to NULL. Normally, this is done by eeh_remove_device(),
    which is called from pcibios_release_device().
    
    However, if something is holding a kref to the device, it will not
    be released, and the PE will remain. eeh_add_device_late() has
    a check for this which will explictly destroy the PE in this case.
    
    This check in eeh_add_device_late() occurs after a call to
    eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
    which will exit without probing if there is an existing PE.
    
    This means that on PowerNV, devices with outstanding krefs will not
    be rediscovered by EEH correctly after a complete hotplug. This is
    affecting CXL (CAPI) devices in the field.
    
    Put the probe after the kref check so that the PE is destroyed
    and affected devices are correctly rediscovered by EEH.
    
    Fixes: d91dafc0 ("powerpc/eeh: Delay probing EEH device during hotplug")
    Cc: stable@vger.kernel.org
    Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
    Signed-off-by: default avatarDaniel Axtens <dja@axtens.net>
    Acked-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    e642d11b
eeh.c 45.2 KB