• James Smart's avatar
    scsi: lpfc: Fix error in remote port address change · 6825b7bd
    James Smart authored
    In a test with high nvme remote port counts connected via a multi-hop FC
    switch config where switches were systematically reset (e.g. fabric
    partitioning and re-establishment), the nvme remote ports would switch
    addresses based on the switch reconfiguration events. The driver would get
    into a situation where the nvme port changed address, PLOGI and PRLI would
    succeed nvme transport registration occurred, but subsequent LS requests by
    the nvme subsystem failed due to a bad ndlp state and connectivity to the
    device failed.
    
    The driver hit a race condition on multiple devices that address swapped
    simultaneously. In cases where the driver notices the remote port structure
    came back as the same value as previously (meaning a nvme_rport structure
    was re-enabled and did not go through devloss_tmo/connect_tmo_failures on
    all controllers) the driver would unconditionally exit assuming the ndlp
    information was correct. But, if the ndlp's had been swapped, the ndlp had
    stale port state information, which when used by the LS request commands,
    would fail the commands.
    
    Fix by checking whether a node swap had occurred, and only exit if no ndlp
    swap had occurred.
    Signed-off-by: default avatarDick Kennedy <dick.kennedy@broadcom.com>
    Signed-off-by: default avatarJames Smart <jsmart2021@gmail.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    6825b7bd
lpfc_nvme.c 79.8 KB