• Chuck Lever's avatar
    nfs/blocklayout: Fix premature PR key unregistration · d869da91
    Chuck Lever authored
    During generic/069 runs with pNFS SCSI layouts, the NFS client emits
    the following in the system journal:
    
    kernel: pNFS: failed to open device /dev/disk/by-id/dm-uuid-mpath-0x6001405e3366f045b7949eb8e4540b51 (-2)
    kernel: pNFS: using block device sdb (reservation key 0x666b60901e7b26b3)
    kernel: pNFS: failed to open device /dev/disk/by-id/dm-uuid-mpath-0x6001405e3366f045b7949eb8e4540b51 (-2)
    kernel: pNFS: using block device sdb (reservation key 0x666b60901e7b26b3)
    kernel: sd 6:0:0:1: reservation conflict
    kernel: sd 6:0:0:1: [sdb] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
    kernel: sd 6:0:0:1: [sdb] tag#16 CDB: Write(10) 2a 00 00 00 00 50 00 00 08 00
    kernel: reservation conflict error, dev sdb, sector 80 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
    kernel: sd 6:0:0:1: reservation conflict
    kernel: sd 6:0:0:1: reservation conflict
    kernel: sd 6:0:0:1: [sdb] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
    kernel: sd 6:0:0:1: [sdb] tag#17 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
    kernel: sd 6:0:0:1: [sdb] tag#18 CDB: Write(10) 2a 00 00 00 00 60 00 00 08 00
    kernel: sd 6:0:0:1: [sdb] tag#17 CDB: Write(10) 2a 00 00 00 00 58 00 00 08 00
    kernel: reservation conflict error, dev sdb, sector 96 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
    kernel: reservation conflict error, dev sdb, sector 88 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
    systemd[1]: fstests-generic-069.scope: Deactivated successfully.
    systemd[1]: fstests-generic-069.scope: Consumed 5.092s CPU time.
    systemd[1]: media-test.mount: Deactivated successfully.
    systemd[1]: media-scratch.mount: Deactivated successfully.
    kernel: sd 6:0:0:1: reservation conflict
    kernel: failed to unregister PR key.
    
    This appears to be due to a race. bl_alloc_lseg() calls this:
    
    561 static struct nfs4_deviceid_node *
    562 bl_find_get_deviceid(struct nfs_server *server,
    563                 const struct nfs4_deviceid *id, const struct cred *cred,
    564                 gfp_t gfp_mask)
    565 {
    566         struct nfs4_deviceid_node *node;
    567         unsigned long start, end;
    568
    569 retry:
    570         node = nfs4_find_get_deviceid(server, id, cred, gfp_mask);
    571         if (!node)
    572                 return ERR_PTR(-ENODEV);
    
    nfs4_find_get_deviceid() does a lookup without the spin lock first.
    If it can't find a matching deviceid, it creates a new device_info
    (which calls bl_alloc_deviceid_node, and that registers the device's
    PR key).
    
    Then it takes the nfs4_deviceid_lock and looks up the deviceid again.
    If it finds it this time, bl_find_get_deviceid() frees the spare
    (new) device_info, which unregisters the PR key for the same device.
    
    Any subsequent I/O from this client on that device gets EBADE.
    
    The umount later unregisters the device's PR key again.
    
    To prevent this problem, register the PR key after the deviceid_node
    lookup.
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Reviewed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
    Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
    d869da91
blocklayout.h 5.78 KB