• Chris Leech's avatar
    nvme-tcp: lockdep: annotate in-kernel sockets · 841aee4d
    Chris Leech authored
    Put NVMe/TCP sockets in their own class to avoid some lockdep warnings.
    Sockets created by nvme-tcp are not exposed to user-space, and will not
    trigger certain code paths that the general socket API exposes.
    
    Lockdep complains about a circular dependency between the socket and
    filesystem locks, because setsockopt can trigger a page fault with a
    socket lock held, but nvme-tcp sends requests on the socket while file
    system locks are held.
    
      ======================================================
      WARNING: possible circular locking dependency detected
      5.15.0-rc3 #1 Not tainted
      ------------------------------------------------------
      fio/1496 is trying to acquire lock:
      (sk_lock-AF_INET){+.+.}-{0:0}, at: tcp_sendpage+0x23/0x80
    
      but task is already holding lock:
      (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]
    
      which lock already depends on the new lock.
    
      other info that might help us debug this:
    
      chain exists of:
       sk_lock-AF_INET --> sb_internal --> &xfs_dir_ilock_class/5
    
      Possible unsafe locking scenario:
    
            CPU0                    CPU1
            ----                    ----
       lock(&xfs_dir_ilock_class/5);
                                    lock(sb_internal);
                                    lock(&xfs_dir_ilock_class/5);
       lock(sk_lock-AF_INET);
    
      *** DEADLOCK ***
    
      6 locks held by fio/1496:
       #0: (sb_writers#13){.+.+}-{0:0}, at: path_openat+0x9fc/0xa20
       #1: (&inode->i_sb->s_type->i_mutex_dir_key){++++}-{3:3}, at: path_openat+0x296/0xa20
       #2: (sb_internal){.+.+}-{0:0}, at: xfs_trans_alloc_icreate+0x41/0xd0 [xfs]
       #3: (&xfs_dir_ilock_class/5){+.+.}-{3:3}, at: xfs_ilock+0xcf/0x290 [xfs]
       #4: (hctx->srcu){....}-{0:0}, at: hctx_lock+0x51/0xd0
       #5: (&queue->send_mutex){+.+.}-{3:3}, at: nvme_tcp_queue_rq+0x33e/0x380 [nvme_tcp]
    
    This annotation lets lockdep analyze nvme-tcp controlled sockets
    independently of what the user-space sockets API does.
    
    Link: https://lore.kernel.org/linux-nvme/CAHj4cs9MDYLJ+q+2_GXUK9HxFizv2pxUryUR0toX974M040z7g@mail.gmail.com/Signed-off-by: default avatarChris Leech <cleech@redhat.com>
    Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
    841aee4d
tcp.c 67.4 KB