• Zhengchao Shao's avatar
    netlink: fix potential sleeping issue in mqueue_flush_file · 234ec0b6
    Zhengchao Shao authored
    I analyze the potential sleeping issue of the following processes:
    Thread A                                Thread B
    ...                                     netlink_create  //ref = 1
    do_mq_notify                            ...
      sock = netlink_getsockbyfilp          ...     //ref = 2
      info->notify_sock = sock;             ...
    ...                                     netlink_sendmsg
    ...                                       skb = netlink_alloc_large_skb  //skb->head is vmalloced
    ...                                       netlink_unicast
    ...                                         sk = netlink_getsockbyportid //ref = 3
    ...                                         netlink_sendskb
    ...                                           __netlink_sendskb
    ...                                             skb_queue_tail //put skb to sk_receive_queue
    ...                                         sock_put //ref = 2
    ...                                     ...
    ...                                     netlink_release
    ...                                       deferred_put_nlk_sk //ref = 1
    mqueue_flush_file
      spin_lock
      remove_notification
        netlink_sendskb
          sock_put  //ref = 0
            sk_free
              ...
              __sk_destruct
                netlink_sock_destruct
                  skb_queue_purge  //get skb from sk_receive_queue
                    ...
                    __skb_queue_purge_reason
                      kfree_skb_reason
                        __kfree_skb
                        ...
                        skb_release_all
                          skb_release_head_state
                            netlink_skb_destructor
                              vfree(skb->head)  //sleeping while holding spinlock
    
    In netlink_sendmsg, if the memory pointed to by skb->head is allocated by
    vmalloc, and is put to sk_receive_queue queue, also the skb is not freed.
    When the mqueue executes flush, the sleeping bug will occur. Use
    vfree_atomic instead of vfree in netlink_skb_destructor to solve the issue.
    
    Fixes: c05cdb1b ("netlink: allow large data transfers from user-space")
    Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
    Link: https://lore.kernel.org/r/20240122011807.2110357-1-shaozhengchao@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
    234ec0b6
af_netlink.c 69.7 KB