• Tariq Toukan's avatar
    net/tls: Fix race in TLS device down flow · f08d8c1b
    Tariq Toukan authored
    Socket destruction flow and tls_device_down function sync against each
    other using tls_device_lock and the context refcount, to guarantee the
    device resources are freed via tls_dev_del() by the end of
    tls_device_down.
    
    In the following unfortunate flow, this won't happen:
    - refcount is decreased to zero in tls_device_sk_destruct.
    - tls_device_down starts, skips the context as refcount is zero, going
      all the way until it flushes the gc work, and returns without freeing
      the device resources.
    - only then, tls_device_queue_ctx_destruction is called, queues the gc
      work and frees the context's device resources.
    
    Solve it by decreasing the refcount in the socket's destruction flow
    under the tls_device_lock, for perfect synchronization.  This does not
    slow down the common likely destructor flow, in which both the refcount
    is decreased and the spinlock is acquired, anyway.
    
    Fixes: e8f69799 ("net/tls: Add generic NIC offload infrastructure")
    Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
    Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
    Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    f08d8c1b
tls_device.c 36.7 KB