• Daniel Borkmann's avatar
    tls: fix currently broken MSG_PEEK behavior · 50c6b58a
    Daniel Borkmann authored
    In kTLS MSG_PEEK behavior is currently failing, strace example:
    
      [pid  2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
      [pid  2430] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
      [pid  2430] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
      [pid  2430] listen(4, 10)               = 0
      [pid  2430] getsockname(4, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
      [pid  2430] connect(3, {sa_family=AF_INET, sin_port=htons(38855), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
      [pid  2430] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
      [pid  2430] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
      [pid  2430] accept(4, {sa_family=AF_INET, sin_port=htons(49636), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
      [pid  2430] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
      [pid  2430] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
      [pid  2430] close(4)                    = 0
      [pid  2430] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
      [pid  2430] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
      [pid  2430] recvfrom(5, "test_read_peektest_read_peektest"..., 64, MSG_PEEK, NULL, NULL) = 64
    
    As can be seen from strace, there are two TLS records sent,
    i) 'test_read_peek' and ii) '_mult_recs\0' where we end up
    peeking 'test_read_peektest_read_peektest'. This is clearly
    wrong, and what happens is that given peek cannot call into
    tls_sw_advance_skb() to unpause strparser and proceed with
    the next skb, we end up looping over the current one, copying
    the 'test_read_peek' over and over into the user provided
    buffer.
    
    Here, we can only peek into the currently held skb (current,
    full TLS record) as otherwise we would end up having to hold
    all the original skb(s) (depending on the peek depth) in a
    separate queue when unpausing strparser to process next
    records, minimally intrusive is to return only up to the
    current record's size (which likely was what c46234eb
    ("tls: RX path for ktls") originally intended as well). Thus,
    after patch we properly peek the first record:
    
      [pid  2046] wait4(2075,  <unfinished ...>
      [pid  2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 3
      [pid  2075] socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 4
      [pid  2075] bind(4, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
      [pid  2075] listen(4, 10)               = 0
      [pid  2075] getsockname(4, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, [16]) = 0
      [pid  2075] connect(3, {sa_family=AF_INET, sin_port=htons(55115), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
      [pid  2075] setsockopt(3, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
      [pid  2075] setsockopt(3, 0x11a /* SOL_?? */, 1, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
      [pid  2075] accept(4, {sa_family=AF_INET, sin_port=htons(45732), sin_addr=inet_addr("127.0.0.1")}, [16]) = 5
      [pid  2075] setsockopt(5, SOL_TCP, 0x1f /* TCP_??? */, [7564404], 4) = 0
      [pid  2075] setsockopt(5, 0x11a /* SOL_?? */, 2, "\3\0033\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 40) = 0
      [pid  2075] close(4)                    = 0
      [pid  2075] sendto(3, "test_read_peek", 14, 0, NULL, 0) = 14
      [pid  2075] sendto(3, "_mult_recs\0", 11, 0, NULL, 0) = 11
      [pid  2075] recvfrom(5, "test_read_peek", 64, MSG_PEEK, NULL, NULL) = 14
    
    Fixes: c46234eb ("tls: RX path for ktls")
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    50c6b58a
tls.c 17.5 KB