• Dong Chenchen's avatar
    net: Remove acked SYN flag from packet in the transmit queue correctly · f99cd562
    Dong Chenchen authored
    syzkaller report:
    
     kernel BUG at net/core/skbuff.c:3452!
     invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
     CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.7.0-rc4-00009-gbee0e776-dirty #135
     RIP: 0010:skb_copy_and_csum_bits (net/core/skbuff.c:3452)
     Call Trace:
     icmp_glue_bits (net/ipv4/icmp.c:357)
     __ip_append_data.isra.0 (net/ipv4/ip_output.c:1165)
     ip_append_data (net/ipv4/ip_output.c:1362 net/ipv4/ip_output.c:1341)
     icmp_push_reply (net/ipv4/icmp.c:370)
     __icmp_send (./include/net/route.h:252 net/ipv4/icmp.c:772)
     ip_fragment.constprop.0 (./include/linux/skbuff.h:1234 net/ipv4/ip_output.c:592 net/ipv4/ip_output.c:577)
     __ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:295)
     ip_output (net/ipv4/ip_output.c:427)
     __ip_queue_xmit (net/ipv4/ip_output.c:535)
     __tcp_transmit_skb (net/ipv4/tcp_output.c:1462)
     __tcp_retransmit_skb (net/ipv4/tcp_output.c:3387)
     tcp_retransmit_skb (net/ipv4/tcp_output.c:3404)
     tcp_retransmit_timer (net/ipv4/tcp_timer.c:604)
     tcp_write_timer (./include/linux/spinlock.h:391 net/ipv4/tcp_timer.c:716)
    
    The panic issue was trigered by tcp simultaneous initiation.
    The initiation process is as follows:
    
          TCP A                                            TCP B
    
      1.  CLOSED                                           CLOSED
    
      2.  SYN-SENT     --> <SEQ=100><CTL=SYN>              ...
    
      3.  SYN-RECEIVED <-- <SEQ=300><CTL=SYN>              <-- SYN-SENT
    
      4.               ... <SEQ=100><CTL=SYN>              --> SYN-RECEIVED
    
      5.  SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ...
    
      // TCP B: not send challenge ack for ack limit or packet loss
      // TCP A: close
    	tcp_close
    	   tcp_send_fin
                  if (!tskb && tcp_under_memory_pressure(sk))
                      tskb = skb_rb_last(&sk->tcp_rtx_queue); //pick SYN_ACK packet
               TCP_SKB_CB(tskb)->tcp_flags |= TCPHDR_FIN;  // set FIN flag
    
      6.  FIN_WAIT_1  --> <SEQ=100><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ...
    
      // TCP B: send challenge ack to SYN_FIN_ACK
    
      7.               ... <SEQ=301><ACK=101><CTL=ACK>   <-- SYN-RECEIVED //challenge ack
    
      // TCP A:  <SND.UNA=101>
    
      8.  FIN_WAIT_1 --> <SEQ=101><ACK=301><END_SEQ=102><CTL=SYN,FIN,ACK> ... // retransmit panic
    
    	__tcp_retransmit_skb  //skb->len=0
    	    tcp_trim_head
    		len = tp->snd_una - TCP_SKB_CB(skb)->seq // len=101-100
    		    __pskb_trim_head
    			skb->data_len -= len // skb->len=-1, wrap around
    	    ... ...
    	    ip_fragment
    		icmp_glue_bits //BUG_ON
    
    If we use tcp_trim_head() to remove acked SYN from packet that contains data
    or other flags, skb->len will be incorrectly decremented. We can remove SYN
    flag that has been acked from rtx_queue earlier than tcp_trim_head(), which
    can fix the problem mentioned above.
    
    Fixes: 1da177e4 ("Linux-2.6.12-rc2")
    Co-developed-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
    Signed-off-by: default avatarDong Chenchen <dongchenchen2@huawei.com>
    Link: https://lore.kernel.org/r/20231210020200.1539875-1-dongchenchen2@huawei.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    f99cd562
tcp_output.c 128 KB