• Florian Westphal's avatar
    net: dctcp: loosen requirement to assert ECT(0) during 3WHS · 843c2fdf
    Florian Westphal authored
    One deployment requirement of DCTCP is to be able to run
    in a DC setting along with TCP traffic. As Glenn Judd's
    NSDI'15 paper "Attaining the Promise and Avoiding the Pitfalls
    of TCP in the Datacenter" [1] (tba) explains, one way to
    solve this on switch side is to split DCTCP and TCP traffic
    in two queues per switch port based on the DSCP: one queue
    soley intended for DCTCP traffic and one for non-DCTCP traffic.
    
    For the DCTCP queue, there's the marking threshold K as
    explained in commit e3118e83 ("net: tcp: add DCTCP congestion
    control algorithm") for RED marking ECT(0) packets with CE.
    For the non-DCTCP queue, there's f.e. a classic tail drop queue.
    As already explained in e3118e83, running DCTCP at scale
    when not marking SYN/SYN-ACK packets with ECT(0) has severe
    consequences as for non-ECT(0) packets, traversing the RED
    marking DCTCP queue will result in a severe reduction of
    connection probability.
    
    This is due to the DCTCP queue being dominated by ECT(0) traffic
    and switches handle non-ECT traffic in the RED marking queue
    after passing K as drops, where K is usually a low watermark
    in order to leave enough tailroom for bursts. Splitting DCTCP
    traffic among several queues (ECN and non-ECN queue) is being
    considered a terrible idea in the network community as it
    splits single flows across multiple network paths.
    
    Therefore, commit e3118e83 implements this on Linux as
    ECT(0) marked traffic, as we argue that marking all packets
    of a DCTCP flow is the only viable solution and also doesn't
    speak against the draft.
    
    However, recently, a DCTCP implementation for FreeBSD hit also
    their mainline kernel [2]. In order to let them play well
    together with Linux' DCTCP, we would need to loosen the
    requirement that ECT(0) has to be asserted during the 3WHS as
    not implemented in FreeBSD. This simplifies the ECN test and
    lets DCTCP work together with FreeBSD.
    
    Joint work with Daniel Borkmann.
    
      [1] https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/judd
      [2] https://github.com/freebsd/freebsd/commit/8ad879445281027858a7fa706d13e458095b595fSigned-off-by: default avatarFlorian Westphal <fw@strlen.de>
    Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
    Cc: Glenn Judd <glenn.judd@morganstanley.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    843c2fdf
tcp_input.c 171 KB