1. 20 Mar, 2015 5 commits
    • David S. Miller's avatar
      Merge branch 'listener_refactor_part_14' · 750f2f91
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      inet: tcp listener refactoring part 14
      
      OK, we have serious patches here.
      
      We get rid of the central timer handling SYNACK rtx,
      which is killing us under even medium SYN flood.
      
      We still use the listener specific hash table.
      
      This will be done in next round ;)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      750f2f91
    • Eric Dumazet's avatar
      net: increase sk_[max_]ack_backlog · becb74f0
      Eric Dumazet authored
      sk_ack_backlog & sk_max_ack_backlog were 16bit fields, meaning
      listen() backlog was limited to 65535.
      
      It is time to increase the width to allow much bigger backlog,
      if admins change /proc/sys/net/core/somaxconn &
      /proc/sys/net/ipv4/tcp_max_syn_backlog default values.
      
      Tested:
      
      echo 5000000 >/proc/sys/net/core/somaxconn
      echo 5000000 >/proc/sys/net/ipv4/tcp_max_syn_backlog
      
      Ran a SYNFLOOD test against a listener using listen(fd, 5000000)
      
      myhost~# grep request_sock_TCP /proc/slabinfo
      request_sock_TCP  4185642 4411940    304   13    1 : tunables   54   27    8 : slabdata 339380 339380      0
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      becb74f0
    • Eric Dumazet's avatar
      inet: get rid of central tcp/dccp listener timer · fa76ce73
      Eric Dumazet authored
      One of the major issue for TCP is the SYNACK rtx handling,
      done by inet_csk_reqsk_queue_prune(), fired by the keepalive
      timer of a TCP_LISTEN socket.
      
      This function runs for awful long times, with socket lock held,
      meaning that other cpus needing this lock have to spin for hundred of ms.
      
      SYNACK are sent in huge bursts, likely to cause severe drops anyway.
      
      This model was OK 15 years ago when memory was very tight.
      
      We now can afford to have a timer per request sock.
      
      Timer invocations no longer need to lock the listener,
      and can be run from all cpus in parallel.
      
      With following patch increasing somaxconn width to 32 bits,
      I tested a listener with more than 4 million active request sockets,
      and a steady SYNFLOOD of ~200,000 SYN per second.
      Host was sending ~830,000 SYNACK per second.
      
      This is ~100 times more what we could achieve before this patch.
      
      Later, we will get rid of the listener hash and use ehash instead.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa76ce73
    • Eric Dumazet's avatar
      inet: drop prev pointer handling in request sock · 52452c54
      Eric Dumazet authored
      When request sock are put in ehash table, the whole notion
      of having a previous request to update dl_next is pointless.
      
      Also, following patch will get rid of big purge timer,
      so we want to delete a request sock without holding listener lock.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52452c54
    • Thomas Graf's avatar
      rhashtable: Round up/down min/max_size to ensure we respect limit · a998f712
      Thomas Graf authored
      Round up min_size respectively round down max_size to the next power
      of two to make sure we always respect the limit specified by the
      user. This is required because we compare the table size against the
      limit before we expand or shrink.
      
      Also fixes a minor bug where we modified min_size in the params
      provided instead of the copy stored in struct rhashtable.
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a998f712
  2. 19 Mar, 2015 27 commits
  3. 18 Mar, 2015 8 commits