1. 30 Aug, 2016 3 commits
    • Florian Westphal's avatar
      netfilter: conntrack: get rid of conntrack timer · f330a7fd
      Florian Westphal authored
      With stats enabled this eats 80 bytes on x86_64 per nf_conn entry, as
      Eric Dumazet pointed out during netfilter workshop 2016.
      
      Eric also says: "Another reason was the fact that Thomas was about to
      change max timer range [..]" (500462a9, 'timers: Switch to
      a non-cascading wheel').
      
      Remove the timer and use a 32bit jiffies value containing timestamp until
      entry is valid.
      
      During conntrack lookup, even before doing tuple comparision, check
      the timeout value and evict the entry in case it is too old.
      
      The dying bit is used as a synchronization point to avoid races where
      multiple cpus try to evict the same entry.
      
      Because lookup is always lockless, we need to bump the refcnt once
      when we evict, else we could try to evict already-dead entry that
      is being recycled.
      
      This is the standard/expected way when conntrack entries are destroyed.
      
      Followup patches will introduce garbage colliction via work queue
      and further places where we can reap obsoleted entries (e.g. during
      netlink dumps), this is needed to avoid expired conntracks from hanging
      around for too long when lookup rate is low after a busy period.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f330a7fd
    • Florian Westphal's avatar
      netfilter: don't rely on DYING bit to detect when destroy event was sent · 616b14b4
      Florian Westphal authored
      The reliable event delivery mode currently (ab)uses the DYING bit to
      detect which entries on the dying list have to be skipped when
      re-delivering events from the eache worker in reliable event mode.
      
      Currently when we delete the conntrack from main table we only set this
      bit if we could also deliver the netlink destroy event to userspace.
      
      If we fail we move it to the dying list, the ecache worker will
      reattempt event delivery for all confirmed conntracks on the dying list
      that do not have the DYING bit set.
      
      Once timer is gone, we can no longer use if (del_timer()) to detect
      when we 'stole' the reference count owned by the timer/hash entry, so
      we need some other way to avoid racing with other cpu.
      
      Pablo suggested to add a marker in the ecache extension that skips
      entries that have been unhashed from main table but are still waiting
      for the last reference count to be dropped (e.g. because one skb waiting
      on nfqueue verdict still holds a reference).
      
      We do this by adding a tristate.
      If we fail to deliver the destroy event, make a note of this in the
      eache extension.  The worker can then skip all entries that are in
      a different state.  Either they never delivered a destroy event,
      e.g. because the netlink backend was not loaded, or redelivery took
      place already.
      
      Once the conntrack timer is removed we will now be able to replace
      del_timer() test with test_and_set_bit(DYING, &ct->status) to avoid
      racing with other cpu that tries to evict the same conntrack.
      
      Because DYING will then be set right before we report the destroy event
      we can no longer skip event reporting when dying bit is set.
      Suggested-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      616b14b4
    • Florian Westphal's avatar
      netfilter: restart search if moved to other chain · 95a8d19f
      Florian Westphal authored
      In case nf_conntrack_tuple_taken did not find a conflicting entry
      check that all entries in this hash slot were tested and restart
      in case an entry was moved to another chain.
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: ea781f19 ("netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      95a8d19f
  2. 26 Aug, 2016 3 commits
  3. 23 Aug, 2016 3 commits
  4. 22 Aug, 2016 4 commits
  5. 18 Aug, 2016 1 commit
  6. 17 Aug, 2016 1 commit
  7. 13 Aug, 2016 1 commit
    • Pablo Neira Ayuso's avatar
      netfilter: remove ip_conntrack* sysctl compat code · adf05168
      Pablo Neira Ayuso authored
      This backward compatibility has been around for more than ten years,
      since Yasuyuki Kozakai introduced IPv6 in conntrack. These days, we have
      alternate /proc/net/nf_conntrack* entries, the ctnetlink interface and
      the conntrack utility got adopted by many people in the user community
      according to what I observed on the netfilter user mailing list.
      
      So let's get rid of this.
      
      Note that nf_conntrack_htable_size and unsigned int nf_conntrack_max do
      not need to be exported as symbol anymore.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      adf05168
  8. 12 Aug, 2016 1 commit
  9. 11 Aug, 2016 23 commits