1. 13 Mar, 2015 32 commits
  2. 12 Mar, 2015 8 commits
    • Daniel Borkmann's avatar
      cls_bpf: do eBPF invocation under non-bh RCU lock variant for maps · 54720df1
      Daniel Borkmann authored
      Currently, it is possible in cls_bpf to access eBPF maps only under
      rcu_read_lock_bh() variants: while on ingress side, that is, handle_ing(),
      the classifier would be called from __netif_receive_skb_core() under
      rcu_read_lock(); on egress side, however, it's rcu_read_lock_bh() via
      __dev_queue_xmit().
      
      This rcu/rcu_bh mix doesn't work together with eBPF maps as they require
      soley to be called under rcu_read_lock(). eBPF maps could also be shared
      among various other eBPF programs (possibly even with other eBPF program
      types, f.e. tracing) and user space processes, so any context is assumed.
      
      Therefore, a possible fix for cls_bpf is to wrap/nest eBPF program
      invocation under non-bh RCU lock variant.
      
      Fixes: e2e9b654 ("cls_bpf: add initial eBPF support for programmable classifiers")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54720df1
    • David S. Miller's avatar
      Merge branch 'fib_trie_table_merge_fixes' · 06741d05
      David S. Miller authored
      Alexander Duyck says:
      
      ====================
      fib_trie: Minor fixes for table merge
      
      This patch set addresses two issues reported with the tables merged, the
      first is a NULL pointer dereference, and the other is to remove a WARN_ON
      and set the ordering for aliases from different tables with the same slen
      values.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      06741d05
    • Alexander Duyck's avatar
      fib_trie: Provide a deterministic order for fib_alias w/ tables merged · 0b65bd97
      Alexander Duyck authored
      This change makes it so that we should always have a deterministic ordering
      for the main and local aliases within the merged table when two leaves
      overlap.
      
      So for example if we have a leaf with a key of 192.168.254.0.  If we
      previously added two aliases with a prefix length of 24 from both local and
      main the first entry would be first and the second would be second.  When I
      was coding this I had added a WARN_ON should such a situation occur as I
      wasn't sure how likely it would be.  However this WARN_ON has been
      triggered so this is something that should be addressed.
      
      With this patch the ordering of the aliases is as follows.  First they are
      sorted on prefix length, then on their table ID, then tos, and finally
      priority.  This way what we end up doing is essentially interleaving the
      two tables on what used to be leaf_info structure boundaries.
      
      Fixes: 0ddcf43d ("ipv4: FIB Local/MAIN table collapse")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b65bd97
    • Alexander Duyck's avatar
      fib_trie: Avoid NULL pointer if local table is not allocated · 3c9e9f73
      Alexander Duyck authored
      The function fib_unmerge assumed the local table had already been
      allocated.  If that is not the case however when custom rules are applied
      then this can result in a NULL pointer dereference.
      
      In order to prevent this we must check the value of the local table pointer
      and if it is NULL simply return 0 as there is no local table to separate
      from the main.
      
      Fixes: 0ddcf43d ("ipv4: FIB Local/MAIN table collapse")
      Reported-by: default avatarMadhu Challa <challa@noironetworks.com>
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c9e9f73
    • Daniel Borkmann's avatar
      ebpf: verifier: check that call reg with ARG_ANYTHING is initialized · 80f1d68c
      Daniel Borkmann authored
      I noticed that a helper function with argument type ARG_ANYTHING does
      not need to have an initialized value (register).
      
      This can worst case lead to unintented stack memory leakage in future
      helper functions if they are not carefully designed, or unintended
      application behaviour in case the application developer was not careful
      enough to match a correct helper function signature in the API.
      
      The underlying issue is that ARG_ANYTHING should actually be split
      into two different semantics:
      
        1) ARG_DONTCARE for function arguments that the helper function
           does not care about (in other words: the default for unused
           function arguments), and
      
        2) ARG_ANYTHING that is an argument actually being used by a
           helper function and *guaranteed* to be an initialized register.
      
      The current risk is low: ARG_ANYTHING is only used for the 'flags'
      argument (r4) in bpf_map_update_elem() that internally does strict
      checking.
      
      Fixes: 17a52670 ("bpf: verifier (add verifier core)")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80f1d68c
    • David S. Miller's avatar
      Merge branch 'possible_net_t' · 20453d88
      David S. Miller authored
      Eric W. Biederman says:
      
      ====================
      Introduce possible_net_t
      
      The current usage of write_pnet and read_pnet is a little laborious and
      error prone as you only notice if you failed to include them if are
      compiling with network namespaces enabled.
      
      possible_net_t remedies that by using a type that is 0 bytes when
      network namespaces are disabled and can only be read and written to with
      read_pnet and write_pnet.
      
      Aka less work and safer for the same effect.
      
      I kill hold_net and release_net first as are they are haven't been used
      since 2008 and are noise at the points where write_pnet and read_pnet
      are used.
      
      I have folded in Eric Dumazets suggestions to improve the killing of
      hold_net and release net.  And respon.  I had to respin anyway as
      there was enough changes elsewhere in the tree the previous version
      of these patches did not quite apply cleanly.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20453d88
    • Eric W. Biederman's avatar
      net: Introduce possible_net_t · 0c5c9fb5
      Eric W. Biederman authored
      Having to say
      > #ifdef CONFIG_NET_NS
      > 	struct net *net;
      > #endif
      
      in structures is a little bit wordy and a little bit error prone.
      
      Instead it is possible to say:
      > typedef struct {
      > #ifdef CONFIG_NET_NS
      >       struct net *net;
      > #endif
      > } possible_net_t;
      
      And then in a header say:
      
      > 	possible_net_t net;
      
      Which is cleaner and easier to use and easier to test, as the
      possible_net_t is always there no matter what the compile options.
      
      Further this allows read_pnet and write_pnet to be functions in all
      cases which is better at catching typos.
      
      This change adds possible_net_t, updates the definitions of read_pnet
      and write_pnet, updates optional struct net * variables that
      write_pnet uses on to have the type possible_net_t, and finally fixes
      up the b0rked users of read_pnet and write_pnet.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c5c9fb5
    • Eric W. Biederman's avatar
      net: Kill hold_net release_net · efd7ef1c
      Eric W. Biederman authored
      hold_net and release_net were an idea that turned out to be useless.
      The code has been disabled since 2008.  Kill the code it is long past due.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      efd7ef1c