1. 18 Jul, 2020 4 commits
    • Jakub Sitnicki's avatar
      inet: Run SK_LOOKUP BPF program on socket lookup · 1559b4aa
      Jakub Sitnicki authored
      Run a BPF program before looking up a listening socket on the receive path.
      Program selects a listening socket to yield as result of socket lookup by
      calling bpf_sk_assign() helper and returning SK_PASS code. Program can
      revert its decision by assigning a NULL socket with bpf_sk_assign().
      
      Alternatively, BPF program can also fail the lookup by returning with
      SK_DROP, or let the lookup continue as usual with SK_PASS on return, when
      no socket has been selected with bpf_sk_assign().
      
      This lets the user match packets with listening sockets freely at the last
      possible point on the receive path, where we know that packets are destined
      for local delivery after undergoing policing, filtering, and routing.
      
      With BPF code selecting the socket, directing packets destined to an IP
      range or to a port range to a single socket becomes possible.
      
      In case multiple programs are attached, they are run in series in the order
      in which they were attached. The end result is determined from return codes
      of all the programs according to following rules:
      
       1. If any program returned SK_PASS and selected a valid socket, the socket
          is used as result of socket lookup.
       2. If more than one program returned SK_PASS and selected a socket,
          last selection takes effect.
       3. If any program returned SK_DROP, and no program returned SK_PASS and
          selected a socket, socket lookup fails with -ECONNREFUSED.
       4. If all programs returned SK_PASS and none of them selected a socket,
          socket lookup continues to htable-based lookup.
      Suggested-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200717103536.397595-5-jakub@cloudflare.com
      1559b4aa
    • Jakub Sitnicki's avatar
      inet: Extract helper for selecting socket from reuseport group · 80b373f7
      Jakub Sitnicki authored
      Prepare for calling into reuseport from __inet_lookup_listener as well.
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200717103536.397595-4-jakub@cloudflare.com
      80b373f7
    • Jakub Sitnicki's avatar
      bpf: Introduce SK_LOOKUP program type with a dedicated attach point · e9ddbb77
      Jakub Sitnicki authored
      Add a new program type BPF_PROG_TYPE_SK_LOOKUP with a dedicated attach type
      BPF_SK_LOOKUP. The new program kind is to be invoked by the transport layer
      when looking up a listening socket for a new connection request for
      connection oriented protocols, or when looking up an unconnected socket for
      a packet for connection-less protocols.
      
      When called, SK_LOOKUP BPF program can select a socket that will receive
      the packet. This serves as a mechanism to overcome the limits of what
      bind() API allows to express. Two use-cases driving this work are:
      
       (1) steer packets destined to an IP range, on fixed port to a socket
      
           192.0.2.0/24, port 80 -> NGINX socket
      
       (2) steer packets destined to an IP address, on any port to a socket
      
           198.51.100.1, any port -> L7 proxy socket
      
      In its run-time context program receives information about the packet that
      triggered the socket lookup. Namely IP version, L4 protocol identifier, and
      address 4-tuple. Context can be further extended to include ingress
      interface identifier.
      
      To select a socket BPF program fetches it from a map holding socket
      references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...)
      helper to record the selection. Transport layer then uses the selected
      socket as a result of socket lookup.
      
      In its basic form, SK_LOOKUP acts as a filter and hence must return either
      SK_PASS or SK_DROP. If the program returns with SK_PASS, transport should
      look for a socket to receive the packet, or use the one selected by the
      program if available, while SK_DROP informs the transport layer that the
      lookup should fail.
      
      This patch only enables the user to attach an SK_LOOKUP program to a
      network namespace. Subsequent patches hook it up to run on local delivery
      path in ipv4 and ipv6 stacks.
      Suggested-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20200717103536.397595-3-jakub@cloudflare.com
      e9ddbb77
    • Jakub Sitnicki's avatar
      bpf, netns: Handle multiple link attachments · ce3aa9cc
      Jakub Sitnicki authored
      Extend the BPF netns link callbacks to rebuild (grow/shrink) or update the
      prog_array at given position when link gets attached/updated/released.
      
      This let's us lift the limit of having just one link attached for the new
      attach type introduced by subsequent patch.
      
      No functional changes intended.
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Link: https://lore.kernel.org/bpf/20200717103536.397595-2-jakub@cloudflare.com
      ce3aa9cc
  2. 16 Jul, 2020 36 commits