Commits · a1c818109cc8126cf20523692ca29f1eae3172eb · Kirill Smelkov / linux

23 May, 2018 10 commits

tools/lib/libbpf.c: fix string format to allow build on arm32 · a1c81810

Sirio Balmelli authored May 23, 2018

On arm32, 'cd tools/testing/selftests/bpf && make' fails with:

libbpf.c:80:10: error: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘int64_t {aka long long int}’ [-Werror=format=]
   (func)("libbpf: " fmt, ##__VA_ARGS__); \
          ^
libbpf.c:83:30: note: in expansion of macro ‘__pr’
 #define pr_warning(fmt, ...) __pr(__pr_warning, fmt, ##__VA_ARGS__)
                              ^~~~
libbpf.c:1072:3: note: in expansion of macro ‘pr_warning’
   pr_warning("map:%s value_type:%s has BTF type_size:%ld != value_size:%u\n",

To fix, typecast 'key_size' and amend format string.
Signed-off-by: Sirio Balmelli <sirio@b-ad.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

a1c81810

selftests/bpf: Makefile fix "missing" headers on build with -idirafter · 167381f3

Sirio Balmelli authored May 21, 2018

Selftests fail to build on several distros/architectures because of
	missing headers files.

On a Ubuntu/x86_64 some missing headers are:
	asm/byteorder.h, asm/socket.h, asm/sockios.h

On a Debian/arm32 build already fails at sys/cdefs.h

In both cases, these already exist in /usr/include/<arch-specific-dir>,
but Clang does not include these when using '-target bpf' flag,
since it is no longer compiling against the host architecture.

The solution is to:

- run Clang without '-target bpf' and extract the include chain for the
current system

- add these to the bpf build with '-idirafter'

The choice of -idirafter is to catch this error without injecting
unexpected include behavior: if an arch-specific tree is built
for bpf in the future, this will be correctly found by Clang.
Signed-off-by: Sirio Balmelli <sirio@b-ad.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

167381f3

Merge branch 'btf-uapi-cleanups' · ff4fb475

Daniel Borkmann authored May 23, 2018

Martin KaFai Lau says:

====================
This patch set makes some changes to cleanup the unused
bits in BTF uapi.  It also makes the btf_header extensible.

Please see individual patches for details.

v2:
- Remove NR_SECS from patch 2
- Remove "unsigned" check on array->index_type from patch 3
- Remove BTF_INT_VARARGS and further limit BTF_INT_ENCODING
  from 8 bits to 4 bits in patch 4
- Adjustments in test_btf.c to reflect changes in v2
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

ff4fb475

bpf: btf: Add tests for the btf uapi changes · 61746dbe

Martin KaFai Lau authored May 22, 2018

This patch does the followings:
1. Modify libbpf and test_btf to reflect the uapi changes in btf
2. Add test for the btf_header changes
3. Add tests for array->index_type
4. Add err_str check to the tests
5. Fix a 4 bytes hole in "struct test #1" by swapping "m" and "n"
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

61746dbe

bpf: btf: Sync bpf.h and btf.h to tools · f03b15d3

Martin KaFai Lau authored May 22, 2018

This patch sync the uapi bpf.h and btf.h to tools.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

f03b15d3

bpf: btf: Rename btf_key_id and btf_value_id in bpf_map_info · 9b2cf328

Martin KaFai Lau authored May 22, 2018

In "struct bpf_map_info", the name "btf_id", "btf_key_id" and "btf_value_id"
could cause confusion because the "id" of "btf_id" means the BPF obj id
given to the BTF object while
"btf_key_id" and "btf_value_id" means the BTF type id within
that BTF object.

To make it clear, btf_key_id and btf_value_id are
renamed to btf_key_type_id and btf_value_type_id.
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

9b2cf328

bpf: btf: Remove unused bits from uapi/linux/btf.h · aea2f7b8

Martin KaFai Lau authored May 22, 2018

This patch does the followings:
1. Limit BTF_MAX_TYPES and BTF_MAX_NAME_OFFSET to 64k.  We can
   raise it later.

2. Remove the BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID.  They are
   currently encoded at the highest bit of a u32.
   It is because the current use case does not require supporting
   parent type (i.e type_id referring to a type in another BTF file).
   It also does not support referring to a string in ELF.

   The BTF_TYPE_PARENT and BTF_STR_TBL_ELF_ID checks are replaced
   by BTF_TYPE_ID_CHECK and BTF_STR_OFFSET_CHECK which are
   defined in btf.c instead of uapi/linux/btf.h.

3. Limit the BTF_INFO_KIND from 5 bits to 4 bits which is enough.
   There is unused bits headroom if we ever needed it later.

4. The root bit in BTF_INFO is also removed because it is not
   used in the current use case.

5. Remove BTF_INT_VARARGS since func type is not supported now.
   The BTF_INT_ENCODING is limited to 4 bits instead of 8 bits.

The above can be added back later because the verifier
ensures the unused bits are zeros.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

aea2f7b8

bpf: btf: Check array->index_type · 4ef5f574

Martin KaFai Lau authored May 22, 2018

Instead of ingoring the array->index_type field.  Enforce that
it must be a BTF_KIND_INT in size 1/2/4/8 bytes.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

4ef5f574

bpf: btf: Change how section is supported in btf_header · f80442a4

Martin KaFai Lau authored May 22, 2018

There are currently unused section descriptions in the btf_header.  Those
sections are here to support future BTF use cases.  For example, the
func section (func_off) is to support function signature (e.g. the BPF
prog function signature).

Instead of spelling out all potential sections up-front in the btf_header.
This patch makes changes to btf_header such that extending it (e.g. adding
a section) is possible later.  The unused ones can be removed for now and
they can be added back later.

This patch:
1. adds a hdr_len to the btf_header.  It will allow adding
sections (and other info like parent_label and parent_name)
later.  The check is similar to the existing bpf_attr.
If a user passes in a longer hdr_len, the kernel
ensures the extra tailing bytes are 0.

2. allows the section order in the BTF object to be
different from its sec_off order in btf_header.

3. each sec_off is followed by a sec_len.  It must not have gap or
overlapping among sections.

The string section is ensured to be at the end due to the 4 bytes
alignment requirement of the type section.

The above changes will allow enough flexibility to
add new sections (and other info) to the btf_header later.

This patch also removes an unnecessary !err check
at the end of btf_parse().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

f80442a4

bpf: Expose check_uarg_tail_zero() · dcab51f1

Martin KaFai Lau authored May 22, 2018

This patch exposes check_uarg_tail_zero() which will
be reused by a later BTF patch.  Its name is changed to
bpf_check_uarg_tail_zero().
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

dcab51f1

22 May, 2018 13 commits

Merge branch 'bpf-fib-mtu-check' · 3fb48d88

Daniel Borkmann authored May 22, 2018

David Ahern says:

====================
Packets that exceed the egress MTU can not be forwarded in the fast path.
Add IPv4 and IPv6 MTU helpers that take a FIB lookup result (versus the
typical dst path) and add the calls to bpf_ipv{4,6}_fib_lookup.

v2
- add ip6_mtu_from_fib6 to ipv6_stub
- only call the new MTU helpers for fib lookups in XDP path; skb
  path uses is_skb_forwardable to determine if the packet can be
  sent via the egress device from the FIB lookup
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

3fb48d88

bpf: Add mtu checking to FIB forwarding helper · 4f74fede

David Ahern authored May 21, 2018

Add check that egress MTU can handle packet to be forwarded. If
the MTU is less than the packet length, return 0 meaning the
packet is expected to continue up the stack for help - eg.,
fragmenting the packet or sending an ICMP.

The XDP path needs to leverage the FIB entry for an MTU on the
route spec or an exception entry for a given destination. The
skb path lets is_skb_forwardable decide if the packet can be
sent.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

4f74fede

net/ipv6: Add helper to return path MTU based on fib result · 901731b8

David Ahern authored May 21, 2018

Determine path MTU from a FIB lookup result. Logic is based on
ip6_dst_mtu_forward plus lookup of nexthop exception.

Add ip6_dst_mtu_forward to ipv6_stubs to handle access by core
bpf code.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

901731b8

net/ipv4: Add helper to return path MTU based on fib result · 50d889b1

David Ahern authored May 21, 2018

Determine path MTU from a FIB lookup result. Logic is a distillation of
ip_dst_mtu_maybe_forward.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

50d889b1

Merge branch 'bpf-af-xdp-cleanups' · fd0bfa8d

Daniel Borkmann authored May 22, 2018

Björn Töpel says:

====================
This the second follow-up set. The first four patches are uapi
changes:

* Removing rebind support
* Getting rid of structure hole
* Removing explicit cache line alignment
* Stricter bind checks

The last patches do some cleanups, where the umem and refcount_t
changes were suggested by Daniel.

* Add a missing write-barrier and use READ_ONCE for data-dependencies
* Clean up umem and do proper locking
* Convert atomic_t to refcount_t
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

fd0bfa8d

xsk: convert atomic_t to refcount_t · d3b42f14

Björn Töpel authored May 22, 2018

Introduce refcount_t, in favor of atomic_t.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

d3b42f14

xsk: simplified umem setup · a49049ea

Björn Töpel authored May 22, 2018

As suggested by Daniel Borkmann, the umem setup code was a too
defensive and complex. Here, we reduce the number of checks. Also, the
memory pinning is now folded into the umem creation, and we do correct
locking.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

a49049ea

xsk: add missing write- and data-dependency barrier · 37b07693

Björn Töpel authored May 22, 2018

Here, we add a missing write-barrier, and use READ_ONCE for the
data-dependency barrier.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

37b07693

samples/bpf: adapt xdpsock to the new uapi · 1c4917da

Björn Töpel authored May 22, 2018

Adapt xdpsock to use the new getsockopt introduced in the previous
commit.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

1c4917da

xsk: remove explicit ring structure from uapi · b3a9e0be

Björn Töpel authored May 22, 2018

In this commit we remove the explicit ring structure from the the
uapi. It is tricky for an uapi to depend on a certain L1 cache line
size, since it can differ for variants of the same architecture. Now,
we let the user application determine the offsets of the producer,
consumer and descriptors by asking the socket via getsockopt.

A typical flow would be (Rx ring):

  struct xdp_mmap_offsets off;
  struct xdp_desc *ring;
  u32 *prod, *cons;
  void *map;
  ...

  getsockopt(fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen);

  map = mmap(NULL, off.rx.desc +
		   NUM_DESCS * sizeof(struct xdp_desc),
		   PROT_READ | PROT_WRITE,
		   MAP_SHARED | MAP_POPULATE, sfd,
		   XDP_PGOFF_RX_RING);
  prod = map + off.rx.producer;
  cons = map + off.rx.consumer;
  ring = map + off.rx.desc;
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

b3a9e0be

xsk: proper queue id check at bind · 2e59dd5e

Magnus Karlsson authored May 22, 2018

Validate the queue id against both Rx and Tx on the netdev. Also, make
sure that the queue exists at xmit time.
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

2e59dd5e

xsk: fill hole in struct sockaddr_xdp · ad75646c

Björn Töpel authored May 22, 2018

Move the sxdp_flags up, avoiding a hole in the uapi structure.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

ad75646c

xsk: remove rebind support · 959b71db

Björn Töpel authored May 22, 2018

Supporting rebind, i.e. after a successful bind the process can call
bind again without closing the socket, makes the AF_XDP setup state
machine more complex. Constrain the state space, by not supporting
rebind.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

959b71db

18 May, 2018 12 commits

Merge branch 'bpf-sk-msg-fields' · d849f9f9

Daniel Borkmann authored May 18, 2018

John Fastabend says:

====================
In this series we add the ability for sk msg programs to read basic
sock information about the sock they are attached to. The second
patch adds the tests to the selftest test_verifier.

One observation that I had from writing this seriess is lots of the
./net/core/filter.c code is almost duplicated across program types.
I thought about building a template/macro that we could use as a
single block of code to read sock data out for multiple programs,
but I wasn't convinced it was worth it yet. The result was using a
macro saved a couple lines of code per block but made the code
a bit harder to read IMO. We can probably revisit the idea later
if we get more duplication.

v2: add errstr field to negative test_verifier test cases to ensure
    we get the expected err string back from the verifier.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

d849f9f9

bpf: add sk_msg prog sk access tests to test_verifier · 4da0dcab

John Fastabend authored May 17, 2018

Add tests for BPF_PROG_TYPE_SK_MSG to test_verifier for read access
to new sk fields.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

4da0dcab

bpf: allow sk_msg programs to read sock fields · 303def35

John Fastabend authored May 17, 2018

Currently sk_msg programs only have access to the raw data. However,
it is often useful when building policies to have the policies specific
to the socket endpoint. This allows using the socket tuple as input
into filters, etc.

This patch adds ctx access to the sock fields.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

303def35

Merge branch 'bpf-nfp-shift-insns' · 1cb61381

Daniel Borkmann authored May 18, 2018

Jiong Wang says:

====================
NFP eBPF JIT is missing logic indirect shifts (both left and right) and
arithmetic right shift (both indirect shift and shift by constant).

This patch adds support for them.

For indirect shifts, shift amount is not specified as constant, NFP needs
to get the shift amount through the low 5 bits of source A operand in
PREV_ALU, therefore extra instructions are needed compared with shifts by
constants.

Because NFP is 32-bit, so we are using register pair for 64-bit shifts and
therefore would need different instruction sequences depending on whether
shift amount is less than 32 or not.

NFP branch-on-bit-test instruction emitter is added by this patch set and
is used for efficient runtime check on shift amount. We'd think the shift
amount is less than 32 if bit 5 is clear and greater or equal then 32
otherwise. Shift amount is greater than or equal to 64 will result in
undefined behavior.

This patch also use range info to avoid generating unnecessary runtime code
if we are certain shift amount is less than 32 or not.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

1cb61381

nfp: bpf: support arithmetic indirect right shift (BPF_ARSH | BPF_X) · c217abcc

Jiong Wang authored May 18, 2018

Code logic is similar with arithmetic right shift by constant, and NFP
get indirect shift amount through source A operand of PREV_ALU.

It is possible to fall back to logic right shift if the MSB is known to be
zero from range info, however there is no benefit to do this given logic
indirect right shift use the same number and cycle of instruction sequence.

Suppose the MSB of regX is the bit we want to replicate to fill in all the
vacant positions, and regY contains the shift amount, then we could use
single instruction to set up both.

  [alu, --, regY, OR, regX]

  --
  NOTE: the PREV_ALU result doesn't need to write to any destination
        register.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

c217abcc

nfp: bpf: support arithmetic right shift by constant (BPF_ARSH | BPF_K) · f43d0f17

Jiong Wang authored May 18, 2018

Code logic is similar with logic right shift except we also need to set
PREV_ALU result properly, the MSB of which is the bit that will be
replicated to fill in all the vacant positions.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

f43d0f17

nfp: bpf: support logic indirect shifts (BPF_[L|R]SH | BPF_X) · 991f5b36

Jiong Wang authored May 18, 2018

For indirect shifts, shift amount is not specified as constant, NFP needs
to get the shift amount through the low 5 bits of source A operand in
PREV_ALU, therefore extra instructions are needed compared with shifts by
constants.

Because NFP is 32-bit, so we are using register pair for 64-bit shifts and
therefore would need different instruction sequences depending on whether
shift amount is less than 32 or not.

NFP branch-on-bit-test instruction emitter is added by this patch and is
used for efficient runtime check on shift amount. We'd think the shift
amount is less than 32 if bit 5 is clear and greater or equal than 32
otherwise. Shift amount is greater than or equal to 64 will result in
undefined behavior.

This patch also use range info to avoid generating unnecessary runtime code
if we are certain shift amount is less than 32 or not.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

991f5b36

Merge branch 'bpf-af-xdp-cleanups' · 82f9e2d5

Daniel Borkmann authored May 18, 2018

Björn Töpel says:

====================
This series contain "cosmetics only" follow-up patches for AF_XDP.

Thanks to Daniel for suggesting them!
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

82f9e2d5

xsk: proper '=' alignment · c2f4374b

Björn Töpel authored May 18, 2018

Properly align xsk_proto_ops initialization.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

c2f4374b

xsk: fixed some cases of unnecessary parentheses · da60cf00

Björn Töpel authored May 18, 2018

Removed some cases of unnecessary parentheses.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

da60cf00

xsk: remove newline at end of file · 54b85c27

Björn Töpel authored May 18, 2018

Minor cleanup, remove newline at end of Makefile.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

54b85c27

xsk: clean up SPDX headers · dac09149

Björn Töpel authored May 18, 2018

Clean up SPDX-License-Identifier and removing licensing leftovers.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

dac09149

17 May, 2018 4 commits

bpf: sockmap, fix double-free · a7862293

Gustavo A. R. Silva authored May 17, 2018

`e' is being freed twice.

Fix this by removing one of the kfree() calls.

Addresses-Coverity-ID: 1468983 ("Double free")
Fixes: 81110384 ("bpf: sockmap, add hash map support")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

a7862293

bpf: sockmap, fix uninitialized variable · 0e436456

Gustavo A. R. Silva authored May 17, 2018

There is a potential execution path in which variable err is
returned without being properly initialized previously.

Fix this by initializing variable err to 0.

Addresses-Coverity-ID: 1468964 ("Uninitialized scalar variable")
Fixes: e5cd3abc ("bpf: sockmap, refactor sockmap routines to work with hashmap")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

0e436456

bpf: change eBPF helper doc parsing script to allow for smaller indent · eeacb716

Quentin Monnet authored May 17, 2018

Documentation for eBPF helpers can be parsed from bpf.h and eventually
turned into a man page. Commit 6f96674d ("bpf: relax constraints on
formatting for eBPF helper documentation") changed the script used to
parse it, in order to allow for different indent style and to ease the
work for writing documentation for future helpers.

The script currently considers that the first tab can be replaced by 6
to 8 spaces. But the documentation for bpf_fib_lookup() uses a mix of
tabs (for the "Description" part) and of spaces ("Return" part), and
only has 5 space long indent for the latter.

We probably do not want to change the values accepted by the script each
time a new helper gets a new indent style. However, it is worth noting
that with those 5 spaces, the "Description" and "Return" part *look*
aligned in the generated patch and in `git show`, so it is likely other
helper authors will use the same length. Therefore, allow for helper
documentation to use 5 spaces only for the first indent level.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

eeacb716

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · b9f672af

David S. Miller authored May 16, 2018

Daniel Borkmann says:

====================
pull-request: bpf-next 2018-05-17

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Provide a new BPF helper for doing a FIB and neighbor lookup
   in the kernel tables from an XDP or tc BPF program. The helper
   provides a fast-path for forwarding packets. The API supports
   IPv4, IPv6 and MPLS protocols, but currently IPv4 and IPv6 are
   implemented in this initial work, from David (Ahern).

2) Just a tiny diff but huge feature enabled for nfp driver by
   extending the BPF offload beyond a pure host processing offload.
   Offloaded XDP programs are allowed to set the RX queue index and
   thus opening the door for defining a fully programmable RSS/n-tuple
   filter replacement. Once BPF decided on a queue already, the device
   data-path will skip the conventional RSS processing completely,
   from Jakub.

3) The original sockmap implementation was array based similar to
   devmap. However unlike devmap where an ifindex has a 1:1 mapping
   into the map there are use cases with sockets that need to be
   referenced using longer keys. Hence, sockhash map is added reusing
   as much of the sockmap code as possible, from John.

4) Introduce BTF ID. The ID is allocatd through an IDR similar as
   with BPF maps and progs. It also makes BTF accessible to user
   space via BPF_BTF_GET_FD_BY_ID and adds exposure of the BTF data
   through BPF_OBJ_GET_INFO_BY_FD, from Martin.

5) Enable BPF stackmap with build_id also in NMI context. Due to the
   up_read() of current->mm->mmap_sem build_id cannot be parsed.
   This work defers the up_read() via a per-cpu irq_work so that
   at least limited support can be enabled, from Song.

6) Various BPF JIT follow-up cleanups and fixups after the LD_ABS/LD_IND
   JIT conversion as well as implementation of an optimized 32/64 bit
   immediate load in the arm64 JIT that allows to reduce the number of
   emitted instructions; in case of tested real-world programs they
   were shrinking by three percent, from Daniel.

7) Add ifindex parameter to the libbpf loader in order to enable
   BPF offload support. Right now only iproute2 can load offloaded
   BPF and this will also enable libbpf for direct integration into
   other applications, from David (Beckett).

8) Convert the plain text documentation under Documentation/bpf/ into
   RST format since this is the appropriate standard the kernel is
   moving to for all documentation. Also add an overview README.rst,
   from Jesper.

9) Add __printf verification attribute to the bpf_verifier_vlog()
   helper. Though it uses va_list we can still allow gcc to check
   the format string, from Mathieu.

10) Fix a bash reference in the BPF selftest's Makefile. The '|& ...'
    is a bash 4.0+ feature which is not guaranteed to be available
    when calling out to shell, therefore use a more portable variant,
    from Joe.

11) Fix a 64 bit division in xdp_umem_reg() by using div_u64()
    instead of relying on the gcc built-in, from Björn.

12) Fix a sock hashmap kmalloc warning reported by syzbot when an
    overly large key size is used in hashmap then causing overflows
    in htab->elem_size. Reject bogus attr->key_size early in the
    sock_hash_alloc(), from Yonghong.

13) Ensure in BPF selftests when urandom_read is being linked that
    --build-id is always enabled so that test_stacktrace_build_id[_nmi]
    won't be failing, from Alexei.

14) Add bitsperlong.h as well as errno.h uapi headers into the tools
    header infrastructure which point to one of the arch specific
    uapi headers. This was needed in order to fix a build error on
    some systems for the BPF selftests, from Sirio.

15) Allow for short options to be used in the xdp_monitor BPF sample
    code. And also a bpf.h tools uapi header sync in order to fix a
    selftest build failure. Both from Prashant.

16) More formally clarify the meaning of ID in the direct packet access
    section of the BPF documentation, from Wang.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b9f672af

16 May, 2018 1 commit

bpf: sockmap, on update propagate errors back to userspace · e23afe5e

John Fastabend authored May 16, 2018

When an error happens in the update sockmap element logic also pass
the err up to the user.

Fixes: e5cd3abc ("bpf: sockmap, refactor sockmap routines to work with hashmap")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

e23afe5e