- 24 Feb, 2018 2 commits
-
-
Daniel Borkmann authored
While analyzing some of the more complex BPF programs from Cilium, I found that LLVM generally prefers to emit LD_IMM64 instead of MOV32 BPF instructions for loading unsigned 32-bit immediates into a register. Given we cannot change the current/stable LLVM versions that are already out there, lets optimize this case such that the JIT prefers to emit 'mov %eax, imm32' over 'movabsq %rax, imm64' whenever suitable in order to reduce the image size by 4-5 bytes per such load in the typical case, reducing image size on some of the bigger programs by up to 4%. emit_mov_imm32() and emit_mov_imm64() have been added as helpers. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Daniel Borkmann authored
When we shift by one, we can use a different encoding where imm is not explicitly needed, which saves 1 byte per such op. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 23 Feb, 2018 1 commit
-
-
Yafang Shao authored
sk is already allocated in inet_create/inet6_create, hence when BPF_CGROUP_RUN_PROG_INET_SOCK is executed sk will never be NULL. The logic is as bellow, sk = sk_alloc(); if (!sk) goto out; BPF_CGROUP_RUN_PROG_INET_SOCK(sk); Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
- 15 Feb, 2018 5 commits
-
-
Daniel Borkmann authored
Joe Stringer says: ==================== This is series makes some minor changes primarily focused on making it easier to understand why test_verifier is failing a test. This includes printing the observed output when a test fails in a different way than expected, or when unprivileged tests fail due to sysctl kernel.unprivileged_bpf_disabled=1. The last patch removes some apparently dead code. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Joe Stringer authored
This array appears to be completely unused, remove it. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Joe Stringer authored
The "kernel.unprivileged_bpf_disabled" sysctl, if enabled, causes all unprivileged tests to fail because it permanently disables unprivileged BPF access for the currently running kernel. Skip the relevant tests if the user attempts to run the testsuite with this sysctl enabled. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Joe Stringer authored
When priviliged tests are skipped due to user rights, count the number of skipped tests so it's more obvious that the test did not check everything. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
Joe Stringer authored
This makes it easier to debug off-hand when the error message isn't exactly as expected. Signed-off-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
-
- 14 Feb, 2018 6 commits
-
-
Yonghong Song authored
The default rlimit RLIMIT_MEMLOCK is 64KB. In certain cases, e.g. in a test machine mimicking our production system, this test may fail due to unable to charge the required memory for map creation: # ./test_tcpbpf_user libbpf: failed to create map (name: 'global_map'): Operation not permitted libbpf: failed to load object 'test_tcpbpf_kern.o' FAILED: load_bpf_file failed for: test_tcpbpf_kern.o Changing the default rlimit RLIMIT_MEMLOCK to unlimited makes the test always pass. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jesper Dangaard Brouer authored
The current selftests Makefile construct result in cgroup_helpers.c gets compiled together with all the TEST_GEN_PROGS. And it also result in invoking the libbpf Makefile two times (tools/lib/bpf). These issues were introduced in commit 9d1f1594 ("bpf: move cgroup_helpers from samples/bpf/ to tools/testing/selftesting/bpf/"). The only test program that requires the cgroup helpers is 'test_dev_cgroup'. Thus, create a make target $(OUTPUT)/test_dev_cgroup that extend[1] the 'prerequisite' for the 'stem' %-style pattern in ../lib.mk, for this particular test program. Reviewers notice the make-rules in tools/testing/selftests/lib.mk differ from the normal kernel kbuild rules, and it is practical to use 'make -p' to follow how these 'Implicit/static pattern stem' gets expanded. [1] https://www.gnu.org/software/make/manual/html_node/Static-Usage.html Fixes: 9d1f1594 ("bpf: move cgroup_helpers from samples/bpf/ to tools/testing/selftesting/bpf/") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jesper Dangaard Brouer authored
If is sufficient with a forward declaration of struct xdp_rxq_info in linux/filter.h, which avoids including net/xdp.h. This was originally suggested by John Fastabend during the review phase, but wasn't included in the final patchset revision. Thus, this followup. Suggested-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Prashant Bhole authored
samples/sockops program keeps the sock_ops program attached to cgroup. Fixed this by detaching program before exit. Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Prashant Bhole authored
While building samples/sockmap, undefined reference error is thrown for `nla_dump_errormsg'. Linking tools/lib/bpf/nlattr.o as a fix Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Tushar Dave authored
Default rlimit RLIMIT_MEMLOCK is 64KB, causes bpf map failure. e.g. [root@labbpf]# ./xdp_redirect $(</sys/class/net/eth2/ifindex) \ > $(</sys/class/net/eth3/ifindex) failed to create a map: 1 Operation not permitted The failure is seen when executing xdp_redirect while xdp_monitor is already runnig. Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
- 13 Feb, 2018 26 commits
-
-
David Ahern authored
Add test cases verifying FIB onlink commands work as expected in various conditions - IPv4, IPv6, main table, and VRF. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
David Ahern says: ==================== selftests: fib_tests: simplifications, verbosity and a race Improve efficiency of fib_tests.sh and make the test result more verbose, from this summary: $ fib_tests.sh is failing in a VM: $ fib_tests.sh Running netdev unregister tests PASS: unicast route test PASS: multipath route test Running netdev down tests PASS: unicast route test PASS: multipath route test Running netdev carrier change tests PASS: local route carrier test FAIL: unicast route carrier test where a single entry actually corresponds to many checks to a much more verbse output that clarifies test cases: $fib_tests.sh Single path route carrier test .... Carrier down IPv4 fibmatch [ OK ] IPv6 fibmatch [ OK ] IPv4 linkdown flag set [FAIL] IPv6 linkdown flag set [FAIL] Second address added with carrier down IPv4 fibmatch [ OK ] IPv6 fibmatch [ OK ] IPv4 linkdown flag set [FAIL] IPv6 linkdown flag set [ OK ] And then fix the race in changing carrier down on dummy device to checking the corresponding routes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
sleep for a second after setting carrier down to allow linkwatch to propagate the change to the routing stack via netdev_state_change. As it stands there is a race setting carrier down on the dummy device and then checking the linkdown flag in the routes. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Move setup and teardown of testns and dummy0 to helpers. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
fib_tests.sh is failing in a VM: $ fib_tests.sh Running netdev unregister tests PASS: unicast route test PASS: multipath route test Running netdev down tests PASS: unicast route test PASS: multipath route test Running netdev carrier change tests PASS: local route carrier test FAIL: unicast route carrier test The last test corresponds to fib_carrier_unicast_test which 12 places that could be failing. Be more verbose in the output so a failure is easier to track down and separate test setup failures with set -e and set +e pairs. With the verbose logging it is easier to see which checks are failing: $fib_tests.sh Single path route carrier test .... Carrier down IPv4 fibmatch [ OK ] IPv6 fibmatch [ OK ] IPv4 linkdown flag set [FAIL] IPv6 linkdown flag set [FAIL] Second address added with carrier down IPv4 fibmatch [ OK ] IPv6 fibmatch [ OK ] IPv4 linkdown flag set [FAIL] IPv6 linkdown flag set [ OK ] Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
'ip netns exec testns ip' is more efficiently handled using 'ip -netns'; runs the ip command after switching the namespace and avoids an exec. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
Do not export fib_multipath_hash or fib_select_path; both are only used by core ipv4 code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
If flow oif is set and it is not an l3mdev, then fib_select_path can jump to the source address check. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Xin Long says: ==================== sctp: rename sctp diag file and add file comments for it This patchset is to remove the sctp_ prefix for sctp diag file, and also to add the missing file comments for it. v1->v2: split them into two patches as Marcelo suggested. ==================== Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
This patch is to add the missing file comments for sctp diag file. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
Remove 'sctp_' prefix for diag file, to keep consistent with other files' names. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arkadi Sharshevsky authored
Use NL_SET_ERR_MSG_MOD helper which adds the module name instead of specifying the prefix each time. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: SPAN cleanups In patch one of this short series, a misplaced pointer star is moved to the correct place. In the second patch, we observe that if SPAN entries carry their reference count anyway, it's redundant to also carry a "used" flag. In the third patch, SPAN support code is moved to a separate module. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
For the upcoming work on SPAN, it makes sense to move the current code to a module of its own. It already has a well-defined API boundary to the mirror management (which is used from matchall and ACL code). A couple more functions need to be exported for the functions that spectrum.c needs to use for MTU handling and subsystem init/fini. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
The member ref_count already determines whether a given SPAN entry is used, and is as easy to use as a dedicated boolean. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: IPIP cleanups In the first patch, a forgotten #include is added. Even though the code compiles as-is, the include is necessary for modules that should include spectrum_ipip.h. The second patch corrects an assumption that IPv6 tunnels use struct ip_tunnel_parm to store tunnel parameters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
struct ip_tunnel_parm, where GRE and several other tunnel types hold information, is IPv4-specific. The current router / ipip code in mlxsw however uses it as if it were generic. Make it clear that it's not. Rename many functions from _params_ to _params4_. mlxsw_sp_ipip_parms_saddr() and _daddr() take a proto argument to dispatch on it. Move the dispatch logic to mlxsw_sp_ipip_netdev_saddr() and _daddr(), and replace with single-protocol functions. In struct mlxsw_sp_ipip_entry, move the "parms" field to a (for the time being, singleton) union. Update users throughout. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Petr Machata authored
struct ip_tunnel_parm, which is used in spectrum_ipip.h, is defined in if_tunnel.h. However, the former neglects to include the latter. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jake Moroni authored
The comment stated that a thread was started, but that is not the case. Signed-off-by: Jake Moroni <mail@jakemoroni.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Kirill Tkhai says: ==================== Replacing net_mutex with rw_semaphore this is the third version of the patchset introducing net_sem instead of net_mutex. The patchset adds net_sem in addition to net_mutex and allows pernet_operations to be "async". This flag means, the pernet_operations methods are safe to be executed with any other pernet_operations (un)initializing another net. If there are only async pernet_operations in the system, net_mutex is not used either for setup_net() or for cleanup_net(). The pernet_operations converted in this patchset allow to create minimal .config to have network working, and the changes improve the performance like you may see below: %for i in {1..10000}; do unshare -n bash -c exit; done *before* real 1m40,377s user 0m9,672s sys 0m19,928s *after* real 0m17,007s user 0m5,311s sys 0m11,779 (5.8 times faster) In the future, when all pernet_operations become async, we'll just remove this "async" field tree-wide. All the new logic is concentrated in patches [1-5/32]. The rest of patches converts specific operations: review, rationale of they can be converted, and setting of async flag. Kirill v3: Improved patches descriptions. Added comment into [5/32]. Added [32/32] converting netlink_tap_net_ops (new pernet operations introduced in 2018). v2: Single patch -> patchset with rationale of every conversion ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations init just allocated net memory, and they obviously can be executed in parallel in any others. v3: New Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet operations just create and destroy netlink socket. The socket is pernet and else operations don't touch it. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet operations consist of exit() and exit_batch() methods. default_device_exit() moves not-local and virtual devices to init_net. There is nothing exciting, because this may happen in any time on a working system, and rtnl_lock() and synchronize_net() protect us from all cases of external dereference. The same for default_device_exit_batch(). Similar unregisteration may happen in any time on a system. Here several lists (like todo_list), which are accessed under rtnl_lock(). After rtnl_unlock() and netdev_run_todo() all the devices are flushed. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations have only init() method. It allocates memory for net_device, calls register_netdev() and assigns net::loopback_dev. register_netdev() is allowed be used without additional locks, as it's synchronized on rtnl_lock(). There are many examples of using this functon directly from ioctl(). The only difference, compared to ioctl(), is that net is not completely alive at this moment. But it looks like, there is no way for parallel pernet_operations to dereference the net_device, as the most of struct net_device lists, where it's linked, are related to net, and the net is not liked. The exceptions are net_device::unreg_list, close_list, todo_list, used for unregistration, and ::link_watch_list, where net_device may be linked to global lists. Unregistration of loopback_dev obviously can't happen, when loopback_net_init() is executing, as the net as alive. It occurs in default_device_ops, which currently requires net_mutex, and it behaves as a barrier at the moment. It will be considered in next patch. Speaking about link_watch_list, it seems, there is no way for loopback_dev at time of registration to be linked in lweventlist and be available for another pernet_operations. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kirill Tkhai authored
These pernet_operations (un)register sysctl, which are not touched by anybody else. So, it's safe to make them async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-