- 24 Oct, 2017 24 commits
-
-
Quentin Monnet authored
`bpftool batch file FILE` takes FILE as an argument and executes all the bpftool commands it finds inside (or stops if an error occurs). To obtain a consistent JSON output, create a root JSON array, then for each command create a new object containing two fields: one with the command arguments, the other with the output (which is the JSON object that the command would have produced, if called on its own). Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Monnet authored
Reuse the json_writer API introduced in an earlier commit to make bpftool able to generate JSON output on `bpftool map { show | dump | lookup | getnext }` commands. Remaining commands produce no output. Some functions have been spit into plain-output and JSON versions in order to remain readable. Outputs for sample maps have been successfully tested against a JSON validator. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Monnet authored
Add a new printing function to dump translated eBPF instructions as JSON. As for plain output, opcodes are printed only on request (when `opcodes` is provided on the command line). The disassembled output is generated by the same code that is used by the kernel verifier. Example output: $ bpftool --json --pretty prog dump xlated id 1 [{ "disasm": "(bf) r6 = r1" },{ "disasm": "(61) r7 = *(u32 *)(r6 +16)" },{ "disasm": "(95) exit" } ] $ bpftool --json --pretty prog dump xlated id 1 opcodes [{ "disasm": "(bf) r6 = r1", "opcodes": { "code": "0xbf", "src_reg": "0x1", "dst_reg": "0x6", "off": ["0x00","0x00" ], "imm": ["0x00","0x00","0x00","0x00" ] } },{ "disasm": "(61) r7 = *(u32 *)(r6 +16)", "opcodes": { "code": "0x61", "src_reg": "0x6", "dst_reg": "0x7", "off": ["0x10","0x00" ], "imm": ["0x00","0x00","0x00","0x00" ] } },{ "disasm": "(95) exit", "opcodes": { "code": "0x95", "src_reg": "0x0", "dst_reg": "0x0", "off": ["0x00","0x00" ], "imm": ["0x00","0x00","0x00","0x00" ] } } ] Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Monnet authored
Reuse the json_writer API introduced in an earlier commit to make bpftool able to generate JSON output on `bpftool prog show *` commands. A new printing function is created to be passed as an argument to the disassembler. Similarly to plain output, opcodes are printed on request. Outputs from sample programs have been successfully tested against a JSON validator. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Monnet authored
Reuse the json_writer API introduced in an earlier commit to make bpftool able to generate JSON output on `bpftool prog show *` commands. For readability, the code from show_prog() has been split into two functions, one for plain output, one for JSON. Outputs from sample programs have been successfully tested against a JSON validator. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Monnet authored
These two options can be used to ask for a JSON output (--j or -json), and to make this JSON human-readable (-p or --pretty). A json_writer object is created when JSON is required, and will be used in follow-up commits to produce JSON output. Note that --pretty implies --json. Update for the manual pages and interactive help messages comes in a later patch of the series. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Monnet authored
Add an option parsing facility to bpftool, in prevision of future options for demanding JSON output. Currently, two options are added: --help and --version, that act the same as the respective commands `help` and `version`. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Quentin Monnet authored
In prevision of following commits, supposed to add JSON output to the tool, two files are copied from the iproute2 repository (taken at commit 268a9eee985f): lib/json_writer.c and include/json_writer.h. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Song Liu says: ==================== net: add a set of tracepoints to tcp stack Changes from v1: Fix build error (with ipv6 as ko) by adding EXPORT_TRACEPOINT_SYMBOL_GPL for trace_tcp_send_reset. These patches add the following tracepoints to tcp stack. tcp_send_reset tcp_receive_reset tcp_destroy_sock tcp_set_state These tracepoints can be used to track TCP state changes. Such state changes include but are not limited to: connection establish, connection termination, tx and rx of RST, various retransmits. Currently, we use the following kprobes to trace these events: int kprobe__tcp_validate_incoming int kprobe__tcp_send_active_reset int kprobe__tcp_v4_send_reset int kprobe__tcp_v6_send_reset int kprobe__tcp_v4_destroy_sock int kprobe__tcp_set_state int kprobe__tcp_retransmit_skb These tracepoints will help us simplify this work. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Song Liu authored
This patch adds tracepoint trace_tcp_set_state. Besides usual fields (s/d ports, IP addresses), old and new state of the socket is also printed with TP_printk, with __print_symbolic(). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Song Liu authored
This patch adds trace event trace_tcp_destroy_sock. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Song Liu authored
New tracepoint trace_tcp_receive_reset is added and called from tcp_reset(). This tracepoint is define with a new class tcp_event_sk. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Song Liu authored
New tracepoint trace_tcp_send_reset is added and called from tcp_v4_send_reset(), tcp_v6_send_reset() and tcp_send_active_reset(). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Song Liu authored
Some functions that we plan to add trace points require const sk and/or skb. So we mark these fields as const in the tracepoint. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Song Liu authored
Introduce event class tcp_event_sk_skb for tcp tracepoints that have arguments sk and skb. Existing tracepoint trace_tcp_retransmit_skb() falls into this class. This patch rewrites the definition of trace_tcp_retransmit_skb() with tcp_event_sk_skb. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Lipeng says: ==================== net: hns3: bug fixes & code improvements This patchset introduces various HNS3 bug fixes, optimizations and code improvements. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lipeng authored
The return value of hns3_clean_tx_ring means tx ring clean result. Return true means clean complete and there is no more pakcet need clean. Retrun false means there is packets need clean and napi need poll again. The last return of hns3_clean_tx_ring is "return !!budget" as budget will decrease when clean a buffer. If there is no valid BD in TX ring, return 0 for hns3_clean_tx_ring will cause napi poll again and never complete the napi poll. This patch fixes the bug. Fixes: 76ad4f0e (net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC) Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lipeng authored
HW will use packet length to write packets to buffer or read packets from buffer. There is a redundant memset when alloc buffer, the memset have no sense and will increase time-consuming. This patch removes it. Fixes: 76ad4f0e (net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC) Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lipeng authored
The interface hns3_ring_get_cfg only update TX ring queue_index, but do not update RX ring queue_index. This patch fixes it. Fixes: 76ad4f0e (net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC) Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lipeng authored
This patch gets vf count by standard function pci_sriov_get_totalvfs, instead of info from NIC HW. Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lipeng authored
1# patch: 07d29954 net: hns3: add support for ETHTOOL_GRXFH. 2# patch: 5668abda net: hns3: add support for set_ringparam. 1# patch adds ae_algo->ops->get_rss_tuple to hns3_get_rxnfc and 2# patch delete ae_algo->ops->get_tc_size from hns3_get_rxnfc.This patch fix the ops check in hns3_get_rxnfc. Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lipeng authored
If one buffer had been recieved to stack, driver will alloc a new buffer, map the buffer to device and replace the old buffer. When map fail, should only free the new alloced buffer, but not free all buffers in the ring. Fixes: 76ad4f0e (net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC) Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lipeng authored
When alloce new buffer to HW, should unmap the old buffer first. This old code map the old buffer but not unmap the old buffer, this patch fixes it. Fixes: 76ad4f0e (net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC) Signed-off-by: Lipeng <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.open-mesh.org/linux-mergeDavid S. Miller authored
Simon Wunderlich says: ==================== This documentation/cleanup patchset includes the following patches: - Fix parameter kerneldoc which caused kerneldoc warnings, by Sven Eckelmann - Remove spurious warnings in B.A.T.M.A.N. V neighbor comparison, by Sven Eckelmann - Use inline kernel-doc style for UAPI constants, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Oct, 2017 16 commits
-
-
Sven Eckelmann authored
The enums of constants for netlink tends to become rather large over time. Documenting them is easier when the kernel-doc is actually next to constant and not in a different block above the enum. Also inline kernel-doc allows multi-paragraph description. This could be required to better document the netlink command types and the expected return values. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
-
Gustavo A. R. Silva authored
Use BUG_ON instead of if condition followed by BUG in do_setlink. This issue was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Because SYSTEMPORT is a (semi) normal network device, the stack may attempt to queue packets on it oustide of the DSA slave transmit path. When that happens, the DSA layer has not had a chance to tag packets with the appropriate per-port and per-queue information, and if that happens and we don't have a port 0 queue 0 available (e.g: on boards where this does not exist), we will hit a NULL pointer de-reference in bcm_sysport_select_queue(). Guard against such cases by testing for the TX ring validity. Fixes: 84ff33eeb23d ("net: systemport: Establish DSA network device queue mapping") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: Add support for non-equal-cost multi-path Ido says: In the device, nexthops are stored as adjacency entries in an array called the KVD linear (KVDL). When a multi-path route is hit the packet's headers are hashed and then converted to an index into KVDL based on the adjacency group's size and base index. Up until now the driver ignored the `weight` parameter for multi-path routes and allocated only one adjacency entry for each nexthop with a limit of 32 nexthops in a group. This set makes the driver take the `weight` parameter into account when allocating adjacency entries. First patch teaches dpipe to show the size of the adjacency group, so that users will be able to determine the actual weight of each nexthop. The second patch refactors the KVDL allocator, making it more receptive towards the addition of another partition later in the set. Patches 3-5 introduce small changes towards the actual change in the sixth patch that populates the adjacency entries according to their relative weight. Last two patches finally add another partition to the KVDL, which allows us to allocate more than 32 entries per-group and thus support more nexthops and also provide higher accuracy with regards to the requested weights. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The KVD linear is currently partitioned into two partitions. One for single entries and another for groups of 32 entries. Add another partition consisting of groups of 512 entries which will allow us to more accurately represent the nexthop weights in non-equal cost multi-path routing. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The memory region where adjacency entries (nexthops) are stored is called the KVD linear and is configured during initialization with a size of 64K. Extend this area with 32K more entries, that will be partitioned into 64 groups of 0.5K entries, thereby allowing us to support weighted nexthops with high accuracy. Change the ratio between both types of hash entries, so as to prevent reduction in the number of double hash entries, which are used for IPv6 neighbours and routes with a prefix length greater than 64. Note that the user will be able to control all these sizes once the devlink resource manager is introduced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Up until now the driver assumed all the nexthops have an equal weight and wrote each to a single adjacency entry. This patch takes the `weight` parameter into account and populates the adjacency group according to the relative weight of each nexthop. Specifically, the weights of all the nexthops that should be offloaded are first normalized and then used to calculate the upper adjacency index of each nexthop. This is done according to the hash-threshold algorithm used by the kernel for IPv4 multi-path routing. Adjacency groups are currently limited to 32 entries which limits the weights that can be used, but follow-up patches will introduce groups of 512 entries. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The device has certain restrictions regarding the size of an adjacency group. Have the router determine the size of the adjacency group according to available KVDL allocation sizes and these restrictions. This was not needed until now since only allocations of up 32 entries were supported and these are all valid sizes for an adjacency group. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
As the first step towards non-equal-cost multi-path support, store each nexthop's weight. For IPv6 nexthops always set the weight to 1, as it only supports ECMP. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The current KVDL allocation API allows the user to specify the requested number of entries, but the user has no way of knowing how many entries were actually allocated. This works because existing users (e.g., router) request the exact number they end up using. With the introduction of large adjacency groups, this will change, as the router will have the ability to choose from several allocation sizes, where larger allocations provide higher accuracy with respect to requested weights and better resilience against nexthop failures. One option is to have the router try several allocations of descending size until one succeeds, but a better way is to simply allow it to query the actual allocation size and then size its request accordingly. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The KVD linear (KVDL) allocator currently consists of a very large bitmap that reflects the KVDL's usage. The boundaries of each partition as well as their allocation size are represented using defines. This representation requires us to patch all the functions that act on a partition whenever the partitioning scheme is changed. In addition, it does not enable the dynamic configuration of the KVDL using the up-coming resource manager. Add objects to represent these partitions as well as the accompanying code that acts on them to perform allocations and de-allocations. In the following patches, this will allow us to easily add another partition as well as new operations to act on these partitions. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The adjacency group size is part of the match on the adjacency group and should therefore be exposed using dpipe. When non-equal-cost multi-path support will be introduced, the group's size will help users understand the exact number of adjacency entries each nexthop occupies, as a nexthop will no longer correspond to a single entry. The output for a multi-path route with two nexthops, one with weight 255 and the second 1 will be: Example: $ devlink dpipe table dump pci/0000:01:00.0 name mlxsw_adj pci/0000:01:00.0: index 0 match_value: type field_exact header mlxsw_meta field adj_index value 65536 type field_exact header mlxsw_meta field adj_size value 512 type field_exact header mlxsw_meta field adj_hash_index value 0 action_value: type field_modify header ethernet field destination mac value e4:1d:2d:a5:f3:64 type field_modify header mlxsw_meta field erif_port mapping ifindex mapping_value 3 value 1 index 1 match_value: type field_exact header mlxsw_meta field adj_index value 65536 type field_exact header mlxsw_meta field adj_size value 512 type field_exact header mlxsw_meta field adj_hash_index value 510 action_value: type field_modify header ethernet field destination mac value e4:1d:2d:a5:f3:65 type field_modify header mlxsw_meta field erif_port mapping ifindex mapping_value 4 value 2 Thus, the first nexthop occupies 510 adjacency entries and the second 2, which leads to a ratio of 255 to 1. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: dsa: bcm_sf2: Add support for IPv6 CFP rules This patch series adds support for matching IPv6 addresses to the existing CFP support code. Because IPv6 addresses are four times bigger than IPv4, we can fit them anymore in a single slice, so we need to chain two in order to have a complete match. This makes us require a second bitmap tracking unique rules so we don't over populate the TCAM. Finally, because the code had to be re-organized, it became a lot easier to support arbitrary prefix/mask lengths, so the last two patches do just that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
There is no reason why we should limit ourselves to matching only full IPv4 addresses (/32), the same logic applies between the DATA and MASK ports, so just make it more configurable to accept both. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
There is no reason why we should limit ourselves to matching only full IPv4 addresses (/32), the same logic applies between the DATA and MASK ports, so just make it more configurable to accept both. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Inserting IPv6 CFP rules complicates the code a little bit in that we need to insert two rules side by side and chain them to match a full IPv6 tuple (src, dst IPv6 + port + protocol). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-