- 18 Mar, 2020 40 commits
-
-
Petr Machata authored
In the commit referenced below, hw_stats_type of an entry is set for every entry that corresponds to a pedit action. However, the assignment is only done after the entry pointer is bumped, and therefore could overwrite memory outside of the entries array. The reason for this positioning may have been that the current entry's hw_stats_type is already set above, before the action-type dispatch. However, if there are no more actions, the assignment is wrong. And if there are, the next round of the for_each_action loop will make the assignment before the action-type dispatch anyway. Therefore fix this issue by simply reordering the two lines. Fixes: 74522e7b ("net: sched: set the hw_stats_type in pedit loop") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ido Schimmel says: ==================== mlxsw: spectrum_cnt: Expose counter resources Jiri says: Capacity and utilization of existing flow and RIF counters are currently unavailable to be seen by the user. Use the existing devlink resources API to expose the information: $ sudo devlink resource show pci/0000:00:10.0 -v pci/0000:00:10.0: name kvd resource_path /kvd size 524288 unit entry dpipe_tables none name span_agents resource_path /span_agents size 8 occ 0 unit entry dpipe_tables none name counters resource_path /counters size 79872 occ 44 unit entry dpipe_tables none resources: name flow resource_path /counters/flow size 61440 occ 4 unit entry dpipe_tables none name rif resource_path /counters/rif size 18432 occ 40 unit entry dpipe_tables none ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Add tests for mlxsw hw_stats types. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Implement occupancy counting for counters and expose over devlink resource API. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Put all init operations related to subpools into mlxsw_sp_counter_sub_pools_init(). Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Move the validation of subpools configuration, to avoid possible over commitment to resource registration. Add WARN_ON to indicate bug in the code. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Implement devlink resources support for counter pools. Move the subpool sizes calculations into the new resources register function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Add new field to subpool struct that would indicate which resource id should be used to query the entry size for the subpool from the device. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Currently, the global static array of subpools is used. Make it per-instance as multiple instances of the mlxsw driver can have different values. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
With the change that made the code to query counter bank size from device instead of using hard-coded value, the number of available counters changed for Spectrum-2. Adjust the limit in the selftests. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
The bank size is different between Spectrum versions. Also it is a resource that can be queried. So instead of hard coding the value in code, query it from the firmware. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rahul Lakkireddy authored
Chelsio NICs have 3 filter regions, in following order of priority: 1. High Priority (HPFILTER) region (Highest Priority). 2. HASH region. 3. Normal FILTER region (Lowest Priority). Currently, there's a 1-to-1 mapping between the prio value passed by TC and the filter region index. However, it's possible to have multiple TC rules with the same prio value. In this case, if a region is exhausted, no attempt is made to try inserting the rule in the next available region. So, rework and remove the 1-to-1 mapping. Instead, dynamically select the region to insert the filter rule, as long as the new rule's prio value doesn't conflict with existing rules across all the 3 regions. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
This reverts the following commits: 8537f786 ("netfilter: Introduce egress hook") 5418d388 ("netfilter: Generalize ingress hook") b030f194 ("netfilter: Rename ingress hook include file") >From the discussion in [0], the author's main motivation to add a hook in fast path is for an out of tree kernel module, which is a red flag to begin with. Other mentioned potential use cases like NAT{64,46} is on future extensions w/o concrete code in the tree yet. Revert as suggested [1] given the weak justification to add more hooks to critical fast-path. [0] https://lore.kernel.org/netdev/cover.1583927267.git.lukas@wunner.de/ [1] https://lore.kernel.org/netdev/20200318.011152.72770718915606186.davem@davemloft.net/Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Miller <davem@davemloft.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Alexei Starovoitov <ast@kernel.org> Nacked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Julian Wiedmann says: ==================== s390/qeth: updates 2020-03-18 please apply the following patch series for qeth to netdev's net-next tree. This consists of three parts: 1) support for __GFP_MEMALLOC, 2) several ethtool enhancements (.set_channels, SW Timestamping), 3) the usual cleanups. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
To check whether a netdevice has already been registered, look at NETREG_REGISTERED to replace some hacks I added a while ago. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
qeth_do_ioctl() is only reached through our own net_device_ops, so we can trust that dev->ml_priv still contains what we put there earlier. qeth_bridgeport_an_set() is an internal function that doesn't require such sanity checks. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Data addresses in the AOB are absolute, and need to be translated before being fed into kmem_cache_free(). Currently this phys_to_virt() is a no-op. Also see commit 2db01da8 ("s390/qdio: fill SBALEs with absolute addresses"). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Versions are meaningless for an in-kernel driver. Instead use the UTS_RELEASE that is set by ethtool_get_drvinfo(). Cc: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
This adds support for SOF_TIMESTAMPING_TX_SOFTWARE. No support for non-IQD devices, since they orphan the skb in their xmit path. To play nice with TX bulking, set the timestamp when the buffer that contains the skb(s) is actually flushed out to HW. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
For ucast traffic, qeth_iqd_select_queue() falls back to netdev_pick_tx(). This will potentially use skb_tx_hash() to distribute the flow over all active TX queues - so txq 0 is a valid selection, and qeth_iqd_select_queue() needs to check for this and put it on some other queue. As a result, the distribution for ucast flows is unbalanced and hits QETH_IQD_MIN_UCAST_TXQ heavier than the other queues. Open-coding a custom variant of skb_tx_hash() isn't an option, since netdev_pick_tx() also gives us eg. access to XPS. But we can pull a little trick: add a single TC class that excludes the mcast txq, and thus encourage skb_tx_hash() to not pick the mcast txq. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Similar to the support for z/VM NICs, but we need to take extra care about the dedicated mcast queue: 1. netdev_pick_tx() is unaware of this limitation and might select the mcast txq. Catch this. 2. require at least _two_ TX queues - one for ucast, one for mcast. 3. when reducing the number of TX queues, there's a potential race where netdev_cap_txqueue() over-rules the selected txq index and falls back to index 0. This would place ucast traffic on the mcast queue, and result in TX errors. So for IQD, reject a reduction while the interface is running. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Add support for ETHTOOL_SCHANNELS to change the count of active TX queues. Since all TX queue structs are pre-allocated and -registered, we just need to trivially adjust dev->real_num_tx_queues. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
z/VM NICs don't offer HW QoS for TX rings. So just use netdev_pick_tx() to distribute the connections equally over all enabled TX queues. We start with just 1 enabled TX queue (this matches the typical configuration without prio-queueing). A follow-on patch will allow users to enable additional TX queues. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
When falling back to an allocation from the HW header cache, check if the skb is eligible for using memory reserves. This only makes a difference if the cache is empty and needs to be refilled. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Use dev_alloc_page() for backing the RX buffers with pages. This way we pick up __GFP_MEMALLOC. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller authored
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Use nf_flow_offload_tuple() to fetch flow stats, from Paul Blakey. 2) Add new xt_IDLETIMER hard mode, from Manoj Basapathi. Follow up patch to clean up this new mode, from Dan Carpenter. 3) Add support for geneve tunnel options, from Xin Long. 4) Make sets built-in and remove modular infrastructure for sets, from Florian Westphal. 5) Remove unused TEMPLATE_NULLS_VAL, from Li RongQing. 6) Statify nft_pipapo_get, from Chen Wandun. 7) Use C99 flexible-array member, from Gustavo A. R. Silva. 8) More descriptive variable names for bitwise, from Jeremy Sowden. 9) Four patches to add tunnel device hardware offload to the flowtable infrastructure, from wenxu. 10) pipapo set supports for 8-bit grouping, from Stefano Brivio. 11) pipapo can switch between nibble and byte grouping, also from Stefano. 12) Add AVX2 vectorized version of pipapo, from Stefano Brivio. 13) Update pipapo to be use it for single ranges, from Stefano. 14) Add stateful expression support to elements via control plane, eg. counter per element. 15) Re-visit sysctls in unprivileged namespaces, from Florian Westphal. 15) Add new egress hook, from Lukas Wunner. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
After commit 58b09919 ("mptcp: create msk early"), the msk socket is already available at subflow_syn_recv_sock() time. Let's move there the state update, to mirror more closely the first subflow state. The above will also help multiple subflow supports. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Russell King says: ==================== net: add phylink support for PCS This series adds support for IEEE 802.3 register set compliant PCS for phylink. In order to do this, we: 1. convert BUG_ON() in existing accessors to WARN_ON_ONCE() and return an error. 2. add accessors for modifying a MDIO device register, and use them in phylib, rather than duplicating the code from phylib. 3. add support for decoding the advertisement from clause 22 compatible register sets for clause 37 advertisements and SGMII advertisements. 4. add support for clause 45 register sets for 10GBASE-R PCS. These have been tested on the LX2160A Clearfog-CX platform. v2: eliminate use of BUG_ON() in the accessors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
Implement helpers for PCS accessed via the MII bus using 802.3 clause 45 cycles for 10GBASE-R. Only link up/down is supported, 10G full duplex is assumed. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
Implement helpers for PCS accessed via the MII bus using 802.3 clause 22 cycles, conforming to 802.3 clause 37 and Cisco SGMII specifications for the advertisement word. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
Add APIs for modifying a MDIO device register, similar to the existing phy_modify() group of functions, but at mdiobus level instead. Adapt __phy_modify_changed() to use the new mdiobus level helper. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Russell King authored
Avoid using BUG_ON() in the mdiobus accessors, prefering instead to use WARN_ON_ONCE() and returning an error. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Nikolay Aleksandrov says: ==================== net: bridge: vlan options: add support for tunnel mapping In order to bring the new vlan API on par with the old one and be able to completely migrate to the new one we need to support vlan tunnel mapping and statistics. This patch-set takes care of the former by making it a vlan option. There are two notable issues to deal with: - vlan range to tunnel range mapping * The tunnel ids are globally unique for the vlan code and a vlan can be mapped to one tunnel, so the old API took care of ranges by taking the starting tunnel id value and incrementally mapping vlan id(i) -> tunnel id(i). This set takes the same approach and uses one new attribute - BRIDGE_VLANDB_ENTRY_TUNNEL_ID. If used with a vlan range then it's the starting tunnel id to map. - tunnel mapping removal * Since there are no reserved/special tunnel ids defined, we can't encode mapping removal within the new attribute, in order to be able to remove a mapping we add a vlan flag which makes the new tunnel option remove the mapping The rest is pretty straight-forward, in fact we directly re-use the old code for manipulating tunnels by just mapping the command (set/del). In order to be able to keep detecting vlan ranges we check that the current vlan has a tunnel and it's extending the current vlan range end's tunnel id. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
This patch adds support for manipulating vlan/tunnel mappings. The tunnel ids are globally unique and are one per-vlan. There were two trickier issues - first in order to support vlan ranges we have to compute the current tunnel id in the following way: - base tunnel id (attr) + current vlan id - starting vlan id This is in line how the old API does vlan/tunnel mapping with ranges. We already have the vlan range present, so it's redundant to add another attribute for the tunnel range end. It's simply base tunnel id + vlan range. And second to support removing mappings we need an out-of-band way to tell the option manipulating function because there are no special/reserved tunnel id values, so we use a vlan flag to denote the operation is tunnel mapping removal. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Add a new option - BRIDGE_VLANDB_ENTRY_TUNNEL_ID which is used to dump the tunnel id mapping. Since they're unique per vlan they can enter a vlan range if they're consecutive, thus we can calculate the tunnel id range map simply as: vlan range end id - vlan range start id. The starting point is the tunnel id in BRIDGE_VLANDB_ENTRY_TUNNEL_ID. This is similar to how the tunnel entries can be created in a range via the old API (a vlan range maps to a tunnel range). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
The vlan tunnel code changes vlan options, it shouldn't touch port or bridge options so we can constify the port argument. This would later help us to re-use these functions from the vlan options code. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
It is more appropriate name as it shows the intent of why we need to check the options' state. It also allows us to give meaning to the two arguments of the function: the first is the current vlan (v_curr) being checked if it could enter the range ending in the second one (range_end). Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jose Abreu says: ==================== net: stmmac: 100GB Enterprise MAC support Adds the support for Enterprise MAC IP version which allows operating speeds up to 100GB. Patch 1/4, adds the support in XPCS for XLGMII interface that is used in this kind of Enterprise MAC IPs. Patch 2/4, adds the XLGMII interface support in stmmac. Patch 3/4, adds the HW specific support for Enterprise MAC. We end in patch 4/4, by updating stmmac documentation to mention the support for this new IP version. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jose Abreu authored
Add the Enterprise MAC support to the list of supported IP versions and the newly added XLGMII interface support. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jose Abreu authored
Adds the support for Enterprise MAC IP version which is very similar to XGMAC. It's so similar that we just need to check the device id and add new speeds definitions and some minor callbacks. Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-