- 08 Jun, 2017 26 commits
-
-
Arkadi Sharshevsky authored
When a new static FDB is added to the bridge a notification is sent to the driver for offload. In case of successful offload the driver should notify the bridge back, which in turn should mark the FDB as offloaded. Currently, externally learned is equivalent for being offloaded which is not correct due to the fact that FDBs which are added from user-space are also marked as externally learned. In order to specify if an FDB was successfully offloaded a new flag is introduced. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arkadi Sharshevsky authored
Currently the bridge doesn't notify the underlying devices about new FDBs learned. The FDB sync is placed on the switchdev notifier chain because devices may potentially learn FDB that are not directly related to their ports, for example: 1. Mixed SW/HW bridge - FDBs that point to the ASICs external devices should be offloaded as CPU traps in order to perform forwarding in slow path. 2. EVPN - Externally learned FDBs for the vtep device. Notification is sent only about static FDB add/del. This is done due to fact that currently this is the only scenario supported by switch drivers. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arkadi Sharshevsky authored
In order to use the switchdev notifier chain for FDB sync with the device it has to be changed to atomic. The is done because the bridge can learn new FDBs in atomic context. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arkadi Sharshevsky authored
This is done as a preparation to moving the switchdev notifier chain to be atomic. The FDB external learning should be called under rtnl or rcu. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arkadi Sharshevsky authored
Currently the flood, learning and learning_sync port attributes are offloaded by setting the SELF flag. Add support for offloading the flood and learning attribute through the bridge code. In case of setting an unsupported flag on a offloded port the operation will fail. The learning_sync attribute doesn't have any software representation and cannot be offloaded through the bridge code. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arkadi Sharshevsky authored
This is done as a preparation stage before setting the bridge port flags from the bridge code. Currently the device can be queried for the bridge flags state, but the querier cannot distinguish if the flag is disabled or if it is not supported at all. Thus, add new attr and a bit-mask which include information regarding the support on a per-flag basis. Drivers that support bridge offload but not support bridge flags should return zeroed bitmask. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Vivien Didelot says: ==================== net: dsa: add cross-chip VLAN support The current code in DSA does not support cross-chip VLAN. This means that in a multi-chip environment such as this one (similar to ZII Rev B) [CPU].................... (mdio) (eth0) | : : : _|_____ _______ _______ [__sw0__]--[__sw1__]--[__sw2__] | | | | | | | | | v v v v v v v v v p1 p2 p3 p4 p5 p6 p7 p8 p9 adding a VLAN to p9 won't be enough to reach the CPU, until at least one port of sw0 and sw1 join the VLAN as well and become aware of the VID. This patchset makes the DSA core program the VLAN on the CPU and DSA links itself, which brings seamlessly cross-chip VLAN support to DSA. With this series applied*, the hardware VLAN tables of a 3-switch setup look like this after adding a VLAN to only one port of the end switch: # cat /sys/class/net/br0/bridge/default_pvid 42 # cat /sys/kernel/debug/mv88e6xxx/sw{0,1,2}/vtu # ip link set up master br0 dev lan6 # cat /sys/kernel/debug/mv88e6xxx/sw{0,1,2}/vtu VID FID SID 0 1 2 3 4 5 6 42 1 0 x x x x x = = VID FID SID 0 1 2 3 4 5 6 42 1 0 x x x x x = = VID FID SID 0 1 2 3 4 5 6 7 8 9 42 1 0 u x x x x x x x x = ('x' is excluded, 'u' is untagged, '=' is unmodified DSA and CPU ports.) Completely removing a VLAN entry (which is currently the responsibility of drivers anyway) is not supported yet since it requires some caching. (*) the output is shown from this out-of-tree debugfs patch: https://github.com/vivien/linux/commit/7b61a684b9d6b6a499135a587c7f62a1fddceb8b.patch Changes in v2: - canonical incrementation (port++ instead of ++port) - check CPU and DSA ports before purging a VLAN - add Reviewed-by tags ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
The mv88e6xxx driver currently tries to be smart and remove by itself a VLAN entry from the VTU when the driven switch sees no user ports as members of the VLAN. This is bad in a multi-chip switch fabric, since a chip in between others may have no bridge port members, but still needs to be aware of the VID in order to correctly pass frames in the data path. Now that the DSA core explicitly manages DSA and CPU ports, do not skip them when checking remaining VLAN members. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
Now that the DSA core adds the CPU and DSA ports itself to the new VLAN entry, there is no need to include them as members of this VLAN when initializing a new VTU entry. As of now, initialize a new VTU entry with all ports excluded. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
In a multi-chip switch fabric, it is currently the responsibility of the driver to add the CPU or DSA (interconnecting chips together) ports as members of a new VLAN entry. This makes the drivers more complicated. We want the DSA drivers to be stupid and the DSA core being the one responsible for caring about the abstracted switch logic and topology. Make the DSA core program the CPU and DSA ports as part of the VLAN. This makes all chips of the data path to be aware of VIDs spanning the the whole fabric and thus, seamlessly add support for cross-chip VLAN. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
Now that the VLAN object is propagated to every switch chip of the switch fabric, we can easily ensure that they all support the required VLAN operations before modifying an entry on a single switch. To achieve that, remove the condition skipping other target switches, and add a bitmap of VLAN members, eventually containing the target port, if we are programming the switch target. This will allow us to easily add other VLAN members, such as the DSA or CPU ports (to introduce cross-chip VLAN support) or the other port members if we want to reduce hardware accesses later. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
Define the target port membership of the VLAN entry in mv88e6xxx_port_vlan_add where ds is scoped. Allow the DSA core to call later the port_vlan_add operation for CPU or DSA ports, by using the Unmodified membership for these ports, as in the current behavior. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20170607-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Tx length parameter Here's a set of patches that allows someone initiating a client call with AF_RXRPC to indicate upfront the total amount of data that will be transmitted. This will allow AF_RXRPC to encrypt directly from source buffer to packet rather than having to copy into the buffer and only encrypt when it's full (the encrypted portion of the packet starts with a length and so we can't encrypt until we know what the length will be). The three patches are: (1) Provide a means of finding out what control message types are actually supported. EINVAL is reported if an unsupported cmsg type is seen, so we don't want to set the new cmsg unless we know it will be accepted. (2) Consolidate some stuff into a struct to reduce the parameter count on the function that parses the cmsg buffer. (3) Introduce the RXRPC_TX_LENGTH cmsg. This can be provided on the first sendmsg() that contributes data to a client call request or a service call reply. If provided, the user must provide exactly that amount of data or an error will be incurred. Changes in version 2: (*) struct rxrpc_send_params::tx_total_len should be s64 not u64. Thanks to Julia Lawall for reporting this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Bjorn Andersson says: ==================== Missing QRTR features The QMUX specification covers packet routing as well as service life cycle and discovery. The current implementation of qrtr supports the prior part, but in order to fully implement service management on-top a few more parts are needed. The first patch in the series serves the purpose of reducing duplication in patch two and three. The second and third patch adds two qrtr-level notifications required by the specification, in order to notify local and remote service controllers about dying clients. The last patch serves the purpose of notifying local clients about the presence of a local service register, allowing them to register services as well as querying for remote registered services. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjorn Andersson authored
As the higher level communication only deals with "services" the a service directory is required to keep track of local and remote services. In order for qrtr clients to be informed about when the service directory implementation is available some event needs to be passed to them. Rather than introducing support for broadcasting such a message in-band to all open local sockets we flag each socket with ENETRESET, as there are no other expected operations that would benefit from having support from locally broadcasting messages. Cc: Courtney Cavin <ccavin@gmail.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjorn Andersson authored
Per the QMUXv2 protocol specificiation a DEL_CLIENT message should be broadcasted when an endpoint is disconnected. The protocol specification does suggest that the router can keep track of which nodes the endpoint has been communicating with to not wake up sleeping remotes unecessarily, but implementation of this suggestion is left for the future. Cc: Courtney Cavin <ccavin@gmail.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjorn Andersson authored
Per the QMUX protocol specification a terminating node can send a BYE control message to signal that the link is going down, upon receiving this all information about remote services should be discarded and local clients should be notified. In the event that the link was brought down abruptly the router is supposed to act like a BYE message has arrived. As there is no harm in receiving an extra BYE from the remote this patch implements the latter by injecting a BYE when the link to the remote is unregistered. The name service will receive the BYE and can implement the notification to the local clients. Cc: Courtney Cavin <ccavin@gmail.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjorn Andersson authored
Extract the allocation and filling in the control message header fields to a separate function in order to reuse this in subsequent patches. Cc: Courtney Cavin <ccavin@gmail.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
Remove unnecessary variable assignments. Addresses-Coverity-ID: 1226917 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
DRAM supply shortage and poor memory pressure tracking in TCP stack makes any change in SO_SNDBUF/SO_RCVBUF (or equivalent autotuning limits) and tcp_mem[] quite hazardous. TCPMemoryPressures SNMP counter is an indication of tcp_mem sysctl limits being hit, but only tracking number of transitions. If TCP stack behavior under stress was perfect : 1) It would maintain memory usage close to the limit. 2) Memory pressure state would be entered for short times. We certainly prefer 100 events lasting 10ms compared to one event lasting 200 seconds. This patch adds a new SNMP counter tracking cumulative duration of memory pressure events, given in ms units. $ cat /proc/sys/net/ipv4/tcp_mem 3088 4117 6176 $ grep TCP /proc/net/sockstat TCP: inuse 180 orphan 0 tw 2 alloc 234 mem 4140 $ nstat -n ; sleep 10 ; nstat |grep Pressure TcpExtTCPMemoryPressures 1700 TcpExtTCPMemoryPressuresChrono 5209 v2: Used EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL() as David instructed. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Eric Dumazet says: ==================== tcp: Namespaceify 3 sysctls Move tcp_sack, tcp_window_scaling and tcp_timestamps sysctls to network namespaces. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
We want to move some TCP sysctls to net namespaces in the future. tcp_window_scaling, tcp_sack and tcp_timestamps being fetched from tcp_parse_options(), we need to pass an extra parameter. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
We need to push the chain index down to the drivers, so they have the information to which chain the rule belongs. For now, no driver supports multichain offload, so only chain 0 is supported. This is needed to prevent chain squashes during offload for now. Later this will be used to implement multichain offload. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Jun, 2017 14 commits
-
-
David S. Miller authored
Tariq Toukan says: ==================== mlx4 drivers: version update This patchset contains version updates for the MLX4 drivers: Core, EN, and IB. Just like we've done in mlx5, we modify the outdated driver version (reported in ethtool for example). This better reflects the current driver state, and removes the redundant date string. We are not going to change this frequently or even use it. I include the IB patch in this series as it has similar subject and content. It does not cause any kind of conflict with Doug's tree. The rdma mailing list is CCed. Please let me know if I need to submit this differently. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Remove date and bump version for mlx4_ib driver. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Remove date and bump version for mlx4_en driver. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tariq Toukan authored
Remove date and bump version for mlx4_core driver. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
The mv88e6161 and mv88e6123 are capable of using EDSA tags when passing frames from the host to the switch and back. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mark Bloch authored
When stopping the vxlan interface we detach it from the socket. Use RCU_INIT_POINTER() and not rcu_assign_pointer() to do so. Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ganesh Goudar authored
the adapter consumes two tids for every ipv6 offload connection be it active or passive, calculate tid usage count accordingly. Also change the signatures of relevant functions to get the address family. Signed-off-by: Rizwan Ansari <rizwana@chelsio.com> Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jakub Kicinski says: ==================== nfp: ctrl vNIC This series adds the ability to use one vNIC as a control channel for passing messages to and from the application firmware. The implementation restructures the existing netdev vNIC code to be able to deal with nfp_nets with netdev pointer set to NULL. Control vNICs are not visible to userspace (other than for dumping ring state), and since they don't have netdevs we use a tasklet for RX and simple skb list for TX queuing. Due to special status of the control vNIC we have to reshuffle the init code a bit to make sure control vNIC will be fully brought up (and therefore communication with app FW can happen) before any netdev or port is visible to user space. FW will designate which vNIC is supposed to be used as control one by setting _pf%u_net_ctrl_bar symbol. Some FWs depend on metadata being prepended to control message, some prefer to look at queue ID to decide that something is a control message. Our implementation can cater to both. First two users of this code will be eBPF maps and flower offloads. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
NFD ABI 0.5 is equivalent to NFD ABI 3.0 but requires that the driver checks the APP id symbol and makes sure it can support given app. Most advanced apps will likely require control vNIC (ability to exchange control messages between the driver and app FW). Detailed app version checking and capability exchange is left to app-specific code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
When driver encounters an nfp_app which has a control message handler defined, allocate a control vNIC. This control channel will be used to exchange data with the application FW such as flow table programming, statistics and global datapath control. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Thus far the code assumed all vNICs will request similar number of IRQs. This will be no longer true with control vNICs (where 1 IRQ will suffice). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
We want to be able to create a special vNIC for control messages. This vNIC should be created before any netdev is registered to allow nfp_app logic to exchange messages with the FW app before any netdev is visible to user space. Unfortunately we can't enable IRQs until we know how many vNICs we will need to spawn. Divide the function which spawns netdevs for vNICs into three parts: - vNIC/memory allocation; - IRQ allocation; - netdev init and register. This will help us insert the initialization of the control channel after IRQ allocation but before netdev init and register. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Reading fw version from the BAR is trivial. Don't pass it around through layers of init functions, simply read it again where needed. This commit has the side effect of each vNIC having the exact NFD version from its own control memory, rather than all data vNICs assuming the version of the first one. This should not result in user-visible changes, though. Capabilities of data vNICs of trival apps are identical. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
RX and TX queue controllers are interleaved. Instead of creating two mappings which map the same area at slightly different offset, create only one mapping. Always map all queue controllers to simplify the code and allow reusing the mapping for non-data vNICs. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-