- 22 Jan, 2017 2 commits
-
-
Geliang Tang authored
To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
To make the code clearer, use rb_entry() instead of container_of() to deal with rbtree. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 20 Jan, 2017 35 commits
-
-
David S. Miller authored
Andrew Lunn says: ==================== net: dsa: Move temperature sensor code into PHY. Marvell Ethernet switches contain a temperature sensor. There appears to be one sensor, which is shared by each of the internal PHYs. Each PHY has independent registers to read this sensor, and to set a limit for when an alarm should be raised. Some Marvell discrete PHY also have the same sensor and registers. Moving the HWMON code from DSA into the PHY makes the sensor available in discrete PHYs, and removes the layering violation, the switch driver poking around in PHY registers. While moving the code into the PHY driver, it has been re-written to use the new HWMON APIs. v2: Better Cover note explaining one sensor, but multiple independent registers Simply error checking. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
Only the Marvell mv88e6xxx DSA driver made use of the HWMON support in DSA. The temperature sensor registers are actually in the embedded PHYs, and the PHY driver now supports it. So remove all HWMON support from DSA and drivers. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
Some Marvell PHYs have an inbuilt temperature sensor. Add hwmon support for this sensor. There are two different variants. The simpler, older chips have a 5 degree accuracy. The newer devices have 1 degree accuracy. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Josef Bacik authored
When comparing two sockets we need to use inet6_rcv_saddr so we get a NULL sk_v6_rcv_saddr if the socket isn't AF_INET6, otherwise our comparison function can be wrong. Fixes: 637bc8bb ("inet: reset tb->fastreuseport when adding a reuseport sk") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5 and mlx5e updates 2017-01-19 This series includes some updates for mlx5 core and mlx5e netdevice driver. From Leon, a small fix that remove an unnecessary print. From Eli Cohen, a fix to the FW version printout in case of internal error. From Eugenia Emantayev, two patches, the 1st adds mlx5 1pps (pulse per second) mlx5 infrastructure support and the 2nd adds the necessary bits for mlx5e ptp logic and structures. From Mohamad, add support for s-tagged packet receive when in promiscuous mode. Form Gal Pressman, MCAM (Management capabilities mask register) and PCAM (Ports capabilities mask register) registers infrastructure, those registers are needed in order to query the different statistics registers support in FW, in order for the driver to enable/disable query and reporting them back to user. On top of this infrastructure we've exposed new set of statistics groups: - MPCNT: Physical layer statistical counters (For symbol errors) - PPCNT: PCIe performance counters In addition to the statistics capabilities series we've moved the mlx5 HCA capabilities fields to a dedicated struct under the driver private data. At the end a small patch to update & query statistics in the most desired order. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Ivan Khoronzhuk says: ==================== net: ethernet: ti: cpsw: correct common res usage This series is intended to remove unneeded redundancies connected with common resource usage function. Since v1: - changed name to cpsw_get_usage_count() - added comments to open/closw for cpsw_get_usage_count() - added patch: net: ethernet: ti: cpsw: clarify ethtool ops changing num of descs Based on net-next/master ==================== Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
After adding cpsw_set_ringparam ethtool op, better to carry out common parts of similar ops splitting descriptors in runtime. It allows to reuse these parts and shows what the ops actually do. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
No need to duplicate the same function in rx handler to get info if any interface is running. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
No need to create additional vars to identify if interface is running. So simplify code by removing redundant var and checking usage counter instead. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
No need to disable interrupts if no open devices, they are disabled anyway. Even no need to disable interrupts if some ndev is opened, In this case shared resources are not touched, only parameters of ndev shell, so no reason to disable them also. Removed lines have proved it. So, no need in redundant check and interrupt disable. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
Common res usage is possible only in case an interface is running. In case of not dual emac here can be only one interface, so while ndo_open and switch mode, only one interface can be opened, thus if open is called no any interface is running ... and no common res are used. So remove check on dual emac, it will simplify code/understanding and will match the name it's called. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Mahesh Bandewar says: ==================== use netdev_is_rx_handler_busy() in few known cases netdev_rx_handler_register() was recently split into two parts - (a) check if the handler is used, (b) register the new handler, parts. This is helpful in scenarios like bonding where at the time of registration there is too much state to unwind and it should check if the device is free before building that state. IPvlan and macvlan drivers don't have this issue however it can make use of the same check instead of using a device specific check. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mahesh Bandewar authored
netdev_is_rx_handler_busy() check is a superset of netif_is_ipvlan_port() check and hence should be preferred. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mahesh Bandewar authored
IPvlan checks if the master device is already used by checking a specific device (here it's macvlan device). This is technically not sufficient and it should just ensure the rx_handler is busy or not. This would be a super check that includes macvlan and any other that has already registered rx-handler. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mahesh Bandewar authored
netdev_rx_handler_register() checks to see if the handler is already busy which was recently separated into netdev_is_rx_handler_busy(). So use the same function inside register() to avoid code duplication. Essentially this change should be a no-op Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Collins authored
The fq_codel qdisc currently always regenerates the skb flow hash. This wastes some cycles and prevents flow seperation in cases where the traffic has been encrypted and can no longer be understood by the flow dissector. Change it to use the prexisting flow hash if one exists, and only regenerate if necessary. Signed-off-by: Andrew Collins <acollins@cradlepoint.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lance Richardson authored
Eliminate sparse warning by maintaining type of dst_port as __be16. Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lance Richardson authored
Cast second parameter of csum_sub() from __sum16 to __wsum. Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jon Maloy says: ==================== tipc: emulate multicast through replication TIPC multicast messages are currently distributed via L2 broadcast or IP multicast to all nodes in the cluster, irrespective of the number of real destinations of the message. In this series we introduce an option to transport messages via replication ("replicast") across a selected number of unicast links, instead of relying on the underlying media. This option is used when true broadcast/multicast is not supported by the media, or when the number of true destinations is much smaller than the cluster size. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
If the bearer carrying multicast messages supports broadcast, those messages will be sent to all cluster nodes, irrespective of whether these nodes host any actual destinations socket or not. This is clearly wasteful if the cluster is large and there are only a few real destinations for the message being sent. In this commit we extend the eligibility of the newly introduced "replicast" transmit option. We now make it possible for a user to select which method he wants to be used, either as a mandatory setting via setsockopt(), or as a relative setting where we let the broadcast layer decide which method to use based on the ratio between cluster size and the message's actual number of destination nodes. In the latter case, a sending socket must stick to a previously selected method until it enters an idle period of at least 5 seconds. This eliminates the risk of message reordering caused by method change, i.e., when changes to cluster size or number of destinations would otherwise mandate a new method to be used. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
TIPC multicast messages are currently carried over a reliable 'broadcast link', making use of the underlying media's ability to transport packets as L2 broadcast or IP multicast to all nodes in the cluster. When the used bearer is lacking that ability, we can instead emulate the broadcast service by replicating and sending the packets over as many unicast links as needed to reach all identified destinations. We now introduce a new TIPC link-level 'replicast' service that does this. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
As a further preparation for the upcoming 'replicast' functionality, we add some necessary structs and functions for looking up and returning a list of all nodes that host destinations for a given multicast message. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
As a preparation for the 'replicast' functionality we are going to introduce in the next commits, we need the broadcast base structure to store whether bearer broadcast is available at all from the currently used bearer or bearers. We do this by adding a new function tipc_bearer_bcast_support() to the bearer layer, and letting the bearer selection function in bcast.c use this to give a new boolean field, 'bcast_support' the appropriate value. Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gianluca Borello authored
Provide a simple helper with the same semantics of strncpy_from_unsafe(): int bpf_probe_read_str(void *dst, int size, const void *unsafe_addr) This gives more flexibility to a bpf program. A typical use case is intercepting a file name during sys_open(). The current approach is: SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs *ctx) { char buf[PATHLEN]; // PATHLEN is defined to 256 bpf_probe_read(buf, sizeof(buf), ctx->di); /* consume buf */ } This is suboptimal because the size of the string needs to be estimated at compile time, causing more memory to be copied than often necessary, and can become more problematic if further processing on buf is done, for example by pushing it to userspace via bpf_perf_event_output(), since the real length of the string is unknown and the entire buffer must be copied (and defining an unrolled strnlen() inside the bpf program is a very inefficient and unfeasible approach). With the new helper, the code can easily operate on the actual string length rather than the buffer size: SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs *ctx) { char buf[PATHLEN]; // PATHLEN is defined to 256 int res = bpf_probe_read_str(buf, sizeof(buf), ctx->di); /* consume buf, for example push it to userspace via * bpf_perf_event_output(), but this time we can use * res (the string length) as event size, after checking * its boundaries. */ } Another useful use case is when parsing individual process arguments or individual environment variables navigating current->mm->arg_start and current->mm->env_start: using this helper and the return value, one can quickly iterate at the right offset of the memory area. The code changes simply leverage the already existent strncpy_from_unsafe() kernel function, which is safe to be called from a bpf program as it is used in bpf_trace_printk(). Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Phil Sutter says: ==================== Retrieve number of VFs in a bus-agnostic way Previously, it was assumed that only PCI NICs would be capable of having virtual functions - with my proposed enhancement of dummy NIC driver implementing (fake) ones for testing purposes, this is no longer true. Discussion of said patch has led to the suggestion of implementing a bus-agnostic method for VF count retrieval so rtnetlink could work with both real VF-capable PCI NICs as well as my dummy modifications without introducing ugly hacks. The following series tries to achieve just that by introducing a bus type callback to retrieve a device's number of VFs, implementing this callback for PCI bus and finally adjusting rtnetlink to make use of the generalized infrastructure. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Now that pci_bus_type has num_vf callback set, dev_num_vf can be implemented in a bus type independent way and the check for whether a PCI device is being handled in rtnetlink can be dropped. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
This allows for bus types to implement their own method of retrieving the number of virtual functions a NIC on that type of bus supports. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
Use hlist_entry_safe() instead of open-coding it. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Sitnicki authored
Commit b05229f4 ("gre6: Cleanup GREv6 transmit path, call common GRE functions") removed the ip6gre specific transmit function, but left the struct ipv6_tel_txoption definition. Clean it up. Signed-off-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Shaohua Li made percpu_counter irq safe in commit 098faf58 ("percpu_counter: make APIs irq safe") We can safely remove BH disable/enable sections around various percpu_counter manipulations. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
The two new variables are only used inside of an #ifdef and cause harmless warnings when that is disabled: drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c: In function 'init_one': drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4646:9: error: unused variable 'port_vec' [-Werror=unused-variable] drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4646:6: error: unused variable 'v' [-Werror=unused-variable] This adds another #ifdef around the declarations. Fixes: 96fe11f2 ("cxgb4: Implement ndo_get_phys_port_id for mgmt dev") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
IPv6 deletes route entries associated with multipath routes on an admin down where IPv4 does not. For example: $ ip ro ls vrf red unreachable default metric 8192 1.1.1.0/24 metric 64 nexthop via 10.100.1.254 dev eth1 weight 1 nexthop via 10.100.2.254 dev eth2 weight 1 10.100.1.0/24 dev eth1 proto kernel scope link src 10.100.1.4 10.100.2.0/24 dev eth2 proto kernel scope link src 10.100.2.4 $ ip -6 ro ls vrf red 2001:db8:1::/120 dev eth1 proto kernel metric 256 pref medium 2001:db8:2:: dev red proto none metric 0 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:11::/120 via 2001:db8:1::16 dev eth1 metric 1024 pref medium 2001:db8:11::/120 via 2001:db8:2::17 dev eth2 metric 1024 pref medium ... Set link down: $ ip li set eth1 down IPv4 retains the multihop route but flags eth1 route as dead: $ ip ro ls vrf red unreachable default metric 8192 1.1.1.0/24 nexthop via 10.100.1.16 dev eth1 weight 1 dead linkdown nexthop via 10.100.2.16 dev eth2 weight 1 10.100.2.0/24 dev eth2 proto kernel scope link src 10.100.2.4 and IPv6 deletes the route as part of flushing all routes for the device: $ ip -6 ro ls vrf red 2001:db8:2:: dev red proto none metric 0 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:11::/120 via 2001:db8:2::17 dev eth2 metric 1024 pref medium ... Worse, on admin up of the device the multipath route has to be deleted to get this leg of the route re-added. This patch keeps routes that are part of a multipath route if ignore_routes_with_linkdown is set with the dead and linkdown flags enabling consistency between IPv4 and IPv6: $ ip -6 ro ls vrf red 2001:db8:2:: dev red proto none metric 0 pref medium 2001:db8:2::/120 dev eth2 proto kernel metric 256 pref medium 2001:db8:11::/120 via 2001:db8:1::16 dev eth1 metric 1024 dead linkdown pref medium 2001:db8:11::/120 via 2001:db8:2::17 dev eth2 metric 1024 pref medium ... Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Commit 04aeb56a ("net/mlx4_en: allocate non 0-order pages for RX ring with __GFP_NOMEMALLOC") added code that appears to be not needed at that time, since mlx4 never used __GFP_MEMALLOC allocations anyway. As using memory reserves is a must in some situations (swap over NFS or iSCSI), this patch adds this flag. Note that this driver does not reuse pages (yet) so we do not have to add anything else. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Timur Tabi authored
This reverts commit 3e884493. With commit 529ed127 ("net: phy: phy drivers should not set SUPPORTED_[Asym_]Pause"), phylib now handles automatically enabling pause frame support in the PHY, and the MAC driver should follow suit. Since the EMAC driver driver does this, we no longer need to force pause frames support. Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 19 Jan, 2017 3 commits
-
-
Saeed Mahameed authored
Reorder update stats flow to update most important counters last, to get more accurate results. New update order: - PCIe counters - Port counters - Vport counters - Queue counters - Software counters Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Gal Pressman <galp@mellanox.com>
-
Gal Pressman authored
The caps structure consists of hca caps and port/management caps, all under one roof. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
Gal Pressman authored
This patch exposes PCIe performance counters, queried with ethtool -S <devname>. Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-