- 14 Nov, 2016 6 commits
-
-
WANG Cong authored
Similar to commit 14135f30 ("inet: fix sleeping inside inet_wait_for_connect()"), sk_wait_event() needs to fix too, because release_sock() is blocking, it changes the process state back to running after sleep, which breaks the previous prepare_to_wait(). Switch to the new wait API. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller authored
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains a second batch of Netfilter updates for your net-next tree. This includes a rework of the core hook infrastructure that improves Netfilter performance by ~15% according to synthetic benchmarks. Then, a large batch with ipset updates, including a new hash:ipmac set type, via Jozsef Kadlecsik. This also includes a couple of assorted updates. Regarding the core hook infrastructure rework to improve performance, using this simple drop-all packets ruleset from ingress: nft add table netdev x nft add chain netdev x y { type filter hook ingress device eth0 priority 0\; } nft add rule netdev x y drop And generating traffic through Jesper Brouer's samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh script using -i option. perf report shows nf_tables calls in its top 10: 17.30% kpktgend_0 [nf_tables] [k] nft_do_chain 15.75% kpktgend_0 [kernel.vmlinux] [k] __netif_receive_skb_core 10.39% kpktgend_0 [nf_tables_netdev] [k] nft_do_chain_netdev I'm measuring here an improvement of ~15% in performance with this patchset, so we got +2.5Mpps more. I have used my old laptop Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz 4-cores. This rework contains more specifically, in strict order, these patches: 1) Remove compile-time debugging from core. 2) Remove obsolete comments that predate the rcu era. These days it is well known that a Netfilter hook always runs under rcu_read_lock(). 3) Remove threshold handling, this is only used by br_netfilter too. We already have specific code to handle this from br_netfilter, so remove this code from the core path. 4) Deprecate NF_STOP, as this is only used by br_netfilter. 5) Place nf_state_hook pointer into xt_action_param structure, so this structure fits into one single cacheline according to pahole. This also implicit affects nftables since it also relies on the xt_action_param structure. 6) Move state->hook_entries into nf_queue entry. The hook_entries pointer is only required by nf_queue(), so we can store this in the queue entry instead. 7) use switch() statement to handle verdict cases. 8) Remove hook_entries field from nf_hook_state structure, this is only required by nf_queue, so store it in nf_queue_entry structure. 9) Merge nf_iterate() into nf_hook_slow() that results in a much more simple and readable function. 10) Handle NF_REPEAT away from the core, so far the only client is nf_conntrack_in() and we can restart the packet processing using a simple goto to jump back there when the TCP requires it. This update required a second pass to fix fallout, fix from Arnd Bergmann. 11) Set random seed from nft_hash when no seed is specified from userspace. 12) Simplify nf_tables expression registration, in a much smarter way to save lots of boiler plate code, by Liping Zhang. 13) Simplify layer 4 protocol conntrack tracker registration, from Davide Caratti. 14) Missing CONFIG_NF_SOCKET_IPV4 dependency for udp4_lib_lookup, due to recent generalization of the socket infrastructure, from Arnd Bergmann. 15) Then, the ipset batch from Jozsef, he describes it as it follows: * Cleanup: Remove extra whitespaces in ip_set.h * Cleanup: Mark some of the helpers arguments as const in ip_set.h * Cleanup: Group counter helper functions together in ip_set.h * struct ip_set_skbinfo is introduced instead of open coded fields in skbinfo get/init helper funcions. * Use kmalloc() in comment extension helper instead of kzalloc() because it is unnecessary to zero out the area just before explicit initialization. * Cleanup: Split extensions into separate files. * Cleanup: Separate memsize calculation code into dedicated function. * Cleanup: group ip_set_put_extensions() and ip_set_get_extensions() together. * Add element count to hash headers by Eric B Munson. * Add element count to all set types header for uniform output across all set types. * Count non-static extension memory into memsize calculation for userspace. * Cleanup: Remove redundant mtype_expire() arguments, because they can be get from other parameters. * Cleanup: Simplify mtype_expire() for hash types by removing one level of intendation. * Make NLEN compile time constant for hash types. * Make sure element data size is a multiple of u32 for the hash set types. * Optimize hash creation routine, exit as early as possible. * Make struct htype per ipset family so nets array becomes fixed size and thus simplifies the struct htype allocation. * Collapse same condition body into a single one. * Fix reported memory size for hash:* types, base hash bucket structure was not taken into account. * hash:ipmac type support added to ipset by Tomasz Chilinski. * Use setup_timer() and mod_timer() instead of init_timer() by Muhammad Falak R Wani, individually for the set type families. 16) Remove useless connlabel field in struct netns_ct, patch from Florian Westphal. 17) xt_find_table_lock() doesn't return ERR_PTR() anymore, so simplify {ip,ip6,arp}tables code that uses this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Add a warning that the abort mechanism was triggered for device. Also avoid going through the procedure if abort was already done. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Andrew Lunn says: ==================== dsa: mv88e6xxx: Fixes for port refactoring The patches which refactored setting up the switch MACs introduced a couple of regressions. The RGMII delays for a port can be set using other mechanism than just phy-mode. Don't overwrite the delays unless explicitly asked to. This broke my Armada 370 RD. Also, the mv88e6351 family supports setting RGMII delays, but is missing the necessary entries in the ops structures to allow this. These fixes are to patches currently in net-next. No need for stable etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
The recent refactoring of setting the MAC configuration broke setting of RGMII delays, via the phy-mode, on the 6351 family. Add the missing ops to the structure. Fixes: 7340e5ecdbb1 ("net: dsa: mv88e6xxx: setup port's MAC") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
The RGMII modes delays can be set via strapping pings or EEPROM. Don't change them unless explicitly asked to change them. The recent refactoring of setting the MAC configuration changed this behaviours, in that CPU and DSA ports have any pre-configured RGMII delays removed. This breaks the Armada 370RD board. Restore the previous behaviour, in that RGMII delays are only applied/removed when explicitly asked for via an phy-mode being PHY_INTERFACE_MODE_RGMII* Fixes: 7340e5ecdbb1 ("net: dsa: mv88e6xxx: setup port's MAC") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 13 Nov, 2016 31 commits
-
-
Julia Lawall authored
Since commit 7926dbfa ("netfilter: don't use mutex_lock_interruptible()"), the function xt_find_table_lock can only return NULL on an error. Simplify the call sites and update the comment before the function. The semantic patch that change the code is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression t,e; @@ t = \(xt_find_table_lock(...)\| try_then_request_module(xt_find_table_lock(...),...)\) ... when != t=e - ! IS_ERR_OR_NULL(t) + t @@ expression t,e; @@ t = \(xt_find_table_lock(...)\| try_then_request_module(xt_find_table_lock(...),...)\) ... when != t=e - IS_ERR_OR_NULL(t) + !t @@ expression t,e,e1; @@ t = \(xt_find_table_lock(...)\| try_then_request_module(xt_find_table_lock(...),...)\) ... when != t=e ?- t ? PTR_ERR(t) : e1 + e1 ... when any // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
since 23014011 ('netfilter: conntrack: support a fixed size of 128 distinct labels') this isn't needed anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Philippe Reynes authored
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. The previous implementation of set_settings was modifying the value of advertising, but with the new API, it's not possible. The structure ethtool_link_ksettings is defined as const. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
No need to stop ctlr if it was already stopped. It can cause timeout warns. Steps: - ifconfig eth0 down - ethtool -l eth0 rx 8 tx 8 - ethtool -l eth0 rx 1 tx 1 Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
The dma ctlr is reseted to 0 while cpdma soft reset, thus cpdma ctlr cannot be configured after cpdma is stopped. So restoring content of cpdma ctlr while off/on procedure is needed. The cpdma ctlr off/on procedure is present while interface down/up and while changing number of channels with ethtool. In order to not restore content in many places, move it to cpdma_ctlr_start(). Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
The field is 7bit long. Fix it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
The idr_alloc(), idr_remove(), et al. routines all expect IDs to be signed integers. Therefore make the genl_family member 'id' signed too. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Uwe Kleine-König authored
Instead of remembering if the page was changed, just compare the current page to the saved one. This is easier and has the advantage to save a register write if the page was already restored. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver updates 2016-11-10 This patch series is targeted at adding support for a new PCI version of the hardware. As part of the new PCI device, there is a new PCS/PHY interaction, ECC support, I2C sideband communication, SFP+ support and more. The following updates and fixes are included in this driver update series: - Hardware workaround for possible incorrectly generated interrupts during software reset - Hardware workaround for Tx timestamp register access order - Add support for a PCI version of the device - Increase the Rx queue limit to take advantage of the increased number of DMA channels that might be available - Add support for a new DMA channel interrupt mode - Add ECC support for the device memory - Add support for using the integrated I2C controller for sideband communication - Expose the phylib phy_aneg_done() function so it can be called by the driver - Add support for SFP+ modules - Add support for MDIO attached PHYs - Add support for KR re-driver between the PCS/SerDes and an external PHY This patch series is based on net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
This patch provides support for the presence of a KR redriver chip in between the device PCS and an external PHY. When a redriver chip is present the device must perform clause 73 auto-negotiation in order to set the redriver chip for the downstream connection. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Use the phylib support in the kernel to communicate with and control an MDIO attached PHY. Use the hardware's MDIO communication mechanism to communicate with the PHY. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Add support for recognizing and using SFP+ modules directly. This includes using the I2C support to read and interpret the information returned from an SFP+ module and configuring things properly. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Make phy_aneg_done() available to drivers so that the result of the auto-negotiation initiated by phy_start_aneg() can be determined. Remove the local implementation of phy_aneg_done() from the Aeroflex driver and use the phy library version. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Add support to initialize and use the I2C controller within the hardware in order to perform sideband communication, e.g. determine the SFP media type that is installed. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Some versions of the amd-xgbe device are capable of reporting ECC error information back to the driver. Add support to process, track and report on this information. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
The current per channel DMA interrupt support is based on an edge triggered interrupt that is not maskable. This results in having to call the disable_irq/enable_irq functions in order to prevent interrupts during napi processing. The hardware now has a way to configure the per channel DMA interrupt that will allow for masking the interrupt which prevents calling disable_irq/enable_irq now. This patch makes use of this support. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Remove the call to netif_get_num_default_rss_queues() and replace it with num_online_cpus() to allow for the possibility of using all of the hardware DMA channels available. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Add support for new PCI devices to the driver. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Update the reading of the Tx timestamp to account for a hardware issue on how the fields and interrupt are cleared. The "seconds" portion of the timestamp should be read first, followed by the "nanoseconds" portion. Reading the "nanoseconds" portion should clear the timestamp data and the interrupt. Because of an issue with the hardware this order is reversed and reading the "seconds" portion actually clears the timestamp. The code currently follows this workaround, but to guard against future versions where this is fixed add a field to the version data to indicate if the workaround is required or not. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lendacky, Thomas authored
Due to a hardware issue, it is possible for interrupt events to be incorrectly generated when performing a soft reset. To guard against this, perform the soft reset twice. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Benc says: ==================== openvswitch: support for layer 3 encapsulated packets At the core of this patch set is removing the assumption in Open vSwitch datapath that all packets have Ethernet header. The implementation relies on the presence of pop_eth and push_eth actions in datapath flows to facilitate adding and removing Ethernet headers as appropriate. The construction of such flows is left up to user-space. This series is based on work by Simon Horman, Lorand Jakab, Thomas Morin and others. I kept Lorand's and Simon's s-o-b in the patches that are derived from v11 to record their authorship of parts of the code. Changes from v12 to v13: * Addressed Pravin's feedback. * Removed the GRE vport conversion patch; L3 GRE ports should be created by rtnetlink instead. Main changes from v11 to v12: * The patches were restructured and split differently for easier review. * They were rebased and adjusted to the current net-next. Especially MPLS handling is different (and easier) thanks to the recent MPLS GSO rework. * Several bugs were discovered and fixed. The most notable is fragment handling: header adjustment for ARPHRD_NONE devices on tx needs to be done after refragmentation, not before it. This required significant changes in the patchset. Another one is stricter checking of attributes (match on L2 vs. L3 packet) at the kernel level. * Instead of is_layer3 bool, a mac_proto field is used. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Allow ARPHRD_NONE interfaces to be added to ovs bridge. Based on previous versions by Lorand Jakab and Simon Horman. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
It's not allowed to push Ethernet header in front of another Ethernet header. It's not allowed to pop Ethernet header if there's a vlan tag. This preserves the invariant that L3 packet never has a vlan tag. Based on previous versions by Lorand Jakab and Simon Horman. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Extend the ovs flow netlink protocol to support L3 packets. Packets without OVS_KEY_ATTR_ETHERNET attribute specify L3 packets; for those, the OVS_KEY_ATTR_ETHERTYPE attribute is mandatory. Push/pop vlan actions are only supported for Ethernet packets. Based on previous versions by Lorand Jakab and Simon Horman. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Support receiving, extracting flow key and sending of L3 packets (packets without an Ethernet header). Note that even after this patch, non-Ethernet interfaces are still not allowed to be added to bridges. Similarly, netlink interface for sending and receiving L3 packets to/from user space is not in place yet. Based on previous versions by Lorand Jakab and Simon Horman. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Update Ethernet header only if there is one. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
We'll need it to alter packets sent to ARPHRD_NONE interfaces. Change do_output() to use the actual L2 header size of the packet when deciding on the minimum cutlen. The assumption here is that what matters is not the output interface hard_header_len but rather the L2 header of the particular packet. For example, ARPHRD_NONE tunnels that encapsulate Ethernet should get at least the Ethernet header. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
Use a hole in the structure. We support only Ethernet so far and will add a support for L2-less packets shortly. We could use a bool to indicate whether the Ethernet header is present or not but the approach with the mac_proto field is more generic and occupies the same number of bytes in the struct, while allowing later extensibility. It also makes the code in the next patches more self explaining. It would be nice to use ARPHRD_ constants but those are u16 which would be waste. Thus define our own constants. Another upside of this is that we can overload this new field to also denote whether the flow key is valid. This has the advantage that on refragmentation, we don't have to reparse the packet but can rely on the stored eth.type. This is especially important for the next patches in this series - instead of adding another branch for L2-less packets before calling ovs_fragment, we can just remove all those branches completely. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Benc authored
On tx, use hard_header_len while deciding whether to refragment or drop the packet. That way, all combinations are calculated correctly: * L2 packet going to L2 interface (the L2 header len is subtracted), * L2 packet going to L3 interface (the L2 header is included in the packet lenght), * L3 packet going to L3 interface. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Commit 67f8b1dc ("net/mlx4_en: Refactor the XDP forwarding rings scheme") added a bug in that the prog's reference count is not dropped in the error path when mlx4_en_try_alloc_resources() is failing from mlx4_xdp_set(). We previously took bpf_prog_add(prog, priv->rx_ring_num - 1), that we need to release again. Earlier in the call path, dev_change_xdp_fd() itself holds a reference to the prog as well (hence the '- 1' in the bpf_prog_add()), so a simple atomic_sub() is safe to use here. When an error is propagated, then bpf_prog_put() is called eventually from dev_change_xdp_fd() Fixes: 67f8b1dc ("net/mlx4_en: Refactor the XDP forwarding rings scheme") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ivan Khoronzhuk authored
While create/destroy channel operation memory is not freed. It was supposed that memory is freed while driver remove. But a channel can be created and destroyed many times while changing number of channels with ethtool. Based on net-next/master Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 10 Nov, 2016 3 commits
-
-
David S. Miller authored
Salil Mehta says: ==================== Bug fixes & Code improvements in HNS driver This patch-set introduces some bug fixes and code improvements. These have been identified during internal review or testing of the driver by internal Hisilicon teams. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kejian Yan authored
This patch adds the support to add or remove the unicast entries to the table and remove from the table. Reported-by: Daode Huang <huangdaode@hisilicon.com> Signed-off-by: Kejian Yan <yankejian@huawei.com> Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kejian Yan authored
There is no clear operation before add a new multicast tcam table, so the tcam table will be overflow when add more entries. Reported-by: Daode Huang <huangdaode@hisilicon.com> Signed-off-by: Kejian Yan <yankejian@huawei.com> Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-