- 17 Jan, 2017 27 commits
-
-
Jason Baron authored
Using a Mac OSX box as a client connecting to a Linux server, we have found that when certain applications (such as 'ab'), are abruptly terminated (via ^C), a FIN is sent followed by a RST packet on tcp connections. The FIN is accepted by the Linux stack but the RST is sent with the same sequence number as the FIN, and Linux responds with a challenge ACK per RFC 5961. The OSX client then sometimes (they are rate-limited) does not reply with any RST as would be expected on a closed socket. This results in sockets accumulating on the Linux server left mostly in the CLOSE_WAIT state, although LAST_ACK and CLOSING are also possible. This sequence of events can tie up a lot of resources on the Linux server since there may be a lot of data in write buffers at the time of the RST. Accepting a RST equal to rcv_nxt - 1, after we have already successfully processed a FIN, has made a significant difference for us in practice, by freeing up unneeded resources in a more expedient fashion. A packetdrill test demonstrating the behavior: // testing mac osx rst behavior // Establish a connection 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < S 0:0(0) win 32768 <mss 1460,nop,wscale 10> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 5> 0.200 < . 1:1(0) ack 1 win 32768 0.200 accept(3, ..., ...) = 4 // Client closes the connection 0.300 < F. 1:1(0) ack 1 win 32768 // now send rst with same sequence 0.300 < R. 1:1(0) ack 1 win 32768 // make sure we are in TCP_CLOSE 0.400 %{ assert tcpi_state == 7 }% Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tobias Klauser authored
Make the needlessly global struct ethtool_ops ethoc_ethtool_ops static to fix a sparse warning. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Edward Cree says: ==================== sfc: RX hash configuration This series improves support for getting and setting RX hashing configuration on Solarflare adapters through ethtool. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Ensures that we report the key and indirection table the NIC is using, rather than (if setting them failed earlier) what we wanted it to use. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Edward Cree authored
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ganesh Goudar authored
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gao Feng authored
The inet_num is u16, so use %hu instead of casting it to int. And the sk_bound_dev_if is int actually, so it needn't cast to int. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shyam Saini authored
Use eth_zero_addr to assign zero address to the given address array instead of memset when the second argument in memset is address of zero. Also, it makes the code clearer Signed-off-by: Shyam Saini <mayhs11saini@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lance Richardson authored
Changed type of csum field in struct igmpv3_query from __be16 to __sum16 to eliminate type warning, made same change in struct igmpv3_report for consistency. Fixed up an ntohs() where htons() should have been used instead. Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
-
David S. Miller authored
Robert Shearman says: ==================== mpls: Packet stats This patchset records per-interface packet stats in the MPLS forwarding path and exports them using a nest of attributes root at a new IFLA_STATS_AF_SPEC attribute as part of RTM_GETSTATS messages: [IFLA_STATS_AF_SPEC] -> [AF_MPLS] -> [MPLS_STATS_LINK] -> struct mpls_link_stats The first patch adds the rtnl infrastructure for this, including a new callbacks to per-AF ops of fill_stats_af and get_stats_af_size. The second patch records MPLS stats and makes use of the infrastructure to export them. The rtnl infrastructure could also be used to export IPv6 stats in the future. Changes in v2: - make incrementing IPv6 stats in mpls_stats_inc_outucastpkts conditional on CONFIG_IPV6 to fix build with CONFIG_IPV6=n ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Robert Shearman authored
Having MPLS packet stats is useful for observing network operation and for diagnosing network problems. In the absence of anything better, RFC2863 and RFC3813 are used for guidance for which stats to expose and the semantics of them. In particular rx_noroutes maps to in unknown protos in RFC2863. The stats are exposed to userspace via AF_MPLS attributes embedded in the IFLA_STATS_AF_SPEC attribute of RTM_GETSTATS messages. All the introduced fields are 64-bit, even error ones, to ensure no overflow with long uptimes. Per-CPU counters are used to avoid cache-line contention on the commonly used fields. The other fields have also been made per-CPU for code to avoid performance problems in error conditions on the assumption that on some platforms the cost of atomic operations could be more expensive than sending the packet (which is what would be done in the success case). If that's not the case, we could instead not use per-CPU counters for these fields. Only unicast and non-fragment are exposed at the moment, but other counters can be exposed in the future either by adding to the end of struct mpls_link_stats or by additional netlink attributes in the AF_MPLS IFLA_STATS_AF_SPEC nested attribute. Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Robert Shearman authored
Add the functionality for including address-family-specific per-link stats in RTM_GETSTATS messages. This is done through adding a new IFLA_STATS_AF_SPEC attribute under which address family attributes are nested and then the AF-specific attributes can be further nested. This follows the model of IFLA_AF_SPEC on RTM_*LINK messages and it has the advantage of presenting an easily extended hierarchy. The rtnl_af_ops structure is extended to provide AFs with the opportunity to fill and provide the size of their stats attributes. One alternative would have been to provide AFs with the ability to add attributes directly into the RTM_GETSTATS message without a nested hierarchy. I discounted this approach as it increases the rate at which the 32 attribute number space is used up and it makes implementation a little more tricky for stats dump resuming (at the moment the order in which attributes are added to the message has to match the numeric order of the attributes). Another alternative would have been to register per-AF RTM_GETSTATS handlers. I discounted this approach as I perceived a common use-case to be getting all the stats for an interface and this approach would necessitate multiple requests/dumps to retrieve them all. Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julia Lawall authored
The function stmmac_dt_phy provides several possibilities for initializing plat->mdio_node, all of which have the effect of increasing the reference count of the assigned value. This field is not updated elsewhere, so the value is live until the end of the lifetime of plat (devm_allocated), just after the end of stmmac_remove_config_dt. Thus, add an of_node_put on plat->mdio_node in stmmac_remove_config_dt. It is possible that the field mdio_node is never initialized, but of_node_put is NULL-safe, so it is also safe to call of_node_put in that case. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rolf Neugebauer authored
This patch part reverts fd2a0437 and e858fae2 which introduced a subtle change in how the virtio_net flags are derived from the SKBs ip_summed field. With the above commits, the flags are set to VIRTIO_NET_HDR_F_DATA_VALID when ip_summed == CHECKSUM_UNNECESSARY, thus treating it differently to ip_summed == CHECKSUM_NONE, which should be the same. Further, the virtio spec 1.0 / CS04 explicitly says that VIRTIO_NET_HDR_F_DATA_VALID must not be set by the driver. Fixes: fd2a0437 ("virtio_net: introduce virtio_net_hdr_{from,to}_skb") Fixes: e858fae2 (" virtio_net: use common code for virtio_net_hdr and skb GSO conversion") Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Handle multicast packets properly in fast-RX path of mac80211, from Johannes Berg. 2) Because of a logic bug, the user can't actually force SW checksumming on r8152 devices. This makes diagnosis of hw checksumming bugs really annoying. Fix from Hayes Wang. 3) VXLAN route lookup does not take the source and destination ports into account, which means IPSEC policies cannot be matched properly. Fix from Martynas Pumputis. 4) Do proper RCU locking in netvsc callbacks, from Stephen Hemminger. 5) Fix SKB leaks in mlxsw driver, from Arkadi Sharshevsky. 6) If lwtunnel_fill_encap() fails, we do not abort the netlink message construction properly in fib_dump_info(), from David Ahern. 7) Do not use kernel stack for DMA buffers in atusb driver, from Stefan Schmidt. 8) Openvswitch conntack actions need to maintain a correct checksum, fix from Lance Richardson. 9) ax25_disconnect() is missing a check for ax25->sk being NULL, in fact it already checks this, but not in all of the necessary spots. Fix from Basil Gunn. 10) Action GET operations in the packet scheduler can erroneously bump the reference count of the entry, making it unreleasable. Fix from Jamal Hadi Salim. Jamal gives a great set of example command lines that trigger this in the commit message. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits) net sched actions: fix refcnt when GETing of action after bind net/mlx4_core: Eliminate warning messages for SRQ_LIMIT under SRIOV net/mlx4_core: Fix when to save some qp context flags for dynamic VST to VGT transitions net/mlx4_core: Fix racy CQ (Completion Queue) free net: stmmac: don't use netdev_[dbg, info, ..] before net_device is registered net/mlx5e: Fix a -Wmaybe-uninitialized warning ax25: Fix segfault after sock connection timeout bpf: rework prog_digest into prog_tag tipc: allocate user memory with GFP_KERNEL flag net: phy: dp83867: allow RGMII_TXID/RGMII_RXID interface types ip6_tunnel: Account for tunnel header in tunnel MTU mld: do not remove mld souce list info when set link down be2net: fix MAC addr setting on privileged BE3 VFs be2net: don't delete MAC on close on unprivileged BE3 VFs be2net: fix status check in be_cmd_pmac_add() cpmac: remove hopeless #warning ravb: do not use zero-length alignment DMA descriptor mlx4: do not call napi_schedule() without care openvswitch: maintain correct checksum state in conntrack actions tcp: fix tcp_fastopen unaligned access complaints on sparc ...
-
Linus Torvalds authored
Merge branch 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fix from Konrad Rzeszutek Wilk: "A tiny fix to make sure that page-sized mappings are page-aligned (and not say straddle two pages). This is important for some drivers (such as NVME)" * 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: ensure that page-sized mappings are page-aligned
-
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds authored
Pull MMC fixes from Ulf Hansson: "MMC core: - fix regressions detecting HS/HS DDR eMMC cards related to CMD6 MMC host: - mmc: mxs-mmc: Fix additional cycles after transmission stop - sdhci-acpi: Only powered up enabled acpi child devices - meson: avoid possible NULL dereference" * tag 'mmc-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: core: Restore parts of the polling policy when switch to HS/HS DDR mmc: mxs-mmc: Fix additional cycles after transmission stop mmc: sdhci-acpi: Only powered up enabled acpi child devices MMC: meson: avoid possible NULL dereference
-
git://git.infradead.org/linux-mtdLinus Torvalds authored
Pull MTD fixes from Brian Norris: "Just NAND updates from Boris: - avoid compiling xway NAND controller driver as a module (which didn't work) - fix tango NAND DT binding and make sure the controller is in a clean state at probe time - add dependency on HAS_IOMEM to the oxnas NAND driver - fix irq number validity check in the lpc32xx driver" * tag 'for-linus-20170116' of git://git.infradead.org/linux-mtd: mtd: nand: lpc32xx: fix invalid error handling of a requested irq mtd: nand: tango: Reset pbus to raw mode in probe mtd: nand: tango: Update DT binding description mtd: nand: oxnas_nand: fix build errors on arch/um, require HAS_IOMEM mtd: nand: xway: fix build because of module functions mtd: nand: xway: disable module support
-
Philippe Reynes authored
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Philippe Reynes authored
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. The callback set_link_ksettings no longer update the value of advertising, as the struct ethtool_link_ksettings is defined as const. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Philippe Reynes authored
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Philippe Reynes authored
The ethtool api {get|set}_settings is deprecated. We move this driver to new api {get|set}_link_ksettings. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Marcin Wojtas says: ==================== mvneta xmit_more and bql support This is a delayed v2 of short patchset, which introduces xmit_more and BQL to mvneta driver. The only one change was added in xmit_more support - condition check preventing excessive descriptors concatenation before flushing in HW. Any comments or feedback would be welcome. Changelog: v1 -> v2: * Add checking condition that ensures too much descriptors are not concatenated before flushing in HW. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Marcin Wojtas authored
Tests showed that when whole bandwidth is consumed, the latency for various kind of traffic can reach high values. With saturated link (e.g. with iperf from target to host) simple ping could take significant amount of time. BQL proved to improve this situation when implemented in mvneta driver. Measurements of ping latency for 3 link speeds: Speed | Latency w/o BQL | Latency with BQL 10 | 7-14 ms | 3.5 ms 100 | 2-12 ms | 0.6 ms 1000 | often timeout | up to 2ms Decreasing latency as above result in sligt performance cost - 4kpps (-1.4%) when pushing 64B packets via two bridged interfaces of Armada 38x. For 1500B packets in the same setup, the mpstat tool showed +8% of CPU occupation (default affinity, second CPU idle). Even though this cost seems reasonable to take, considering other improvements. This commit adds byte queue limit mechanism for the mvneta driver. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Simon Guinot authored
Basing on xmit_more flag of the skb, TX descriptors can be concatenated before flushing. This commit delay Tx descriptor flush if the queue is running and if there is more skb's to send. A maximum allowed number of descriptors for flushing at once due to MVNETA_TXQ_UPDATE_REG(q) reqisters limitation, is 255. Because of that a new macro was added (MVNETA_TXQ_DEC_SENT_MASK) in order to ensure that concatenated amount of descriptor does not exceed that value. Signed-off-by: Simon Guinot <simon.guinot@sequanux.org> Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jamal Hadi Salim authored
Demonstrating the issue: .. add a drop action $sudo $TC actions add action drop index 10 .. retrieve it $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 2 bind 0 installed 29 sec used 29 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 ... bug 1 above: reference is two. Reference is actually 1 but we forget to subtract 1. ... do a GET again and we see the same issue try a few times and nothing changes ~$ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 2 bind 0 installed 31 sec used 31 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 ... lets try to bind the action to a filter.. $ sudo $TC qdisc add dev lo ingress $ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \ u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10 ... and now a few GETs: $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 3 bind 1 installed 204 sec used 204 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 4 bind 1 installed 206 sec used 206 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 5 bind 1 installed 235 sec used 235 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 .... as can be observed the reference count keeps going up. After the fix $ sudo $TC actions add action drop index 10 $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 1 bind 0 installed 4 sec used 4 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 1 bind 0 installed 6 sec used 6 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 $ sudo $TC qdisc add dev lo ingress $ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \ u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10 $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 2 bind 1 installed 32 sec used 32 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 $ sudo $TC -s actions get action gact index 10 action order 1: gact action drop random type none pass val 0 index 10 ref 2 bind 1 installed 33 sec used 33 sec Action statistics: Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Fixes: aecc5cef ("net sched actions: fix GETing actions") Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 16 Jan, 2017 13 commits
-
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds authored
Pull NFS client bugfixes from Trond Myklebust: - fix invalid fget()/fput() calls when doing file locking - fix multiple directory cache invalidation issues due to the client failing to recognise that the directory wasn't changed - fix client recovery when server reboots multiple times * tag 'nfs-for-4.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix client recovery when server reboots multiple times NFSv4: update_changeattr should update the attribute timestamp NFSv4: Don't call update_changeattr() unless the unlink is successful NFSv4: Don't apply change_info4 twice on rename within a directory NFSv4: Call update_changeattr() from _nfs4_proc_open only if a file was created nfs: Don't take a reference on fl->fl_file for LOCK operation
-
David S. Miller authored
Tariq Toukan says: ==================== mlx4 core fixes This patchset contains bug fixes from Jack to the mlx4 Core driver. Patch 1 solves a race in the flow of CQ free. Patch 2 moves some qp context flags update to the correct qp transition. Patch 3 eliminates warnings from the path of SRQ_LIMIT that flood the message log, and keeps them only in the path of SRQ_CATAS_ERROR. Series generated against net commit: 1a717fcf Merge tag 'mac80211-for-davem-2017-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jack Morgenstein authored
When running SRIOV, warnings for SRQ LIMIT events flood the Hypervisor's message log when (correct, normally operating) apps use SRQ LIMIT events as a trigger to post WQEs to SRQs. Add more information to the existing debug printout for SRQ_LIMIT, and output the warning messages only for the SRQ CATAS ERROR event. Fixes: acba2420 ("mlx4_core: Add wrapper functions and comm channel and slave event support to EQs") Fixes: e0debf9c ("mlx4_core: Reduce warning message for SRQ_LIMIT event to debug level") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jack Morgenstein authored
Save the qp context flags byte containing the flag disabling vlan stripping in the RESET to INIT qp transition, rather than in the INIT to RTR transition. Per the firmware spec, the flags in this byte are active in the RESET to INIT transition. As a result of saving the flags in the incorrect qp transition, when switching dynamically from VGT to VST and back to VGT, the vlan remained stripped (as is required for VST) and did not return to not-stripped (as is required for VGT). Fixes: f0f829bf ("net/mlx4_core: Add immediate activate for VGT->VST->VGT") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jack Morgenstein authored
In function mlx4_cq_completion() and mlx4_cq_event(), the radix_tree_lookup requires a rcu_read_lock. This is mandatory: if another core frees the CQ, it could run the radix_tree_node_rcu_free() call_rcu() callback while its being used by the radix tree lookup function. Additionally, in function mlx4_cq_event(), since we are adding the rcu lock around the radix-tree lookup, we no longer need to take the spinlock. Also, the synchronize_irq() call for the async event eliminates the need for incrementing the cq reference count in mlx4_cq_event(). Other changes: 1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock: we no longer take this spinlock in the interrupt context. The spinlock here, therefore, simply protects against different threads simultaneously invoking mlx4_cq_free() for different cq's. 2. In function mlx4_cq_free(), we move the radix tree delete to before the synchronize_irq() calls. This guarantees that we will not access this cq during any subsequent interrupts, and therefore can safely free the CQ after the synchronize_irq calls. The rcu_read_lock in the interrupt handlers only needs to protect against corrupting the radix tree; the interrupt handlers may access the cq outside the rcu_read_lock due to the synchronize_irq calls which protect against premature freeing of the cq. 3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg. 4. We leave the cq reference count mechanism in place, because it is still needed for the cq completion tasklet mechanism. Fixes: 6d90aa5c ("net/mlx4_core: Make sure there are no pending async events when freeing CQ") Fixes: 225c7b1f ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Paul Blakey authored
Flower currently allows having the same filter twice with the same priority. Actions (and statistics update) will always execute on the first inserted rule leaving the second rule unused. This patch disallows that. Signed-off-by: Paul Blakey <paulb@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Don't use netdev_info and friends before the net_device is registered. This avoids ugly messages like "meson8b-dwmac c9410000.ethernet (unnamed net_device) (uninitialized): Enable RX Mitigation via HW Watchdog Timer" Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
As found by Olof's build bot, we gain a harmless warning about a potential uninitialized variable reference in mlx5: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function 'parse_tc_fdb_actions': drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:769:13: warning: 'out_dev' may be used uninitialized in this function [-Wmaybe-uninitialized] drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:811:21: note: 'out_dev' was declared here This was introduced through the addition of an 'IS_ERR/PTR_ERR' pair that gcc is unfortunately unable to completely figure out. The problem being gcc cannot tell that if(IS_ERR()) in mlx5e_route_lookup_ipv4() is equivalent to checking if(err) later, so it assumes that 'out_dev' is used after the 'return PTR_ERR(rt)'. The PTR_ERR_OR_ZERO() case by comparison is fairly easy to detect by gcc, so it can't get that wrong, so it no longer warns. Hadar Hen Zion already attempted to fix the warning earlier by adding fake initializations, but that ended up not fully addressing all warnings, so I'm reverting it now that it is no longer needed. Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.10-rc3-98-gcff3b2c/ Fixes: a42485eb ("net/mlx5e: TC ipv4 tunnel encap offload error flow fixes") Fixes: a757d108 ("net/mlx5e: Fix kbuild warnings for uninitialized parameters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Lebrun authored
Add missing IPv6-SR header files in include/uapi/linux/Kbuild. Also, prevent seg6_lwt_headroom() from being exported and add missing linux/types.h include. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Make sure that ctx cannot potentially be accessed oob by asserting explicitly that ctx access size into pt_regs for BPF_PROG_TYPE_KPROBE programs must be within limits. In case some 32bit archs have pt_regs not being a multiple of 8, then BPF_DW access could cause such access. BPF_PROG_TYPE_KPROBE progs don't have a ctx conversion function since there's no extra mapping needed. kprobe_prog_is_valid_access() didn't enforce sizeof(long) as the only allowed access size, since LLVM can generate non BPF_W/BPF_DW access to regs from time to time. For BPF_PROG_TYPE_TRACEPOINT we don't have a ctx conversion either, so add a BUILD_BUG_ON() check to make sure that BPF_DW access will not be a similar issue in future (ctx works on event buffer as opposed to pt_regs there). Fixes: 2541517c ("tracing, perf: Implement BPF programs attached to kprobes") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Basil Gunn authored
The ax.25 socket connection timed out & the sock struct has been previously taken down ie. sock struct is now a NULL pointer. Checking the sock_flag causes the segfault. Check if the socket struct pointer is NULL before checking sock_flag. This segfault is seen in timed out netrom connections. Please submit to -stable. Signed-off-by: Basil Gunn <basil@pacabunga.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mahesh Bandewar authored
In the last patch da36e13c ("ipvlan: improvise dev_id generation logic in IPvlan") I missed some part of Dave's suggestion and because of that the dev_id creation could fail in a corner case scenario. This would happen when more or less 64k devices have been already created and several have been deleted. If the devices that are still sticking around are the last n bits from the bitmap. So in this scenario even if lower bits are available, the dev_id search is so narrow that it always fails. Fixes: da36e13c ("ipvlan: improvise dev_id generation logic in IPvlan") CC: David Miller <davem@davemloft.org> CC: Eric Dumazet <edumazet@google.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Commit 7bd509e3 ("bpf: add prog_digest and expose it via fdinfo/netlink") was recently discussed, partially due to admittedly suboptimal name of "prog_digest" in combination with sha1 hash usage, thus inevitably and rightfully concerns about its security in terms of collision resistance were raised with regards to use-cases. The intended use cases are for debugging resp. introspection only for providing a stable "tag" over the instruction sequence that both kernel and user space can calculate independently. It's not usable at all for making a security relevant decision. So collisions where two different instruction sequences generate the same tag can happen, but ideally at a rather low rate. The "tag" will be dumped in hex and is short enough to introspect in tracepoints or kallsyms output along with other data such as stack trace, etc. Thus, this patch performs a rename into prog_tag and truncates the tag to a short output (64 bits) to make it obvious it's not collision-free. Should in future a hash or facility be needed with a security relevant focus, then we can think about requirements, constraints, etc that would fit to that situation. For now, rework the exposed parts for the current use cases as long as nothing has been released yet. Tested on x86_64 and s390x. Fixes: 7bd509e3 ("bpf: add prog_digest and expose it via fdinfo/netlink") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-