- 03 May, 2019 27 commits
-
-
Alice Michael authored
This patch introduces "recovery mode" to the i40e driver. It is part of a new Any2Any idea of upgrading the firmware. In this approach, it is required for the driver to have support for "transition firmware", that is used for migrating from structured to flat firmware image. In this new, very basic mode, i40e driver must be able to handle particular IOCTL calls from the NVM Update Tool and run a small set of AQ commands. These additional AQ commands are part of the interface used by the NVMUpdate tool. The NVMUpdate tool contains all of the necessary logic to reference these new AQ commands. The end user experience remains the same, they are using the NVMUpdate tool to update the NVM contents. Signed-off-by: Alice Michael <alice.michael@intel.com> Signed-off-by: Piotr Marczak <piotr.marczak@intel.com> Tested-by: Don Buchholz <donald.buchholz@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Stefan Assmann authored
Printing each devices PCI vendor and device ID has the advantage of easily revealing what hardware we're dealing with exactly. It's no longer necessary to match the PCI bus information to the lspci output. Helps with bug reports where no lspci output is available. Output before i40e 0000:08:00.0: fw 6.1.49420 api 1.7 nvm 6.80 0x80003c64 1.2007.0 and after i40e 0000:08:00.0: fw 6.1.49420 api 1.7 nvm 6.80 0x80003c64 1.2007.0 [8086:1572] [8086:0004] Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Harshitha Ramamurthy authored
A refactor of the i40e_vc_config_promiscuous_mode_msg function moved the check for un-trusted VF into another function. We have to lie to an un-trusted VF that its request to set promiscuous mode is successful even when it is not because we don't want the VF to find out its trust status this way. With the refactor, we were running into a case where even though we were not setting promiscuous mode for an un-trusted VF, we still printed a misleading message that it was successful. This patch fixes that by ensuring that a success message is printed on the host side only when the promiscuous mode change has been successful. Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Alice Michael authored
Just bumping the version number appropriately. Signed-off-by: Alice Michael <alice.michael@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jacob Keller authored
The function i40e_validate_cloud_filter checks that the destination and source port numbers are valid by attempting to ensure that the number is non-zero and no larger than 0xFFFF. However, the types for the dst_port and src_port variable are __be16 which by definition cannot be larger than 0xFFFF Since these values cannot be larger than 2 bytes, the check to see if they exceed 0xFFFF is meaningless. One might consider these checks as some sort of defensive coding, in case the type was later changed. However, these checks also byte-swap the value before comparison using be16_to_cpu, which will truncate the values to 16bits anyways. Additionally, changing the type would require updating the opcodes to support new data layout of these virtchnl commands. Remove the check to silence the -Wtype-limits warning that was added to GCC 8. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Aleksandr Loktionov authored
This code implements driver code changes necessary for LLDP Agent support. Modified i40e_aq_start_lldp() and i40e_aq_stop_lldp() adding false parameter whether LLDP state should be persistent across power cycles. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Adam Ludkiewicz authored
Add assignments for advertising 40GBase_LR4, 40GBase_CR4 and fibre Signed-off-by: Adam Ludkiewicz <adam.ludkiewicz@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Maciej Paczkowski authored
Due to changes in FW the SW is required to perform double SR dump in some cases. Implementation adds two new steps to update nvm checksum function: * recalculate checksum and check if checksum in NVM is correct * if checksum in NVM is not correct then update it again Signed-off-by: Maciej Paczkowski <maciej.paczkowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Aleksandr Loktionov authored
VF's attempt to delete vlan 0 when a port vlan is configured is harmless in this case pf driver just does nothing. If vf will try to remove other vlans when a port vlan is configured it will still produce error as before. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Carolyn Wyborny authored
TX MDD events reported on the PF are the result of the PF misconfiguring a descriptor and not because of "bad actions" by anything else. No need to reset now because if it results in a Tx hang, the Tx hang check will take care of it. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Carolyn Wyborny authored
This patch changes the driver behavior when detecting a VF MDD event. It now disables the VF after one event, which indicates a hw detected problem in the VF. Before this change, the PF would allow a couple of events before doing the reset. Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
David S. Miller authored
Vladimir Oltean says: ==================== NXP SJA1105 DSA driver This patchset adds a DSA driver for the SPI-controlled NXP SJA1105 switch. Due to the hardware's unfriendliness, most of its state needs to be shadowed in kernel memory by the driver. To support this and keep a decent amount of cleanliness in the code, a new generic API for converting between CPU-accessible ("unpacked") structures and hardware-accessible ("packed") structures is proposed and used. The driver is GPL-2.0 licensed. The source code files which are licensed as BSD-3-Clause are hardware support files and derivative of the userspace NXP sja1105-tool program, which is BSD-3-Clause licensed. TODO items: * Add support for traffic. * Add full support for the P/Q/R/S series. The patches were mostly tested on a first-generation T device. * Add timestamping support and PTP clock manipulation. * Figure out how the tc-taprio hardware offload that was just proposed by Vinicius can be used to configure the switch's time-aware scheduler. * Rework link state callbacks to use phylink once the SGMII port is supported. Changes in v5: 1. Removed trailing empty lines at the end of files. 2. Moved the lib/packing.c file under a CONFIG_PACKING option instead of having it always built-in. The module is GPL licensed, which applies to its distribution in binary form, but the code is dual-licensed which means it can be used in projects with other licenses as well. 3. Made SJA1105 driver select CONFIG_PACKING and CONFIG_CRC32. v4 patchset can be found at: https://lwn.net/Articles/787077/ Changes in v4: 1. Previous patchset was broken apart, and for the moment the driver is configuring the switch as unmanaged. Support for regular and management traffic, as well as for PTP timestamping, will be submitted once the basic driver is accepted. Some core DSA patches were also broken out of the series, and are a dependency for this series: https://patchwork.ozlabs.org/project/netdev/list/?series=105069 2. Addressed Jiri Pirko's feedback about too generic function and macro naming. 3. Re-introduced ETH_P_DSA_8021Q. v3 patchset can be found at: https://lkml.org/lkml/2019/4/12/978 Changes in v3: 1. Removed the patch for a dedicated Ethertype to use with 802.1Q DSA tagging 2. Changed the SJA1105 switch tagging protocol sysfs label from "sja1105" to "8021q" to denote to users such as tcpdump that the structure is more generic. 3. Respun previous patch "net: dsa: Allow drivers to modulate between presence and absence of tagging". Current equivalent patch is called "net: dsa: Allow drivers to filter packets they can decode source port from" and at least allows reception of management traffic during the time when switch tagging is not enabled. 4. Added DSA-level fixes for the bridge core not unsetting vlan_filtering when ports leave. The global VLAN filtering is treated as a special case. Made the mt7530 driver use this. This patch benefits the SJA1105 because otherwise traffic in standalone mode would no longer work after removing the ports from a vlan_filtering bridge, since the driver and the hardware would be in an inconsistent state. 5. Restructured the documentation as rst. This depends upon the recently submitted "[PATCH net-next] Documentation: net: dsa: transition to the rst format": https://patchwork.ozlabs.org/patch/1084658/. v2 patchset can be found at: https://www.spinics.net/lists/netdev/msg563454.html Changes in v2: 1. Device ID is no longer auto-detected but enforced based on explicit DT compatible string. This helps with stricter checking of DT bindings. 2. Group all device-specific operations into a sja1105_info structure and avoid using the IS_ET() and IS_PQRS() macros at runtime as much as possible. 3. Added more verbiage to commit messages and documentation. 4. Treat the case where RGMII internal delays are requested through DT bindings and return error. 5. Miscellaneous cosmetic cleanup in sja1105_clocking.c 6. Not advertising link features that are not supported, such as pause frames and the half duplex modes. 7. Fixed a mistake in previous patchset where the switch tagging was not actually enabled (lost during a rebase). This brought up another uncaught issue where switching at runtime between tagging and no-tagging was not supported by DSA. Fixed up the mistake in "net: dsa: sja1105: Add support for traffic through standalone ports", and added the new patch "net: dsa: Allow drivers to modulate between presence and absence of tagging" to address the other issue. 8. Added a workaround for switch resets cutting a frame in the middle of transmission, which would throw off some link partners. 9. Changed the TPID from ETH_P_EDSA (0xDADA) to a newly introduced one: ETH_P_DSA_8021Q (0xDADB). Uncovered another mistake in the previous patchset with a missing ntohs(), which was not caught because 0xDADA is endian-agnostic. 10. Made NET_DSA_TAG_8021Q select VLAN_8021Q 11. Renamed __dsa_port_vlan_add to dsa_port_vid_add and not to dsa_port_vlan_add_trans, as suggested, because the corresponding _del function does not have a transactional phase and the naming is more uniform this way. v1 patchset can be found at: https://www.spinics.net/lists/netdev/msg561589.html Changes from RFC: 1. Removed the packing code for the static configuration tables that were not currently used 2. Removed the code for unpacking a static configuration structure from a memory buffer (not used) 3. Completely removed the SGMII stubs, since the configuration is not complete anyway. 4. Moved some code from the SJA1105 introduction commit into the patch that used it. 5. Made the code for checking global VLAN filtering generic and made b53 driver use it. 6. Made mt7530 driver use the new generic dp->vlan_filtering 7. Fixed check for stringset in .get_sset_count 8. Minor cleanup in sja1105_clocking.c 9. Fixed a confusing typo in DSA RFC can be found at: https://www.mail-archive.com/netdev@vger.kernel.org/msg291717.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Ethernet flow control: The switch MAC does not consume, nor does it emit pause frames. It simply forwards them as any other Ethernet frame (and since the DMAC is, per IEEE spec, 01-80-C2-00-00-01, it means they are filtered as link-local traffic and forwarded to the CPU, which can't do anything useful with them). Duplex: There is no duplex setting in the SJA1105 MAC. It is known to forward traffic at line rate on the same port in both directions. Therefore it must be that it only supports full duplex. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Resetting the switch at runtime is currently done while changing the vlan_filtering setting (due to the required TPID change). But reset is asynchronous with packet egress, and the switch core will not wait for egress to finish before carrying on with the reset operation. As a result, a connected PHY such as the BCM5464 would see an unterminated Ethernet frame and start to jabber (repeat the last seen Ethernet symbols - jabber is by definition an oversized Ethernet frame with bad FCS). This behavior is strange in itself, but it also causes the MACs of some link partners (such as the FRDM-LS1012A) to completely lock up. So as a remedy for this situation, when switch reset is required, simply inhibit Tx on all ports, and wait for the necessary time for the eventual one frame left in the egress queue (not even the Tx inhibit command is instantaneous) to be flushed. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
If STP is active, this setting is applied on bridged ports each time an Ethernet link is established (topology changes). Since the setting is global to the switch and a reset is required to change it, resets are prevented if the new callback does not change the value that the hardware already is programmed for. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
VLAN filtering cannot be properly disabled in SJA1105. So in order to emulate the "no VLAN awareness" behavior (not dropping traffic that is tagged with a VID that isn't configured on the port), we need to hack another switch feature: programmable TPID (which is 0x8100 for 802.1Q). We are reprogramming the TPID to a bogus value which leaves the switch thinking that all traffic is untagged, and therefore accepts it. Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is installed again, and the switch starts identifying 802.1Q-tagged traffic. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
There are two possible utilizations so far: - Switch devices that don't support a native insertion/extraction header on the CPU port may still enjoy the benefits of port isolation with a custom VLAN tag. For this, they need to have a customizable TPID in hardware and a new Ethertype to distinguish between real 802.1Q traffic and the private tags used for port separation. - Switches that don't support the deactivation of VLAN awareness, but still want to have a mode in which they accept all traffic, including frames that are tagged with a VLAN not configured on their ports, may use this as a fake to trick the hardware into thinking that the TPID for VLAN is something other than 0x8100. What follows after the ETH_P_DSA_8021Q EtherType is a regular VLAN header (TCI), however there is no other EtherType that can be used for this purpose and doesn't already have a well-defined meaning. ETH_P_8021AD, ETH_P_QINQ1, ETH_P_QINQ2 and ETH_P_QINQ3 expect that another follow-up VLAN tag is present, which is not the case here. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Documentation/devicetree/bindings/net/ethernet.txt is confusing because it says what the MAC should not do, but not what it *should* do: * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) The gap in semantics is threefold: 1. Is it illegal for the MAC to apply the Rx internal delay by itself, and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before passing it to of_phy_connect? The documentation would suggest yes. 1. For "rgmii-rxid", while the situation with the Rx clock skew is more or less clear (needs to be added by the PHY), what should the MAC driver do about the Tx delays? Is it an implicit wild card for the MAC to apply delays in the Tx direction if it can? What if those were already added as serpentine PCB traces, how could that be made more obvious through DT bindings so that the MAC doesn't attempt to add them twice and again potentially break the link? 3. If the interface is a fixed-link and therefore the PHY object is fixed (a purely software entity that obviously cannot add clock skew), what is the meaning of the above property? So an interpretation of the RGMII bindings was chosen that hopefully does not contradict their intention but also makes them more applied. The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings if the port is in the PHY role (either explicitly, or if it is a fixed-link). Otherwise it always passes the duty of setting up delays to the PHY driver. The error behavior that this patch adds is required on SJA1105E/T where the MAC really cannot apply internal delays. If the other end of the fixed-link cannot apply RGMII delays either (this would be specified through its own DT bindings), then the situation requires PCB delays. For SJA1105P/Q/R/S, this is however hardware supported and the error is thus only temporary. I created a stub function pointer for configuring delays per-port on RXC and TXC, and will implement it when I have access to a board with this hardware setup. Meanwhile do not allow the user to select an invalid configuration. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
Currently only the (more difficult) first generation E/T series is supported. Here the TCAM is only 4-way associative, and to know where the hardware will search for a FDB entry, we need to perform the same hash algorithm in order to install the entry in the correct bin. On P/Q/R/S, the TCAM should be fully associative. However the SPI command interface is different, and because I don't have access to a new-generation device at the moment, support for it is TODO. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
At this moment the following is supported: * Link state management through phylib * Autonomous L2 forwarding managed through iproute2 bridge commands. IP termination must be done currently through the master netdevice, since the switch is unmanaged at this point and using DSA_TAG_PROTO_NONE. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vladimir Oltean authored
This provides an unified API for accessing register bit fields regardless of memory layout. The basic unit of data for these API functions is the u64. The process of transforming an u64 from native CPU encoding into the peripheral's encoding is called 'pack', and transforming it from peripheral to native CPU encoding is 'unpack'. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Ferre authored
This structure was used intensively for machine specific values when DT was not used. Since the removal of AVR32 from the kernel, this structure is only used for passing clocks from PCI macb wrapper, all other fields being 0. All other known platforms use DT. Remove the leftovers but make sure that PCI macb still works as expected by using default values: - phydev->irq is set to PHY_POLL by mdiobus_alloc() - mii_bus->phy_mask is cleared while allocating it - bp->phy_interface is set to PHY_INTERFACE_MODE_MII if mode not found in DT. This simplifies driver probe path and particularly phy handling. Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Ferre authored
While moving the chunk of code during 739de9a1 ("net: macb: Reorganize macb_mii bringup"), the declaration of struct phy_device declaration was kept. It's not useful in this function as we alrady have a phydev pointer. Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Three trivial overlapping conflicts. Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 02 May, 2019 7 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Out of bounds access in xfrm IPSEC policy unlink, from Yue Haibing. 2) Missing length check for esp4 UDP encap, from Sabrina Dubroca. 3) Fix byte order of RX STBC access in mac80211, from Johannes Berg. 4) Inifnite loop in bpftool map create, from Alban Crequy. 5) Register mark fix in ebpf verifier after pkt/null checks, from Paul Chaignon. 6) Properly use rcu_dereference_sk_user_data in L2TP code, from Eric Dumazet. 7) Buffer overrun in marvell phy driver, from Andrew Lunn. 8) Several crash and statistics handling fixes to bnxt_en driver, from Michael Chan and Vasundhara Volam. 9) Several fixes to the TLS layer from Jakub Kicinski (copying negative amounts of data in reencrypt, reencrypt frag copying, blind nskb->sk NULL deref, etc). 10) Several UDP GRO fixes, from Paolo Abeni and Eric Dumazet. 11) PID/UID checks on ipv6 flow labels are inverted, from Willem de Bruijn. 12) Use after free in l2tp, from Eric Dumazet. 13) IPV6 route destroy races, also from Eric Dumazet. 14) SCTP state machine can erroneously run recursively, fix from Xin Long. 15) Adjust AF_PACKET msg_name length checks, add padding bytes if necessary. From Willem de Bruijn. 16) Preserve skb_iif, so that forwarded packets have consistent values even if fragmentation is involved. From Shmulik Ladkani. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits) udp: fix GRO packet of death ipv6: A few fixes on dereferencing rt->from rds: ib: force endiannes annotation selftests: fib_rule_tests: print the result and return 1 if any tests failed ipv4: ip_do_fragment: Preserve skb_iif during fragmentation net/tls: avoid NULL pointer deref on nskb->sk in fallback selftests: fib_rule_tests: Fix icmp proto with ipv6 packet: validate msg_namelen in send directly packet: in recvmsg msg_name return at least sizeof sockaddr_ll sctp: avoid running the sctp state machine recursively stmmac: pci: Fix typo in IOT2000 comment Documentation: fix netdev-FAQ.rst markup warning ipv6: fix races in ip6_dst_destroy() l2ip: fix possible use-after-free appletalk: Set error code if register_snap_client failed net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc rxrpc: Fix net namespace cleanup ipv6/flowlabel: wait rcu grace period before put_pid() vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach tcp: add sanity tests in tcp_add_backlog() ...
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "This is mostly io_uring fixes/tweaks. Most of these were actually done in time for the last -rc, but I wanted to ensure that everything tested out great before including them. The code delta looks larger than it really is, as it's mostly just comment additions/changes. Outside of the comment additions/changes, this is mostly removal of unnecessary barriers. In all, this pull request contains: - Tweak to how we handle errors at submission time. We now post a completion event if the error occurs on behalf of an sqe, instead of returning it through the system call. If the error happens outside of a specific sqe, we return the error through the system call. This makes it nicer to use and makes the "normal" use case behave the same as the offload cases. (me) - Fix for a missing req reference drop from async context (me) - If an sqe is submitted with RWF_NOWAIT, don't punt it to async context. Return -EAGAIN directly, instead of using it as a hint to do async punt. (Stefan) - Fix notes on barriers (Stefan) - Remove unnecessary barriers (Stefan) - Fix potential double free of memory in setup error (Mark) - Further improve sq poll CPU validation (Mark) - Fix page allocation warning and leak on buffer registration error (Mark) - Fix iov_iter_type() for new no-ref flag (Ming) - Fix a case where dio doesn't honor bio no-page-ref (Ming)" * tag 'for-linus-20190502' of git://git.kernel.dk/linux-block: io_uring: avoid page allocation warnings iov_iter: fix iov_iter_type block: fix handling for BIO_NO_PAGE_REF io_uring: drop req submit reference always in async punt io_uring: free allocated io_memory once io_uring: fix SQPOLL cpu validation io_uring: have submission side sqe errors post a cqe io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP io_uring: remove unnecessary barrier after incrementing dropped counter io_uring: remove unnecessary barrier before reading SQ tail io_uring: remove unnecessary barrier after updating SQ head io_uring: remove unnecessary barrier before reading cq head io_uring: remove unnecessary barrier before wq_has_sleeper io_uring: fix notes on barriers io_uring: fix handling SQEs requesting NOWAIT
-
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pciLinus Torvalds authored
Pull PCI fixes from Bjorn Helgaas: "I apologize for sending these so late in the cycle. We went back and forth about how to deal with the unexpected logging of intentional link state changes and finally decided to just config them off by default. PCI fixes: - Stop ignoring "pci=disable_acs_redir" parameter (Logan Gunthorpe) - Use shared MSI/MSI-X vector for Link Bandwidth Management (Alex Williamson) - Add Kconfig option for Link Bandwidth notification messages (Keith Busch)" * tag 'pci-v5.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/LINK: Add Kconfig option (default off) PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored
-
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linuxLinus Torvalds authored
Pull MTD fix from Richard Weinberger: "A single regression fix for the marvell nand driver" * tag 'mtd/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: marvell: Clean the controller state before each operation
-
Keith Busch authored
e8303bb7 ("PCI/LINK: Report degraded links via link bandwidth notification") added dmesg logging whenever a link changes speed or width to a state that is considered degraded. Unfortunately, it cannot differentiate signal integrity-related link changes from those intentionally initiated by an endpoint driver, including drivers that may live in userspace or VMs when making use of vfio-pci. Some GPU drivers actively manage the link state to save power, which generates a stream of messages like this: vfio-pci 0000:07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at 0000:00:02.0 (capable of 64.000 Gb/s with 5 GT/s x16 link) Since we can't distinguish the intentional changes from the signal integrity issues, leave the reporting turned off by default. Add a Kconfig option to turn it on if desired. Fixes: e8303bb7 ("PCI/LINK: Report degraded links via link bandwidth notification") Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.comSigned-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
-
Esben Haabendal authored
Fixes: d84aec42 ("net: ll_temac: Fix support for 64-bit platforms") Signed-off-by: Esben Haabendal <esben@geanix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
syzbot was able to crash host by sending UDP packets with a 0 payload. TCP does not have this issue since we do not aggregate packets without payload. Since dev_gro_receive() sets gso_size based on skb_gro_len(skb) it seems not worth trying to cope with padded packets. BUG: KASAN: slab-out-of-bounds in skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826 Read of size 16 at addr ffff88808893fff0 by task syz-executor612/7889 CPU: 0 PID: 7889 Comm: syz-executor612 Not tainted 5.1.0-rc7+ #96 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 __asan_report_load16_noabort+0x14/0x20 mm/kasan/generic_report.c:133 skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826 udp_gro_receive_segment net/ipv4/udp_offload.c:382 [inline] call_gro_receive include/linux/netdevice.h:2349 [inline] udp_gro_receive+0xb61/0xfd0 net/ipv4/udp_offload.c:414 udp4_gro_receive+0x763/0xeb0 net/ipv4/udp_offload.c:478 inet_gro_receive+0xe72/0x1110 net/ipv4/af_inet.c:1510 dev_gro_receive+0x1cd0/0x23c0 net/core/dev.c:5581 napi_gro_frags+0x36b/0xd10 net/core/dev.c:5843 tun_get_user+0x2f24/0x3fb0 drivers/net/tun.c:1981 tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2027 call_write_iter include/linux/fs.h:1866 [inline] do_iter_readv_writev+0x5e1/0x8e0 fs/read_write.c:681 do_iter_write fs/read_write.c:957 [inline] do_iter_write+0x184/0x610 fs/read_write.c:938 vfs_writev+0x1b3/0x2f0 fs/read_write.c:1002 do_writev+0x15e/0x370 fs/read_write.c:1037 __do_sys_writev fs/read_write.c:1110 [inline] __se_sys_writev fs/read_write.c:1107 [inline] __x64_sys_writev+0x75/0xb0 fs/read_write.c:1107 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x441cc0 Code: 05 48 3d 01 f0 ff ff 0f 83 9d 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 83 3d 51 93 29 00 00 75 14 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 74 09 fc ff c3 48 83 ec 08 e8 ba 2b 00 00 RSP: 002b:00007ffe8c716118 EFLAGS: 00000246 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 00007ffe8c716150 RCX: 0000000000441cc0 RDX: 0000000000000001 RSI: 00007ffe8c716170 RDI: 00000000000000f0 RBP: 0000000000000000 R08: 000000000000ffff R09: 0000000000a64668 R10: 0000000020000040 R11: 0000000000000246 R12: 000000000000c2d9 R13: 0000000000402b50 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 5143: save_stack+0x45/0xd0 mm/kasan/common.c:75 set_track mm/kasan/common.c:87 [inline] __kasan_kmalloc mm/kasan/common.c:497 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505 slab_post_alloc_hook mm/slab.h:437 [inline] slab_alloc mm/slab.c:3393 [inline] kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555 mm_alloc+0x1d/0xd0 kernel/fork.c:1030 bprm_mm_init fs/exec.c:363 [inline] __do_execve_file.isra.0+0xaa3/0x23f0 fs/exec.c:1791 do_execveat_common fs/exec.c:1865 [inline] do_execve fs/exec.c:1882 [inline] __do_sys_execve fs/exec.c:1958 [inline] __se_sys_execve fs/exec.c:1953 [inline] __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 5351: save_stack+0x45/0xd0 mm/kasan/common.c:75 set_track mm/kasan/common.c:87 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459 kasan_slab_free+0xe/0x10 mm/kasan/common.c:467 __cache_free mm/slab.c:3499 [inline] kmem_cache_free+0x86/0x260 mm/slab.c:3765 __mmdrop+0x238/0x320 kernel/fork.c:677 mmdrop include/linux/sched/mm.h:49 [inline] finish_task_switch+0x47b/0x780 kernel/sched/core.c:2746 context_switch kernel/sched/core.c:2880 [inline] __schedule+0x81b/0x1cc0 kernel/sched/core.c:3518 preempt_schedule_irq+0xb5/0x140 kernel/sched/core.c:3745 retint_kernel+0x1b/0x2d arch_local_irq_restore arch/x86/include/asm/paravirt.h:767 [inline] kmem_cache_free+0xab/0x260 mm/slab.c:3766 anon_vma_chain_free mm/rmap.c:134 [inline] unlink_anon_vmas+0x2ba/0x870 mm/rmap.c:401 free_pgtables+0x1af/0x2f0 mm/memory.c:394 exit_mmap+0x2d1/0x530 mm/mmap.c:3144 __mmput kernel/fork.c:1046 [inline] mmput+0x15f/0x4c0 kernel/fork.c:1067 exec_mmap fs/exec.c:1046 [inline] flush_old_exec+0x8d9/0x1c20 fs/exec.c:1279 load_elf_binary+0x9bc/0x53f0 fs/binfmt_elf.c:864 search_binary_handler fs/exec.c:1656 [inline] search_binary_handler+0x17f/0x570 fs/exec.c:1634 exec_binprm fs/exec.c:1698 [inline] __do_execve_file.isra.0+0x1394/0x23f0 fs/exec.c:1818 do_execveat_common fs/exec.c:1865 [inline] do_execve fs/exec.c:1882 [inline] __do_sys_execve fs/exec.c:1958 [inline] __se_sys_execve fs/exec.c:1953 [inline] __x64_sys_execve+0x8f/0xc0 fs/exec.c:1953 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff88808893f7c0 which belongs to the cache mm_struct of size 1496 The buggy address is located 600 bytes to the right of 1496-byte region [ffff88808893f7c0, ffff88808893fd98) The buggy address belongs to the page: page:ffffea0002224f80 count:1 mapcount:0 mapping:ffff88821bc40ac0 index:0xffff88808893f7c0 compound_mapcount: 0 flags: 0x1fffc0000010200(slab|head) raw: 01fffc0000010200 ffffea00025b4f08 ffffea00027b9d08 ffff88821bc40ac0 raw: ffff88808893f7c0 ffff88808893e440 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88808893fe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88808893ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88808893ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff888088940000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888088940080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 01 May, 2019 6 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supplyLinus Torvalds authored
Pull power supply fixes from Sebastian Reichel: "Two more fixes for the 5.1 cycle. One division by zero fix in a specific driver and one core workaround for bad userspace behaviour from systemd regarding uevents. IMHO this can be considered to be a userspace bug, but the debug messages are useless anyways - cpcap-battery: fix a division by zero - core: fix systemd issue due to log messages produced by uevent" * tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG power: supply: cpcap-battery: Fix division by zero
-
Martin KaFai Lau authored
It is a followup after the fix in commit 9c69a132 ("route: Avoid crash from dereferencing NULL rt->from") rt6_do_redirect(): 1. NULL checking is needed on rt->from because a parallel fib6_info delete could happen that sets rt->from to NULL. (e.g. rt6_remove_exception() and fib6_drop_pcpu_from()). 2. fib6_info_hold() is not enough. Same reason as (1). Meaning, holding dst->__refcnt cannot ensure rt->from is not NULL or rt->from->fib6_ref is not 0. Instead of using fib6_info_hold_safe() which ip6_rt_cache_alloc() is already doing, this patch chooses to extend the rcu section to keep "from" dereference-able after checking for NULL. inet6_rtm_getroute(): 1. NULL checking is also needed on rt->from for a similar reason. Note that inet6_rtm_getroute() is using RTNL_FLAG_DOIT_UNLOCKED. Fixes: a68886a6 ("net/ipv6: Make from in rt6_info rcu protected") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Wei Wang <weiwan@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicholas Mc Guire authored
While the endiannes is being handled correctly as indicated by the comment above the offending line - sparse was unhappy with the missing annotation as be64_to_cpu() expects a __be64 argument. To mitigate this annotation all involved variables are changed to a consistent __le64 and the conversion to uint64_t delayed to the call to rds_cong_map_updated(). Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Maxime Chevallier says: ==================== net: mvpp2: cls: Add classification This series is a rework of the previously standalone patch adding classification support for mvpp2 : https://lore.kernel.org/netdev/20190423075031.26074-1-maxime.chevallier@bootlin.com/ This patch has been reworked according to Saeed's review, to make sure that the location of the rule is always respected and serves as a way to prioritize rules between each other. This the 3rd iteration of this submission, but since it's now a series, I reset the revision numbering. This series implements that in a limited configuration for now, since we limit the total number of rules per port to 4. The main factors for this limitation are that : - We share the classification tables between all ports (4 max, although one is only used for internal loopback), hence we have to perform a logical separation between rules, which is done today by dedicated ranges for each port in each table - The "Flow table", which dictates which lookups operations are performed for an ingress packet, in subdivided into 22 "sub flows", each corresponding to a traffic type based on the L3 proto, L4 proto, the presence or not of a VLAN tag and the L3 fragmentation. This makes so that when adding a rule, it has to be added into each of these subflows, introducing duplications of entries and limiting our max number of entries. These limitations can be overcomed in several ways, but for readability sake, I'd rather submit basic classification offload support for now, and improve it gradually. This series also adds a small cosmetic cleanup patch (1), and also adds support for the "Drop" action compared to the first submission of this feature. It is simple enough to be added with this basic support. Compared to the first submissions, the NETIF_F_NTUPLE flag was also removed, following Saeed's comment. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maxime Chevallier authored
This commit introduces support for the "Drop" action in classification offload. This corresponds to the "-1" action with ethtool -N. This is achieved using the color marking actions available in the C2 engine, which associate a color to a packet. These colors can be either Green, Yellow or Red, Red meaning that the packet should be dropped. Green and Yellow colors are interpreted by the Policer, which isn't supported yet. This method of dropping using the Classifier is different than the already existing early-drop features, such as VLAN filtering and MAC UC/MC filtering, which are performed during the Parsing step, and therefore take precedence over classification actions. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Maxime Chevallier authored
This commit introduces basic classification offloading support for the PPv2 controller. The PPv2 classifier has many classification engines, for now we only use the C2 TCAM match engine. This engine allows to perform ternary lookups on 64 bits keys (called Header Extracted Key), that are built by extracting fields from the packet header and concatenating them. At most 4 fields can be extracted for a single lookup. This basic implementation allows to build the HEK from the following fields : - L4 source and destination ports (for UDP and TCP) More fields are to be added in the future. Classification flows are added through the ethtool interface, using the newly introduced flow_rule infrastructure as an internal rule representation, allowing to more easily implement tc flower rules if need be. The internal design for now allocates one range of 4 rules per port due to the internal design of the flow table, which uses 22 sub-flows. When inserting a classification rule, the rule is created in every relevant sub-flow. This low rule-count is a very simple design which reaches quickly the limitations of the flow table ordering, but guarantees that the rule ordering will always be respected. This commit only introduces support for the "steer to rxq" action. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-