- 27 Sep, 2022 22 commits
-
-
Paolo Abeni authored
Michael Weiß says: ==================== net: openvswitch: metering and conntrack in userns Currently using openvswitch in a non-initial user namespace, e.g., an unprivileged container, is possible but without metering and conntrack support. This is due to the restriction of the corresponding Netlink interfaces to the global CAP_NET_ADMIN. This simple patches switch from GENL_ADMIN_PERM to GENL_UNS_ADMIN_PERM in several cases to allow this also for the unprivileged container use case. We tested this for unprivileged containers created by the container manager of GyroidOS (gyroidos.github.io). However, for other container managers such as LXC or systemd which provide unprivileged containers this should be apply equally. ==================== Link: https://lore.kernel.org/r/20220923133820.993725-1-michael.weiss@aisec.fraunhofer.deSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Michael Weiß authored
Similar to the previous commit, the Netlink interface of the OVS conntrack module was restricted to global CAP_NET_ADMIN by using GENL_ADMIN_PERM. This is changed to GENL_UNS_ADMIN_PERM to support unprivileged containers in non-initial user namespace. Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Michael Weiß authored
The Netlink interface for metering was restricted to global CAP_NET_ADMIN by using GENL_ADMIN_PERM. To allow metring in a non-inital user namespace, e.g., a container, this is changed to GENL_UNS_ADMIN_PERM. Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-
Tony Lu authored
This enables SO_REUSEPORT [1] for clcsock when it is set on smc socket, so that some applications which uses it can be transparently replaced with SMC. Also, this helps improve load distribution. Here is a simple test of NGINX + wrk with SMC. The CPU usage is collected on NGINX (server) side as below. Disable SO_REUSEPORT: 05:15:33 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 05:15:34 PM all 7.02 0.00 11.86 0.00 2.04 8.93 0.00 0.00 0.00 70.15 05:15:34 PM 0 0.00 0.00 0.00 0.00 16.00 70.00 0.00 0.00 0.00 14.00 05:15:34 PM 1 11.58 0.00 22.11 0.00 0.00 0.00 0.00 0.00 0.00 66.32 05:15:34 PM 2 1.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 98.00 05:15:34 PM 3 16.84 0.00 30.53 0.00 0.00 0.00 0.00 0.00 0.00 52.63 05:15:34 PM 4 28.72 0.00 44.68 0.00 0.00 0.00 0.00 0.00 0.00 26.60 05:15:34 PM 5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:15:34 PM 6 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 05:15:34 PM 7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 Enable SO_REUSEPORT: 05:15:20 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle 05:15:21 PM all 8.56 0.00 14.40 0.00 2.20 9.86 0.00 0.00 0.00 64.98 05:15:21 PM 0 0.00 0.00 4.08 0.00 14.29 76.53 0.00 0.00 0.00 5.10 05:15:21 PM 1 9.09 0.00 16.16 0.00 1.01 0.00 0.00 0.00 0.00 73.74 05:15:21 PM 2 9.38 0.00 16.67 0.00 1.04 0.00 0.00 0.00 0.00 72.92 05:15:21 PM 3 10.42 0.00 17.71 0.00 1.04 0.00 0.00 0.00 0.00 70.83 05:15:21 PM 4 9.57 0.00 15.96 0.00 0.00 0.00 0.00 0.00 0.00 74.47 05:15:21 PM 5 9.18 0.00 15.31 0.00 0.00 1.02 0.00 0.00 0.00 74.49 05:15:21 PM 6 8.60 0.00 15.05 0.00 0.00 0.00 0.00 0.00 0.00 76.34 05:15:21 PM 7 12.37 0.00 14.43 0.00 0.00 0.00 0.00 0.00 0.00 73.20 Using SO_REUSEPORT helps the load distribution of NGINX be more balanced. [1] https://man7.org/linux/man-pages/man7/socket.7.htmlSigned-off-by: Tony Lu <tonylu@linux.alibaba.com> Acked-by: Wenjia Zhang <wenjia@linux.ibm.com> Link: https://lore.kernel.org/r/20220922121906.72406-1-tonylu@linux.alibaba.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Jakub Kicinski authored
Sean Anderson says: ==================== net: sunhme: Cleanups and logging improvements This series is a continuation of [1] with a focus on logging improvements (in the style of commit b11e5f6a ("net: sunhme: output link status with a single print.")). I have included several of Rolf's patches in the series where appropriate (with slight modifications). After this series is applied, many more messages from this driver will come with driver/device information. Additionally, most messages (especially debug messages) have been condensed onto one line (as KERN_CONT messages get split). [1] https://lore.kernel.org/netdev/4686583.GXAFRqVoOG@eto.sf-tec.de/ ==================== Link: https://lore.kernel.org/r/20220924015339.1816744-1-seanga2@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
I have the hardware so at the very least I can test things. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
The SXD, TXD, and RXD macros are used only once (or twice). Just use the vdbg print, which seems to have been devised for these sorts of very verbose messages. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
This driver seems to have been written under the assumption that messages can be continued arbitrarily. I'm not when this changed (if ever), but such ad-hoc continuations are liable to be rudely interrupted. Convert all such instances to single prints. This loses a bit of timing information (such as when a line was constructed piecemeal as the function executed), but it's easy to add a few prints if necessary. This also adds newlines to the ends of any prints without them. Since (almost every) debug print included the name of the function, include it automatically. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
Wherever possible, use the associated netdev (or device) when printing errors or other messages. This makes it immediately clear what device caused the error, and provides more information than just the device name. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
This is a mostly-mechanical translation of the existing printks into pr_foos. In several places, I have pasted messages which were broken over several lines to allow for easier grepping. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
Remove all the single-use debug conditionals, and just collect the debug defines at the top of the file. HMD seems like it is used for general debug info, so just redefine it as pr_debug. Additionally, instead of using the default loglevel, use the debug loglevel for debugging. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
With the power of variadic macros, double parentheses are unnecessary. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rolf Eike Beer authored
This not only removes a lot of code, it also fixes the memleak of the DMA memory when register_netdev() fails. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> [ rebased onto net-next/master; fixed error reporting ] Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
This fixes several error paths to ensure they return an appropriate error (instead of ENODEV). Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
In order to differentiate between a missing bridge and an OOM condition, return ERR_PTRs from quattro_pci_find. This also does some general linting in the area. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rolf Eike Beer authored
This already returns a proper error value, so pass it to the caller. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sean Anderson authored
Module versions are not very useful: > The basic problem is, the version string does not identify the sources > with enough accuracy. It says nothing about back ported fixes in > stable kernels. It tells you nothing about vendor patches to the > network core, etc. https://lore.kernel.org/all/Yf6mtvA1zO7cdzr7@lunn.ch/ While we're at it, inline the author and use the driver name a bit more. Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rolf Eike Beer authored
I can't find a reference to it in the entire git history. Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Yang Yingliang says: ==================== net: dsa: remove unnecessary i2c_set_clientdata() This patchset https://lore.kernel.org/all/20220921140524.3831101-8-yangyingliang@huawei.com/T/ removed all set_drvdata(NULL) in driver remove function. i2c_set_clientdata() is another wrapper of set drvdata function, to follow the same convention, remove i2c_set_clientdata() called in driver remove function in drivers/net/dsa/. ==================== Link: https://lore.kernel.org/r/20220923143742.87093-1-yangyingliang@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yang Yingliang authored
Remove unnecessary i2c_set_clientdata() in ->remove(), the driver_data will be set to NULL in device_unbind_cleanup() after calling ->remove(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yang Yingliang authored
Remove unnecessary i2c_set_clientdata() in ->remove(), the driver_data will be set to NULL in device_unbind_cleanup() after calling ->remove(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yang Yingliang authored
Remove unnecessary i2c_set_clientdata() in ->remove(), the driver_data will be set to NULL in device_unbind_cleanup() after calling ->remove(). Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 26 Sep, 2022 18 commits
-
-
Jesper Dangaard Brouer authored
Practical experience (and advice from Alexei) tell us that bitfields in structs lead to un-optimized assembly code. I've verified this change does lead to better x86_64 assembly, both via objdump and playing with code snippets in godbolt.org. Using scripts/bloat-o-meter shows the code size is reduced with 24 bytes for xdp_convert_buff_to_frame() that gets inlined e.g. in i40e_xmit_xdp_tx_ring() which were used for microbenchmarking. Microbenchmarking results do show improvements, but very small and varying between 0.5 to 2 nanosec improvement per packet. The member @metasize is changed from u8 to u32. Future users of this area could split this into two u16 fields. I've also benchmarked with two u16 fields showing equal performance gains and code size reduction. The moved member @frame_sz doesn't change sizeof struct due to existing padding. Like xdp_buff member @frame_sz is placed next to @flags, which allows compiler to optimize assignment of these. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/166393728005.2213882.4162674859542409548.stgit@firesoulSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Vladimir Oltean says: ==================== Improve tsn_lib selftests for future distributed tasks Some of the boards I am working with are limited in the number of ports that they offer, and as more TSN related selftests are added, it is important to be able to distribute the work among multiple boards. A large part of implementing that is ensuring network-wide synchronization, but also permitting more streams of data to flow through the network. There is the more important aspect of also coordinating the timing characteristics of those streams, and that is also something that is tackled, although not in this modest patch set. The goal here is not to introduce new selftests yet, but just to lay a better foundation for them. These patches are a part of the cleanup work I've done while working on selftests for frame preemption. They are regression-tested with psfp.sh. ==================== Link: https://lore.kernel.org/r/20220923210016.3406301-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
We can make the phc2sys helper not only synchronize a PHC to CLOCK_REALTIME, which is what it currently does, but also CLOCK_REALTIME to a PHC, which is going to be needed in distributed TSN tests. Instead of making the complexity of the arguments passed to phc2sys_start() explode, we can let it figure out the sync direction automatically, based on ptp4l's port states. Towards that goal, pass just the path to the desired ptp4l instance's UNIX domain socket, and remove the $if_name argument (from which it derives the PHC). Also adapt the one caller from the ocelot psfp.sh test. In the case of psfp.sh, phc2sys_start is able to properly figure out that CLOCK_REALTIME is the source clock and swp1's PHC is the destination, because of the way in which ptp4l_start for the UDS_ADDRESS_SWP1 was called: with slave_only=false, so it will always win the BMCA and always become the sync master between itself and $h1. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
Move the PID variable for the isochron receiver into a separate namespace per stats port, to allow multiple receivers (and/or orchestration daemons) to be instantiated by the same script. Preserve the existing behavior by making isochron_do() use the default stats TCP port of 5000. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
Switch ports will want to act as Boundary Clocks, which are configured using ptp4l by specifying the "-i" argument multiple times. Since we track a log file and a pid file for each ptp4l instance, and we want to be compatible with the existing single-port callers of ptp4l_start and ptp4l_stop, pass the interface list as a single string of space-separated values. Based on this, we create a label for each ptp4l instance, where the spaces are replaced with underscores (ptp4l_start "eth0 eth1" generates "ptp4l_pid_eth0_eth1"). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The extra_args argument ($3) of isochron_recv_start is overwritten with uds ($2), if that argument exists. This is currently not a problem, because the only TSN selftest (ocelot/psfp.sh) omits remote sync so it does not specify to the receiver a UNIX domain socket for ptp4l. So $uds is currently an empty string. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yang Yingliang authored
This patch adds missing MODULE_DEVICE_TABLE definition which generates correct modalias for automatic loading of this driver when it is built as an external module. Fixes: bc93e19d ("net: ethernet: adi: Add ADIN1110 support") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220922070438.586692-1-yangyingliang@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Wei Yongjun authored
SPI devices use the spi_device_id for module autoloading even on systems using device tree, after commit 5fa6863b ("spi: Check we have a spi_device_id for each DT compatible"), kernel warns as follows since the spi_device_id is missing: SPI driver mse102x has no spi_device_id for vertexcom,mse1021 SPI driver mse102x has no spi_device_id for vertexcom,mse1022 Add spi_device_id entries to silence the warnings, and ensure driver module autoloading works. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20220922065717.1448498-1-weiyongjun@huaweicloud.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Wei Yongjun authored
In case of error, the function get_phy_device() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: bc93e19d ("net: ethernet: adi: Add ADIN1110 support") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20220922021023.811581-1-weiyongjun@huaweicloud.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Arun Ramadoss says: ==================== net: dsa: microchip: ksz9477: enable interrupt for internal phy link detection This patch series implements the common interrupt handling for ksz9477 based switches and lan937x. The ksz9477 and lan937x has similar interrupt registers except ksz9477 has 4 port based interrupts whereas lan937x has 6 interrupts. The patch moves the phy interrupt hanler implemented in lan937x_main.c to ksz_common.c, along with the mdio_register functionality. ==================== Link: https://lore.kernel.org/r/20220922071028.18012-1-arun.ramadoss@microchip.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arun Ramadoss authored
Config_intr and handle_interrupt are enabled for ksz9477 phy. It is similar to all other phys in the micrel phys. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arun Ramadoss authored
The global port interrupt routines and individual ports interrupt routines has similar implementation except the mask & status register and number of nested irqs in them. The mask & status register and pointer to ksz_device is added to ksz_irq and uses the ksz_irq as irq_chip_data. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arun Ramadoss authored
To support the phy link detection through interrupt method for ksz9477 based switch, the interrupt handling routines are moved from lan937x_main.c to ksz_common.c. The only changes made are functions names are prefixed with ksz_ instead of lan937x_. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arun Ramadoss authored
Currently, if the mdio node is not present in the dts file then lan937x_mdio_register return -ENODEV and entire probing process fails. To make the mdio_register generic for all ksz series switches and to maintain back-compatibility with existing dts file, return -ENODEV is replaced with return 0. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arun Ramadoss authored
In the lan937x_mdio_register function, phy interrupts are enabled irrespective of irq is enabled in the switch. Now, the check is added to enable the phy interrupt only if the irq is enabled in the switch. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arun Ramadoss authored
Currently the number of port irqs is hard coded for the lan937x switch as 6. In order to make the generic interrupt handler for ksz switches, number of port irq supported by the switch is added to the ksz_chip_data. It is 4 for ksz9477, 2 for ksz9897 and 3 for ksz9567. Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
taprio_dev_notifier() subscribes to netdev state changes in order to determine whether interfaces which have a taprio root qdisc have changed their link speed, so the internal calculations can be adapted properly. The 'qdev' temporary variable serves no purpose, because we just use it only once, and can just as well use qdisc_dev(q->root) directly (or the "dev" that comes from the netdev notifier; this is because qdev is only interesting if it was the subject of the state change, _and_ its root qdisc belongs in the taprio list). The 'found' variable also doesn't really serve too much of a purpose either; we can just call taprio_set_picos_per_byte() within the loop, and exit immediately afterwards. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Link: https://lore.kernel.org/r/20220923145921.3038904-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Gaosheng Cui says: ==================== Remove useless inline functions from net ==================== Link: https://lore.kernel.org/r/20220922083857.1430811-1-cuigaosheng1@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-