- 28 Nov, 2020 6 commits
-
-
Jakub Kicinski authored
Ido Schimmel says: ==================== mlxsw: Update adjacency index more efficiently The device supports an operation that allows the driver to issue one request to update the adjacency index for all the routes in a given virtual router (VR) from old index and size to new ones. This is useful in case the configuration of a certain nexthop group is updated and its adjacency index changes. Currently, the driver does not use this operation in an efficient manner. It iterates over all the routes using the nexthop group and issues an update request for the VR if it is not the same as the previous VR. Instead, this patch set tracks the VRs in which the nexthop group is used and issues one request for each VR. Example: 8k IPv6 routes were added in an alternating manner to two VRFs. All the routes are using the same nexthop object ('nhid 1'). Before: Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3': 16,385 devlink:devlink_hwmsg 4.255933213 seconds time elapsed 0.000000000 seconds user 0.666923000 seconds sys Number of EMAD transactions corresponds to number of routes using the nexthop group. After: Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3': 3 devlink:devlink_hwmsg 0.077655094 seconds time elapsed 0.000000000 seconds user 0.076698000 seconds sys Number of EMAD transactions corresponds to number of VRFs / VRs. Patch set overview: Patch #1 is a fix for a bug introduced in previous submission. Detected by Coverity. Patches #2 and #3 are preparations. Patch #4 tracks the VRs a nexthop group is member of. Patch #5 uses the membership tracking from the previous patch to issue one update request per each VR. ==================== Link: https://lore.kernel.org/r/20201125193505.1052466-1-idosch@idosch.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The device supports an operation that allows the driver to issue one request to update the adjacency index for all the routes in a given virtual router (VR) from old index and size to new ones. This is useful in case the configuration of a certain nexthop group is updated and its adjacency index changes. Currently, the driver does not use this operation in an efficient manner. It iterates over all the routes using the nexthop group and issues an update request for the VR if it is not the same as the previous VR. Instead, use the VR tracking added in the previous patch to update the adjacency index once for each VR currently using the nexthop group. Example: 8k IPv6 routes were added in an alternating manner to two VRFs. All the routes are using the same nexthop object ('nhid 1'). Before: # perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3': 16,385 devlink:devlink_hwmsg 4.255933213 seconds time elapsed 0.000000000 seconds user 0.666923000 seconds sys Number of EMAD transactions corresponds to number of routes using the nexthop group. After: # perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3': 3 devlink:devlink_hwmsg 0.077655094 seconds time elapsed 0.000000000 seconds user 0.076698000 seconds sys Number of EMAD transactions corresponds to number of VRFs / VRs. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
For each nexthop group, track in which virtual routers (VRs) the group is used. This is going to be used by the next patch to perform a more efficient adjacency index update whenever the group's adjacency index changes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
In the rare case where the adjacency pointer cannot be updated for a given virtual router, rollback the operation so that virtual routers that are already using the new index will use the old one again. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
mlxsw_sp_adj_index_mass_update_vr() only needs the virtual router's identifier and protocol, so pass them directly. In a subsequent patch the caller will not have access to the pointer. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
Return error to the caller instead of suppressing it. Fixes: e3ddfb45 ("mlxsw: spectrum_router: Allow returning errors from mlxsw_sp_nexthop_group_refresh()") Addresses-Coverity: ("Error handling issues (CHECKED_RETURN)") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 27 Nov, 2020 10 commits
-
-
Jakub Kicinski authored
wenxu says: ==================== net/sched: fix over mtu packet of defrag in Currently kernel tc subsystem can do conntrack in act_ct. But when several fragment packets go through the act_ct, function tcf_ct_handle_fragments will defrag the packets to a big one. But the last action will redirect mirred to a device which maybe lead the reassembly big packet over the mtu of target device. The first patch fix miss init the qdisc_skb_cb->mru The send one refactor the hanle of xmit in act_mirred and prepare for the third one The last one add implict packet fragment support to fix the over mtu for defrag in act_ct. ==================== Link: https://lore.kernel.org/r/1606276883-6825-1-git-send-email-wenxu@ucloud.cnSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
wenxu authored
Currently kernel tc subsystem can do conntrack in cat_ct. But when several fragment packets go through the act_ct, function tcf_ct_handle_fragments will defrag the packets to a big one. But the last action will redirect mirred to a device which maybe lead the reassembly big packet over the mtu of target device. This patch add support for a xmit hook to mirred, that gets executed before xmiting the packet. Then, when act_ct gets loaded, it configs that hook. The frag xmit hook maybe reused by other modules. Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Cong Wang <cong.wang@bytedance.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
wenxu authored
This one is prepare for the next patch. Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
wenxu authored
The mru in the qdisc_skb_cb should be init as 0. Only defrag packets in the act_ct will set the value. Fixes: 038ebb1a ("net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct") Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Vadim Fedorenko says: ==================== Add CHACHA20-POLY1305 cipher to Kernel TLS RFC 7905 defines usage of ChaCha20-Poly1305 in TLS connections. This cipher is widely used nowadays and it's good to have a support for it in TLS connections in kernel. ==================== Link: https://lore.kernel.org/r/1606231490-653-1-git-send-email-vfedorenko@novek.ruSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vadim Fedorenko authored
Add new cipher as a variant of standard tls selftests Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vadim Fedorenko authored
Add ChaCha-Poly specific configuration code. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vadim Fedorenko authored
RFC 7905 defines special behavior for ChaCha-Poly TLS sessions. The differences are in the calculation of nonce and the absence of explicit IV. This behavior is like TLSv1.3 partly. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vadim Fedorenko authored
To provide support for ChaCha-Poly cipher we need to define specific constants and structures. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vadim Fedorenko authored
Inline functions defined in tls.h have a lot of AES-specific constants. Remove these constants and change argument to struct tls_prot_info to have an access to cipher type in later patches Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 26 Nov, 2020 11 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueJakub Kicinski authored
Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2020-11-24 This series contains updates to i40e and igbvf drivers. Marek removes a redundant assignment for i40e. Stefan Assmann corrects reporting of VF link speed for i40e. Karen revises a couple of error messages to warnings for igbvf as they could be misinterpreted as issues when they are not. v2: Dropped PTP patch as it's being updated. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igbvf: Refactor traces i40e: report correct VF link speed when link state is set to enable i40e: remove redundant assignment ==================== Link: https://lore.kernel.org/r/20201124165245.2844118-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Chris Packham says: ==================== net: dsa: mv88e6xxx: serdes link without phy This small series gets my hardware into a working state. The key points are to make sure we don't force the link and that we ask the MAC for the link status. I also have updated my dts to say `phy-mode = "1000base-x";` and `managed = "in-band-status";` I've dropped the patch for the 88E6123 as it's a distraction and I lack hardware to do any proper testing with it. Earlier versions are on the mailing list if anyone wants to pick it up in the future. I notice there's a series for mv88e6393x circulating on the netdev mailing list. As patch #1 is adding a new device specific op either this series will need updating to cover the mv88e6393x or the mv88e6393x series will need updating for the new op depenting on which lands first. ==================== Link: https://lore.kernel.org/r/20201124043440.28400-1-chris.packham@alliedtelesis.co.nzSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Chris Packham authored
If the underlying read operation failed we would end up writing stale data to the supplied buffer. This would end up with the last successfully read value repeating. Fix this by only writing the data when we know the read was good. This will mean that failed values will return 0xffff. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Chris Packham authored
The MV88E6097 presents the serdes interrupts for ports 8 and 9 via the Switch Global 2 registers. There is no additional layer of enablinh/disabling the serdes interrupts like other mv88e6xxx switches. Even though most of the serdes behaviour is the same as the MV88E6185 that chip does not provide interrupts for serdes events so unlike earlier commits the functions added here are specific to the MV88E6097. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Chris Packham authored
Implement serdes_power, serdes_get_lane and serdes_pcs_get_state ops for the MV88E6097/6095/6185 so that ports 8 & 9 can be supported as serdes ports and directly connected to other network interfaces or to SFPs without a PHY. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Chris Packham authored
When a port is configured with 'managed = "in-band-status"' switch chips like the 88E6390 need to propagate the SERDES link state to the MAC because the link state is not correctly detected. This causes problems on the 88E6185/88E6097 where the link partner won't see link state changes because we're forcing the link. To address this introduce a new device specific op port_sync_link() and push the logic from mv88e6xxx_mac_link_up() into that. Provide an implementation for the 88E6185 like devices which doesn't force the link. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Christian Eggers says: ==================== dt-bindings: net: dsa: microchip: convert KSZ bindings to yaml These patches are orginally from the series "net: dsa: microchip: PTP support for KSZ956x" As the the device tree conversion to yaml is not really related to the PTP patches and the original series is going to take more time than I expected, I would like to split this. Changes (original series -> v1) -------------------------------- - dts: moved "allOf" below "maintainers" - dts: use "unevaluatedProperties" instead of "additionalProperties" - dts: removed "spi-cpha" and "spi-cpol" flags as the hardware is fixed - ksz8795: setup SPI for mode 3 - ksz9477: dito ==================== Link: https://lore.kernel.org/r/20201120112107.16334-1-ceggers@arri.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christian Eggers authored
This should be done in the device driver instead of the device tree. Signed-off-by: Christian Eggers <ceggers@arri.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christian Eggers authored
This should be done in the device driver instead of the device tree. Signed-off-by: Christian Eggers <ceggers@arri.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christian Eggers authored
The dsa.yaml device tree binding allows "ethernet-ports" (preferred) and "ports". Signed-off-by: Christian Eggers <ceggers@arri.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christian Eggers authored
Convert the bindings document for Microchip KSZ Series Ethernet switches from txt to yaml. Removed spi-cpha and spi-cpol flags is this should be handled by the device driver. Signed-off-by: Christian Eggers <ceggers@arri.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 25 Nov, 2020 13 commits
-
-
Jakub Kicinski authored
Yunsheng Lin says: ==================== Add an assert in napi_consume_skb() This patch introduces a lockdep_assert_in_softirq() interface and uses it to assert the case when napi_consume_skb() is not called in the softirq context. ==================== Link: https://lore.kernel.org/r/1606214969-97849-1-git-send-email-linyunsheng@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yunsheng Lin authored
Use napi_consume_skb() to assert the case when it is not called in a atomic softirq context. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yunsheng Lin authored
The current semantic for napi_consume_skb() is that caller need to provide non-zero budget when calling from NAPI context, and breaking this semantic will cause hard to debug problem, because _kfree_skb_defer() need to run in atomic context in order to push the skb to the particular cpu' napi_alloc_cache atomically. So add the lockdep_assert_in_softirq() to assert when the running context is not in_softirq, in_softirq means softirq is serving or BH is disabled, which has a ambiguous semantics due to the BH disabled confusion, so add a comment to emphasize that. And the softirq context can be interrupted by hard IRQ or NMI context, lockdep_assert_in_softirq() need to assert about hard IRQ or NMI context too. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rikard Falkeborn authored
These are only used as input arguments to qmi_handle_init() which accepts const pointers to both qmi_ops and qmi_msg_handler. Make them const to allow the compiler to put them in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Acked-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20201122234031.33432-2-rikard.falkeborn@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
We can enter the main mptcp_recvmsg() loop even when no subflows are connected. As note by Eric, that would result in a divide by zero oops on ack generation. Address the issue by checking the subflow status before sending the ack. Additionally protect mptcp_recvmsg() against invocation with weird socket states. v1 -> v2: - removed unneeded inline keyword - Jakub Reported-and-suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: ea4ca586 ("mptcp: refine MPTCP-level ack scheduling") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/5370c0ae03449239e3d1674ddcfb090cf6f20abe.1606253206.git.pabeni@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Horatiu Vultur authored
Extend MRP to support LC mode(link check) for the interconnect port. This applies only to the interconnect ring. Opposite to RC mode(ring check) the LC mode is using CFM frames to detect when the link goes up or down and based on that the userspace will need to react. One advantage of the LC mode over RC mode is that there will be fewer frames in the normal rings. Because RC mode generates InTest on all ports while LC mode sends CFM frame only on the interconnect port. All 4 nodes part of the interconnect ring needs to have the same mode. And it is not possible to have running LC and RC mode at the same time on a node. Whenever the MIM starts it needs to detect the status of the other 3 nodes in the interconnect ring so it would send a frame called InLinkStatus, on which the clients needs to reply with their link status. This patch adds InLinkStatus frame type and extends existing rules on how to forward this frame. Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20201124082525.273820-1-horatiu.vultur@microchip.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vlad Buslov authored
Currently both filter and action flags use same "TCA_" prefix which makes them hard to distinguish to code and confusing for users. Create aliases for existing action flags constants with "TCA_ACT_" prefix. Signed-off-by: Vlad Buslov <vlad@buslov.dev> Link: https://lore.kernel.org/r/20201124164054.893168-1-vlad@buslov.devSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Florian Westphal authored
On close this timer might be scheduled. mptcp uses sk_reset_timer for this, so the a reference on the mptcp socket is taken. This causes a refcount leak which can for example be reproduced with 'mp_join_server_v4.pkt' from the mptcp-packetdrill repo. The leak has nothing to do with join requests, v1_mp_capable_bind_no_cs.pkt works too when replacing the last ack mpcapable to v1 instead of v0. unreferenced object 0xffff888109bba040 (size 2744): comm "packetdrill", [..] backtrace: [..] sk_prot_alloc.isra.0+0x2b/0xc0 [..] sk_clone_lock+0x2f/0x740 [..] mptcp_sk_clone+0x33/0x1a0 [..] subflow_syn_recv_sock+0x2b1/0x690 [..] Fixes: e16163b6 ("mptcp: refactor shutdown and close") Cc: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20201124162446.11448-1-fw@strlen.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Antonio Borneo authored
The rtl8211f supports downshift and before commit 5502b218 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status") the read-back of register MII_CTRL1000 was used to detect the negotiated link speed. The code added in commit d445dff2 ("net: phy: realtek: read actual speed to detect downshift") is working fine also for this phy and it's trivial re-using it to restore the downshift detection on rtl8211f. Add the phy specific read_status() pointing to the existing function rtlgen_read_status(). Signed-off-by: Antonio Borneo <antonio.borneo@st.com> Link: https://lore.kernel.org/r/478f871a-583d-01f1-9cc5-2eea56d8c2a7@huawei.comTested-by: Yonglong Liu <liuyonglong@huawei.com> Link: https://lore.kernel.org/r/20201124230756.887925-1-antonio.borneo@st.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Christian Eggers says: ==================== net: ptp: use common defines for PTP message types in further drivers This series replaces further driver internal enumeration / uses of magic numbers with the newly introduced PTP_MSGTYPE_* defines. ==================== Link: https://lore.kernel.org/r/20201124074418.2609-1-ceggers@arri.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christian Eggers authored
Use recently introduced PTP_MSGTYPE_SYNC and PTP_MSGTYPE_DELAY_REQ defines instead of a driver internal enumeration. Signed-off-by: Christian Eggers <ceggers@arri.de> Reviewed-by: Antoine Tenart <atenart@kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christian Eggers authored
Use recently introduced PTP wide defines instead of a driver internal enumeration. Signed-off-by: Christian Eggers <ceggers@arri.de> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Cc: Petr Machata <petrm@mellanox.com> Cc: Jiri Pirko <jiri@nvidia.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christian Eggers authored
Replace use of magic number with recently introduced define. Signed-off-by: Christian Eggers <ceggers@arri.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-