- 15 Jun, 2022 4 commits
-
-
Petr Machata authored
This reverts commit 923ba95e ("Merge branch 'mlxsw-spectrum-prepare-for-xm-implementation-lpm-trees'"). Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
This reverts commit e7086213 ("Merge branch 'mlxsw-spectrum-prepare-for-xm-implementation-prefix-insertion-and-removal'"). Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Petr Machata authored
This reverts commit 75c2a8fe ("Merge branch 'mlxsw-introduce-initial-xm-router-support'"). Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linuxJakub Kicinski authored
Saeed Mahameed says: ==================== mlx5-next: updates 2022-06-14 1) Updated HW bits and definitions for upcoming features 1.1) vport debug counters 1.2) flow meter 1.3) Execute ASO action for flow entry 1.4) enhanced CQE compression 2) Add ICM header-modify-pattern RDMA API Leon Says ========= SW steering manipulates packet's header using "modifying header" actions. Many of these actions do the same operation, but use different data each time. Currently we create and keep every one of these actions, which use expensive and limited resources. Now we introduce a new mechanism - pattern and argument, which splits a modifying action into two parts: 1. action pattern: contains the operations to be applied on packet's header, mainly set/add/copy of fields in the packet 2. action data/argument: contains the data to be used by each operation in the pattern. This way we reuse same patterns with different arguments to create new modifying actions, and since many actions share the same operations, we end up creating a small number of patterns that we keep in a dedicated cache. These modify header patterns are implemented as new type of ICM memory, so the following kernel patch series add the support for this new ICM type. ========== * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add bits and fields to support enhanced CQE compression net/mlx5: Remove not used MLX5_CAP_BITS_RW_MASK net/mlx5: group fdb cleanup to single function net/mlx5: Add support EXECUTE_ASO action for flow entry net/mlx5: Add HW definitions of vport debug counters net/mlx5: Add IFC bits and enums for flow meter RDMA/mlx5: Support handling of modify-header pattern ICM area net/mlx5: Manage ICM of type modify-header pattern net/mlx5: Introduce header-modify-pattern ICM properties ==================== Link: https://lore.kernel.org/r/20220614184028.51548-1-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 14 Jun, 2022 3 commits
-
-
Jakub Kicinski authored
Add missing documentation for the TLS_TX_ZEROCOPY_RO opt-in. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Maxim Mikityanskiy <maximmi@nvidia.com> Link: https://lore.kernel.org/r/20220610180212.110590-1-kuba@kernel.orgSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Marco Bonelli authored
Fix the implementation of ethtool_convert_link_mode_to_legacy_u32(), which is supposed to return false if src has bits higher than 31 set. The current implementation uses the complement of bitmap_fill(ext, 32) to test high bits of src, which is wrong as bitmap_fill() fills _with long granularity_, and sizeof(long) can be > 4. No users of this function currently check the return value, so the bug was dormant. Also remove the check for __ETHTOOL_LINK_MODE_MASK_NBITS > 32, as the enum ethtool_link_mode_bit_indices contains far beyond 32 values. Using find_next_bit() to test the src bitmask works regardless of this anyway. Signed-off-by: Marco Bonelli <marco@mebeim.net> Link: https://lore.kernel.org/r/20220609134900.11201-1-marco@mebeim.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rasmus Villemoes authored
There's no point probing for phys on this artificial bus, so we can save a little bit of boot time by telling mdiobus_register() not to do that. This doesn't have any functional change, since, at this point, fixed_mdio_read() returns 0xffff for all addresses/registers, so mdiobus_scan() -> get_phy_device() -> get_phy_c22_id() will return -ENODEV, which is just ignored. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lore.kernel.org/r/20220606200208.1665417-1-linux@rasmusvillemoes.dkSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 13 Jun, 2022 20 commits
-
-
Ofer Levi authored
Expose ifc bits and add needed structure fields and methods to support enhanced CQE compression feature. The enhanced CQE compression feature improves cpu utiliziation with better packet latency from nic to host. Signed-off-by: Ofer Levi <oferle@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Remove not used MLX5_CAP_BITS_RW_MASK. While at it, remove CAP_MASK, MLX5_CAP_OFF_CMDIF_CSUM and MLX5_DEV_CAP_FLAG_*, since MLX5_CAP_BITS_RW_MASK was their only user. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Shay Drory authored
Currently, the allocation of fdb software objects are done is single function, oppose to the cleanup of them. Group the cleanup of fdb software objects to single function. Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Jianbo Liu authored
Attach flow meter to FTE with object id and index. Use metadata register C5 to store the packet color meter result. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Saeed Mahameed authored
total_q_under_processor_handle - number of queues in error state due to an async error or errored command. send_queue_priority_update_flow - number of QP/SQ priority/SL update events. cq_overrun - number of times CQ entered an error state due to an overflow. async_eq_overrun -number of time an EQ mapped to async events was overrun. comp_eq_overrun - number of time an EQ mapped to completion events was overrun. quota_exceeded_command - number of commands issued and failed due to quota exceeded. invalid_command - number of commands issued and failed dues to any reason other than quota exceeded. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Jianbo Liu authored
Add/extend structure layouts and defines for flow meter. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Ariel Levkovich <lariel@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Yevgeny Kliteynik authored
Add support for allocate/deallocate and registering MR of the new type of ICM area. Support exists only for devices that support sw_owner_v2. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Yevgeny Kliteynik authored
Added support for managing new type of ICM for devices that support sw_owner_v2. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Yevgeny Kliteynik authored
Added new fields for device memory capabilities, in order to support creation of ICM memory for modify header patterns. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
-
Yajun Deng authored
__sys_accept4_file() isn't used outside of the file, make it static. As the same time, move file_flags and nofile parameters into __sys_accept4_file(). Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
sk_memory_allocated_add() has three callers, and returns to them @memory_allocated. sk_forced_mem_schedule() is one of them, and ignores the returned value. Change sk_memory_allocated_add() to return void. Change sock_reserve_memory() and __sk_mem_raise_allocated() to call sk_memory_allocated(). This removes one cache line miss [1] for RPC workloads, as first skbs in TCP write queue and receive queue go through sk_forced_mem_schedule(). [1] Cache line holding tcp_memory_allocated. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Parthiban Veerasooran authored
This patch adds support for Microchip's EVB-LAN8670-USB 10BASE-T1S ethernet device to the existing smsc95xx driver by adding the new USB VID/PID pairs. Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yinjun Zhang authored
48-bit DMA addressing is supported in NFP3800 HW and implemented in NFDK firmware, so enable this feature in driver now. Note that with this change, NFD3 firmware, which doesn't implement 48-bit DMA, cannot be used for NFP3800 any more. RX free list descriptor, used by both NFD3 and NFDK, is also modified to support 48-bit DMA. That's OK because the top bits is always get set to 0 when assigned with 40-bit address. Based on initial work of Jakub Kicinski <jakub.kicinski@netronome.com>. Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Alex Elder says: ==================== net: ipa: simple refactoring This series contains some minor code improvements. The first patch verifies that the configuration is compatible with a recently-defined limit. The second and third rename two fields so they better reflect their use in the code. The next gets rid of an empty function by reworking its only caller. The last two begin to remove the assumption that an event ring is associated with a single channel. Eventually we'll support having multiple channels share an event ring but some more needs to be done before that can happen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
In gsi_channel_tx_queued(), we report when a transaction gets passed to hardware. Change that function so it takes transaction rather than a channel as its argument, and derive the channel from the transaction. Rename the function accordingly. Delete the header comments above the function definition; the ones above the declaration in "gsi_private.h" should suffice. In addition, the comments above gsi_channel_tx_update() do a fine job of explaining what's going on. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Each event in an event ring describes the TRE whose completion caused the event. Currently, every event ring is dedicated to a single channel, so the channel is easily derived from the event ring. An event ring can actually be shared by more than one channel though, and to distinguish events for one channel from another, the event structure contains a field indicating which channel the event is associated with. In gsi_event_trans(), use the channel ID in an event to determine which channel the event is for. This makes the channel pointer now passed to that function irrelevant; pass the GSI pointer to that function instead. And although it shouldn't happen, warn if an event arrives that records a channel ID that's not in use, or if the event does not have a transaction associated with it. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
When a GSI transaction completes, ipa_endpoint_trans_complete() is eventually called. That handles TX and RX completions separately, but ipa_endpoint_tx_complete() is a no-op. Instead, have ipa_endpoint_trans_complete() return immediately for a TX transaction, and incorporate code from ipa_endpoint_rx_complete() to handle RX transactions. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
The trans_tre_max field of the IPA endpoint structure is only used to limit the number of fragments allowed for an SKB being prepared for transmission. Recognizing that, rename the field skb_frag_max, and reduce its value by 1. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
Each GSI channel has a TLV FIFO of a certain size, specified in the configuration data for an AP channel. That size dictates the maximum number of TREs that are allowed in a single transaction. The only way that value is used after initialization is as a limit on the number of TREs in a transaction; calling it "tlv_count" isn't helpful, and in fact gsi_channel_trans_tre_max() exists to sort of abstract it. Instead, rename the channel->tlv_count field trans_tre_max, and get rid of the helper function. Update a couple of comments as well. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alex Elder authored
In commit 8797972a ("net: ipa: remove command info pool"), the maximum number of IPA commands that would be sent in a single transaction was defined. That number can't exceed the size of the TLV FIFO on the command channel, and we can check that at runtime. To add this check, pass a new flag to gsi_channel_data_valid() to indicate the channel being checked is being used for IPA commands. Knowing that we can also verify the channel direction is correct. Use a new local variable that refers to the command-specific portion of the data being checked. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 11 Jun, 2022 3 commits
-
-
Yinjun Zhang authored
Previously the traffic class field is ignored while firmware has already supported to pedit flowinfo fields, including traffic class and flow label, now add it back. Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Link: https://lore.kernel.org/r/20220609080136.151830-1-simon.horman@corigine.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Bin Chen authored
The commit a14857c2 ("rtnetlink: verify rate parameters for calls to ndo_set_vf_rate") has been merged to master, so we can to remove the now-duplicate checks in drivers. Signed-off-by: Bin Chen <bin.chen@corigine.com> Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220609084717.155154-1-simon.horman@corigine.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueJakub Kicinski authored
Tony Nguyen says: ==================== 10GbE Intel Wired LAN Driver Updates 2022-06-09 Maximilian Heyne adds reporting of VF statistics on ixgbe via iproute2 interface. Kai-Heng Feng removes duplicate defines from igb. Jiaqing Zhao fixes typos in e1000, ixgb, and ixgbe drivers. Julia Lawall fixes typos for fm10k, ixgbe, and ice drivers. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: drivers/net/ethernet/intel: fix typos in comments ixgbe: Fix typos in comments ixgb: Fix typos in comments e1000: Fix typos in comments igb: Remove duplicate defines drivers, ixgbe: export vf statistics ==================== Link: https://lore.kernel.org/r/20220609171257.2727150-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 10 Jun, 2022 10 commits
-
-
Jakub Kicinski authored
Eric Dumazet says: ==================== net: reduce tcp_memory_allocated inflation Hosts with a lot of sockets tend to hit so called TCP memory pressure, leading to very bad TCP performance and/or OOM. The problem is that some TCP sockets can hold up to 2MB of 'forward allocations' in their per-socket cache (sk->sk_forward_alloc), and there is no mechanism to make them relinquish their share under mem pressure. Only under some potentially rare events their share is reclaimed, one socket at a time. In this series, I implemented a per-cpu cache instead of a per-socket one. Each CPU has a +1/-1 MB (256 pages on x86) forward alloc cache, in order to not dirty tcp_memory_allocated shared cache line too often. We keep sk->sk_forward_alloc values as small as possible, to meet memcg page granularity constraint. Note that memcg already has a per-cpu cache, although MEMCG_CHARGE_BATCH is defined to 32 pages, which seems a bit small. Note that while this cover letter mentions TCP, this work is generic and supports TCP, UDP, DECNET, SCTP. ==================== Link: https://lore.kernel.org/r/20220609063412.2205738-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
These two helpers are only used from core networking. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
Currently, tcp_memory_allocated can hit tcp_mem[] limits quite fast. Each TCP socket can forward allocate up to 2 MB of memory, even after flow became less active. 10,000 sockets can have reserved 20 GB of memory, and we have no shrinker in place to reclaim that. Instead of trying to reclaim the extra allocations in some places, just keep sk->sk_forward_alloc values as small as possible. This should not impact performance too much now we have per-cpu reserves: Changes to tcp_memory_allocated should not be too frequent. For sockets not using SO_RESERVE_MEM: - idle sockets (no packets in tx/rx queues) have zero forward alloc. - non idle sockets have a forward alloc smaller than one page. Note: - Removal of SK_RECLAIM_CHUNK and SK_RECLAIM_THRESHOLD is left to MPTCP maintainers as a follow up. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
If sk->sk_forward_alloc is 150000, and we need to schedule 150001 bytes, we want to allocate 1 byte more (rounded up to one page), instead of 150001 :/ Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
We plan keeping sk->sk_forward_alloc as small as possible in future patches. This means we are going to call sk_memory_allocated_add() and sk_memory_allocated_sub() more often. Implement a per-cpu cache of +1/-1 MB, to reduce number of changes to sk->sk_prot->memory_allocated, which would otherwise be cause of false sharing. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
Each protocol having a ->memory_allocated pointer gets a corresponding per-cpu reserve, that following patches will use. Instead of having reserved bytes per socket, we want to have per-cpu reserves. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
Due to memcg interface, SK_MEM_QUANTUM is effectively PAGE_SIZE. This might change in the future, but it seems better to avoid the confusion. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
This reverts commit bd68a2a8. This change broke memcg on arches with PAGE_SIZE != 4096 Later, commit 2bb2f5fb ("net: add new socket option SO_RESERVE_MEM") also assumed PAGE_SIZE==SK_MEM_QUANTUM Following patches in the series will greatly reduce the over allocations problem. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linuxLinus Torvalds authored
Pull more devicetree fixes from Rob Herring: - More DT meta-schema check fixes from new bindings in merge window - Fix stale DT binding references from Mauro - Update various binding maintainers - Fix in arm,malidp properties to match reality - Add deprecated 'atheros' vendor prefix * tag 'devicetree-fixes-for-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: display: arm,malidp: remove bogus RQOS property dt-bindings: pinctrl: ralink: Fix 'enum' lists with duplicate entries dt-bindings: Drop more redundant 'maxItems/minItems' in if/then schemas dt-bindings: nvme: apple,nvme-ans: Drop 'maxItems' from 'apple,sart' MAINTAINERS: rectify entries for ARM DRM DRIVERS after dt conversion MAINTAINERS: update snps,axs10x-reset.yaml reference MAINTAINERS: update dongwoon,dw9807-vcm.yaml reference MAINTAINERS: update cortina,gemini-ethernet.yaml reference dt-bindings: mfd: rk808: update rockchip,rk808.yaml reference dt-bindings: reset: update st,stih407-powerdown.yaml references dt-bindings: arm: update vexpress-config.yaml references dt-bindings: interrupt-controller: update brcm,l2-intc.yaml reference dt-bindings: mfd: bd9571mwv: update rohm,bd9571mwv.yaml reference dt-bindings: update Luca Ceresoli's e-mail address dt-bindings: msm: update maintainers list with proper id dt-bindings: vendor-prefixes: document deprecated Atheros dt-bindings: Update QCOM USB subsystem maintainer information
-