Commits · c6a9d360874a41dc972c44c0949916da55199f85 · Kirill Smelkov / linux

05 Oct, 2022 7 commits

wifi: rtw89: phy: ignore warning of bb gain cfg_type 4 · c6a9d360

Ping-Ke Shih authored Sep 30, 2022

The new BB parameters add new cfg_tpe 4 to improve performance of eFEM
modules (rfe_type >= 50), but we are using iFEM modules for now, so this
warning can be ignored.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220930133659.7789-2-pkshih@realtek.com

c6a9d360

wifi: rtw89: 8852c: update BB parameters to v28 · a9ee25c3

Ping-Ke Shih authored Sep 30, 2022

Update BB parameters along with internal tag HALBB_027_067_07.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220930133659.7789-1-pkshih@realtek.com

a9ee25c3

wifi: rtw89: 8852c: rfk: correct miscoding delay of DPK · 3be11416

Ping-Ke Shih authored Sep 30, 2022

Using mdelay() can work well, but calibration causes too much time. Use
proper udelay() to get shorter time and the same result.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220930133318.6335-2-pkshih@realtek.com

3be11416

wifi: rtw89: 8852c: correct set of IQK backup registers · 68b0ce5b

Ping-Ke Shih authored Sep 30, 2022

IQK can change the values of this register set, so need to backup and
restore the values. During we rewrite IQK, the policy is changed. Some
values are controlled and filled by IQK, and don't need to restore after
IQK. Therefore, remove this kind of registers from this array.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220930133318.6335-1-pkshih@realtek.com

68b0ce5b

brcmfmac: Fix AP interface delete issue · 1562bdef

Prasanna Kerekoppa authored Sep 29, 2022

Fixes the ap interface delete issue. Fix is to make sure interface
is created with supported version.
Patch has been verified by creating and deleting AP interface.
Signed-off-by: Prasanna Kerekoppa <prasanna.kerekoppa@infineon.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929050614.31518-4-ian.lin@infineon.com

1562bdef

brcmfmac: support station interface creation version 1, 2 and 3 · 4388827b

Wright Feng authored Sep 29, 2022

To create virtual station interface for RSDB and VSDB, we add interface
creation version 1, 2 and 3 supports
The structures of each version are different and only version 3 and
later version are able to get interface creating version from firmware
side.

The patch has been verified two concurrent stations pings test with
 interface create version 1:
          89342(4359b1)-PCIE: 9.40.100
 interface create version 2:
         4373a0-sdio: 13.10.271
 interface create version 3:
         4373a0-sdio: 13.35.48
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@infineon.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929050614.31518-3-ian.lin@infineon.com

4388827b

brcmfmac: add creating station interface support · 2b5fb30f

Wright Feng authored Sep 29, 2022

With RSDB device, it is able to control two station interfaces
concurrently. So we add creating station interface support and
allow user to create it via cfg80211.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@infineon.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929050614.31518-2-ian.lin@infineon.com

2b5fb30f

04 Oct, 2022 18 commits

brcmfmac: dump dongle memory when attaching failed · 5671c8b5

Wright Feng authored Sep 28, 2022

To enhance FW debugging, we add dongle memory dump when hitting attaching
failure with PCIE bus. It can help developer to get more information
about dongle trap reason and root cause.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929031001.9962-4-ian.lin@infineon.com

5671c8b5

brcmfmac: return error when getting invalid max_flowrings from dongle · 2aca4f37

Wright Feng authored Sep 28, 2022

When firmware hit trap at initialization, host will read abnormal
max_flowrings number from dongle, and it will cause kernel panic when
doing iowrite to initialize dongle ring.
To detect this error at early stage, we directly return error when getting
invalid max_flowrings(>256).
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929031001.9962-3-ian.lin@infineon.com

2aca4f37

brcmfmac: add a timer to read console periodically in PCIE bus · dcb485df

Wright Feng authored Sep 28, 2022

Currently, host only reads console buffer when receiving mailbox data or
hit crash with PCIE bus. Therefore, we add timer in PCIE code to read
console buffer periodically to help developer and user check firmware
message when there is no data transmission between host and dongle.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929031001.9962-2-ian.lin@infineon.com

dcb485df

brcmfmac: Fix authentication latency caused by OBSS stats survey · 62ccb2e6

Ramesh Rangavittal authored Sep 28, 2022

Auto Channel Select feature of HostAP uses dump_survey to fetch
OBSS statistics. When the device is in the middle of an authentication
sequence or just at the end of authentication completion, running
dump_survey would trigger a channel change. The channel change in-turn
can cause packet loss, resulting in authentication delay. With this change,
dump_survey won't be run when authentication or association is in progress,
hence resolving the issue.
Signed-off-by: Ramesh Rangavittal <ramesh.rangavittal@infineon.com>
Signed-off-by: Chung-Hsien Hsu <chung-hsien.hsu@infineon.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@infineon.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929012527.4152-5-ian.lin@infineon.com

62ccb2e6

brcmfmac: fix CERT-P2P:5.1.10 failure · 25076fe2

Double Lo authored Sep 28, 2022

This patch fix CERT-P2P:5.1.10 failure at step 18 Group formation failed
due to chip is under dump survey. Decrease the dump survery duration to
pass this certification case.
Signed-off-by: Double Lo <double.lo@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@infineon.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929012527.4152-4-ian.lin@infineon.com

25076fe2

brcmfmac: fix firmware trap while dumping obss stats · 216647e6

Wright Feng authored Sep 28, 2022

When doing dump_survey, host will call "dump_obss" iovar to firmware
side. Host need to make sure the HW clock in dongle is on, or there is
high probability that firmware gets trap because register or shared
memory access failed. To fix this, we disable mpc when doing dump obss
and set it back after that.

[28350.512799] brcmfmac: brcmf_dump_obss: dump_obss error (-52)
[28743.402314] ieee80211 phy0: brcmf_fw_crashed: Firmware has halted or
crashed
[28745.869430] brcmfmac: brcmf_sdio_bus_rxctl: resumed on timeout
[28745.877546] brcmfmac: brcmf_sdio_checkdied: firmware trap in dongle
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@infineon.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929012527.4152-3-ian.lin@infineon.com

216647e6

brcmfmac: Add dump_survey cfg80211 ops for HostApd AutoChannelSelection · 6c04deae

Wright Feng authored Sep 28, 2022

To enable ACS feature in Hostap daemon, dump_survey cfg80211 ops and dump
obss survey command in firmware side are needed. This patch is for adding
dump_survey feature and adding DUMP_OBSS feature flag to check if
firmware supports dump_obss iovar.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Ian Lin <ian.lin@infineon.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220929012527.4152-2-ian.lin@infineon.com

6c04deae

wifi: rtl8xxxu: gen2: Turn on the rate control · 791082ec

Bitterblue Smith authored Sep 28, 2022

Re-enable the function rtl8xxxu_gen2_report_connect.

It informs the firmware when connecting to a network. This makes the
firmware enable the rate control, which makes the upload faster.

It also informs the firmware when disconnecting from a network. In the
past this made reconnecting impossible because it was sending the
auth on queue 0x7 (TXDESC_QUEUE_VO) instead of queue 0x12
(TXDESC_QUEUE_MGNT):

wlp0s20f0u3: send auth to 90:55:de:__:__:__ (try 1/3)
wlp0s20f0u3: send auth to 90:55:de:__:__:__ (try 2/3)
wlp0s20f0u3: send auth to 90:55:de:__:__:__ (try 3/3)
wlp0s20f0u3: authentication with 90:55:de:__:__:__ timed out

Probably the firmware disables the unnecessary TX queues when it
knows it's disconnected.

However, this was fixed in commit edd5747a ("wifi: rtl8xxxu: Fix
skb misuse in TX queue selection").

Fixes: c59f13bb ("rtl8xxxu: Work around issue with 8192eu and 8723bu devices not reconnecting")
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/43200afc-0c65-ee72-48f8-231edd1df493@gmail.com

791082ec

wifi: rtl8xxxu: Support new chip RTL8188FU · c888183b

Bitterblue Smith authored Sep 29, 2022

This chip is found in the cheapest USB adapters, e.g. 1.17 USD with
VAT and shipping from China included.

It's a gen 2 chip, similar to the RTL8723BU, but without Bluetooth.
Features: 2.4 GHz, b/g/n mode, 1T1R, 150 Mbps.

The vendor driver rtl8188fu version 4.3.23.6_20964.20170110 [0]
was used as reference. The CD shipped with the device includes a
newer driver, version 5.11.5-1-g12f7cde4b.20201102, but that one
couldn't complete the WPA2 key exchange thing for whatever reason.

[0] https://github.com/kelebek333/rtl8188fuSigned-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/b14f299d-3248-98fe-eee1-ba50d2e76c74@gmail.com

c888183b

wifi: rtw89: 8852be: add 8852BE PCI entry · 9695dc2e

Ping-Ke Shih authored Sep 28, 2022

8852BE has two variants with different ID. One is 10ec:b852 that is a main
model with 2x2 antenna, and the other is 10ec:b85b that is a 1x1 model.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-10-pkshih@realtek.com

9695dc2e

wifi: rtw89: 8852b: add chip_ops to read phy cap · 134cf7c0

Ping-Ke Shih authored Sep 28, 2022

This efuse region is to store PHY calibration, and it is a separated region
from the region that stores MAC address. Then, use these data to configure
via chip_ops::power_trim that is a calibration mechanism of TX power.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-9-pkshih@realtek.com

134cf7c0

wifi: rtw89: 8852b: add chip_ops to read efuse · 132dc4fe

Ping-Ke Shih authored Sep 28, 2022

efuse stores individual data about a chip itself, such as MAC address,
country code, RF and crystal calibration data, and so on. Define a struct
to help access efuse content, and copy them into a common struct.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-8-pkshih@realtek.com

132dc4fe

wifi: rtw89: 8852b: add chip_ops::set_txpwr · 08484e1f

Ping-Ke Shih authored Sep 28, 2022

This chip_ops is to set TX power according to country, channel, rate and
so on.  Since shared code is used to configure TX power, we only implement
specific part in this patch.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-7-pkshih@realtek.com

08484e1f

wifi: rtw89: debug: txpwr_table considers sign · b9021616

Zong-Zhe Yang authored Sep 28, 2022

Previously, value of each field is just shown as unsigned.
Now, we start to show them with sign to make things more intuitive
during debugging.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-6-pkshih@realtek.com

b9021616

wifi: rtw89: phy: make generic txpwr setting functions · 9b43bd1a

Zong-Zhe Yang authored Sep 28, 2022

Previously, we thought control registers or setting things for TX power
series may change according to chip. So, setting functions are implemented
chip by chip. However, until now, the functions keep the same among chips,
at least 8852A, 8852C, and 8852B. There is a sufficient number of chips to
share generic setting functions. So, we now remake them including TX power
by rate, TX power offset, TX power limit, and TX power limit RU as generic
ones in phy.c.

Besides, there are some code refinements in the generic ones, but almost
all of the logic doesn't change.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-5-pkshih@realtek.com

9b43bd1a

wifi: rtw89: 8852b: add tables for RFK · 2b379eb4

Ping-Ke Shih authored Sep 28, 2022

These tables are used by RFK to assist to configure PHY and RF registers.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-4-pkshih@realtek.com

2b379eb4

wifi: rtw89: 8852b: add BB and RF tables (2 of 2) · 3e65a0ae

Ping-Ke Shih authored Sep 28, 2022

These tables contain BB and RF parameters that driver will load them into
registers. It also contains TX power according to country, band, rate and
so on. Increasing thermal can cause TX power degraded, so power tracking
tables are defined to compensate TX power.

Internal version of these tables:
 - HALRF_029_00_014 (R32)
 - HALBB_027_046_05
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-3-pkshih@realtek.com

3e65a0ae

wifi: rtw89: 8852b: add BB and RF tables (1 of 2) · c8b5fc2e

Ping-Ke Shih authored Sep 28, 2022

These tables contain BB and RF parameters that driver will load them into
registers. It also contains TX power according to country, band, rate and
so on. Increasing thermal can cause TX power degraded, so power tracking
tables are defined to compensate TX power.

Internal version of these tables:
 - HALRF_029_00_014 (R32)
 - HALBB_027_046_05
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220928084336.34981-2-pkshih@realtek.com

c8b5fc2e

30 Sep, 2022 15 commits

Merge tag 'wireless-next-2022-09-30' of... · 915b96c5

Jakub Kicinski authored Sep 30, 2022

Merge tag 'wireless-next-2022-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.1

Few stack changes and lots of driver changes in this round. brcmfmac
has more activity as usual and it gets new hardware support. ath11k
improves WCN6750 support and also other smaller features. And of
course changes all over.

Note: in early September wireless tree was merged to wireless-next to
avoid some conflicts with mac80211 patches, this shouldn't cause any
problems but wanted to mention anyway.

Major changes:

mac80211

 - refactoring and preparation for Wi-Fi 7 Multi-Link Operation (MLO)
  feature continues

brcmfmac

 - support CYW43439 SDIO chipset

 - support BCM4378 on Apple platforms

 - support CYW89459 PCIe chipset

rtw89

 - more work to get rtw8852c supported

 - P2P support

 - support for enabling and disabling MSDU aggregation via nl80211

mt76

 - tx status reporting improvements

ath11k

 - cold boot calibration support on WCN6750

 - Target Wake Time (TWT) debugfs support for STA interface

 - support to connect to a non-transmit MBSSID AP profile

 - enable remain-on-channel support on WCN6750

 - implement SRAM dump debugfs interface

 - enable threaded NAPI on all hardware

 - WoW support for WCN6750

 - support to provide transmit power from firmware via nl80211

 - support to get power save duration for each client

 - spectral scan support for 160 MHz

wcn36xx

 - add SNR from a received frame as a source of system entropy

* tag 'wireless-next-2022-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (231 commits)
  wifi: rtl8xxxu: Improve rtl8xxxu_queue_select
  wifi: rtl8xxxu: Fix AIFS written to REG_EDCA_*_PARAM
  wifi: rtl8xxxu: gen2: Enable 40 MHz channel width
  wifi: rtw89: 8852b: configure DLE mem
  wifi: rtw89: check DLE FIFO size with reserved size
  wifi: rtw89: mac: correct register of report IMR
  wifi: rtw89: pci: set power cut closed for 8852be
  wifi: rtw89: pci: add to do PCI auto calibration
  wifi: rtw89: 8852b: implement chip_ops::{enable,disable}_bb_rf
  wifi: rtw89: add DMA busy checking bits to chip info
  wifi: rtw89: mac: define DMA channel mask to avoid unsupported channels
  wifi: rtw89: pci: mask out unsupported TX channels
  iwlegacy: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper
  ipw2x00: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
  wifi: iwlwifi: Track scan_cmd allocation size explicitly
  brcmfmac: Remove the call to "dtim_assoc" IOVAR
  brcmfmac: increase dcmd maximum buffer size
  brcmfmac: Support 89459 pcie
  brcmfmac: increase default max WOWL patterns to 16
  cw1200: fix incorrect check to determine if no element is found in list
  ...
====================

Link: https://lore.kernel.org/r/20220930150413.A7984C433D6@smtp.kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

915b96c5

Merge branch 'mlx5-xsk-updates-part2-2022-09-28' · 6690c2c4

Jakub Kicinski authored Sep 30, 2022

Saeed Mahameed says:

====================
mlx5 xsk updates part2 2022-09-28

XSK buffer improvements, This is part #2 of 4 parts series.

 1) Expose xsk min chunk size to drivers, to allow the driver to adjust to a
   better buffer stride size

 2) Adjust MTT page size to the XSK frame size, to avoid umem overrun in
  certain situations.

 3) Use xsk frame size as the striding RQ page size for XSK RQs

 4) KSM for unaligned XSK, KSM allows arbitrary buffer chunk lengths
    registration in HW, which makes more sense for unaligned XSK.

 4) More cleanups and optimizations in preparation for next improvements
    in part3

part 1: https://lore.kernel.org/netdev/20220927203611.244301-1-saeed@kernel.org/
====================

Link: https://lore.kernel.org/r/20220929072156.93299-1-saeed@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

6690c2c4

net/mlx5e: Clean up and fix error flows in mlx5e_alloc_rq · 8f5ed1c1

Maxim Mikityanskiy authored Sep 29, 2022

Although mlx5e_rq_free_shampo can be called unconditionally, it belongs
to case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ. Move it there to allow to
add more init/cleanup actions to the striding RQ case.

If xdp_rxq_info_reg_mem_model fails, don't forget to destroy the page
pool.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

8f5ed1c1

net/mlx5e: Move repeating clear_bit in mlx5e_rx_reporter_err_rq_cqe_recover · e64d71d0

Maxim Mikityanskiy authored Sep 29, 2022

The same clear_bit is called in both error and success flows. Move the
call to do it only once and remove the out label.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

e64d71d0

net/mlx5e: Split out channel (de)activation in rx_res · d32c2253

Maxim Mikityanskiy authored Sep 29, 2022

To decrease the nesting level and reduce duplication of code, create
functions to redirect direct RQTs to the actual RQs or drop_rq, which
are used in the activation and deactivation flows of channels.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

d32c2253

net/mlx5e: xsk: Remove mlx5e_xsk_page_alloc_pool · 2d0765f7

Maxim Mikityanskiy authored Sep 29, 2022

mlx5e_xsk_page_alloc_pool became a thin wrapper around xsk_buff_alloc.
Drop it and call xsk_buff_alloc directly.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2d0765f7

net/mlx5e: Convert struct mlx5e_alloc_unit to a union · 672db024

Maxim Mikityanskiy authored Sep 29, 2022

struct mlx5e_alloc_unit consists of a single union. Convert it to a
union itself to simplify casting it to struct xdp_buff *, which will be
used to implement XSK batching on striding RQ.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

672db024

net/mlx5e: Remove DMA address from mlx5e_alloc_unit · 6bdeb963

Maxim Mikityanskiy authored Sep 29, 2022

mlx5e_alloc_unit stores the DMA address and a pointer to either struct
page (regular RQ) or struct xdp_buff (XSK RQ). This DMA address is
redundant, because when a page or an XSK frame is allocated, the same
address is also stored there. Some flows take the address from struct
mlx5e_alloc_unit, and some take it from struct page or xdp_buff.

This commit removes the address from struct mlx5e_alloc_unit, which
makes it twice as small and improves locality (this struct is used in an
array), also saving on unnecessary stores to the addr field. Almost all
flows know unambiguously whether the DMA address should be taken from
page or from xdp_buff. The exception is the allocation flows, where a
new branch appeared, which will be optimized out in the next commits.

struct mlx5e_alloc_unit used to be called mlx5e_dma_info.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

6bdeb963

net/mlx5e: Rename mlx5e_dma_info to prepare for removal of DMA address · 79008676

Maxim Mikityanskiy authored Sep 29, 2022

The next commit will remove the DMA address from the struct currently
called mlx5e_dma_info, because the same value can be retrieved with
page_pool_get_dma_addr(page) in almost all cases, with the notable
exception of SHAMPO (HW GRO implementation) that modifies this address
on the fly, after the initial allocation.

To keep the SHAMPO logic intact, struct mlx5e_dma_info remains in the
SHAMPO code, consisting of addr and page (XSK is not compatible with
SHAMPO). The struct used in all other places is renamed to
mlx5e_alloc_unit, allowing the next commit to remove the addr field
without affecting SHAMPO.

The new name means "allocation unit", and it's more appropriate after
the field with the DMA address gets removed.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

79008676

net/mlx5e: Optimize the page cache reducing its size 2x · 707f908e

Maxim Mikityanskiy authored Sep 29, 2022

RX page cache stores dma_info structs, that consist of a pointer to
struct page and a DMA address. In fact, the DMA address is extracted
from struct page using page_pool_get_dma_addr when a page is pushed to
the cache. By moving this call to the point when a page is popped from
the cache, we can avoid storing the DMA address in the cache,
effectively reducing its size by two times without losing any
functionality.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

707f908e

net/mlx5e: Fix calculations for ICOSQ size · 0b9c86c7

Maxim Mikityanskiy authored Sep 29, 2022

WQEs must not cross page boundaries, they are padded with NOPs if they
don't fit the page. mlx5e_mpwrq_total_umr_wqebbs doesn't take into
account this padding, risking reserving not enough space.

The padding is not straightforward to add to this calculation, because
WQEs of different sizes may be mixed together in the queue. If each page
ends with a big WQE that doesn't fit and requires at most its size minus
1 WQEBB of padding, the total space can be much bigger than in case when
smaller WQEs take advantage of this padding.

Replace the wrong exact calculation by the following estimation. Each
padding can be at most the size of the maximum WQE used in the queue
minus one WQEBB. Let's call the rest of the page "useful space". If we
divide the total size of all needed WQEs by this useful space, rounding
up, we'll get the number of pages, which is enough to contain all these
WQEs. It's correct, because every WQE that appeared on the boundary
between two blocks of useful space would start in the useful space of
one page and end in the padding of the same page, while our estimation
reserved space for its tail in the next space, making the estimation not
smaller than the real space occupied in the queue.

The code actually uses a looser estimation: instead of taking the
maximum size of all used WQE types minus 1 WQEBB, it takes the maximum
hardware size of a WQE. It's made for simplicity and extensibility.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

0b9c86c7

xsk: Remove unused xsk_buff_discard · f2f16758

Maxim Mikityanskiy authored Sep 29, 2022

The previous commit removed the last usage of xsk_buff_discard in mlx5e,
so the function that is no longer used can be removed.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
CC: "Björn Töpel" <bjorn@kernel.org>
CC: Magnus Karlsson <magnus.karlsson@intel.com>
CC: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

f2f16758

net/mlx5e: xsk: Use KSM for unaligned XSK · 6470d2e7

Maxim Mikityanskiy authored Sep 29, 2022

UMR MTTs used in striding RQ have certain alignment requirements. While
it's guaranteed to work when UMR pages are aligned to the UMR page size,
in practice it works then UMR pages are aligned to 8 bytes. However,
it's still not enough flexibility for the unaligned mode of XSK. This
patch leverages KSM to map UMR pages without alignment requirements,
when unaligned XSK is active. The downside is that KSM entries are twice
as big as MTTs, which limits the maximum WQE size, so regular RQs and
aligned XSK continue using MTTs.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

6470d2e7

net/mlx5: Add MLX5_FLEXIBLE_INLEN to safely calculate cmd inlen · c4418f34

Maxim Mikityanskiy authored Sep 29, 2022

Some commands use a flexible array after a common header. Add a macro to
safely calculate the total input length of the command, detecting
overflows and printing errors with specific values when such overflows
happen.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

c4418f34

net/mlx5e: Keep a separate MKey for striding RQ · ecc7ad2e

Maxim Mikityanskiy authored Sep 29, 2022

Currently, rq->mkey_be keeps a big-endian value of either the PA MKey
(for legacy RQ, no address translation) or MTT MKey (for striding RQ,
direct address translation). Striding RQ stores the same value in
rq->umr_mkey in the native endianness.

The next commit will make striding RQ use KSM MKey (indirect address
translation) for the unaligned mode of XSK, which will require storing
both KSM MKey and PA MKey in the RQ struct. This commit optimizes fields
of mlx5e_rq: umr_mkey is removed (it's redundant), mkey_be always points
to the PA MKey, and mpwqe.umr_mkey_be points to the MTT MKey (or to the
KSM MKey, starting from the next commit).
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ecc7ad2e