Commits · 7c5b42055964f587e55bd87ef334c3a27e95d144 · Kirill Smelkov / linux

01 Aug, 2019 29 commits

tipc: reduce risk of wakeup queue starvation · 7c5b4205

Jon Maloy authored Jul 30, 2019

In commit 365ad353 ("tipc: reduce risk of user starvation during
link congestion") we allowed senders to add exactly one list of extra
buffers to the link backlog queues during link congestion (aka
"oversubscription"). However, the criteria for when to stop adding
wakeup messages to the input queue when the overload abates is
inaccurate, and may cause starvation problems during very high load.

Currently, we stop adding wakeup messages after 10 total failed attempts
where we find that there is no space left in the backlog queue for a
certain importance level. The counter for this is accumulated across all
levels, which may lead the algorithm to leave the loop prematurely,
although there may still be plenty of space available at some levels.
The result is sometimes that messages near the wakeup queue tail are not
added to the input queue as they should be.

We now introduce a more exact algorithm, where we keep adding wakeup
messages to a level as long as the backlog queue has free slots for
the corresponding level, and stop at the moment there are no more such
slots or when there are no more wakeup messages to dequeue.

Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
Reported-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c5b4205

Merge branch 'net-dsa-mv88e6xxx-avoid-some-redundant-VTU-operations' · f7571cde

David S. Miller authored Aug 01, 2019

Vivien Didelot says:

====================
net: dsa: mv88e6xxx: avoid some redundant VTU operations

The mv88e6xxx driver currently uses a mv88e6xxx_vtu_get wrapper to get a
single entry and uses a boolean to eventually initialize a fresh one.

However the fresh entry is only needed in one place and mv88e6xxx_vtu_getnext
is simple enough to call it directly. Doing so makes the code easier to read,
especially for the return code expected by switchdev to honor software VLANs.

In addition to not loading the VTU again when an entry is already correctly
programmed, this also allows to avoid programming the broadcast entries
again when updating a port's membership, from e.g. tagged to untagged.

This patch series removes the mv88e6xxx_vtu_get wrapper in favor of direct
calls to mv88e6xxx_vtu_getnext, and also renames the _mv88e6xxx_port_vlan_add
and _mv88e6xxx_port_vlan_del helpers using an old underscore prefix convention.

In case the port's membership is already correctly programmed in hardware,
the following debug message may be printed:

    [  745.989884] mv88e6085 2188000.ethernet-1:00: p4: already a member of VLAN 42
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

f7571cde

net: dsa: mv88e6xxx: call vtu_getnext directly in vlan_add · b1ac6fb4

Vivien Didelot authored Aug 01, 2019

Wrapping mv88e6xxx_vtu_getnext makes the code less easy to read and
_mv88e6xxx_port_vlan_add is the only function requiring the preparation
of a new VLAN entry.

To simplify things up, remove the mv88e6xxx_vtu_get wrapper and
explicit the VLAN lookup in _mv88e6xxx_port_vlan_add. This rework
also avoids programming the broadcast entries again when changing a
port's membership, e.g. from tagged to untagged.

At the same time, rename the helper using an old underscore convention.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1ac6fb4

net: dsa: mv88e6xxx: call vtu_getnext directly in vlan_del · 52109892

Vivien Didelot authored Aug 01, 2019

Wrapping mv88e6xxx_vtu_getnext makes the code less easy to read.
Explicit the call to mv88e6xxx_vtu_getnext in _mv88e6xxx_port_vlan_del
and the return value expected by switchdev in case of software VLANs.

At the same time, rename the helper using an old underscore convention.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

52109892

net: dsa: mv88e6xxx: call vtu_getnext directly in db load/purge · 5ef8d249

Vivien Didelot authored Aug 01, 2019

mv88e6xxx_vtu_getnext is simple enough to call it directly in the
mv88e6xxx_port_db_load_purge function and explicit the return code
expected by switchdev for software VLANs when an hardware VLAN does
not exist.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5ef8d249

net: dsa: mv88e6xxx: explicit entry passed to vtu_getnext · 425d2d37

Vivien Didelot authored Aug 01, 2019

mv88e6xxx_vtu_getnext interprets two members from the input
mv88e6xxx_vtu_entry structure: the (excluded) vid member to start
the iteration from, and the valid argument specifying whether the VID
must be written or not (only required once at the start of a loop).

Explicit the assignation of these two fields right before calling
mv88e6xxx_vtu_getnext, as it is done in the mv88e6xxx_vtu_get wrapper.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

425d2d37

net: dsa: mv88e6xxx: lock mutex in vlan_prepare · 7095a4c4

Vivien Didelot authored Aug 01, 2019

Lock the mutex in the mv88e6xxx_port_vlan_prepare function
called by the DSA stack, instead of doing it in the internal
mv88e6xxx_port_check_hw_vlan helper.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7095a4c4

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue · a8e600e2

David S. Miller authored Aug 01, 2019

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-07-31

This series contains updates to ice driver only.

Paul adds support for reporting what the link partner is advertising for
flow control settings.

Jake fixes the hardware statistics register which is prone to rollover
since the statistic registers are either 32 or 40 bits wide, depending
on which register is being read.  So use a 64 bit software statistic to
store off the hardware statistics to track past when it rolls over.
Fixes an issue with the locking of the control queue, where locks were
being destroyed at run time.

Tony fixes an issue that was created when interrupt tracking was
refactored and the call to ice_vsi_setup_vector_base() was removed from
the PF VSI instead of the VF VSI.  Adds a check before trying to
configure a port to ensure that media is attached.

Brett fixes an issue in the receive queue configuration where prefena
(Prefetch Enable) was being set to 0 which caused the hardware to only
fetch descriptors when there are none free in the cache for a received
packet.  Updates the driver to only bump the receive tail once per
napi_poll call, instead of the current model of bumping the tail up to 4
times per napi_poll call.  Adds statistics for receive drops at the port
level to ethtool/netlink.  Cleans up duplicate code in the allocation of
receive buffer code.

Akeem updates the driver to ensure that VFs stay disabled until the
setup or reset is completed.  Modifies the driver to use the allocated
number of transmit queues per VSI to set up the scheduling tree versus
using the total number of available transmit queues.  Also fix the
driver to update the total number of configured queues, after a
successful VF request to change its number of queues before updating the
corresponding VSI for that VF.  Cleaned up unnecessary flags that are no
longer needed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a8e600e2

Merge branch 'net-hns3-some-code-optimizations-bugfixes-features' · 9b59e39f

David S. Miller authored Aug 01, 2019

Huazhong Tan says:

====================
net: hns3: some code optimizations & bugfixes & features

This patch-set includes code optimizations, bugfixes and features for
the HNS3 ethernet controller driver.

[patch 01/12] adds support for reporting link change event.

[patch 02/12] adds handler for NCSI error.

[patch 03/12] fixes bug related to debugfs.

[patch 04/12] adds a code optimization for setting ring parameters.

[patch 05/12 - 09/12] adds some cleanups.

[patch 10/12 - 12/12] adds some patches related to reset issue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9b59e39f

net: hns3: activate reset timer when calling reset_event · 012fcb52

Huazhong Tan authored Aug 01, 2019

When calling hclge_reset_event() within HCLGE_RESET_INTERVAL,
it returns directly now. If no one call it again, then the
error which needs a reset to fix it can not be fixed.

So this patch activates the reset timer for this case, and
adds checking in the end of the reset procedure to make this
error fixed earlier.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

012fcb52

net: hns3: clear reset interrupt status in hclge_irq_handle() · 72e2fb07

Huazhong Tan authored Aug 01, 2019

Currently, the reset interrupt is cleared in the reset task, which
is too late. Since, when the hardware finish the previous reset,
it can begin to do a new global/IMP reset, if this new coming reset
type is same as the previous one, the driver will clear them together,
then driver can not get that there is another reset, but the hardware
still wait for the driver to deal with the second one.

So this patch clears PF's reset interrupt status in the
hclge_irq_handle(), the hardware waits for handshaking from
driver before doing reset, so the driver and hardware deal with reset
one by one.

BTW, when VF doing global/IMP reset, it reads PF's reset interrupt
register to get that whether PF driver's re-initialization is done,
since VF's re-initialization should be done after PF's. So we add
a new command and a register bit to do that. When VF receive reset
interrupt, it sets up this bit, and PF finishes re-initialization
send command to clear this bit, then VF do re-initialization.

Fixes: 4ed340ab ("net: hns3: Add reset process in hclge_main")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

72e2fb07

net: hns3: fix some reset handshake issue · 6b428b4f

Huazhong Tan authored Aug 01, 2019

Currently, the driver sets handshake status to tell the hardware
that the driver have downed the netdev and it can continue with
reset process. The driver will clear the handshake status when
re-initializing the CMDQ, and does not recover this status
when reset fail, which may cause the hardware to wait for
the handshake status to be set and not being able to continue
with reset process.

So this patch delays clearing handshake status just before UP,
and recovers this status when reset fail.

BTW, this patch adds a new function hclge(vf)_reset_handshake() to
deal with the reset handshake issue, and renames
HCLGE(VF)_NIC_CMQ_ENABLE to HCLGE(VF)_NIC_SW_RST_RDY which
represents this register bit more accurately.

Fixes: ada13ee3 ("net: hns3: add handshake with hardware while doing reset")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6b428b4f

net: hns3: rename a member in struct hclge_mac_ethertype_idx_rd_cmd · 6e6e7680

Guojia Liao authored Aug 01, 2019

The member 'mac_add' defined in hclge_mac_ethertype_idx_rd_cmd
means MAC address, so 'mac_addr' is a better name for it.
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e6e7680

net: hns3: simplify hclge_cmd_query_error() · dbae56a3

Weihang Li authored Aug 01, 2019

The 4th and 5th parameter of hclge_cmd_query_error is useless, so this
patch removes them.
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dbae56a3

net: hns3: minior error handling change for hclge_tm_schd_info_init · b6872fd3

Yunsheng Lin authored Aug 01, 2019

When hclge_tm_schd_info_update calls hclge_tm_schd_info_init to
initialize the schedule info, hdev->tm_info.num_pg and
hdev->tx_sch_mode is not changed, which makes the checking in
hclge_tm_schd_info_init unnecessary.

So this patch moves the hdev->tm_info.num_pg and hdev->tx_sch_mode
checking into hclge_tm_schd_init and changes the return type of
hclge_tm_schd_info_init from int to void.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b6872fd3

net: hns3: minor cleanup in hns3_clean_rx_ring · a4ee7624

Yunsheng Lin authored Aug 01, 2019

The unused_count variable is used to indicate how many
RX BD need attaching new buffer in hns3_clean_rx_ring,
and the clean_count variable has the similar meaning.

This patch removes the clean_count variable and use
unused_count to uniformly indicate the RX BD that need
attaching new buffer.

This patch also clean up some coding style related to
variable assignment in hns3_clean_rx_ring.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a4ee7624

net: hns3: remove unnecessary variable in hclge_get_mac_vlan_cmd_status() · 6e4139f6

Jian Shen authored Aug 01, 2019

The local variable return_status in hclge_get_mac_val_cmd_status()
is useless. So this patch returns the error code directly, instead of
using this variable. Also, replace some '%d' with '%u' in
hclge_get_mac_val_cmd_status().
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e4139f6

net: hns3: refine for set ring parameters · a723fb8e

Jian Shen authored Aug 01, 2019

Previously, when changing the ring parameters, we free the old
ring resources firstly, and then setup the new ring resources.
In some case of an memory allocation fail, there will be no
resources to use. This patch refines it by setup new ring
resources and free the old ring resources in order.

Also reduce the max ring BD number to 32760 according to UM.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a723fb8e

net: hns3: do not query unsupported commands in debugfs · 3f0f3253

Yufeng Mo authored Aug 01, 2019

Some commands are not supported on DCB-unsupported ports.
This patch distinguishes these commands and does not query
unsupported commands in debugfs.

This patch also fix an error in the dump "qos buf cfg"
command in debugfs.

Fixes: 2849d4e7 ("net: hns3: Add "tc config" info query function")
Fixes: 7d9d7f88 ("net: hns3: Add "qos buffer" config info query function")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3f0f3253

net: hns3: add handler for NCSI error mailbox · b18bf305

Huazhong Tan authored Aug 01, 2019

When NCSI has HW error, the IMP will report this error to the driver
by sending a mailbox. After received this message, the driver should
assert a global reset to fix this kind of HW error.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b18bf305

net: hns3: add link change event report · ed8fb4b2

Jian Shen authored Aug 01, 2019

Previously, PF updates link status per second. For some scenario,
it requires link down event being reported more quickly.
To solve it, firmware pushes the link change event to PF with
CMDQ message, and driver updates the link status directly.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ed8fb4b2