Merge branch 'prevent-permanently-closed-tc-taprio-gates-from-blocking-a-felix-dsa-switch-port'
Vladimir Oltean says: ==================== Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port Richie Pearn reports that if we install a tc-taprio schedule on a Felix switch port, and that schedule has at least one gate that never opens (for example TC0 below): tc qdisc add dev swp1 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 sched-entry S fe 1000000 flags 0x2 then packets classified to the permanently closed traffic class will not be dequeued by the egress port. They will just remain in the queue system, to consume resources. Frame aging does not trigger either, because in order for that to happen, the packets need to be eligible for egress scheduling in the first place, which they aren't. If that port is allowed to consume the entire shared buffer of the switch (as we configure things by default using devlink-sb), then eventually, by sending enough packets, the entire switch will hang. If we think enough about the problem, we realize that this is only a special case of a more general issue, and can also be reproduced with gates that aren't permanently closed, but are not large enough to send an entire frame. In that sense, a permanently closed gate is simply a case where all frames are oversized. The ENETC has logic to reject transmitted packets that would overrun the time window - see commit 285e8ded ("net: enetc: count the tc-taprio window drops"). The Felix switch has no such thing on a per-packet basis, but it has a register replicated per {egress port, TC} which essentially limits the max MTU. A packet which exceeds the per-port-TC MTU is immediately discarded and therefore will not hang the port anymore (albeit, sadly, this only bumps a generic drop hardware counter and we cannot really infer the reason such as to offer a dedicated counter for these events). This patch set calculates the max MTU per {port, TC} when the tc-taprio config, or link speed, or port-global MTU values change. This solves the larger "gate too small for packet" problem, but also the original issue with the gate permanently closed that was reported by Richie. ==================== Link: https://lore.kernel.org/r/20220628145238.3247853-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
Showing
Please register or sign in to comment