Commits · 6e10785ee148655577885b65605836210a741bee · Kirill Smelkov / linux

30 Jan, 2021 16 commits

net: mhi: Get rid of local rx queue count · 6e10785e

Loic Poulain authored Jan 11, 2021

Use the new mhi_get_free_desc_count helper to track queue usage
instead of relying on the locally maintained rx_queued count.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

6e10785e

net: mhi: Get RX queue size from MHI core · e6ec3ccd

Loic Poulain authored Jan 28, 2021

The RX queue size can be determined at runtime by retrieving the
number of available transfer descriptors.
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

e6ec3ccd

Merge branch 'mhi-net-immutable' of https://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi · 2bca263c
Jakub Kicinski authored Jan 29, 2021
```
Needed by mhi-net patches.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
```
2bca263c

docs: networking: timestamping: fix section title markup · 5daf8384

Jan Luebbe authored Jan 28, 2021

This section was missed during the conversion to ReST, so convert it in the
same style as the surrounding section titles.
Signed-off-by: Jan Luebbe <jlu@pengutronix.de>
Link: https://lore.kernel.org/r/20210128111930.29473-1-jlu@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>

5daf8384

net/ethernet: convert to use module_platform_driver in octeon_mgmt.c · afa4f675

dingsenjie authored Jan 28, 2021

Simplify the code by using module_platform_driver macro
for octeon_mgmt.
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Link: https://lore.kernel.org/r/20210128035330.17676-1-dingsenjie@163.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

afa4f675

net: atm: pppoatm: use new API for wakeup tasklet · a5874597

Emil Renner Berthing authored Jan 27, 2021

This converts the driver to use the new tasklet API introduced in
commit 12cc923f ("tasklet: Introduce new initialization API")
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Link: https://lore.kernel.org/r/20210127173256.13954-2-kernel@esmil.dkSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a5874597

net: atm: pppoatm: use tasklet_init to initialize wakeup tasklet · a5b88632

Emil Renner Berthing authored Jan 27, 2021

Previously a temporary tasklet structure was initialized on the stack
using DECLARE_TASKLET_OLD() and then copied over and modified. Nothing
else in the kernel seems to use this pattern, so let's just call
tasklet_init() like everyone else.
Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Link: https://lore.kernel.org/r/20210127173256.13954-1-kernel@esmil.dkSigned-off-by: Jakub Kicinski <kuba@kernel.org>

a5b88632

Merge branch 'net-sched-cls_flower-add-support-for-matching-on-ct_state-reply-flag' · 810e754c

Jakub Kicinski authored Jan 29, 2021

Paul Blakey says:

====================
net/sched: cls_flower: Add support for matching on ct_state reply flag

This patchset adds software match support and offload of flower
match ct_state reply flag (+/-rpl).

The first patch adds the definition for the flag and match to flower.

Second patch gives the direction of the connection to the offloading
drivers via ct_metadata flow offload action.

The last patch does offload of this new ct_state by using the supplied
connection's direction.
====================

Link: https://lore.kernel.org/r/1611757967-18236-1-git-send-email-paulb@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

810e754c

net/mlx5: CT: Add support for matching on ct_state reply flag · 6895cb3a

Paul Blakey authored Jan 27, 2021

Add support for matching on ct_state reply flag.

Example:
$ tc filter add dev ens1f0_0 ingress prio 1 chain 1 proto ip flower \
  ct_state +trk+est+rpl \
  action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_1 ingress prio 1 chain 1 proto ip flower \
  ct_state +trk+est-rpl \
  action mirred egress redirect dev ens1f0_0
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

6895cb3a

net: flow_offload: Add original direction flag to ct_metadata · 941eff5a

Paul Blakey authored Jan 27, 2021

Give offloading drivers the direction of the offloaded ct flow,
this will be used for matches on direction (ct_state +/-rpl).
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

941eff5a

net/sched: cls_flower: Add match on the ct_state reply flag · 8c85d18c

Paul Blakey authored Jan 27, 2021

Add match on the ct_state reply flag.

Example:
$ tc filter add dev ens1f0_0 ingress prio 1 chain 1 proto ip flower \
  ct_state +trk+est+rpl \
  action mirred egress redirect dev ens1f0_1
$ tc filter add dev ens1f0_1 ingress prio 1 chain 1 proto ip flower \
  ct_state +trk+est-rpl \
  action mirred egress redirect dev ens1f0_0
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

8c85d18c

Merge branch 'add-nci-suit-and-virtual-nci-device-driver' · cf3c7c7b

Jakub Kicinski authored Jan 29, 2021

Bongsu Jeon says:

====================
Add nci suit and virtual nci device driver

1/2 is the Virtual NCI device driver.
2/2 is the NCI selftest suite
====================

Link: https://lore.kernel.org/r/20210127130829.4026-1-bongsu.jeon@samsung.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

cf3c7c7b

selftests: Add nci suite · f595cf12

Bongsu Jeon authored Jan 27, 2021

This is the NCI test suite. It tests the NFC/NCI module using virtual NCI
device. Test cases consist of making the virtual NCI device on/off and
controlling the device's polling for NCI1.0 and NCI2.0 version.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

f595cf12

nfc: Add a virtual nci device driver · e624e6c3

Bongsu Jeon authored Jan 27, 2021

NCI virtual device simulates a NCI device to the user. It can be used to
validate the NCI module and applications. This driver supports
communication between the virtual NCI device and NCI module.
Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

e624e6c3

net: packet: make pkt_sk() inline · 8c224751

Menglong Dong authored Jan 27, 2021

It's better make 'pkt_sk()' inline here, as non-inline function
shouldn't occur in headers. Besides, this function is simple
enough to be inline.
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Link: https://lore.kernel.org/r/20210127123302.29842-1-dong.menglong@zte.com.cnSigned-off-by: Jakub Kicinski <kuba@kernel.org>

8c224751

hv_netvsc: Copy packets sent by Hyper-V out of the receive buffer · 0ba35fe9

Andrea Parri (Microsoft) authored Jan 26, 2021

Pointers to receive-buffer packets sent by Hyper-V are used within the
guest VM. Hyper-V can send packets with erroneous values or modify
packet fields after they are processed by the guest. To defend against
these scenarios, copy (sections of) the incoming packet after validating
their length and offset fields in netvsc_filter_receive(). In this way,
the packet can no longer be modified by the host.
Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20210126162907.21056-1-parri.andrea@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

0ba35fe9

29 Jan, 2021 24 commits

octeontx2-af: Fix 'physical' typos · 46eb3c10

Bjorn Helgaas authored Jan 27, 2021

Fix misspellings of "physical".
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20210127181359.3008316-1-helgaas@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>

46eb3c10

linux/qed: fix spelling typo in qed_chain.h · 1d3f9bb1

dingsenjie authored Jan 27, 2021

allocted -> allocated
Signed-off-by: dingsenjie <dingsenjie@yulong.com>
Link: https://lore.kernel.org/r/20210127022801.8028-1-dingsenjie@163.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

1d3f9bb1

Merge branch 'nexthop-preparations-for-resilient-next-hop-groups' · 67d25ce8

Jakub Kicinski authored Jan 28, 2021

Petr Machata says:

====================
nexthop: Preparations for resilient next-hop groups

At this moment, there is only one type of next-hop group: an mpath group.
Mpath groups implement the hash-threshold algorithm, described in RFC
2992[1].

To select a next hop, hash-threshold algorithm first assigns a range of
hashes to each next hop in the group, and then selects the next hop by
comparing the SKB hash with the individual ranges. When a next hop is
removed from the group, the ranges are recomputed, which leads to
reassignment of parts of hash space from one next hop to another. RFC 2992
illustrates it thus:

             +-------+-------+-------+-------+-------+
             |   1   |   2   |   3   |   4   |   5   |
             +-------+-+-----+---+---+-----+-+-------+
             |    1    |    2    |    4    |    5    |
             +---------+---------+---------+---------+

              Before and after deletion of next hop 3
	      under the hash-threshold algorithm.

Note how next hop 2 gave up part of the hash space in favor of next hop 1,
and 4 in favor of 5. While there will usually be some overlap between the
previous and the new distribution, some traffic flows change the next hop
that they resolve to.

If a multipath group is used for load-balancing between multiple servers,
this hash space reassignment causes an issue that packets from a single
flow suddenly end up arriving at a server that does not expect them, which
may lead to TCP reset.

If a multipath group is used for load-balancing among available paths to
the same server, the issue is that different latencies and reordering along
the way causes the packets to arrive in wrong order.

Resilient hashing is a technique to address the above problem. Resilient
next-hop group has another layer of indirection between the group itself
and its constituent next hops: a hash table. The selection algorithm uses a
straightforward modulo operation to choose a hash bucket, and then reads
the next hop that this bucket contains, and forwards traffic there.

This indirection brings an important feature. In the hash-threshold
algorithm, the range of hashes associated with a next hop must be
continuous. With a hash table, mapping between the hash table buckets and
the individual next hops is arbitrary. Therefore when a next hop is deleted
the buckets that held it are simply reassigned to other next hops:

             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             |1|1|1|1|2|2|2|2|3|3|3|3|4|4|4|4|5|5|5|5|
             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	                      v v v v
             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             |1|1|1|1|2|2|2|2|1|2|4|5|4|4|4|4|5|5|5|5|
             +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

              Before and after deletion of next hop 3
	      under the resilient hashing algorithm.

When weights of next hops in a group are altered, it may be possible to
choose a subset of buckets that are currently not used for forwarding
traffic, and use those to satisfy the new next-hop distribution demands,
keeping the "busy" buckets intact. This way, established flows are ideally
kept being forwarded to the same endpoints through the same paths as before
the next-hop group change.

This patchset prepares the next-hop code for eventual introduction of
resilient hashing groups.

- Patches #1-#4 carry otherwise disjoint changes that just remove certain
  assumptions in the next-hop code.

- Patches #5-#6 extend the in-kernel next-hop notifiers to support more
  next-hop group types.

- Patches #7-#12 refactor RTNL message handlers. Resilient next-hop groups
  will introduce a new logical object, a hash table bucket. It turns out
  that handling bucket-related messages is similar to how next-hop messages
  are handled. These patches extract the commonalities into reusable
  components.

The plan is to contribute approximately the following patchsets:

1) Nexthop policy refactoring (already pushed)
2) Preparations for resilient next hop groups (this patchset)
3) Implementation of resilient next hop group
4) Netdevsim offload plus a suite of selftests
5) Preparations for mlxsw offload of resilient next-hop groups
6) mlxsw offload including selftests

Interested parties can look at the current state of the code at [2] and
[3].

[1] https://tools.ietf.org/html/rfc2992
[2] https://github.com/idosch/linux/commits/submit/res_integ_v1
[3] https://github.com/idosch/iproute2/commits/submit/res_v1
====================

Link: https://lore.kernel.org/r/cover.1611836479.git.petrm@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

67d25ce8

nexthop: Extract a helper for validation of get/del RTNL requests · 0bccf8ed

Petr Machata authored Jan 28, 2021

Validation of messages for get / del of a next hop is the same as will be
validation of messages for get of a resilient next hop group bucket. The
difference is that policy for resilient next hop group buckets is a
superset of that used for next-hop get.

It is therefore possible to reuse the code that validates the nhmsg fields,
extracts the next-hop ID, and validates that. To that end, extract from
nh_valid_get_del_req() a helper __nh_valid_get_del_req() that does just
that.

Make the nlh argument const so that the function can be called from the
dump context, which only has a const nlh. Propagate the constness to
nh_valid_get_del_req().
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

0bccf8ed

nexthop: Add a callback parameter to rtm_dump_walk_nexthops() · e948217d

Petr Machata authored Jan 28, 2021

In order to allow different handling for next-hop tree dumper and for
bucket dumper, parameterize the next-hop tree walker with a callback. Add
rtm_dump_nexthop_cb() with just the bits relevant for next-hop tree
dumping.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

e948217d

nexthop: Extract a helper for walking the next-hop tree · cbee1807

Petr Machata authored Jan 28, 2021

Extract from rtm_dump_nexthop() a helper to walk the next hop tree. A
separate function for this will be reusable from the bucket dumper.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

cbee1807

nexthop: Strongly-type context of rtm_dump_nexthop() · a6fbbaa6

Petr Machata authored Jan 28, 2021

The dump operations need to keep state from one invocation to another. A
scratch area is dedicated for this purpose in the passed-in argument, cb,
namely via two aliased arrays, struct netlink_callback.args and .ctx.

Dumping of buckets will end up having to iterate over next hops as well,
and it would be nice to be able to reuse the iteration logic with the NH
dumper. The fact that the logic currently relies on fixed index to the
.args array, and the indices would have to be coordinated between the two
dumpers, makes this somewhat awkward.

To make the access patters clearer, introduce a helper struct with a NH
index, and instead of using the .args array directly, use it through this
structure.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

a6fbbaa6

nexthop: Extract a common helper for parsing dump attributes · b9ebea12

Petr Machata authored Jan 28, 2021

Requests to dump nexthops have many attributes in common with those that
requests to dump buckets of resilient NH groups will have. However, they
have different policies. To allow reuse of this code, extract a
policy-agnostic wrapper out of nh_valid_dump_req(), and convert this
function into a thin wrapper around it.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

b9ebea12

nexthop: Extract dump filtering parameters into a single structure · 56450ec6

Petr Machata authored Jan 28, 2021

Requests to dump nexthops have many attributes in common with those that
requests to dump buckets of resilient NH groups will have. In order to make
reuse of this code simpler, convert the code to use a single structure with
filtering configuration instead of passing around the parameters one by
one.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

56450ec6

nexthop: Dispatch notifier init()/fini() by group type · da230501

Petr Machata authored Jan 28, 2021

After there are several next-hop group types, initialization and
finalization of notifier type needs to reflect the actual type. Transform
nh_notifier_grp_info_init() and _fini() to make extending them easier.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

da230501

nexthop: Use enum to encode notification type · 09ad6bec

Ido Schimmel authored Jan 28, 2021

Currently there are only two types of in-kernel nexthop notification.
The two are distinguished by the 'is_grp' boolean field in 'struct
nh_notifier_info'.

As more notification types are introduced for more next-hop group types, a
boolean is not an easily extensible interface. Instead, convert it to an
enum.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

09ad6bec

nexthop: Assert the invariant that a NH group is of only one type · 720ccd9a

Petr Machata authored Jan 28, 2021

Most of the code that deals with nexthop groups relies on the fact that the
group is of exactly one well-known type. Currently there is only one type,
"mpath", but as more next-hop group types come, it becomes desirable to
have a central place where the setting is validated. Introduce such place
into nexthop_create_group(), such that the check is done before the code
that relies on that invariant is invoked.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

720ccd9a

nexthop: Introduce to struct nh_grp_entry a per-type union · b9bae61b

Petr Machata authored Jan 28, 2021

The values that a next-hop group needs to keep track of depend on the group
type. Introduce a union to separate fields specific to the mpath groups
from fields specific to other group types.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

b9bae61b

nexthop: Dispatch nexthop_select_path() by group type · 79bc55e3

Petr Machata authored Jan 28, 2021

The logic for selecting path depends on the next-hop group type. Adapt the
nexthop_select_path() to dispatch according to the group type.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

79bc55e3

nexthop: Rename nexthop_free_mpath · 5d1f0f09

David Ahern authored Jan 28, 2021

nexthop_free_mpath really should be nexthop_free_group. Rename it.
Signed-off-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

5d1f0f09

Merge branch 'net-iucv-updates-2021-01-28' · 4915a404

Jakub Kicinski authored Jan 28, 2021

Julian Wiedmann says:

====================
net/iucv: updates 2021-01-28

This reworks & simplifies the TX notification path in af_iucv, so that we
can send out SG skbs over TRANS_HIPER sockets. Also remove a noisy
WARN_ONCE() in the RX path.
====================

Link: https://lore.kernel.org/r/20210128114108.39409-1-jwi@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

4915a404

net/af_iucv: build SG skbs for TRANS_HIPER sockets · 2c3b4456

Julian Wiedmann authored Jan 28, 2021

The TX path no longer falls apart when some of its SG skbs are later
linearized by lower layers of the stack. So enable the use of SG skbs
in iucv_sock_sendmsg() again.

This effectively reverts
commit dc5367bc ("net/af_iucv: don't use paged skbs for TX on HiperSockets").
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

2c3b4456

net/af_iucv: don't track individual TX skbs for TRANS_HIPER sockets · 80bc97aa

Julian Wiedmann authored Jan 28, 2021

Stop maintaining the skb_send_q list for TRANS_HIPER sockets.

Not only is it extra overhead, but keeping around a list of skb clones
means that we later also have to match the ->sk_txnotify() calls
against these clones and free them accordingly.
The current matching logic (comparing the skbs' shinfo location) is
frustratingly fragile, and breaks if the skb's head is mangled in any
sort of way while passing from dev_queue_xmit() to the device's
HW queue.

Also adjust the interface for ->sk_txnotify(), to make clear that we
don't actually care about any skb internals.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

80bc97aa

net/af_iucv: count packets in the xmit path · ef6af7bd

Julian Wiedmann authored Jan 28, 2021

The TX code keeps track of all skbs that are in-flight but haven't
actually been sent out yet. For native IUCV sockets that's not a huge
deal, but with TRANS_HIPER sockets it would be much better if we
didn't need to maintain a list of skb clones.

Note that we actually only care about the _count_ of skbs in this stage
of the TX pipeline. So as prep work for removing the skb tracking on
TRANS_HIPER sockets, keep track of the skb count in a separate variable
and pair any list {enqueue, unlink} with a count {increment, decrement}.

Then replace all occurences where we currently look at the skb list's
fill level.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ef6af7bd

net/af_iucv: don't lookup the socket on TX notification · c464444f

Julian Wiedmann authored Jan 28, 2021

Whoever called iucv_sk(sk)->sk_txnotify() must already know that they're
dealing with an af_iucv socket.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

c464444f

net/af_iucv: remove WARN_ONCE on malformed RX packets · 27e9c1de

Alexander Egorenkov authored Jan 28, 2021

syzbot reported the following finding:

AF_IUCV failed to receive skb, len=0
WARNING: CPU: 0 PID: 522 at net/iucv/af_iucv.c:2039 afiucv_hs_rcv+0x174/0x190 net/iucv/af_iucv.c:2039
CPU: 0 PID: 522 Comm: syz-executor091 Not tainted 5.10.0-rc1-syzkaller-07082-g55027a88ec9f #0
Hardware name: IBM 3906 M04 701 (KVM/Linux)
Call Trace:
 [<00000000b87ea538>] afiucv_hs_rcv+0x178/0x190 net/iucv/af_iucv.c:2039
([<00000000b87ea534>] afiucv_hs_rcv+0x174/0x190 net/iucv/af_iucv.c:2039)
 [<00000000b796533e>] __netif_receive_skb_one_core+0x13e/0x188 net/core/dev.c:5315
 [<00000000b79653ce>] __netif_receive_skb+0x46/0x1c0 net/core/dev.c:5429
 [<00000000b79655fe>] netif_receive_skb_internal+0xb6/0x220 net/core/dev.c:5534
 [<00000000b796ac3a>] netif_receive_skb+0x42/0x318 net/core/dev.c:5593
 [<00000000b6fd45f4>] tun_rx_batched.isra.0+0x6fc/0x860 drivers/net/tun.c:1485
 [<00000000b6fddc4e>] tun_get_user+0x1c26/0x27f0 drivers/net/tun.c:1939
 [<00000000b6fe0f00>] tun_chr_write_iter+0x158/0x248 drivers/net/tun.c:1968
 [<00000000b4f22bfa>] call_write_iter include/linux/fs.h:1887 [inline]
 [<00000000b4f22bfa>] new_sync_write+0x442/0x648 fs/read_write.c:518
 [<00000000b4f238fe>] vfs_write.part.0+0x36e/0x5d8 fs/read_write.c:605
 [<00000000b4f2984e>] vfs_write+0x10e/0x148 fs/read_write.c:615
 [<00000000b4f29d0e>] ksys_write+0x166/0x290 fs/read_write.c:658
 [<00000000b8dc4ab4>] system_call+0xe0/0x28c arch/s390/kernel/entry.S:415
Last Breaking-Event-Address:
 [<00000000b8dc64d4>] __s390_indirect_jump_r14+0x0/0xc

Malformed RX packets shouldn't generate any warnings because
debugging info already flows to dropmon via the kfree_skb().
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

27e9c1de

Merge branch 's390-qeth-updates-2021-01-28' · 14a6daf3

Jakub Kicinski authored Jan 28, 2021

Julian Wiedmann says:

====================
s390/qeth: updates 2021-01-28

Nothing special, mostly fine-tuning and follow-on cleanups for earlier fixes.
====================

Link: https://lore.kernel.org/r/20210128112551.18780-1-jwi@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>

14a6daf3

s390/qeth: don't fake a TX completion interrupt after TX error · d6e51503

Julian Wiedmann authored Jan 28, 2021

When do_qdio() returns with an unexpected error, qeth_flush_buffers()
kicks off a recovery action.

In such a case there's no point in starting TX completion processing,
the device gets torn down anyway. So take a closer look at do_qdio()'s
return value, and skip the TX completion processing accordingly.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

d6e51503

s390/qeth: make cast type selection for af_iucv skbs robust · a667fee1

Julian Wiedmann authored Jan 28, 2021

As part of the TX queue selection for af_iucv skbs,
qeth_l3_get_cast_type_rcu() ends up calling qeth_get_ether_cast_type().
Which is rather fragile, since such skbs don't have a proper ETH header
and we rely on it being zeroed out in the right places. Add a separate
case for ETH_P_AF_IUCV instead that does the right thing.

When later building the HW header for such skbs, don't hard-code the
cast type but follow the same path as for other protocol types. Here
the cast type should naturally come from the skb's queue mapping.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

a667fee1