Commits · ef74422020aa8c224b00a927e3e47faac4d8fae3 · nexedi / linux

30 May, 2019 8 commits

mlxsw: spectrum_acl: Avoid warning after identical rules insertion · ef744220

Jiri Pirko authored May 29, 2019

When identical rules are inserted, the latter one goes to C-TCAM. For
that, a second eRP with the same mask is created. These 2 eRPs by the
nature cannot be merged and also one cannot be parent of another.
Teach mlxsw_sp_acl_erp_delta_fill() about this possibility and handle it
gracefully.
Reported-by: Alex Kushnarov <alexanderk@mellanox.com>
Fixes: c22291f7 ("mlxsw: spectrum: acl: Implement delta for ERP")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ef744220

net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT · 84b3fd1f

Rasmus Villemoes authored May 29, 2019

Currently, the upper half of a 4-byte STATS_TYPE_PORT statistic ends
up in bits 47:32 of the return value, instead of bits 31:16 as they
should.

Fixes: 6e46e2d8 ("net: dsa: mv88e6xxx: Fix u64 statistics")
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

84b3fd1f

r8169: fix MAC address being lost in PCI D3 · 59715171

Heiner Kallweit authored May 29, 2019

(At least) RTL8168e forgets its MAC address in PCI D3. To fix this set
the MAC address when resuming. For resuming from runtime-suspend we
had this in place already, for resuming from S3/S5 it was missing.

The commit referenced as being fixed isn't wrong, it's just the first
one where the patch applies cleanly.

Fixes: 0f07bd85 ("r8169: use dev_get_drvdata where possible")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Albert Astals Cid <aacid@kde.org>
Tested-by: Albert Astals Cid <aacid@kde.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

59715171

Merge tag 'mlx5-fixes-2019-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 200c6758

David S. Miller authored May 30, 2019

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2019-05-28

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.13:
('net/mlx5: Allocate root ns memory using kzalloc to match kfree')

For -stable v4.16:
('net/mlx5: Avoid double free in fs init error unwinding path')

For -stable v4.18:
('net/mlx5e: Disable rxhash when CQE compress is enabled')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

200c6758

Merge branch 'XDP-generic-fixes' · 4b280531

David S. Miller authored May 30, 2019

Stephen Hemminger says:

====================
XDP generic fixes

This set of patches came about while investigating XDP
generic on Azure. The split brain nature of the accelerated
networking exposed issues with the stack device model.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4b280531

net: core: support XDP generic on stacked devices. · 458bf2f2

Stephen Hemminger authored May 28, 2019

When a device is stacked like (team, bonding, failsafe or netvsc) the
XDP generic program for the parent device was not called.

Move the call to XDP generic inside __netif_receive_skb_core where
it can be done multiple times for stacked case.

Fixes: d4455169 ("net: xdp: support xdp generic on virtual devices")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

458bf2f2

netvsc: unshare skb in VF rx handler · 996ed047

Stephen Hemminger authored May 28, 2019

The netvsc VF skb handler should make sure that skb is not
shared. Similar logic already exists in bonding and team device
drivers.

This is not an issue in practice because the VF devicex
does not send up shared skb's. But the netvsc driver
should do the right thing if it did.

Fixes: 0c195567 ("netvsc: transparent VF management")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

996ed047

udp: Avoid post-GRO UDP checksum recalculation · f2696099

Sean Tranchetti authored May 28, 2019

Currently, when resegmenting an unexpected UDP GRO packet, the full UDP
checksum will be calculated for every new SKB created by skb_segment()
because the netdev features passed in by udp_rcv_segment() lack any
information about checksum offload capabilities.

Usually, we have no need to perform this calculation again, as
  1) The GRO implementation guarantees that any packets making it to the
     udp_rcv_segment() function had correct checksums, and, more
     importantly,
  2) Upon the successful return of udp_rcv_segment(), we immediately pull
     the UDP header off and either queue the segment to the socket or
     hand it off to a new protocol handler.

Unless userspace has set the IP_CHECKSUM sockopt to indicate that they
want the final checksum values, we can pass the needed netdev feature
flags to __skb_gso_segment() to avoid checksumming each segment in
skb_segment().

Fixes: cf329aa4 ("udp: cope with UDP GRO packet misdirection")
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f2696099

29 May, 2019 17 commits

Merge branch 'net-phy-dp83867-add-some-fixes' · 58e8b370

David S. Miller authored May 29, 2019

Max Uvarov says:

====================
net: phy: dp83867: add some fixes

v3: use phy_modify_mmd()
v2: fix minor comments by Heiner Kallweit and Florian Fainelli
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

58e8b370

net: phy: dp83867: Set up RGMII TX delay · 2b892649

Max Uvarov authored May 28, 2019

PHY_INTERFACE_MODE_RGMII_RXID is less then TXID
so code to set tx delay is never called.

Fixes: 2a10154a ("net: phy: dp83867: Add TI dp83867 phy")
Signed-off-by: Max Uvarov <muvarov@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2b892649

net: phy: dp83867: do not call config_init twice · c8081fc3

Max Uvarov authored May 28, 2019

Phy state machine calls _config_init just after
reset.
Signed-off-by: Max Uvarov <muvarov@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c8081fc3

net: phy: dp83867: increase SGMII autoneg timer duration · 1a97a477

Max Uvarov authored May 28, 2019

After reset SGMII Autoneg timer is set to 2us (bits 6 and 5 are 01).
That is not enough to finalize autonegatiation on some devices.
Increase this timer duration to maximum supported 16ms.
Signed-off-by: Max Uvarov <muvarov@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1a97a477

net: phy: dp83867: fix speed 10 in sgmii mode · 333061b9

Max Uvarov authored May 28, 2019

For supporting 10Mps speed in SGMII mode DP83867_10M_SGMII_RATE_ADAPT bit
of DP83867_10M_SGMII_CFG register has to be cleared by software.
That does not affect speeds 100 and 1000 so can be done on init.
Signed-off-by: Max Uvarov <muvarov@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

333061b9

net: phy: marvell10g: report if the PHY fails to boot firmware · 3d3ced2e

Russell King authored May 28, 2019

Some boards do not have the PHY firmware programmed in the 3310's flash,
which leads to the PHY not working as expected. Warn the user when the
PHY fails to boot the firmware and refuse to initialise.

Fixes: 20b2af32 ("net: phy: add Marvell Alaska X 88X3310 10Gigabit PHY support")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

3d3ced2e

net: phylink: ensure consistent phy interface mode · c6787263

Russell King authored May 28, 2019

Ensure that we supply the same phy interface mode to mac_link_down() as
we did for the corresponding mac_link_up() call.  This ensures that MAC
drivers that use the phy interface mode in these methods can depend on
mac_link_down() always corresponding to a mac_link_up() call for the
same interface mode.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6787263

net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs · 315ca92d

Yoshihiro Shimoda authored May 28, 2019

The sh_eth_close() resets the MAC and then calls phy_stop()
so that mdio read access result is incorrect without any error
according to kernel trace like below:

ifconfig-216 [003] .n.. 109.133124: mdio_access: ee700000.ethernet-ffffffff read phy:0x01 reg:0x00 val:0xffff

According to the hardware manual, the RMII mode should be set to 1
before operation the Ethernet MAC. However, the previous code was not
set to 1 after the driver issued the soft_reset in sh_eth_dev_exit()
so that the mdio read access result seemed incorrect. To fix the issue,
this patch adds a condition and set the RMII mode register in
sh_eth_dev_exit() for R-Car Gen2 and RZ/A1 SoCs.

Note that when I have tried to move the sh_eth_dev_exit() calling
after phy_stop() on sh_eth_close(), but it gets worse (kernel panic
happened and it seems that a register is accessed while the clock is
off).
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

315ca92d

net/mlx5e: Disable rxhash when CQE compress is enabled · c0194e2d

Saeed Mahameed authored May 23, 2019

When CQE compression is enabled (Multi-host systems), compressed CQEs
might arrive to the driver rx, compressed CQEs don't have a valid hash
offload and the driver already reports a hash value of 0 and invalid hash
type on the skb for compressed CQEs, but this is not good enough.

On a congested PCIe, where CQE compression will kick in aggressively,
gro will deliver lots of out of order packets due to the invalid hash
and this might cause a serious performance drop.

The only valid solution, is to disable rxhash offload at all when CQE
compression is favorable (Multi-host systems).

Fixes: 7219ab34 ("net/mlx5e: CQE compression")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

c0194e2d

net/mlx5e: restrict the real_dev of vlan device is the same as uplink device · 24bcd210

wenxu authored May 15, 2019

When register indr block for vlan device, it should check the real_dev
of vlan device is same as uplink device. Or it will set offload rule
to mlx5e which will never hit.

Fixes: 35a605db ("net/mlx5e: Offload TC e-switch rules with ingress VLAN device")
Signed-off-by: wenxu <wenxu@ucloud.cn>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

24bcd210

net/mlx5: Allocate root ns memory using kzalloc to match kfree · 25fa506b

Parav Pandit authored May 10, 2019

root ns is yet another fs core node which is freed using kfree() by
tree_put_node().
Rest of the other fs core objects are also allocated using kmalloc
variants.

However, root ns memory is allocated using kvzalloc().
Hence allocate root ns memory using kzalloc().

Fixes: 25302363 ("net/mlx5_core: Flow steering tree initialization")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

25fa506b

net/mlx5: Avoid double free in fs init error unwinding path · 9414277a

Parav Pandit authored May 10, 2019

In below code flow, for ingress acl table root ns memory leads
to double free.

mlx5_init_fs
  init_ingress_acls_root_ns()
    init_ingress_acl_root_ns
       kfree(steering->esw_ingress_root_ns);
       /* steering->esw_ingress_root_ns is not marked NULL */
  mlx5_cleanup_fs
    cleanup_ingress_acls_root_ns
       steering->esw_ingress_root_ns non NULL check passes.
       kfree(steering->esw_ingress_root_ns);
       /* double free */

Similar issue exist for other tables.

Hence zero out the pointers to not process the table again.

Fixes: 9b93ab98 ("net/mlx5: Separate ingress/egress namespaces for each vport")
Fixes: 40c3eebb49e51 ("net/mlx5: Add support in RDMA RX steering")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

9414277a

net/mlx5: Avoid double free of root ns in the error flow path · 905f6bd3

Parav Pandit authored May 09, 2019

When root ns setup for rdma, sniffer tx and sniffer rx fails,
such root ns cleanup is done by the error unwinding path of
mlx5_cleanup_fs().
Below call graph shows an example for sniffer_rx_root_ns.

mlx5_init_fs()
  init_sniffer_rx_root_ns()
    cleanup_root_ns(steering->sniffer_rx_root_ns);
mlx5_cleanup_fs()
  cleanup_root_ns(steering->sniffer_rx_root_ns);
  /* double free of sniffer_rx_root_ns */

Hence, use the existing cleanup_fs to cleanup.

Fixes: d83eb50e ("net/mlx5: Add support in RDMA RX steering")
Fixes: 87d22483 ("net/mlx5: Add sniffer namespaces")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

905f6bd3

net/mlx5: Fix error handling in mlx5_load() · 87883929

Saeed Mahameed authored May 16, 2019

In case mlx5_core_set_hca_defaults fails, it should jump to
mlx5_cleanup_fs, fix that.

Fixes: c85023e1 ("IB/mlx5: Add raw ethernet local loopback support")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>

87883929

Documentation: net-sysfs: Remove duplicate PHY device documentation · a6cd0d2d

Florian Fainelli authored May 27, 2019

Both sysfs-bus-mdio and sysfs-class-net-phydev contain the same
duplication information. There is not currently any MDIO bus specific
attribute, but there are PHY device (struct phy_device) specific
attributes. Use the more precise description from sysfs-bus-mdio and
carry that over to sysfs-class-net-phydev.

Fixes: 86f22d04 ("net: sysfs: Document PHY device sysfs attributes")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6cd0d2d

llc: fix skb leak in llc_build_and_send_ui_pkt() · 8fb44d60

Eric Dumazet authored May 27, 2019

If llc_mac_hdr_init() returns an error, we must drop the skb
since no llc_build_and_send_ui_pkt() caller will take care of this.

BUG: memory leak
unreferenced object 0xffff8881202b6800 (size 2048):
  comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.590s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    1a 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00  ...@............
  backtrace:
    [<00000000e25b5abe>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [<00000000e25b5abe>] slab_post_alloc_hook mm/slab.h:439 [inline]
    [<00000000e25b5abe>] slab_alloc mm/slab.c:3326 [inline]
    [<00000000e25b5abe>] __do_kmalloc mm/slab.c:3658 [inline]
    [<00000000e25b5abe>] __kmalloc+0x161/0x2c0 mm/slab.c:3669
    [<00000000a1ae188a>] kmalloc include/linux/slab.h:552 [inline]
    [<00000000a1ae188a>] sk_prot_alloc+0xd6/0x170 net/core/sock.c:1608
    [<00000000ded25bbe>] sk_alloc+0x35/0x2f0 net/core/sock.c:1662
    [<000000002ecae075>] llc_sk_alloc+0x35/0x170 net/llc/llc_conn.c:950
    [<00000000551f7c47>] llc_ui_create+0x7b/0x140 net/llc/af_llc.c:173
    [<0000000029027f0e>] __sock_create+0x164/0x250 net/socket.c:1430
    [<000000008bdec225>] sock_create net/socket.c:1481 [inline]
    [<000000008bdec225>] __sys_socket+0x69/0x110 net/socket.c:1523
    [<00000000b6439228>] __do_sys_socket net/socket.c:1532 [inline]
    [<00000000b6439228>] __se_sys_socket net/socket.c:1530 [inline]
    [<00000000b6439228>] __x64_sys_socket+0x1e/0x30 net/socket.c:1530
    [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
    [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

BUG: memory leak
unreferenced object 0xffff88811d750d00 (size 224):
  comm "syz-executor907", pid 7074, jiffies 4294943781 (age 8.600s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 f0 0c 24 81 88 ff ff 00 68 2b 20 81 88 ff ff  ...$.....h+ ....
  backtrace:
    [<0000000053026172>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [<0000000053026172>] slab_post_alloc_hook mm/slab.h:439 [inline]
    [<0000000053026172>] slab_alloc_node mm/slab.c:3269 [inline]
    [<0000000053026172>] kmem_cache_alloc_node+0x153/0x2a0 mm/slab.c:3579
    [<00000000fa8f3c30>] __alloc_skb+0x6e/0x210 net/core/skbuff.c:198
    [<00000000d96fdafb>] alloc_skb include/linux/skbuff.h:1058 [inline]
    [<00000000d96fdafb>] alloc_skb_with_frags+0x5f/0x250 net/core/skbuff.c:5327
    [<000000000a34a2e7>] sock_alloc_send_pskb+0x269/0x2a0 net/core/sock.c:2225
    [<00000000ee39999b>] sock_alloc_send_skb+0x32/0x40 net/core/sock.c:2242
    [<00000000e034d810>] llc_ui_sendmsg+0x10a/0x540 net/llc/af_llc.c:933
    [<00000000c0bc8445>] sock_sendmsg_nosec net/socket.c:652 [inline]
    [<00000000c0bc8445>] sock_sendmsg+0x54/0x70 net/socket.c:671
    [<000000003b687167>] __sys_sendto+0x148/0x1f0 net/socket.c:1964
    [<00000000922d78d9>] __do_sys_sendto net/socket.c:1976 [inline]
    [<00000000922d78d9>] __se_sys_sendto net/socket.c:1972 [inline]
    [<00000000922d78d9>] __x64_sys_sendto+0x2a/0x30 net/socket.c:1972
    [<00000000cec820c1>] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
    [<000000000c32554f>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8fb44d60

selftests: pmtu: Fix encapsulating device in pmtu_vti6_link_change_mtu · 73f51d15

Stefano Brivio authored May 27, 2019

In the pmtu_vti6_link_change_mtu test, both local and remote addresses
for the vti6 tunnel are assigned to the same address given to the dummy
interface that we use as encapsulating device with a known MTU.

This works as long as the dummy interface is actually selected, via
rt6_lookup(), as encapsulating device. But if the remote address of the
tunnel is a local address too, the loopback interface could also be
selected, and there's nothing wrong with it.

This is what some older -stable kernels do (3.18.z, at least), and
nothing prevents us from subtly changing FIB implementation to revert
back to that behaviour in the future.

Define an IPv6 prefix instead, and use two separate addresses as local
and remote for vti6, so that the encapsulating device can't be a
loopback interface.
Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 1fad59ea ("selftests: pmtu: Add pmtu_vti6_link_change_mtu test")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

73f51d15

28 May, 2019 3 commits

dpaa_eth: use only online CPU portals · 7aae703f

Madalin Bucur authored May 27, 2019

Make sure only the portals for the online CPUs are used.
Without this change, there are issues when someone boots with
maxcpus=n, with n < actual number of cores available as frames
either received or corresponding to the transmit confirmation
path would be offered for dequeue to the offline CPU portals,
getting lost.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7aae703f

net: mvneta: Fix err code path of probe · d484e06e

Jisheng Zhang authored May 27, 2019

Fix below issues in err code path of probe:
1. we don't need to unregister_netdev() because the netdev isn't
registered.
2. when register_netdev() fails, we also need to destroy bm pool for
HWBM case.

Fixes: dc35a10f ("net: mvneta: bm: add support for hardware buffer management")
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d484e06e

net: stmmac: Do not output error on deferred probe · 54ed6fd2

Thierry Reding authored May 27, 2019

If the subdriver defers probe, do not show an error message. It's
perfectly fine for this error to occur since the driver will get another
chance to probe after some time and will usually succeed after all of
the resources that it requires have been registered.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

54ed6fd2

27 May, 2019 11 commits

Merge branch 'aquantia-fixes' · c3cf73c7

David S. Miller authored May 27, 2019

Igor Russkikh says:

====================
net: aquantia: various fixes May, 2019

Here is a set of various bug fixes found on recent verification stage.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c3cf73c7

net: aquantia: tcp checksum 0xffff being handled incorrectly · 76f254d4

Nikita Danilov authored May 25, 2019

Thats a known quirk in windows tcp stack it can produce 0xffff checksum.
Thats incorrect but it is.

Atlantic HW with LRO enabled handles that incorrectly and changes csum to
0xfffe - but indicates that csum is invalid. This causes driver to pass
packet to linux networking stack with CSUM_NONE, stack eventually drops
the packet.

There is a quirk in atlantic HW to enable correct processing of
0xffff incorrect csum. Enable it.

The visible bug is that windows link partner with software generated csums
caused TCP connection to be unstable since all packets that csum value
are dropped.
Reported-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: Nikita Danilov <ndanilov@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

76f254d4

net: aquantia: fix LRO with FCS error · eaeb3b74

Dmitry Bogdanov authored May 25, 2019

Driver stops producing skbs on ring if a packet with FCS error
was coalesced into LRO session. Ring gets hang forever.

Thats a logical error in driver processing descriptors:
When rx_stat indicates MAC Error, next pointer and eop flags
are not filled. This confuses driver so it waits for descriptor 0
to be filled by HW.

Solution is fill next pointer and eop flag even for packets with FCS error.

Fixes: bab6de8f ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions.")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eaeb3b74

net: aquantia: check rx csum for all packets in LRO session · f38f1ee8

Dmitry Bogdanov authored May 25, 2019

Atlantic hardware does not aggregate nor breaks LRO sessions
with bad csum packets. This means driver should take care of that.

If in LRO session there is a non-first descriptor with invalid
checksum (L2/L3/L4), the driver must account this information
in csum application logic.

Fixes: 018423e9 ("net: ethernet: aquantia: Add ring support code")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f38f1ee8

net: aquantia: tx clean budget logic error · 31bafc49

Igor Russkikh authored May 25, 2019

In case no other traffic happening on the ring, full tx cleanup
may not be completed. That may cause socket buffer to overflow
and tx traffic to stuck until next activity on the ring happens.

This is due to logic error in budget variable decrementor.
Variable is compared with zero, and then post decremented,
causing it to become MAX_INT. Solution is remove decrementor
from the `for` statement and rewrite it in a clear way.

Fixes: b647d398 ("net: aquantia: Add tx clean budget and valid budget handling logic")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

31bafc49

net: tulip: de4x5: Drop redundant MODULE_DEVICE_TABLE() · 3e66b7cc

Kees Cook authored May 24, 2019

Building with Clang reports the redundant use of MODULE_DEVICE_TABLE():

drivers/net/ethernet/dec/tulip/de4x5.c:2110:1: error: redefinition of '__mod_eisa__de4x5_eisa_ids_device_table'
MODULE_DEVICE_TABLE(eisa, de4x5_eisa_ids);
^
./include/linux/module.h:229:21: note: expanded from macro 'MODULE_DEVICE_TABLE'
extern typeof(name) __mod_##type##__##name##_device_table               \
                    ^
<scratch space>:90:1: note: expanded from here
__mod_eisa__de4x5_eisa_ids_device_table
^
drivers/net/ethernet/dec/tulip/de4x5.c:2100:1: note: previous definition is here
MODULE_DEVICE_TABLE(eisa, de4x5_eisa_ids);
^
./include/linux/module.h:229:21: note: expanded from macro 'MODULE_DEVICE_TABLE'
extern typeof(name) __mod_##type##__##name##_device_table               \
                    ^
<scratch space>:85:1: note: expanded from here
__mod_eisa__de4x5_eisa_ids_device_table
^

This drops the one further from the table definition to match the common
use of MODULE_DEVICE_TABLE().

Fixes: 07563c71 ("EISA bus MODALIAS attributes support")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

3e66b7cc

Merge branch 'net-tls-two-fixes-for-rx_list-pre-handling' · b933dc36

David S. Miller authored May 26, 2019

Jakub Kicinski says:

====================
net/tls: two fixes for rx_list pre-handling

tls_sw_recvmsg() had been modified to cater better to async decrypt.
Partially read records now live on the rx_list. Data is copied from
this list before the old do {} while loop, and the not included
correctly in deciding whether to sleep or not and lowat threshold
handling. These modifications, unfortunately, added some bugs.

First patch fixes lowat - we need to calculate the threshold early
and make sure all copied data is compared to the threshold, not just
the freshly decrypted data.

Third patch fixes sleep - if data is picked up from rx_list and
no flags are set, we should not put the process to sleep, but
rather return the partial read.

Patches 2 and 4 add test cases for these bugs, both will cause
a sleep and test timeout before the fix.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b933dc36

selftests/tls: add test for sleeping even though there is data · 043556d0

Jakub Kicinski authored May 24, 2019

Add a test which sends 15 bytes of data, and then tries
to read 10 byes twice.  Previously the second read would
sleep indifinitely, since the record was already decrypted
and there is only 5 bytes left, not full 10.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

043556d0

net/tls: fix no wakeup on partial reads · 04b25a54

Jakub Kicinski authored May 24, 2019

When tls_sw_recvmsg() partially copies a record it pops that
record from ctx->recv_pkt and places it on rx_list.

Next iteration of tls_sw_recvmsg() reads from rx_list via
process_rx_list() before it enters the decryption loop.
If there is no more records to be read tls_wait_data()
will put the process on the wait queue and got to sleep.
This is incorrect, because some data was already copied
in process_rx_list().

In case of RPC connections process may never get woken up,
because peer also simply blocks in read().

I think this may also fix a similar issue when BPF is at
play, because after __tcp_bpf_recvmsg() returns some data
we subtract it from len and use continue to restart the
loop, but len could have just reached 0, so again we'd
sleep unnecessarily. That's added by:
commit d3b18ad3 ("tls: add bpf support to sk_msg handling")

Fixes: 692d7b5d ("tls: Fix recvmsg() to be able to peek across multiple records")
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Tested-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

04b25a54

selftests/tls: test for lowat overshoot with multiple records · 7718a855

Jakub Kicinski authored May 24, 2019

Set SO_RCVLOWAT and test it gets respected when gathering
data from multiple records.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7718a855

net/tls: fix lowat calculation if some data came from previous record · 46a16959

Jakub Kicinski authored May 24, 2019

If some of the data came from the previous record, i.e. from
the rx_list it had already been decrypted, so it's not counted
towards the "decrypted" variable, but the "copied" variable.
Take that into account when checking lowat.

When calculating lowat target we need to pass the original len.
E.g. if lowat is at 80, len is 100 and we had 30 bytes on rx_list
target would currently be incorrectly calculated as 70, even though
we only need 50 more bytes to make up the 80.

Fixes: 692d7b5d ("tls: Fix recvmsg() to be able to peek across multiple records")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Tested-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

46a16959

26 May, 2019 1 commit

Merge branch 'dpaa2-eth-Fix-smatch-warnings' · 66a04abf

David S. Miller authored May 26, 2019

Ioana Radulescu says:

====================
dpaa2-eth: Fix smatch warnings

Fix a couple of warnings reported by smatch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

66a04abf