Commits · e457d86ada27cbd2f46ded75d4b4bc06e26d0e2e · nexedi / linux

24 Aug, 2017 11 commits

net: sched: add couple of goto_chain helpers · e457d86a

Jiri Pirko authored Aug 23, 2017

Add helpers to find out if a gact instance is goto_chain termination
action and to get chain index.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e457d86a

mlxsw: spectrum: Offload multichain TC rules · 45b62742

Jiri Pirko authored Aug 23, 2017

Reflect chain index coming down from TC core and create a ruleset per
chain. Note that only chain 0, being the implicit chain, is bound to the
device for processing. The rest of chains have to be "jumped-to" by
actions.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

45b62742

Merge branch 'mvpp2-software-TSO-support' · ae99e188

David S. Miller authored Aug 23, 2017

Antoine Tenart says:

====================
net: mvpp2: software TSO support

This series adds the s/w TSO support in the PPv2 driver, in addition to
two cosmetic commits. As stated in patch 3/3:

Using iperf and 10G ports, using TSO shows a significant performance
improvement by a factor 2 to reach around 9.5Gbps in TX; as well as a
significant CPU usage drop (from 25% to 15%).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ae99e188

net: mvpp2: software tso support · 186cd4d4

Antoine Ténart authored Aug 23, 2017

The patch uses the tso API to implement the tso functionality in Marvell
PPv2 driver.

Using iperf and 10G ports, using TSO shows a significant performance
improvement by a factor 2 to reach around 9.5Gbps in TX; as well as a
significant CPU usage drop (from 25% to 15%).
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

186cd4d4

net: mvpp2: unify the txq size define use · 85affd7e

Antoine Ténart authored Aug 23, 2017

The txq size is defined by MVPP2_AGGR_TXQ_SIZE, which is sometime not
used directly but through variables. As it is a fixed value use the
define everywhere in the driver.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85affd7e

net: define the TSO header size in net/tso.h · f9cbe9a5

Antoine Ténart authored Aug 23, 2017

The TSO header size was defined in many drivers. Factorize the code and
define its size in net/tso.h.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f9cbe9a5

ipv4: do metrics match when looking up and deleting a route · 5f9ae3d9

Xin Long authored Aug 23, 2017

Now when ipv4 route inserts a fib_info, it memcmp fib_metrics.
It means ipv4 route identifies one route also with metrics.

But when removing a route, it tries to find the route without
caring about the metrics. It will cause that the route with
right metrics can't be removed.

Thomas noticed this issue when doing the testing:

1. add:
   # ip route append 192.168.7.0/24 dev v window 1000
   # ip route append 192.168.7.0/24 dev v window 1001
   # ip route append 192.168.7.0/24 dev v window 1002
   # ip route append 192.168.7.0/24 dev v window 1003
2. delete:
   # ip route delete 192.168.7.0/24 dev v window 1002
3. show:
     192.168.7.0/24 proto boot scope link window 1001
     192.168.7.0/24 proto boot scope link window 1002
     192.168.7.0/24 proto boot scope link window 1003

The one with window 1002 wasn't deleted but the first one was.

This patch is to do metrics match when looking up and deleting
one route.
Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

5f9ae3d9

Merge branch 'tcp-sw-rx-timestamps' · d260e9e6

David S. Miller authored Aug 23, 2017

Mike Maloney says:

====================
net: Add software rx timestamp for TCP.

Add software rx timestamps for TCP, and a test to ensure consistency of
behavior between IP, UDP, and TCP implementation.

Changes since v1:
  -Initialize tss->ts[1] to 0 if caller requested any timestamps.
  -Fix test case to validate that tss->ts[1] is zero.
  -Fix tests to actually use a raw socket.
  -Fix --tcp flag to work on the test.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d260e9e6

selftests/net: Add a test to validate behavior of rx timestamps · 16e78122

Mike Maloney authored Aug 22, 2017

Validate the behavior of the combination of various timestamp socket
options, and ensure consistency across ip, udp, and tcp.
Signed-off-by: Mike Maloney <maloney@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

16e78122

tcp: Extend SOF_TIMESTAMPING_RX_SOFTWARE to TCP recvmsg · 98aaa913

Mike Maloney authored Aug 22, 2017

When SOF_TIMESTAMPING_RX_SOFTWARE is enabled for tcp sockets, return the
timestamp corresponding to the highest sequence number data returned.

Previously the skb->tstamp is overwritten when a TCP packet is placed
in the out of order queue.  While the packet is in the ooo queue, save the
timestamp in the TCB_SKB_CB.  This space is shared with the gso_*
options which are only used on the tx path, and a previously unused 4
byte hole.

When skbs are coalesced either in the sk_receive_queue or the
out_of_order_queue always choose the timestamp of the appended skb to
maintain the invariant of returning the timestamp of the last byte in
the recvmsg buffer.
Signed-off-by: Mike Maloney <maloney@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

98aaa913

liquidio: change manner of detecting whether or not NIC firmware is loaded · b2854772

Felix Manlunas authored Aug 22, 2017

In the NIC firmware, the 1-bit flag indicating "firmware is loaded" moved
from SLI_SCRATCH_1 to SLI_SCRATCH_2 (these are Octeon general-purpose
scratch registers).  Make the PF driver conform to this change.

Remove code that sets the "firmware is loaded" flag because it's now the
firmware's job to do that.

In the code that detects whether or not the firmware is loaded, don't just
rely on checking the "firmware is loaded" flag because that may cause a
rare false negative.  Add code that deduces whether or not the firmware is
loaded; that will never give a false negative.

Also bump up driver version to match newer NIC firmware.
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2854772

23 Aug, 2017 4 commits

gre: fix goto statement typo · e3d0328c

William Tu authored Aug 22, 2017

Fix typo: pnet_tap_faied.
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e3d0328c

Merge branch 'bpf-minor-cleanups' · 572a5767

David S. Miller authored Aug 22, 2017

Daniel Borkmann says:

====================
Two minor BPF cleanups

Two minor cleanups on devmap and redirect I still had
in my queue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

572a5767

bpf: minor cleanups for dev_map · af4d045c

Daniel Borkmann authored Aug 23, 2017

Some minor code cleanups, while going over it I also noticed that
we're accounting the bitmap only for one CPU currently, so fix that
up as well.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

af4d045c

bpf: misc xdp redirect cleanups · e4a8e817

Daniel Borkmann authored Aug 23, 2017

Few cleanups including: bpf_redirect_map() is really XDP only due to
the return code. Move it to a more appropriate location where we do
the XDP redirect handling and change it's name into bpf_xdp_redirect_map()
to make it consistent to the bpf_xdp_redirect() helper.

xdp_do_redirect_map() helper can be static since only used out of filter.c
file. Drop the goto in xdp_do_generic_redirect() and only return errors
directly. In xdp_do_flush_map() only clear ri->map_to_flush which is the
arg we're using in that function, ri->map is cleared earlier along with
ri->ifindex.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e4a8e817

22 Aug, 2017 25 commits

bpf: fix map value attribute for hash of maps · cd36c3a2

Daniel Borkmann authored Aug 23, 2017

Currently, iproute2's BPF ELF loader works fine with array of maps
when retrieving the fd from a pinned node and doing a selfcheck
against the provided map attributes from the object file, but we
fail to do the same for hash of maps and thus refuse to get the
map from pinned node.

Reason is that when allocating hash of maps, fd_htab_map_alloc() will
set the value size to sizeof(void *), and any user space map creation
requests are forced to set 4 bytes as value size. Thus, selfcheck
will complain about exposed 8 bytes on 64 bit archs vs. 4 bytes from
object file as value size. Contract is that fdinfo or BPF_MAP_GET_FD_BY_ID
returns the value size used to create the map.

Fix it by handling it the same way as we do for array of maps, which
means that we leave value size at 4 bytes and in the allocation phase
round up value size to 8 bytes. alloc_htab_elem() needs an adjustment
in order to copy rounded up 8 bytes due to bpf_fd_htab_map_update_elem()
calling into htab_map_update_elem() with the pointer of the map
pointer as value. Unlike array of maps where we just xchg(), we're
using the generic htab_map_update_elem() callback also used from helper
calls, which published the key/value already on return, so we need
to ensure to memcpy() the right size.

Fixes: bcc6b1b7 ("bpf: Add hash of maps support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cd36c3a2

MIPS,bpf: fix missing break in switch statement · 4a00aa05

Colin Ian King authored Aug 22, 2017

There is a missing break causing a fall-through and setting
ctx.use_bbit_insns to the wrong value. Fix this by adding the
missing break.

Detected with cppcheck:
"Variable 'ctx.use_bbit_insns' is reassigned a value before the old
one has been used. 'break;' missing?"

Fixes: 8d8d18c3 ("MIPS,bpf: Fix using smp_processor_id() in preemptible splat.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4a00aa05

net: sched: use kvmalloc() for class hash tables · 9695fe6f

Eric Dumazet authored Aug 22, 2017

High order GFP_KERNEL allocations can stress the host badly.

Use modern kvmalloc_array()/kvfree() instead of custom
allocations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

9695fe6f

net: amd: constify zorro_device_id · 153890b4

Arvind Yadav authored Aug 22, 2017

zorro_device_id are not supposed to change at runtime. All functions
working with zorro_device_id provided by <linux/zorro.h> work with
const zorro_device_id. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

153890b4

Merge branch 'net-mvpp2-MAC-GoP-configuration' · 9142b673

David S. Miller authored Aug 22, 2017

Antoine Tenart says:

====================
net: mvpp2: MAC/GoP configuration

This is based on net-next (e2a7c34f).

I removed the GoP interrupt and PHY optional parts in this v2 to ease
the review process as the MAC/GoP initialization seemed to start less
discussions :)

This series now only aims at making the PPv2 driver less depending on
the firmware/bootloader initialization. Some patches cleanup some parts
as well, and add new interface descriptions in the dt.

The current implementation of the PPv2 driver relies on the
firmware/bootloader initialization to configure some parts, as the Group
of Ports (GoP) and the MACs (GMAC and/or XLG MAC --for 10G--).  The
drawback is the kernel must be configured to match exactly what the
bootloader configures which is not convenient and is an issue when using
boards having an Ethernet port and an SFP port wired to the same GoP
port, as no dynamic configuration can be done.

This series adds the GoP and GMAC/XLG MAC initializations so that the
PPV2 does not have to rely on a previous initialization. One part is
still missing from this series, and that would be the 'comphy' which
provides shared serdes PHYs and which must be configured as well for a
full kernel initialization to work. This comphy support will be part of
a following up series. (This series was also tested with this 'comphy'
support, as it's nearly ready).

@Dave: patches 9 and 10 should go through the mvebu tree. Thanks!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9142b673

Documentation/bindings: net: marvell-pp2: add the system controller · 7afe461e

Antoine Ténart authored Aug 22, 2017

This patch documents the new marvell,system-controller property used by
the Marvell ppv2 network driver.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7afe461e

net: mvpp2: initialize the GoP · f84bf386

Antoine Ténart authored Aug 22, 2017

The patch adds GoP (group of ports) initialization functions. The mvpp2
driver was relying on the firmware/bootloader initialization; this patch
moves this setup to the mvpp2 driver.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f84bf386

net: mvpp2: set maximum packet size for 10G ports · 76eb1b1d

Stefan Chulski authored Aug 22, 2017

Set maximum packet size for XLG 10G ports. Missing maximum packet size
for XLG configuration will cause kernel panic if oversized packet is
received by port.
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

76eb1b1d

net: mvpp2: initialize the XLG MAC when using a port · 77321959

Antoine Ténart authored Aug 22, 2017

This adds a routine to initialize the XLG MAC at the port level when
using a port and the XAUI/10GKR interface mode. This wasn't done until
this commit, and the mvpp2 driver was relying on the bootloader/firmware
initialization. This doesn't mean everything is configured in the mvpp2
driver now, but it helps reducing the gap.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

77321959

net: mvpp2: initialize the GMAC when using a port · 3919357f

Antoine Ténart authored Aug 22, 2017

This adds a routine to initialize the GMAC at the port level when using
a port. This wasn't done until this commit, and the mvpp2 driver was
relying on the bootloader/firmware initialization. This doesn't mean
everything is configured in the mvpp2 driver now, but it helps reducing
the gap.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3919357f

net: mvpp2: move the mii configuration in the ndo_open path · 2055d626

Antoine Ténart authored Aug 22, 2017

This moves the mii configuration in the ndo_open path, to allow handling
different mii configurations later and to switch between these
configurations at runtime.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2055d626

net: mvpp2: fix the synchronization module bypass macro name · 1068ec79

Antoine Ténart authored Aug 22, 2017

The macro defining the bit to toggle to bypass or not the
synchronization module is wrongly named. Writing 1 will disable bypass.
This patch s/MVPP22_CTRL4_SYNC_BYPASS/MVPP22_CTRL4_SYNC_BYPASS_DIS/.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1068ec79

net: mvpp2: unify register definitions coding style · 81b6630f

Antoine Ténart authored Aug 22, 2017

Cosmetic patch to use the same formatting rules on all register
definitions.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

81b6630f

gre: introduce native tunnel support for ERSPAN · 84e54fe0

William Tu authored Aug 22, 2017

The patch adds ERSPAN type II tunnel support.  The implementation
is based on the draft at [1].  One of the purposes is for Linux
box to be able to receive ERSPAN monitoring traffic sent from
the Cisco switch, by creating a ERSPAN tunnel device.
In addition, the patch also adds ERSPAN TX, so Linux virtual
switch can redirect monitored traffic to the ERSPAN tunnel device.
The traffic will be encapsulated into ERSPAN and sent out.

The implementation reuses tunnel key as ERSPAN session ID, and
field 'erspan' as ERSPAN Index fields:
./ip link add dev ers11 type erspan seq key 100 erspan 123 \
			local 172.16.1.200 remote 172.16.1.100

To use the above device as ERSPAN receiver, configure
Nexus 5000 switch as below:

monitor session 100 type erspan-source
  erspan-id 123
  vrf default
  destination ip 172.16.1.200
  source interface Ethernet1/11 both
  source interface Ethernet1/12 both
  no shut
monitor erspan origin ip-address 172.16.1.100 global

[1] https://tools.ietf.org/html/draft-foschiano-erspan-01
[2] iproute2 patch: http://marc.info/?l=linux-netdev&m=150306086924951&w=2
[3] test script: http://marc.info/?l=linux-netdev&m=150231021807304&w=2Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Meenakshi Vohra <mvohra@vmware.com>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

84e54fe0

udp: remove unreachable ufo branches · ab2fb7e3

Willem de Bruijn authored Aug 22, 2017

Remove two references to ufo in the udp send path that are no longer
reachable now that ufo has been removed.

Commit 85f1bd9a ("udp: consistently apply ufo or fragmentation")
is a fix to ufo. It is safe to revert what remains of it.

Also, no skb can enter ip_append_page with skb_is_gso true now that
skb_shinfo(skb)->gso_type is no longer set in ip_append_page/_data.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ab2fb7e3

net: mdio-gpio: make mdiobb_ops const · 41a130f7

Bhumika Goyal authored Aug 22, 2017

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

41a130f7

net: ethernet: freescale: fs_enet: make mdiobb_ops const · 94494733

Bhumika Goyal authored Aug 22, 2017

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

94494733

net: ethernet: ax88796: make mdiobb_ops const · 58e0c0db

Bhumika Goyal authored Aug 22, 2017

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.
Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

58e0c0db

Merge branch 'tcp_conn_request-cleanup' · 3d845ee1

David S. Miller authored Aug 22, 2017

Tonghao Zhang says:

====================
tcp: Simplify the tcp_conn_request.

Just simplify the tcp_conn_request function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

3d845ee1

tcp: Remove the unused parameter for tcp_try_fastopen. · 11199369

Tonghao Zhang authored Aug 21, 2017

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

11199369

tcp: Get a proper dst before checking it. · 49c71586

Tonghao Zhang authored Aug 21, 2017

tcp_peer_is_proven needs a proper route to make the
determination, but dst always is NULL. This bug may
be there at the beginning of git tree. This does not
look serious enough to deserve backports to stable
versions.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

49c71586

Merge branch 'hv_netvsc-Ethtool-handler-to-change-UDP-hash-levels' · 6c647935

David S. Miller authored Aug 22, 2017

Haiyang Zhang says:

====================
hv_netvsc: Ethtool handler to change UDP hash levels

The patch set adds the functions to switch UDP hash level between
L3 and L4 by ethtool command. UDP over IPv4 and v6 can be set
differently. The default hash level is L4. We currently only
allow switching TX hash level from within the guests.

The ethtool callback function is triggered by command line, and
update the per device variables of the hash level.

On Azure, fragmented UDP packets is not yet supported with L4
hashing, and may have high packet loss rate. Using L3 hashing is
recommended in this case. This ethtool option allows a user to
make this selection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6c647935

hv_netvsc: Update netvsc Document for UDP hash level setting · 3b0c3458

Haiyang Zhang authored Aug 21, 2017

Update Documentation/networking/netvsc.txt for UDP hash level setting
and related info.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3b0c3458

hv_netvsc: Add ethtool handler to set and get UDP hash levels · 4823eb2f

Haiyang Zhang authored Aug 21, 2017

The patch add the functions to switch UDP hash level between
L3 and L4 by ethtool command. UDP over IPv4 and v6 can be set
differently. The default hash level is L4. We currently only
allow switching TX hash level from within the guests.

On Azure, fragmented UDP packets have high loss rate with L4
hashing. Using L3 hashing is recommended in this case.

For example, for UDP over IPv4 on eth0:
To include UDP port numbers in hasing:
	ethtool -N eth0 rx-flow-hash udp4 sdfn
To exclude UDP port numbers in hasing:
	ethtool -N eth0 rx-flow-hash udp4 sd
To show UDP hash level:
	ethtool -n eth0 rx-flow-hash udp4
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4823eb2f

hv_netvsc: Clean up unused parameter from netvsc_get_rss_hash_opts() · 4c0e2cbf

Haiyang Zhang authored Aug 21, 2017

The parameter "nvdev" is not in use.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c0e2cbf