Commits · 746305347ac39a194444029b164e76666a9cbf74 · Kirill Smelkov / linux

11 Feb, 2016 40 commits

Merge branch 'thunderx-irq-hints' · 74630534

David S. Miller authored Feb 11, 2016

Sunil Goutham says:

====================
net: thunderx: Setting IRQ affinity hints and other optimizations

This patch series contains changes
- To add support for virtual function's irq affinity hint
- Replace napi_schedule() with napi_schedule_irqoff()
- Reduce page allocation overhead by allocating pages
  of higher order when pagesize is 4KB.
- Add couple of stats which helps in debugging
- Some miscellaneous changes to BGX driver.

Changes from v1:
- As suggested changed MAC address invalid log message
  to dev_err() instead of dev_warn().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

74630534

net: thunderx: Alloc higher order pages when pagesize is small · 6e4be8d6

Sunil Goutham authored Feb 11, 2016

Allocate higher order pages when pagesize is small, this will
reduce number of calls to page allocator and wastage of memory.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e4be8d6

net: thunderx: bgx: Add log message when setting mac address · 1d82efac

Robert Richter authored Feb 11, 2016

Signed-off-by: Robert Richter <rrichter@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1d82efac

net: thunderx: bgx: Use standard firmware node infrastructure. · eee326fd

David Daney authored Feb 11, 2016

In the case of OF device tree, the firmware information is attached to
the BGX device structure in the standard manner, so use the firmware
iterators and accessors where possible.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eee326fd

net: thunderx: Assign affinity hints to vf's interrupts · fb4b7d98

Sunil Goutham authored Feb 11, 2016

This affinity hint can be used by user space irqbalance tool to set
preferred CPU mask for irqs registered by this VF. Irqbalance needs
to be in 'exact' mode to set irq affinity same as indicated by
affinity hint.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fb4b7d98

net: thunderx: Use napi_schedule_irqoff() · ef0a4d86

Sunil Goutham authored Feb 11, 2016

napi_schedule is being called from hard irq context, hence
switch to napi_schedule_irqoff which avoids unneeded call
to local_irq_save and local_irq_restore.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ef0a4d86

net, thunderx: Add TX timeout and RX buffer alloc failure stats. · a05d4845

Thanneeru Srinivasulu authored Feb 11, 2016

When system is low on atomic memory, too many error messages are logged.
Since this is not a total failure but a simple switch to non-atomic allocation
better to have a stat.

Also add a stat for reset, kicked due to transmit watchdog timeout.
Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@caviumnetworks.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a05d4845

Merge branch 'igmp-ns' · 65411adb

David S. Miller authored Feb 11, 2016

Nikolay Borisov says:

====================
Make igmp sysctl knobs namespace aware

This series continue making more of the net related sysctls
namespace aware. The first 2 and last patches are straight
forward and convert sysctls which weren't defined to be
namespace aware. The only thing in them is that each removes
a define which is used in only one place (to initialise
the respective sysctl) so I don't think this is a huge loss.

The third patch however, converts igmp_llm_reports which was
already defined in the ipv4_net_table but wasn't using any of
the net namespace infrastructure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

65411adb

igmp: Namespacify igmp_qrv sysctl knob · 165094af

Nikolay Borisov authored Feb 08, 2016

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

165094af

igmp: Namespaceify igmp_llm_reports sysctl knob · 87a8a2ae

Nikolay Borisov authored Feb 09, 2016

This was initially introduced in df2cf4a7 ("IGMP: Inhibit
reports for local multicast groups") by defining the sysctl in the
ipv4_net_table array, however it was never implemented to be
namespace aware. Fix this by changing the code accordingly.
Signed-off-by: David S. Miller <davem@davemloft.net>

87a8a2ae

igmp: Namespaceify igmp_max_msf sysctl knob · 166b6b2d

Nikolay Borisov authored Feb 08, 2016

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

166b6b2d

igmp: Namespaceify igmp_max_memberships sysctl knob · 815c5270

Nikolay Borisov authored Feb 08, 2016

Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

815c5270

bonding: use return instead of goto · 1e2a8868

Zhang Shengju authored Feb 09, 2016

Replace 'goto' with 'return' to remove unnecessary check at label:
err_undo_flags.

The reason is that 'err_undo_flags' do two things for the first slave device:
1.revert bond mac address if it is set by the slave device.
2.revert bond device type if it's not ARPHRD_ETHER.

It's not necessary for the following three places, they changed neither bond
mac address nor type. It's straightforward to return directly.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1e2a8868

net: macb: add wake-on-lan support via magic packet · 3e2a5e15

Sergio Prado authored Feb 09, 2016

Tested on Acqua A5 SoM (http://www.acmesystems.it/acqua).
Signed-off-by: Sergio Prado <sergio.prado@e-labworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3e2a5e15

net: hamradio: baycom_ser_fdx: Replace timeval with timespec64 · e6515203

Amitoj Kaur Chawla authored Feb 10, 2016

32 bit systems using 'struct timeval' will break in the year 2038, so
we replace the code appropriately. However, this driver is not broken
in 2038 since we are only using microseconds portion of the time.

This patch replaces 'struct timeval' with 'struct timespec64'. We only
need to find elapsed microseconds rather than absolute time, so it's
better to use monotonic time, so using ktime_get_ts64() makes the code
more efficient and more robust against concurrent settimeofday()
calls.
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Thomas Sailer <t.sailer@alumni.ethz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

e6515203

openvswitch: allow management from inside user namespaces · 4a92602a

Tycho Andersen authored Feb 05, 2016

Operations with the GENL_ADMIN_PERM flag fail permissions checks because
this flag means we call netlink_capable, which uses the init user ns.

Instead, let's introduce a new flag, GENL_UNS_ADMIN_PERM for operations
which should be allowed inside a user namespace.

The motivation for this is to be able to run openvswitch in unprivileged
containers. I've tested this and it seems to work, but I really have no
idea about the security consequences of this patch, so thoughts would be
much appreciated.

v2: use the GENL_UNS_ADMIN_PERM flag instead of a check in each function
v3: use separate ifs for UNS_ADMIN_PERM and ADMIN_PERM, instead of one
    massive one
Reported-by: James Page <james.page@canonical.com>
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
CC: Eric Biederman <ebiederm@xmission.com>
CC: Pravin Shelar <pshelar@ovn.org>
CC: Justin Pettit <jpettit@nicira.com>
CC: "David S. Miller" <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

4a92602a

ethtool: future-proof interface for speed extensions · 4456ed04

Michael S. Tsirkin authored Feb 07, 2016

Many virtual and not quite virtual devices allow any speed to be set
through ethtool. In particular, this applies to the virtio-net devices.
Document this fact to make sure people don't assume the enum lists all
possible values.  Reserve values greater than INT_MAX for future
extension and to avoid conflict with SPEED_UNKNOWN.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4456ed04

vrf: duplicate include of rtnetlink.h · 809dc75e

stephen hemminger authored Feb 09, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

809dc75e

vxlan: udp_tunnel duplicate include net/udp_tunnel.h · 40d29af0

stephen hemminger authored Feb 09, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

40d29af0

rds: duplicate include net/tcp.h · f48e7231

stephen hemminger authored Feb 09, 2016

Duplicate include detected.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

f48e7231

bonding: Return correct error code · 6d9b6f42

Amitoj Kaur Chawla authored Feb 07, 2016

The return value of kzalloc on failure of allocation of memory should
be -ENOMEM and not -1.

Found using Coccinelle. A simplified version of the semantic patch
used is:

//<smpl>
@@
expression *e;
@@

e = kzalloc(...);
if (e == NULL) {
...
return
- -1
+ -ENOMEM
;
}
//</smpl>

The single call site only checks that the return value is not 0,
hence no change is required at the call site.
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6d9b6f42

Merge branch 'gso-checksums' · e7e9956d

David S. Miller authored Feb 11, 2016

Alexander Duyck says:

====================
Add GSO support for outer checksum w/ inner checksum offloads

This patch series updates the existing segmentation offload code for
tunnels to make better use of existing and updated GSO checksum
computation.  This is done primarily through two mechanisms.  First we
maintain a separate checksum in the GSO context block of the sk_buff.  This
allows us to maintain two checksum values, one offloaded with values stored
in csum_start and csum_offset, and one computed and tracked in
SKB_GSO_CB(skb)->csum.  By maintaining these two values we are able to take
advantage of the same sort of math used in local checksum offload so that
we can provide both inner and outer checksums with minimal overhead.

Below is the performance for a netperf session between an ixgbe PF and VF
on the same host but in different namespaces.  As can be seen a significant
gain in performance can be had from allowing the use of Tx checksum offload
on the inner headers while performing a software offload on the outer
header computation:

 Recv   Send   Send                       Utilization  Service Demand
 Socket Socket Message Elapsed            Send  Recv   Send  Recv
 Size   Size   Size    Time    Throughput local remote local remote
 bytes  bytes  bytes   secs.   10^6bits/s % S   % U    us/KB us/KB

Before:
 87380  16384  16384   10.00   12844.38   9.30  -1.00  0.712 -1.00
After:
 87380  16384  16384   10.00   13216.63   6.78  -1.00  0.504 -1.000

Changes from v1:
* Dropped use of CHECKSUM_UNNECESSARY for remote checksum offload
* Left encap_hdr_csum as it will likely be needed in future for SCTP GSO
* Broke the changes out over many more patches
* Updated GRE segmentation to more closely match UDP tunnel segmentation
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e7e9956d

net: Allow tunnels to use inner checksum offloads with outer checksums needed · f245d079

Alexander Duyck authored Feb 05, 2016

This patch enables us to use inner checksum offloads if provided by
hardware with outer checksums computed by software.

It basically reduces encap_hdr_csum to an advisory flag for now, but based
on the fact that SCTP may be getting segmentation support before long I
thought we may want to keep it as it is possible we may need to support
CRC32c and 1's compliment checksum in the same packet at some point in the
future.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f245d079

udp: Use uh->len instead of skb->len to compute checksum in segmentation · dbef491e

Alexander Duyck authored Feb 05, 2016

The segmentation code was having to do a bunch of work to pull the
skb->len and strip the udp header offset before the value could be used to
adjust the checksum. Instead of doing all this work we can just use the
value that goes into uh->len since that is the correct value with the
correct byte order that we need anyway. By using this value we can save
ourselves a bunch of pain as there is no need to do multiple byte swaps.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dbef491e

udp: Clean up the use of flags in UDP segmentation offload · fdaefd62

Alexander Duyck authored Feb 05, 2016

This patch goes though and cleans up the logic related to several of the
control flags used in UDP segmentation.  Specifically the use of dont_encap
isn't really needed as we can just check the skb for CHECKSUM_PARTIAL and
if it isn't set then we don't need to update the internal headers.  As such
we can just drop that value.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fdaefd62

gre: Use inner_proto to obtain inner header protocol · 38720352

Alexander Duyck authored Feb 05, 2016

Instead of parsing headers to determine the inner protocol we can just pull
the value from inner_proto.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

38720352

gre: Use GSO flags to determine csum need instead of GRE flags · 2e598af7

Alexander Duyck authored Feb 05, 2016

This patch updates the gre checksum path to follow something much closer to
the UDP checksum path.  By doing this we can avoid needing to do as much
header inspection and can just make use of the fields we were already
reading in the sk_buff structure.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2e598af7

net: Move skb_has_shared_frag check out of GRE code and into segmentation · ddff00d4

Alexander Duyck authored Feb 05, 2016

The call skb_has_shared_frag is used in the GRE path and skb_checksum_help
to verify that no frags can be modified by an external entity. This check
really doesn't belong in the GRE path but in the skb_segment function
itself. This way any protocol that might be segmented will be performing
this check before attempting to offload a checksum to software.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ddff00d4

net: Store checksum result for offloaded GSO checksums · 08b64fcc

Alexander Duyck authored Feb 05, 2016

This patch makes it so that we can offload the checksums for a packet up
to a certain point and then begin computing the checksums via software.
Setting this up is fairly straight forward as all we need to do is reset
the values stored in csum and csum_start for the GSO context block.

One complication for this is remote checksum offload. In order to allow
the inner checksums to be offloaded while computing the outer checksum
manually we needed to have some way of indicating that the offload wasn't
real. In order to do that I replaced CHECKSUM_PARTIAL with
CHECKSUM_UNNECESSARY in the case of us computing checksums for the outer
header while skipping computing checksums for the inner headers. We clean
up the ip_summed flag and set it to either CHECKSUM_PARTIAL or
CHECKSUM_NONE once we hand the packet off to the next lower level.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

08b64fcc

net: Update remote checksum segmentation to support use of GSO checksum · 7fbeffed

Alexander Duyck authored Feb 05, 2016

This patch addresses two main issues.

First in the case of remote checksum offload we were avoiding dealing with
scatter-gather issues.  As a result it would be possible to assemble a
series of frames that used frags instead of being linearized as they should
have if remote checksum offload was enabled.

Second I have updated the code so that we now let GSO take care of doing
the checksum on the data itself and drop the special case that was added
for remote checksum offload.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7fbeffed

net: Move GSO csum into SKB_GSO_CB · 76443456

Alexander Duyck authored Feb 05, 2016

This patch moves the checksum maintained by GSO out of skb->csum and into
the GSO context block in order to allow for us to work on outer checksums
while maintaining the inner checksum offsets in the case of the inner
checksum being offloaded, while the outer checksums will be computed.

While updating the code I also did a minor cleanu-up on gso_make_checksum.
The change is mostly to make it so that we store the values and compute the
checksum instead of computing the checksum and then storing the values we
needed to update.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

76443456

net: Drop unecessary enc_features variable from tunnel segmentation functions · bef3c6c9

Alexander Duyck authored Feb 05, 2016

The enc_features variable isn't necessary since features isn't used
anywhere after we create enc_features so instead just use a destructive AND
on features itself and save ourselves the variable declaration.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bef3c6c9

hv_netvsc: cleanup netdev feature flags for netvsc · a060679c

sixiao@microsoft.com authored Feb 04, 2016

1. Adding NETIF_F_TSO6 feature flag;
2. Adding NETIF_F_HW_CSUM. NETIF_F_IPV6_CSUM and NETIF_F_IP_CSUM are
being deprecated;
3. Cleanup the coding style of flag assignment by using macro.
Signed-off-by: Simon Xiao <sixiao@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a060679c

Merge branch 'ethtool-nfc-ipv6' · c1e9a491

David S. Miller authored Feb 11, 2016

Edward Cree says:

====================
IPv6 NFC

This series adds support for steering IPv6 flows using the ethtool NFC
 interface, and implements it for sfc devices.
Tested using an in-development patch to the ethtool utility.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c1e9a491

sfc: implement IPv6 NFC (and IPV4_USER_FLOW) · a7ad40d0

Edward Cree authored Feb 05, 2016

Signed-off-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

a7ad40d0

ethtool: add IPv6 to the NFC API · 72bb6872

Edward Cree authored Feb 05, 2016

Signed-off-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

72bb6872

Merge branch 'cxgb4-tos' · 84e7f67a

David S. Miller authored Feb 11, 2016

Hariprasad Shenai says:

====================
Add TOS support and some cleanup

This series adds TOS support for iWARP and also does some cleanup to make
code more readable. Patch series is created against infiniband tree and
includes patches on iw_cxgb4 and cxgb4 driver.

We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

84e7f67a

cxgb4/iw_cxgb4: TOS support · ba9cee6a

Hariprasad Shenai authored Feb 05, 2016

This series provides support for iWARP applications to specify a TOS
value and have that map to a VLAN Priority for iw_cxgb4 iWARP connections.

In iw_cxgb4, when allocating an L2T entry, pass the skb_priority based
on the tos value in the cm_id. Also pass the correct tos value during
connection setup so the passive side gets the client's desired tos.
When sending the FLOWC work request to FW, if the egress device is
in a vlan, then use the vlan priority bits as the scheduling class.
This allows associating RDMA connections with scheduling classes to
provide traffic shaping per flow.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ba9cee6a

iw_cxgb4: remove false error log entry · 6102c66e

Hariprasad Shenai authored Feb 05, 2016

Don't log errors if a listening endpoint is going away when procesing a
PASS_ACCEPT_REQ message.  This can happen.  Change the error printk to
a PDBG() debug log entry
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6102c66e

iw_cxgb4: make queue allocation code more readable · 9d3053ef

Hariprasad Shenai authored Feb 05, 2016

Rename local mm* variables to more meaningful names
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9d3053ef