Commits · 18a4c0eab2623cc95be98a1e6af1ad18e7695977 · nexedi / linux

06 Oct, 2017 40 commits

net: add rb_to_skb() and other rb tree helpers · 18a4c0ea

Eric Dumazet authored Oct 05, 2017

Geeralize private netem_rb_to_skb()

TCP rtx queue will soon be converted to rb-tree,
so we will need skb_rbtree_walk() helpers.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

18a4c0ea

Merge branch '40GbE' of ra.kernel.org:/pub/scm/linux/kernel/git/jkirsher/next-queue · f5333f80

David S. Miller authored Oct 06, 2017

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-10-06

This series contains updates to i40e and i40evf only.

Rami fixes a typo in the code comments.

Mitch adds an ethtool private flag to control source pruning to resolve an
issue where our default behavior is to enable source pruning which breaks ARP
monitoring in channel bonding.  Fixes a couple of register definitions, which
were incorrect.

Jake fixes an issue with multiple logical CPUs per core (simultaneous
multithreading - SMT) and how we set an affinity hint based on the v_idx of
that q_vector, which is an incremental value and might lead to multiple
offline CPUs being assigned to a q_vector.  Instead, we should only assign
hints for CPUs which are online, so look to use cpumask_local_spread().
Also fixed a VF VLAN tag stripping issue, where the flag created to change
this feature was seen as unchangeable.  Lastly, organized and re-numbered
the feature flags.

Alan re-enables PTP L4 for XL710 devices with firmware version 6.0 or
greater, now that the previous bug in the older firmware is fixed.
Implements the PCI error handlers for reset_prepare() and reset_done() to
allow us to handle function level resets.

Alice cleans up code that was added to the incorrect function during a
merge.

Filip adds a change to display an error message when a module is inserted
that does not meet the thermal requirements, Talking Heads "Burning Down
the House" comes to mind.  Also fixed a flow director filter issue where
a variable was not being cleared which stores the filter number to be
removed from the list when the firmware refused to add the requested
filter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

f5333f80

Merge tag 'batadv-next-for-davem-20171006' of git://git.open-mesh.org/linux-merge · 4bc4e64c

David S. Miller authored Oct 06, 2017

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Cleanup patches to make checkpatch happy, by Sven Eckelmann (3 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4bc4e64c

bnx2x: Use pci_ari_enabled() instead of local copy · d2746fe5

Bjorn Helgaas authored Oct 06, 2017

Use pci_ari_enabled() from the PCI core instead of the identical local copy
bnx2x_ari_enabled().  No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d2746fe5

Merge branch 'xdp_monitor-improve' · 399e3514

David S. Miller authored Oct 06, 2017

Jesper Dangaard Brouer says:

====================
Improve xdp_monitor samples/bpf

Here are some improvements to the xdp_monitor tool currently located
under samples/bpf/.  Once the tools library libbpf become more feature
complete, xdp_monitor should be converted to use it, and be moved into
tools/bpf/xdp/ or tools/xdp/.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

399e3514

samples/bpf: xdp_monitor increase memory rlimit · c4eb7f46

Jesper Dangaard Brouer authored Oct 06, 2017

Other concurrent running programs, like perf or the XDP program what
needed to be monitored, might take up part of the max locked memory
limit.  Thus, the xdp_monitor tool have to set the RLIMIT_MEMLOCK to
RLIM_INFINITY, as it cannot determine a more sane limit.

Using the man exit(3) specified EXIT_FAILURE return exit code, and
correct other users too.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c4eb7f46

samples/bpf: xdp_monitor also record xdp_exception tracepoint · 280b058d

Jesper Dangaard Brouer authored Oct 06, 2017

Also monitor the tracepoint xdp_exception. This tracepoint is usually
invoked by the drivers. Programs themselves can activate this by
returning XDP_ABORTED, which will drop the packet but also trigger the
tracepoint. This is useful for distinguishing intentional (XDP_DROP)
vs. ebpf-program error cases that cased a drop (XDP_ABORTED).

Drivers also use this tracepoint for reporting on XDP actions that are
unknown to the specific driver. This can help the user to detect if a
driver e.g. doesn't implement XDP_REDIRECT yet.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

280b058d

samples/bpf: xdp_monitor first 8 bytes are not accessible by bpf · f4ce0a01

Jesper Dangaard Brouer authored Oct 06, 2017

The first 8 bytes of the tracepoint context struct are not accessible
by the bpf code.  This is a choice that dates back to the original
inclusion of this code.

See explaination in:
 commit 98b5c2c6 ("perf, bpf: allow bpf programs attach to tracepoints")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

f4ce0a01

Merge branch 'nfp-extend-match-and-action' · c6a15752

David S. Miller authored Oct 06, 2017

Simon Horman says:

====================
nfp: extend match and action for flower offload

Pieter says:

This series extends flower offload match and action capabilities. It
specifically adds offload capabilities for matching on MPLS, TTL, TOS
and flow label. Furthermore offload capabilities for action have been
expanded to include set ethernet, ipv4, ipv6, tcp and udp headers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c6a15752

nfp: add set tcp and udp header action flower offload · f8b7b0a6

Pieter Jansen van Vuuren authored Oct 06, 2017

Previously we did not have offloading support for set TCP/UDP actions. This
patch enables TC flower offload of set TCP/UDP sport and dport actions.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f8b7b0a6

nfp: add set ipv6 source and destination address · 354b82bb

Pieter Jansen van Vuuren authored Oct 06, 2017

Previously we did not have offloading support for set IPv6 actions. This
patch enables TC flower offload of set IPv6 src and dst address actions.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

354b82bb

nfp: add set ipv4 header action flower offload · c0b1bd9a

Pieter Jansen van Vuuren authored Oct 06, 2017

Previously we did not have offloading support for set IPv4 actions. This
patch enables TC flower offload of set IPv4 src and dst address actions.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c0b1bd9a

nfp: add set ethernet header action flower offload · da83d8fe

Pieter Jansen van Vuuren authored Oct 06, 2017

Previously we did not have offloading support for set ethernet actions.
This patch enables TC flower offload of set ethernet actions.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

da83d8fe

nfp: add IPv6 ttl and tos match offloading support · fc53b4a7

Pieter Jansen van Vuuren authored Oct 06, 2017

Previously matching on IPv6 ttl and tos fields were not offloaded. This
patch enables offloading IPv6 ttl and tos as match fields.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fc53b4a7

nfp: add IPv4 ttl and tos match offloading support · a1e9203c

Pieter Jansen van Vuuren authored Oct 06, 2017

Previously matching on IPv4 ttl and tos fields were not offloaded. This
patch enables offloading IPv4 ttl and tos as match fields.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a1e9203c

nfp: add mpls match offloading support · bb055c19

Pieter Jansen van Vuuren authored Oct 06, 2017

Previously MPLS match offloading was not supported. This patch enables
MPLS match offloading support for label, bos and tc fields.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bb055c19

net/ipv6: Convert icmpv6_push_pending_frames to void · 4e64b1ed

Joe Perches authored Oct 05, 2017

commit cc71b7b0 ("net/ipv6: remove unused err variable on
icmpv6_push_pending_frames") exposed icmpv6_push_pending_frames
return value not being used.

Remove now unnecessary int err declarations and uses.

Miscellanea:

o Remove unnecessary goto and out: labels
o Realign arguments
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e64b1ed

i40e/i40evf: organize and re-number feature flags · b74f571f

Jacob Keller authored Sep 01, 2017

Now that we've reduced the number of flags, organize similar flags
together and re-number them accordingly.

Since we don't yet have more than 32 flags, we'll use a u32 for both the
hw_features and flag field. Should we gain more flags in the future, we
may need to convert to a u64 or separate flags out into two fields.

One alternative approach considered, but not implemented here, was to
use an enumeration for the flag variables, and create a macro
I40E_FLAG() which used string concatenation to generate BIT_ULL values.
This has the advantage of making the actual bit values compile-time
dynamic so that we do not need to worry about matching the order to the
bit value. However, this does produce a high level of code churn, and
makes it more difficult to read a dumped flags value when debugging.

Change-ID: I8653fff69453cd547d6fe98d29dfa9d8710387d1
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

b74f571f

i40e: ignore skb->xmit_more when deciding to set RS bit · a5340d93

Jacob Keller authored Aug 29, 2017

Since commit 6a7fded7 ("i40e: Fix RS bit update in Tx path and
disable force WB workaround") we've tried to "optimize" setting the
RS bit based around skb->xmit_more. This same logic was refactored
in commit 1dc8b538 ("i40e: Reorder logic for coalescing RS bits"),
but ultimately was not functionally changed.

Using skb->xmit_more in this way is incorrect, because in certain
circumstances we may see a large number of skbs in sequence with
xmit_more set. This leads to a performance loss as the hardware does not
writeback anything for those packets, which delays the time it takes for
us to respond to the stack transmit requests. This significantly impacts
UDP performance, especially when layered with multiple devices, such as
bonding, VLANs, and vnet setups.

This was not noticed until now because it is difficult to create a setup
which reproduces the issue. It was discovered in a UDP_STREAM test in
a VM, connected using a vnet device to a bridge, which is connected to
a bonded pair of X710 ports in active-backup mode with a VLAN. These
layered devices seem to compound the number of skbs transmitted at once
by the qdisc. Additionally, the problem can be masked by reducing the
ITR value.

Since the original commit does not provide strong justification for this
RS bit "optimization", revert to the previous behavior of setting the RS
bit every 4th packet.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

a5340d93

i40evf: enable support for VF VLAN tag stripping control · 0a3b4f70

Jacob Keller authored Aug 29, 2017

A recent commit 809481484e5d ("i40e/i40evf: support for VF VLAN tag
stripping control") added support for VFs to negotiate the control of
VLAN tag stripping. This should have allowed VFs to disable the feature.
Unfortunately, the flag was set only in netdev->feature flags and not in
netdev->hw_features.

This ultimately causes the stack to assume that it cannot change the
flag, so it was unchangeable and marked as [fixed] in the ethtool -k
output.

Fix this by setting the feature in hw_features first, just as we do for
the PF code. This enables ethtool -K to disable the feature correctly,
and fully enables user control of the VLAN tag stripping feature.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

0a3b4f70

i40e: do not enter PHY debug mode while setting LEDs behaviour · 052b93d0

Mariusz Stachura authored Aug 29, 2017

Previous implementation of LED set/get functions required to enter
PHY debug mode, in order to prevent access to it from FW and SW at
the same time. Reset of all ports was a unwanted side effect.
Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

052b93d0

i40e: implement split PCI error reset handler · 19b7960b

Alan Brady authored Aug 29, 2017

This patch implements the PCI error handler reset_prepare and reset_done.
This allows us to handle function level reset.  Without this patch we
are unable to perform and recover from an FLR correctly and this will cause
VFs to be unable to recover from an FLR on the PF.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

19b7960b

i40e: Properly maintain flow director filters list · 013df598

Filip Sadowski authored Aug 29, 2017

When there is no space for more flow director filters and user requested to
add a new one it is rejected by firmware and automatically removed from the
filter list maintained by driver. This behaviour is correct. Afterwards
existing filter can be removed making free slot for the new one. This
however causes the newly added filter to be accepted by firmware but
removed from driver filter list resulting in not showing after issuing
'ethtool -n <dev_name>'.

This happened due to not clearing the variable pf->fd_inv which stores
filter number to be removed from the list when firmware refused to add the
requested filter. It caused the filter with this specific ID to be
constantly removed once it was added to the list although it has been
accepted by firmware and effectively applied to the NIC.
It was fixed by clearing pf->fd_inv variable after removal of the filter
from the list when it was rejected by firmware.
Signed-off-by: Filip Sadowski <filip.sadowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

013df598

i40e: Display error message if module does not meet thermal requirements · 9a858178

Filip Sadowski authored Aug 29, 2017

This patch causes error message to be displayed when NIC detects
insertion of module that does not meet thermal requirements.
Signed-off-by: Filip Sadowski <filip.sadowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

9a858178

i40e: fix merge error · 7f661822

Alice Michael authored Aug 29, 2017

This patch removes some code that was accidentally added to
the wrong function with a merge error.  Fixes: c53934c6
("i40e: fix: do not sleep in netdev_ops")
Signed-off-by: Alice Michael <alice.michael@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

7f661822

i40e/i40evf: use DECLARE_BITMAP for state · bd6cd4e6

Jesse Brandeburg authored Aug 29, 2017

When using set_bit and friends, we should be using actual
bitmaps, and fix all the locations where we might access
it.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

bd6cd4e6

i40e: fix incorrect register definition · 0a0d9af5

Mitch Williams authored Aug 29, 2017

This register was defined incorrectly. Fix the increment value to 8, and
replace the iterator with _i to make the definition consistent with
other statistics registers.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

0a0d9af5

i40e: redfine I40E_PHY_TYPE_MAX · 60518a04

Mitch Williams authored Aug 29, 2017

Since I40E_PHY_TYPE_MAX is used as an iterator, usually combined with
some sort of bit-shifting, it should only include actual PHY types and
not error cases. Move it up in the enum declaration so that loops only
iterate across valid PHY types.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

60518a04

i40e: re-enable PTP L4 capabilities for XL710 if FW >6.0 · c3d26b75

Alan Brady authored Aug 29, 2017

Starting with XL710 FW 5.3 PTP L4 was disabled for XL710 due to a bug.  The
bug has since been resolved in XL710 FW >6.0 and PTP L4 can now be
re-enabled on those devices with updated firmware.
Signed-off-by: Alan Brady <alan.brady@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

c3d26b75

i40e/i40evf: spread CPU affinity hints across online CPUs only · be664cbe

Jacob Keller authored Aug 29, 2017

Currently, when setting up the IRQ for a q_vector, we set an affinity
hint based on the v_idx of that q_vector. Meaning a loop iterates on
v_idx, which is an incremental value, and the cpumask is created based
on this value.

This is a problem in systems with multiple logical CPUs per core (like in
simultaneous multithreading (SMT) scenarios). If we disable some logical
CPUs, by turning SMT off for example, we will end up with a sparse
cpu_online_mask, i.e., only the first CPU in a core is online, and
incremental filling in q_vector cpumask might lead to multiple offline
CPUs being assigned to q_vectors.

Example: if we have a system with 8 cores each one containing 8 logical
CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is
disabled, only the 1st CPU in each core remains online, so the
cpu_online_mask in this case would have only 8 bits set, in a sparse way.

In general case, when SMT is off the cpu_online_mask has only C bits set:
0, 1*N, 2*N, ..., C*(N-1)  where
C == # of cores;
N == # of logical CPUs per core.
In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set.

Instead, we should only assign hints for CPUs which are online. Even
better, the kernel already provides a function, cpumask_local_spread()
which takes an index and returns a CPU, spreading the interrupts across
local NUMA nodes first, and then remote ones if necessary.

Since we generally have a 1:1 mapping between vectors and CPUs, there
is no real advantage to spreading vectors to local CPUs first. In order
to avoid mismatch of the default XPS hints, we'll pass -1 so that it
spreads across all CPUs without regard to the node locality.

Note that we don't need to change the q_vector->affinity_mask as this is
initialized to cpu_possible_mask, until an actual affinity is set and
then notified back to us.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

be664cbe

i40e: add private flag to control source pruning · 64615b54

Mitch Williams authored Aug 29, 2017

By default, our devices do source pruning, that is, they drop receive
packets that have the source MAC matching one of the receive filters.
Unfortunately, this breaks ARP monitoring in channel bonding, as the
bonding driver expects devices to receive ARPs containing their own
source address.

Add an ethtool private flag to control this feature.

Also, remove the netif_running() check when we process our private
flags. It's OK to reset when the device is closed and in most cases we
need the reset the apply these changes.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

64615b54

i40e: fix a typo in i40e_pf documentation · ec2f25d2

Rami Rosen authored Aug 19, 2017

This patch fixes a typo in i40e_pf object documentation; num_req_vfs
refers to the number of VFs requested for the PF.
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ec2f25d2

net/ipv6: remove unused err variable on icmpv6_push_pending_frames · cc71b7b0

Tim Hansen authored Oct 05, 2017

int err is unused by icmpv6_push_pending_frames(), this patch returns removes the variable and returns the function with 0.

git bisect shows this variable has been around since linux has been in git in commit 1da177e4.

This was found by running make coccicheck M=net/ipv6/ on linus' tree on commit 77ede3a0 (current HEAD as of this patch).
Signed-off-by: Tim Hansen <devtimhansen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cc71b7b0

net: ipv6: remove unused code in ipv6_find_hdr() · 380537b4

Lin Zhang authored Oct 06, 2017

Storing the left length of skb into 'len' actually has no effect
so we can remove it.
Signed-off-by: Lin Zhang <xiaolou4617@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

380537b4

Merge branch 'libbpf-support-more-map-options' · d433b3f3

David S. Miller authored Oct 05, 2017

Craig Gallek says:

====================
libbpf: support more map options

The functional change to this series is the ability to use flags when
creating maps from object files loaded by libbpf.  In order to do this,
the first patch updates the library to handle map definitions that
differ in size from libbpf's struct bpf_map_def.

For object files with a larger map definition, libbpf will continue to load
if the unknown fields are all zero, otherwise the map is rejected.  If the
map definition in the object file is smaller than expected, libbpf will use
zero as a default value in the missing fields.
====================
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d433b3f3

libbpf: use map_flags when creating maps · fe9b5f77

Craig Gallek authored Oct 05, 2017

This is required to use BPF_MAP_TYPE_LPM_TRIE or any other map type
which requires flags.
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

fe9b5f77

libbpf: parse maps sections of varying size · b13c5c14

Craig Gallek authored Oct 05, 2017

This library previously assumed a fixed-size map options structure.
Any new options were ignored.  In order to allow the options structure
to grow and to support parsing older programs, this patch updates
the maps section parsing to handle varying sizes.

Object files with maps sections smaller than expected will have the new
fields initialized to zero.  Object files which have larger than expected
maps sections will be rejected unless all of the unrecognized data is zero.

This change still assumes that each map definition in the maps section
is the same size.
Signed-off-by: Craig Gallek <kraig@google.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

b13c5c14

net: qcom/emac: make function emac_isr static · d009313c

Colin Ian King authored Oct 05, 2017

The function emac_isr is local to the source and does not need to
be in global scope, so make it static.

Cleans up sparse warnings:
symbol 'emac_isr' was not declared. Should it be static?
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d009313c

Merge branch 'tcp-improving-RACK-cpu-performance' · cec451ce

David S. Miller authored Oct 05, 2017

Yuchung Cheng says:

====================
tcp: improving RACK cpu performance

This patch set improves the CPU consumption of the RACK TCP loss
recovery algorithm, in particular for high-speed networks. Currently,
for every ACK in recovery RACK can potentially iterate over all sent
packets in the write queue. On large BDP networks with non-trivial
losses the RACK write queue walk CPU usage becomes unreasonably high.

This patch introduces a new queue in TCP that keeps only skbs sent and
not yet (s)acked or marked lost, in time order instead of sequence
order.  With that, RACK can examine this time-sorted list and only
check packets that were sent recently, within the reordering window,
per ACK. This is the fastest way without any write queue walks. The
number of skbs examined per ACK is reduced by orders of magnitude.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

cec451ce

tcp: a small refactor of RACK loss detection · bef06223

Yuchung Cheng authored Oct 04, 2017

Refactor the RACK loop to improve readability and speed up the checks.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bef06223