Commits · a293974590cfdc2d59c559a54d62a5ecb648104b · Kirill Smelkov / linux

30 Nov, 2018 26 commits

rtnetlink: avoid frame size warning in rtnl_newlink() · a2939745

Jakub Kicinski authored Nov 27, 2018

Standard kernel compilation produces the following warning:

net/core/rtnetlink.c: In function ‘rtnl_newlink’:
net/core/rtnetlink.c:3232:1: warning: the frame size of 1288 bytes is larger than 1024 bytes [-Wframe-larger-than=]
 }
  ^

This should not really be an issue, as rtnl_newlink() stack is
generally quite shallow.

Fix the warning by allocating attributes with kmalloc() in a wrapper
and passing it down to rtnl_newlink(), avoiding complexities on error
paths.

Alternatively we could kmalloc() some structure within rtnl_newlink(),
slave attributes look like a good candidate.  In practice it adds to
already rather high complexity and length of the function.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a2939745

rtnetlink: remove a level of indentation in rtnl_newlink() · 420d0318

Jakub Kicinski authored Nov 27, 2018

rtnl_newlink() used to create VLAs based on link kind.  Since
commit ccf8dbcd ("rtnetlink: Remove VLA usage") statically
sized array is created on the stack, so there is no more use
for a separate code block that used to be the VLA's live range.

While at it christmas tree the variables.  Note that there is
a goto-based retry so to be on the safe side the variables can
no longer be initialized in place.  It doesn't seem to matter,
logically, but why make the code harder to read..
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

420d0318

Merge branch 'nfp-update-TX-path-to-enable-repr-offloads' · 74315c39

David S. Miller authored Nov 30, 2018

Jakub Kicinski says:

====================
nfp: update TX path to enable repr offloads

This set starts with three micro optimizations to the TX path.
The improvement is measurable, but below 1% of CPU utilization.

Patches 4 - 9 add basic TX offloads to representor devices, like
checksum offload or TSO, and remove the unnecessary TX lock and
Qdisc (our representors are software constructs on top of the PF).

The last 2 patches add more info to error messages - id of command
which failed and exact location of incorrect TLVs, very useful for
debugging.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

74315c39

nfp: report more info when reconfiguration fails · 6db3a9dc

Jakub Kicinski authored Nov 27, 2018

FW reconfiguration timeouts are a common indicator of FW trouble.
To make debugging easier print requested update and control word
when reconfiguration fails.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6db3a9dc

nfp: add offset to all TLV parsing errors · 9571d987

Jakub Kicinski authored Nov 27, 2018

When troubleshooting incorrect FW capabilities it's useful to know
where the faulty TLV is located. Add offset to all errors messages.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9571d987

nfp: add offloads on representors · 51a6588e

Jakub Kicinski authored Nov 27, 2018

FW/HW can generally support the standard networking offloads
on representors without any trouble.  Add the ability for FW
to advertise which features should be available on representors.

Because representors are muxed on top of the vNIC we need to listen
on feature changes of their lower devices, and update their features
appropriately.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

51a6588e

nfp: add locking around representor changes · 71844fac

Jakub Kicinski authored Nov 27, 2018

Up until now we never needed to keep a networking locks around
representors accesses, we only accessed them when device was
reconfigured (under nfp pf->lock) or on fast path (under RCU).
Now we want to be able to iterate over all representors during
notifications, so make sure representor assignment is done
under RTNL lock.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

71844fac

nfp: run don't require Qdiscs on representor netdevs · fbf60e37

Jakub Kicinski authored Nov 27, 2018

Our representors are software devices built on top of the PF
vNIC, the queuing should only happen at the vNIC netdevice.
Allow representors to run qdisc-less.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fbf60e37

nfp: run representor TX locklessly · 9db8bbcb

Jakub Kicinski authored Nov 27, 2018

Our representors are software devices built on top of the PF
vNIC, the only state they have are per-cpu stats, so make
the TX run locklessly.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9db8bbcb

nfp: avoid oversized TSO headers with metadata prepend · d7cc8252

Jakub Kicinski authored Nov 27, 2018

In preparation for TSO over representors make sure the port id
prepend will always fit in the frame.  The current max header
length is 255, which is ample, so assume worst case scenario
of 8 byte prepend and save ourselves the conditionals.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d7cc8252

nfp: correct descriptor offsets in presence of metadata · b54ad0ea

Jakub Kicinski authored Nov 27, 2018

The TSO-related offsets in the descriptor should not include
the length of the prepended metadata.  Adjust them.  Note that
this could not have caused issues in the past as we don't
support TSO with metadata prepend as of this patch.
Signed-off-by: Michael Rapson <michael.rapson@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b54ad0ea

nfp: move queue variable init · 8b5ddf1e

Jakub Kicinski authored Nov 27, 2018

nd_q is only used at the very end of nfp_net_tx(), there is no need
to initialize it early.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8b5ddf1e

nfp: move temporary variables in nfp_net_tx_complete() · de31049a

Jakub Kicinski authored Nov 27, 2018

Move temporary variables in scope of the loop in nfp_net_tx_complete(),
and add a temp for txbuf software structure. This saves us 0.2% of CPU.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

de31049a

nfp: copy only the relevant part of the TX descriptor for frags · 95862749

Jakub Kicinski authored Nov 27, 2018

Chained descriptors for fragments need to duplicate all the descriptor
fields of the skb head, so we copy the descriptor and then modify the
relevant fields. This is wasteful, because the top half of the descriptor
will get overwritten entirely while the bottom half is not modified at all.
Copy only the bottom half. This saves us 0.3% of CPU in a GSO test.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

95862749

tcp: md5: add tcp_md5_needed jump label · 6015c71e

Eric Dumazet authored Nov 27, 2018

Most linux hosts never setup TCP MD5 keys. We can avoid a
cache line miss (accessing tp->md5ig_info) on RX and TX
using a jump label.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6015c71e

Merge branch 'tcp-take-a-bit-more-care-of-backlog-stress' · 2f695553

David S. Miller authored Nov 30, 2018

Eric Dumazet says:

====================
tcp: take a bit more care of backlog stress

While working on the SACK compression issue Jean-Louis Dupond
reported, we found that his linux box was suffering very hard
from tail drops on the socket backlog queue.

First patch hints the compiler about sack flows being the norm.

Second patch changes non-sack code in preparation of the ack
compression.

Third patch fixes tcp_space() to take backlog into account.

Fourth patch is attempting coalescing when a new packet must
be added to the backlog queue. Cooking bigger skbs helps
to keep backlog list smaller and speeds its handling when
user thread finally releases the socket lock.

v3: Neal/Yuchung feedback addressed :
     Do not aggregate if any skb has URG bit set.
     Do not aggregate if the skbs have different ECE/CWR bits

v2: added feedback from Neal : tcp: take care of compressed acks in tcp_add_reno_sack()
    added : tcp: hint compiler about sack flows
	added : tcp: make tcp_space() aware of socket backlog
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2f695553

tcp: implement coalescing on backlog queue · 4f693b55

Eric Dumazet authored Nov 27, 2018

In case GRO is not as efficient as it should be or disabled,
we might have a user thread trapped in __release_sock() while
softirq handler flood packets up to the point we have to drop.

This patch balances work done from user thread and softirq,
to give more chances to __release_sock() to complete its work
before new packets are added the the backlog.

This also helps if we receive many ACK packets, since GRO
does not aggregate them.

This patch brings ~60% throughput increase on a receiver
without GRO, but the spectacular gain is really on
1000x release_sock() latency reduction I have measured.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4f693b55

tcp: make tcp_space() aware of socket backlog · 85bdf7db

Eric Dumazet authored Nov 27, 2018

Jean-Louis Dupond reported poor iscsi TCP receive performance
that we tracked to backlog drops.

Apparently we fail to send window updates reflecting the
fact that we are under stress.

Note that we might lack a proper window increase when
backlog is fully processed, since __release_sock() clears
sk->sk_backlog.len _after_ all skbs have been processed.

This should not matter in practice. If we had a significant
load through socket backlog, we are in a dangerous
situation.
Reported-by: Jean-Louis Dupond <jean-louis@dupond.be>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Tested-by: Jean-Louis Dupond<jean-louis@dupond.be>
Signed-off-by: David S. Miller <davem@davemloft.net>

85bdf7db

tcp: take care of compressed acks in tcp_add_reno_sack() · 19119f29

Eric Dumazet authored Nov 27, 2018

Neal pointed out that non sack flows might suffer from ACK compression
added in the following patch ("tcp: implement coalescing on backlog queue")

Instead of tweaking tcp_add_backlog() we can take into
account how many ACK were coalesced, this information
will be available in skb_shinfo(skb)->gso_segs
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

19119f29

tcp: hint compiler about sack flows · ebeef4bc

Eric Dumazet authored Nov 27, 2018

Tell the compiler that most TCP flows are using SACK these days.

There is no need to add the unlikely() clause in tcp_is_reno(),
the compiler is able to infer it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ebeef4bc

net: Add trace events for all receive exit points · b0e3f1bd

Geneviève Bastien authored Nov 27, 2018

Trace events are already present for the receive entry points, to indicate
how the reception entered the stack.

This patch adds the corresponding exit trace events that will bound the
reception such that all events occurring between the entry and the exit
can be considered as part of the reception context. This greatly helps
for dependency and root cause analyses.

Without this, it is not possible with tracepoint instrumentation to
determine whether a sched_wakeup event following a netif_receive_skb
event is the result of the packet reception or a simple coincidence after
further processing by the thread. It is possible using other mechanisms
like kretprobes, but considering the "entry" points are already present,
it would be good to add the matching exit events.

In addition to linking packets with wakeups, the entry/exit event pair
can also be used to perform network stack latency analyses.
Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: David S. Miller <davem@davemloft.net>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> (tracing side)
Signed-off-by: David S. Miller <davem@davemloft.net>

b0e3f1bd

net/flow_dissector: correct comments on enum flow_dissector_key_id · 91c45956

Edward Cree authored Nov 27, 2018

There are no such structs flow_dissector_key_flow_vlan or
 flow_dissector_key_flow_tags, the actual structs used are struct
 flow_dissector_key_vlan and struct flow_dissector_key_tags.  So correct the
 comments against FLOW_DISSECTOR_KEY_VLAN, FLOW_DISSECTOR_KEY_FLOW_LABEL and
 FLOW_DISSECTOR_KEY_CVLAN to refer to those.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

91c45956

cxgb4: number of VFs supported is not always 16 · 1b974aa4

Ganesh Goudar authored Nov 27, 2018

Total number of VFs supported by PF is used to determine the last
byte of VF's mac address. Number of VFs supported is not always
16, use the variable nvfs to get the number of VFs supported
rather than hard coding it to 16.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1b974aa4

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 93029d7d

David S. Miller authored Nov 29, 2018

Daniel Borkmann says:

====================
bpf-next 2018-11-30

The following pull-request contains BPF updates for your *net-next* tree.

(Getting out bit earlier this time to pull in a dependency from bpf.)

The main changes are:

1) Add libbpf ABI versioning and document API naming conventions
   as well as ABI versioning process, from Andrey.

2) Add a new sk_msg_pop_data() helper for sk_msg based BPF
   programs that is used in conjunction with sk_msg_push_data()
   for adding / removing meta data to the msg data, from John.

3) Optimize convert_bpf_ld_abs() for 0 offset and fix various
   lib and testsuite build failures on 32 bit, from David.

4) Make BPF prog dump for !JIT identical to how we dump subprogs
   when JIT is in use, from Yonghong.

5) Rename btf_get_from_id() to make it more conform with libbpf
   API naming conventions, from Martin.

6) Add a missing BPF kselftest config item, from Naresh.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

93029d7d

tools/bpf: make libbpf _GNU_SOURCE friendly · b4269954

Yonghong Song authored Nov 29, 2018

During porting libbpf to bcc, I got some warnings like below:
  ...
  [  2%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf.c.o
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf.c:12:0:
  warning: "_GNU_SOURCE" redefined [enabled by default]
   #define _GNU_SOURCE
  ...
  [  3%] Building C object src/cc/CMakeFiles/bpf-shared.dir/libbpf/src/libbpf_errno.c.o
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c: In function ‘libbpf_strerror’:
  /home/yhs/work/bcc2/src/cc/libbpf/src/libbpf_errno.c:45:7:
  warning: assignment makes integer from pointer without a cast [enabled by default]
     ret = strerror_r(err, buf, size);
  ...

bcc is built with _GNU_SOURCE defined and this caused the above warning.
This patch intends to make libpf _GNU_SOURCE friendly by
  . define _GNU_SOURCE in libbpf.c unless it is not defined
  . undefine _GNU_SOURCE as non-gnu version of strerror_r is expected.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

b4269954

net: Don't default Aquantia USB driver to 'y' · 3d58c9c9
David S. Miller authored Nov 29, 2018
```
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
3d58c9c9

29 Nov, 2018 11 commits

net: explain __skb_checksum_complete() with comments · 14641931

Cong Wang authored Nov 26, 2018

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

14641931

tcp: remove loop to compute wscale · 19bf6261

Eric Dumazet authored Nov 29, 2018

We can remove the loop and conditional branches
and compute wscale efficiently thanks to ilog2()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

19bf6261

qede - Add a statistic for a case where driver drops tx packet due to memory allocation failure. · dcc6abae

Michael Shteinbok authored Nov 29, 2018

skb_linearization can fail due to memory allocation failure.
In such a case, the driver will drop the packet. In such a case
The driver used to print an error message.
This patch replaces this error message by a dedicated statistic.
Signed-off-by: Michael Shteinbok <michael.shteinbok@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dcc6abae

dpaa2-eth: Add "fall through" comments · c1cb11bc

Ioana Ciocoi Radulescu authored Nov 29, 2018

Add comments in the switch statement for XDP action to indicate
fallthrough is intended.
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c1cb11bc

Merge branch 'ave-suspend-resume' · a3270106

David S. Miller authored Nov 29, 2018

Kunihiko Hayashi says:

====================
Add suspend/resume support for AVE ethernet driver

This series adds support for suspend/resume to AVE ethernet driver.

And to avoid the error that wol state of phy hardware is enabled by default,
this sets initial wol state to disabled and add preservation the state in
suspend/resume sequence.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a3270106

net: ethernet: ave: Preserve wol state in suspend/resume sequence · 8d1283b1

Kunihiko Hayashi authored Nov 29, 2018

Since the wol state forces to be initialized after reset, the state should
be preserved in suspend/resume sequence.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8d1283b1

net: ethernet: ave: Set initial wol state to disabled · 7200f2e3

Kunihiko Hayashi authored Nov 29, 2018

If wol state of phy hardware is enabled after reset, phy_ethtool_get_wol()
returns that wol.wolopts is true.

However, since net_device.wol_enabled is zero and this doesn't apply wol
state until calling ethtool_set_wol(), so mdio_bus_phy_may_suspend()
returns true, that is, it's in a state where phy can suspend even though
wol state is enabled.

In this inconsistency, phy_suspend() returns -EBUSY, and at last,
suspend sequence fails with the following message:

    dpm_run_callback(): mdio_bus_phy_suspend+0x0/0x58 returns -16
    PM: Device 65000000.ethernet-ffffffff:01 failed to suspend: error -16
    PM: Some devices failed to suspend, or early wake event detected

In order to fix the above issue, this patch forces to set initial wol state
to disabled as default.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7200f2e3

net: ethernet: ave: Add suspend/resume support · 0ba78b4a

Kunihiko Hayashi authored Nov 29, 2018

This patch introduces suspend and resume functions to ave driver.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0ba78b4a

Merge tag 'linux-can-next-for-4.21-20181128' of... · bd82233f

David S. Miller authored Nov 28, 2018

Merge tag 'linux-can-next-for-4.21-20181128' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
This is a pull request for net-next/master consisting of 18 patches.

The first patch is by Colin Ian King and fixes the spelling in the ucan
driver.

The next three patches target the xilinx driver. YueHaibing's patch
fixes the return type of ndo_start_xmit function. Two patches by
Shubhrajyoti Datta add support for the CAN FD 2.0 controllers.

Flavio Suligoi's patch for the sja1000 driver add support for the ASEM
CAN raw hardware.

Wolfram Sang's and Kuninori Morimoto's patches switch the rcar driver to
use SPDX license identifiers.

The remaining 111 patches improve the flexcan driver. Pankaj Bansal's
patch enables the driver in Kconfig on all architectures with IOMEM
support. The next four patches by me fix indention, add missing
parentheses and comments. Aisheng Dong's patches add self wake support
and document it in the DT bindings. The remaining patches by Pankaj
Bansal first fix the loopback support and prepare the driver for the
CAN-FD support needed for the LX2160A SoC. The actual CAN-FD support
will be added in a later patch series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

bd82233f

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · e561bb29

David S. Miller authored Nov 28, 2018

Trivial conflict in net/core/filter.c, a locally computed
'sdif' is now an argument to the function.
Signed-off-by: David S. Miller <davem@davemloft.net>

e561bb29

bpf: Fix various lib and testsuite build failures on 32-bit. · 1ad93ab1

David Miller authored Nov 28, 2018

Cannot cast a u64 to a pointer on 32-bit without an intervening (long)
cast otherwise GCC warns.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

1ad93ab1

28 Nov, 2018 3 commits

selftests/bpf: add config fragment CONFIG_FTRACE_SYSCALLS · 295daee4

Naresh Kamboju authored Nov 27, 2018

CONFIG_FTRACE_SYSCALLS=y is required for get_cgroup_id_user test case
this test reads a file from debug trace path
/sys/kernel/debug/tracing/events/syscalls/sys_enter_nanosleep/id
Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

295daee4

Merge branch 'bpf-sk-msg-pop-data' · 36dbe571

Daniel Borkmann authored Nov 28, 2018

John Fastabend says:

====================
After being able to add metadata to messages with sk_msg_push_data we
have also found it useful to be able to "pop" this metadata off before
sending it to applications in some cases. This series adds a new helper
sk_msg_pop_data() and the associated patches to add tests and tools/lib
support.

Thanks!

v2: Daniel caught that we missed adding sk_msg_pop_data to the changes
    data helper so that the verifier ensures BPF programs revalidate
    data after using this helper. Also improve documentation adding a
    return description and using RST syntax per Quentin's comment. And
    delta calculations for DROP with pop'd data (albeit a strange set
    of operations for a program to be doing) had potential to be
    incorrect possibly confusing user space applications, so fix it.
====================
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

36dbe571

bpf: test_sockmap, add options for msg_pop_data() helper · 1ade9aba

John Fastabend authored Nov 26, 2018

Similar to msg_pull_data and msg_push_data add a set of options to
have msg_pop_data() exercised.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

1ade9aba