Commit ef11d41b authored by Alexei Starovoitov's avatar Alexei Starovoitov

Merge branch 'convert-doc-to-rst'

Jesper Dangaard Brouer says:

====================
The kernel is moving files under Documentation to use the RST
(reStructuredText) format and Sphinx [1].  This patchset converts the
files under Documentation/bpf/ into RST format.  The Sphinx
integration is left as followup work.

[1] https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html

This patchset have been uploaded as branch bpf_doc10 on github[2], so
reviewers can see how GitHub renders this.

[2] https://github.com/netoptimizer/linux/tree/bpf_doc10/Documentation/bpf
====================
Acked-by: default avatarYonghong Song <yhs@fb.com>
Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents 1d827879 b7a27c3a
=================
BPF documentation
=================
This directory contains documentation for the BPF (Berkeley Packet
Filter) facility, with a focus on the extended BPF version (eBPF).
This kernel side documentation is still work in progress. The main
textual documentation is (for historical reasons) described in
`Documentation/networking/filter.txt`_, which describe both classical
and extended BPF instruction-set.
The Cilium project also maintains a `BPF and XDP Reference Guide`_
that goes into great technical depth about the BPF Architecture.
The primary info for the bpf syscall is available in the `man-pages`_
for `bpf(2)`_.
Frequently asked questions (FAQ)
================================
Two sets of Questions and Answers (Q&A) are maintained.
* QA for common questions about BPF see: bpf_design_QA_
* QA for developers interacting with BPF subsystem: bpf_devel_QA_
.. Links:
.. _bpf_design_QA: bpf_design_QA.rst
.. _bpf_devel_QA: bpf_devel_QA.rst
.. _Documentation/networking/filter.txt: ../networking/filter.txt
.. _man-pages: https://www.kernel.org/doc/man-pages/
.. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
.. _BPF and XDP Reference Guide: http://cilium.readthedocs.io/en/latest/bpf/
==============
BPF Design Q&A
==============
BPF extensibility and applicability to networking, tracing, security BPF extensibility and applicability to networking, tracing, security
in the linux kernel and several user space implementations of BPF in the linux kernel and several user space implementations of BPF
virtual machine led to a number of misunderstanding on what BPF actually is. virtual machine led to a number of misunderstanding on what BPF actually is.
This short QA is an attempt to address that and outline a direction This short QA is an attempt to address that and outline a direction
of where BPF is heading long term. of where BPF is heading long term.
.. contents::
:local:
:depth: 3
Questions and Answers
=====================
Q: Is BPF a generic instruction set similar to x64 and arm64? Q: Is BPF a generic instruction set similar to x64 and arm64?
-------------------------------------------------------------
A: NO. A: NO.
Q: Is BPF a generic virtual machine ? Q: Is BPF a generic virtual machine ?
-------------------------------------
A: NO. A: NO.
BPF is generic instruction set _with_ C calling convention. BPF is generic instruction set *with* C calling convention.
-----------------------------------------------------------
Q: Why C calling convention was chosen? Q: Why C calling convention was chosen?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because BPF programs are designed to run in the linux kernel A: Because BPF programs are designed to run in the linux kernel
which is written in C, hence BPF defines instruction set compatible which is written in C, hence BPF defines instruction set compatible
with two most used architectures x64 and arm64 (and takes into with two most used architectures x64 and arm64 (and takes into
consideration important quirks of other architectures) and consideration important quirks of other architectures) and
defines calling convention that is compatible with C calling defines calling convention that is compatible with C calling
convention of the linux kernel on those architectures. convention of the linux kernel on those architectures.
Q: can multiple return values be supported in the future? Q: can multiple return values be supported in the future?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: NO. BPF allows only register R0 to be used as return value. A: NO. BPF allows only register R0 to be used as return value.
Q: can more than 5 function arguments be supported in the future? Q: can more than 5 function arguments be supported in the future?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: NO. BPF calling convention only allows registers R1-R5 to be used A: NO. BPF calling convention only allows registers R1-R5 to be used
as arguments. BPF is not a standalone instruction set. as arguments. BPF is not a standalone instruction set.
(unlike x64 ISA that allows msft, cdecl and other conventions) (unlike x64 ISA that allows msft, cdecl and other conventions)
Q: can BPF programs access instruction pointer or return address? Q: can BPF programs access instruction pointer or return address?
-----------------------------------------------------------------
A: NO. A: NO.
Q: can BPF programs access stack pointer ? Q: can BPF programs access stack pointer ?
A: NO. Only frame pointer (register R10) is accessible. ------------------------------------------
From compiler point of view it's necessary to have stack pointer. A: NO.
For example LLVM defines register R11 as stack pointer in its
BPF backend, but it makes sure that generated code never uses it. Only frame pointer (register R10) is accessible.
From compiler point of view it's necessary to have stack pointer.
For example LLVM defines register R11 as stack pointer in its
BPF backend, but it makes sure that generated code never uses it.
Q: Does C-calling convention diminishes possible use cases? Q: Does C-calling convention diminishes possible use cases?
A: YES. BPF design forces addition of major functionality in the form -----------------------------------------------------------
of kernel helper functions and kernel objects like BPF maps with A: YES.
seamless interoperability between them. It lets kernel call into
BPF programs and programs call kernel helpers with zero overhead. BPF design forces addition of major functionality in the form
As all of them were native C code. That is particularly the case of kernel helper functions and kernel objects like BPF maps with
for JITed BPF programs that are indistinguishable from seamless interoperability between them. It lets kernel call into
native kernel C code. BPF programs and programs call kernel helpers with zero overhead.
As all of them were native C code. That is particularly the case
for JITed BPF programs that are indistinguishable from
native kernel C code.
Q: Does it mean that 'innovative' extensions to BPF code are disallowed? Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
A: Soft yes. At least for now until BPF core has support for ------------------------------------------------------------------------
bpf-to-bpf calls, indirect calls, loops, global variables, A: Soft yes.
jump tables, read only sections and all other normal constructs
that C code can produce. At least for now until BPF core has support for
bpf-to-bpf calls, indirect calls, loops, global variables,
jump tables, read only sections and all other normal constructs
that C code can produce.
Q: Can loops be supported in a safe way? Q: Can loops be supported in a safe way?
A: It's not clear yet. BPF developers are trying to find a way to ----------------------------------------
support bounded loops where the verifier can guarantee that A: It's not clear yet.
the program terminates in less than 4096 instructions.
BPF developers are trying to find a way to
support bounded loops where the verifier can guarantee that
the program terminates in less than 4096 instructions.
Instruction level questions
---------------------------
Q: LD_ABS and LD_IND instructions vs C code
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Q: How come LD_ABS and LD_IND instruction are present in BPF whereas Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
C code cannot express them and has to use builtin intrinsics? C code cannot express them and has to use builtin intrinsics?
A: This is artifact of compatibility with classic BPF. Modern A: This is artifact of compatibility with classic BPF. Modern
networking code in BPF performs better without them. networking code in BPF performs better without them.
See 'direct packet access'. See 'direct packet access'.
Q: BPF instructions mapping not one-to-one to native CPU
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Q: It seems not all BPF instructions are one-to-one to native CPU. Q: It seems not all BPF instructions are one-to-one to native CPU.
For example why BPF_JNE and other compare and jumps are not cpu-like? For example why BPF_JNE and other compare and jumps are not cpu-like?
A: This was necessary to avoid introducing flags into ISA which are A: This was necessary to avoid introducing flags into ISA which are
impossible to make generic and efficient across CPU architectures. impossible to make generic and efficient across CPU architectures.
Q: why BPF_DIV instruction doesn't map to x64 div? Q: why BPF_DIV instruction doesn't map to x64 div?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because if we picked one-to-one relationship to x64 it would have made A: Because if we picked one-to-one relationship to x64 it would have made
it more complicated to support on arm64 and other archs. Also it it more complicated to support on arm64 and other archs. Also it
needs div-by-zero runtime check. needs div-by-zero runtime check.
Q: why there is no BPF_SDIV for signed divide operation? Q: why there is no BPF_SDIV for signed divide operation?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because it would be rarely used. llvm errors in such case and A: Because it would be rarely used. llvm errors in such case and
prints a suggestion to use unsigned divide instead prints a suggestion to use unsigned divide instead
Q: Why BPF has implicit prologue and epilogue? Q: Why BPF has implicit prologue and epilogue?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because architectures like sparc have register windows and in general A: Because architectures like sparc have register windows and in general
there are enough subtle differences between architectures, so naive there are enough subtle differences between architectures, so naive
store return address into stack won't work. Another reason is BPF has store return address into stack won't work. Another reason is BPF has
to be safe from division by zero (and legacy exception path to be safe from division by zero (and legacy exception path
of LD_ABS insn). Those instructions need to invoke epilogue and of LD_ABS insn). Those instructions need to invoke epilogue and
return implicitly. return implicitly.
Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning? Q: Why BPF_JLT and BPF_JLE instructions were not introduced in the beginning?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because classic BPF didn't have them and BPF authors felt that compiler A: Because classic BPF didn't have them and BPF authors felt that compiler
workaround would be acceptable. Turned out that programs lose performance workaround would be acceptable. Turned out that programs lose performance
due to lack of these compare instructions and they were added. due to lack of these compare instructions and they were added.
These two instructions is a perfect example what kind of new BPF These two instructions is a perfect example what kind of new BPF
instructions are acceptable and can be added in the future. instructions are acceptable and can be added in the future.
These two already had equivalent instructions in native CPUs. These two already had equivalent instructions in native CPUs.
New instructions that don't have one-to-one mapping to HW instructions New instructions that don't have one-to-one mapping to HW instructions
will not be accepted. will not be accepted.
Q: BPF 32-bit subregister requirements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF Q: BPF 32-bit subregisters have a requirement to zero upper 32-bits of BPF
registers which makes BPF inefficient virtual machine for 32-bit registers which makes BPF inefficient virtual machine for 32-bit
CPU architectures and 32-bit HW accelerators. Can true 32-bit registers CPU architectures and 32-bit HW accelerators. Can true 32-bit registers
be added to BPF in the future? be added to BPF in the future?
A: NO. The first thing to improve performance on 32-bit archs is to teach A: NO. The first thing to improve performance on 32-bit archs is to teach
LLVM to generate code that uses 32-bit subregisters. Then second step LLVM to generate code that uses 32-bit subregisters. Then second step
is to teach verifier to mark operations where zero-ing upper bits is to teach verifier to mark operations where zero-ing upper bits
is unnecessary. Then JITs can take advantage of those markings and is unnecessary. Then JITs can take advantage of those markings and
drastically reduce size of generated code and improve performance. drastically reduce size of generated code and improve performance.
Q: Does BPF have a stable ABI? Q: Does BPF have a stable ABI?
------------------------------
A: YES. BPF instructions, arguments to BPF programs, set of helper A: YES. BPF instructions, arguments to BPF programs, set of helper
functions and their arguments, recognized return codes are all part functions and their arguments, recognized return codes are all part
of ABI. However when tracing programs are using bpf_probe_read() helper of ABI. However when tracing programs are using bpf_probe_read() helper
to walk kernel internal datastructures and compile with kernel to walk kernel internal datastructures and compile with kernel
internal headers these accesses can and will break with newer internal headers these accesses can and will break with newer
kernels. The union bpf_attr -> kern_version is checked at load time kernels. The union bpf_attr -> kern_version is checked at load time
to prevent accidentally loading kprobe-based bpf programs written to prevent accidentally loading kprobe-based bpf programs written
for a different kernel. Networking programs don't do kern_version check. for a different kernel. Networking programs don't do kern_version check.
Q: How much stack space a BPF program uses? Q: How much stack space a BPF program uses?
-------------------------------------------
A: Currently all program types are limited to 512 bytes of stack A: Currently all program types are limited to 512 bytes of stack
space, but the verifier computes the actual amount of stack used space, but the verifier computes the actual amount of stack used
and both interpreter and most JITed code consume necessary amount. and both interpreter and most JITed code consume necessary amount.
Q: Can BPF be offloaded to HW? Q: Can BPF be offloaded to HW?
------------------------------
A: YES. BPF HW offload is supported by NFP driver. A: YES. BPF HW offload is supported by NFP driver.
Q: Does classic BPF interpreter still exist? Q: Does classic BPF interpreter still exist?
--------------------------------------------
A: NO. Classic BPF programs are converted into extend BPF instructions. A: NO. Classic BPF programs are converted into extend BPF instructions.
Q: Can BPF call arbitrary kernel functions? Q: Can BPF call arbitrary kernel functions?
-------------------------------------------
A: NO. BPF programs can only call a set of helper functions which A: NO. BPF programs can only call a set of helper functions which
is defined for every program type. is defined for every program type.
Q: Can BPF overwrite arbitrary kernel memory? Q: Can BPF overwrite arbitrary kernel memory?
A: NO. Tracing bpf programs can _read_ arbitrary memory with bpf_probe_read() ---------------------------------------------
and bpf_probe_read_str() helpers. Networking programs cannot read A: NO.
arbitrary memory, since they don't have access to these helpers.
Programs can never read or write arbitrary memory directly. Tracing bpf programs can *read* arbitrary memory with bpf_probe_read()
and bpf_probe_read_str() helpers. Networking programs cannot read
arbitrary memory, since they don't have access to these helpers.
Programs can never read or write arbitrary memory directly.
Q: Can BPF overwrite arbitrary user memory? Q: Can BPF overwrite arbitrary user memory?
A: Sort-of. Tracing BPF programs can overwrite the user memory -------------------------------------------
of the current task with bpf_probe_write_user(). Every time such A: Sort-of.
program is loaded the kernel will print warning message, so
this helper is only useful for experiments and prototypes. Tracing BPF programs can overwrite the user memory
Tracing BPF programs are root only. of the current task with bpf_probe_write_user(). Every time such
program is loaded the kernel will print warning message, so
this helper is only useful for experiments and prototypes.
Tracing BPF programs are root only.
Q: bpf_trace_printk() helper warning
------------------------------------
Q: When bpf_trace_printk() helper is used the kernel prints nasty Q: When bpf_trace_printk() helper is used the kernel prints nasty
warning message. Why is that? warning message. Why is that?
A: This is done to nudge program authors into better interfaces when A: This is done to nudge program authors into better interfaces when
programs need to pass data to user space. Like bpf_perf_event_output() programs need to pass data to user space. Like bpf_perf_event_output()
can be used to efficiently stream data via perf ring buffer. can be used to efficiently stream data via perf ring buffer.
BPF maps can be used for asynchronous data sharing between kernel BPF maps can be used for asynchronous data sharing between kernel
and user space. bpf_trace_printk() should only be used for debugging. and user space. bpf_trace_printk() should only be used for debugging.
Q: New functionality via kernel modules?
----------------------------------------
Q: Can BPF functionality such as new program or map types, new Q: Can BPF functionality such as new program or map types, new
helpers, etc be added out of kernel module code? helpers, etc be added out of kernel module code?
A: NO. A: NO.
=================================
HOWTO interact with BPF subsystem
=================================
This document provides information for the BPF subsystem about various
workflows related to reporting bugs, submitting patches, and queueing
patches for stable kernels.
For general information about submitting patches, please refer to
`Documentation/process/`_. This document only describes additional specifics
related to BPF.
.. contents::
:local:
:depth: 2
Reporting bugs
==============
Q: How do I report bugs for BPF kernel code?
--------------------------------------------
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
loader development happens through the netdev kernel mailing list,
please report any found issues around BPF to the following mailing
list:
netdev@vger.kernel.org
This may also include issues related to XDP, BPF tracing, etc.
Given netdev has a high volume of traffic, please also add the BPF
maintainers to Cc (from kernel MAINTAINERS_ file):
* Alexei Starovoitov <ast@kernel.org>
* Daniel Borkmann <daniel@iogearbox.net>
In case a buggy commit has already been identified, make sure to keep
the actual commit authors in Cc as well for the report. They can
typically be identified through the kernel's git tree.
**Please do NOT report BPF issues to bugzilla.kernel.org since it
is a guarantee that the reported issue will be overlooked.**
Submitting patches
==================
Q: To which mailing list do I need to submit my BPF patches?
------------------------------------------------------------
A: Please submit your BPF patches to the netdev kernel mailing list:
netdev@vger.kernel.org
Historically, BPF came out of networking and has always been maintained
by the kernel networking community. Although these days BPF touches
many other subsystems as well, the patches are still routed mainly
through the networking community.
In case your patch has changes in various different subsystems (e.g.
tracing, security, etc), make sure to Cc the related kernel mailing
lists and maintainers from there as well, so they are able to review
the changes and provide their Acked-by's to the patches.
Q: Where can I find patches currently under discussion for BPF subsystem?
-------------------------------------------------------------------------
A: All patches that are Cc'ed to netdev are queued for review under netdev
patchwork project:
http://patchwork.ozlabs.org/project/netdev/list/
Those patches which target BPF, are assigned to a 'bpf' delegate for
further processing from BPF maintainers. The current queue with
patches under review can be found at:
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
Once the patches have been reviewed by the BPF community as a whole
and approved by the BPF maintainers, their status in patchwork will be
changed to 'Accepted' and the submitter will be notified by mail. This
means that the patches look good from a BPF perspective and have been
applied to one of the two BPF kernel trees.
In case feedback from the community requires a respin of the patches,
their status in patchwork will be set to 'Changes Requested', and purged
from the current review queue. Likewise for cases where patches would
get rejected or are not applicable to the BPF trees (but assigned to
the 'bpf' delegate).
Q: How do the changes make their way into Linux?
------------------------------------------------
A: There are two BPF kernel trees (git repositories). Once patches have
been accepted by the BPF maintainers, they will be applied to one
of the two BPF trees:
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
The bpf tree itself is for fixes only, whereas bpf-next for features,
cleanups or other kind of improvements ("next-like" content). This is
analogous to net and net-next trees for networking. Both bpf and
bpf-next will only have a master branch in order to simplify against
which branch patches should get rebased to.
Accumulated BPF patches in the bpf tree will regularly get pulled
into the net kernel tree. Likewise, accumulated BPF patches accepted
into the bpf-next tree will make their way into net-next tree. net and
net-next are both run by David S. Miller. From there, they will go
into the kernel mainline tree run by Linus Torvalds. To read up on the
process of net and net-next being merged into the mainline tree, see
the `netdev FAQ`_ under:
`Documentation/networking/netdev-FAQ.txt`_
Occasionally, to prevent merge conflicts, we might send pull requests
to other trees (e.g. tracing) with a small subset of the patches, but
net and net-next are always the main trees targeted for integration.
The pull requests will contain a high-level summary of the accumulated
patches and can be searched on netdev kernel mailing list through the
following subject lines (``yyyy-mm-dd`` is the date of the pull
request)::
pull-request: bpf yyyy-mm-dd
pull-request: bpf-next yyyy-mm-dd
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be applied to?
---------------------------------------------------------------------------------
A: The process is the very same as described in the `netdev FAQ`_, so
please read up on it. The subject line must indicate whether the
patch is a fix or rather "next-like" content in order to let the
maintainers know whether it is targeted at bpf or bpf-next.
For fixes eventually landing in bpf -> net tree, the subject must
look like::
git format-patch --subject-prefix='PATCH bpf' start..finish
For features/improvements/etc that should eventually land in
bpf-next -> net-next, the subject must look like::
git format-patch --subject-prefix='PATCH bpf-next' start..finish
If unsure whether the patch or patch series should go into bpf
or net directly, or bpf-next or net-next directly, it is not a
problem either if the subject line says net or net-next as target.
It is eventually up to the maintainers to do the delegation of
the patches.
If it is clear that patches should go into bpf or bpf-next tree,
please make sure to rebase the patches against those trees in
order to reduce potential conflicts.
In case the patch or patch series has to be reworked and sent out
again in a second or later revision, it is also required to add a
version number (``v2``, ``v3``, ...) into the subject prefix::
git format-patch --subject-prefix='PATCH net-next v2' start..finish
When changes have been requested to the patch series, always send the
whole patch series again with the feedback incorporated (never send
individual diffs on top of the old series).
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
-----------------------------------------------------------------------
A: It means that the patch looks good for mainline inclusion from
a BPF point of view.
Be aware that this is not a final verdict that the patch will
automatically get accepted into net or net-next trees eventually:
On the netdev kernel mailing list reviews can come in at any point
in time. If discussions around a patch conclude that they cannot
get included as-is, we will either apply a follow-up fix or drop
them from the trees entirely. Therefore, we also reserve to rebase
the trees when deemed necessary. After all, the purpose of the tree
is to:
i) accumulate and stage BPF patches for integration into trees
like net and net-next, and
ii) run extensive BPF test suite and
workloads on the patches before they make their way any further.
Once the BPF pull request was accepted by David S. Miller, then
the patches end up in net or net-next tree, respectively, and
make their way from there further into mainline. Again, see the
`netdev FAQ`_ for additional information e.g. on how often they are
merged to mainline.
Q: How long do I need to wait for feedback on my BPF patches?
-------------------------------------------------------------
A: We try to keep the latency low. The usual time to feedback will
be around 2 or 3 business days. It may vary depending on the
complexity of changes and current patch load.
Q: How often do you send pull requests to major kernel trees like net or net-next?
----------------------------------------------------------------------------------
A: Pull requests will be sent out rather often in order to not
accumulate too many patches in bpf or bpf-next.
As a rule of thumb, expect pull requests for each tree regularly
at the end of the week. In some cases pull requests could additionally
come also in the middle of the week depending on the current patch
load or urgency.
Q: Are patches applied to bpf-next when the merge window is open?
-----------------------------------------------------------------
A: For the time when the merge window is open, bpf-next will not be
processed. This is roughly analogous to net-next patch processing,
so feel free to read up on the `netdev FAQ`_ about further details.
During those two weeks of merge window, we might ask you to resend
your patch series once bpf-next is open again. Once Linus released
a ``v*-rc1`` after the merge window, we continue processing of bpf-next.
For non-subscribers to kernel mailing lists, there is also a status
page run by David S. Miller on net-next that provides guidance:
http://vger.kernel.org/~davem/net-next.html
Q: Verifier changes and test cases
----------------------------------
Q: I made a BPF verifier change, do I need to add test cases for
BPF kernel selftests_?
A: If the patch has changes to the behavior of the verifier, then yes,
it is absolutely necessary to add test cases to the BPF kernel
selftests_ suite. If they are not present and we think they are
needed, then we might ask for them before accepting any changes.
In particular, test_verifier.c is tracking a high number of BPF test
cases, including a lot of corner cases that LLVM BPF back end may
generate out of the restricted C code. Thus, adding test cases is
absolutely crucial to make sure future changes do not accidentally
affect prior use-cases. Thus, treat those test cases as: verifier
behavior that is not tracked in test_verifier.c could potentially
be subject to change.
Q: samples/bpf preference vs selftests?
---------------------------------------
Q: When should I add code to `samples/bpf/`_ and when to BPF kernel
selftests_ ?
A: In general, we prefer additions to BPF kernel selftests_ rather than
`samples/bpf/`_. The rationale is very simple: kernel selftests are
regularly run by various bots to test for kernel regressions.
The more test cases we add to BPF selftests, the better the coverage
and the less likely it is that those could accidentally break. It is
not that BPF kernel selftests cannot demo how a specific feature can
be used.
That said, `samples/bpf/`_ may be a good place for people to get started,
so it might be advisable that simple demos of features could go into
`samples/bpf/`_, but advanced functional and corner-case testing rather
into kernel selftests.
If your sample looks like a test case, then go for BPF kernel selftests
instead!
Q: When should I add code to the bpftool?
-----------------------------------------
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
a central user space tool for debugging and introspection of BPF programs
and maps that are active in the kernel. If UAPI changes related to BPF
enable for dumping additional information of programs or maps, then
bpftool should be extended as well to support dumping them.
Q: When should I add code to iproute2's BPF loader?
---------------------------------------------------
A: For UAPI changes related to the XDP or tc layer (e.g. ``cls_bpf``),
the convention is that those control-path related changes are added to
iproute2's BPF loader as well from user space side. This is not only
useful to have UAPI changes properly designed to be usable, but also
to make those changes available to a wider user base of major
downstream distributions.
Q: Do you accept patches as well for iproute2's BPF loader?
-----------------------------------------------------------
A: Patches for the iproute2's BPF loader have to be sent to:
netdev@vger.kernel.org
While those patches are not processed by the BPF kernel maintainers,
please keep them in Cc as well, so they can be reviewed.
The official git repository for iproute2 is run by Stephen Hemminger
and can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
The patches need to have a subject prefix of '``[PATCH iproute2
master]``' or '``[PATCH iproute2 net-next]``'. '``master``' or
'``net-next``' describes the target branch where the patch should be
applied to. Meaning, if kernel changes went into the net-next kernel
tree, then the related iproute2 changes need to go into the iproute2
net-next branch, otherwise they can be targeted at master branch. The
iproute2 net-next branch will get merged into the master branch after
the current iproute2 version from master has been released.
Like BPF, the patches end up in patchwork under the netdev project and
are delegated to 'shemminger' for further processing:
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
Q: What is the minimum requirement before I submit my BPF patches?
------------------------------------------------------------------
A: When submitting patches, always take the time and properly test your
patches *prior* to submission. Never rush them! If maintainers find
that your patches have not been properly tested, it is a good way to
get them grumpy. Testing patch submissions is a hard requirement!
Note, fixes that go to bpf tree *must* have a ``Fixes:`` tag included.
The same applies to fixes that target bpf-next, where the affected
commit is in net-next (or in some cases bpf-next). The ``Fixes:`` tag is
crucial in order to identify follow-up commits and tremendously helps
for people having to do backporting, so it is a must have!
We also don't accept patches with an empty commit message. Take your
time and properly write up a high quality commit message, it is
essential!
Think about it this way: other developers looking at your code a month
from now need to understand *why* a certain change has been done that
way, and whether there have been flaws in the analysis or assumptions
that the original author did. Thus providing a proper rationale and
describing the use-case for the changes is a must.
Patch submissions with >1 patch must have a cover letter which includes
a high level description of the series. This high level summary will
then be placed into the merge commit by the BPF maintainers such that
it is also accessible from the git log for future reference.
Q: Features changing BPF JIT and/or LLVM
----------------------------------------
Q: What do I need to consider when adding a new instruction or feature
that would require BPF JIT and/or LLVM integration as well?
A: We try hard to keep all BPF JITs up to date such that the same user
experience can be guaranteed when running BPF programs on different
architectures without having the program punt to the less efficient
interpreter in case the in-kernel BPF JIT is enabled.
If you are unable to implement or test the required JIT changes for
certain architectures, please work together with the related BPF JIT
developers in order to get the feature implemented in a timely manner.
Please refer to the git log (``arch/*/net/``) to locate the necessary
people for helping out.
Also always make sure to add BPF test cases (e.g. test_bpf.c and
test_verifier.c) for new instructions, so that they can receive
broad test coverage and help run-time testing the various BPF JITs.
In case of new BPF instructions, once the changes have been accepted
into the Linux kernel, please implement support into LLVM's BPF back
end. See LLVM_ section below for further information.
Stable submission
=================
Q: I need a specific BPF commit in stable kernels. What should I do?
--------------------------------------------------------------------
A: In case you need a specific fix in stable kernels, first check whether
the commit has already been applied in the related ``linux-*.y`` branches:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
If not the case, then drop an email to the BPF maintainers with the
netdev kernel mailing list in Cc and ask for the fix to be queued up:
netdev@vger.kernel.org
The process in general is the same as on netdev itself, see also the
`netdev FAQ`_ document.
Q: Do you also backport to kernels not currently maintained as stable?
----------------------------------------------------------------------
A: No. If you need a specific BPF commit in kernels that are currently not
maintained by the stable maintainers, then you are on your own.
The current stable and longterm stable kernels are all listed here:
https://www.kernel.org/
Q: The BPF patch I am about to submit needs to go to stable as well
-------------------------------------------------------------------
What should I do?
A: The same rules apply as with netdev patch submissions in general, see
`netdev FAQ`_ under:
`Documentation/networking/netdev-FAQ.txt`_
Never add "``Cc: stable@vger.kernel.org``" to the patch description, but
ask the BPF maintainers to queue the patches instead. This can be done
with a note, for example, under the ``---`` part of the patch which does
not go into the git log. Alternatively, this can be done as a simple
request by mail instead.
Q: Queue stable patches
-----------------------
Q: Where do I find currently queued BPF patches that will be submitted
to stable?
A: Once patches that fix critical bugs got applied into the bpf tree, they
are queued up for stable submission under:
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
They will be on hold there at minimum until the related commit made its
way into the mainline kernel tree.
After having been under broader exposure, the queued patches will be
submitted by the BPF maintainers to the stable maintainers.
Testing patches
===============
Q: How to run BPF selftests
---------------------------
A: After you have booted into the newly compiled kernel, navigate to
the BPF selftests_ suite in order to test BPF functionality (current
working directory points to the root of the cloned git tree)::
$ cd tools/testing/selftests/bpf/
$ make
To run the verifier tests::
$ sudo ./test_verifier
The verifier tests print out all the current checks being
performed. The summary at the end of running all tests will dump
information of test successes and failures::
Summary: 418 PASSED, 0 FAILED
In order to run through all BPF selftests, the following command is
needed::
$ sudo make run_tests
See the kernels selftest `Documentation/dev-tools/kselftest.rst`_
document for further documentation.
Q: Which BPF kernel selftests version should I run my kernel against?
---------------------------------------------------------------------
A: If you run a kernel ``xyz``, then always run the BPF kernel selftests
from that kernel ``xyz`` as well. Do not expect that the BPF selftest
from the latest mainline tree will pass all the time.
In particular, test_bpf.c and test_verifier.c have a large number of
test cases and are constantly updated with new BPF test sequences, or
existing ones are adapted to verifier changes e.g. due to verifier
becoming smarter and being able to better track certain things.
LLVM
====
Q: Where do I find LLVM with BPF support?
-----------------------------------------
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
All major distributions these days ship LLVM with BPF back end enabled,
so for the majority of use-cases it is not required to compile LLVM by
hand anymore, just install the distribution provided package.
LLVM's static compiler lists the supported targets through
``llc --version``, make sure BPF targets are listed. Example::
$ llc --version
LLVM (http://llvm.org/):
LLVM version 6.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake
Registered Targets:
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
For developers in order to utilize the latest features added to LLVM's
BPF back end, it is advisable to run the latest LLVM releases. Support
for new BPF kernel features such as additions to the BPF instruction
set are often developed together.
All LLVM releases can be found at: http://releases.llvm.org/
Q: Got it, so how do I build LLVM manually anyway?
--------------------------------------------------
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
that set up, proceed with building the latest LLVM and clang version
from the git repositories::
$ git clone http://llvm.org/git/llvm.git
$ cd llvm/tools
$ git clone --depth 1 http://llvm.org/git/clang.git
$ cd ..; mkdir build; cd build
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
-DBUILD_SHARED_LIBS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_BUILD_RUNTIME=OFF
$ make -j $(getconf _NPROCESSORS_ONLN)
The built binaries can then be found in the build/bin/ directory, where
you can point the PATH variable to.
Q: Reporting LLVM BPF issues
----------------------------
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
generation back end or about LLVM generated code that the verifier
refuses to accept?
A: Yes, please do!
LLVM's BPF back end is a key piece of the whole BPF
infrastructure and it ties deeply into verification of programs from the
kernel side. Therefore, any issues on either side need to be investigated
and fixed whenever necessary.
Therefore, please make sure to bring them up at netdev kernel mailing
list and Cc BPF maintainers for LLVM and kernel bits:
* Yonghong Song <yhs@fb.com>
* Alexei Starovoitov <ast@kernel.org>
* Daniel Borkmann <daniel@iogearbox.net>
LLVM also has an issue tracker where BPF related bugs can be found:
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
However, it is better to reach out through mailing lists with having
maintainers in Cc.
Q: New BPF instruction for kernel and LLVM
------------------------------------------
Q: I have added a new BPF instruction to the kernel, how can I integrate
it into LLVM?
A: LLVM has a ``-mcpu`` selector for the BPF back end in order to allow
the selection of BPF instruction set extensions. By default the
``generic`` processor target is used, which is the base instruction set
(v1) of BPF.
LLVM has an option to select ``-mcpu=probe`` where it will probe the host
kernel for supported BPF instruction set extensions and selects the
optimal set automatically.
For cross-compilation, a specific version can be select manually as well ::
$ llc -march bpf -mcpu=help
Available CPUs for this target:
generic - Select the generic processor.
probe - Select the probe processor.
v1 - Select the v1 processor.
v2 - Select the v2 processor.
[...]
Newly added BPF instructions to the Linux kernel need to follow the same
scheme, bump the instruction set version and implement probing for the
extensions such that ``-mcpu=probe`` users can benefit from the
optimization transparently when upgrading their kernels.
If you are unable to implement support for the newly added BPF instruction
please reach out to BPF developers for help.
By the way, the BPF kernel selftests run with ``-mcpu=probe`` for better
test coverage.
Q: clang flag for target bpf?
-----------------------------
Q: In some cases clang flag ``-target bpf`` is used but in other cases the
default clang target, which matches the underlying architecture, is used.
What is the difference and when I should use which?
A: Although LLVM IR generation and optimization try to stay architecture
independent, ``-target <arch>`` still has some impact on generated code:
- BPF program may recursively include header file(s) with file scope
inline assembly codes. The default target can handle this well,
while ``bpf`` target may fail if bpf backend assembler does not
understand these assembly codes, which is true in most cases.
- When compiled without ``-g``, additional elf sections, e.g.,
.eh_frame and .rela.eh_frame, may be present in the object file
with default target, but not with ``bpf`` target.
- The default target may turn a C switch statement into a switch table
lookup and jump operation. Since the switch table is placed
in the global readonly section, the bpf program will fail to load.
The bpf target does not support switch table optimization.
The clang option ``-fno-jump-tables`` can be used to disable
switch table generation.
- For clang ``-target bpf``, it is guaranteed that pointer or long /
unsigned long types will always have a width of 64 bit, no matter
whether underlying clang binary or default target (or kernel) is
32 bit. However, when native clang target is used, then it will
compile these types based on the underlying architecture's conventions,
meaning in case of 32 bit architecture, pointer or long / unsigned
long types e.g. in BPF context structure will have width of 32 bit
while the BPF LLVM back end still operates in 64 bit. The native
target is mostly needed in tracing for the case of walking ``pt_regs``
or other kernel structures where CPU's register width matters.
Otherwise, ``clang -target bpf`` is generally recommended.
You should use default target when:
- Your program includes a header file, e.g., ptrace.h, which eventually
pulls in some header files containing file scope host assembly codes.
- You can add ``-fno-jump-tables`` to work around the switch table issue.
Otherwise, you can use ``bpf`` target. Additionally, you *must* use bpf target
when:
- Your program uses data structures with pointer or long / unsigned long
types that interface with BPF helpers or context data structures. Access
into these structures is verified by the BPF verifier and may result
in verification failures if the native architecture is not aligned with
the BPF architecture, e.g. 64-bit. An example of this is
BPF_PROG_TYPE_SK_MSG require ``-target bpf``
.. Links
.. _Documentation/process/: https://www.kernel.org/doc/html/latest/process/
.. _MAINTAINERS: ../../MAINTAINERS
.. _Documentation/networking/netdev-FAQ.txt: ../networking/netdev-FAQ.txt
.. _netdev FAQ: ../networking/netdev-FAQ.txt
.. _samples/bpf/: ../../samples/bpf/
.. _selftests: ../../tools/testing/selftests/bpf/
.. _Documentation/dev-tools/kselftest.rst:
https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
Happy BPF hacking!
This document provides information for the BPF subsystem about various
workflows related to reporting bugs, submitting patches, and queueing
patches for stable kernels.
For general information about submitting patches, please refer to
Documentation/process/. This document only describes additional specifics
related to BPF.
Reporting bugs:
---------------
Q: How do I report bugs for BPF kernel code?
A: Since all BPF kernel development as well as bpftool and iproute2 BPF
loader development happens through the netdev kernel mailing list,
please report any found issues around BPF to the following mailing
list:
netdev@vger.kernel.org
This may also include issues related to XDP, BPF tracing, etc.
Given netdev has a high volume of traffic, please also add the BPF
maintainers to Cc (from kernel MAINTAINERS file):
Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann <daniel@iogearbox.net>
In case a buggy commit has already been identified, make sure to keep
the actual commit authors in Cc as well for the report. They can
typically be identified through the kernel's git tree.
Please do *not* report BPF issues to bugzilla.kernel.org since it
is a guarantee that the reported issue will be overlooked.
Submitting patches:
-------------------
Q: To which mailing list do I need to submit my BPF patches?
A: Please submit your BPF patches to the netdev kernel mailing list:
netdev@vger.kernel.org
Historically, BPF came out of networking and has always been maintained
by the kernel networking community. Although these days BPF touches
many other subsystems as well, the patches are still routed mainly
through the networking community.
In case your patch has changes in various different subsystems (e.g.
tracing, security, etc), make sure to Cc the related kernel mailing
lists and maintainers from there as well, so they are able to review
the changes and provide their Acked-by's to the patches.
Q: Where can I find patches currently under discussion for BPF subsystem?
A: All patches that are Cc'ed to netdev are queued for review under netdev
patchwork project:
http://patchwork.ozlabs.org/project/netdev/list/
Those patches which target BPF, are assigned to a 'bpf' delegate for
further processing from BPF maintainers. The current queue with
patches under review can be found at:
https://patchwork.ozlabs.org/project/netdev/list/?delegate=77147
Once the patches have been reviewed by the BPF community as a whole
and approved by the BPF maintainers, their status in patchwork will be
changed to 'Accepted' and the submitter will be notified by mail. This
means that the patches look good from a BPF perspective and have been
applied to one of the two BPF kernel trees.
In case feedback from the community requires a respin of the patches,
their status in patchwork will be set to 'Changes Requested', and purged
from the current review queue. Likewise for cases where patches would
get rejected or are not applicable to the BPF trees (but assigned to
the 'bpf' delegate).
Q: How do the changes make their way into Linux?
A: There are two BPF kernel trees (git repositories). Once patches have
been accepted by the BPF maintainers, they will be applied to one
of the two BPF trees:
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
The bpf tree itself is for fixes only, whereas bpf-next for features,
cleanups or other kind of improvements ("next-like" content). This is
analogous to net and net-next trees for networking. Both bpf and
bpf-next will only have a master branch in order to simplify against
which branch patches should get rebased to.
Accumulated BPF patches in the bpf tree will regularly get pulled
into the net kernel tree. Likewise, accumulated BPF patches accepted
into the bpf-next tree will make their way into net-next tree. net and
net-next are both run by David S. Miller. From there, they will go
into the kernel mainline tree run by Linus Torvalds. To read up on the
process of net and net-next being merged into the mainline tree, see
the netdev FAQ under:
Documentation/networking/netdev-FAQ.txt
Occasionally, to prevent merge conflicts, we might send pull requests
to other trees (e.g. tracing) with a small subset of the patches, but
net and net-next are always the main trees targeted for integration.
The pull requests will contain a high-level summary of the accumulated
patches and can be searched on netdev kernel mailing list through the
following subject lines (yyyy-mm-dd is the date of the pull request):
pull-request: bpf yyyy-mm-dd
pull-request: bpf-next yyyy-mm-dd
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be
applied to?
A: The process is the very same as described in the netdev FAQ, so
please read up on it. The subject line must indicate whether the
patch is a fix or rather "next-like" content in order to let the
maintainers know whether it is targeted at bpf or bpf-next.
For fixes eventually landing in bpf -> net tree, the subject must
look like:
git format-patch --subject-prefix='PATCH bpf' start..finish
For features/improvements/etc that should eventually land in
bpf-next -> net-next, the subject must look like:
git format-patch --subject-prefix='PATCH bpf-next' start..finish
If unsure whether the patch or patch series should go into bpf
or net directly, or bpf-next or net-next directly, it is not a
problem either if the subject line says net or net-next as target.
It is eventually up to the maintainers to do the delegation of
the patches.
If it is clear that patches should go into bpf or bpf-next tree,
please make sure to rebase the patches against those trees in
order to reduce potential conflicts.
In case the patch or patch series has to be reworked and sent out
again in a second or later revision, it is also required to add a
version number (v2, v3, ...) into the subject prefix:
git format-patch --subject-prefix='PATCH net-next v2' start..finish
When changes have been requested to the patch series, always send the
whole patch series again with the feedback incorporated (never send
individual diffs on top of the old series).
Q: What does it mean when a patch gets applied to bpf or bpf-next tree?
A: It means that the patch looks good for mainline inclusion from
a BPF point of view.
Be aware that this is not a final verdict that the patch will
automatically get accepted into net or net-next trees eventually:
On the netdev kernel mailing list reviews can come in at any point
in time. If discussions around a patch conclude that they cannot
get included as-is, we will either apply a follow-up fix or drop
them from the trees entirely. Therefore, we also reserve to rebase
the trees when deemed necessary. After all, the purpose of the tree
is to i) accumulate and stage BPF patches for integration into trees
like net and net-next, and ii) run extensive BPF test suite and
workloads on the patches before they make their way any further.
Once the BPF pull request was accepted by David S. Miller, then
the patches end up in net or net-next tree, respectively, and
make their way from there further into mainline. Again, see the
netdev FAQ for additional information e.g. on how often they are
merged to mainline.
Q: How long do I need to wait for feedback on my BPF patches?
A: We try to keep the latency low. The usual time to feedback will
be around 2 or 3 business days. It may vary depending on the
complexity of changes and current patch load.
Q: How often do you send pull requests to major kernel trees like
net or net-next?
A: Pull requests will be sent out rather often in order to not
accumulate too many patches in bpf or bpf-next.
As a rule of thumb, expect pull requests for each tree regularly
at the end of the week. In some cases pull requests could additionally
come also in the middle of the week depending on the current patch
load or urgency.
Q: Are patches applied to bpf-next when the merge window is open?
A: For the time when the merge window is open, bpf-next will not be
processed. This is roughly analogous to net-next patch processing,
so feel free to read up on the netdev FAQ about further details.
During those two weeks of merge window, we might ask you to resend
your patch series once bpf-next is open again. Once Linus released
a v*-rc1 after the merge window, we continue processing of bpf-next.
For non-subscribers to kernel mailing lists, there is also a status
page run by David S. Miller on net-next that provides guidance:
http://vger.kernel.org/~davem/net-next.html
Q: I made a BPF verifier change, do I need to add test cases for
BPF kernel selftests?
A: If the patch has changes to the behavior of the verifier, then yes,
it is absolutely necessary to add test cases to the BPF kernel
selftests suite. If they are not present and we think they are
needed, then we might ask for them before accepting any changes.
In particular, test_verifier.c is tracking a high number of BPF test
cases, including a lot of corner cases that LLVM BPF back end may
generate out of the restricted C code. Thus, adding test cases is
absolutely crucial to make sure future changes do not accidentally
affect prior use-cases. Thus, treat those test cases as: verifier
behavior that is not tracked in test_verifier.c could potentially
be subject to change.
Q: When should I add code to samples/bpf/ and when to BPF kernel
selftests?
A: In general, we prefer additions to BPF kernel selftests rather than
samples/bpf/. The rationale is very simple: kernel selftests are
regularly run by various bots to test for kernel regressions.
The more test cases we add to BPF selftests, the better the coverage
and the less likely it is that those could accidentally break. It is
not that BPF kernel selftests cannot demo how a specific feature can
be used.
That said, samples/bpf/ may be a good place for people to get started,
so it might be advisable that simple demos of features could go into
samples/bpf/, but advanced functional and corner-case testing rather
into kernel selftests.
If your sample looks like a test case, then go for BPF kernel selftests
instead!
Q: When should I add code to the bpftool?
A: The main purpose of bpftool (under tools/bpf/bpftool/) is to provide
a central user space tool for debugging and introspection of BPF programs
and maps that are active in the kernel. If UAPI changes related to BPF
enable for dumping additional information of programs or maps, then
bpftool should be extended as well to support dumping them.
Q: When should I add code to iproute2's BPF loader?
A: For UAPI changes related to the XDP or tc layer (e.g. cls_bpf), the
convention is that those control-path related changes are added to
iproute2's BPF loader as well from user space side. This is not only
useful to have UAPI changes properly designed to be usable, but also
to make those changes available to a wider user base of major
downstream distributions.
Q: Do you accept patches as well for iproute2's BPF loader?
A: Patches for the iproute2's BPF loader have to be sent to:
netdev@vger.kernel.org
While those patches are not processed by the BPF kernel maintainers,
please keep them in Cc as well, so they can be reviewed.
The official git repository for iproute2 is run by Stephen Hemminger
and can be found at:
https://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git/
The patches need to have a subject prefix of '[PATCH iproute2 master]'
or '[PATCH iproute2 net-next]'. 'master' or 'net-next' describes the
target branch where the patch should be applied to. Meaning, if kernel
changes went into the net-next kernel tree, then the related iproute2
changes need to go into the iproute2 net-next branch, otherwise they
can be targeted at master branch. The iproute2 net-next branch will get
merged into the master branch after the current iproute2 version from
master has been released.
Like BPF, the patches end up in patchwork under the netdev project and
are delegated to 'shemminger' for further processing:
http://patchwork.ozlabs.org/project/netdev/list/?delegate=389
Q: What is the minimum requirement before I submit my BPF patches?
A: When submitting patches, always take the time and properly test your
patches *prior* to submission. Never rush them! If maintainers find
that your patches have not been properly tested, it is a good way to
get them grumpy. Testing patch submissions is a hard requirement!
Note, fixes that go to bpf tree *must* have a Fixes: tag included. The
same applies to fixes that target bpf-next, where the affected commit
is in net-next (or in some cases bpf-next). The Fixes: tag is crucial
in order to identify follow-up commits and tremendously helps for people
having to do backporting, so it is a must have!
We also don't accept patches with an empty commit message. Take your
time and properly write up a high quality commit message, it is
essential!
Think about it this way: other developers looking at your code a month
from now need to understand *why* a certain change has been done that
way, and whether there have been flaws in the analysis or assumptions
that the original author did. Thus providing a proper rationale and
describing the use-case for the changes is a must.
Patch submissions with >1 patch must have a cover letter which includes
a high level description of the series. This high level summary will
then be placed into the merge commit by the BPF maintainers such that
it is also accessible from the git log for future reference.
Q: What do I need to consider when adding a new instruction or feature
that would require BPF JIT and/or LLVM integration as well?
A: We try hard to keep all BPF JITs up to date such that the same user
experience can be guaranteed when running BPF programs on different
architectures without having the program punt to the less efficient
interpreter in case the in-kernel BPF JIT is enabled.
If you are unable to implement or test the required JIT changes for
certain architectures, please work together with the related BPF JIT
developers in order to get the feature implemented in a timely manner.
Please refer to the git log (arch/*/net/) to locate the necessary
people for helping out.
Also always make sure to add BPF test cases (e.g. test_bpf.c and
test_verifier.c) for new instructions, so that they can receive
broad test coverage and help run-time testing the various BPF JITs.
In case of new BPF instructions, once the changes have been accepted
into the Linux kernel, please implement support into LLVM's BPF back
end. See LLVM section below for further information.
Stable submission:
------------------
Q: I need a specific BPF commit in stable kernels. What should I do?
A: In case you need a specific fix in stable kernels, first check whether
the commit has already been applied in the related linux-*.y branches:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/
If not the case, then drop an email to the BPF maintainers with the
netdev kernel mailing list in Cc and ask for the fix to be queued up:
netdev@vger.kernel.org
The process in general is the same as on netdev itself, see also the
netdev FAQ document.
Q: Do you also backport to kernels not currently maintained as stable?
A: No. If you need a specific BPF commit in kernels that are currently not
maintained by the stable maintainers, then you are on your own.
The current stable and longterm stable kernels are all listed here:
https://www.kernel.org/
Q: The BPF patch I am about to submit needs to go to stable as well. What
should I do?
A: The same rules apply as with netdev patch submissions in general, see
netdev FAQ under:
Documentation/networking/netdev-FAQ.txt
Never add "Cc: stable@vger.kernel.org" to the patch description, but
ask the BPF maintainers to queue the patches instead. This can be done
with a note, for example, under the "---" part of the patch which does
not go into the git log. Alternatively, this can be done as a simple
request by mail instead.
Q: Where do I find currently queued BPF patches that will be submitted
to stable?
A: Once patches that fix critical bugs got applied into the bpf tree, they
are queued up for stable submission under:
http://patchwork.ozlabs.org/bundle/bpf/stable/?state=*
They will be on hold there at minimum until the related commit made its
way into the mainline kernel tree.
After having been under broader exposure, the queued patches will be
submitted by the BPF maintainers to the stable maintainers.
Testing patches:
----------------
Q: Which BPF kernel selftests version should I run my kernel against?
A: If you run a kernel xyz, then always run the BPF kernel selftests from
that kernel xyz as well. Do not expect that the BPF selftest from the
latest mainline tree will pass all the time.
In particular, test_bpf.c and test_verifier.c have a large number of
test cases and are constantly updated with new BPF test sequences, or
existing ones are adapted to verifier changes e.g. due to verifier
becoming smarter and being able to better track certain things.
LLVM:
-----
Q: Where do I find LLVM with BPF support?
A: The BPF back end for LLVM is upstream in LLVM since version 3.7.1.
All major distributions these days ship LLVM with BPF back end enabled,
so for the majority of use-cases it is not required to compile LLVM by
hand anymore, just install the distribution provided package.
LLVM's static compiler lists the supported targets through 'llc --version',
make sure BPF targets are listed. Example:
$ llc --version
LLVM (http://llvm.org/):
LLVM version 6.0.0svn
Optimized build.
Default target: x86_64-unknown-linux-gnu
Host CPU: skylake
Registered Targets:
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
For developers in order to utilize the latest features added to LLVM's
BPF back end, it is advisable to run the latest LLVM releases. Support
for new BPF kernel features such as additions to the BPF instruction
set are often developed together.
All LLVM releases can be found at: http://releases.llvm.org/
Q: Got it, so how do I build LLVM manually anyway?
A: You need cmake and gcc-c++ as build requisites for LLVM. Once you have
that set up, proceed with building the latest LLVM and clang version
from the git repositories:
$ git clone http://llvm.org/git/llvm.git
$ cd llvm/tools
$ git clone --depth 1 http://llvm.org/git/clang.git
$ cd ..; mkdir build; cd build
$ cmake .. -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
-DBUILD_SHARED_LIBS=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_BUILD_RUNTIME=OFF
$ make -j $(getconf _NPROCESSORS_ONLN)
The built binaries can then be found in the build/bin/ directory, where
you can point the PATH variable to.
Q: Should I notify BPF kernel maintainers about issues in LLVM's BPF code
generation back end or about LLVM generated code that the verifier
refuses to accept?
A: Yes, please do! LLVM's BPF back end is a key piece of the whole BPF
infrastructure and it ties deeply into verification of programs from the
kernel side. Therefore, any issues on either side need to be investigated
and fixed whenever necessary.
Therefore, please make sure to bring them up at netdev kernel mailing
list and Cc BPF maintainers for LLVM and kernel bits:
Yonghong Song <yhs@fb.com>
Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann <daniel@iogearbox.net>
LLVM also has an issue tracker where BPF related bugs can be found:
https://bugs.llvm.org/buglist.cgi?quicksearch=bpf
However, it is better to reach out through mailing lists with having
maintainers in Cc.
Q: I have added a new BPF instruction to the kernel, how can I integrate
it into LLVM?
A: LLVM has a -mcpu selector for the BPF back end in order to allow the
selection of BPF instruction set extensions. By default the 'generic'
processor target is used, which is the base instruction set (v1) of BPF.
LLVM has an option to select -mcpu=probe where it will probe the host
kernel for supported BPF instruction set extensions and selects the
optimal set automatically.
For cross-compilation, a specific version can be select manually as well.
$ llc -march bpf -mcpu=help
Available CPUs for this target:
generic - Select the generic processor.
probe - Select the probe processor.
v1 - Select the v1 processor.
v2 - Select the v2 processor.
[...]
Newly added BPF instructions to the Linux kernel need to follow the same
scheme, bump the instruction set version and implement probing for the
extensions such that -mcpu=probe users can benefit from the optimization
transparently when upgrading their kernels.
If you are unable to implement support for the newly added BPF instruction
please reach out to BPF developers for help.
By the way, the BPF kernel selftests run with -mcpu=probe for better
test coverage.
Q: In some cases clang flag "-target bpf" is used but in other cases the
default clang target, which matches the underlying architecture, is used.
What is the difference and when I should use which?
A: Although LLVM IR generation and optimization try to stay architecture
independent, "-target <arch>" still has some impact on generated code:
- BPF program may recursively include header file(s) with file scope
inline assembly codes. The default target can handle this well,
while bpf target may fail if bpf backend assembler does not
understand these assembly codes, which is true in most cases.
- When compiled without -g, additional elf sections, e.g.,
.eh_frame and .rela.eh_frame, may be present in the object file
with default target, but not with bpf target.
- The default target may turn a C switch statement into a switch table
lookup and jump operation. Since the switch table is placed
in the global readonly section, the bpf program will fail to load.
The bpf target does not support switch table optimization.
The clang option "-fno-jump-tables" can be used to disable
switch table generation.
- For clang -target bpf, it is guaranteed that pointer or long /
unsigned long types will always have a width of 64 bit, no matter
whether underlying clang binary or default target (or kernel) is
32 bit. However, when native clang target is used, then it will
compile these types based on the underlying architecture's conventions,
meaning in case of 32 bit architecture, pointer or long / unsigned
long types e.g. in BPF context structure will have width of 32 bit
while the BPF LLVM back end still operates in 64 bit. The native
target is mostly needed in tracing for the case of walking pt_regs
or other kernel structures where CPU's register width matters.
Otherwise, clang -target bpf is generally recommended.
You should use default target when:
- Your program includes a header file, e.g., ptrace.h, which eventually
pulls in some header files containing file scope host assembly codes.
- You can add "-fno-jump-tables" to work around the switch table issue.
Otherwise, you can use bpf target. Additionally, you _must_ use bpf target
when:
- Your program uses data structures with pointer or long / unsigned long
types that interface with BPF helpers or context data structures. Access
into these structures is verified by the BPF verifier and may result
in verification failures if the native architecture is not aligned with
the BPF architecture, e.g. 64-bit. An example of this is
BPF_PROG_TYPE_SK_MSG require '-target bpf'
Happy BPF hacking!
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment