1. 19 Dec, 2017 10 commits
    • Zhu Yanjun's avatar
      forcedeth: remove duplicate structure member in xmit · 41b0cd36
      Zhu Yanjun authored
      Since both first_tx_ctx and tx_skb are the head of tx ctx, it not
      necessary to use two structure members to statically indicate
      the head of tx ctx. So first_tx_ctx is removed.
      
      CC: Srinivas Eeda <srinivas.eeda@oracle.com>
      CC: Joe Jin <joe.jin@oracle.com>
      CC: Junxiao Bi <junxiao.bi@oracle.com>
      Signed-off-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41b0cd36
    • David S. Miller's avatar
      Merge branch 'net-NETIF_F_GRO_HW' · e9c5a106
      David S. Miller authored
      Michael Chan says:
      
      ====================
      Introduce NETIF_F_GRO_HW
      
      Introduce NETIF_F_GRO_HW feature flag and convert drivers that support
      hardware GRO to use the new flag.
      
      v5:
      - Documentation changes requested by Alexander Duyck.
      - bnx2x changes requested by Manish Chopra to enable LRO by default, and
      disable GRO_HW if disable_tpa module parameter is set.
      
      v4:
      - more changes requested by Alexander Duyck:
      - check GRO_HW/GRO dependency in drivers's ndo_fix_features().
      - Reverse the order of RXCSUM and GRO_HW dependency check in
      netdev_fix_features().
      - No propagation in netdev_disable_gro_hw().
      
      v3:
      - Let driver's ndo_fix_features() disable NETIF_F_LRO when NETIF_F_GRO_HW
      is set instead of doing it in common netdev_fix_features().
      
      v2:
      - NETIF_F_GRO_HW flag propagation between upper and lower devices not
      required (see patch 1).
      - NETIF_F_GRO_HW depends on NETIF_F_GRO and NETIF_F_RXCSUM.
      - Add dev_disable_gro_hw() to disable GRO_HW for generic XDP.
      - Use ndo_fix_features() on all 3 drivers to drop GRO_HW when it is not
      supported
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9c5a106
    • Michael Chan's avatar
      qede: Use NETIF_F_GRO_HW. · 18c602de
      Michael Chan authored
      Advertise NETIF_F_GRO_HW and set edev->gro_disable according to the
      feature flag.  Add qede_fix_features() to drop NETIF_F_GRO_HW if
      XDP is running or MTU does not support GRO_HW or GRO is not set.
      qede_change_mtu() also checks and disables GRO_HW if MTU is not
      supported.
      
      Cc: Ariel Elior <Ariel.Elior@cavium.com>
      Cc: everest-linux-l2@cavium.com
      Acked-by: default avatarManish Chopra <manish.chopra@cavium.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Acked-by: default avatarManish Chopra <manish.chopra@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18c602de
    • Michael Chan's avatar
      bnx2x: Use NETIF_F_GRO_HW. · 3c3def5f
      Michael Chan authored
      Advertise NETIF_F_GRO_HW and turn on TPA_MODE_GRO when NETIF_F_GRO_HW
      is set.  Disable NETIF_F_GRO_HW in bnx2x_fix_features() if the MTU
      does not support TPA_MODE_GRO or GRO is not set.  bnx2x_change_mtu() also
      needs to disable NETIF_F_GRO_HW if the MTU does not support it.
      
      Original parameter disable_tpa will continue to disable LRO and GRO_HW.
      
      Preserve the original behavior of enabling LRO by default.  User has
      to run ethtool -K to explicitly enable GRO_HW.
      
      Cc: Ariel Elior <Ariel.Elior@cavium.com>
      Cc: everest-linux-l2@cavium.com
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Acked-by: default avatarManish Chopra <manish.chopra@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c3def5f
    • Michael Chan's avatar
      bnxt_en: Use NETIF_F_GRO_HW. · 1054aee8
      Michael Chan authored
      Advertise NETIF_F_GRO_HW in hw_features if hardware GRO is supported.
      In bnxt_fix_features(), disable GRO_HW and LRO if current hardware
      configuration does not allow it.  GRO_HW depends on GRO.  GRO_HW is
      also mutually exclusive with LRO.  XDP setup will now rely on
      bnxt_fix_features() to turn off aggregation.  During chip init, turn on
      or off hardware GRO based on NETIF_F_GRO_HW in features flag.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1054aee8
    • Michael Chan's avatar
      net: Disable GRO_HW when generic XDP is installed on a device. · 56f5aa77
      Michael Chan authored
      Hardware should not aggregate any packets when generic XDP is installed.
      
      Cc: Ariel Elior <Ariel.Elior@cavium.com>
      Cc: everest-linux-l2@cavium.com
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56f5aa77
    • Michael Chan's avatar
      net: Introduce NETIF_F_GRO_HW. · fb1f5f79
      Michael Chan authored
      Introduce NETIF_F_GRO_HW feature flag for NICs that support hardware
      GRO.  With this flag, we can now independently turn on or off hardware
      GRO when GRO is on.  Previously, drivers were using NETIF_F_GRO to
      control hardware GRO and so it cannot be independently turned on or
      off without affecting GRO.
      
      Hardware GRO (just like GRO) guarantees that packets can be re-segmented
      by TSO/GSO to reconstruct the original packet stream.  Logically,
      GRO_HW should depend on GRO since it a subset, but we will let
      individual drivers enforce this dependency as they see fit.
      
      Since NETIF_F_GRO is not propagated between upper and lower devices,
      NETIF_F_GRO_HW should follow suit since it is a subset of GRO.  In other
      words, a lower device can independent have GRO/GRO_HW enabled or disabled
      and no feature propagation is required.  This will preserve the current
      GRO behavior.  This can be changed later if we decide to propagate GRO/
      GRO_HW/RXCSUM from upper to lower devices.
      
      Cc: Ariel Elior <Ariel.Elior@cavium.com>
      Cc: everest-linux-l2@cavium.com
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Acked-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fb1f5f79
    • Tonghao Zhang's avatar
      sock: Hide unused variable when !CONFIG_PROC_FS. · 398b841e
      Tonghao Zhang authored
      When CONFIG_PROC_FS is disabled, we will not use the prot_inuse
      counter. This adds an #ifdef to hide the variable definition in
      that case. This is not a bugfix. But we can save bytes when there
      are many network namespace.
      
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarMartin Zhang <zhangjunweimartin@didichuxing.com>
      Signed-off-by: default avatarTonghao Zhang <zhangtonghao@didichuxing.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      398b841e
    • Tonghao Zhang's avatar
      sock: Move the socket inuse to namespace. · 648845ab
      Tonghao Zhang authored
      In some case, we want to know how many sockets are in use in
      different _net_ namespaces. It's a key resource metric.
      
      This patch add a member in struct netns_core. This is a counter
      for socket-inuse in the _net_ namespace. The patch will add/sub
      counter in the sk_alloc, sk_clone_lock and __sk_free.
      
      This patch will not counter the socket created in kernel.
      It's not very useful for userspace to know how many kernel
      sockets we created.
      
      The main reasons for doing this are that:
      
      1. When linux calls the 'do_exit' for process to exit, the functions
      'exit_task_namespaces' and 'exit_task_work' will be called sequentially.
      'exit_task_namespaces' may have destroyed the _net_ namespace, but
      'sock_release' called in 'exit_task_work' may use the _net_ namespace
      if we counter the socket-inuse in sock_release.
      
      2. socket and sock are in pair. More important, sock holds the _net_
      namespace. We counter the socket-inuse in sock, for avoiding holding
      _net_ namespace again in socket. It's a easy way to maintain the code.
      Signed-off-by: default avatarMartin Zhang <zhangjunweimartin@didichuxing.com>
      Signed-off-by: default avatarTonghao Zhang <zhangtonghao@didichuxing.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      648845ab
    • Tonghao Zhang's avatar
      sock: Change the netns_core member name. · 08fc7f81
      Tonghao Zhang authored
      Change the member name will make the code more readable.
      This patch will be used in next patch.
      Signed-off-by: default avatarMartin Zhang <zhangjunweimartin@didichuxing.com>
      Signed-off-by: default avatarTonghao Zhang <zhangtonghao@didichuxing.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      08fc7f81
  2. 18 Dec, 2017 25 commits
  3. 17 Dec, 2017 5 commits
    • Josef Bacik's avatar
      trace: reenable preemption if we modify the ip · 46df3d20
      Josef Bacik authored
      Things got moved around between the original bpf_override_return patches
      and the final version, and now the ftrace kprobe dispatcher assumes if
      you modified the ip that you also enabled preemption.  Make a comment of
      this and enable preemption, this fixes the lockdep splat that happened
      when using this feature.
      
      Fixes: 9802d865 ("bpf: add a bpf_override_function helper")
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      46df3d20
    • Jakub Kicinski's avatar
      nfp: set flags in the correct member of netdev_bpf · 4a29c0db
      Jakub Kicinski authored
      netdev_bpf.flags is the input member for installing the program.
      netdev_bpf.prog_flags is the output member for querying.  Set
      the correct one on query.
      
      Fixes: 92f0292b ("net: xdp: report flags program was installed with on query")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      4a29c0db
    • Jakub Kicinski's avatar
      libbpf: fix Makefile exit code if libelf not found · 21567ede
      Jakub Kicinski authored
      /bin/sh's exit does not recognize -1 as a number, leading to
      the following error message:
      
      /bin/sh: 1: exit: Illegal number: -1
      
      Use 1 as the exit code.
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      21567ede
    • Daniel Borkmann's avatar
      Merge branch 'bpf-to-bpf-function-calls' · ef9fde06
      Daniel Borkmann authored
      Alexei Starovoitov says:
      
      ====================
      First of all huge thank you to Daniel, John, Jakub, Edward and others who
      reviewed multiple iterations of this patch set over the last many months
      and to Dave and others who gave critical feedback during netconf/netdev.
      
      The patch is solid enough and we thought through numerous corner cases,
      but it's not the end. More followups with code reorg and features to follow.
      
      TLDR: Allow arbitrary function calls from bpf function to another bpf function.
      
      Since the beginning of bpf all bpf programs were represented as a single function
      and program authors were forced to use always_inline for all functions
      in their C code. That was causing llvm to unnecessary inflate the code size
      and forcing developers to move code to header files with little code reuse.
      
      With a bit of additional complexity teach verifier to recognize
      arbitrary function calls from one bpf function to another as long as
      all of functions are presented to the verifier as a single bpf program.
      Extended program layout:
      ..
      r1 = ..    // arg1
      r2 = ..    // arg2
      call pc+1  // function call pc-relative
      exit
      .. = r1    // access arg1
      .. = r2    // access arg2
      ..
      call pc+20 // second level of function call
      ...
      
      It allows for better optimized code and finally allows to introduce
      the core bpf libraries that can be reused in different projects,
      since programs are no longer limited by single elf file.
      With function calls bpf can be compiled into multiple .o files.
      
      This patch is the first step. It detects programs that contain
      multiple functions and checks that calls between them are valid.
      It splits the sequence of bpf instructions (one program) into a set
      of bpf functions that call each other. Calls to only known
      functions are allowed. Since all functions are presented to
      the verifier at once conceptually it is 'static linking'.
      
      Future plans:
      - introduce BPF_PROG_TYPE_LIBRARY and allow a set of bpf functions
        to be loaded into the kernel that can be later linked to other
        programs with concrete program types. Aka 'dynamic linking'.
      
      - introduce function pointer type and indirect calls to allow
        bpf functions call other dynamically loaded bpf functions while
        the caller bpf function is already executing. Aka 'runtime linking'.
        This will be more generic and more flexible alternative
        to bpf_tail_calls.
      
      FAQ:
      Q: Interpreter and JIT changes mean that new instruction is introduced ?
      A: No. The call instruction technically stays the same. Now it can call
         both kernel helpers and other bpf functions.
         Calling convention stays the same as well.
         From uapi point of view the call insn got new 'relocation' BPF_PSEUDO_CALL
         similar to BPF_PSEUDO_MAP_FD 'relocation' of bpf_ldimm64 insn.
      
      Q: What had to change on LLVM side?
      A: Trivial LLVM patch to allow calls was applied to upcoming 6.0 release:
         https://reviews.llvm.org/rL318614
         with few bugfixes as well.
         Make sure to build the latest llvm to have bpf_call support.
      
      More details in the patches.
      ====================
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      ef9fde06
    • Daniel Borkmann's avatar
      selftests/bpf: additional bpf_call tests · 28ab173e
      Daniel Borkmann authored
      Add some additional checks for few more corner cases.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      28ab173e