1. 17 Sep, 2018 8 commits
  2. 16 Sep, 2018 6 commits
  3. 14 Sep, 2018 4 commits
  4. 13 Sep, 2018 22 commits
    • Kees Cook's avatar
      net/ibm/emac: Remove VLA usage · ee4fccbe
      Kees Cook authored
      In the quest to remove all stack VLA usage from the kernel[1], this
      removes the VLA used for the emac xaht registers size. Since the size
      of registers can only ever be 4 or 8, as detected in emac_init_config(),
      the max can be hardcoded and a runtime test added for robustness.
      
      [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Christian Lamparter <chunkeey@gmail.com>
      Cc: Ivan Mikhaylov <ivan@de.ibm.com>
      Cc: netdev@vger.kernel.org
      Co-developed-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee4fccbe
    • Gustavo A. R. Silva's avatar
      pktgen: Fix fall-through annotation · f91845da
      Gustavo A. R. Silva authored
      Replace "fallthru" with a proper "fall through" annotation.
      
      This fix is part of the ongoing efforts to enabling
      -Wimplicit-fallthrough
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f91845da
    • Gustavo A. R. Silva's avatar
      tg3: Fix fall-through annotations · 310fc051
      Gustavo A. R. Silva authored
      Replace "fallthru" with a proper "fall through" annotation.
      
      This fix is part of the ongoing efforts to enabling
      -Wimplicit-fallthrough
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      310fc051
    • Toke Høiland-Jørgensen's avatar
      gso_segment: Reset skb->mac_len after modifying network header · 50c12f74
      Toke Høiland-Jørgensen authored
      When splitting a GSO segment that consists of encapsulated packets, the
      skb->mac_len of the segments can end up being set wrong, causing packet
      drops in particular when using act_mirred and ifb interfaces in
      combination with a qdisc that splits GSO packets.
      
      This happens because at the time skb_segment() is called, network_header
      will point to the inner header, throwing off the calculation in
      skb_reset_mac_len(). The network_header is subsequently adjust by the
      outer IP gso_segment handlers, but they don't set the mac_len.
      
      Fix this by adding skb_reset_mac_len() calls to both the IPv4 and IPv6
      gso_segment handlers, after they modify the network_header.
      
      Many thanks to Eric Dumazet for his help in identifying the cause of
      the bug.
      Acked-by: default avatarDave Taht <dave.taht@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@toke.dk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50c12f74
    • YueHaibing's avatar
      vxlan: Remove duplicated include from vxlan.h · 293681f1
      YueHaibing authored
      Remove duplicated include.
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      293681f1
    • Florian Fainelli's avatar
      net: dsa: b53: Do not fail when IRQ are not initialized · b2ddc48a
      Florian Fainelli authored
      When the Device Tree is not providing the per-port interrupts, do not fail
      during b53_srab_irq_enable() but instead bail out gracefully. The SRAB driver
      is used on the BCM5301X (Northstar) platforms which do not yet have the SRAB
      interrupts wired up.
      
      Fixes: 16994374 ("net: dsa: b53: Make SRAB driver manage port interrupts")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2ddc48a
    • David S. Miller's avatar
      Merge branch 'vhost_net-TX-batching' · 8bb83b78
      David S. Miller authored
      Jason Wang says:
      
      ====================
      vhost_net TX batching
      
      This series tries to batch submitting packets to underlayer socket
      through msg_control during sendmsg(). This is done by:
      
      1) Doing userspace copy inside vhost_net
      2) Build XDP buff
      3) Batch at most 64 (VHOST_NET_BATCH) XDP buffs and submit them once
         through msg_control during sendmsg().
      4) Underlayer sockets can use XDP buffs directly when XDP is enalbed,
         or build skb based on XDP buff.
      
      For the packet that can not be built easily with XDP or for the case
      that batch submission is hard (e.g sndbuf is limited). We will go for
      the previous slow path, passing iov iterator to underlayer socket
      through sendmsg() once per packet.
      
      This can help to improve cache utilization and avoid lots of indirect
      calls with sendmsg(). It can also co-operate with the batching support
      of the underlayer sockets (e.g the case of XDP redirection through
      maps).
      
      Testpmd(txonly) in guest shows obvious improvements:
      
      Test                /+pps%
      XDP_DROP on TAP     /+44.8%
      XDP_REDIRECT on TAP /+29%
      macvtap (skb)       /+26%
      
      Netperf TCP_STREAM TX from guest shows obvious improvements on small
      packet:
      
          size/session/+thu%/+normalize%
             64/     1/   +2%/    0%
             64/     2/   +3%/   +1%
             64/     4/   +7%/   +5%
             64/     8/   +8%/   +6%
            256/     1/   +3%/    0%
            256/     2/  +10%/   +7%
            256/     4/  +26%/  +22%
            256/     8/  +27%/  +23%
            512/     1/   +3%/   +2%
            512/     2/  +19%/  +14%
            512/     4/  +43%/  +40%
            512/     8/  +45%/  +41%
           1024/     1/   +4%/    0%
           1024/     2/  +27%/  +21%
           1024/     4/  +38%/  +73%
           1024/     8/  +15%/  +24%
           2048/     1/  +10%/   +7%
           2048/     2/  +16%/  +12%
           2048/     4/    0%/   +2%
           2048/     8/    0%/   +2%
           4096/     1/  +36%/  +60%
           4096/     2/  -11%/  -26%
           4096/     4/    0%/  +14%
           4096/     8/    0%/   +4%
          16384/     1/   -1%/   +5%
          16384/     2/    0%/   +2%
          16384/     4/    0%/   -3%
          16384/     8/    0%/   +4%
          65535/     1/    0%/  +10%
          65535/     2/    0%/   +8%
          65535/     4/    0%/   +1%
          65535/     8/    0%/   +3%
      
      Please review.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8bb83b78
    • Jason Wang's avatar
      vhost_net: batch submitting XDP buffers to underlayer sockets · 0a0be13b
      Jason Wang authored
      This patch implements XDP batching for vhost_net. The idea is first to
      try to do userspace copy and build XDP buff directly in vhost. Instead
      of submitting the packet immediately, vhost_net will batch them in an
      array and submit every 64 (VHOST_NET_BATCH) packets to the under layer
      sockets through msg_control of sendmsg().
      
      When XDP is enabled on the TUN/TAP, TUN/TAP can process XDP inside a
      loop without caring GUP thus it can do batch map flushing. When XDP is
      not enabled or not supported, the underlayer socket need to build skb
      and pass it to network core. The batched packet submission allows us
      to do batching like netif_receive_skb_list() in the future.
      
      This saves lots of indirect calls for better cache utilization. For
      the case that we can't so batching e.g when sndbuf is limited or
      packet size is too large, we will go for usual one packet per
      sendmsg() way.
      
      Doing testpmd on various setups gives us:
      
      Test                /+pps%
      XDP_DROP on TAP     /+44.8%
      XDP_REDIRECT on TAP /+29%
      macvtap (skb)       /+26%
      
      Netperf tests shows obvious improvements for small packet transmission:
      
      size/session/+thu%/+normalize%
         64/     1/   +2%/    0%
         64/     2/   +3%/   +1%
         64/     4/   +7%/   +5%
         64/     8/   +8%/   +6%
        256/     1/   +3%/    0%
        256/     2/  +10%/   +7%
        256/     4/  +26%/  +22%
        256/     8/  +27%/  +23%
        512/     1/   +3%/   +2%
        512/     2/  +19%/  +14%
        512/     4/  +43%/  +40%
        512/     8/  +45%/  +41%
       1024/     1/   +4%/    0%
       1024/     2/  +27%/  +21%
       1024/     4/  +38%/  +73%
       1024/     8/  +15%/  +24%
       2048/     1/  +10%/   +7%
       2048/     2/  +16%/  +12%
       2048/     4/    0%/   +2%
       2048/     8/    0%/   +2%
       4096/     1/  +36%/  +60%
       4096/     2/  -11%/  -26%
       4096/     4/    0%/  +14%
       4096/     8/    0%/   +4%
      16384/     1/   -1%/   +5%
      16384/     2/    0%/   +2%
      16384/     4/    0%/   -3%
      16384/     8/    0%/   +4%
      65535/     1/    0%/  +10%
      65535/     2/    0%/   +8%
      65535/     4/    0%/   +1%
      65535/     8/    0%/   +3%
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a0be13b
    • Jason Wang's avatar
      tap: accept an array of XDP buffs through sendmsg() · 0efac277
      Jason Wang authored
      This patch implement TUN_MSG_PTR msg_control type. This type allows
      the caller to pass an array of XDP buffs to tuntap through ptr field
      of the tun_msg_control. Tap will build skb through those XDP buffers.
      
      This will avoid lots of indirect calls thus improves the icache
      utilization and allows to do XDP batched flushing when doing XDP
      redirection.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0efac277
    • Jason Wang's avatar
      tuntap: accept an array of XDP buffs through sendmsg() · 043d222f
      Jason Wang authored
      This patch implement TUN_MSG_PTR msg_control type. This type allows
      the caller to pass an array of XDP buffs to tuntap through ptr field
      of the tun_msg_control. If an XDP program is attached, tuntap can run
      XDP program directly. If not, tuntap will build skb and do a fast
      receiving since part of the work has been done by vhost_net.
      
      This will avoid lots of indirect calls thus improves the icache
      utilization and allows to do XDP batched flushing when doing XDP
      redirection.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      043d222f
    • Jason Wang's avatar
      tun: switch to new type of msg_control · fe8dd45b
      Jason Wang authored
      This patch introduces to a new tun/tap specific msg_control:
      
      #define TUN_MSG_UBUF 1
      #define TUN_MSG_PTR  2
      struct tun_msg_ctl {
             int type;
             void *ptr;
      };
      
      This allows us to pass different kinds of msg_control through
      sendmsg(). The first supported type is ubuf (TUN_MSG_UBUF) which will
      be used by the existed vhost_net zerocopy code. The second is XDP
      buff, which allows vhost_net to pass XDP buff to TUN. This could be
      used to implement accepting an array of XDP buffs from vhost_net in
      the following patches.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe8dd45b
    • Jason Wang's avatar
      tuntap: move XDP flushing out of tun_do_xdp() · 1a097910
      Jason Wang authored
      This will allow adding batch flushing on top.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a097910
    • Jason Wang's avatar
      tuntap: split out XDP logic · 8ae1aff0
      Jason Wang authored
      This patch split out XDP logic into a single function. This make it to
      be reused by XDP batching path in the following patch.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ae1aff0
    • Jason Wang's avatar
      tuntap: tweak on the path of skb XDP case in tun_build_skb() · ac1f1f6c
      Jason Wang authored
      If we're sure not to go native XDP, there's no need for several things
      like bh and rcu stuffs. So this patch introduces a helper to build skb
      and hold page refcnt. When we found we will go through skb path, build
      skb directly.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac1f1f6c
    • Jason Wang's avatar
      tuntap: simplify error handling in tun_build_skb() · f7053b6c
      Jason Wang authored
      There's no need to duplicate page get logic in each action. So this
      patch tries to get page and calculate the offset before processing XDP
      actions (except for XDP_DROP), and undo them when meet errors (we
      don't care the performance on errors). This will be used for factoring
      out XDP logic.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7053b6c
    • Jason Wang's avatar
      tuntap: enable bh early during processing XDP · 291aeb2b
      Jason Wang authored
      This patch move the bh enabling a little bit earlier, this will be
      used for factoring out the core XDP logic of tuntap.
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      291aeb2b
    • Jason Wang's avatar
      4f23aff8
    • Jason Wang's avatar
      net: sock: introduce SOCK_XDP · e4a2a304
      Jason Wang authored
      This patch introduces a new sock flag - SOCK_XDP. This will be used
      for notifying the upper layer that XDP program is attached on the
      lower socket, and requires for extra headroom.
      
      TUN will be the first user.
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4a2a304
    • Cong Wang's avatar
      llc: avoid blocking in llc_sap_close() · 9708d2b5
      Cong Wang authored
      llc_sap_close() is called by llc_sap_put() which
      could be called in BH context in llc_rcv(). We can't
      block in BH.
      
      There is no reason to block it here, kfree_rcu() should
      be sufficient.
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9708d2b5
    • Andre Naujoks's avatar
      ipv6: Add sockopt IPV6_MULTICAST_ALL analogue to IP_MULTICAST_ALL · 15033f04
      Andre Naujoks authored
      The socket option will be enabled by default to ensure current behaviour
      is not changed. This is the same for the IPv4 version.
      
      A socket bound to in6addr_any and a specific port will receive all traffic
      on that port. Analogue to IP_MULTICAST_ALL, disable this behaviour, if
      one or more multicast groups were joined (using said socket) and only
      pass on multicast traffic from groups, which were explicitly joined via
      this socket.
      
      Without this option disabled a socket (system even) joined to multiple
      multicast groups is very hard to get right. Filtering by destination
      address has to take place in user space to avoid receiving multicast
      traffic from other multicast groups, which might have traffic on the same
      port.
      
      The extension of the IP_MULTICAST_ALL socketoption to just apply to ipv6,
      too, is not done to avoid changing the behaviour of current applications.
      Signed-off-by: default avatarAndre Naujoks <nautsch2@gmail.com>
      Acked-By: default avatarYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15033f04
    • David S. Miller's avatar
      Merge branch 'Lantiq-Intel-vrx200-support' · d03790f5
      David S. Miller authored
      Hauke Mehrtens says:
      
      ====================
      Add support for Lantiq / Intel vrx200 network
      
      This adds basic support for the GSWIP (Gigabit Switch) found in the
      VRX200 SoC.
      There are different versions of this IP core used in different SoCs, but
      this driver was currently only tested on the VRX200 SoC line, for other
      SoCs this driver probably need some adoptions to work.
      
      I also plan to add Layer 2 offloading to the DSA driver and later also
      layer 3 offloading which is supported by the PPE HW block.
      
      All these patches should go through the net-next tree.
      
      This depends on the patch "MIPS: lantiq: dma: add dev pointer" which
      should go into 4.19.
      
      Changes since:
      v2:
       * Send patch "MIPS: lantiq: dma: add dev pointer" separately
       * all: removed return in register write functions
       * switch: uses phylink
       * switch: uses hardware MDIO auto polling
       * switch: use usleep_range() in MDIO busy check
       * switch: configure MDIO bus to 2.5 MHz
       * switch: disable xMII link when it is not used
       * Ethernet: use NAPI for TX cleanups
       * Ethernet: enable clock in open callback
       * Ethernet: improve skb allocation
       * Ethernet: use net_dev->stats
      
      v1:
       * Add "MIPS: lantiq: dma: add dev pointer"
       * checkpatch fixes a all patches
       * Added binding documentation
       * use readx_poll_timeout function and ETIMEOUT error code
       * integrate GPHY firmware loading into DSA driver
       * renamed to NET_DSA_LANTIQ_GSWIP
       * removed some needed casts
       * added of_device_id.data information about the detected switch
       * fixed John's email address
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d03790f5
    • Hauke Mehrtens's avatar
      net: dsa: Add Lantiq / Intel DSA driver for vrx200 · 14fceff4
      Hauke Mehrtens authored
      This adds the DSA driver for the GSWIP Switch found in the VRX200 SoC.
      This switch is integrated in the DSL SoC, this SoC uses a GSWIP version
      2.1, there are other SoCs using different versions of this IP block, but
      this driver was only tested with the version found in the VRX200.
      Currently only the basic features are implemented which will forward all
      packages to the CPU and let the CPU do the forwarding. The hardware also
      support Layer 2 offloading which is not yet implemented in this driver.
      
      The GPHY FW loaded is now done by this driver and not any more by the
      separate driver in drivers/soc/lantiq/gphy.c, I will remove this driver
      is a separate patch. to make use of the GPHY this switch driver is
      needed anyway. Other SoCs have more embedded GPHYs so this driver should
      support a variable number of GPHYs. After the firmware was loaded the
      GPHY can be probed on the MDIO bus and it behaves like an external GPHY,
      without the firmware it can not be probed on the MDIO bus.
      
      The clock names in the sysctrl.c file have to be changed because the
      clocks are now used by a different driver. This should be cleaned up and
      a real common clock driver should provide the clocks instead.
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14fceff4