1. 17 Dec, 2018 5 commits
  2. 16 Dec, 2018 28 commits
  3. 15 Dec, 2018 7 commits
    • David S. Miller's avatar
      Merge tag 'mlx5e-updates-2018-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 63de273f
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5e-updates-2018-12-14 (VF Lag)
      
      From Aviv Heller,
      
      Subsequent patches introduce VF LAG, which provdies load-balancing and
      high-availability capabilities for VFs associated with different
      physical ports of the same Connect-X card.
      
      This series consists of the following:
       - mlx5 devcom, driver infrastructure that facilitates operations that involve
         both core devices (physical functions) of the same card, to synchronize and
         communicate between two driver instances of the same card.
       - Infrastructure for TC rule duplication.
       - Changes to LAG logic to enable its use when SR-IOV is enabled
       - PFs in switchdev mode is the only mode currently supported.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63de273f
    • David S. Miller's avatar
      Merge branch 'net-mitigate-retpoline-overhead' · bedf3b33
      David S. Miller authored
      Paolo Abeni says:
      
      ====================
      net: mitigate retpoline overhead
      
      The spectre v2 counter-measures, aka retpolines, are a source of measurable
      overhead[1]. We can partially address that when the function pointer refers to
      a builtin symbol resorting to a list of tests vs well-known builtin function and
      direct calls.
      
      Experimental results show that replacing a single indirect call via
      retpoline with several branches and a direct call gives performance gains
      even when multiple branches are added - 5 or more, as reported in [2].
      
      This may lead to some uglification around the indirect calls. In netconf 2018
      Eric Dumazet described a technique to hide the most relevant part of the needed
      boilerplate with some macro help.
      
      This series is a [re-]implementation of such idea, exposing the introduced
      helpers in a new header file. They are later leveraged to avoid the indirect
      call overhead in the GRO path, when possible.
      
      Overall this gives > 10% performance improvement for UDP GRO benchmark and
      smaller but measurable for TCP syn flood.
      
      The added infra can be used in follow-up patches to cope with retpoline overhead
      in other points of the networking stack (e.g. at the qdisc layer) and possibly
      even in other subsystems.
      
      v2  -> v3:
       - fix build error with CONFIG_IPV6=m
      
      v1  -> v2:
       - list explicitly the builtin function names in INDIRECT_CALL_*(),
         as suggested by Ed Cree
       - expand the recipients list
      
      rfc -> v1:
       - use branch prediction hints, as suggested by Eric
      
      [1] http://vger.kernel.org/netconf2018_files/PaoloAbeni_netconf2018.pdf
      [2] https://linuxplumbersconf.org/event/2/contributions/99/attachments/98/117/lpc18_paper_af_xdp_perf-v2.pdf
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bedf3b33
    • Paolo Abeni's avatar
      udp: use indirect call wrappers for GRO socket lookup · 4f24ed77
      Paolo Abeni authored
      This avoids another indirect call for UDP GRO. Again, the test
      for the IPv6 variant is performed first.
      
      v1 -> v2:
       - adapted to INDIRECT_CALL_ changes
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f24ed77
    • Paolo Abeni's avatar
      net: use indirect call wrappers at GRO transport layer · 028e0a47
      Paolo Abeni authored
      This avoids an indirect call in the receive path for TCP and UDP
      packets. TCP takes precedence on UDP, so that we have a single
      additional conditional in the common case.
      
      When IPV6 is build as module, all gro symbols except UDPv6 are
      builtin, while the latter belong to the ipv6 module, so we
      need some special care.
      
      v1 -> v2:
       - adapted to INDIRECT_CALL_ changes
      v2 -> v3:
       - fix build issue with CONFIG_IPV6=m
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      028e0a47
    • Paolo Abeni's avatar
      net: use indirect call wrappers at GRO network layer · aaa5d90b
      Paolo Abeni authored
      This avoids an indirect calls for L3 GRO receive path, both
      for ipv4 and ipv6, if the latter is not compiled as a module.
      
      Note that when IPv6 is compiled as builtin, it will be checked first,
      so we have a single additional compare for the more common path.
      
      v1 -> v2:
       - adapted to INDIRECT_CALL_ changes
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aaa5d90b
    • Paolo Abeni's avatar
      indirect call wrappers: helpers to speed-up indirect calls of builtin · 283c16a2
      Paolo Abeni authored
      This header define a bunch of helpers that allow avoiding the
      retpoline overhead when calling builtin functions via function pointers.
      It boils down to explicitly comparing the function pointers to
      known builtin functions and eventually invoke directly the latter.
      
      The macros defined here implement the boilerplate for the above schema
      and will be used by the next patches.
      
      rfc -> v1:
       - use branch prediction hint, as suggested by Eric
      v1  -> v2:
       - list explicitly the builtin function names in INDIRECT_CALL_*(),
         as suggested by Ed Cree
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      283c16a2
    • Ilias Apalodimas's avatar
      net: socionext: remove mmio reads on Tx · 35e07d23
      Ilias Apalodimas authored
      Currently the driver issues 2 mmio reads to figure out the number of
      transmitted packets and clean them. We can get rid of the expensive
      reads since BIT 31 of the Tx descriptor can be used for that.
      We can also remove the budget counting of Tx completions since all of
      the descriptors are not deliberately processed.
      
      Performance numbers using pktgen are:
      size  pre-patch(pps)  post-patch(pps)
      64       362483           427916
      128      358315           411686
      256      352725           389683
      512      215675           216464
      1024     113812           114442
      Signed-off-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35e07d23