1. 14 Jan, 2015 17 commits
    • Fan Du's avatar
      openvswitch: Introduce ovs_tunnel_route_lookup · 3f4c1d87
      Fan Du authored
      Introduce ovs_tunnel_route_lookup to consolidate route lookup
      shared by vxlan, gre, and geneve ports.
      Signed-off-by: default avatarFan Du <fan.du@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f4c1d87
    • David S. Miller's avatar
      Merge branch 'vxlan_rco' · 27331353
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      net: Remote checksum offload for VXLAN
      
      This patch set adds support for remote checksum offload in VXLAN.
      
      The remote checksum offload is generalized by creating a common
      function (remcsum_adjust) that does the work of modifying the
      checksum in remote checksum offload. This function can be called
      from normal or GRO path. GUE was modified to use this function.
      
      To support RCO is VXLAN we use the 9th bit in the reserved
      flags to indicated remote checksum offload. The start and offset
      values are encoded n a compressed form in the low order (reserved)
      byte of the vni field.
      
      Remote checksum offload is described in
      https://tools.ietf.org/html/draft-herbert-remotecsumoffload-01
      
      Changes in v2:
        - Add udp_offload_callbacks which has GRO functions that take a
          udp_offload pointer argument. This argument can be used to retrieve
          a per port structure of the encapsulation for use in gro processing
          (mostly by doing container_of on the structure).
        - Use the 10th bit in VXLAN flags for RCO which does not seem to
          conflict with other proposals at this time (ie. VXLAN-GPE and
          VXLAN-GPB)
        - Require that RCO must be explicitly enabled on the receiver
          as well as the sender.
      
      Tested by running 200 TCP_STREAM connections with VXLAN (over IPv4).
      
      With UDP checksums and Remote Checksum Offload
        IPv4
            Client
              11.84% CPU utilization
            Server
              12.96% CPU utilization
            9197 Mbps
        IPv6
            Client
              12.46% CPU utilization
            Server
              14.48% CPU utilization
            8963 Mbps
      
      With UDP checksums, no remote checksum offload
        IPv4
            Client
              15.67% CPU utilization
            Server
              14.83% CPU utilization
            9094 Mbps
        IPv6
            Client
              16.21% CPU utilization
            Server
              14.32% CPU utilization
            9058 Mbps
      
      No UDP checksums
        IPv4
            Client
              15.03% CPU utilization
            Server
              23.09% CPU utilization
            9089 Mbps
        IPv6
            Client
              16.18% CPU utilization
            Server
              26.57% CPU utilization
             8954 Mbps
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      27331353
    • Tom Herbert's avatar
      vxlan: Remote checksum offload · dfd8645e
      Tom Herbert authored
      Add support for remote checksum offload in VXLAN. This uses a
      reserved bit to indicate that RCO is being done, and uses the low order
      reserved eight bits of the VNI to hold the start and offset values in a
      compressed manner.
      
      Start is encoded in the low order seven bits of VNI. This is start >> 1
      so that the checksum start offset is 0-254 using even values only.
      Checksum offset (transport checksum field) is indicated in the high
      order bit in the low order byte of the VNI. If the bit is set, the
      checksum field is for UDP (so offset = start + 6), else checksum
      field is for TCP (so offset = start + 16). Only TCP and UDP are
      supported in this implementation.
      
      Remote checksum offload for VXLAN is described in:
      
      https://tools.ietf.org/html/draft-herbert-vxlan-rco-00
      
      Tested by running 200 TCP_STREAM connections with VXLAN (over IPv4).
      
      With UDP checksums and Remote Checksum Offload
        IPv4
            Client
              11.84% CPU utilization
            Server
              12.96% CPU utilization
            9197 Mbps
        IPv6
            Client
              12.46% CPU utilization
            Server
              14.48% CPU utilization
            8963 Mbps
      
      With UDP checksums, no remote checksum offload
        IPv4
            Client
              15.67% CPU utilization
            Server
              14.83% CPU utilization
            9094 Mbps
        IPv6
            Client
              16.21% CPU utilization
            Server
              14.32% CPU utilization
            9058 Mbps
      
      No UDP checksums
        IPv4
            Client
              15.03% CPU utilization
            Server
              23.09% CPU utilization
            9089 Mbps
        IPv6
            Client
              16.18% CPU utilization
            Server
              26.57% CPU utilization
             8954 Mbps
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dfd8645e
    • Tom Herbert's avatar
      udp: pass udp_offload struct to UDP gro callbacks · a2b12f3c
      Tom Herbert authored
      This patch introduces udp_offload_callbacks which has the same
      GRO functions (but not a GSO function) as offload_callbacks,
      except there is an argument to a udp_offload struct passed to
      gro_receive and gro_complete functions. This additional argument
      can be used to retrieve the per port structure of the encapsulation
      for use in gro processing (mostly by doing container_of on the
      structure).
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2b12f3c
    • hayeswang's avatar
      r8152: replace tasklet with NAPI · d823ab68
      hayeswang authored
      Replace tasklet with NAPI.
      
      Add rx_queue to queue the remaining rx packets if the number of the
      rx packets is more than the request from poll().
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d823ab68
    • David S. Miller's avatar
      Merge branch 'hip04' · 237de6ef
      David S. Miller authored
      Ding Tianhong says:
      
      ====================
      add hisilicon hip04 ethernet driver
      
      v13:
      - Fix the problem of alignment parameters for function and checkpatch warming.
      
      v12:
      - According Alex's suggestion, modify the changelog and add MODULE_DEVICE_TABLE
        for hip04 ethernet.
      
      v11:
      - Add ethtool support for tx coalecse getting and setting, the xmit_more
        is not supported for this patch, but I think it could work for hip04,
        will support it later after some tests for performance better.
      
        Here are some performance test results by ping and iperf(add tx_coalesce_frames/users),
        it looks that the performance and latency is more better by tx_coalesce_frames/usecs.
      
        - Before:
          $ ping 192.168.1.1 ...
          === 192.168.1.1 ping statistics ===
          24 packets transmitted, 24 received, 0% packet loss, time 22999ms
          rtt min/avg/max/mdev = 0.180/0.202/0.403/0.043 ms
      
          $ iperf -c 192.168.1.1 ...
          [ ID] Interval       Transfer     Bandwidth
          [  3]  0.0- 1.0 sec   115 MBytes   945 Mbits/sec
      
        - After:
          $ ping 192.168.1.1 ...
          === 192.168.1.1 ping statistics ===
          24 packets transmitted, 24 received, 0% packet loss, time 22999ms
          rtt min/avg/max/mdev = 0.178/0.190/0.380/0.041 ms
      
          $ iperf -c 192.168.1.1 ...
          [ ID] Interval       Transfer     Bandwidth
          [  3]  0.0- 1.0 sec   115 MBytes   965 Mbits/sec
      
      v10:
      - According Arnd's suggestion, remove the skb_orphan and use the hrtimer
        for the cleanup of the TX queue and add some modification for the hip04
        drivers.
        1) drop the broken skb_orphan call
        2) drop the workqueue
        3) batch cleanup based on tx_coalesce_frames/usecs for better throughput
        4) use a reasonable default tx timeout (200us, could be shorted
           based on measurements) with a range timer
        5) fix napi poll function return value
        6) use a lockless queue for cleanup
      
      v9:
      - There is no tx completion interrupts to free DMAd Tx packets, it means taht
        we rely on new tx packets arriving to run the destructors of completed packets,
        which open up space in their sockets's send queues. Sometimes we don't get such
        new packets causing Tx to stall, a single UDP transmitter is a good example of
        this situation, so we need a clean up workqueue to reclaims completed packets,
        the workqueue will only free the last packets which is already stay for several jiffies.
        Also fix some format cleanups.
      
      v8:
      - Use poll to reclaim xmitted buffer as workaround since no tx done interrupt
      
      v7:
      - Remove select NET_CORE in 0002
      
      v6:
      - Suggest by Russell: Use netdev_sent_queue & netdev_completed_queue to solve latency issue
        Also shorten the period of timer, which is used to wakeup the queue since no
        tx completed interrupt.
      
      v5:
      - no big change, fix typo
      
      v4:
      - Modify accoringly to the suggetion from Arnd, Florian, Eric, David
        Use of_parse_phandle_with_fixed_args & syscon_node_to_regmap get ppe info
        Add skb_orphan() and tx_timer for reclaim since no tx_finished interrupt
        Update timeout, and move of_phy_connect to probe to reuse open/stop
      
      v3:
      - Suggest from Arnd, use syscon & regmap_write/read to replace static void __iomem *ppebase.
        Modify hisilicon-hip04-net.txt accrordingly to suggestion from Florian and Sergei.
      
      v2:
      - Got many suggestions from Russell, Arnd, Florian, Mark and Sergei
        Remove memcpy, use dma_map/unmap_single, use dma_alloc_coherent rather than dma_pool, etc.
        Refer property in ethernet.txt, change ppe description, etc.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      237de6ef
    • dingtianhong's avatar
      net: hisilicon: new hip04 ethernet driver · a41ea46a
      dingtianhong authored
      Support Hisilicon hip04 ethernet driver, including 100M / 1000M controller.
      The controller has no tx done interrupt, reclaim xmitted buffer in the poll.
      
      v13: Fix the problem of alignment parameters for function and checkpatch warming.
      
      v12: According Alex's suggestion, modify the changelog and add MODULE_DEVICE_TABLE
           for hip04 ethernet.
      
      v11: Add ethtool support for tx coalecse getting and setting, the xmit_more
           is not supported for this patch, but I think it could work for hip04,
           will support it later after some tests for performance better.
      
           Here are some performance test results by ping and iperf(add tx_coalesce_frames/users),
           it looks that the performance and latency is more better by tx_coalesce_frames/usecs.
      
           - Before:
           $ ping 192.168.1.1 ...
           === 192.168.1.1 ping statistics ===
           24 packets transmitted, 24 received, 0% packet loss, time 22999ms
           rtt min/avg/max/mdev = 0.180/0.202/0.403/0.043 ms
      
           $ iperf -c 192.168.1.1 ...
           [ ID] Interval       Transfer     Bandwidth
           [  3]  0.0- 1.0 sec   115 MBytes   945 Mbits/sec
      
           - After:
           $ ping 192.168.1.1 ...
           === 192.168.1.1 ping statistics ===
           24 packets transmitted, 24 received, 0% packet loss, time 22999ms
           rtt min/avg/max/mdev = 0.178/0.190/0.380/0.041 ms
      
           $ iperf -c 192.168.1.1 ...
           [ ID] Interval       Transfer     Bandwidth
           [  3]  0.0- 1.0 sec   115 MBytes   965 Mbits/sec
      
      v10: According David Miller and Arnd Bergmann's suggestion, add some modification
           for v9 version
           - drop the workqueue
           - batch cleanup based on tx_coalesce_frames/usecs for better throughput
           - use a reasonable default tx timeout (200us, could be shorted
             based on measurements) with a range timer
           - fix napi poll function return value
           - use a lockless queue for cleanup
      Signed-off-by: default avatarZhangfei Gao <zhangfei.gao@linaro.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a41ea46a
    • Zhangfei Gao's avatar
      net: hisilicon: new hip04 MDIO driver · 4a841ee9
      Zhangfei Gao authored
      Hisilicon hip04 platform mdio driver
      Reuse Marvell phy drivers/net/phy/marvell.c
      Signed-off-by: default avatarZhangfei Gao <zhangfei.gao@linaro.org>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a841ee9
    • Zhangfei Gao's avatar
      Documentation: add Device tree bindings for Hisilicon hip04 ethernet · ef80c32d
      Zhangfei Gao authored
      This patch adds the Device Tree bindings for the Hisilicon hip04
      Ethernet controller, including 100M / 1000M controller.
      Signed-off-by: default avatarZhangfei Gao <zhangfei.gao@linaro.org>
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef80c32d
    • Xander Huff's avatar
      net/macb: improved ethtool statistics support · 3ff13f1c
      Xander Huff authored
      Currently `ethtool -S` simply returns "no stats available". It
      would be more useful to see what the various ethtool statistics
      registers' values are. This change implements get_ethtool_stats,
      get_strings, and get_sset_count functions to accomplish this.
      
      Read all GEM statistics registers and sum them into
      macb.ethtool_stats. Add the necessary infrastructure to make this
      accessible via `ethtool -S`.
      
      Update gem_update_stats to utilize ethtool_stats.
      Signed-off-by: default avatarXander Huff <xander.huff@ni.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ff13f1c
    • Xander Huff's avatar
      net/macb: Adding comments to various #defs to make interpretation easier · 5c2fa0f6
      Xander Huff authored
      This change is to help improve at-a-glace knowledge of the purpose of the
      various Cadence MACB/GEM registers. Comments are more helpful for human
      readability than short acronyms.
      
      Describe various #define varibles Cadence MACB/GEM registers as documented
      in Xilinix's "Zynq-7000 All Programmable SoC TechnicalReference Manual, v1.9.1
      (UG-585)"
      Signed-off-by: default avatarXander Huff <xander.huff@ni.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c2fa0f6
    • David S. Miller's avatar
      Merge branch 'xen-netfront-next' · 6a38cc2b
      David S. Miller authored
      David Vrabel says:
      
      ====================
      xen-netfront: refactor making Tx requests
      
      As netfront as evolved to handle different sorts of skbs the code to
      fill a Tx requests has been copy and pasted several times.  The series
      refactors this and a few other areas.
      
      The first patch is to a Xen header but this can be merged via
      net-next.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a38cc2b
    • David Vrabel's avatar
      xen-netfront: refactor making Tx requests · a55e8bb8
      David Vrabel authored
      Eliminate all the duplicate code for making Tx requests by
      consolidating them into a single xennet_make_one_txreq() function.
      
      xennet_make_one_txreq() and xennet_make_txreqs() work with pages and
      offsets so it will be easier to make netfront handle highmem frags in
      the future.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a55e8bb8
    • David Vrabel's avatar
      xen-netfront: refactor skb slot counting · e84448d5
      David Vrabel authored
      A function to count the number of slots an skb needs is more useful
      than one that counts the slots needed for only the frags.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e84448d5
    • David Vrabel's avatar
      xen: add page_to_mfn() · 28e98c2c
      David Vrabel authored
      pfn_to_mfn(page_to_pfn(p)) is a common use case so add a generic
      helper for it.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28e98c2c
    • Thomas Graf's avatar
      rhashtable: Add MAINTAINERS entry · 933685ca
      Thomas Graf authored
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      933685ca
    • Thomas Graf's avatar
      rhashtable: Lower/upper bucket may map to same lock while shrinking · 80ca8c3a
      Thomas Graf authored
      Each per bucket lock covers a configurable number of buckets. While
      shrinking, two buckets in the old table contain entries for a single
      bucket in the new table. We need to lock down both while linking.
      Check if they are protected by different locks to avoid a recursive
      lock.
      
      Fixes: 97defe1e ("rhashtable: Per bucket locks & deferred expansion/shrinking")
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarThomas Graf <tgraf@suug.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80ca8c3a
  2. 13 Jan, 2015 23 commits