1. 25 Aug, 2014 24 commits
    • Daniel Borkmann's avatar
      ixgbe: support skb->xmit_more in netdev_ops->ndo_start_xmit() · 9c938cdd
      Daniel Borkmann authored
      This implements the deferred tail pointer flush API for the ixgbe
      driver. Similar version also proposed longer time ago by Alexander Duyck.
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c938cdd
    • David S. Miller's avatar
      net: Remove ndo_xmit_flush netdev operation, use signalling instead. · 0b725a2c
      David S. Miller authored
      As reported by Jesper Dangaard Brouer, for high packet rates the
      overhead of having another indirect call in the TX path is
      non-trivial.
      
      There is the indirect call itself, and then there is all of the
      reloading of the state to refetch the tail pointer value and
      then write the device register.
      
      Move to a more passive scheme, which requires very light modifications
      to the device drivers.
      
      The signal is a new skb->xmit_more value, if it is non-zero it means
      that more SKBs are pending to be transmitted on the same queue as the
      current SKB.  And therefore, the driver may elide the tail pointer
      update.
      
      Right now skb->xmit_more is always zero.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b725a2c
    • David S. Miller's avatar
      Merge branch 'is_kdump_kernel' · 44a52ffd
      David S. Miller authored
      Amir Vadai says:
      
      ====================
      Make is_kdump_kernel() accessible from modules
      
      I'm re-spinning this patchset. At the begining it was suggested to use a
      different name for the parameter, but at the end [3] the resolution was to
      leave it as it is in this patch.
      
      Drivers need to know if running from kdump kernel in order to change their
      memory profile - since kdump environment is limited by available memory.
      Currently there are drivers that are using reset_devices as suggested in [2].
      In [2] it was suggested to use reset_devices, but the context was, to enable
      driver to know when the hardware device is needed to be reset, and not if this
      is a kdump environment. We think that is_kdump_kernel() is better suited to
      select between different memory profiles.
      
      The first patch in this patchset exports a needed symbol in order to make
      is_kdump_kernel() accessible from the drivers. The rest of the patches change
      from reset_devices to is_kdump_kernel() in 2 networking drivers.
      
      The idea of this patchset was suggested by Vivek Goyal.
      
      Tested (only build) and applied on top of commit 8fc54f68: ("net: use
      reciprocal_scale() helper")
      
      [1] - ea1c1af1: ("net/mlx4_en: Reduce memory consumption on kdump kernel")
      [2] - https://lkml.org/lkml/2011/1/27/341
      [3] - http://www.spinics.net/lists/netdev/msg291492.html
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44a52ffd
    • Amir Vadai's avatar
      net/bnx2x: Use is_kdump_kernel() to detect kdump kernel · c9931896
      Amir Vadai authored
      Use is_kdump_kernel() to detect kdump kernel, instead of
      reset_devices.
      
      CC: Ariel Elior <ariel.elior@qlogic.com>
      CC: Michal Schmidt <mschmidt@redhat.com>
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9931896
    • Amir Vadai's avatar
      net/mlx4: Use is_kdump_kernel() to detect kdump kernel · 48ea526a
      Amir Vadai authored
      Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices.
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48ea526a
    • Amir Vadai's avatar
      crash_dump: Make is_kdump_kernel() accessible from modules · b3292e88
      Amir Vadai authored
      In order to make is_kdump_kernel() accessible from modules, need to
      make elfcorehdr_addr exported.
      This was rejected in the past [1] because reset_devices was prefered in
      that context (reseting the device in kdump kernel), but now there are
      some network drivers that need to reduce memory usage when loaded from
      a kdump kernel.  And in that context, is_kdump_kernel() suits better.
      
      [1] - https://lkml.org/lkml/2011/1/27/341
      
      CC: Vivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3292e88
    • Pavel Machek's avatar
      stmmac: simple cleanups · a77e4acc
      Pavel Machek authored
      This adds simple cleanups for stmmac, removing test we know is always
      true, fixing whitespace, and moving code out of if().
      Signed-off-by: default avatarPavel Machek <pavel@denx.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a77e4acc
    • hayeswang's avatar
      r8152: check code with checkpatch.pl · b209af99
      hayeswang authored
      626: CHECK: Alignment should match open parenthesis
       646: CHECK: Alignment should match open parenthesis
       655: CHECK: Alignment should match open parenthesis
       695: CHECK: Alignment should match open parenthesis
       729: CHECK: Alignment should match open parenthesis
       739: CHECK: Alignment should match open parenthesis
       976: WARNING: externs should be avoided in .c files
       1314: CHECK: Alignment should match open parenthesis
       1358: WARNING: networking block comments don't use an empty /* line, use /* Comment...
       1402: WARNING: networking block comments don't use an empty /* line, use /* Comment...
       1521: CHECK: multiple assignments should be avoided
       1775: CHECK: Alignment should match open parenthesis
       1838: CHECK: multiple assignments should be avoided
       1843: CHECK: multiple assignments should be avoided
       1847: CHECK: multiple assignments should be avoided
       1850: WARNING: Missing a blank line after declarations
       1864: CHECK: Alignment should match open parenthesis
       1872: CHECK: braces {} should be used on all arms of this statement
       1906: CHECK: usleep_range is preferred over udelay
       2865: WARNING: networking block comments don't use an empty /* line, use /* Comment...
       3088: CHECK: Alignment should match open parenthesis
       total: 0 errors, 5 warnings, 16 checks, 3567 lines checked
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b209af99
    • David S. Miller's avatar
      Merge branch 'ndo_xmit_flush' · fe88e6dd
      David S. Miller authored
      Basic deferred TX queue flushing infrastructure.
      
      Over time, and specifically and more recently at the Networking
      Workshop during Kernel SUmmit in Chicago, we have discussed the idea
      of having some way to optimize transmits of multiple TX packets at
      a time.
      
      There are several areas of overhead that could be amortized with such
      schemes.  One has to do with locking and transactional overhead, the
      other has to do with device specific costs.
      
      This patch set here is more aimed at device specific costs.
      
      Typically a device queues up a packet in the TX queue and then has to
      do something to have the device start processing that new entry.
      Sometimes this is composed of doing an MMIO write to a "tail"
      register, and in other cases it can involve something as expensive as
      a hypervisor call.
      
      The basic setup defined here is that when the driver supports deferred
      TX queue flushing, ndo_start_xmit should no longer perform that
      operation.  Instead a new operation, ndo_xmit_flush, should do it.
      
      I have converted IGB and virtio_net as example initial users.  The IGB
      conversion is tested, virtio_net is not but it does compile :-)
      
      All ndo_start_xmit call sites have been abstracted behind a new helper
      called netdev_start_xmit().
      
      This just adds the infrastructure, it does not actually add any
      instances of actually doing multiple ndo_start_xmit calls per
      ndo_xmit_flush invocation.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe88e6dd
    • David S. Miller's avatar
      c223a078
    • David S. Miller's avatar
      c1ebf46c
    • David S. Miller's avatar
      net: Add ops->ndo_xmit_flush() · 4798248e
      David S. Miller authored
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4798248e
    • Ian Morris's avatar
      ipv6: White-space cleansing : gaps between function and symbol export · 4c83acbc
      Ian Morris authored
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      This patch removes some blank lines between the end of a function
      definition and the EXPORT_SYMBOL_GPL macro in order to prevent
      checkpatch warning that EXPORT_SYMBOL must immediately follow
      a function.
      Signed-off-by: default avatarIan Morris <ipm@chirality.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c83acbc
    • Ian Morris's avatar
      ipv6: White-space cleansing : Structure layouts · cc24beca
      Ian Morris authored
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      This patch addresses structure definitions, specifically it cleanses the brace
      placement and replaces spaces with tabs in a few places.
      Signed-off-by: default avatarIan Morris <ipm@chirality.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc24beca
    • Ian Morris's avatar
      ipv6: White-space cleansing : Line Layouts · 67ba4152
      Ian Morris authored
      This patch makes no changes to the logic of the code but simply addresses
      coding style issues as detected by checkpatch.
      
      Both objdump and diff -w show no differences.
      
      A number of items are addressed in this patch:
      * Multiple spaces converted to tabs
      * Spaces before tabs removed.
      * Spaces in pointer typing cleansed (char *)foo etc.
      * Remove space after sizeof
      * Ensure spacing around comparators such as if statements.
      Signed-off-by: default avatarIan Morris <ipm@chirality.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67ba4152
    • Darek Marcinkiewicz's avatar
      net: ec_bhf: remove excessive debug messages · a9b0b2fa
      Darek Marcinkiewicz authored
      This cuts down the number of debug information spit out by
      the driver.
      Signed-off-by: default avatarDariusz Marcinkiewicz <reksio@newterm.pl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9b0b2fa
    • Daniel Borkmann's avatar
      random32: improvements to prandom_bytes · a98406e2
      Daniel Borkmann authored
      This patch addresses a couple of minor items, mostly addesssing
      prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t
      for length arguments, 2) We can use put_unaligned() when filling
      the array instead of open coding it [ perhaps some archs will
      further benefit from their own arch specific implementation when
      GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned
      int as type for getting the arch seed, 5) Make use of
      prandom_u32_max() for timer slack.
      
      Regarding the change to put_unaligned(), callers of prandom_bytes()
      which internally invoke prandom_bytes_state(), don't bother as
      they expect the array to be filled randomly and don't have any
      control of the internal state what-so-ever (that's also why we
      have periodic reseeding there, etc), so they really don't care.
      
      Now for the direct callers of prandom_bytes_state(), which
      are solely located in test cases for MTD devices, that is,
      drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}:
      
      These tests basically fill a test write-vector through
      prandom_bytes_state() with an a-priori defined seed each time
      and write that to a MTD device. Later on, they set up a read-vector
      and read back that blocks from the device. So in the verification
      phase, the write-vector is being re-setup [ so same seed and
      prandom_bytes_state() called ], and then memcmp()'ed against the
      read-vector to check if the data is the same.
      
      Akinobu, Lothar and I also tested this patch and it runs through
      the 3 relevant MTD test cases w/o any errors on the nandsim device
      (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53
      and i.MX6):
      
        # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \
                           third_id_byte=0x00 fourth_id_byte=0x15
        # modprobe mtd_oobtest dev=0
        # modprobe mtd_pagetest dev=0
        # modprobe mtd_subpagetest dev=0
      
      We also don't have any users depending directly on a particular
      result of the PRNG (except the PRNG self-test itself), and that's
      just fine as it e.g. allowed us easily to do things like upgrading
      from taus88 to taus113.
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Tested-by: default avatarAkinobu Mita <akinobu.mita@gmail.com>
      Tested-by: default avatarLothar Waßmann <LW@KARO-electronics.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a98406e2
    • David S. Miller's avatar
      Merge branch 'csums-next' · c1e60bd4
      David S. Miller authored
      Tom Herbert says:
      
      ====================
      net: Checksum offload changes - Part V
      
      I am working on overhauling RX checksum offload. Goals of this effort
      are:
      
      - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
      - Preserve CHECKSUM_COMPLETE through encapsulation layers
      - Don't do skb_checksum more than once per packet
      - Unify GRO and non-GRO csum verification as much as possible
      - Unify the checksum functions (checksum_init)
      - Simplify code
      
      What is in this fifth patch set:
      
      - Added GRO checksum validation functions
      - Call the GRO validations functions from TCP and GRE gro_receive
      - Perform checksum verification in the UDP gro_receive path using
        GRO functions and add support for gro_receive in UDP6
      
      Changes in V2:
      
      - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it
        to CHECKSUM_COMPLETE from GRO checksum validation. This avoids
        performance penalty in checksumming bytes which are before the header
        GRO is at.
      
      Please review carefully and test if possible, mucking with basic
      checksum functions is always a little precarious :-)
      
      ----
      
      Test results with this patch set are below. I did not notice any
      performace regression.
      
      Tests run:
         TCP_STREAM: super_netperf with 200 streams
         TCP_RR: super_netperf with 200 streams and -r 1,1
      
      Device bnx2x (10Gbps):
         No GRE RSS hash (RX interrupts occur on one core)
         UDP RSS port hashing enabled.
      
      * GRE with checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              9.91% CPU utilization
              5163.78 Mbps
          TCP_RR
              50.64% CPU utilization
              219/347/502 90/95/99% latencies
              834103 tps
        Without fix:
          TCP_STREAM
              10.05% CPU utilization
              5186.22 tps
          TCP_RR
              49.70% CPU utilization
              227/338/486 90/95/99% latencies
              813450 tps
      
      * GRE without checksum with IPv4 encapsulated packets
        With fix:
          TCP_STREAM
              10.18% CPU utilization
              5159 Mbps
          TCP_RR
              51.86% CPU utilization
              214/325/471 90/95/99% latencies
              865943 tps
        Without fix:
          TCP_STREAM
              10.26% CPU utilization
              5307.87 Mbps
          TCP_RR
              50.59% CPU utilization
              224/325/476 90/95/99% latencies
              846429 tps
      
      *** Simulate device returns CHECKSUM_COMPLETE
      
      * VXLAN with checksum
        With fix:
          TCP_STREAM
              13.03% CPU utilization
              9093.9 Mbps
          TCP_RR
              95.96% CPU utilization
              161/259/474 90/95/99% latencies
              1.14806e+06 tps
        Without fix:
          TCP_STREAM
              13.59% CPU utilization
              9093.97 Mbps
          TCP_RR
              93.95% CPU utilization
              160/259/484 90/95/99% latencies
              1.10262e+06 tps
      
      * VXLAN without checksum
        With fix:
          TCP_STREAM
              13.28% CPU utilization
              9093.87 Mbps
          TCP_RR
              95.04% CPU utilization
              155/246/439 90/95/99% latencies
              1.15e+06 tps
        Without fix:
          TCP_STREAM
              13.37% CPU utilization
              9178.45 Mbps
          TCP_RR
              93.74% CPU utilization
              161/257/469 90/95/99% latencies
              1.1068e+06 Mbps
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1e60bd4
    • Tom Herbert's avatar
      gre: When GRE csum is present count as encap layer wrt csum · 48a5fc77
      Tom Herbert authored
      In GRE demux if the GRE checksum pop rcv encapsulation so that any
      encapsulated checksums are treated as tunnel checksums.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48a5fc77
    • Tom Herbert's avatar
      udp: additional GRO support · 57c67ff4
      Tom Herbert authored
      Implement GRO for UDPv6. Add UDP checksum verification in gro_receive
      for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      57c67ff4
    • Tom Herbert's avatar
      tcp: Call skb_gro_checksum_validate · 149d0774
      Tom Herbert authored
      In tcp[64]_gro_receive call skb_gro_checksum_validate to validate TCP
      checksum in the gro context.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      149d0774
    • Tom Herbert's avatar
      758f75d1
    • Tom Herbert's avatar
      net: add gro_compute_pseudo functions · 1933a785
      Tom Herbert authored
      Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are
      the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo
      for GRO path. The IP header is taken from skb_gro_network_header.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1933a785
    • Tom Herbert's avatar
      net: skb_gro_checksum_* functions · 573e8fca
      Tom Herbert authored
      Add skb_gro_checksum_validate, skb_gro_checksum_validate_zero_check,
      and skb_gro_checksum_simple_validate, and __skb_gro_checksum_complete.
      These are the cognates of the normal checksum functions but are used
      in the gro_receive path and operate on GRO related fields in sk_buffs.
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      573e8fca
  2. 23 Aug, 2014 16 commits