1. 27 May, 2015 17 commits
    • Sunil Goutham's avatar
      net: Adding support for Cavium ThunderX network controller · 4863dea3
      Sunil Goutham authored
      This patch adds support for the Cavium ThunderX network controller.
      The driver is on the pci bus and thus requires the Thunder PCIe host
      controller driver to be enabled.
      Signed-off-by: default avatarMaciej Czekaj <mjc@semihalf.com>
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: default avatarGanapatrao Kulkarni <ganapatrao.kulkarni@caviumnetworks.com>
      Signed-off-by: default avatarAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: default avatarTomasz Nowicki <tomasz.nowicki@linaro.org>
      Signed-off-by: default avatarRobert Richter <rrichter@cavium.com>
      Signed-off-by: default avatarKamil Rytarowski <kamil@semihalf.com>
      Signed-off-by: default avatarThanneeru Srinivasulu <tsrinivasulu@caviumnetworks.com>
      Signed-off-by: default avatarSruthi Vangala <svangala@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4863dea3
    • Sunil Goutham's avatar
      pci: Add Cavium PCI vendor id · e5c4708b
      Sunil Goutham authored
      This vendor id will be used by network (vNIC), USB (xHCI),
      SATA (AHCI), GPIO, I2C, MMC and maybe other drivers
      for ThunderX SoC.
      Acked-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@cavium.com>
      Signed-off-by: default avatarAleksey Makarov <aleksey.makarov@caviumnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5c4708b
    • Daniel Borkmann's avatar
      test_bpf: add similarly conflicting jump test case only for classic · bde28bc6
      Daniel Borkmann authored
      While 3b529602 ("test_bpf: add more eBPF jump torture cases")
      added the int3 bug test case only for eBPF, which needs exactly 11
      passes to converge, here's a version for classic BPF with 11 passes,
      and one that would need 70 passes on x86_64 to actually converge for
      being successfully JITed. Effectively, all jumps are being optimized
      out resulting in a JIT image of just 89 bytes (from originally max
      BPF insns), only returning K.
      
      Might be useful as a receipe for folks wanting to craft a test case
      when backporting the fix in commit 3f7352bf ("x86: bpf_jit: fix
      compilation of large bpf programs") while not having eBPF. The 2nd
      one is delegated to the interpreter as the last pass still results
      in shrinking, in other words, this one won't be JITed on x86_64.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bde28bc6
    • David S. Miller's avatar
      Merge branch 'sfc-next' · 5474b132
      David S. Miller authored
      Edward Cree says:
      
      ====================
      sfc: add MCDI tracing
      
      This patchset adds support for logging MCDI (Management-Controller-to-
       Driver Interface) interactions between the sfc driver and a bound device,
       to aid in debugging.
      Solarflare has a tool to decode the resulting traces and will look to
       open-source this if there is any external interest, but the protocol is
       already detailed in drivers/net/ethernet/sfc/mcdi_pcol.h.
      The logging buffer we allocate per MCDI context is a work area for
       constructing each individual message before logging it with netif_info.
      The reason the buffer is long-lived is simply to avoid the overhead of
       allocating and freeing it every MCDI call, since MCDIs are already known
       to be serialised for other reasons.
      
      --
      v4: remove patch #4, which has already been applied via sshah
      v3: add some explanations to cover letter and patch #4
      v2: avoid long lines in cover letter; fix multiline comment style
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5474b132
    • Edward Cree's avatar
      sfc: add module parameter to enable MCDI logging on new functions · 42ca087f
      Edward Cree authored
      As many issues are encountered at probe time, where MCDI logging can't be
       enabled through the sysfs node, this change adds a module parameter
       'mcdi_logging_default', which defaults to false.  When set to true, newly-
       probed functions will have MCDI logging enabled.  The setting can
       subsequently be changed as normal through the sysfs node.
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42ca087f
    • Edward Cree's avatar
      sfc: add sysfs entry to control MCDI tracing · e7fef9b4
      Edward Cree authored
      MCDI tracing is enabled per-function with a sysfs file
          /sys/class/net/<NET_DEV>/device/mcdi_logging
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7fef9b4
    • Edward Cree's avatar
      sfc: add tracing of MCDI commands · 75aba2a5
      Edward Cree authored
      MCDI tracing is conditional on CONFIG_SFC_MCDI_LOGGING, which is enabled
       by default.
      
      Each MCDI command will produce a console line like
          sfc dom:bus:dev:fn ifname: MCDI RPC REQ: xxxxxxxx [yyyyyyyy...]
      where xxxxxxxx etc. are the raw MCDI payload in 32-bit hex chunks.
      The response will then produce a similar line with "RESP" instead of "REQ",
       and containing the MCDI response payload (if any).
      Signed-off-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75aba2a5
    • Sorin Dumitru's avatar
      vxlan: release lock after each bucket in vxlan_cleanup · 14e1d0fa
      Sorin Dumitru authored
      We're seeing some softlockups from this function when there
      are a lot fdb entries on a vxlan device. Taking the lock for
      each bucket instead of the whole table is enough to fix that.
      Signed-off-by: default avatarSorin Dumitru <sdumitru@ixiacom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14e1d0fa
    • Eric Dumazet's avatar
      tcp/dccp: try to not exhaust ip_local_port_range in connect() · 07f4c900
      Eric Dumazet authored
      A long standing problem on busy servers is the tiny available TCP port
      range (/proc/sys/net/ipv4/ip_local_port_range) and the default
      sequential allocation of source ports in connect() system call.
      
      If a host is having a lot of active TCP sessions, chances are
      very high that all ports are in use by at least one flow,
      and subsequent bind(0) attempts fail, or have to scan a big portion of
      space to find a slot.
      
      In this patch, I changed the starting point in __inet_hash_connect()
      so that we try to favor even [1] ports, leaving odd ports for bind()
      users.
      
      We still perform a sequential search, so there is no guarantee, but
      if connect() targets are very different, end result is we leave
      more ports available to bind(), and we spread them all over the range,
      lowering time for both connect() and bind() to find a slot.
      
      This strategy only works well if /proc/sys/net/ipv4/ip_local_port_range
      is even, ie if start/end values have different parity.
      
      Therefore, default /proc/sys/net/ipv4/ip_local_port_range was changed to
      32768 - 60999 (instead of 32768 - 61000)
      
      There is no change on security aspects here, only some poor hashing
      schemes could be eventually impacted by this change.
      
      [1] : The odd/even property depends on ip_local_port_range values parity
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07f4c900
    • David S. Miller's avatar
      Merge branch 'ip_frag_next' · 837b9955
      David S. Miller authored
      Florian Westphal says:
      
      ====================
      net: force refragmentation for DF reassembed skbs
      
      output path tests:
      
          if (skb->len > mtu) ip_fragment()
      
      This breaks connectivity in one corner case:
       If the skb was reassembled, but has the DF bit set and ..
       .. its reassembled size is <= outdev mtu ..
       .. we will forward a DF packet larger than what the sender
          transmitted on wire.
      
      If a router later in the path can't forward this packet, it will send an
      icmp error in response to an mtu that the original sender never exceeded.
      
      This changes ipv4 defrag/output path to
      
      a) force refragmentation for DF reassembled skbs and
      b) set DF bit on all fragments when refragmenting if it was set on original
      frags.
      
      tested via:
      from scapy.all import *
      dip="10.23.42.2"
      payload="A"*1400
      packet=IP(dst=dip,id=12345,flags='DF')/UDP(sport=42,dport=42)/payload
      frags=fragment(packet,fragsize=1200)
      for fragment in frags:
          send(fragment)
      
      Without this patch, we generate fragments without df bit set based
      on the outgoing device mtu when fragmenting after forwarding, ie.
      
      IP (ttl 64, id 12345, offset 0, flags [+, DF], proto UDP (17), length 1204)
          192.168.7.1.42 > 10.23.42.2.42: UDP, length 1400
      IP (ttl 64, id 12345, offset 1184, flags [DF], proto UDP (17), length 244)
          192.168.7.1 > 10.23.42.2: ip-proto-17
      
      on ingress will either turn into
      
      IP (ttl 63, id 12345, offset 0, flags [+], proto UDP (17), length 1396)
          192.168.7.1.42 > 10.23.42.2.42: UDP, length 1400
      IP (ttl 63, id 12345, offset 1376, flags [none], proto UDP (17), length 52)
      
      (mtu 1400: We strip df and send larger fragment), or
      
      IP (ttl 63, id 12345, offset 0, flags [DF], proto UDP (17), length 1428)
          192.168.7.1.42 > 10.23.42.2.42: [udp sum ok] UDP, length 1400
      
      if mtu is 1500.  And in this case things break; router with a smaller mtu
      will send icmp error, but original sender only sent packets <= 1204 byte.
      
      With patch, we keep intent of such fragments and will emit DF-fragments
      that won't exceed 1204 byte in size.
      
      Joint work with Hannes Frederic Sowa.
      
      Changes since v2:
       - split unrelated patches from series
       - rework changelog of patch #2 to better illustrate breakage
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      837b9955
    • Florian Westphal's avatar
      ip_fragment: don't forward defragmented DF packet · d6b915e2
      Florian Westphal authored
      We currently always send fragments without DF bit set.
      
      Thus, given following setup:
      
      mtu1500 - mtu1500:1400 - mtu1400:1280 - mtu1280
         A           R1              R2         B
      
      Where R1 and R2 run linux with netfilter defragmentation/conntrack
      enabled, then if Host A sent a fragmented packet _with_ DF set to B, R1
      will respond with icmp too big error if one of these fragments exceeded
      1400 bytes.
      
      However, if R1 receives fragment sizes 1200 and 100, it would
      forward the reassembled packet without refragmenting, i.e.
      R2 will send an icmp error in response to a packet that was never sent,
      citing mtu that the original sender never exceeded.
      
      The other minor issue is that a refragmentation on R1 will conceal the
      MTU of R2-B since refragmentation does not set DF bit on the fragments.
      
      This modifies ip_fragment so that we track largest fragment size seen
      both for DF and non-DF packets, and set frag_max_size to the largest
      value.
      
      If the DF fragment size is larger or equal to the non-df one, we will
      consider the packet a path mtu probe:
      We set DF bit on the reassembled skb and also tag it with a new IPCB flag
      to force refragmentation even if skb fits outdev mtu.
      
      We will also set DF bit on each fragment in this case.
      
      Joint work with Hannes Frederic Sowa.
      Reported-by: default avatarJesse Gross <jesse@nicira.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6b915e2
    • Florian Westphal's avatar
      net: ipv4: avoid repeated calls to ip_skb_dst_mtu helper · c5501eb3
      Florian Westphal authored
      ip_skb_dst_mtu is small inline helper, but its called in several places.
      
      before: 17061      44       0   17105    42d1 net/ipv4/ip_output.o
      after:  16805      44       0   16849    41d1 net/ipv4/ip_output.o
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5501eb3
    • David S. Miller's avatar
      Merge branch 'phy_rgmii' · 8c0ce770
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: phy: phy_interface_is_rgmii helper
      
      As you suggested, here is the helper function to avoid missing some RGMII
      interface checks. Had to wait for net to be merged in net-next to avoid
      submitting the same patch/commit.
      
      Dan, you might want to rebase your dp83867 submission to use that helper
      when you this patchset gets merged into net-next, thanks!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c0ce770
    • Florian Fainelli's avatar
      net: phy: Utilize phy_interface_is_rgmii · 32a64161
      Florian Fainelli authored
      Update all open-coded tests for all 4 PHY_INTERFACE_MODE_RGMII* values
      to use the newly introduced helper: phy_interface_is_rgmii.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32a64161
    • Florian Fainelli's avatar
      net: phy: Add phy_interface_is_rgmii helper · e463d88c
      Florian Fainelli authored
      RGMII interfaces come in 4 different flavors that the PHY library needs
      to care about: regular RGMII (no delays), RGMII with either RX or TX
      delay, and both. In order to avoid errors of checking only for one type
      of RGMII interface and miss the 3 others, introduce a convenience
      function which tests for all values.
      Suggested-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e463d88c
    • David S. Miller's avatar
      ipv4: Fix fib_trie.c build, missing linux/vmalloc.h include. · ffa915d0
      David S. Miller authored
      We used to get this indirectly I supposed, but no longer do.
      
      Either way, an explicit include should have been done in the
      first place.
      
         net/ipv4/fib_trie.c: In function '__node_free_rcu':
      >> net/ipv4/fib_trie.c:293:3: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]
            vfree(n);
            ^
         net/ipv4/fib_trie.c: In function 'tnode_alloc':
      >> net/ipv4/fib_trie.c:312:3: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration]
            return vzalloc(size);
            ^
      >> net/ipv4/fib_trie.c:312:3: warning: return makes pointer from integer without a cast
         cc1: some warnings being treated as errors
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ffa915d0
    • Eric Dumazet's avatar
      tcp: tcp_tso_autosize() minimum is one packet · d6a4e26a
      Eric Dumazet authored
      By making sure sk->sk_gso_max_segs minimal value is one,
      and sysctl_tcp_min_tso_segs minimal value is one as well,
      tcp_tso_autosize() will return a non zero value.
      
      We can then revert 843925f3
      ("tcp: Do not apply TSO segment limit to non-TSO packets")
      and save few cpu cycles in fast path.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6a4e26a
  2. 26 May, 2015 7 commits
    • Eric Dumazet's avatar
      tcp: fix/cleanup inet_ehash_locks_alloc() · 095dc8e0
      Eric Dumazet authored
      If tcp ehash table is constrained to a very small number of buckets
      (eg boot parameter thash_entries=128), then we can crash if spinlock
      array has more entries.
      
      While we are at it, un-inline inet_ehash_locks_alloc() and make
      following changes :
      
      - Budget 2 cache lines per cpu worth of 'spinlocks'
      - Try to kmalloc() the array to avoid extra TLB pressure.
        (Most servers at Google allocate 8192 bytes for this hash table)
      - Get rid of various #ifdef
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      095dc8e0
    • Jon Paul Maloy's avatar
      tipc: fix bug in link protocol message create function · f3903bcc
      Jon Paul Maloy authored
      In commit dd3f9e70
      ("tipc: add packet sequence number at instant of transmission") we
      made a change with the consequence that packets in the link backlog
      queue don't contain valid sequence numbers.
      
      However, when we create a link protocol message, we still use the
      sequence number of the first packet in the backlog, if there is any,
      as "next_sent" indicator in the message. This may entail unnecessary
      retransissions or stale packet transmission when there is very low
      traffic on the link.
      
      This commit fixes this issue by only using the current value of
      tipc_link::snd_nxt as indicator.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3903bcc
    • Eric Dumazet's avatar
      net: fix inet_proto_csum_replace4() sparse errors · 05c98543
      Eric Dumazet authored
      make C=2 CF=-D__CHECK_ENDIAN__ net/core/utils.o
      ...
      net/core/utils.c:307:72: warning: incorrect type in argument 2 (different base types)
      net/core/utils.c:307:72:    expected restricted __wsum [usertype] addend
      net/core/utils.c:307:72:    got restricted __be32 [usertype] from
      net/core/utils.c:308:34: warning: incorrect type in argument 2 (different base types)
      net/core/utils.c:308:34:    expected restricted __wsum [usertype] addend
      net/core/utils.c:308:34:    got restricted __be32 [usertype] to
      net/core/utils.c:310:70: warning: incorrect type in argument 2 (different base types)
      net/core/utils.c:310:70:    expected restricted __wsum [usertype] addend
      net/core/utils.c:310:70:    got restricted __be32 [usertype] from
      net/core/utils.c:310:77: warning: incorrect type in argument 2 (different base types)
      net/core/utils.c:310:77:    expected restricted __wsum [usertype] addend
      net/core/utils.c:310:77:    got restricted __be32 [usertype] to
      net/core/utils.c:312:72: warning: incorrect type in argument 2 (different base types)
      net/core/utils.c:312:72:    expected restricted __wsum [usertype] addend
      net/core/utils.c:312:72:    got restricted __be32 [usertype] from
      net/core/utils.c:313:35: warning: incorrect type in argument 2 (different base types)
      net/core/utils.c:313:35:    expected restricted __wsum [usertype] addend
      net/core/utils.c:313:35:    got restricted __be32 [usertype] to
      
      Note we can use csum_replace4() helper
      
      Fixes: 58e3cac5 ("net: optimise inet_proto_csum_replace4()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05c98543
    • Eric Dumazet's avatar
      net: remove a sparse error in secure_dccpv6_sequence_number() · 68319052
      Eric Dumazet authored
      make C=2 CF=-D__CHECK_ENDIAN__ net/core/secure_seq.o
      net/core/secure_seq.c:157:50: warning: restricted __be32 degrades to
      integer
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68319052
    • Wilson Kok's avatar
      bridge: skip fdb add if the port shouldn't learn · eb8d7baa
      Wilson Kok authored
      Check in fdb_add_entry() if the source port should learn, similar
      check is used in br_fdb_update.
      Note that new fdb entries which are added manually or
      as local ones are still permitted.
      This patch has been tested by running traffic via a bridge port and
      switching the port's state, also by manually adding/removing entries
      from the bridge's fdb.
      Signed-off-by: default avatarWilson Kok <wkok@cumulusnetworks.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb8d7baa
    • Eric Dumazet's avatar
      pktgen: remove one sparse error · d4969581
      Eric Dumazet authored
      net/core/pktgen.c:2672:43: warning: incorrect type in assignment (different base types)
      net/core/pktgen.c:2672:43:    expected unsigned short [unsigned] [short] [usertype] <noident>
      net/core/pktgen.c:2672:43:    got restricted __be16 [usertype] protocol
      
      Let's use proper struct ethhdr instead of hard coding everything.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d4969581
    • Eric Dumazet's avatar
      ipv6: ipv6_select_ident() returns a __be32 · 7f159867
      Eric Dumazet authored
      ipv6_select_ident() returns a 32bit value in network order.
      
      Fixes: 286c2349 ("ipv6: Clean up ipv6_select_ident() and ip6_fragment()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f159867
  3. 25 May, 2015 16 commits
    • David S. Miller's avatar
      Merge branch 'cpsw-cleanups' · eedf4c66
      David S. Miller authored
      Richard Cochran says:
      
      ====================
      cpsw cleanups
      
      While working on an out-of-tree customization, I noticed a few minor
      problems in the cpsw code.  This series cleans up the issues I found.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eedf4c66
    • Richard Cochran's avatar
      net: cpsw: remove redundant calls disabling dma interrupts. · 61d22596
      Richard Cochran authored
      The function, cpsw_intr_disable, already calls cpdma_ctlr_int_ctrl.  There
      is no need to disable the dma interrupts twice.  This patch removes the
      extra calls.
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      61d22596
    • Richard Cochran's avatar
      net: cpsw: remove redundant calls enabling dma interrupts. · 071f1a96
      Richard Cochran authored
      The function, cpsw_intr_enable, already calls cpdma_ctlr_int_ctrl.  There
      is no need to enable the dma interrupts twice.  This patch removes the
      extra call.
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      071f1a96
    • Richard Cochran's avatar
      net: cpsw: remove two unused global functions · 202c5919
      Richard Cochran authored
      The funtions, cpsw_ale_flush and cpsw_ale_set_ageout, have never been used
      since they were first introduced.  This patch removes the dead code.
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      202c5919
    • Richard Cochran's avatar
      net: cpsw: fix misplaced break statements. · 26fe7eb8
      Richard Cochran authored
      Having the breaks too far to the left makes parsing the dense switch/case
      block unnecessarily harder.
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26fe7eb8
    • David S. Miller's avatar
      Merge branch 'rocker-cleanups' · d1a0ed79
      David S. Miller authored
      Simon Horman says:
      
      ====================
      rocker: unused parameter and const cleanups
      
      This series provides some minor though verbose cleanup of rocker.
      
      The second patch depends on the first though it could be rebased.
      
      I had previously asked for v2 to be put on hold while some bugs I had found
      in the rocker driver were shaken out. That has now happened and the bugs
      turned out to be unrelated.  Accordingly I am reposting the series.
      
      * Changes v2 -> v3
        - Rebase and update for new variables and parameters that may be const
      
      * Changes v1 -> v2
        - Found quite a few more variables and parameters to make const
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1a0ed79
    • Simon Horman's avatar
      rocker: mark parameters and local variables as const · e5054643
      Simon Horman authored
      Mark parameters and local variables as const where possible.
      Signed-off-by: default avatarSimon Horman <simon.horman@netronome.com>
      Acked-by: default avatarScott Feldman <sfeldma@gmail.com>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5054643
    • Simon Horman's avatar
      rocker: remove unused rocker_port parameter from rocker_port_kfree · 0985df73
      Simon Horman authored
      Remove unused rocker_port parameter from rocker_port_kfree.
      Also remove the rocker_port parameter from callers of rocker_port_kfree
      where the parameter it is now unused.
      Signed-off-by: default avatarSimon Horman <simon.horman@netronome.com>
      Acked-by: default avatarScott Feldman <sfeldma@gmail.com>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0985df73
    • Nicholas Mc Guire's avatar
      irda: use msecs_to_jiffies for conversion to jiffies · 005e8709
      Nicholas Mc Guire authored
      API compliance scanning with coccinelle flagged:
      ./net/irda/timer.c:63:35-37: use of msecs_to_jiffies probably perferable
      
      Converting milliseconds to jiffies by "val * HZ / 1000" technically
      is not a clean solution as it does not handle all corner cases correctly.
      By changing the conversion to use msecs_to_jiffies(val) conversion is
      correct in all cases. Further the () around the arithmetic expression
      was dropped.
      
      Patch was compile tested for x86_64_defconfig + CONFIG_IRDA=m
      
      Patch is against 4.1-rc4 (localversion-next is -next-20150522)
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      005e8709
    • Joe Perches's avatar
      neterion: s2io: Fix kernel doc formatting · d07ce242
      Joe Perches authored
      These two uses seem to have had carriage returns removed.
      Make these entries like all the others in this file.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d07ce242
    • Nicholas Mc Guire's avatar
      irda: irda-usb: use msecs_to_jiffies for conversions · bbfe0f37
      Nicholas Mc Guire authored
      API compliance scanning with coccinelle flagged:
      
      Converting milliseconds to jiffies by "val * HZ / 1000" is technically
      is not a clean solution as it does not handle all corner cases correctly.
      By changing the conversion to use msecs_to_jiffies(val) conversion is
      correct in all cases.
      
      in the current code:
        mod_timer(&self->rx_defer_timer, jiffies + (10 * HZ / 1000));
      for HZ < 100 (e.g. CONFIG_HZ == 64|32 in alpha) this effectively results
      in no delay at all.
      
      Patch was compile tested for x86_64_defconfig (implies CONFIG_USB_IRDA=m)
      
      Patch is against 4.1-rc4 (localversion-next is -next-20150522)
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbfe0f37
    • Linus Lüssing's avatar
      bridge: allow setting hash_max + multicast_router if interface is down · 6ae4ae8e
      Linus Lüssing authored
      Network managers like netifd (used in OpenWRT for instance) try to
      configure interface options after creation but before setting the
      interface up.
      
      Unfortunately the sysfs / bridge currently only allows to configure the
      hash_max and multicast_router options when the bridge interface is up.
      But since br_multicast_init() doesn't start any timers and only sets
      default values and initializes timers it should be save to reconfigure
      the default values after that, before things actually get active after
      the bridge is set up.
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ae4ae8e
    • Florian Westphal's avatar
      ipv6: don't increase size when refragmenting forwarded ipv6 skbs · 485fca66
      Florian Westphal authored
      since commit 6aafeef0 ("netfilter: push reasm skb through instead of
      original frag skbs") we will end up sometimes re-fragmenting skbs
      that we've reassembled.
      
      ipv6 defrag preserves the original skbs using the skb frag list, i.e. as long
      as the skb frag list is preserved there is no problem since we keep
      original geometry of fragments intact.
      
      However, in the rare case where the frag list is munged or skb
      is linearized, we might send larger fragments than what we originally
      received.
      
      A router in the path might then send packet-too-big errors even if
      sender never sent fragments exceeding the reported mtu:
      
      mtu 1500 - 1500:1400 - 1400:1280 - 1280
           A         R1         R2        B
      
      1 - A sends to B, fragment size 1400
      2 - R2 sends pkttoobig error for 1280
      3 - A sends to B, fragment size 1280
      4 - R2 sends pkttoobig error for 1280 again because it sees fragments of size 1400.
      
      make sure ip6_fragment always caps MTU at largest packet size seen
      when defragmented skb is forwarded.
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      485fca66
    • Shailendra Verma's avatar
      atm:he - Change 1 to true for bool type variable. · 376cd36d
      Shailendra Verma authored
      The variable irq_coalesce is bool type.
      So assign the value true instead of 1.
      Signed-off-by: default avatarShailendra Verma <shailendra.capricorn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      376cd36d
    • Shailendra Verma's avatar
      net:xen-netback - Change 1 to true for bool type variable. · c489dbb1
      Shailendra Verma authored
      The variable separate_tx_rx_irq is bool type so assigning true
      instead of 1.
      Signed-off-by: default avatarShailendra Verma <shailendra.capricorn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c489dbb1
    • David S. Miller's avatar
      Merge branch 'ipv6_route_sharing' · c1a34035
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      ipv6: Only create RTF_CACHE route after encountering pmtu exception
      
      v4 -> v5:
      - Patch 1 is new. Clean up the ipv6_select_ident() and ip6_fragment().
      
      - Further simplify the newly added rt6_get_pcpu_route().  If there is a
        'prev' after cmpxchg, return prev instead of the newly created percpu
        clone.
      
      v3 -> v4:
      - Patch 8 is new. It keeps track of the DST_NOCACHE routes in a list to handle
        the iface down/unregister event.
      
      - Remove rcu from the newly added rt6i_pcpu variable.  It is not needed
        because it has already been protected by the existing reader/writer lock.
      
      - Thanks to 'Julian Anastasov <ja@ssi.bg>' for testing the FLOWI_FLAG_KNOWN_NH
        patches.
      
      v2 -> v3:
      - Patch 5 to 7 are new.  They take care of cases where the daddr in
        skb is not the one used to do the route look-up.  There is also
        related changes to rt6_nexthop() since v2 which is in patch 2/9.
        Thanks to 'Julian Anastasov <ja@ssi.bg>' for pointing it out.
      
      - Fix a few problems in __ip6_rt_update_pmtu(), like setting the expire
        and mtu before inserting to the tree and don't do dst_destroy() after
        tree insertion failure.  Also update the rt6i_pmtu in fib6_add_rt2node().
        Thanks to 'Steffen Klassert <steffen.klassert@secunet.com>' for pointing
        it out.
      
      - Merge ip6_pmtu_rt_cache_alloc() into ip6_rt_cache_alloc().
      
      v1 -> v2:
      - Move the /128 route bug fixes to another series (accepted).
      - Create a function for checking (rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)).
      - Avoid shuffling the skb network_header.  Instead, change the function
        signature to take iph instead of skb.
      
      - Many Thanks to 'Hannes Frederic Sowa <hannes@stressinduktion.org>' on
        reviewing v1 and v2 and giving advice.
      
      --Martin
      
      ~~~ start: v1 compose message (with the out-dated parts removed) ~~~
      
      This series is to avoid creating a RTF_CACHE route whenever we are consulting
      the fib6 tree with a new destination.  Instead, only create RTF_CACHE route
      when we see a pmtu exception.
      
      Out of all ipv6 RTF_CACHE routes that are created, the percentage that has a
      different mtu is very small. In one of our end-user facing proxy server,
      only 1k out of 80k RTF_CACHE routes have a smaller MTU.  For our DC
      traffic, there is no mtu exception.
      
      A large fib6 tree has problems like, 'ip -6 r show' takes a long time.
      gc may kick in too often.  Also, when a service has restarted and a lot
      of new TCP conn requests come in, it creates pressure on the tree by inserting
      a lot of RTF_CACHE in a short time and it currently requires a write lock
      to do that.
      
      The first few patches are prep works to remove assumption that the
      returned rt is always RTF_CACHE.
      
      The patch 'ipv6: Only create RTF_CACHE routes after encountering pmtu exception'
      do the lazy RTF_CACHE route creation.
      
      The following patches added percpu rt to compensate the performance loss after
      doing the RTF_CACHE lazy creation.
      
      Here is some numbers of the udpflood test.  The udpflood has been
      slightly modified to have a time limit instead of count limit.
      
      A /64 via gateway route is used for the test. Each udpflood uses 10000 dst
      addresses.  The dst addresses of different udpflood processes do not overlap
      with each other.
      
      1                    16M                          15M
      10                   61M                          61M
      20                   65M                          62M
      40                   88M                          83M
      
      ~~~ end: v1 compose message ~~~
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1a34035