1. 21 Sep, 2015 3 commits
  2. 18 Sep, 2015 17 commits
    • Eric Dumazet's avatar
      tcp_cubic: do not set epoch_start in the future · c2e7204d
      Eric Dumazet authored
      Tracking idle time in bictcp_cwnd_event() is imprecise, as epoch_start
      is normally set at ACK processing time, not at send time.
      
      Doing a proper fix would need to add an additional state variable,
      and does not seem worth the trouble, given CUBIC bug has been there
      forever before Jana noticed it.
      
      Let's simply not set epoch_start in the future, otherwise
      bictcp_update() could overflow and CUBIC would again
      grow cwnd too fast.
      
      This was detected thanks to a packetdrill test Neal wrote that was flaky
      before applying this fix.
      
      Fixes: 30927520 ("tcp_cubic: better follow cubic curve after idle period")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Cc: Jana Iyengar <jri@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c2e7204d
    • Taku Izumi's avatar
      fjes: fix off-by-one error at fjes_hw_update_zone_task() · adb094e5
      Taku Izumi authored
      Dan Carpenter reported off-by-one error of fjes at
      http://www.mail-archive.com/netdev@vger.kernel.org/msg77520.html
      
      Actually this is a bug.
      ep_shm_info[epidx].{es_status, zone} should be update
      inside for loop.
      
      This patch fixes this bug.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTaku Izumi <izumi.taku@jp.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adb094e5
    • Jiri Benc's avatar
      MAINTAINERS: remove bouncing email address for qlcnic · 6cf35642
      Jiri Benc authored
      I got this automated message from <shahed.shaikh@qlogic.com> when submitting
      a qlcnic patch:
      
      > Shahed Shaikh is no longer with QLogic. If you require assistance please
      > contact Ariel Elior Ariel.Elior@qlogic.com
      
      There's no point in having a bouncing address in MAINTAINERS.
      
      CC: Dept-GELinuxNICDev@qlogic.com
      CC: Ariel Elior <Ariel.Elior@qlogic.com>
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6cf35642
    • David S. Miller's avatar
      Merge branch 'vxlan-fixes' · 1bdc0b10
      David S. Miller authored
      Jiri Benc says:
      
      ====================
      vxlan fixes
      
      This fixes various issues with vxlan related to IPv6.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bdc0b10
    • Jiri Benc's avatar
      bnx2x: track vxlan port count · ac7eccd4
      Jiri Benc authored
      The callback for adding vxlan port can be called with the same port for
      both IPv4 and IPv6. Do not disable the offloading when the same port for
      both protocols is added and later one of them removed.
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac7eccd4
    • Jiri Benc's avatar
      be2net: allow offloading with the same port for IPv4 and IPv6 · 1e5b311a
      Jiri Benc authored
      The callback for adding vxlan port can be called with the same port for both
      IPv4 and IPv6. Do not disable the offloading if this occurs.
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Acked-by: default avatarSathya Perla <sathya.perla@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e5b311a
    • Jiri Benc's avatar
      qlcnic: track vxlan port count · 378fddc2
      Jiri Benc authored
      The callback for adding vxlan port can be called with the same port for
      both IPv4 and IPv6. Do not disable the offloading when the same port for
      both protocols is added and later one of them removed.
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      378fddc2
    • Jiri Benc's avatar
      vxlan: reject IPv6 addresses if IPv6 is not configured · 057ba29b
      Jiri Benc authored
      When IPv6 address is set without IPv6 configured, the vxlan socket is mostly
      treated as an IPv4 one but various lookus in fdb etc. still take the
      AF_INET6 into account. This creates incosistencies with weird consequences.
      
      Just reject IPv6 addresses in such case.
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      057ba29b
    • Jiri Benc's avatar
      vxlan: set needed headroom correctly · 9dc2ad10
      Jiri Benc authored
      vxlan_setup is called when allocating the net_device, i.e. way before
      vxlan_newlink (or vxlan_dev_configure) is called. This means
      vxlan->default_dst is actually unset in vxlan_setup and the condition that
      sets needed_headroom always takes the else branch.
      
      Set the needed_headrom at the point when we have the information about
      the address family available.
      
      Fixes: e4c7ed41 ("vxlan: add ipv6 support")
      Fixes: 2853af6a ("vxlan: use dev->needed_headroom instead of dev->hard_header_len")
      CC: Cong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9dc2ad10
    • Michael Grzeschik's avatar
      MAINTAINERS: add arcnet and take maintainership · c38f6ac7
      Michael Grzeschik authored
      Add entry for arcnet to MAINTAINERS file and add myself as the
      maintainer of the subsystem.
      Signed-off-by: default avatarMichael Grzeschik <m.grzeschik@pengutronix.de>
      Cc: davem@davemloft.net
      Cc: joe@perches.com
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c38f6ac7
    • Michael Grzeschik's avatar
      ARCNET: fix hard_header_len limit · 980137a2
      Michael Grzeschik authored
      For arcnet the bare minimum header only contains the 4 bytes to
      specify source, dest and offset (1, 1 and 2 bytes respectively).
      The corresponding struct is struct arc_hardware.
      
      The struct archdr contains additionally a union of possible soft
      headers. When doing $insertusecasehere packets might well
      include short (or even no?) soft headers.
      
      For this reason only use arc_hardware instead of archdr to
      determine the hard_header_len for an arcnet device.
      Signed-off-by: default avatarMichael Grzeschik <m.grzeschik@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      980137a2
    • David S. Miller's avatar
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 1dbb2413
      David S. Miller authored
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth 2015-09-17
      
      Here's one important patch for the 4.3-rc series that fixes an issue
      with Bluetooth LE encryption failing because of a too early check for
      the SMP context.
      
      Please let me know if there are any issues pulling. Thanks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1dbb2413
    • Sasha Levin's avatar
      atm: deal with setting entry before mkip was called · 34f5b006
      Sasha Levin authored
      If we didn't call ATMARP_MKIP before ATMARP_ENCAP the VCC descriptor is
      non-existant and we'll end up dereferencing a NULL ptr:
      
      [1033173.491930] kasan: GPF could be caused by NULL-ptr deref or user memory accessirq event stamp: 123386
      [1033173.493678] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [1033173.493689] Modules linked in:
      [1033173.493697] CPU: 9 PID: 23815 Comm: trinity-c64 Not tainted 4.2.0-next-20150911-sasha-00043-g353d875-dirty #2545
      [1033173.493706] task: ffff8800630c4000 ti: ffff880063110000 task.ti: ffff880063110000
      [1033173.493823] RIP: clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
      [1033173.493826] RSP: 0018:ffff880063117a88  EFLAGS: 00010203
      [1033173.493828] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 000000000000000c
      [1033173.493830] RDX: 0000000000000002 RSI: ffffffffb3f10720 RDI: 0000000000000014
      [1033173.493832] RBP: ffff880063117b80 R08: ffff88047574d9a4 R09: 0000000000000000
      [1033173.493834] R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff1000c622f53
      [1033173.493836] R13: ffff8800cb905500 R14: ffff8808d6da2000 R15: 00000000fffffdfd
      [1033173.493840] FS:  00007fa56b92d700(0000) GS:ffff880478000000(0000) knlGS:0000000000000000
      [1033173.493843] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [1033173.493845] CR2: 0000000000000000 CR3: 00000000630e8000 CR4: 00000000000006a0
      [1033173.493855] Stack:
      [1033173.493862]  ffffffffb0b60444 000000000000eaea 0000000041b58ab3 ffffffffb3c3ce32
      [1033173.493867]  ffffffffb0b6f3e0 ffffffffb0b60444 ffffffffb5ea2e50 1ffff1000c622f5e
      [1033173.493873]  ffff8800630c4cd8 00000000000ee09a ffffffffb3ec4888 ffffffffb5ea2de8
      [1033173.493874] Call Trace:
      [1033173.494108] do_vcc_ioctl (net/atm/ioctl.c:170)
      [1033173.494113] vcc_ioctl (net/atm/ioctl.c:189)
      [1033173.494116] svc_ioctl (net/atm/svc.c:605)
      [1033173.494200] sock_do_ioctl (net/socket.c:874)
      [1033173.494204] sock_ioctl (net/socket.c:958)
      [1033173.494244] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
      [1033173.494290] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
      [1033173.494295] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:186)
      [1033173.494362] Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 50 09 00 00 49 8b 9e 60 06 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 14 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 14 09 00
      All code
      
      ========
         0:   fa                      cli
         1:   48 c1 ea 03             shr    $0x3,%rdx
         5:   80 3c 02 00             cmpb   $0x0,(%rdx,%rax,1)
         9:   0f 85 50 09 00 00       jne    0x95f
         f:   49 8b 9e 60 06 00 00    mov    0x660(%r14),%rbx
        16:   48 b8 00 00 00 00 00    movabs $0xdffffc0000000000,%rax
        1d:   fc ff df
        20:   48 8d 7b 14             lea    0x14(%rbx),%rdi
        24:   48 89 fa                mov    %rdi,%rdx
        27:   48 c1 ea 03             shr    $0x3,%rdx
        2b:*  0f b6 04 02             movzbl (%rdx,%rax,1),%eax               <-- trapping instruction
        2f:   48 89 fa                mov    %rdi,%rdx
        32:   83 e2 07                and    $0x7,%edx
        35:   38 d0                   cmp    %dl,%al
        37:   7f 08                   jg     0x41
        39:   84 c0                   test   %al,%al
        3b:   0f 85 14 09 00 00       jne    0x955
      
      Code starting with the faulting instruction
      ===========================================
         0:   0f b6 04 02             movzbl (%rdx,%rax,1),%eax
         4:   48 89 fa                mov    %rdi,%rdx
         7:   83 e2 07                and    $0x7,%edx
         a:   38 d0                   cmp    %dl,%al
         c:   7f 08                   jg     0x16
         e:   84 c0                   test   %al,%al
        10:   0f 85 14 09 00 00       jne    0x92a
      [1033173.494366] RIP clip_ioctl (net/atm/clip.c:320 net/atm/clip.c:689)
      [1033173.494368]  RSP <ffff880063117a88>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34f5b006
    • Florian Westphal's avatar
      ipv6: ip6_fragment: fix headroom tests and skb leak · 1d325d21
      Florian Westphal authored
      David Woodhouse reports skb_under_panic when we try to push ethernet
      header to fragmented ipv6 skbs:
      
       skbuff: skb_under_panic: text:c1277f1e len:1294 put:14 head:dec98000
       data:dec97ffc tail:0xdec9850a end:0xdec98f40 dev:br-lan
      [..]
      ip6_finish_output2+0x196/0x4da
      
      David further debugged this:
        [..] offending fragments were arriving here with skb_headroom(skb)==10.
        Which is reasonable, being the Solos ADSL card's header of 8 bytes
        followed by 2 bytes of PPP frame type.
      
      The problem is that if netfilter ipv6 defragmentation is used, skb_cow()
      in ip6_forward will only see reassembled skb.
      
      Therefore, headroom is overestimated by 8 bytes (we pulled fragment
      header) and we don't check the skbs in the frag_list either.
      
      We can't do these checks in netfilter defrag since outdev isn't known yet.
      
      Furthermore, existing tests in ip6_fragment did not consider the fragment
      or ipv6 header size when checking headroom of the fraglist skbs.
      
      While at it, also fix a skb leak on memory allocation -- ip6_fragment
      must consume the skb.
      
      I tested this e1000 driver hacked to not allocate additional headroom
      (we end up in slowpath, since LL_RESERVED_SPACE is 16).
      
      If 2 bytes of headroom are allocated, fastpath is taken (14 byte
      ethernet header was pulled, so 16 byte headroom available in all
      fragments).
      Reported-by: default avatarDavid Woodhouse <dwmw2@infradead.org>
      Diagnosed-by: default avatarDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Tested-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d325d21
    • David Woodhouse's avatar
      solos-pci: Increase headroom on received packets · ce816eb0
      David Woodhouse authored
      A comment in include/linux/skbuff.h says that:
      
       * Various parts of the networking layer expect at least 32 bytes of
       * headroom, you should not reduce this.
      
      This was demonstrated by a panic when handling fragmented IPv6 packets:
      http://marc.info/?l=linux-netdev&m=144236093519172&w=2
      
      It's not entirely clear if that comment is still valid — and if it is,
      perhaps netif_rx() ought to be enforcing it with a warning.
      
      But either way, it is rather stupid from a performance point of view
      for us to be receiving packets into a buffer which doesn't have enough
      room to prepend an Ethernet header — it means that *every* incoming
      packet is going to be need to be reallocated. So let's fix that.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce816eb0
    • Javier Martinez Canillas's avatar
      net: ks8851: Export OF module alias information · 88c79664
      Javier Martinez Canillas authored
      Drivers needs to export the OF id table and this be built into
      the module or udev won't have the necessary information to autoload
      the driver module when the device is registered via OF.
      Signed-off-by: default avatarJavier Martinez Canillas <javier@osg.samsung.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88c79664
    • Eric Dumazet's avatar
      net/mlx4_en: really allow to change RSS key · 4671fc6d
      Eric Dumazet authored
      When changing rss key, we do not want to overwrite user provided key
      by the one provided by netdev_rss_key_fill(), which is the host random
      key generated at boot time.
      
      Fixes: 947cbb0a ("net/mlx4_en: Support for configurable RSS hash function")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Eyal Perry <eyalpe@mellanox.com>
      CC: Amir Vadai <amirv@mellanox.com>
      Acked-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4671fc6d
  3. 17 Sep, 2015 7 commits
  4. 15 Sep, 2015 13 commits
    • Julia Lawall's avatar
      dccp: drop null test before destroy functions · 20471ed4
      Julia Lawall authored
      Remove unneeded NULL test.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@
      expression x;
      @@
      
      -if (x != NULL)
        \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
      
      @@
      expression x;
      @@
      
      -if (x != NULL) {
        \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
        x = NULL;
      -}
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      20471ed4
    • Julia Lawall's avatar
      net: core: drop null test before destroy functions · adf78eda
      Julia Lawall authored
      Remove unneeded NULL test.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@ expression x; @@
      -if (x != NULL) {
        \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
        x = NULL;
      -}
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adf78eda
    • Julia Lawall's avatar
      atm: he: drop null test before destroy functions · 58d29e3c
      Julia Lawall authored
      Remove unneeded NULL test.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/)
      
      // <smpl>
      @@ expression x; @@
      -if (x != NULL)
        \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58d29e3c
    • Jesse Gross's avatar
      openvswitch: Fix mask generation for nested attributes. · 982b5270
      Jesse Gross authored
      Masks were added to OVS flows in a way that was backwards compatible
      with userspace programs that did not generate masks. As a result, it is
      possible that we may receive flows that do not have a mask and we need
      to synthesize one.
      
      Generating a mask requires iterating over attributes and descending into
      nested attributes. For each level we need to know the size to generate the
      correct mask. We do this with a linked table of attribute types.
      
      Although the logic to handle these nested attributes was there in concept,
      there are a number of bugs in practice. Examples include incomplete links
      between tables, variable length attributes being treated as nested and
      missing sanity checks.
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      982b5270
    • Sjoerd Simons's avatar
      net: stmmac: Use msleep rather then udelay for reset delay · 892aa01d
      Sjoerd Simons authored
      The reset delays used for stmmac are in the order of 10ms to 1 second,
      which is far too long for udelay usage, so switch to using msleep.
      
      Practically this fixes the PHY not being reliably detected in some cases
      as udelay wouldn't actually delay for long enough to let the phy
      reliably be reset.
      Signed-off-by: default avatarSjoerd Simons <sjoerd.simons@collabora.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      892aa01d
    • Roopa Prabhu's avatar
      rtnetlink: catch -EOPNOTSUPP errors from ndo_bridge_getlink · d64f69b0
      Roopa Prabhu authored
      problem reported:
      	kernel 4.1.3
      	------------
      	# bridge vlan
      	port	vlan ids
      	eth0	 1 PVID Egress Untagged
      	 	90
      	 	91
      	 	92
      	 	93
      	 	94
      	 	95
      	 	96
      	 	97
      	 	98
      	 	99
      	 	100
      
      	vmbr0	 1 PVID Egress Untagged
      	 	94
      
      	kernel 4.2
      	-----------
      	# bridge vlan
      	port	vlan ids
      
      ndo_bridge_getlink can return -EOPNOTSUPP when an interfaces
      ndo_bridge_getlink op is set to switchdev_port_bridge_getlink
      and CONFIG_SWITCHDEV is not defined. This today can happen to
      bond, rocker and team devices. This patch adds -EOPNOTSUPP
      checks after calls to ndo_bridge_getlink.
      
      Fixes: 85fdb956 ("switchdev: cut over to new switchdev_port_bridge_getlink")
      Reported-by: default avatarAlexandre DERUMIER <aderumier@odiso.com>
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d64f69b0
    • Simon Guinot's avatar
      net: mvneta: fix DMA buffer unmapping in mvneta_rx() · daf158d0
      Simon Guinot authored
      This patch fixes a regression introduced by the commit a84e3289
      ("net: mvneta: fix refilling for Rx DMA buffers"). Due to this commit
      the newly allocated Rx buffers are DMA-unmapped in place of those passed
      to the networking stack. Obviously, this causes data corruptions.
      
      This patch fixes the issue by ensuring that the right Rx buffers are
      DMA-unmapped.
      Reported-by: default avatarOren Laskin <oren@igneous.io>
      Signed-off-by: default avatarSimon Guinot <simon.guinot@sequanux.org>
      Fixes: a84e3289 ("net: mvneta: fix refilling for Rx DMA buffers")
      Cc: <stable@vger.kernel.org> # v3.8+
      Tested-by: default avatarOren Laskin <oren@igneous.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      daf158d0
    • David S. Miller's avatar
      Merge branch 'ip6tunnel_dst' · 244b7f43
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      ipv6: Fix dst_entry refcnt bugs in ip6_tunnel
      
      v4:
      - Fix a compilation error in patch 5 when CONFIG_LOCKDEP is turned on and
        re-test it
      
      v3:
      - Merge a 'if else if' test in patch 4
      - Use rcu_dereference_protected in patch 5 to fix a sparse check when
        CONFIG_SPARSE_RCU_POINTER is enabled
      
      v2:
      - Add patch 4 and 5 to remove the spinlock
      
      v1:
      This patch series is to fix the dst refcnt bugs in ip6_tunnel.
      
      Patch 1 and 2 are the prep works.  Patch 3 is the fix.
      
      I can reproduce the bug by adding and removing the ip6gre tunnel
      while running a super_netperf TCP_CRR test.  I get the following
      trace by adding WARN_ON_ONCE(newrefcnt < 0) to dst_release():
      
      [  312.760432] ------------[ cut here ]------------
      [  312.774664] WARNING: CPU: 2 PID: 10263 at net/core/dst.c:288 dst_release+0xf3/0x100()
      [  312.776041] Modules linked in: k10temp coretemp hwmon ip6_gre ip6_tunnel tunnel6 ipmi_devintf ipmi_ms\
      ghandler ip6table_filter ip6_tables xt_NFLOG nfnetlink_log nfnetlink xt_comment xt_statistic iptable_fil\
      ter ip_tables x_tables nfsv3 nfs_acl nfs fscache lockd grace mptctl netconsole autofs4 rpcsec_gss_krb5 a\
      uth_rpcgss oid_registry sunrpc ipv6 dm_mod loop iTCO_wdt iTCO_vendor_support serio_raw rtc_cmos pcspkr i\
      2c_i801 i2c_core lpc_ich mfd_core ehci_pci ehci_hcd e1000e mlx4_en ptp pps_core vxlan udp_tunnel ip6_udp\
      _tunnel mlx4_core sg button ext3 jbd mpt2sas raid_class
      [  312.785302] CPU: 2 PID: 10263 Comm: netperf Not tainted 4.2.0-rc8-00046-g4db9b63-dirty #15
      [  312.791695] Hardware name: Quanta Freedom /Windmill-EP, BIOS F03_3B04 09/12/2013
      [  312.792965]  ffffffff819dca2c ffff8811dfbdf6f8 ffffffff816537de ffff88123788fdb8
      [  312.794263]  0000000000000000 ffff8811dfbdf738 ffffffff81052646 ffff8811dfbdf768
      [  312.795593]  ffff881203a98180 00000000ffffffff ffff88242927a000 ffff88120a2532e0
      [  312.796946] Call Trace:
      [  312.797380]  [<ffffffff816537de>] dump_stack+0x45/0x57
      [  312.798288]  [<ffffffff81052646>] warn_slowpath_common+0x86/0xc0
      [  312.799699]  [<ffffffff8105273a>] warn_slowpath_null+0x1a/0x20
      [  312.800852]  [<ffffffff8159f9b3>] dst_release+0xf3/0x100
      [  312.801834]  [<ffffffffa03f1308>] ip6_tnl_dst_store+0x48/0x70 [ip6_tunnel]
      [  312.803738]  [<ffffffffa03fd0b6>] ip6gre_xmit2+0x536/0x720 [ip6_gre]
      [  312.804774]  [<ffffffffa03fd40a>] ip6gre_tunnel_xmit+0x16a/0x410 [ip6_gre]
      [  312.805986]  [<ffffffff8159934b>] dev_hard_start_xmit+0x23b/0x390
      [  312.808810]  [<ffffffff815a2f5f>] ? neigh_destroy+0xef/0x140
      [  312.809843]  [<ffffffff81599a6c>] __dev_queue_xmit+0x48c/0x4f0
      [  312.813931]  [<ffffffff81599ae3>] dev_queue_xmit_sk+0x13/0x20
      [  312.814993]  [<ffffffff815a0832>] neigh_direct_output+0x12/0x20
      [  312.817448]  [<ffffffffa021d633>] ip6_finish_output2+0x183/0x460 [ipv6]
      [  312.818762]  [<ffffffff81306fc5>] ? find_next_bit+0x15/0x20
      [  312.819671]  [<ffffffffa021fd79>] ip6_finish_output+0x89/0xe0 [ipv6]
      [  312.820720]  [<ffffffffa021fe14>] ip6_output+0x44/0xe0 [ipv6]
      [  312.821762]  [<ffffffff815c8809>] ? nf_hook_slow+0x69/0xc0
      [  312.823123]  [<ffffffffa021d232>] ip6_xmit+0x242/0x4c0 [ipv6]
      [  312.824073]  [<ffffffffa021c9f0>] ? ac6_proc_exit+0x20/0x20 [ipv6]
      [  312.825116]  [<ffffffffa024c751>] inet6_csk_xmit+0x61/0xa0 [ipv6]
      [  312.826127]  [<ffffffff815eb590>] tcp_transmit_skb+0x4f0/0x9b0
      [  312.827441]  [<ffffffff815ed267>] tcp_connect+0x637/0x7a0
      [  312.828327]  [<ffffffffa0245906>] tcp_v6_connect+0x2d6/0x550 [ipv6]
      [  312.829581]  [<ffffffff81606f05>] __inet_stream_connect+0x95/0x2f0
      [  312.830600]  [<ffffffff810ae13a>] ? hrtimer_try_to_cancel+0x1a/0xf0
      [  312.833456]  [<ffffffff812fba19>] ? timerqueue_add+0x59/0xb0
      [  312.834407]  [<ffffffff81607198>] inet_stream_connect+0x38/0x50
      [  312.835886]  [<ffffffff8157cb17>] SYSC_connect+0xb7/0xf0
      [  312.840035]  [<ffffffff810af6d3>] ? do_setitimer+0x1b3/0x200
      [  312.840983]  [<ffffffff810af75a>] ? alarm_setitimer+0x3a/0x70
      [  312.841941]  [<ffffffff8157d7ae>] SyS_connect+0xe/0x10
      [  312.842818]  [<ffffffff81659297>] entry_SYSCALL_64_fastpath+0x12/0x6a
      [  312.844206] ---[ end trace 43f3ecd86c3b1313 ]---
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      244b7f43
    • Martin KaFai Lau's avatar
      ipv6: Replace spinlock with seqlock and rcu in ip6_tunnel · 70da5b5c
      Martin KaFai Lau authored
      This patch uses a seqlock to ensure consistency between idst->dst and
      idst->cookie.  It also makes dst freeing from fib tree to undergo a
      rcu grace period.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70da5b5c
    • Martin KaFai Lau's avatar
      ipv6: Avoid double dst_free · 8e3d5be7
      Martin KaFai Lau authored
      It is a prep work to get dst freeing from fib tree undergo
      a rcu grace period.
      
      The following is a common paradigm:
      if (ip6_del_rt(rt))
      	dst_free(rt)
      
      which means, if rt cannot be deleted from the fib tree, dst_free(rt) now.
      1. We don't know the ip6_del_rt(rt) failure is because it
         was not managed by fib tree (e.g. DST_NOCACHE) or it had already been
         removed from the fib tree.
      2. If rt had been managed by the fib tree, ip6_del_rt(rt) failure means
         dst_free(rt) has been called already.  A second
         dst_free(rt) is not always obviously safe.  The rt may have
         been destroyed already.
      3. If rt is a DST_NOCACHE, dst_free(rt) should not be called.
      4. It is a stopper to make dst freeing from fib tree undergo a
         rcu grace period.
      
      This patch is to use a DST_NOCACHE flag to indicate a rt is
      not managed by the fib tree.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8e3d5be7
    • Martin KaFai Lau's avatar
      ipv6: Fix dst_entry refcnt bugs in ip6_tunnel · cdf3464e
      Martin KaFai Lau authored
      Problems in the current dst_entry cache in the ip6_tunnel:
      
      1. ip6_tnl_dst_set is racy.  There is no lock to protect it:
         - One major problem is that the dst refcnt gets messed up. F.e.
           the same dst_cache can be released multiple times and then
           triggering the infamous dst refcnt < 0 warning message.
         - Another issue is the inconsistency between dst_cache and
           dst_cookie.
      
         It can be reproduced by adding and removing the ip6gre tunnel
         while running a super_netperf TCP_CRR test.
      
      2. ip6_tnl_dst_get does not take the dst refcnt before returning
         the dst.
      
      This patch:
      1. Create a percpu dst_entry cache in ip6_tnl
      2. Use a spinlock to protect the dst_cache operations
      3. ip6_tnl_dst_get always takes the dst refcnt before returning
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdf3464e
    • Martin KaFai Lau's avatar
      ipv6: Rename the dst_cache helper functions in ip6_tunnel · f230d1e8
      Martin KaFai Lau authored
      It is a prep work to fix the dst_entry refcnt bugs in
      ip6_tunnel.
      
      This patch rename:
      1. ip6_tnl_dst_check() to ip6_tnl_dst_get() to better
         reflect that it will take a dst refcnt in the next patch.
      2. ip6_tnl_dst_store() to ip6_tnl_dst_set() to have a more
         conventional name matching with ip6_tnl_dst_get().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f230d1e8
    • Martin KaFai Lau's avatar
      ipv6: Refactor common ip6gre_tunnel_init codes · a3c119d3
      Martin KaFai Lau authored
      It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel.
      
      This patch refactors some common init codes used by both
      ip6gre_tunnel_init and ip6gre_tap_init.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3c119d3