1. 09 Sep, 2015 12 commits
    • Daniel Borkmann's avatar
      ebpf: fix fd refcount leaks related to maps in bpf syscall · 592867bf
      Daniel Borkmann authored
      We may already have gotten a proper fd struct through fdget(), so
      whenever we return at the end of an map operation, we need to call
      fdput(). However, each map operation from syscall side first probes
      CHECK_ATTR() to verify that unused fields in the bpf_attr union are
      zero.
      
      In case of malformed input, we return with error, but the lookup to
      the map_fd was already performed at that time, so that we return
      without an corresponding fdput(). Fix it by performing an fdget()
      only right before bpf_map_get(). The fdget() invocation on maps in
      the verifier is not affected.
      
      Fixes: db20fd2b ("bpf: add lookup/update/delete/iterate methods to BPF maps")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      592867bf
    • Sasha Levin's avatar
      RDS: verify the underlying transport exists before creating a connection · 74e98eb0
      Sasha Levin authored
      There was no verification that an underlying transport exists when creating
      a connection, this would cause dereferencing a NULL ptr.
      
      It might happen on sockets that weren't properly bound before attempting to
      send a message, which will cause a NULL ptr deref:
      
      [135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
      [135546.051270] Modules linked in:
      [135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted 4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
      [135546.053217] task: ffff8800835bc000 ti: ffff8800bc708000 task.ti: ffff8800bc708000
      [135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
      [135546.055666] RSP: 0018:ffff8800bc70fab0  EFLAGS: 00010202
      [135546.056457] RAX: dffffc0000000000 RBX: 0000000000000f2c RCX: ffff8800835bc000
      [135546.057494] RDX: 0000000000000007 RSI: ffff8800835bccd8 RDI: 0000000000000038
      [135546.058530] RBP: ffff8800bc70fb18 R08: 0000000000000001 R09: 0000000000000000
      [135546.059556] R10: ffffed014d7a3a23 R11: ffffed014d7a3a21 R12: 0000000000000000
      [135546.060614] R13: 0000000000000001 R14: ffff8801ec3d0000 R15: 0000000000000000
      [135546.061668] FS:  00007faad4ffb700(0000) GS:ffff880252000000(0000) knlGS:0000000000000000
      [135546.062836] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [135546.063682] CR2: 000000000000846a CR3: 000000009d137000 CR4: 00000000000006a0
      [135546.064723] Stack:
      [135546.065048]  ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
      [135546.066247]  0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
      [135546.067438]  1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
      [135546.068629] Call Trace:
      [135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
      [135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
      [135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
      [135546.071981] rds_sendmsg (net/rds/send.c:1058)
      [135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
      [135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
      [135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
      [135546.076349] ? __might_fault (mm/memory.c:3795)
      [135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
      [135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
      [135546.078856] SYSC_sendto (net/socket.c:1657)
      [135546.079596] ? SYSC_connect (net/socket.c:1628)
      [135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
      [135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
      [135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
      [135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
      [135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
      [135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      74e98eb0
    • David Vrabel's avatar
      xen-netback: require fewer guest Rx slots when not using GSO · 1d5d4852
      David Vrabel authored
      Commit f48da8b1 (xen-netback: fix
      unlimited guest Rx internal queue and carrier flapping) introduced a
      regression.
      
      The PV frontend in IPXE only places 4 requests on the guest Rx ring.
      Since netback required at least (MAX_SKB_FRAGS + 1) slots, IPXE could
      not receive any packets.
      
      a) If GSO is not enabled on the VIF, fewer guest Rx slots are required
         for the largest possible packet.  Calculate the required slots
         based on the maximum GSO size or the MTU.
      
         This calculation of the number of required slots relies on
         1650d545 (xen-netback: always fully coalesce guest Rx packets)
         which present in 4.0-rc1 and later.
      
      b) Reduce the Rx stall detection to checking for at least one
         available Rx request.  This is fine since we're predominately
         concerned with detecting interfaces which are down and thus have
         zero available Rx requests.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d5d4852
    • David S. Miller's avatar
      Merge branch 'cxgb4-fixes' · 9b57ab8b
      David S. Miller authored
      Hariprasad Shenai says:
      
      ====================
      cxgb4: Fix tx flit calculation and wc stat configuration
      
      This patch series fixes the following:
      Patch 1/2 fixes tx flit calculation, which if wrong can lead to
      stall, hang, data corrpution, write combining failure. Patch 2/2 fixes
      PCI-E write combining stats configuration.
      
      This patch series has been created against net tree and includes
      patches on cxgb4 driver.
      
      We have included all the maintainers of respective drivers. Kindly review
      the change and let us know in case of any review comments.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b57ab8b
    • Hariprasad Shenai's avatar
      cxgb4: Fix for write-combining stats configuration · 2a485cf7
      Hariprasad Shenai authored
      The write-combining configuration register SGE_STAT_CFG_A needs to
      be configured after FW initializes the adapter, else FW will reset
      the configuration
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a485cf7
    • Hariprasad Shenai's avatar
      cxgb4: Fix tx flit calculation · fd1754fb
      Hariprasad Shenai authored
      In commit 0aac3f56 ("cxgb4: Add comment for calculate tx flits
      and sge length code") introduced a regression where tx flit calculation
      is going wrong, which can lead to data corruption, hang, stall and
      write-combining failure. Fixing it.
      Signed-off-by: default avatarHariprasad Shenai <hariprasad@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd1754fb
    • Atsushi Nemoto's avatar
      net: eth: altera: Fix the initial device operstate · d43cefcd
      Atsushi Nemoto authored
      Call netif_carrier_off() prior to register_netdev(), otherwise
      userspace can see incorrect link state.
      Signed-off-by: default avatarAtsushi Nemoto <nemoto@toshiba-tops.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d43cefcd
    • Kolmakov Dmitriy's avatar
      net: tipc: fix stall during bclink wakeup procedure · 7845989c
      Kolmakov Dmitriy authored
      If an attempt to wake up users of broadcast link is made when there is
      no enough place in send queue than it may hang up inside the
      tipc_sk_rcv() function since the loop breaks only after the wake up
      queue becomes empty. This can lead to complete CPU stall with the
      following message generated by RCU:
      
      INFO: rcu_sched self-detected stall on CPU { 0}  (t=2101 jiffies
      					g=54225 c=54224 q=11465)
      Task dump for CPU 0:
      tpch            R  running task        0 39949  39948 0x0000000a
       ffffffff818536c0 ffff88181fa037a0 ffffffff8106a4be 0000000000000000
       ffffffff818536c0 ffff88181fa037c0 ffffffff8106d8a8 ffff88181fa03800
       0000000000000001 ffff88181fa037f0 ffffffff81094a50 ffff88181fa15680
      Call Trace:
       <IRQ>  [<ffffffff8106a4be>] sched_show_task+0xae/0x120
       [<ffffffff8106d8a8>] dump_cpu_task+0x38/0x40
       [<ffffffff81094a50>] rcu_dump_cpu_stacks+0x90/0xd0
       [<ffffffff81097c3b>] rcu_check_callbacks+0x3eb/0x6e0
       [<ffffffff8106e53f>] ? account_system_time+0x7f/0x170
       [<ffffffff81099e64>] update_process_times+0x34/0x60
       [<ffffffff810a84d1>] tick_sched_handle.isra.18+0x31/0x40
       [<ffffffff810a851c>] tick_sched_timer+0x3c/0x70
       [<ffffffff8109a43d>] __run_hrtimer.isra.34+0x3d/0xc0
       [<ffffffff8109aa95>] hrtimer_interrupt+0xc5/0x1e0
       [<ffffffff81030d52>] ? native_smp_send_reschedule+0x42/0x60
       [<ffffffff81032f04>] local_apic_timer_interrupt+0x34/0x60
       [<ffffffff810335bc>] smp_apic_timer_interrupt+0x3c/0x60
       [<ffffffff8165a3fb>] apic_timer_interrupt+0x6b/0x70
       [<ffffffff81659129>] ? _raw_spin_unlock_irqrestore+0x9/0x10
       [<ffffffff8107eb9f>] __wake_up_sync_key+0x4f/0x60
       [<ffffffffa313ddd1>] tipc_write_space+0x31/0x40 [tipc]
       [<ffffffffa313dadf>] filter_rcv+0x31f/0x520 [tipc]
       [<ffffffffa313d699>] ? tipc_sk_lookup+0xc9/0x110 [tipc]
       [<ffffffff81659259>] ? _raw_spin_lock_bh+0x19/0x30
       [<ffffffffa314122c>] tipc_sk_rcv+0x2dc/0x3e0 [tipc]
       [<ffffffffa312e7ff>] tipc_bclink_wakeup_users+0x2f/0x40 [tipc]
       [<ffffffffa313ce26>] tipc_node_unlock+0x186/0x190 [tipc]
       [<ffffffff81597c1c>] ? kfree_skb+0x2c/0x40
       [<ffffffffa313475c>] tipc_rcv+0x2ac/0x8c0 [tipc]
       [<ffffffffa312ff58>] tipc_l2_rcv_msg+0x38/0x50 [tipc]
       [<ffffffff815a76d3>] __netif_receive_skb_core+0x5a3/0x950
       [<ffffffff815a98d3>] __netif_receive_skb+0x13/0x60
       [<ffffffff815a993e>] netif_receive_skb_internal+0x1e/0x90
       [<ffffffff815aa138>] napi_gro_receive+0x78/0xa0
       [<ffffffffa07f93f4>] tg3_poll_work+0xc54/0xf40 [tg3]
       [<ffffffff81597c8c>] ? consume_skb+0x2c/0x40
       [<ffffffffa07f9721>] tg3_poll_msix+0x41/0x160 [tg3]
       [<ffffffff815ab0f2>] net_rx_action+0xe2/0x290
       [<ffffffff8104b92a>] __do_softirq+0xda/0x1f0
       [<ffffffff8104bc26>] irq_exit+0x76/0xa0
       [<ffffffff81004355>] do_IRQ+0x55/0xf0
       [<ffffffff8165a12b>] common_interrupt+0x6b/0x6b
       <EOI>
      
      The issue occurs only when tipc_sk_rcv() is used to wake up postponed
      senders:
      
      	tipc_bclink_wakeup_users()
      		// wakeupq - is a queue which consists of special
      		// 		 messages with SOCK_WAKEUP type.
      		tipc_sk_rcv(wakeupq)
      			...
      			while (skb_queue_len(inputq)) {
      				filter_rcv(skb)
      					// Here the type of message is checked
      					// and if it is SOCK_WAKEUP then
      					// it tries to wake up a sender.
      					tipc_write_space(sk)
      						wake_up_interruptible_sync_poll()
      			}
      
      After the sender thread is woke up it can gather control and perform
      an attempt to send a message. But if there is no enough place in send
      queue it will call link_schedule_user() function which puts a message
      of type SOCK_WAKEUP to the wakeup queue and put the sender to sleep.
      Thus the size of the queue actually is not changed and the while()
      loop never exits.
      
      The approach I proposed is to wake up only senders for which there is
      enough place in send queue so the described issue can't occur.
      Moreover the same approach is already used to wake up senders on
      unicast links.
      
      I have got into the issue on our product code but to reproduce the
      issue I changed a benchmark test application (from
      tipcutils/demos/benchmark) to perform the following scenario:
      	1. Run 64 instances of test application (nodes). It can be done
      	   on the one physical machine.
      	2. Each application connects to all other using TIPC sockets in
      	   RDM mode.
      	3. When setup is done all nodes start simultaneously send
      	   broadcast messages.
      	4. Everything hangs up.
      
      The issue is reproducible only when a congestion on broadcast link
      occurs. For example, when there are only 8 nodes it works fine since
      congestion doesn't occur. Send queue limit is 40 in my case (I use a
      critical importance level) and when 64 nodes send a message at the
      same moment a congestion occurs every time.
      Signed-off-by: default avatarDmitry S Kolmakov <kolmakov.dmitriy@huawei.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7845989c
    • Barry Song's avatar
      dm9000: fix a typo · 7b901873
      Barry Song authored
      Signed-off-by: default avatarBarry Song <Baohua.Song@csr.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b901873
    • Vivien Didelot's avatar
      net: bridge: remove unnecessary switchdev include · 7a577f01
      Vivien Didelot authored
      Remove the unnecessary switchdev.h include from br_netlink.c.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a577f01
    • Vivien Didelot's avatar
      net: bridge: check __vlan_vid_del for error · bf361ad3
      Vivien Didelot authored
      Since __vlan_del can return an error code, change its inner function
      __vlan_vid_del to return an eventual error from switchdev_port_obj_del.
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Acked-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf361ad3
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Fix ageing conditions and operation · 39797a27
      Florian Fainelli authored
      The comparison check between cur_hw_state and hw_state is currently
      invalid because cur_hw_state is right shifted by G_MISTP_SHIFT, while
      hw_state is not, so we end-up comparing bits 2:0 with bits 7:5, which is
      going to cause an additional aging to occur. Fix this by not shifting
      cur_hw_state while reading it, but instead, mask the value with the
      appropriately shitfted bitmask.
      
      The other problem with the fast-ageing process is that we did not set
      the EN_AGE_DYNAMIC bit to request the ageing to occur for dynamically
      learned MAC addresses. Finally, write back 0 to the FAST_AGE_CTRL
      register to avoid leaving spurious bits sets from one operation to the
      other.
      
      Fixes: 12f460f2 ("net: dsa: bcm_sf2: add HW bridging support")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39797a27
  2. 08 Sep, 2015 2 commits
    • Julien Grall's avatar
      device property: Don't overwrite addr when failing in device_get_mac_address · 5b902d6f
      Julien Grall authored
      The function device_get_mac_address is trying different property names
      in order to get the mac address. To check the return value, the variable
      addr (which contain the buffer pass by the caller) will be re-used. This
      means that if the previous property is not found, the next property will
      be read using a NULL buffer.
      
      Therefore it's only possible to retrieve the mac if node contains a
      property "mac-address". Fix it by using a temporary buffer for the
      return value.
      
      This has been introduced by commit 4c96b7dc
      "Add a matching set of device_ functions for determining mac/phy"
      Signed-off-by: default avatarJulien Grall <julien.grall@citrix.com>
      Cc: Jeremy Linton <jeremy.linton@arm.com>
      Cc: David S. Miller <davem@davemloft.net>
      Reviewed-by: default avatarJeremy Linton <jeremy.linton@arm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b902d6f
    • Eugene Shatokhin's avatar
      usbnet: Fix a race between usbnet_stop() and the BH · fcb0bb6a
      Eugene Shatokhin authored
      The race may happen when a device (e.g. YOTA 4G LTE Modem) is
      unplugged while the system is downloading a large file from the Net.
      
      Hardware breakpoints and Kprobes with delays were used to confirm that
      the race does actually happen.
      
      The race is on skb_queue ('next' pointer) between usbnet_stop()
      and rx_complete(), which, in turn, calls usbnet_bh().
      
      Here is a part of the call stack with the code where the changes to the
      queue happen. The line numbers are for the kernel 4.1.0:
      
      *0 __skb_unlink (skbuff.h:1517)
          prev->next = next;
      *1 defer_bh (usbnet.c:430)
          spin_lock_irqsave(&list->lock, flags);
          old_state = entry->state;
          entry->state = state;
          __skb_unlink(skb, list);
          spin_unlock(&list->lock);
          spin_lock(&dev->done.lock);
          __skb_queue_tail(&dev->done, skb);
          if (dev->done.qlen == 1)
              tasklet_schedule(&dev->bh);
          spin_unlock_irqrestore(&dev->done.lock, flags);
      *2 rx_complete (usbnet.c:640)
          state = defer_bh(dev, skb, &dev->rxq, state);
      
      At the same time, the following code repeatedly checks if the queue is
      empty and reads these values concurrently with the above changes:
      
      *0  usbnet_terminate_urbs (usbnet.c:765)
          /* maybe wait for deletions to finish. */
          while (!skb_queue_empty(&dev->rxq)
              && !skb_queue_empty(&dev->txq)
              && !skb_queue_empty(&dev->done)) {
                  schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS));
                  set_current_state(TASK_UNINTERRUPTIBLE);
                  netif_dbg(dev, ifdown, dev->net,
                        "waited for %d urb completions\n", temp);
          }
      *1  usbnet_stop (usbnet.c:806)
          if (!(info->flags & FLAG_AVOID_UNLINK_URBS))
              usbnet_terminate_urbs(dev);
      
      As a result, it is possible, for example, that the skb is removed from
      dev->rxq by __skb_unlink() before the check
      "!skb_queue_empty(&dev->rxq)" in usbnet_terminate_urbs() is made. It is
      also possible in this case that the skb is added to dev->done queue
      after "!skb_queue_empty(&dev->done)" is checked. So
      usbnet_terminate_urbs() may stop waiting and return while dev->done
      queue still has an item.
      
      Locking in defer_bh() and usbnet_terminate_urbs() was revisited to avoid
      this race.
      Signed-off-by: default avatarEugene Shatokhin <eugene.shatokhin@rosalab.ru>
      Reviewed-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcb0bb6a
  3. 07 Sep, 2015 8 commits
  4. 06 Sep, 2015 7 commits
  5. 04 Sep, 2015 6 commits
  6. 03 Sep, 2015 5 commits
    • David S. Miller's avatar
      Merge branch 'sctp-fixes' · 724a7636
      David S. Miller authored
      Marcelo Ricardo Leitner says:
      
      ====================
      couple of sctp fixes for 0ca50d12
      
      These are two fixes for sctp after my patch on 0ca50d12 ("sctp: fix
      src address selection if using secondary addresses")
      
      The first, fix a dst leak on those it decided to skip.
      
      The second, adds the fallback on src selection that Vlad had asked
      about. Unfortunatelly a lot of ipvs setups relies on the old behavior
      and I don't see a better fix for it.
      
      Please consider both to -stable tree.
      ====================
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      724a7636
    • Marcelo Ricardo Leitner's avatar
      sctp: add routing output fallback · 410f0383
      Marcelo Ricardo Leitner authored
      Commit 0ca50d12 added a restriction that the address must belong to
      the output interface, so that sctp will use the right interface even
      when using secondary addresses.
      
      But it breaks IPVS setups, on which people is used to attach VIP
      addresses to loopback interface on real servers. It's preferred to
      attach to the interface actually in use, but it's a very common setup
      and that used to work.
      
      This patch then saves the first routing good result, even if it would be
      going out through an interface that doesn't have that address. If no
      better hit found, it's then used. This effectively restores the original
      behavior if no better interface could be found.
      
      Fixes: 0ca50d12 ("sctp: fix src address selection if using secondary addresses")
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      410f0383
    • Marcelo Ricardo Leitner's avatar
      sctp: fix dst leak · d82f0f1f
      Marcelo Ricardo Leitner authored
      Commit 0ca50d12 failed to release the reference to dst entries that
      it decided to skip.
      
      Fixes: 0ca50d12 ("sctp: fix src address selection if using secondary addresses")
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d82f0f1f
    • Atsushi Nemoto's avatar
      net: eth: altera: fix napi poll_list corruption · 4548a697
      Atsushi Nemoto authored
      tse_poll() calls __napi_complete() with irq enabled.  This leads napi
      poll_list corruption and may stop all napi drivers working.
      Use napi_complete() instead of __napi_complete().
      Signed-off-by: default avatarAtsushi Nemoto <nemoto@toshiba-tops.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4548a697
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next · dd5cdb48
      Linus Torvalds authored
      Pull networking updates from David Miller:
       "Another merge window, another set of networking changes.  I've heard
        rumblings that the lightweight tunnels infrastructure has been voted
        networking change of the year.  But what do I know?
      
         1) Add conntrack support to openvswitch, from Joe Stringer.
      
         2) Initial support for VRF (Virtual Routing and Forwarding), which
            allows the segmentation of routing paths without using multiple
            devices.  There are some semantic kinks to work out still, but
            this is a reasonably strong foundation.  From David Ahern.
      
         3) Remove spinlock fro act_bpf fast path, from Alexei Starovoitov.
      
         4) Ignore route nexthops with a link down state in ipv6, just like
            ipv4.  From Andy Gospodarek.
      
         5) Remove spinlock from fast path of act_gact and act_mirred, from
            Eric Dumazet.
      
         6) Document the DSA layer, from Florian Fainelli.
      
         7) Add netconsole support to bcmgenet, systemport, and DSA.  Also
            from Florian Fainelli.
      
         8) Add Mellanox Switch Driver and core infrastructure, from Jiri
            Pirko.
      
         9) Add support for "light weight tunnels", which allow for
            encapsulation and decapsulation without bearing the overhead of a
            full blown netdevice.  From Thomas Graf, Jiri Benc, and a cast of
            others.
      
        10) Add Identifier Locator Addressing support for ipv6, from Tom
            Herbert.
      
        11) Support fragmented SKBs in iwlwifi, from Johannes Berg.
      
        12) Allow perf PMUs to be accessed from eBPF programs, from Kaixu Xia.
      
        13) Add BQL support to 3c59x driver, from Loganaden Velvindron.
      
        14) Stop using a zero TX queue length to mean that a device shouldn't
            have a qdisc attached, use an explicit flag instead.  From Phil
            Sutter.
      
        15) Use generic geneve netdevice infrastructure in openvswitch, from
            Pravin B Shelar.
      
        16) Add infrastructure to avoid re-forwarding a packet in software
            that was already forwarded by a hardware switch.  From Scott
            Feldman.
      
        17) Allow AF_PACKET fanout function to be implemented in a bpf
            program, from Willem de Bruijn"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1458 commits)
        netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
        netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
        net: fec: clear receive interrupts before processing a packet
        ipv6: fix exthdrs offload registration in out_rt path
        xen-netback: add support for multicast control
        bgmac: Update fixed_phy_register()
        sock, diag: fix panic in sock_diag_put_filterinfo
        flow_dissector: Use 'const' where possible.
        flow_dissector: Fix function argument ordering dependency
        ixgbe: Resolve "initialized field overwritten" warnings
        ixgbe: Remove bimodal SR-IOV disabling
        ixgbe: Add support for reporting 2.5G link speed
        ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
        ixgbe: support for ethtool set_rxfh
        ixgbe: Avoid needless PHY access on copper phys
        ixgbe: cleanup to use cached mask value
        ixgbe: Remove second instance of lan_id variable
        ixgbe: use kzalloc for allocating one thing
        flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
        ixgbe: Remove unused PCI bus types
        ...
      dd5cdb48