1. 10 Jun, 2016 24 commits
    • Bhaktipriya Shridhar's avatar
      mlxsw: core: Remove deprecated create_workqueue · 3d5479e9
      Bhaktipriya Shridhar authored
      alloc_workqueue replaces deprecated create_workqueue().
      
      A dedicated workqueue has been used since the workqueue
      mlxsw_wq is used for FDB notif. processing with workitems that are
      involved in normal device operation && because it's a network device
      which can be depended upon during memory reclaim.
      
      Workitems &trans->timeout_dw and &mlxsw_sp->fdb_notify.dw,
      map to mlxsw_sp_fdb_notify_work (processes FDB notifications from the
      underlying device and resolves the netdev to which the entry points to
      and notifies the bridge using the switchdev notifier) and
      mlxsw_emad_trans_timeout_work (provides async EMAD register access)
      respectively. They require forward progress under memory pressure and
      hence, WQ_MEM_RECLAIM has been set.
      
      Since there are only a fixed number of work items, explicit concurrency
      limit is unnecessary here.
      Signed-off-by: default avatarBhaktipriya Shridhar <bhaktipriya96@gmail.com>
      Tested-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3d5479e9
    • Bhaktipriya Shridhar's avatar
      net: cavium: liquidio: Remove deprecated create_workqueue · 292b9dab
      Bhaktipriya Shridhar authored
      alloc_workqueue replaces deprecated create_workqueue().
      
      A dedicated workqueue has been used since the workitem viz
      (&lio->txq_status_wq.wk.work which maps to octnet_poll_check_txq_status)
      is involved in a brief poll routine for checking transmit queue status
      and is an intergral part of normal device operation.
      WQ_MEM_RECLAIM has been set to guarantee forward progress under memory
      pressure, which is a requirement here.
      Since there are only a fixed number of work items, explicit concurrency
      limit is unnecessary.
      
      flush_workqueue is unnecessary since destroy_workqueue() itself calls
      drain_workqueue() which flushes repeatedly till the workqueue
      becomes empty.
      Signed-off-by: default avatarBhaktipriya Shridhar <bhaktipriya96@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      292b9dab
    • Eric Dumazet's avatar
      net/mlx4_en: fix ethtool -x · f7d3c1cb
      Eric Dumazet authored
      mlx4 RSS is limited to spread incoming packets to a power of two number
      of queues.
      
      An uniformly distibuted traffic would be split on queues 0 to N-1, N
      being a power of two, each queue having a 1/N weight.
      
      If number of RX queues is not a power of two, upper RX queues do not
      receive traffic.
      
      ethtool -x is lying, because it pretends some queues have higher weight.
      
      Before patch:
      
      lpaa24:~# ethtool -L eth1 rx 24
      lpaa24:~# ethtool -x eth1
      RX flow hash indirection table for eth1 with 24 RX ring(s):
          0:      0     1     2     3     4     5     6     7
          8:      8     9    10    11    12    13    14    15
         16:      0     1     2     3     4     5     6     7
      RSS hash key:
      e0:7c:3a:89:07:55:b6:58:69:cc:f4:e5:24:62:e3:25:88:6c:42:5b:d2:cb:9a:d2:e0:06:e1:dc:f9:09:a1:89:0f:a0:30:43:73:6f:0c:b6
      
      If this information was correct, user space tools could expect queues 0
      to 7 to receive twice more traffic than queues 8 to 15
      
      After patch :
      
      lpaa24:~# ethtool -L eth1 rx 24
      lpaa24:~# ethtool -x eth1
      RX flow hash indirection table for eth1 with 24 RX ring(s):
          0:      0     1     2     3     4     5     6     7
          8:      8     9    10    11    12    13    14    15
      RSS hash key:
      da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce
      lpaa24:~# ethtool -X eth1 equal 8
      lpaa24:~# ethtool -x eth1
      RX flow hash indirection table for eth1 with 24 RX ring(s):
          0:      0     1     2     3     4     5     6     7
          8:      0     1     2     3     4     5     6     7
      RSS hash key:
      da:7b:09:60:f1:ac:67:b4:d0:72:d4:ec:a2:e5:80:0a:ad:50:22:1a:f8:f9:66:54:5f:22:45:c3:88:f4:57:82:c1:c1:90:ed:70:cb:40:ce
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMaciej Żenczykowski <maze@google.com>
      Cc: Eugenia Emantayev <eugenia@mellanox.com>
      Cc: Wei Wang <weiwan@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7d3c1cb
    • hayeswang's avatar
      r8152: replace netdev_alloc_skb_ip_align with napi_alloc_skb · c8d83963
      hayeswang authored
      Replace netdev_alloc_skb_ip_align() with napi_alloc_skb() which can save
      several CPU cycles by avoiding having to disable and re-enable IRQs.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8d83963
    • David S. Miller's avatar
      Merge branch 'bgmac-stats' · cef8a464
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: bgmac: Misc improvements
      
      This patch series add minor changes to the bgmac driver:
      
      - properly bind net_device with its backing device structure such that
        we can locate the device using common helper functions
      
      - add support for ethtool statistics reading the HW MIB counters which
        is useful for debugging
      
      - add netdev statistics throughout the TX/RX path to know what is going on
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cef8a464
    • Florian Fainelli's avatar
      bgmac: Maintain some netdev statistics · 6d490f62
      Florian Fainelli authored
      Add a few netdev statistics to report transmitted and received bytes and
      packets and a few obvious errors.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d490f62
    • Florian Fainelli's avatar
      bgmac: Add support for ethtool statistics · f6613d4f
      Florian Fainelli authored
      Read the statistics from the BGMAC's builtin MAC and return them to
      user-space using the standard ethtool helpers.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6613d4f
    • Florian Fainelli's avatar
      bgmac: Bind net_device with backing device structure · 2022e9d5
      Florian Fainelli authored
      In preparation for allowing different helpers to be utilized against
      network devices created by the bgmac driver, make sure that we bind the
      net_device with core->dev.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2022e9d5
    • Aaron Conole's avatar
      virtio_net: Update the feature bit to comply with spec · 7d84e37e
      Aaron Conole authored
      A draft version of the MTU Advice feature bit was specified as 25.  This
      bit is not within the allowed range for network device feature bits, and
      should be changed to be feature bit 3 to fully comply with the spec.
      
      Fixes 14de9d11 ('virtio-net: Add initial MTU advice feature')
      Signed-off-by: default avatarAaron Conole <aconole@redhat.com>
      Suggested-by: default avatar"Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d84e37e
    • David Ahern's avatar
      net: vrf: Fix crash when IPv6 is disabled at boot time · e4348637
      David Ahern authored
      Frank Kellermann reported a kernel crash with 4.5.0 when IPv6 is
      disabled at boot using the kernel option ipv6.disable=1. Using
      current net-next with the boot option:
      
      $ ip link add red type vrf table 1001
      
      Generates:
      [12210.919584] BUG: unable to handle kernel NULL pointer dereference at 0000000000000748
      [12210.921341] IP: [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a
      [12210.922537] PGD b79e3067 PUD bb32b067 PMD 0
      [12210.923479] Oops: 0000 [#1] SMP
      [12210.924001] Modules linked in: ipvlan 8021q garp mrp stp llc
      [12210.925130] CPU: 3 PID: 1177 Comm: ip Not tainted 4.7.0-rc1+ #235
      [12210.926168] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
      [12210.928065] task: ffff8800b9ac4640 ti: ffff8800bacac000 task.ti: ffff8800bacac000
      [12210.929328] RIP: 0010:[<ffffffff814b30e3>]  [<ffffffff814b30e3>] fib6_get_table+0x2c/0x5a
      [12210.930697] RSP: 0018:ffff8800bacaf888  EFLAGS: 00010202
      [12210.931563] RAX: 0000000000000748 RBX: ffffffff81a9e280 RCX: ffff8800b9ac4e28
      [12210.932688] RDX: 00000000000000e9 RSI: 0000000000000002 RDI: 0000000000000286
      [12210.933820] RBP: ffff8800bacaf898 R08: ffff8800b9ac4df0 R09: 000000000052001b
      [12210.934941] R10: 00000000657c0000 R11: 000000000000c649 R12: 00000000000003e9
      [12210.936032] R13: 00000000000003e9 R14: ffff8800bace7800 R15: ffff8800bb3ec000
      [12210.937103] FS:  00007faa1766c700(0000) GS:ffff88013ac00000(0000) knlGS:0000000000000000
      [12210.938321] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [12210.939166] CR2: 0000000000000748 CR3: 00000000b79d6000 CR4: 00000000000406e0
      [12210.940278] Stack:
      [12210.940603]  ffff8800bb3ec000 ffffffff81a9e280 ffff8800bacaf8c8 ffffffff814b3135
      [12210.941818]  ffff8800bb3ec000 ffffffff81a9e280 ffffffff81a9e280 ffff8800bace7800
      [12210.943040]  ffff8800bacaf8f0 ffffffff81397c88 ffff8800bb3ec000 ffffffff81a9e280
      [12210.944288] Call Trace:
      [12210.944688]  [<ffffffff814b3135>] fib6_new_table+0x24/0x8a
      [12210.945516]  [<ffffffff81397c88>] vrf_dev_init+0xd4/0x162
      [12210.946328]  [<ffffffff814091e1>] register_netdevice+0x100/0x396
      [12210.947209]  [<ffffffff8139823d>] vrf_newlink+0x40/0xb3
      [12210.948001]  [<ffffffff814187f0>] rtnl_newlink+0x5d3/0x6d5
      ...
      
      The problem above is due to the fact that the fib hash table is not
      allocated when IPv6 is disabled at boot.
      
      As for the VRF driver it should not do any IPv6 initializations if IPv6
      is disabled, so it needs to know if IPv6 is disabled at boot. The disable
      parameter is private to the IPv6 module, so provide an accessor for
      modules to determine if IPv6 was disabled at boot time.
      
      Fixes: 35402e31 ("net: Add IPv6 support to VRF device")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4348637
    • David Howells's avatar
      rxrpc: Simplify connect() implementation and simplify sendmsg() op · 2341e077
      David Howells authored
      Simplify the RxRPC connect() implementation.  It will just note the
      destination address it is given, and if a sendmsg() comes along with no
      address, this will be assigned as the address.  No transport struct will be
      held internally, which will allow us to remove this later.
      
      Simplify sendmsg() also.  Whilst a call is active, userspace refers to it
      by a private unique user ID specified in a control message.  When sendmsg()
      sees a user ID that doesn't map to an extant call, it creates a new call
      for that user ID and attempts to add it.  If, when we try to add it, the
      user ID is now registered, we now reject the message with -EEXIST.  We
      should never see this situation unless two threads are racing, trying to
      create a call with the same ID - which would be an error.
      
      It also isn't required to provide sendmsg() with an address - provided the
      control message data holds a user ID that maps to a currently active call.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2341e077
    • Fabien Siron's avatar
    • Eric Dumazet's avatar
      net/mlx4_en: mlx4_en_netpoll() should schedule TX, not RX · 7d71e994
      Eric Dumazet authored
      I am not sure mlx4_en_netpoll() is doing anything useful right now.
      
      mlx4 has different NAPI structures for RX and TX, and netpoll only wants
      to drain TX queues.
      
      Lets schedule NAPI polls on TX, not RX.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Maciej Żenczykowski <maze@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Acked-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d71e994
    • David S. Miller's avatar
      Merge branch 'BCM53xx-driver' · 23c731e8
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: dsa: Broadcom BCM53xx switches support
      
      This patch series adds support for the Broadcom BCM53xx series aka RoboSwitches.
      
      This driver is largely based on Jonas Gorski's b53 driver for OpenWrt which can
      be found here:
      
      https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/b53
      
      a few bug fixes and DSA-ifycation later, here is what we got.
      
      This has been successfully tested in the following configurations:
      
      - Broadcom BCM53011 using the SRAB bus layer with 4 ports LAN, 1 port WAN
      
      - A Broadcom BCM7445 device with an internal Starfighter 2 switch (bcm_sf2.c)
        and a Broadcom BCM53125 hanging off one of its ports connected via MDIO, creating
        two trees hanging off each other, and this works!
      
      - A Broadcom BCM53125 MDIO connected to a Lamobo/Bananapi R1 board using the STMMAC
        MDIO driver
      
      For now, we do not enable Broadcom tags, because there are different
      generations of switches being supported which have different tag formats, but
      the plan is to enable them later on.
      
      Support for different HW features will be added later: EEE, Compact Field
      Processor (TCAM) once this initial cut gets accepted.
      
      Testing and bug reports welcome!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23c731e8
    • Florian Fainelli's avatar
      net: dsa: b53: Plug in VLAN support · a2482d2c
      Florian Fainelli authored
      Add support for configuration VLANs on B53 devices by implementing the
      port VLAN add/del/dump functions. We currently default to a behavior
      which is equivalent to having VLAN filtering turned on, where all VLANs
      not programmed into the VLAN port-based vector will be discarded on
      ingress.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2482d2c
    • Florian Fainelli's avatar
      net: dsa: b53: Add bridge support · ff39c2d6
      Florian Fainelli authored
      Add support for HW bridging by tying the ports together in the same port
      VLAN mask when they belong to the same bridge, and isolating them to be
      alone with the CPU port when they are not.
      
      Propagate STP states from the bridge layer to the switch's HW mapping
      when requested.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff39c2d6
    • Florian Fainelli's avatar
      net: dsa: b53: Implement ARL add/del/dump operations · 1da6df85
      Florian Fainelli authored
      Adds support for FDB add/delete/dump using the ARL read/write logic and
      the ARL search logic for faster dumps. The code is made flexible enough
      it could support devices with a different register layout like BCM5325
      and BCM5365 which have fewer number of entries or pack values into a
      single 64 bits register.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1da6df85
    • Florian Fainelli's avatar
      net: dsa: b53: Add BCM7445 quirk · 0830c980
      Florian Fainelli authored
      The Broadcom BCM7445 STB chip has an issued in its revision D0 which was
      previously worked around in drivers/net/dsa/bcm_sf2.c where we may
      end-up double programming the integrated BCM7445 switch (bcm_sf2) and an
      external Broadcom switch such as BCM53125, since these are mostly
      register compatible.
      
      Add a small quirk which just defers probing until we are sitting on the
      slave DSA MDIO bus, which will allow us to intercept reads/writes and
      funnel them through the SF2 internal MDIO master (which happens to
      disconnect its pseudo PHY).
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0830c980
    • Florian Fainelli's avatar
      net: dsa: b53: Add support for Broadcom RoboSwitch · 967dd82f
      Florian Fainelli authored
      This patch adds support for Broadcom's BCM53xx switch family, also known
      as RoboSwitch. Some of these switches are ubiquituous, found in home
      routers, Wi-Fi routers, DSL and cable modem gateways and other
      networking related products.
      
      This drivers adds the library driver (b53_common.c) as well as a few bus
      glue drivers for MDIO, SPI, Switch Register Access Block (SRAB) and
      memory-mapped I/O into a SoC's address space (Broadcom BCM63xx/33xx).
      
      Basic operations are supported to bring the Layer 1/2 up and running,
      but not much more at this point, subsequent patches add the remaining
      features.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      967dd82f
    • David S. Miller's avatar
      Merge branch 'bcm_sf2-vlan' · 409a5f27
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: dsa: bcm_sf2: add VLAN support
      
      This is long overdue, finally add support for VLANs in the Broadcom Starfigther
      2 switch driver.
      
      There are a few things that make us differ from e.g; mv88e6xxx.c:
      
      - we keep a software cache of which VLANs are enabled and which are not to
        dramatically speed up the VLAN dump operation, we do not have any HW operation
        which would only return the list of valid VLAN entries, they would have to be
        all queried one by one, with 4K vlans, this takes a while
      
      - the default behavior is equivalent to setting VLAN filtering to 1, still working
        on implementing a proper port_vlan_filtering callback, but I figured the most
        conservative behavior is probably okay anyway
      
      - without enabling VLANs, the default behavior is to receive any 802.1q frames
        (per the DSA documentation), however, once we start enabling VLAN support, if
        an interface leaves the bridge, we still want it to receive all 802.1q frames
        so we utiliez the "Join all VLAN" feature of the switch to perform that
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      409a5f27
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Add VLAN support · 9c57a771
      Florian Fainelli authored
      Add support for configuring VLANs on the Broadcom Starfigther2 switch.
      This is all done through the bridge vlan facility just like other DSA
      drivers.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c57a771
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Add VLAN registers definitions · 064523ff
      Florian Fainelli authored
      Add the definitions for the VLAN registers that we are going to
      manipulate in subsequent patches.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      064523ff
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Move setup function at the far end · 7fbb1a92
      Florian Fainelli authored
      Re-order the bcm_sf2_sw_setup() function so that it is at the far end of
      the driver to avoid any kind of forward declarations.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fbb1a92
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Split fast age into a helper function · a468ef45
      Florian Fainelli authored
      Add a helper function to fast age something that is controlled by the
      caller: port, VLAN. We will use this to implement a VLAN fast age
      operation.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a468ef45
  2. 09 Jun, 2016 15 commits
    • David S. Miller's avatar
      Merge branch 'netdev_lockdep_set_classes' · cf515802
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      net: better lockdep annotations
      
      Introduction of qdisc->running seqcount added lockdep false positives.
      
      While chasing the bug, it came to me that we had a lot of copies of the
      same stuff in virtual drivers.
      
      This patch series has the qdisc->running fix (considers that a trylock
      is attempted in lockdep terminology), and adds a generic helper so
      that we no longer have to patch many virtual drivers when a new per-device
      or per-qdisc lock is added.
      
      Thanks to David Ahern for reporting the issue and testing my patches :)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cf515802
    • Eric Dumazet's avatar
      net: ipvlan: call netdev_lockdep_set_classes() · 0d7dd798
      Eric Dumazet authored
      In case a qdisc is used on a ipvlan device, we need to use different
      lockdep classes to avoid false positives.
      
      Use the new netdev_lockdep_set_classes() generic helper.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d7dd798
    • Eric Dumazet's avatar
      net: macvlan: call netdev_lockdep_set_classes() · 24ffd752
      Eric Dumazet authored
      In case a qdisc is used on a macvlan device, we need to use different
      lockdep classes to avoid false positives.
      
      Use the new netdev_lockdep_set_classes() generic helper.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24ffd752
    • Eric Dumazet's avatar
      net: vrf: call netdev_lockdep_set_classes() · 78e7a2ae
      Eric Dumazet authored
      In case a qdisc is used on a vrf device, we need to use different
      lockdep classes to avoid false positives.
      
      Use the new netdev_lockdep_set_classes() generic helper.
      Reported-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78e7a2ae
    • Eric Dumazet's avatar
      net: add netdev_lockdep_set_classes() helper · d3fff6c4
      Eric Dumazet authored
      It is time to add netdev_lockdep_set_classes() helper
      so that lockdep annotations per device type are easier to manage.
      
      This removes a lot of copies and missing annotations.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3fff6c4
    • Eric Dumazet's avatar
      net: sched: fix qdisc->running lockdep annotations · 52fbb290
      Eric Dumazet authored
      1) qdisc_run_begin() is really using the equivalent of a trylock.
        Instead of using write_seqcount_begin(), use a combination of
        raw_write_seqcount_begin() and correct lockdep annotation.
      
      2) sch_direct_xmit() should use regular spin_lock(root_lock)
      
      Fixes: f9eb8aea ("net_sched: transform qdisc running bit into a seqcount")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52fbb290
    • Vitaly Kuznetsov's avatar
      netvsc: get rid of completion timeouts · 5362855a
      Vitaly Kuznetsov authored
      I'm hitting 5 second timeout in rndis_filter_set_rss_param() while setting
      RSS parameters for the device. When this happens we end up returning
      -ETIMEDOUT from the function and rndis_filter_device_add() falls back to
      setting
      
              net_device->max_chn = 1;
              net_device->num_chn = 1;
              net_device->num_sc_offered = 0;
      
      but after a moment the rndis request succeeds and subchannels start to
      appear. netvsc_sc_open() does unconditional nvscdev->num_sc_offered-- and
      it becomes U32_MAX-1. Consequent rndis_filter_device_remove() will hang
      while waiting for all U32_MAX-1 subchannels to appear and this is not
      going to happen.
      
      The immediate issue could be solved by adding num_sc_offered > 0 check to
      netvsc_sc_open() but we're getting out of sync with the host and it's not
      easy to adjust things later, e.g. in this particular case we'll be creating
      queues without a user request for it and races are expected. Same applies
      to other parts of the driver which have the same completion timeout.
      
      Following the trend in drivers/hv/* code I suggest we remove all these
      timeouts completely. As a guest we can always trust the host we're running
      on and if the host screws things up there is no easy way to recover anyway.
      Signed-off-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Acked-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5362855a
    • Simon Horman's avatar
      sit: remove unnecessary protocol check in ipip6_tunnel_xmit() · adba931f
      Simon Horman authored
      ipip6_tunnel_xmit() is called immediately after checking that
      skb->protocol is  htons(ETH_P_IPV6) so there is no need
      to check it a second time.
      
      Found by inspection.
      Signed-off-by: default avatarSimon Horman <simon.horman@netronome.com>
      Reviewed-by: default avatarDinan Gunawardena <dinan.gunawardena@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      adba931f
    • David S. Miller's avatar
      Merge branch 'cbq-kill-drop' · b8d99ba0
      David S. Miller authored
      Florian Westphal says:
      
      ====================
      sched, cbq: remove OVL_STRATEGY/POLICE support
      
      iproute2 does not implement any options that result in the
      TCA_CBQ_OVL_STRATEGY/TCA_CBQ_POLICE attributes being set/used.
      
      This series removes these two attributes from cbq and makes kernel reject
       them via EOPNOTSUPP in case they are present.
      
      The two followup changes then remove several features from qdisc
      infrastructure that are then no longer used/needed.  These are:
       - The 'drop' method provided by most qdiscs
       - the 'reshape_fail' function used by some qdiscs
       - the __parent member in struct Qdisc
      
      I tested this with allmod and allyesconfig builds and also with
      a brief cbq script:
      
        tc qdisc add dev eth0 root handle 1:0 cbq bandwidth 10Mbit avpkt 1000 cell 8
        tc class add dev eth0 parent 1:0 classid 1:1 est 1sec 8sec cbq bandwidth 10Mbit rate 5Mbit prio 1 allot 1514 maxburst 20 cell 8 avpkt 1000 bounded split 1:0 defmap 3f
        tc class add dev eth0 parent 1:0 classid 1:2 est 1sec 8sec cbq bandwidth 10Mbit rate 5Mbit prio 1 allot 1514 maxburst 20 cell 8 avpkt 1000 bounded split 1:0 defmap 3f
        tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip tos 0x10 0xff classid 1:1 police rate 2Mbit burst 10K reclassify
        tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip tos 0x0c 0xff classid 1:2
        tc filter add dev eth0 parent 1:0 protocol ip prio 2 u32 match ip tos 0x10 0xff classid 1:2
        tc filter add dev eth0 parent 1:0 protocol ip prio 3 u32 match ip tos 0x0 0x0 classid 1:2
      
      No changes since v1 except patch #5 to fix up struct Qdisc layout.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8d99ba0
    • Florian Westphal's avatar
      sched: place state, next_sched and gso_skb in same cacheline again · c8945043
      Florian Westphal authored
      Earlier commits removed two members from struct Qdisc which places
      next_sched/gso_skb into a different cacheline than ->state.
      
      This restores the struct layout to what it was before the removal.
      Move the two members, then add an annotation so they all reside in the
      same cacheline.
      
      This adds a 16 byte hole after cpu_qstats.
      
      The hole could be closed but as it doesn't decrease total struct size just
      do it this way.
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8945043
    • Florian Westphal's avatar
      sched: remove qdisc->drop · a09ceb0e
      Florian Westphal authored
      after removal of TCA_CBQ_OVL_STRATEGY from cbq scheduler, there are no
      more callers of ->drop() outside of other ->drop functions, i.e.
      nothing calls them.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a09ceb0e
    • Florian Westphal's avatar
      sched: remove qdisc_rehape_fail · c3a173d7
      Florian Westphal authored
      After the removal of TCA_CBQ_POLICE in cbq scheduler qdisc->reshape_fail
      is always NULL, i.e. qdisc_rehape_fail is now the same as qdisc_drop.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3a173d7
    • Florian Westphal's avatar
      cbq: remove TCA_CBQ_POLICE support · dd47c1fa
      Florian Westphal authored
      iproute2 doesn't implement any cbq option that results in this attribute
      being sent to kernel.
      
      To make use of it, user would have to
      
      - patch iproute2
      - add a class
      - attach a qdisc to the class (default pfifo doesn't work as
        q->handle is 0 and cbq_set_police() is a no-op in this case)
      - re-'add' the same class (tc class change ...) again
      - user must also specifiy a defmap (e.g. 'split 1:0 defmap 3f'), since
        this 'police' feature relies on its presence
      - the added qdisc must be one of bfifo, pfifo or netem
      
      If all of these conditions are met and _some_ leaf qdiscs, namely
      p/bfifo, netem, plug or tbf would drop a packet, kernel calls back into
      cbq, which will attempt to re-queue the skb into a different class
      as indicated by the parents' defmap entry for TC_PRIO_BESTEFFORT.
      
      [ i.e. we behave as if tc_classify returned TC_ACT_RECLASSIFY ].
      
      This feature, which isn't documented or implemented in iproute2,
      and isn't implemented consistently (most qdiscs like sfq, codel, etc
      drop right away instead of attempting this reclassification) is the
      sole reason for the reshape_fail and __parent member in Qdisc struct.
      
      So remove TCA_CBQ_POLICE support from the kernel, reject it via EOPNOTSUPP
      so userspace knows we don't support it, and then remove no-longer needed
      infrastructure in followup commit.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd47c1fa
    • Florian Westphal's avatar
      cbq: remove TCA_CBQ_OVL_STRATEGY support · c3498d34
      Florian Westphal authored
      since initial revision of cbq in 2004 iproute 2 has never implemented
      support for TCA_CBQ_OVL_STRATEGY, which is what needs to be set to
      activate the class->drop() call (TC_CBQ_OVL_DROP strategy must be
      set by userspace value must be set by userspace).
      
      David Miller says:
         It seems really safe to kill this thing off, flag an error if someone
         tries to set the attribute, and therefore kill off all of the
         non-default cbq_ovl_*() functions.
      
      A followup commit can then remove all .drop qdisc methods since this
      removed the only caller.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3498d34
    • Shweta Choudaha's avatar
      ip6gre: Allow live link address change · 76e48f9f
      Shweta Choudaha authored
      The ip6 GRE tap device should not be forced to down state to change
      the mac address and should allow live address change for tap device
      similar to ipv4 gre.
      Signed-off-by: default avatarShweta Choudaha <schoudah@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76e48f9f
  3. 08 Jun, 2016 1 commit
    • David S. Miller's avatar
      Merge branch 'vrf-fib-rule-improve' · 753c104b
      David S. Miller authored
      David Ahern says:
      
      ====================
      net: vrf: Improve use of FIB rules
      
      Currently, VRFs require 1 oif and 1 iif rule per address family per
      VRF. As the number of VRF devices increases it brings scalability
      issues with the increasing rule list. All of the VRF rules have the
      same format with the exception of the specific table id to direct the
      lookup. Since the table id is available from the oif or iif in the
      loopup, the VRF rules can be consolidated to a single rule that pulls
      the table from the VRF device.
      
      This solution still allows a user to insert their own rules for VRFs,
      including rules with additional attributes. Accordingly, it is backwards
      compatible with existing setups and allows other policy routing as
      desired.
      
      Hopefully v5 is the charm; my e-waste can is getting full.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      753c104b