1. 10 Feb, 2017 9 commits
    • Netanel Belgazal's avatar
      net/ena: use READ_ONCE to access completion descriptors · a8496eb8
      Netanel Belgazal authored
      Completion descriptors are accessed from the driver and from the device.
      To avoid reading the old value, use READ_ONCE macro.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8496eb8
    • Netanel Belgazal's avatar
      net/ena: use napi_complete_done() return value · b1669c9f
      Netanel Belgazal authored
      Do not unamsk interrupts if we are in busy poll mode.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b1669c9f
    • Netanel Belgazal's avatar
      net/ena: fix potential access to freed memory during device reset · 3f6159db
      Netanel Belgazal authored
      If the ena driver detects that the device is not behave as expected,
      it tries to reset the device.
      The reset flow calls ena_down, which will frees all the resources
      the driver allocates and then it will reset the device.
      
      This flow can cause memory corruption if the device is still writes
      to the driver's memory space.
      To overcome this potential race, move the reset before the device
      resources are freed.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f6159db
    • Netanel Belgazal's avatar
      net/ena: refactor ena_get_stats64 to be atomic context safe · d81db240
      Netanel Belgazal authored
      ndo_get_stat64() can be called from atomic context, but the current
      implementation sends an admin command to retrieve the statistics from
      the device. This admin command can sleep.
      
      This patch re-factors the implementation of ena_get_stats64() to use
      the {rx,tx}bytes/count from the driver's inner counters, and to obtain
      the rx drop counter from the asynchronous keep alive (heart bit)
      event.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d81db240
    • Netanel Belgazal's avatar
      net/ena: fix NULL dereference when removing the driver after device reset failed · 22b331c9
      Netanel Belgazal authored
      If for some reason the device stops responding, and the device reset
      failes to recover the device, the mmio register read data structure
      will not be reinitialized.
      
      On driver removal, the driver will also try to reset the device, but
      this time the mmio data structure will be NULL.
      
      To solve this issue, perform the device reset in the remove function
      only if the device is runnig.
      
      Crash log
         54.240382] BUG: unable to handle kernel NULL pointer dereference at           (null)
      [   54.244186] IP: [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
      [   54.244186] PGD 0
      [   54.244186] Oops: 0002 [#1] SMP
      [   54.244186] Modules linked in: ena_drv(OE-) snd_hda_codec_generic kvm_intel kvm crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_intel aes_x86_64 snd_hda_controller lrw gf128mul cirrus glue_helper ablk_helper ttm snd_hda_codec drm_kms_helper cryptd snd_hwdep drm snd_pcm pvpanic snd_timer syscopyarea sysfillrect snd parport_pc sysimgblt serio_raw soundcore i2c_piix4 mac_hid lp parport psmouse floppy
      [   54.244186] CPU: 5 PID: 1841 Comm: rmmod Tainted: G           OE 3.16.0-031600-generic #201408031935
      [   54.244186] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      [   54.244186] task: ffff880135852880 ti: ffff8800bb640000 task.ti: ffff8800bb640000
      [   54.244186] RIP: 0010:[<ffffffffc067de5a>]  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
      [   54.244186] RSP: 0018:ffff8800bb643d50  EFLAGS: 00010083
      [   54.244186] RAX: 000000000000deb0 RBX: 0000000000030d40 RCX: 0000000000000003
      [   54.244186] RDX: 0000000000000202 RSI: 0000000000000058 RDI: ffffc90000775104
      [   54.244186] RBP: ffff8800bb643d88 R08: 0000000000000000 R09: cf00000000000000
      [   54.244186] R10: 0000000fffffffe0 R11: 0000000000000001 R12: 0000000000000000
      [   54.244186] R13: ffffc90000765000 R14: ffffc90000775104 R15: 00007fca1fa98090
      [   54.244186] FS:  00007fca1f1bd740(0000) GS:ffff88013fd40000(0000) knlGS:0000000000000000
      [   54.244186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   54.244186] CR2: 0000000000000000 CR3: 00000000b9cf6000 CR4: 00000000001406e0
      [   54.244186] Stack:
      [   54.244186]  0000000000000202 0000005800000286 ffffc90000765000 ffffc90000765000
      [   54.244186]  ffff880135f6b000 ffff8800b9360000 00007fca1fa98090 ffff8800bb643db8
      [   54.244186]  ffffffffc0680b3d ffff8800b93608c0 ffffc90000765000 ffff880135f6b000
      [   54.244186] Call Trace:
      [   54.244186]  [<ffffffffc0680b3d>] ena_com_dev_reset+0x1d/0x1b0 [ena_drv]
      [   54.244186]  [<ffffffffc0678497>] ena_remove+0xa7/0x130 [ena_drv]
      [   54.244186]  [<ffffffff813d4df6>] pci_device_remove+0x46/0xc0
      [   54.244186]  [<ffffffff814c3b7f>] __device_release_driver+0x7f/0xf0
      [   54.244186]  [<ffffffff814c4738>] driver_detach+0xc8/0xd0
      [   54.244186]  [<ffffffff814c3969>] bus_remove_driver+0x59/0xd0
      [   54.244186]  [<ffffffff814c4fde>] driver_unregister+0x2e/0x60
      [   54.244186]  [<ffffffff810f0a80>] ? show_refcnt+0x40/0x40
      [   54.244186]  [<ffffffff813d4ec3>] pci_unregister_driver+0x23/0xa0
      [   54.244186]  [<ffffffffc068413f>] ena_cleanup+0x10/0xed1 [ena_drv]
      [   54.244186]  [<ffffffff810f3a47>] SyS_delete_module+0x157/0x1e0
      [   54.244186]  [<ffffffff81014fb7>] ? do_notify_resume+0xc7/0xd0
      [   54.244186]  [<ffffffff81793fad>] system_call_fastpath+0x1a/0x1f
      [   54.244186] Code: c3 4d 8d b5 04 01 01 00 4c 89 f7 e8 e1 5a 11 c1 48 89 45 c8 41 0f b7 85 00 01 01 00 8d 48 01 66 2d 52 21 66 41 89 8d 00 01 01 00 <66> 41 89 04 24 0f b7 45 d4 89 45 d0 89 c1 41 0f b7 85 00 01 01
      [   54.244186] RIP  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
      [   54.244186]  RSP <ffff8800bb643d50>
      [   54.244186] CR2: 0000000000000000
      [   54.244186] ---[ end trace 18dd9889b6497810 ]---
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22b331c9
    • Netanel Belgazal's avatar
      net/ena: fix RSS default hash configuration · 422e21e7
      Netanel Belgazal authored
      ENA default hash configures IPv4_frag hash twice instead of
      configure non-IP packets.
      
      The bug caused IPv4 fragmented packets to be calculated based on
      L2 source and destination address instead of L3 source and destination.
      IPv4 packets can reach to the wrong Rx queue.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      422e21e7
    • Netanel Belgazal's avatar
      net/ena: fix ethtool RSS flow configuration · 6e2de20d
      Netanel Belgazal authored
      ena_flow_data_to_flow_hash and ena_flow_hash_to_flow_type
      treat the ena_flow_hash_to_flow_type enum as power of two values.
      
      Change the values of ena_admin_flow_hash_fields to be power of two values.
      
      This bug effect the ethtool set/get rxnfc.
      ethtool will report wrong values hash fields for get and will
      configure wrong hash fields in set.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e2de20d
    • Netanel Belgazal's avatar
      net/ena: fix queues number calculation · 6a1ce2fb
      Netanel Belgazal authored
      The ENA driver tries to open a queue per vCPU.
      To determine how many vCPUs the instance have it uses num_possible_cpus()
      while it should have use num_online_cpus() instead.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a1ce2fb
    • Netanel Belgazal's avatar
      net/ena: remove ntuple filter support from device feature list · fdeea0ad
      Netanel Belgazal authored
      Remove NETIF_F_NTUPLE from netdev->features.
      The ENA device driver does not support ntuple filtering.
      Signed-off-by: default avatarNetanel Belgazal <netanel@annapurnalabs.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdeea0ad
  2. 09 Feb, 2017 24 commits
  3. 08 Feb, 2017 7 commits
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Don't reflect LINKDOWN nexthops · df6dd79b
      Ido Schimmel authored
      The kernel resolves the nexthops for a given route using
      FIB_LOOKUP_IGNORE_LINKSTATE which means a notification can be sent for a
      route with one of its nexthops being LINKDOWN.
      
      In case IGNORE_ROUTES_WITH_LINKDOWN is set for the nexthop netdev, then
      we shouldn't reflect the nexthop to the device's table.
      
      Once the nexthop netdev's carrier goes up we'll be notified using NH_ADD
      and reflect it to the device.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df6dd79b
    • David S. Miller's avatar
      Merge branch 'mlxsw-Reflect-nexthop-status-changes' · d9e1661d
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      mlxsw: Reflect nexthop status changes
      
      Ido says:
      
      When the kernel forwards IPv4 packets via multipath routes it doesn't
      consider nexthops that are dead or linkdown. For example, if the nexthop
      netdev is administratively down or doesn't have a carrier.
      
      Devices capable of offloading such multipath routes need to be made
      aware of changes in the reflected nexthops' status. Otherwise, the
      device might forward packets via non-functional nexthops, resulting in
      packet loss. This patchset aims to fix that.
      
      The first 11 patches deal with the necessary restructuring in the
      mlxsw driver, so that it's able to correctly add and remove nexthops
      from the device's adjacency table.
      
      The 12th patch adds the NH_{ADD,DEL} events to the FIB notification
      chain. These notifications are sent whenever the kernel decides to add
      or remove a nexthop from the forwarding plane.
      
      Finally, the last three patches add support for these events in the
      mlxsw driver, which is currently the only driver capable of offloading
      multipath routes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9e1661d
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Flush resources when RIF is deleted · 9665b745
      Ido Schimmel authored
      When the last IP address is removed from a netdev, its RIF is deleted.
      However, if user didn't first remove neighbours and nexthops using this
      interface, then they would still be present in the device's tables.
      
      Therefore, whenever a RIF is deleted, make sure all the neighbours and
      nexthops (adjacency entries) using it are removed from the relevant
      tables as well.
      
      The action associated with any route using this RIF would be refreshed,
      most likely to trap. If the kernel decides to remove the route (f.e.,
      because all the nexthops are now DEAD), then an event would be sent,
      causing the route to be removed from the device.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9665b745
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Reflect nexthop status changes · ad178c8e
      Ido Schimmel authored
      When a packet hits a multipath route in the device's routing table, a
      hash is computed over its headers, which is then used to select the
      appropriate nexthop from the device's adjacency table.
      
      There are situations in which the kernel removes a nexthop from a
      multipath route (e.g., no carrier) and the device should do the same.
      
      Upon the reception of NH_{ADD,DEL} events, add or remove a nexthop from
      the device's adjacency table and refresh all the routes using the
      nexthop group. If all the nexthops of a multipath route are invalid,
      then any packet hitting the route would be trapped to the CPU for
      forwarding.
      
      If all the nexthops are DEAD, then the kernel would remove the route
      entirely. On the other hand, if all the nexthops are merely LINKDOWN,
      then the kernel would keep the route and forward any incoming packet
      using a different route.
      
      While the last case might sound like a problem, it's expected that a
      routing daemon running in user space would remove such a route from the
      FIB as it's dumped with the DEAD flag set.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad178c8e
    • Ido Schimmel's avatar
      ipv4: fib: Notify about nexthop status changes · 982acb97
      Ido Schimmel authored
      When a multipath route is hit the kernel doesn't consider nexthops that
      are DEAD or LINKDOWN when IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN is set.
      Devices that offload multipath routes need to be made aware of nexthop
      status changes. Otherwise, the device will keep forwarding packets to
      non-functional nexthops.
      
      Add the FIB_EVENT_NH_{ADD,DEL} events to the fib notification chain,
      which notify capable devices when they should add or delete a nexthop
      from their tables.
      
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: David Ahern <dsa@cumulusnetworks.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarAndy Gospodarek <gospo@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      982acb97
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Use trap action only for some route types · 70ad3506
      Ido Schimmel authored
      The device can have one of three actions associated with a route:
      
      1) Remote - packets continue to the adjacency table
      2) Local - packets continue to the neighbour table
      3) Trap - packets continue to the CPU
      
      The first two actions can also trap packets to the CPU, but they do so
      using a different trap ID, which has a lower traffic class and less
      allotted bandwidth.
      
      We currently use the third action for both RTN_{LOCAL,BROADCAST} routes
      and RTN_UNICAST routes not pointing to the switch ports.
      
      However, packets that merely need to be forwarded by the switch are
      likely not control packets and can be therefore scheduled towards the
      CPU using a lower traffic class.
      
      Achieve the above by assigning the third action only to local and
      broadcast routes and have any other route use either of the first two
      actions, based on whether the route is gatewayed or not.
      
      This will also allow us to refresh routes using the local action and
      have them trap packets when their RIF is no longer valid following a
      NH_DEL event.
      
      One side effect of this patch is that we no longer give special
      treatment to multipath routes using both switch and non-switch ports
      towards their nexthops. If at least one of the nexthops can be resolved,
      then the device will forward the packets instead of trapping them.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70ad3506
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Determine offload status using generic function · 4b411477
      Ido Schimmel authored
      The previous patch introduced a generic function to determine whether a
      route should be offloaded or not. Make use of it here.
      
      In the future we're going to add more conditions to this test (e.g.,
      whether TOS is non-zero), so it makes sense to centralize it instead of
      open coding it in a few places.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b411477