1. 09 Feb, 2017 23 commits
  2. 08 Feb, 2017 17 commits
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Don't reflect LINKDOWN nexthops · df6dd79b
      Ido Schimmel authored
      The kernel resolves the nexthops for a given route using
      FIB_LOOKUP_IGNORE_LINKSTATE which means a notification can be sent for a
      route with one of its nexthops being LINKDOWN.
      
      In case IGNORE_ROUTES_WITH_LINKDOWN is set for the nexthop netdev, then
      we shouldn't reflect the nexthop to the device's table.
      
      Once the nexthop netdev's carrier goes up we'll be notified using NH_ADD
      and reflect it to the device.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df6dd79b
    • David S. Miller's avatar
      Merge branch 'mlxsw-Reflect-nexthop-status-changes' · d9e1661d
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      mlxsw: Reflect nexthop status changes
      
      Ido says:
      
      When the kernel forwards IPv4 packets via multipath routes it doesn't
      consider nexthops that are dead or linkdown. For example, if the nexthop
      netdev is administratively down or doesn't have a carrier.
      
      Devices capable of offloading such multipath routes need to be made
      aware of changes in the reflected nexthops' status. Otherwise, the
      device might forward packets via non-functional nexthops, resulting in
      packet loss. This patchset aims to fix that.
      
      The first 11 patches deal with the necessary restructuring in the
      mlxsw driver, so that it's able to correctly add and remove nexthops
      from the device's adjacency table.
      
      The 12th patch adds the NH_{ADD,DEL} events to the FIB notification
      chain. These notifications are sent whenever the kernel decides to add
      or remove a nexthop from the forwarding plane.
      
      Finally, the last three patches add support for these events in the
      mlxsw driver, which is currently the only driver capable of offloading
      multipath routes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9e1661d
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Flush resources when RIF is deleted · 9665b745
      Ido Schimmel authored
      When the last IP address is removed from a netdev, its RIF is deleted.
      However, if user didn't first remove neighbours and nexthops using this
      interface, then they would still be present in the device's tables.
      
      Therefore, whenever a RIF is deleted, make sure all the neighbours and
      nexthops (adjacency entries) using it are removed from the relevant
      tables as well.
      
      The action associated with any route using this RIF would be refreshed,
      most likely to trap. If the kernel decides to remove the route (f.e.,
      because all the nexthops are now DEAD), then an event would be sent,
      causing the route to be removed from the device.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9665b745
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Reflect nexthop status changes · ad178c8e
      Ido Schimmel authored
      When a packet hits a multipath route in the device's routing table, a
      hash is computed over its headers, which is then used to select the
      appropriate nexthop from the device's adjacency table.
      
      There are situations in which the kernel removes a nexthop from a
      multipath route (e.g., no carrier) and the device should do the same.
      
      Upon the reception of NH_{ADD,DEL} events, add or remove a nexthop from
      the device's adjacency table and refresh all the routes using the
      nexthop group. If all the nexthops of a multipath route are invalid,
      then any packet hitting the route would be trapped to the CPU for
      forwarding.
      
      If all the nexthops are DEAD, then the kernel would remove the route
      entirely. On the other hand, if all the nexthops are merely LINKDOWN,
      then the kernel would keep the route and forward any incoming packet
      using a different route.
      
      While the last case might sound like a problem, it's expected that a
      routing daemon running in user space would remove such a route from the
      FIB as it's dumped with the DEAD flag set.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad178c8e
    • Ido Schimmel's avatar
      ipv4: fib: Notify about nexthop status changes · 982acb97
      Ido Schimmel authored
      When a multipath route is hit the kernel doesn't consider nexthops that
      are DEAD or LINKDOWN when IN_DEV_IGNORE_ROUTES_WITH_LINKDOWN is set.
      Devices that offload multipath routes need to be made aware of nexthop
      status changes. Otherwise, the device will keep forwarding packets to
      non-functional nexthops.
      
      Add the FIB_EVENT_NH_{ADD,DEL} events to the fib notification chain,
      which notify capable devices when they should add or delete a nexthop
      from their tables.
      
      Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
      Cc: David Ahern <dsa@cumulusnetworks.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarAndy Gospodarek <gospo@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      982acb97
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Use trap action only for some route types · 70ad3506
      Ido Schimmel authored
      The device can have one of three actions associated with a route:
      
      1) Remote - packets continue to the adjacency table
      2) Local - packets continue to the neighbour table
      3) Trap - packets continue to the CPU
      
      The first two actions can also trap packets to the CPU, but they do so
      using a different trap ID, which has a lower traffic class and less
      allotted bandwidth.
      
      We currently use the third action for both RTN_{LOCAL,BROADCAST} routes
      and RTN_UNICAST routes not pointing to the switch ports.
      
      However, packets that merely need to be forwarded by the switch are
      likely not control packets and can be therefore scheduled towards the
      CPU using a lower traffic class.
      
      Achieve the above by assigning the third action only to local and
      broadcast routes and have any other route use either of the first two
      actions, based on whether the route is gatewayed or not.
      
      This will also allow us to refresh routes using the local action and
      have them trap packets when their RIF is no longer valid following a
      NH_DEL event.
      
      One side effect of this patch is that we no longer give special
      treatment to multipath routes using both switch and non-switch ports
      towards their nexthops. If at least one of the nexthops can be resolved,
      then the device will forward the packets instead of trapping them.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70ad3506
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Determine offload status using generic function · 4b411477
      Ido Schimmel authored
      The previous patch introduced a generic function to determine whether a
      route should be offloaded or not. Make use of it here.
      
      In the future we're going to add more conditions to this test (e.g.,
      whether TOS is non-zero), so it makes sense to centralize it instead of
      open coding it in a few places.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b411477
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: More accurately set offload flag · 013b20f9
      Ido Schimmel authored
      We currently set the RTNH_F_OFFLOAD flag for all routes using remote
      action, but this isn't always correct. If none of the nexthops
      associated with a gatewayed route can be offloaded into the device, then
      any packet hitting it would be trapped to the CPU and forwarded by the
      kernel.
      
      Solve this by pushing the setting of the offload flag to after the route
      was programmed into the device, thereby allowing us to take all the
      parameters into account.
      
      This change will also help us further in the patchset, when we refresh
      routes following the reception of NH_{ADD,DEL} events.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      013b20f9
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Refactor nexthop init routine · a8c97014
      Ido Schimmel authored
      The nexthop init and de-init functions both have symmetric parts
      concerned with the reflection of the neighbour entry into the device's
      adjacency table, in case it's used by a gatewayed route.
      
      These sections of code also need to be called when a nexthop is marked
      as valid / invalid following NH_{ADD,DEL} events. Break these out into
      appropriate functions, so that they could be invoked following the
      reception of above events.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8c97014
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Remove FIB info from FIB entry struct · c8b03077
      Ido Schimmel authored
      After the previous changes, the FIB info is embedded in every nexthop
      group struct, which in turn is embedded in every FIB entry struct.
      
      We can therefore safely remove the FIB info from the entry struct. This
      has the added advantage of making the router-related structs more
      generic and suitable for use with IPv6 offloads.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c8b03077
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Store routes in a more generic way · b8399a1e
      Ido Schimmel authored
      Up until now, the only FIB entries that were associated with a nexthop
      group were routes to remote networks where all the nexthop devices had a
      valid router interface (RIF). This is in contrast to the FIB code,
      where all the routes are associated with a FIB info. The same design
      choice needs to be applied to the driver's cache.
      
      Based on the NH_{ADD,DEL} events which will be added later in the
      patchset, we need to be able to change the action (forward / trap)
      associated with all the routes using the nexthop group. However, if we
      can't link between the nexthop and the routes using it, then the above
      is impossible.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8399a1e
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Add gateway indication to nexthop group · b3e8d1eb
      Ido Schimmel authored
      The next patch is going to generalize the way in which we store routes.
      Instead of attaching a nexthop group only to gatewayed routes, one will
      be attached to each route, in a similar way to the way the FIB code
      stores its routes.
      
      The above means that any function operating on a nexthop group cannot
      assume the group represents only gatewayed nexthops. One such function
      is the one that refreshes a nexthop group and updates the adjacency
      table following nexthop changes.
      
      For a nexthop group that doesn't represent any gateways this function
      would essentially be a NOP, but it would be useful if it did update the
      action associated with any route using it. This will allow us to later
      consolidate code paths when a nexthop changes following NH_{ADD,DEL}
      events.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3e8d1eb
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Use nexthop's scope to set action type · d55409cb
      Ido Schimmel authored
      We currently use the scope of the FIB info to distinguish between a
      direct unicast route and a gatewayed one. However, the kernel is
      perfectly happy to configure a route with scope UNIVERSE to a directly
      connected network.
      
      Instead, we can rely on the first nexthop's scope to check if the route
      is gatewayed or not.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d55409cb
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Store nexthops in a hash table · c53b8e1b
      Ido Schimmel authored
      Later in the patchset we'll add the NH_{ADD,DEL} events which will let
      us know when a nexthop is considered to be dead. Based on these events
      we need to be able to add or remove the nexthop from the device's
      tables.
      
      Therefore, store the private nexthop structs in a hash table and use the
      kernel's fib_nh struct as the key, so that we'll be able to easily find
      them when the events are received.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c53b8e1b
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Store nexthop groups in a hash table · e9ad5e7d
      Ido Schimmel authored
      Currently, when we're notified about a new RTN_UNICAST route we perform
      a lookup on the nexthop group list looking for a group with a matching
      configuration to that found in the FIB info. This is quite inefficient.
      
      Instead, we can simply rely on the kernel to consolidate several FIB
      configurations into the same FIB info and use the FIB info as the key
      for our private nexthop group struct.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9ad5e7d
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Nullify nexthop's neigh pointer · e58be79e
      Ido Schimmel authored
      When we invalidate a nexthop we should also invalidate its neighbour
      entry pointer as it might be destroyed later on. This makes the nexthop
      de-init function symmetric with its init and also ensures nobody will
      try to access the neighbour entry.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e58be79e
    • Jiri Pirko's avatar
      mlxsw: acl: Fix mlxsw_afa_block_commit error path · b05d0cfa
      Jiri Pirko authored
      No rollback is needed since the chain is in consistent state and
      mlxsw_afa_block_destroy() will take care of putting it away. So remove
      the one we have now which is wrong. Also move the set of 'finished' flag
      to the beginning of the function, because the block is certainly unusable
      for future action addition no matter if the function succeeds or not.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 4cda7d8d ("mlxsw: core: Introduce flexible actions support")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Acked-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b05d0cfa