1. 25 Sep, 2020 12 commits
  2. 24 Sep, 2020 17 commits
  3. 23 Sep, 2020 11 commits
    • David S. Miller's avatar
      Merge branch 'net-bridge-mcast-IGMPv3-MLDv2-fast-path-part-2' · 68d4fd30
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: mcast: IGMPv3/MLDv2 fast-path (part 2)
      
      This is the second part of the IGMPv3/MLDv2 support which adds support
      for the fast-path. In order to be able to handle source entries we add
      mdb support for S,G entries (i.e. we add source address support to
      br_ip), that requires to extend the current mdb netlink API, fortunately
      we just add another attribute which will contain nested future mdb
      attributes, then we use it to add support for S,G user- add, del and
      dump. The lookup sequence is simple: when IGMPv3/MLDv2 are enabled do
      the S,G lookup first and if it fails fallback to *,G. The more complex
      part is when we begin handling source lists and auto-installing S,G entries
      and *,G filter mode transitions. We have the following cases:
       1) *,G INCLUDE -> EXCLUDE transition: we need to install the port in
          all of *,G's installed S,G entries for proper replication (except
          the ones explicitly blocked), this is also necessary when adding a
          new *,G EXCLUDE port group
      
       2) *,G EXCLUDE -> INCLUDE transition: we need to remove the port from
          all of *,G's installed S,G entries, this is also necessary when
          removing a *,G port group
      
       3) New S,G port entry: we need to install all current *,G EXCLUDE ports
      
       4) Remove S,G port entry: if all other port groups were auto-installed we
          can safely remove them and delete the whole S,G entry
      
      Currently we compute these operations from the available ports, their
      source lists and their filter mode. In the future we can extend the port
      group structure and reduce the running time of these ops. Also one
      current limitation is that host-joined S,G entries are not supported.
      I.e. one cannot add "dev bridge port bridge" mdb S,G entries. The host
      join is currently considered an EXCLUDE {} join, so it's reflected in
      all of *,G's installed S,G entries. If an S,G,port entry is added as
      temporary then the kernel can take it over if a source shows up from a
      report, permanent entries are skipped. In order to properly handle
      blocked sources we add a new port group blocked flag to avoid forwarding
      to that port group in the S,G. Finally when forwarding we use the port
      group filter mode (if it's INCLUDE and the port group is from a *,G then
      don't replicate to it, respectively if it's EXCLUDE then forward) and the
      blocked flag (obviously if it's set - skip that port unless it's a
      router port) to decide if the port should be skipped. Another limitation
      is that we can't do some of the above transitions without small traffic
      drop while installing/removing entries. That will be taken care of when
      we add atomic swap of port replication lists later.
      
      Patch break down:
       patches 1-3: prepare the mdb code for better extack support which is
                    used in future patches to return a more meaningful error
       patches 4-6: add the source address field to struct br_ip, and do minor
                    cleanups around it
       patches 7-8: extend the mdb netlink API so we can send new mdb
                    attributes and uses the new API for S,G entry add/del/dump
                    support
       patch     9: takes care of S,G entries when doing a lookup (first S,G
                    then *,G lookup)
       patch    10: adds a new port group field and attribute for origin protocol
                    we use the already available RTPROT_ definitions,
                    currently user-space entries are added as RTPROT_STATIC and
                    kernel entries are added as RTPROT_KERNEL, we may allow
                    user-space to set custom values later (e.g. for FRR, clag)
       patch    11: adds an internal S,G,port rhashtable to speed up filter
                    mode transitions
       patch    12: initial automatic install of S,G entries based on port
                    groups' source lists
       patch    13: handles port group modes on transitions or when new
                    port group entries are added
       patch    14: self-explanatory - adds support for blocked port group
                    entries needed to stop forwarding to particular S,G,port
                    entries
       patch    15: handles host-join/leave state changes, treats host-joins
                    as EXCLUDE {} groups (reflected in all *,G's S,G entries)
       patch    16: finally adds the fast-path filter mode and block flag
                    support
      
      Here're the sets that will come next (in order):
       - iproute2 support for IGMPv3/MLDv2
       - selftests for all mode transitions and group flags
       - explicit host tracking for proper fast-leave support
       - atomic port replication lists (these are also needed for broadcast
         forwarding optimizations)
       - mode transition optimization and removal of open-coded sorted lists
      
      Not implemented yet:
       - Host IGMPv3/MLDv2 filter support (currently we handle only join/leave
         as before)
       - Proper other querier source timer and value updates
       - IGMPv3/v2 MLDv2/v1 compat (I have a few rough patches for this one)
      
      v2: fix build with CONFIG_BATMAN_ADV_MCAST in patch 6
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68d4fd30
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: when forwarding handle filter mode and blocked flag · 36cfec73
      Nikolay Aleksandrov authored
      We need to avoid forwarding to ports in MCAST_INCLUDE filter mode when the
      mdst entry is a *,G or when the port has the blocked flag.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36cfec73
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: handle host state · 094b82fd
      Nikolay Aleksandrov authored
      Since host joins are considered as EXCLUDE {} joins we need to reflect
      that in all of *,G ports' S,G entries. Since the S,Gs can have
      host_joined == true only set automatically we can safely set it to false
      when removing all automatically added entries upon S,G delete.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      094b82fd
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: add support for blocked port groups · 9116ffbf
      Nikolay Aleksandrov authored
      When excluding S,G entries we need a way to block a particular S,G,port.
      The new port group flag is managed based on the source's timer as per
      RFCs 3376 and 3810. When a source expires and its port group is in
      EXCLUDE mode, it will be blocked.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9116ffbf
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: handle port group filter modes · 8266a049
      Nikolay Aleksandrov authored
      We need to handle group filter mode transitions and initial state.
      To change a port group's INCLUDE -> EXCLUDE mode (or when we have added
      a new port group in EXCLUDE mode) we need to add that port to all of
      *,G ports' S,G entries for proper replication. When the EXCLUDE state is
      changed from IGMPv3 report, br_multicast_fwd_filter_exclude() must be
      called after the source list processing because the assumption is that
      all of the group's S,G entries will be created before transitioning to
      EXCLUDE mode, i.e. most importantly its blocked entries will already be
      added so it will not get automatically added to them.
      The transition EXCLUDE -> INCLUDE happens only when a port group timer
      expires, it requires us to remove that port from all of *,G ports' S,G
      entries where it was automatically added previously.
      Finally when we are adding a new S,G entry we must add all of *,G's
      EXCLUDE ports to it.
      In order to distinguish automatically added *,G EXCLUDE ports we have a
      new port group flag - MDB_PG_FLAGS_STAR_EXCL.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8266a049
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: install S,G entries automatically based on reports · b0812368
      Nikolay Aleksandrov authored
      This patch adds support for automatic install of S,G mdb entries based
      on the port group's source list and the source entry's timer.
      Once installed the S,G will be used when forwarding packets if the
      approprate multicast/mld versions are set. A new source flag called
      BR_SGRP_F_INSTALLED denotes if the source has a forwarding mdb entry
      installed.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0812368
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: add sg_port rhashtable · 085b53c8
      Nikolay Aleksandrov authored
      To speedup S,G forward handling we need to be able to quickly find out
      if a port is a member of an S,G group. To do that add a global S,G port
      rhashtable with key: source addr, group addr, protocol, vid (all br_ip
      fields) and port pointer.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      085b53c8
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: add rt_protocol field to the port group struct · 8f8cb77e
      Nikolay Aleksandrov authored
      We need to be able to differentiate between pg entries created by
      user-space and the kernel when we start generating S,G entries for
      IGMPv3/MLDv2's fast path. User-space entries are created by default as
      RTPROT_STATIC and the kernel entries are RTPROT_KERNEL. Later we can
      allow user-space to provide the entry rt_protocol so we can
      differentiate between who added the entries specifically (e.g. clag,
      admin, frr etc).
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8f8cb77e
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: when igmpv3/mldv2 are enabled lookup (S,G) first, then (*,G) · 7d07a68c
      Nikolay Aleksandrov authored
      If (S,G) entries are enabled (igmpv3/mldv2) then look them up first. If
      there isn't a present (S,G) entry then try to find (*,G).
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d07a68c
    • Nikolay Aleksandrov's avatar
      net: bridge: mdb: add support for add/del/dump of entries with source · 88d4bd18
      Nikolay Aleksandrov authored
      Add new mdb attributes (MDBE_ATTR_SOURCE for setting,
      MDBA_MDB_EATTR_SOURCE for dumping) to allow add/del and dump of mdb
      entries with a source address (S,G). New S,G entries are created with
      filter mode of MCAST_INCLUDE. The same attributes are used for IPv4 and
      IPv6, they're validated and parsed based on their protocol.
      S,G host joined entries which are added by user are not allowed yet.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88d4bd18
    • Nikolay Aleksandrov's avatar
      net: bridge: mdb: add support to extend add/del commands · 9c4258c7
      Nikolay Aleksandrov authored
      Since the MDB add/del code expects an exact struct br_mdb_entry we can't
      really add any extensions, thus add a new nested attribute at the level of
      MDBA_SET_ENTRY called MDBA_SET_ENTRY_ATTRS which will be used to pass
      all new options via netlink attributes. This patch doesn't change
      anything functionally since the new attribute is not used yet, only
      parsed.
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c4258c7