1. 05 Aug, 2021 5 commits
  2. 04 Aug, 2021 8 commits
  3. 03 Aug, 2021 10 commits
  4. 02 Aug, 2021 17 commits
    • Vladimir Oltean's avatar
      net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry · 0541a629
      Vladimir Oltean authored
      Currently it is possible to add broken extern_learn FDB entries to the
      bridge in two ways:
      
      1. Entries pointing towards the bridge device that are not local/permanent:
      
      ip link add br0 type bridge
      bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static
      
      2. Entries pointing towards the bridge device or towards a port that
      are marked as local/permanent, however the bridge does not process the
      'permanent' bit in any way, therefore they are recorded as though they
      aren't permanent:
      
      ip link add br0 type bridge
      bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent
      
      Since commit 52e4bec1 ("net: bridge: switchdev: treat local FDBs the
      same as entries towards the bridge"), these incorrect FDB entries can
      even trigger NULL pointer dereferences inside the kernel.
      
      This is because that commit made the assumption that all FDB entries
      that are not local/permanent have a valid destination port. For context,
      local / permanent FDB entries either have fdb->dst == NULL, and these
      point towards the bridge device and are therefore local and not to be
      used for forwarding, or have fdb->dst == a net_bridge_port structure
      (but are to be treated in the same way, i.e. not for forwarding).
      
      That assumption _is_ correct as long as things are working correctly in
      the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
      any circumstance for FDB entries that are not local. However, the
      extern_learn code path where FDB entries are managed by a user space
      controller show that it is possible for the bridge kernel driver to
      misinterpret the NUD flags of an entry transmitted by user space, and
      end up having fdb->dst == NULL while not being a local entry. This is
      invalid and should be rejected.
      
      Before, the two commands listed above both crashed the kernel in this
      check from br_switchdev_fdb_notify:
      
      	struct net_device *dev = info.is_local ? br->dev : dst->dev;
      
      info.is_local == false, dst == NULL.
      
      After this patch, the invalid entry added by the first command is
      rejected:
      
      ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0
      Error: bridge: FDB entry towards bridge must be permanent.
      
      and the valid entry added by the second command is properly treated as a
      local address and does not crash br_switchdev_fdb_notify anymore:
      
      ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0
      
      Fixes: eb100e0e ("net: bridge: allow to add externally learned entries from user-space")
      Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Link: https://lore.kernel.org/r/20210801231730.7493-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0541a629
    • Jakub Kicinski's avatar
      Revert "mhi: Fix networking tree build." · 1c69d7cf
      Jakub Kicinski authored
      This reverts commit 40e15940.
      
      Looks like this commit breaks the build for me.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1c69d7cf
    • Jakub Kicinski's avatar
      docs: operstates: document IF_OPER_TESTING · 7a7b8635
      Jakub Kicinski authored
      IF_OPER_TESTING is in fact used today.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a7b8635
    • Jakub Kicinski's avatar
      docs: operstates: fix typo · 66e0da21
      Jakub Kicinski authored
      TVL -> TLV
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66e0da21
    • Jakub Kicinski's avatar
      net: sparx5: fix compiletime_assert for GCC 4.9 · 6387f65e
      Jakub Kicinski authored
      Stephen reports sparx5 broke GCC 4.9 build.
      Move the compiletime_assert() out of the static function.
      Compile-tested only, no object code changes.
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Fixes: f3cad261 ("net: sparx5: add hostmode with phylink support")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6387f65e
    • Wang Hai's avatar
      net: natsemi: Fix missing pci_disable_device() in probe and remove · 7fe74dfd
      Wang Hai authored
      Replace pci_enable_device() with pcim_enable_device(),
      pci_disable_device() and pci_release_regions() will be
      called in release automatically.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fe74dfd
    • Steve Bennett's avatar
      net: phy: micrel: Fix detection of ksz87xx switch · a5e63c7d
      Steve Bennett authored
      The logic for discerning between KSZ8051 and KSZ87XX PHYs is incorrect
      such that the that KSZ87XX switch is not identified correctly.
      
      ksz8051_ksz8795_match_phy_device() uses the parameter ksz_phy_id
      to discriminate whether it was called from ksz8051_match_phy_device()
      or from ksz8795_match_phy_device() but since PHY_ID_KSZ87XX is the
      same value as PHY_ID_KSZ8051, this doesn't work.
      
      Instead use a bool to discriminate the caller.
      
      Without this patch, the KSZ8795 switch port identifies as:
      
      ksz8795-switch spi3.1 ade1 (uninitialized): PHY [dsa-0.1:03] driver [Generic PHY]
      
      With the patch, it identifies correctly:
      
      ksz8795-switch spi3.1 ade1 (uninitialized): PHY [dsa-0.1:03] driver [Micrel KSZ87XX Switch]
      
      Fixes: 8b95599c ("net: phy: micrel: Discern KSZ8051 and KSZ8795 PHYs")
      Signed-off-by: default avatarSteve Bennett <steveb@workware.net.au>
      Reviewed-by: default avatarMarek Vasut <marex@denx.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5e63c7d
    • David S. Miller's avatar
      Merge branch 'sja1105-fdb-fixes' · cebb5103
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      FDB fixes for NXP SJA1105
      
      I have some upcoming patches that make heavy use of statically installed
      FDB entries, and when testing them on SJA1105P/Q/R/S and SJA1110, it
      became clear that these switches do not behave reliably at all.
      
      - On SJA1110, a static FDB entry cannot be installed at all
      - On SJA1105P/Q/R/S, it is very picky about the inner/outer VLAN type
      - Dynamically learned entries will make us not install static ones, or
        even if we do, they might not take effect
      
      Patch 5/6 has a conflict with net-next (sorry), the commit message of
      that patch describes how to deal with it. Thanks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cebb5103
    • Vladimir Oltean's avatar
      net: dsa: sja1105: match FDB entries regardless of inner/outer VLAN tag · 47c2c0c2
      Vladimir Oltean authored
      On SJA1105P/Q/R/S and SJA1110, the L2 Lookup Table entries contain a
      maskable "inner/outer tag" bit which means:
      - when set to 1: match single-outer and double tagged frames
      - when set to 0: match untagged and single-inner tagged frames
      - when masked off: match all frames regardless of the type of tag
      
      This driver does not make any meaningful distinction between inner tags
      (matches on TPID) and outer tags (matches on TPID2). In fact, all VLAN
      table entries are installed as SJA1110_VLAN_D_TAG, which means that they
      match on both inner and outer tags.
      
      So it does not make sense that we install FDB entries with the IOTAG bit
      set to 1.
      
      In VLAN-unaware mode, we set both TPID and TPID2 to 0xdadb, so the
      switch will see frames as outer-tagged or double-tagged (never inner).
      So the FDB entries will match if IOTAG is set to 1.
      
      In VLAN-aware mode, we set TPID to 0x8100 and TPID2 to 0x88a8. So the
      switch will see untagged and 802.1Q-tagged packets as inner-tagged, and
      802.1ad-tagged packets as outer-tagged. So untagged and 802.1Q-tagged
      packets will not match FDB entries if IOTAG is set to 1, but 802.1ad
      tagged packets will. Strange.
      
      To fix this, simply mask off the IOTAG bit from FDB entries, and make
      them match regardless of whether the VLAN tag is inner or outer.
      
      Fixes: 1da73821 ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      47c2c0c2
    • Vladimir Oltean's avatar
      net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too · 589918df
      Vladimir Oltean authored
      Similar but not quite the same with what was done in commit b11f0a4c
      ("net: dsa: sja1105: be stateless when installing FDB entries") for
      SJA1105E/T, it is desirable to drop the priv->vlan_aware check and
      simply go ahead and install FDB entries in the VLAN that was given by
      the bridge.
      
      As opposed to SJA1105E/T, in SJA1105P/Q/R/S and SJA1110, the FDB is a
      maskable TCAM, and we are installing VLAN-unaware FDB entries with the
      VLAN ID masked off. However, such FDB entries might completely obscure
      VLAN-aware entries where the VLAN ID is included in the search mask,
      because the switch looks up the FDB from left to right and picks the
      first entry which results in a masked match. So it depends on whether
      the bridge installs first the VLAN-unaware or the VLAN-aware FDB entries.
      
      Anyway, if we had a VLAN-unaware FDB entry towards one set of DESTPORTS
      and a VLAN-aware one towards other set of DESTPORTS, the result is that
      the packets in VLAN-aware mode will be forwarded towards the DESTPORTS
      specified by the VLAN-unaware entry.
      
      To solve this, simply do not use the masked matching ability of the FDB
      for VLAN ID, and always match precisely on it. In VLAN-unaware mode, we
      configure the switch for shared VLAN learning, so the VLAN ID will be
      ignored anyway during lookup, so it is redundant to mask it off in the
      TCAM.
      
      This patch conflicts with net-next commit 0fac6aa0 ("net: dsa: sja1105:
      delete the best_effort_vlan_filtering mode") which changed this line:
      	if (priv->vlan_state != SJA1105_VLAN_UNAWARE) {
      into:
      	if (priv->vlan_aware) {
      
      When merging with net-next, the lines added by this patch should take
      precedence in the conflict resolution (i.e. the "if" condition should be
      deleted in both cases).
      
      Fixes: 1da73821 ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      589918df
    • Vladimir Oltean's avatar
      net: dsa: sja1105: ignore the FDB entry for unknown multicast when adding a new address · 728db843
      Vladimir Oltean authored
      Currently, when sja1105pqrs_fdb_add() is called for a host-joined IPv6
      MDB entry such as 33:33:00:00:00:6a, the search for that address will
      return the FDB entry for SJA1105_UNKNOWN_MULTICAST, which has a
      destination MAC of 01:00:00:00:00:00 and a mask of 01:00:00:00:00:00.
      It returns that entry because, well, it matches, in the sense that
      unknown multicast is supposed by design to match it...
      
      But the issue is that we then proceed to overwrite this entry with the
      one for our precise host-joined multicast address, and the unknown
      multicast entry is no longer there - unknown multicast is now flooded to
      the same group of ports as broadcast, which does not look up the FDB.
      
      To solve this problem, we should ignore searches that return the unknown
      multicast address as the match, and treat them as "no match" which will
      result in the entry being installed to hardware.
      
      For this to work properly, we need to put the result of the FDB search
      in a temporary variable in order to avoid overwriting the l2_lookup
      entry we want to program. The l2_lookup entry returned by the search
      might not have the same set of DESTPORTS and not even the same MACADDR
      as the entry we're trying to add.
      
      Fixes: 4d942354 ("net: dsa: sja1105: offload bridge port flags to device")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      728db843
    • Vladimir Oltean's avatar
      net: dsa: sja1105: invalidate dynamic FDB entries learned concurrently with statically added ones · 6c5fc159
      Vladimir Oltean authored
      The procedure to add a static FDB entry in sja1105 is concurrent with
      dynamic learning performed on all bridge ports and the CPU port.
      
      The switch looks up the FDB from left to right, and also learns
      dynamically from left to right, so it is possible that between the
      moment when we pick up a free slot to install an FDB entry, another slot
      to the left of that one becomes free due to an address ageing out, and
      that other slot is then immediately used by the switch to learn
      dynamically the same address as we're trying to add statically.
      
      The result is that we succeeded to add our static FDB entry, but it is
      being shadowed by a dynamic FDB entry to its left, and the switch will
      behave as if our static FDB entry did not exist.
      
      We cannot really prevent this from happening unless we make the entire
      process to add a static FDB entry a huge critical section where address
      learning is temporarily disabled on _all_ ports, and then re-enabled
      according to the configuration done by sja1105_port_set_learning.
      However, that is kind of disruptive for the operation of the network.
      
      What we can do alternatively is to simply read back the FDB for dynamic
      entries located before our newly added static one, and delete them.
      This will guarantee that our static FDB entry is now operational. It
      will still not guarantee that there aren't dynamic FDB entries to the
      _right_ of that static FDB entry, but at least those entries will age
      out by themselves since they aren't hit, and won't bother anyone.
      
      Fixes: 291d1e72 ("net: dsa: sja1105: Add support for FDB and MDB management")
      Fixes: 1da73821 ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c5fc159
    • Vladimir Oltean's avatar
      net: dsa: sja1105: overwrite dynamic FDB entries with static ones in .port_fdb_add · e11e865b
      Vladimir Oltean authored
      The SJA1105 switch family leaves it up to software to decide where
      within the FDB to install a static entry, and to concatenate destination
      ports for already existing entries (the FDB is also used for multicast
      entries), it is not as simple as just saying "please add this entry".
      
      This means we first need to search for an existing FDB entry before
      adding a new one. The driver currently manages to fool itself into
      thinking that if an FDB entry already exists, there is nothing to be
      done. But that FDB entry might be dynamically learned, case in which it
      should be replaced with a static entry, but instead it is left alone.
      
      This patch checks the LOCKEDS ("locked/static") bit from found FDB
      entries, and lets the code "goto skip_finding_an_index;" if the FDB
      entry was not static. So we also need to move the place where we set
      LOCKEDS = true, to cover the new case where a dynamic FDB entry existed
      but was dynamic.
      
      Fixes: 291d1e72 ("net: dsa: sja1105: Add support for FDB and MDB management")
      Fixes: 1da73821 ("net: dsa: sja1105: Add FDB operations for P/Q/R/S series")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e11e865b
    • Vladimir Oltean's avatar
      net: dsa: sja1105: fix static FDB writes for SJA1110 · cb81698f
      Vladimir Oltean authored
      The blamed commit made FDB access on SJA1110 functional only as far as
      dumping the existing entries goes, but anything having to do with an
      entry's index (adding, deleting) is still broken.
      
      There are in fact 2 problems, all caused by improperly inheriting the
      code from SJA1105P/Q/R/S:
      - An entry size is SJA1110_SIZE_L2_LOOKUP_ENTRY (24) bytes and not
        SJA1105PQRS_SIZE_L2_LOOKUP_ENTRY (20) bytes
      - The "index" field within an FDB entry is at bits 10:1 for SJA1110 and
        not 15:6 as in SJA1105P/Q/R/S
      
      This patch moves the packing function for the cmd->index outside of
      sja1105pqrs_common_l2_lookup_cmd_packing() and into the device specific
      functions sja1105pqrs_l2_lookup_cmd_packing and
      sja1110_l2_lookup_cmd_packing.
      
      Fixes: 74e7feff ("net: dsa: sja1105: fix dynamic access to L2 Address Lookup table for SJA1110")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb81698f
    • David S. Miller's avatar
      mhi: Fix networking tree build. · 40e15940
      David S. Miller authored
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40e15940
    • Yannick Vignon's avatar
      net/sched: taprio: Fix init procedure · ebca25ea
      Yannick Vignon authored
      Commit 13511704 ("net: taprio offload: enforce qdisc to netdev queue mapping")
      resulted in duplicate entries in the qdisc hash.
      While this did not impact the overall operation of the qdisc and taprio
      code paths, it did result in an infinite loop when dumping the qdisc
      properties, at least on one target (NXP LS1028 ARDB).
      Removing the duplicate call to qdisc_hash_add() solves the problem.
      
      Fixes: 13511704 ("net: taprio offload: enforce qdisc to netdev queue mapping")
      Signed-off-by: default avatarYannick Vignon <yannick.vignon@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebca25ea
    • Jakub Sitnicki's avatar
      net, gro: Set inner transport header offset in tcp/udp GRO hook · d51c5907
      Jakub Sitnicki authored
      GSO expects inner transport header offset to be valid when
      skb->encapsulation flag is set. GSO uses this value to calculate the length
      of an individual segment of a GSO packet in skb_gso_transport_seglen().
      
      However, tcp/udp gro_complete callbacks don't update the
      skb->inner_transport_header when processing an encapsulated TCP/UDP
      segment. As a result a GRO skb has ->inner_transport_header set to a value
      carried over from earlier skb processing.
      
      This can have mild to tragic consequences. From miscalculating the GSO
      segment length to triggering a page fault [1], when trying to read TCP/UDP
      header at an address past the skb->data page.
      
      The latter scenario leads to an oops report like so:
      
        BUG: unable to handle page fault for address: ffff9fa7ec00d008
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 123f201067 P4D 123f201067 PUD 123f209067 PMD 0
        Oops: 0000 [#1] SMP NOPTI
        CPU: 44 PID: 0 Comm: swapper/44 Not tainted 5.4.53-cloudflare-2020.7.21 #1
        Hardware name: HYVE EDGE-METAL-GEN10/HS-1811DLite1, BIOS V2.15 02/21/2020
        RIP: 0010:skb_gso_transport_seglen+0x44/0xa0
        Code: c0 41 83 e0 11 f6 87 81 00 00 00 20 74 30 0f b7 87 aa 00 00 00 0f [...]
        RSP: 0018:ffffad8640bacbb8 EFLAGS: 00010202
        RAX: 000000000000feda RBX: ffff9fcc8d31bc00 RCX: ffff9fa7ec00cffc
        RDX: ffff9fa7ebffdec0 RSI: 000000000000feda RDI: 0000000000000122
        RBP: 00000000000005c4 R08: 0000000000000001 R09: 0000000000000000
        R10: ffff9fe588ae3800 R11: ffff9fe011fc92f0 R12: ffff9fcc8d31bc00
        R13: ffff9fe0119d4300 R14: 00000000000005c4 R15: ffff9fba57d70900
        FS:  0000000000000000(0000) GS:ffff9fe68df00000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: ffff9fa7ec00d008 CR3: 0000003e99b1c000 CR4: 0000000000340ee0
        Call Trace:
         <IRQ>
         skb_gso_validate_network_len+0x11/0x70
         __ip_finish_output+0x109/0x1c0
         ip_sublist_rcv_finish+0x57/0x70
         ip_sublist_rcv+0x2aa/0x2d0
         ? ip_rcv_finish_core.constprop.0+0x390/0x390
         ip_list_rcv+0x12b/0x14f
         __netif_receive_skb_list_core+0x2a9/0x2d0
         netif_receive_skb_list_internal+0x1b5/0x2e0
         napi_complete_done+0x93/0x140
         veth_poll+0xc0/0x19f [veth]
         ? mlx5e_napi_poll+0x221/0x610 [mlx5_core]
         net_rx_action+0x1f8/0x790
         __do_softirq+0xe1/0x2bf
         irq_exit+0x8e/0xc0
         do_IRQ+0x58/0xe0
         common_interrupt+0xf/0xf
         </IRQ>
      
      The bug can be observed in a simple setup where we send IP/GRE/IP/TCP
      packets into a netns over a veth pair. Inside the netns, packets are
      forwarded to dummy device:
      
        trafgen -> [veth A]--[veth B] -forward-> [dummy]
      
      For veth B to GRO aggregate packets on receive, it needs to have an XDP
      program attached (for example, a trivial XDP_PASS). Additionally, for UDP,
      we need to enable GSO_UDP_L4 feature on the device:
      
        ip netns exec A ethtool -K AB rx-udp-gro-forwarding on
      
      The last component is an artificial delay to increase the chances of GRO
      batching happening:
      
        ip netns exec A tc qdisc add dev AB root \
           netem delay 200us slot 5ms 10ms packets 2 bytes 64k
      
      With such a setup in place, the bug can be observed by tracing the skb
      outer and inner offsets when GSO skb is transmitted from the dummy device:
      
      tcp:
      
      FUNC              DEV   SKB_LEN  NH  TH ENC INH ITH GSO_SIZE GSO_TYPE
      ip_finish_output  dumB     2830 270 290   1 294 254     1383 (tcpv4,gre,)
                                                      ^^^
      udp:
      
      FUNC              DEV   SKB_LEN  NH  TH ENC INH ITH GSO_SIZE GSO_TYPE
      ip_finish_output  dumB     2818 270 290   1 294 254     1383 (gre,udp_l4,)
                                                      ^^^
      
      Fix it by updating the inner transport header offset in tcp/udp
      gro_complete callbacks, similar to how {inet,ipv6}_gro_complete callbacks
      update the inner network header offset, when skb->encapsulation flag is
      set.
      
      [1] https://lore.kernel.org/netdev/CAKxSbF01cLpZem2GFaUaifh0S-5WYViZemTicAg7FCHOnh6kug@mail.gmail.com/
      
      Fixes: bf296b12 ("tcp: Add GRO support")
      Fixes: f993bc25 ("net: core: handle encapsulation offloads when computing segment lengths")
      Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
      Reported-by: default avatarAlex Forster <aforster@cloudflare.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d51c5907