1. 01 Nov, 2021 6 commits
  2. 29 Oct, 2021 14 commits
    • Yoshiki Komachi's avatar
      cls_flower: Fix inability to match GRE/IPIP packets · 6de6e46d
      Yoshiki Komachi authored
      When a packet of a new flow arrives in openvswitch kernel module, it dissects
      the packet and passes the extracted flow key to ovs-vswtichd daemon. If hw-
      offload configuration is enabled, the daemon creates a new TC flower entry to
      bypass openvswitch kernel module for the flow (TC flower can also offload flows
      to NICs but this time that does not matter).
      
      In this processing flow, I found the following issue in cases of GRE/IPIP
      packets.
      
      When ovs_flow_key_extract() in openvswitch module parses a packet of a new
      GRE (or IPIP) flow received on non-tunneling vports, it extracts information
      of the outer IP header for ip_proto/src_ip/dst_ip match keys.
      
      This means ovs-vswitchd creates a TC flower entry with IP protocol/addresses
      match keys whose values are those of the outer IP header. OTOH, TC flower,
      which uses flow_dissector (different parser from openvswitch module), extracts
      information of the inner IP header.
      
      The following flow is an example to describe the issue in more detail.
      
         <----------- Outer IP -----------------> <---------- Inner IP ---------->
        +----------+--------------+--------------+----------+----------+----------+
        | ip_proto | src_ip       | dst_ip       | ip_proto | src_ip   | dst_ip   |
        | 47 (GRE) | 192.168.10.1 | 192.168.10.2 | 6 (TCP)  | 10.0.0.1 | 10.0.0.2 |
        +----------+--------------+--------------+----------+----------+----------+
      
      In this case, TC flower entry and extracted information are shown as below:
      
        - ovs-vswitchd creates TC flower entry with:
            - ip_proto: 47
            - src_ip: 192.168.10.1
            - dst_ip: 192.168.10.2
      
        - TC flower extracts below for IP header matches:
            - ip_proto: 6
            - src_ip: 10.0.0.1
            - dst_ip: 10.0.0.2
      
      Thus, GRE or IPIP packets never match the TC flower entry, as each
      dissector behaves differently.
      
      IMHO, the behavior of TC flower (flow dissector) does not look correct,
      as ip_proto/src_ip/dst_ip in TC flower match means the outermost IP
      header information except for GRE/IPIP cases. This patch adds a new
      flow_dissector flag FLOW_DISSECTOR_F_STOP_BEFORE_ENCAP which skips
      dissection of the encapsulated inner GRE/IPIP header in TC flower
      classifier.
      Signed-off-by: default avatarYoshiki Komachi <komachi.yoshiki@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6de6e46d
    • Nikolay Aleksandrov's avatar
      selftests: net: bridge: update IGMP/MLD membership interval value · 34d7ecb3
      Nikolay Aleksandrov authored
      When I fixed IGMPv3/MLDv2 to use the bridge's multicast_membership_interval
      value which is chosen by user-space instead of calculating it based on
      multicast_query_interval and multicast_query_response_interval I forgot
      to update the selftests relying on that behaviour. Now we have to
      manually set the expected GMI value to perform the tests correctly and get
      proper results (similar to IGMPv2 behaviour).
      
      Fixes: fac3cb82 ("net: bridge: mcast: use multicast_membership_interval for IGMPv3")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34d7ecb3
    • Ivan Vecera's avatar
      net: bridge: fix uninitialized variables when BRIDGE_CFM is disabled · 829e050e
      Ivan Vecera authored
      Function br_get_link_af_size_filtered() calls br_cfm_{,peer}_mep_count()
      that return a count. When BRIDGE_CFM is not enabled these functions
      simply return -EOPNOTSUPP but do not modify count parameter and
      calling function then works with uninitialized variables.
      Modify these inline functions to return zero in count parameter.
      
      Fixes: b6d0425b ("bridge: cfm: Netlink Notifications.")
      Cc: Henrik Bjoernlund <henrik.bjoernlund@microchip.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      829e050e
    • Russell King (Oracle)'s avatar
      net: phylink: avoid mvneta warning when setting pause parameters · fd8d9731
      Russell King (Oracle) authored
      mvneta does not support asymetric pause modes, and it flags this by the
      lack of AsymPause in the supported field. When setting pause modes, we
      check that pause->rx_pause == pause->tx_pause, but only when pause
      autoneg is enabled. When pause autoneg is disabled, we still allow
      pause->rx_pause != pause->tx_pause, which is incorrect when the MAC
      does not support asymetric pause, and causes mvneta to issue a warning.
      
      Fix this by removing the test for pause->autoneg, so we always check
      that pause->rx_pause == pause->tx_pause for network devices that do not
      support AsymPause.
      
      Fixes: 9525ae83 ("phylink: add phylink infrastructure")
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd8d9731
    • David S. Miller's avatar
      Merge branch 'nfp-fixes' · 0f48fb66
      David S. Miller authored
      Simon Horman says:
      
      ====================
      nfp: fix bugs caused by adaptive coalesce
      
      this series contains fixes for two bugs introduced when
      when adaptive coalesce support was added to the NFP driver in
      v5.15 by 9d32e4e7 ("nfp: add support for coalesce adaptive feature")
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f48fb66
    • Yinjun Zhang's avatar
      nfp: fix potential deadlock when canceling dim work · 17e712c6
      Yinjun Zhang authored
      When port is linked down, the process which has acquired rtnl_lock
      will wait for the in-progress dim work to finish, and the work also
      acquires rtnl_lock, which may cause deadlock.
      
      Currently IRQ_MOD registers can be configured by `ethtool -C` and
      dim work, and which will take effect depends on the execution order,
      rtnl_lock is useless here, so remove them.
      
      Fixes: 9d32e4e7 ("nfp: add support for coalesce adaptive feature")
      Signed-off-by: default avatarYinjun Zhang <yinjun.zhang@corigine.com>
      Signed-off-by: default avatarLouis Peens <louis.peens@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      17e712c6
    • Yinjun Zhang's avatar
      nfp: fix NULL pointer access when scheduling dim work · f8d384a6
      Yinjun Zhang authored
      Each rx/tx ring has a related dim work, when rx/tx ring number is
      decreased by `ethtool -L`, the corresponding rx_ring or tx_ring is
      assigned NULL, while its related work is not destroyed. When scheduled,
      the work will access NULL pointer.
      
      Fixes: 9d32e4e7 ("nfp: add support for coalesce adaptive feature")
      Signed-off-by: default avatarYinjun Zhang <yinjun.zhang@corigine.com>
      Signed-off-by: default avatarLouis Peens <louis.peens@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8d384a6
    • Shuah Khan's avatar
      selftests/net: update .gitignore with newly added tests · e300a85d
      Shuah Khan authored
      Update .gitignore with newly added tests:
      	tools/testing/selftests/net/af_unix/test_unix_oob
      	tools/testing/selftests/net/gro
      	tools/testing/selftests/net/ioam6_parser
      	tools/testing/selftests/net/toeplitz
      Signed-off-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e300a85d
    • Shyam Sundar S K's avatar
      net: amd-xgbe: Toggle PLL settings during rate change · daf182d3
      Shyam Sundar S K authored
      For each rate change command submission, the FW has to do a phy
      power off sequence internally. For this to happen correctly, the
      PLL re-initialization control setting has to be turned off before
      sending mailbox commands and re-enabled once the command submission
      is complete.
      
      Without the PLL control setting, the link up takes longer time in a
      fixed phy configuration.
      
      Fixes: 47f164de ("amd-xgbe: Add PCI device support")
      Co-developed-by: default avatarSudheesh Mavila <sudheesh.mavila@amd.com>
      Signed-off-by: default avatarSudheesh Mavila <sudheesh.mavila@amd.com>
      Signed-off-by: default avatarShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Acked-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      daf182d3
    • David S. Miller's avatar
      Merge branch 'sctp-plpmtud-fixes' · cec6880d
      David S. Miller authored
      Xin Long says:
      
      ====================
      sctp: a couple of fixes for PLPMTUD
      
      Four fixes included in this patchset:
      
        - fix the packet sending in Error state.
        - fix the timer stop when transport update dst.
        - fix the outer header len calculation.
        - fix the return value for toobig processing.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cec6880d
    • Xin Long's avatar
      sctp: return true only for pathmtu update in sctp_transport_pl_toobig · 75cf662c
      Xin Long authored
      sctp_transport_pl_toobig() supposes to return true only if there's
      pathmtu update, so that in sctp_icmp_frag_needed() it would call
      sctp_assoc_sync_pmtu() and sctp_retransmit(). This patch is to fix
      these return places in sctp_transport_pl_toobig().
      
      Fixes: 83696408 ("sctp: do state transition when receiving an icmp TOOBIG packet")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      75cf662c
    • Xin Long's avatar
      sctp: subtract sctphdr len in sctp_transport_pl_hlen · cc4665ca
      Xin Long authored
      sctp_transport_pl_hlen() is called to calculate the outer header length
      for PL. However, as the Figure in rfc8899#section-4.4:
      
         Any additional
           headers         .--- MPS -----.
                  |        |             |
                  v        v             v
           +------------------------------+
           | IP | ** | PL | protocol data |
           +------------------------------+
      
                      <----- PLPMTU ----->
           <---------- PMTU -------------->
      
      Outer header are IP + Any additional headers, which doesn't include
      Packetization Layer itself header, namely sctphdr, whereas sctphdr
      is counted by __sctp_mtu_payload().
      
      The incorrect calculation caused the link pathmtu to be set larger
      than expected by t->pl.pmtu + sctp_transport_pl_hlen(). This patch
      is to fix it by subtracting sctphdr len in sctp_transport_pl_hlen().
      
      Fixes: d9e2e410 ("sctp: add the constants/variables and states and some APIs for transport")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc4665ca
    • Xin Long's avatar
      sctp: reset probe_timer in sctp_transport_pl_update · c6ea04ea
      Xin Long authored
      sctp_transport_pl_update() is called when transport update its dst and
      pathmtu, instead of stopping the PLPMTUD probe timer, PLPMTUD should
      start over and reset the probe timer. Otherwise, the PLPMTUD service
      would stop.
      
      Fixes: 92548ec2 ("sctp: add the probe timer in transport for PLPMTUD")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c6ea04ea
    • Xin Long's avatar
      sctp: allow IP fragmentation when PLPMTUD enters Error state · 40171248
      Xin Long authored
      Currently when PLPMTUD enters Error state, transport pathmtu will be set
      to MIN_PLPMTU(512) while probe is continuing with BASE_PLPMTU(1200). It
      will cause pathmtu to stay in a very small value, even if the real pmtu
      is some value like 1000.
      
      RFC8899 doesn't clearly say how to set the value in Error state. But one
      possibility could be keep using BASE_PLPMTU for the real pmtu, but allow
      to do IP fragmentation when it's in Error state.
      
      As it says in rfc8899#section-5.4:
      
         Some paths could be unable to sustain packets of the BASE_PLPMTU
         size.  The Error State could be implemented to provide robustness to
         such paths.  This allows fallback to a smaller than desired PLPMTU
         rather than suffer connectivity failure.  This could utilize methods
         such as endpoint IP fragmentation to enable the PL sender to
         communicate using packets smaller than the BASE_PLPMTU.
      
      This patch is to set pmtu to BASE_PLPMTU instead of MIN_PLPMTU for Error
      state in sctp_transport_pl_send/toobig(), and set packet ipfragok for
      non-probe packets when it's in Error state.
      
      Fixes: 1dc68c19 ("sctp: do state transition when PROBE_COUNT == MAX_PROBES on HB send path")
      Reported-by: default avatarYing Xu <yinxu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40171248
  3. 28 Oct, 2021 20 commits