1. 24 Jan, 2023 14 commits
  2. 23 Jan, 2023 23 commits
    • Heiner Kallweit's avatar
      net: mdio: mux-meson-g12a: use devm_clk_get_enabled to simplify the code · 32e54254
      Heiner Kallweit authored
      Use devm_clk_get_enabled() to simplify the code.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarJerome Brunet <jbrunet@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32e54254
    • Andy Shevchenko's avatar
      net: mdiobus: Convert to use fwnode_device_is_compatible() · d408ec0b
      Andy Shevchenko authored
      Replace open coded fwnode_device_is_compatible() in the driver.
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d408ec0b
    • David S. Miller's avatar
      dc0b98a1
    • David S. Miller's avatar
      Merge branch 'enetc-mac-merge-prep' · 7a981431
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      ENETC MAC Merge cleanup
      
      This is a preparatory patch set for MAC Merge layer support in enetc via
      ethtool. It does the following:
      
      - consolidates a software lockstep register write procedure for the pMAC
      - detects per-port frame preemption capability and only writes pMAC
        registers if a pMAC exists
      - stops enabling the pMAC by default
      
      Additionally, I noticed some build warnings in the driver which are new
      in this kernel version, so patch 1/6 fixes those.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a981431
    • Vladimir Oltean's avatar
      net: enetc: stop auto-configuring the port pMAC · 086cc080
      Vladimir Oltean authored
      The pMAC (ENETC_PFPMR_PMACE) is probably unconditionally enabled in the
      enetc driver to allow RX of preemptible packets and not see them as
      error frames. I don't know why TX preemption (ENETC_MMCSR_ME) is enabled
      though. With no way to say which traffic classes are preemptible (all
      are express by default), no preemptible frames would be transmitted
      anyway.
      
      Lastly, it may have been believed that the register write lock-step mode
      (now deleted) needed the pMAC to be enabled at all times. I don't know
      if that's true. However, I've checked that driver writes to PM1
      registers do propagate through to the ENETC IP even when the pMAC is
      disabled.
      
      With such incomplete support for frame preemption, it's best to just
      remove whatever exists right now and come with something more coherent
      later.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      086cc080
    • Vladimir Oltean's avatar
      net: enetc: implement software lockstep for port MAC registers · 12717dec
      Vladimir Oltean authored
      Currently the enetc driver duplicates its writes to the PM0 registers
      also to PM1, but it doesn't do this consistently - for example we write
      to ENETC_PM0_MAXFRM but not to ENETC_PM1_MAXFRM.
      
      Create enetc_port_mac_wr() which writes both the PM0 and PM1 register
      with the same value (if frame preemption is supported on this port).
      Also create enetc_port_mac_rd() which reads from PM0 - the assumption
      being that PM1 contains just the same value.
      
      This will be necessary when we enable the MAC Merge layer properly, and
      the pMAC becomes operational.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12717dec
    • Vladimir Oltean's avatar
      net: enetc: stop configuring pMAC in lockstep with eMAC · 219355f1
      Vladimir Oltean authored
      The MWLM bit (MAC write lock-step mode) allows register writes to the
      pMAC to be auto-performed whenever the corresponding eMAC register is
      written by the driver. This allows their configuration to remain
      in sync.
      
      The driver has set this bit since the initial commit, but it doesn't do
      anything, since the hardware feature doesn't work (and the bit has been
      removed from more recent versions of the documentation).
      
      The driver does attempt, more or less, to keep those MAC registers in
      sync by writing the same value once to e.g. ENETC_PM0_CMD_CFG (eMAC) and
      once to ENETC_PM1_CMD_CFG (pMAC). Because the lockstep feature doesn't
      work, that's what it will stick to.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      219355f1
    • Vladimir Oltean's avatar
      net: enetc: add definition for offset between eMAC and pMAC regs · 9c949e0b
      Vladimir Oltean authored
      This is a preliminary patch which replaces the hardcoded 0x1000 present
      in other PM1 (port MAC 1, aka pMAC) register definitions, which is an
      offset to the PM0 (port MAC 0, aka eMAC) equivalent register.
      This definition will be used in more places by future code.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c949e0b
    • Vladimir Oltean's avatar
      net: enetc: detect frame preemption hardware capability · 94557a9a
      Vladimir Oltean authored
      Similar to other TSN features, query the Station Interface capability
      register to see whether preemption is supported on this port or not.
      On LS1028A, preemption is available on ports 0 and 2, but not on 1
      and 3.
      
      This will allow us in the future to write the pMAC registers only on the
      ENETC ports where a pMAC actually exists.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94557a9a
    • Vladimir Oltean's avatar
      net: enetc: build common object files into a separate module · e3972399
      Vladimir Oltean authored
      The build system is complaining about the following:
      
      enetc.o is added to multiple modules: fsl-enetc fsl-enetc-vf
      enetc_cbdr.o is added to multiple modules: fsl-enetc fsl-enetc-vf
      enetc_ethtool.o is added to multiple modules: fsl-enetc fsl-enetc-vf
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3972399
    • David S. Miller's avatar
      Merge branch 'ethtool-mac-merge' · f3c6e128
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      ethtool support for IEEE 802.3 MAC Merge layer
      
      Change log
      ----------
      
      v3->v4:
      - add missing opening bracket in ocelot_port_mm_irq()
      - moved cfg.verify_time range checking so that it actually takes place
        for the updated rather than old value
      v3 at:
      https://patchwork.kernel.org/project/netdevbpf/cover/20230117085947.2176464-1-vladimir.oltean@nxp.com/
      
      v2->v3:
      - made get_mm return int instead of void
      - deleted ETHTOOL_A_MM_SUPPORTED
      - renamed ETHTOOL_A_MM_ADD_FRAG_SIZE to ETHTOOL_A_MM_TX_MIN_FRAG_SIZE
      - introduced ETHTOOL_A_MM_RX_MIN_FRAG_SIZE
      - cleaned up documentation
      - rebased on top of PLCA changes
      - renamed ETHTOOL_STATS_SRC_* to ETHTOOL_MAC_STATS_SRC_*
      v2 at:
      https://patchwork.kernel.org/project/netdevbpf/cover/20230111161706.1465242-1-vladimir.oltean@nxp.com/
      
      v1->v2:
      I've decided to focus just on the MAC Merge layer for now, which is why
      I am able to submit this patch set as non-RFC.
      v1 (RFC) at:
      https://patchwork.kernel.org/project/netdevbpf/cover/20220816222920.1952936-1-vladimir.oltean@nxp.com/
      
      What is being introduced
      ------------------------
      
      TL;DR: a MAC Merge layer as defined by IEEE 802.3-2018, clause 99
      (interspersing of express traffic). This is controlled through ethtool
      netlink (ETHTOOL_MSG_MM_GET, ETHTOOL_MSG_MM_SET). The raw ethtool
      commands are posted here:
      https://patchwork.kernel.org/project/netdevbpf/cover/20230111153638.1454687-1-vladimir.oltean@nxp.com/
      
      The MAC Merge layer has its own statistics counters
      (ethtool --include-statistics --show-mm swp0) as well as two member
      MACs, the statistics of which can be queried individually, through a new
      ethtool netlink attribute, corresponding to:
      
      $ ethtool -I --show-pause eno2 --src aggregate
      $ ethtool -S eno2 --groups eth-mac eth-phy eth-ctrl rmon -- --src pmac
      
      The core properties of the MAC Merge layer are described in great detail
      in patches 02/12 and 03/12. They can be viewed in "make htmldocs" format.
      
      Devices for which the API is supported
      --------------------------------------
      
      I decided to start with the Ethernet switch on NXP LS1028A (Felix)
      because of the smaller patch set. I also have support for the ENETC
      controller pending.
      
      I would like to get confirmation that the UAPI being proposed here will
      not restrict any use cases known by other hardware vendors.
      
      Why is support for preemptible traffic classes not here?
      --------------------------------------------------------
      
      There is legitimate concern whether the 802.1Q portion of the standard
      (which traffic classes go to the eMAC and which to the pMAC) should be
      modeled in Linux using tc or using another UAPI. I think that is
      stalling the entire series, but should be discussed separately instead.
      Removing FP adminStatus support makes me confident enough to submit this
      patch set without an RFC tag (meaning: I wouldn't mind if it was merged
      as is).
      
      What is submitted here is sufficient for an LLDP daemon to do its job.
      I've patched openlldp to advertise and configure frame preemption:
      https://github.com/vladimiroltean/openlldp/tree/frame-preemption-v3
      
      In case someone wants to try it out, here are some commands I've used.
      
       # Configure the interfaces to receive and transmit LLDP Data Units
       lldptool -L -i eno0 adminStatus=rxtx
       lldptool -L -i swp0 adminStatus=rxtx
       # Enable the transmission of certain TLVs on switch's interface
       lldptool -T -i eno0 -V addEthCap enableTx=yes
       lldptool -T -i swp0 -V addEthCap enableTx=yes
       # Query LLDP statistics on switch's interface
       lldptool -S -i swp0
       # Query the received neighbor TLVs
       lldptool -i swp0 -t -n -V addEthCap
       Additional Ethernet Capabilities TLV
               Preemption capability supported
               Preemption capability enabled
               Preemption capability active
               Additional fragment size: 60 octets
      
      So using this patch set, lldpad will be able to advertise and configure
      frame preemption, but still, no data packet will be sent as preemptible
      over the link, because there is no UAPI to control which traffic classes
      are sent as preemptible and which as express.
      
      Preemptable or preemptible?
      ---------------------------
      
      IEEE 802.3 uses "preemptable" throughout. IEEE 802.1Q uses "preemptible"
      throughout. Because the definition of "preemptible" falls under 802.1Q's
      jurisdiction and 802.3 just references it, I went with the 802.1Q naming
      even where supporting an 802.3 feature. Also, checkpatch agrees with this.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3c6e128
    • Vladimir Oltean's avatar
      net: mscc: ocelot: add MAC Merge layer support for VSC9959 · 6505b680
      Vladimir Oltean authored
      Felix (VSC9959) has a DEV_GMII:MM_CONFIG block composed of 2 registers
      (ENABLE_CONFIG and VERIF_CONFIG). Because the MAC Merge statistics and
      pMAC statistics are already in the Ocelot switch lib even if just Felix
      supports them, I'm adding support for the whole MAC Merge layer in the
      common Ocelot library too.
      
      There is an interrupt (shared with the PTP interrupt) which signals
      changes to the MM verification state. This is done because the
      preemptible traffic classes should be committed to hardware only once
      the verification procedure has declared the link partner of being
      capable of receiving preemptible frames.
      
      We implement ethtool getters and setters for the MAC Merge layer state.
      The "TX enabled" and "verify status" are taken from the IRQ handler,
      using a mutex to ensure serialized access.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6505b680
    • Vladimir Oltean's avatar
      net: mscc: ocelot: export ethtool MAC Merge stats for Felix VSC9959 · ab3f97a9
      Vladimir Oltean authored
      The Felix VSC9959 switch supports frame preemption and has a MAC Merge
      layer. In addition to the structured stats that exist for the eMAC,
      export the counters associated with its pMAC (pause, RMON, MAC, PHY,
      control) plus the high-level MAC Merge layer stats. The unstructured
      ethtool counters, as well as the rtnl_link_stats64 were left to report
      only the eMAC counters.
      
      Because statistics processing is quite self-contained in ocelot_stats.c
      now, I've opted for introducing an ocelot->mm_supported bool, based on
      which the common switch lib does everything, rather than pushing the
      TSN-specific code in felix_vsc9959.c, as happens for other TSN stuff.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab3f97a9
    • Vladimir Oltean's avatar
      net: mscc: ocelot: hide access to ocelot_stats_layout behind a helper · 497eea9f
      Vladimir Oltean authored
      Some hardware instances of the ocelot driver support the MAC Merge
      layer, which gives access to an extra preemptible MAC. This has
      implications upon the statistics. There will be a stats layout when MM
      isn't supported, and a different one when it is.
      The ocelot_stats_layout() helper will return the correct one.
      In preparation of that, refactor the existing code to use this helper.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      497eea9f
    • Vladimir Oltean's avatar
      net: mscc: ocelot: allow ocelot_stat_layout elements with no name · 1a733bbd
      Vladimir Oltean authored
      We will add support for pMAC counters and MAC merge layer counters,
      which are only reported via the structured stats, and the current
      ocelot_get_strings() stands in our way, because it expects that the
      statistics should be placed in the data array at the same index as found
      in the ocelot_stats_layout array.
      
      That is not true. Statistics which don't have a name should not be
      exported to the unstructured ethtool -S, so we need to have different
      indices into the ocelot_stats_layout array (i) and into the data array
      (data itself).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a733bbd
    • Vladimir Oltean's avatar
      net: dsa: add plumbing for changing and getting MAC merge layer state · 5f6c2d49
      Vladimir Oltean authored
      The DSA core is in charge of the ethtool_ops of the net devices
      associated with switch ports, so in case a hardware driver supports the
      MAC merge layer, DSA must pass the callbacks through to the driver.
      Add support for precisely that.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f6c2d49
    • Vladimir Oltean's avatar
      net: ethtool: add helpers for MM fragment size translation · dd1c4164
      Vladimir Oltean authored
      We deliberately make the Linux UAPI pass the minimum fragment size in
      octets, even though IEEE 802.3 defines it as discrete values, and
      addFragSize is just the multiplier. This is because there is nothing
      impossible in operating with an in-between value for the fragment size
      of non-final preempted fragments, and there may even appear hardware
      which supports the in-between sizes.
      
      For the hardware which just understands the addFragSize multiplier,
      create two helpers which translate back and forth the values passed in
      octets.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd1c4164
    • Vladimir Oltean's avatar
      net: ethtool: add helpers for aggregate statistics · 449c5459
      Vladimir Oltean authored
      When a pMAC exists but the driver is unable to atomically query the
      aggregate eMAC+pMAC statistics, the user should be given back at least
      the sum of eMAC and pMAC counters queried separately.
      
      This is a generic problem, so add helpers in ethtool to do this
      operation, if the driver doesn't have a better way to report aggregate
      stats. Do this in a way that does not require changes to these functions
      when new stats are added (basically treat the structures as an array of
      u64 values, except for the first element which is the stats source).
      
      In include/linux/ethtool.h, there is already a section where helper
      function prototypes should be placed. The trouble is, this section is
      too early, before the definitions of struct ethtool_eth_mac_stats et.al.
      Move that section at the end and append these new helpers to it.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      449c5459
    • Vladimir Oltean's avatar
      docs: ethtool: document ETHTOOL_A_STATS_SRC and ETHTOOL_A_PAUSE_STATS_SRC · c319df10
      Vladimir Oltean authored
      Two new netlink attributes were added to PAUSE_GET and STATS_GET and
      their replies. Document them.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c319df10
    • Vladimir Oltean's avatar
      net: ethtool: netlink: retrieve stats from multiple sources (eMAC, pMAC) · 04692c90
      Vladimir Oltean authored
      IEEE 802.3-2018 clause 99 defines a MAC Merge sublayer which contains an
      Express MAC and a Preemptible MAC. Both MACs are hidden to higher and
      lower layers and visible as a single MAC (packet classification to eMAC
      or pMAC on TX is done based on priority; classification on RX is done
      based on SFD).
      
      For devices which support a MAC Merge sublayer, it is desirable to
      retrieve individual packet counters from the eMAC and the pMAC, as well
      as aggregate statistics (their sum).
      
      Introduce a new ETHTOOL_A_STATS_SRC attribute which is part of the
      policy of ETHTOOL_MSG_STATS_GET and, and an ETHTOOL_A_PAUSE_STATS_SRC
      which is part of the policy of ETHTOOL_MSG_PAUSE_GET (accepted when
      ETHTOOL_FLAG_STATS is set in the common ethtool header). Both of these
      take values from enum ethtool_mac_stats_src, defaulting to "aggregate"
      in the absence of the attribute.
      
      Existing drivers do not need to pay attention to this enum which was
      added to all driver-facing structures, just the ones which report the
      MAC merge layer as supported.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04692c90
    • Vladimir Oltean's avatar
      docs: ethtool-netlink: document interface for MAC Merge layer · 37000004
      Vladimir Oltean authored
      Show details about the structures passed back and forth related to MAC
      Merge layer configuration, state and statistics. The rendered htmldocs
      will be much more verbose due to the kerneldoc references.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37000004
    • Vladimir Oltean's avatar
      net: ethtool: add support for MAC Merge layer · 2b30f829
      Vladimir Oltean authored
      The MAC merge sublayer (IEEE 802.3-2018 clause 99) is one of 2
      specifications (the other being Frame Preemption; IEEE 802.1Q-2018
      clause 6.7.2), which work together to minimize latency caused by frame
      interference at TX. The overall goal of TSN is for normal traffic and
      traffic with a bounded deadline to be able to cohabitate on the same L2
      network and not bother each other too much.
      
      The standards achieve this (partly) by introducing the concept of
      preemptible traffic, i.e. Ethernet frames that have a custom value for
      the Start-of-Frame-Delimiter (SFD), and these frames can be fragmented
      and reassembled at L2 on a link-local basis. The non-preemptible frames
      are called express traffic, they are transmitted using a normal SFD, and
      they can preempt preemptible frames, therefore having lower latency,
      which can matter at lower (100 Mbps) link speeds, or at high MTUs (jumbo
      frames around 9K). Preemption is not recursive, i.e. a P frame cannot
      preempt another P frame. Preemption also does not depend upon priority,
      or otherwise said, an E frame with prio 0 will still preempt a P frame
      with prio 7.
      
      In terms of implementation, the standards talk about the presence of an
      express MAC (eMAC) which handles express traffic, and a preemptible MAC
      (pMAC) which handles preemptible traffic, and these MACs are multiplexed
      on the same MII by a MAC merge layer.
      
      To support frame preemption, the definition of the SFD was generalized
      to SMD (Start-of-mPacket-Delimiter), where an mPacket is essentially an
      Ethernet frame fragment, or a complete frame. Stations unaware of an SMD
      value different from the standard SFD will treat P frames as error
      frames. To prevent that from happening, a negotiation process is
      defined.
      
      On RX, packets are dispatched to the eMAC or pMAC after being filtered
      by their SMD. On TX, the eMAC/pMAC classification decision is taken by
      the 802.1Q spec, based on packet priority (each of the 8 user priority
      values may have an admin-status of preemptible or express).
      
      The MAC Merge layer and the Frame Preemption parameters have some degree
      of independence in terms of how software stacks are supposed to deal
      with them. The activation of the MM layer is supposed to be controlled
      by an LLDP daemon (after it has been communicated that the link partner
      also supports it), after which a (hardware-based or not) verification
      handshake takes place, before actually enabling the feature. So the
      process is intended to be relatively plug-and-play. Whereas FP settings
      are supposed to be coordinated across a network using something
      approximating NETCONF.
      
      The support contained here is exclusively for the 802.3 (MAC Merge)
      portions and not for the 802.1Q (Frame Preemption) parts. This API is
      sufficient for an LLDP daemon to do its job. The FP adminStatus variable
      from 802.1Q is outside the scope of an LLDP daemon.
      
      I have taken a few creative licenses and augmented the Linux kernel UAPI
      compared to the standard managed objects recommended by IEEE 802.3.
      These are:
      
      - ETHTOOL_A_MM_PMAC_ENABLED: According to Figure 99-6: Receive
        Processing state diagram, a MAC Merge layer is always supposed to be
        able to receive P frames. However, this implies keeping the pMAC
        powered on, which will consume needless power in applications where FP
        will never be used. If LLDP is used, the reception of an Additional
        Ethernet Capabilities TLV from the link partner is sufficient
        indication that the pMAC should be enabled. So my proposal is that in
        Linux, we keep the pMAC turned off by default and that user space
        turns it on when needed.
      
      - ETHTOOL_A_MM_VERIFY_ENABLED: The IEEE managed object is called
        aMACMergeVerifyDisableTx. I opted for consistency (positive logic) in
        the boolean netlink attributes offered, so this is also positive here.
        Other than the meaning being reversed, they correspond to the same
        thing.
      
      - ETHTOOL_A_MM_MAX_VERIFY_TIME: I found it most reasonable for a LLDP
        daemon to maximize the verifyTime variable (delay between SMD-V
        transmissions), to maximize its chances that the LP replies. IEEE says
        that the verifyTime can range between 1 and 128 ms, but the NXP ENETC
        stupidly keeps this variable in a 7 bit register, so the maximum
        supported value is 127 ms. I could have chosen to hardcode this in the
        LLDP daemon to a lower value, but why not let the kernel expose its
        supported range directly.
      
      - ETHTOOL_A_MM_TX_MIN_FRAG_SIZE: the standard managed object is called
        aMACMergeAddFragSize, and expresses the "additional" fragment size
        (on top of ETH_ZLEN), whereas this expresses the absolute value of the
        fragment size.
      
      - ETHTOOL_A_MM_RX_MIN_FRAG_SIZE: there doesn't appear to exist a managed
        object mandated by the standard, but user space clearly needs to know
        what is the minimum supported fragment size of our local receiver,
        since LLDP must advertise a value no lower than that.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b30f829
    • Peilin Ye's avatar
      net/sock: Introduce trace_sk_data_ready() · 40e0b090
      Peilin Ye authored
      As suggested by Cong, introduce a tracepoint for all ->sk_data_ready()
      callback implementations.  For example:
      
      <...>
        iperf-609  [002] .....  70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable
        iperf-609  [002] .....  70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable
      <...>
      Suggested-by: default avatarCong Wang <cong.wang@bytedance.com>
      Signed-off-by: default avatarPeilin Ye <peilin.ye@bytedance.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40e0b090
  3. 21 Jan, 2023 3 commits
    • Jakub Kicinski's avatar
      Merge branch 'mlxsw-add-support-of-latency-tlv' · a7b87d2a
      Jakub Kicinski authored
      Petr Machata says:
      
      ====================
      mlxsw: Add support of latency TLV
      
      Amit Cohen writes:
      
      Ethernet Management Datagrams (EMADs) are Ethernet packets sent between
      the driver and device's firmware. They are used to pass various
      configurations to the device, but also to get events (e.g., port up)
      from it. After the Ethernet header, these packets are built in a TLV
      format.
      
      This is the structure of EMADs:
      * Ethernet header
      * Operation TLV
      * String TLV (optional)
      * Latency TLV (optional)
      * Reg TLV
      * End TLV
      
      The latency of each EMAD is measured by firmware. The driver can get the
      measurement via latency TLV which can be added to each EMAD. This TLV is
      optional, when EMAD is sent with this TLV, the EMAD's response will include
      the TLV and will contain the firmware measurement.
      
      Add support for Latency TLV and use it by default for all EMADs (see
      more information in commit messages). The latency measurements can be
      processed using BPF program for example, to create a histogram and average
      of the latency per register. In addition, it is possible to measure the
      end-to-end latency, so then the latency of the software overhead can be
      calculated. This information can be useful to improve the driver
      performance.
      
      See an example of output of BPF tool which presents these measurements:
      
      $ ./emadlatency -f -a
          Tracing EMADs... Hit Ctrl-C to end.
          Register write = RALUE (0x8013)
          E2E Measurements:
          average = 23 usecs, total = 32052693 usecs, count = 1337061
               usecs               : count    distribution
                   0 -> 1          : 0        |                                 |
                   2 -> 3          : 0        |                                 |
                   4 -> 7          : 0        |                                 |
                   8 -> 15         : 0        |                                 |
                  16 -> 31         : 1290814  |*********************************|
                  32 -> 63         : 45339    |*                                |
                  64 -> 127        : 532      |                                 |
                 128 -> 255        : 247      |                                 |
                 256 -> 511        : 57       |                                 |
                 512 -> 1023       : 26       |                                 |
                1024 -> 2047       : 33       |                                 |
                2048 -> 4095       : 0        |                                 |
                4096 -> 8191       : 10       |                                 |
                8192 -> 16383      : 1        |                                 |
               16384 -> 32767      : 1        |                                 |
               32768 -> 65535      : 1        |                                 |
      
          Firmware Measurements:
          average = 10 usecs, total = 13884128 usecs, count = 1337061
               usecs               : count    distribution
                   0 -> 1          : 0        |                                 |
                   2 -> 3          : 0        |                                 |
                   4 -> 7          : 0        |                                 |
                   8 -> 15         : 1337035  |*********************************|
                  16 -> 31         : 17       |                                 |
                  32 -> 63         : 7        |                                 |
                  64 -> 127        : 0        |                                 |
                 128 -> 255        : 2        |                                 |
      
          Diff between measurements: 13 usecs
      
      Patch set overview:
      Patches #1-#3 add support for querying MGIR, to know if string TLV and
      latency TLV are supported
      Patches #4-#5 add some relevant fields to support latency TLV
      Patch #6 adds support of latency TLV
      ====================
      
      Link: https://lore.kernel.org/r/cover.1674123673.git.petrm@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a7b87d2a
    • Amit Cohen's avatar
      mlxsw: Add support of latency TLV · 49f5b769
      Amit Cohen authored
      The latency of each EMAD can be measured by firmware. The driver can get
      the measurement via latency TLV which can be added to each EMAD. This TLV
      is optional, when EMAD is sent with this TLV, the EMAD's response will
      include the TLV and the field 'latency_time' will contain the firmware
      measurement.
      
      This information can be processed using BPF program for example, to
      create a histogram and average of the latency per register. In addition,
      it is possible to measure the end-to-end latency, and then reduce firmware
      measurement, which will result in the latency of the software overhead.
      This information can be useful to improve the driver performance.
      
      Add support for latency TLV by default for all EMADs. First we planned to
      enable latency TLV per demand, using devlink-param. After some tests, we
      know that the usage of latency TLV does not impact the end-to-end latency,
      so it is OK to enable it by default.
      
      Note that similar to string TLV, the latency TLV is not supported in all
      firmware versions. Enable the usage of this TLV only after verifying it is
      supported by the current firmware version by querying the Management
      General Information Register (MGIR).
      Signed-off-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      49f5b769
    • Amit Cohen's avatar
      mlxsw: core: Define latency TLV fields · 6ee0d3a9
      Amit Cohen authored
      The next patch will add support for latency TLV as part of EMAD (Ethernet
      Management Datagrams) packets. As preparation, add the relevant fields.
      Signed-off-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Reviewed-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6ee0d3a9