1. 16 Aug, 2021 40 commits
    • Vlad Buslov's avatar
      net/mlx5: Bridge, obtain core device from eswitch instead of priv · a514d173
      Vlad Buslov authored
      Following patches in series will pass bond device to bridge, which means
      the code can't assume the device is mlx5 representor. Moreover, the core
      device can be easily obtained from eswitch instance, so there is no reason
      for more complex code that obtains struct mlx5_priv from net_device in
      order to use its mdev. Refactor the code to use esw->dev instead of
      priv->mdev.
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      a514d173
    • Vlad Buslov's avatar
      net/mlx5: Bridge, release bridge in same function where it is taken · 4de20e9a
      Vlad Buslov authored
      Refactor mlx5_esw_bridge_vport_link() to release the bridge instance if
      mlx5_esw_bridge_vport_init() returned an error instead of relying on it to
      release the bridge. This improves the design because object instance is
      taken and released in same layer and simplifies following patches that add
      more logic to mlx5_esw_bridge_vport_link().
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4de20e9a
    • Tariq Toukan's avatar
      net/mlx5e: Support MQPRIO channel mode · ec60c458
      Tariq Toukan authored
      Add support for MQPRIO channel mode, in which a partition to TCs
      is defined over the channels. We allow partitions with contiguous
      queue indices, with no holes within. We do not allow modification
      to the num of channels while this MQPRIO mode is active.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ec60c458
    • Tariq Toukan's avatar
      net/mlx5e: Handle errors of netdev_set_num_tc() · 21ecfcb8
      Tariq Toukan authored
      Add handling for failures in netdev_set_num_tc().
      Let mlx5e_netdev_set_tcs return an int.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      21ecfcb8
    • Tariq Toukan's avatar
      net/mlx5e: Maintain MQPRIO mode parameter · e2aeac44
      Tariq Toukan authored
      This is in preparation for supporting MQPRIO CHANNEL mode in
      downstream patch, in addition to DCB mode that's supported today.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      e2aeac44
    • Tariq Toukan's avatar
      net/mlx5e: Abstract MQPRIO params · 86d747a3
      Tariq Toukan authored
      Abstract the MQPRIO params into a struct.
      Use a getter for DCB mode num_tcs.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      86d747a3
    • Tariq Toukan's avatar
      net/mlx5e: Support flow classification into RSS contexts · 248d3b4c
      Tariq Toukan authored
      Extend the existing flow classification support, to steer
      flows not only directly to a receive ring, but also into
      the new RSS contexts.
      
      Create needed TIR objects on demand, and hold reference
      on the RSS context.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      248d3b4c
    • Tariq Toukan's avatar
      net/mlx5e: Support multiple RSS contexts · f01cc58c
      Tariq Toukan authored
      Add support to multiple RSS contexts. Resources of the non-default
      RSS contexts are allocated and created on demand. Each RSS context
      can be controlled and configured separately, via the implemented
      ethtool ops. Here we limit the num of total contexts to 16.
      
      We do not enforce any kind of new limitation over the indirection table
      content. More specifically, two separate contexts can be configured to
      fully or partially point to the same set of receive rings.
      
      The default RSS context (index 0) is created with its full set of TIRs.
      All other contexts are created with an empty set, then TIRs are added
      upon first usage when steering rules are added.
      We use a reference counting mechanism to make sure an RSS context is
      not removed before the rules pointing to it.
      
      Block ethtool set_channels operations when multiple RSS contexts exist,
      as currently the kernel doesn't protect against inconsistent channels
      configs that break non-default RSS contexts.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f01cc58c
    • Tariq Toukan's avatar
      net/mlx5e: Dynamically allocate TIRs in RSS contexts · 49095f64
      Tariq Toukan authored
      Move from static to dynamic memory allocations for TIR.
      This is in preparation to supporting on-demand TIR operations in
      downstream patches, where every RSS context will be init with an
      empty set of TIRs.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      49095f64
    • Tariq Toukan's avatar
      net/mlx5e: Convert RSS to a dedicated object · 25307a91
      Tariq Toukan authored
      Code related to RSS is now encapsulated into a dedicated object and put
      into new files en/rss.{c,h}. All usages are converted.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      25307a91
    • Tariq Toukan's avatar
      net/mlx5e: Introduce abstraction of RSS context · 713ba5e5
      Tariq Toukan authored
      Bring all fields that define and maintain RSS behavior together
      into a new structure.
      Align all usages with this new structure. Keep it hidden within
      rx_res.c.
      This helps supporting multiple RSS contexts in downstream patch.
      
      Use dynamic allocations for the RSS context.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      713ba5e5
    • Tariq Toukan's avatar
      net/mlx5e: Introduce TIR create/destroy API in rx_res · fc651ff9
      Tariq Toukan authored
      Take TIR control operations in rx_res into functions.
      This is in preparation to supporting on-demand TIR operations in
      downstream patches.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      fc651ff9
    • Tariq Toukan's avatar
      net/mlx5e: Do not try enable RSS when resetting indir table · 6e5fea51
      Tariq Toukan authored
      All calls to mlx5e_rx_res_rss_set_indir_uniform() occur while the RSS
      state is inactive, i.e. the RQT is pointing to the drop RQ, not to the
      channels' RQs.
      It means that the "apply" part of the function is not called.
      Remove this part from the function, and document the change. It will be
      useful for next patches in the series, allows code simplifications when
      multiple RSS contexts are introduced.
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6e5fea51
    • Antoine Tenart's avatar
      bonding: improve nl error msg when device can't be enslaved because of IFF_MASTER · 1b3f78df
      Antoine Tenart authored
      Use a more user friendly netlink error message when a device can't be
      enslaved because it has IFF_MASTER, by not referring directly to a
      kernel internal flag.
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b3f78df
    • David S. Miller's avatar
      Merge branch 'bridge-mcast-fixes' · ab636138
      David S. Miller authored
      Nikolay Aleksandrov says:
      
      ====================
      net: bridge: mcast: fixes for mcast querier state
      
      These three fix querier state dumping. The first patch can be considered
      a minor behaviour improvement, it avoids dumping querier state when mcast
      snooping is disabled. The second patch was a report of sizeof(0) used
      for nested netlink attribute size which should be just 0, and the third
      patch accounts for IPv6 querier state size when allocating skb for
      notifications.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ab636138
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: account for ipv6 size when dumping querier state · 175e6692
      Nikolay Aleksandrov authored
      We need to account for the IPv6 attributes when dumping querier state.
      
      Fixes: 5e924fe6ccfd ("net: bridge: mcast: dump ipv6 querier state")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      175e6692
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: drop sizeof for nest attribute's zero size · cdda378b
      Nikolay Aleksandrov authored
      This was a dumb error I made instead of writing nla_total_size(0)
      for a nest attribute, I wrote nla_total_size(sizeof(0)).
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 606433fe3e11 ("net: bridge: mcast: dump ipv4 querier state")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cdda378b
    • Nikolay Aleksandrov's avatar
      net: bridge: mcast: don't dump querier state if snooping is disabled · f137b7d4
      Nikolay Aleksandrov authored
      A minor improvement to avoid dumping mcast ctx querier state if snooping
      is disabled for that context (either bridge or vlan).
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f137b7d4
    • David S. Miller's avatar
      Merge branch 'stmmac-per-queue-stats' · 23a44b77
      David S. Miller authored
      Vijayakannan Ayyathurai says:
      
      ====================
      net: stmmac: Add ethtool per-queue statistic
      
      Adding generic ethtool per-queue statistic framework to display the
      statistics for each rx/tx queue. In future, users can avail it to add
      more per-queue specific counters. Number of rx/tx queues displayed is
      depending on the available rx/tx queues in that particular MAC config
      and this number is limited up to the MTL_MAX_{RX|TX}_QUEUES defined
      in the driver.
      
      Ethtool per-queue statistic display will look like below, when users
      start adding more counters.
      
      Example - 1:
       q0_tx_statA:
       q0_tx_statB:
       q0_tx_statC:
       |
       q0_tx_statX:
       .
       .
       .
       qMAX_tx_statA:
       qMAX_tx_statB:
       qMAX_tx_statC:
       |
       qMAX_tx_statX:
      
       q0_rx_statA:
       q0_rx_statB:
       q0_rx_statC:
       |
       q0_rx_statX:
       .
       .
       .
       qMAX_rx_statA:
       qMAX_rx_statB:
       qMAX_rx_statC:
       |
       qMAX_rx_statX:
      
      Example - 2: Ping test using the tx queue 3.
      
      $ tc qdisc add dev enp0s30f4 root mqprio num_tc 2 map 1 0 0 0 0 0 0 0
       0 0 0 0 0 0 0 0 queues 3@0 1@3 hw 0
      
      Statistic before ping:
      ---------------------
      $ ethtool -S enp0s30f4
      
      [ snip ]
           q3_tx_pkt_n: 7916
           q3_tx_irq_n: 316
      [ snip ]
      
      $ cat /proc/interrupts
      
      [ snip ]
       143:          0          0          0        316          0          0
      
               0          0  IR-PCI-MSI 499719-edge      enp0s30f4:tx-3
      [ snip ]
      
      $ ping -I enp0s30f4 192.168.1.10 -i 0.01 -c 100 > /dev/null
      
      Statistic after ping:
      ---------------------
      $ ethtool -S enp0s30f4
      
      [ snip ]
           q3_tx_pkt_n: 8016
           q3_tx_irq_n: 320
      [ snip ]
      
      $ cat /proc/interrupts
      
      [ snip ]
      143:          0          0          0        320          0          0
      
               0          0  IR-PCI-MSI 499719-edge      enp0s30f4:tx-3
      [ snip ]
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23a44b77
    • Vijayakannan Ayyathurai's avatar
      net: stmmac: add ethtool per-queue irq statistic support · af9bf701
      Vijayakannan Ayyathurai authored
      Adding ethtool per-queue statistics support to show number of interrupts
      generated at DMA tx and DMA rx. All the counters are incremented at
      dwmac4_dma_interrupt function.
      Signed-off-by: default avatarVijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af9bf701
    • Vijayakannan Ayyathurai's avatar
      net: stmmac: add ethtool per-queue statistic framework · 68e9c5de
      Vijayakannan Ayyathurai authored
      Adding generic ethtool per-queue statistic framework to display the
      statistics for each rx/tx queue. In future, users can avail it to add
      more per-queue specific counters. Number of rx/tx queues displayed is
      depending on the available rx/tx queues in that particular MAC config
      and this number is limited up to the MTL_MAX_{RX|TX}_QUEUES defined
      in the driver.
      
      Ethtool per-queue statistic display will look like below, when users
      start adding more counters.
      
      Example:
       q0_tx_statA:
       q0_tx_statB:
       q0_tx_statC:
       |
       q0_tx_statX:
       .
       .
       .
       qMAX_tx_statA:
       qMAX_tx_statB:
       qMAX_tx_statC:
       |
       qMAX_tx_statX:
      
       q0_rx_statA:
       q0_rx_statB:
       q0_rx_statC:
       |
       q0_rx_statX:
       .
       .
       .
       qMAX_rx_statA:
       qMAX_rx_statB:
       qMAX_rx_statC:
       |
       qMAX_rx_statX:
      
      In addition, this patch has the support on displaying the number of
      packets received and transmitted per queue.
      Signed-off-by: default avatarVijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68e9c5de
    • Voon Weifeng's avatar
      net: stmmac: fix INTR TBU status affecting irq count statistic · 1975df88
      Voon Weifeng authored
      DMA channel status "Transmit buffer unavailable(TBU)" bit is not
      considered as a successful dma tx. Hence, it should not affect
      all the irq count statistic.
      
      Fixes: 1103d3a5 ("net: stmmac: dwmac4: Also use TBU interrupt to clean TX path")
      Signed-off-by: default avatarVoon Weifeng <weifeng.voon@intel.com>
      Signed-off-by: default avatarVijayakannan Ayyathurai <vijayakannan.ayyathurai@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1975df88
    • Vladimir Oltean's avatar
      net: dsa: sja1105: reorganize probe, remove, setup and teardown ordering · 022522ac
      Vladimir Oltean authored
      The sja1105 driver's initialization and teardown sequence is a chaotic
      mess that has gathered a lot of cruft over time. It works because there
      is no strict dependency between the functions, but it could be improved.
      
      The basic principle that teardown should be the exact reverse of setup
      is obviously not held. We have initialization steps (sja1105_tas_setup,
      sja1105_flower_setup) in the probe method that are torn down in the DSA
      .teardown method instead of driver unbind time.
      
      We also have code after the dsa_register_switch() call, which implicitly
      means after the .setup() method has finished, which is pretty unusual.
      
      Also, sja1105_teardown() has calls set up in a different order than the
      error path of sja1105_setup(): see the reversed ordering between
      sja1105_ptp_clock_unregister and sja1105_mdiobus_unregister.
      
      Also, sja1105_static_config_load() is called towards the end of
      sja1105_setup(), but sja1105_static_config_free() is also towards the
      end of the error path and teardown path. The static_config_load() call
      should be earlier.
      
      Also, making and breaking the connections between struct sja1105_port
      and struct dsa_port could be refactored into dedicated functions, makes
      the code easier to follow.
      
      We move some code from the DSA .setup() method into the probe method,
      like the device tree parsing, and we move some code from the probe
      method into the DSA .setup() method to be symmetric with its placement
      in the DSA .teardown() method, which is nice because the unbind function
      has a single call to dsa_unregister_switch(). Example of the latter type
      of code movement are the connections between ports mentioned above, they
      are now in the .setup() method.
      
      Finally, due to fact that the kthread_init_worker() call is no longer
      in sja1105_probe() - located towards the bottom of the file - but in
      sja1105_setup() - located much higher - there is an inverse ordering
      with the worker function declaration, sja1105_port_deferred_xmit. To
      avoid that, the entire sja1105_setup() and sja1105_teardown() functions
      are moved towards the bottom of the file.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      022522ac
    • Heiner Kallweit's avatar
      r8169: rename rtl_csi_access_enable to rtl_set_aspm_entry_latency · c07c8ffc
      Heiner Kallweit authored
      Rename the function to reflect what it's doing. Also add a description
      of the register values as kindly provided by Realtek.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c07c8ffc
    • David S. Miller's avatar
      Merge branch 'ocelot-phylink' · 793ee362
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Convert ocelot to phylink
      
      The ocelot switchdev and felix dsa drivers are interesting because they
      target the same class of hardware switches but used in different modes.
      
      Colin has an interesting use case where he wants to use a hardware
      switch supported by the ocelot switchdev driver with the felix dsa
      driver.
      
      So far, the existing hardware revisions were similar between the ocelot
      and felix drivers, but not completely identical. With identical hardware,
      it is absurd that the felix driver uses phylink while the ocelot driver
      uses phylib - this should not be one of the differences between the
      switchdev and dsa driver, and we could eliminate it.
      
      Colin will need the common phylink support in ocelot and felix when
      adding a phylink_pcs driver for the PCS1G block inside VSC7514, which
      will make the felix driver work with either the NXP or the Microchip PCS.
      
      As usual, Alex, Horatiu, sorry for bugging you, but it would be
      appreciated if you could give this a quick run on actual VSC7514
      hardware (which I don't have) to make sure I'm not introducing any
      breakage.
      ====================
      
      Fixes: 0f06a678 ("samples: Add an IPv6 "-6" option to the pktgen scripts")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      793ee362
    • Vladimir Oltean's avatar
      net: mscc: ocelot: convert to phylink · e6e12df6
      Vladimir Oltean authored
      The felix DSA driver, which is a wrapper over the same hardware class as
      ocelot, is integrated with phylink, but ocelot is using the plain PHY
      library. It makes sense to bring together the two implementations, which
      is what this patch achieves.
      
      This is a large patch and hard to break up, but it does the following:
      
      The existing ocelot_adjust_link writes some registers, and
      felix_phylink_mac_link_up writes some registers, some of them are
      common, but both functions write to some registers to which the other
      doesn't.
      
      The main reasons for this are:
      - Felix switches so far have used an NXP PCS so they had no need to
        write the PCS1G registers that ocelot_adjust_link writes
      - Felix switches have the MAC fixed at 1G, so some of the MAC speed
        changes actually break the link and must be avoided.
      
      The naming conventions for the functions introduced in this patch are:
      - vsc7514_phylink_{mac_config,validate} are specific to the Ocelot
        instantiations and placed in ocelot_net.c which is built only for the
        ocelot switchdev driver.
      - ocelot_phylink_mac_link_{up,down} are shared between the ocelot
        switchdev driver and the felix DSA driver (they are put in the common
        lib).
      
      One by one, the registers written by ocelot_adjust_link are:
      
      DEV_MAC_MODE_CFG - felix_phylink_mac_link_up had no need to write this
                         register since its out-of-reset value was fine and
                         did not need changing. The write is moved to the
                         common ocelot_phylink_mac_link_up and on felix it is
                         guarded by a quirk bit that makes the written value
                         identical with the out-of-reset one
      DEV_PORT_MISC - runtime invariant, was moved to vsc7514_phylink_mac_config
      PCS1G_MODE_CFG - same as above
      PCS1G_SD_CFG - same as above
      PCS1G_CFG - same as above
      PCS1G_ANEG_CFG - same as above
      PCS1G_LB_CFG - same as above
      DEV_MAC_ENA_CFG - both ocelot_adjust_link and ocelot_port_disable
                        touched this. felix_phylink_mac_link_{up,down} also
                        do. We go with what felix does and put it in
                        ocelot_phylink_mac_link_up.
      DEV_CLOCK_CFG - ocelot_adjust_link and felix_phylink_mac_link_up both
                      write this, but to different values. Move to the common
                      ocelot_phylink_mac_link_up and make sure via the quirk
                      that the old values are preserved for both.
      ANA_PFC_PFC_CFG - ocelot_adjust_link wrote this, felix_phylink_mac_link_up
                        did not. Runtime invariant, speed does not matter since
                        PFC is disabled via the RX_PFC_ENA bits which are cleared.
                        Move to vsc7514_phylink_mac_config.
      QSYS_SWITCH_PORT_MODE_PORT_ENA - both ocelot_adjust_link and
                                       felix_phylink_mac_link_{up,down} wrote
                                       this. Ocelot also wrote this register
                                       from ocelot_port_disable. Keep what
                                       felix did, move in ocelot_phylink_mac_link_{up,down}
                                       and delete ocelot_port_disable.
      ANA_POL_FLOWC - same as above
      SYS_MAC_FC_CFG - same as above, except slight behavior change. Whereas
                       ocelot always enabled RX and TX flow control, felix
                       listened to phylink (for the most part, at least - see
                       the 2500base-X comment).
      
      The registers which only felix_phylink_mac_link_up wrote are:
      
      SYS_PAUSE_CFG_PAUSE_ENA - this is why I am not sure that flow control
                                worked on ocelot. Not it should, since the
                                code is shared with felix where it does.
      ANA_PORT_PORT_CFG - this is a Frame Analyzer block register, phylink
                          should be the one touching them, deleted.
      
      Other changes:
      
      - The old phylib registration code was in mscc_ocelot_init_ports. It is
        hard to work with 2 levels of indentation already in, and with hard to
        follow teardown logic. The new phylink registration code was moved
        inside ocelot_probe_port(), right between alloc_etherdev() and
        register_netdev(). It could not be done before (=> outside of)
        ocelot_probe_port() because ocelot_probe_port() allocates the struct
        ocelot_port which we then use to assign ocelot_port->phy_mode to. It
        is more preferable to me to have all PHY handling logic inside the
        same function.
      - On the same topic: struct ocelot_port_private :: serdes is only used
        in ocelot_port_open to set the SERDES protocol to Ethernet. This is
        logically a runtime invariant and can be done just once, when the port
        registers with phylink. We therefore don't even need to keep the
        serdes reference inside struct ocelot_port_private, or to use the devm
        variant of of_phy_get().
      - Phylink needs a valid phy-mode for phylink_create() to succeed, and
        the existing device tree bindings in arch/mips/boot/dts/mscc/ocelot_pcb120.dts
        don't define one for the internal PHY ports. So we patch
        PHY_INTERFACE_MODE_NA into PHY_INTERFACE_MODE_INTERNAL.
      - There was a strategically placed:
      
      	switch (priv->phy_mode) {
      	case PHY_INTERFACE_MODE_NA:
      	        continue;
      
        which made the code skip the serdes initialization for the internal
        PHY ports. Frankly that is not all that obvious, so now we explicitly
        initialize the serdes under an "if" condition and not rely on code
        jumps, so everything is clearer.
      - There was a write of OCELOT_SPEED_1000 to DEV_CLOCK_CFG for QSGMII
        ports. Since that is in fact the default value for the register field
        DEV_CLOCK_CFG_LINK_SPEED, I can only guess the intention was to clear
        the adjacent fields, MAC_TX_RST and MAC_RX_RST, aka take the port out
        of reset, which does match the comment. I don't even want to know why
        this code is placed there, but if there is indeed an issue that all
        ports that share a QSGMII lane must all be up, then this logic is
        already buggy, since mscc_ocelot_init_ports iterates using
        for_each_available_child_of_node, so nobody prevents the user from
        putting a 'status = "disabled";' for some QSGMII ports which would
        break the driver's assumption.
        In any case, in the eventuality that I'm right, we would have yet
        another issue if ocelot_phylink_mac_link_down would reset those ports
        and that would be forbidden, so since the ocelot_adjust_link logic did
        not do that (maybe for a reason), add another quirk to preserve the
        old logic.
      
      The ocelot driver teardown goes through all ports in one fell swoop.
      When initialization of one port fails, the ocelot->ports[port] pointer
      for that is reset to NULL, and teardown is done only for non-NULL ports,
      so there is no reason to do partial teardowns, let the central
      mscc_ocelot_release_ports() do its job.
      
      Tested bind, unbind, rebind, link up, link down, speed change on mock-up
      hardware (modified the driver to probe on Felix VSC9959). Also
      regression tested the felix DSA driver. Could not test the Ocelot
      specific bits (PCS1G, SERDES, device tree bindings).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6e12df6
    • Vladimir Oltean's avatar
      net: dsa: felix: stop calling ocelot_port_{enable,disable} · 46efe4ef
      Vladimir Oltean authored
      ocelot_port_enable touches ANA_PORT_PORT_CFG, which has the following
      fields:
      
      - LOCKED_PORTMOVE_CPU, LEARNDROP, LEARNCPU, LEARNAUTO, RECV_ENA, all of
        which are written with their hardware default values, also runtime
        invariants. So it makes no sense to write these during every .ndo_open.
      
      - PORTID_VAL: this field has an out-of-reset value of zero for all ports
        and must be initialized by software. Additionally, the
        ocelot_setup_logical_port_ids() code path sets up different logical
        port IDs for the ports in a hardware LAG, and we absolutely don't want
        .ndo_open to interfere there and reset those values.
      
      So in fact the write from ocelot_port_enable can better be moved to
      ocelot_init_port, and the .ndo_open hook deleted.
      
      ocelot_port_disable touches DEV_MAC_ENA_CFG and QSYS_SWITCH_PORT_MODE_PORT_ENA,
      in an attempt to undo what ocelot_adjust_link did. But since .ndo_stop
      does not get called each time the link falls (i.e. this isn't a
      substitute for .phylink_mac_link_down), felix already does better at
      this by writing those registers already in felix_phylink_mac_link_down.
      
      So keep ocelot_port_disable (for now, until ocelot is converted to
      phylink too), and just delete the felix call to it, which is not
      necessary.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46efe4ef
    • Changbin Du's avatar
      s390/net: replace in_irq() with in_hardirq() · e871ee69
      Changbin Du authored
      Replace the obsolete and ambiguos macro in_irq() with new
      macro in_hardirq().
      Signed-off-by: default avatarChangbin Du <changbin.du@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e871ee69
    • Vladimir Oltean's avatar
      net: dsa: tag_8021q: fix notifiers broadcast when they shouldn't, and vice versa · b2b89133
      Vladimir Oltean authored
      During the development of the blamed patch, the "bool broadcast"
      argument of dsa_port_tag_8021q_vlan_{add,del} was originally called
      "bool local", and the meaning was the exact opposite.
      
      Due to a rookie mistake where the patch was modified at the last minute
      without retesting, the instances of dsa_port_tag_8021q_vlan_{add,del}
      are called with the wrong values. During setup and teardown, cross-chip
      notifiers should not be broadcast to all DSA trees, while during
      bridging, they should.
      
      Fixes: 724395f4 ("net: dsa: tag_8021q: don't broadcast during setup/teardown")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b2b89133
    • Randy Dunlap's avatar
      ptp: ocp: don't allow on S390 · 944f5101
      Randy Dunlap authored
      Fix kconfig warning on arch/s390/:
      
      WARNING: unmet direct dependencies detected for SERIAL_8250
        Depends on [n]: TTY [=y] && HAS_IOMEM [=y] && !S390 [=y]
        Selected by [m]:
        - PTP_1588_CLOCK_OCP [=m] && PTP_1588_CLOCK [=m] && HAS_IOMEM [=y] && PCI [=y] && SPI [=y] && I2C [=m] && MTD [=m]
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      944f5101
    • Rao Shoaib's avatar
      af_unix: check socket state when queuing OOB · 19eed721
      Rao Shoaib authored
      edumazet@google.com pointed out that queue_oob
      does not check socket state after acquiring
      the lock. He also pointed to an incorrect usage
      of kfree_skb and an unnecessary setting of skb
      length. This patch addresses those issue.
      Signed-off-by: default avatarRao Shoaib <Rao.Shoaib@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19eed721
    • Song Yoong Siang's avatar
      net: phy: marvell: Add WAKE_PHY support to WOL event · 6164659f
      Song Yoong Siang authored
      Add Wake-on-PHY feature support by enabling the Link Up Event.
      Signed-off-by: default avatarSong Yoong Siang <yoong.siang.song@intel.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6164659f
    • Wong Vee Khee's avatar
      net: pcs: xpcs: Add Pause Mode support for SGMII and 2500BaseX · 849d2f83
      Wong Vee Khee authored
      SGMII/2500BaseX supports Pause frame as defined in the IEEE802.3x
      Flow Control standardization.
      
      Add this as a supported feature under the xpcs_sgmii_features struct.
      
      Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarWong Vee Khee <vee.khee.wong@linux.intel.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      849d2f83
    • David S. Miller's avatar
      Merge branch 'pktgen-samples' · 5fa5fb8b
      David S. Miller authored
      samples: pktgen: enhance the usability of pktgen samples
      
      This patchset improves the usability of pktgen samples by adding an option for
      propagating the environment variable of normal user to sudo. And also adds the
      missing IPv6 option to pktgen scripts.
      
      Currently, all pktgen samples are able to use the environment variable instead
      of optional parameters. However, it doesn't work appropriately when running
      samples as normal user.
      
      This is results of running sample as root and user:
      
          // running as root
          # DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1
          Running... ctrl^C to stop
      
          // running as normal user
          $ DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1
          [...]
          ERROR: Please specify output device
      
      The reason why passing the environment varaible doesn't work properly when
      running samples as normal user is that the environment variable of normal user
      doesn't propagate to sudo (root_check_run_with_sudo)). So the first commit
      solves this issue by using "-E" (--preserve-env) option of "sudo", which passes
      normal user's existing environment variables.
      
      Also, "sample04" and "sample05" are not working properly when running with IPv6
      option parameter("-6"). Because the commit 0f06a678 ("samples: Add an IPv6
      "-6" option to the pktgen scripts") has omitted the addition of this option at
      these samples. So the second commit adds missing IPv6 option to pktgen scripts.
      
      ====================
      
      Fixes: 0f06a678 ("samples: Add an IPv6 "-6" option to the pktgen scripts")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5fa5fb8b
    • Juhee Kang's avatar
      samples: pktgen: add missing IPv6 option to pktgen scripts · 0f0c4f1b
      Juhee Kang authored
      Currently, "sample04" and "sample05" are not working properly when
      running with an IPv6 option("-6"). The commit 0f06a678 ("samples:
      Add an IPv6 "-6" option to the pktgen scripts") has omitted the addition
      of this option at "sample04" and "sample05".
      
      In order to support IPv6 option, this commit adds logic related to IPv6
      option.
      
      Fixes: 0f06a678 ("samples: Add an IPv6 "-6" option to the pktgen scripts")
      Signed-off-by: default avatarJuhee Kang <claudiajkang@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f0c4f1b
    • Juhee Kang's avatar
      samples: pktgen: pass the environment variable of normal user to sudo · 7caeabd7
      Juhee Kang authored
      All pktgen samples can use the environment variable instead of option
      parameters(eg. $DEV is able to use instead of '-i' option).
      
      This is results of running sample as root and user:
      
          // running as root
          # DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1
          Running... ctrl^C to stop
      
          // running as normal user
          $ DEV=eth0 DEST_IP=10.1.0.1 DST_MAC=00:11:22:33:44:55 ./pktgen_sample01_simple.sh -v -n 1
          [...]
          ERROR: Please specify output device
      
      This results show the sample doesn't work properly when the sample runs
      as normal user. Because the sample is restarted by the function
      (root_check_run_with_sudo) to run with sudo. In this process, the
      environment variable of normal user doesn't propagate to sudo.
      
      It can be solved by using "-E"(--preserve-env) option of "sudo", which
      preserve normal user's existing environment variables. So this commit
      adds "-E" option in the function (root_check_run_with_sudo).
      Signed-off-by: default avatarJuhee Kang <claudiajkang@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7caeabd7
    • David S. Miller's avatar
      Merge branch 'ipq-mdio' · cbbb7abd
      David S. Miller authored
      Luo Jie says:
      
      ====================
      net: mdio: Add IPQ MDIO reset related function
      
      This patch series add the MDIO reset features, which includes
      configuring MDIO clock source frequency and indicating CMN_PLL that
      ethernet LDO has been ready, this ethernet LDO is dedicated in the
      IPQ5018 platform.
      
      Specify more chipset IPQ40xx, IPQ807x, IPQ60xx and IPQ50xx supported by
      this MDIO driver.
      
      Changes in v3:
      	* simplify the function ipq_mdio_reset.
      
      Changes in v2:
      	* Addressed review comments (Andrew Lunn).
      	* Remove the IS_ERR().
      	* make binding patch part of series.
      	* document the property 'reg' and 'clock'.
      
      Changes in v1:
      	* make MDIO_IPQ4019 unchanged for backwards compatibility.
      	* remove the PHY reset functions
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbbb7abd
    • Luo Jie's avatar
      dt-bindings: net: Add the properties for ipq4019 MDIO · 2a4c32e7
      Luo Jie authored
      The new added properties resource "reg" is for configuring
      ethernet LDO in the IPQ5018 chipset, the property "clocks"
      is for configuring the MDIO clock source frequency.
      Signed-off-by: default avatarLuo Jie <luoj@codeaurora.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a4c32e7
    • Luo Jie's avatar
      MDIO: Kconfig: Specify more IPQ chipset supported · c76ee263
      Luo Jie authored
      The IPQ MDIO driver currently supports the chipset IPQ40xx, IPQ807x,
      IPQ60xx and IPQ50xx.
      
      Add the compatible 'qcom,ipq5018-mdio' because of ethernet LDO dedicated
      to the IPQ5018 platform.
      Signed-off-by: default avatarLuo Jie <luoj@codeaurora.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c76ee263
    • Luo Jie's avatar
      net: mdio: Add the reset function for IPQ MDIO driver · 23a890d4
      Luo Jie authored
      1. configure the MDIO clock source frequency.
      2. the LDO resource is needed to configure the ethernet LDO available
      for CMN_PLL.
      Signed-off-by: default avatarLuo Jie <luoj@codeaurora.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23a890d4