1. 29 Jun, 2019 8 commits
  2. 28 Jun, 2019 32 commits
    • David S. Miller's avatar
      Merge branch 'net-sched-Add-txtime-assist-support-for-taprio' · 0a7960c7
      David S. Miller authored
      Vedang Patel says:
      
      ====================
      net/sched: Add txtime-assist support for taprio.
      
      Changes in v6:
      - Use _BITUL() instead of BIT() in UAPI for etf. (patch #1)
      - Fix a bug reported by kbuild test bot in length_to_duration(). (patch #6)
      - Remove an unused function (get_cycle_start()). (Patch #6)
      
      Changes in v5:
      - Commit message improved for the igb patch (patch #1).
      - Fixed typo in commit message for etf patch (patch #2).
      
      Changes in v4:
      - Remove inline directive from functions in foo.c.
      - Fix spacing in pkt_sched.h (for etf patch).
      
      Changes in v3:
      - Simplify implementation for taprio flags.
      - txtime_delay can only be set if txtime-assist mode is enabled.
      - txtime_delay and flags will only be visible in tc output if set by user.
      - Minor changes in error reporting.
      
      Changes in v2:
      - Txtime-offload has now been renamed to txtime-assist mode.
      - Renamed the offload parameter to flags.
      - Removed the code which introduced the hardware offloading functionality.
      
      Original Cover letter (with above changes included)
      --------------------------------------------------
      
      Currently, we are seeing packets being transmitted outside their
      timeslices. We can confirm that the packets are being dequeued at the right
      time. So, the delay is induced after the packet is dequeued, because
      taprio, without any offloading, has no control of when a packet is actually
      transmitted.
      
      In order to solve this, we are making use of the txtime feature provided by
      ETF qdisc. Hardware offloading needs to be supported by the ETF qdisc in
      order to take advantage of this feature. The taprio qdisc will assign
      txtime (in skb->tstamp) for all the packets which do not have the txtime
      allocated via the SO_TXTIME socket option. For the packets which already
      have SO_TXTIME set, taprio will validate whether the packet will be
      transmitted in the correct interval.
      
      In order to support this, the following parameters have been added:
      - flags (taprio): This is added in order to support different offloading
        modes which will be added in the future.
      - txtime-delay (taprio): This indicates the minimum time it will take for
        the packet to hit the wire after it reaches taprio_enqueue(). This is
        useful in determining whether we can transmit the packet in the remaining
        time if the gate corresponding to the packet is currently open.
      - skip_skb_check (ETF): ETF currently drops any packet which does not have
        the SO_TXTIME socket option set. This check can be skipped by specifying
        this option.
      
      Following is an example configuration:
      
      tc qdisc replace dev $IFACE parent root handle 100 taprio \\
          num_tc 3 \\
          map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \\
          queues 1@0 1@0 1@0 \\
          base-time $BASE_TIME \\
          sched-entry S 01 300000 \\
          sched-entry S 02 300000 \\
          sched-entry S 04 400000 \\
          flags 0x1 \\
          txtime-delay 200000 \\
          clockid CLOCK_TAI
      
      tc qdisc replace dev $IFACE parent 100:1 etf \\
          offload delta 200000 clockid CLOCK_TAI skip_skb_check
      
      Here, the "flags" parameter is indicating that the txtime-assist mode is
      enabled. Also, all the traffic classes have been assigned the same queue.
      This is to prevent the traffic classes in the lower priority queues from
      getting starved. Note that this configuration is specific to the i210
      ethernet card. Other network cards where the hardware queues are given the
      same priority, might be able to utilize more than one queue.
      
      Following are some of the other highlights of the series:
      - Fix a bug where hardware timestamping and SO_TXTIME options cannot be
        used together. (Patch 1)
      - Introduces the skip_skb_check option.  (Patch 2)
      - Make TxTime assist mode work with TCP packets (Patch 7).
      
      The following changes are recommended to be done in order to get the best
      performance from taprio in this mode:
      ip link set dev enp1s0 mtu 1514
      ethtool -K eth0 gso off
      ethtool -K eth0 tso off
      ethtool --set-eee eth0 eee off
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a7960c7
    • Vedang Patel's avatar
      taprio: Adjust timestamps for TCP packets · 54002066
      Vedang Patel authored
      When the taprio qdisc is running in "txtime offload" mode, it will
      set the launchtime value (in skb->tstamp) for all the packets which do
      not have the SO_TXTIME socket option. But, the TCP packets already have
      this value set and it indicates the earliest departure time represented
      in CLOCK_MONOTONIC clock.
      
      We need to respect the timestamp set by the TCP subsystem. So, convert
      this time to the clock which taprio is using and ensure that the packet
      is not transmitted before the deadline set by TCP.
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54002066
    • Vedang Patel's avatar
      taprio: make clock reference conversions easier · 7ede7b03
      Vedang Patel authored
      Later in this series we will need to transform from
      CLOCK_MONOTONIC (used in TCP) to the clock reference used in TAPRIO.
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ede7b03
    • Vedang Patel's avatar
      taprio: Add support for txtime-assist mode · 4cfd5779
      Vedang Patel authored
      Currently, we are seeing non-critical packets being transmitted outside of
      their timeslice. We can confirm that the packets are being dequeued at the
      right time. So, the delay is induced in the hardware side.  The most likely
      reason is the hardware queues are starving the lower priority queues.
      
      In order to improve the performance of taprio, we will be making use of the
      txtime feature provided by the ETF qdisc. For all the packets which do not
      have the SO_TXTIME option set, taprio will set the transmit timestamp (set
      in skb->tstamp) in this mode. TAPrio Qdisc will ensure that the transmit
      time for the packet is set to when the gate is open. If SO_TXTIME is set,
      the TAPrio qdisc will validate whether the timestamp (in skb->tstamp)
      occurs when the gate corresponding to skb's traffic class is open.
      
      Following two parameters added to support this mode:
      - flags: used to enable txtime-assist mode. Will also be used to enable
        other modes (like hardware offloading) later.
      - txtime-delay: This indicates the minimum time it will take for the packet
        to hit the wire. This is useful in determining whether we can transmit
      the packet in the remaining time if the gate corresponding to the packet is
      currently open.
      
      An example configuration for enabling txtime-assist:
      
      tc qdisc replace dev eth0 parent root handle 100 taprio \\
            num_tc 3 \\
            map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \\
            queues 1@0 1@0 1@0 \\
            base-time 1558653424279842568 \\
            sched-entry S 01 300000 \\
            sched-entry S 02 300000 \\
            sched-entry S 04 400000 \\
            flags 0x1 \\
            txtime-delay 40000 \\
            clockid CLOCK_TAI
      
      tc qdisc replace dev $IFACE parent 100:1 etf skip_sock_check \\
            offload delta 200000 clockid CLOCK_TAI
      
      Note that all the traffic classes are mapped to the same queue.  This is
      only possible in taprio when txtime-assist is enabled. Also, note that the
      ETF Qdisc is enabled with offload mode set.
      
      In this mode, if the packet's traffic class is open and the complete packet
      can be transmitted, taprio will try to transmit the packet immediately.
      This will be done by setting skb->tstamp to current_time + the time delta
      indicated in the txtime-delay parameter. This parameter indicates the time
      taken (in software) for packet to reach the network adapter.
      
      If the packet cannot be transmitted in the current interval or if the
      packet's traffic is not currently transmitting, the skb->tstamp is set to
      the next available timestamp value. This is tracked in the next_launchtime
      parameter in the struct sched_entry.
      
      The behaviour w.r.t admin and oper schedules is not changed from what is
      present in software mode.
      
      The transmit time is already known in advance. So, we do not need the HR
      timers to advance the schedule and wakeup the dequeue side of taprio.  So,
      HR timer won't be run when this mode is enabled.
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4cfd5779
    • Vedang Patel's avatar
      taprio: Remove inline directive · 566af331
      Vedang Patel authored
      Remove inline directive from length_to_duration(). We will let the compiler
      make the decisions.
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      566af331
    • Vedang Patel's avatar
      taprio: calculate cycle_time when schedule is installed · 037be037
      Vedang Patel authored
      cycle time for a particular schedule is calculated only when it is first
      installed. So, it makes sense to just calculate it once right after the
      'cycle_time' parameter has been parsed and store it in cycle_time.
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      037be037
    • Vedang Patel's avatar
      etf: Add skip_sock_check · d14d2b20
      Vedang Patel authored
      Currently, etf expects a socket with SO_TXTIME option set for each packet
      it encounters. So, it will drop all other packets. But, in the future
      commits we are planning to add functionality where tstamp value will be set
      by another qdisc. Also, some packets which are generated from within the
      kernel (e.g. ICMP packets) do not have any socket associated with them.
      
      So, this commit adds support for skip_sock_check. When this option is set,
      etf will skip checking for a socket and other associated options for all
      skbs.
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d14d2b20
    • Vedang Patel's avatar
      etf: Don't use BIT() in UAPI headers. · 9903c8dc
      Vedang Patel authored
      The BIT() macro isn't exported as part of the UAPI interface. So, the
      compile-test to ensure they are self contained fails. So, use _BITUL()
      instead.
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9903c8dc
    • Vedang Patel's avatar
      igb: clear out skb->tstamp after reading the txtime · 1e08511d
      Vedang Patel authored
      If a packet which is utilizing the launchtime feature (via SO_TXTIME socket
      option) also requests the hardware transmit timestamp, the hardware
      timestamp is not delivered to the userspace. This is because the value in
      skb->tstamp is mistaken as the software timestamp.
      
      Applications, like ptp4l, request a hardware timestamp by setting the
      SOF_TIMESTAMPING_TX_HARDWARE socket option. Whenever a new timestamp is
      detected by the driver (this work is done in igb_ptp_tx_work() which calls
      igb_ptp_tx_hwtstamps() in igb_ptp.c[1]), it will queue the timestamp in the
      ERR_QUEUE for the userspace to read. When the userspace is ready, it will
      issue a recvmsg() call to collect this timestamp.  The problem is in this
      recvmsg() call. If the skb->tstamp is not cleared out, it will be
      interpreted as a software timestamp and the hardware tx timestamp will not
      be successfully sent to the userspace. Look at skb_is_swtx_tstamp() and the
      callee function __sock_recv_timestamp() in net/socket.c for more details.
      Signed-off-by: default avatarVedang Patel <vedang.patel@intel.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e08511d
    • David S. Miller's avatar
      Merge branch 'mirred-recurse' · 8747d82d
      David S. Miller authored
      John Hurley says:
      
      ====================
      Track recursive calls in TC act_mirred
      
      These patches aim to prevent act_mirred causing stack overflow events from
      recursively calling packet xmit or receive functions. Such events can
      occur with poor TC configuration that causes packets to travel in loops
      within the system.
      
      Florian Westphal advises that a recursion crash and packets looping are
      separate issues and should be treated as such. David Miller futher points
      out that pcpu counters cannot track the precise skb context required to
      detect loops. Hence these patches are not aimed at detecting packet loops,
      rather, preventing stack flows arising from such loops.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8747d82d
    • John Hurley's avatar
      net: sched: protect against stack overflow in TC act_mirred · e2ca070f
      John Hurley authored
      TC hooks allow the application of filters and actions to packets at both
      ingress and egress of the network stack. It is possible, with poor
      configuration, that this can produce loops whereby an ingress hook calls
      a mirred egress action that has an egress hook that redirects back to
      the first ingress etc. The TC core classifier protects against loops when
      doing reclassifies but there is no protection against a packet looping
      between multiple hooks and recursively calling act_mirred. This can lead
      to stack overflow panics.
      
      Add a per CPU counter to act_mirred that is incremented for each recursive
      call of the action function when processing a packet. If a limit is passed
      then the packet is dropped and CPU counter reset.
      
      Note that this patch does not protect against loops in TC datapaths. Its
      aim is to prevent stack overflow kernel panics that can be a consequence
      of such loops.
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2ca070f
    • John Hurley's avatar
      net: sched: refactor reinsert action · 720f22fe
      John Hurley authored
      The TC_ACT_REINSERT return type was added as an in-kernel only option to
      allow a packet ingress or egress redirect. This is used to avoid
      unnecessary skb clones in situations where they are not required. If a TC
      hook returns this code then the packet is 'reinserted' and no skb consume
      is carried out as no clone took place.
      
      This return type is only used in act_mirred. Rather than have the reinsert
      called from the main datapath, call it directly in act_mirred. Instead of
      returning TC_ACT_REINSERT, change the type to the new TC_ACT_CONSUMED
      which tells the caller that the packet has been stolen by another process
      and that no consume call is required.
      
      Moving all redirect calls to the act_mirred code is in preparation for
      tracking recursion created by act_mirred.
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      720f22fe
    • Christian Brauner's avatar
      ipv4: enable route flushing in network namespaces · 5cdda5f1
      Christian Brauner authored
      Tools such as vpnc try to flush routes when run inside network
      namespaces by writing 1 into /proc/sys/net/ipv4/route/flush. This
      currently does not work because flush is not enabled in non-initial
      network namespaces.
      Since routes are per network namespace it is safe to enable
      /proc/sys/net/ipv4/route/flush in there.
      
      Link: https://github.com/lxc/lxd/issues/4257Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5cdda5f1
    • David S. Miller's avatar
      Merge tag 'batadv-next-for-davem-20190627v2' of git://git.open-mesh.org/linux-merge · 65dc5416
      David S. Miller authored
      Simon Wunderlich says:
      
      ====================
      This feature/cleanup patchset includes the following patches:
      
       - bump version strings, by Simon Wunderlich
      
       - fix includes for _MAX constants, atomic functions and fwdecls,
         by Sven Eckelmann (3 patches)
      
       - shorten multicast tt/tvlv worker spinlock section, by Linus Luessing
      
       - routeable multicast preparations: implement MAC multicast filtering,
         by Linus Luessing (2 patches, David Millers comments integrated)
      
       - remove return value checks for debugfs_create, by Greg Kroah-Hartman
      
       - add routable multicast optimizations, by Linus Luessing (2 patches)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65dc5416
    • David S. Miller's avatar
      Merge branch 'hns3-next' · fcd71efd
      David S. Miller authored
      Huazhong Tan says:
      
      ====================
      net: hns3: some code optimizations & cleanups & bugfixes
      
      [patch 01/12] fixes a TX timeout issue.
      
      [patch 02/12 - 04/12] adds some patch related to TM module.
      
      [patch 05/12] fixes a compile warning.
      
      [patch 06/12] adds Asym Pause support for autoneg
      
      [patch 07/12] optimizes the error handler for VF reset.
      
      [patch 08/12] deals with the empty interrupt case.
      
      [patch 09/12 - 12/12] adds some cleanups & optimizations.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fcd71efd
    • Peng Li's avatar
      net: hns3: optimize the CSQ cmd error handling · 82c8ae6e
      Peng Li authored
      If CMDQ ring is full, hclge_cmd_send may return directly, but IMP still
      working and HW pointer changed, SW ring pointer do not match the HW
      pointer. This patch update the SW pointer every time when the space is
      full, so it can work normally next time if IMP and HW still working.
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82c8ae6e
    • Yunsheng Lin's avatar
      net: hns3: remove RXD_VLD check in hns3_handle_bdinfo · 289f8125
      Yunsheng Lin authored
      The HNS3_RXD_VLD_B bit has already been checked in hns3_add_frag
      or hns3_handle_rx_bd before calling hns3_handle_bdinfo, so when
      hns3_handle_bdinfo is called, the HNS3_RXD_VLD_B bit is always
      set, which makes the checking in hns3_handle_bdinfo unnecessary.
      
      This patch removes the RXD_VLD_B checking in hns3_handle_bdinfo.
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      289f8125
    • Jian Shen's avatar
      net: hns3: remove unused linkmode definition · 53eb60c7
      Jian Shen authored
      This patch removes unused linkmode definition.
      Signed-off-by: default avatarJian Shen <shenjian15@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53eb60c7
    • Yufeng Mo's avatar
      net: hns3: fix a statistics issue about l3l4 checksum error · 8b552079
      Yufeng Mo authored
      The frame column is based on rx_crc_errors and rx_frame_errors. So
      l3l4 checksum error should not be counted by rx_crc_errors. Instead,
      l3l4 checksum error should be counted in ifconfig error column.
      
      Fixes: d3ec4ef6 ("net: hns3: refactor the statistics updating for netdev")
      Signed-off-by: default avatarYufeng Mo <moyufeng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8b552079
    • Huazhong Tan's avatar
      net: hns3: handle empty unknown interrupt · 9bc6ac91
      Huazhong Tan authored
      Since some MSI-X interrupt's status may be cleared by hardware,
      so when the driver receives the interrupt, reading
      HCLGE_VECTOR0_PF_OTHER_INT_STS_REG register will get an empty
      unknown interrupt. For this case, the irq handler should enable
      vector0 interrupt. This patch also use dev_info() instead of
      dev_dbg() in the hclge_check_event_cause(), since this information
      will be useful for normal usage.
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9bc6ac91
    • Huazhong Tan's avatar
      net: hns3: re-schedule reset task while VF reset fail · bbe6540e
      Huazhong Tan authored
      The VF reset may fail for some probabilistic reasons,
      such as wait for hardware reset timeout, wait for mailbox
      response timeout, so this patch tries to re-schedule the
      reset task when the number of reset failing is under
      HCLGEVF_RESET_MAX_FAIL_CNT. This patch also add a function
      hclgevf_reset_err_handle() to handle the reset failing.
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bbe6540e
    • Yonglong Liu's avatar
      net: hns3: add Asym Pause support to fix autoneg problem · bc3781ed
      Yonglong Liu authored
      Local device and link partner config auto-negotiation on both,
      local device config pause frame use as: rx on/tx off,
      link partner config pause frame use as: rx off/tx on.
      
      We except the result is:
      Local device:
      Autonegotiate:  on
      RX:             on
      TX:             off
      RX negotiated:  on
      TX negotiated:  off
      
      Link partner:
      Autonegotiate:  on
      RX:             off
      TX:             on
      RX negotiated:  off
      TX negotiated:  on
      
      But actually, the result of Local device and link partner is both:
      Autonegotiate:  on
      RX:             off
      TX:             off
      RX negotiated:  off
      TX negotiated:  off
      
      The root cause is that the supported flag is has only Pause,
      reference to the function genphy_config_advert():
      static int genphy_config_advert(struct phy_device *phydev)
      {
      	...
      	linkmode_and(phydev->advertising, phydev->advertising,
      		     phydev->supported);
      	...
      }
      The pause frame use of link partner is rx off/tx on, so its
      advertising only set the bit Asym_Pause, and the supported is
      only set the bit Pause, so the result of linkmode_and(), is
      rx off/tx off.
      
      This patch adds Asym_Pause to the supported flag to fix it.
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc3781ed
    • Yonglong Liu's avatar
      net: hns3: fix a -Wformat-nonliteral compile warning · 18d219b7
      Yonglong Liu authored
      When setting -Wformat=2, there is a compiler warning like this:
      
      hclge_main.c:xxx:x: warning: format not a string literal and no
      format arguments [-Wformat-nonliteral]
      strs[i].desc);
      ^~~~
      
      This patch adds missing format parameter "%s" to snprintf() to
      fix it.
      
      Fixes: 46a3df9f ("Add HNS3 Acceleration Engine & Compatibility Layer Support")
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      18d219b7
    • Yunsheng Lin's avatar
      net: hns3: add some error checking in hclge_tm module · 04f25edb
      Yunsheng Lin authored
      When hdev->tx_sch_mode is HCLGE_FLAG_VNET_BASE_SCH_MODE, the
      hclge_tm_schd_mode_vnet_base_cfg calls hclge_tm_pri_schd_mode_cfg
      with vport->vport_id as pri_id, which is used as index for
      hdev->tm_info.tc_info, it will cause out of bound access issue
      if vport_id is equal to or larger than HNAE3_MAX_TC.
      
      Also hardware only support maximum speed of HCLGE_ETHER_MAX_RATE.
      
      So this patch adds two checks for above cases.
      
      Fixes: 84844054 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04f25edb
    • Yunsheng Lin's avatar
      net: hns3: change SSU's buffer allocation according to UM · 9e15be90
      Yunsheng Lin authored
      Currently when there is share buffer in the SSU(storage
      switching unit), the low waterline for RX private buffer is
      too low to keep the hardware running. Hardware may have
      processed all the packet stored in the private buffer of the
      low waterline before the new packet comes, because hardware
      only tell the peer send packet again when the private buffer
      is under the low waterline.
      
      So this patch only allocate RX private buffer if there is
      enough buffer according to hardware user manual.
      
      This patch also reserve some buffer for reusing when TC num
      is less than or equal to 2, and change PAUSE_TRANS_GAP &
      HCLGE_NON_DCB_ADDITIONAL_BUF according to hardware user
      manual.
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e15be90
    • Yunsheng Lin's avatar
      net: hns3: enable DCB when TC num is one and pfc_en is non-zero · ae179b2f
      Yunsheng Lin authored
      Currently when TC num is one, the DCB will be disabled no matter if
      pfc_en is non-zero or not.
      
      This patch enables the DCB if pfc_en is non-zero, even when TC num
      is one.
      Signed-off-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Signed-off-by: default avatarPeng Li <lipeng321@huawei.com>
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae179b2f
    • Huazhong Tan's avatar
      net: hns3: fix __QUEUE_STATE_STACK_XOFF not cleared issue · f96315f2
      Huazhong Tan authored
      When change MTU or other operations, which just calling .reset_notify
      to do HNAE3_DOWN_CLIENT and HNAE3_UP_CLIENT, then
      the netdev_tx_reset_queue() in the hns3_clear_all_ring() will be
      ignored. So the dev_watchdog() may misdiagnose a TX timeout.
      
      This patch separates netdev_tx_reset_queue() from
      hns3_clear_all_ring(), and unifies hns3_clear_all_ring() and
      hns3_force_clear_all_ring into one, since they are doing
      similar things.
      
      Fixes: 3a30964a ("net: hns3: delay ring buffer clearing during reset")
      Signed-off-by: default avatarHuazhong Tan <tanhuazhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f96315f2
    • David S. Miller's avatar
      Merge branch 'Better-PHYLINK-compliance-for-SJA1105-DSA' · 5b18c705
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Better PHYLINK compliance for SJA1105 DSA
      
      After discussing with Russell King, it appears this driver is making a
      few confusions and not performing some checks for consistent operation.
      
      Changes in v2:
      - Removed redundant print in the phylink_validate callback (in 2/3).
      ====================
      Acked-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5b18c705
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Mark in-band AN modes not supported for PHYLINK · 9f971573
      Vladimir Oltean authored
      We need a better way to signal this, perhaps in phylink_validate, but
      for now just print this error message as guidance for other people
      looking at this driver's code while trying to rework PHYLINK.
      
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f971573
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Check for PHY mode mismatches with what PHYLINK reports · 39710229
      Vladimir Oltean authored
      PHYLINK being designed with PHYs in mind that can change MII protocol,
      for correct operation it is necessary to ensure that the PHY interface
      mode stays the same (otherwise clear the supported bit mask, as
      required).
      
      Because this is just a hypothetical situation for now, we don't bother
      to check whether we could actually support the new PHY interface mode.
      Actually we could modify the xMII table, reset the switch and send an
      updated static configuration, but adding that would just be dead code.
      
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39710229
    • Vladimir Oltean's avatar
      net: dsa: sja1105: Don't check state->link in phylink_mac_config · a979a0ab
      Vladimir Oltean authored
      It has been pointed out that PHYLINK can call mac_config only to update
      the phy_interface_type and without knowing what the AN results are.
      
      Experimentally, when this was observed to happen, state->link was also
      unset, and therefore was used as a proxy to ignore this call. However it
      is also suggested that state->link is undefined for this callback and
      should not be relied upon.
      
      So let the previously-dead codepath for SPEED_UNKNOWN be called, and
      update the comment to make sure the MAC's behavior is sane.
      
      Cc: Russell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a979a0ab
    • Arnd Bergmann's avatar
      hinic: reduce rss_init stack usage · f7110b75
      Arnd Bergmann authored
      On 32-bit architectures, putting an array of 256 u32 values on the
      stack uses more space than the warning limit:
      
      drivers/net/ethernet/huawei/hinic/hinic_main.c: In function 'hinic_rss_init':
      drivers/net/ethernet/huawei/hinic/hinic_main.c:286:1: error: the frame size of 1068 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
      
      I considered changing the code to use u8 values here, since that's
      all the hardware supports, but dynamically allocating the array is
      a more isolated fix here.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7110b75