1. 31 Mar, 2020 17 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · d9679cd9
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS updates for net-next
      
      The following patchset contains Netfilter/IPVS updates for net-next:
      
      1) Add support to specify a stateful expression in set definitions,
         this allows users to specify e.g. counters per set elements.
      
      2) Flowtable software counter support.
      
      3) Flowtable hardware offload counter support, from wenxu.
      
      3) Parallelize flowtable hardware offload requests, from Paul Blakey.
         This includes a patch to add one work entry per offload command.
      
      4) Several patches to rework nf_queue refcount handling, from Florian
         Westphal.
      
      4) A few fixes for the flowtable tunnel offload: Fix crash if tunneling
         information is missing and set up indirect flow block as TC_SETUP_FT,
         patch from wenxu.
      
      5) Stricter netlink attribute sanity check on filters, from Romain Bellan
         and Florent Fourcot.
      
      5) Annotations to make sparse happy, from Jules Irenge.
      
      6) Improve icmp errors in debugging information, from Haishuang Yan.
      
      7) Fix warning in IPVS icmp error debugging, from Haishuang Yan.
      
      8) Fix endianess issue in tcp extension header, from Sergey Marinkevich.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9679cd9
    • David S. Miller's avatar
      Merge branch 'Add-packet-trap-policers-support' · 6fe9a949
      David S. Miller authored
      Ido Schimmel says:
      
      ====================
      Add packet trap policers support
      
      Background
      ==========
      
      Devices capable of offloading the kernel's datapath and perform
      functions such as bridging and routing must also be able to send (trap)
      specific packets to the kernel (i.e., the CPU) for processing.
      
      For example, a device acting as a multicast-aware bridge must be able to
      trap IGMP membership reports to the kernel for processing by the bridge
      module.
      
      Motivation
      ==========
      
      In most cases, the underlying device is capable of handling packet rates
      that are several orders of magnitude higher compared to those that can
      be handled by the CPU.
      
      Therefore, in order to prevent the underlying device from overwhelming
      the CPU, devices usually include packet trap policers that are able to
      police the trapped packets to rates that can be handled by the CPU.
      
      Proposed solution
      =================
      
      This patch set allows capable device drivers to register their supported
      packet trap policers with devlink. User space can then tune the
      parameters of these policers (currently, rate and burst size) and read
      from the device the number of packets that were dropped by the policer,
      if supported.
      
      These packet trap policers can then be bound to existing packet trap
      groups, which are used to aggregate logically related packet traps. As a
      result, trapped packets are policed to rates that can be handled the
      host CPU.
      
      Example usage
      =============
      
      Instantiate netdevsim:
      
      Dump available packet trap policers:
      netdevsim/netdevsim10:
        policer 1 rate 1000 burst 128
        policer 2 rate 2000 burst 256
        policer 3 rate 3000 burst 512
      
      Change the parameters of a packet trap policer:
      
      Bind a packet trap policer to a packet trap group:
      
      Dump parameters and statistics of a packet trap policer:
      netdevsim/netdevsim10:
        policer 3 rate 100 burst 16
          stats:
              rx:
                dropped 92
      
      Unbind a packet trap policer from a packet trap group:
      
      Patch set overview
      ==================
      
      Patch #1 adds the core infrastructure in devlink which allows capable
      device drivers to register their supported packet trap policers with
      devlink.
      
      Patch #2 extends the existing devlink-trap documentation.
      
      Patch #3 extends netdevsim to register a few dummy packet trap policers
      with devlink. Used later on to selftests the core infrastructure.
      
      Patches #4-#5 adds infrastructure in devlink to allow binding of packet
      trap policers to packet trap groups.
      
      Patch #6 extends netdevsim to allow such binding.
      
      Patch #7 adds a selftest over netdevsim that verifies the core
      devlink-trap policers functionality.
      
      Patches #8-#14 gradually add devlink-trap policers support in mlxsw.
      
      Patch #15 adds a selftest over mlxsw. All registered packet trap
      policers are verified to handle the configured rate and burst size.
      
      Future plans
      ============
      
      * Allow changing default association between packet traps and packet
        trap groups
      * Add more packet traps. For example, for control packets (e.g., IGMP)
      
      v3:
      * Rebase
      
      v2 (address comments from Jiri and Jakub):
      * Patch #1: Add 'strict_start_type' in devlink policy
      * Patch #1: Have device drivers provide max/min rate/burst size for each
        policer. Use them to check validity of user provided parameters
      * Patch #3: Remove check about burst size being a power of 2 and instead
        add a debugfs knob to fail the operation
      * Patch #3: Provide max/min rate/burst size when registering policers
        and remove the validity checks from nsim_dev_devlink_trap_policer_set()
      * Patch #5: Check for presence of 'DEVLINK_ATTR_TRAP_POLICER_ID' in
        devlink_trap_group_set() and bail if not present
      * Patch #5: Add extack error message in case trap group was partially
        modified
      * Patch #7: Add test case with new 'fail_trap_policer_set' knob
      * Patch #7: Add test case for partially modified trap group
      * Patch #10: Provide max/min rate/burst size when registering policers
      * Patch #11: Remove the max/min validity checks from
        __mlxsw_sp_trap_policer_set()
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fe9a949
    • Ido Schimmel's avatar
      selftests: mlxsw: Add test cases for devlink-trap policers · 9f3e63c5
      Ido Schimmel authored
      Add test cases that verify that each registered packet trap policer:
      
      * Honors that imposed limitations of rate and burst size
      * Able to police trapped packets to the specified rate
      * Able to police trapped packets to the specified burst size
      * Able to be unbound from its trap group
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f3e63c5
    • Ido Schimmel's avatar
      mlxsw: spectrum_trap: Add support for setting of packet trap group parameters · 39defcbb
      Ido Schimmel authored
      Implement support for setting of packet trap group parameters by
      invoking the trap_group_init() callback with the new parameters.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39defcbb
    • Ido Schimmel's avatar
      mlxsw: spectrum_trap: Switch to use correct packet trap group · d12d8468
      Ido Schimmel authored
      Some packet traps are currently exposed to user space as being member of
      "l3_drops" trap group, but internally they are member of a different
      group.
      
      Switch these traps to use the correct group so that they are all subject
      to the same policer, as exposed to user space.
      
      Set the trap priority of packets trapped due to loopback error during
      routing to the lowest priority. Such packets are not routed again by the
      kernel and therefore should not mask other traps (e.g., host miss) that
      should be routed.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d12d8468
    • Ido Schimmel's avatar
      mlxsw: spectrum_trap: Do not initialize dedicated discard policer · bc82521e
      Ido Schimmel authored
      The policer is now initialized as part of the registration with devlink,
      so there is no need to initialize it before the registration.
      
      Remove the initialization.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc82521e
    • Ido Schimmel's avatar
      mlxsw: spectrum_trap: Add devlink-trap policer support · 13f2e64b
      Ido Schimmel authored
      Register supported packet trap policers with devlink and implement
      callbacks to change their parameters and read their counters.
      
      Prevent user space from passing invalid policer parameters down to the
      device by checking their validity and communicating the failure via an
      appropriate extack message.
      
      v2:
      * Remove the max/min validity checks from __mlxsw_sp_trap_policer_set()
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13f2e64b
    • Ido Schimmel's avatar
      mlxsw: spectrum_trap: Prepare policers for registration with devlink · 4561705b
      Ido Schimmel authored
      Prepare an array of policer IDs to register with devlink and their
      associated parameters.
      
      The array is composed from both policers that are currently bound to
      exposed trap groups and policers that are not bound to any trap group.
      
      v2:
      * Provide max/min rate/burst size when registering policers
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4561705b
    • Ido Schimmel's avatar
      mlxsw: spectrum: Track used packet trap policer IDs · 03484e49
      Ido Schimmel authored
      During initialization the driver configures various packet trap groups
      and binds policers to them.
      
      Currently, most of these groups are not exposed to user space and
      therefore their policers should not be exposed as well. Otherwise, user
      space will be able to alter policer parameters without knowing which
      packet traps are policed by the policer.
      
      Use a bitmap to track the used policer IDs so that these policers will
      not be registered with devlink in a subsequent patch.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      03484e49
    • Ido Schimmel's avatar
      mlxsw: reg: Extend QPCR register · 2b84d7c3
      Ido Schimmel authored
      The QoS Policer Configuration Register (QPCR) is used to configure
      hardware policers. Extend this register with following fields and
      defines which will be used by subsequent patches:
      
      1. Violate counter: reads number of packets dropped by the policer
      2. Clear counter: to ensure we start counting from 0
      3. Rate and burst size limits
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2b84d7c3
    • Ido Schimmel's avatar
      selftests: netdevsim: Add test cases for devlink-trap policers · 5fbff58e
      Ido Schimmel authored
      Add test cases for packet trap policer set / show commands as well as
      for the binding of these policers to packet trap groups.
      
      Both good and bad flows are tested for maximum coverage.
      
      v2:
      * Add test case with new 'fail_trap_policer_set' knob
      * Add test case for partially modified trap group
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5fbff58e
    • Ido Schimmel's avatar
      netdevsim: Add support for setting of packet trap group parameters · 0dc8249a
      Ido Schimmel authored
      Add a dummy callback to set trap group parameters. Return an error when
      the 'fail_trap_group_set' debugfs file is set in order to exercise error
      paths and verify that error is propagated to user space when should.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0dc8249a
    • Ido Schimmel's avatar
      devlink: Allow setting of packet trap group parameters · c064875a
      Ido Schimmel authored
      The previous patch allowed device drivers to publish their default
      binding between packet trap policers and packet trap groups. However,
      some users might not be content with this binding and would like to
      change it.
      
      In case user space passed a packet trap policer identifier when setting
      a packet trap group, invoke the appropriate device driver callback and
      pass the new policer identifier.
      
      v2:
      * Check for presence of 'DEVLINK_ATTR_TRAP_POLICER_ID' in
        devlink_trap_group_set() and bail if not present
      * Add extack error message in case trap group was partially modified
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c064875a
    • Ido Schimmel's avatar
      devlink: Add packet trap group parameters support · f9f54392
      Ido Schimmel authored
      Packet trap groups are used to aggregate logically related packet traps.
      Currently, these groups allow user space to batch operations such as
      setting the trap action of all member traps.
      
      In order to prevent the CPU from being overwhelmed by too many trapped
      packets, it is desirable to bind a packet trap policer to these groups.
      For example, to limit all the packets that encountered an exception
      during routing to 10Kpps.
      
      Allow device drivers to bind default packet trap policers to packet trap
      groups when the latter are registered with devlink.
      
      The next patch will enable user space to change this default binding.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9f54392
    • Ido Schimmel's avatar
      netdevsim: Add devlink-trap policer support · ad188458
      Ido Schimmel authored
      Register three dummy packet trap policers with devlink and implement
      callbacks to change their parameters and read their counters.
      
      This will be used later on in the series to test the devlink-trap
      policer infrastructure.
      
      v2:
      * Remove check about burst size being a power of 2 and instead add a
        debugfs knob to fail the operation
      * Provide max/min rate/burst size when registering policers and remove
        the validity checks from nsim_dev_devlink_trap_policer_set()
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad188458
    • Ido Schimmel's avatar
      Documentation: Add description of packet trap policers · ef7d5c7d
      Ido Schimmel authored
      Extend devlink-trap documentation with information about packet trap
      policers.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef7d5c7d
    • Ido Schimmel's avatar
      devlink: Add packet trap policers support · 1e8c6619
      Ido Schimmel authored
      Devices capable of offloading the kernel's datapath and perform
      functions such as bridging and routing must also be able to send (trap)
      specific packets to the kernel (i.e., the CPU) for processing.
      
      For example, a device acting as a multicast-aware bridge must be able to
      trap IGMP membership reports to the kernel for processing by the bridge
      module.
      
      In most cases, the underlying device is capable of handling packet rates
      that are several orders of magnitude higher compared to those that can
      be handled by the CPU.
      
      Therefore, in order to prevent the underlying device from overwhelming
      the CPU, devices usually include packet trap policers that are able to
      police the trapped packets to rates that can be handled by the CPU.
      
      This patch allows capable device drivers to register their supported
      packet trap policers with devlink. User space can then tune the
      parameters of these policer (currently, rate and burst size) and read
      from the device the number of packets that were dropped by the policer,
      if supported.
      
      Subsequent patches in the series will allow device drivers to create
      default binding between these policers and packet trap groups and allow
      user space to change the binding.
      
      v2:
      * Add 'strict_start_type' in devlink policy
      * Have device drivers provide max/min rate/burst size for each policer.
        Use them to check validity of user provided parameters
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e8c6619
  2. 30 Mar, 2020 23 commits
    • Haishuang Yan's avatar
      ipvs: fix uninitialized variable warning · e19680f8
      Haishuang Yan authored
      If outer_proto is not set, GCC warning as following:
      
      In file included from net/netfilter/ipvs/ip_vs_core.c:52:
      net/netfilter/ipvs/ip_vs_core.c: In function 'ip_vs_in_icmp':
      include/net/ip_vs.h:233:4: warning: 'outer_proto' may be used uninitialized in this function [-Wmaybe-uninitialized]
       233 |    printk(KERN_DEBUG pr_fmt(msg), ##__VA_ARGS__); \
           |    ^~~~~~
      net/netfilter/ipvs/ip_vs_core.c:1666:8: note: 'outer_proto' was declared here
      1666 |  char *outer_proto;
           |        ^~~~~~~~~~~
      
      Fixes: 73348fed ("ipvs: optimize tunnel dumps for icmp errors")
      Signed-off-by: default avatarHaishuang Yan <yanhaishuang@cmss.chinamobile.com>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      e19680f8
    • Sergey Marinkevich's avatar
      netfilter: nft_exthdr: fix endianness of tcp option cast · 2e34328b
      Sergey Marinkevich authored
      I got a problem on MIPS with Big-Endian is turned on: every time when
      NF trying to change TCP MSS it returns because of new.v16 was greater
      than old.v16. But real MSS was 1460 and my rule was like this:
      
      	add rule table chain tcp option maxseg size set 1400
      
      And 1400 is lesser that 1460, not greater.
      
      Later I founded that main causer is cast from u32 to __be16.
      
      Debugging:
      
      In example MSS = 1400(HEX: 0x578). Here is representation of each byte
      like it is in memory by addresses from left to right(e.g. [0x0 0x1 0x2
      0x3]). LE — Little-Endian system, BE — Big-Endian, left column is type.
      
      	     LE               BE
      	u32: [78 05 00 00]    [00 00 05 78]
      
      As you can see, u32 representation will be casted to u16 from different
      half of 4-byte address range. But actually nf_tables uses registers and
      store data of various size. Actually TCP MSS stored in 2 bytes. But
      registers are still u32 in definition:
      
      	struct nft_regs {
      		union {
      			u32			data[20];
      			struct nft_verdict	verdict;
      		};
      	};
      
      So, access like regs->data[priv->sreg] exactly u32. So, according to
      table presents above, per-byte representation of stored TCP MSS in
      register will be:
      
      	                     LE               BE
      	(u32)regs->data[]:   [78 05 00 00]    [05 78 00 00]
      	                                       ^^ ^^
      
      We see that register uses just half of u32 and other 2 bytes may be
      used for some another data. But in nft_exthdr_tcp_set_eval() it casted
      just like u32 -> __be16:
      
      	new.v16 = src
      
      But u32 overfill __be16, so it get 2 low bytes. For clarity draw
      one more table(<xx xx> means that bytes will be used for cast).
      
      	                     LE                 BE
      	u32:                 [<78 05> 00 00]    [00 00 <05 78>]
      	(u32)regs->data[]:   [<78 05> 00 00]    [05 78 <00 00>]
      
      As you can see, for Little-Endian nothing changes, but for Big-endian we
      take the wrong half. In my case there is some other data instead of
      zeros, so new MSS was wrongly greater.
      
      For shooting this bug I used solution for ports ranges. Applying of this
      patch does not affect Little-Endian systems.
      Signed-off-by: default avatarSergey Marinkevich <sergey.marinkevich@eltex-co.ru>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2e34328b
    • David S. Miller's avatar
      Merge branch 'split-phylink-PCS-operations' · 2d39eab4
      David S. Miller authored
      Russell King says:
      
      ====================
      split phylink PCS operations
      
      This series splits the phylink_mac_ops structure so that PCS can be
      supported separately with their own PCS operations, separating them
      from the MAC layer.  This may need adaption later as more users come
      along.
      
      v2: change pcs_config() and associated called function prototypes to
      only pass the information that is required, and add some documention.
      
      v3: change phylink_create() prototype
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d39eab4
    • Russell King's avatar
      net: phylink: add separate pcs operations structure · 4c0d6d3a
      Russell King authored
      Add a separate set of PCS operations, which MAC drivers can use to
      couple phylink with their associated MAC PCS layer.  The PCS
      operations include:
      
      - pcs_get_state() - reads the link up/down, resolved speed, duplex
         and pause from the PCS.
      - pcs_config() - configures the PCS for the specified mode, PHY
         interface type, and setting the advertisement.
      - pcs_an_restart() - restarts 802.3 in-band negotiation with the
         link partner
      - pcs_link_up() - informs the PCS that link has come up, and the
         parameters of the link. Link parameters are used to program the
         PCS for fixed speed and non-inband modes.
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c0d6d3a
    • Russell King's avatar
      net: phylink: rename 'ops' to 'mac_ops' · e7765d63
      Russell King authored
      Rename the bland 'ops' member of struct phylink to be a more
      descriptive 'mac_ops' - this is necessary as we're about to introduce
      another set of operations.
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7765d63
    • Russell King's avatar
      net: phylink: change phylink_mii_c22_pcs_set_advertisement() prototype · 0bd27406
      Russell King authored
      Change phylink_mii_c22_pcs_set_advertisement() to take only the PHY
      interface and advertisement mask, rather than the full phylink state.
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0bd27406
    • Heiner Kallweit's avatar
      r8169: factor out rtl8169_tx_map · b8447abc
      Heiner Kallweit authored
      Factor out mapping the tx skb to a new function rtl8169_tx_map(). This
      allows to remove redundancies, and rtl8169_get_txd_opts1() has only
      one user left, so it can be inlined.
      As a result rtl8169_xmit_frags() is significantly simplified, and in
      rtl8169_start_xmit() the code is simplified and better readable.
      No functional change intended.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8447abc
    • David S. Miller's avatar
      Merge branch 'for-upstream' of... · 033c6f3b
      David S. Miller authored
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
      
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth-next 2020-03-29
      
      Here are a few more Bluetooth patches for the 5.7 kernel:
      
       - Fix assumption of encryption key size when reading fails
       - Add support for DEFER_SETUP with L2CAP Enhanced Credit Based Mode
       - Fix issue with auto-connected devices
       - Fix suspend handling when entering the state fails
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      033c6f3b
    • Yuval Basson's avatar
      qed: Fix use after free in qed_chain_free · 8063f761
      Yuval Basson authored
      The qed_chain data structure was modified in
      commit 1a4a6975 ("qed: Chain support for external PBL") to support
      receiving an external pbl (due to iWARP FW requirements).
      The pages pointed to by the pbl are allocated in qed_chain_alloc
      and their virtual address are stored in an virtual addresses array to
      enable accessing and freeing the data. The physical addresses however
      weren't stored and were accessed directly from the external-pbl
      during free.
      
      Destroy-qp flow, leads to freeing the external pbl before the chain is
      freed, when the chain is freed it tries accessing the already freed
      external pbl, leading to a use-after-free. Therefore we need to store
      the physical addresses in additional to the virtual addresses in a
      new data structure.
      
      Fixes: 1a4a6975 ("qed: Chain support for external PBL")
      Signed-off-by: default avatarMichal Kalderon <mkalderon@marvell.com>
      Signed-off-by: default avatarYuval Bason <ybason@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8063f761
    • Heiner Kallweit's avatar
      r8169: improve handling of TD_MSS_MAX · 4abc3c04
      Heiner Kallweit authored
      If the mtu is greater than TD_MSS_MAX, then TSO is disabled, see
      rtl8169_fix_features(). Because mss is less than mtu, we can't have
      the case mss > TD_MSS_MAX in the TSO path.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4abc3c04
    • David S. Miller's avatar
      Merge branch 'Port-and-flow-policers-for-DSA' · 3288dffc
      David S. Miller authored
      Vladimir Oltean says:
      
      ====================
      Port and flow policers for DSA (SJA1105, Felix/Ocelot)
      
      This series adds support for 2 types of policers:
       - port policers, via tc matchall filter
       - flow policers, via tc flower filter
      for 2 DSA drivers:
       - sja1105
       - felix/ocelot
      
      First we start with ocelot/felix. Prior to this patch, the ocelot core
      library currently only supported:
      - Port policers
      - Flow-based dropping and trapping
      But the felix wrapper could not actually use the port policers due to
      missing linkage and support in the DSA core. So one of the patches
      addresses exactly that limitation by adding the missing support to the
      DSA core. The other patch for felix flow policers (via the VCAP IS2
      engine) is actually in the ocelot library itself, since the linkage with
      the ocelot flower classifier has already been done in an earlier patch
      set.
      
      Then with the newly added .port_policer_add and .port_policer_del, we
      can also start supporting the L2 policers on sja1105.
      
      Then, for full functionality of these L2 policers on sja1105, we also
      implement a more limited set of flow-based policing keys for this
      switch, namely for broadcast and VLAN PCP.
      
      Series version 1 was submitted here:
      https://patchwork.ozlabs.org/cover/1263353/
      
      Nothing functional changed in v2, only a rebase.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3288dffc
    • Vladimir Oltean's avatar
      net: dsa: sja1105: add broadcast and per-traffic class policers · a6af7763
      Vladimir Oltean authored
      This patch adds complete support for manipulating the L2 Policing Tables
      from this switch. There are 45 table entries, one entry per each port
      and traffic class, and one dedicated entry for broadcast traffic for
      each ingress port.
      
      Policing entries are shareable, and we use this functionality to support
      shared block filters.
      
      We are modeling broadcast policers as simple tc-flower matches on
      dst_mac. As for the traffic class policers, the switch only deduces the
      traffic class from the VLAN PCP field, so it makes sense to model this
      as a tc-flower match on vlan_prio.
      
      How to limit broadcast traffic coming from all front-panel ports to a
      cumulated total of 10 Mbit/s:
      
      tc qdisc add dev sw0p0 ingress_block 1 clsact
      tc qdisc add dev sw0p1 ingress_block 1 clsact
      tc qdisc add dev sw0p2 ingress_block 1 clsact
      tc qdisc add dev sw0p3 ingress_block 1 clsact
      tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
      	action police rate 10mbit burst 64k
      
      How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
      100 Mbit/s on port 0 only:
      
      tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
      	vlan_prio 0 action police rate 100mbit burst 64k
      
      The broadcast, VLAN PCP and port policers are compatible with one
      another (can be installed at the same time on a port).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6af7763
    • Vladimir Oltean's avatar
      net: dsa: sja1105: add configuration of port policers · a7cc081c
      Vladimir Oltean authored
      This adds partial configuration support for the L2 Policing Table. Out
      of the 45 policing entries, only 5 are used (one for each port), in a
      shared manner. All 8 traffic classes, and the broadcast policer, are
      redirected to a common instance which belongs to the ingress port.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7cc081c
    • Vladimir Oltean's avatar
      net: dsa: felix: add port policers · fc411eaa
      Vladimir Oltean authored
      This patch is a trivial passthrough towards the ocelot library, which
      support port policers since commit 2c1d029a ("net: mscc: ocelot:
      Implement port policers via tc command").
      
      Some data structure conversion between the DSA core and the Ocelot
      library is necessary, for policer parameters.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc411eaa
    • Vladimir Oltean's avatar
      net: dsa: add port policers · 34297176
      Vladimir Oltean authored
      The approach taken to pass the port policer methods on to drivers is
      pragmatic. It is similar to the port mirroring implementation (in that
      the DSA core does all of the filter block interaction and only passes
      simple operations for the driver to implement) and dissimilar to how
      flow-based policers are going to be implemented (where the driver has
      full control over the flow_cls_offload data structure).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34297176
    • Vladimir Oltean's avatar
      net: dsa: refactor matchall mirred action to separate function · e13c2075
      Vladimir Oltean authored
      Make room for other actions for the matchall filter by keeping the
      mirred argument parsing self-contained in its own function.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e13c2075
    • Xiaoliang Yang's avatar
      net: mscc: ocelot: add action of police on vcap_is2 · c9a7fe12
      Xiaoliang Yang authored
      Ocelot has 384 policers that can be allocated to ingress ports,
      QoS classes per port, and VCAP IS2 entries. ocelot_police.c
      supports to set policers which can be allocated to police action
      of VCAP IS2. We allocate policers from maximum pol_id, and
      decrease the pol_id when add a new vcap_is2 entry which is
      police action.
      Signed-off-by: default avatarXiaoliang Yang <xiaoliang.yang_1@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9a7fe12
    • David S. Miller's avatar
      Merge branch 'ionic-support-for-firmware-upgrade' · 0d5d6045
      David S. Miller authored
      Shannon Nelson says:
      
      ====================
      ionic support for firmware upgrade
      
      The Pensando Distributed Services Card can get firmware upgrades from
      the off-host centralized management suite, and can be upgraded without a
      host reboot or driver reload.  This patchset sets up the support for fw
      upgrade in the Linux driver.
      
      When the upgrade begins, the DSC first brings the link down, then stops
      the firmware.  The driver will notice this and quiesce itself by stopping
      the queues and releasing DMA resources, then monitoring for firmware to
      start back up.  When the upgrade is finished the firmware is restarted
      and link is brought up, and the driver rebuilds the queues and restarts
      traffic flow.
      
      First we separate the Link state from the netdev state, then reorganize a
      few things to prepare for partial tear-down of the queues.  Next we fix
      up the state machine so that we take the Tx and Rx queues down and back
      up when we get LINK_DOWN and LINK_UP events.  Lastly, we add handling of
      the FW reset itself by tearing down the lif internals and rebuilding them
      with the new FW setup.
      
      v2: This changes the design from (ab)using the full .ndo_stop and
          .ndo_open routines to getting a better separation between the
          alloc and the init functions so that we can keep our resource
          allocations as long as possible.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d5d6045
    • Shannon Nelson's avatar
      ionic: remove lifs on fw reset · c672412f
      Shannon Nelson authored
      When the FW RESET event comes to the driver from the firmware,
      or the fw_status goes to 0 (stopped) or to 0xff (no PCI
      connection), then shut down the driver activity.  This event
      signals a FW upgrade where we need to quiesce all operations and
      wait for the FW to restart.  The FW will continue the update
      process once it sees all the LIFs are reset.  When the update
      process is done it will set the fw_status back to RUNNING.
      Meanwhile, the heartbeat check continues and when the fw_status
      is seen as set to running we can restart the driver operations.
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c672412f
    • Shannon Nelson's avatar
      ionic: disable the queues on link down · 49d3b493
      Shannon Nelson authored
      When the link goes down, we need to disable the queues on the
      NIC in addition to stopping the netdev stack.  This lets the
      FW know that the driver has stopped queue activity, and then
      the FW can do internal reconfiguration work, whether actually
      Link related, or for other internal FW needs.  To do this,
      we pull out the queue enable and disable from ionic_open()
      and ionic_stop() so they can be used by other routines.
      
      To help keep things sane, we swap the queue enables so that
      the rx queue and its napi are enabled before the tx queue
      which rides on the rx queues napi.
      
      We also drop the ionic_lif_quiesce() as it doesn't do anything
      more than what the queue disable has already taken care of.
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49d3b493
    • Shannon Nelson's avatar
      ionic: check for queues before deleting · d5eddde5
      Shannon Nelson authored
      Make sure the queue structures exist before trying
      to delete them.  This addresses a couple of error
      recovery issues.
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5eddde5
    • Shannon Nelson's avatar
      ionic: clean tx queue of unfinished requests · f9c00e2c
      Shannon Nelson authored
      Clean out tx requests that didn't get finished before
      shutting down the queue.
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9c00e2c
    • Shannon Nelson's avatar
      ionic: move irq request to qcq alloc · 0b064100
      Shannon Nelson authored
      Move the irq request and free out of the qcq_init and deinit
      and into the alloc and free routines where they belong for
      better resource management.
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b064100