1. 04 Nov, 2022 35 commits
    • Anirudh Venkataramanan's avatar
      ixgbe: Remove local variable · 6a6f9e3e
      Anirudh Venkataramanan authored
      Remove local variable "match" and directly return evaluated conditional
      instead.
      Suggested-by: default avatarAlexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      6a6f9e3e
    • Daniel Willenson's avatar
      ixgbe: change MAX_RXD/MAX_TXD based on adapter type · 864f8888
      Daniel Willenson authored
      Set the length limit for the receive descriptor buffer and transmit
      descriptor buffer based on the controller type. The values used are called
      out in the controller datasheets as a 'Note:' in the RDLEN and TDLEN
      register descriptions.
      
      This allows the user to use ethtool to allocate larger descriptor buffers
      in the case where data is received or transmitted too quickly for the
      driver to keep up.
      Signed-off-by: default avatarDaniel Willenson <daniel@veobot.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      864f8888
    • David S. Miller's avatar
      Merge branch 'net-ipa-more-endpoints' · 95ec6bce
      David S. Miller authored
      Alex Elder says:
      
      ====================
      net: ipa: support more endpoints
      
      This series adds support for more than 32 IPA endpoints.  To do
      this, five registers whose bits represent endpoint state are
      replicated as needed to represent endpoints beyond 32.  For existing
      platforms, the number of endpoints is never greater than 32, so
      there is just one of each register.  IPA v5.0+ supports more than
      that though; these changes prepare the code for that.
      
      Beyond that, the IPA fields that represent endpoints in a 32-bit
      bitmask are updated to support an arbitrary number of these endpoint
      registers.  (There is one exception, explained in patch 7.)
      
      The first two patches are some sort of unrelated cleanups, making
      use of a helper function introduced recently.
      
      The third and fourth use parameterized functions to determine the
      register offset for registers that represent endpoints.
      
      The last five convert fields representing endpoints to allow more
      than 32 endpoints to be represented.
      
      Since v1, I have implemented Jakub's suggestions:
        - Don't print a message on (bitmap) memory allocation failure
        - Do not do "mass null checks" when allocating bitmaps
        - Rework some code to ensure error path is sane
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95ec6bce
    • Alex Elder's avatar
      net: ipa: use a bitmap for enabled endpoints · 9b7a0065
      Alex Elder authored
      Replace the 32-bit unsigned used to track enabled endpoints with a
      Linux bitmap, to allow an arbitrary number of endpoints to be
      represented.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b7a0065
    • Alex Elder's avatar
      net: ipa: use a bitmap for set-up endpoints · ae5108e9
      Alex Elder authored
      Replace the 32-bit unsigned used to track endpoints that have
      completed setup with a Linux bitmap, to allow an arbitrary number
      of endpoints to be represented.
      
      Rework the error handling in ipa_endpoint_init() so the defined
      endpoint bitmap is freed if an error occurs early.  Once endpoints
      have been initialized, ipa_endpoint_exit() is used to recover if
      the set of filtered endpoints is invalid.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae5108e9
    • Alex Elder's avatar
      net: ipa: support more filtering endpoints · 0f97fbd4
      Alex Elder authored
      Prior to IPA v5.0, there could be no more than 32 endpoints.
      
      A filter table begins with a bitmap indicating which endpoints have
      a filter defined.  That bitmap is currently assumed to fit in a
      32-bit value.
      
      Starting with IPA v5.0, more than 32 endpoints are supported, so
      it's conceivable that a TX endpoint has an ID that exceeds 32.
      Increase the size of the field representing endpoints that support
      filtering to 64 bits.  Rename the bitmap field "filtered".
      
      Unlike other similar fields, we do not use an (arbitrarily long)
      Linux bitmap for this purpose.  The reason is that if a filter table
      ever *did* need to support more than 64 TX endpoints, its format
      would change in ways we can't anticipate.
      
      Have ipa_endpoint_init() return a negative errno rather than a mask
      that indicates which endpoints support filtering, and have that
      function assign the "filtered" field directly.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f97fbd4
    • Alex Elder's avatar
      net: ipa: use a bitmap for available endpoints · 88de7672
      Alex Elder authored
      Similar to the previous patch, replace the 32-bit unsigned used to
      track endpoints supported by hardware with a Linux bitmap, to allow
      an arbitrary number of endpoints to be represented.
      
      Move ipa_endpoint_deconfig() above ipa_endpoint_config() and use
      it in the error path of the latter function.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88de7672
    • Alex Elder's avatar
      net: ipa: use a bitmap for defined endpoints · 9a9f5129
      Alex Elder authored
      IPA v5.0 supports more than 32 endpoints, so we will be unable to
      represent endpoints defined in the configuration data with a 32-bit
      value.  To prepare for that, convert the field in the IPA structure
      representing defined endpoints to be a Linux bitmap.
      
      Convert loops based on that field into for_each_set_bit() calls over
      the new bitmap.  Note that the loop in ipa_endpoint_config() still
      assumes there are 32 or fewer endpoints (when comparing against the
      available endpoint bit mask); that assumption goes away in the next
      patch.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a9f5129
    • Alex Elder's avatar
      net: ipa: add a parameter to suspend registers · f298ba78
      Alex Elder authored
      The SUSPEND_INFO, SUSPEND_EN, SUSPEND_CLR registers represent
      endpoint IDs in a bit mask.  When more than 32 endpoints are
      supported, these registers will be replicated as needed to represent
      the number of supported endpoints.  Update the definitions of these
      registers to have a stride of 4 bytes, and update the code that
      operates them to select the proper offset and bit.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f298ba78
    • Alex Elder's avatar
      net: ipa: add a parameter to aggregation registers · 1d8f16db
      Alex Elder authored
      Starting with IPA v5.0, a single IPA instance can have more than 32
      endpoints defined.  To handle this, each register that holds a
      bitmap of IPA endpoints is replicated as needed to represent the
      available endpoints.
      
      To prepare for this, registers that represent endpoint IDs in a bit
      mask will be defined to have a parameter, with a stride value of 4
      bytes.  The first 32 endpoints are represented in the first 32-bit
      register, then the next (up to) 32 endpoints at an offset 4 bytes
      higher.  When accessing such a register, the endpoint ID divided
      by 32 determines the offset, and the endpoint ID modulo 32 defines
      the endpoint's bit position within the register.
      
      The first two registers we'll update for this are STATE_AGGR_ACTIVE
      and AGGR_FORCE_CLOSE.
      
      Until more than 32 endpoints are supported, this change has no
      practical effect.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d8f16db
    • Alex Elder's avatar
      net: ipa: use ipa_table_mem() in ipa_table_reset_add() · 6337b147
      Alex Elder authored
      Similar to the previous commit, pass flags rather than a memory
      region ID to ipa_table_reset_add(), and there use ipa_table_mem() to
      look up the memory region affected based on those flags.
      
      Currently all eight of these table memory regions are assumed to
      exist, because they all have canaries within them.  Stop assuming
      that will always be the case, and in ipa_table_reset_add() allow
      these memory regions to be non-existent.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6337b147
    • Alex Elder's avatar
      net: ipa: reduce arguments to ipa_table_init_add() · 5cb76899
      Alex Elder authored
      Recently ipa_table_mem() was added as a way to look up one of 8
      possible memory regions by indicating whether it was a filter or
      route table, hashed or not, and IPv6 or not.
      
      We can simplify the interface to ipa_table_init_add() by passing two
      flags to it instead of the opcode and both hashed and non-hashed
      memory region IDs.  The "filter" and "ipv6" flags are sufficient to
      determine the opcode to use, and with ipa_table_mem() can look up
      the correct memory region as well.
      
      It's possible to not have hashed tables, but we already verify the
      number of entries in a filter or routing table is nonzero.  Stop
      assuming a hashed table entry exists in ipa_table_init_add().
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5cb76899
    • Colin Ian King's avatar
      rds: remove redundant variable total_payload_len · d28c0e73
      Colin Ian King authored
      Variable total_payload_len is being used to accumulate payload lengths
      however it is never read or used afterwards. It is redundant and can
      be removed.
      Signed-off-by: default avatarColin Ian King <colin.i.king@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d28c0e73
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · b3809277
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-11-02 (i40e, iavf)
      
      This series contains updates to i40e and iavf drivers.
      
      Joe Damato adds tracepoint information to i40e_napi_poll to expose helpful
      debug information for users who'd like to get a better understanding of
      how their NIC is performing as they adjust various parameters and tuning
      knobs.
      
      Note: this does not touch any XDP related code paths. This
      tracepoint will only work when not using XDP. Care has been taken to avoid
      changing control flow in i40e_napi_poll with this change.
      
      Alicja adds error messaging for unsupported duplex settings for i40e.
      
      Ye Xingchen replaces use of __FUNCTION__ with __func__ for iavf.
      
      Bartosz changes tense of device removal message to be more clear on the
      action for iavf.
      
      * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
        iavf: Change information about device removal in dmesg
        iavf: Replace __FUNCTION__ with __func__
        i40e: Add appropriate error message logged for incorrect duplex setting
        i40e: Add i40e_napi_poll tracepoint
        i40e: Record number of RXes cleaned during NAPI
        i40e: Record number TXes cleaned during NAPI
        i40e: Store the irq number in i40e_q_vector
      ====================
      
      Link: https://lore.kernel.org/r/20221102211011.2944983-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b3809277
    • Jakub Kicinski's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 20224838
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-11-02 (e1000e, e1000, igc)
      
      This series contains updates to e1000e, e1000, and igc drivers.
      
      For e1000e, Sasha adds a new board type to help distinguish platforms and
      adds device id support for upcoming platforms. He also adds trace points
      for CSME flows to aid in debugging.
      
      Ani removes unnecessary kmap_atomic call for e1000 and e1000e.
      
      Muhammad sets speed based transmit offsets for launchtime functionality to
      reduce latency for igc.
      
      * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
        igc: Correct the launchtime offset
        e1000: Remove unnecessary use of kmap_atomic()
        e1000e: Remove unnecessary use of kmap_atomic()
        e1000e: Add e1000e trace module
        e1000e: Add support for the next LOM generation
        e1000e: Separate MTP board type from ADP
      ====================
      
      Link: https://lore.kernel.org/r/20221102203957.2944396-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      20224838
    • Nathan Chancellor's avatar
      hamradio: baycom_epp: Fix return type of baycom_send_packet() · c5733e5b
      Nathan Chancellor authored
      With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
      indirect call targets are validated against the expected function
      pointer prototype to make sure the call target is valid to help mitigate
      ROP attacks. If they are not identical, there is a failure at run time,
      which manifests as either a kernel panic or thread getting killed. A
      proposed warning in clang aims to catch these at compile time, which
      reveals:
      
        drivers/net/hamradio/baycom_epp.c:1119:25: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
                .ndo_start_xmit      = baycom_send_packet,
                                      ^~~~~~~~~~~~~~~~~~
        1 error generated.
      
      ->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
      'netdev_tx_t', not 'int'. Adjust the return type of baycom_send_packet()
      to match the prototype's to resolve the warning and CFI failure.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1750Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20221102160610.1186145-1-nathan@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c5733e5b
    • Nathan Chancellor's avatar
      net: ethernet: ti: Fix return type of netcp_ndo_start_xmit() · 63fe6ff6
      Nathan Chancellor authored
      With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
      indirect call targets are validated against the expected function
      pointer prototype to make sure the call target is valid to help mitigate
      ROP attacks. If they are not identical, there is a failure at run time,
      which manifests as either a kernel panic or thread getting killed. A
      proposed warning in clang aims to catch these at compile time, which
      reveals:
      
        drivers/net/ethernet/ti/netcp_core.c:1944:21: error: incompatible function pointer types initializing 'netdev_tx_t (*)(struct sk_buff *, struct net_device *)' (aka 'enum netdev_tx (*)(struct sk_buff *, struct net_device *)') with an expression of type 'int (struct sk_buff *, struct net_device *)' [-Werror,-Wincompatible-function-pointer-types-strict]
                .ndo_start_xmit         = netcp_ndo_start_xmit,
                                          ^~~~~~~~~~~~~~~~~~~~
        1 error generated.
      
      ->ndo_start_xmit() in 'struct net_device_ops' expects a return type of
      'netdev_tx_t', not 'int'. Adjust the return type of
      netcp_ndo_start_xmit() to match the prototype's to resolve the warning
      and CFI failure.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/1750Signed-off-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Link: https://lore.kernel.org/r/20221102160933.1601260-1-nathan@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      63fe6ff6
    • Christophe JAILLET's avatar
      net: usb: Use kstrtobool() instead of strtobool() · c2cce3a6
      Christophe JAILLET authored
      strtobool() is the same as kstrtobool().
      However, the latter is more used within the kernel.
      
      In order to remove strtobool() and slightly simplify kstrtox.h, switch to
      the other function name.
      
      While at it, include the corresponding header file (<linux/kstrtox.h>).
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Link: https://lore.kernel.org/r/d4432a67b6f769cac0a9ec910ac725298b64e102.1667336095.git.christophe.jaillet@wanadoo.frSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c2cce3a6
    • Jakub Kicinski's avatar
      Merge branch 'net-fix-netdev-to-devlink_port-linkage-and-expose-to-user' · 7712b3e9
      Jakub Kicinski authored
      Jiri Pirko says:
      
      ====================
      net: fix netdev to devlink_port linkage and expose to user
      
      Currently, the info about linkage from netdev to the related
      devlink_port instance is done using ndo_get_devlink_port().
      This is not sufficient, as it is up to the driver to implement it and
      some of them don't do that. Also it leads to a lot of unnecessary
      boilerplate code in all the drivers.
      
      Instead of that, introduce a possibility for driver to expose this
      relationship by new SET_NETDEV_DEVLINK_PORT macro which stores it into
      dev->devlink_port. It is ensured by the driver init/fini flows that
      the devlink_port pointer does not change during the netdev lifetime.
      Devlink port is always registered before netdev register and
      unregistered after netdev unregister.
      
      Benefit from this linkage setup and remove explicit calls from driver
      to devlink_port_type_eth_set() and clear(). Many of the driver
      didn't use it correctly anyway. Let the devlink.c to track associated
      netdev events and adjust type and type pointer accordingly. Also
      use this events to to keep track on ifname change and remove RTNL lock
      taking from devlink_nl_port_fill().
      
      Finally, remove the ndo_get_devlink_port() ndo which is no longer used
      and expose devlink_port handle as a new netdev netlink attribute to the
      user. That way, during the ifname->devlink_port lookup, userspace app
      does not have to dump whole devlink port list and instead it can just
      do a simple RTM_GETLINK query.
      ====================
      
      Link: https://lore.kernel.org/r/20221102160211.662752-1-jiri@resnulli.usSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7712b3e9
    • Jiri Pirko's avatar
      net: expose devlink port over rtnetlink · dca56c30
      Jiri Pirko authored
      Expose devlink port handle related to netdev over rtnetlink. Introduce a
      new nested IFLA attribute to carry the info. Call into devlink code to
      fill-up the nest with existing devlink attributes that are used over
      devlink netlink.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dca56c30
    • Jiri Pirko's avatar
      net: remove unused ndo_get_devlink_port · 77df1db8
      Jiri Pirko authored
      Remove ndo_get_devlink_port which is no longer used alongside with the
      implementations in drivers.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      77df1db8
    • Jiri Pirko's avatar
      net: devlink: use devlink_port pointer instead of ndo_get_devlink_port · 8eba37f7
      Jiri Pirko authored
      Use newly introduced devlink_port pointer instead of getting it calling
      to ndo_get_devlink_port op.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8eba37f7
    • Jiri Pirko's avatar
      net: devlink: add not cleared type warning to port unregister · e705a621
      Jiri Pirko authored
      By the time port unregister is called. There should be no type set. Make
      sure that the driver cleared it before and warn in case it didn't. This
      enforces symmetricity with type set and port register.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e705a621
    • Jiri Pirko's avatar
      net: devlink: store copy netdevice ifindex and ifname to allow port_fill() without RTNL held · 31265c1e
      Jiri Pirko authored
      To avoid a need to take RTNL mutex in port_fill() function, benefit from
      the introduce infrastructure that tracks netdevice notifier events.
      Store the ifindex and ifname upon register and change name events.
      Remove the rtnl_held bool propagated down to port_fill() function as it
      is no longer needed.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      31265c1e
    • Jiri Pirko's avatar
      net: devlink: remove net namespace check from devlink_nl_port_fill() · d0f51726
      Jiri Pirko authored
      It is ensured by the netdevice notifier event processing, that only
      netdev pointers from the same net namespaces are filled. Remove the
      net namespace check from devlink_nl_port_fill() as it is no longer
      needed.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d0f51726
    • Jiri Pirko's avatar
      net: devlink: remove netdev arg from devlink_port_type_eth_set() · c8096578
      Jiri Pirko authored
      Since devlink_port_type_eth_set() should no longer be called by any
      driver with netdev pointer as it should rather use
      SET_NETDEV_DEVLINK_PORT, remove the netdev arg. Add a warn to
      type_clear()
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c8096578
    • Jiri Pirko's avatar
      net: make drivers to use SET_NETDEV_DEVLINK_PORT to set devlink_port · ac73d4bf
      Jiri Pirko authored
      Benefit from the previously implemented tracking of netdev events in
      devlink code and instead of calling  devlink_port_type_eth_set() and
      devlink_port_type_clear() to set devlink port type and link to related
      netdev, use SET_NETDEV_DEVLINK_PORT() macro to assign devlink_port
      pointer to netdevice which is about to be registered.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ac73d4bf
    • Jiri Pirko's avatar
      net: devlink: track netdev with devlink_port assigned · 02a68a47
      Jiri Pirko authored
      Currently, ethernet drivers are using devlink_port_type_eth_set() and
      devlink_port_type_clear() to set devlink port type and link to related
      netdev.
      
      Instead of calling them directly, let the driver use
      SET_NETDEV_DEVLINK_PORT macro to assign devlink_port pointer and let
      devlink to track it. Note the devlink port pointer is static during
      the time netdevice is registered.
      
      In devlink code, use per-namespace netdev notifier to track
      the netdevices with devlink_port assigned and change the internal
      devlink_port type and related type pointer accordingly.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      02a68a47
    • Jiri Pirko's avatar
      net: devlink: take RTNL in port_fill() function only if it is not held · d41c9dbd
      Jiri Pirko authored
      Follow-up patch is going to introduce a netdevice notifier event
      processing which is called with RTNL mutex held. Processing of this will
      eventually lead to call to port_notity() and port_fill() which currently
      takes RTNL mutex internally. So as a temporary solution, propagate a
      bool indicating if the mutex is already held. This will go away in one
      of the follow-up patches.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d41c9dbd
    • Jiri Pirko's avatar
      net: devlink: move port_type_netdev_checks() call to __devlink_port_type_set() · 45791e0d
      Jiri Pirko authored
      As __devlink_port_type_set() is going to be called directly from netdevice
      notifier event handle in one of the follow-up patches, move the
      port_type_netdev_checks() call there.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      45791e0d
    • Jiri Pirko's avatar
      net: devlink: move port_type_warn_schedule() call to __devlink_port_type_set() · 8573a044
      Jiri Pirko authored
      As __devlink_port_type_set() is going to be called directly from netdevice
      notifier event handle in one of the follow-up patches, move the
      port_type_warn_schedule() call there.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8573a044
    • Jiri Pirko's avatar
      net: devlink: convert devlink port type-specific pointers to union · 3830c571
      Jiri Pirko authored
      Instead of storing type_dev as a void pointer, convert it to union and
      use it to store either struct net_device or struct ib_device pointer.
      Signed-off-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3830c571
    • Jakub Kicinski's avatar
      Merge branch 'bridge-add-mac-authentication-bypass-mab-support' · 0884aaf3
      Jakub Kicinski authored
      Ido Schimmel says:
      
      ====================
      bridge: Add MAC Authentication Bypass (MAB) support
      
      Patch #1 adds MAB support in the bridge driver. See the commit message
      for motivation, design choices and implementation details.
      
      Patch #2 adds corresponding test cases.
      
      Follow-up patchsets will add offload support in mlxsw and mv88e6xxx.
      ====================
      
      Link: https://lore.kernel.org/r/20221101193922.2125323-1-idosch@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0884aaf3
    • Hans J. Schultz's avatar
      selftests: forwarding: Add MAC Authentication Bypass (MAB) test cases · 4a331d34
      Hans J. Schultz authored
      Add four test cases to verify MAB functionality:
      
      * Verify that a locked FDB entry can be generated by the bridge,
        preventing a host from communicating via the bridge. Test that user
        space can clear the "locked" flag by replacing the entry, thereby
        authenticating the host and allowing it to communicate via the bridge.
      
      * Test that an entry cannot roam to a locked port, but that it can roam
        to an unlocked port.
      
      * Test that MAB can only be enabled on a port that is both locked and
        has learning enabled.
      
      * Test that locked FDB entries are flushed from a port when MAB is
        disabled.
      Signed-off-by: default avatarHans J. Schultz <netdev@kapio-technology.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4a331d34
    • Hans J. Schultz's avatar
      bridge: Add MAC Authentication Bypass (MAB) support · a35ec8e3
      Hans J. Schultz authored
      Hosts that support 802.1X authentication are able to authenticate
      themselves by exchanging EAPOL frames with an authenticator (Ethernet
      bridge, in this case) and an authentication server. Access to the
      network is only granted by the authenticator to successfully
      authenticated hosts.
      
      The above is implemented in the bridge using the "locked" bridge port
      option. When enabled, link-local frames (e.g., EAPOL) can be locally
      received by the bridge, but all other frames are dropped unless the host
      is authenticated. That is, unless the user space control plane installed
      an FDB entry according to which the source address of the frame is
      located behind the locked ingress port. The entry can be dynamic, in
      which case learning needs to be enabled so that the entry will be
      refreshed by incoming traffic.
      
      There are deployments in which not all the devices connected to the
      authenticator (the bridge) support 802.1X. Such devices can include
      printers and cameras. One option to support such deployments is to
      unlock the bridge ports connecting these devices, but a slightly more
      secure option is to use MAB. When MAB is enabled, the MAC address of the
      connected device is used as the user name and password for the
      authentication.
      
      For MAB to work, the user space control plane needs to be notified about
      MAC addresses that are trying to gain access so that they will be
      compared against an allow list. This can be implemented via the regular
      learning process with the sole difference that learned FDB entries are
      installed with a new "locked" flag indicating that the entry cannot be
      used to authenticate the device. The flag cannot be set by user space,
      but user space can clear the flag by replacing the entry, thereby
      authenticating the device.
      
      Locked FDB entries implement the following semantics with regards to
      roaming, aging and forwarding:
      
      1. Roaming: Locked FDB entries can roam to unlocked (authorized) ports,
         in which case the "locked" flag is cleared. FDB entries cannot roam
         to locked ports regardless of MAB being enabled or not. Therefore,
         locked FDB entries are only created if an FDB entry with the given {MAC,
         VID} does not already exist. This behavior prevents unauthenticated
         devices from disrupting traffic destined to already authenticated
         devices.
      
      2. Aging: Locked FDB entries age and refresh by incoming traffic like
         regular entries.
      
      3. Forwarding: Locked FDB entries forward traffic like regular entries.
         If user space detects an unauthorized MAC behind a locked port and
         wishes to prevent traffic with this MAC DA from reaching the host, it
         can do so using tc or a different mechanism.
      
      Enable the above behavior using a new bridge port option called "mab".
      It can only be enabled on a bridge port that is both locked and has
      learning enabled. Locked FDB entries are flushed from the port once MAB
      is disabled. A new option is added because there are pure 802.1X
      deployments that are not interested in notifications about locked FDB
      entries.
      Signed-off-by: default avatarHans J. Schultz <netdev@kapio-technology.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a35ec8e3
  2. 03 Nov, 2022 5 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · fbeb229a
      Jakub Kicinski authored
      No conflicts.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fbeb229a
    • Linus Torvalds's avatar
      Merge tag 'net-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 9521c9d6
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bluetooth and netfilter.
      
        Current release - regressions:
      
         - net: several zerocopy flags fixes
      
         - netfilter: fix possible memory leak in nf_nat_init()
      
         - openvswitch: add missing .resv_start_op
      
        Previous releases - regressions:
      
         - neigh: fix null-ptr-deref in neigh_table_clear()
      
         - sched: fix use after free in red_enqueue()
      
         - dsa: fall back to default tagger if we can't load the one from DT
      
         - bluetooth: fix use-after-free in l2cap_conn_del()
      
        Previous releases - always broken:
      
         - netfilter: netlink notifier might race to release objects
      
         - nfc: fix potential memory leak of skb
      
         - bluetooth: fix use-after-free caused by l2cap_reassemble_sdu
      
         - bluetooth: use skb_put to set length
      
         - eth: tun: fix bugs for oversize packet when napi frags enabled
      
         - eth: lan966x: fixes for when MTU is changed
      
         - eth: dwmac-loongson: fix invalid mdio_node"
      
      * tag 'net-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
        vsock: fix possible infinite sleep in vsock_connectible_wait_data()
        vsock: remove the unused 'wait' in vsock_connectible_recvmsg()
        ipv6: fix WARNING in ip6_route_net_exit_late()
        bridge: Fix flushing of dynamic FDB entries
        net, neigh: Fix null-ptr-deref in neigh_table_clear()
        net/smc: Fix possible leaked pernet namespace in smc_init()
        stmmac: dwmac-loongson: fix invalid mdio_node
        ibmvnic: Free rwi on reset success
        net: mdio: fix undefined behavior in bit shift for __mdiobus_register
        Bluetooth: L2CAP: Fix attempting to access uninitialized memory
        Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm
        Bluetooth: L2CAP: Fix accepting connection request for invalid SPSM
        Bluetooth: hci_conn: Fix not restoring ISO buffer count on disconnect
        Bluetooth: L2CAP: Fix memory leak in vhci_write
        Bluetooth: L2CAP: fix use-after-free in l2cap_conn_del()
        Bluetooth: virtio_bt: Use skb_put to set length
        Bluetooth: hci_conn: Fix CIS connection dst_type handling
        Bluetooth: L2CAP: Fix use-after-free caused by l2cap_reassemble_sdu
        netfilter: ipset: enforce documented limit to prevent allocating huge memory
        isdn: mISDN: netjet: fix wrong check of device registration
        ...
      9521c9d6
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 4d740391
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
      
       - Fix an endian thinko in the asm-generic compat_arg_u64() which led to
         syscall arguments being swapped for some compat syscalls.
      
       - Fix syscall wrapper handling of syscalls with 64-bit arguments on
         32-bit kernels, which led to syscall arguments being misplaced.
      
       - A build fix for amdgpu on Book3E with AltiVec disabled.
      
      Thanks to Andreas Schwab, Christian Zigotzky, and Arnd Bergmann.
      
      * tag 'powerpc-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/32: Select ARCH_SPLIT_ARG64
        powerpc/32: fix syscall wrappers with 64-bit arguments
        asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual()
        powerpc/64e: Fix amdgpu build on Book3E w/o AltiVec
      4d740391
    • Paolo Abeni's avatar
      Merge branch 'add-new-pcp-and-apptrust-attributes-to-dcbnl' · d9095f92
      Paolo Abeni authored
      Daniel Machon says:
      
      ====================
      Add new PCP and APPTRUST attributes to dcbnl
      
      This patch series adds new extension attributes to dcbnl, to support PCP
      prioritization (and thereby hw offloadable pcp-based queue
      classification) and per-selector trust and trust order. Additionally,
      the microchip sparx5 driver has been dcb-enabled to make use of the new
      attributes to offload PCP, DSCP and Default prio to the switch, and
      implement trust order of selectors.
      
      For pre-RFC discussion see:
      https://lore.kernel.org/netdev/Yv9VO1DYAxNduw6A@DEN-LT-70577/
      
      For RFC series see:
      https://lore.kernel.org/netdev/20220915095757.2861822-1-daniel.machon@microchip.com/
      
      In summary: there currently exist no convenient way to offload per-port
      PCP-based queue classification to hardware. The DCB subsystem offers
      different ways to prioritize through its APP table, but lacks an option
      for PCP. Similarly, there is no way to indicate the notion of trust for
      APP table selectors. This patch series addresses both topics.
      
      PCP based queue classification:
        - 8021Q standardizes the Priority Code Point table (see 6.9.3 of IEEE
          Std 802.1Q-2018).  This patch series makes it possible, to offload
          the PCP classification to said table.  The new PCP selector is not a
          standard part of the APP managed object, therefore it is
          encapsulated in a new non-std extension attribute.
      
      Selector trust:
        - ASIC's often has the notion of trust DSCP and trust PCP. The new
          attribute makes it possible to specify a trust order of app
          selectors, which drivers can then react on.
      
      DCB-enable sparx5 driver:
       - Now supports offloading of DSCP, PCP and default priority. Only one
         mapping of protocol:priority is allowed. Consecutive mappings of the
         same protocol to some new priority, will overwrite the previous. This
         is to keep a consistent view of the app table and the hardware.
       - Now supports dscp and pcp trust, by use of the introduced
         dcbnl_set/getapptrust ops. Sparx5 supports trust orders: [], [dscp],
         [pcp] and [dscp, pcp]. For now, only DSCP and PCP selectors are
         supported by the driver, everything else is bounced.
      
      Patch #1 introduces a new PCP selector to the APP object, which makes it
      possible to encode PCP and DEI in the app triplet and offload it to the
      PCP table of the ASIC.
      
      Patch #2 Introduces the new extension attributes
      DCB_ATTR_DCB_APP_TRUST_TABLE and DCB_ATTR_DCB_APP_TRUST. Trusted
      selectors are passed in the nested DCB_ATTR_DCB_APP_TRUST_TABLE
      attribute, and assembled into an array of selectors:
      
        u8 selectors[256];
      
      where lower indexes has higher precedence.  In the array, selectors are
      stored consecutively, starting from index zero. With a maximum number of
      256 unique selectors, the list has the same maximum size.
      
      Patch #3 Sets up the dcbnl ops hook, and adds support for offloading pcp
      app entries, to the PCP table of the switch.
      
      Patch #4 Makes use of the dcbnl_set/getapptrust ops, to set a per-port
      trust order.
      
      Patch #5 Adds support for offloading dscp app entries to the DSCP table
      of the switch.
      
      Patch #6 Adds support for offloading default prio app entries to the
      switch.
      
      ====================
      
      Link: https://lore.kernel.org/r/20221101094834.2726202-1-daniel.machon@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d9095f92
    • Daniel Machon's avatar
      net: microchip: sparx5: add support for offloading default prio · c58ff3ed
      Daniel Machon authored
      Add support for offloading default prio {ETHERTYPE, 0, prio}.
      Signed-off-by: default avatarDaniel Machon <daniel.machon@microchip.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c58ff3ed