1. 10 Jun, 2022 9 commits
  2. 09 Jun, 2022 31 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · a98a62e4
      Jakub Kicinski authored
      No conflicts.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a98a62e4
    • Linus Torvalds's avatar
      Merge tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 825464e7
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf and netfilter.
      
        Current release - regressions:
      
         - eth: amt: fix possible null-ptr-deref in amt_rcv()
      
        Previous releases - regressions:
      
         - tcp: use alloc_large_system_hash() to allocate table_perturb
      
         - af_unix: fix a data-race in unix_dgram_peer_wake_me()
      
         - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
      
         - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF
      
        Previous releases - always broken:
      
         - ipv6: fix signed integer overflow in __ip6_append_data
      
         - netfilter:
             - nat: really support inet nat without l3 address
             - nf_tables: memleak flow rule from commit path
      
         - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs
      
         - openvswitch: fix misuse of the cached connection on tuple changes
      
         - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred
      
         - eth: altera: fix refcount leak in altera_tse_mdio_create
      
        Misc:
      
         - add Quentin Monnet to bpftool maintainers"
      
      * tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
        net: amd-xgbe: fix clang -Wformat warning
        tcp: use alloc_large_system_hash() to allocate table_perturb
        net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY
        net: dsa: mv88e6xxx: correctly report serdes link failure
        net: dsa: mv88e6xxx: fix BMSR error to be consistent with others
        net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete
        net: altera: Fix refcount leak in altera_tse_mdio_create
        net: openvswitch: fix misuse of the cached connection on tuple changes
        net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
        ip_gre: test csum_start instead of transport header
        au1000_eth: stop using virt_to_bus()
        ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg
        ipv6: Fix signed integer overflow in __ip6_append_data
        nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred
        nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION
        nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
        nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
        net: ipv6: unexport __init-annotated seg6_hmac_init()
        net: xfrm: unexport __init-annotated xfrm4_protocol_init()
        net: mdio: unexport __init-annotated mdio_bus_init()
        ...
      825464e7
    • Linus Torvalds's avatar
      netfs: gcc-12: temporarily disable '-Wattribute-warning' for now · 507160f4
      Linus Torvalds authored
      This is a pure band-aid so that I can continue merging stuff from people
      while some of the gcc-12 fallout gets sorted out.
      
      In particular, gcc-12 is very unhappy about the kinds of pointer
      arithmetic tricks that netfs does, and that makes the fortify checks
      trigger in afs and ceph:
      
        In function ‘fortify_memset_chk’,
            inlined from ‘netfs_i_context_init’ at include/linux/netfs.h:327:2,
            inlined from ‘afs_set_netfs_context’ at fs/afs/inode.c:61:2,
            inlined from ‘afs_root_iget’ at fs/afs/inode.c:543:2:
        include/linux/fortify-string.h:258:25: warning: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
          258 |                         __write_overflow_field(p_size_field, size);
              |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      and the reason is that netfs_i_context_init() is passed a 'struct inode'
      pointer, and then it does
      
              struct netfs_i_context *ctx = netfs_i_context(inode);
      
              memset(ctx, 0, sizeof(*ctx));
      
      where that netfs_i_context() function just does pointer arithmetic on
      the inode pointer, knowing that the netfs_i_context is laid out
      immediately after it in memory.
      
      This is all truly disgusting, since the whole "netfs_i_context is laid
      out immediately after it in memory" is not actually remotely true in
      general, but is just made to be that way for afs and ceph.
      
      See for example fs/cifs/cifsglob.h:
      
        struct cifsInodeInfo {
              struct {
                      /* These must be contiguous */
                      struct inode    vfs_inode;      /* the VFS's inode record */
                      struct netfs_i_context netfs_ctx; /* Netfslib context */
              };
      	[...]
      
      and realize that this is all entirely wrong, and the pointer arithmetic
      that netfs_i_context() is doing is also very very wrong and wouldn't
      give the right answer if netfs_ctx had different alignment rules from a
      'struct inode', for example).
      
      Anyway, that's just a long-winded way to say "the gcc-12 warning is
      actually quite reasonable, and our code happens to work but is pretty
      disgusting".
      
      This is getting fixed properly, but for now I made the mistake of
      thinking "the week right after the merge window tends to be calm for me
      as people take a breather" and I did a sustem upgrade.  And I got gcc-12
      as a result, so to continue merging fixes from people and not have the
      end result drown in warnings, I am fixing all these gcc-12 issues I hit.
      
      Including with these kinds of temporary fixes.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/all/AEEBCF5D-8402-441D-940B-105AA718C71F@chromium.org/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      507160f4
    • Linus Torvalds's avatar
      gcc-12: disable '-Warray-bounds' universally for now · f0be87c4
      Linus Torvalds authored
      In commit 8b202ee2 ("s390: disable -Warray-bounds") the s390 people
      disabled the '-Warray-bounds' warning for gcc-12, because the new logic
      in gcc would cause warnings for their use of the S390_lowcore macro,
      which accesses absolute pointers.
      
      It turns out gcc-12 has many other issues in this area, so this takes
      that s390 warning disable logic, and turns it into a kernel build config
      entry instead.
      
      Part of the intent is that we can make this all much more targeted, and
      use this conflig flag to disable it in only particular configurations
      that cause problems, with the s390 case as an example:
      
              select GCC12_NO_ARRAY_BOUNDS
      
      and we could do that for other configuration cases that cause issues.
      
      Or we could possibly use the CONFIG_CC_NO_ARRAY_BOUNDS thing in a more
      targeted way, and disable the warning only for particular uses: again
      the s390 case as an example:
      
        KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds)
      
      but this ends up just doing it globally in the top-level Makefile, since
      the current issues are spread fairly widely all over:
      
        KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds
      
      We'll try to limit this later, since the gcc-12 problems are rare enough
      that *much* of the kernel can be built with it without disabling this
      warning.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0be87c4
    • Linus Torvalds's avatar
      mellanox: mlx5: avoid uninitialized variable warning with gcc-12 · 842c3b3d
      Linus Torvalds authored
      gcc-12 started warning about 'tracker' being used uninitialized:
      
        drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c: In function ‘mlx5_do_bond’:
        drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c:786:28: warning: ‘tracker’ is used uninitialized [-Wuninitialized]
          786 |         struct lag_tracker tracker;
              |                            ^~~~~~~
      
      which seems to be because it doesn't track how the use (and
      initialization) is bound by the 'do_bond' flag.
      
      But admittedly that 'do_bond' usage is fairly complicated, and involves
      passing it around as an argument to helper functions, so it's somewhat
      understandable that gcc doesn't see how that all works.
      
      This function could be rewritten to make the use of that tracker
      variable more obviously safe, but for now I'm just adding the forced
      initialization of it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      842c3b3d
    • Linus Torvalds's avatar
      gcc-12: disable '-Wdangling-pointer' warning for now · 49beadbd
      Linus Torvalds authored
      While the concept of checking for dangling pointers to local variables
      at function exit is really interesting, the gcc-12 implementation is not
      compatible with reality, and results in false positives.
      
      For example, gcc sees us putting things on a local list head allocated
      on the stack, which involves exactly those kinds of pointers to the
      local stack entry:
      
        In function ‘__list_add’,
            inlined from ‘list_add_tail’ at include/linux/list.h:102:2,
            inlined from ‘rebuild_snap_realms’ at fs/ceph/snap.c:434:2:
        include/linux/list.h:74:19: warning: storing the address of local variable ‘realm_queue’ in ‘*&realm_27(D)->rebuild_item.prev’ [-Wdangling-pointer=]
           74 |         new->prev = prev;
              |         ~~~~~~~~~~^~~~~~
      
      But then gcc - understandably - doesn't really understand the big
      picture how the doubly linked list works, so doesn't see how we then end
      up emptying said list head in a loop and the pointer we added has been
      removed.
      
      Gcc also complains about us (intentionally) using this as a way to store
      a kind of fake stack trace, eg
      
        drivers/acpi/acpica/utdebug.c:40:38: warning: storing the address of local variable ‘current_sp’ in ‘acpi_gbl_entry_stack_pointer’ [-Wdangling-pointer=]
           40 |         acpi_gbl_entry_stack_pointer = &current_sp;
              |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
      
      which is entirely reasonable from a compiler standpoint, and we may want
      to change those kinds of patterns, but not not.
      
      So this is one of those "it would be lovely if the compiler were to
      complain about us leaving dangling pointers to the stack", but not this
      way.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49beadbd
    • Linus Torvalds's avatar
      drm: imx: fix compiler warning with gcc-12 · 7aefd8b5
      Linus Torvalds authored
      Gcc-12 correctly warned about this code using a non-NULL pointer as a
      truth value:
      
        drivers/gpu/drm/imx/ipuv3-crtc.c: In function ‘ipu_crtc_disable_planes’:
        drivers/gpu/drm/imx/ipuv3-crtc.c:72:21: error: the comparison will always evaluate as ‘true’ for the address of ‘plane’ will never be NULL [-Werror=address]
           72 |                 if (&ipu_crtc->plane[1] && plane == &ipu_crtc->plane[1]->base)
              |                     ^
      
      due to the extraneous '&' address-of operator.
      
      Philipp Zabel points out that The mistake had no adverse effect since
      the following condition doesn't actually dereference the NULL pointer,
      but the intent of the code was obviously to check for it, not to take
      the address of the member.
      
      Fixes: eb8c8880 ("drm/imx: add deferred plane disabling")
      Acked-by: default avatarPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7aefd8b5
    • Claudiu Beznea's avatar
      net: macb: change return type for gem_ptp_set_one_step_sync() · 263efe85
      Claudiu Beznea authored
      gem_ptp_set_one_step_sync() always returns zero thus change its return
      type to void.
      Signed-off-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Link: https://lore.kernel.org/r/20220608080818.1495044-1-claudiu.beznea@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      263efe85
    • Paolo Abeni's avatar
      Merge branch 'vmxnet3-upgrade-to-version-7' · e4c437cd
      Paolo Abeni authored
      Ronak Doshi says:
      
      ====================
      vmxnet3: upgrade to version 7
      
      vmxnet3 emulation has recently added several new features including
      support for uniform passthrough(UPT). To make UPT work vmxnet3 has
      to be enhanced as per the new specification. This patch series
      extends the vmxnet3 driver to leverage these new features.
      
      Compatibility is maintained using existing vmxnet3 versioning mechanism as
      follows:
       - new features added to vmxnet3 emulation are associated with new vmxnet3
         version viz. vmxnet3 version 7.
       - emulation advertises all the versions it supports to the driver.
       - during initialization, vmxnet3 driver picks the highest version number
       supported by both the emulation and the driver and configures emulation
       to run at that version.
      
      In particular, following changes are introduced:
      
      Patch 1:
        This patch introduces utility macros for vmxnet3 version 7 comparison
        and updates Copyright information.
      
      Patch 2:
        This patch adds new capability registers to fine control enablement of
        individual features based on emulation and passthrough.
      
      Patch 3:
        This patch adds support for large passthrough BAR register.
      
      Patch 4:
        This patch adds support for out of order rx completion processing.
      
      Patch 5:
        This patch introduces new command to set ring buffer sizes to pass this
        information to the hardware.
      
      Patch 6:
        For better performance, hardware has a requirement to limit number of TSO
        descriptors. This patch adds that support.
      
      Patch 7:
        With vmxnet3 version 7, new descriptor fields are used to indicate
        encapsulation offload.
      
      Patch 8:
        With all vmxnet3 version 7 changes incorporated in the vmxnet3 driver,
        with this patch, the driver can configure emulation to run at vmxnet3
        version 7.
      
      Changes in v2->v3:
       - use correct byte ordering for ringBufSize
      
      Changes in v2:
       - use local rss_fields variable for the rss capability checks in patch 2
      ====================
      
      Link: https://lore.kernel.org/r/20220608032353.964-1-doshir@vmware.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e4c437cd
    • Ronak Doshi's avatar
      vmxnet3: update to version 7 · acc38e04
      Ronak Doshi authored
      With all vmxnet3 version 7 changes incorporated in the vmxnet3 driver,
      the driver can configure emulation to run at vmxnet3 version 7, provided
      the emulation advertises support for version 7.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      acc38e04
    • Ronak Doshi's avatar
      vmxnet3: use ext1 field to indicate encapsulated packet · 60cafa03
      Ronak Doshi authored
      Till vmxnet3 version 6, om field of transmit descriptor was used
      to indicate encapsulated offload packet and msscof was used to
      indirectly indicate TSO/CSO. From version 7 and later, ext1 field
      will be used to indicate whether packet is encapsulated or not and
      om fields will continue to indicate if the packet is TSO or CSO.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      60cafa03
    • Ronak Doshi's avatar
      vmxnet3: limit number of TXDs used for TSO packet · d2857b99
      Ronak Doshi authored
      Currently, vmxnet3 does not have a limit on number of descriptors
      used for a TSO packet. However, with UPT, for hardware performance
      reasons, this patch limits the number of transmit descriptors to 24
      for a TSO packet.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d2857b99
    • Ronak Doshi's avatar
      vmxnet3: add command to set ring buffer sizes · c7112ebd
      Ronak Doshi authored
      This patch adds a new command to set ring buffer sizes. This is
      required to pass the buffer size information to passthrough devices.
      For performance reasons, with version7 and later, ring1 will contain
      only mtu size buffers (bound to 3K). Packets > 3K will use both ring1
      and ring2.
      
      Also, ring sizes are round down to power of 2 and ring2 default
      size is increased to 512.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c7112ebd
    • Ronak Doshi's avatar
      vmxnet3: add support for out of order rx completion · 2c5a5748
      Ronak Doshi authored
      Currently, vmxnet3 processes rx completions in-order i.e. no
      out of order completion descriptor is expected. With UPT, if
      hardware supports LRO, then hardware can report out of order
      rx completions. This patch enhances vmxnet3 to add this support.
      This supports gets effective only when the corresponding feature
      bit is set.
      
      Also, minor enhancements are done for performance.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2c5a5748
    • Ronak Doshi's avatar
      vmxnet3: add support for large passthrough BAR register · 543fb674
      Ronak Doshi authored
      For vmxnet3 to work in UPT mode, the BAR sizes have been increased.
      The PT page has been extended to 2 pages and also includes OOB pages
      as a part of PT BAR. This patch enhances vmxnet3 to use appropriate
      BAR offsets based on the capability registered. To use new offsets,
      VMXNET3_CAP_LARGE_BAR needs to be set by the device. If it is not set
      then the device will use legacy PT page layout.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      543fb674
    • Ronak Doshi's avatar
      vmxnet3: add support for capability registers · 6f91f4ba
      Ronak Doshi authored
      This patch enhances vmxnet3 to suuport capability registers which
      allows it to enable features selectively. The DCR register tracks
      the capabilities vmxnet3 device supports. The PTCR register states
      the capabilities that the passthrough device supports.
      
      With the help of these registers, vmxnet3 can enable only those
      features which the passthrough device supoprts. This allows
      smooth trasition to Uniform-Passthrough (UPT) mode if the virtual
      nic requests it. If PTCR register returns nothing or error it means
      UPT is not being requested and vnic will continue in emulation mode.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6f91f4ba
    • Ronak Doshi's avatar
      vmxnet3: prepare for version 7 changes · 55f0395f
      Ronak Doshi authored
      vmxnet3 is currently at version 6 and this patch initiates the
      preparation to accommodate changes for upto version 7. Introduced
      utility macros for vmxnet3 version 7 comparison and update Copyright
      information.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      55f0395f
    • Justin Stitt's avatar
      net: amd-xgbe: fix clang -Wformat warning · 647df0d4
      Justin Stitt authored
      see warning:
      | drivers/net/ethernet/amd/xgbe/xgbe-drv.c:2787:43: warning: format specifies
      | type 'unsigned short' but the argument has type 'int' [-Wformat]
      |        netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
      |                                      ~~~~~~     ^~~~~~~~~~~~~~~~~~~
      
      Variadic functions (printf-like) undergo default argument promotion.
      Documentation/core-api/printk-formats.rst specifically recommends
      using the promoted-to-type's format flag.
      
      Also, as per C11 6.3.1.1:
      (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
      `If an int can represent all values of the original type ..., the
      value is converted to an int; otherwise, it is converted to an
      unsigned int. These are called the integer promotions.`
      
      Since the argument is a u16 it will get promoted to an int and thus it is
      most accurate to use the %x format specifier here. It should be noted that the
      `#06` formatting sugar does not alter the promotion rules.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/378Signed-off-by: default avatarJustin Stitt <jstitt007@gmail.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Link: https://lore.kernel.org/r/20220607191119.20686-1-jstitt007@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      647df0d4
    • Muchun Song's avatar
      tcp: use alloc_large_system_hash() to allocate table_perturb · e67b72b9
      Muchun Song authored
      In our server, there may be no high order (>= 6) memory since we reserve
      lots of HugeTLB pages when booting.  Then the system panic.  So use
      alloc_large_system_hash() to allocate table_perturb.
      
      Fixes: e9261476 ("tcp: dynamically allocate the perturb table used by source ports")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220607070214.94443-1-songmuchun@bytedance.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e67b72b9
    • Juergen Gross's avatar
      xen/netback: do some code cleanup · 5834e72e
      Juergen Gross authored
      Remove some unused macros and functions, make local functions static.
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Acked-by: default avatarWei Liu <wei.liu@kernel.org>
      Link: https://lore.kernel.org/r/20220608043726.9380-1-jgross@suse.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5834e72e
    • Alvin Šipraga's avatar
      net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY · 487994ff
      Alvin Šipraga authored
      Since commit a18e6521 ("net: phylink: handle NA interface mode in
      phylink_fwnode_phy_connect()"), phylib defaults to GMII when no phy-mode
      or phy-connection-type property is specified in a DSA port node of the
      device tree. The same commit caused a regression in rtl8365mb whereby
      phylink would fail to connect, because the driver did not advertise
      support for GMII for ports with internal PHY.
      
      It should be noted that the aforementioned regression is not because the
      blamed commit was incorrect: on the contrary, the blamed commit is
      correcting the previous behaviour whereby unspecified phy-mode would
      cause the internal interface mode to be PHY_INTERFACE_MODE_NA. The
      rtl8365mb driver only worked by accident before because it _did_
      advertise support for PHY_INTERFACE_MODE_NA, despite NA being reserved
      for internal use by phylink. With one mistake fixed, the other was
      exposed.
      
      Commit a5dba0f2 ("net: dsa: rtl8365mb: add GMII as user port mode")
      then introduced implicit support for GMII mode on ports with internal
      PHY to allow a PHY connection for device trees where the phy-mode is not
      explicitly set to "internal". At this point everything was working OK
      again.
      
      Subsequently, commit 6ff60646 ("net: dsa: realtek: convert to
      phylink_generic_validate()") broke this behaviour again by discarding
      the usage of rtl8365mb_phy_mode_supported() - where this GMII support
      was indicated - while switching to the new .phylink_get_caps API.
      
      With the new API, rtl8365mb_phy_mode_supported() is no longer needed.
      Remove it altogether and add back the GMII capability - this time to
      rtl8365mb_phylink_get_caps() - so that the above default behaviour works
      for ports with internal PHY again.
      
      Fixes: 6ff60646 ("net: dsa: realtek: convert to phylink_generic_validate()")
      Signed-off-by: default avatarAlvin Šipraga <alsi@bang-olufsen.dk>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/20220607184624.417641-1-alvin@pqrs.dkSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      487994ff
    • Jakub Kicinski's avatar
      Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 568a32f5
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-06-07
      
      This series contains updates to ixgbe driver only.
      
      Olivier Matz resolves an issue so that broadcast packets can still be
      received when VF removes promiscuous settings and removes setting of
      VLAN promiscuous, in promiscuous mode, to prevent a loop when VFs are
      bridged.
      
      * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ixgbe: fix unexpected VLAN Rx in promisc mode on VF
        ixgbe: fix bcast packets Rx on VF after promisc removal
      ====================
      
      Link: https://lore.kernel.org/r/20220607181538.748786-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      568a32f5
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 42a09d93
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      40GbE Intel Wired LAN Driver Updates 2022-06-07
      
      This series contains updates to i40e and iavf drivers.
      
      Mateusz adds implementation for setting VF VLAN pruning to allow user to
      specify visibility of VLAN tagged traffic to VFs for i40e. He also adds
      waiting for result from PF for setting MAC address in iavf.
      ====================
      
      Link: https://lore.kernel.org/r/20220607175506.696671-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      42a09d93
    • Jakub Kicinski's avatar
      Merge branch 'mv88e6xxx-fixes-for-reading-serdes-state' · 5d4af9c1
      Jakub Kicinski authored
      Russell King says:
      
      ====================
      mv88e6xxx: fixes for reading serdes state
      
      These are some low-priority fixes to the mv88e6xxx serdes code.
      Patch 1 fixes the reporting of an_complete, which is used in the
      emulation of a conventional C22 PHY. Patch from Marek.
      
      Patch 2 makes one of the error messages in patch 2 to be consistent
      with the other error messages in this function.
      
      Patch 3 ensures that we do not miss a link-failure event.
      ====================
      
      Link: https://lore.kernel.org/r/Yp82TyoLon9jz6k3@shell.armlinux.org.ukSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5d4af9c1
    • Russell King (Oracle)'s avatar
      net: dsa: mv88e6xxx: correctly report serdes link failure · b4d78731
      Russell King (Oracle) authored
      Phylink wants to know if the link has dropped since the last time state
      was retrieved, and the BMSR gives us that. Read the BMSR and use it when
      deciding the link state. Fill in the an_complete member as well for the
      emulated PHY state.
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b4d78731
    • Russell King (Oracle)'s avatar
      net: dsa: mv88e6xxx: fix BMSR error to be consistent with others · 2b4bb9cd
      Russell King (Oracle) authored
      Other errors accessing the registers in mv88e6352_serdes_pcs_get_state()
      print "PHY " before the register name, except for the BMSR. Make this
      consistent with the other error messages.
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2b4bb9cd
    • Marek Behún's avatar
      net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete · 47e96930
      Marek Behún authored
      Commit ede359d8 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN
      is bypassed") added the ability to link if AN was bypassed, and added
      filling of state->an_complete field, but set it to true if AN was
      enabled in BMCR, not when AN was reported complete in BMSR.
      
      This was done because for some reason, when I wanted to use BMSR value
      to infer an_complete, I was looking at BMSR_ANEGCAPABLE bit (which was
      always 1), instead of BMSR_ANEGCOMPLETE bit.
      
      Use BMSR_ANEGCOMPLETE for filling state->an_complete.
      
      Fixes: ede359d8 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      47e96930
    • Miaoqian Lin's avatar
      net: altera: Fix refcount leak in altera_tse_mdio_create · 11ec18b1
      Miaoqian Lin authored
      Every iteration of for_each_child_of_node() decrements
      the reference count of the previous node.
      When break from a for_each_child_of_node() loop,
      we need to explicitly call of_node_put() on the child node when
      not need anymore.
      Add missing of_node_put() to avoid refcount leak.
      
      Fixes: bbd2190c ("Altera TSE: Add main and header file for Altera Ethernet Driver")
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Link: https://lore.kernel.org/r/20220607041144.7553-1-linmq006@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      11ec18b1
    • Ilya Maximets's avatar
      net: openvswitch: fix misuse of the cached connection on tuple changes · 2061ecfd
      Ilya Maximets authored
      If packet headers changed, the cached nfct is no longer relevant
      for the packet and attempt to re-use it leads to the incorrect packet
      classification.
      
      This issue is causing broken connectivity in OpenStack deployments
      with OVS/OVN due to hairpin traffic being unexpectedly dropped.
      
      The setup has datapath flows with several conntrack actions and tuple
      changes between them:
      
        actions:ct(commit,zone=8,mark=0/0x1,nat(src)),
                set(eth(src=00:00:00:00:00:01,dst=00:00:00:00:00:06)),
                set(ipv4(src=172.18.2.10,dst=192.168.100.6,ttl=62)),
                ct(zone=8),recirc(0x4)
      
      After the first ct() action the packet headers are almost fully
      re-written.  The next ct() tries to re-use the existing nfct entry
      and marks the packet as invalid, so it gets dropped later in the
      pipeline.
      
      Clearing the cached conntrack entry whenever packet tuple is changed
      to avoid the issue.
      
      The flow key should not be cleared though, because we should still
      be able to match on the ct_state if the recirculation happens after
      the tuple change but before the next ct() action.
      
      Cc: stable@vger.kernel.org
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      Reported-by: default avatarFrode Nordahl <frode.nordahl@canonical.com>
      Link: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051829.html
      Link: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1967856Signed-off-by: default avatarIlya Maximets <i.maximets@ovn.org>
      Link: https://lore.kernel.org/r/20220606221140.488984-1-i.maximets@ovn.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2061ecfd
    • Chen Lin's avatar
      net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag · 2f2c0d29
      Chen Lin authored
      When rx_flag == MTK_RX_FLAGS_HWLRO,
      rx_data_len = MTK_MAX_LRO_RX_LENGTH(4096 * 3) > PAGE_SIZE.
      netdev_alloc_frag is for alloction of page fragment only.
      Reference to other drivers and Documentation/vm/page_frags.rst
      
      Branch to use __get_free_pages when ring->frag_size > PAGE_SIZE.
      Signed-off-by: default avatarChen Lin <chen45464546@163.com>
      Link: https://lore.kernel.org/r/1654692413-2598-1-git-send-email-chen45464546@163.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2f2c0d29
    • Willem de Bruijn's avatar
      ip_gre: test csum_start instead of transport header · 8d21e996
      Willem de Bruijn authored
      GRE with TUNNEL_CSUM will apply local checksum offload on
      CHECKSUM_PARTIAL packets.
      
      ipgre_xmit must validate csum_start after an optional skb_pull,
      else lco_csum may trigger an overflow. The original check was
      
      	if (csum && skb_checksum_start(skb) < skb->data)
      		return -EINVAL;
      
      This had false positives when skb_checksum_start is undefined:
      when ip_summed is not CHECKSUM_PARTIAL. A discussed refinement
      was straightforward
      
      	if (csum && skb->ip_summed == CHECKSUM_PARTIAL &&
      	    skb_checksum_start(skb) < skb->data)
      		return -EINVAL;
      
      But was eventually revised more thoroughly:
      - restrict the check to the only branch where needed, in an
        uncommon GRE path that uses header_ops and calls skb_pull.
      - test skb_transport_header, which is set along with csum_start
        in skb_partial_csum_set in the normal header_ops datapath.
      
      Turns out skbs can arrive in this branch without the transport
      header set, e.g., through BPF redirection.
      
      Revise the check back to check csum_start directly, and only if
      CHECKSUM_PARTIAL. Do leave the check in the updated location.
      Check field regardless of whether TUNNEL_CSUM is configured.
      
      Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
      Link: https://lore.kernel.org/all/20210902193447.94039-2-willemdebruijn.kernel@gmail.com/T/#u
      Fixes: 8a0ed250 ("ip_gre: validate csum_start only on pull")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Link: https://lore.kernel.org/r/20220606132107.3582565-1-willemdebruijn.kernel@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8d21e996