1. 10 Jun, 2022 21 commits
  2. 09 Jun, 2022 19 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · a98a62e4
      Jakub Kicinski authored
      No conflicts.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a98a62e4
    • Linus Torvalds's avatar
      Merge tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 825464e7
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bpf and netfilter.
      
        Current release - regressions:
      
         - eth: amt: fix possible null-ptr-deref in amt_rcv()
      
        Previous releases - regressions:
      
         - tcp: use alloc_large_system_hash() to allocate table_perturb
      
         - af_unix: fix a data-race in unix_dgram_peer_wake_me()
      
         - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
      
         - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF
      
        Previous releases - always broken:
      
         - ipv6: fix signed integer overflow in __ip6_append_data
      
         - netfilter:
             - nat: really support inet nat without l3 address
             - nf_tables: memleak flow rule from commit path
      
         - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs
      
         - openvswitch: fix misuse of the cached connection on tuple changes
      
         - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred
      
         - eth: altera: fix refcount leak in altera_tse_mdio_create
      
        Misc:
      
         - add Quentin Monnet to bpftool maintainers"
      
      * tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
        net: amd-xgbe: fix clang -Wformat warning
        tcp: use alloc_large_system_hash() to allocate table_perturb
        net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY
        net: dsa: mv88e6xxx: correctly report serdes link failure
        net: dsa: mv88e6xxx: fix BMSR error to be consistent with others
        net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete
        net: altera: Fix refcount leak in altera_tse_mdio_create
        net: openvswitch: fix misuse of the cached connection on tuple changes
        net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
        ip_gre: test csum_start instead of transport header
        au1000_eth: stop using virt_to_bus()
        ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg
        ipv6: Fix signed integer overflow in __ip6_append_data
        nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred
        nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION
        nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
        nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
        net: ipv6: unexport __init-annotated seg6_hmac_init()
        net: xfrm: unexport __init-annotated xfrm4_protocol_init()
        net: mdio: unexport __init-annotated mdio_bus_init()
        ...
      825464e7
    • Linus Torvalds's avatar
      netfs: gcc-12: temporarily disable '-Wattribute-warning' for now · 507160f4
      Linus Torvalds authored
      This is a pure band-aid so that I can continue merging stuff from people
      while some of the gcc-12 fallout gets sorted out.
      
      In particular, gcc-12 is very unhappy about the kinds of pointer
      arithmetic tricks that netfs does, and that makes the fortify checks
      trigger in afs and ceph:
      
        In function ‘fortify_memset_chk’,
            inlined from ‘netfs_i_context_init’ at include/linux/netfs.h:327:2,
            inlined from ‘afs_set_netfs_context’ at fs/afs/inode.c:61:2,
            inlined from ‘afs_root_iget’ at fs/afs/inode.c:543:2:
        include/linux/fortify-string.h:258:25: warning: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
          258 |                         __write_overflow_field(p_size_field, size);
              |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      and the reason is that netfs_i_context_init() is passed a 'struct inode'
      pointer, and then it does
      
              struct netfs_i_context *ctx = netfs_i_context(inode);
      
              memset(ctx, 0, sizeof(*ctx));
      
      where that netfs_i_context() function just does pointer arithmetic on
      the inode pointer, knowing that the netfs_i_context is laid out
      immediately after it in memory.
      
      This is all truly disgusting, since the whole "netfs_i_context is laid
      out immediately after it in memory" is not actually remotely true in
      general, but is just made to be that way for afs and ceph.
      
      See for example fs/cifs/cifsglob.h:
      
        struct cifsInodeInfo {
              struct {
                      /* These must be contiguous */
                      struct inode    vfs_inode;      /* the VFS's inode record */
                      struct netfs_i_context netfs_ctx; /* Netfslib context */
              };
      	[...]
      
      and realize that this is all entirely wrong, and the pointer arithmetic
      that netfs_i_context() is doing is also very very wrong and wouldn't
      give the right answer if netfs_ctx had different alignment rules from a
      'struct inode', for example).
      
      Anyway, that's just a long-winded way to say "the gcc-12 warning is
      actually quite reasonable, and our code happens to work but is pretty
      disgusting".
      
      This is getting fixed properly, but for now I made the mistake of
      thinking "the week right after the merge window tends to be calm for me
      as people take a breather" and I did a sustem upgrade.  And I got gcc-12
      as a result, so to continue merging fixes from people and not have the
      end result drown in warnings, I am fixing all these gcc-12 issues I hit.
      
      Including with these kinds of temporary fixes.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: David Howells <dhowells@redhat.com>
      Link: https://lore.kernel.org/all/AEEBCF5D-8402-441D-940B-105AA718C71F@chromium.org/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      507160f4
    • Linus Torvalds's avatar
      gcc-12: disable '-Warray-bounds' universally for now · f0be87c4
      Linus Torvalds authored
      In commit 8b202ee2 ("s390: disable -Warray-bounds") the s390 people
      disabled the '-Warray-bounds' warning for gcc-12, because the new logic
      in gcc would cause warnings for their use of the S390_lowcore macro,
      which accesses absolute pointers.
      
      It turns out gcc-12 has many other issues in this area, so this takes
      that s390 warning disable logic, and turns it into a kernel build config
      entry instead.
      
      Part of the intent is that we can make this all much more targeted, and
      use this conflig flag to disable it in only particular configurations
      that cause problems, with the s390 case as an example:
      
              select GCC12_NO_ARRAY_BOUNDS
      
      and we could do that for other configuration cases that cause issues.
      
      Or we could possibly use the CONFIG_CC_NO_ARRAY_BOUNDS thing in a more
      targeted way, and disable the warning only for particular uses: again
      the s390 case as an example:
      
        KBUILD_CFLAGS_DECOMPRESSOR += $(if $(CONFIG_CC_NO_ARRAY_BOUNDS),-Wno-array-bounds)
      
      but this ends up just doing it globally in the top-level Makefile, since
      the current issues are spread fairly widely all over:
      
        KBUILD_CFLAGS-$(CONFIG_CC_NO_ARRAY_BOUNDS) += -Wno-array-bounds
      
      We'll try to limit this later, since the gcc-12 problems are rare enough
      that *much* of the kernel can be built with it without disabling this
      warning.
      
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0be87c4
    • Linus Torvalds's avatar
      mellanox: mlx5: avoid uninitialized variable warning with gcc-12 · 842c3b3d
      Linus Torvalds authored
      gcc-12 started warning about 'tracker' being used uninitialized:
      
        drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c: In function ‘mlx5_do_bond’:
        drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c:786:28: warning: ‘tracker’ is used uninitialized [-Wuninitialized]
          786 |         struct lag_tracker tracker;
              |                            ^~~~~~~
      
      which seems to be because it doesn't track how the use (and
      initialization) is bound by the 'do_bond' flag.
      
      But admittedly that 'do_bond' usage is fairly complicated, and involves
      passing it around as an argument to helper functions, so it's somewhat
      understandable that gcc doesn't see how that all works.
      
      This function could be rewritten to make the use of that tracker
      variable more obviously safe, but for now I'm just adding the forced
      initialization of it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      842c3b3d
    • Linus Torvalds's avatar
      gcc-12: disable '-Wdangling-pointer' warning for now · 49beadbd
      Linus Torvalds authored
      While the concept of checking for dangling pointers to local variables
      at function exit is really interesting, the gcc-12 implementation is not
      compatible with reality, and results in false positives.
      
      For example, gcc sees us putting things on a local list head allocated
      on the stack, which involves exactly those kinds of pointers to the
      local stack entry:
      
        In function ‘__list_add’,
            inlined from ‘list_add_tail’ at include/linux/list.h:102:2,
            inlined from ‘rebuild_snap_realms’ at fs/ceph/snap.c:434:2:
        include/linux/list.h:74:19: warning: storing the address of local variable ‘realm_queue’ in ‘*&realm_27(D)->rebuild_item.prev’ [-Wdangling-pointer=]
           74 |         new->prev = prev;
              |         ~~~~~~~~~~^~~~~~
      
      But then gcc - understandably - doesn't really understand the big
      picture how the doubly linked list works, so doesn't see how we then end
      up emptying said list head in a loop and the pointer we added has been
      removed.
      
      Gcc also complains about us (intentionally) using this as a way to store
      a kind of fake stack trace, eg
      
        drivers/acpi/acpica/utdebug.c:40:38: warning: storing the address of local variable ‘current_sp’ in ‘acpi_gbl_entry_stack_pointer’ [-Wdangling-pointer=]
           40 |         acpi_gbl_entry_stack_pointer = &current_sp;
              |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~
      
      which is entirely reasonable from a compiler standpoint, and we may want
      to change those kinds of patterns, but not not.
      
      So this is one of those "it would be lovely if the compiler were to
      complain about us leaving dangling pointers to the stack", but not this
      way.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49beadbd
    • Linus Torvalds's avatar
      drm: imx: fix compiler warning with gcc-12 · 7aefd8b5
      Linus Torvalds authored
      Gcc-12 correctly warned about this code using a non-NULL pointer as a
      truth value:
      
        drivers/gpu/drm/imx/ipuv3-crtc.c: In function ‘ipu_crtc_disable_planes’:
        drivers/gpu/drm/imx/ipuv3-crtc.c:72:21: error: the comparison will always evaluate as ‘true’ for the address of ‘plane’ will never be NULL [-Werror=address]
           72 |                 if (&ipu_crtc->plane[1] && plane == &ipu_crtc->plane[1]->base)
              |                     ^
      
      due to the extraneous '&' address-of operator.
      
      Philipp Zabel points out that The mistake had no adverse effect since
      the following condition doesn't actually dereference the NULL pointer,
      but the intent of the code was obviously to check for it, not to take
      the address of the member.
      
      Fixes: eb8c8880 ("drm/imx: add deferred plane disabling")
      Acked-by: default avatarPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7aefd8b5
    • Claudiu Beznea's avatar
      net: macb: change return type for gem_ptp_set_one_step_sync() · 263efe85
      Claudiu Beznea authored
      gem_ptp_set_one_step_sync() always returns zero thus change its return
      type to void.
      Signed-off-by: default avatarClaudiu Beznea <claudiu.beznea@microchip.com>
      Link: https://lore.kernel.org/r/20220608080818.1495044-1-claudiu.beznea@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      263efe85
    • Paolo Abeni's avatar
      Merge branch 'vmxnet3-upgrade-to-version-7' · e4c437cd
      Paolo Abeni authored
      Ronak Doshi says:
      
      ====================
      vmxnet3: upgrade to version 7
      
      vmxnet3 emulation has recently added several new features including
      support for uniform passthrough(UPT). To make UPT work vmxnet3 has
      to be enhanced as per the new specification. This patch series
      extends the vmxnet3 driver to leverage these new features.
      
      Compatibility is maintained using existing vmxnet3 versioning mechanism as
      follows:
       - new features added to vmxnet3 emulation are associated with new vmxnet3
         version viz. vmxnet3 version 7.
       - emulation advertises all the versions it supports to the driver.
       - during initialization, vmxnet3 driver picks the highest version number
       supported by both the emulation and the driver and configures emulation
       to run at that version.
      
      In particular, following changes are introduced:
      
      Patch 1:
        This patch introduces utility macros for vmxnet3 version 7 comparison
        and updates Copyright information.
      
      Patch 2:
        This patch adds new capability registers to fine control enablement of
        individual features based on emulation and passthrough.
      
      Patch 3:
        This patch adds support for large passthrough BAR register.
      
      Patch 4:
        This patch adds support for out of order rx completion processing.
      
      Patch 5:
        This patch introduces new command to set ring buffer sizes to pass this
        information to the hardware.
      
      Patch 6:
        For better performance, hardware has a requirement to limit number of TSO
        descriptors. This patch adds that support.
      
      Patch 7:
        With vmxnet3 version 7, new descriptor fields are used to indicate
        encapsulation offload.
      
      Patch 8:
        With all vmxnet3 version 7 changes incorporated in the vmxnet3 driver,
        with this patch, the driver can configure emulation to run at vmxnet3
        version 7.
      
      Changes in v2->v3:
       - use correct byte ordering for ringBufSize
      
      Changes in v2:
       - use local rss_fields variable for the rss capability checks in patch 2
      ====================
      
      Link: https://lore.kernel.org/r/20220608032353.964-1-doshir@vmware.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e4c437cd
    • Ronak Doshi's avatar
      vmxnet3: update to version 7 · acc38e04
      Ronak Doshi authored
      With all vmxnet3 version 7 changes incorporated in the vmxnet3 driver,
      the driver can configure emulation to run at vmxnet3 version 7, provided
      the emulation advertises support for version 7.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      acc38e04
    • Ronak Doshi's avatar
      vmxnet3: use ext1 field to indicate encapsulated packet · 60cafa03
      Ronak Doshi authored
      Till vmxnet3 version 6, om field of transmit descriptor was used
      to indicate encapsulated offload packet and msscof was used to
      indirectly indicate TSO/CSO. From version 7 and later, ext1 field
      will be used to indicate whether packet is encapsulated or not and
      om fields will continue to indicate if the packet is TSO or CSO.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      60cafa03
    • Ronak Doshi's avatar
      vmxnet3: limit number of TXDs used for TSO packet · d2857b99
      Ronak Doshi authored
      Currently, vmxnet3 does not have a limit on number of descriptors
      used for a TSO packet. However, with UPT, for hardware performance
      reasons, this patch limits the number of transmit descriptors to 24
      for a TSO packet.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d2857b99
    • Ronak Doshi's avatar
      vmxnet3: add command to set ring buffer sizes · c7112ebd
      Ronak Doshi authored
      This patch adds a new command to set ring buffer sizes. This is
      required to pass the buffer size information to passthrough devices.
      For performance reasons, with version7 and later, ring1 will contain
      only mtu size buffers (bound to 3K). Packets > 3K will use both ring1
      and ring2.
      
      Also, ring sizes are round down to power of 2 and ring2 default
      size is increased to 512.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c7112ebd
    • Ronak Doshi's avatar
      vmxnet3: add support for out of order rx completion · 2c5a5748
      Ronak Doshi authored
      Currently, vmxnet3 processes rx completions in-order i.e. no
      out of order completion descriptor is expected. With UPT, if
      hardware supports LRO, then hardware can report out of order
      rx completions. This patch enhances vmxnet3 to add this support.
      This supports gets effective only when the corresponding feature
      bit is set.
      
      Also, minor enhancements are done for performance.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2c5a5748
    • Ronak Doshi's avatar
      vmxnet3: add support for large passthrough BAR register · 543fb674
      Ronak Doshi authored
      For vmxnet3 to work in UPT mode, the BAR sizes have been increased.
      The PT page has been extended to 2 pages and also includes OOB pages
      as a part of PT BAR. This patch enhances vmxnet3 to use appropriate
      BAR offsets based on the capability registered. To use new offsets,
      VMXNET3_CAP_LARGE_BAR needs to be set by the device. If it is not set
      then the device will use legacy PT page layout.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      543fb674
    • Ronak Doshi's avatar
      vmxnet3: add support for capability registers · 6f91f4ba
      Ronak Doshi authored
      This patch enhances vmxnet3 to suuport capability registers which
      allows it to enable features selectively. The DCR register tracks
      the capabilities vmxnet3 device supports. The PTCR register states
      the capabilities that the passthrough device supports.
      
      With the help of these registers, vmxnet3 can enable only those
      features which the passthrough device supoprts. This allows
      smooth trasition to Uniform-Passthrough (UPT) mode if the virtual
      nic requests it. If PTCR register returns nothing or error it means
      UPT is not being requested and vnic will continue in emulation mode.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6f91f4ba
    • Ronak Doshi's avatar
      vmxnet3: prepare for version 7 changes · 55f0395f
      Ronak Doshi authored
      vmxnet3 is currently at version 6 and this patch initiates the
      preparation to accommodate changes for upto version 7. Introduced
      utility macros for vmxnet3 version 7 comparison and update Copyright
      information.
      Signed-off-by: default avatarRonak Doshi <doshir@vmware.com>
      Acked-by: default avatarGuolin Yang <gyang@vmware.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      55f0395f
    • Justin Stitt's avatar
      net: amd-xgbe: fix clang -Wformat warning · 647df0d4
      Justin Stitt authored
      see warning:
      | drivers/net/ethernet/amd/xgbe/xgbe-drv.c:2787:43: warning: format specifies
      | type 'unsigned short' but the argument has type 'int' [-Wformat]
      |        netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
      |                                      ~~~~~~     ^~~~~~~~~~~~~~~~~~~
      
      Variadic functions (printf-like) undergo default argument promotion.
      Documentation/core-api/printk-formats.rst specifically recommends
      using the promoted-to-type's format flag.
      
      Also, as per C11 6.3.1.1:
      (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
      `If an int can represent all values of the original type ..., the
      value is converted to an int; otherwise, it is converted to an
      unsigned int. These are called the integer promotions.`
      
      Since the argument is a u16 it will get promoted to an int and thus it is
      most accurate to use the %x format specifier here. It should be noted that the
      `#06` formatting sugar does not alter the promotion rules.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/378Signed-off-by: default avatarJustin Stitt <jstitt007@gmail.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Link: https://lore.kernel.org/r/20220607191119.20686-1-jstitt007@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      647df0d4
    • Muchun Song's avatar
      tcp: use alloc_large_system_hash() to allocate table_perturb · e67b72b9
      Muchun Song authored
      In our server, there may be no high order (>= 6) memory since we reserve
      lots of HugeTLB pages when booting.  Then the system panic.  So use
      alloc_large_system_hash() to allocate table_perturb.
      
      Fixes: e9261476 ("tcp: dynamically allocate the perturb table used by source ports")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220607070214.94443-1-songmuchun@bytedance.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e67b72b9