1. 23 Apr, 2023 15 commits
  2. 22 Apr, 2023 6 commits
    • Jakub Kicinski's avatar
      Merge tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · fbc1449d
      Jakub Kicinski authored
      Saeed Mahameed says:
      
      ====================
      mlx5-updates-2023-04-20
      
      1) Dragos Improves RX page pool, and provides some fixes to his previous
         series:
       1.1) Fix releasing page_pool for striding RQ and legacy RQ nonlinear case
       1.2) Hook NAPIs to page pools to gain more performance.
      
      2) From Roi, Some cleanups to TC and eswitch modules.
      
      3) Maher migrates vnic diagnostic counters reporting from debugfs to a
          dedicated devlink health reporter
      
      Maher Says:
      ===========
       net/mlx5: Expose vnic diagnostic counters using devlink
      
      Currently, vnic diagnostic counters are exposed through the following
      debugfs:
      
      $ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/
      cq_overrun
      quota_exceeded_command
      total_q_under_processor_handle
      invalid_command
      send_queue_priority_update_flow
      nic_receive_steering_discard
      
      The current design does not allow the hypervisor to view the diagnostic
      counters of its VFs, in case the VFs get bound to a VM. In other words,
      the counters are not exposed for representor interfaces.
      Furthermore, the debugfs design is inconvenient future-wise, in case more
      counters need to be reported by the driver in the future.
      
      As these counters pertain to vNIC health, it is more appropriate to
      utilize the devlink health reporter to expose them.
      
      Thus, this patchest includes the following changes:
      
      * Drop the current vnic diagnostic counters debugfs interface.
      * Add a vnic devlink health reporter for PFs/VFs core devices, which
        when diagnosed will dump vnic diagnostic counter values that are
        queried from FW.
      * Add a vnic devlink health reporter for the representor interface, which
        serves the same purpose listed in the previous point, in addition to
        allowing the hypervisor to view its VFs diagnostic counters, even when
        the VFs are bounded to external VMs.
      
      Example of devlink health reporter usage is:
      $devlink health diagnose pci/0000:08:00.0 reporter vnic
       vNIC env counters:
          total_error_queues: 0 send_queue_priority_update_flow: 0
          comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
          invalid_command: 0 quota_exceeded_command: 0
          nic_receive_steering_discard: 0
      
      ===========
      
      4) SW steering fixes and improvements
      
      Yevgeny Kliteynik Says:
      =======================
      These short patch series are just small fixes / improvements for
      SW steering:
      
       - Patch 1: Fix dumping of legacy modify_hdr in debug dump to
         align to what is expected by parser
       - Patch 2: Have separate threshold for ICM sync per ICM type
       - Patch 3: Add more info to the steering debug dump - Linux
         version and device name
       - Patch 4: Keep track of number of buddies that are currently
         in use per domain per buddy type
      
      =======================
      
      * tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
        net/mlx5: Update op_mode to op_mod for port selection
        net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set()
        net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc()
        net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn()
        net/mlx5e: RX, Hook NAPIs to page pools
        net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case
        net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ
        net/mlx5e: Add vnic devlink health reporter to representors
        net/mlx5: Add vnic devlink health reporter to PFs/VFs
        Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports"
        Revert "net/mlx5: Expose steering dropped packets counter"
        net/mlx5: DR, Add memory statistics for domain object
        net/mlx5: DR, Add more info in domain dbg dump
        net/mlx5: DR, Calculate sync threshold of each pool according to its type
        net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump
      ====================
      
      Link: https://lore.kernel.org/r/20230421013850.349646-1-saeed@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fbc1449d
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 9a82cdc2
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2023-04-21
      
      We've added 71 non-merge commits during the last 8 day(s) which contain
      a total of 116 files changed, 13397 insertions(+), 8896 deletions(-).
      
      The main changes are:
      
      1) Add a new BPF netfilter program type and minimal support to hook
         BPF programs to netfilter hooks such as prerouting or forward,
         from Florian Westphal.
      
      2) Fix race between btf_put and btf_idr walk which caused a deadlock,
         from Alexei Starovoitov.
      
      3) Second big batch to migrate test_verifier unit tests into test_progs
         for ease of readability and debugging, from Eduard Zingerman.
      
      4) Add support for refcounted local kptrs to the verifier for allowing
         shared ownership, useful for adding a node to both the BPF list and
         rbtree, from Dave Marchevsky.
      
      5) Migrate bpf_for(), bpf_for_each() and bpf_repeat() macros from BPF
        selftests into libbpf-provided bpf_helpers.h header and improve
        kfunc handling, from Andrii Nakryiko.
      
      6) Support 64-bit pointers to kfuncs needed for archs like s390x,
         from Ilya Leoshkevich.
      
      7) Support BPF progs under getsockopt with a NULL optval,
         from Stanislav Fomichev.
      
      8) Improve verifier u32 scalar equality checking in order to enable
         LLVM transformations which earlier had to be disabled specifically
         for BPF backend, from Yonghong Song.
      
      9) Extend bpftool's struct_ops object loading to support links,
         from Kui-Feng Lee.
      
      10) Add xsk selftest follow-up fixes for hugepage allocated umem,
          from Magnus Karlsson.
      
      11) Support BPF redirects from tc BPF to ifb devices,
          from Daniel Borkmann.
      
      12) Add BPF support for integer type when accessing variable length
          arrays, from Feng Zhou.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (71 commits)
        selftests/bpf: verifier/value_ptr_arith converted to inline assembly
        selftests/bpf: verifier/value_illegal_alu converted to inline assembly
        selftests/bpf: verifier/unpriv converted to inline assembly
        selftests/bpf: verifier/subreg converted to inline assembly
        selftests/bpf: verifier/spin_lock converted to inline assembly
        selftests/bpf: verifier/sock converted to inline assembly
        selftests/bpf: verifier/search_pruning converted to inline assembly
        selftests/bpf: verifier/runtime_jit converted to inline assembly
        selftests/bpf: verifier/regalloc converted to inline assembly
        selftests/bpf: verifier/ref_tracking converted to inline assembly
        selftests/bpf: verifier/map_ptr_mixing converted to inline assembly
        selftests/bpf: verifier/map_in_map converted to inline assembly
        selftests/bpf: verifier/lwt converted to inline assembly
        selftests/bpf: verifier/loops1 converted to inline assembly
        selftests/bpf: verifier/jeq_infer_not_null converted to inline assembly
        selftests/bpf: verifier/direct_packet_access converted to inline assembly
        selftests/bpf: verifier/d_path converted to inline assembly
        selftests/bpf: verifier/ctx converted to inline assembly
        selftests/bpf: verifier/btf_ctx_access converted to inline assembly
        selftests/bpf: verifier/bpf_get_stack converted to inline assembly
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20230421211035.9111-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9a82cdc2
    • Maxime Bizon's avatar
      net: dst: fix missing initialization of rt_uncached · 418a7307
      Maxime Bizon authored
      xfrm_alloc_dst() followed by xfrm4_dst_destroy(), without a
      xfrm4_fill_dst() call in between, causes the following BUG:
      
       BUG: spinlock bad magic on CPU#0, fbxhostapd/732
        lock: 0x890b7668, .magic: 890b7668, .owner: <none>/-1, .owner_cpu: 0
       CPU: 0 PID: 732 Comm: fbxhostapd Not tainted 6.3.0-rc6-next-20230414-00613-ge8de66369925-dirty #9
       Hardware name: Marvell Kirkwood (Flattened Device Tree)
        unwind_backtrace from show_stack+0x10/0x14
        show_stack from dump_stack_lvl+0x28/0x30
        dump_stack_lvl from do_raw_spin_lock+0x20/0x80
        do_raw_spin_lock from rt_del_uncached_list+0x30/0x64
        rt_del_uncached_list from xfrm4_dst_destroy+0x3c/0xbc
        xfrm4_dst_destroy from dst_destroy+0x5c/0xb0
        dst_destroy from rcu_process_callbacks+0xc4/0xec
        rcu_process_callbacks from __do_softirq+0xb4/0x22c
        __do_softirq from call_with_stack+0x1c/0x24
        call_with_stack from do_softirq+0x60/0x6c
        do_softirq from __local_bh_enable_ip+0xa0/0xcc
      
      Patch "net: dst: Prevent false sharing vs. dst_entry:: __refcnt" moved
      rt_uncached and rt_uncached_list fields from rtable struct to dst
      struct, so they are more zeroed by memset_after(xdst, 0, u.dst) in
      xfrm_alloc_dst().
      
      Note that rt_uncached (list_head) was never properly initialized at
      alloc time, but xfrm[46]_dst_destroy() is written in such a way that
      it was not an issue thanks to the memset:
      
      	if (xdst->u.rt.dst.rt_uncached_list)
      		rt_del_uncached_list(&xdst->u.rt);
      
      The route code does it the other way around: rt_uncached_list is
      assumed to be valid IIF rt_uncached list_head is not empty:
      
      void rt_del_uncached_list(struct rtable *rt)
      {
              if (!list_empty(&rt->dst.rt_uncached)) {
                      struct uncached_list *ul = rt->dst.rt_uncached_list;
      
                      spin_lock_bh(&ul->lock);
                      list_del_init(&rt->dst.rt_uncached);
                      spin_unlock_bh(&ul->lock);
              }
      }
      
      This patch adds mandatory rt_uncached list_head initialization in
      generic dst_init(), and adapt xfrm[46]_dst_destroy logic to match the
      rest of the code.
      
      Fixes: d288a162 ("net: dst: Prevent false sharing vs. dst_entry:: __refcnt")
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Link: https://lore.kernel.org/oe-lkp/202304162125.18b7bcdd-oliver.sang@intel.comReviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      CC: Leon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarMaxime Bizon <mbizon@freebox.fr>
      Link: https://lore.kernel.org/r/20230420182508.2417582-1-mbizon@freebox.frSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      418a7307
    • Arnd Bergmann's avatar
      net: dsa: qca8k: fix LEDS_CLASS dependency · 33c1af8e
      Arnd Bergmann authored
      With LEDS_CLASS=m, a built-in qca8k driver fails to link:
      
      arm-linux-gnueabi-ld: drivers/net/dsa/qca/qca8k-leds.o: in function `qca8k_setup_led_ctrl':
      qca8k-leds.c:(.text+0x1ea): undefined reference to `devm_led_classdev_register_ext'
      
      Change the dependency to avoid the broken configuration.
      
      Fixes: 1e264f9d ("net: dsa: qca8k: add LEDs basic support")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230420213639.2243388-1-arnd@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      33c1af8e
    • Geert Uytterhoeven's avatar
      net/handshake: Fix section mismatch in handshake_exit · 6aa445e3
      Geert Uytterhoeven authored
      If CONFIG_NET_NS=n (e.g. m68k/defconfig):
      
          WARNING: modpost: vmlinux.o: section mismatch in reference: handshake_exit (section: .exit.text) -> handshake_genl_net_ops (section: .init.data)
          ERROR: modpost: Section mismatches detected.
      
      Fix this by dropping the __net_initdata tag from handshake_genl_net_ops.
      
      Fixes: 3b3009ea ("net/handshake: Create a NETLINK service for handling handshake requests")
      Reported-by: noreply@ellerman.id.au
      Closes: http://kisskb.ellerman.id.au/kisskb/buildresult/14912987Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Link: https://lore.kernel.org/r/20230420173723.3773434-1-geert@linux-m68k.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6aa445e3
    • Vladimir Oltean's avatar
      net: phy: add basic driver for NXP CBTX PHY · f3b766d9
      Vladimir Oltean authored
      The CBTX PHY is a Fast Ethernet PHY integrated into the SJA1110 A/B/C
      automotive Ethernet switches.
      
      It was hoped it would work with the Generic PHY driver, but alas, it
      doesn't. The most important reason why is that the PHY is powered down
      by default, and it needs a vendor register to power it on.
      
      It has a linear memory map that is accessed over SPI by the SJA1110
      switch driver, which exposes a fake MDIO controller. It has the
      following (and only the following) standard clause 22 registers:
      
      0x0: MII_BMCR
      0x1: MII_BMSR
      0x2: MII_PHYSID1
      0x3: MII_PHYSID2
      0x4: MII_ADVERTISE
      0x5: MII_LPA
      0x6: MII_EXPANSION
      0x7: the missing MII_NPAGE for Next Page Transmit Register
      
      Every other register is vendor-defined.
      
      The register map expands the standard clause 22 5-bit address space of
      0x20 registers, however the driver does not need to access the extra
      registers for now (and hopefully never). If it ever needs to do that, it
      is possible to implement a fake (software) page switching mechanism
      between the PHY driver and the SJA1110 MDIO controller driver.
      
      Also, Auto-MDIX is turned off by default in hardware, the driver turns
      it on by default and reports the current status. I've tested this with a
      VSC8514 link partner and a crossover cable, by forcing the mode on the
      link partner, and seeing that the CBTX PHY always sees the reverse of
      the mode forced on the VSC8514 (and that traffic works). The link
      doesn't come up (as expected) if MDI modes are forced on both ends in
      the same way (with the cross-over cable, that is).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230418190141.1040562-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3b766d9
  3. 21 Apr, 2023 19 commits