1. 31 May, 2022 10 commits
  2. 29 May, 2022 3 commits
    • David S. Miller's avatar
      Merge branch 'sfc-fixes' · 90343f57
      David S. Miller authored
      Íñigo Huguet says:
      
      ====================
      sfc: fix some efx_separate_tx_channels errors
      
      Trying to load sfc driver with modparam efx_separate_tx_channels=1
      resulted in errors during initialization and not being able to use the
      NIC. This patches fix a few bugs and make it work again.
      
      v2:
      * added Martin's patch instead of a previous mine. Mine one solved some
      of the initialization errors, but Martin's solves them also in all
      possible cases.
      * removed whitespaces cleanup, as requested by Jakub
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      90343f57
    • Íñigo Huguet's avatar
      sfc: fix wrong tx channel offset with efx_separate_tx_channels · c308dfd1
      Íñigo Huguet authored
      tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
      also calculated again in efx_set_channels because it was originally done
      there, and when efx_allocate_msix_channels was introduced it was
      forgotten to be removed from efx_set_channels.
      
      Moreover, the old calculation is wrong when using
      efx_separate_tx_channels because now we can have XDP channels after the
      TX channels, so n_channels - n_tx_channels doesn't point to the first TX
      channel.
      
      Remove the old calculation from efx_set_channels, and add the
      initialization of this variable if MSI or legacy interrupts are used,
      next to the initialization of the rest of the related variables, where
      it was missing.
      
      Fixes: 3990a8ff ("sfc: allocate channels for XDP tx queues")
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c308dfd1
    • Martin Habets's avatar
      sfc: fix considering that all channels have TX queues · 2e102b53
      Martin Habets authored
      Normally, all channels have RX and TX queues, but this is not true if
      modparam efx_separate_tx_channels=1 is used. In that cases, some
      channels only have RX queues and others only TX queues (or more
      preciselly, they have them allocated, but not initialized).
      
      Fix efx_channel_has_tx_queues to return the correct value for this case
      too.
      
      Messages shown at probe time before the fix:
       sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
       ------------[ cut here ]------------
       netdevice: ens6f0np0: failed to initialise TXQ -1
       WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
       [...] stripped
       RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
       [...] stripped
       Call Trace:
        efx_init_tx_queue+0xaa/0xf0 [sfc]
        efx_start_channels+0x49/0x120 [sfc]
        efx_start_all+0x1f8/0x430 [sfc]
        efx_net_open+0x5a/0xe0 [sfc]
        __dev_open+0xd0/0x190
        __dev_change_flags+0x1b3/0x220
        dev_change_flags+0x21/0x60
       [...] stripped
      
      Messages shown at remove time before the fix:
       sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
       sfc 0000:03:00.0 ens6f0np0: failed to flush queues
      
      Fixes: 8700aff0 ("sfc: fix channel allocation with brute force")
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: default avatarMartin Habets <habetsm.xilinx@gmail.com>
      Tested-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e102b53
  3. 28 May, 2022 13 commits
  4. 27 May, 2022 14 commits
    • Menglong Dong's avatar
      bpf: Fix probe read error in ___bpf_prog_run() · caff1fa4
      Menglong Dong authored
      I think there is something wrong with BPF_PROBE_MEM in ___bpf_prog_run()
      in big-endian machine. Let's make a test and see what will happen if we
      want to load a 'u16' with BPF_PROBE_MEM.
      
      Let's make the src value '0x0001', the value of dest register will become
      0x0001000000000000, as the value will be loaded to the first 2 byte of
      DST with following code:
      
        bpf_probe_read_kernel(&DST, SIZE, (const void *)(long) (SRC + insn->off));
      
      Obviously, the value in DST is not correct. In fact, we can compare
      BPF_PROBE_MEM with LDX_MEM_H:
      
        DST = *(SIZE *)(unsigned long) (SRC + insn->off);
      
      If the memory load is done by LDX_MEM_H, the value in DST will be 0x1 now.
      
      And I think this error results in the test case 'test_bpf_sk_storage_map'
      failing:
      
        test_bpf_sk_storage_map:PASS:bpf_iter_bpf_sk_storage_map__open_and_load 0 nsec
        test_bpf_sk_storage_map:PASS:socket 0 nsec
        test_bpf_sk_storage_map:PASS:map_update 0 nsec
        test_bpf_sk_storage_map:PASS:socket 0 nsec
        test_bpf_sk_storage_map:PASS:map_update 0 nsec
        test_bpf_sk_storage_map:PASS:socket 0 nsec
        test_bpf_sk_storage_map:PASS:map_update 0 nsec
        test_bpf_sk_storage_map:PASS:attach_iter 0 nsec
        test_bpf_sk_storage_map:PASS:create_iter 0 nsec
        test_bpf_sk_storage_map:PASS:read 0 nsec
        test_bpf_sk_storage_map:FAIL:ipv6_sk_count got 0 expected 3
        $10/26 bpf_iter/bpf_sk_storage_map:FAIL
      
      The code of the test case is simply, it will load sk->sk_family to the
      register with BPF_PROBE_MEM and check if it is AF_INET6. With this patch,
      now the test case 'bpf_iter' can pass:
      
        $10  bpf_iter:OK
      
      Fixes: 2a02759e ("bpf: Add support for BTF pointers to interpreter")
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarJiang Biao <benbjiang@tencent.com>
      Reviewed-by: default avatarHao Peng <flyingpeng@tencent.com>
      Cc: Ilya Leoshkevich <iii@linux.ibm.com>
      Link: https://lore.kernel.org/bpf/20220524021228.533216-1-imagedong@tencent.com
      caff1fa4
    • Song Liu's avatar
      selftests/bpf: fix stacktrace_build_id with missing kprobe/urandom_read · 59ed76fe
      Song Liu authored
      Kernel function urandom_read is replaced with urandom_read_iter.
      Therefore, kprobe on urandom_read is not working any more:
      
      [root@eth50-1 bpf]# ./test_progs -n 161
      test_stacktrace_build_id:PASS:skel_open_and_load 0 nsec
      libbpf: kprobe perf_event_open() failed: No such file or directory
      libbpf: prog 'oncpu': failed to create kprobe 'urandom_read+0x0' \
              perf event: No such file or directory
      libbpf: prog 'oncpu': failed to auto-attach: -2
      test_stacktrace_build_id:FAIL:attach_tp err -2
      161     stacktrace_build_id:FAIL
      
      Fix this by replacing urandom_read with urandom_read_iter in the test.
      
      Fixes: 1b388e77 ("random: convert to using fops->read_iter()")
      Reported-by: default avatarMykola Lysenko <mykolal@fb.com>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/r/20220526191608.2364049-1-song@kernel.orgSigned-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      59ed76fe
    • Carlo Lobrano's avatar
      net: usb: qmi_wwan: add Telit 0x1250 composition · 2c262b21
      Carlo Lobrano authored
      Add support for Telit LN910Cx 0x1250 composition
      
      0x1250: rmnet, tty, tty, tty, tty
      Signed-off-by: default avatarCarlo Lobrano <c.lobrano@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c262b21
    • Raju Lakkaraju's avatar
      net: lan743x: PCI11010 / PCI11414 fix · 79dfeb29
      Raju Lakkaraju authored
      Fix the MDIO interface declarations to reflect what is currently supported by
      the PCI11010 / PCI11414 devices (C22 for RGMII and C22_C45 for SGMII)
      Signed-off-by: default avatarRaju Lakkaraju <Raju.Lakkaraju@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79dfeb29
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 55919b32
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following contain more Netfilter fixes for net:
      
      1) syzbot warning in nfnetlink bind, from Florian.
      
      2) Refetch conntrack after __nf_conntrack_confirm(), from Florian Westphal.
      
      3) Move struct nf_ct_timeout back at the bottom of the ctnl_time, to
         where it before recent update, also from Florian.
      
      4) Add NL_SET_BAD_ATTR() to nf_tables netlink for proper set element
         commands error reporting.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55919b32
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: set element extended ACK reporting support · b53c1166
      Pablo Neira Ayuso authored
      Report the element that causes problems via netlink extended ACK for set
      element commands.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      b53c1166
    • Florian Westphal's avatar
      netfilter: cttimeout: fix slab-out-of-bounds read in cttimeout_net_exit · aeed55a0
      Florian Westphal authored
      syzbot reports:
      BUG: KASAN: slab-out-of-bounds in __list_del_entry_valid+0xcc/0xf0 lib/list_debug.c:42
      [..]
       list_del include/linux/list.h:148 [inline]
       cttimeout_net_exit+0x211/0x540 net/netfilter/nfnetlink_cttimeout.c:617
      
      No reproducer so far. Looking at recent changes in this area
      its clear that the free_head must not be at the end of the
      structure because nf_ct_timeout structure has variable size.
      
      Reported-by: <syzbot+92968395eedbdbd3617d@syzkaller.appspotmail.com>
      Fixes: 78222bac ("netfilter: cttimeout: decouple unlink and free on netns destruction")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      aeed55a0
    • Florian Westphal's avatar
      netfilter: conntrack: re-fetch conntrack after insertion · 56b14ece
      Florian Westphal authored
      In case the conntrack is clashing, insertion can free skb->_nfct and
      set skb->_nfct to the already-confirmed entry.
      
      This wasn't found before because the conntrack entry and the extension
      space used to free'd after an rcu grace period, plus the race needs
      events enabled to trigger.
      
      Reported-by: <syzbot+793a590957d9c1b96620@syzkaller.appspotmail.com>
      Fixes: 71d8c47f ("netfilter: conntrack: introduce clash resolution on insertion race")
      Fixes: 2ad9d774 ("netfilter: conntrack: free extension area immediately")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      56b14ece
    • Florian Westphal's avatar
      netfilter: nfnetlink: fix warn in nfnetlink_unbind · ffd219ef
      Florian Westphal authored
      syzbot reports following warn:
      WARNING: CPU: 0 PID: 3600 at net/netfilter/nfnetlink.c:703 nfnetlink_unbind+0x357/0x3b0 net/netfilter/nfnetlink.c:694
      
      The syzbot generated program does this:
      
      socket(AF_NETLINK, SOCK_RAW, NETLINK_NETFILTER) = 3
      setsockopt(3, SOL_NETLINK, NETLINK_DROP_MEMBERSHIP, [1], 4) = 0
      
      ... which triggers 'WARN_ON_ONCE(nfnlnet->ctnetlink_listeners == 0)' check.
      
      Instead of counting, just enable reporting for every bind request
      and check if we still have listeners on unbind.
      
      While at it, also add the needed bounds check on nfnl_group2type[]
      access.
      
      Reported-by: <syzbot+4903218f7fba0a2d6226@syzkaller.appspotmail.com>
      Reported-by: <syzbot+afd2d80e495f96049571@syzkaller.appspotmail.com>
      Fixes: 2794cdb0 ("netfilter: nfnetlink: allow to detect if ctnetlink listeners exist")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ffd219ef
    • Miaoqian Lin's avatar
      net: dsa: mv88e6xxx: Fix refcount leak in mv88e6xxx_mdios_register · 02ded5a1
      Miaoqian Lin authored
      of_get_child_by_name() returns a node pointer with refcount
      incremented, we should use of_node_put() on it when done.
      
      mv88e6xxx_mdio_register() pass the device node to of_mdiobus_register().
      We don't need the device node after it.
      
      Add missing of_node_put() to avoid refcount leak.
      
      Fixes: a3c53be5 ("net: dsa: mv88e6xxx: Support multiple MDIO busses")
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Reviewed-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02ded5a1
    • Miaoqian Lin's avatar
      net: ethernet: ti: am65-cpsw-nuss: Fix some refcount leaks · 5dd89d2f
      Miaoqian Lin authored
      of_get_child_by_name() returns a node pointer with refcount
      incremented, we should use of_node_put() on it when not need anymore.
      am65_cpsw_init_cpts() and am65_cpsw_nuss_probe() don't release
      the refcount in error case.
      Add missing of_node_put() to avoid refcount leak.
      
      Fixes: b1f66a5b ("net: ethernet: ti: am65-cpsw-nuss: enable packet timestamping support")
      Fixes: 93a76530 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5dd89d2f
    • Dan Carpenter's avatar
      net: ethernet: mtk_eth_soc: out of bounds read in mtk_hwlro_get_fdir_entry() · e7e7104e
      Dan Carpenter authored
      The "fsp->location" variable comes from user via ethtool_get_rxnfc().
      Check that it is valid to prevent an out of bounds read.
      
      Fixes: 7aab747e ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7e7104e
    • Vincent Ray's avatar
      net: sched: fixed barrier to prevent skbuff sticking in qdisc backlog · a54ce370
      Vincent Ray authored
      In qdisc_run_begin(), smp_mb__before_atomic() used before test_bit()
      does not provide any ordering guarantee as test_bit() is not an atomic
      operation. This, added to the fact that the spin_trylock() call at
      the beginning of qdisc_run_begin() does not guarantee acquire
      semantics if it does not grab the lock, makes it possible for the
      following statement :
      
      if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
      
      to be executed before an enqueue operation called before
      qdisc_run_begin().
      
      As a result the following race can happen :
      
                 CPU 1                             CPU 2
      
            qdisc_run_begin()               qdisc_run_begin() /* true */
              set(MISSED)                            .
            /* returns false */                      .
                .                            /* sees MISSED = 1 */
                .                            /* so qdisc not empty */
                .                            __qdisc_run()
                .                                    .
                .                              pfifo_fast_dequeue()
       ----> /* may be done here */                  .
      |         .                                clear(MISSED)
      |         .                                    .
      |         .                                smp_mb __after_atomic();
      |         .                                    .
      |         .                                /* recheck the queue */
      |         .                                /* nothing => exit   */
      |   enqueue(skb1)
      |         .
      |   qdisc_run_begin()
      |         .
      |     spin_trylock() /* fail */
      |         .
      |     smp_mb__before_atomic() /* not enough */
      |         .
       ---- if (test_bit(MISSED))
              return false;   /* exit */
      
      In the above scenario, CPU 1 and CPU 2 both try to grab the
      qdisc->seqlock at the same time. Only CPU 2 succeeds and enters the
      bypass code path, where it emits its skb then calls __qdisc_run().
      
      CPU1 fails, sets MISSED and goes down the traditionnal enqueue() +
      dequeue() code path. But when executing qdisc_run_begin() for the
      second time, after enqueuing its skbuff, it sees the MISSED bit still
      set (by itself) and consequently chooses to exit early without setting
      it again nor trying to grab the spinlock again.
      
      Meanwhile CPU2 has seen MISSED = 1, cleared it, checked the queue
      and found it empty, so it returned.
      
      At the end of the sequence, we end up with skb1 enqueued in the
      backlog, both CPUs out of __dev_xmit_skb(), the MISSED bit not set,
      and no __netif_schedule() called made. skb1 will now linger in the
      qdisc until somebody later performs a full __qdisc_run(). Associated
      to the bypass capacity of the qdisc, and the ability of the TCP layer
      to avoid resending packets which it knows are still in the qdisc, this
      can lead to serious traffic "holes" in a TCP connection.
      
      We fix this by replacing the smp_mb__before_atomic() / test_bit() /
      set_bit() / smp_mb__after_atomic() sequence inside qdisc_run_begin()
      by a single test_and_set_bit() call, which is more concise and
      enforces the needed memory barriers.
      
      Fixes: 89837eb4 ("net: sched: add barrier to ensure correct ordering for lockless qdisc")
      Signed-off-by: default avatarVincent Ray <vray@kalrayinc.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20220526001746.2437669-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a54ce370
    • Michael Walle's avatar
      net: lan966x: check devm_of_phy_get() for -EDEFER_PROBE · b58cdd43
      Michael Walle authored
      At the moment, if devm_of_phy_get() returns an error the serdes
      simply isn't set. While it is bad to ignore an error in general, there
      is a particular bug that network isn't working if the serdes driver is
      compiled as a module. In that case, devm_of_phy_get() returns
      -EDEFER_PROBE and the error is silently ignored.
      
      The serdes is optional, it is not there if the port is using RGMII, in
      which case devm_of_phy_get() returns -ENODEV. Rearrange the error
      handling so that -ENODEV will be handled but other error codes will
      abort the probing.
      
      Fixes: d28d6d2e ("net: lan966x: add port module support")
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Link: https://lore.kernel.org/r/20220525231239.1307298-1-michael@walle.ccSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b58cdd43