1. 30 May, 2023 4 commits
    • Eric Dumazet's avatar
      tcp: deny tcp_disconnect() when threads are waiting · 4faeee0c
      Eric Dumazet authored
      Historically connect(AF_UNSPEC) has been abused by syzkaller
      and other fuzzers to trigger various bugs.
      
      A recent one triggers a divide-by-zero [1], and Paolo Abeni
      was able to diagnose the issue.
      
      tcp_recvmsg_locked() has tests about sk_state being not TCP_LISTEN
      and TCP REPAIR mode being not used.
      
      Then later if socket lock is released in sk_wait_data(),
      another thread can call connect(AF_UNSPEC), then make this
      socket a TCP listener.
      
      When recvmsg() is resumed, it can eventually call tcp_cleanup_rbuf()
      and attempt a divide by 0 in tcp_rcv_space_adjust() [1]
      
      This patch adds a new socket field, counting number of threads
      blocked in sk_wait_event() and inet_wait_for_connect().
      
      If this counter is not zero, tcp_disconnect() returns an error.
      
      This patch adds code in blocking socket system calls, thus should
      not hurt performance of non blocking ones.
      
      Note that we probably could revert commit 499350a5 ("tcp:
      initialize rcv_mss to TCP_MIN_MSS instead of 0") to restore
      original tcpi_rcv_mss meaning (was 0 if no payload was ever
      received on a socket)
      
      [1]
      divide error: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 13832 Comm: syz-executor.5 Not tainted 6.3.0-rc4-syzkaller-00224-g00c7b5f4 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023
      RIP: 0010:tcp_rcv_space_adjust+0x36e/0x9d0 net/ipv4/tcp_input.c:740
      Code: 00 00 00 00 fc ff df 4c 89 64 24 48 8b 44 24 04 44 89 f9 41 81 c7 80 03 00 00 c1 e1 04 44 29 f0 48 63 c9 48 01 e9 48 0f af c1 <49> f7 f6 48 8d 04 41 48 89 44 24 40 48 8b 44 24 30 48 c1 e8 03 48
      RSP: 0018:ffffc900033af660 EFLAGS: 00010206
      RAX: 4a66b76cbade2c48 RBX: ffff888076640cc0 RCX: 00000000c334e4ac
      RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000001
      RBP: 00000000c324e86c R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880766417f8
      R13: ffff888028fbb980 R14: 0000000000000000 R15: 0000000000010344
      FS: 00007f5bffbfe700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b32f25000 CR3: 000000007ced0000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      tcp_recvmsg_locked+0x100e/0x22e0 net/ipv4/tcp.c:2616
      tcp_recvmsg+0x117/0x620 net/ipv4/tcp.c:2681
      inet6_recvmsg+0x114/0x640 net/ipv6/af_inet6.c:670
      sock_recvmsg_nosec net/socket.c:1017 [inline]
      sock_recvmsg+0xe2/0x160 net/socket.c:1038
      ____sys_recvmsg+0x210/0x5a0 net/socket.c:2720
      ___sys_recvmsg+0xf2/0x180 net/socket.c:2762
      do_recvmmsg+0x25e/0x6e0 net/socket.c:2856
      __sys_recvmmsg net/socket.c:2935 [inline]
      __do_sys_recvmmsg net/socket.c:2958 [inline]
      __se_sys_recvmmsg net/socket.c:2951 [inline]
      __x64_sys_recvmmsg+0x20f/0x260 net/socket.c:2951
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7f5c0108c0f9
      Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f5bffbfe168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
      RAX: ffffffffffffffda RBX: 00007f5c011ac050 RCX: 00007f5c0108c0f9
      RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000003
      RBP: 00007f5c010e7b39 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007f5c012cfb1f R14: 00007f5bffbfe300 R15: 0000000000022000
      </TASK>
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reported-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Diagnosed-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Link: https://lore.kernel.org/r/20230526163458.2880232-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4faeee0c
    • Eric Dumazet's avatar
      af_packet: do not use READ_ONCE() in packet_bind() · 6ffc57ea
      Eric Dumazet authored
      A recent patch added READ_ONCE() in packet_bind() and packet_bind_spkt()
      
      This is better handled by reading pkt_sk(sk)->num later
      in packet_do_bind() while appropriate lock is held.
      
      READ_ONCE() in writers are often an evidence of something being wrong.
      
      Fixes: 822b5a1c ("af_packet: Fix data-races of pkt_sk(sk)->num.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230526154342.2533026-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6ffc57ea
    • Jakub Kicinski's avatar
      netlink: specs: correct types of legacy arrays · 0684f29a
      Jakub Kicinski authored
      ethtool has some attrs which dump multiple scalars into
      an attribute. The spec currently expects one attr per entry.
      
      Fixes: a353318e ("tools: ynl: populate most of the ethtool spec")
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/20230526220653.65538-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0684f29a
    • Sebastian Krzyszkowiak's avatar
      net: usb: qmi_wwan: Set DTR quirk for BroadMobi BM818 · 36936a56
      Sebastian Krzyszkowiak authored
      BM818 is based on Qualcomm MDM9607 chipset.
      
      Fixes: 9a07406b ("net: usb: qmi_wwan: Add the BroadMobi BM818 card")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Link: https://lore.kernel.org/r/20230526-bm818-dtr-v1-1-64bbfa6ba8af@puri.smSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      36936a56
  2. 26 May, 2023 9 commits
    • Osama Muhammad's avatar
      nfcsim.c: Fix error checking for debugfs_create_dir · 9b9e46aa
      Osama Muhammad authored
      This patch fixes the error checking in nfcsim.c.
      The DebugFS kernel API is developed in
      a way that the caller can safely ignore the errors that
      occur during the creation of DebugFS nodes.
      Signed-off-by: default avatarOsama Muhammad <osmtendev@gmail.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b9e46aa
    • Raju Rangoju's avatar
      amd-xgbe: fix the false linkup in xgbe_phy_status · dc362e20
      Raju Rangoju authored
      In the event of a change in XGBE mode, the current auto-negotiation
      needs to be reset and the AN cycle needs to be re-triggerred. However,
      the current code ignores the return value of xgbe_set_mode(), leading to
      false information as the link is declared without checking the status
      register.
      
      Fix this by propagating the mode switch status information to
      xgbe_phy_status().
      
      Fixes: e57f7a3f ("amd-xgbe: Prepare for working with more than one type of phy")
      Co-developed-by: default avatarSudheesh Mavila <sudheesh.mavila@amd.com>
      Signed-off-by: default avatarSudheesh Mavila <sudheesh.mavila@amd.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Signed-off-by: default avatarRaju Rangoju <Raju.Rangoju@amd.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc362e20
    • Jakub Kicinski's avatar
      tls: improve lockless access safety of tls_err_abort() · 8a0d57df
      Jakub Kicinski authored
      Most protos' poll() methods insert a memory barrier between
      writes to sk_err and sk_error_report(). This dates back to
      commit a4d25803 ("tcp: Fix race in tcp_poll").
      
      I guess we should do the same thing in TLS, tcp_poll() does
      not hold the socket lock.
      
      Fixes: 3c4d7559 ("tls: kernel TLS support")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8a0d57df
    • Jakub Kicinski's avatar
      Merge tag 'mlx5-fixes-2023-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · aa866ee4
      Jakub Kicinski authored
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2023-05-24
      
      This series includes bug fixes for the mlx5 driver.
      
      * tag 'mlx5-fixes-2023-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
        Documentation: net/mlx5: Wrap notes in admonition blocks
        Documentation: net/mlx5: Add blank line separator before numbered lists
        Documentation: net/mlx5: Use bullet and definition lists for vnic counters description
        Documentation: net/mlx5: Wrap vnic reporter devlink commands in code blocks
        net/mlx5: Fix check for allocation failure in comp_irqs_request_pci()
        net/mlx5: DR, Add missing mutex init/destroy in pattern manager
        net/mlx5e: Move Ethernet driver debugfs to profile init callback
        net/mlx5e: Don't attach netdev profile while handling internal error
        net/mlx5: Fix post parse infra to only parse every action once
        net/mlx5e: Use query_special_contexts cmd only once per mdev
        net/mlx5: fw_tracer, Fix event handling
        net/mlx5: SF, Drain health before removing device
        net/mlx5: Drain health before unregistering devlink
        net/mlx5e: Do not update SBCM when prio2buffer command is invalid
        net/mlx5e: Consider internal buffers size in port buffer calculations
        net/mlx5e: Prevent encap offload when neigh update is running
        net/mlx5e: Extract remaining tunnel encap code to dedicated file
      ====================
      
      Link: https://lore.kernel.org/r/20230525034847.99268-1-saeed@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      aa866ee4
    • Kuniyuki Iwashima's avatar
      af_packet: Fix data-races of pkt_sk(sk)->num. · 822b5a1c
      Kuniyuki Iwashima authored
      syzkaller found a data race of pkt_sk(sk)->num.
      
      The value is changed under lock_sock() and po->bind_lock, so we
      need READ_ONCE() to access pkt_sk(sk)->num without these locks in
      packet_bind_spkt(), packet_bind(), and sk_diag_fill().
      
      Note that WRITE_ONCE() is already added by commit c7d2ef5d
      ("net/packet: annotate accesses to po->bind").
      
      BUG: KCSAN: data-race in packet_bind / packet_do_bind
      
      write (marked) to 0xffff88802ffd1cee of 2 bytes by task 7322 on cpu 0:
       packet_do_bind+0x446/0x640 net/packet/af_packet.c:3236
       packet_bind+0x99/0xe0 net/packet/af_packet.c:3321
       __sys_bind+0x19b/0x1e0 net/socket.c:1803
       __do_sys_bind net/socket.c:1814 [inline]
       __se_sys_bind net/socket.c:1812 [inline]
       __x64_sys_bind+0x40/0x50 net/socket.c:1812
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      read to 0xffff88802ffd1cee of 2 bytes by task 7318 on cpu 1:
       packet_bind+0xbf/0xe0 net/packet/af_packet.c:3322
       __sys_bind+0x19b/0x1e0 net/socket.c:1803
       __do_sys_bind net/socket.c:1814 [inline]
       __se_sys_bind net/socket.c:1812 [inline]
       __x64_sys_bind+0x40/0x50 net/socket.c:1812
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      value changed: 0x0300 -> 0x0000
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 7318 Comm: syz-executor.4 Not tainted 6.3.0-13380-g7fddb5b5300c #4
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      
      Fixes: 96ec6327 ("packet: Diag core and basic socket info dumping")
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20230524232934.50950-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      822b5a1c
    • Jakub Kicinski's avatar
      tools: ynl: avoid dict errors on older Python versions · 081e8df6
      Jakub Kicinski authored
      Python 3.9.0 or newer supports combining dicts() with |,
      but older versions of Python are still used in the wild
      (e.g. on CentOS 8, which goes EoL May 31, 2024).
      With Python 3.6.8 we get:
      
        TypeError: unsupported operand type(s) for |: 'dict' and 'dict'
      
      Use older syntax. Tested with non-legacy families only.
      
      Fixes: f036d936 ("tools: ynl: Add fixed-header support to ynl")
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Tested-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Link: https://lore.kernel.org/r/20230524170712.2036128-1-kuba@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      081e8df6
    • Eric Dumazet's avatar
      netrom: fix info-leak in nr_write_internal() · 31642e70
      Eric Dumazet authored
      Simon Kapadia reported the following issue:
      
      <quote>
      
      The Online Amateur Radio Community (OARC) has recently been experimenting
      with building a nationwide packet network in the UK.
      As part of our experimentation, we have been testing out packet on 300bps HF,
      and playing with net/rom.  For HF packet at this baud rate you really need
      to make sure that your MTU is relatively low; AX.25 suggests a PACLEN of 60,
      and a net/rom PACLEN of 40 to go with that.
      However the Linux net/rom support didn't work with a low PACLEN;
      the mkiss module would truncate packets if you set the PACLEN below about 200 or so, e.g.:
      
      Apr 19 14:00:51 radio kernel: [12985.747310] mkiss: ax1: truncating oversized transmit packet!
      
      This didn't make any sense to me (if the packets are smaller why would they
      be truncated?) so I started investigating.
      I looked at the packets using ethereal, and found that many were just huge
      compared to what I would expect.
      A simple net/rom connection request packet had the request and then a bunch
      of what appeared to be random data following it:
      
      </quote>
      
      Simon provided a patch that I slightly revised:
      Not only we must not use skb_tailroom(), we also do
      not want to count NR_NETWORK_LEN twice.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Co-Developed-by: default avatarSimon Kapadia <szymon@kapadia.pl>
      Signed-off-by: default avatarSimon Kapadia <szymon@kapadia.pl>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarSimon Kapadia <szymon@kapadia.pl>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230524141456.1045467-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      31642e70
    • Wei Fang's avatar
      net: stmmac: fix call trace when stmmac_xdp_xmit() is invoked · ffb33221
      Wei Fang authored
      We encountered a kernel call trace issue which was related to
      ndo_xdp_xmit callback on our i.MX8MP platform. The reproduce
      steps show as follows.
      1. The FEC port (eth0) connects to a PC port, and the PC uses
      pktgen_sample03_burst_single_flow.sh to generate packets and
      send these packets to the FEC port. Notice that the script must
      be executed before step 2.
      2. Run the "./xdp_redirect eth0 eth1" command on i.MX8MP, the
      eth1 interface is the dwmac. Then there will be a call trace
      issue soon. Please see the log for more details.
      The root cause is that the NETDEV_XDP_ACT_NDO_XMIT feature is
      enabled by default, so when the step 2 command is exexcuted
      and packets have already been sent to eth0, the stmmac_xdp_xmit()
      starts running before the stmmac_xdp_set_prog() finishes. To
      resolve this issue, we disable the NETDEV_XDP_ACT_NDO_XMIT
      feature by default and turn on/off this feature when the bpf
      program is installed/uninstalled which just like the other
      ethernet drivers.
      
      Call Trace log:
      [  306.311271] ------------[ cut here ]------------
      [  306.315910] WARNING: CPU: 0 PID: 15 at lib/timerqueue.c:55 timerqueue_del+0x68/0x70
      [  306.323590] Modules linked in:
      [  306.326654] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 6.4.0-rc1+ #37
      [  306.333277] Hardware name: NXP i.MX8MPlus EVK board (DT)
      [  306.338591] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [  306.345561] pc : timerqueue_del+0x68/0x70
      [  306.349577] lr : __remove_hrtimer+0x5c/0xa0
      [  306.353777] sp : ffff80000b7c3920
      [  306.357094] x29: ffff80000b7c3920 x28: 0000000000000000 x27: 0000000000000001
      [  306.364244] x26: ffff80000a763a40 x25: ffff0000d0285a00 x24: 0000000000000001
      [  306.371390] x23: 0000000000000001 x22: ffff000179389a40 x21: 0000000000000000
      [  306.378537] x20: ffff000179389aa0 x19: ffff0000d2951308 x18: 0000000000001000
      [  306.385686] x17: f1d3000000000000 x16: 00000000c39c1000 x15: 55e99bbe00001a00
      [  306.392835] x14: 09000900120aa8c0 x13: e49af1d300000000 x12: 000000000000c39c
      [  306.399987] x11: 100055e99bbe0000 x10: ffff8000090b1048 x9 : ffff8000081603fc
      [  306.407133] x8 : 000000000000003c x7 : 000000000000003c x6 : 0000000000000001
      [  306.414284] x5 : ffff0000d2950980 x4 : 0000000000000000 x3 : 0000000000000000
      [  306.421432] x2 : 0000000000000001 x1 : ffff0000d2951308 x0 : ffff0000d2951308
      [  306.428585] Call trace:
      [  306.431035]  timerqueue_del+0x68/0x70
      [  306.434706]  __remove_hrtimer+0x5c/0xa0
      [  306.438549]  hrtimer_start_range_ns+0x2bc/0x370
      [  306.443089]  stmmac_xdp_xmit+0x174/0x1b0
      [  306.447021]  bq_xmit_all+0x194/0x4b0
      [  306.450612]  __dev_flush+0x4c/0x98
      [  306.454024]  xdp_do_flush+0x18/0x38
      [  306.457522]  fec_enet_rx_napi+0x6c8/0xc68
      [  306.461539]  __napi_poll+0x40/0x220
      [  306.465038]  net_rx_action+0xf8/0x240
      [  306.468707]  __do_softirq+0x128/0x3a8
      [  306.472378]  run_ksoftirqd+0x40/0x58
      [  306.475961]  smpboot_thread_fn+0x1c4/0x288
      [  306.480068]  kthread+0x124/0x138
      [  306.483305]  ret_from_fork+0x10/0x20
      [  306.486889] ---[ end trace 0000000000000000 ]---
      
      Fixes: 66c0e13a ("drivers: net: turn on XDP features")
      Signed-off-by: default avatarWei Fang <wei.fang@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230524125714.357337-1-wei.fang@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ffb33221
    • Thomas Bogendoerfer's avatar
      net: mellanox: mlxbf_gige: Fix skb_panic splat under memory pressure · d68cb7cf
      Thomas Bogendoerfer authored
      Do skb_put() after a new skb has been successfully allocated otherwise
      the reused skb leads to skb_panics or incorrect packet sizes.
      
      Fixes: f92e1869 ("Add Mellanox BlueField Gigabit Ethernet driver")
      Signed-off-by: default avatarThomas Bogendoerfer <tbogendoerfer@suse.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230524194908.147145-1-tbogendoerfer@suse.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d68cb7cf
  3. 25 May, 2023 27 commits