- 24 Jun, 2023 1 commit
-
-
Florian Fainelli authored
With support for Ethernet PHY LEDs having been added, while unregistering a MDIO bus and its child device liks PHYs there may be "late" accesses to the MDIO bus. One typical use case is setting the PHY LEDs brightness to OFF for instance. We need to ensure that the MDIO bus controller remains entirely functional since it runs off the main GENET adapter clock. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230617155500.4005881-1-andrew@lunn.ch/ Fixes: 9a4e7969 ("net: bcmgenet: utilize generic Broadcom UniMAC MDIO controller driver") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230622103107.1760280-1-florian.fainelli@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 23 Jun, 2023 15 commits
-
-
Jakub Kicinski authored
Merge tag 'linux-can-fixes-for-6.4-20230622' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2023-06-22 Oliver Hartkopp's patch fixes the return value in the error path of isotp_sendmsg() in the CAN ISOTP protocol. * tag 'linux-can-fixes-for-6.4-20230622' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: isotp: isotp_sendmsg(): fix return error fix on TX path ==================== Link: https://lore.kernel.org/r/20230622090122.574506-1-mkl@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Oleksij Rempel authored
Fix an issue where the kernel would stall during netboot, showing the "sched: RT throttling activated" message. This stall was triggered by the behavior of the mii_interrupt bit (Bit 7 - DP83TD510E_STS_MII_INT) in the DP83TD510E's PHY_STS Register (Address = 0x10). The DP83TD510E datasheet (2020) states that the bit clears on write, however, in practice, the bit clears on read. This discrepancy had significant implications on the driver's interrupt handling. The PHY_STS Register was used by handle_interrupt() to check for pending interrupts and by read_status() to get the current link status. The call to read_status() was unintentionally clearing the mii_interrupt status bit without deasserting the IRQ pin, causing handle_interrupt() to miss other pending interrupts. This issue was most apparent during netboot. The fix refrains from using the PHY_STS Register for interrupt handling. Instead, we now solely rely on the INTERRUPT_REG_1 Register (Address = 0x12) and INTERRUPT_REG_2 Register (Address = 0x13) for this purpose. These registers directly influence the IRQ pin state and are latched high until read. Note: The INTERRUPT_REG_2 Register (Address = 0x13) exists and can also be used for interrupt handling, specifically for "Aneg page received interrupt" and "Polarity change interrupt". However, these features are currently not supported by this driver. Fixes: 165cd04f ("net: phy: dp83td510: Add support for the DP83TD510 Ethernet PHY") Cc: <stable@vger.kernel.org> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230621043848.3806124-1-o.rempel@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
syzbot reports that some netdev devices do not have a six bytes address [1] Replace ETH_ALEN by dev->addr_len. [1] (Case of a device where dev->addr_len = 4) BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copyout+0xb8/0x100 lib/iov_iter.c:169 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copyout+0xb8/0x100 lib/iov_iter.c:169 _copy_to_iter+0x6d8/0x1d00 lib/iov_iter.c:536 copy_to_iter include/linux/uio.h:206 [inline] simple_copy_to_iter+0x68/0xa0 net/core/datagram.c:513 __skb_datagram_iter+0x123/0xdc0 net/core/datagram.c:419 skb_copy_datagram_iter+0x5c/0x200 net/core/datagram.c:527 skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline] netlink_recvmsg+0x4ae/0x15a0 net/netlink/af_netlink.c:1970 sock_recvmsg_nosec net/socket.c:1019 [inline] sock_recvmsg net/socket.c:1040 [inline] ____sys_recvmsg+0x283/0x7f0 net/socket.c:2722 ___sys_recvmsg+0x223/0x840 net/socket.c:2764 do_recvmmsg+0x4f9/0xfd0 net/socket.c:2858 __sys_recvmmsg net/socket.c:2937 [inline] __do_sys_recvmmsg net/socket.c:2960 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0x397/0x490 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Uninit was stored to memory at: __nla_put lib/nlattr.c:1009 [inline] nla_put+0x1c6/0x230 lib/nlattr.c:1067 nlmsg_populate_fdb_fill+0x2b8/0x600 net/core/rtnetlink.c:4071 nlmsg_populate_fdb net/core/rtnetlink.c:4418 [inline] ndo_dflt_fdb_dump+0x616/0x840 net/core/rtnetlink.c:4456 rtnl_fdb_dump+0x14ff/0x1fc0 net/core/rtnetlink.c:4629 netlink_dump+0x9d1/0x1310 net/netlink/af_netlink.c:2268 netlink_recvmsg+0xc5c/0x15a0 net/netlink/af_netlink.c:1995 sock_recvmsg_nosec+0x7a/0x120 net/socket.c:1019 ____sys_recvmsg+0x664/0x7f0 net/socket.c:2720 ___sys_recvmsg+0x223/0x840 net/socket.c:2764 do_recvmmsg+0x4f9/0xfd0 net/socket.c:2858 __sys_recvmmsg net/socket.c:2937 [inline] __do_sys_recvmmsg net/socket.c:2960 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0x397/0x490 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Uninit was created at: slab_post_alloc_hook+0x12d/0xb60 mm/slab.h:716 slab_alloc_node mm/slub.c:3451 [inline] __kmem_cache_alloc_node+0x4ff/0x8b0 mm/slub.c:3490 kmalloc_trace+0x51/0x200 mm/slab_common.c:1057 kmalloc include/linux/slab.h:559 [inline] __hw_addr_create net/core/dev_addr_lists.c:60 [inline] __hw_addr_add_ex+0x2e5/0x9e0 net/core/dev_addr_lists.c:118 __dev_mc_add net/core/dev_addr_lists.c:867 [inline] dev_mc_add+0x9a/0x130 net/core/dev_addr_lists.c:885 igmp6_group_added+0x267/0xbc0 net/ipv6/mcast.c:680 ipv6_mc_up+0x296/0x3b0 net/ipv6/mcast.c:2754 ipv6_mc_remap+0x1e/0x30 net/ipv6/mcast.c:2708 addrconf_type_change net/ipv6/addrconf.c:3731 [inline] addrconf_notify+0x4d3/0x1d90 net/ipv6/addrconf.c:3699 notifier_call_chain kernel/notifier.c:93 [inline] raw_notifier_call_chain+0xe4/0x430 kernel/notifier.c:461 call_netdevice_notifiers_info net/core/dev.c:1935 [inline] call_netdevice_notifiers_extack net/core/dev.c:1973 [inline] call_netdevice_notifiers+0x1ee/0x2d0 net/core/dev.c:1987 bond_enslave+0xccd/0x53f0 drivers/net/bonding/bond_main.c:1906 do_set_master net/core/rtnetlink.c:2626 [inline] rtnl_newlink_create net/core/rtnetlink.c:3460 [inline] __rtnl_newlink net/core/rtnetlink.c:3660 [inline] rtnl_newlink+0x378c/0x40e0 net/core/rtnetlink.c:3673 rtnetlink_rcv_msg+0x16a6/0x1840 net/core/rtnetlink.c:6395 netlink_rcv_skb+0x371/0x650 net/netlink/af_netlink.c:2546 rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6413 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0xf28/0x1230 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x122f/0x13d0 net/netlink/af_netlink.c:1913 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] ____sys_sendmsg+0x999/0xd50 net/socket.c:2503 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2557 __sys_sendmsg net/socket.c:2586 [inline] __do_sys_sendmsg net/socket.c:2595 [inline] __se_sys_sendmsg net/socket.c:2593 [inline] __x64_sys_sendmsg+0x304/0x490 net/socket.c:2593 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Bytes 2856-2857 of 3500 are uninitialized Memory access of size 3500 starts at ffff888018d99104 Data copied to user address 0000000020000480 Fixes: d83b0603 ("net: add fdb generic dump routine") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20230621174720.1845040-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Eric Dumazet authored
syzbot reported a possible deadlock in netlink_set_err() [1] A similar issue was fixed in commit 1d482e66 ("netlink: disable IRQs for netlink_lock_table()") in netlink_lock_table() This patch adds IRQ safety to netlink_set_err() and __netlink_diag_dump() which were not covered by cited commit. [1] WARNING: possible irq lock inversion dependency detected 6.4.0-rc6-syzkaller-00240-g4e9f0ec3 #0 Not tainted syz-executor.2/23011 just changed the state of lock: ffffffff8e1a7a58 (nl_table_lock){.+.?}-{2:2}, at: netlink_set_err+0x2e/0x3a0 net/netlink/af_netlink.c:1612 but this lock was taken by another, SOFTIRQ-safe lock in the past: (&local->queue_stop_reason_lock){..-.}-{2:2} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(nl_table_lock); local_irq_disable(); lock(&local->queue_stop_reason_lock); lock(nl_table_lock); <Interrupt> lock(&local->queue_stop_reason_lock); *** DEADLOCK *** Fixes: 1d482e66 ("netlink: disable IRQs for netlink_lock_table()") Reported-by: syzbot+a7d200a347f912723e5c@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=a7d200a347f912723e5c Link: https://lore.kernel.org/netdev/000000000000e38d1605fea5747e@google.com/T/#uSigned-off-by: Eric Dumazet <edumazet@google.com> Cc: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20230621154337.1668594-1-edumazet@google.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Bartosz Golaszewski authored
Commit 49725ffc ("net: stmmac: power up/down serdes in stmmac_open/release") correctly added a call to the serdes_powerdown() callback to stmmac_release() but did not remove the one from stmmac_remove() which leads to a doubled call to serdes_powerdown(). This can lead to all kinds of problems: in the case of the qcom ethqos driver, it caused an unbalanced regulator disable splat. Fixes: 49725ffc ("net: stmmac: power up/down serdes in stmmac_open/release") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Junxiao Chang <junxiao.chang@intel.com> Reviewed-by: Andrew Halaney <ahalaney@redhat.com> Tested-by: Andrew Halaney <ahalaney@redhat.com> Link: https://lore.kernel.org/r/20230621135537.376649-1-brgl@bgdev.plSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sathesh Edara authored
Update email addresses of Marvell octeon_ep driver maintainers. Also remove a former maintainer. As a maintainer below are the responsibilities: - Pushing the bug fixes and new features to upstream. - Responsible for reviewing the external changes submitted for the octeon_ep driver. - Reply to maintainers questions in a timely manner. Signed-off-by: Sathesh Edara <sedara@marvell.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Krzysztof Kozlowski authored
The Devicetree bindings should be picked up by subsystem maintainers, but respective pattern for Bluetooth drivers was missing. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Paolo Abeni: "Including fixes from ipsec, bpf, mptcp and netfilter. Current release - regressions: - netfilter: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain - eth: mlx5e: - fix scheduling of IPsec ASO query while in atomic - free IRQ rmap and notifier on kernel shutdown Current release - new code bugs: - phy: manual remove LEDs to ensure correct ordering Previous releases - regressions: - mptcp: fix possible divide by zero in recvmsg() - dsa: revert "net: phy: dp83867: perform soft reset and retain established link" Previous releases - always broken: - sched: netem: acquire qdisc lock in netem_change() - bpf: - fix verifier id tracking of scalars on spill - fix NULL dereference on exceptions - accept function names that contain dots - netfilter: disallow element updates of bound anonymous sets - mptcp: ensure listener is unhashed before updating the sk status - xfrm: - add missed call to delete offloaded policies - fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets - selftests: fixes for FIPS mode - dsa: mt7530: fix multiple CPU ports, BPDU and LLDP handling - eth: sfc: use budget for TX completions Misc: - wifi: iwlwifi: add support for SO-F device with PCI id 0x7AF0" * tag 'net-6.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits) revert "net: align SO_RCVMARK required privileges with SO_MARK" net: wwan: iosm: Convert single instance struct member to flexible array sch_netem: acquire qdisc lock in netem_change() selftests: forwarding: Fix race condition in mirror installation wifi: mac80211: report all unusable beacon frames mptcp: ensure listener is unhashed before updating the sk status mptcp: drop legacy code around RX EOF mptcp: consolidate fallback and non fallback state machine mptcp: fix possible list corruption on passive MPJ mptcp: fix possible divide by zero in recvmsg() mptcp: handle correctly disconnect() failures bpf: Force kprobe multi expected_attach_type for kprobe_multi link bpf/btf: Accept function names that contain dots Revert "net: phy: dp83867: perform soft reset and retain established link" net: mdio: fix the wrong parameters netfilter: nf_tables: Fix for deleting base chains with payload netfilter: nfnetlink_osf: fix module autoload netfilter: nf_tables: drop module reference after updating chain netfilter: nf_tables: disallow timeout for anonymous sets netfilter: nf_tables: disallow updates of anonymous sets ...
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull kvm fixes from Paolo Bonzini: "ARM: - Correctly save/restore PMUSERNR_EL0 when host userspace is using PMU counters directly - Fix GICv2 emulation on GICv3 after the locking rework - Don't use smp_processor_id() in kvm_pmu_probe_armpmu(), and document why Generic: - Avoid setting page table entries pointing to a deleted memslot if a host page table entry is changed concurrently with the deletion" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Avoid illegal stage2 mapping on invalid memory slot KVM: arm64: Use raw_smp_processor_id() in kvm_pmu_probe_armpmu() KVM: arm64: Restore GICv2-on-GICv3 functionality KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fix from Michael Ellerman: - Disable IRQs when switching mm in exit_lazy_flush_tlb() called from exit_mmap() Thanks to Nicholas Piggin and Sachin Sant. * tag 'powerpc-6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/radix: Fix exit lazy tlb mm switch with irqs enabled
-
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pciLinus Torvalds authored
Pull pci fix from Bjorn Helgaas: - Transfer Intel LGM GW PCIe maintenance from Rahul Tanwar to Chuanhua Lei (Zhu YiXin) * tag 'pci-v6.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: Add Chuanhua Lei as Intel LGM GW PCIe maintainer
-
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds authored
Pull MMC fixes from Ulf Hansson: - Fix support for deferred probing for several host drivers - litex_mmc: Use async probe as it's common for all mmc hosts - meson-gx: Fix bug when scheduling while atomic - mmci_stm32: Fix max busy timeout calculation - sdhci-msm: Disable broken 64-bit DMA on MSM8916 * tag 'mmc-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: usdhi60rol0: fix deferred probing mmc: sunxi: fix deferred probing mmc: sh_mmcif: fix deferred probing mmc: sdhci-spear: fix deferred probing mmc: sdhci-acpi: fix deferred probing mmc: owl: fix deferred probing mmc: omap_hsmmc: fix deferred probing mmc: omap: fix deferred probing mmc: mvsdio: fix deferred probing mmc: mtk-sd: fix deferred probing mmc: meson-gx: fix deferred probing mmc: bcm2835: fix deferred probing mmc: litex_mmc: set PROBE_PREFER_ASYNCHRONOUS mmc: meson-gx: remove redundant mmc_request_done() call from irq context mmc: mmci: stm32: fix max busy timeout calculation mmc: sdhci-msm: Disable broken 64-bit DMA on MSM8916
-
Linus Torvalds authored
Merge tag 'platform-drivers-x86-v6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fix from Hans de Goede: "One small fix for an AMD PMF driver issue which is causing issues for users of just released AMD laptop models" * tag 'platform-drivers-x86-v6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/pmf: Register notify handler only if SPS is enabled
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "A fix for a race condition with poll removal and linked timeouts, and then a few followup fixes/tweaks for the msg_control patch from last week. Not super important, particularly the sparse fixup, as it was broken before that recent commit. But let's get it sorted for real for this release, rather than just have it broken a bit differently" * tag 'io_uring-6.4-2023-06-21' of git://git.kernel.dk/linux: io_uring/net: use the correct msghdr union member in io_sendmsg_copy_hdr io_uring/net: disable partial retries for recvmsg with cmsg io_uring/net: clear msg_controllen on partial sendmsg retry io_uring/poll: serialize poll linked timer start with poll removal
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds authored
Pull cgroup fixes from Tejun Heo: "It's late but here are two bug fixes. Both fix problems which can be severe but are very confined in scope. The risk to most use cases should be minimal. - Fix for an old bug which triggers if a cgroup subsystem is remounted to a different hierarchy while someone is reading its cgroup.procs/tasks file. The risk is pretty low given how seldom cgroup subsystems are moved across hierarchies. - We moved cpus_read_lock() outside of cgroup internal locks a while ago but forgot to update the legacy_freezer leading to lockdep triggers. Fixed" * tag 'cgroup-for-6.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Do not corrupt task iteration when rebinding subsystem cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex in freezer_css_{online,offline}()
-
- 22 Jun, 2023 17 commits
-
-
Paolo Bonzini authored
Merge tag 'kvmarm-fixes-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.4, take #4 - Correctly save/restore PMUSERNR_EL0 when host userspace is using PMU counters directly - Fix GICv2 emulation on GICv3 after the locking rework - Don't use smp_processor_id() in kvm_pmu_probe_armpmu(), and document why...
-
Gavin Shan authored
We run into guest hang in edk2 firmware when KSM is kept as running on the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash device (TYPE_PFLASH_CFI01) during the operation of sector erasing or buffered write. The status is returned by reading the memory region of the pflash device and the read request should have been forwarded to QEMU and emulated by it. Unfortunately, the read request is covered by an illegal stage2 mapping when the guest hang issue occurs. The read request is completed with QEMU bypassed and wrong status is fetched. The edk2 firmware runs into an infinite loop with the wrong status. The illegal stage2 mapping is populated due to same page sharing by KSM at (C) even the associated memory slot has been marked as invalid at (B) when the memory slot is requested to be deleted. It's notable that the active and inactive memory slots can't be swapped when we're in the middle of kvm_mmu_notifier_change_pte() because kvm->mn_active_invalidate_count is elevated, and kvm_swap_active_memslots() will busy loop until it reaches to zero again. Besides, the swapping from the active to the inactive memory slots is also avoided by holding &kvm->srcu in __kvm_handle_hva_range(), corresponding to synchronize_srcu_expedited() in kvm_swap_active_memslots(). CPU-A CPU-B ----- ----- ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION) kvm_vm_ioctl_set_memory_region kvm_set_memory_region __kvm_set_memory_region kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE) kvm_invalidate_memslot kvm_copy_memslot kvm_replace_memslot kvm_swap_active_memslots (A) kvm_arch_flush_shadow_memslot (B) same page sharing by KSM kvm_mmu_notifier_invalidate_range_start : kvm_mmu_notifier_change_pte kvm_handle_hva_range __kvm_handle_hva_range kvm_set_spte_gfn (C) : kvm_mmu_notifier_invalidate_range_end Fix the issue by skipping the invalid memory slot at (C) to avoid the illegal stage2 mapping so that the read request for the pflash's status is forwarded to QEMU and emulated by it. In this way, the correct pflash's status can be returned from QEMU to break the infinite loop in the edk2 firmware. We tried a git-bisect and the first problematic commit is cd4c7183 (" KVM: arm64: Convert to the gfn-based MMU notifier callbacks"). With this, clean_dcache_guest_page() is called after the memory slots are iterated in kvm_mmu_notifier_change_pte(). clean_dcache_guest_page() is called before the iteration on the memory slots before this commit. This change literally enlarges the racy window between kvm_mmu_notifier_change_pte() and memory slot removal so that we're able to reproduce the issue in a practical test case. However, the issue exists since commit d5d8184d ("KVM: ARM: Memory virtualization setup"). Cc: stable@vger.kernel.org # v3.9+ Fixes: d5d8184d ("KVM: ARM: Memory virtualization setup") Reported-by: Shuai Hu <hshuai@redhat.com> Reported-by: Zhenyu Zhang <zhenyzha@redhat.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Message-Id: <20230615054259.14911-1-gshan@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfPaolo Abeni authored
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net This is v3, including a crash fix for patch 01/14. The following patchset contains Netfilter/IPVS fixes for net: 1) Fix UDP segmentation with IPVS tunneled traffic, from Terin Stock. 2) Fix chain binding transaction logic, add a bound flag to rule transactions. Remove incorrect logic in nft_data_hold() and nft_data_release(). 3) Add a NFT_TRANS_PREPARE_ERROR deactivate state to deal with releasing the set/chain as a follow up to 1240eb93 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE") 4) Drop map element references from preparation phase instead of set destroy path, otherwise bogus EBUSY with transactions such as: flush chain ip x y delete chain ip x w where chain ip x y contains jump/goto from set elements. 5) Pipapo set type does not regard generation mask from the walk iteration. 6) Fix reference count underflow in set element reference to stateful object. 7) Several patches to tighten the nf_tables API: - disallow set element updates of bound anonymous set - disallow unbound anonymous set/chain at the end of transaction. - disallow updates of anonymous set. - disallow timeout configuration for anonymous sets. 8) Fix module reference leak in chain updates. 9) Fix nfnetlink_osf module autoload. 10) Fix deletion of basechain when NFTA_CHAIN_HOOK is specified as in iptables-nft. This Netfilter batch is larger than usual at this stage, I am aware we are fairly late in the -rc cycle, if you prefer to route them through net-next, please let me know. netfilter pull request 23-06-21 * tag 'nf-23-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: Fix for deleting base chains with payload netfilter: nfnetlink_osf: fix module autoload netfilter: nf_tables: drop module reference after updating chain netfilter: nf_tables: disallow timeout for anonymous sets netfilter: nf_tables: disallow updates of anonymous sets netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: disallow element updates of bound anonymous sets netfilter: nf_tables: fix underflow in object reference counter netfilter: nft_set_pipapo: .walk does not deal with generations netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: fix chain binding transaction logic ipvs: align inner_mac_header for encapsulation ==================== Link: https://lore.kernel.org/r/20230621100731.68068-1-pablo@netfilter.orgSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Maciej Żenczykowski authored
This reverts commit 1f86123b ("net: align SO_RCVMARK required privileges with SO_MARK") because the reasoning in the commit message is not really correct: SO_RCVMARK is used for 'reading' incoming skb mark (via cmsg), as such it is more equivalent to 'getsockopt(SO_MARK)' which has no priv check and retrieves the socket mark, rather than 'setsockopt(SO_MARK) which sets the socket mark and does require privs. Additionally incoming skb->mark may already be visible if sysctl_fwmark_reflect and/or sysctl_tcp_fwmark_accept are enabled. Furthermore, it is easier to block the getsockopt via bpf (either cgroup setsockopt hook, or via syscall filters) then to unblock it if it requires CAP_NET_RAW/ADMIN. On Android the socket mark is (among other things) used to store the network identifier a socket is bound to. Setting it is privileged, but retrieving it is not. We'd like unprivileged userspace to be able to read the network id of incoming packets (where mark is set via iptables [to be moved to bpf])... An alternative would be to add another sysctl to control whether setting SO_RCVMARK is privilged or not. (or even a MASK of which bits in the mark can be exposed) But this seems like over-engineering... Note: This is a non-trivial revert, due to later merged commit e42c7bee ("bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()") which changed both 'ns_capable' into 'sockopt_ns_capable' calls. Fixes: 1f86123b ("net: align SO_RCVMARK required privileges with SO_MARK") Cc: Larysa Zaremba <larysa.zaremba@intel.com> Cc: Simon Horman <simon.horman@corigine.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eyal Birger <eyal.birger@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Patrick Rohr <prohr@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230618103130.51628-1-maze@google.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Kees Cook authored
struct mux_adth actually ends with multiple struct mux_adth_dg members. This is seen both in the comments about the member: /** * struct mux_adth - Structure of the Aggregated Datagram Table Header. ... * @dg: datagramm table with variable length */ and in the preparation for populating it: adth_dg_size = offsetof(struct mux_adth, dg) + ul_adb->dg_count[i] * sizeof(*dg); ... adth_dg_size -= offsetof(struct mux_adth, dg); memcpy(&adth->dg, ul_adb->dg[i], adth_dg_size); This was reported as a run-time false positive warning: memcpy: detected field-spanning write (size 16) of single field "&adth->dg" at drivers/net/wwan/iosm/iosm_ipc_mux_codec.c:852 (size 8) Adjust the struct mux_adth definition and associated sizeof() math; no binary output differences are observed in the resulting object file. Reported-by: Florian Klink <flokli@flokli.de> Closes: https://lore.kernel.org/lkml/dbfa25f5-64c8-5574-4f5d-0151ba95d232@gmail.com/ Fixes: 1f52d7b6 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Cc: M Chetan Kumar <m.chetan.kumar@intel.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Intel Corporation <linuxwwan@intel.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620194234.never.023-kees@kernel.orgSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Eric Dumazet authored
syzbot managed to trigger a divide error [1] in netem. It could happen if q->rate changes while netem_enqueue() is running, since q->rate is read twice. It turns out netem_change() always lacked proper synchronization. [1] divide error: 0000 [#1] SMP KASAN CPU: 1 PID: 7867 Comm: syz-executor.1 Not tainted 6.1.30-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 RIP: 0010:div64_u64 include/linux/math64.h:69 [inline] RIP: 0010:packet_time_ns net/sched/sch_netem.c:357 [inline] RIP: 0010:netem_enqueue+0x2067/0x36d0 net/sched/sch_netem.c:576 Code: 89 e2 48 69 da 00 ca 9a 3b 42 80 3c 28 00 4c 8b a4 24 88 00 00 00 74 0d 4c 89 e7 e8 c3 4f 3b fd 48 8b 4c 24 18 48 89 d8 31 d2 <49> f7 34 24 49 01 c7 4c 8b 64 24 48 4d 01 f7 4c 89 e3 48 c1 eb 03 RSP: 0018:ffffc9000dccea60 EFLAGS: 00010246 RAX: 000001a442624200 RBX: 000001a442624200 RCX: ffff888108a4f000 RDX: 0000000000000000 RSI: 000000000000070d RDI: 000000000000070d RBP: ffffc9000dcceb90 R08: ffffffff849c5e26 R09: fffffbfff10e1297 R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888108a4f358 R13: dffffc0000000000 R14: 0000001a8cd9a7ec R15: 0000000000000000 FS: 00007fa73fe18700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa73fdf7718 CR3: 000000011d36e000 CR4: 0000000000350ee0 Call Trace: <TASK> [<ffffffff84714385>] __dev_xmit_skb net/core/dev.c:3931 [inline] [<ffffffff84714385>] __dev_queue_xmit+0xcf5/0x3370 net/core/dev.c:4290 [<ffffffff84d22df2>] dev_queue_xmit include/linux/netdevice.h:3030 [inline] [<ffffffff84d22df2>] neigh_hh_output include/net/neighbour.h:531 [inline] [<ffffffff84d22df2>] neigh_output include/net/neighbour.h:545 [inline] [<ffffffff84d22df2>] ip_finish_output2+0xb92/0x10d0 net/ipv4/ip_output.c:235 [<ffffffff84d21e63>] __ip_finish_output+0xc3/0x2b0 [<ffffffff84d10a81>] ip_finish_output+0x31/0x2a0 net/ipv4/ip_output.c:323 [<ffffffff84d10f14>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff84d10f14>] ip_output+0x224/0x2a0 net/ipv4/ip_output.c:437 [<ffffffff84d123b5>] dst_output include/net/dst.h:444 [inline] [<ffffffff84d123b5>] ip_local_out net/ipv4/ip_output.c:127 [inline] [<ffffffff84d123b5>] __ip_queue_xmit+0x1425/0x2000 net/ipv4/ip_output.c:542 [<ffffffff84d12fdc>] ip_queue_xmit+0x4c/0x70 net/ipv4/ip_output.c:556 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620184425.1179809-1-edumazet@google.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Oliver Hartkopp authored
With commit d674a8f1 ("can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path") the missing correct return value in the case of a protocol error was introduced. But the way the error value has been read and sent to the user space does not follow the common scheme to clear the error after reading which is provided by the sock_error() function. This leads to an error report at the following write() attempt although everything should be working. Fixes: d674a8f1 ("can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX path") Reported-by: Carsten Schmidt <carsten.schmidt-achim@t-online.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230607072708.38809-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-
Shyam Sundar S K authored
Power source notify handler is getting registered even when none of the PMF feature in enabled leading to a crash. ... [ 22.592162] Call Trace: [ 22.592164] <TASK> [ 22.592164] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592166] ? __warn+0x81/0x130 [ 22.592171] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592172] ? report_bug+0x171/0x1a0 [ 22.592175] ? prb_read_valid+0x1b/0x30 [ 22.592177] ? handle_bug+0x3c/0x80 [ 22.592178] ? exc_invalid_op+0x17/0x70 [ 22.592179] ? asm_exc_invalid_op+0x1a/0x20 [ 22.592182] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592183] ? acpi_ut_delete_object_desc+0x86/0xb0 [ 22.592186] ? acpi_ut_update_ref_count.part.0+0x22d/0x930 [ 22.592187] __schedule+0xc0/0x1410 [ 22.592189] ? ktime_get+0x3c/0xa0 [ 22.592191] ? lapic_next_event+0x1d/0x30 [ 22.592193] ? hrtimer_start_range_ns+0x25b/0x350 [ 22.592196] schedule+0x5e/0xd0 [ 22.592197] schedule_hrtimeout_range_clock+0xbe/0x140 [ 22.592199] ? __pfx_hrtimer_wakeup+0x10/0x10 [ 22.592200] usleep_range_state+0x64/0x90 [ 22.592203] amd_pmf_send_cmd+0x106/0x2a0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592207] amd_pmf_update_slider+0x56/0x1b0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592210] amd_pmf_set_sps_power_limits+0x72/0x80 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592213] amd_pmf_pwr_src_notify_call+0x49/0x90 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592216] notifier_call_chain+0x5a/0xd0 [ 22.592218] atomic_notifier_call_chain+0x32/0x50 ... Fix this by moving the registration of source change notify handler only when SPS(Static Slider) is advertised as supported. Reported-by: Allen Zhong <allen@atr.me> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217571 Fixes: 4c71ae41 ("platform/x86/amd/pmf: Add support SPS PMF feature") Tested-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20230622060309.310001-1-Shyam-sundar.S-k@amd.comReviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
-
Danielle Ratson authored
When mirroring to a gretap in hardware the device expects to be programmed with the egress port and all the encapsulating headers. This requires the driver to resolve the path the packet will take in the software data path and program the device accordingly. If the path cannot be resolved (in this case because of an unresolved neighbor), then mirror installation fails until the path is resolved. This results in a race that causes the test to sometimes fail. Fix this by setting the neighbor's state to permanent in a couple of tests, so that it is always valid. Fixes: 35c31d5c ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1d") Fixes: 239e754a ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/268816ac729cb6028c7a34d4dda6f4ec7af55333.1687264607.git.petrm@nvidia.comSigned-off-by: Paolo Abeni <pabeni@redhat.com>
-
Benjamin Berg authored
Properly check for RX_DROP_UNUSABLE now that the new drop reason infrastructure is used. Without this change, the comparison will always be false as a more specific reason is given in the lower bits of result. Fixes: baa951a1 ("mac80211: use the new drop reasons infrastructure") Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20230621120543.412920-2-johannes@sipsolutions.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Matthieu Baerts says: ==================== mptcp: fixes for 6.4 Patch 1 correctly handles disconnect() failures that can happen in some specific cases: now the socket state is set as unconnected as expected. That fixes an issue introduced in v6.2. Patch 2 fixes a divide by zero bug in mptcp_recvmsg() with a fix similar to a recent one from Eric Dumazet for TCP introducing sk_wait_pending flag. It should address an issue present in MPTCP from almost the beginning, from v5.9. Patch 3 fixes a possible list corruption on passive MPJ even if the race seems very unlikely, better be safe than sorry. The possible issue is present from v5.17. Patch 4 consolidates fallback and non fallback state machines to avoid leaking some MPTCP sockets. The fix is likely needed for versions from v5.11. Patch 5 drops code that is no longer used after the introduction of patch 4/6. This is not really a fix but this patch can probably land in the -net tree as well not to leave unused code. Patch 6 ensures listeners are unhashed before updating their sk status to avoid possible deadlocks when diag info are going to be retrieved with a lock. Even if it should not be visible with the way we are currently getting diag info, the issue is present from v5.17. ==================== Link: https://lore.kernel.org/r/20230620-upstream-net-20230620-misc-fixes-for-v6-4-v1-0-f36aa5eae8b9@tessares.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The MPTCP protocol access the listener subflow in a lockless manner in a couple of places (poll, diag). That works only if the msk itself leaves the listener status only after that the subflow itself has been closed/disconnected. Otherwise we risk deadlock in diag, as reported by Christoph. Address the issue ensuring that the first subflow (the listener one) is always disconnected before updating the msk socket status. Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/407 Fixes: b29fcfb5 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Thanks to the previous patch -- "mptcp: consolidate fallback and non fallback state machine" -- we can finally drop the "temporary hack" used to detect rx eof. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
An orphaned msk releases the used resources via the worker, when the latter first see the msk in CLOSED status. If the msk status transitions to TCP_CLOSE in the release callback invoked by the worker's final release_sock(), such instance of the workqueue will not take any action. Additionally the MPTCP code prevents scheduling the worker once the socket reaches the CLOSE status: such msk resources will be leaked. The only code path that can trigger the above scenario is the __mptcp_check_send_data_fin() in fallback mode. Address the issue removing the special handling of fallback socket in __mptcp_check_send_data_fin(), consolidating the state machine for fallback and non fallback socket. Since non-fallback sockets do not send and do not receive data_fin, the mptcp code can update the msk internal status to match the next step in the SM every time data fin (ack) should be generated or received. As a consequence we can remove a bunch of checks for fallback from the fastpath. Fixes: 6e628cd3 ("mptcp: use mptcp release_cb for delayed tasks") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
At passive MPJ time, if the msk socket lock is held by the user, the new subflow is appended to the msk->join_list under the msk data lock. In mptcp_release_cb()/__mptcp_flush_join_list(), the subflows in that list are moved from the join_list into the conn_list under the msk socket lock. Append and removal could race, possibly corrupting such list. Address the issue splicing the join list into a temporary one while still under the msk data lock. Found by code inspection, the race itself should be almost impossible to trigger in practice. Fixes: 3e501490 ("mptcp: cleanup MPJ subflow list handling") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Christoph reported a divide by zero bug in mptcp_recvmsg(): divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 19978 Comm: syz-executor.6 Not tainted 6.4.0-rc2-gffcc7899081b #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__tcp_select_window+0x30e/0x420 net/ipv4/tcp_output.c:3018 Code: 11 ff 0f b7 cd c1 e9 0c b8 ff ff ff ff d3 e0 89 c1 f7 d1 01 cb 21 c3 eb 17 e8 2e 83 11 ff 31 db eb 0e e8 25 83 11 ff 89 d8 99 <f7> 7c 24 04 29 d3 65 48 8b 04 25 28 00 00 00 48 3b 44 24 10 75 60 RSP: 0018:ffffc90000a07a18 EFLAGS: 00010246 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000040000 RDX: 0000000000000000 RSI: 000000000003ffff RDI: 0000000000040000 RBP: 000000000000ffd7 R08: ffffffff820cf297 R09: 0000000000000001 R10: 0000000000000000 R11: ffffffff8103d1a0 R12: 0000000000003f00 R13: 0000000000300000 R14: ffff888101cf3540 R15: 0000000000180000 FS: 00007f9af4c09640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b33824000 CR3: 000000012f241001 CR4: 0000000000170ee0 Call Trace: <TASK> __tcp_cleanup_rbuf+0x138/0x1d0 net/ipv4/tcp.c:1611 mptcp_recvmsg+0xcb8/0xdd0 net/mptcp/protocol.c:2034 inet_recvmsg+0x127/0x1f0 net/ipv4/af_inet.c:861 ____sys_recvmsg+0x269/0x2b0 net/socket.c:1019 ___sys_recvmsg+0xe6/0x260 net/socket.c:2764 do_recvmmsg+0x1a5/0x470 net/socket.c:2858 __do_sys_recvmmsg net/socket.c:2937 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0xa6/0x130 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f9af58fc6a9 Code: 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 37 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007f9af4c08cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f9af58fc6a9 RDX: 0000000000000001 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000f00 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40 </TASK> mptcp_recvmsg is allowed to release the msk socket lock when blocking, and before re-acquiring it another thread could have switched the sock to TCP_LISTEN status - with a prior connect(AF_UNSPEC) - also clearing icsk_ack.rcv_mss. Address the issue preventing the disconnect if some other process is concurrently performing a blocking syscall on the same socket, alike commit 4faeee0c ("tcp: deny tcp_disconnect() when threads are waiting"). Fixes: a6b118fe ("mptcp: add receive buffer auto-tuning") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/404Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Currently the mptcp code has assumes that disconnect() can fail only at mptcp_sendmsg_fastopen() time - to avoid a deadlock scenario - and don't even bother returning an error code. Soon mptcp_disconnect() will handle more error conditions: let's track them explicitly. As a bonus, explicitly annotate TCP-level disconnect as not failing: the mptcp code never blocks for event on the subflows. Fixes: 7d803344 ("mptcp: fix deadlock in fastopen error path") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 21 Jun, 2023 7 commits
-
-
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski authored
Daniel Borkmann says: ==================== pull-request: bpf 2023-06-21 We've added 7 non-merge commits during the last 14 day(s) which contain a total of 7 files changed, 181 insertions(+), 15 deletions(-). The main changes are: 1) Fix a verifier id tracking issue with scalars upon spill, from Maxim Mikityanskiy. 2) Fix NULL dereference if an exception is generated while a BPF subprogram is running, from Krister Johansen. 3) Fix a BTF verification failure when compiling kernel with LLVM_IAS=0, from Florent Revest. 4) Fix expected_attach_type enforcement for kprobe_multi link, from Jiri Olsa. 5) Fix a bpf_jit_dump issue for x86_64 to pick the correct JITed image, from Yonghong Song. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Force kprobe multi expected_attach_type for kprobe_multi link bpf/btf: Accept function names that contain dots selftests/bpf: add a test for subprogram extables bpf: ensure main program has an extable bpf: Fix a bpf_jit_dump issue for x86_64 with sysctl bpf_jit_enable. selftests/bpf: Add test cases to assert proper ID tracking on spill bpf: Fix verifier id tracking of scalars on spill ==================== Link: https://lore.kernel.org/r/20230621101116.16122-1-daniel@iogearbox.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fix from Thomas Gleixner: "A single regression fix for a regression fix: For a long time the tick was aligned to clock MONOTONIC so that the tick event happened at a multiple of nanoseconds per tick starting from clock MONOTONIC = 0. At some point this changed as the refined jiffies clocksource which is used during boot before the TSC or other clocksources becomes usable, was adjusted with a boot offset, so that time 0 is closer to the point where the kernel starts. This broke the assumption in the tick code that when the tick setup happens early on ktime_get() will return a multiple of nanoseconds per tick. As a consequence applications which aligned their periodic execution so that it does not collide with the tick were not longer guaranteed that the tick period starts from time 0. The fix for this regression was to realign the tick when it is initially set up to a multiple of tick periods. That works as long as the underlying tick device supports periodic mode, but breaks under certain conditions when the tick device supports only one shot mode. Depending on the offset, the alignment delta to clock MONOTONIC can get in a range where the minimal programming delta of the underlying clock event device is larger than the calculated delta to the next tick. This results in a boot hang as the tick code tries to play catch up, but as the tick never fires jiffies are not advanced so it keeps trying for ever. Solve this by moving the tick alignement into the NOHZ / HIGHRES enablement code because at that point it is guaranteed that the underlying clocksource is high resolution capable and not longer depending on the tick. This is far before user space starts, so at the point where applications try to align their timers, the old behaviour of the tick happening at a multiple of nanoseconds per tick starting from clock MONOTONIC = 0 is restored" * tag 'timers-urgent-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/common: Align tick period during sched_timer setup
-
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds authored
Pull virtio fix from Michael Tsirkin: "A last minute revert to fix a regression" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: Revert "virtio-blk: support completion batching for the IRQ path"
-
Linus Torvalds authored
This reverts commit e7b813b3 (and the subsequent fix for it: 41a15855 "efi: random: fix NULL-deref when refreshing seed"). It turns otu to cause non-deterministic boot stalls on at least a HP 6730b laptop. Reported-and-bisected-by: Sami Korkalainen <sami.korkalainen@proton.me> Link: https://lore.kernel.org/all/GQUnKz2al3yke5mB2i1kp3SzNHjK8vi6KJEh7rnLrOQ24OrlljeCyeWveLW9pICEmB9Qc8PKdNt3w1t_g3-Uvxq1l8Wj67PpoMeWDoH8PKk=@proton.me/ Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spiLinus Torvalds authored
Pull spi fix from Mark Brown: "One last fix for SPI, just a simple fix for incorrect handling of probe deferral for DMA in the Qualcomm GENI driver" * tag 'spi-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-geni-qcom: correctly handle -EPROBE_DEFER from dma_request_chan()
-
Linus Torvalds authored
Merge tag 'regulator-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "One simple fix for v6.4, some incorrectly specified bitfield masks in the PCA9450 driver" * tag 'regulator-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: pca9450: Fix LDO3OUT and LDO4OUT MASK
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmapLinus Torvalds authored
Pull regmap fix from Mark Brown: "One more fix for v6.4 The earlier fix to take account of the register data size when limiting raw register writes exposed the fact that the Intel AVMM bus was incorrectly specifying too low a limit on the maximum data transfer, it is only capable of transmitting one register so had set a transfer size limit that couldn't fit both the value and the the register address into a single message" * tag 'regmap-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: spi-avmm: Fix regmap_bus max_raw_write
-