- 05 Mar, 2022 6 commits
-
-
Chengming Zhou authored
The error message when I build vm tests on debian10 (GLIBC 2.28): userfaultfd.c: In function `userfaultfd_pagemap_test': userfaultfd.c:1393:37: error: `MADV_PAGEOUT' undeclared (first use in this function); did you mean `MADV_RANDOM'? if (madvise(area_dst, test_pgsize, MADV_PAGEOUT)) ^~~~~~~~~~~~ MADV_RANDOM This patch includes these newer definitions from UAPI linux/mman.h, is useful to fix tests build on systems without these definitions in glibc sys/mman.h. Link: https://lkml.kernel.org/r/20220227055330.43087-2-zhouchengming@bytedance.comSigned-off-by: Chengming Zhou <zhouchengming@bytedance.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hugh Dickins authored
Wangyong reports: after enabling tmpfs filesystem to support transparent hugepage with the following command: echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled the docker program tries to add F_SEAL_WRITE through the following command, but it fails unexpectedly with errno EBUSY: fcntl(5, F_ADD_SEALS, F_SEAL_WRITE) = -1. That is because memfd_tag_pins() and memfd_wait_for_pins() were never updated for shmem huge pages: checking page_mapcount() against page_count() is hopeless on THP subpages - they need to check total_mapcount() against page_count() on THP heads only. Make memfd_tag_pins() (compared > 1) as strict as memfd_wait_for_pins() (compared != 1): either can be justified, but given the non-atomic total_mapcount() calculation, it is better now to be strict. Bear in mind that total_mapcount() itself scans all of the THP subpages, when choosing to take an XA_CHECK_SCHED latency break. Also fix the unlikely xa_is_value() case in memfd_wait_for_pins(): if a page has been swapped out since memfd_tag_pins(), then its refcount must have fallen, and so it can safely be untagged. Link: https://lkml.kernel.org/r/a4f79248-df75-2c8c-3df-ba3317ccb5da@google.comSigned-off-by: Hugh Dickins <hughd@google.com> Reported-by: Zeal Robot <zealci@zte.com.cn> Reported-by: wangyong <wang.yong12@zte.com.cn> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: CGEL ZTE <cgel.zte@gmail.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Song Liu <songliubraving@fb.com> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
When adjacent vmas are being merged it can result in the vma that was originally passed to madvise_update_vma being destroyed. In the current implementation, the name parameter passed to madvise_update_vma points directly to vma->anon_name and it is used after the call to vma_merge. In the cases when vma_merge merges the original vma and destroys it, this might result in UAF. For that the original vma would have to hold the anon_vma_name with the last reference. The following vma would need to contain a different anon_vma_name object with the same string. Such scenario is shown below: madvise_vma_behavior(vma) madvise_update_vma(vma, ..., anon_name == vma->anon_name) vma_merge(vma) __vma_adjust(vma) <-- merges vma with adjacent one vm_area_free(vma) <-- frees the original vma replace_vma_anon_name(anon_name) <-- UAF of vma->anon_name Fix this by raising the name refcount and stabilizing it. Link: https://lkml.kernel.org/r/20220224231834.1481408-3-surenb@google.com Link: https://lkml.kernel.org/r/20220223153613.835563-3-surenb@google.com Fixes: 9a10064f ("mm: add a field to store names for private anonymous memory") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: syzbot+aa7b3d4b35f9dc46a366@syzkaller.appspotmail.com Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Gladkov <legion@kernel.org> Cc: Chris Hyser <chris.hyser@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Colin Cross <ccross@google.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xiaofeng Cao <caoxiaofeng@yulong.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
A deep process chain with many vmas could grow really high. With default sysctl_max_map_count (64k) and default pid_max (32k) the max number of vmas in the system is 2147450880 and the refcounter has headroom of 1073774592 before it reaches REFCOUNT_SATURATED (3221225472). Therefore it's unlikely that an anonymous name refcounter will overflow with these defaults. Currently the max for pid_max is PID_MAX_LIMIT (4194304) and for sysctl_max_map_count it's INT_MAX (2147483647). In this configuration anon_vma_name refcount overflow becomes theoretically possible (that still require heavy sharing of that anon_vma_name between processes). kref refcounting interface used in anon_vma_name structure will detect a counter overflow when it reaches REFCOUNT_SATURATED value but will only generate a warning and freeze the ref counter. This would lead to the refcounted object never being freed. A determined attacker could leak memory like that but it would be rather expensive and inefficient way to do so. To ensure anon_vma_name refcount does not overflow, stop anon_vma_name sharing when the refcount reaches REFCOUNT_MAX (2147483647), which still leaves INT_MAX/2 (1073741823) values before the counter reaches REFCOUNT_SATURATED. This should provide enough headroom for raising the refcounts temporarily. Link: https://lkml.kernel.org/r/20220223153613.835563-2-surenb@google.comSigned-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexey Gladkov <legion@kernel.org> Cc: Chris Hyser <chris.hyser@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Colin Cross <ccross@google.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kees Cook <keescook@chromium.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Sasha Levin <sashal@kernel.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xiaofeng Cao <caoxiaofeng@yulong.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
Avoid mixing strings and their anon_vma_name referenced pointers by using struct anon_vma_name whenever possible. This simplifies the code and allows easier sharing of anon_vma_name structures when they represent the same name. [surenb@google.com: fix comment] Link: https://lkml.kernel.org/r/20220223153613.835563-1-surenb@google.com Link: https://lkml.kernel.org/r/20220224231834.1481408-1-surenb@google.comSigned-off-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Colin Cross <ccross@google.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Alexey Gladkov <legion@kernel.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Chris Hyser <chris.hyser@oracle.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Collingbourne <pcc@google.com> Cc: Xiaofeng Cao <caoxiaofeng@yulong.com> Cc: David Hildenbrand <david@redhat.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mike Kravetz authored
The hugepage-mremap test will create a file in a hugetlb filesystem. In a default 'run_vmtests' run, the file will contain all the hugetlb pages. After the test, the file remains and there are no free hugetlb pages for subsequent tests. This causes those hugetlb tests to fail. Change hugepage-mremap to take the name of the hugetlb file as an argument. Unlink the file within the test, and just to be sure remove the file in the run_vmtests script. Link: https://lkml.kernel.org/r/20220201033459.156944-1-mike.kravetz@oracle.comSigned-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 04 Mar, 2022 11 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linuxLinus Torvalds authored
Pull RISC-V fixes from Palmer Dabbelt: - Fixes for a handful of KASAN-related crashes. - A fix to avoid a crash during boot for SPARSEMEM && !SPARSEMEM_VMEMMAP configurations. - A fix to stop reporting some incorrect errors under DEBUG_VIRTUAL. - A fix for the K210's device tree to properly populate the interrupt map, so hart1 will get interrupts again. * tag 'riscv-for-linus-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: dts: k210: fix broken IRQs on hart1 riscv: Fix kasan pud population riscv: Move high_memory initialization to setup_bootmem riscv: Fix config KASAN && DEBUG_VIRTUAL riscv: Fix DEBUG_VIRTUAL false warnings riscv: Fix config KASAN && SPARSEMEM && !SPARSE_VMEMMAP riscv: Fix is_linear_mapping with recent move of KASAN region
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds authored
Pull iommu fixes from Joerg Roedel: - Fix a double list_add() in Intel VT-d code - Add missing put_device() in Tegra SMMU driver - Two AMD IOMMU fixes: - Memory leak in IO page-table freeing code - Add missing recovery from event-log overflow * tag 'iommu-fixes-v5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/tegra-smmu: Fix missing put_device() call in tegra_smmu_find iommu/vt-d: Fix double list_add when enabling VMD in scalable mode iommu/amd: Fix I/O page table memory leak iommu/amd: Recover from event log overflow
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull thermal control fix from Rafael Wysocki: "Fix NULL pointer dereference in the thermal netlink interface (Nicolas Cavallari)" * tag 'thermal-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: core: Fix TZ_GET_TRIP NULL pointer dereference
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull sound fixes from Takashi Iwai: "Hopefully the last PR for 5.17, including just a few small changes: an additional fix for ASoC ops boundary check and other minor device-specific fixes" * tag 'sound-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: intel_hdmi: Fix reference to PCM buffer address ASoC: cs4265: Fix the duplicated control name ASoC: ops: Shift tested values in snd_soc_put_volsw() by +min
-
git://anongit.freedesktop.org/drm/drmLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Things are quieting down as expected, just a small set of fixes, i915, exynos, amdgpu, vrr, bridge and hdlcd. Nothing scary at all. i915: - Fix GuC SLPC unset command - Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake amdgpu: - Suspend regression fix exynos: - irq handling fixes - Fix two regressions to TE-gpio handling arm/hdlcd: - Select DRM_GEM_CMEA_HELPER for HDLCD bridge: - ti-sn65dsi86: Properly undo autosuspend vrr: - Fix potential NULL-pointer deref" * tag 'drm-fixes-2022-03-04' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: fix suspend/resume hang regression drm/vrr: Set VRR capable prop only if it is attached to connector drm/arm: arm hdlcd select DRM_GEM_CMA_HELPER drm/bridge: ti-sn65dsi86: Properly undo autosuspend drm/i915: s/JSP2/ICP2/ PCH drm/i915/guc/slpc: Correct the param count for unset param drm/exynos: Search for TE-gpio in DSI panel's node drm/exynos: Don't fail if no TE-gpio is defined for DSI driver drm/exynos: gsc: Use platform_get_irq() to get the interrupt drm/exynos/fimc: Use platform_get_irq() to get the interrupt drm/exynos/exynos_drm_fimd: Use platform_get_irq_byname() to get the interrupt drm/exynos: mixer: Use platform_get_irq() to get the interrupt drm/exynos/exynos7_drm_decon: Use platform_get_irq_byname() to get the interrupt
-
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrlLinus Torvalds authored
Pull pin control fixes from Linus Walleij: "These two fixes should fix the issues seen on the OrangePi, first we needed the correct offset when calling pinctrl_gpio_direction(), and fixing that made a lockdep issue explode in our face. Both now fixed" * tag 'pinctrl-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: sunxi: Use unique lockdep classes for IRQs pinctrl-sunxi: sunxi_pinctrl_gpio_direction_in/output: use correct offset
-
Daniel Borkmann authored
syzkaller was recently triggering an oversized kvmalloc() warning via xdp_umem_create(). The triggered warning was added back in 7661809d ("mm: don't allow oversized kvmalloc() calls"). The rationale for the warning for huge kvmalloc sizes was as a reaction to a security bug where the size was more than UINT_MAX but not everything was prepared to handle unsigned long sizes. Anyway, the AF_XDP related call trace from this syzkaller report was: kvmalloc include/linux/mm.h:806 [inline] kvmalloc_array include/linux/mm.h:824 [inline] kvcalloc include/linux/mm.h:829 [inline] xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline] xdp_umem_reg net/xdp/xdp_umem.c:219 [inline] xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252 xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068 __sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176 __do_sys_setsockopt net/socket.c:2187 [inline] __se_sys_setsockopt net/socket.c:2184 [inline] __x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae Björn mentioned that requests for >2GB allocation can still be valid: The structure that is being allocated is the page-pinning accounting. AF_XDP has an internal limit of U32_MAX pages, which is *a lot*, but still fewer than what memcg allows (PAGE_COUNTER_MAX is a LONG_MAX/ PAGE_SIZE on 64 bit systems). [...] I could just change from U32_MAX to INT_MAX, but as I stated earlier that has a hacky feeling to it. [...] From my perspective, the code isn't broken, with the memcg limits in consideration. [...] Linus says: [...] Pretty much every time this has come up, the kernel warning has shown that yes, the code was broken and there really wasn't a reason for doing allocations that big. Of course, some people would be perfectly fine with the allocation failing, they just don't want the warning. I didn't want __GFP_NOWARN to shut it up originally because I wanted people to see all those cases, but these days I think we can just say "yeah, people can shut it up explicitly by saying 'go ahead and fail this allocation, don't warn about it'". So enough time has passed that by now I'd certainly be ok with [it]. Thus allow call-sites to silence such userspace triggered splats if the allocation requests have __GFP_NOWARN. For xdp_umem_pin_pages()'s call to kvcalloc() this is already the case, so nothing else needed there. Fixes: 7661809d ("mm: don't allow oversized kvmalloc() calls") Reported-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+11421fbbff99b989670e@syzkaller.appspotmail.com Cc: Björn Töpel <bjorn@kernel.org> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: David S. Miller <davem@davemloft.net> Link: https://lore.kernel.org/bpf/CAJ+HfNhyfsT5cS_U9EC213ducHs9k9zNxX9+abqC0kTrPbQ0gg@mail.gmail.com Link: https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@linux-foundation.orgReviewed-by: Leon Romanovsky <leonro@nvidia.com> Ackd-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Niklas Cassel authored
Commit 67d96729 ("riscv: Update Canaan Kendryte K210 device tree") incorrectly removed two entries from the PLIC interrupt-controller node's interrupts-extended property. The PLIC driver cannot know the mapping between hart contexts and hart ids, so this information has to be provided by device tree, as specified by the PLIC device tree binding. The PLIC driver uses the interrupts-extended property, and initializes the hart context registers in the exact same order as provided by the interrupts-extended property. In other words, if we don't specify the S-mode interrupts, the PLIC driver will simply initialize the hart0 S-mode hart context with the hart1 M-mode configuration. It is therefore essential to specify the S-mode IRQs even though the system itself will only ever be running in M-mode. Re-add the S-mode interrupts, so that we get working IRQs on hart1 again. Cc: <stable@vger.kernel.org> Fixes: 67d96729 ("riscv: Update Canaan Kendryte K210 device tree") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
git://anongit.freedesktop.org/drm/drm-miscDave Airlie authored
* drm/arm: Select DRM_GEM_CMEA_HELPER for HDLCD * drm/bridge: ti-sn65dsi86: Properly undo autosuspend * drm/vrr: Fix potential NULL-pointer deref Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/YiCTGZ8IVCw0ilKK@linux-uq9g
-
Dave Airlie authored
Merge tag 'amd-drm-fixes-5.17-2022-03-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.17-2022-03-02: amdgpu: - Suspend regression fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220303045035.5650-1-alexander.deucher@amd.com
-
Dave Airlie authored
Merge tag 'drm-intel-fixes-2022-03-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Fix GuC SLPC unset command. (Vinay Belgaumkar) - Fix misidentification of some Apple MacBook Pro laptops as Jasper Lake. (Ville Syrjälä) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/YiCXHiTyCE7TbopG@tursulin-mobl2
-
- 03 Mar, 2022 23 commits
-
-
Alexandre Ghiti authored
In sv48, the kasan inner regions are not aligned on PGDIR_SIZE and then when we populate the kasan linear mapping region, we clear the kasan vmalloc region which is in the same PGD. Fix this by copying the content of the kasan early pud after allocating a new PGD for the first time. Fixes: e8a62cc2 ("riscv: Implement sv48 support") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Alexandre Ghiti authored
high_memory used to be initialized in mem_init, way after setup_bootmem. But a call to dma_contiguous_reserve in this function gives rise to the below warning because high_memory is equal to 0 and is used at the very beginning at cma_declare_contiguous_nid. It went unnoticed since the move of the kasan region redefined KERN_VIRT_SIZE so that it does not encompass -1 anymore. Fix this by initializing high_memory in setup_bootmem. ------------[ cut here ]------------ virt_to_phys used for non-linear address: ffffffffffffffff (0xffffffffffffffff) WARNING: CPU: 0 PID: 0 at arch/riscv/mm/physaddr.c:14 __virt_to_phys+0xac/0x1b8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.17.0-rc1-00007-ga68b89289e26 #27 Hardware name: riscv-virtio,qemu (DT) epc : __virt_to_phys+0xac/0x1b8 ra : __virt_to_phys+0xac/0x1b8 epc : ffffffff80014922 ra : ffffffff80014922 sp : ffffffff84a03c30 gp : ffffffff85866c80 tp : ffffffff84a3f180 t0 : ffffffff86bce657 t1 : fffffffef09406e8 t2 : 0000000000000000 s0 : ffffffff84a03c70 s1 : ffffffffffffffff a0 : 000000000000004f a1 : 00000000000f0000 a2 : 0000000000000002 a3 : ffffffff8011f408 a4 : 0000000000000000 a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffff84a03747 s2 : ffffffd800000000 s3 : ffffffff86ef4000 s4 : ffffffff8467f828 s5 : fffffff800000000 s6 : 8000000000006800 s7 : 0000000000000000 s8 : 0000000480000000 s9 : 0000000080038ea0 s10: 0000000000000000 s11: ffffffffffffffff t3 : ffffffff84a035c0 t4 : fffffffef09406e8 t5 : fffffffef09406e9 t6 : ffffffff84a03758 status: 0000000000000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff8322ef4c>] cma_declare_contiguous_nid+0xf2/0x64a [<ffffffff83212a58>] dma_contiguous_reserve_area+0x46/0xb4 [<ffffffff83212c3a>] dma_contiguous_reserve+0x174/0x18e [<ffffffff83208fc2>] paging_init+0x12c/0x35e [<ffffffff83206bd2>] setup_arch+0x120/0x74e [<ffffffff83201416>] start_kernel+0xce/0x68c irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<0000000000000000>] 0x0 softirqs last enabled at (0): [<0000000000000000>] 0x0 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace 0000000000000000 ]--- Fixes: f7ae0233 ("riscv: Move KASAN mapping next to the kernel mapping") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Alexandre Ghiti authored
__virt_to_phys function is called very early in the boot process (ie kasan_early_init) so it should not be instrumented by KASAN otherwise it bugs. Fix this by declaring phys_addr.c as non-kasan instrumentable. Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Fixes: 8ad8b727 (riscv: Add KASAN support) Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Alexandre Ghiti authored
KERN_VIRT_SIZE used to encompass the kernel mapping before it was redefined when moving the kasan mapping next to the kernel mapping to only match the maximum amount of physical memory. Then, kernel mapping addresses that go through __virt_to_phys are now declared as wrong which is not true, one can use __virt_to_phys on such addresses. Fix this by redefining the condition that matches wrong addresses. Fixes: f7ae0233 ("riscv: Move KASAN mapping next to the kernel mapping") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Alexandre Ghiti authored
In order to get the pfn of a struct page* when sparsemem is enabled without vmemmap, the mem_section structures need to be initialized which happens in sparse_init. But kasan_early_init calls pfn_to_page way before sparse_init is called, which then tries to dereference a null mem_section pointer. Fix this by removing the usage of this function in kasan_early_init. Fixes: 8ad8b727 ("riscv: Add KASAN support") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Alexandre Ghiti authored
The KASAN region was recently moved between the linear mapping and the kernel mapping, is_linear_mapping used to check the validity of an address by using the start of the kernel mapping, which is now wrong. Fix this by using the maximum size of the physical memory. Fixes: f7ae0233 ("riscv: Move KASAN mapping next to the kernel mapping") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
-
Ammar Faizi authored
The patchwork link is dead. It says: 404: File not found The page URL requested (/project/LKML/list/) does not exist. Remove it. Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
When cachefiles_shorten_object() calls fallocate() to shape the cache file to match the DIO size, it passes the total file size it wants to achieve, not the amount of zeros that should be inserted. Since this is meant to preallocate that amount of storage for the file, it can cause the cache to fill up the disk and hit ENOSPC. Fix this by passing the length actually required to go from the current EOF to the desired EOF. Fixes: 7623ed67 ("cachefiles: Implement cookie resize for truncate") Reported-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/164630854858.3665356.17419701804248490708.stgit@warthog.procyon.org.uk # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Including fixes from can, xfrm, wifi, bluetooth, and netfilter. Lots of various size fixes, the length of the tag speaks for itself. Most of the 5.17-relevant stuff comes from xfrm, wifi and bt trees which had been lagging as you pointed out previously. But there's also a larger than we'd like portion of fixes for bugs from previous releases. Three more fixes still under discussion, including and xfrm revert for uAPI error. Current release - regressions: - iwlwifi: don't advertise TWT support, prevent FW crash - xfrm: fix the if_id check in changelink - xen/netfront: destroy queues before real_num_tx_queues is zeroed - bluetooth: fix not checking MGMT cmd pending queue, make scanning work again Current release - new code bugs: - mptcp: make SIOCOUTQ accurate for fallback socket - bluetooth: access skb->len after null check - bluetooth: hci_sync: fix not using conn_timeout - smc: fix cleanup when register ULP fails - dsa: restore error path of dsa_tree_change_tag_proto - iwlwifi: fix build error for IWLMEI - iwlwifi: mvm: propagate error from request_ownership to the user Previous releases - regressions: - xfrm: fix pMTU regression when reported pMTU is too small - xfrm: fix TCP MSS calculation when pMTU is close to 1280 - bluetooth: fix bt_skb_sendmmsg not allocating partial chunks - ipv6: ensure we call ipv6_mc_down() at most once, prevent leaks - ipv6: prevent leaks in igmp6 when input queues get full - fix up skbs delta_truesize in UDP GRO frag_list - eth: e1000e: fix possible HW unit hang after an s0ix exit - eth: e1000e: correct NVM checksum verification flow - ptp: ocp: fix large time adjustments Previous releases - always broken: - tcp: make tcp_read_sock() more robust in presence of urgent data - xfrm: distinguishing SAs and SPs by if_id in xfrm_migrate - xfrm: fix xfrm_migrate issues when address family changes - dcb: flush lingering app table entries for unregistered devices - smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error - mac80211: fix EAPoL rekey fail in 802.3 rx path - mac80211: fix forwarded mesh frames AC & queue selection - netfilter: nf_queue: fix socket access races and bugs - batman-adv: fix ToCToU iflink problems and check the result belongs to the expected net namespace - can: gs_usb, etas_es58x: fix opened_channel_cnt's accounting - can: rcar_canfd: register the CAN device when fully ready - eth: igb, igc: phy: drop premature return leaking HW semaphore - eth: ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc(), prevent live lock when link goes down - eth: stmmac: only enable DMA interrupts when ready - eth: sparx5: move vlan checks before any changes are made - eth: iavf: fix races around init, removal, resets and vlan ops - ibmvnic: more reset flow fixes Misc: - eth: fix return value of __setup handlers" * tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits) ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report() net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc() selftests: mlxsw: resource_scale: Fix return value selftests: mlxsw: tc_police_scale: Make test more robust net: dcb: disable softirqs in dcbnl_flush_dev() bnx2: Fix an error message sfc: extend the locking on mcdi->seqno net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe() tcp: make tcp_read_sock() more robust bpf, sockmap: Do not ignore orig_len parameter net: ipa: add an interconnect dependency net: fix up skbs delta_truesize in UDP GRO frag_list iwlwifi: mvm: return value for request_ownership nl80211: Update bss channel on channel switch for P2P_CLIENT iwlwifi: fix build error for IWLMEI ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments batman-adv: Don't expect inter-netns unique iflink indices ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds authored
Pull MIPS fixes from Thomas Bogendoerfer: - Fix memory detection for MT7621 devices - Fix setnocoherentio kernel option - Fix warning when CONFIG_SCHED_CORE is enabled * tag 'mips-fixes-5.17_4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: ralink: mt7621: use bitwise NOT instead of logical mips: setup: fix setnocoherentio() boolean setting MIPS: smp: fill in sibling and core maps earlier MIPS: ralink: mt7621: do memory detection on KSEG1
-
git://github.com/ojeda/linuxLinus Torvalds authored
Pull auxdisplay fixes from Miguel Ojeda: "A few lcd2s fixes from Andy Shevchenko" * tag 'auxdisplay-for-linus-v5.17-rc7' of git://github.com/ojeda/linux: auxdisplay: lcd2s: Use proper API to free the instance of charlcd object auxdisplay: lcd2s: Fix memory leak in ->remove() auxdisplay: lcd2s: Fix lcd2s_redefine_char() feature
-
Eric Dumazet authored
While investigating on why a synchronize_net() has been added recently in ipv6_mc_down(), I found that igmp6_event_query() and igmp6_event_report() might drop skbs in some cases. Discussion about removing synchronize_net() from ipv6_mc_down() will happen in a different thread. Fixes: f185de28 ("mld: add new workqueues for process mld events") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Taehee Yoo <ap420073@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220303173728.937869-1-eric.dumazet@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The blamed commit said one thing but did another. It explains that we should restore the "return err" to the original "goto out_unwind_tagger", but instead it replaced it with "goto out_unlock". When DSA_NOTIFIER_TAG_PROTO fails after the first switch of a multi-switch tree, the switches would end up not using the same tagging protocol. Fixes: 0b0e2ff1 ("net: dsa: restore error path of dsa_tree_change_tag_proto") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220303154249.1854436-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Maciej Fijalkowski authored
Commit c685c69f ("ixgbe: don't do any AF_XDP zero-copy transmit if netif is not OK") addressed the ring transient state when MEM_TYPE_XSK_BUFF_POOL was being configured which in turn caused the interface to through down/up. Maurice reported that when carrier is not ok and xsk_pool is present on ring pair, ksoftirqd will consume 100% CPU cycles due to the constant NAPI rescheduling as ixgbe_poll() states that there is still some work to be done. To fix this, do not set work_done to false for a !netif_carrier_ok(). Fixes: c685c69f ("ixgbe: don't do any AF_XDP zero-copy transmit if netif is not OK") Reported-by: Maurice Baijens <maurice.baijens@ellips.com> Tested-by: Maurice Baijens <maurice.baijens@ellips.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Ido Schimmel says: ==================== selftests: mlxsw: A couple of fixes Patch #1 fixes a breakage due to a change in iproute2 output. The real problem is not iproute2, but the fact that the check was not strict enough. Fixed by using JSON output instead. Targeting at net so that the test will pass as part of old and new kernels regardless of iproute2 version. Patch #2 fixes an issue uncovered by the first one. ==================== Link: https://lore.kernel.org/r/20220302161447.217447-1-idosch@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Amit Cohen authored
The test runs several test cases and is supposed to return an error in case at least one of them failed. Currently, the check of the return value of each test case is in the wrong place, which can result in the wrong return value. For example: # TESTS='tc_police' ./resource_scale.sh TEST: 'tc_police' [default] 968 [FAIL] tc police offload count failed Error: mlxsw_spectrum: Failed to allocate policer index. We have an error talking to the kernel Command failed /tmp/tmp.i7Oc5HwmXY:969 TEST: 'tc_police' [default] overflow 969 [ OK ] ... TEST: 'tc_police' [ipv4_max] overflow 969 [ OK ] $ echo $? 0 Fix this by moving the check to be done after each test case. Fixes: 059b18e2 ("selftests: mlxsw: Return correct error code in resource scale test") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Amit Cohen authored
The test adds tc filters and checks how many of them were offloaded by grepping for 'in_hw'. iproute2 commit f4cd4f127047 ("tc: add skip_hw and skip_sw to control action offload") added offload indication to tc actions, producing the following output: $ tc filter show dev swp2 ingress ... filter protocol ipv6 pref 1000 flower chain 0 handle 0x7c0 eth_type ipv6 dst_ip 2001:db8:1::7bf skip_sw in_hw in_hw_count 1 action order 1: police 0x7c0 rate 10Mbit burst 100Kb mtu 2Kb action drop overhead 0b ref 1 bind 1 not_in_hw used_hw_stats immediate The current grep expression matches on both 'in_hw' and 'not_in_hw', resulting in incorrect results. Fix that by using JSON output instead. Fixes: 5061e773 ("selftests: mlxsw: Add scale test for tc-police") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
Ido Schimmel points out that since commit 52cff74e ("dcbnl : Disable software interrupts before taking dcb_lock"), the DCB API can be called by drivers from softirq context. One such in-tree example is the chelsio cxgb4 driver: dcb_rpl -> cxgb4_dcb_handle_fw_update -> dcb_ieee_setapp If the firmware for this driver happened to send an event which resulted in a call to dcb_ieee_setapp() at the exact same time as another DCB-enabled interface was unregistering on the same CPU, the softirq would deadlock, because the interrupted process was already holding the dcb_lock in dcbnl_flush_dev(). Fix this unlikely event by using spin_lock_bh() in dcbnl_flush_dev() as in the rest of the dcbnl code. Fixes: 91b0383f ("net: dcb: flush lingering app table entries for unregistered devices") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220302193939.1368823-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Christophe JAILLET authored
Fix an error message and report the correct failing function. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Niels Dossche authored
seqno could be read as a stale value outside of the lock. The lock is already acquired to protect the modification of seqno against a possible race condition. Place the reading of this value also inside this locking to protect it against a possible race condition. Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
D. Wythe says: ==================== fix unexpected SMC_CLC_DECL_ERR_REGRMB error We can easily trigger the SMC_CLC_DECL_ERR_REGRMB exception within following script: server: smc_run nginx client: smc_run ./wrk -c 2000 -t 8 -d 20 http://smc-server And we can clearly see that this error is also divided into two types: 1. 0x09990003 2. 0x05000000/0x09990003 Which has the same root causes, but the immediate causes vary. The root cause of this issues is that remove connections from link group is not synchronous with add/delete rtoken entry, which means that even the number of connections is less that SMC_RMBS_PER_LGR_MAX, it does not mean that the connection can register rtoken successfully later. In other words, the rtoken entry may released, This will cause an unexpected SMC_CLC_DECL_ERR_REGRMB to be reported, and then this SMC connections have to fallback to TCP. This patch set handles two types of SMC_CLC_DECL_ERR_REGRMB exceptions from different perspectives. Patch 1: fix the 0x05000000/0x09990003 error. Patch 2: fix the 0x09990003 error. After those patches, there is no SMC_CLC_DECL_ERR_REGRMB exceptions in my test case any more. v1 -> v2: - add bugfix patch for SMC_CLC_DECL_ERR_REGRMB cause by server side v2 -> v3: - fix incorrect mail thread ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
D. Wythe authored
The problem of SMC_CLC_DECL_ERR_REGRMB on the server is very clear. Based on the fact that whether a new SMC connection can be accepted or not depends on not only the limit of conn nums, but also the available entries of rtoken. Since the rtoken release is trigger by peer, while the conn nums is decrease by local, tons of thing can happen in this time difference. This only thing that needs to be mentioned is that now all connection creations are completely protected by smc_server_lgr_pending lock, it's enough to check only the available entries in rtokens_used_mask. Fixes: cd6851f3 ("smc: remote memory buffers (RMBs)") Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
D. Wythe authored
The main reason for this unexpected SMC_CLC_DECL_ERR_REGRMB in client dues to following execution sequence: Server Conn A: Server Conn B: Client Conn B: smc_lgr_unregister_conn smc_lgr_register_conn smc_clc_send_accept -> smc_rtoken_add smcr_buf_unuse -> Client Conn A: smc_rtoken_delete smc_lgr_unregister_conn() makes current link available to assigned to new incoming connection, while smcr_buf_unuse() has not executed yet, which means that smc_rtoken_add may fail because of insufficient rtoken_entry, reversing their execution order will avoid this problem. Fixes: 3e034725 ("net/smc: common functions for RMBs and send buffers") Signed-off-by: D. Wythe <alibuda@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-