- 14 Aug, 2023 16 commits
-
-
Chao Yu authored
As reported, status debugfs entry shows inconsistent GC stats as below: GC calls: 6008 (BG: 6161) - data segments : 3053 (BG: 3053) - node segments : 2955 (BG: 2955) Total GC calls is larger than BGGC calls, the reason is: - f2fs_stat_info.call_count accounts total migrated section count by f2fs_gc() - f2fs_stat_info.bg_gc accounts total call times of f2fs_gc() from background gc_thread Another issue is gc_foreground_calls sysfs entry shows total GC call count rather than FGGC call count. This patch changes as below for fix: - account GC calls and migrated segment count separately - support to account migrated section count if it enables large section mode - fix to show correct value in gc_foreground_calls sysfs entry Fixes: fc7100ea ("f2fs: Add f2fs stats to sysfs") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
It has checked return value of write_all_xattrs(), remove unneeded following check condition. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
generic/728 - output mismatch (see /media/fstests/results//generic/728.out.bad) --- tests/generic/728.out 2023-07-19 07:10:48.362711407 +0000 +++ /media/fstests/results//generic/728.out.bad 2023-07-19 08:39:57.000000000 +0000 QA output created by 728 +Expected ctime to change after setxattr. +Expected ctime to change after removexattr. Silence is golden ... (Run 'diff -u /media/fstests/tests/generic/728.out /media/fstests/results//generic/728.out.bad' to see the entire diff) generic/729 1s It needs to update i_ctime after {set,remove}xattr, fix it. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
syzbot reports a f2fs bug as below: UBSAN: array-index-out-of-bounds in fs/f2fs/f2fs.h:3275:19 index 1409 is out of range for type '__le32[923]' (aka 'unsigned int[923]') Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 ubsan_epilogue lib/ubsan.c:217 [inline] __ubsan_handle_out_of_bounds+0x11c/0x150 lib/ubsan.c:348 inline_data_addr fs/f2fs/f2fs.h:3275 [inline] __recover_inline_status fs/f2fs/inode.c:113 [inline] do_read_inode fs/f2fs/inode.c:480 [inline] f2fs_iget+0x4730/0x48b0 fs/f2fs/inode.c:604 f2fs_fill_super+0x640e/0x80c0 fs/f2fs/super.c:4601 mount_bdev+0x276/0x3b0 fs/super.c:1391 legacy_get_tree+0xef/0x190 fs/fs_context.c:611 vfs_get_tree+0x8c/0x270 fs/super.c:1519 do_new_mount+0x28f/0xae0 fs/namespace.c:3335 do_mount fs/namespace.c:3675 [inline] __do_sys_mount fs/namespace.c:3884 [inline] __se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The issue was bisected to: commit d48a7b3a Author: Chao Yu <chao@kernel.org> Date: Mon Jan 9 03:49:20 2023 +0000 f2fs: fix to do sanity check on extent cache correctly The root cause is we applied both v1 and v2 of the patch, v2 is the right fix, so it needs to revert v1 in order to fix reported issue. v1: commit d48a7b3a ("f2fs: fix to do sanity check on extent cache correctly") https://lore.kernel.org/lkml/20230109034920.492914-1-chao@kernel.org/ v2: commit 269d1194 ("f2fs: fix to do sanity check on extent cache correctly") https://lore.kernel.org/lkml/20230207134808.1827869-1-chao@kernel.org/ Reported-by: syzbot+601018296973a481f302@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000fcf0690600e4d04d@google.com/ Fixes: d48a7b3a ("f2fs: fix to do sanity check on extent cache correctly") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Minjie Du authored
Simplify code pattern of 'folio->index + folio_nr_pages(folio)' by using the existing helper folio_next_index(). Signed-off-by:
Minjie Du <duminjie@vivo.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chunhai Guo authored
Now f2fs support four block allocation modes: lfs, adaptive, fragment:segment, fragment:block. Only lfs mode is allowed with zoned block device feature. Fixes: 6691d940 ("f2fs: introduce fragment allocation mode mount option") Signed-off-by:
Chunhai Guo <guochunhai@vivo.com> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Shin'ichiro Kawasaki authored
The commit 25f90805 ("f2fs: add async reset zone command support") introduced "async reset zone commands" by calling __submit_zone_reset_cmd() in async discard operations. However, __submit_zone_reset_cmd() is called regardless of zone type of discard target zone. When devices have conventional zones, zone reset commands are sent to the conventional zones and cause I/O errors. Avoid the I/O errors by checking that the discard target zone type is sequential write required. If not, handle the discard operation in same manner as non-zoned, regular block devices. For that purpose, add a new helper function f2fs_bdev_index() which gets index of the zone reset target device. Fixes: 25f90805 ("f2fs: add async reset zone command support") Signed-off-by:
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
f2fs won't compress non-full cluster in tail of file, let's skip dirtying and rewrite such cluster during f2fs_ioc_{,de}compress_file. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
This patch allows f2fs_ioc_{,de}compress_file() to be interrupted, so that, userspace won't be blocked when manual {,de}compression on large file is interrupted by signal. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Christoph Hellwig authored
f2fs_scan_devices reopens the main device since the very beginning, which has always been useless, and also means that we don't pass the right holder for the reopen, which now leads to a warning as the core super.c holder ops aren't passed in for the reopen. Fixes: 3c62be17 ("f2fs: support multiple devices") Fixes: 0718afd4 ("block: introduce holder ops") Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
Compression option in inode should not be changed after they have been used, however, it may happen in below race case: Thread A Thread B - f2fs_ioc_set_compress_option - check f2fs_is_mmap_file() - check get_dirty_pages() - check F2FS_HAS_BLOCKS() - f2fs_file_mmap - set_inode_flag(FI_MMAP_FILE) - fault - do_page_mkwrite - f2fs_vm_page_mkwrite - f2fs_get_block_locked - fault_dirty_shared_page - set_page_dirty - update i_compress_algorithm - update i_log_cluster_size - update i_cluster_size Avoid such race condition by covering f2fs_file_mmap() w/ i_sem lock, meanwhile add mmap file check condition in f2fs_may_compress() as well. Fixes: e1e8debe ("f2fs: add F2FS_IOC_SET_COMPRESS_OPTION ioctl") Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Randy Dunlap authored
Correct spelling problems as identified by codespell. Fixes: 9e615dbb ("f2fs: add missing description for ipu_policy node") Fixes: b2e4a2b3 ("f2fs: expose discard related parameters in sysfs") Fixes: 846ae671 ("f2fs: expose extension_list sysfs entry") Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Chao Yu <chao@kernel.org> Cc: linux-f2fs-devel@lists.sourceforge.net Cc: Yangtao Li <frank.li@vivo.com> Cc: Konstantin Vyshetsky <vkon@google.com> Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
https://bugzilla.kernel.org/show_bug.cgi?id=216050 Somehow we're getting a page which has a different mapping. Let's avoid the infinite loop. Cc: <stable@vger.kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
Let's flush the inode being aborted atomic operation to avoid stale dirty inode during eviction in this call stack: f2fs_mark_inode_dirty_sync+0x22/0x40 [f2fs] f2fs_abort_atomic_write+0xc4/0xf0 [f2fs] f2fs_evict_inode+0x3f/0x690 [f2fs] ? sugov_start+0x140/0x140 evict+0xc3/0x1c0 evict_inodes+0x17b/0x210 generic_shutdown_super+0x32/0x120 kill_block_super+0x21/0x50 deactivate_locked_super+0x31/0x90 cleanup_mnt+0x100/0x160 task_work_run+0x59/0x90 do_exit+0x33b/0xa50 do_group_exit+0x2d/0x80 __x64_sys_exit_group+0x14/0x20 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd This triggers f2fs_bug_on() in f2fs_evict_inode: f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE)); This fixes the syzbot report: loop0: detected capacity change from 0 to 131072 F2FS-fs (loop0): invalid crc value F2FS-fs (loop0): Found nat_bits in checkpoint F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4 ------------[ cut here ]------------ kernel BUG at fs/f2fs/inode.c:869! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 5014 Comm: syz-executor220 Not tainted 6.4.0-syzkaller-11479-g6cd06ab1 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869 Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007 RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000 R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0 Call Trace: <TASK> evict+0x2ed/0x6b0 fs/inode.c:665 dispose_list+0x117/0x1e0 fs/inode.c:698 evict_inodes+0x345/0x440 fs/inode.c:748 generic_shutdown_super+0xaf/0x480 fs/super.c:478 kill_block_super+0x64/0xb0 fs/super.c:1417 kill_f2fs_super+0x2af/0x3c0 fs/f2fs/super.c:4704 deactivate_locked_super+0x98/0x160 fs/super.c:330 deactivate_super+0xb1/0xd0 fs/super.c:361 cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1254 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xa9a/0x29a0 kernel/exit.c:874 do_group_exit+0xd4/0x2a0 kernel/exit.c:1024 __do_sys_exit_group kernel/exit.c:1035 [inline] __se_sys_exit_group kernel/exit.c:1033 [inline] __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f309be71a09 Code: Unable to access opcode bytes at 0x7f309be719df. RSP: 002b:00007fff171df518 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 00007f309bef7330 RCX: 00007f309be71a09 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001 RBP: 0000000000000001 R08: ffffffffffffffc0 R09: 00007f309bef1e40 R10: 0000000000010600 R11: 0000000000000246 R12: 00007f309bef7330 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869 Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007 RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000 R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0 Cc: <stable@vger.kernel.org> Reported-and-tested-by: syzbot+e1246909d526a9d470fa@syzkaller.appspotmail.com Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Chao Yu authored
f2fs_compress_alloc_page() uses mempool to allocate memory, it never fail, don't handle error case in its callers. Signed-off-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
Jaegeuk Kim authored
This reverts commit bfd47662. Shinichiro Kawasaki reported: When I ran workloads on f2fs using v6.5-rcX with fixes [1][2] and a zoned block devices with 4kb logical block size, I observe mount failure as follows. When I revert this commit, the failure goes away. [ 167.781975][ T1555] F2FS-fs (dm-0): IO Block Size: 4 KB [ 167.890728][ T1555] F2FS-fs (dm-0): Found nat_bits in checkpoint [ 171.482588][ T1555] F2FS-fs (dm-0): Zone without valid block has non-zero write pointer. Reset the write pointer: wp[0x1300,0x8] [ 171.496000][ T1555] F2FS-fs (dm-0): (0) : Unaligned zone reset attempted (block 280000 + 80000) [ 171.505037][ T1555] F2FS-fs (dm-0): Discard zone failed: (errno=-5) The patch replaced "sbi->log_blocksize - SECTOR_SHIFT" with "sbi->log_sectors_per_block". However, I think these two are not equal when the device has 4k logical block size. The former uses Linux kernel sector size 512 byte. The latter use 512b sector size or 4kb sector size depending on the device. mkfs.f2fs obtains logical block size via BLKSSZGET ioctl from the device and reflects it to the value sbi->log_sector_size_per_block. This causes unexpected write pointer calculations in check_zone_write_pointer(). This resulted in unexpected zone reset and the mount failure. [1] https://lkml.kernel.org/linux-f2fs-devel/20230711050101.GA19128@lst.de/ [2] https://lore.kernel.org/linux-f2fs-devel/20230804091556.2372567-1-shinichiro.kawasaki@wdc.com/ Cc: stable@vger.kernel.org Reported-by:
Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: bfd47662 ("f2fs: clean up w/ sbi->log_sectors_per_block") Reviewed-by:
Chao Yu <chao@kernel.org> Signed-off-by:
Jaegeuk Kim <jaegeuk@kernel.org>
-
- 09 Jul, 2023 10 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
We just sorted the entries and fields last release, so just out of a perverse sense of curiosity, I decided to see if we can keep things ordered for even just one release. The answer is "No. No we cannot". I suggest that all kernel developers will need weekly training sessions, involving a lot of Big Bird and Sesame Street. And at the yearly maintainer summit, we will all sing the alphabet song together. I doubt I will keep doing this. At some point "perverse sense of curiosity" turns into just a cold dark place filled with sadness and despair. Repeats: 80e62bc8 ("MAINTAINERS: re-sort all entries and fields") Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.infradead.org/users/hch/dma-mappingLinus Torvalds authored
Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fpu fix from Borislav Petkov: - Do FPU AP initialization on Xen PV too which got missed by the recent boot reordering work * tag 'x86_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/xen: Fix secondary processors' FPU initialization
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Thomas Gleixner: "A single fix for the mechanism to park CPUs with an INIT IPI. On shutdown or kexec, the kernel tries to park the non-boot CPUs with an INIT IPI. But the same code path is also used by the crash utility. If the CPU which panics is not the boot CPU then it sends an INIT IPI to the boot CPU which resets the machine. Prevent this by validating that the CPU which runs the stop mechanism is the boot CPU. If not, leave the other CPUs in HLT" * tag 'x86-core-2023-07-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/smp: Don't send INIT to boot CPU
-
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds authored
Pull MIPS fixes from Thomas Bogendoerfer: - fixes for KVM - fix for loongson build and cpu probing - DT fixes * tag 'mips_6.5_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: kvm: Fix build error with KVM_MIPS_DEBUG_COP0_COUNTERS enabled MIPS: dts: add missing space before { MIPS: Loongson: Fix build error when make modules_install MIPS: KVM: Fix NULL pointer dereference MIPS: Loongson: Fix cpu_probe_loongson() again
-
git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds authored
Pull xfs fix from Darrick Wong: "Nothing exciting here, just getting rid of a gcc warning that I got tired of seeing when I turn on gcov" * tag 'xfs-6.5-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix uninit warning in xfs_growfs_data
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull more smb client updates from Steve French: - fix potential use after free in unmount - minor cleanup - add worker to cleanup stale directory leases * tag '6.5-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add a laundromat thread for cached directories smb: client: remove redundant pointer 'server' cifs: fix session state transition to avoid use-after-free issue
-
https://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
-
- 08 Jul, 2023 14 commits
-
-
Hugh Dickins authored
Lockdep is certainly right to complain about (&vma->vm_lock->lock){++++}-{3:3}, at: vma_start_write+0x2d/0x3f but task is already holding lock: (&mapping->i_mmap_rwsem){+.+.}-{3:3}, at: mmap_region+0x4dc/0x6db Invert those to the usual ordering. Fixes: 33313a74 ("mm: lock newly mapped VMA which can be modified after it becomes visible") Cc: stable@vger.kernel.org Signed-off-by:
Hugh Dickins <hughd@google.com> Tested-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
Merge tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "16 hotfixes. Six are cc:stable and the remainder address post-6.4 issues" The merge undoes the disabling of the CONFIG_PER_VMA_LOCK feature, since it was all hopefully fixed in mainline. * tag 'mm-hotfixes-stable-2023-07-08-10-43' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: lib: dhry: fix sleeping allocations inside non-preemptable section kasan, slub: fix HW_TAGS zeroing with slub_debug kasan: fix type cast in memory_is_poisoned_n mailmap: add entries for Heiko Stuebner mailmap: update manpage link bootmem: remove the vmemmap pages from kmemleak in free_bootmem_page MAINTAINERS: add linux-next info mailmap: add Markus Schneider-Pargmann writeback: account the number of pages written back mm: call arch_swap_restore() from do_swap_page() squashfs: fix cache race with migration mm/hugetlb.c: fix a bug within a BUG(): inconsistent pte comparison docs: update ocfs2-devel mailing list address MAINTAINERS: update ocfs2-devel mailing list address mm: disable CONFIG_PER_VMA_LOCK until its fixed fork: lock VMAs of the parent process when forking
-
Suren Baghdasaryan authored
When forking a child process, the parent write-protects anonymous pages and COW-shares them with the child being forked using copy_present_pte(). We must not take any concurrent page faults on the source vma's as they are being processed, as we expect both the vma and the pte's behind it to be stable. For example, the anon_vma_fork() expects the parents vma->anon_vma to not change during the vma copy. A concurrent page fault on a page newly marked read-only by the page copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the source vma, defeating the anon_vma_clone() that wasn't done because the parent vma originally didn't have an anon_vma, but we now might end up copying a pte entry for a page that has one. Before the per-vma lock based changes, the mmap_lock guaranteed exclusion with concurrent page faults. But now we need to do a vma_start_write() to make sure no concurrent faults happen on this vma while it is being processed. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by:
David Hildenbrand <david@redhat.com> Reported-by:
Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/Reported-by:
Holger Hoffstätte <holger@applied-asynchrony.com> Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/Reported-by:
Jacob Young <jacobly.alt@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aae ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking. Cc: stable@vger.kernel.org Signed-off-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Suren Baghdasaryan authored
With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking. Cc: stable@vger.kernel.org Signed-off-by:
Suren Baghdasaryan <surenb@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull more SCSI updates from James Bottomley: "A few late arriving patches that missed the initial pull request. It's mostly bug fixes (the dt-bindings is a fix for the initial pull)" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Remove unused function declaration scsi: target: docs: Remove tcm_mod_builder.py scsi: target: iblock: Quiet bool conversion warning with pr_preempt use scsi: dt-bindings: ufs: qcom: Fix ICE phandle scsi: core: Simplify scsi_cdl_check_cmd() scsi: isci: Fix comment typo scsi: smartpqi: Replace one-element arrays with flexible-array members scsi: target: tcmu: Replace strlcpy() with strscpy() scsi: ncr53c8xx: Replace strlcpy() with strscpy() scsi: lpfc: Fix lpfc_name struct packing
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull more i2c updates from Wolfram Sang: - xiic patch should have been in the original pull but slipped through - mpc patch fixes a build regression - nomadik cleanup * tag 'i2c-for-6.5-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mpc: Drop unused variable i2c: nomadik: Remove a useless call in the remove function i2c: xiic: Don't try to handle more interrupt events after error
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull hardening fixes from Kees Cook: - Check for NULL bdev in LoadPin (Matthias Kaehlcke) - Revert unwanted KUnit FORTIFY build default - Fix 1-element array causing boot warnings with xhci-hub * tag 'hardening-v6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: usb: ch9: Replace bmSublinkSpeedAttr 1-element array with flexible array Revert "fortify: Allow KUnit test to build without FORTIFY" dm: verity-loadpin: Add NULL pointer check for 'bdev' parameter
-
Anup Sharma authored
The debugfs_create_dir function returns ERR_PTR in case of error, and the only correct way to check if an error occurred is 'IS_ERR' inline function. This patch will replace the null-comparison with IS_ERR. Signed-off-by:
Anup Sharma <anupnewsmail@gmail.com> Suggested-by:
Ivan Orlov <ivan.orlov0322@gmail.com> Signed-off-by:
Jon Mason <jdmason@kudzu.us>
-
Linus Torvalds authored
Merge tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next Pull more perf tools updates from Namhyung Kim: "These are remaining changes and fixes for this cycle. Build: - Allow generating vmlinux.h from BTF using `make GEN_VMLINUX_H=1` and skip if the vmlinux has no BTF. - Replace deprecated clang -target xxx option by --target=xxx. perf record: - Print event attributes with well known type and config symbols in the debug output like below: # perf record -e cycles,cpu-clock -C0 -vv true <SNIP> ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 - Update AMD IBS event error message since it now support per-process profiling but no priviledge filters. $ sudo perf record -e ibs_op//k -C 0 Error: AMD IBS doesn't support privilege filtering. Try again without the privilege modifiers (like 'k') at the end. perf lock contention: - Support CSV style output using -x option $ sudo perf lock con -ab -x, sleep 1 # output: contended, total wait, max wait, avg wait, type, caller 19, 194232, 21415, 10222, spinlock, process_one_work+0x1f0 15, 162748, 23843, 10849, rwsem:R, do_user_addr_fault+0x40e 4, 86740, 23415, 21685, rwlock:R, ep_poll_callback+0x2d 1, 84281, 84281, 84281, mutex, iwl_mvm_async_handlers_wk+0x135 8, 67608, 27404, 8451, spinlock, __queue_work+0x174 3, 58616, 31125, 19538, rwsem:W, do_mprotect_pkey+0xff 3, 52953, 21172, 17651, rwlock:W, do_epoll_wait+0x248 2, 30324, 19704, 15162, rwsem:R, do_madvise+0x3ad 1, 24619, 24619, 24619, spinlock, rcu_core+0xd4 - Add --output option to save the data to a file not to be interfered by other debug messages. Test: - Fix event parsing test on ARM where there's no raw PMU nor supports PERF_PMU_CAP_EXTENDED_HW_TYPE. - Update the lock contention test case for CSV output. - Fix a segfault in the daemon command test. Vendor events (JSON): - Add has_event() to check if the given event is available on system at runtime. On Intel machines, some transaction events may not be present when TSC extensions are disabled. - Update Intel event metrics. Misc: - Sort symbols by name using an external array of pointers instead of a rbtree node in the symbol. This will save 16-bytes or 24-bytes per symbol whether the sorting is actually requested or not. - Fix unwinding DWARF callstacks using libdw when --symfs option is used" * tag 'perf-tools-for-v6.5-2-2023-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next: (38 commits) perf test: Fix event parsing test when PERF_PMU_CAP_EXTENDED_HW_TYPE isn't supported. perf test: Fix event parsing test on Arm perf evsel amd: Fix IBS error message perf: unwind: Fix symfs with libdw perf symbol: Fix uninitialized return value in symbols__find_by_name() perf test: Test perf lock contention CSV output perf lock contention: Add --output option perf lock contention: Add -x option for CSV style output perf lock: Remove stale comments perf vendor events intel: Update tigerlake to 1.13 perf vendor events intel: Update skylakex to 1.31 perf vendor events intel: Update skylake to 57 perf vendor events intel: Update sapphirerapids to 1.14 perf vendor events intel: Update icelakex to 1.21 perf vendor events intel: Update icelake to 1.19 perf vendor events intel: Update cascadelakex to 1.19 perf vendor events intel: Update meteorlake to 1.03 perf vendor events intel: Add rocketlake events/metrics perf vendor metrics intel: Make transaction metrics conditional perf jevents: Support for has_event function ...
-
https://github.com/norov/linuxLinus Torvalds authored
Pull bitmap updates from Yury Norov: "Fixes for different bitmap pieces: - lib/test_bitmap: increment failure counter properly The tests that don't use expect_eq() macro to determine that a test is failured must increment failed_tests explicitly. - lib/bitmap: drop optimization of bitmap_{from,to}_arr64 bitmap_{from,to}_arr64() optimization is overly optimistic on 32-bit LE architectures when it's wired to bitmap_copy_clear_tail(). - nodemask: Drop duplicate check in for_each_node_mask() As the return value type of first_node() became unsigned, the node >= 0 became unnecessary. - cpumask: fix function description kernel-doc notation - MAINTAINERS: Add bits.h and bitfield.h to the BITMAP API record Add linux/bits.h and linux/bitfield.h for visibility" * tag 'bitmap-6.5-rc1' of https://github.com/norov/linux: MAINTAINERS: Add bitfield.h to the BITMAP API record MAINTAINERS: Add bits.h to the BITMAP API record cpumask: fix function description kernel-doc notation nodemask: Drop duplicate check in for_each_node_mask() lib/bitmap: drop optimization of bitmap_{from,to}_arr64 lib/test_bitmap: increment failure counter properly
-
Geert Uytterhoeven authored
The Smatch static checker reports the following warnings: lib/dhry_run.c:38 dhry_benchmark() warn: sleeping in atomic context lib/dhry_run.c:43 dhry_benchmark() warn: sleeping in atomic context Indeed, dhry() does sleeping allocations inside the non-preemptable section delimited by get_cpu()/put_cpu(). Fix this by using atomic allocations instead. Add error handling, as atomic these allocations may fail. Link: https://lkml.kernel.org/r/bac6d517818a7cd8efe217c1ad649fffab9cc371.1688568764.git.geert+renesas@glider.be Fixes: 13684e96 ("lib: dhry: fix unstable smp_processor_id(_) usage") Reported-by:
Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/0469eb3a-02eb-4b41-b189-de20b931fa56@moroto.mountainSigned-off-by:
Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Andrey Konovalov authored
Commit 946fa0db ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") added precise kmalloc redzone poisoning to the slub_debug functionality. However, this commit didn't account for HW_TAGS KASAN fully initializing the object via its built-in memory initialization feature. Even though HW_TAGS KASAN memory initialization contains special memory initialization handling for when slub_debug is enabled, it does not account for in-object slub_debug redzones. As a result, HW_TAGS KASAN can overwrite these redzones and cause false-positive slub_debug reports. To fix the issue, avoid HW_TAGS KASAN memory initialization when slub_debug is enabled altogether. Implement this by moving the __slub_debug_enabled check to slab_post_alloc_hook. Common slab code seems like a more appropriate place for a slub_debug check anyway. Link: https://lkml.kernel.org/r/678ac92ab790dba9198f9ca14f405651b97c8502.1688561016.git.andreyknvl@google.com Fixes: 946fa0db ("mm/slub: extend redzone check to extra allocated kmalloc space than requested") Signed-off-by:
Andrey Konovalov <andreyknvl@google.com> Reported-by:
Will Deacon <will@kernel.org> Acked-by:
Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: kasan-dev@googlegroups.com Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Collingbourne <pcc@google.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Andrey Konovalov authored
Commit bb6e04a1 ("kasan: use internal prototypes matching gcc-13 builtins") introduced a bug into the memory_is_poisoned_n implementation: it effectively removed the cast to a signed integer type after applying KASAN_GRANULE_MASK. As a result, KASAN started failing to properly check memset, memcpy, and other similar functions. Fix the bug by adding the cast back (through an additional signed integer variable to make the code more readable). Link: https://lkml.kernel.org/r/8c9e0251c2b8b81016255709d4ec42942dcaf018.1688431866.git.andreyknvl@google.com Fixes: bb6e04a1 ("kasan: use internal prototypes matching gcc-13 builtins") Signed-off-by:
Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-