- 14 Jun, 2023 1 commit
-
-
Mostafa Saleh authored
When the use of pointer authentication is enabled in the kernel it applies to both the kernel itself as well as KVM's nVHE hypervisor. The same keys are used for both the kernel and the nVHE hypervisor, which is less than desirable for pKVM as the host is not trusted at runtime. Naturally, the fix is to use a different set of keys for the hypervisor when running in protected mode. Have the host generate a new set of keys for the hypervisor before deprivileging the kernel. While there might be other sources of random directly available at EL2, this keeps the implementation simple, and the host is trusted anyways until it is deprivileged. Since the host and hypervisor no longer share a set of pointer authentication keys, start context switching them on the host entry/exit path exactly as we do for guest entry/exit. There is no need to handle CPU migration as the nVHE code is not migratable in the first place. Signed-off-by: Mostafa Saleh <smostafa@google.com> Link: https://lore.kernel.org/r/20230614122600.2098901-1-smostafa@google.comSigned-off-by: Oliver Upton <oliver.upton@linux.dev>
-
- 13 Jun, 2023 1 commit
-
-
Dan Carpenter authored
Smatch detected this bug: arch/arm64/kvm/arch_timer.c:1425 kvm_timer_hyp_init() warn: missing unwind goto? There are two resources to be freed the vtimer and ptimer. The line that Smatch complains about should free the vtimer first before returning and then after that cleanup code should free the ptimer. I've added a out_free_ptimer_irq to free the ptimer and renamed the existing label to out_free_vtimer_irq. Fixes: 9e01dc76 ("KVM: arm/arm64: arch_timer: Assign the phys timer on VHE systems") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/72fffc35-7669-40b1-9d14-113c43269cf3@kili.mountainSigned-off-by: Oliver Upton <oliver.upton@linux.dev>
-
- 30 May, 2023 1 commit
-
-
Mostafa Saleh authored
CONFIG_ARM64_BTI_KERNEL compiles the kernel to support ARMv8.5-BTI. However, the nvhe code doesn't make use of it as it doesn't map any pages with Guarded Page(GP) bit. kvm pgtable code is modified to map executable pages with GP bit if BTI is enabled for the kernel. At hyp init, SCTLR_EL2.BT is set to 1 to match EL1 configuration (SCTLR_EL1.BT1) set in bti_enable(). One difference between kernel and nvhe code, is that the kernel maps .text with GP while nvhe maps all the executable pages, this makes nvhe code need to deal with special initialization code coming from other executable sections (.idmap.text). For this we need to add bti instruction at the beginning of __kvm_handle_stub_hvc as it can be called by __host_hvc through branch instruction(br) and unlike SYM_FUNC_START, SYM_CODE_START doesn’t add bti instruction at the beginning, and it can’t be modified to add it as it is used with vector tables. Another solution which is more intrusive is to convert __kvm_handle_stub_hvc to a function and inject “bti jc” instead of “bti c” in SYM_FUNC_START Signed-off-by: Mostafa Saleh <smostafa@google.com> Link: https://lore.kernel.org/r/20230530150845.2856828-1-smostafa@google.comSigned-off-by: Oliver Upton <oliver.upton@linux.dev>
-
- 21 May, 2023 1 commit
-
-
Marc Zyngier authored
CTR_EL0 can often be used in userspace, and it would be nice if KVM didn't have to emulate it unnecessarily. While it isn't possible to trap the cache configuration registers independently from CTR_EL0 in the base ARMv8.0 architecture, FEAT_EVT allows these cache configuration registers (CCSIDR_EL1, CCSIDR2_EL1, CLIDR_EL1 and CSSELR_EL1) to be trapped independently by setting HCR_EL2.TID4. Switch to using TID4 instead of TID2 in the cases where FEAT_EVT is available *and* that KVM doesn't need to sanitise CTR_EL0 to paper over mismatched cache configurations. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230515170016.965378-1-maz@kernel.orgSigned-off-by: Oliver Upton <oliver.upton@linux.dev>
-
- 14 May, 2023 13 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds authored
Pull compute express link fixes from Dan Williams: - Fix a compilation issue with DEFINE_STATIC_SRCU() in the unit tests - Fix leaking kernel memory to a root-only sysfs attribute * tag 'cxl-fixes-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl: Add missing return to cdat read error path tools/testing/cxl: Use DEFINE_STATIC_SRCU()
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc architecture fixes from Helge Deller: - Fix encoding of swp_entry due to added SWP_EXCLUSIVE flag - Include reboot.h to avoid gcc-12 compiler warning * tag 'parisc-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Fix encoding of swp_entry due to added SWP_EXCLUSIVE flag parisc: kexec: include reboot.h
-
git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM fixes from Russell King: - fix unwinder for uleb128 case - fix kernel-doc warnings for HP Jornada 7xx - fix unbalanced stack on vfp success path * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9297/1: vfp: avoid unbalanced stack on 'success' return path ARM: 9296/1: HP Jornada 7XX: fix kernel-doc warnings ARM: 9295/1: unwind:fix unwind abort for uleb128 case
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull locking fix from Borislav Petkov: - Make sure __down_read_common() is always inlined so that the callers' names land in traceevents output and thus the blocked function can be identified * tag 'locking_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Add __always_inline annotation to __down_read_common() and inlined callers
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Borislav Petkov: - Make sure the PEBS buffer is flushed before reprogramming the hardware so that the correct record sizes are used - Update the sample size for AMD BRS events - Fix a confusion with using the same on-stack struct with different events in the event processing path * tag 'perf_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG perf/x86: Fix missing sample size update on AMD BRS perf/core: Fix perf_sample_data not properly initialized for different swevents in perf_tp_event()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fix from Borislav Petkov: - Fix a couple of kernel-doc warnings * tag 'sched_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: fix cid_lock kernel-doc warnings
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fix from Borislav Petkov: - Add the required PCI IDs so that the generic SMN accesses provided by amd_nb.c work for drivers which switch to them. Add a PCI device ID to k10temp's table so that latter is loaded on such systems too * tag 'x86_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: hwmon: (k10temp) Add PCI ID for family 19, model 78h x86/amd_nb: Add PCI ID for family 19h model 78h
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fix from Borislav Petkov: - Prevent CPU state corruption when an active clockevent broadcast device is replaced while the system is already in oneshot mode * tag 'timers_urgent_for_v6.4_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tick/broadcast: Make broadcast device replacement work correctly
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 fixes from Ted Ts'o: "Some ext4 bug fixes (mostly to address Syzbot reports)" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: bail out of ext4_xattr_ibody_get() fails for any reason ext4: add bounds checking in get_max_inline_xattr_value_size() ext4: add indication of ro vs r/w mounts in the mount message ext4: fix deadlock when converting an inline directory in nojournal mode ext4: improve error recovery code paths in __ext4_remount() ext4: improve error handling from ext4_dirhash() ext4: don't clear SB_RDONLY when remounting r/w until quota is re-enabled ext4: check iomap type only if ext4_iomap_begin() does not fail ext4: avoid a potential slab-out-of-bounds in ext4_group_desc_csum ext4: fix data races when using cached status extents ext4: avoid deadlock in fs reclaim with page writeback ext4: fix invalid free tracking in ext4_xattr_move_to_block() ext4: remove a BUG_ON in ext4_mb_release_group_pa() ext4: allow ext4_get_group_info() to fail ext4: fix lockdep warning when enabling MMP ext4: fix WARNING in mb_find_extent
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdevLinus Torvalds authored
Pull fbdev fixes from Helge Deller: - use after free fix in imsttfb (Zheng Wang) - fix error handling in arcfb (Zongjie Li) - lots of whitespace cleanups (Thomas Zimmermann) - add 1920x1080 modedb entry (me) * tag 'fbdev-for-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: stifb: Fix info entry in sti_struct on error path fbdev: modedb: Add 1920x1080 at 60 Hz video mode fbdev: imsttfb: Fix use after free bug in imsttfb_probe fbdev: vfb: Remove trailing whitespaces fbdev: valkyriefb: Remove trailing whitespaces fbdev: stifb: Remove trailing whitespaces fbdev: sa1100fb: Remove trailing whitespaces fbdev: platinumfb: Remove trailing whitespaces fbdev: p9100: Remove trailing whitespaces fbdev: maxinefb: Remove trailing whitespaces fbdev: macfb: Remove trailing whitespaces fbdev: hpfb: Remove trailing whitespaces fbdev: hgafb: Remove trailing whitespaces fbdev: g364fb: Remove trailing whitespaces fbdev: controlfb: Remove trailing whitespaces fbdev: cg14: Remove trailing whitespaces fbdev: atmel_lcdfb: Remove trailing whitespaces fbdev: 68328fb: Remove trailing whitespaces fbdev: arcfb: Fix error handling in arcfb_probe()
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fix from James Bottomley: "A single small fix for the UFS driver to fix a power management failure" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Fix I/O hang that occurs when BKOPS fails in W-LUN suspend
-
Helge Deller authored
Fix the __swp_offset() and __swp_entry() macros due to commit 6d239fc7 ("parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE") which introduced the SWP_EXCLUSIVE flag by reusing the _PAGE_ACCESSED flag. Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Helge Deller <deller@gmx.de> Fixes: 6d239fc7 ("parisc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE") Cc: <stable@vger.kernel.org> # v6.3+
-
- 13 May, 2023 17 commits
-
-
Theodore Ts'o authored
In ext4_update_inline_data(), if ext4_xattr_ibody_get() fails for any reason, it's best if we just fail as opposed to stumbling on, especially if the failure is EFSCORRUPTED. Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
Normally the extended attributes in the inode body would have been checked when the inode is first opened, but if someone is writing to the block device while the file system is mounted, it's possible for the inode table to get corrupted. Add bounds checking to avoid reading beyond the end of allocated memory if this happens. Reported-by: syzbot+1966db24521e5f6e23f7@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=1966db24521e5f6e23f7 Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
Whether the file system is mounted read-only or read/write is more important than the quota mode, which we are already printing. Add the ro vs r/w indication since this can be helpful in debugging problems from the console log. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
In no journal mode, ext4_finish_convert_inline_dir() can self-deadlock by calling ext4_handle_dirty_dirblock() when it already has taken the directory lock. There is a similar self-deadlock in ext4_incvert_inline_data_nolock() for data files which we'll fix at the same time. A simple reproducer demonstrating the problem: mke2fs -Fq -t ext2 -O inline_data -b 4k /dev/vdc 64 mount -t ext4 -o dirsync /dev/vdc /vdc cd /vdc mkdir file0 cd file0 touch file0 touch file1 attr -s BurnSpaceInEA -V abcde . touch supercalifragilisticexpialidocious Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230507021608.1290720-1-tytso@mit.edu Reported-by: syzbot+91dccab7c64e2850a4e5@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=ba84cc80a9491d65416bc7877e1650c87530fe8aSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
If there are failures while changing the mount options in __ext4_remount(), we need to restore the old mount options. This commit fixes two problem. The first is there is a chance that we will free the old quota file names before a potential failure leading to a use-after-free. The second problem addressed in this commit is if there is a failed read/write to read-only transition, if the quota has already been suspended, we need to renable quota handling. Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230506142419.984260-2-tytso@mit.eduSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
The ext4_dirhash() will *almost* never fail, especially when the hash tree feature was first introduced. However, with the addition of support of encrypted, casefolded file names, that function can most certainly fail today. So make sure the callers of ext4_dirhash() properly check for failures, and reflect the errors back up to their callers. Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230506142419.984260-1-tytso@mit.edu Reported-by: syzbot+394aa8a792cb99dbc837@syzkaller.appspotmail.com Reported-by: syzbot+344aaa8697ebd232bfc8@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=db56459ea4ac4a676ae4b4678f633e55da005a9bSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
When a file system currently mounted read/only is remounted read/write, if we clear the SB_RDONLY flag too early, before the quota is initialized, and there is another process/thread constantly attempting to create a directory, it's possible to trigger the WARN_ON_ONCE(dquot_initialize_needed(inode)); in ext4_xattr_block_set(), with the following stack trace: WARNING: CPU: 0 PID: 5338 at fs/ext4/xattr.c:2141 ext4_xattr_block_set+0x2ef2/0x3680 RIP: 0010:ext4_xattr_block_set+0x2ef2/0x3680 fs/ext4/xattr.c:2141 Call Trace: ext4_xattr_set_handle+0xcd4/0x15c0 fs/ext4/xattr.c:2458 ext4_initxattrs+0xa3/0x110 fs/ext4/xattr_security.c:44 security_inode_init_security+0x2df/0x3f0 security/security.c:1147 __ext4_new_inode+0x347e/0x43d0 fs/ext4/ialloc.c:1324 ext4_mkdir+0x425/0xce0 fs/ext4/namei.c:2992 vfs_mkdir+0x29d/0x450 fs/namei.c:4038 do_mkdirat+0x264/0x520 fs/namei.c:4061 __do_sys_mkdirat fs/namei.c:4076 [inline] __se_sys_mkdirat fs/namei.c:4074 [inline] __x64_sys_mkdirat+0x89/0xa0 fs/namei.c:4074 Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230506142419.984260-1-tytso@mit.edu Reported-by: syzbot+6385d7d3065524c5ca6d@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=6513f6cb5cd6b5fc9f37e3bb70d273b94be9c34cSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Baokun Li authored
When ext4_iomap_overwrite_begin() calls ext4_iomap_begin() map blocks may fail for some reason (e.g. memory allocation failure, bare disk write), and later because "iomap->type ! = IOMAP_MAPPED" triggers WARN_ON(). When ext4 iomap_begin() returns an error, it is normal that the type of iomap->type may not match the expectation. Therefore, we only determine if iomap->type is as expected when ext4_iomap_begin() is executed successfully. Cc: stable@kernel.org Reported-by: syzbot+08106c4b7d60702dbc14@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/00000000000015760b05f9b4eee9@google.comSigned-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230505132429.714648-1-libaokun1@huawei.comSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Tudor Ambarus authored
When modifying the block device while it is mounted by the filesystem, syzbot reported the following: BUG: KASAN: slab-out-of-bounds in crc16+0x206/0x280 lib/crc16.c:58 Read of size 1 at addr ffff888075f5c0a8 by task syz-executor.2/15586 CPU: 1 PID: 15586 Comm: syz-executor.2 Not tainted 6.2.0-rc5-syzkaller-00205-gc9661827 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:306 print_report+0x107/0x1f0 mm/kasan/report.c:417 kasan_report+0xcd/0x100 mm/kasan/report.c:517 crc16+0x206/0x280 lib/crc16.c:58 ext4_group_desc_csum+0x81b/0xb20 fs/ext4/super.c:3187 ext4_group_desc_csum_set+0x195/0x230 fs/ext4/super.c:3210 ext4_mb_clear_bb fs/ext4/mballoc.c:6027 [inline] ext4_free_blocks+0x191a/0x2810 fs/ext4/mballoc.c:6173 ext4_remove_blocks fs/ext4/extents.c:2527 [inline] ext4_ext_rm_leaf fs/ext4/extents.c:2710 [inline] ext4_ext_remove_space+0x24ef/0x46a0 fs/ext4/extents.c:2958 ext4_ext_truncate+0x177/0x220 fs/ext4/extents.c:4416 ext4_truncate+0xa6a/0xea0 fs/ext4/inode.c:4342 ext4_setattr+0x10c8/0x1930 fs/ext4/inode.c:5622 notify_change+0xe50/0x1100 fs/attr.c:482 do_truncate+0x200/0x2f0 fs/open.c:65 handle_truncate fs/namei.c:3216 [inline] do_open fs/namei.c:3561 [inline] path_openat+0x272b/0x2dd0 fs/namei.c:3714 do_filp_open+0x264/0x4f0 fs/namei.c:3741 do_sys_openat2+0x124/0x4e0 fs/open.c:1310 do_sys_open fs/open.c:1326 [inline] __do_sys_creat fs/open.c:1402 [inline] __se_sys_creat fs/open.c:1396 [inline] __x64_sys_creat+0x11f/0x160 fs/open.c:1396 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f72f8a8c0c9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f72f97e3168 EFLAGS: 00000246 ORIG_RAX: 0000000000000055 RAX: ffffffffffffffda RBX: 00007f72f8bac050 RCX: 00007f72f8a8c0c9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000280 RBP: 00007f72f8ae7ae9 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffd165348bf R14: 00007f72f97e3300 R15: 0000000000022000 Replace le16_to_cpu(sbi->s_es->s_desc_size) with sbi->s_desc_size It reduces ext4's compiled text size, and makes the code more efficient (we remove an extra indirect reference and a potential byte swap on big endian systems), and there is no downside. It also avoids the potential KASAN / syzkaller failure, as a bonus. Reported-by: syzbot+fc51227e7100c9294894@syzkaller.appspotmail.com Reported-by: syzbot+8785e41224a3afd04321@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=70d28d11ab14bd7938f3e088365252aa923cff42 Link: https://syzkaller.appspot.com/bug?id=b85721b38583ecc6b5e72ff524c67302abbc30f3 Link: https://lore.kernel.org/all/000000000000ece18705f3b20934@google.com/ Fixes: 717d50e4 ("Ext4: Uninitialized Block Groups") Cc: stable@vger.kernel.org Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Link: https://lore.kernel.org/r/20230504121525.3275886-1-tudor.ambarus@linaro.orgSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Jan Kara authored
When using cached extent stored in extent status tree in tree->cache_es another process holding ei->i_es_lock for reading can be racing with us setting new value of tree->cache_es. If the compiler would decide to refetch tree->cache_es at an unfortunate moment, it could result in a bogus in_range() check. Fix the possible race by using READ_ONCE() when using tree->cache_es only under ei->i_es_lock for reading. Cc: stable@kernel.org Reported-by: syzbot+4a03518df1e31b537066@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000d3b33905fa0fd4a6@google.comSuggested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230504125524.10802-1-jack@suse.czSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Jan Kara authored
Ext4 has a filesystem wide lock protecting ext4_writepages() calls to avoid races with switching of journalled data flag or inode format. This lock can however cause a deadlock like: CPU0 CPU1 ext4_writepages() percpu_down_read(sbi->s_writepages_rwsem); ext4_change_inode_journal_flag() percpu_down_write(sbi->s_writepages_rwsem); - blocks, all readers block from now on ext4_do_writepages() ext4_init_io_end() kmem_cache_zalloc(io_end_cachep, GFP_KERNEL) fs_reclaim frees dentry... dentry_unlink_inode() iput() - last ref => iput_final() - inode dirty => write_inode_now()... ext4_writepages() tries to acquire sbi->s_writepages_rwsem and blocks forever Make sure we cannot recurse into filesystem reclaim from writeback code to avoid the deadlock. Reported-by: syzbot+6898da502aef574c5f8a@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/0000000000004c66b405fa108e27@google.com Fixes: c8585c6f ("ext4: fix races between changing inode journal mode and ext4_writepages") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230504124723.20205-1-jack@suse.czSigned-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
In ext4_xattr_move_to_block(), the value of the extended attribute which we need to move to an external block may be allocated by kvmalloc() if the value is stored in an external inode. So at the end of the function the code tried to check if this was the case by testing entry->e_value_inum. However, at this point, the pointer to the xattr entry is no longer valid, because it was removed from the original location where it had been stored. So we could end up calling kvfree() on a pointer which was not allocated by kvmalloc(); or we could also potentially leak memory by not freeing the buffer when it should be freed. Fix this by storing whether it should be freed in a separate variable. Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230430160426.581366-1-tytso@mit.edu Link: https://syzkaller.appspot.com/bug?id=5c2aee8256e30b55ccf57312c16d88417adbd5e1 Link: https://syzkaller.appspot.com/bug?id=41a6b5d4917c0412eb3b3c3c604965bed7d7420b Reported-by: syzbot+64b645917ce07d89bde5@syzkaller.appspotmail.com Reported-by: syzbot+0d042627c4f2ad332195@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
If a malicious fuzzer overwrites the ext4 superblock while it is mounted such that the s_first_data_block is set to a very large number, the calculation of the block group can underflow, and trigger a BUG_ON check. Change this to be an ext4_warning so that we don't crash the kernel. Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230430154311.579720-3-tytso@mit.edu Reported-by: syzbot+e2efa3efc15a1c9e95c3@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=69b28112e098b070f639efb356393af3ffec4220Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
Previously, ext4_get_group_info() would treat an invalid group number as BUG(), since in theory it should never happen. However, if a malicious attaker (or fuzzer) modifies the superblock via the block device while it is the file system is mounted, it is possible for s_first_data_block to get set to a very large number. In that case, when calculating the block group of some block number (such as the starting block of a preallocation region), could result in an underflow and very large block group number. Then the BUG_ON check in ext4_get_group_info() would fire, resutling in a denial of service attack that can be triggered by root or someone with write access to the block device. For a quality of implementation perspective, it's best that even if the system administrator does something that they shouldn't, that it will not trigger a BUG. So instead of BUG'ing, ext4_get_group_info() will call ext4_error and return NULL. We also add fallback code in all of the callers of ext4_get_group_info() that it might NULL. Also, since ext4_get_group_info() was already borderline to be an inline function, un-inline it. The results in a next reduction of the compiled text size of ext4 by roughly 2k. Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230430154311.579720-2-tytso@mit.edu Reported-by: syzbot+e2efa3efc15a1c9e95c3@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=69b28112e098b070f639efb356393af3ffec4220Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull block fixes from Jens Axboe: "Just a few minor fixes for drivers, and a deletion of a file that is woefully out-of-date these days" * tag 'block-6.4-2023-05-13' of git://git.kernel.dk/linux: Documentation/block: drop the request.rst file ublk: fix command op code check block/rnbd: replace REQ_OP_FLUSH with REQ_OP_WRITE nbd: Fix debugfs_create_dir error checking
-
Dave Jiang authored
Add a return to the error path when cxl_cdat_read_table() fails. Current code continues with the table pointer points to freed memory. Fixes: 7a877c92 ("cxl/pci: Simplify CDAT retrieval error path") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/168382793506.3510737.4792518576623749076.stgit@djiang5-mobl3Signed-off-by: Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Starting with commit: 95433f72 ("srcu: Begin offloading srcu_struct fields to srcu_update") ...it is no longer possible to do: static DEFINE_SRCU(x) Switch to DEFINE_STATIC_SRCU(x) to fix: tools/testing/cxl/test/mock.c:22:1: error: duplicate ‘static’ 22 | static DEFINE_SRCU(cxl_mock_srcu); | ^~~~~~ Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168392709546.1135523.10424917245934547117.stgit@dwillia2-xfh.jf.intel.comSigned-off-by: Dan Williams <dan.j.williams@intel.com>
-
- 12 May, 2023 6 commits
-
-
Borislav Petkov (AMD) authored
SYM_FUNC_START_LOCAL_NOALIGN() adds an endbr leading to this layout (leaving only the last 2 bytes of the address): 3bff <zen_untrain_ret>: 3bff: f3 0f 1e fa endbr64 3c03: f6 test $0xcc,%bl 3c04 <__x86_return_thunk>: 3c04: c3 ret 3c05: cc int3 3c06: 0f ae e8 lfence However, "the RET at __x86_return_thunk must be on a 64 byte boundary, for alignment within the BTB." Use SYM_START instead. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds authored
Pull more btrfs fixes from David Sterba: - fix incorrect number of bitmap entries for space cache if loading is interrupted by some error - fix backref walking, this breaks a mode of LOGICAL_INO_V2 ioctl that is used in deduplication tools - zoned mode fixes: - properly finish zone reserved for relocation - correctly calculate super block zone end on ZNS - properly initialize new extent buffer for redirty - make mount option clear_cache work with block-group-tree, to rebuild free-space-tree instead of temporarily disabling it that would lead to a forced read-only mount - fix alignment check for offset when printing extent item * tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: make clear_cache mount option to rebuild FST without disabling it btrfs: zero the buffer before marking it dirty in btrfs_redirty_list_add btrfs: zoned: fix full zone super block reading on ZNS btrfs: zoned: zone finish data relocation BG with last IO btrfs: fix backref walking not returning all inode refs btrfs: fix space cache inconsistency after error loading it from disk btrfs: print-tree: parent bytenr must be aligned to sector size
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull cifs client fixes from Steve French: - fix for copy_file_range bug for very large files that are multiples of rsize - do not ignore "isolated transport" flag if set on share - set rasize default better - three fixes related to shutdown and freezing (fixes 4 xfstests, and closes deferred handles faster in some places that were missed) * tag '6.4-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: release leases for deferred close handles when freezing smb3: fix problem remounting a share after shutdown SMB3: force unmount was failing to close deferred close files smb3: improve parallel reads of large files do not reuse connection if share marked as isolated cifs: fix pcchunk length type in smb2_copychunk_range
-
Linus Torvalds authored
Pull vfs fix from Christian Brauner: "During the pipe nonblock rework the check for both O_NONBLOCK and IOCB_NOWAIT was dropped. Both checks need to be performed to ensure that files without O_NONBLOCK but IOCB_NOWAIT don't block when writing to or reading from a pipe. This just contains the fix adding the check for IOCB_NOWAIT back in" * tag 'vfs/v6.4-rc1/pipe' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: pipe: check for IOCB_NOWAIT alongside O_NONBLOCK
-
git://git.kernel.dk/linuxLinus Torvalds authored
Pull io_uring fix from Jens Axboe: "Just a single fix making io_uring_sqe_cmd() available regardless of CONFIG_IO_URING, fixing a regression introduced during the merge window if nvme was selected but io_uring was not" * tag 'io_uring-6.4-2023-05-12' of git://git.kernel.dk/linux: io_uring: make io_uring_sqe_cmd() unconditionally available
-
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linuxLinus Torvalds authored
Pull RISC-V fix from Palmer Dabbelt: "Just a single fix this week for a build issue. That'd usually be a good sign, but we've started to get some reports of boot failures on some hardware/bootloader configurations. Nothing concrete yet, but I've got a funny feeling that's where much of the bug hunting is going right now. Nothing's reproducing on my end, though, and this fixes some pretty concrete issues so I figured there's no reason to delay it: - a fix to the linker script to avoid orpahaned sections in kernel/pi" * tag 'riscv-for-linus-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix orphan section warnings caused by kernel/pi
-