- 13 May, 2022 3 commits
-
-
Puyou Lu authored
Add allocated strarray to device's resource list. This is a must to automatically release strarray when the device disappears. Without this fix we have a memory leak in the few drivers which use devm_kasprintf_strarray(). Link: https://lkml.kernel.org/r/20220506044409.30066-1-puyou.lu@gmail.com Link: https://lkml.kernel.org/r/20220506073623.2679-1-puyou.lu@gmail.com Fixes: acdb89b6 ("lib/string_helpers: Introduce managed variant of kasprintf_strarray()") Signed-off-by: Puyou Lu <puyou.lu@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
lizhe authored
At the end of get_last_crashkernel(), the judgement of ck_cmdline is obviously unnecessary and causes redundance, let's clean it up. Link: https://lkml.kernel.org/r/20220506104116.259323-1-sensor1010@163.comSigned-off-by: lizhe <sensor1010@163.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Philipp Rudo <prudo@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Young <dyoung@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alexey Dobriyan authored
This is very theoretical compile failure: ELF_ST_TYPE(st_info = A) Cast will bind first and st_info will stop being lvalue: error: lvalue required as left operand of assignment Given that the only use of this macro is ELF_ST_TYPE(sym->st_info) where st_info is "unsigned char" I've decided to remove cast especially given that companion macro ELF_ST_BIND doesn't use cast. Link: https://lkml.kernel.org/r/Ymv7G1BeX4kt3obz@localhost.localdomainSigned-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
- 10 May, 2022 10 commits
-
-
Waiman Long authored
When running the stress-ng clone benchmark with multiple testing threads, it was found that there were significant spinlock contention in sget_fc(). The contended spinlock was the sb_lock. It is under heavy contention because the following code in the critcal section of sget_fc(): hlist_for_each_entry(old, &fc->fs_type->fs_supers, s_instances) { if (test(old, fc)) goto share_extant_sb; } After testing with added instrumentation code, it was found that the benchmark could generate thousands of ipc namespaces with the corresponding number of entries in the mqueue's fs_supers list where the namespaces are the key for the search. This leads to excessive time in scanning the list for a match. Looking back at the mqueue calling sequence leading to sget_fc(): mq_init_ns() => mq_create_mount() => fc_mount() => vfs_get_tree() => mqueue_get_tree() => get_tree_keyed() => vfs_get_super() => sget_fc() Currently, mq_init_ns() is the only mqueue function that will indirectly call mqueue_get_tree() with a newly allocated ipc namespace as the key for searching. As a result, there will never be a match with the exising ipc namespaces stored in the mqueue's fs_supers list. So using get_tree_keyed() to do an existing ipc namespace search is just a waste of time. Instead, we could use get_tree_nodev() to eliminate the useless search. By doing so, we can greatly reduce the sb_lock hold time and avoid the spinlock contention problem in case a large number of ipc namespaces are present. Of course, if the code is modified in the future to allow mqueue_get_tree() to be called with an existing ipc namespace instead of a new one, we will have to use get_tree_keyed() in this case. The following stress-ng clone benchmark command was run on a 2-socket 48-core Intel system: ./stress-ng --clone 32 --verbose --oomable --metrics-brief -t 20 The "bogo ops/s" increased from 5948.45 before patch to 9137.06 after patch. This is an increase of 54% in performance. Link: https://lkml.kernel.org/r/20220121172315.19652-1-longman@redhat.com Fixes: 935c6912 ("ipc: Convert mqueue fs to fs_context") Signed-off-by: Waiman Long <longman@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Howells <dhowells@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Prakash Sangappa authored
semtimedop() should be converted to use hrtimer like it has been done for most of the system calls with timeouts. This system call already takes a struct timespec as an argument and can therefore provide finer granularity timed wait. Link: https://lkml.kernel.org/r/1651187881-2858-1-git-send-email-prakash.sangappa@oracle.comSigned-off-by: Prakash Sangappa <prakash.sangappa@oracle.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Michal Orzel authored
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends. Reported by clang-tidy [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220409101933.207157-1-michalorzel.eng@gmail.comSigned-off-by: Michal Orzel <michalorzel.eng@gmail.com> Reviewed-by: Tom Rix <trix@redhat.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
Add support for extraction of checksum-enabled "070702" cpio archives, specified in Documentation/driver-api/early-userspace/buffer-format.rst. Fail extraction if the calculated file data checksum doesn't match the value carried in the header. Link: https://lkml.kernel.org/r/20220404093429.27570-7-ddiss@suse.deSigned-off-by: David Disseldorp <ddiss@suse.de> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Martin Wilck <mwilck@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
Documentation/driver-api/early-userspace/buffer-format.rst includes the specification for checksum-enabled cpio archives. Implement support for this format in gen_init_cpio via a new '-c' parameter. Link: https://lkml.kernel.org/r/20220404093429.27570-6-ddiss@suse.deSigned-off-by: David Disseldorp <ddiss@suse.de> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Martin Wilck <mwilck@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
When processing a "file" entry, gen_init_cpio attempts to allocate a buffer large enough to stage the entire contents of the source file. It then attempts to fill the buffer via a single read() call and subsequently writes out the entire buffer length, without checking that read() returned the full length, potentially writing uninitialized buffer memory. Fix this by breaking up file I/O into 64k chunks and only writing the length returned by the prior read() call. Link: https://lkml.kernel.org/r/20220404093429.27570-5-ddiss@suse.deSigned-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
initramfs cpio mtime preservation, as implemented in commit 889d51a1 ("initramfs: add option to preserve mtime from initramfs cpio images"), uses a linked list to defer directory mtime processing until after all other items in the cpio archive have been processed. This is done to ensure that parent directory mtimes aren't overwritten via subsequent child creation. The lkml link below indicates that the mtime retention use case was for embedded devices with applications running exclusively out of initramfs, where the 32-bit mtime value provided a rough file version identifier. Linux distributions which discard an extracted initramfs immediately after the root filesystem has been mounted may want to avoid the unnecessary overhead. This change adds a new INITRAMFS_PRESERVE_MTIME Kconfig option, which can be used to disable on-by-default mtime retention and in turn speed up initramfs extraction, particularly for cpio archives with large directory counts. Benchmarks with a one million directory cpio archive extracted 20 times demonstrated: mean extraction time (s) std dev INITRAMFS_PRESERVE_MTIME=y 3.808 0.006 INITRAMFS_PRESERVE_MTIME unset 3.056 0.004 The above extraction times were measured using ftrace (initcall_finish - initcall_start) values for populate_rootfs() with initramfs_async disabled. [ddiss@suse.de: rebase atop dir_entry.name flexible array member and drop separate initramfs_mtime.h header] Link: https://lkml.org/lkml/2008/9/3/424 Link: https://lkml.kernel.org/r/20220404093429.27570-4-ddiss@suse.deSigned-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
dir_entry.name is currently allocated via a separate kstrdup(). Change it to a flexible array member and allocate it along with struct dir_entry. Link: https://lkml.kernel.org/r/20220404093429.27570-3-ddiss@suse.deSigned-off-by: David Disseldorp <ddiss@suse.de> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Martin Wilck <mwilck@suse.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
David Disseldorp authored
Patch series "initramfs: "crc" cpio format and INITRAMFS_PRESERVE_MTIME", v7. This patchset does some minor initramfs refactoring and allows cpio entry mtime preservation to be disabled via a new Kconfig INITRAMFS_PRESERVE_MTIME option. Patches 4/6 to 6/6 implement support for creation and extraction of "crc" cpio archives, which carry file data checksums. Basic tests for this functionality can be found at https://github.com/rapido-linux/rapido/pull/163 This patch (of 6): do_header() is called for each cpio entry and fails if the first six bytes don't match "newc" magic. The magic check includes a special case error message if POSIX.1 ASCII (cpio -H odc) magic is detected. This special case POSIX.1 check can be nested under the "newc" mismatch code path to avoid calling memcmp() twice in a non-error case. Link: https://lkml.kernel.org/r/20220404093429.27570-1-ddiss@suse.de Link: https://lkml.kernel.org/r/20220404093429.27570-2-ddiss@suse.deSigned-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Martin Wilck <mwilck@suse.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Alexey Dobriyan authored
When a process exits, /proc/${pid}, and /proc/${pid}/net dentries are flushed. However some leaf dentries like /proc/${pid}/net/arp_cache aren't. That's because respective PDEs have proc_misc_d_revalidate() hook which returns 1 and leaves dentries/inodes in the LRU. Force revalidation/lookup on everything under /proc/${pid}/net by inheriting proc_net_dentry_ops. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/YjdVHgildbWO7diJ@localhost.localdomain Fixes: c6c75ded ("proc: fix lookup in /proc/net subdirectories after setns(2)") Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: hui li <juanfengpy@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
- 29 Apr, 2022 27 commits
-
-
Liu Shixin authored
sbi->s_firstinodezone is initialized to 2 and sbi->s_firstdatazone is read from sbd. There's no guarantee that sbi->s_firstdatazone must bigger than sbi->s_firstinodezone. If sbi->s_firstdatazone less than 2, the filesystem can still be mounted unexpetly. At this point, sbi->s_ninodes flip to very large value and this filesystem is broken. We can observe this by executing 'df' command. When we execute, we will get an error message: "sysv_count_free_inodes: unable to read inode table" Link: https://lkml.kernel.org/r/20220330104215.530223-1-liushixin2@huawei.comSigned-off-by: Liu Shixin <liushixin2@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
xu xin authored
If getdelays runs in a non-init network namespace, it will fail in getting delayacct stats even if it has privilege of root user, which seems to be not very reasonable. We can simply reproduce this by executing commands: unshare -n getdelays -d -p <pid> I don't think net namespace should be an obstacle to the normal execution of getdelay function. So let's make it available from all net namespaces. Link: https://lkml.kernel.org/r/20220412071946.2532318-1-xu.xin16@zte.com.cnSigned-off-by: xu xin <xu.xin16@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: "Dr. Thomas Orgis" <thomas.orgis@uni-hamburg.de> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Ismael Luceno <ismael@iodev.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Dr. Thomas Orgis authored
The task exit struct needs some crucial information to be able to provide an enhanced version of process and thread accounting. This change provides: 1. ac_tgid in additon to ac_pid 2. thread group execution walltime in ac_tgetime 3. flag AGROUP in ac_flag to indicate the last task in a thread group / process 4. device ID and inode of task's /proc/self/exe in ac_exe_dev and ac_exe_inode 5. tools/accounting/procacct as demonstrator When a task exits, taskstats are reported to userspace including the task's pid and ppid, but without the id of the thread group this task is part of. Without the tgid, the stats of single tasks cannot be correlated to each other as a thread group (process). The taskstats documentation suggests that on process exit a data set consisting of accumulated stats for the whole group is produced. But such an additional set of stats is only produced for actually multithreaded processes, not groups that had only one thread, and also those stats only contain data about delay accounting and not the more basic information about CPU and memory resource usage. Adding the AGROUP flag to be set when the last task of a group exited enables determination of process end also for single-threaded processes. My applicaton basically does enhanced process accounting with summed cputime, biggest maxrss, tasks per process. The data is not available with the traditional BSD process accounting (which is not designed to be extensible) and the taskstats interface allows more efficient on-the-fly grouping and summing of the stats, anyway, without intermediate disk writes. Furthermore, I do carry statistics on which exact program binary is used how often with associated resources, getting a picture on how important which parts of a collection of installed scientific software in different versions are, and how well they put load on the machine. This is enabled by providing information on /proc/self/exe for each task. I assume the two 64-bit fields for device ID and inode are more appropriate than the possibly large resolved path to keep the data volume down. Add the tgid to the stats to complete task identification, the flag AGROUP to mark the last task of a group, the group wallclock time, and inode-based identification of the associated executable file. Add tools/accounting/procacct.c as a simplified fork of getdelays.c to demonstrate process and thread accounting. [thomas.orgis@uni-hamburg.de: fix version number in comment] Link: https://lkml.kernel.org/r/20220405003601.7a5f6008@plasteblaster Link: https://lkml.kernel.org/r/20220331004106.64e5616b@plasteblasterSigned-off-by: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de> Reviewed-by: Ismael Luceno <ismael@iodev.co.uk> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Jakob Koschel authored
req->map is set in the valid case and always equals 'map' if the break was hit. It therefore is unnecessary to use the list iterator variable and the use of 'map' can be replaced with req->map. This is done in preparation to limit the scope of a list iterator to the list traversal loop [1]. Link: https://lore.kernel.org/all/YhdfEIwI4EdtHdym@kroah.com/ Link: https://lkml.kernel.org/r/20220319203344.2547702-1-jakobkoschel@gmail.comSigned-off-by: Jakob Koschel <jakobkoschel@gmail.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: "Brian Johannesmeyer" <bjohannesmeyer@gmail.com> Cc: Cristiano Giuffrida <c.giuffrida@vu.nl> Cc: "Bos, H.J." <h.j.bos@vu.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Michal Orzel authored
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends. Reported by clang-tidy [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220326180948.192154-1-michalorzel.eng@gmail.comSigned-off-by: Michal Orzel <michalorzel.eng@gmail.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Michal Orzel <michalorzel.eng@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Tiezhu Yang authored
In MAINTAINERS PTRACE SUPPORT entry, the file include/uapi/linux/ptrace.h is redundant, remove it. Link: https://lkml.kernel.org/r/1649240981-11024-4-git-send-email-yangtiezhu@loongson.cnSigned-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Tiezhu Yang authored
PT_DTRACE is only used on um now, fix the wrong comment. Link: https://lkml.kernel.org/r/1649240981-11024-3-git-send-email-yangtiezhu@loongson.cnSigned-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Tiezhu Yang authored
Patch series "ptrace: do some cleanup". This patch (of 3): PTRACE_SINGLESTEP is always defined as 9 in include/uapi/linux/ptrace.h, remove redudant check of #ifdef PTRACE_SINGLESTEP. Link: https://lkml.kernel.org/r/1649240981-11024-2-git-send-email-yangtiezhu@loongson.cnSigned-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
OGAWA Hirofumi authored
fat*_ent_bread() can be the cause of too many report on I/O error path. So use fat_msg_ratelimit() instead. Link: https://lkml.kernel.org/r/87bkxogfeq.fsf@mail.parknet.co.jpSigned-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: qianfan <qianfanguijin@163.com> Tested-by: qianfan <qianfanguijin@163.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Jonathan Lassoff authored
In order for end users to quickly react to new issues that come up in production, it is proving useful to leverage the printk indexing system. This printk index enables kernel developers to use calls to printk() with changeable ad-hoc format strings (as they always have; no change of expectations), while enabling end users to examine format strings to detect changes. Since end users are using regular expressions to match messages printed through printk(), being able to detect changes in chosen format strings from release to release provides a useful signal to review printk()-matching regular expressions for any necessary updates. So that detailed FAT messages are captured by this printk index, this patch wraps fat_msg with a macro. [akpm@linux-foundation.org: coding-style cleanups] Link: https://lkml.kernel.org/r/8aaa2dd7995e820292bb40d2120ab69756662c65.1648688136.git.jof@thejof.comSigned-off-by: Jonathan Lassoff <jof@thejof.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Petr Mladek <pmladek@suse.com> Tested-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Yubo Feng authored
iput() has already judged the incoming parameter, so there is no need to repeat outside. Link: https://lkml.kernel.org/r/1648265418-76563-1-git-send-email-fengyubo3@huawei.comSigned-off-by: Yubo Feng <fengyubo3@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kees Cook authored
The uselib syscall has been long deprecated. There's no need to keep this enabled by default under X86_32. Link: https://lkml.kernel.org/r/20220412212519.4113845-1-keescook@chromium.orgSigned-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kuniyuki Iwashima authored
ep_poll() first calls ep_events_available() with no lock held and checks if ep->rdllist is empty by list_empty_careful(), which reads rdllist->prev. Thus all accesses to it need some protection to avoid store/load-tearing. Note INIT_LIST_HEAD_RCU() already has the annotation for both prev and next. Commit bf3b9f63 ("epoll: Add busy poll support to epoll with socket fds.") added the first lockless ep_events_available(), and commit c5a282e9 ("fs/epoll: reduce the scope of wq lock in epoll_wait()") made some ep_events_available() calls lockless and added single call under a lock, finally commit e59d3c64 ("epoll: eliminate unnecessary lock for zero timeout") made the last ep_events_available() lockless. BUG: KCSAN: data-race in do_epoll_wait / do_epoll_wait write to 0xffff88810480c7d8 of 8 bytes by task 1802 on cpu 0: INIT_LIST_HEAD include/linux/list.h:38 [inline] list_splice_init include/linux/list.h:492 [inline] ep_start_scan fs/eventpoll.c:622 [inline] ep_send_events fs/eventpoll.c:1656 [inline] ep_poll fs/eventpoll.c:1806 [inline] do_epoll_wait+0x4eb/0xf40 fs/eventpoll.c:2234 do_epoll_pwait fs/eventpoll.c:2268 [inline] __do_sys_epoll_pwait fs/eventpoll.c:2281 [inline] __se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275 __x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88810480c7d8 of 8 bytes by task 1799 on cpu 1: list_empty_careful include/linux/list.h:329 [inline] ep_events_available fs/eventpoll.c:381 [inline] ep_poll fs/eventpoll.c:1797 [inline] do_epoll_wait+0x279/0xf40 fs/eventpoll.c:2234 do_epoll_pwait fs/eventpoll.c:2268 [inline] __do_sys_epoll_pwait fs/eventpoll.c:2281 [inline] __se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275 __x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xffff88810480c7d0 -> 0xffff888103c15098 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 1799 Comm: syz-fuzzer Tainted: G W 5.17.0-rc7-syzkaller-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Link: https://lkml.kernel.org/r/20220322002653.33865-3-kuniyu@amazon.co.jp Fixes: e59d3c64 ("epoll: eliminate unnecessary lock for zero timeout") Fixes: c5a282e9 ("fs/epoll: reduce the scope of wq lock in epoll_wait()") Fixes: bf3b9f63 ("epoll: Add busy poll support to epoll with socket fds.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Reported-by: syzbot+bdd6e38a1ed5ee58d8bd@syzkaller.appspotmail.com Cc: Al Viro <viro@zeniv.linux.org.uk>, Andrew Morton <akpm@linux-foundation.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: "Soheil Hassas Yeganeh" <soheil@google.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: "Sridhar Samudrala" <sridhar.samudrala@intel.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kuniyuki Iwashima authored
Patch series "Fix data-races around epoll reported by KCSAN." This series suppresses a false positive KCSAN's message and fixes a real data-race. This patch (of 2): pipe_poll() runs locklessly and assigns 1 to poll_usage. Once poll_usage is set to 1, it never changes in other places. However, concurrent writes of a value trigger KCSAN, so let's make KCSAN happy. BUG: KCSAN: data-race in pipe_poll / pipe_poll write to 0xffff8880042f6678 of 4 bytes by task 174 on cpu 3: pipe_poll (fs/pipe.c:656) ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853) do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234) __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) write to 0xffff8880042f6678 of 4 bytes by task 177 on cpu 1: pipe_poll (fs/pipe.c:656) ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853) do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234) __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 177 Comm: epoll_race Not tainted 5.17.0-58927-gf443e374 #6 Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014 Link: https://lkml.kernel.org/r/20220322002653.33865-1-kuniyu@amazon.co.jp Link: https://lkml.kernel.org/r/20220322002653.33865-2-kuniyu@amazon.co.jp Fixes: 3b844826 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Kuniyuki Iwashima <kuni1840@gmail.com> Cc: "Soheil Hassas Yeganeh" <soheil@google.com> Cc: "Sridhar Samudrala" <sridhar.samudrala@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Tom Rix authored
Clang static analysis reports this false positive glob.c:48:32: warning: Assigned value is garbage or undefined char const *back_pat = NULL, *back_str = back_str; ^~~~~~~~ ~~~~~~~~ back_str is set after back_pat and it's use is protected by the !back_pat check. It is not necessary to initialize back_str, so remove the initialization. Link: https://lkml.kernel.org/r/20220402131546.3383578-1-trix@redhat.comSigned-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Rasmus Villemoes authored
Use strchr(), which makes them a lot shorter, and more obviously symmetric in their treatment of accept/reject. It also saves a little bit of .text; bloat-o-meter for an arm build says Function old new delta strcspn 92 76 -16 strspn 108 76 -32 While here, also remove a stray empty line before EXPORT_SYMBOL(). Link: https://lkml.kernel.org/r/20220328224119.3003834-2-linux@rasmusvillemoes.dkSigned-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Rasmus Villemoes authored
Before refactoring strspn() and strcspn(), add some simple test cases. Link: https://lkml.kernel.org/r/20220328224119.3003834-1-linux@rasmusvillemoes.dkSigned-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Rasmus Villemoes authored
As in "kernel/panic.c: remove CONFIG_PANIC_ON_OOPS_VALUE indirection", use the IS_ENABLED() helper rather than having a hidden config option. Link: https://lkml.kernel.org/r/20220321121301.1389693-1-linux@rasmusvillemoes.dkSigned-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Xiaoke Wang authored
To make the test more robust, there are the following changes: 1. add a check for the return value of kmem_cache_alloc(). 2. properly release the object `buf` on several error paths. 3. release the objects of `used_objects` if we never hit `saved_ptr`. 4. destroy the created cache by default. Link: https://lkml.kernel.org/r/tencent_7CB95F1C3914BCE1CA4A61FF7C20E7CCB108@qq.comSigned-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Marco Elver <elver@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Rob Herring authored
Add support to also use the mailmap for 'in file' email addresses. Link: https://lkml.kernel.org/r/20220323193645.317514-1-robh@kernel.orgSigned-off-by: Rob Herring <robh@kernel.org> Reported-by: Marc Zyngier <maz@kernel.org> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Haowen Bai authored
This fixes the following sparse warnings: kernel/pid_namespace.c:55:77: warning: Using plain integer as NULL pointer Link: https://lkml.kernel.org/r/1647944288-2806-1-git-send-email-baihaowen@meizu.comSigned-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Christoph Hellwig authored
csum_and_copy_from_user and csum_and_copy_to_user are exported by a few architectures, but not actually used in modular code. Drop the exports. Link: https://lkml.kernel.org/r/20220421070440.1282704-1-hch@lst.deSigned-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Remove the read_from_oldmem() wrapper introduced earlier and convert all the remaining callers to pass an iov_iter. Link: https://lkml.kernel.org/r/20220408090636.560886-4-bhe@redhat.comSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
This gets rid of copy_to() and let us use proc_read_iter() instead of proc_read(). Link: https://lkml.kernel.org/r/20220408090636.560886-3-bhe@redhat.comSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Patch series "Convert vmcore to use an iov_iter", v5. For some reason several people have been sending bad patches to fix compiler warnings in vmcore recently. Here's how it should be done. Compile-tested only on x86. As noted in the first patch, s390 should take this conversion a bit further, but I'm not inclined to do that work myself. This patch (of 3): Instead of passing in a 'buf' and 'userbuf' argument, pass in an iov_iter. s390 needs more work to pass the iov_iter down further, or refactor, but I'd be more comfortable if someone who can test on s390 did that work. It's more convenient to convert the whole of read_from_oldmem() to take an iov_iter at the same time, so rename it to read_from_oldmem_iter() and add a temporary read_from_oldmem() wrapper that creates an iov_iter. Link: https://lkml.kernel.org/r/20220408090636.560886-1-bhe@redhat.com Link: https://lkml.kernel.org/r/20220408090636.560886-2-bhe@redhat.comSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Jakob Koschel authored
When list_for_each_entry() completes the iteration over the whole list without breaking the loop, the iterator value will be a bogus pointer computed based on the head element. While it is safe to use the pointer to determine if it was computed based on the head element, either with list_entry_is_head() or &pos->member == head, using the iterator variable after the loop should be avoided. In preparation to limit the scope of a list iterator to the list traversal loop, use a dedicated pointer to point to the found element [1]. [akpm@linux-foundation.org: reduce scope of `iter'] Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1] Link: https://lkml.kernel.org/r/20220331223700.902556-1-jakobkoschel@gmail.comSigned-off-by: Jakob Koschel <jakobkoschel@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Brian Johannesmeyer" <bjohannesmeyer@gmail.com> Cc: Cristiano Giuffrida <c.giuffrida@vu.nl> Cc: "Bos, H.J." <h.j.bos@vu.nl> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Heming Zhao via Ocfs2-devel authored
Current ocfs2_fill_super() uses one goto label "read_super_error" to handle all error cases. And with previous serial patches, the error handling should fork more branches to handle different error cases. This patch rewrite the error handling of ocfs2_fill_super. Link: https://lkml.kernel.org/r/20220424130952.2436-6-heming.zhao@suse.comSigned-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-