- 14 Oct, 2019 7 commits
-
-
Alexander Potapenko authored
slab_alloc_node() already zeroed out the freelist pointer if init_on_free was on. Thibaut Sautereau noticed that the same needs to be done for kmem_cache_alloc_bulk(), which performs the allocations separately. kmem_cache_alloc_bulk() is currently used in two places in the kernel, so this change is unlikely to have a major performance impact. SLAB doesn't require a similar change, as auto-initialization makes the allocator store the freelist pointers off-slab. Link: http://lkml.kernel.org/r/20191007091605.30530-1-glider@google.com Fixes: 6471384a ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options") Signed-off-by: Alexander Potapenko <glider@google.com> Reported-by: Thibaut Sautereau <thibaut@sautereau.fr> Reported-by: Kees Cook <keescook@chromium.org> Cc: Christoph Lameter <cl@linux.com> Cc: Laura Abbott <labbott@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Eric Biggers authored
Kmemleak is falsely reporting a leak of the slab allocation in sctp_stream_init_ext(): BUG: memory leak unreferenced object 0xffff8881114f5d80 (size 96): comm "syz-executor934", pid 7160, jiffies 4294993058 (age 31.950s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000ce7a1326>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline] [<00000000ce7a1326>] slab_post_alloc_hook mm/slab.h:439 [inline] [<00000000ce7a1326>] slab_alloc mm/slab.c:3326 [inline] [<00000000ce7a1326>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553 [<000000007abb7ac9>] kmalloc include/linux/slab.h:547 [inline] [<000000007abb7ac9>] kzalloc include/linux/slab.h:742 [inline] [<000000007abb7ac9>] sctp_stream_init_ext+0x2b/0xa0 net/sctp/stream.c:157 [<0000000048ecb9c1>] sctp_sendmsg_to_asoc+0x946/0xa00 net/sctp/socket.c:1882 [<000000004483ca2b>] sctp_sendmsg+0x2a8/0x990 net/sctp/socket.c:2102 [...] But it's freed later. Kmemleak misses the allocation because its pointer is stored in the generic radix tree sctp_stream::out, and the generic radix tree uses raw pages which aren't tracked by kmemleak. Fix this by adding the kmemleak hooks to the generic radix tree code. Link: http://lkml.kernel.org/r/20191004065039.727564-1-ebiggers@kernel.orgSigned-off-by: Eric Biggers <ebiggers@google.com> Reported-by: <syzbot+7f3b6b106be8dcdcdeec@syzkaller.appspotmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Xin Long <lucien.xin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Qian Cai authored
A long time ago we fixed a similar deadlock in show_slab_objects() [1]. However, it is apparently due to the commits like 01fb58bc ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") and 03afc0e2 ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by just reading files in /sys/kernel/slab which will generate a lockdep splat below. Since the "mem_hotplug_lock" here is only to obtain a stable online node mask while racing with NUMA node hotplug, in the worst case, the results may me miscalculated while doing NUMA node hotplug, but they shall be corrected by later reads of the same files. WARNING: possible circular locking dependency detected ------------------------------------------------------ cat/5224 is trying to acquire lock: ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at: show_slab_objects+0x94/0x3a8 but task is already holding lock: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (kn->count#45){++++}: lock_acquire+0x31c/0x360 __kernfs_remove+0x290/0x490 kernfs_remove+0x30/0x44 sysfs_remove_dir+0x70/0x88 kobject_del+0x50/0xb0 sysfs_slab_unlink+0x2c/0x38 shutdown_cache+0xa0/0xf0 kmemcg_cache_shutdown_fn+0x1c/0x34 kmemcg_workfn+0x44/0x64 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #1 (slab_mutex){+.+.}: lock_acquire+0x31c/0x360 __mutex_lock_common+0x16c/0xf78 mutex_lock_nested+0x40/0x50 memcg_create_kmem_cache+0x38/0x16c memcg_kmem_cache_create_func+0x3c/0x70 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #0 (mem_hotplug_lock.rw_sem){++++}: validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc other info that might help us debug this: Chain exists of: mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kn->count#45); lock(slab_mutex); lock(kn->count#45); lock(mem_hotplug_lock.rw_sem); *** DEADLOCK *** 3 locks held by cat/5224: #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8 #1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0 #2: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 stack backtrace: Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xd0/0x140 print_circular_bug+0x368/0x380 check_noncircular+0x248/0x250 validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc I think it is important to mention that this doesn't expose the show_slab_objects to use-after-free. There is only a single path that might really race here and that is the slab hotplug notifier callback __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path doesn't really destroy kmem_cache_node data structures. [1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html [akpm@linux-foundation.org: add comment explaining why we don't need mem_hotplug_lock] Link: http://lkml.kernel.org/r/1570192309-10132-1-git-send-email-cai@lca.pw Fixes: 01fb58bc ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") Fixes: 03afc0e2 ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}") Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Commit 37389167 ("mm, page_owner: keep owner info when freeing the page") has introduced a flag PAGE_EXT_OWNER_ACTIVE to indicate that page is tracked as being allocated. Kirril suggested naming it PAGE_EXT_OWNER_ALLOCATED to make it more clear, as "active is somewhat loaded term for a page". Link: http://lkml.kernel.org/r/20190930122916.14969-4-vbabka@suse.czSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Walter Wu <walter-zh.wu@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Commit 8974558f ("mm, page_owner, debug_pagealloc: save and dump freeing stack trace") enhanced page_owner to also store freeing stack trace, when debug_pagealloc is also enabled. KASAN would also like to do this [1] to improve error reports to debug e.g. UAF issues. Kirill has suggested that the freeing stack trace saving should be also possible to be enabled separately from KASAN or debug_pagealloc, i.e. with an extra boot option. Qian argued that we have enough options already, and avoiding the extra overhead is not worth the complications in the case of a debugging option. Kirill noted that the extra stack handle in struct page_owner requires 0.1% of memory. This patch therefore enables free stack saving whenever page_owner is enabled, regardless of whether debug_pagealloc or KASAN is also enabled. KASAN kernels booted with page_owner=on will thus benefit from the improved error reports. [1] https://bugzilla.kernel.org/show_bug.cgi?id=203967 [vbabka@suse.cz: v3] Link: http://lkml.kernel.org/r/20191007091808.7096-3-vbabka@suse.cz Link: http://lkml.kernel.org/r/20190930122916.14969-3-vbabka@suse.czSigned-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Qian Cai <cai@lca.pw> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Suggested-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Patch series "followups to debug_pagealloc improvements through page_owner", v3. These are followups to [1] which made it to Linus meanwhile. Patches 1 and 3 are based on Kirill's review, patch 2 on KASAN request [2]. It would be nice if all of this made it to 5.4 with [1] already there (or at least Patch 1). This patch (of 3): As noted by Kirill, commit 7e2f2a0c ("mm, page_owner: record page owner for each subpage") has introduced an off-by-one error in __set_page_owner_handle() when looking up page_ext for subpages. As a result, the head page page_owner info is set twice, while for the last tail page, it's not set at all. Fix this and also make the code more efficient by advancing the page_ext pointer we already have, instead of calling lookup_page_ext() for each subpage. Since the full size of struct page_ext is not known at compile time, we can't use a simple page_ext++ statement, so introduce a page_ext_next() inline function for that. Link: http://lkml.kernel.org/r/20190930122916.14969-2-vbabka@suse.cz Fixes: 7e2f2a0c ("mm, page_owner: record page owner for each subpage") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Kirill A. Shutemov <kirill@shutemov.name> Reported-by: Miles Chen <miles.chen@mediatek.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Walter Wu <walter-zh.wu@mediatek.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Catalin Marinas authored
In case of an error (e.g. memory pool too small), kmemleak disables itself and cleans up the already allocated metadata objects. However, if this happens early before the RCU callback mechanism is available, put_object() skips call_rcu() and frees the object directly. This is not safe with the RCU list traversal in __kmemleak_do_cleanup(). Change the list traversal in __kmemleak_do_cleanup() to list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since the kmemleak is already disabled at this point. In addition, avoid an unnecessary metadata object rb-tree look-up since it already has the struct kmemleak_object pointer. Fixes: c5665868 ("mm: kmemleak: use the memory pool for early allocations") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reported-by: Marc Dionne <marc.c.dionne@gmail.com> Reported-by: Ted Ts'o <tytso@mit.edu> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 13 Oct, 2019 16 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceLinus Torvalds authored
Pull tracing fixes from Steven Rostedt: "A few tracing fixes: - Remove lockdown from tracefs itself and moved it to the trace directory. Have the open functions there do the lockdown checks. - Fix a few races with opening an instance file and the instance being deleted (Discovered during the lockdown updates). Kept separate from the clean up code such that they can be backported to stable easier. - Clean up and consolidated the checks done when opening a trace file, as there were multiple checks that need to be done, and it did not make sense having them done in each open instance. - Fix a regression in the record mcount code. - Small hw_lat detector tracer fixes. - A trace_pipe read fix due to not initializing trace_seq" * tag 'trace-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Initialize iter->seq after zeroing in tracing_read_pipe() tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency tracing/hwlat: Report total time spent in all NMIs during the sample recordmcount: Fix nop_mcount() function tracing: Do not create tracefs files if tracefs lockdown is in effect tracing: Add locked_down checks to the open calls of files created for tracefs tracing: Add tracing_check_open_get_tr() tracing: Have trace events system open call tracing_open_generic_tr() tracing: Get trace_array reference for available_tracers files ftrace: Get a reference counter for the trace_array on filter files tracefs: Revert ccbd54ff ("tracefs: Restrict tracefs when the kernel is locked down")
-
Linus Torvalds authored
Merge tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - Update/fix inspur-ipsps1 and k10temp Documentation - Fix nct7904 driver - Fix HWMON_P_MIN_ALARM mask in hwmon core * tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: docs: Extend inspur-ipsps1 title underline hwmon: (nct7904) Add array fan_alarm and vsen_alarm to store the alarms in nct7904_data struct. docs: hwmon: Include 'inspur-ipsps1.rst' into docs hwmon: Fix HWMON_P_MIN_ALARM mask hwmon: (k10temp) Update documentation and add temp2_input info hwmon: (nct7904) Fix the incorrect value of vsen_mask in nct7904_data struct
-
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linuxLinus Torvalds authored
Pull MTD fixes from Richard Weinberger: "Two fixes for MTD: - spi-nor: Fix for a regression in write_sr() - rawnand: Regression fix for the au1550nd driver" * tag 'fixes-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: rawnand: au1550nd: Fix au_read_buf16() prototype mtd: spi-nor: Fix direction of the write_sr() transfer
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull io_uring fix from Jens Axboe: "Single small fix for a regression in the sequence logic for linked commands" * tag 'for-linus-20191012' of git://git.kernel.dk/linux-block: io_uring: fix sequence logic for timeout requests
-
Petr Mladek authored
A customer reported the following softlockup: [899688.160002] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [test.sh:16464] [899688.160002] CPU: 0 PID: 16464 Comm: test.sh Not tainted 4.12.14-6.23-azure #1 SLE12-SP4 [899688.160002] RIP: 0010:up_write+0x1a/0x30 [899688.160002] Kernel panic - not syncing: softlockup: hung tasks [899688.160002] RIP: 0010:up_write+0x1a/0x30 [899688.160002] RSP: 0018:ffffa86784d4fde8 EFLAGS: 00000257 ORIG_RAX: ffffffffffffff12 [899688.160002] RAX: ffffffff970fea00 RBX: 0000000000000001 RCX: 0000000000000000 [899688.160002] RDX: ffffffff00000001 RSI: 0000000000000080 RDI: ffffffff970fea00 [899688.160002] RBP: ffffffffffffffff R08: ffffffffffffffff R09: 0000000000000000 [899688.160002] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b59014720d8 [899688.160002] R13: ffff8b59014720c0 R14: ffff8b5901471090 R15: ffff8b5901470000 [899688.160002] tracing_read_pipe+0x336/0x3c0 [899688.160002] __vfs_read+0x26/0x140 [899688.160002] vfs_read+0x87/0x130 [899688.160002] SyS_read+0x42/0x90 [899688.160002] do_syscall_64+0x74/0x160 It caught the process in the middle of trace_access_unlock(). There is no loop. So, it must be looping in the caller tracing_read_pipe() via the "waitagain" label. Crashdump analyze uncovered that iter->seq was completely zeroed at this point, including iter->seq.seq.size. It means that print_trace_line() was never able to print anything and there was no forward progress. The culprit seems to be in the code: /* reset all but tr, trace, and overruns */ memset(&iter->seq, 0, sizeof(struct trace_iterator) - offsetof(struct trace_iterator, seq)); It was added by the commit 53d0aa77 ("ftrace: add logic to record overruns"). It was v2.6.27-rc1. It was the time when iter->seq looked like: struct trace_seq { unsigned char buffer[PAGE_SIZE]; unsigned int len; }; There was no "size" variable and zeroing was perfectly fine. The solution is to reinitialize the structure after or without zeroing. Link: http://lkml.kernel.org/r/20191011142134.11997-1-pmladek@suse.comSigned-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Srivatsa S. Bhat (VMware) authored
max_latency is intended to record the maximum ever observed hardware latency, which may occur in either part of the loop (inner/outer). So we need to also consider the outer-loop sample when updating max_latency. Link: http://lkml.kernel.org/r/157073345463.17189.18124025522664682811.stgit@srivatsa-ubuntu Fixes: e7c15cd8 ("tracing: Added hardware latency tracer") Cc: stable@vger.kernel.org Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Srivatsa S. Bhat (VMware) authored
nmi_total_ts is supposed to record the total time spent in *all* NMIs that occur on the given CPU during the (active portion of the) sampling window. However, the code seems to be overwriting this variable for each NMI, thereby only recording the time spent in the most recent NMI. Fix it by accumulating the duration instead. Link: http://lkml.kernel.org/r/157073343544.17189.13911783866738671133.stgit@srivatsa-ubuntu Fixes: 7b2c8625 ("tracing: Add NMI tracing in hwlat detector") Cc: stable@vger.kernel.org Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
The removal of the longjmp code in recordmcount.c mistakenly made the return of make_nop() being negative an exit of nop_mcount(). It should not exit the routine, but instead just not process that part of the code. By exiting with an error code, it would cause the update of recordmcount to fail some files which would fail the build if ftrace function tracing was enabled. Link: http://lkml.kernel.org/r/20191009110538.5909fec6@gandalf.local.homeReported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Fixes: 3f1df120 ("recordmcount: Rewrite error/success handling") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
If on boot up, lockdown is activated for tracefs, don't even bother creating the files. This can also prevent instances from being created if lockdown is in effect. Link: http://lkml.kernel.org/r/CAHk-=whC6Ji=fWnjh2+eS4b15TnbsS4VPVtvBOwCy1jjEG_JHQ@mail.gmail.comSuggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
Added various checks on open tracefs calls to see if tracefs is in lockdown mode, and if so, to return -EPERM. Note, the event format files (which are basically standard on all machines) as well as the enabled_functions file (which shows what is currently being traced) are not lockde down. Perhaps they should be, but it seems counter intuitive to lockdown information to help you know if the system has been modified. Link: http://lkml.kernel.org/r/CAHk-=wj7fGPKUspr579Cii-w_y60PtRaiDgKuxVtBAMK0VNNkA@mail.gmail.comSuggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
Currently, most files in the tracefs directory test if tracing_disabled is set. If so, it should return -ENODEV. The tracing_disabled is called when tracing is found to be broken. Originally it was done in case the ring buffer was found to be corrupted, and we wanted to prevent reading it from crashing the kernel. But it's also called if a tracing selftest fails on boot. It's a one way switch. That is, once it is triggered, tracing is disabled until reboot. As most tracefs files can also be used by instances in the tracefs directory, they need to be carefully done. Each instance has a trace_array associated to it, and when the instance is removed, the trace_array is freed. But if an instance is opened with a reference to the trace_array, then it requires looking up the trace_array to get its ref counter (as there could be a race with it being deleted and the open itself). Once it is found, a reference is added to prevent the instance from being removed (and the trace_array associated with it freed). Combine the two checks (tracing_disabled and trace_array_get()) into a single helper function. This will also make it easier to add lockdown to tracefs later. Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.homeSigned-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
Instead of having the trace events system open call open code the taking of the trace_array descriptor (with trace_array_get()) and then calling trace_open_generic(), have it use the tracing_open_generic_tr() that does the combination of the two. This requires making tracing_open_generic_tr() global. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
As instances may have different tracers available, we need to look at the trace_array descriptor that shows the list of the available tracers for the instance. But there's a race between opening the file and an admin deleting the instance. The trace_array_get() needs to be called before accessing the trace_array. Cc: stable@vger.kernel.org Fixes: 607e2ea1 ("tracing: Set up infrastructure to allow tracers for instances") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
The ftrace set_ftrace_filter and set_ftrace_notrace files are specific for an instance now. They need to take a reference to the instance otherwise there could be a race between accessing the files and deleting the instance. It wasn't until the :mod: caching where these file operations started referencing the trace_array directly. Cc: stable@vger.kernel.org Fixes: 673feb9d ("ftrace: Add :mod: caching infrastructure to trace_array") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Steven Rostedt (VMware) authored
Running the latest kernel through my "make instances" stress tests, I triggered the following bug (with KASAN and kmemleak enabled): mkdir invoked oom-killer: gfp_mask=0x40cd0(GFP_KERNEL|__GFP_COMP|__GFP_RECLAIMABLE), order=0, oom_score_adj=0 CPU: 1 PID: 2229 Comm: mkdir Not tainted 5.4.0-rc2-test #325 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014 Call Trace: dump_stack+0x64/0x8c dump_header+0x43/0x3b7 ? trace_hardirqs_on+0x48/0x4a oom_kill_process+0x68/0x2d5 out_of_memory+0x2aa/0x2d0 __alloc_pages_nodemask+0x96d/0xb67 __alloc_pages_node+0x19/0x1e alloc_slab_page+0x17/0x45 new_slab+0xd0/0x234 ___slab_alloc.constprop.86+0x18f/0x336 ? alloc_inode+0x2c/0x74 ? irq_trace+0x12/0x1e ? tracer_hardirqs_off+0x1d/0xd7 ? __slab_alloc.constprop.85+0x21/0x53 __slab_alloc.constprop.85+0x31/0x53 ? __slab_alloc.constprop.85+0x31/0x53 ? alloc_inode+0x2c/0x74 kmem_cache_alloc+0x50/0x179 ? alloc_inode+0x2c/0x74 alloc_inode+0x2c/0x74 new_inode_pseudo+0xf/0x48 new_inode+0x15/0x25 tracefs_get_inode+0x23/0x7c ? lookup_one_len+0x54/0x6c tracefs_create_file+0x53/0x11d trace_create_file+0x15/0x33 event_create_dir+0x2a3/0x34b __trace_add_new_event+0x1c/0x26 event_trace_add_tracer+0x56/0x86 trace_array_create+0x13e/0x1e1 instance_mkdir+0x8/0x17 tracefs_syscall_mkdir+0x39/0x50 ? get_dname+0x31/0x31 vfs_mkdir+0x78/0xa3 do_mkdirat+0x71/0xb0 sys_mkdir+0x19/0x1b do_fast_syscall_32+0xb0/0xed I bisected this down to the addition of the proxy_ops into tracefs for lockdown. It appears that the allocation of the proxy_ops and then freeing it in the destroy_inode callback, is causing havoc with the memory system. Reading the documentation about destroy_inode and talking with Linus about this, this is buggy and wrong. When defining the destroy_inode() method, it is expected that the destroy_inode() will also free the inode, and not just the extra allocations done in the creation of the inode. The faulty commit causes a memory leak of the inode data structure when they are deleted. Instead of allocating the proxy_ops (and then having to free it) the checks should be done by the open functions themselves, and not hack into the tracefs directory. First revert the tracefs updates for locked_down and then later we can add the locked_down checks in the kernel/trace files. Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home Fixes: ccbd54ff ("tracefs: Restrict tracefs when the kernel is locked down") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
- 12 Oct, 2019 17 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-miscLinus Torvalds authored
Pull char/misc driver fixes from Greg KH: "Here are some small char/misc driver fixes for 5.4-rc3. Nothing huge here. Some binder driver fixes (although it is still being discussed if these all fix the reported issues or not, so more might be coming later), some mei device ids and fixes, and a google firmware driver bugfix that fixes a regression, as well as some other tiny fixes. All have been in linux-next with no reported issues" * tag 'char-misc-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: firmware: google: increment VPD key_len properly w1: ds250x: Fix build error without CRC16 virt: vbox: fix memory leak in hgcm_call_preprocess_linaddr binder: Fix comment headers on binder_alloc_prepare_to_free() binder: prevent UAF read in print_binder_transaction_log_entry() misc: fastrpc: prevent memory leak in fastrpc_dma_buf_attach mei: avoid FW version request on Ibex Peak and earlier mei: me: add comet point (lake) LP device ids
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/stagingLinus Torvalds authored
Pull staging/IIO driver fixes from Greg KH: "Here are some staging and IIO driver fixes for 5.4-rc3. The "biggest" thing here is a removal of the fbtft device and flexfb code as they have been abandoned by their authors and are no longer needed for that hardware. Other than that, the usual amount of staging driver and iio driver fixes for reported issues, and some speakup sysfs file documentation, which has been long awaited for. All have been in linux-next with no reported issues" * tag 'staging-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (32 commits) iio: Fix an undefied reference error in noa1305_probe iio: light: opt3001: fix mutex unlock race iio: adc: ad799x: fix probe error handling iio: light: add missing vcnl4040 of_compatible iio: light: fix vcnl4000 devicetree hooks iio: imu: st_lsm6dsx: fix waitime for st_lsm6dsx i2c controller iio: adc: axp288: Override TS pin bias current for some models iio: imu: adis16400: fix memory leak iio: imu: adis16400: release allocated memory on failure iio: adc: stm32-adc: fix a race when using several adcs with dma and irq iio: adc: stm32-adc: move registers definitions iio: accel: adxl372: Perform a reset at start up iio: accel: adxl372: Fix push to buffers lost samples iio: accel: adxl372: Fix/remove limitation for FIFO samples iio: adc: hx711: fix bug in sampling of data staging: vt6655: Fix memory leak in vt6655_probe staging: exfat: Use kvzalloc() instead of kzalloc() for exfat_sb_info Staging: fbtft: fix memory leak in fbtft_framebuffer_alloc staging: speakup: document sysfs attributes staging: rtl8188eu: fix HighestRate check in odm_ARFBRefresh_8188E() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/ttyLinus Torvalds authored
Pull tty/serial driver fixes from Greg KH: "Here are some small tty and serial driver fixes for 5.4-rc3 that resolve a number of reported issues and regressions. None of these are huge, full details are in the shortlog. There's also a MAINTAINERS update that I think you might have already taken in your tree already, but git should handle that merge easily. All have been in linux-next with no reported issues" * tag 'tty-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: MAINTAINERS: kgdb: Add myself as a reviewer for kgdb/kdb tty: serial: imx: Use platform_get_irq_optional() for optional IRQs serial: fix kernel-doc warning in comments serial: 8250_omap: Fix gpio check for auto RTS/CTS serial: mctrl_gpio: Check for NULL pointer tty: serial: fsl_lpuart: Fix lpuart_flush_buffer() tty: serial: Fix PORT_LINFLEXUART definition tty: n_hdlc: fix build on SPARC serial: uartps: Fix uartps_major handling serial: uartlite: fix exit path null pointer tty: serial: linflexuart: Fix magic SysRq handling serial: sh-sci: Use platform_get_irq_optional() for optional interrupts dt-bindings: serial: sh-sci: Document r8a774b1 bindings serial/sifive: select SERIAL_EARLYCON tty: serial: rda: Fix the link time qualifier of 'rda_uart_exit()' tty: serial: owl: Fix the link time qualifier of 'owl_uart_exit()'
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usbLinus Torvalds authored
Pull USB fixes from Greg KH: "Here are a lot of small USB driver fixes for 5.4-rc3. syzbot has stepped up its testing of the USB driver stack, now able to trigger fun race conditions between disconnect and probe functions. Because of that we have a lot of fixes in here from Johan and others fixing these reported issues that have been around since almost all time. We also are just deleting the rio500 driver, making all of the syzbot bugs found in it moot as it turns out no one has been using it for years as there is a userspace version that is being used instead. There are also a number of other small fixes in here, all resolving reported issues or regressions. All have been in linux-next without any reported issues" * tag 'usb-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (65 commits) USB: yurex: fix NULL-derefs on disconnect USB: iowarrior: use pr_err() USB: iowarrior: drop redundant iowarrior mutex USB: iowarrior: drop redundant disconnect mutex USB: iowarrior: fix use-after-free after driver unbind USB: iowarrior: fix use-after-free on release USB: iowarrior: fix use-after-free on disconnect USB: chaoskey: fix use-after-free on release USB: adutux: fix use-after-free on release USB: ldusb: fix NULL-derefs on driver unbind USB: legousbtower: fix use-after-free on release usb: cdns3: Fix for incorrect DMA mask. usb: cdns3: fix cdns3_core_init_role() usb: cdns3: gadget: Fix full-speed mode USB: usb-skeleton: drop redundant in-urb check USB: usb-skeleton: fix use-after-free after driver unbind USB: usb-skeleton: fix NULL-deref on disconnect usb:cdns3: Fix for CV CH9 running with g_zero driver. usb: dwc3: Remove dev_err() on platform_get_irq() failure usb: dwc3: Switch to platform_get_irq_byname_optional() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Ingo Molnar: "Two fixes: a guest-cputime accounting fix, and a cgroup bandwidth quota precision fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/vtime: Fix guest/system mis-accounting on task switch sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fixes from Ingo Molnar: "Mostly tooling fixes, but also a couple of updates for new Intel models (which are technically hw-enablement, but to users it's a fix to perf behavior on those new CPUs - hope this is fine), an AUX inheritance fix, event time-sharing fix, and a fix for lost non-perf NMI events on AMD systems" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) perf/x86/cstate: Add Tiger Lake CPU support perf/x86/msr: Add Tiger Lake CPU support perf/x86/intel: Add Tiger Lake CPU support perf/x86/cstate: Update C-state counters for Ice Lake perf/x86/msr: Add new CPU model numbers for Ice Lake perf/x86/cstate: Add Comet Lake CPU support perf/x86/msr: Add Comet Lake CPU support perf/x86/intel: Add Comet Lake CPU support perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp perf/core: Fix corner case in perf_rotate_context() perf/core: Rework memory accounting in perf_mmap() perf/core: Fix inheritance of aux_output groups perf annotate: Don't return -1 for error when doing BPF disassembly perf annotate: Return appropriate error code for allocation failures perf annotate: Fix arch specific ->init() failure errors perf annotate: Propagate the symbol__annotate() error return perf annotate: Fix the signedness of failure returns perf annotate: Propagate perf_env__arch() error perf evsel: Fall back to global 'perf_env' in perf_evsel__env() perf tools: Propagate get_cpuid() error ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull EFI fixes from Ingo Molnar: "Misc EFI fixes all across the map: CPER error report fixes, fixes to TPM event log parsing, fix for a kexec hang, a Sparse fix and other fixes" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/tpm: Fix sanity check of unsigned tbl_size being less than zero efi/x86: Do not clean dummy variable in kexec path efi: Make unexported efi_rci2_sysfs_init() static efi/tpm: Only set 'efi_tpm_final_log_size' after successful event log parsing efi/tpm: Don't traverse an event log with no events efi/tpm: Don't access event->count when it isn't mapped efivar/ssdt: Don't iterate over EFI vars if no SSDT override was specified efi/cper: Fix endianness of PCIe class code
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Ingo Molnar: "A handful of fixes: a kexec linking fix, an AMD MWAITX fix, a vmware guest support fix when built under Clang, and new CPU model number definitions" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add Comet Lake to the Intel CPU models header lib/string: Make memzero_explicit() inline instead of external x86/cpu/vmware: Use the full form of INL in VMWARE_PORT x86/asm: Fix MWAITX C-state hint value
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 license tag fixlets from Ingo Molnar: "Fix a couple of SPDX tags in x86 headers to follow the canonical pattern" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Use the correct SPDX License Identifier in headers
-
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linuxLinus Torvalds authored
Pull RISC-V fixes from Paul Walmsley: - Fix several bugs in the breakpoint trap handler - Drop an unnecessary loop around calls to preempt_schedule_irq() * tag 'riscv/for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: entry: Remove unneeded need_resched() loop riscv: Correct the handling of unexpected ebreak in do_trap_break() riscv: avoid sending a SIGTRAP to a user thread trapped in WARN() riscv: avoid kernel hangs when trapped in BUG()
-
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds authored
Pull MIPS fixes from Paul Burton: - Build fixes for CONFIG_OPTIMIZE_INLINING=y builds in which the compiler may choose not to inline __xchg() & __cmpxchg(). - A build fix for Loongson configurations with GCC 9.x. - Expose some extra HWCAP bits to indicate support for various instruction set extensions to userland. - Fix bad stack access in firmware handling code for old SNI RM200/300/400 machines. * tag 'mips_fixes_5.4_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Disable Loongson MMI instructions for kernel build MIPS: elf_hwcap: Export userspace ASEs MIPS: fw: sni: Fix out of bounds init of o32 stack MIPS: include: Mark __xchg as __always_inline MIPS: include: Mark __cmpxchg as __always_inline
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: "Fix a kernel crash in spufs_create_root() on Cell machines, since the new mount API went in. Fix a regression in our KVM code caused by our recent PCR changes. Avoid a warning message about a failing hypervisor API on systems that don't have that API. A couple of minor build fixes. Thanks to: Alexey Kardashevskiy, Alistair Popple, Desnes A. Nunes do Rosario, Emmanuel Nicolet, Jordan Niethe, Laurent Dufour, Stephen Rothwell" * tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: spufs: fix a crash in spufs_create_root() powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host selftests/powerpc: Fix compile error on tlbie_test due to newer gcc powerpc/pseries: Remove confusing warning message. powerpc/64s/radix: Fix build failure with RADIX_MMU=n
-
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tipLinus Torvalds authored
Pull xen fixes from Juergen Gross: - correct panic handling when running as a Xen guest - cleanup the Xen grant driver to remove printing a pointer being always NULL - remove a soon to be wrong call of of_dma_configure() * tag 'for-linus-5.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Stop abusing DT of_dma_configure API xen/grant-table: remove unnecessary printing x86/xen: Return from panic notifier
-
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds authored
Pull s390 fixes from Vasily Gorbik: - Fix virtio-ccw DMA regression - Fix compiler warnings in uaccess * tag 's390-5.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/uaccess: avoid (false positive) compiler warnings s390/cio: fix virtio-ccw DMA without PV
-
Kan Liang authored
Tiger Lake is the followon to Ice Lake. From the perspective of Intel cstate residency counters, there is nothing changed compared with Ice Lake. Share icl_cstates with Ice Lake. Update the comments for Tiger Lake. The External Design Specification (EDS) is not published yet. It comes from an authoritative internal source. The patch has been tested on real hardware. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1570549810-25049-10-git-send-email-kan.liang@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Kan Liang authored
Tiger Lake is the followon to Ice Lake. PPERF and SMI_COUNT MSRs are also supported. The External Design Specification (EDS) is not published yet. It comes from an authoritative internal source. The patch has been tested on real hardware. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1570549810-25049-9-git-send-email-kan.liang@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Kan Liang authored
Tiger Lake is the followon to Ice Lake. From the perspective of Intel core PMU, there is little changes compared with Ice Lake, e.g. small changes in event list. But it doesn't impact on core PMU functionality. Share the perf code with Ice Lake. The event list patch will be submitted later separately. The patch has been tested on real hardware. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1570549810-25049-8-git-send-email-kan.liang@linux.intel.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-