- 29 Apr, 2019 1 commit
-
-
Joel Fernandes (Google) authored
Introduce in-kernel headers which are made available as an archive through proc (/proc/kheaders.tar.xz file). This archive makes it possible to run eBPF and other tracing programs that need to extend the kernel for tracing purposes without any dependency on the file system having headers. A github PR is sent for the corresponding BCC patch at: https://github.com/iovisor/bcc/pull/2312 On Android and embedded systems, it is common to switch kernels but not have kernel headers available on the file system. Further once a different kernel is booted, any headers stored on the file system will no longer be useful. This is an issue even well known to distros. By storing the headers as a compressed archive within the kernel, we can avoid these issues that have been a hindrance for a long time. The best way to use this feature is by building it in. Several users have a need for this, when they switch debug kernels, they do not want to update the filesystem or worry about it where to store the headers on it. However, the feature is also buildable as a module in case the user desires it not being part of the kernel image. This makes it possible to load and unload the headers from memory on demand. A tracing program can load the module, do its operations, and then unload the module to save kernel memory. The total memory needed is 3.3MB. By having the archive available at a fixed location independent of filesystem dependencies and conventions, all debugging tools can directly refer to the fixed location for the archive, without concerning with where the headers on a typical filesystem which significantly simplifies tooling that needs kernel headers. The code to read the headers is based on /proc/config.gz code and uses the same technique to embed the headers. Other approaches were discussed such as having an in-memory mountable filesystem, but that has drawbacks such as requiring an in-kernel xz decompressor which we don't have today, and requiring usage of 42 MB of kernel memory to host the decompressed headers at anytime. Also this approach is simpler than such approaches. Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 28 Apr, 2019 2 commits
-
-
Tobin C. Harding authored
Function kobject_init_and_add() is currently misused in a number of places in the kernel. On error return kobject_put() must be called but is at times not. Make the function documentation more explicit about calling kobject_put() in the error path. Signed-off-by: Tobin C. Harding <tobin@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tobin C. Harding authored
There is currently some confusion on how to wind back kobject_init_and_add() during the error paths in code that uses this function. Add documentation to kobject_add() and kobject_del() to help clarify the usage. Signed-off-by: Tobin C. Harding <tobin@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 Apr, 2019 20 commits
-
-
Venkata Narendra Kumar Gutta authored
Platform core is using pdev->name as the platform device name to do the binding of the devices with the drivers. But, when the platform driver overrides the platform device name with dev_set_name(), the pdev->name is pointing to a location which is freed and becomes an invalid parameter to do the binding match. use-after-free instance: [ 33.325013] BUG: KASAN: use-after-free in strcmp+0x8c/0xb0 [ 33.330646] Read of size 1 at addr ffffffc10beae600 by task modprobe [ 33.339068] CPU: 5 PID: 518 Comm: modprobe Tainted: G S W O 4.19.30+ #3 [ 33.346835] Hardware name: MTP (DT) [ 33.350419] Call trace: [ 33.352941] dump_backtrace+0x0/0x3b8 [ 33.356713] show_stack+0x24/0x30 [ 33.360119] dump_stack+0x160/0x1d8 [ 33.363709] print_address_description+0x84/0x2e0 [ 33.368549] kasan_report+0x26c/0x2d0 [ 33.372322] __asan_report_load1_noabort+0x2c/0x38 [ 33.377248] strcmp+0x8c/0xb0 [ 33.380306] platform_match+0x70/0x1f8 [ 33.384168] __driver_attach+0x78/0x3a0 [ 33.388111] bus_for_each_dev+0x13c/0x1b8 [ 33.392237] driver_attach+0x4c/0x58 [ 33.395910] bus_add_driver+0x350/0x560 [ 33.399854] driver_register+0x23c/0x328 [ 33.403886] __platform_driver_register+0xd0/0xe0 So, use dev_name(&pdev->dev), which fetches the platform device name from the kobject(dev->kobj->name) of the device instead of the pdev->name. Signed-off-by: Venkata Narendra Kumar Gutta <vnkgutta@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace klp_ktype_patch's default_attrs field with default_groups and use the ATTRIBUTE_GROUPS macro to create klp_patch_groups. This patch was tested by loading the livepatch-sample module and verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace sugov_tunables_ktype's default_attrs field with default groups. Change "sugov_attributes" to "sugov_attrs" and use the ATTRIBUTE_GROUPS macro to create sugov_groups. This patch was tested by setting the scaling governor to schedutil and verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace padata_attr_type's default_attrs field with default_groups and use the ATTRIBUTE_GROUPS macro to create padata_default_groups. This patch was tested by loading the pcrypt module and verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace irq_kobj_type's default_attrs field with default_groups and use the ATTRIBUTE_GROUPS macro to create irq_groups. This patch was tested by verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace the default_attrs fields in rx_queue_ktype and netdev_queue_ktype with default_groups. Use the ATTRIBUTE_GROUPS macro to create rx_queue_default_groups and netdev_queue_default_groups. This patch was tested by verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace all of the ktype default_attrs fields in the block subsystem with default_groups and use the ATTRIBUTE_GROUPS macro to create the default groups. Remove default_ctx_attrs[] because it doesn't contain any attributes. This patch was tested by verifying that the sysfs files for the attributes in the default groups were created. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
The kobj_type default_attrs field is being replaced by the default_groups field. Replace foo_ktype's default_attrs field with default_groups and use the ATTRIBUTE_GROUPS macro to create foo_default_groups. This patch was tested by loading the kset-example module and verifying that the sysfs files for the attributes in the default group were created. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kimberly Brown authored
kobj_type currently uses a list of individual attributes to store default attributes. Attribute groups are more flexible than a list of attributes because groups provide support for attribute visibility. So, add support for default attribute groups to kobj_type. In future patches, the existing uses of kobj_type’s attribute list will be converted to attribute groups. When that is complete, kobj_type’s attribute list, “default_attrs”, will be removed. Signed-off-by: Kimberly Brown <kimbrownkd@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John Garry authored
In commit 376991db ("driver core: Postpone DMA tear-down until after devres release"), we changed the ordering of tearing down the device DMA ops and releasing all the device's resources; this was because the DMA ops should be maintained until we release the device's managed DMA memories. However, we have seen another crash on an arm64 system when a device driver probe fails: hisi_sas_v3_hw 0000:74:02.0: Adding to iommu group 2 scsi host1: hisi_sas_v3_hw BUG: Bad page state in process swapper/0 pfn:313f5 page:ffff7e0000c4fd40 count:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd48 ffff7e0000c4fd48 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x1000(reserved) Modules linked in: CPU: 49 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1-43081-g22d97fd-dirty #1433 Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - V1.12.01 01/29/2019 Call trace: dump_backtrace+0x0/0x118 show_stack+0x14/0x1c dump_stack+0xa4/0xc8 bad_page+0xe4/0x13c free_pages_check_bad+0x4c/0xc0 __free_pages_ok+0x30c/0x340 __free_pages+0x30/0x44 __dma_direct_free_pages+0x30/0x38 dma_direct_free+0x24/0x38 dma_free_attrs+0x9c/0xd8 dmam_release+0x20/0x28 release_nodes+0x17c/0x220 devres_release_all+0x34/0x54 really_probe+0xc4/0x2c8 driver_probe_device+0x58/0xfc device_driver_attach+0x68/0x70 __driver_attach+0x94/0xdc bus_for_each_dev+0x5c/0xb4 driver_attach+0x20/0x28 bus_add_driver+0x14c/0x200 driver_register+0x6c/0x124 __pci_register_driver+0x48/0x50 sas_v3_pci_driver_init+0x20/0x28 do_one_initcall+0x40/0x25c kernel_init_freeable+0x2b8/0x3c0 kernel_init+0x10/0x100 ret_from_fork+0x10/0x18 Disabling lock debugging due to kernel taint BUG: Bad page state in process swapper/0 pfn:313f6 page:ffff7e0000c4fd80 count:1 mapcount:0 mapping:0000000000000000 index:0x0 [ 89.322983] flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd88 ffff7e0000c4fd88 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 The crash occurs for the same reason. In this case, on the really_probe() failure path, we are still clearing the DMA ops prior to releasing the device's managed memories. This patch fixes this issue by reordering the DMA ops teardown and the call to devres_release_all() on the failure path. Reported-by: Xiang Chen <chenxiang66@hisilicon.com> Tested-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andy Shevchenko authored
Since insert_resource() might return an error we don't need to shadow its error code and would safely propagate to the user. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrea Parri authored
smp_mb__before_atomic() can not be applied to atomic_set(). Remove the barrier and rely on RELEASE synchronization. Fixes: ba16b284 ("kernfs: add an API to get kernfs node from inode number") Cc: stable@vger.kernel.org Signed-off-by: Andrea Parri <andrea.parri@amarulasolutions.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christina Quast authored
flies => files Signed-off-by: Christina Quast <cquast@hanoverdisplays.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Qian Cai authored
The commit 665ac7e9 ("acpi/hmat: Register processor domain to its memory") introduced an uninitialized "struct memory_target" that could cause an incorrect branching. drivers/acpi/hmat/hmat.c:385:6: warning: variable 'target' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] if (p->flags & ACPI_HMAT_MEMORY_PD_VALID) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/acpi/hmat/hmat.c:392:6: note: uninitialized use occurs here if (target && p->flags & ACPI_HMAT_PROCESSOR_PD_VALID) { ^~~~~~ drivers/acpi/hmat/hmat.c:385:2: note: remove the 'if' if its condition is always true if (p->flags & ACPI_HMAT_MEMORY_PD_VALID) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/acpi/hmat/hmat.c:369:30: note: initialize the variable 'target' to silence this warning struct memory_target *target; ^ = NULL Signed-off-by: Qian Cai <cai@lca.pw> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Fixes: 665ac7e9 ("acpi/hmat: Register processor domain to its memory") Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alison Schofield authored
ACPI 6.3 changed the subtable "Memory Subsystem Address Range Structure" to "Memory Proximity Domain Attributes Structure". Updating and renaming of the structure was included in commit: ACPICA: ACPI 6.3: HMAT updates (9a8d961f) Rename the enum type to match the subtable and structure naming. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Qian Cai authored
The commit 665ac7e9 ("acpi/hmat: Register processor domain to its memory") introduced some memory leaks below due to it fails to release the heap memory in an error path, and then those statically-allocated __initdata memory which reference them get freed during boot renders those heap memory as leaks. Since it is valid to pass NULL to acpi_put_table(), it is fine to call it even if acpi_get_table() returns an error. unreferenced object 0xc8ff8008349e9400 (size 128): comm "swapper/0", pid 1, jiffies 4294709236 (age 48121.476s) hex dump (first 32 bytes): 00 d0 9e 34 08 80 ff 84 d8 00 43 11 00 10 ff ff ...4......C..... 00 00 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000869d4503>] __kmalloc+0x568/0x600 [<0000000070fd6afb>] alloc_memory_target+0x50/0xd8 [<00000000efa2081e>] srat_parse_mem_affinity+0x58/0x5c [<000000008bfaef74>] acpi_parse_entries_array+0x1c8/0x2c0 [<0000000022804877>] acpi_table_parse_entries_array+0x11c/0x138 [<00000000ffe9cd34>] acpi_table_parse_entries+0x7c/0xac [<00000000a7023afd>] hmat_init+0x90/0x174 [<00000000694a86c1>] do_one_initcall+0x2d8/0x5f8 [<0000000024889da9>] do_initcall_level+0x37c/0x3fc [<000000009be02908>] do_basic_setup+0x38/0x50 [<0000000037b3ac0a>] kernel_init_freeable+0x194/0x258 [<00000000f5741184>] kernel_init+0x18/0x334 [<000000007b30f423>] ret_from_fork+0x10/0x18 [<000000006c7147a8>] 0xffffffffffffffff Signed-off-by: Qian Cai <cai@lca.pw> Fixes: 665ac7e9 ("acpi/hmat: Register processor domain to its memory") Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bartosz Golaszewski authored
It should have been 'management' not 'managemend'. Fixes: 7945f929 ("drivers: provide devm_platform_ioremap_resource()") Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Mukesh Ojha <mojha@codeaurora.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
zhong jiang authored
When adding the memory by probing memory block in sysfs interface, there is an obvious issue that we will unlock the device_hotplug_lock when fails to takes it. That issue was introduced in Commit 8df1d0e4 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock") We should drop out in time when fails to take the device_hotplug_lock. Fixes: 8df1d0e4 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock") Reported-by: Yang yingliang <yangyingliang@huawei.com> Signed-off-by: zhong jiang <zhongjiang@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Borislav Petkov authored
It is not absolutely clear from the docs how the cleanup path after device_add() should look like so spell it out explicitly. No functional changes, just documentation. Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ronald Tschalär authored
Since commit ff9fb72b ("debugfs: return error values, not NULL") these helper functions do not return NULL anymore (with the exception of debugfs_create_u32_array()). Fixes: ff9fb72b ("debugfs: return error values, not NULL") Signed-off-by: Ronald Tschalär <ronald@innovation.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 04 Apr, 2019 14 commits
-
-
Greg Kroah-Hartman authored
There were a few files in the driver core power code that did not have SPDX identifiers on them, so fix that up. At the same time, remove the "free form" text that specified the license of the file, as that is impossible for any tool to properly parse. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
There were two files in the firmware_loader code that did not have SPDX identifiers on them, so fix that up. Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
The Makefile in the drivers/base/test/ directory did not have a SPDX identifier on it, so fix that up. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lingutla Chandrasekhar authored
If user updates any cpu's cpu_capacity, then the new value is going to be applied to all its online sibling cpus. But this need not to be correct always, as sibling cpus (in ARM, same micro architecture cpus) would have different cpu_capacity with different performance characteristics. So, updating the user supplied cpu_capacity to all cpu siblings is not correct. And another problem is, current code assumes that 'all cpus in a cluster or with same package_id (core_siblings), would have same cpu_capacity'. But with commit '5bdd2b3f ("arm64: topology: add support to remove cpu topology sibling masks")', when a cpu hotplugged out, the cpu information gets cleared in its sibling cpus. So, user supplied cpu_capacity would be applied to only online sibling cpus at the time. After that, if any cpu hotplugged in, it would have different cpu_capacity than its siblings, which breaks the above assumption. So, instead of mucking around the core sibling mask for user supplied value, use device-tree to set cpu capacity. And make the cpu_capacity node as read-only to know the asymmetry between cpus in the system. While at it, remove cpu_scale_mutex usage, which used for sysfs write protection. Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Quentin Perret <quentin.perret@arm.com> Reviewed-by: Quentin Perret <quentin.perret@arm.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Lingutla Chandrasekhar <clingutla@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
Platforms may provide system memory where some physical address ranges perform differently than others, or is cached by the system on the memory side. Add documentation describing a high level overview of such systems and the perforamnce and caching attributes the kernel provides for applications wishing to query this information. Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
Register memory side cache attributes with the memory's node if HMAT provides the side cache iniformation table. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
Save the best performance access attributes and register these with the memory's node if HMAT provides the locality table. While HMAT does make it possible to know performance for all possible initiator-target pairings, we export only the local pairings at this time. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
If the HMAT Subsystem Address Range provides a valid processor proximity domain for a memory domain, or a processor domain matches the performance access of the valid processor proximity domain, register the memory target with that initiator so this relationship will be visible under the node's sysfs directory. Since HMAT requires valid address ranges have an equivalent SRAT entry, verify each memory target satisfies this requirement. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Brice Goglin <Brice.Goglin@inria.fr> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
System memory may have caches to help improve access speed to frequently requested address ranges. While the system provided cache is transparent to the software accessing these memory ranges, applications can optimize their own access based on cache attributes. Provide a new API for the kernel to register these memory-side caches under the memory node that provides it. The new sysfs representation is modeled from the existing cpu cacheinfo attributes, as seen from /sys/devices/system/cpu/<cpu>/cache/. Unlike CPU cacheinfo though, the node cache level is reported from the view of the memory. A higher level number is nearer to the CPU, while lower levels are closer to the last level memory. The exported attributes are the cache size, the line size, associativity indexing, and write back policy, and add the attributes for the system memory caches to sysfs stable documentation. Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Brice Goglin <Brice.Goglin@inria.fr> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
Heterogeneous memory systems provide memory nodes with different latency and bandwidth performance attributes. Provide a new kernel interface for subsystems to register the attributes under the memory target node's initiator access class. If the system provides this information, applications may query these attributes when deciding which node to request memory. The following example shows the new sysfs hierarchy for a node exporting performance attributes: # tree -P "read*|write*"/sys/devices/system/node/nodeY/accessZ/initiators/ /sys/devices/system/node/nodeY/accessZ/initiators/ |-- read_bandwidth |-- read_latency |-- write_bandwidth `-- write_latency The bandwidth is exported as MB/s and latency is reported in nanoseconds. The values are taken from the platform as reported by the manufacturer. Memory accesses from an initiator node that is not one of the memory's access "Z" initiator nodes linked in the same directory may observe different performance than reported here. When a subsystem makes use of this interface, initiators of a different access number may not have the same performance relative to initiators in other access numbers, or omitted from the any access class' initiators. Descriptions for memory access initiator performance access attributes are added to sysfs stable documentation. Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
Systems may be constructed with various specialized nodes. Some nodes may provide memory, some provide compute devices that access and use that memory, and others may provide both. Nodes that provide memory are referred to as memory targets, and nodes that can initiate memory access are referred to as memory initiators. Memory targets will often have varying access characteristics from different initiators, and platforms may have ways to express those relationships. In preparation for these systems, provide interfaces for the kernel to export the memory relationship among different nodes memory targets and their initiators with symlinks to each other. If a system provides access locality for each initiator-target pair, nodes may be grouped into ranked access classes relative to other nodes. The new interface allows a subsystem to register relationships of varying classes if available and desired to be exported. A memory initiator may have multiple memory targets in the same access class. The target memory's initiators in a given class indicate the nodes access characteristics share the same performance relative to other linked initiator nodes. Each target within an initiator's access class, though, do not necessarily perform the same as each other. A memory target node may have multiple memory initiators. All linked initiators in a target's class have the same access characteristics to that target. The following example show the nodes' new sysfs hierarchy for a memory target node 'Y' with access class 0 from initiator node 'X': # symlinks -v /sys/devices/system/node/nodeX/access0/ relative: /sys/devices/system/node/nodeX/access0/targets/nodeY -> ../../nodeY # symlinks -v /sys/devices/system/node/nodeY/access0/ relative: /sys/devices/system/node/nodeY/access0/initiators/nodeX -> ../../nodeX The new attributes are added to the sysfs stable documentation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
Systems may provide different memory types and export this information in the ACPI Heterogeneous Memory Attribute Table (HMAT). Parse these tables provided by the platform and report the memory access and caching attributes to the kernel messages. Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
The Heterogeneous Memory Attribute Table (HMAT) header has different field lengths than the existing parsing uses. Add the HMAT type to the parsing rules so it may be generically parsed. Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Keith Busch authored
Parsing entries in an ACPI table had assumed a generic header structure. There is no standard ACPI header, though, so less common layouts with different field sizes required custom parsers to go through their subtable entry list. Create the infrastructure for adding different table types so parsing the entries array may be more reused for all ACPI system tables and the common code doesn't need to be duplicated. Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Tested-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 01 Apr, 2019 3 commits
-
-
Tetsuo Handa authored
syzbot is hitting use-after-free bug in uinput module [1]. This is because kobject_uevent(KOBJ_REMOVE) is called again due to commit 0f4dafc0 ("Kobject: auto-cleanup on final unref") after memory allocation fault injection made kobject_uevent(KOBJ_REMOVE) from device_del() from input_unregister_device() fail, while uinput_destroy_device() is expecting that kobject_uevent(KOBJ_REMOVE) is not called after device_del() from input_unregister_device() completed. That commit intended to catch cases where nobody even attempted to send "remove" uevents. But there is no guarantee that an event will ultimately be sent. We are at the point of no return as far as the rest of the kernel is concerned; there are no repeats or do-overs. Also, it is not clear whether some subsystem depends on that commit. If no subsystem depends on that commit, it will be better to remove the state_{add,remove}_uevent_sent logic. But we don't want to risk a regression (in a patch which will be backported) by trying to remove that logic. Therefore, as a first step, let's avoid the use-after-free bug by making sure that kobject_uevent(KOBJ_REMOVE) won't be triggered twice. [1] https://syzkaller.appspot.com/bug?id=8b17c134fe938bbddd75a45afaa9e68af43a362dReported-by: syzbot <syzbot+f648cfb7e0b52bf7ae32@syzkaller.appspotmail.com> Analyzed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Fixes: 0f4dafc0 ("Kobject: auto-cleanup on final unref") Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Geert Uytterhoeven authored
Since commit 7934779a ("Driver-Core: disable /sbin/hotplug by default"), the help text for the /sbin/hotplug fork-bomb says "This should not be used today [...] creates a high system load, or [...] out-of-memory situations during bootup". The rationale for this was that no recent mainstream system used this anymore (in 2010!). A few years later, the complete uevent helper support was made optional in commit 86d56134 ("kobject: Make support for uevent_helper optional."). However, if was still left enabled by default, to support ancient userland. Time passed by, and nothing should use this anymore, so it can be disabled by default. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Greg Kroah-Hartman authored
struct device is big, around 760 bytes on x86_64. It's not a critical structure, but it is embedded everywhere, so making it smaller is always a good thing. With a recent patch that moved a field from struct device to the private structure, some benchmarks showed a very odd regression, despite this structure having nothing to do with those benchmarks. That caused me to look into the layout of the structure. Using 'pahole', it showed a number of holes and ways that the structure could be reordered in order to align some cachelines better, as well as reduce the size of the overall structure. Move 'struct kobj' to the start of the structure, to keep that access in the first cacheline, and try to organize things a bit more compactly where possible By doing these few moves, the result removes at least 8 bytes from 'struct device' on a 64bit system. Given we know there are systems with at least 30k devices in memory at once, every little byte counts, and this change could be a savings of 240k of kernel memory for them. On "normal" systems the overall memory savings would be much less. Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-