- 19 Jun, 2023 40 commits
-
-
SeongJae Park authored
Fix typos including a unnecessary comma and incomplete ':ref:' keywords. Link: https://lkml.kernel.org/r/20230616191742.87531-4-sj@kernel.orgSigned-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
SeongJae Park authored
DAMON user-space tool, damo, has deprecated[1] its old DAMOS schemes specification format. However, an example of DAMON documentation is still using it. Update the example to use one of the alternative options. [1] https://github.com/awslabs/damo/commit/e9950ae68f6c Link: https://lkml.kernel.org/r/20230616191742.87531-3-sj@kernel.orgSigned-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
SeongJae Park authored
Patch series "Docs/{mm,admin-guide}damon: update design and usage docs". Update DAMON design and usage documents for outdated and unnecessarily duplicated parts. This patch (of 7): The 'age' of each region in DAMON monitoring results is an important concept for both monitoring part and DAMOS. And DAMOS section of the design document is mentioning it. However, the age itself is not explained in the document. Add a section for that. Link: https://lkml.kernel.org/r/20230616191742.87531-1-sj@kernel.org Link: https://lkml.kernel.org/r/20230616191742.87531-2-sj@kernel.orgSigned-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Mathieu Desnoyers authored
The mm_struct mm_count field is frequently updated by mmgrab/mmdrop performed by context switch. This causes false-sharing for surrounding mm_struct fields which are read-mostly. This has been observed on a 2sockets/112core/224cpu Intel Sapphire Rapids server running hackbench, and by the kernel test robot will-it-scale testcase. Move the mm_count field into its own cache line to prevent false-sharing with other mm_struct fields. Move mm_count to the first field of mm_struct to minimize the amount of padding required: rather than adding padding before and after the mm_count field, padding is only added after mm_count. Note that I noticed this odd comment in mm_struct: commit 2e302543 ("mm: relocate 'write_protect_seq' in struct mm_struct") /* * With some kernel config, the current mmap_lock's offset * inside 'mm_struct' is at 0x120, which is very optimal, as * its two hot fields 'count' and 'owner' sit in 2 different * cachelines, and when mmap_lock is highly contended, both * of the 2 fields will be accessed frequently, current layout * will help to reduce cache bouncing. * * So please be careful with adding new fields before * mmap_lock, which can easily push the 2 fields into one * cacheline. */ struct rw_semaphore mmap_lock; This comment is rather odd for a few reasons: - It requires addition/removal of mm_struct fields to carefully consider field alignment of _other_ fields, - It expresses the wish to keep an "optimal" alignment for a specific kernel config. I suspect that the author of this comment may want to revisit this topic and perhaps introduce a split-struct approach for struct rw_semaphore, if the need is to place various fields of this structure in different cache lines. Link: https://lkml.kernel.org/r/20230515143536.114960-1-mathieu.desnoyers@efficios.com Fixes: 223baf9d ("sched: Fix performance regression introduced by mm_cid") Fixes: af7f588d ("sched: Introduce per-memory-map concurrency ID") Link: https://lore.kernel.org/lkml/7a0c1db1-103d-d518-ed96-1584a28fbf32@efficios.comReported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/oe-lkp/202305151017.27581d75-yujie.liu@intel.comSigned-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Aaron Lu <aaron.lu@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Olivier Dion <odion@efficios.com> Cc: <michael.christie@oracle.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
ZhangPeng authored
No user checks the return value of mem_cgroup_scan_tasks(). Make the return value void. Link: https://lkml.kernel.org/r/20230616063030.977586-1-zhangpeng362@huawei.comSigned-off-by: ZhangPeng <zhangpeng362@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nanyong Sun <sunnanyong@huawei.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
SeongJae Park authored
Commit 5ff6e2ff ("mm/damon/core: fix divide error in damon_nr_accesses_to_accesses_bp()") fixed a bug by adding arguments validation in damon_set_attrs(). Add a unit test for the added validation to ensure the bug cannot occur again. Link: https://lkml.kernel.org/r/20230615183323.87561-1-sj@kernel.orgSigned-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
folio_is_longterm_pinnable() already exists as a wrapper function. Now that the whole implementation of is_longterm_pinnable_page() can be implemented using folios, folio_is_longterm_pinnable() can be made its own standalone function - and we can remove is_longterm_pinnable_page(). Link: https://lkml.kernel.org/r/20230614021312.34085-6-vishal.moola@gmail.comSigned-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
try_get_folio() takes in a page, then chooses to do some folio operations based on the flags (either FOLL_GET or FOLL_PIN). We can rewrite this function to be more purpose oriented. After calling try_get_folio(), if neither FOLL_GET nor FOLL_PIN are set, warn and fail. If FOLL_GET is set we can return the result. If FOLL_GET is not set then FOLL_PIN is set, so we pin the folio. This change assists with folio conversions, and makes the function more readable. Link: https://lkml.kernel.org/r/20230614021312.34085-5-vishal.moola@gmail.comSigned-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
verify_dma_pinned() checks that pages are dma-pinned. We can convert this to use folios. Link: https://lkml.kernel.org/r/20230614021312.34085-4-vishal.moola@gmail.comSigned-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
Introduce folio_migratetype() as a folio equivalent for get_pageblock_migratetype(). This function intends to return the migratetype the folio is located in, hence the name choice. Link: https://lkml.kernel.org/r/20230614021312.34085-3-vishal.moola@gmail.comSigned-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Vishal Moola (Oracle) authored
Patch series "Replace is_longterm_pinnable_page()", v2. This patchset introduces some more helper functions for the folio conversions, and converts all callers of is_longterm_pinnable_page() to use folios. This patch (of 5): Introduce folio_is_zone_movable() to act as a folio equivalent for is_zone_movable_page(). This is to assist in later folio conversions. Link: https://lkml.kernel.org/r/20230614021312.34085-1-vishal.moola@gmail.com Link: https://lkml.kernel.org/r/20230614021312.34085-2-vishal.moola@gmail.comSigned-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Marco Elver authored
KASAN's boot time kernel parameter 'kasan.fault=' currently supports 'report' and 'panic', which results in either only reporting bugs or also panicking on reports. However, some users may wish to have more control over when KASAN reports result in a kernel panic: in particular, KASAN reported invalid _writes_ are of special interest, because they have greater potential to corrupt random kernel memory or be more easily exploited. To panic on invalid writes only, introduce 'kasan.fault=panic_on_write', which allows users to choose to continue running on invalid reads, but panic only on invalid writes. Link: https://lkml.kernel.org/r/20230614095158.1133673-1-elver@google.comSigned-off-by: Marco Elver <elver@google.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Aleksandr Nogikh <nogikh@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Taras Madan <tarasmadan@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Sergey Senozhatsky authored
Recompression threshold should be below huge-size-class watermark. Any object larger than huge-size-class is a "huge object" and occupies a whole physical page on the zsmalloc side, in other words it's incompressible, as far as zsmalloc is concerned. Link: https://lkml.kernel.org/r/20230614141338.3480029-1-senozhatsky@chromium.orgSigned-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Suggested-by: Brian Geffon <bgeffon@google.com> Acked-by: Brian Geffon <bgeffon@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Domenico Cerasuolo authored
When an entry started writeback, it used to be invalidated with ref count logic alone, meaning that it would stay on the tree until all references were put. The problem with this behavior is that as soon as the writeback started, the ownership of the data held by the entry is passed to the swapcache and should not be left in zswap too. Currently there are no known issues because of this, but this change explicitly invalidates an entry that started writeback to reduce opportunities for future bugs. This patch is a follow up on the series titled "mm: zswap: move writeback LRU from zpool to zswap" + commit f090b7949768("mm: zswap: support exclusive loads"). Link: https://lkml.kernel.org/r/20230614143122.74471-1-cerasuolodomenico@gmail.comSigned-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kefeng Wang authored
Since commit c7c3dec1 ("mm: rmap: remove lock_page_memcg()"), no more user, kill lock_page_memcg() and unlock_page_memcg(). Link: https://lkml.kernel.org/r/20230614143612.62575-1-wangkefeng.wang@huawei.comSigned-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Kassey Li authored
cma: display pfn as well as pfn_to_page(pfn) page_owner: display pfn in hex rather than decimal Link: https://lkml.kernel.org/r/20230613092533.15449-1-quic_yingangl@quicinc.comSigned-off-by: Kassey Li <quic_yingangl@quicinc.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Support large folios in block_truncate_page() and avoid three hidden calls to compound_head(). [willy@infradead.org: fix check of filemap_grab_folio() return value] Link: https://lkml.kernel.org/r/ZItZOt+XxV12HtzL@casper.infradead.org Link: https://lkml.kernel.org/r/20230612210141.730128-15-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Saves a call to compound_head() and may be needed to support block size > PAGE_SIZE. Link: https://lkml.kernel.org/r/20230612210141.730128-14-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Its one caller already has a folio, so switch it to use the folio API. Removes a hidden call to compound_head(). Link: https://lkml.kernel.org/r/20230612210141.730128-13-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Use the folio API and pass the folio from both callers. Saves a hidden call to compound_head(). Link: https://lkml.kernel.org/r/20230612210141.730128-12-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Get a folio from the page cache instead of a page, then use the folio API throughout. Removes a few calls to compound_head() and may be needed to support block size > PAGE_SIZE. Link: https://lkml.kernel.org/r/20230612210141.730128-11-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Most of the callers already have a folio; convert reiserfs_write_end() to have a folio. Removes a couple of hidden calls to compound_head(). Link: https://lkml.kernel.org/r/20230612210141.730128-10-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
This removes a hidden call to compound_head() inside __block_commit_write() and moves it to those callers which are still page based. Also make block_write_end() safe for large folios. Link: https://lkml.kernel.org/r/20230612210141.730128-9-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
If any page in a folio is dirtied, dirty the entire folio. Removes a number of hidden calls to compound_head() and references to page->mapping and page->index. Fixes a pre-existing bug where we could mark a folio as dirty if the file is truncated to a multiple of the page size just as we take the page fault. I don't believe this bug has any bad effect, it's just inefficient. Link: https://lkml.kernel.org/r/20230612210141.730128-8-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Keep the interface as struct page, but work entirely on the folio internally. Removes several PAGE_SIZE assumptions and removes some references to page->index and page->mapping. Link: https://lkml.kernel.org/r/20230612210141.730128-7-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
We may someday support folios larger than 4GB, so use a size_t for the byte count within a folio to prevent unpleasant truncations. Link: https://lkml.kernel.org/r/20230612210141.730128-6-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Remove nine hidden calls to compound_head() by using a folio instead of a page. Link: https://lkml.kernel.org/r/20230612210141.730128-5-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Add support for large folios and remove some accesses to page->mapping and page->index. Link: https://lkml.kernel.org/r/20230612210141.730128-4-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Remove a couple of folio->page conversions in the callers, and two calls to compound_head() in the function itself. Rename it from __gfs2_jdata_writepage() to __gfs2_jdata_write_folio(). Link: https://lkml.kernel.org/r/20230612210141.730128-3-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Patch series "gfs2/buffer folio changes for 6.5", v3. This kind of started off as a gfs2 patch series, then became entwined with buffer heads once I realised that gfs2 was the only remaining caller of __block_write_full_page(). For those not in the gfs2 world, the big point of this series is that block_write_full_page() should now handle large folios correctly. This patch (of 14): Replace a few implicit calls to compound_head() with one explicit one. Link: https://lkml.kernel.org/r/20230612210141.730128-1-willy@infradead.org Link: https://lkml.kernel.org/r/20230612210141.730128-2-willy@infradead.orgSigned-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Nick Desaulniers authored
These are equivalent, but DEFINE_READ_MOSTLY_HASHTABLE exists to define a hashtable in the .data..read_mostly section. Link: https://lkml.kernel.org/r/20230609-khugepage-v1-1-dad4e8382298@google.comSigned-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Yu Ma authored
When running UnixBench/Execl throughput case, false sharing is observed due to frequent read on base_addr and write on free_bytes, chunk_md. UnixBench/Execl represents a class of workload where bash scripts are spawned frequently to do some short jobs. It will do system call on execl frequently, and execl will call mm_init to initialize mm_struct of the process. mm_init will call __percpu_counter_init for percpu_counters initialization. Then pcpu_alloc is called to read the base_addr of pcpu_chunk for memory allocation. Inside pcpu_alloc, it will call pcpu_alloc_area to allocate memory from a specified chunk. This function will update "free_bytes" and "chunk_md" to record the rest free bytes and other meta data for this chunk. Correspondingly, pcpu_free_area will also update these 2 members when free memory. Call trace from perf is as below: + 57.15% 0.01% execl [kernel.kallsyms] [k] __percpu_counter_init + 57.13% 0.91% execl [kernel.kallsyms] [k] pcpu_alloc - 55.27% 54.51% execl [kernel.kallsyms] [k] osq_lock - 53.54% 0x654278696e552f34 main __execve entry_SYSCALL_64_after_hwframe do_syscall_64 __x64_sys_execve do_execveat_common.isra.47 alloc_bprm mm_init __percpu_counter_init pcpu_alloc - __mutex_lock.isra.17 In current pcpu_chunk layout, `base_addr' is in the same cache line with `free_bytes' and `chunk_md', and `base_addr' is at the last 8 bytes. This patch moves `bound_map' up to `base_addr', to let `base_addr' locate in a new cacheline. With this change, on Intel Sapphire Rapids 112C/224T platform, based on v6.4-rc4, the 160 parallel score improves by 24%. The pcpu_chunk struct is a backing data structure per chunk, so the additional memory should not be dramatic. A chunk covers ballpark between 64kb and 512kb memory depending on some config and boot time stuff, so I believe the additional memory used here is nominal at best. Working the #s on my desktop: Percpu: 58624 kB 28 cores -> ~2.1MB of percpu memory. At say ~128KB per chunk -> 33 chunks, generously 40 chunks. Adding alignment might bump the chunk size ~64 bytes, so in total ~2KB of overhead? I believe we can do a little better to avoid eating that full padding, so likely less than that. [dennis@kernel.org: changelog details] Link: https://lkml.kernel.org/r/20230610030730.110074-1-yu.ma@intel.comSigned-off-by: Yu Ma <yu.ma@intel.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Acked-by: Dennis Zhou <dennis@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Miaohe Lin authored
establish_demotion_targets() is defined while CONFIG_MIGRATION is enabled. There's no need to check it again. Link: https://lkml.kernel.org/r/20230610034114.981861-1-linmiaohe@huawei.comSigned-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Miaohe Lin authored
Add __meminit to kcompactd_run() and kcompactd_stop() to ensure they're default to __init when memory hotplug is not enabled. Link: https://lkml.kernel.org/r/20230610034615.997813-1-linmiaohe@huawei.comSigned-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
YueHaibing authored
commit c7f8f31c ("mm: separate vma->lock from vm_area_struct") left this behind. Link: https://lkml.kernel.org/r/20230610101956.20592-1-yuehaibing@huawei.comSigned-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
YueHaibing authored
This inline function is unused, remove it. Link: https://lkml.kernel.org/r/20230610102858.31488-1-yuehaibing@huawei.comSigned-off-by: YueHaibing <yuehaibing@huawei.com> Cc: Jeff Xu <jeffxu@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Liam R. Howlett authored
Android reported a performance regression in the userfaultfd unmap path. A closer inspection on the userfaultfd_unmap_prep() change showed that a second tree walk would be necessary in the reworked code. Fix the regression by passing each VMA that will be unmapped through to the userfaultfd_unmap_prep() function as they are added to the unmap list, instead of re-walking the tree for the VMA. Link: https://lkml.kernel.org/r/20230601015402.2819343-1-Liam.Howlett@oracle.com Fixes: 69dbe6da ("userfaultfd: use maple tree iterator to iterate VMAs") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reported-by: Suren Baghdasaryan <surenb@google.com> Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Tarun Sahu authored
The patch ("mm/folio: Avoid special handling for order value 0 in folio_set_order") [1] removed the need for special handling of order = 0 in folio_set_order. Now, folio_set_order and set_compound_order becomes similar function. This patch removes the set_compound_order and uses folio_set_order instead. [1] https://lore.kernel.org/all/20230609183032.13E08C433D2@smtp.kernel.org/ Link: https://lkml.kernel.org/r/20230612093514.689846-1-tsahu@linux.ibm.comSigned-off-by: Tarun Sahu <tsahu@linux.ibm.com> Reviewed-by Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Domenico Cerasuolo authored
Previously, zswap_header served the purpose of storing the swpentry within zpool pages. This allowed zpool implementations to pass relevant information to the writeback function. However, with the current implementation, writeback is directly handled within zswap. Consequently, there is no longer a necessity for zswap_header, as the swp_entry_t can be stored directly in zswap_entry. Link: https://lkml.kernel.org/r/20230612093815.133504-8-cerasuolodomenico@gmail.comSigned-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Tested-by: Yosry Ahmed <yosryahmed@google.com> Suggested-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Domenico Cerasuolo authored
zswap_writeback_entry() used to be a callback for the backends, which don't know about struct zswap_entry. Now that the only user is the generic zswap LRU reclaimer, it can be simplified: pass the pinned zswap_entry directly, and consolidate the refcount management in the shrink function. Link: https://lkml.kernel.org/r/20230612093815.133504-7-cerasuolodomenico@gmail.comSigned-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Tested-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-