1. 16 Oct, 2023 17 commits
    • Stefan Roesch's avatar
      mm/ksm: add pages_skipped metric · e5a68991
      Stefan Roesch authored
      This change adds the "pages skipped" metric.  To be able to evaluate how
      successful smart page scanning is, the pages skipped metric can be
      compared to the pages scanned metric.
      
      The pages skipped metric is a cumulative counter.  The counter is stored
      under /sys/kernel/mm/ksm/pages_skipped.
      
      Link: https://lkml.kernel.org/r/20230926040939.516161-3-shr@devkernel.ioSigned-off-by: default avatarStefan Roesch <shr@devkernel.io>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e5a68991
    • Stefan Roesch's avatar
      mm/ksm: add "smart" page scanning mode · 5e924ff5
      Stefan Roesch authored
      Patch series "Smart scanning mode for KSM", v3.
      
      This patch series adds "smart scanning" for KSM.
      
      What is smart scanning?
      =======================
      KSM evaluates all the candidate pages for each scan. It does not use historic
      information from previous scans. This has the effect that candidate pages that
      couldn't be used for KSM de-duplication continue to be evaluated for each scan.
      
      The idea of "smart scanning" is to keep historic information. With the historic
      information we can temporarily skip the candidate page for one or several scans.
      
      Details:
      ========
      "Smart scanning" is to keep two small counters to store if the page has been
      used for KSM. One counter stores how often we already tried to use the page for
      KSM and the other counter stores how often we skip a page.
      
      How often we skip the candidate page depends how often a page failed KSM
      de-duplication. The code skips a maximum of 8 times. During testing this has
      shown to be a good compromise for different workloads.
      
      New sysfs knob:
      ===============
      Smart scanning is not enabled by default. With /sys/kernel/mm/ksm/smart_scan
      smart scanning can be enabled.
      
      Monitoring:
      ===========
      To monitor how effective smart scanning is a new sysfs knob has been introduced.
      /sys/kernel/mm/pages_skipped report how many pages have been skipped by smart
      scanning.
      
      Results:
      ========
      - Various workloads have shown a 20% - 25% reduction in page scans
        For the instagram workload for instance, the number of pages scanned has been
        reduced from over 20M pages per scan to less than 15M pages.
      - Less pages scans also resulted in an overall higher de-duplication rate as
        some shorter lived pages could be de-duplicated additionally
      - Less pages scanned allows to reduce the pages_to_scan parameter
        and this resulted in  a 25% reduction in terms of CPU.
      - The improvements have been observed for workloads that enable KSM with
        madvise as well as prctl
      
      
      This patch (of 4):
      
      This change adds a "smart" page scanning mode for KSM.  So far all the
      candidate pages are continuously scanned to find candidates for
      de-duplication.  There are a considerably number of pages that cannot be
      de-duplicated.  This is costly in terms of CPU.  By using smart scanning
      considerable CPU savings can be achieved.
      
      This change takes the history of scanning pages into account and skips the
      page scanning of certain pages for a while if de-deduplication for this
      page has not been successful in the past.
      
      To do this it introduces two new fields in the ksm_rmap_item structure:
      age and remaining_skips.  age, is the KSM age and remaining_skips
      determines how often scanning of this page is skipped.  The age field is
      incremented each time the page is scanned and the page cannot be de-
      duplicated.  age updated is capped at U8_MAX.
      
      How often a page is skipped is dependent how often de-duplication has been
      tried so far and the number of skips is currently limited to 8.  This
      value has shown to be effective with different workloads.
      
      The feature is currently disable by default and can be enabled with the
      new smart_scan knob.
      
      The feature has shown to be very effective: upt to 25% of the page scans
      can be eliminated; the pages_to_scan rate can be reduced by 40 - 50% and a
      similar de-duplication rate can be maintained.
      
      [akpm@linux-foundation.org: make ksm_smart_scan default true, for testing]
      Link: https://lkml.kernel.org/r/20230926040939.516161-1-shr@devkernel.io
      Link: https://lkml.kernel.org/r/20230926040939.516161-2-shr@devkernel.ioSigned-off-by: default avatarStefan Roesch <shr@devkernel.io>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Stefan Roesch <shr@devkernel.io>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5e924ff5
    • Huang Ying's avatar
      dax, kmem: calculate abstract distance with general interface · 6bc2cfdf
      Huang Ying authored
      Previously, a fixed abstract distance MEMTIER_DEFAULT_DAX_ADISTANCE is
      used for slow memory type in kmem driver.  This limits the usage of kmem
      driver, for example, it cannot be used for HBM (high bandwidth memory).
      
      So, we use the general abstract distance calculation mechanism in kmem
      drivers to get more accurate abstract distance on systems with proper
      support.  The original MEMTIER_DEFAULT_DAX_ADISTANCE is used as fallback
      only.
      
      Now, multiple memory types may be managed by kmem.  These memory types are
      put into the "kmem_memory_types" list and protected by
      kmem_memory_type_lock.
      
      Link: https://lkml.kernel.org/r/20230926060628.265989-5-ying.huang@intel.comSigned-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Tested-by: default avatarBharata B Rao <bharata@amd.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Wei Xu <weixugc@google.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6bc2cfdf
    • Huang Ying's avatar
      acpi, hmat: calculate abstract distance with HMAT · 3718c02d
      Huang Ying authored
      A memory tiering abstract distance calculation algorithm based on ACPI
      HMAT is implemented.  The basic idea is as follows.
      
      The performance attributes of system default DRAM nodes are recorded as
      the base line.  Whose abstract distance is MEMTIER_ADISTANCE_DRAM.  Then,
      the ratio of the abstract distance of a memory node (target) to
      MEMTIER_ADISTANCE_DRAM is scaled based on the ratio of the performance
      attributes of the node to that of the default DRAM nodes.
      
      The functions to record the read/write latency/bandwidth of the default
      DRAM nodes and calculate abstract distance according to read/write
      latency/bandwidth ratio will be used by CXL CDAT (Coherent Device
      Attribute Table) and other memory device drivers.  So, they are put in
      memory-tiers.c.
      
      Link: https://lkml.kernel.org/r/20230926060628.265989-4-ying.huang@intel.comSigned-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Tested-by: default avatarBharata B Rao <bharata@amd.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Wei Xu <weixugc@google.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3718c02d
    • Huang Ying's avatar
      acpi, hmat: refactor hmat_register_target_initiators() · d0376aac
      Huang Ying authored
      Previously, in hmat_register_target_initiators(), the performance
      attributes are calculated and the corresponding sysfs links and files are
      created too.  Which is called during memory onlining.
      
      But now, to calculate the abstract distance of a memory target before
      memory onlining, we need to calculate the performance attributes for a
      memory target without creating sysfs links and files.
      
      To do that, hmat_register_target_initiators() is refactored to make it
      possible to calculate performance attributes separately.
      
      Link: https://lkml.kernel.org/r/20230926060628.265989-3-ying.huang@intel.comSigned-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Tested-by: default avatarAlistair Popple <apopple@nvidia.com>
      Tested-by: default avatarBharata B Rao <bharata@amd.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Wei Xu <weixugc@google.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d0376aac
    • Huang Ying's avatar
      memory tiering: add abstract distance calculation algorithms management · 07a8bdd4
      Huang Ying authored
      Patch series "memory tiering: calculate abstract distance based on ACPI
      HMAT", v4.
      
      We have the explicit memory tiers framework to manage systems with
      multiple types of memory, e.g., DRAM in DIMM slots and CXL memory devices.
      Where, same kind of memory devices will be grouped into memory types,
      then put into memory tiers.  To describe the performance of a memory type,
      abstract distance is defined.  Which is in direct proportion to the memory
      latency and inversely proportional to the memory bandwidth.  To keep the
      code as simple as possible, fixed abstract distance is used in dax/kmem to
      describe slow memory such as Optane DCPMM.
      
      To support more memory types, in this series, we added the abstract
      distance calculation algorithm management mechanism, provided a algorithm
      implementation based on ACPI HMAT, and used the general abstract distance
      calculation interface in dax/kmem driver.  So, dax/kmem can support HBM
      (high bandwidth memory) in addition to the original Optane DCPMM.
      
      
      This patch (of 4):
      
      The abstract distance may be calculated by various drivers, such as ACPI
      HMAT, CXL CDAT, etc.  While it may be used by various code which hot-add
      memory node, such as dax/kmem etc.  To decouple the algorithm users and
      the providers, the abstract distance calculation algorithms management
      mechanism is implemented in this patch.  It provides interface for the
      providers to register the implementation, and interface for the users.
      
      Multiple algorithm implementations can cooperate via calculating abstract
      distance for different memory nodes.  The preference of algorithm
      implementations can be specified via priority (notifier_block.priority).
      
      Link: https://lkml.kernel.org/r/20230926060628.265989-1-ying.huang@intel.com
      Link: https://lkml.kernel.org/r/20230926060628.265989-2-ying.huang@intel.comSigned-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Tested-by: default avatarBharata B Rao <bharata@amd.com>
      Reviewed-by: default avatarAlistair Popple <apopple@nvidia.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Wei Xu <weixugc@google.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      07a8bdd4
    • Sidhartha Kumar's avatar
      mm/hugetlb: replace page_ref_freeze() with folio_ref_freeze() in hugetlb_folio_init_vmemmap() · a48bf7b4
      Sidhartha Kumar authored
      No functional difference, folio_ref_freeze() is currently a wrapper for
      page_ref_freeze().
      
      Link: https://lkml.kernel.org/r/20230926174433.81241-1-sidhartha.kumar@oracle.comSigned-off-by: default avatarSidhartha Kumar <sidhartha.kumar@oracle.com>
      Reviewed-by: Muchun Song <songmuchun@bytedance.com> 
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Usama Arif <usama.arif@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a48bf7b4
    • Sidhartha Kumar's avatar
      mm/filemap: remove hugetlb special casing in filemap.c · a08c7193
      Sidhartha Kumar authored
      Remove special cased hugetlb handling code within the page cache by
      changing the granularity of ->index to the base page size rather than the
      huge page size.  The motivation of this patch is to reduce complexity
      within the filemap code while also increasing performance by removing
      branches that are evaluated on every page cache lookup.
      
      To support the change in index, new wrappers for hugetlb page cache
      interactions are added.  These wrappers perform the conversion to a linear
      index which is now expected by the page cache for huge pages.
      
      ========================= PERFORMANCE ======================================
      
      Perf was used to check the performance differences after the patch. 
      Overall the performance is similar to mainline with a very small larger
      overhead that occurs in __filemap_add_folio() and
      hugetlb_add_to_page_cache().  This is because of the larger overhead that
      occurs in xa_load() and xa_store() as the xarray is now using more entries
      to store hugetlb folios in the page cache.
      
      Timing
      
      aarch64
          2MB Page Size
              6.5-rc3 + this patch:
                  [root@sidhakum-ol9-1 hugepages]# time fallocate -l 700GB test.txt
                  real    1m49.568s
                  user    0m0.000s
                  sys     1m49.461s
      
              6.5-rc3:
                  [root]# time fallocate -l 700GB test.txt
                  real    1m47.495s
                  user    0m0.000s
                  sys     1m47.370s
          1GB Page Size
              6.5-rc3 + this patch:
                  [root@sidhakum-ol9-1 hugepages1G]# time fallocate -l 700GB test.txt
                  real    1m47.024s
                  user    0m0.000s
                  sys     1m46.921s
      
              6.5-rc3:
                  [root@sidhakum-ol9-1 hugepages1G]# time fallocate -l 700GB test.txt
                  real    1m44.551s
                  user    0m0.000s
                  sys     1m44.438s
      
      x86
          2MB Page Size
              6.5-rc3 + this patch:
                  [root@sidhakum-ol9-2 hugepages]# time fallocate -l 100GB test.txt
                  real    0m22.383s
                  user    0m0.000s
                  sys     0m22.255s
      
              6.5-rc3:
                  [opc@sidhakum-ol9-2 hugepages]$ time sudo fallocate -l 100GB /dev/hugepages/test.txt
                  real    0m22.735s
                  user    0m0.038s
                  sys     0m22.567s
      
          1GB Page Size
              6.5-rc3 + this patch:
                  [root@sidhakum-ol9-2 hugepages1GB]# time fallocate -l 100GB test.txt
                  real    0m25.786s
                  user    0m0.001s
                  sys     0m25.589s
      
              6.5-rc3:
                  [root@sidhakum-ol9-2 hugepages1G]# time fallocate -l 100GB test.txt
                  real    0m33.454s
                  user    0m0.001s
                  sys     0m33.193s
      
      aarch64:
          workload - fallocate a 700GB file backed by huge pages
      
          6.5-rc3 + this patch:
              2MB Page Size:
                  --100.00%--__arm64_sys_fallocate
                                ksys_fallocate
                                vfs_fallocate
                                hugetlbfs_fallocate
                                |
                                |--95.04%--__pi_clear_page
                                |
                                |--3.57%--clear_huge_page
                                |          |
                                |          |--2.63%--rcu_all_qs
                                |          |
                                |           --0.91%--__cond_resched
                                |
                                 --0.67%--__cond_resched
                  0.17%     0.00%             0  fallocate  [kernel.vmlinux]       [k] hugetlb_add_to_page_cache
                  0.14%     0.10%            11  fallocate  [kernel.vmlinux]       [k] __filemap_add_folio
      
          6.5-rc3
              2MB Page Size:
                      --100.00%--__arm64_sys_fallocate
                                ksys_fallocate
                                vfs_fallocate
                                hugetlbfs_fallocate
                                |
                                |--94.91%--__pi_clear_page
                                |
                                |--4.11%--clear_huge_page
                                |          |
                                |          |--3.00%--rcu_all_qs
                                |          |
                                |           --1.10%--__cond_resched
                                |
                                 --0.59%--__cond_resched
                  0.08%     0.01%             1  fallocate  [kernel.kallsyms]  [k] hugetlb_add_to_page_cache
                  0.05%     0.03%             3  fallocate  [kernel.kallsyms]  [k] __filemap_add_folio
      
      x86
          workload - fallocate a 100GB file backed by huge pages
      
          6.5-rc3 + this patch:
              2MB Page Size:
                  hugetlbfs_fallocate
                  |
                  --99.57%--clear_huge_page
                      |
                      --98.47%--clear_page_erms
                          |
                          --0.53%--asm_sysvec_apic_timer_interrupt
      
                  0.04%     0.04%             1  fallocate  [kernel.kallsyms]     [k] xa_load
                  0.04%     0.00%             0  fallocate  [kernel.kallsyms]     [k] hugetlb_add_to_page_cache
                  0.04%     0.00%             0  fallocate  [kernel.kallsyms]     [k] __filemap_add_folio
                  0.04%     0.00%             0  fallocate  [kernel.kallsyms]     [k] xas_store
      
          6.5-rc3
              2MB Page Size:
                      --99.93%--__x64_sys_fallocate
                                vfs_fallocate
                                hugetlbfs_fallocate
                                |
                                 --99.38%--clear_huge_page
                                           |
                                           |--98.40%--clear_page_erms
                                           |
                                            --0.59%--__cond_resched
                  0.03%     0.03%             1  fallocate  [kernel.kallsyms]  [k] __filemap_add_folio
      
      ========================= TESTING ======================================
      
      This patch passes libhugetlbfs tests and LTP hugetlb tests
      
      ********** TEST SUMMARY
      *                      2M
      *                      32-bit 64-bit
      *     Total testcases:   110    113
      *             Skipped:     0      0
      *                PASS:   107    113
      *                FAIL:     0      0
      *    Killed by signal:     3      0
      *   Bad configuration:     0      0
      *       Expected FAIL:     0      0
      *     Unexpected PASS:     0      0
      *    Test not present:     0      0
      * Strange test result:     0      0
      **********
      
          Done executing testcases.
          LTP Version:  20220527-178-g2761a81c4
      
      page migration was also tested using Mike Kravetz's test program.[8]
      
      [dan.carpenter@linaro.org: fix an NULL vs IS_ERR() bug]
        Link: https://lkml.kernel.org/r/1772c296-1417-486f-8eef-171af2192681@moroto.mountain
      Link: https://lkml.kernel.org/r/20230926192017.98183-1-sidhartha.kumar@oracle.comSigned-off-by: default avatarSidhartha Kumar <sidhartha.kumar@oracle.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Reported-and-tested-by: syzbot+c225dea486da4d5592bd@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=c225dea486da4d5592bd
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a08c7193
    • Stefan Roesch's avatar
      mm/ksm: test case for prctl fork/exec workflow · 0374af1d
      Stefan Roesch authored
      This adds a new test case to the ksm functional tests to make sure that
      the KSM setting is inherited by the child process when doing a fork/exec.
      
      Link: https://lkml.kernel.org/r/20230922211141.320789-3-shr@devkernel.ioSigned-off-by: default avatarStefan Roesch <shr@devkernel.io>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Carl Klemm <carl@uvos.xyz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0374af1d
    • Stefan Roesch's avatar
      mm/ksm: support fork/exec for prctl · 3c6f33b7
      Stefan Roesch authored
      Patch series "mm/ksm: add fork-exec support for prctl", v4.
      
      A process can enable KSM with the prctl system call.  When the process is
      forked the KSM flag is inherited by the child process.  However if the
      process is executing an exec system call directly after the fork, the KSM
      setting is cleared.  This patch series addresses this problem.
      
      1) Change the mask in coredump.h for execing a new process
      2) Add a new test case in ksm_functional_tests
      
      
      This patch (of 2):
      
      Today we have two ways to enable KSM:
      
      1) madvise system call
         This allows to enable KSM for a memory region for a long time.
      
      2) prctl system call
         This is a recent addition to enable KSM for the complete process.
         In addition when a process is forked, the KSM setting is inherited.
      
      This change only affects the second case.
      
      One of the use cases for (2) was to support the ability to enable
      KSM for cgroups. This allows systemd to enable KSM for the seed
      process. By enabling it in the seed process all child processes inherit
      the setting.
      
      This works correctly when the process is forked. However it doesn't
      support fork/exec workflow.
      
      From the previous cover letter:
      
      ....
      Use case 3:
      With the madvise call sharing opportunities are only enabled for the
      current process: it is a workload-local decision. A considerable number
      of sharing opportunities may exist across multiple workloads or jobs
      (if they are part of the same security domain). Only a higler level
      entity like a job scheduler or container can know for certain if its
      running one or more instances of a job. That job scheduler however
      doesn't have the necessary internal workload knowledge to make targeted
      madvise calls.
      ....
      
      In addition it can also be a bit surprising that fork keeps the KSM
      setting and fork/exec does not.
      
      Link: https://lkml.kernel.org/r/20230922211141.320789-1-shr@devkernel.io
      Link: https://lkml.kernel.org/r/20230922211141.320789-2-shr@devkernel.ioSigned-off-by: default avatarStefan Roesch <shr@devkernel.io>
      Fixes: d7597f59 ("mm: add new api to enable ksm per process")
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reported-by: default avatarCarl Klemm <carl@uvos.xyz>
      Tested-by: default avatarCarl Klemm <carl@uvos.xyz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3c6f33b7
    • Huan Yang's avatar
      mm/damon/core: remove unnecessary si_meminfo invoke. · 987ffa5a
      Huan Yang authored
      si_meminfo() will read and assign more info not just free/ram pages.  For
      just DAMOS_WMARK_FREE_MEM_RATE use, only get free and ram pages is ok to
      save cpu.
      
      Link: https://lkml.kernel.org/r/20230920015727.4482-1-link@vivo.comSigned-off-by: default avatarHuan Yang <link@vivo.com>
      Reviewed-by: default avatarSeongJae Park <sj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      987ffa5a
    • Kefeng Wang's avatar
      sched/numa, mm: make numa migrate functions to take a folio · 8c9ae56d
      Kefeng Wang authored
      The cpupid (or access time) is stored in the head page for THP, so it is
      safely to make should_numa_migrate_memory() and numa_hint_fault_latency()
      to take a folio.  This is in preparation for large folio numa balancing.
      
      Link: https://lkml.kernel.org/r/20230921074417.24004-7-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8c9ae56d
    • Kefeng Wang's avatar
      mm: mempolicy: make mpol_misplaced() to take a folio · 75c70128
      Kefeng Wang authored
      In preparation for large folio numa balancing, make mpol_misplaced() to
      take a folio, no functional change intended.
      
      Link: https://lkml.kernel.org/r/20230921074417.24004-6-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      75c70128
    • Kefeng Wang's avatar
      mm: memory: make numa_migrate_prep() to take a folio · cda6d936
      Kefeng Wang authored
      In preparation for large folio numa balancing, make numa_migrate_prep() to
      take a folio, no functional change intended.
      
      Link: https://lkml.kernel.org/r/20230921074417.24004-5-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      cda6d936
    • Kefeng Wang's avatar
      mm: memory: use a folio in do_numa_page() · 6695cf68
      Kefeng Wang authored
      Numa balancing only try to migrate non-compound page in do_numa_page(),
      use a folio in it to save several compound_head calls, note we use
      folio_estimated_sharers(), it is enough to check the folio sharers since
      only normal page is handled, if large folio numa balancing is supported, a
      precise folio sharers check would be used, no functional change intended.
      
      Link: https://lkml.kernel.org/r/20230921074417.24004-4-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6695cf68
    • Kefeng Wang's avatar
      mm: huge_memory: use a folio in do_huge_pmd_numa_page() · 667ffc31
      Kefeng Wang authored
      Use a folio in do_huge_pmd_numa_page(), reduce three page_folio() calls to
      one, no functional change intended.
      
      Link: https://lkml.kernel.org/r/20230921074417.24004-3-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      667ffc31
    • Kefeng Wang's avatar
      mm: memory: add vm_normal_folio_pmd() · 65610453
      Kefeng Wang authored
      Patch series "mm: convert numa balancing functions to use a folio", v2.
      
      do_numa_pages() only handles non-compound pages, and only PMD-mapped THPs
      are handled in do_huge_pmd_numa_page().  But a large, PTE-mapped folio
      will be supported so let's convert more numa balancing functions to
      use/take a folio in preparation for that, no functional change intended
      for now.
      
      
      This patch (of 6):
      
      The new vm_normal_folio_pmd() wrapper is similar to vm_normal_folio(),
      which allow them to completely replace the struct page variables with
      struct folio variables.
      
      Link: https://lkml.kernel.org/r/20230921074417.24004-1-wangkefeng.wang@huawei.com
      Link: https://lkml.kernel.org/r/20230921074417.24004-2-wangkefeng.wang@huawei.comSigned-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: "Huang, Ying" <ying.huang@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      65610453
  2. 06 Oct, 2023 13 commits
  3. 04 Oct, 2023 10 commits
    • Yin Fengwei's avatar
      mm: mlock: update mlock_pte_range to handle large folio · dc68badc
      Yin Fengwei authored
      Current kernel only lock base size folio during mlock syscall.
      Add large folio support with following rules:
        - Only mlock large folio when it's in VM_LOCKED VMA range
          and fully mapped to page table.
      
          fully mapped folio is required as if folio is not fully
          mapped to a VM_LOCKED VMA, if system is in memory pressure,
          page reclaim is allowed to pick up this folio, split it
          and reclaim the pages which are not in VM_LOCKED VMA.
      
        - munlock will apply to the large folio which is in VMA range
          or cross the VMA boundary.
      
          This is required to handle the case that the large folio is
          mlocked, later the VMA is split in the middle of large folio.
      
      Link: https://lkml.kernel.org/r/20230918073318.1181104-4-fengwei.yin@intel.comSigned-off-by: default avatarYin Fengwei <fengwei.yin@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      dc68badc
    • Yin Fengwei's avatar
      mm: handle large folio when large folio in VM_LOCKED VMA range · 1acbc3f9
      Yin Fengwei authored
      If large folio is in the range of VM_LOCKED VMA, it should be mlocked to
      avoid being picked by page reclaim.  Which may split the large folio and
      then mlock each pages again.
      
      Mlock this kind of large folio to prevent them being picked by page
      reclaim.
      
      For the large folio which cross the boundary of VM_LOCKED VMA or not fully
      mapped to VM_LOCKED VMA, we'd better not to mlock it.  So if the system is
      under memory pressure, this kind of large folio will be split and the
      pages ouf of VM_LOCKED VMA can be reclaimed.
      
      Ideally, for large folio, we should mlock it when the large folio is fully
      mapped to VMA and munlock it if any page are unmampped from VMA.  But it's
      not easy to detect whether the large folio is fully mapped to VMA in some
      cases (like add/remove rmap).  So we update mlock_vma_folio() and
      munlock_vma_folio() to mlock/munlock the folio according to vma->vm_flags.
      Let caller to decide whether they should call these two functions.
      
      For add rmap, only mlock normal 4K folio and postpone large folio handling
      to page reclaim phase.  It is possible to reuse page table iterator to
      detect whether folio is fully mapped or not during page reclaim phase. 
      For remove rmap, invoke munlock_vma_folio() to munlock folio unconditionly
      because rmap makes folio not fully mapped to VMA.
      
      Link: https://lkml.kernel.org/r/20230918073318.1181104-3-fengwei.yin@intel.comSigned-off-by: default avatarYin Fengwei <fengwei.yin@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1acbc3f9
    • Yin Fengwei's avatar
      mm: add functions folio_in_range() and folio_within_vma() · 28e56657
      Yin Fengwei authored
      Patch series "support large folio for mlock", v3.
      
      Yu mentioned at [1] about the mlock() can't be applied to large folio.
      
      I leant the related code and here is my understanding:
      
      - For RLIMIT_MEMLOCK related, there is no problem.  Because the
        RLIMIT_MEMLOCK statistics is not related underneath page.  That means
        underneath page mlock or munlock doesn't impact the RLIMIT_MEMLOCK
        statistics collection which is always correct.
      
      - For keeping the page in RAM, there is no problem either.  At least,
        during try_to_unmap_one(), once detect the VMA has VM_LOCKED bit set in
        vm_flags, the folio will be kept whatever the folio is mlocked or not.
      
      So the function of mlock for large folio works.  But it's not optimized
      because the page reclaim needs scan these large folio and may split them.
      
      This series identified the large folio for mlock to four types:
        - The large folio is in VM_LOCKED range and fully mapped to the
          range
      
        - The large folio is in the VM_LOCKED range but not fully mapped to
          the range
      
        - The large folio cross VM_LOCKED VMA boundary
      
        - The large folio cross last level page table boundary
      
      For the first type, we mlock large folio so page reclaim will skip it.
      
      For the second/third type, we don't mlock large folio.  As the pages not
      mapped to VM_LOACKED range are mapped to none VM_LOCKED range, if system
      is in memory pressure situation, the large folio can be picked by page
      reclaim and split.  Then the pages not mapped to VM_LOCKED range can be
      reclaimed.
      
      For the fourth type, we don't mlock large folio because locking one page
      table lock can't prevent the part in another last level page table being
      unmapped.  Thanks to Ryan for pointing this out.
      
      
      To check whether the folio is fully mapped to the range, PTEs needs be
      checked to see whether the page of folio is associated.  Which needs take
      page table lock and is heavy operation.  So far, the only place needs this
      check is madvise and page reclaim.  These functions already have their own
      PTE iterator.
      
      patch1 introduce API to check whether large folio is in VMA range.
      patch2 make page reclaim/mlock_vma_folio/munlock_vma_folio support
             large folio mlock/munlock.
      patch3 make mlock/munlock syscall support large folio.
      
      Yu also mentioned a race which can make folio unevictable after munlock
      during RFC v2 discussion [3]:
      We decided that race issue didn't block this series based on:
        - That race issue was not introduced by this series
      
        - We had a looks-ok fix for that race issue. Need to wait
          for mlock_count fixing patch as Yosry Ahmed suggested [4]
      
      [1] https://lore.kernel.org/linux-mm/CAOUHufbtNPkdktjt_5qM45GegVO-rCFOMkSh0HQminQ12zsV8Q@mail.gmail.com/
      [2] https://lore.kernel.org/linux-mm/20230809061105.3369958-1-fengwei.yin@intel.com/
      [3] https://lore.kernel.org/linux-mm/CAOUHufZ6=9P_=CAOQyw0xw-3q707q-1FVV09dBNDC-hpcpj2Pg@mail.gmail.com/
      
      
      This patch (of 3):
      
      folio_in_range() will be used to check whether the folio is mapped to
      specific VMA and whether the mapping address of folio is in the range.
      
      Also a helper function folio_within_vma() to check whether folio
      is in the range of vma based on folio_in_range().
      
      Link: https://lkml.kernel.org/r/20230918073318.1181104-1-fengwei.yin@intel.com
      Link: https://lkml.kernel.org/r/20230918073318.1181104-2-fengwei.yin@intel.comSigned-off-by: default avatarYin Fengwei <fengwei.yin@intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Ryan Roberts <ryan.roberts@arm.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Cc: Yu Zhao <yuzhao@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      28e56657
    • Jinjie Ruan's avatar
      mm/damon/core-test: fix memory leak in damon_new_ctx() · a0ce7925
      Jinjie Ruan authored
      When CONFIG_DAMON_KUNIT_TEST=y and making CONFIG_DEBUG_KMEMLEAK=y and
      CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the below memory leak is detected.
      
      The damon_ctx which is allocated by kzalloc() in damon_new_ctx() in
      damon_test_ops_registration() and damon_test_set_attrs() are not freed. 
      So use damon_destroy_ctx() to free it.  After applying this patch, the
      following memory leak is never detected
      
          unreferenced object 0xffff2b49c6968800 (size 512):
            comm "kunit_try_catch", pid 350, jiffies 4294895294 (age 557.028s)
            hex dump (first 32 bytes):
              88 13 00 00 00 00 00 00 a0 86 01 00 00 00 00 00  ................
              00 87 93 03 00 00 00 00 0a 00 00 00 00 00 00 00  ................
            backtrace:
              [<0000000088e71769>] slab_post_alloc_hook+0xb8/0x368
              [<0000000073acab3b>] __kmem_cache_alloc_node+0x174/0x290
              [<00000000b5f89cef>] kmalloc_trace+0x40/0x164
              [<00000000eb19e83f>] damon_new_ctx+0x28/0xb4
              [<00000000daf6227b>] damon_test_ops_registration+0x34/0x328
              [<00000000559c4801>] kunit_try_run_case+0x50/0xac
              [<000000003932ed49>] kunit_generic_run_threadfn_adapter+0x20/0x2c
              [<000000003c3e9211>] kthread+0x124/0x130
              [<0000000028f85bdd>] ret_from_fork+0x10/0x20
          unreferenced object 0xffff2b49c1a9cc00 (size 512):
            comm "kunit_try_catch", pid 356, jiffies 4294895306 (age 557.000s)
            hex dump (first 32 bytes):
              88 13 00 00 00 00 00 00 a0 86 01 00 00 00 00 00  ................
              00 00 00 00 00 00 00 00 0a 00 00 00 00 00 00 00  ................
            backtrace:
              [<0000000088e71769>] slab_post_alloc_hook+0xb8/0x368
              [<0000000073acab3b>] __kmem_cache_alloc_node+0x174/0x290
              [<00000000b5f89cef>] kmalloc_trace+0x40/0x164
              [<00000000eb19e83f>] damon_new_ctx+0x28/0xb4
              [<00000000058495c4>] damon_test_set_attrs+0x30/0x1a8
              [<00000000559c4801>] kunit_try_run_case+0x50/0xac
              [<000000003932ed49>] kunit_generic_run_threadfn_adapter+0x20/0x2c
              [<000000003c3e9211>] kthread+0x124/0x130
              [<0000000028f85bdd>] ret_from_fork+0x10/0x20
      
      Link: https://lkml.kernel.org/r/20230918120951.2230468-3-ruanjinjie@huawei.com
      Fixes: d1836a3b ("mm/damon/core-test: initialise context before test in damon_test_set_attrs()")
      Fixes: 4f540f5a ("mm/damon/core-test: add a kunit test case for ops registration")
      Signed-off-by: default avatarJinjie Ruan <ruanjinjie@huawei.com>
      Reviewed-by: default avatarFeng Tang <feng.tang@intel.com>
      Reviewed-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: Brendan Higgins <brendan.higgins@linux.dev>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a0ce7925
    • Jinjie Ruan's avatar
      mm/damon/core-test: fix memory leak in damon_new_region() · f950fa6e
      Jinjie Ruan authored
      Patch series "mm/damon/core-test: Fix memory leaks in core-test", v3.
      
      There are a few memory leaks in core-test which are detected by kmemleak. 
      This patchset fixes the issues.
      
      
      This patch (of 2):
      
      When CONFIG_DAMON_KUNIT_TEST=y and making CONFIG_DEBUG_KMEMLEAK=y
      and CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the below memory leak is detected.
      
      The damon_region which is allocated by kmem_cache_alloc() in
      damon_new_region() in damon_test_regions() and
      damon_test_update_monitoring_result() are not freed.
      
      So for damon_test_regions(), replace damon_del_region() call with
      damon_destroy_region() so that it calls both damon_del_region() and
      damon_free_region(), the latter will free the damon_region. For
      damon_test_update_monitoring_result(), call damon_free_region() to
      free it. After applying this patch, the following memory leak is never
      detected.
      
          unreferenced object 0xffff2b49c3edc000 (size 56):
            comm "kunit_try_catch", pid 338, jiffies 4294895280 (age 557.084s)
            hex dump (first 32 bytes):
              01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00  ................
              00 00 00 00 00 00 00 00 00 00 00 00 49 2b ff ff  ............I+..
            backtrace:
              [<0000000088e71769>] slab_post_alloc_hook+0xb8/0x368
              [<00000000b528f67c>] kmem_cache_alloc+0x168/0x284
              [<000000008603f022>] damon_new_region+0x28/0x54
              [<00000000a3b8c64e>] damon_test_regions+0x38/0x270
              [<00000000559c4801>] kunit_try_run_case+0x50/0xac
              [<000000003932ed49>] kunit_generic_run_threadfn_adapter+0x20/0x2c
              [<000000003c3e9211>] kthread+0x124/0x130
              [<0000000028f85bdd>] ret_from_fork+0x10/0x20
          unreferenced object 0xffff2b49c5b20000 (size 56):
            comm "kunit_try_catch", pid 354, jiffies 4294895304 (age 556.988s)
            hex dump (first 32 bytes):
              03 00 00 00 00 00 00 00 07 00 00 00 00 00 00 00  ................
              00 00 00 00 00 00 00 00 96 00 00 00 49 2b ff ff  ............I+..
            backtrace:
              [<0000000088e71769>] slab_post_alloc_hook+0xb8/0x368
              [<00000000b528f67c>] kmem_cache_alloc+0x168/0x284
              [<000000008603f022>] damon_new_region+0x28/0x54
              [<00000000ca019f80>] damon_test_update_monitoring_result+0x18/0x34
              [<00000000559c4801>] kunit_try_run_case+0x50/0xac
              [<000000003932ed49>] kunit_generic_run_threadfn_adapter+0x20/0x2c
              [<000000003c3e9211>] kthread+0x124/0x130
              [<0000000028f85bdd>] ret_from_fork+0x10/0x20
      
      Link: https://lkml.kernel.org/r/20230918120951.2230468-1-ruanjinjie@huawei.com
      Link: https://lkml.kernel.org/r/20230918120951.2230468-2-ruanjinjie@huawei.com
      Fixes: 17ccae8b ("mm/damon: add kunit tests")
      Fixes: f4c978b6 ("mm/damon/core-test: add a test for damon_update_monitoring_results()")
      Signed-off-by: default avatarJinjie Ruan <ruanjinjie@huawei.com>
      Reviewed-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: Brendan Higgins <brendan.higgins@linux.dev>
      Cc: Feng Tang <feng.tang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f950fa6e
    • Jianguo Bao's avatar
      mm/writeback: update filemap_dirty_folio() comment · ab428b4c
      Jianguo Bao authored
      Change to use new address space operation dirty_folio().
      
      Link: https://lkml.kernel.org/r/20230917-trycontrib1-v1-1-db22630b8839@gmail.com
      Fixes: 6f31a5a2 ("fs: Add aops->dirty_folio")
      Signed-off-by: default avatarJianguo Bau <roidinev@gmail.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ab428b4c
    • SeongJae Park's avatar
      Docs/ABI/damon: update for DAMOS apply intervals · d57d36b5
      SeongJae Park authored
      Update DAMON ABI document for the newly added DAMON sysfs file for DAMOS
      apply intervals (apply_interval_us file).
      
      Link: https://lkml.kernel.org/r/20230916020945.47296-10-sj@kernel.orgSigned-off-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d57d36b5
    • SeongJae Park's avatar
      Docs/admin-guide/mm/damon/usage: update for DAMOS apply intervals · 033343d5
      SeongJae Park authored
      Update DAMON usage document's DAMON sysfs interface section for the newly
      added DAMOS apply intervals support (apply_interval_us file).
      
      Link: https://lkml.kernel.org/r/20230916020945.47296-9-sj@kernel.orgSigned-off-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      033343d5
    • SeongJae Park's avatar
      selftests/damon/sysfs: test DAMOS apply intervals · 65ded14e
      SeongJae Park authored
      Update DAMON selftests to test existence of the file for reading/writing
      DAMOS apply interval under each scheme directory.
      
      Link: https://lkml.kernel.org/r/20230916020945.47296-8-sj@kernel.orgSigned-off-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      65ded14e
    • SeongJae Park's avatar
      mm/damon/sysfs-schemes: support DAMOS apply interval · a2a9f68e
      SeongJae Park authored
      Update DAMON sysfs interface to support DAMOS apply intervals by adding a
      new file, 'apply_interval_us' in each scheme directory.  Users can set and
      get the interval for each scheme in microseconds by writing to and reading
      from the file.
      
      Link: https://lkml.kernel.org/r/20230916020945.47296-7-sj@kernel.orgSigned-off-by: default avatarSeongJae Park <sj@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a2a9f68e