1. 06 Jul, 2024 5 commits
    • Andrew Morton's avatar
      Merge branch 'mm-hotfixes-stable' into mm-stable to pick up "mm: fix · 8ef6fd0e
      Andrew Morton authored
      crashes from deferred split racing folio migration", needed by "mm:
      migrate: split folio_migrate_mapping()".
      8ef6fd0e
    • Lorenzo Stoakes's avatar
    • Hugh Dickins's avatar
      mm: fix crashes from deferred split racing folio migration · be9581ea
      Hugh Dickins authored
      Even on 6.10-rc6, I've been seeing elusive "Bad page state"s (often on
      flags when freeing, yet the flags shown are not bad: PG_locked had been
      set and cleared??), and VM_BUG_ON_PAGE(page_ref_count(page) == 0)s from
      deferred_split_scan()'s folio_put(), and a variety of other BUG and WARN
      symptoms implying double free by deferred split and large folio migration.
      
      6.7 commit 9bcef597 ("mm: memcg: fix split queue list crash when large
      folio migration") was right to fix the memcg-dependent locking broken in
      85ce2c51 ("memcontrol: only transfer the memcg data for migration"),
      but missed a subtlety of deferred_split_scan(): it moves folios to its own
      local list to work on them without split_queue_lock, during which time
      folio->_deferred_list is not empty, but even the "right" lock does nothing
      to secure the folio and the list it is on.
      
      Fortunately, deferred_split_scan() is careful to use folio_try_get(): so
      folio_migrate_mapping() can avoid the race by folio_undo_large_rmappable()
      while the old folio's reference count is temporarily frozen to 0 - adding
      such a freeze in the !mapping case too (originally, folio lock and
      unmapping and no swap cache left an anon folio unreachable, so no freezing
      was needed there: but the deferred split queue offers a way to reach it).
      
      Link: https://lkml.kernel.org/r/29c83d1a-11ca-b6c9-f92e-6ccb322af510@google.com
      Fixes: 9bcef597 ("mm: memcg: fix split queue list crash when large folio migration")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
      Cc: Barry Song <baohua@kernel.org>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Nhat Pham <nphamcs@gmail.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Zi Yan <ziy@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      be9581ea
    • Paul Menzel's avatar
      lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat · 2fe29fe9
      Paul Menzel authored
      On a system with Perl 5.12.1, commit 5ef6dc08
      ("lib/build_OID_registry: don't mention the full path of the script in
      output") causes the build to fail with the error below.
      
           Bareword found where operator expected at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
           syntax error at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
           Execution of ./lib/build_OID_registry aborted due to compilation errors.
           make[3]: *** [lib/Makefile:352: lib/oid_registry_data.c] Error 255
      
      Ahmad Fatoum analyzed that non-destructive substitution is only supported since
      Perl 5.13.2. Instead of dropping `r` and having the side effect of modifying
      `$0`, introduce a dedicated variable to support older Perl versions.
      
      Link: https://lkml.kernel.org/r/20240702223512.8329-2-pmenzel@molgen.mpg.de
      Link: https://lkml.kernel.org/r/20240701155802.75152-1-pmenzel@molgen.mpg.de
      Fixes: 5ef6dc08 ("lib/build_OID_registry: don't mention the full path of the script in output")
      Link: https://lore.kernel.org/all/259f7a87-2692-480e-9073-1c1c35b52f67@molgen.mpg.de/Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Suggested-by: default avatarAhmad Fatoum <a.fatoum@pengutronix.de>
      Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Ahmad Fatoum <a.fatoum@pengutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2fe29fe9
    • Yang Shi's avatar
      mm: gup: stop abusing try_grab_folio · f442fa61
      Yang Shi authored
      A kernel warning was reported when pinning folio in CMA memory when
      launching SEV virtual machine.  The splat looks like:
      
      [  464.325306] WARNING: CPU: 13 PID: 6734 at mm/gup.c:1313 __get_user_pages+0x423/0x520
      [  464.325464] CPU: 13 PID: 6734 Comm: qemu-kvm Kdump: loaded Not tainted 6.6.33+ #6
      [  464.325477] RIP: 0010:__get_user_pages+0x423/0x520
      [  464.325515] Call Trace:
      [  464.325520]  <TASK>
      [  464.325523]  ? __get_user_pages+0x423/0x520
      [  464.325528]  ? __warn+0x81/0x130
      [  464.325536]  ? __get_user_pages+0x423/0x520
      [  464.325541]  ? report_bug+0x171/0x1a0
      [  464.325549]  ? handle_bug+0x3c/0x70
      [  464.325554]  ? exc_invalid_op+0x17/0x70
      [  464.325558]  ? asm_exc_invalid_op+0x1a/0x20
      [  464.325567]  ? __get_user_pages+0x423/0x520
      [  464.325575]  __gup_longterm_locked+0x212/0x7a0
      [  464.325583]  internal_get_user_pages_fast+0xfb/0x190
      [  464.325590]  pin_user_pages_fast+0x47/0x60
      [  464.325598]  sev_pin_memory+0xca/0x170 [kvm_amd]
      [  464.325616]  sev_mem_enc_register_region+0x81/0x130 [kvm_amd]
      
      Per the analysis done by yangge, when starting the SEV virtual machine, it
      will call pin_user_pages_fast(..., FOLL_LONGTERM, ...) to pin the memory. 
      But the page is in CMA area, so fast GUP will fail then fallback to the
      slow path due to the longterm pinnalbe check in try_grab_folio().
      
      The slow path will try to pin the pages then migrate them out of CMA area.
      But the slow path also uses try_grab_folio() to pin the page, it will
      also fail due to the same check then the above warning is triggered.
      
      In addition, the try_grab_folio() is supposed to be used in fast path and
      it elevates folio refcount by using add ref unless zero.  We are guaranteed
      to have at least one stable reference in slow path, so the simple atomic add
      could be used.  The performance difference should be trivial, but the
      misuse may be confusing and misleading.
      
      Redefined try_grab_folio() to try_grab_folio_fast(), and try_grab_page()
      to try_grab_folio(), and use them in the proper paths.  This solves both
      the abuse and the kernel warning.
      
      The proper naming makes their usecase more clear and should prevent from
      abusing in the future.
      
      peterx said:
      
      : The user will see the pin fails, for gpu-slow it further triggers the WARN
      : right below that failure (as in the original report):
      : 
      :         folio = try_grab_folio(page, page_increm - 1,
      :                                 foll_flags);
      :         if (WARN_ON_ONCE(!folio)) { <------------------------ here
      :                 /*
      :                         * Release the 1st page ref if the
      :                         * folio is problematic, fail hard.
      :                         */
      :                 gup_put_folio(page_folio(page), 1,
      :                                 foll_flags);
      :                 ret = -EFAULT;
      :                 goto out;
      :         }
      
      [1] https://lore.kernel.org/linux-mm/1719478388-31917-1-git-send-email-yangge1116@126.com/
      
      [shy828301@gmail.com: fix implicit declaration of function try_grab_folio_fast]
        Link: https://lkml.kernel.org/r/CAHbLzkowMSso-4Nufc9hcMehQsK9PNz3OSu-+eniU-2Mm-xjhA@mail.gmail.com
      Link: https://lkml.kernel.org/r/20240628191458.2605553-1-yang@os.amperecomputing.com
      Fixes: 57edfcfd ("mm/gup: accelerate thp gup even for "pages != NULL"")
      Signed-off-by: default avatarYang Shi <yang@os.amperecomputing.com>
      Reported-by: default avataryangge <yangge1116@126.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: <stable@vger.kernel.org>	[6.6+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f442fa61
  2. 05 Jul, 2024 35 commits