An error occurred fetching the project authors.
  1. 18 Jul, 2022 3 commits
  2. 16 May, 2022 2 commits
  3. 29 Apr, 2022 2 commits
    • Muchun Song's avatar
      dax: fix missing writeprotect the pte entry · 06083a09
      Muchun Song authored
      Currently dax_mapping_entry_mkclean() fails to clean and write protect the
      pte entry within a DAX PMD entry during an *sync operation.  This can
      result in data loss in the following sequence:
      
        1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and
           making the pmd entry dirty and writeable.
        2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K)
           write to the same file, dirtying PMD radix tree entry (already
           done in 1)) and making the pte entry dirty and writeable.
        3) fsync, flushing out PMD data and cleaning the radix tree entry. We
           currently fail to mark the pte entry as clean and write protected
           since the vma of process B is not covered in dax_entry_mkclean().
        4) process B writes to the pte. These don't cause any page faults since
           the pte entry is dirty and writeable. The radix tree entry remains
           clean.
        5) fsync, which fails to flush the dirty PMD data because the radix tree
           entry was clean.
        6) crash - dirty data that should have been fsync'd as part of 5) could
           still have been in the processor cache, and is lost.
      
      Just to use pfn_mkclean_range() to clean the pfns to fix this issue.
      
      Link: https://lkml.kernel.org/r/20220403053957.10770-6-songmuchun@bytedance.com
      Fixes: 4b4bb46d ("dax: clear dirty entry tags on cache flush")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Ross Zwisler <zwisler@kernel.org>
      Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
      Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      06083a09
    • Muchun Song's avatar
      dax: fix cache flush on PMD-mapped pages · e583b5c4
      Muchun Song authored
      The flush_cache_page() only remove a PAGE_SIZE sized range from the cache.
      However, it does not cover the full pages in a THP except a head page. 
      Replace it with flush_cache_range() to fix this issue.  This is just a
      documentation issue with the respect to properly documenting the expected
      usage of cache flushing before modifying the pmd.  However, in practice
      this is not a problem due to the fact that DAX is not available on
      architectures with virtually indexed caches per:
      
        commit d92576f1 ("dax: does not work correctly with virtual aliasing caches")
      
      Link: https://lkml.kernel.org/r/20220403053957.10770-3-songmuchun@bytedance.com
      Fixes: f729c8c9 ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean")
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Ross Zwisler <zwisler@kernel.org>
      Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
      Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e583b5c4
  4. 18 Feb, 2022 1 commit
  5. 02 Feb, 2022 1 commit
  6. 18 Dec, 2021 1 commit
  7. 04 Dec, 2021 8 commits
  8. 17 Aug, 2021 5 commits
  9. 08 Jul, 2021 1 commit
  10. 29 Jun, 2021 1 commit
    • Jan Kara's avatar
      dax: fix ENOMEM handling in grab_mapping_entry() · 1a14e377
      Jan Kara authored
      grab_mapping_entry() has a bug in handling of ENOMEM condition.  Suppose
      we have a PMD entry at index i which we are downgrading to a PTE entry.
      grab_mapping_entry() will set pmd_downgrade to true, lock the entry, clear
      the entry in xarray, and decrement mapping->nrpages.  The it will call:
      
      	entry = dax_make_entry(pfn_to_pfn_t(0), flags);
      	dax_lock_entry(xas, entry);
      
      which inserts new PTE entry into xarray.  However this may fail allocating
      the new node.  We handle this by:
      
      	if (xas_nomem(xas, mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM))
      		goto retry;
      
      however pmd_downgrade stays set to true even though 'entry' returned from
      get_unlocked_entry() will be NULL now.  And we will go again through the
      downgrade branch.  This is mostly harmless except that mapping->nrpages is
      decremented again and we temporarily have an invalid entry stored in
      xarray.  Fix the problem by setting pmd_downgrade to false each time we
      lookup the entry we work with so that it matches the entry we found.
      
      Link: https://lkml.kernel.org/r/20210622160015.18004-1-jack@suse.cz
      Fixes: b15cd800 ("dax: Convert page fault handlers to XArray")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarDan Williams <dan.j.williams@intel.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1a14e377
  11. 07 May, 2021 3 commits
  12. 05 May, 2021 2 commits
  13. 09 Feb, 2021 1 commit
    • Paolo Bonzini's avatar
      mm: provide a saner PTE walking API for modules · 9fd6dad1
      Paolo Bonzini authored
      Currently, the follow_pfn function is exported for modules but
      follow_pte is not.  However, follow_pfn is very easy to misuse,
      because it does not provide protections (so most of its callers
      assume the page is writable!) and because it returns after having
      already unlocked the page table lock.
      
      Provide instead a simplified version of follow_pte that does
      not have the pmdpp and range arguments.  The older version
      survives as follow_invalidate_pte() for use by fs/dax.c.
      Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      9fd6dad1
  14. 16 Dec, 2020 1 commit
  15. 21 Sep, 2020 1 commit
  16. 10 Sep, 2020 1 commit
    • Vivek Goyal's avatar
      dax: Create a range version of dax_layout_busy_page() · 6bbdd563
      Vivek Goyal authored
      virtiofs device has a range of memory which is mapped into file inodes
      using dax. This memory is mapped in qemu on host and maps different
      sections of real file on host. Size of this memory is limited
      (determined by administrator) and depending on filesystem size, we will
      soon reach a situation where all the memory is in use and we need to
      reclaim some.
      
      As part of reclaim process, we will need to make sure that there are
      no active references to pages (taken by get_user_pages()) on the memory
      range we are trying to reclaim. I am planning to use
      dax_layout_busy_page() for this. But in current form this is per inode
      and scans through all the pages of the inode.
      
      We want to reclaim only a portion of memory (say 2MB page). So we want
      to make sure that only that 2MB range of pages do not have any
      references  (and don't want to unmap all the pages of inode).
      
      Hence, create a range version of this function named
      dax_layout_busy_page_range() which can be used to pass a range which
      needs to be unmapped.
      
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-nvdimm@lists.01.org
      Cc: Jan Kara <jack@suse.cz>
      Cc: Vishal L Verma <vishal.l.verma@intel.com>
      Cc: "Weiny, Ira" <ira.weiny@intel.com>
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      6bbdd563
  17. 23 Aug, 2020 1 commit
  18. 31 Jul, 2020 1 commit
  19. 28 Jul, 2020 1 commit
  20. 03 Apr, 2020 2 commits
  21. 06 Feb, 2020 1 commit