An error occurred fetching the project authors.
- 18 Jul, 2022 3 commits
-
-
Shiyang Ruan authored
Add address output in dax_iomap_pfn() in order to perform a memcpy() in CoW case. Since this function both output address and pfn, rename it to dax_iomap_direct_access(). [ruansy.fnst@fujitsu.com: initialize `rc', per Dan] Link: https://lore.kernel.org/linux-fsdevel/Yp8FUZnO64Qvyx5G@kili/ Link: https://lkml.kernel.org/r/20220607143837.161174-1-ruansy.fnst@fujitsu.com Link: https://lkml.kernel.org/r/20220603053738.1218681-9-ruansy.fnst@fujitsu.comSigned-off-by:
Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.wiliams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Shiyang Ruan authored
Introduce a PAGE_MAPPING_DAX_COW flag to support association with CoW file mappings. In this case, since the dax-rmap has already took the responsibility to look up for shared files by given dax page, the page->mapping is no longer to used for rmap but for marking that this dax page is shared. And to make sure disassociation works fine, we use page->index as refcount, and clear page->mapping to the initial state when page->index is decreased to 0. With the help of this new flag, it is able to distinguish normal case and CoW case, and keep the warning in normal case. Link: https://lkml.kernel.org/r/20220603053738.1218681-8-ruansy.fnst@fujitsu.comSigned-off-by:
Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.wiliams@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Shiyang Ruan authored
The current dax_lock_page() locks dax entry by obtaining mapping and index in page. To support 1-to-N RMAP in NVDIMM, we need a new function to lock a specific dax entry corresponding to this file's mapping,index. And output the page corresponding to the specific dax entry for caller use. Link: https://lkml.kernel.org/r/20220603053738.1218681-5-ruansy.fnst@fujitsu.comSigned-off-by:
Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.wiliams@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Cc: Jane Chu <jane.chu@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> Cc: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- 16 May, 2022 2 commits
-
-
Jane Chu authored
Introduce dax_recovery_write() operation. The function is used to recover a dax range that contains poison. Typical use case is when a user process receives a SIGBUS with si_code BUS_MCEERR_AR indicating poison(s) in a dax range, in response, the user process issues a pwrite() to the page-aligned dax range, thus clears the poison and puts valid data in the range. Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jane Chu <jane.chu@oracle.com> Link: https://lore.kernel.org/r/20220422224508.440670-6-jane.chu@oracle.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Jane Chu authored
Up till now, dax_direct_access() is used implicitly for normal access, but for the purpose of recovery write, dax range with poison is requested. To make the interface clear, introduce enum dax_access_mode { DAX_ACCESS, DAX_RECOVERY_WRITE, } where DAX_ACCESS is used for normal dax access, and DAX_RECOVERY_WRITE is used for dax recovery write. Suggested-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Jane Chu <jane.chu@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Reviewed-by:
Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/165247982851.52965.11024212198889762949.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 29 Apr, 2022 2 commits
-
-
Muchun Song authored
Currently dax_mapping_entry_mkclean() fails to clean and write protect the pte entry within a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd entry dirty and writeable. 2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K) write to the same file, dirtying PMD radix tree entry (already done in 1)) and making the pte entry dirty and writeable. 3) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pte entry as clean and write protected since the vma of process B is not covered in dax_entry_mkclean(). 4) process B writes to the pte. These don't cause any page faults since the pte entry is dirty and writeable. The radix tree entry remains clean. 5) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 6) crash - dirty data that should have been fsync'd as part of 5) could still have been in the processor cache, and is lost. Just to use pfn_mkclean_range() to clean the pfns to fix this issue. Link: https://lkml.kernel.org/r/20220403053957.10770-6-songmuchun@bytedance.com Fixes: 4b4bb46d ("dax: clear dirty entry tags on cache flush") Signed-off-by:
Muchun Song <songmuchun@bytedance.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
Muchun Song authored
The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. This is just a documentation issue with the respect to properly documenting the expected usage of cache flushing before modifying the pmd. However, in practice this is not a problem due to the fact that DAX is not available on architectures with virtually indexed caches per: commit d92576f1 ("dax: does not work correctly with virtual aliasing caches") Link: https://lkml.kernel.org/r/20220403053957.10770-3-songmuchun@bytedance.com Fixes: f729c8c9 ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean") Signed-off-by:
Muchun Song <songmuchun@bytedance.com> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Ross Zwisler <zwisler@kernel.org> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
-
- 18 Feb, 2022 1 commit
-
-
Shiyang Ruan authored
The function name has been changed, so the description should be updated too. Signed-off-by:
Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220127124058.1172422-5-ruansy.fnst@fujitsu.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 02 Feb, 2022 1 commit
-
-
Christoph Hellwig authored
There is no good reason to keep genhd.h separate from the main blkdev.h header that includes it. So fold the contents of genhd.h into blkdev.h and remove genhd.h entirely. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by:
Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220124093913.742411-4-hch@lst.deSigned-off-by:
Jens Axboe <axboe@kernel.dk>
-
- 18 Dec, 2021 1 commit
-
-
Christoph Hellwig authored
These methods indirect the actual DAX read/write path. In the end pmem uses magic flush and mc safe variants and fuse and dcssblk use plain ones while device mapper picks redirects to the underlying device. Add set_dax_nocache() and set_dax_nomc() APIs to control which copy routines are used to remove indirect call from the read/write fast path as well as a lot of boilerplate code. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> [virtiofs] Link: https://lore.kernel.org/r/20211215084508.435401-5-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 04 Dec, 2021 8 commits
-
-
Christoph Hellwig authored
Remove the last user of ->bdev in dax.c by requiring the file system to pass in an address that already includes the DAX offset. As part of the only set ->bdev or ->daxdev when actually required in the ->iomap_begin methods. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> [erofs] Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-27-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
Add a flag so that the file system can easily detect DAX operations based just on the iomap operation requested instead of looking at inode state using IS_DAX. This will be needed to apply the to be added partition offset only for operations that actually use DAX, but not things like fiemap that are based on the block device. In the long run it should also allow turning the bdev, dax_dev and inline_data into a union. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-25-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
Unshare the DAX and iomap buffered I/O page zeroing code. This code previously did a IS_DAX check deep inside the iomap code, which in fact was the only DAX check in the code. Instead move these checks into the callers. Most callers already have DAX special casing anyway and XFS will need it for reflink support as well. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-19-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
Factor out a helper for the "manual" zeroing of a DAX range to clean up dax_iomap_zero a lot. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-18-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
The file relative offset must have the same alignment as the storage offset, so use that and get rid of the call to iomap_sector. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-17-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
Replace the two steps of dax_iomap_sector and bdev_dax_pgoff with a single dax_iomap_pgoff helper that avoids lots of cumbersome sector conversions. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-15-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
Just pass the vm_fault and iomap_iter structures, and figure out the rest locally. Note that this requires moving dax_iomap_sector up in the file. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-14-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Christoph Hellwig authored
Despite its name copy_user_page expected kernel addresses, which is what we already have. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20211129102203.2243509-13-hch@lst.deSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 17 Aug, 2021 5 commits
-
-
Christoph Hellwig authored
Avoid the open coded calls to ->iomap_begin and ->iomap_end and call iomap_iter instead. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Shiyang Ruan authored
The core logic in the two dax page fault functions is similar. So, move the logic into a common helper function. Also, to facilitate the addition of new features, such as CoW, switch-case is no longer used to handle different iomap types. Signed-off-by:
Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Shiyang Ruan authored
The dax page fault code is too long and a bit difficult to read. And it is hard to understand when we trying to add new features. Some of the PTE/PMD codes have similar logic. So, factor out helper functions to simplify the code. Signed-off-by:
Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by:
Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> [hch: minor cleanups] Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Christoph Hellwig authored
Switch the dax_iomap_rw implementation to use iomap_iter. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
Christoph Hellwig authored
Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <djwong@kernel.org> Signed-off-by:
Darrick J. Wong <djwong@kernel.org>
-
- 08 Jul, 2021 1 commit
-
-
Ira Weiny authored
dax_direct_access() takes a number of pages. PHYS_PFN(PAGE_SIZE) is a very round about way to specify '1'. Change the nr_pages parameter to the explicit value of '1'. Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20210525172428.3634316-3-ira.weiny@intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 29 Jun, 2021 1 commit
-
-
Jan Kara authored
grab_mapping_entry() has a bug in handling of ENOMEM condition. Suppose we have a PMD entry at index i which we are downgrading to a PTE entry. grab_mapping_entry() will set pmd_downgrade to true, lock the entry, clear the entry in xarray, and decrement mapping->nrpages. The it will call: entry = dax_make_entry(pfn_to_pfn_t(0), flags); dax_lock_entry(xas, entry); which inserts new PTE entry into xarray. However this may fail allocating the new node. We handle this by: if (xas_nomem(xas, mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM)) goto retry; however pmd_downgrade stays set to true even though 'entry' returned from get_unlocked_entry() will be NULL now. And we will go again through the downgrade branch. This is mostly harmless except that mapping->nrpages is decremented again and we temporarily have an invalid entry stored in xarray. Fix the problem by setting pmd_downgrade to false each time we lookup the entry we work with so that it matches the entry we found. Link: https://lkml.kernel.org/r/20210622160015.18004-1-jack@suse.cz Fixes: b15cd800 ("dax: Convert page fault handlers to XArray") Signed-off-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 07 May, 2021 3 commits
-
-
Vivek Goyal authored
I am seeing missed wakeups which ultimately lead to a deadlock when I am using virtiofs with DAX enabled and running "make -j". I had to mount virtiofs as rootfs and also reduce to dax window size to 256M to reproduce the problem consistently. So here is the problem. put_unlocked_entry() wakes up waiters only if entry is not null as well as !dax_is_conflict(entry). But if I call multiple instances of invalidate_inode_pages2() in parallel, then I can run into a situation where there are waiters on this index but nobody will wake these waiters. invalidate_inode_pages2() invalidate_inode_pages2_range() invalidate_exceptional_entry2() dax_invalidate_mapping_entry_sync() __dax_invalidate_entry() { xas_lock_irq(&xas); entry = get_unlocked_entry(&xas, 0); ... ... dax_disassociate_entry(entry, mapping, trunc); xas_store(&xas, NULL); ... ... put_unlocked_entry(&xas, entry); xas_unlock_irq(&xas); } Say a fault in in progress and it has locked entry at offset say "0x1c". Now say three instances of invalidate_inode_pages2() are in progress (A, B, C) and they all try to invalidate entry at offset "0x1c". Given dax entry is locked, all tree instances A, B, C will wait in wait queue. When dax fault finishes, say A is woken up. It will store NULL entry at index "0x1c" and wake up B. When B comes along it will find "entry=0" at page offset 0x1c and it will call put_unlocked_entry(&xas, 0). And this means put_unlocked_entry() will not wake up next waiter, given the current code. And that means C continues to wait and is not woken up. This patch fixes the issue by waking up all waiters when a dax entry has been invalidated. This seems to fix the deadlock I am facing and I can make forward progress. Reported-by:
Sergio Lopez <slp@redhat.com> Fixes: ac401cc7 ("dax: New fault locking") Reviewed-by:
Jan Kara <jack@suse.cz> Suggested-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20210428190314.1865312-4-vgoyal@redhat.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Vivek Goyal authored
As of now put_unlocked_entry() always wakes up next waiter. In next patches we want to wake up all waiters at one callsite. Hence, add a parameter to the function. This patch does not introduce any change of behavior. Reviewed-by:
Greg Kurz <groug@kaod.org> Reviewed-by:
Jan Kara <jack@suse.cz> Suggested-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20210428190314.1865312-3-vgoyal@redhat.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Vivek Goyal authored
Dan mentioned that he is not very fond of passing around a boolean true/false to specify if only next waiter should be woken up or all waiters should be woken up. He instead prefers that we introduce an enum and make it very explicity at the callsite itself. Easier to read code. This patch should not introduce any change of behavior. Reviewed-by:
Greg Kurz <groug@kaod.org> Reviewed-by:
Jan Kara <jack@suse.cz> Suggested-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20210428190314.1865312-2-vgoyal@redhat.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 05 May, 2021 2 commits
-
-
Matthew Wilcox (Oracle) authored
Simplify mapping_needs_writeback() by accounting DAX entries as pages instead of exceptional entries. Link: https://lkml.kernel.org/r/20201026151849.24232-4-willy@infradead.orgSigned-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by:
Vishal Verma <vishal.l.verma@intel.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Matthew Wilcox (Oracle) authored
Patch series "Remove nrexceptional tracking", v2. We actually use nrexceptional for very little these days. It's a minor pain to keep in sync with nrpages, but the pain becomes much bigger with the THP patches because we don't know how many indices a shadow entry occupies. It's easier to just remove it than keep it accurate. Also, we save 8 bytes per inode which is nothing to sneeze at; on my laptop, it would improve shmem_inode_cache from 22 to 23 objects per 16kB, and inode_cache from 26 to 27 objects. Combined, that saves a megabyte of memory from a combined usage of 25MB for both caches. Unfortunately, ext4 doesn't cross a magic boundary, so it doesn't save any memory for ext4. This patch (of 4): Instead of checking the two counters (nrpages and nrexceptional), we can just check whether i_pages is empty. Link: https://lkml.kernel.org/r/20201026151849.24232-1-willy@infradead.org Link: https://lkml.kernel.org/r/20201026151849.24232-2-willy@infradead.orgSigned-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by:
Vishal Verma <vishal.l.verma@intel.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 09 Feb, 2021 1 commit
-
-
Paolo Bonzini authored
Currently, the follow_pfn function is exported for modules but follow_pte is not. However, follow_pfn is very easy to misuse, because it does not provide protections (so most of its callers assume the page is writable!) and because it returns after having already unlocked the page table lock. Provide instead a simplified version of follow_pte that does not have the pmdpp and range arguments. The older version survives as follow_invalidate_pte() for use by fs/dax.c. Reviewed-by:
Jason Gunthorpe <jgg@nvidia.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
- 16 Dec, 2020 1 commit
-
-
Christoph Hellwig authored
Merge __follow_pte_pmd, follow_pte_pmd and follow_pte into a single follow_pte function and just pass two additional NULL arguments for the two previous follow_pte callers. [sfr@canb.auug.org.au: merge fix for "s390/pci: remove races against pte updates"] Link: https://lkml.kernel.org/r/20201111221254.7f6a3658@canb.auug.org.au Link: https://lkml.kernel.org/r/20201029101432.47011-3-hch@lst.deSigned-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 21 Sep, 2020 1 commit
-
-
Matthew Wilcox (Oracle) authored
Pass the full length to iomap_zero() and dax_iomap_zero(), and have them return how many bytes they actually handled. This is preparatory work for handling THP, although it looks like DAX could actually take advantage of it if there's a larger contiguous area. Signed-off-by:
Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Christoph Hellwig <hch@lst.de>
-
- 10 Sep, 2020 1 commit
-
-
Vivek Goyal authored
virtiofs device has a range of memory which is mapped into file inodes using dax. This memory is mapped in qemu on host and maps different sections of real file on host. Size of this memory is limited (determined by administrator) and depending on filesystem size, we will soon reach a situation where all the memory is in use and we need to reclaim some. As part of reclaim process, we will need to make sure that there are no active references to pages (taken by get_user_pages()) on the memory range we are trying to reclaim. I am planning to use dax_layout_busy_page() for this. But in current form this is per inode and scans through all the pages of the inode. We want to reclaim only a portion of memory (say 2MB page). So we want to make sure that only that 2MB range of pages do not have any references (and don't want to unmap all the pages of inode). Hence, create a range version of this function named dax_layout_busy_page_range() which can be used to pass a range which needs to be unmapped. Cc: Dan Williams <dan.j.williams@intel.com> Cc: linux-nvdimm@lists.01.org Cc: Jan Kara <jack@suse.cz> Cc: Vishal L Verma <vishal.l.verma@intel.com> Cc: "Weiny, Ira" <ira.weiny@intel.com> Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Miklos Szeredi <mszeredi@redhat.com>
-
- 23 Aug, 2020 1 commit
-
-
Gustavo A. R. Silva authored
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-throughSigned-off-by:
Gustavo A. R. Silva <gustavoars@kernel.org>
-
- 31 Jul, 2020 1 commit
-
-
Hao Li authored
The argument passed to xas_set_err() to indicate an error should be negative. Otherwise, xas_error() will return 0, and grab_mapping_entry() will return the found entry instead of 'SIGBUS' when the entry is not in fact valid. This would result in problems in subsequent code paths. Link: https://lore.kernel.org/r/20200729034436.24267-1-lihao2018.fnst@cn.fujitsu.comReviewed-by:
Pankaj Gupta <pankaj.gupta.linux@gmail.com> Signed-off-by:
Hao Li <lihao2018.fnst@cn.fujitsu.com> Signed-off-by:
Vishal Verma <vishal.l.verma@intel.com>
-
- 28 Jul, 2020 1 commit
-
-
Ira Weiny authored
Passing size to copy_user_dax implies it can copy variable sizes of data when in fact it calls copy_user_page() which is exactly a page. We are safe because the only caller uses PAGE_SIZE anyway so just remove the variable for clarity. While we are at it change copy_user_dax() to copy_cow_page_dax() to make it clear it is a singleton helper for this one case not implementing what dax_iomap_actor() does. Link: https://lore.kernel.org/r/20200717072056.73134-11-ira.weiny@intel.comReviewed-by:
Ben Widawsky <ben.widawsky@intel.com> Reviewed-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Ira Weiny <ira.weiny@intel.com> Signed-off-by:
Vishal Verma <vishal.l.verma@intel.com>
-
- 03 Apr, 2020 2 commits
-
-
Vivek Goyal authored
Add a helper dax_ioamp_zero() to zero a range. This patch basically merges __dax_zero_page_range() and iomap_dax_zero(). Suggested-by:
Christoph Hellwig <hch@infradead.org> Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20200228163456.1587-7-vgoyal@redhat.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Vivek Goyal authored
Use new dax native zero page method for zeroing page if I/O is page aligned. Otherwise fall back to direct_access() + memcpy(). This gets rid of one of the depenendency on block device in dax path. Signed-off-by:
Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/r/20200228163456.1587-6-vgoyal@redhat.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- 06 Feb, 2020 1 commit
-
-
Jeff Moyer authored
fstests generic/471 reports a failure when run with MOUNT_OPTIONS="-o dax". The reason is that the initial pwrite to an empty file with the RWF_NOWAIT flag set does not return -EAGAIN. It turns out that dax_iomap_rw doesn't pass that flag through to iomap_apply. With this patch applied, generic/471 passes for me. Signed-off-by:
Jeff Moyer <jmoyer@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/x49r1z86e1d.fsf@segfault.boston.devel.redhat.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com>
-