- 08 Aug, 2018 1 commit
-
-
Bob Peterson authored
Function update_rgrp_lvb_unlinked used to do the same thing as be32_add_cpu. This patch removes it in favor of using be32_add_cpu directly. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com>
-
- 07 Aug, 2018 1 commit
-
-
Bob Peterson authored
Function gfs2_testbit is called in three places. Two of those places, gfs2_alloc_extent and gfs2_unaligned_extlen, should be using the clone bitmaps, not the "real" bitmaps. Function gfs2_unaligned_extlen is used by the block reservations scheme to determine the length of an extent of free blocks. Before this patch, it wasn't using the clone bitmap, which means recently-freed blocks were treated as free blocks for the purposes of an allocation. This patch adds a new parameter to gfs2_testbit to indicate whether or not the clone bitmaps should be used (if available). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
-
- 03 Aug, 2018 1 commit
-
-
Andreas Gruenbacher authored
Function gfs2_ea_strlen is only called from ea_list_i, so inline it there. Remove the duplicate switch statement and the creative use of memcpy to set a null byte. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
- 26 Jul, 2018 1 commit
-
-
Bob Peterson authored
Before this patch gfs2_rgrp_ondisk2lvb was called after every call to gfs2_rgrp_out. This patch just calls it directly from within gfs2_rgrp_out, and moves the function to be before it so we don't need a function prototype. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
-
- 25 Jul, 2018 1 commit
-
-
Andreas Gruenbacher authored
To speed up the common case of appending to a file, gfs2_write_alloc_required presumes that writing beyond the end of a file will always require additional blocks to be allocated. This assumption is incorrect for preallocates files, but there are no negative consequences as long as *some* space is still left on the filesystem. One special file that always has some space preallocated beyond the end of the file is the rindex: when growing a filesystem, gfs2_grow adds one or more new resource groups and appends records describing those resource groups to the rindex; the preallocated space ensures that this is always possible. However, when a filesystem is completely full, gfs2_write_alloc_required will indicate that an additional allocation is required, and appending the next record to the rindex will fail even though space for that record has already been preallocated. To fix that, skip the incorrect optimization in gfs2_write_alloc_required, but for the rindex only. Other writes to preallocated space beyond the end of the file are still allowed to fail on completely full filesystems. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
- 24 Jul, 2018 11 commits
-
-
Bob Peterson authored
Before this patch, several functions in rgrp.c checked the value of rgd->rd_free_clone. That does not take into account blocks that were reserved by a multi-block reservation. This causes a problem when space gets tight in the file system. For example, when function gfs2_inplace_reserve checks to see if a rgrp has enough blocks to satisfy the request, it can accept a rgrp that it should reject because, although there are enough blocks to satisfy the request _now_, those blocks may be reserved for another running process. A second problem with this occurs when we've reserved the remaining blocks in an rgrp: function rg_mblk_search() can reject an rgrp improperly because it calculates: u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved; But rd_reserved includes blocks that the current process just reserved in its own call to inplace_reserve. For example, it can reserve the last 128 blocks of an rgrp, then reject that same rgrp because the above calculates out to free_blocks = 0; Consequences include, but are not limited to, (1) leaving holes, and thus increasing file system fragmentation, and (2) reporting file system is full long before it actually is. This patch introduces a new function, rgd_free, which returns the number of clone-free blocks (blocks that are truly free as opposed to blocks that are still being used because an unlinked file is still open) minus the number of blocks reserved by processes, but not counting the blocks we ourselves reserved (because obviously we need to allocate them). Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
Colin Ian King authored
Variable 'moved' s being assigned but is never used hence it is redundant and can be removed. This has been the case ever since commit c752666c. Cleans up clang warning: warning: variable 'moved' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
Andreas Gruenbacher authored
We only use iomap_readpage for pages that don't have buffer heads attached yet: iomap_readpage would otherwise read pages from disk that are marked buffer_uptodate() but not PageUptodate(). Those pages may actually contain data more recent than what's on disk. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
Remove the fallback code from direct to buffered I/O for stuffed reads. For stuffed writes, we must keep the fallback code: the deferred glock we are holding under direct I/O doesn't allow to write to the inode or change the file size. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
Merge xfs branch 'iomap-4.19-merge' into linux-gfs2/for-next. This brings in readpage and direct I/O support for inline data. The IOMAP_F_BUFFER_HEAD flag introduced in commit "iomap: add initial support for writes without buffer heads" needs to be set for gfs2 as well, so do that in the merge. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
Andreas Gruenbacher authored
In fallocate_chunk, always initialize the iomap before calling gfs2_iomap_get_alloc: future changes could otherwise cause things like iomap.flags to leak across calls. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
Bob Peterson authored
This patch fixes a couple problems dealing with spectators who remain with gfs2 mounts after the last non-spectator node fails. Before this patch, spectator mounts would try to acquire the dlm's mounted lock EX as part of its normal recovery sequence. The mounted lock is only used to determine whether the node is the first mounter, the first node to mount the file system, for the purposes of file system recovery and journal replay. It's not necessary for spectators: they should never do journal recovery. If they acquire the lock it will prevent another "real" first-mounter from acquiring the lock in EX mode, which means it also cannot do journal recovery because it doesn't think it's the first node to mount the file system. This patch checks if the mounter is a spectator, and if so, avoids grabbing the mounted lock. This allows a secondary mounter who is really the first non-spectator mounter, to do journal recovery: since the spectator doesn't acquire the lock, it can grab it in EX mode, and therefore consider itself to be the first mounter both as a "real" first mount, and as a first-real-after-spectator. Note that the control lock still needs to be taken in PR mode in order to fetch the lvb value so it has the current status of all journal's recovery. This is used as it is today by a first mounter to replay the journals. For spectators, it's merely used to fetch the status bits. All recovery is bypassed and the node waits until recovery is completed by a non-spectator node. I also improved the cryptic message given by control_mount when a spectator is waiting for a non-spectator to perform recovery. It also fixes a problem in gfs2_recover_set whereby spectators were never queueing recovery work for their own journal. They cannot do recovery themselves, but they still need to queue the work so they can check the recovery bits and clear the DFL_BLOCK_LOCKS bit once the recovery happens on another node. When the work queue runs on a spectator, it bypasses most of the work so it won't print a bunch of annoying messages. All it will print is a bunch of messages that look like this until recovery completes on the non-spectator node: GFS2: fsid=mycluster:scratch.s: recover generation 3 jid 0 GFS2: fsid=mycluster:scratch.s: recover jid 0 result busy These continue every 1.5 seconds until the recovery is done by the non-spectator, at which time it says: GFS2: fsid=mycluster:scratch.s: recover generation 4 done Then it proceeds with its mount. If the file system is mounted in spectator node and the last remaining non-spectator is fenced, any IO to the file system is blocked by dlm and the spectator waits until recovery is performed by a non-spectator. If a spectator tries to mount the file system before any non-spectators, it blocks and repeatedly gives this kernel message: GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount. GFS2: fsid=mycluster:scratch: Recovery is required. Waiting for a non-spectator to mount. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
Andreas Gruenbacher authored
Pull in the gfs2 iomap-write changes: Tweak the existing code to properly support iomap write and eliminate an unnecessary special case in gfs2_block_map. Implement iomap write support for buffered and direct I/O. Simplify some of the existing code and eliminate code that is no longer used: gfs2: Remove gfs2_write_{begin,end} gfs2: iomap direct I/O support gfs2: gfs2_extent_length cleanup gfs2: iomap buffered write support gfs2: Further iomap cleanups This is based on the following changes on the xfs 'iomap-4.19-merge' branch: iomap: add private pointer to struct iomap iomap: add a page_done callback iomap: generic inline data handling iomap: complete partial direct I/O writes synchronously iomap: mark newly allocated buffer heads as new fs: factor out a __generic_write_end helper Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
Souptick Joarder authored
Use new return type vm_fault_t for gfs2_page_mkwrite handler. see commit 1c8f4220 ("mm: change return type to vm_fault_t") for reference. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
Chengguang Xu authored
It seems better to get size by calling posix_acl_xattr_size() instead of calling posix_acl_to_xattr() with NULL buffer argument. posix_acl_xattr_size() never returns 0, so remove the unnecessary check. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
Bob Peterson authored
Before this patch, you could get into situations like this: 1. Process 1 searches for X free blocks, finds them, makes a reservation 2. Process 2 searches for free blocks in the same rgrp, but now the bitmap is full because process 1's reservation is skipped over. So it marks the bitmap as GBF_FULL. 3. Process 1 tries to allocate blocks from its own reservation, but since the GBF_FULL bit is set, it skips over the rgrp and searches elsewhere, thus not using its own reservation. This patch adds an additional check to allow processes to use their own reservations. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
-
- 05 Jul, 2018 1 commit
-
-
Andreas Gruenbacher authored
GFS2 remembers the last rgrp used for allocations in ip->i_rgd. However, block allocations are made by way of a reservations structure, ip->i_res, which keeps the last rgrp in ip->i_res.rs_rgd, and ip->i_res is kept in sync with ip->i_res.rs_rgd, so it's redundant. Get rid of ip->i_rgd and just use ip->i_res.rs_rgd in its place. Based on patches by Robert Peterson. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
-
- 04 Jul, 2018 1 commit
-
-
Andreas Gruenbacher authored
In the resource group list code, keep the last resource group added in the last position in the array. Check against that instead of messing with ip->i_rgd. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
-
- 03 Jul, 2018 3 commits
-
-
Andreas Gruenbacher authored
Just copy the inline data into the page using the existing helper. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Andreas Gruenbacher authored
Add support for reading from and writing to inline data to iomap_dio_rw. This saves filesystems from having to implement fallback code for this case. The inline data is actually cached in the inode, so the I/O is only direct in the sense that it doesn't go through the page cache. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
Split the function up into two helpers for the bio based I/O and hole case, and a small helper to call the two. This separates the code a little better in preparation for supporting I/O to inline data. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
- 02 Jul, 2018 5 commits
-
-
Andreas Gruenbacher authored
Now that generic_file_write_iter is no longer used, there are no remaining users of these address space operations. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
The page unmapping previously done in gfs2_direct_IO is now done generically in iomap_dio_rw. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
Now that gfs2_extent_length is no longer used for determining the size of a hole and always with an upper size limit, the function can be simplified. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
With the traditional page-based writes, blocks are allocated separately for each page written to. With iomap writes, we can allocate a lot more blocks at once, with a fraction of the allocation overhead for each page. Split calculating the number of blocks that can be allocated at a given position (gfs2_alloc_size) off from gfs2_iomap_alloc: that size determines the number of blocks to allocate and reserve in the journal. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
In gfs2_iomap_alloc, set the type of newly allocated extents to IOMAP_MAPPED so that iomap_to_bh will set the bh states correctly: otherwise, the bhs would not be marked as mapped, confusing __mpage_writepage. This means that we need to check for the IOMAP_F_NEW flag in fallocate_chunk now. Further clean up gfs2_iomap_get and implement gfs2_stuffed_iomap here directly. For reads beyond the end of the file, return holes instead of failing with -ENOENT so that we can get rid of that special case in gfs2_block_map. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Bob Peterson <rpeterso@redhat.com>
-
- 21 Jun, 2018 4 commits
-
-
Arnd Bergmann authored
current_kernel_time64() is now just a deprecated wrapper around ktime_get_coarse_real_ts64(), so let's just call that directly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
Rename end_off to end_len to make the code less confusing. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
-
Andreas Gruenbacher authored
In two places, the gfs2_io_error_bh macro is called while holding the sd_ail_lock spin lock. This isn't allowed because gfs2_io_error_bh withdraws the filesystem, which can sleep because it issues a uevent. To fix that, add a gfs2_io_error_bh_wd macro that does withdraw the filesystem and change gfs2_io_error_bh to not withdraw the filesystem. In those places where the new gfs2_io_error_bh is used, withdraw the filesystem after releasing sd_ail_lock. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Reviewed-by: Andrew Price <anprice@redhat.com>
-
Bob Peterson authored
Before this patch, block reservations kept track of the inode number. At one point, that was a valid thing to do. However, since we made the reservation a part of the inode (rather than a pointer to a separate allocated object) the reservation can determine the inode number by using container_of. This saves us a little memory in our inode. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Acked-by: Steven Whitehouse <swhiteho@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
-
- 20 Jun, 2018 1 commit
-
-
Christoph Hellwig authored
For now just limited to blocksize == PAGE_SIZE, where we can simply read in the full page in write begin, and just set the whole page dirty after copying data into it. This code is enabled by default and XFS will now be feed pages without buffer heads in ->writepage and ->writepages. If a file system sets the IOMAP_F_BUFFER_HEAD flag on the iomap the old path will still be used, this both helps the transition in XFS and prepares for the gfs2 migration to the iomap infrastructure. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
- 19 Jun, 2018 7 commits
-
-
Andreas Gruenbacher authored
Add a private pointer to struct iomap to allow filesystems to pass data from iomap_begin to iomap_end. Will be used by gfs2 for passing on the on-disk inode buffer head. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
Simply use iomap_apply to iterate over the file and a submit a bio for each non-uptodate but mapped region and zero everything else. Note that as-is this can not be used for file systems with a blocksize smaller than the page size, but that support will be added later. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
This will be used by gfs2 to attach data to transactions for the journaled data mode. But the concept is generic enough that we might be able to use it for other purposes like encryption/integrity post-processing in the future. Based on a patch from Andreas Gruenbacher. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Andreas Gruenbacher authored
Add generic inline data handling by adding a pointer to the inline data region to struct iomap. When handling a buffered IOMAP_INLINE write, iomap_write_begin will copy the current inline data from the inline data region into the page cache, and iomap_write_end will copy the changes in the page cache back to the inline data region. This doesn't cover inline data reads and direct I/O yet because so far, we have no users. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> [hch: small cleanups to better fit in with other iomap work] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Andreas Gruenbacher authored
According to xfstest generic/240, applications seem to expect direct I/O writes to either complete as a whole or to fail; short direct I/O writes are apparently not appreciated. This means that when only part of an asynchronous direct I/O write succeeds, we can either fail the entire write, or we can wait for the partial write to complete and retry the remaining write as buffered I/O. The old __blockdev_direct_IO helper has code for waiting for partial writes to complete; the new iomap_dio_rw iomap helper does not. The above mentioned fallback mode is needed for gfs2, which doesn't allow block allocations under direct I/O to avoid taking cluster-wide exclusive locks. As a consequence, an asynchronous direct I/O write to a file range that contains a hole will result in a short write. In that case, wait for the short write to complete to allow gfs2 to recover. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Andreas Gruenbacher authored
In iomap_to_bh, not only mark buffer heads in IOMAP_UNWRITTEN maps as new, but also buffer heads in IOMAP_MAPPED maps with the IOMAP_F_NEW flag set. This will be used by filesystems like gfs2, which allocate blocks in iomap->begin. Minor corrections to the comment for IOMAP_UNWRITTEN maps. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
Christoph Hellwig authored
Bits of the buffer.c based write_end implementations that don't know about buffer_heads and can be reused by other implementations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
-
- 16 Jun, 2018 2 commits
-
-
Linus Torvalds authored
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block fixes from Jens Axboe: "A collection of fixes that should go into -rc1. This contains: - bsg_open vs bsg_unregister race fix (Anatoliy) - NVMe pull request from Christoph, with fixes for regressions in this window, FC connect/reconnect path code unification, and a trace point addition. - timeout fix (Christoph) - remove a few unused functions (Christoph) - blk-mq tag_set reinit fix (Roman)" * tag 'for-linus-20180616' of git://git.kernel.dk/linux-block: bsg: fix race of bsg_open and bsg_unregister block: remov blk_queue_invalidate_tags nvme-fabrics: fix and refine state checks in __nvmf_check_ready nvme-fabrics: handle the admin-only case properly in nvmf_check_ready nvme-fabrics: refactor queue ready check blk-mq: remove blk_mq_tagset_iter nvme: remove nvme_reinit_tagset nvme-fc: fix nulling of queue data on reconnect nvme-fc: remove reinit_request routine blk-mq: don't time out requests again that are in the timeout handler nvme-fc: change controllers first connect to use reconnect path nvme: don't rely on the changed namespace list log nvmet: free smart-log buffer after use nvme-rdma: fix error flow during mapping request data nvme: add bio remapping tracepoint nvme: fix NULL pointer dereference in nvme_init_subsystem blk-mq: reinit q->tag_set_list entry only after grace period
-