Commit ec3604c7 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'wberr-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull writeback error handling updates from Jeff Layton:
 "This pile continues the work from last cycle on better tracking
  writeback errors. In v4.13 we added some basic errseq_t infrastructure
  and converted a few filesystems to use it.

  This set continues refining that infrastructure, adds documentation,
  and converts most of the other filesystems to use it. The main
  exception at this point is the NFS client"

* tag 'wberr-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  ecryptfs: convert to file_write_and_wait in ->fsync
  mm: remove optimizations based on i_size in mapping writeback waits
  fs: convert a pile of fsync routines to errseq_t based reporting
  gfs2: convert to errseq_t based writeback error reporting for fsync
  fs: convert sync_file_range to use errseq_t based error-tracking
  mm: add file_fdatawait_range and file_write_and_wait
  fuse: convert to errseq_t based error tracking for fsync
  mm: consolidate dax / non-dax checks for writeback
  Documentation: add some docs for errseq_t
  errseq: rename __errseq_set to errseq_set
parents 066dea8c 6d4b5124
The errseq_t datatype
=====================
An errseq_t is a way of recording errors in one place, and allowing any
number of "subscribers" to tell whether it has changed since a previous
point where it was sampled.
The initial use case for this is tracking errors for file
synchronization syscalls (fsync, fdatasync, msync and sync_file_range),
but it may be usable in other situations.
It's implemented as an unsigned 32-bit value. The low order bits are
designated to hold an error code (between 1 and MAX_ERRNO). The upper bits
are used as a counter. This is done with atomics instead of locking so that
these functions can be called from any context.
Note that there is a risk of collisions if new errors are being recorded
frequently, since we have so few bits to use as a counter.
To mitigate this, the bit between the error value and counter is used as
a flag to tell whether the value has been sampled since a new value was
recorded. That allows us to avoid bumping the counter if no one has
sampled it since the last time an error was recorded.
Thus we end up with a value that looks something like this::
bit: 31..13 12 11..0
+-----------------+----+----------------+
| counter | SF | errno |
+-----------------+----+----------------+
The general idea is for "watchers" to sample an errseq_t value and keep
it as a running cursor. That value can later be used to tell whether
any new errors have occurred since that sampling was done, and atomically
record the state at the time that it was checked. This allows us to
record errors in one place, and then have a number of "watchers" that
can tell whether the value has changed since they last checked it.
A new errseq_t should always be zeroed out. An errseq_t value of all zeroes
is the special (but common) case where there has never been an error. An all
zero value thus serves as the "epoch" if one wishes to know whether there
has ever been an error set since it was first initialized.
API usage
=========
Let me tell you a story about a worker drone. Now, he's a good worker
overall, but the company is a little...management heavy. He has to
report to 77 supervisors today, and tomorrow the "big boss" is coming in
from out of town and he's sure to test the poor fellow too.
They're all handing him work to do -- so much he can't keep track of who
handed him what, but that's not really a big problem. The supervisors
just want to know when he's finished all of the work they've handed him so
far and whether he made any mistakes since they last asked.
He might have made the mistake on work they didn't actually hand him,
but he can't keep track of things at that level of detail, all he can
remember is the most recent mistake that he made.
Here's our worker_drone representation::
struct worker_drone {
errseq_t wd_err; /* for recording errors */
};
Every day, the worker_drone starts out with a blank slate::
struct worker_drone wd;
wd.wd_err = (errseq_t)0;
The supervisors come in and get an initial read for the day. They
don't care about anything that happened before their watch begins::
struct supervisor {
errseq_t s_wd_err; /* private "cursor" for wd_err */
spinlock_t s_wd_err_lock; /* protects s_wd_err */
}
struct supervisor su;
su.s_wd_err = errseq_sample(&wd.wd_err);
spin_lock_init(&su.s_wd_err_lock);
Now they start handing him tasks to do. Every few minutes they ask him to
finish up all of the work they've handed him so far. Then they ask him
whether he made any mistakes on any of it::
spin_lock(&su.su_wd_err_lock);
err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
spin_unlock(&su.su_wd_err_lock);
Up to this point, that just keeps returning 0.
Now, the owners of this company are quite miserly and have given him
substandard equipment with which to do his job. Occasionally it
glitches and he makes a mistake. He sighs a heavy sigh, and marks it
down::
errseq_set(&wd.wd_err, -EIO);
...and then gets back to work. The supervisors eventually poll again
and they each get the error when they next check. Subsequent calls will
return 0, until another error is recorded, at which point it's reported
to each of them once.
Note that the supervisors can't tell how many mistakes he made, only
whether one was made since they last checked, and the latest value
recorded.
Occasionally the big boss comes in for a spot check and asks the worker
to do a one-off job for him. He's not really watching the worker
full-time like the supervisors, but he does need to know whether a
mistake occurred while his job was processing.
He can just sample the current errseq_t in the worker, and then use that
to tell whether an error has occurred later::
errseq_t since = errseq_sample(&wd.wd_err);
/* submit some work and wait for it to complete */
err = errseq_check(&wd.wd_err, since);
Since he's just going to discard "since" after that point, he doesn't
need to advance it here. He also doesn't need any locking since it's
not usable by anyone else.
Serializing errseq_t cursor updates
===================================
Note that the errseq_t API does not protect the errseq_t cursor during a
check_and_advance_operation. Only the canonical error code is handled
atomically. In a situation where more than one task might be using the
same errseq_t cursor at the same time, it's important to serialize
updates to that cursor.
If that's not done, then it's possible for the cursor to go backward
in which case the same error could be reported more than once.
Because of this, it's often advantageous to first do an errseq_check to
see if anything has changed, and only later do an
errseq_check_and_advance after taking the lock. e.g.::
if (errseq_check(&wd.wd_err, READ_ONCE(su.s_wd_err)) {
/* su.s_wd_err is protected by s_wd_err_lock */
spin_lock(&su.s_wd_err_lock);
err = errseq_check_and_advance(&wd.wd_err, &su.s_wd_err);
spin_unlock(&su.s_wd_err_lock);
}
That avoids the spinlock in the common case where nothing has changed
since the last time it was checked.
...@@ -1749,7 +1749,7 @@ static int spufs_mfc_flush(struct file *file, fl_owner_t id) ...@@ -1749,7 +1749,7 @@ static int spufs_mfc_flush(struct file *file, fl_owner_t id)
static int spufs_mfc_fsync(struct file *file, loff_t start, loff_t end, int datasync) static int spufs_mfc_fsync(struct file *file, loff_t start, loff_t end, int datasync)
{ {
struct inode *inode = file_inode(file); struct inode *inode = file_inode(file);
int err = filemap_write_and_wait_range(inode->i_mapping, start, end); int err = file_write_and_wait_range(file, start, end);
if (!err) { if (!err) {
inode_lock(inode); inode_lock(inode);
err = spufs_mfc_flush(file, NULL); err = spufs_mfc_flush(file, NULL);
......
...@@ -2364,7 +2364,7 @@ int ll_fsync(struct file *file, loff_t start, loff_t end, int datasync) ...@@ -2364,7 +2364,7 @@ int ll_fsync(struct file *file, loff_t start, loff_t end, int datasync)
PFID(ll_inode2fid(inode)), inode); PFID(ll_inode2fid(inode)), inode);
ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_FSYNC, 1); ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_FSYNC, 1);
rc = filemap_write_and_wait_range(inode->i_mapping, start, end); rc = file_write_and_wait_range(file, start, end);
inode_lock(inode); inode_lock(inode);
/* catch async errors that were recorded back when async writeback /* catch async errors that were recorded back when async writeback
......
...@@ -69,7 +69,7 @@ int fb_deferred_io_fsync(struct file *file, loff_t start, loff_t end, int datasy ...@@ -69,7 +69,7 @@ int fb_deferred_io_fsync(struct file *file, loff_t start, loff_t end, int datasy
{ {
struct fb_info *info = file->private_data; struct fb_info *info = file->private_data;
struct inode *inode = file_inode(file); struct inode *inode = file_inode(file);
int err = filemap_write_and_wait_range(inode->i_mapping, start, end); int err = file_write_and_wait_range(file, start, end);
if (err) if (err)
return err; return err;
......
...@@ -445,7 +445,7 @@ static int v9fs_file_fsync(struct file *filp, loff_t start, loff_t end, ...@@ -445,7 +445,7 @@ static int v9fs_file_fsync(struct file *filp, loff_t start, loff_t end,
struct p9_wstat wstat; struct p9_wstat wstat;
int retval; int retval;
retval = filemap_write_and_wait_range(inode->i_mapping, start, end); retval = file_write_and_wait_range(filp, start, end);
if (retval) if (retval)
return retval; return retval;
...@@ -468,7 +468,7 @@ int v9fs_file_fsync_dotl(struct file *filp, loff_t start, loff_t end, ...@@ -468,7 +468,7 @@ int v9fs_file_fsync_dotl(struct file *filp, loff_t start, loff_t end,
struct inode *inode = filp->f_mapping->host; struct inode *inode = filp->f_mapping->host;
int retval; int retval;
retval = filemap_write_and_wait_range(inode->i_mapping, start, end); retval = file_write_and_wait_range(filp, start, end);
if (retval) if (retval)
return retval; return retval;
......
...@@ -954,7 +954,7 @@ int affs_file_fsync(struct file *filp, loff_t start, loff_t end, int datasync) ...@@ -954,7 +954,7 @@ int affs_file_fsync(struct file *filp, loff_t start, loff_t end, int datasync)
struct inode *inode = filp->f_mapping->host; struct inode *inode = filp->f_mapping->host;
int ret, err; int ret, err;
err = filemap_write_and_wait_range(inode->i_mapping, start, end); err = file_write_and_wait_range(filp, start, end);
if (err) if (err)
return err; return err;
......
...@@ -714,7 +714,7 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync) ...@@ -714,7 +714,7 @@ int afs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
vnode->fid.vid, vnode->fid.vnode, file, vnode->fid.vid, vnode->fid.vnode, file,
datasync); datasync);
ret = filemap_write_and_wait_range(inode->i_mapping, start, end); ret = file_write_and_wait_range(file, start, end);
if (ret) if (ret)
return ret; return ret;
inode_lock(inode); inode_lock(inode);
......
...@@ -2329,7 +2329,7 @@ int cifs_strict_fsync(struct file *file, loff_t start, loff_t end, ...@@ -2329,7 +2329,7 @@ int cifs_strict_fsync(struct file *file, loff_t start, loff_t end,
struct inode *inode = file_inode(file); struct inode *inode = file_inode(file);
struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
rc = filemap_write_and_wait_range(inode->i_mapping, start, end); rc = file_write_and_wait_range(file, start, end);
if (rc) if (rc)
return rc; return rc;
inode_lock(inode); inode_lock(inode);
...@@ -2371,7 +2371,7 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync) ...@@ -2371,7 +2371,7 @@ int cifs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
struct cifs_sb_info *cifs_sb = CIFS_FILE_SB(file); struct cifs_sb_info *cifs_sb = CIFS_FILE_SB(file);
struct inode *inode = file->f_mapping->host; struct inode *inode = file->f_mapping->host;
rc = filemap_write_and_wait_range(inode->i_mapping, start, end); rc = file_write_and_wait_range(file, start, end);
if (rc) if (rc)
return rc; return rc;
inode_lock(inode); inode_lock(inode);
......
...@@ -328,7 +328,7 @@ ecryptfs_fsync(struct file *file, loff_t start, loff_t end, int datasync) ...@@ -328,7 +328,7 @@ ecryptfs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
{ {
int rc; int rc;
rc = filemap_write_and_wait(file->f_mapping); rc = file_write_and_wait(file);
if (rc) if (rc)
return rc; return rc;
......
...@@ -48,7 +48,7 @@ static int exofs_file_fsync(struct file *filp, loff_t start, loff_t end, ...@@ -48,7 +48,7 @@ static int exofs_file_fsync(struct file *filp, loff_t start, loff_t end,
struct inode *inode = filp->f_mapping->host; struct inode *inode = filp->f_mapping->host;
int ret; int ret;
ret = filemap_write_and_wait_range(inode->i_mapping, start, end); ret = file_write_and_wait_range(filp, start, end);
if (ret) if (ret)
return ret; return ret;
......
...@@ -206,7 +206,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end, ...@@ -206,7 +206,7 @@ static int f2fs_do_sync_file(struct file *file, loff_t start, loff_t end,
/* if fdatasync is triggered, let's do in-place-update */ /* if fdatasync is triggered, let's do in-place-update */
if (datasync || get_dirty_pages(inode) <= SM_I(sbi)->min_fsync_blocks) if (datasync || get_dirty_pages(inode) <= SM_I(sbi)->min_fsync_blocks)
set_inode_flag(inode, FI_NEED_IPU); set_inode_flag(inode, FI_NEED_IPU);
ret = filemap_write_and_wait_range(inode->i_mapping, start, end); ret = file_write_and_wait_range(file, start, end);
clear_inode_flag(inode, FI_NEED_IPU); clear_inode_flag(inode, FI_NEED_IPU);
if (ret) { if (ret) {
......
...@@ -457,7 +457,7 @@ int fuse_fsync_common(struct file *file, loff_t start, loff_t end, ...@@ -457,7 +457,7 @@ int fuse_fsync_common(struct file *file, loff_t start, loff_t end,
* wait for all outstanding writes, before sending the FSYNC * wait for all outstanding writes, before sending the FSYNC
* request. * request.
*/ */
err = filemap_write_and_wait_range(inode->i_mapping, start, end); err = file_write_and_wait_range(file, start, end);
if (err) if (err)
goto out; goto out;
...@@ -465,10 +465,10 @@ int fuse_fsync_common(struct file *file, loff_t start, loff_t end, ...@@ -465,10 +465,10 @@ int fuse_fsync_common(struct file *file, loff_t start, loff_t end,
/* /*
* Due to implementation of fuse writeback * Due to implementation of fuse writeback
* filemap_write_and_wait_range() does not catch errors. * file_write_and_wait_range() does not catch errors.
* We have to do this directly after fuse_sync_writes() * We have to do this directly after fuse_sync_writes()
*/ */
err = filemap_check_errors(file->f_mapping); err = file_check_and_advance_wb_err(file);
if (err) if (err)
goto out; goto out;
......
...@@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t start, loff_t end, ...@@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t start, loff_t end,
if (ret) if (ret)
return ret; return ret;
if (gfs2_is_jdata(ip)) if (gfs2_is_jdata(ip))
filemap_write_and_wait(mapping); ret = file_write_and_wait(file);
if (ret)
return ret;
gfs2_ail_flush(ip->i_gl, 1); gfs2_ail_flush(ip->i_gl, 1);
} }
if (mapping->nrpages) if (mapping->nrpages)
ret = filemap_fdatawait_range(mapping, start, end); ret = file_fdatawait_range(file, start, end);
return ret ? ret : ret1; return ret ? ret : ret1;
} }
......
...@@ -656,7 +656,7 @@ static int hfs_file_fsync(struct file *filp, loff_t start, loff_t end, ...@@ -656,7 +656,7 @@ static int hfs_file_fsync(struct file *filp, loff_t start, loff_t end,
struct super_block * sb; struct super_block * sb;
int ret, err; int ret, err;
ret = filemap_write_and_wait_range(inode->i_mapping, start, end); ret = file_write_and_wait_range(filp, start, end);
if (ret) if (ret)
return ret; return ret;
inode_lock(inode); inode_lock(inode);
......
...@@ -283,7 +283,7 @@ int hfsplus_file_fsync(struct file *file, loff_t start, loff_t end, ...@@ -283,7 +283,7 @@ int hfsplus_file_fsync(struct file *file, loff_t start, loff_t end,
struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb); struct hfsplus_sb_info *sbi = HFSPLUS_SB(inode->i_sb);
int error = 0, error2; int error = 0, error2;
error = filemap_write_and_wait_range(inode->i_mapping, start, end); error = file_write_and_wait_range(file, start, end);
if (error) if (error)
return error; return error;
inode_lock(inode); inode_lock(inode);
......
...@@ -374,7 +374,7 @@ static int hostfs_fsync(struct file *file, loff_t start, loff_t end, ...@@ -374,7 +374,7 @@ static int hostfs_fsync(struct file *file, loff_t start, loff_t end,
struct inode *inode = file->f_mapping->host; struct inode *inode = file->f_mapping->host;
int ret; int ret;
ret = filemap_write_and_wait_range(inode->i_mapping, start, end); ret = file_write_and_wait_range(file, start, end);
if (ret) if (ret)
return ret; return ret;
......
...@@ -24,7 +24,7 @@ int hpfs_file_fsync(struct file *file, loff_t start, loff_t end, int datasync) ...@@ -24,7 +24,7 @@ int hpfs_file_fsync(struct file *file, loff_t start, loff_t end, int datasync)
struct inode *inode = file->f_mapping->host; struct inode *inode = file->f_mapping->host;
int ret; int ret;
ret = filemap_write_and_wait_range(file->f_mapping, start, end); ret = file_write_and_wait_range(file, start, end);
if (ret) if (ret)
return ret; return ret;
return sync_blockdev(inode->i_sb->s_bdev); return sync_blockdev(inode->i_sb->s_bdev);
......
...@@ -35,7 +35,7 @@ int jffs2_fsync(struct file *filp, loff_t start, loff_t end, int datasync) ...@@ -35,7 +35,7 @@ int jffs2_fsync(struct file *filp, loff_t start, loff_t end, int datasync)
struct jffs2_sb_info *c = JFFS2_SB_INFO(inode->i_sb); struct jffs2_sb_info *c = JFFS2_SB_INFO(inode->i_sb);
int ret; int ret;
ret = filemap_write_and_wait_range(inode->i_mapping, start, end); ret = file_write_and_wait_range(filp, start, end);
if (ret) if (ret)
return ret; return ret;
......
...@@ -34,7 +34,7 @@ int jfs_fsync(struct file *file, loff_t start, loff_t end, int datasync) ...@@ -34,7 +34,7 @@ int jfs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
struct inode *inode = file->f_mapping->host; struct inode *inode = file->f_mapping->host;
int rc = 0; int rc = 0;
rc = filemap_write_and_wait_range(inode->i_mapping, start, end); rc = file_write_and_wait_range(file, start, end);
if (rc) if (rc)
return rc; return rc;
......
...@@ -23,7 +23,7 @@ ...@@ -23,7 +23,7 @@
static int ncp_fsync(struct file *file, loff_t start, loff_t end, int datasync) static int ncp_fsync(struct file *file, loff_t start, loff_t end, int datasync)
{ {
return filemap_write_and_wait_range(file->f_mapping, start, end); return file_write_and_wait_range(file, start, end);
} }
/* /*
......
...@@ -1506,7 +1506,7 @@ static int ntfs_dir_fsync(struct file *filp, loff_t start, loff_t end, ...@@ -1506,7 +1506,7 @@ static int ntfs_dir_fsync(struct file *filp, loff_t start, loff_t end,
ntfs_debug("Entering for inode 0x%lx.", vi->i_ino); ntfs_debug("Entering for inode 0x%lx.", vi->i_ino);
err = filemap_write_and_wait_range(vi->i_mapping, start, end); err = file_write_and_wait_range(filp, start, end);
if (err) if (err)
return err; return err;
inode_lock(vi); inode_lock(vi);
......
...@@ -1989,7 +1989,7 @@ static int ntfs_file_fsync(struct file *filp, loff_t start, loff_t end, ...@@ -1989,7 +1989,7 @@ static int ntfs_file_fsync(struct file *filp, loff_t start, loff_t end,
ntfs_debug("Entering for inode 0x%lx.", vi->i_ino); ntfs_debug("Entering for inode 0x%lx.", vi->i_ino);
err = filemap_write_and_wait_range(vi->i_mapping, start, end); err = file_write_and_wait_range(filp, start, end);
if (err) if (err)
return err; return err;
inode_lock(vi); inode_lock(vi);
......
...@@ -196,7 +196,7 @@ static int ocfs2_sync_file(struct file *file, loff_t start, loff_t end, ...@@ -196,7 +196,7 @@ static int ocfs2_sync_file(struct file *file, loff_t start, loff_t end,
if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb)) if (ocfs2_is_hard_readonly(osb) || ocfs2_is_soft_readonly(osb))
return -EROFS; return -EROFS;
err = filemap_write_and_wait_range(inode->i_mapping, start, end); err = file_write_and_wait_range(file, start, end);
if (err) if (err)
return err; return err;
......
...@@ -34,7 +34,7 @@ static int reiserfs_dir_fsync(struct file *filp, loff_t start, loff_t end, ...@@ -34,7 +34,7 @@ static int reiserfs_dir_fsync(struct file *filp, loff_t start, loff_t end,
struct inode *inode = filp->f_mapping->host; struct inode *inode = filp->f_mapping->host;
int err; int err;
err = filemap_write_and_wait_range(inode->i_mapping, start, end); err = file_write_and_wait_range(filp, start, end);
if (err) if (err)
return err; return err;
......
...@@ -154,7 +154,7 @@ static int reiserfs_sync_file(struct file *filp, loff_t start, loff_t end, ...@@ -154,7 +154,7 @@ static int reiserfs_sync_file(struct file *filp, loff_t start, loff_t end,
int err; int err;
int barrier_done; int barrier_done;
err = filemap_write_and_wait_range(inode->i_mapping, start, end); err = file_write_and_wait_range(filp, start, end);
if (err) if (err)
return err; return err;
......
...@@ -342,7 +342,7 @@ SYSCALL_DEFINE4(sync_file_range, int, fd, loff_t, offset, loff_t, nbytes, ...@@ -342,7 +342,7 @@ SYSCALL_DEFINE4(sync_file_range, int, fd, loff_t, offset, loff_t, nbytes,
ret = 0; ret = 0;
if (flags & SYNC_FILE_RANGE_WAIT_BEFORE) { if (flags & SYNC_FILE_RANGE_WAIT_BEFORE) {
ret = filemap_fdatawait_range(mapping, offset, endbyte); ret = file_fdatawait_range(f.file, offset, endbyte);
if (ret < 0) if (ret < 0)
goto out_put; goto out_put;
} }
...@@ -355,7 +355,7 @@ SYSCALL_DEFINE4(sync_file_range, int, fd, loff_t, offset, loff_t, nbytes, ...@@ -355,7 +355,7 @@ SYSCALL_DEFINE4(sync_file_range, int, fd, loff_t, offset, loff_t, nbytes,
} }
if (flags & SYNC_FILE_RANGE_WAIT_AFTER) if (flags & SYNC_FILE_RANGE_WAIT_AFTER)
ret = filemap_fdatawait_range(mapping, offset, endbyte); ret = file_fdatawait_range(f.file, offset, endbyte);
out_put: out_put:
fdput(f); fdput(f);
......
...@@ -1337,7 +1337,7 @@ int ubifs_fsync(struct file *file, loff_t start, loff_t end, int datasync) ...@@ -1337,7 +1337,7 @@ int ubifs_fsync(struct file *file, loff_t start, loff_t end, int datasync)
*/ */
return 0; return 0;
err = filemap_write_and_wait_range(inode->i_mapping, start, end); err = file_write_and_wait_range(file, start, end);
if (err) if (err)
return err; return err;
inode_lock(inode); inode_lock(inode);
......
/*
* See Documentation/errseq.rst and lib/errseq.c
*/
#ifndef _LINUX_ERRSEQ_H #ifndef _LINUX_ERRSEQ_H
#define _LINUX_ERRSEQ_H #define _LINUX_ERRSEQ_H
/* See lib/errseq.c for more info */
typedef u32 errseq_t; typedef u32 errseq_t;
errseq_t __errseq_set(errseq_t *eseq, int err); errseq_t errseq_set(errseq_t *eseq, int err);
static inline void errseq_set(errseq_t *eseq, int err)
{
/* Optimize for the common case of no error */
if (unlikely(err))
__errseq_set(eseq, err);
}
errseq_t errseq_sample(errseq_t *eseq); errseq_t errseq_sample(errseq_t *eseq);
int errseq_check(errseq_t *eseq, errseq_t since); int errseq_check(errseq_t *eseq, errseq_t since);
int errseq_check_and_advance(errseq_t *eseq, errseq_t *since); int errseq_check_and_advance(errseq_t *eseq, errseq_t *since);
......
...@@ -2544,12 +2544,19 @@ extern int invalidate_inode_pages2_range(struct address_space *mapping, ...@@ -2544,12 +2544,19 @@ extern int invalidate_inode_pages2_range(struct address_space *mapping,
extern int write_inode_now(struct inode *, int); extern int write_inode_now(struct inode *, int);
extern int filemap_fdatawrite(struct address_space *); extern int filemap_fdatawrite(struct address_space *);
extern int filemap_flush(struct address_space *); extern int filemap_flush(struct address_space *);
extern int filemap_fdatawait(struct address_space *);
extern int filemap_fdatawait_keep_errors(struct address_space *mapping); extern int filemap_fdatawait_keep_errors(struct address_space *mapping);
extern int filemap_fdatawait_range(struct address_space *, loff_t lstart, extern int filemap_fdatawait_range(struct address_space *, loff_t lstart,
loff_t lend); loff_t lend);
static inline int filemap_fdatawait(struct address_space *mapping)
{
return filemap_fdatawait_range(mapping, 0, LLONG_MAX);
}
extern bool filemap_range_has_page(struct address_space *, loff_t lstart, extern bool filemap_range_has_page(struct address_space *, loff_t lstart,
loff_t lend); loff_t lend);
extern int __must_check file_fdatawait_range(struct file *file, loff_t lstart,
loff_t lend);
extern int filemap_write_and_wait(struct address_space *mapping); extern int filemap_write_and_wait(struct address_space *mapping);
extern int filemap_write_and_wait_range(struct address_space *mapping, extern int filemap_write_and_wait_range(struct address_space *mapping,
loff_t lstart, loff_t lend); loff_t lstart, loff_t lend);
...@@ -2558,12 +2565,19 @@ extern int __filemap_fdatawrite_range(struct address_space *mapping, ...@@ -2558,12 +2565,19 @@ extern int __filemap_fdatawrite_range(struct address_space *mapping,
extern int filemap_fdatawrite_range(struct address_space *mapping, extern int filemap_fdatawrite_range(struct address_space *mapping,
loff_t start, loff_t end); loff_t start, loff_t end);
extern int filemap_check_errors(struct address_space *mapping); extern int filemap_check_errors(struct address_space *mapping);
extern void __filemap_set_wb_err(struct address_space *mapping, int err); extern void __filemap_set_wb_err(struct address_space *mapping, int err);
extern int __must_check file_fdatawait_range(struct file *file, loff_t lstart,
loff_t lend);
extern int __must_check file_check_and_advance_wb_err(struct file *file); extern int __must_check file_check_and_advance_wb_err(struct file *file);
extern int __must_check file_write_and_wait_range(struct file *file, extern int __must_check file_write_and_wait_range(struct file *file,
loff_t start, loff_t end); loff_t start, loff_t end);
static inline int file_write_and_wait(struct file *file)
{
return file_write_and_wait_range(file, 0, LLONG_MAX);
}
/** /**
* filemap_set_wb_err - set a writeback error on an address_space * filemap_set_wb_err - set a writeback error on an address_space
* @mapping: mapping in which to set writeback error * @mapping: mapping in which to set writeback error
...@@ -2577,8 +2591,6 @@ extern int __must_check file_write_and_wait_range(struct file *file, ...@@ -2577,8 +2591,6 @@ extern int __must_check file_write_and_wait_range(struct file *file,
* When a writeback error occurs, most filesystems will want to call * When a writeback error occurs, most filesystems will want to call
* filemap_set_wb_err to record the error in the mapping so that it will be * filemap_set_wb_err to record the error in the mapping so that it will be
* automatically reported whenever fsync is called on the file. * automatically reported whenever fsync is called on the file.
*
* FIXME: mention FS_* flag here?
*/ */
static inline void filemap_set_wb_err(struct address_space *mapping, int err) static inline void filemap_set_wb_err(struct address_space *mapping, int err)
{ {
......
...@@ -41,23 +41,20 @@ ...@@ -41,23 +41,20 @@
#define ERRSEQ_CTR_INC (1 << (ERRSEQ_SHIFT + 1)) #define ERRSEQ_CTR_INC (1 << (ERRSEQ_SHIFT + 1))
/** /**
* __errseq_set - set a errseq_t for later reporting * errseq_set - set a errseq_t for later reporting
* @eseq: errseq_t field that should be set * @eseq: errseq_t field that should be set
* @err: error to set * @err: error to set (must be between -1 and -MAX_ERRNO)
* *
* This function sets the error in *eseq, and increments the sequence counter * This function sets the error in *eseq, and increments the sequence counter
* if the last sequence was sampled at some point in the past. * if the last sequence was sampled at some point in the past.
* *
* Any error set will always overwrite an existing error. * Any error set will always overwrite an existing error.
* *
* Most callers will want to use the errseq_set inline wrapper to efficiently * We do return the latest value here, primarily for debugging purposes. The
* handle the common case where err is 0. * return value should not be used as a previously sampled value in later calls
* * as it will not have the SEEN flag set.
* We do return an errseq_t here, primarily for debugging purposes. The return
* value should not be used as a previously sampled value in later calls as it
* will not have the SEEN flag set.
*/ */
errseq_t __errseq_set(errseq_t *eseq, int err) errseq_t errseq_set(errseq_t *eseq, int err)
{ {
errseq_t cur, old; errseq_t cur, old;
...@@ -107,7 +104,7 @@ errseq_t __errseq_set(errseq_t *eseq, int err) ...@@ -107,7 +104,7 @@ errseq_t __errseq_set(errseq_t *eseq, int err)
} }
return cur; return cur;
} }
EXPORT_SYMBOL(__errseq_set); EXPORT_SYMBOL(errseq_set);
/** /**
* errseq_sample - grab current errseq_t value * errseq_sample - grab current errseq_t value
......
...@@ -475,6 +475,29 @@ int filemap_fdatawait_range(struct address_space *mapping, loff_t start_byte, ...@@ -475,6 +475,29 @@ int filemap_fdatawait_range(struct address_space *mapping, loff_t start_byte,
} }
EXPORT_SYMBOL(filemap_fdatawait_range); EXPORT_SYMBOL(filemap_fdatawait_range);
/**
* file_fdatawait_range - wait for writeback to complete
* @file: file pointing to address space structure to wait for
* @start_byte: offset in bytes where the range starts
* @end_byte: offset in bytes where the range ends (inclusive)
*
* Walk the list of under-writeback pages of the address space that file
* refers to, in the given range and wait for all of them. Check error
* status of the address space vs. the file->f_wb_err cursor and return it.
*
* Since the error status of the file is advanced by this function,
* callers are responsible for checking the return value and handling and/or
* reporting the error.
*/
int file_fdatawait_range(struct file *file, loff_t start_byte, loff_t end_byte)
{
struct address_space *mapping = file->f_mapping;
__filemap_fdatawait_range(mapping, start_byte, end_byte);
return file_check_and_advance_wb_err(file);
}
EXPORT_SYMBOL(file_fdatawait_range);
/** /**
* filemap_fdatawait_keep_errors - wait for writeback without clearing errors * filemap_fdatawait_keep_errors - wait for writeback without clearing errors
* @mapping: address space structure to wait for * @mapping: address space structure to wait for
...@@ -489,45 +512,22 @@ EXPORT_SYMBOL(filemap_fdatawait_range); ...@@ -489,45 +512,22 @@ EXPORT_SYMBOL(filemap_fdatawait_range);
*/ */
int filemap_fdatawait_keep_errors(struct address_space *mapping) int filemap_fdatawait_keep_errors(struct address_space *mapping)
{ {
loff_t i_size = i_size_read(mapping->host); __filemap_fdatawait_range(mapping, 0, LLONG_MAX);
if (i_size == 0)
return 0;
__filemap_fdatawait_range(mapping, 0, i_size - 1);
return filemap_check_and_keep_errors(mapping); return filemap_check_and_keep_errors(mapping);
} }
EXPORT_SYMBOL(filemap_fdatawait_keep_errors); EXPORT_SYMBOL(filemap_fdatawait_keep_errors);
/** static bool mapping_needs_writeback(struct address_space *mapping)
* filemap_fdatawait - wait for all under-writeback pages to complete
* @mapping: address space structure to wait for
*
* Walk the list of under-writeback pages of the given address space
* and wait for all of them. Check error status of the address space
* and return it.
*
* Since the error status of the address space is cleared by this function,
* callers are responsible for checking the return value and handling and/or
* reporting the error.
*/
int filemap_fdatawait(struct address_space *mapping)
{ {
loff_t i_size = i_size_read(mapping->host); return (!dax_mapping(mapping) && mapping->nrpages) ||
(dax_mapping(mapping) && mapping->nrexceptional);
if (i_size == 0)
return 0;
return filemap_fdatawait_range(mapping, 0, i_size - 1);
} }
EXPORT_SYMBOL(filemap_fdatawait);
int filemap_write_and_wait(struct address_space *mapping) int filemap_write_and_wait(struct address_space *mapping)
{ {
int err = 0; int err = 0;
if ((!dax_mapping(mapping) && mapping->nrpages) || if (mapping_needs_writeback(mapping)) {
(dax_mapping(mapping) && mapping->nrexceptional)) {
err = filemap_fdatawrite(mapping); err = filemap_fdatawrite(mapping);
/* /*
* Even if the above returned error, the pages may be * Even if the above returned error, the pages may be
...@@ -566,8 +566,7 @@ int filemap_write_and_wait_range(struct address_space *mapping, ...@@ -566,8 +566,7 @@ int filemap_write_and_wait_range(struct address_space *mapping,
{ {
int err = 0; int err = 0;
if ((!dax_mapping(mapping) && mapping->nrpages) || if (mapping_needs_writeback(mapping)) {
(dax_mapping(mapping) && mapping->nrexceptional)) {
err = __filemap_fdatawrite_range(mapping, lstart, lend, err = __filemap_fdatawrite_range(mapping, lstart, lend,
WB_SYNC_ALL); WB_SYNC_ALL);
/* See comment of filemap_write_and_wait() */ /* See comment of filemap_write_and_wait() */
...@@ -589,7 +588,7 @@ EXPORT_SYMBOL(filemap_write_and_wait_range); ...@@ -589,7 +588,7 @@ EXPORT_SYMBOL(filemap_write_and_wait_range);
void __filemap_set_wb_err(struct address_space *mapping, int err) void __filemap_set_wb_err(struct address_space *mapping, int err)
{ {
errseq_t eseq = __errseq_set(&mapping->wb_err, err); errseq_t eseq = errseq_set(&mapping->wb_err, err);
trace_filemap_set_wb_err(mapping, eseq); trace_filemap_set_wb_err(mapping, eseq);
} }
...@@ -656,8 +655,7 @@ int file_write_and_wait_range(struct file *file, loff_t lstart, loff_t lend) ...@@ -656,8 +655,7 @@ int file_write_and_wait_range(struct file *file, loff_t lstart, loff_t lend)
int err = 0, err2; int err = 0, err2;
struct address_space *mapping = file->f_mapping; struct address_space *mapping = file->f_mapping;
if ((!dax_mapping(mapping) && mapping->nrpages) || if (mapping_needs_writeback(mapping)) {
(dax_mapping(mapping) && mapping->nrexceptional)) {
err = __filemap_fdatawrite_range(mapping, lstart, lend, err = __filemap_fdatawrite_range(mapping, lstart, lend,
WB_SYNC_ALL); WB_SYNC_ALL);
/* See comment of filemap_write_and_wait() */ /* See comment of filemap_write_and_wait() */
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment