• Darrick J. Wong's avatar
    mm: only enforce stable page writes if the backing device requires it · 1d1d1a76
    Darrick J. Wong authored
    Create a helper function to check if a backing device requires stable
    page writes and, if so, performs the necessary wait.  Then, make it so
    that all points in the memory manager that handle making pages writable
    use the helper function.  This should provide stable page write support
    to most filesystems, while eliminating unnecessary waiting for devices
    that don't require the feature.
    
    Before this patchset, all filesystems would block, regardless of whether
    or not it was necessary.  ext3 would wait, but still generate occasional
    checksum errors.  The network filesystems were left to do their own
    thing, so they'd wait too.
    
    After this patchset, all the disk filesystems except ext3 and btrfs will
    wait only if the hardware requires it.  ext3 (if necessary) snapshots
    pages instead of blocking, and btrfs provides its own bdi so the mm will
    never wait.  Network filesystems haven't been touched, so either they
    provide their own stable page guarantees or they don't block at all.
    The blocking behavior is back to what it was before 3.0 if you don't
    have a disk requiring stable page writes.
    
    Here's the result of using dbench to test latency on ext2:
    
    3.8.0-rc3:
     Operation      Count    AvgLat    MaxLat
     ----------------------------------------
     WriteX        109347     0.028    59.817
     ReadX         347180     0.004     3.391
     Flush          15514    29.828   287.283
    
    Throughput 57.429 MB/sec  4 clients  4 procs  max_latency=287.290 ms
    
    3.8.0-rc3 + patches:
     WriteX        105556     0.029     4.273
     ReadX         335004     0.005     4.112
     Flush          14982    30.540   298.634
    
    Throughput 55.4496 MB/sec  4 clients  4 procs  max_latency=298.650 ms
    
    As you can see, the maximum write latency drops considerably with this
    patch enabled.  The other filesystems (ext3/ext4/xfs/btrfs) behave
    similarly, but see the cover letter for those results.
    Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
    Acked-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
    Reviewed-by: default avatarJan Kara <jack@suse.cz>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Artem Bityutskiy <dedekind1@gmail.com>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Mark Fasheh <mfasheh@suse.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Eric Van Hensbergen <ericvh@gmail.com>
    Cc: Ron Minnich <rminnich@sandia.gov>
    Cc: Latchesar Ionkov <lucho@ionkov.net>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    1d1d1a76
inode.c 144 KB