• Brian Foster's avatar
    filemap: skip write and wait if end offset precedes start · feeb9b26
    Brian Foster authored
    Patch series "filemap: skip write and wait if end offset precedes start",
    v2.
    
    A fix for the odd write and wait behavior described in the patch 1 commit
    log.  Technically patch 1 could simply remove the check rather than lift
    it into the callers, but this seemed a bit more user friendly to me. 
    Patch 2 is appended after observation that fadvise() interacted poorly
    with the v1 patch.  This is no longer a problem with v2, making patch 2
    purely a cleanup.
    
    This series survived both fstests and ltp regression runs without
    observable problems.  I had (end < start) warning checks in each relevant
    function, with fadvise() being the only caller that triggered them.  That
    said, I dropped the warnings after testing because there seemed to much
    potential for noise from the various other callers.
    
    
    This patch (of 2):
    
    A call to file[map]_write_and_wait_range() with an end offset that
    precedes the start offset but happens to land in the same page can trigger
    writeback submission but fails to wait on the submitted page.  Writeback
    submission occurs because __filemap_fdatawrite_range() passes both offsets
    down into write_cache_pages(), which rounds down to page indexes before it
    starts processing writeback.  However, __filemap_fdatawait_range()
    immediately returns if the byte-granular end offset precedes the start
    offset.
    
    This behavior was observed in the form of unpredictable latency from a
    frequent write and wait call with incorrect parameters.  The behavior gave
    the impression that the fdatawait path might occasionally fail to wait on
    writeback, but further investigation showed the latency was from
    write_cache_pages() waiting on writeback state to clear for a page already
    under writeback.  Therefore, this indicated that fdatawait actually never
    waits on writeback in this particular situation.
    
    The byte granular check in __filemap_fdatawait_range() goes all the way
    back to the old wait_on_page_writeback() helper.  It originally used page
    offsets and so would have waited in this problematic case.  That changed
    to byte granularity file offsets in commit 94004ed7 ("kill
    wait_on_page_writeback_range"), which subtly changed this behavior.  The
    check itself has become somewhat redundant since the error checking code
    that used to follow the wait loop (at the time of the aforementioned
    commit) has now been removed and lifted into the higher level callers.
    
    Therefore, we can restore historical fdatawait behavior by simply removing
    the check.  Since the current fdatawait behavior has been in place for
    quite some time and is consistent with other interfaces that use file
    offsets, instead lift the check into the file[map]_write_and_wait_range()
    helpers to provide consistent behavior between the write and wait.
    
    Link: https://lkml.kernel.org/r/20221128155632.3950447-1-bfoster@redhat.com
    Link: https://lkml.kernel.org/r/20221128155632.3950447-2-bfoster@redhat.comSigned-off-by: default avatarBrian Foster <bfoster@redhat.com>
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    feeb9b26
filemap.c 112 KB