• Theodore Ts'o's avatar
    ext4: only call ext4_jbd2_file_inode when an inode has been extended · decbd919
    Theodore Ts'o authored
    In delayed allocation mode, it's important to only call
    ext4_jbd2_file_inode when the file has been extended.  This is
    necessary to avoid a race which first got introduced in commit
    678aaf48, but which was made much more common with the introduction
    of the "punch hole" functionality.  (Especially when dioread_nolock
    was enabled; when I could reliably reproduce this problem with
    xfstests #74.)
    
    The race is this: If while trying to writeback a delayed allocation
    inode, there is a need to map delalloc blocks, and we run out of space
    in the journal, *and* at the same time the inode is already on the
    committing transaction's t_inode_list (because for example while doing
    the punch hole operation, ext4_jbd2_file_inode() is called), then the
    commit operation will wait for the inode to finish all of its pending
    writebacks by calling filemap_fdatawait(), but since that inode has
    one or more pages with the PageWriteback flag set, the commit
    operation will wait forever, and the so the writeback of the inode can
    never take place, and the kjournald thread and the writeback thread
    end up waiting for each other --- forever.
    
    It's important at this point to recall why an inode is placed on the
    t_inode_list; it is to provide the data=ordered guarantees that we
    don't end up exposing stale data.  In the case where we are truncating
    or punching a hole in the inode, there is no possibility that stale
    data could be exposed in the first place, so we don't need to put the
    inode on the t_inode_list!
    
    The right long-term fix is to get rid of data=ordered mode altogether,
    and only update the extent tree or indirect blocks after the data has
    been written.  Until then, this change will also avoid some
    unnecessary waiting in the commit operation.
    Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
    Cc: Allison Henderson <achender@linux.vnet.ibm.com>
    Cc: Jan Kara <jack@suse.cz>
    decbd919
inode.c 134 KB