• Andrew Morton's avatar
    [PATCH] jbd: descriptor buffer state fix · ace476bb
    Andrew Morton authored
    Fix a problem discovered by Jeff Mahoney <jeffm@suse.com>, based on an initial
    patch from Chris Mason <mason@suse.com>.
    
    journal_get_descriptor_buffer() is used to obtain a regular old buffer_head
    against the blockdev mapping.  The caller will populate that bh by hand and
    will then submit it for writing.
    
    But there are problems:
    
    a) The function sets bh->b_state nonatomically.  But this buffer is
       accessible to other CPUs via pagecache lookup.
    
    b) The function sets the buffer dirty and then the caller populates it and
       then it is submitted for I/O.  Wrong order: there's a window in which the
       VM could write the buffer before it is fully populated.
    
    c) The function fails to set the buffer uptodate after zeroing it.  And one
       caller forgot to mark it uptodate as well.  So if the VM happens to decide
       to write the containing page back __block_write_full_page() encounters a
       dirty, not uptodate buffer, which is an illegal state.  This was generating
       buffer_error() warnings before we removed buffer_error().
    
       Leaving the buffer not uptodate also means that a concurrent reader of
       /dev/hda1 could cause physical I/O against the buffer, scribbling on what
       we just put in it.
    
       So journal_get_descriptor_buffer() is changed to mark the buffer
       uptodate, under the buffer lock.
    
    I considered changing journal_get_descriptor_buffer() to return a locked
    buffer but there doesn't seem to be a need for this, and both callers end up
    using ll_rw_block() anyway, which requires that the buffer be unlocked again.
    
    Note that the journal_get_descriptor_buffer() callers dirty these buffers with
    set_buffer_dirty().  That's a bit naughty, because it could create dirty
    buffers against a clean page - an illegal state.  They really should use
    mark_buffer_dirty() to dirty the page and inode as well.  But all callers will
    immediately write and clean the buffer anyway, so we can safely leave this
    optimising cheat in place.
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    ace476bb
commit.c 23.5 KB