• Andrew Morton's avatar
    [PATCH] Don't remove inode from hash until filesystem has · d6686d54
    Andrew Morton authored
    From: Neil Brown <neilb@cse.unsw.edu.au>
    
    When an NFS request arrives, it contains a filehandle which needs to be
    converted to a dentry.  Many filesystems use find_exported_dentry in
    fs/exportfs/expfs.c.  A key part of this on filesystem where a 32bit inode
    number uniquely locates a file is export_iget which calls iget(sb, inum).
    
    iget will either:
    
       1/ find the inode in the inode cache and return it
    
     or
    
       2/ create a new inode and call ->read_inode to load it from the
          storage device.
    
    export_iget then verifies the inode is really a good inode (->read_inode
    didn't detect any problems) and the right inode (base on generation number
    from the file handle).
    
    For this to work reliably, it is important that whenever an inode is *not* in
    the cache, the on-device version is up-to-date.  Otherwise, when read_inode
    loads the inode it will get bad data.
    
    For a file that has not been deleted, this condition always holds: a dirty
    inode is always flushed to disc before the inode is unhashed.
    
    However for a file that is being deleted this condition doesn't (didn't)
    hold.  When iput -> iput_final -> generic_drop_inode -> generic_delete_inode
    is called we would unhash the inode before calling into the filesytem through
    ->delete_inode.
    
    So there is a small window between when generic_delete_inode unhashes the
    inode, and when ->delete_inode writes something to disc, where a call to
    ->read_inode (for export_iget) might discover what it thinks is a valid
    inode, but is really one that is in the process of being destroyed.
    
    It is this window that I want to close by moving the unhashing to the end of
    generic_delete_inode.
    d6686d54
fs-writeback.c 16.2 KB