• Linus Torvalds's avatar
    ext3: avoid unnecessary spinlock in critical POSIX ACL path · 96159f25
    Linus Torvalds authored
    If a filesystem supports POSIX ACL's, the VFS layer expects the filesystem 
    to do POSIX ACL checks on any files not owned by the caller, and it does 
    this for every single pathname component that it looks up.
    
    That obviously can be pretty expensive if the filesystem isn't careful 
    about it, especially with locking. That's doubly sad, since the common 
    case tends to be that there are no ACL's associated with the files in 
    question.
    
    ext3 already caches the ACL data so that it doesn't have to look it up 
    over and over again, but it does so by taking the inode->i_lock spinlock 
    on every lookup. Which is a noticeable overhead even if it's a private 
    lock, especially on CPU's where the serialization is expensive (eg Intel 
    Netburst aka 'P4').
    
    For the special case of not actually having any ACL's, all that locking is 
    unnecessary. Even if somebody else were to be changing the ACL's on 
    another CPU, we simply don't care - if we've seen a NULL ACL, we might as 
    well use it.
    
    So just load the ACL speculatively without any locking, and if it was 
    NULL, just use it. If it's non-NULL (either because we had a cached 
    entry, or because the cache hasn't been filled in at all), it means that 
    we'll need to get the lock and re-load it properly.
    
    This is noticeable even on Nehalem, which does locking quite well (much 
    better than P4). From lmbench:
    
    	Processor, Processes - times in microseconds - smaller is better
    	--------------------------------------------------------------------
    	Host                 OS  Mhz null null      open slct fork exec sh  
    	                             call  I/O stat clos TCP  proc proc proc
    	--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ----
     - before:
    	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.45 2.18 69.1 273. 1141
    	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.95 1.48 2.28 69.9 253. 1140
    	nehalem.l Linux 2.6.30- 3193 0.04 0.10 0.95 1.42 2.19 68.6 284. 1141
     - after:
    	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.44 2.12 68.3 282. 1094
    	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.20 67.0 308. 1123
    	nehalem.l Linux 2.6.30- 3193 0.04 0.09 0.92 1.39 2.36 67.4 293. 1148
    
    where you can see what appears to be a roughly 3% improvement in stat
    and open/close latencies from just the removal of the locking overhead. 
    
    Of course, this only matters for files you don't own (the owner never 
    needs to do the ACL checks), but that's the common case for libraries, 
    header files, and executables. As well as for the base components of any 
    absolute pathname, even if you are the owner of the final file.
    
    [ At some point we probably want to move this ACL caching logic entirely
      into the VFS layer (and only call down to the filesystem when
      uncached), but in the meantime this improves ext3 a bit.
    
      A similar fix to btrfs makes a much bigger difference (15x improvement
      in lmbench) due to broken caching. ]
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
    Acked-by: default avatarJan Kara <jack@suse.cz>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    96159f25
acl.c 12.3 KB