• Andi Kleen's avatar
    Cache xattr security drop check for write v2 · 69b45732
    Andi Kleen authored
    Some recent benchmarking on btrfs showed that a major scaling bottleneck
    on large systems on btrfs is currently the xattr lookup on every write.
    
    Why xattr lookup on every write I hear you ask?
    
    write wants to drop suid and security related xattrs that could set o
    capabilities for executables.  To do that it currently looks up
    security.capability on EVERY write (even for non executables) to decide
    whether to drop it or not.
    
    In btrfs this causes an additional tree walk, hitting some per file system
    locks and quite bad scalability. In a simple read workload on a 8S
    system I saw over 90% CPU time in spinlocks related to that.
    
    Chris Mason tells me this is also a problem in ext4, where it hits
    the global mbcache lock.
    
    This patch adds a simple per inode to avoid this problem.  We only
    do the lookup once per file and then if there is no xattr cache
    the decision. All xattr changes clear the flag.
    
    I also used the same flag to avoid the suid check, although
    that one is pretty cheap.
    
    A file system can also set this flag when it creates the inode,
    if it has a cheap way to do so.  This is done for some common file systems
    in followon patches.
    
    With this patch a major part of the lock contention disappears
    for btrfs. Some testing on smaller systems didn't show significant
    performance changes, but at least it helps the larger systems
    and is generally more efficient.
    
    v2: Rename is_sgid. add file system helper.
    Cc: chris.mason@oracle.com
    Cc: josef@redhat.com
    Cc: viro@zeniv.linux.org.uk
    Cc: agruen@linbit.com
    Cc: Serge E. Hallyn <serue@us.ibm.com>
    Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
    Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    69b45732
filemap.c 69.2 KB