Commit f1ee6162 authored by NeilBrown's avatar NeilBrown Committed by Al Viro

VFS: don't keep disconnected dentries on d_anon

The original purpose of the per-superblock d_anon list was to
keep disconnected dentries in the cache between consecutive
requests to the NFS server.  Dentries can be disconnected if
a client holds a file open and repeatedly performs IO on it,
and if the server drops the dentry, whether due to memory
pressure, server restart, or "echo 3 > /proc/sys/vm/drop_caches".

This purpose was thwarted by commit 75a6f82a ("freeing unlinked
file indefinitely delayed") which caused disconnected dentries
to be freed as soon as their refcount reached zero.

This means that, when a dentry being used by nfsd gets disconnected, a
new one needs to be allocated for every request (unless requests
overlap).  As the dentry has no name, no parent, and no children,
there is little of value to cache.  As small memory allocations are
typically fast (from per-cpu free lists) this likely has little cost.

This means that the original purpose of s_anon is no longer relevant:
there is no longer any need to keep disconnected dentries on a list so
they appear to be hashed.

However, s_anon now has a new use.  When you mount an NFS filesystem,
the dentry stored in s_root is just a placebo.  The "real" root dentry
is allocated using d_obtain_root() and so it kept on the s_anon list.
I don't know the reason for this, but suspect it related to NFSv4
where a mount of "server:/some/path" require NFS to look up the root
filehandle on the server, then walk down "/some" and "/path" to get
the filehandle to mount.

Whatever the reason, NFS depends on the s_anon list and on
shrink_dcache_for_umount() pruning all dentries on this list.  So we
cannot simply remove s_anon.

We could just leave the code unchanged, but apart from that being
potentially confusing, the (unfair) bit-spin-lock which protects
s_anon can become a bottle neck when lots of disconnected dentries are
being created.

So this patch renames s_anon to s_roots, and stops storing
disconnected dentries on the list.  Only dentries obtained with
d_obtain_root() are now stored on this list.  There are many fewer of
these (only NFS and NILFS2 use the call, and only during filesystem
mount) so contention on the bit-lock will not be a problem.

Possibly an alternate solution should be found for NFS and NILFS2, but
that would require understanding their needs first.
Signed-off-by: default avatarNeilBrown <neilb@suse.com>
Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
parent 00b0c9b8
...@@ -56,13 +56,25 @@ a/ A dentry flag DCACHE_DISCONNECTED which is set on ...@@ -56,13 +56,25 @@ a/ A dentry flag DCACHE_DISCONNECTED which is set on
any dentry that might not be part of the proper prefix. any dentry that might not be part of the proper prefix.
This is set when anonymous dentries are created, and cleared when a This is set when anonymous dentries are created, and cleared when a
dentry is noticed to be a child of a dentry which is in the proper dentry is noticed to be a child of a dentry which is in the proper
prefix. prefix. If the refcount on a dentry with this flag set
becomes zero, the dentry is immediately discarded, rather than being
b/ A per-superblock list "s_anon" of dentries which are the roots of kept in the dcache. If a dentry that is not already in the dcache
subtrees that are not in the proper prefix. These dentries, as is repeatedly accessed by filehandle (as NFSD might do), an new dentry
well as the proper prefix, need to be released at unmount time. As will be a allocated for each access, and discarded at the end of
these dentries will not be hashed, they are linked together on the the access.
d_hash list_head.
Note that such a dentry can acquire children, name, ancestors, etc.
without losing DCACHE_DISCONNECTED - that flag is only cleared when
subtree is successfully reconnected to root. Until then dentries
in such subtree are retained only as long as there are references;
refcount reaching zero means immediate eviction, same as for unhashed
dentries. That guarantees that we won't need to hunt them down upon
umount.
b/ A primitive for creation of secondary roots - d_obtain_root(inode).
Those do _not_ bear DCACHE_DISCONNECTED. They are placed on the
per-superblock list (->s_roots), so they can be located at umount
time for eviction purposes.
c/ Helper routines to allocate anonymous dentries, and to help attach c/ Helper routines to allocate anonymous dentries, and to help attach
loose directory dentries at lookup time. They are: loose directory dentries at lookup time. They are:
...@@ -78,7 +90,6 @@ c/ Helper routines to allocate anonymous dentries, and to help attach ...@@ -78,7 +90,6 @@ c/ Helper routines to allocate anonymous dentries, and to help attach
It returns NULL when the passed-in dentry is used, following the calling It returns NULL when the passed-in dentry is used, following the calling
convention of ->lookup. convention of ->lookup.
Filesystem Issues Filesystem Issues
----------------- -----------------
......
...@@ -1296,15 +1296,7 @@ static inline void d_lustre_invalidate(struct dentry *dentry, int nested) ...@@ -1296,15 +1296,7 @@ static inline void d_lustre_invalidate(struct dentry *dentry, int nested)
spin_lock_nested(&dentry->d_lock, spin_lock_nested(&dentry->d_lock,
nested ? DENTRY_D_LOCK_NESTED : DENTRY_D_LOCK_NORMAL); nested ? DENTRY_D_LOCK_NESTED : DENTRY_D_LOCK_NORMAL);
ll_d2d(dentry)->lld_invalid = 1; ll_d2d(dentry)->lld_invalid = 1;
/* if (d_count(dentry) == 0)
* We should be careful about dentries created by d_obtain_alias().
* These dentries are not put in the dentry tree, instead they are
* linked to sb->s_anon through dentry->d_hash.
* shrink_dcache_for_umount() shrinks the tree and sb->s_anon list.
* If we unhashed such a dentry, unmount would not be able to find
* it and busy inodes would be reported.
*/
if (d_count(dentry) == 0 && !(dentry->d_flags & DCACHE_DISCONNECTED))
__d_drop(dentry); __d_drop(dentry);
spin_unlock(&dentry->d_lock); spin_unlock(&dentry->d_lock);
} }
......
...@@ -48,8 +48,8 @@ ...@@ -48,8 +48,8 @@
* - i_dentry, d_u.d_alias, d_inode of aliases * - i_dentry, d_u.d_alias, d_inode of aliases
* dcache_hash_bucket lock protects: * dcache_hash_bucket lock protects:
* - the dcache hash table * - the dcache hash table
* s_anon bl list spinlock protects: * s_roots bl list spinlock protects:
* - the s_anon list (see __d_drop) * - the s_roots list (see __d_drop)
* dentry->d_sb->s_dentry_lru_lock protects: * dentry->d_sb->s_dentry_lru_lock protects:
* - the dcache lru lists and counters * - the dcache lru lists and counters
* d_lock protects: * d_lock protects:
...@@ -67,7 +67,7 @@ ...@@ -67,7 +67,7 @@
* dentry->d_lock * dentry->d_lock
* dentry->d_sb->s_dentry_lru_lock * dentry->d_sb->s_dentry_lru_lock
* dcache_hash_bucket lock * dcache_hash_bucket lock
* s_anon lock * s_roots lock
* *
* If there is an ancestor relationship: * If there is an ancestor relationship:
* dentry->d_parent->...->d_parent->d_lock * dentry->d_parent->...->d_parent->d_lock
...@@ -476,10 +476,10 @@ void __d_drop(struct dentry *dentry) ...@@ -476,10 +476,10 @@ void __d_drop(struct dentry *dentry)
/* /*
* Hashed dentries are normally on the dentry hashtable, * Hashed dentries are normally on the dentry hashtable,
* with the exception of those newly allocated by * with the exception of those newly allocated by
* d_obtain_alias, which are always IS_ROOT: * d_obtain_root, which are always IS_ROOT:
*/ */
if (unlikely(IS_ROOT(dentry))) if (unlikely(IS_ROOT(dentry)))
b = &dentry->d_sb->s_anon; b = &dentry->d_sb->s_roots;
else else
b = d_hash(dentry->d_name.hash); b = d_hash(dentry->d_name.hash);
...@@ -1499,8 +1499,8 @@ void shrink_dcache_for_umount(struct super_block *sb) ...@@ -1499,8 +1499,8 @@ void shrink_dcache_for_umount(struct super_block *sb)
sb->s_root = NULL; sb->s_root = NULL;
do_one_tree(dentry); do_one_tree(dentry);
while (!hlist_bl_empty(&sb->s_anon)) { while (!hlist_bl_empty(&sb->s_roots)) {
dentry = dget(hlist_bl_entry(hlist_bl_first(&sb->s_anon), struct dentry, d_hash)); dentry = dget(hlist_bl_entry(hlist_bl_first(&sb->s_roots), struct dentry, d_hash));
do_one_tree(dentry); do_one_tree(dentry);
} }
} }
...@@ -1964,9 +1964,11 @@ static struct dentry *__d_obtain_alias(struct inode *inode, int disconnected) ...@@ -1964,9 +1964,11 @@ static struct dentry *__d_obtain_alias(struct inode *inode, int disconnected)
spin_lock(&tmp->d_lock); spin_lock(&tmp->d_lock);
__d_set_inode_and_type(tmp, inode, add_flags); __d_set_inode_and_type(tmp, inode, add_flags);
hlist_add_head(&tmp->d_u.d_alias, &inode->i_dentry); hlist_add_head(&tmp->d_u.d_alias, &inode->i_dentry);
hlist_bl_lock(&tmp->d_sb->s_anon); if (!disconnected) {
hlist_bl_add_head(&tmp->d_hash, &tmp->d_sb->s_anon); hlist_bl_lock(&tmp->d_sb->s_roots);
hlist_bl_unlock(&tmp->d_sb->s_anon); hlist_bl_add_head(&tmp->d_hash, &tmp->d_sb->s_roots);
hlist_bl_unlock(&tmp->d_sb->s_roots);
}
spin_unlock(&tmp->d_lock); spin_unlock(&tmp->d_lock);
spin_unlock(&inode->i_lock); spin_unlock(&inode->i_lock);
......
...@@ -207,7 +207,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags, ...@@ -207,7 +207,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
if (s->s_user_ns != &init_user_ns) if (s->s_user_ns != &init_user_ns)
s->s_iflags |= SB_I_NODEV; s->s_iflags |= SB_I_NODEV;
INIT_HLIST_NODE(&s->s_instances); INIT_HLIST_NODE(&s->s_instances);
INIT_HLIST_BL_HEAD(&s->s_anon); INIT_HLIST_BL_HEAD(&s->s_roots);
mutex_init(&s->s_sync_lock); mutex_init(&s->s_sync_lock);
INIT_LIST_HEAD(&s->s_inodes); INIT_LIST_HEAD(&s->s_inodes);
spin_lock_init(&s->s_inode_list_lock); spin_lock_init(&s->s_inode_list_lock);
......
...@@ -1359,7 +1359,7 @@ struct super_block { ...@@ -1359,7 +1359,7 @@ struct super_block {
const struct fscrypt_operations *s_cop; const struct fscrypt_operations *s_cop;
struct hlist_bl_head s_anon; /* anonymous dentries for (nfs) exporting */ struct hlist_bl_head s_roots; /* alternate root dentries for NFS */
struct list_head s_mounts; /* list of mounts; _not_ for fs use */ struct list_head s_mounts; /* list of mounts; _not_ for fs use */
struct block_device *s_bdev; struct block_device *s_bdev;
struct backing_dev_info *s_bdi; struct backing_dev_info *s_bdi;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment