• Yafang Shao's avatar
    vfs: Delete the associated dentry when deleting a file · 681ce862
    Yafang Shao authored
    Our applications, built on Elasticsearch[0], frequently create and
    delete files.  These applications operate within containers, some with a
    memory limit exceeding 100GB.  Over prolonged periods, the accumulation
    of negative dentries within these containers can amount to tens of
    gigabytes.
    
    Upon container exit, directories are deleted.  However, due to the
    numerous associated dentries, this process can be time-consuming.  Our
    users have expressed frustration with this prolonged exit duration,
    which constitutes our first issue.
    
    Simultaneously, other processes may attempt to access the parent
    directory of the Elasticsearch directories.  Since the task responsible
    for deleting the dentries holds the inode lock, processes attempting
    directory lookup experience significant delays.  This issue, our second
    problem, is easily demonstrated:
    
      - Task 1 generates negative dentries:
      $ pwd
      ~/test
      $ mkdir es && cd es/ && ./create_and_delete_files.sh
    
      [ After generating tens of GB dentries ]
    
      $ cd ~/test && rm -rf es
    
      [ It will take a long duration to finish ]
    
      - Task 2 attempts to lookup the 'test/' directory
      $ pwd
      ~/test
      $ ls
    
      The 'ls' command in Task 2 experiences prolonged execution as Task 1
      is deleting the dentries.
    
    We've devised a solution to address both issues by deleting associated
    dentry when removing a file.  Interestingly, we've noted that a similar
    patch was proposed years ago[1], although it was rejected citing the
    absence of tangible issues caused by negative dentries.  Given our
    current challenges, we're resubmitting the proposal.  All relevant
    stakeholders from previous discussions have been included for reference.
    
    Some alternative solutions are also under discussion[2][3], such as
    shrinking child dentries outside of the parent inode lock or even
    asynchronously shrinking child dentries.  However, given the
    straightforward nature of the current solution, I believe this approach
    is still necessary.
    
    [ NOTE! This is a pretty fundamental change in how we deal with
      unlinking dentries, and it doesn't change the fact that you can have
      lots of negative dentries from just doing negative lookups.
    
      But the kernel test robot is at least initially happy with this from a
      performance angle, so I'm applying this ASAP just to get more testing
      and as a "known fix for an issue people hit in real life".
    
      Put another way: we should still look at the alternatives, and this
      patch may get reverted if somebody finds a performance regression on
      some other load.       - Linus ]
    Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
    Link: https://github.com/elastic/elasticsearch [0]
    Link: https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@huawei.com [1]
    Link: https://lore.kernel.org/linux-fsdevel/20240511200240.6354-2-torvalds@linux-foundation.org/ [2]
    Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wjEMf8Du4UFzxuToGDnF3yLaMcrYeyNAaH1NJWa6fwcNQ@mail.gmail.com/ [3]
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Christian Brauner <brauner@kernel.org>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Waiman Long <longman@redhat.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Wangkai <wangkai86@huawei.com>
    Cc: Colin Walters <walters@verbum.org>
    Tested-by: default avatarkernel test robot <oliver.sang@intel.com>
    Link: https://lore.kernel.org/all/202405221518.ecea2810-oliver.sang@intel.com/Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    681ce862
dcache.c 83.5 KB