• Ivan Babrou's avatar
    proc: report open files as size in stat() for /proc/pid/fd · f1f1f256
    Ivan Babrou authored
    Many monitoring tools include open file count as a metric.  Currently the
    only way to get this number is to enumerate the files in /proc/pid/fd.
    
    The problem with the current approach is that it does many things people
    generally don't care about when they need one number for a metric.  In our
    tests for cadvisor, which reports open file counts per cgroup, we observed
    that reading the number of open files is slow.  Out of 35.23% of CPU time
    spent in `proc_readfd_common`, we see 29.43% spent in `proc_fill_cache`,
    which is responsible for filling dentry info.  Some of this extra time is
    spinlock contention, but it's a contention for the lock we don't want to
    take to begin with.
    
    We considered putting the number of open files in /proc/pid/status. 
    Unfortunately, counting the number of fds involves iterating the
    open_files bitmap, which has a linear complexity in proportion with the
    number of open files (bitmap slots really, but it's close).  We don't want
    to make /proc/pid/status any slower, so instead we put this info in
    /proc/pid/fd as a size member of the stat syscall result.  Previously the
    reported number was zero, so there's very little risk of breaking
    anything, while still providing a somewhat logical way to count the open
    files with a fallback if it's zero.
    
    RFC for this patch included iterating open fds under RCU.  Thanks to Frank
    Hofmann for the suggestion to use the bitmap instead.
    
    Previously:
    
    ```
    $ sudo stat /proc/1/fd | head -n2
      File: /proc/1/fd
      Size: 0         	Blocks: 0          IO Block: 1024   directory
    ```
    
    With this patch:
    
    ```
    $ sudo stat /proc/1/fd | head -n2
      File: /proc/1/fd
      Size: 65        	Blocks: 0          IO Block: 1024   directory
    ```
    
    Correctness check:
    
    ```
    $ sudo ls /proc/1/fd | wc -l
    65
    ```
    
    I added the docs for /proc/<pid>/fd while I'm at it.
    
    [ivan@cloudflare.com: use bitmap_weight() to count the bits]
      Link: https://lkml.kernel.org/r/20221018045844.37697-1-ivan@cloudflare.com
    [akpm@linux-foundation.org: include linux/bitmap.h for bitmap_weight()]
    [ivan@cloudflare.com: return errno from proc_fd_getattr() instead of setting negative size]
      Link: https://lkml.kernel.org/r/20221024173140.30673-1-ivan@cloudflare.com
    Link: https://lkml.kernel.org/r/20220922224027.59266-1-ivan@cloudflare.comSigned-off-by: default avatarIvan Babrou <ivan@cloudflare.com>
    Cc: Alexey Dobriyan <adobriyan@gmail.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: David Laight <David.Laight@ACULAB.COM>
    Cc: Ivan Babrou <ivan@cloudflare.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Kalesh Singh <kaleshsingh@google.com>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    f1f1f256
proc.rst 94.3 KB