• Christian Brauner's avatar
    exit: support non-blocking pidfds · ba7d25f3
    Christian Brauner authored
    Passing a non-blocking pidfd to waitid() currently has no effect, i.e.  is not
    supported. There are users which would like to use waitid() on pidfds that are
    O_NONBLOCK and mix it with pidfds that are blocking and both pass them to
    waitid().
    The expected behavior is to have waitid() return -EAGAIN for non-blocking
    pidfds and to block for blocking pidfds without needing to perform any
    additional checks for flags set on the pidfd before passing it to waitid().
    Non-blocking pidfds will return EAGAIN from waitid() when no child process is
    ready yet. Returning -EAGAIN for non-blocking pidfds makes it easier for event
    loops that handle EAGAIN specially.
    
    It also makes the API more consistent and uniform. In essence, waitid() is
    treated like a read on a non-blocking pidfd or a recvmsg() on a non-blocking
    socket.
    With the addition of support for non-blocking pidfds we support the same
    functionality that sockets do. For sockets() recvmsg() supports MSG_DONTWAIT
    for pidfds waitid() supports WNOHANG. Both flags are per-call options. In
    contrast non-blocking pidfds and non-blocking sockets are a setting on an open
    file description affecting all threads in the calling process as well as other
    processes that hold file descriptors referring to the same open file
    description. Both behaviors, per call and per open file description, have
    genuine use-cases.
    
    The implementation should be straightforward:
    - If a non-blocking pidfd is passed and WNOHANG is not raised we simply raise
      the WNOHANG flag internally. When do_wait() returns indicating that there are
      eligible child processes but none have exited yet we set EAGAIN. If no child
      process exists we continue returning ECHILD.
    - If a non-blocking pidfd is passed and WNOHANG is raised waitid() will
      continue returning 0, i.e. it will not set EAGAIN. This ensure backwards
      compatibility with applications passing WNOHANG explicitly with pidfds.
    
    A concrete use-case that was brought on-list was Josh's async pidfd library.
    Ever since the introduction of pidfds and more advanced async io various
    programming languages such as Rust have grown support for async event
    libraries. These libraries are created to help build epoll-based event loops
    around file descriptors. A common pattern is to automatically make all file
    descriptors they manage to O_NONBLOCK.
    
    For such libraries the EAGAIN error code is treated specially. When a function
    is called that returns EAGAIN the function isn't called again until the event
    loop indicates the the file descriptor is ready.  Supporting EAGAIN when
    waiting on pidfds makes such libraries just work with little effort.
    Suggested-by: default avatarJosh Triplett <josh@joshtriplett.org>
    Signed-off-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
    Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
    Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Sargun Dhillon <sargun@sargun.me>
    Cc: Jann Horn <jannh@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Oleg Nesterov <oleg@redhat.com>
    Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
    Link: https://lore.kernel.org/lkml/20200811181236.GA18763@localhost/
    Link: https://github.com/joshtriplett/async-pidfd
    Link: https://lore.kernel.org/r/20200902102130.147672-3-christian.brauner@ubuntu.com
    ba7d25f3
exit.c 43.3 KB