• Linus Torvalds's avatar
    pipe: make sure to wake up everybody when the last reader/writer closes · 6551d5c5
    Linus Torvalds authored
    Andrei Vagin reported that commit 0ddad21d ("pipe: use exclusive
    waits when reading or writing") broke one of the CRIU tests.  He even
    has a trivial reproducer:
    
        #include <unistd.h>
        #include <sys/types.h>
        #include <sys/wait.h>
    
        int main()
        {
                int p[2];
                pid_t p1, p2;
                int status;
    
                if (pipe(p) == -1)
                        return 1;
    
                p1 = fork();
                if (p1 == 0) {
                        close(p[1]);
                        read(p[0], &status, sizeof(status));
                        return 0;
                }
                p2 = fork();
                if (p2 == 0) {
                        close(p[1]);
                        read(p[0], &status, sizeof(status));
                        return 0;
                }
                sleep(1);
                close(p[1]);
                wait(&status);
                wait(&status);
    
                return 0;
        }
    
    and the problem - once he points it out - is obvious.  We use these nice
    exclusive waits, but when the last writer goes away, it then needs to
    wake up _every_ reader (and conversely, the last reader disappearing
    needs to wake every writer, of course).
    
    In fact, when going through this, we had several small oddities around
    how to wake things.  We did in fact wake every reader when we changed
    the size of the pipe buffers.  But that's entirely pointless, since that
    just acts as a possible source of new space - no new data to read.
    
    And when we change the size of the buffer, we don't need to wake all
    writers even when we add space - that case acts just as if somebody made
    space by reading, and any writer that finds itself not filling it up
    entirely will wake the next one.
    
    On the other hand, on the exit path, we tried to limit the wakeups with
    the proper poll keys etc, which is entirely pointless, because at that
    point we obviously need to wake up everybody.  So don't do that: just
    wake up everybody - but only do that if the counts changed to zero.
    
    So fix those non-IO wakeups to be more proper: space change doesn't add
    any new data, but it might make room for writers, so it wakes up a
    writer.  And the actual changes to reader/writer counts should wake up
    everybody, since everybody is affected (ie readers will all see EOF if
    the writers have gone away, and writers will all get EPIPE if all
    readers have gone away).
    
    Fixes: 0ddad21d ("pipe: use exclusive waits when reading or writing")
    Reported-and-tested-by: default avatarAndrei Vagin <avagin@gmail.com>
    Cc: Josh Triplett <josh@joshtriplett.org>
    Cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    6551d5c5
pipe.c 32.6 KB