-
Linus Torvalds authored
Andrei Vagin reported that commit 0ddad21d ("pipe: use exclusive waits when reading or writing") broke one of the CRIU tests. He even has a trivial reproducer: #include <unistd.h> #include <sys/types.h> #include <sys/wait.h> int main() { int p[2]; pid_t p1, p2; int status; if (pipe(p) == -1) return 1; p1 = fork(); if (p1 == 0) { close(p[1]); read(p[0], &status, sizeof(status)); return 0; } p2 = fork(); if (p2 == 0) { close(p[1]); read(p[0], &status, sizeof(status)); return 0; } sleep(1); close(p[1]); wait(&status); wait(&status); return 0; } and the problem - once he points it out - is obvious. We use these nice exclusive waits, but when the last writer goes away, it then needs to wake up _every_ reader (and conversely, the last reader disappearing needs to wake every writer, of course). In fact, when going through this, we had several small oddities around how to wake things. We did in fact wake every reader when we changed the size of the pipe buffers. But that's entirely pointless, since that just acts as a possible source of new space - no new data to read. And when we change the size of the buffer, we don't need to wake all writers even when we add space - that case acts just as if somebody made space by reading, and any writer that finds itself not filling it up entirely will wake the next one. On the other hand, on the exit path, we tried to limit the wakeups with the proper poll keys etc, which is entirely pointless, because at that point we obviously need to wake up everybody. So don't do that: just wake up everybody - but only do that if the counts changed to zero. So fix those non-IO wakeups to be more proper: space change doesn't add any new data, but it might make room for writers, so it wakes up a writer. And the actual changes to reader/writer counts should wake up everybody, since everybody is affected (ie readers will all see EOF if the writers have gone away, and writers will all get EPIPE if all readers have gone away). Fixes: 0ddad21d ("pipe: use exclusive waits when reading or writing") Reported-and-tested-by: Andrei Vagin <avagin@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6551d5c5