• Jens Axboe's avatar
    io-wq: fix silly logic error in io_task_work_match() · 3b33e3f4
    Jens Axboe authored
    We check for the func with an OR condition, which means it always ends
    up being false and we never match the task_work we want to cancel. In
    the unexpected case that we do exit with that pending, we can trigger
    a hang waiting for a worker to exit, but it was never created. syzbot
    reports that as such:
    
    INFO: task syz-executor687:8514 blocked for more than 143 seconds.
          Not tainted 5.14.0-syzkaller #0
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    task:syz-executor687 state:D stack:27296 pid: 8514 ppid:  8479 flags:0x00024004
    Call Trace:
     context_switch kernel/sched/core.c:4940 [inline]
     __schedule+0x940/0x26f0 kernel/sched/core.c:6287
     schedule+0xd3/0x270 kernel/sched/core.c:6366
     schedule_timeout+0x1db/0x2a0 kernel/time/timer.c:1857
     do_wait_for_common kernel/sched/completion.c:85 [inline]
     __wait_for_common kernel/sched/completion.c:106 [inline]
     wait_for_common kernel/sched/completion.c:117 [inline]
     wait_for_completion+0x176/0x280 kernel/sched/completion.c:138
     io_wq_exit_workers fs/io-wq.c:1162 [inline]
     io_wq_put_and_exit+0x40c/0xc70 fs/io-wq.c:1197
     io_uring_clean_tctx fs/io_uring.c:9607 [inline]
     io_uring_cancel_generic+0x5fe/0x740 fs/io_uring.c:9687
     io_uring_files_cancel include/linux/io_uring.h:16 [inline]
     do_exit+0x265/0x2a30 kernel/exit.c:780
     do_group_exit+0x125/0x310 kernel/exit.c:922
     get_signal+0x47f/0x2160 kernel/signal.c:2868
     arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:865
     handle_signal_work kernel/entry/common.c:148 [inline]
     exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
     exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:209
     __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
     syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302
     do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x445cd9
    RSP: 002b:00007fc657f4b308 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
    RAX: 0000000000000001 RBX: 00000000004cb448 RCX: 0000000000445cd9
    RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000004cb44c
    RBP: 00000000004cb440 R08: 000000000000000e R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 000000000049b154
    R13: 0000000000000003 R14: 00007fc657f4b400 R15: 0000000000022000
    
    While in there, also decrement accr->nr_workers. This isn't strictly
    needed as we're exiting, but let's make sure the accounting matches up.
    
    Fixes: 3146cba9 ("io-wq: make worker creation resilient against signals")
    Reported-by: syzbot+f62d3e0a4ea4f38f5326@syzkaller.appspotmail.com
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    3b33e3f4
io-wq.c 30.9 KB