1. 13 Jan, 2021 2 commits
    • Pavel Begunkov's avatar
      io_uring: do sqo disable on install_fd error · 06585c49
      Pavel Begunkov authored
      WARNING: CPU: 0 PID: 8494 at fs/io_uring.c:8717
      	io_ring_ctx_wait_and_kill+0x4f2/0x600 fs/io_uring.c:8717
      Call Trace:
       io_uring_release+0x3e/0x50 fs/io_uring.c:8759
       __fput+0x283/0x920 fs/file_table.c:280
       task_work_run+0xdd/0x190 kernel/task_work.c:140
       tracehook_notify_resume include/linux/tracehook.h:189 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:174 [inline]
       exit_to_user_mode_prepare+0x249/0x250 kernel/entry/common.c:201
       __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
       syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      failed io_uring_install_fd() is a special case, we don't do
      io_ring_ctx_wait_and_kill() directly but defer it to fput, though still
      need to io_disable_sqo_submit() before.
      
      note: it doesn't fix any real problem, just a warning. That's because
      sqring won't be available to the userspace in this case and so SQPOLL
      won't submit anything.
      
      Reported-by: syzbot+9c9c35374c0ecac06516@syzkaller.appspotmail.com
      Fixes: d9d05217 ("io_uring: stop SQPOLL submit on creator's death")
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      06585c49
    • Pavel Begunkov's avatar
      io_uring: fix null-deref in io_disable_sqo_submit · b4411616
      Pavel Begunkov authored
      general protection fault, probably for non-canonical address
      	0xdffffc0000000022: 0000 [#1] KASAN: null-ptr-deref
      	in range [0x0000000000000110-0x0000000000000117]
      RIP: 0010:io_ring_set_wakeup_flag fs/io_uring.c:6929 [inline]
      RIP: 0010:io_disable_sqo_submit+0xdb/0x130 fs/io_uring.c:8891
      Call Trace:
       io_uring_create fs/io_uring.c:9711 [inline]
       io_uring_setup+0x12b1/0x38e0 fs/io_uring.c:9739
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      io_disable_sqo_submit() might be called before user rings were
      allocated, don't do io_ring_set_wakeup_flag() in those cases.
      
      Reported-by: syzbot+ab412638aeb652ded540@syzkaller.appspotmail.com
      Fixes: d9d05217 ("io_uring: stop SQPOLL submit on creator's death")
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b4411616
  2. 11 Jan, 2021 2 commits
  3. 09 Jan, 2021 4 commits
    • Pavel Begunkov's avatar
      io_uring: stop SQPOLL submit on creator's death · d9d05217
      Pavel Begunkov authored
      When the creator of SQPOLL io_uring dies (i.e. sqo_task), we don't want
      its internals like ->files and ->mm to be poked by the SQPOLL task, it
      have never been nice and recently got racy. That can happen when the
      owner undergoes destruction and SQPOLL tasks tries to submit new
      requests in parallel, and so calls io_sq_thread_acquire*().
      
      That patch halts SQPOLL submissions when sqo_task dies by introducing
      sqo_dead flag. Once set, the SQPOLL task must not do any submission,
      which is synchronised by uring_lock as well as the new flag.
      
      The tricky part is to make sure that disabling always happens, that
      means either the ring is discovered by creator's do_exit() -> cancel,
      or if the final close() happens before it's done by the creator. The
      last is guaranteed by the fact that for SQPOLL the creator task and only
      it holds exactly one file note, so either it pins up to do_exit() or
      removed by the creator on the final put in flush. (see comments in
      uring_flush() around file->f_count == 2).
      
      One more place that can trigger io_sq_thread_acquire_*() is
      __io_req_task_submit(). Shoot off requests on sqo_dead there, even
      though actually we don't need to. That's because cancellation of
      sqo_task should wait for the request before going any further.
      
      note 1: io_disable_sqo_submit() does io_ring_set_wakeup_flag() so the
      caller would enter the ring to get an error, but it still doesn't
      guarantee that the flag won't be cleared.
      
      note 2: if final __userspace__ close happens not from the creator
      task, the file note will pin the ring until the task dies.
      
      Fixed: b1b6b5a3 ("kernel/io_uring: cancel io_uring before task works")
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      d9d05217
    • Pavel Begunkov's avatar
      io_uring: add warn_once for io_uring_flush() · 6b5733eb
      Pavel Begunkov authored
      files_cancel() should cancel all relevant requests and drop file notes,
      so we should never have file notes after that, including on-exit fput
      and flush. Add a WARN_ONCE to be sure.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      6b5733eb
    • Pavel Begunkov's avatar
      io_uring: inline io_uring_attempt_task_drop() · 4f793dc4
      Pavel Begunkov authored
      A simple preparation change inlining io_uring_attempt_task_drop() into
      io_uring_flush().
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4f793dc4
    • Pavel Begunkov's avatar
      io_uring: io_rw_reissue lockdep annotations · 55e6ac1e
      Pavel Begunkov authored
      We expect io_rw_reissue() to take place only during submission with
      uring_lock held. Add a lockdep annotation to check that invariant.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      55e6ac1e
  4. 07 Jan, 2021 3 commits
    • Pavel Begunkov's avatar
      io_uring: synchronise ev_posted() with waitqueues · b1445e59
      Pavel Begunkov authored
      waitqueue_active() needs smp_mb() to be in sync with waitqueues
      modification, but we miss it in io_cqring_ev_posted*() apart from
      cq_wait() case.
      
      Take an smb_mb() out of wq_has_sleeper() making it waitqueue_active(),
      and place it a few lines before, so it can synchronise other
      waitqueue_active() as well.
      
      The patch doesn't add any additional overhead, so even if there are
      no problems currently, it's just safer to have it this way.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b1445e59
    • Pavel Begunkov's avatar
      io_uring: dont kill fasync under completion_lock · 4aa84f2f
      Pavel Begunkov authored
            CPU0                    CPU1
             ----                    ----
        lock(&new->fa_lock);
                                     local_irq_disable();
                                     lock(&ctx->completion_lock);
                                     lock(&new->fa_lock);
        <Interrupt>
          lock(&ctx->completion_lock);
      
       *** DEADLOCK ***
      
      Move kill_fasync() out of io_commit_cqring() to io_cqring_ev_posted(),
      so it doesn't hold completion_lock while doing it. That saves from the
      reported deadlock, and it's just nice to shorten the locking time and
      untangle nested locks (compl_lock -> wq_head::lock).
      
      Reported-by: syzbot+91ca3f25bd7f795f019c@syzkaller.appspotmail.com
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      4aa84f2f
    • Pavel Begunkov's avatar
      io_uring: trigger eventfd for IOPOLL · 80c18e4a
      Pavel Begunkov authored
      Make sure io_iopoll_complete() tries to wake up eventfd, which currently
      is skipped together with io_cqring_ev_posted() for non-SQPOLL IOPOLL.
      
      Add an iopoll version of io_cqring_ev_posted(), duplicates a bit of
      code, but they actually use different sets of wait queues may be for
      better.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      80c18e4a
  5. 06 Jan, 2021 1 commit
  6. 05 Jan, 2021 1 commit
  7. 04 Jan, 2021 4 commits
  8. 31 Dec, 2020 3 commits
  9. 29 Dec, 2020 1 commit
    • Jens Axboe's avatar
      io_uring: don't assume mm is constant across submits · 77788775
      Jens Axboe authored
      If we COW the identity, we assume that ->mm never changes. But this
      isn't true of multiple processes end up sharing the ring. Hence treat
      id->mm like like any other process compontent when it comes to the
      identity mapping. This is pretty trivial, just moving the existing grab
      into io_grab_identity(), and including a check for the match.
      
      Cc: stable@vger.kernel.org # 5.10
      Fixes: 1e6fa521 ("io_uring: COW io_identity on mismatch")
      Reported-by: Christian Brauner <christian.brauner@ubuntu.com>:
      Tested-by: Christian Brauner <christian.brauner@ubuntu.com>:
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      77788775
  10. 27 Dec, 2020 8 commits
  11. 26 Dec, 2020 5 commits
  12. 25 Dec, 2020 5 commits
    • Linus Torvalds's avatar
      drm/amd/display: avoid uninitialized variable warning · 61d79136
      Linus Torvalds authored
      clang (quite rightly) complains fairly loudly about the newly added
      mpc1_get_mpc_out_mux() function returning an uninitialized value if the
      'opp_id' checks don't pass.
      
      This may not happen in practice, but the code really shouldn't return
      garbage if the sanity checks don't pass.
      
      So just initialize 'val' to zero to avoid the issue.
      
      Fixes: 110b055b ("drm/amd/display: add getter routine to retrieve mpcc mux")
      Cc: Josip Pavic <Josip.Pavic@amd.com>
      Cc: Bindu Ramamurthy <bindu.r@amd.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      61d79136
    • Linus Torvalds's avatar
      Merge tag 'perf-tools-2020-12-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux · 5814bc2d
      Linus Torvalds authored
      Pull more perf tools updates from Arnaldo Carvalho de Melo:
      
       - Refactor 'perf stat' per CPU/socket/die/thread aggregation fixing use
         cases in ARM machines.
      
       - Fix memory leak when synthesizing SDT probes in 'perf probe'.
      
       - Update kernel header copies related to KVM, epol_pwait. msr-index and
         powerpc and s390 syscall tables.
      
      * tag 'perf-tools-2020-12-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (24 commits)
        perf probe: Fix memory leak when synthesizing SDT probes
        perf stat aggregation: Add separate thread member
        perf stat aggregation: Add separate core member
        perf stat aggregation: Add separate die member
        perf stat aggregation: Add separate socket member
        perf stat aggregation: Add separate node member
        perf stat aggregation: Start using cpu_aggr_id in map
        perf cpumap: Drop in cpu_aggr_map struct
        perf cpumap: Add new map type for aggregation
        perf stat: Replace aggregation ID with a struct
        perf cpumap: Add new struct for cpu aggregation
        perf cpumap: Use existing allocator to avoid using malloc
        perf tests: Improve topology test to check all aggregation types
        perf tools: Update s390's syscall.tbl copy from the kernel sources
        perf tools: Update powerpc's syscall.tbl copy from the kernel sources
        perf s390: Move syscall.tbl check into check-headers.sh
        perf powerpc: Move syscall.tbl check to check-headers.sh
        tools headers UAPI: Synch KVM's svm.h header with the kernel
        tools kvm headers: Update KVM headers from the kernel sources
        tools headers UAPI: Sync KVM's vmx.h header with the kernel sources
        ...
      5814bc2d
    • Linus Torvalds's avatar
      Merge branch 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux · 42dc45e8
      Linus Torvalds authored
      Pull coccinelle updates from Julia Lawall.
      
      * 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
        scripts: coccicheck: Correct usage of make coccicheck
        coccinelle: update expiring email addresses
        coccinnelle: Remove ptr_ret script
        kbuild: do not use scripts/ld-version.sh for checking spatch version
        remove boolinit.cocci
      42dc45e8
    • Michael Ellerman's avatar
      genirq: Fix export of irq_to_desc() for powerpc KVM · 11cc92eb
      Michael Ellerman authored
      Commit 64a1b95b ("genirq: Restrict export of irq_to_desc()") removed
      the export of irq_to_desc() unless powerpc KVM is being built, because
      there is still a use of irq_to_desc() in modular code there.
      
      However it used:
      
        #ifdef CONFIG_KVM_BOOK3S_64_HV
      
      Which doesn't work when that symbol is =m, leading to a build failure:
      
        ERROR: modpost: "irq_to_desc" [arch/powerpc/kvm/kvm-hv.ko] undefined!
      
      Fix it by checking for the definedness of the correct symbol which is
      CONFIG_KVM_BOOK3S_64_HV_MODULE.
      
      Fixes: 64a1b95b ("genirq: Restrict export of irq_to_desc()")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      11cc92eb
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 7bb5226c
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "Assorted patches from previous cycle(s)..."
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix hostfs_open() use of ->f_path.dentry
        Make sure that make_create_in_sticky() never sees uninitialized value of dir_mode
        fs: Kill DCACHE_DONTCACHE dentry even if DCACHE_REFERENCED is set
        fs: Handle I_DONTCACHE in iput_final() instead of generic_drop_inode()
        fs/namespace.c: WARN if mnt_count has become negative
      7bb5226c
  13. 24 Dec, 2020 1 commit
    • Linus Torvalds's avatar
      Merge tag 'docs-5.11-2' of git://git.lwn.net/linux · 71c5f031
      Linus Torvalds authored
      Pull documentation fixes from Jonathan Corbet:
       "A small set of late-arriving, small documentation fixes"
      
      * tag 'docs-5.11-2' of git://git.lwn.net/linux:
        docs: admin-guide: Fix default value of max_map_count in sysctl/vm.rst
        Documentation/submitting-patches: Document the SoB chain
        Documentation: process: Correct numbering
        docs: submitting-patches: Trivial - fix grammatical error
      71c5f031