1. 06 Mar, 2021 5 commits
    • Peter Zijlstra's avatar
      sched: Fix migration_cpu_stop() requeueing · 8a6edb52
      Peter Zijlstra authored
      When affine_move_task(p) is called on a running task @p, which is not
      otherwise already changing affinity, we'll first set
      p->migration_pending and then do:
      
      	 stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, &arg);
      
      This then gets us to migration_cpu_stop() running on the CPU that was
      previously running our victim task @p.
      
      If we find that our task is no longer on that runqueue (this can
      happen because of a concurrent migration due to load-balance etc.),
      then we'll end up at the:
      
      	} else if (dest_cpu < 1 || pending) {
      
      branch. Which we'll take because we set pending earlier. Here we first
      check if the task @p has already satisfied the affinity constraints,
      if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop()
      onto the CPU that is now hosting our task @p:
      
      	stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop,
      			    &pending->arg, &pending->stop_work);
      
      Except, we've never initialized pending->arg, which will be all 0s.
      
      This then results in running migration_cpu_stop() on the next CPU with
      arg->p == NULL, which gives the by now obvious result of fireworks.
      
      The cure is to change affine_move_task() to always use pending->arg,
      furthermore we can use the exact same pattern as the
      SCA_MIGRATE_ENABLE case, since we'll block on the pending->done
      completion anyway, no point in adding yet another completion in
      stop_one_cpu().
      
      This then gives a clear distinction between the two
      migration_cpu_stop() use cases:
      
        - sched_exec() / migrate_task_to() : arg->pending == NULL
        - affine_move_task() : arg->pending != NULL;
      
      And we can have it ignore p->migration_pending when !arg->pending. Any
      stop work from sched_exec() / migrate_task_to() is in addition to stop
      works from affine_move_task(), which will be sufficient to issue the
      completion.
      
      Fixes: 6d337eab ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
      Cc: stable@kernel.org
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Link: https://lkml.kernel.org/r/20210224131355.357743989@infradead.org
      8a6edb52
    • Linus Torvalds's avatar
      Linux 5.12-rc2 · a38fd874
      Linus Torvalds authored
      a38fd874
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · f3ed4de6
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "Nothing special here, though Bob's regression fixes for rxe would have
        made it before the rc cycle had there not been such strong winter
        weather!
      
         - Fix corner cases in the rxe reference counting cleanup that are
           causing regressions in blktests for SRP
      
         - Two kdoc fixes so W=1 is clean
      
         - Missing error return in error unwind for mlx5
      
         - Wrong lock type nesting in IB CM"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/rxe: Fix errant WARN_ONCE in rxe_completer()
        RDMA/rxe: Fix extra deref in rxe_rcv_mcast_pkt()
        RDMA/rxe: Fix missed IB reference counting in loopback
        RDMA/uverbs: Fix kernel-doc warning of _uverbs_alloc
        RDMA/mlx5: Set correct kernel-doc identifier
        IB/mlx5: Add missing error code
        RDMA/rxe: Fix missing kconfig dependency on CRYPTO
        RDMA/cm: Fix IRQ restore in ib_send_cm_sidr_rep
      f3ed4de6
    • Linus Torvalds's avatar
      Merge tag 'gcc-plugins-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · de5bd6c5
      Linus Torvalds authored
      Pull gcc-plugins fixes from Kees Cook:
       "Tiny gcc-plugin fixes for v5.12-rc2. These issues are small but have
        been reported a couple times now by static analyzers, so best to get
        them fixed to reduce the noise. :)
      
         - Fix coding style issues (Jason Yan)"
      
      * tag 'gcc-plugins-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        gcc-plugins: latent_entropy: remove unneeded semicolon
        gcc-plugins: structleak: remove unneeded variable 'ret'
      de5bd6c5
    • Linus Torvalds's avatar
      Merge tag 'pstore-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 8b24ef44
      Linus Torvalds authored
      Pull pstore fixes from Kees Cook:
      
       - Rate-limit ECC warnings (Dmitry Osipenko)
      
       - Fix error path check for NULL (Tetsuo Handa)
      
      * tag 'pstore-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        pstore/ram: Rate-limit "uncorrectable error in header" message
        pstore: Fix warning in pstore_kill_sb()
      8b24ef44
  2. 05 Mar, 2021 33 commits
  3. 04 Mar, 2021 2 commits
    • Jens Axboe's avatar
      kernel: provide create_io_thread() helper · cc440e87
      Jens Axboe authored
      Provide a generic helper for setting up an io_uring worker. Returns a
      task_struct so that the caller can do whatever setup is needed, then call
      wake_up_new_task() to kick it into gear.
      
      Add a kernel_clone_args member, io_thread, which tells copy_process() to
      mark the task with PF_IO_WORKER.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      cc440e87
    • Pavel Begunkov's avatar
      io_uring: reliably cancel linked timeouts · dd59a3d5
      Pavel Begunkov authored
      Linked timeouts are fired asynchronously (i.e. soft-irq), and use
      generic cancellation paths to do its stuff, including poking into io-wq.
      The problem is that it's racy to access tctx->io_wq, as
      io_uring_task_cancel() and others may be happening at this exact moment.
      Mark linked timeouts with REQ_F_INLIFGHT for now, making sure there are
      no timeouts before io-wq destraction.
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      dd59a3d5