• Sven Schnelle's avatar
    workqueue: Fix selection of wake_cpu in kick_pool() · 57a01eaf
    Sven Schnelle authored
    
    
    With cpu_possible_mask=0-63 and cpu_online_mask=0-7 the following
    kernel oops was observed:
    
    smp: Bringing up secondary CPUs ...
    smp: Brought up 1 node, 8 CPUs
    Unable to handle kernel pointer dereference in virtual kernel address space
    Failing address: 0000000000000000 TEID: 0000000000000803
    [..]
     Call Trace:
    arch_vcpu_is_preempted+0x12/0x80
    select_idle_sibling+0x42/0x560
    select_task_rq_fair+0x29a/0x3b0
    try_to_wake_up+0x38e/0x6e0
    kick_pool+0xa4/0x198
    __queue_work.part.0+0x2bc/0x3a8
    call_timer_fn+0x36/0x160
    __run_timers+0x1e2/0x328
    __run_timer_base+0x5a/0x88
    run_timer_softirq+0x40/0x78
    __do_softirq+0x118/0x388
    irq_exit_rcu+0xc0/0xd8
    do_ext_irq+0xae/0x168
    ext_int_handler+0xbe/0xf0
    psw_idle_exit+0x0/0xc
    default_idle_call+0x3c/0x110
    do_idle+0xd4/0x158
    cpu_startup_entry+0x40/0x48
    rest_init+0xc6/0xc8
    start_kernel+0x3c4/0x5e0
    startup_continue+0x3c/0x50
    
    The crash is caused by calling arch_vcpu_is_preempted() for an offline
    CPU. To avoid this, select the cpu with cpumask_any_and_distribute()
    to mask __pod_cpumask with cpu_online_mask. In case no cpu is left in
    the pool, skip the assignment.
    
    tj: This doesn't fully fix the bug as CPUs can still go down between picking
    the target CPU and the wake call. Fixing that likely requires adding
    cpu_online() test to either the sched or s390 arch code. However, regardless
    of how that is fixed, workqueue shouldn't be picking a CPU which isn't
    online as that would result in unpredictable and worse behavior.
    Signed-off-by: default avatarSven Schnelle <svens@linux.ibm.com>
    Fixes: 8639eceb
    
     ("workqueue: Implement non-strict affinity scope for unbound workqueues")
    Cc: stable@vger.kernel.org # v6.6+
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    57a01eaf
workqueue.c 220 KB