• Marco Elver's avatar
    workqueue, kasan: avoid alloc_pages() when recording stack · f70da745
    Marco Elver authored
    Shuah Khan reported:
    
     | When CONFIG_PROVE_RAW_LOCK_NESTING=y and CONFIG_KASAN are enabled,
     | kasan_record_aux_stack() runs into "BUG: Invalid wait context" when
     | it tries to allocate memory attempting to acquire spinlock in page
     | allocation code while holding workqueue pool raw_spinlock.
     |
     | There are several instances of this problem when block layer tries
     | to __queue_work(). Call trace from one of these instances is below:
     |
     |     kblockd_mod_delayed_work_on()
     |       mod_delayed_work_on()
     |         __queue_delayed_work()
     |           __queue_work() (rcu_read_lock, raw_spin_lock pool->lock held)
     |             insert_work()
     |               kasan_record_aux_stack()
     |                 kasan_save_stack()
     |                   stack_depot_save()
     |                     alloc_pages()
     |                       __alloc_pages()
     |                         get_page_from_freelist()
     |                           rm_queue()
     |                             rm_queue_pcplist()
     |                               local_lock_irqsave(&pagesets.lock, flags);
     |                               [ BUG: Invalid wait context triggered ]
    
    The default kasan_record_aux_stack() calls stack_depot_save() with
    GFP_NOWAIT, which in turn can then call alloc_pages(GFP_NOWAIT, ...).
    In general, however, it is not even possible to use either GFP_ATOMIC
    nor GFP_NOWAIT in certain non-preemptive contexts, including
    raw_spin_locks (see gfp.h and commmit ab00db21).
    
    Fix it by instructing stackdepot to not expand stack storage via
    alloc_pages() in case it runs out by using
    kasan_record_aux_stack_noalloc().
    
    While there is an increased risk of failing to insert the stack trace,
    this is typically unlikely, especially if the same insertion had already
    succeeded previously (stack depot hit).
    
    For frequent calls from the same location, it therefore becomes
    extremely unlikely that kasan_record_aux_stack_noalloc() fails.
    
    Link: https://lkml.kernel.org/r/20210902200134.25603-1-skhan@linuxfoundation.org
    Link: https://lkml.kernel.org/r/20210913112609.2651084-7-elver@google.com
    
    Signed-off-by: default avatarMarco Elver <elver@google.com>
    Reported-by: default avatarShuah Khan <skhan@linuxfoundation.org>
    Tested-by: default avatarShuah Khan <skhan@linuxfoundation.org>
    Acked-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
    Acked-by: default avatarTejun Heo <tj@kernel.org>
    Reviewed-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
    Cc: Alexander Potapenko <glider@google.com>
    Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
    Cc: Dmitry Vyukov <dvyukov@google.com>
    Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
    Cc: Lai Jiangshan <jiangshanlai@gmail.com>
    Cc: Taras Madan <tarasmadan@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vijayanand Jitta <vjitta@codeaurora.org>
    Cc: Vinayak Menon <vinmenon@codeaurora.org>
    Cc: Walter Wu <walter-zh.wu@mediatek.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    f70da745
workqueue.c 168 KB