• Wang Qing's avatar
    workqueue/watchdog: Make unbound workqueues aware of touch_softlockup_watchdog() · 89e28ce6
    Wang Qing authored
    84;0;0c84;0;0c
    There are two workqueue-specific watchdog timestamps:
    
        + @wq_watchdog_touched_cpu (per-CPU) updated by
          touch_softlockup_watchdog()
    
        + @wq_watchdog_touched (global) updated by
          touch_all_softlockup_watchdogs()
    
    watchdog_timer_fn() checks only the global @wq_watchdog_touched for
    unbound workqueues. As a result, unbound workqueues are not aware
    of touch_softlockup_watchdog(). The watchdog might report a stall
    even when the unbound workqueues are blocked by a known slow code.
    
    Solution:
    touch_softlockup_watchdog() must touch also the global @wq_watchdog_touched
    timestamp.
    
    The global timestamp can no longer be used for bound workqueues because
    it is now updated from all CPUs. Instead, bound workqueues have to check
    only @wq_watchdog_touched_cpu and these timestamps have to be updated for
    all CPUs in touch_all_softlockup_watchdogs().
    
    Beware:
    The change might cause the opposite problem. An unbound workqueue
    might get blocked on CPU A because of a real softlockup. The workqueue
    watchdog would miss it when the timestamp got touched on CPU B.
    
    It is acceptable because softlockups are detected by softlockup
    watchdog. The workqueue watchdog is there to detect stalls where
    a work never finishes, for example, because of dependencies of works
    queued into the same workqueue.
    
    V3:
    - Modify the commit message clearly according to Petr's suggestion.
    Signed-off-by: default avatarWang Qing <wangqing@vivo.com>
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    89e28ce6
watchdog.c 20.1 KB