• Mika Westerberg's avatar
    bdi: Do not use freezable workqueue · a2b90f11
    Mika Westerberg authored
    A removable block device, such as NVMe or SSD connected over Thunderbolt
    can be hot-removed any time including when the system is suspended. When
    device is hot-removed during suspend and the system gets resumed, kernel
    first resumes devices and then thaws the userspace including freezable
    workqueues. What happens in that case is that the NVMe driver notices
    that the device is unplugged and removes it from the system. This ends
    up calling bdi_unregister() for the gendisk which then schedules
    wb_workfn() to be run one more time.
    
    However, since the bdi_wq is still frozen flush_delayed_work() call in
    wb_shutdown() blocks forever halting system resume process. User sees
    this as hang as nothing is happening anymore.
    
    Triggering sysrq-w reveals this:
    
      Workqueue: nvme-wq nvme_remove_dead_ctrl_work [nvme]
      Call Trace:
       ? __schedule+0x2c5/0x630
       ? wait_for_completion+0xa4/0x120
       schedule+0x3e/0xc0
       schedule_timeout+0x1c9/0x320
       ? resched_curr+0x1f/0xd0
       ? wait_for...
    a2b90f11