• James Bottomley's avatar
    scsi: fix soft lockup in scsi_remove_target() on module removal · 90a88d6e
    James Bottomley authored
    This softlockup is currently happening:
    
    [  444.088002] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:29]
    [  444.088002] Modules linked in: lpfc(-) qla2x00tgt(O) qla2xxx_scst(O) scst_vdisk(O) scsi_transport_fc libcrc32c scst(O) dlm configfs nfsd lockd grace nfs_acl auth_rpcgss sunrpc ed
    d snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device dm_mod iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic gpio_ich iTCO_vendor_support ppdev snd_hda_intel snd_hda_codec snd_hda
    _core snd_hwdep tg3 snd_pcm snd_timer libphy lpc_ich parport_pc ptp acpi_cpufreq snd pps_core fjes parport i2c_i801 ehci_pci tpm_tis tpm sr_mod cdrom soundcore floppy hwmon sg 8250_
    fintek pcspkr i915 drm_kms_helper uhci_hcd ehci_hcd drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit usbcore button video usb_common fan ata_generic ata_piix libata th
    ermal
    [  444.088002] CPU: 1 PID: 29 Comm: kworker/1:1 Tainted: G           O    4.4.0-rc5-2.g1e923a3-default #1
    [  444.088002] Hardware name: FUJITSU SIEMENS ESPRIMO E           /D2164-A1, BIOS 5.00 R1.10.2164.A1               05/08/2006
    [  444.088002] Workqueue: fc_wq_4 fc_rport_final_delete [scsi_transport_fc]
    [  444.088002] task: f6266ec0 ti: f6268000 task.ti: f6268000
    [  444.088002] EIP: 0060:[<c07e7044>] EFLAGS: 00000286 CPU: 1
    [  444.088002] EIP is at _raw_spin_unlock_irqrestore+0x14/0x20
    [  444.088002] EAX: 00000286 EBX: f20d3800 ECX: 00000002 EDX: 00000286
    [  444.088002] ESI: f50ba800 EDI: f2146848 EBP: f6269ec8 ESP: f6269ec8
    [  444.088002]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    [  444.088002] CR0: 8005003b CR2: 08f96600 CR3: 363ae000 CR4: 000006d0
    [  444.088002] Stack:
    [  444.088002]  f6269eec c066b0f7 00000286 f2146848 f50ba808 f50ba800 f50ba800 f2146a90
    [  444.088002]  f2146848 f6269f08 f8f0a4ed f3141000 f2146800 f2146a90 f619fa00 00000040
    [  444.088002]  f6269f40 c026cb25 00000001 166c6392 00000061 f6757140 f6136340 00000004
    [  444.088002] Call Trace:
    [  444.088002]  [<c066b0f7>] scsi_remove_target+0x167/0x1c0
    [  444.088002]  [<f8f0a4ed>] fc_rport_final_delete+0x9d/0x1e0 [scsi_transport_fc]
    [  444.088002]  [<c026cb25>] process_one_work+0x155/0x3e0
    [  444.088002]  [<c026cde7>] worker_thread+0x37/0x490
    [  444.088002]  [<c027214b>] kthread+0x9b/0xb0
    [  444.088002]  [<c07e72c1>] ret_from_kernel_thread+0x21/0x40
    
    What appears to be happening is that something has pinned the target
    so it can't go into STARGET_DEL via final release and the loop in
    scsi_remove_target spins endlessly until that happens.
    
    The fix for this soft lockup is to not keep looping over a device that
    we've called remove on but which hasn't gone into DEL state.  This
    patch will retain a simplistic memory of the last target and not keep
    looping over it.
    Reported-by: default avatarSebastian Herbszt <herbszt@gmx.de>
    Tested-by: default avatarSebastian Herbszt <herbszt@gmx.de>
    Fixes: 40998193
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
    90a88d6e
scsi_sysfs.c 35.3 KB