• Chad Dupuis's avatar
    scsi: qedf: Synchronize rport restarts when multiple ELS commands time out · 44c7c859
    Chad Dupuis authored
    If multiple ELS commands time out, such as aborts, they could all try to
    restart the same rport and the same time.  This could mean multiple
    multiple processes trying to clean up any outstanding commands or trying
    to upload the same port.
    
    Add a new flag (QEDF_RPORT_IN_RESET) and check other fcport state flags
    before trying to reset the port.
    
    Fixes the crash:
    
    [17501.824701] ------------[ cut here ]------------
    [17501.824733] kernel BUG at include/asm-generic/dma-mapping-common.h:65!
    [17501.824760] invalid opcode: 0000 [#1] SMP
    [17501.824781] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ses enclosure dm_service_time vfat fat sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass joydev btrfs hpilo raid6_pq iTCO_wdt iTCO_vendor_support xor hpwdt ipmi_ssif sg crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul ioatdma lpc_ich glue_helper ablk_helper i2c_i801 shpchp cryptd ipmi_si pcspkr acpi_power_meter ipmi_devintf pcc_cpufreq dca wmi ipmi_msghandler dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod cdrom sd_mod
    [17501.825119]  crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qedf(OE) drm libfcoe ahci qedi(OE) crct10dif_pclmul libfc libahci uio crct10dif_common crc32c_intel libiscsi libata scsi_transport_iscsi scsi_transport_fc tg3 qede(OE) scsi_tgt hpsa qed(OE) i2c_core ptp scsi_transport_sas pps_core iscsi_boot_sysfs dm_mirror dm_region_hash dm_log dm_mod
    [17501.825292] CPU: 8 PID: 10531 Comm: kworker/u96:1 Tainted: G           OE  ------------   3.10.0-693.el7.x86_64 #1
    [17501.825330] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 06/02/2016
    [17501.825372] Workqueue: fc_rport_eq fc_rport_work [libfc]
    [17501.825395] task: ffff88101bca8000 ti: ffff881025278000 task.ti: ffff881025278000
    [17501.825424] RIP: 0010:[<ffffffffc042def9>]  [<ffffffffc042def9>] qedf_unmap_sg_list.isra.15+0x89/0x90 [qedf]
    [17501.825471] RSP: 0018:ffff88102527bb98  EFLAGS: 00010212
    [17501.825493] RAX: ffff8800224eac00 RBX: ffffc9000cd05210 RCX: 0000000000001000
    [17501.825520] RDX: 000000007e655e40 RSI: 0000000000001000 RDI: ffff88107fe3b098
    [17501.826683] RBP: ffff88102527bba0 R08: ffffffff81a13200 R09: 0000000000000286
    [17501.827747] R10: 0000000000000004 R11: 0000000000000005 R12: ffffc9000cd051b8
    [17501.828804] R13: ffff881037640c28 R14: 0000000000000007 R15: ffffc9000cd05200
    [17501.829850] FS:  0000000000000000(0000) GS:ffff88103fa00000(0000) knlGS:0000000000000000
    [17501.830910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [17501.831966] CR2: 00007f9b94005f38 CR3: 00000000019f2000 CR4: 00000000003407e0
    [17501.833027] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [17501.834087] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [17501.835142] Stack:
    [17501.836201]  ffff881033ddbb80 ffff88102527bc30 ffffffffc042f834 0000000000002710
    [17501.837264]  ffff88102527bbd0 ffffffff8133d9dd ffffc9000cd052a0 ffff88102527bc30
    [17501.838325]  ffffffff816a9c65 0000000000000001 ffff88101bca8000 ffffffff810c4810
    [17501.839388] Call Trace:
    [17501.840446]  [<ffffffffc042f834>] qedf_scsi_done+0x54/0x1d0 [qedf]
    [17501.841504]  [<ffffffff8133d9dd>] ? list_del+0xd/0x30
    [17501.842537]  [<ffffffff816a9c65>] ? wait_for_completion_timeout+0x125/0x140
    [17501.843560]  [<ffffffff810c4810>] ? wake_up_state+0x20/0x20
    [17501.844577]  [<ffffffffc0430311>] qedf_initiate_cleanup+0x2e1/0x310 [qedf]
    [17501.845587]  [<ffffffffc04305fe>] qedf_flush_active_ios+0x10e/0x260 [qedf]
    [17501.846612]  [<ffffffffc042892f>] qedf_cleanup_fcport+0x5f/0x370 [qedf]
    [17501.847613]  [<ffffffffc04292d8>] qedf_rport_event_handler+0x398/0x950 [qedf]
    [17501.848602]  [<ffffffff810cdc7c>] ? dequeue_entity+0x11c/0x5d0
    [17501.849581]  [<ffffffff81098a2b>] ? __internal_add_timer+0xab/0x130
    [17501.850555]  [<ffffffff810ce54e>] ? dequeue_task_fair+0x41e/0x660
    [17501.851528]  [<ffffffffc03241a4>] fc_rport_work+0xf4/0x6c0 [libfc]
    [17501.852490]  [<ffffffff810a881a>] process_one_work+0x17a/0x440
    [17501.853446]  [<ffffffff810a94e6>] worker_thread+0x126/0x3c0
    Signed-off-by: default avatarChad Dupuis <chad.dupuis@cavium.com>
    Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
    44c7c859
qedf_els.c 25.3 KB