• Yangyang Li's avatar
    RDMA/hns: Do not halt commands during reset until later · 52414e27
    Yangyang Li authored
    is_reset is used to indicate whether the hardware starts to reset. When
    hns_roce_hw_v2_reset_notify_down() is called, the hardware has not yet
    started to reset. If is_reset is set at this time, all mailbox operations
    of resource destroy actions will be intercepted by driver. When the driver
    cleans up resources, but the hardware is still accessed, the following
    errors will appear:
    
      arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x000002088000003f
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50e0800
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
      arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x000002088000043e
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50a0800
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
      arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000020880000436
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50a0880
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
      arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x000002088000043a
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x00000000a50e0840
      hns3 0000:35:00.0: INT status: CMDQ(0x0) HW errors(0x0) other(0x0)
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000000000000000
      hns3 0000:35:00.0: received unknown or unhandled event of vector0
      arm-smmu-v3 arm-smmu-v3.2.auto: event 0x10 received:
      arm-smmu-v3 arm-smmu-v3.2.auto: 	0x0000350100000010
      {34}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 7
    
    is_reset will be set correctly in check_aedev_reset_status(), so the
    setting in hns_roce_hw_v2_reset_notify_down() should be deleted.
    
    Fixes: 726be12f ("RDMA/hns: Set reset flag when hw resetting")
    Link: https://lore.kernel.org/r/20211123084809.37318-1-liangwenpeng@huawei.comSigned-off-by: default avatarYangyang Li <liyangyang20@huawei.com>
    Signed-off-by: default avatarWenpeng Liang <liangwenpeng@huawei.com>
    Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
    52414e27
hns_roce_hw_v2.c 183 KB