• Ohad Sharabi's avatar
    habanalabs/gaudi: print last QM PQEs on error · 2718e1d3
    Ohad Sharabi authored
    In case QMAN has an error and stop_on_err is true, print specific
    information of the "offending" command buffer batch.
    
    If the error occurred on one of the higher CPs, the CQ pointer and size
    will be printed along with (up to) last 8 PQEs of the stream.
    
    If the error occurred in the lower CP, the CQ pointer and size will be
    printed along with (up to) last 8 PQEs of ALL upper CPs as we have no
    way to know which upper CP sent the job there.
    
    This is done so higher SW levels will be able to debug their CS by
    extracting the raw data of the offending command buffer batch and
    examine those offline to detect the issue.
    Signed-off-by: default avatarOhad Sharabi <osharabi@habana.ai>
    Reviewed-by: default avatarOded Gabbay <ogabbay@kernel.org>
    Signed-off-by: default avatarOded Gabbay <ogabbay@kernel.org>
    2718e1d3
gaudiP.h 11.6 KB