• Borislav Petkov's avatar
    x86, AMD, MCE thresholding: Fix the MCi_MISCj iteration order · 6dcbfe4f
    Borislav Petkov authored
    This fixes possible cases of not collecting valid error info in
    the MCE error thresholding groups on F10h hardware.
    
    The current code contains a subtle problem of checking only the
    Valid bit of MSR0000_0413 (which is MC4_MISC0 - DRAM
    thresholding group) in its first iteration and breaking out if
    the bit is cleared.
    
    But (!), this MSR contains an offset value, BlkPtr[31:24], which
    points to the remaining MSRs in this thresholding group which
    might contain valid information too. But if we bail out only
    after we checked the valid bit in the first MSR and not the
    block pointer too, we miss that other information.
    
    The thing is, MC4_MISC0[BlkPtr] is not predicated on
    MCi_STATUS[MiscV] or MC4_MISC0[Valid] and should be checked
    prior to iterating over the MCI_MISCj thresholding group,
    irrespective of the MC4_MISC0[Valid] setting.
    Signed-off-by: default avatarBorislav Petkov <borislav.petkov@amd.com>
    Cc: <stable@kernel.org>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    6dcbfe4f
mce_amd.c 15.2 KB