• Shuai Xue's avatar
    ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events · a70297d2
    Shuai Xue authored
    There are two major types of uncorrected recoverable (UCR) errors :
    
     - Synchronous error: The error is detected and raised at the point of
       the consumption in the execution flow, e.g. when a CPU tries to
       access a poisoned cache line. The CPU will take a synchronous error
       exception such as Synchronous External Abort (SEA) on Arm64 and
       Machine Check Exception (MCE) on X86. OS requires to take action (for
       example, offline failure page/kill failure thread) to recover this
       uncorrectable error.
    
     - Asynchronous error: The error is detected out of processor execution
       context, e.g. when an error is detected by a background scrubber.
       Some data in the memory are corrupted. But the data have not been
       consumed. OS is optional to take action to recover this uncorrectable
       error.
    
    When APEI firmware first is enabled, a platform may describe one error
    source for the handling of synchronous errors (e.g. MCE or SEA notification
    ), or for handling asynchronous errors (e.g. SCI or External Interrupt
    notification). In other words, we can distinguish synchronous errors by
    APEI notification. For synchronous errors, kernel will kill the current
    process which accessing the poisoned page by sending SIGBUS with
    BUS_MCEERR_AR. In addition, for asynchronous errors, kernel will notify the
    process who owns the poisoned page by sending SIGBUS with BUS_MCEERR_AO in
    early kill mode. However, the GHES driver always sets mf_flags to 0 so that
    all synchronous errors are handled as asynchronous errors in memory failure.
    
    To this end, set memory failure flags as MF_ACTION_REQUIRED on synchronous
    events.
    Signed-off-by: default avatarShuai Xue <xueshuai@linux.alibaba.com>
    Tested-by: default avatarMa Wupeng <mawupeng1@huawei.com>
    Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
    Reviewed-by: default avatarXiaofei Tan <tanxiaofei@huawei.com>
    Reviewed-by: default avatarBaolin Wang <baolin.wang@linux.alibaba.com>
    Reviewed-by: default avatarJames Morse <james.morse@arm.com>
    Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
    a70297d2
ghes.c 42.1 KB