• Qiuxu Zhuo's avatar
    EDAC, sb_edac: Classify memory mirroring modes · 039d7af6
    Qiuxu Zhuo authored
    Basically, there are full memory mirroring and address range partial
    memory mirroring (supported by Haswell EX and Broadwell EX) modes.
    
    a) In full memory mirroring, the memory behind each memory controller
       is mirrored, i.e. the memory is split into two identical mirrors
       (primary and secondary), half of the memory is reserved for redundancy.
    
    b) In address range partial memory mirroring, the memory size (range)
       of primary and secondary behind each memory controller can be user
       defined by the TAD0 register. The rest of memory ranges defined by
       TAD1/TAD2/... in that memory controller are non-mirrored.
    
    For more detail on memory mirroring, see the following link written by Tony Luck:
    
      https://01.org/lkp/blogs/tonyluck/2016/address-range-partial-memory-mirroring-linux
    
    Currently the sb_edac driver only supports address decoding in full
    memory mirroring and non-mirroring modes. In address range partial
    memory mirroring mode, it may fail to decode an address that falls in a
    non-mirroring area (the following was one of this kind of failed logs).
    
      mce: Uncorrected hardware memory error in user-access at 566d53a400
      Memory failure: 0x566d53a8: Killing einj_mem_uc:4647 due to hardware memory corruption
      Memory failure: 0x566d53a8: recovery action for dirty LRU page: Recovered
      mce: [Hardware Error]: Machine check events logged
      EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
      EDAC sbridge MC1: CPU 48: Machine Check Event: 0 Bank 7: ec00000000010090
      EDAC sbridge MC1: TSC 4b914aa5a99dab
      EDAC sbridge MC1: ADDR 566d53a400
      EDAC sbridge MC1: MISC 1443a0c86
      EDAC sbridge MC1: PROCESSOR 0:406f1 TIME 1499712764 SOCKET 2 APIC 80
      EDAC MC1: 0 UE Can't discover the memory rank for ch addr 0x7fb54e900 on any memory ( page:0x0 offset:0x0 grain:32)
      mce: [Hardware Error]: Machine check events logged
    
    Therefore, classify memory mirroring modes and make the address decoding
    in address range partial memory mode correct.
    Signed-off-by: default avatarQiuxu Zhuo <qiuxu.zhuo@intel.com>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: linux-edac <linux-edac@vger.kernel.org>
    Link: http://lkml.kernel.org/r/20170730180651.30060-1-qiuxu.zhuo@intel.comSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
    039d7af6
sb_edac.c 91.6 KB