• Tony Luck's avatar
    x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL · 93022482
    Tony Luck authored
    Code in v6.9 arch/x86/kernel/smpboot.c was changed by commit
    
      4db64279 ("x86/cpu: Switch to new Intel CPU model defines") from:
    
      static const struct x86_cpu_id intel_cod_cpu[] = {
              X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X, 0),       /* COD */
              X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X, 0),     /* COD */
              X86_MATCH_INTEL_FAM6_MODEL(ANY, 1),             /* SNC */	<--- 443
              {}
      };
    
      static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
      {
              const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu);
    
    to:
    
      static const struct x86_cpu_id intel_cod_cpu[] = {
               X86_MATCH_VFM(INTEL_HASWELL_X,   0),    /* COD */
               X86_MATCH_VFM(INTEL_BROADWELL_X, 0),    /* COD */
               X86_MATCH_VFM(INTEL_ANY,         1),    /* SNC */
               {}
       };
    
      static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
      {
              const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu);
    
    On an Intel CPU with SNC enabled this code previously matched the rule on line
    443 to avoid printing messages about insane cache configuration.  The new code
    did not match any rules.
    
    Expanding the macros for the intel_cod_cpu[] array shows that the old is
    equivalent to:
    
      static const struct x86_cpu_id intel_cod_cpu[] = {
      [0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 },
      [1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 },
      [2] = { .vendor = 0, .family = 6, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 },
      [3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 }
      }
    
    while the new code expands to:
    
      static const struct x86_cpu_id intel_cod_cpu[] = {
      [0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 },
      [1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 },
      [2] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 },
      [3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 }
      }
    
    Looking at the code for x86_match_cpu():
    
      const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
      {
               const struct x86_cpu_id *m;
               struct cpuinfo_x86 *c = &boot_cpu_data;
    
               for (m = match;
                    m->vendor | m->family | m->model | m->steppings | m->feature;
                    m++) {
           		...
               }
               return NULL;
    
    it is clear that there was no match because the ANY entry in the table (array
    index 2) is now the loop termination condition (all of vendor, family, model,
    steppings, and feature are zero).
    
    So this code was working before because the "ANY" check was looking for any
    Intel CPU in family 6. But fails now because the family is a wild card. So the
    root cause is that x86_match_cpu() has never been able to match on a rule with
    just X86_VENDOR_INTEL and all other fields set to wildcards.
    
    Add a new flags field to struct x86_cpu_id that has a bit set to indicate that
    this entry in the array is valid. Update X86_MATCH*() macros to set that bit.
    Change the end-marker check in x86_match_cpu() to just check the flags field
    for this bit.
    
    Backporter notes: The commit in Fixes is really the one that is broken:
    you can't have m->vendor as part of the loop termination conditional in
    x86_match_cpu() because it can happen - as it has happened above
    - that that whole conditional is 0 albeit vendor == 0 is a valid case
    - X86_VENDOR_INTEL is 0.
    
    However, the only case where the above happens is the SNC check added by
    4db64279 so you only need this fix if you have backported that
    other commit
    
      4db64279 ("x86/cpu: Switch to new Intel CPU model defines")
    
    Fixes: 644e9cbb ("Add driver auto probing for x86 features v4")
    Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Suggested-by: default avatarBorislav Petkov <bp@alien8.de>
    Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
    Signed-off-by: default avatarBorislav Petkov (AMD) <bp@alien8.de>
    Cc: <stable+noautosel@kernel.org> # see above
    Link: https://lore.kernel.org/r/20240517144312.GBZkdtAOuJZCvxhFbJ@fat_crate.local
    93022482
cpu_device_id.h 10.7 KB