• Thomas Gleixner's avatar
    x86/apic: Ignore secondary threads if nosmt=force · 2207def7
    Thomas Gleixner authored
    nosmt on the kernel command line merely prevents the onlining of the
    secondary SMT siblings.
    
    nosmt=force makes the APIC detection code ignore the secondary SMT siblings
    completely, so they even do not show up as possible CPUs. That reduces the
    amount of memory allocations for per cpu variables and saves other
    resources from being allocated too large.
    
    This is not fully equivalent to disabling SMT in the BIOS because the low
    level SMT enabling in the BIOS can result in partitioning of resources
    between the siblings, which is not undone by just ignoring them. Some CPUs
    can use the full resources when their sibling is not onlined, but this is
    depending on the CPU family and model and it's not well documented whether
    this applies to all partitioned resources. That means depending on the
    workload disabling SMT in the BIOS might result in better performance.
    
    Linus analysis of the Intel manual:
    
      The intel optimization manual is not very clear on what the partitioning
      rules are.
    
      I find:
    
        "In general, the buffers for staging instructions between major pipe
         stages  are partitioned. These buffers include µop queues after the
         execution trace cache, the queues after the register rename stage, the
         reorder buffer which stages instructions for retirement, and the load
         and store buffers.
    
         In the case of load and store buffers, partitioning also provided an
         easier implementation to maintain memory ordering for each logical
         processor and detect memory ordering violations"
    
      but some of that partitioning may be relaxed if the HT thread is "not
      active":
    
        "In Intel microarchitecture code name Sandy Bridge, the micro-op queue
         is statically partitioned to provide 28 entries for each logical
         processor,  irrespective of software executing in single thread or
         multiple threads. If one logical processor is not active in Intel
         microarchitecture code name Ivy Bridge, then a single thread executing
         on that processor  core can use the 56 entries in the micro-op queue"
    
      but I do not know what "not active" means, and how dynamic it is. Some of
      that partitioning may be entirely static and depend on the early BIOS
      disabling of HT, and even if we park the cores, the resources will just be
      wasted.
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Acked-by: default avatarIngo Molnar <mingo@kernel.org>
    2207def7
boot.c 42.8 KB