• Nathan Lynch's avatar
    powerpc/pseries/cpuhp: cache node corrections · 7edd5c9a
    Nathan Lynch authored
    On pseries, cache nodes in the device tree can be added and removed by the
    CPU DLPAR code as well as the partition migration (mobility) code. PowerVM
    partitions in dedicated processor mode typically have L2 and L3 cache
    nodes.
    
    The CPU DLPAR code has the following shortcomings:
    
    * Cache nodes returned as siblings of a new CPU node by
      ibm,configure-connector are silently discarded; only the CPU node is
      added to the device tree.
    
    * Cache nodes which become unreferenced in the processor removal path are
      not removed from the device tree. This can lead to duplicate nodes when
      the post-migration device tree update code replaces cache nodes.
    
    This is long-standing behavior. Presumably it has gone mostly unnoticed
    because the two bugs have the property of obscuring each other in common
    simple scenarios (e.g. remove a CPU and add it back). Likely you'd notice
    only if you cared to inspect the device tree or the sysfs cacheinfo
    information.
    
    Booted with two processors:
    
      $ pwd
      /sys/firmware/devicetree/base/cpus
      $ ls -1d */
      l2-cache@2010/
      l2-cache@2011/
      l3-cache@3110/
      l3-cache@3111/
      PowerPC,POWER9@0/
      PowerPC,POWER9@8/
      $ lsprop */l2-cache
      l2-cache@2010/l2-cache
                     00003110 (12560)
      l2-cache@2011/l2-cache
                     00003111 (12561)
      PowerPC,POWER9@0/l2-cache
                     00002010 (8208)
      PowerPC,POWER9@8/l2-cache
                     00002011 (8209)
      $ ls /sys/devices/system/cpu/cpu0/cache/
      index0  index1  index2  index3
    
    After DLPAR-adding PowerPC,POWER9@10, we see that its associated cache
    nodes are absent, its threads' L2+L3 cacheinfo is unpopulated, and it is
    missing a cache level in its sched domain hierarchy:
    
      $ ls -1d */
      l2-cache@2010/
      l2-cache@2011/
      l3-cache@3110/
      l3-cache@3111/
      PowerPC,POWER9@0/
      PowerPC,POWER9@10/
      PowerPC,POWER9@8/
      $ lsprop PowerPC\,POWER9@10/l2-cache
      PowerPC,POWER9@10/l2-cache
                     00002012 (8210)
      $ ls /sys/devices/system/cpu/cpu16/cache/
      index0  index1
      $ grep . /sys/kernel/debug/sched/domains/cpu{0,8,16}/domain*/name
      /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
      /sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE
      /sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE
      /sys/kernel/debug/sched/domains/cpu8/domain0/name:SMT
      /sys/kernel/debug/sched/domains/cpu8/domain1/name:CACHE
      /sys/kernel/debug/sched/domains/cpu8/domain2/name:DIE
      /sys/kernel/debug/sched/domains/cpu16/domain0/name:SMT
      /sys/kernel/debug/sched/domains/cpu16/domain1/name:DIE
    
    When removing PowerPC,POWER9@8, we see that its cache nodes are left
    behind:
    
      $ ls -1d */
      l2-cache@2010/
      l2-cache@2011/
      l3-cache@3110/
      l3-cache@3111/
      PowerPC,POWER9@0/
    
    When DLPAR is combined with VM migration, we can get duplicate nodes. E.g.
    removing one processor, then migrating, adding a processor, and then
    migrating again can result in warnings from the OF core during
    post-migration device tree updates:
    
      Duplicate name in cpus, renamed to "l2-cache@2011#1"
      Duplicate name in cpus, renamed to "l3-cache@3111#1"
    
    and nodes with duplicated phandles in the tree, making lookup behavior
    unpredictable:
    
      $ lsprop l[23]-cache@*/ibm,phandle
      l2-cache@2010/ibm,phandle
                       00002010 (8208)
      l2-cache@2011#1/ibm,phandle
                       00002011 (8209)
      l2-cache@2011/ibm,phandle
                       00002011 (8209)
      l3-cache@3110/ibm,phandle
                       00003110 (12560)
      l3-cache@3111#1/ibm,phandle
                       00003111 (12561)
      l3-cache@3111/ibm,phandle
                       00003111 (12561)
    
    Address these issues by:
    
    * Correctly processing siblings of the node returned from
      dlpar_configure_connector().
    * Removing cache nodes in the CPU remove path when it can be determined
      that they are not associated with other CPUs or caches.
    
    Use the of_changeset API in both cases, which allows us to keep the error
    handling in this code from becoming more complex while ensuring that the
    device tree cannot become inconsistent.
    
    Fixes: ac713800 ("powerpc/pseries: Add CPU dlpar remove functionality")
    Fixes: 90edf184 ("powerpc/pseries: Add CPU dlpar add functionality")
    Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
    Tested-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
    Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20210927201933.76786-2-nathanl@linux.ibm.com
    7edd5c9a
hotplug-cpu.c 24 KB