1. 27 Oct, 2021 2 commits
  2. 22 Oct, 2021 27 commits
  3. 14 Oct, 2021 2 commits
  4. 13 Oct, 2021 3 commits
  5. 12 Oct, 2021 2 commits
  6. 08 Oct, 2021 4 commits
    • Nathan Lynch's avatar
      powerpc/pseries/cpuhp: remove obsolete comment from pseries_cpu_die · f9473a65
      Nathan Lynch authored
      This comment likely refers to the obsolete DLPAR workflow where some
      resource state transitions were driven more directly from user space
      utilities, but it also seems to contradict itself: "Change isolate state to
      Isolate [...]" is at odds with the preceding sentences, and it does not
      relate at all to the code that follows.
      
      Remove it to prevent confusion.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-5-nathanl@linux.ibm.com
      f9473a65
    • Nathan Lynch's avatar
      powerpc/pseries/cpuhp: delete add/remove_by_count code · fa2a5dfe
      Nathan Lynch authored
      The core DLPAR code supports two actions (add and remove) and three
      subtypes of action:
      
      * By DRC index: the action is attempted on a single specified resource.
        This is the usual case for processors.
      * By indexed count: the action is attempted on a range of resources
        beginning at the specified index. This is implemented only by the memory
        DLPAR code.
      * By count: the lower layer (CPU or memory) is responsible for locating the
        specified number of resources to which the action can be applied.
      
      I cannot find any evidence of the "by count" subtype being used by drmgr or
      qemu for processors. And when I try to exercise this code, the add case
      does not work:
      
        $ ppc64_cpu --smt ; nproc
        SMT=8
        24
        $ printf "cpu remove count 2" > /sys/kernel/dlpar
        $ nproc
        8
        $ printf "cpu add count 2" > /sys/kernel/dlpar
        -bash: printf: write error: Invalid argument
        $ dmesg | tail -2
        pseries-hotplug-cpu: Failed to find enough CPUs (1 of 2) to add
        dlpar: Could not handle DLPAR request "cpu add count 2"
        $ nproc
        8
        $ drmgr -c cpu -a -q 2         # this uses the by-index method
        Validating CPU DLPAR capability...yes.
        CPU 1
        CPU 17
        $ nproc
        24
      
      This is because find_drc_info_cpus_to_add() does not increment drc_index
      appropriately during its search.
      
      This is not hard to fix. But the _by_count() functions also have the
      property that they attempt to roll back all prior operations if the entire
      request cannot be satisfied, even though the rollback itself can encounter
      errors. It's not possible to provide transaction-like behavior at this
      level, and it's undesirable to have code that can only pretend to do that.
      Any users of these functions cannot know what the state of the system is in
      the error case. And the error paths are, to my knowledge, impossible to
      test without adding custom error injection code.
      
      Summary:
      
      * This code has not worked reliably since its introduction.
      * There is no evidence that it is used.
      * It contains questionable rollback behaviors in error paths which are
        difficult to test.
      
      So let's remove it.
      
      Fixes: ac713800 ("powerpc/pseries: Add CPU dlpar remove functionality")
      Fixes: 90edf184 ("powerpc/pseries: Add CPU dlpar add functionality")
      Fixes: b015f6bc ("powerpc/pseries: Add cpu DLPAR support for drc-info property")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Tested-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-4-nathanl@linux.ibm.com
      fa2a5dfe
    • Nathan Lynch's avatar
      powerpc/cpuhp: BUG -> WARN conversion in offline path · 983f9101
      Nathan Lynch authored
      If, due to bugs elsewhere, we get into unregister_cpu_online() with a CPU
      that isn't marked hotpluggable, we can emit a warning and return an
      appropriate error instead of crashing.
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-3-nathanl@linux.ibm.com
      983f9101
    • Nathan Lynch's avatar
      powerpc/pseries/cpuhp: cache node corrections · 7edd5c9a
      Nathan Lynch authored
      On pseries, cache nodes in the device tree can be added and removed by the
      CPU DLPAR code as well as the partition migration (mobility) code. PowerVM
      partitions in dedicated processor mode typically have L2 and L3 cache
      nodes.
      
      The CPU DLPAR code has the following shortcomings:
      
      * Cache nodes returned as siblings of a new CPU node by
        ibm,configure-connector are silently discarded; only the CPU node is
        added to the device tree.
      
      * Cache nodes which become unreferenced in the processor removal path are
        not removed from the device tree. This can lead to duplicate nodes when
        the post-migration device tree update code replaces cache nodes.
      
      This is long-standing behavior. Presumably it has gone mostly unnoticed
      because the two bugs have the property of obscuring each other in common
      simple scenarios (e.g. remove a CPU and add it back). Likely you'd notice
      only if you cared to inspect the device tree or the sysfs cacheinfo
      information.
      
      Booted with two processors:
      
        $ pwd
        /sys/firmware/devicetree/base/cpus
        $ ls -1d */
        l2-cache@2010/
        l2-cache@2011/
        l3-cache@3110/
        l3-cache@3111/
        PowerPC,POWER9@0/
        PowerPC,POWER9@8/
        $ lsprop */l2-cache
        l2-cache@2010/l2-cache
                       00003110 (12560)
        l2-cache@2011/l2-cache
                       00003111 (12561)
        PowerPC,POWER9@0/l2-cache
                       00002010 (8208)
        PowerPC,POWER9@8/l2-cache
                       00002011 (8209)
        $ ls /sys/devices/system/cpu/cpu0/cache/
        index0  index1  index2  index3
      
      After DLPAR-adding PowerPC,POWER9@10, we see that its associated cache
      nodes are absent, its threads' L2+L3 cacheinfo is unpopulated, and it is
      missing a cache level in its sched domain hierarchy:
      
        $ ls -1d */
        l2-cache@2010/
        l2-cache@2011/
        l3-cache@3110/
        l3-cache@3111/
        PowerPC,POWER9@0/
        PowerPC,POWER9@10/
        PowerPC,POWER9@8/
        $ lsprop PowerPC\,POWER9@10/l2-cache
        PowerPC,POWER9@10/l2-cache
                       00002012 (8210)
        $ ls /sys/devices/system/cpu/cpu16/cache/
        index0  index1
        $ grep . /sys/kernel/debug/sched/domains/cpu{0,8,16}/domain*/name
        /sys/kernel/debug/sched/domains/cpu0/domain0/name:SMT
        /sys/kernel/debug/sched/domains/cpu0/domain1/name:CACHE
        /sys/kernel/debug/sched/domains/cpu0/domain2/name:DIE
        /sys/kernel/debug/sched/domains/cpu8/domain0/name:SMT
        /sys/kernel/debug/sched/domains/cpu8/domain1/name:CACHE
        /sys/kernel/debug/sched/domains/cpu8/domain2/name:DIE
        /sys/kernel/debug/sched/domains/cpu16/domain0/name:SMT
        /sys/kernel/debug/sched/domains/cpu16/domain1/name:DIE
      
      When removing PowerPC,POWER9@8, we see that its cache nodes are left
      behind:
      
        $ ls -1d */
        l2-cache@2010/
        l2-cache@2011/
        l3-cache@3110/
        l3-cache@3111/
        PowerPC,POWER9@0/
      
      When DLPAR is combined with VM migration, we can get duplicate nodes. E.g.
      removing one processor, then migrating, adding a processor, and then
      migrating again can result in warnings from the OF core during
      post-migration device tree updates:
      
        Duplicate name in cpus, renamed to "l2-cache@2011#1"
        Duplicate name in cpus, renamed to "l3-cache@3111#1"
      
      and nodes with duplicated phandles in the tree, making lookup behavior
      unpredictable:
      
        $ lsprop l[23]-cache@*/ibm,phandle
        l2-cache@2010/ibm,phandle
                         00002010 (8208)
        l2-cache@2011#1/ibm,phandle
                         00002011 (8209)
        l2-cache@2011/ibm,phandle
                         00002011 (8209)
        l3-cache@3110/ibm,phandle
                         00003110 (12560)
        l3-cache@3111#1/ibm,phandle
                         00003111 (12561)
        l3-cache@3111/ibm,phandle
                         00003111 (12561)
      
      Address these issues by:
      
      * Correctly processing siblings of the node returned from
        dlpar_configure_connector().
      * Removing cache nodes in the CPU remove path when it can be determined
        that they are not associated with other CPUs or caches.
      
      Use the of_changeset API in both cases, which allows us to keep the error
      handling in this code from becoming more complex while ensuring that the
      device tree cannot become inconsistent.
      
      Fixes: ac713800 ("powerpc/pseries: Add CPU dlpar remove functionality")
      Fixes: 90edf184 ("powerpc/pseries: Add CPU dlpar add functionality")
      Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
      Tested-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Reviewed-by: default avatarDaniel Henrique Barboza <danielhb413@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20210927201933.76786-2-nathanl@linux.ibm.com
      7edd5c9a