• Pierre Gondois's avatar
    sched/topology: Remove the EM_MAX_COMPLEXITY limit · 5b77261c
    Pierre Gondois authored
    The Energy Aware Scheduler (EAS) estimates the energy consumption
    of placing a task on different CPUs. The goal is to minimize this
    energy consumption. Estimating the energy of different task placements
    is increasingly complex with the size of the platform.
    
    To avoid having a slow wake-up path, EAS is only enabled if this
    complexity is low enough.
    
    The current complexity limit was set in:
    
      b68a4c0d ("sched/topology: Disable EAS on inappropriate platforms")
    
    ... based on the first implementation of EAS, which was re-computing
    the power of the whole platform for each task placement scenario, see:
    
      390031e4 ("sched/fair: Introduce an energy estimation helper function")
    
    ... but the complexity of EAS was reduced in:
    
      eb92692b ("sched/fair: Speed-up energy-aware wake-ups")
    
    ... and find_energy_efficient_cpu() (feec) algorithm was updated in:
    
      3e8c6c9a ("sched/fair: Remove task_util from effective utilization in feec()")
    
    find_energy_efficient_cpu() (feec) is now doing:
    
    	feec()
    	\_ for_each_pd(pd) [0]
    	  // get max_spare_cap_cpu and compute_prev_delta
    	  \_ for_each_cpu(pd) [1]
    
    	  \_ eenv_pd_busy_time(pd) [2]
    		\_ for_each_cpu(pd)
    
    	  // compute_energy(pd) without the task
    	  \_ eenv_pd_max_util(pd, -1) [3.0]
    	    \_ for_each_cpu(pd)
    	  \_ em_cpu_energy(pd, -1)
    	    \_ for_each_ps(pd)
    
    	  // compute_energy(pd) with the task on prev_cpu
    	  \_ eenv_pd_max_util(pd, prev_cpu) [3.1]
    	    \_ for_each_cpu(pd)
    	  \_ em_cpu_energy(pd, prev_cpu)
    	    \_ for_each_ps(pd)
    
    	  // compute_energy(pd) with the task on max_spare_cap_cpu
    	  \_ eenv_pd_max_util(pd, max_spare_cap_cpu) [3.2]
    	    \_ for_each_cpu(pd)
    	  \_ em_cpu_energy(pd, max_spare_cap_cpu)
    	    \_ for_each_ps(pd)
    
    	[3.1] happens only once since prev_cpu is unique. With the same
    	      definitions for nr_pd, nr_cpus and nr_ps, the complexity is of:
    
    		nr_pd * (2 * [nr_cpus in pd] + 2 * ([nr_cpus in pd] + [nr_ps in pd]))
    		+ ([nr_cpus in pd] + [nr_ps in pd])
    
    		 [0]  * (     [1] + [2]      +       [3.0] + [3.2]                  )
    		+ [3.1]
    
    		= nr_pd * (4 * [nr_cpus in pd] + 2 * [nr_ps in pd])
    		+ [nr_cpus in prev pd] + nr_ps
    
    The complexity limit was set to 2048 in:
    
      b68a4c0d ("sched/topology: Disable EAS on inappropriate platforms")
    
    ... to make "EAS usable up to 16 CPUs with per-CPU DVFS and less than 8
    performance states each". For the same platform, the complexity would
    actually be of:
    
      16 * (4 + 2 * 7) + 1 + 7 = 296
    
    Since the EAS complexity was greatly reduced since the limit was
    introduced, bigger platforms can handle EAS.
    
    For instance, a platform with 112 CPUs with 7 performance states
    each would not reach it:
    
      112 * (4 + 2 * 7) + 1 + 7 = 2024
    
    To reflect this improvement in the underlying EAS code, remove
    the EAS complexity check.
    
    Note that a limit on the number of CPUs still holds against
    EM_MAX_NUM_CPUS to avoid overflows during the energy estimation.
    
    [ mingo: Updates to the changelog. ]
    Signed-off-by: default avatarPierre Gondois <Pierre.Gondois@arm.com>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Reviewed-by: default avatarLukasz Luba <lukasz.luba@arm.com>
    Reviewed-by: default avatarDietmar Eggemann <dietmar.eggemann@arm.com>
    Link: https://lore.kernel.org/r/20231009060037.170765-2-sshegde@linux.vnet.ibm.com
    5b77261c
sched-energy.rst 18.4 KB