• Mel Gorman's avatar
    mm: numa: Add pte updates, hinting and migration stats · 03c5a6e1
    Mel Gorman authored
    It is tricky to quantify the basic cost of automatic NUMA placement in a
    meaningful manner. This patch adds some vmstats that can be used as part
    of a basic costing model.
    
    u    = basic unit = sizeof(void *)
    Ca   = cost of struct page access = sizeof(struct page) / u
    Cpte = Cost PTE access = Ca
    Cupdate = Cost PTE update = (2 * Cpte) + (2 * Wlock)
    	where Cpte is incurred twice for a read and a write and Wlock
    	is a constant representing the cost of taking or releasing a
    	lock
    Cnumahint = Cost of a minor page fault = some high constant e.g. 1000
    Cpagerw = Cost to read or write a full page = Ca + PAGE_SIZE/u
    Ci = Cost of page isolation = Ca + Wi
    	where Wi is a constant that should reflect the approximate cost
    	of the locking operation
    Cpagecopy = Cpagerw + (Cpagerw * Wnuma) + Ci + (Ci * Wnuma)
    	where Wnuma is the approximate NUMA factor. 1 is local. 1.2
    	would imply that remote accesses are 20% more expensive
    
    Balancing cost = Cpte * numa_pte_updates +
    		Cnumahint * numa_hint_faults +
    		Ci * numa_pages_migrated +
    		Cpagecopy * numa_pages_migrated
    
    Note that numa_pages_migrated is used as a measure of how many pages
    were isolated even though it would miss pages that failed to migrate. A
    vmstat counter could have been added for it but the isolation cost is
    pretty marginal in comparison to the overall cost so it seemed overkill.
    
    The ideal way to measure automatic placement benefit would be to count
    the number of remote accesses versus local accesses and do something like
    
    	benefit = (remote_accesses_before - remove_access_after) * Wnuma
    
    but the information is not readily available. As a workload converges, the
    expection would be that the number of remote numa hints would reduce to 0.
    
    	convergence = numa_hint_faults_local / numa_hint_faults
    		where this is measured for the last N number of
    		numa hints recorded. When the workload is fully
    		converged the value is 1.
    
    This can measure if the placement policy is converging and how fast it is
    doing it.
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Acked-by: default avatarRik van Riel <riel@redhat.com>
    03c5a6e1
migrate.c 36.4 KB