• Mel Gorman's avatar
    mm, memcg: Try charging a page before setting page up to date · eb3c24f3
    Mel Gorman authored
    Historically memcg overhead was high even if memcg was unused.  This has
    improved a lot but it still showed up in a profile summary as being a
    problem.
    
    /usr/src/linux-4.0-vanilla/mm/memcontrol.c                           6.6441   395842
      mem_cgroup_try_charge                                                        2.950%   175781
      __mem_cgroup_count_vm_event                                                  1.431%    85239
      mem_cgroup_page_lruvec                                                       0.456%    27156
      mem_cgroup_commit_charge                                                     0.392%    23342
      uncharge_list                                                                0.323%    19256
      mem_cgroup_update_lru_size                                                   0.278%    16538
      memcg_check_events                                                           0.216%    12858
      mem_cgroup_charge_statistics.isra.22                                         0.188%    11172
      try_charge                                                                   0.150%     8928
      commit_charge                                                                0.141%     8388
      get_mem_cgroup_from_mm                                                       0.121%     7184
    
    That is showing that 6.64% of system CPU cycles were in memcontrol.c and
    dominated by mem_cgroup_try_charge.  The annotation shows that the bulk
    of the cost was checking PageSwapCache which is expected to be cache hot
    but is very expensive.  The problem appears to be that __SetPageUptodate
    is called just before the check which is a write barrier.  It is
    required to make sure struct page and page data is written before the
    PTE is updated and the data visible to userspace.  memcg charging does
    not require or need the barrier but gets unfairly hit with the cost so
    this patch attempts the charging before the barrier.  Aside from the
    accidental cost to memcg there is the added benefit that the barrier is
    avoided if the page cannot be charged.  When applied the relevant
    profile summary is as follows.
    
    /usr/src/linux-4.0-chargefirst-v2r1/mm/memcontrol.c                  3.7907   223277
      __mem_cgroup_count_vm_event                                                  1.143%    67312
      mem_cgroup_page_lruvec                                                       0.465%    27403
      mem_cgroup_commit_charge                                                     0.381%    22452
      uncharge_list                                                                0.332%    19543
      mem_cgroup_update_lru_size                                                   0.284%    16704
      get_mem_cgroup_from_mm                                                       0.271%    15952
      mem_cgroup_try_charge                                                        0.237%    13982
      memcg_check_events                                                           0.222%    13058
      mem_cgroup_charge_statistics.isra.22                                         0.185%    10920
      commit_charge                                                                0.140%     8235
      try_charge                                                                   0.131%     7716
    
    That brings the overhead down to 3.79% and leaves the memcg fault
    accounting to the root cgroup but it's an improvement.  The difference
    in headline performance of the page fault microbench is marginal as
    memcg is such a small component of it.
    
    pft faults
                                           4.0.0                  4.0.0
                                         vanilla            chargefirst
    Hmean    faults/cpu-1 1443258.1051 (  0.00%) 1509075.7561 (  4.56%)
    Hmean    faults/cpu-3 1340385.9270 (  0.00%) 1339160.7113 ( -0.09%)
    Hmean    faults/cpu-5  875599.0222 (  0.00%)  874174.1255 ( -0.16%)
    Hmean    faults/cpu-7  601146.6726 (  0.00%)  601370.9977 (  0.04%)
    Hmean    faults/cpu-8  510728.2754 (  0.00%)  510598.8214 ( -0.03%)
    Hmean    faults/sec-1 1432084.7845 (  0.00%) 1497935.5274 (  4.60%)
    Hmean    faults/sec-3 3943818.1437 (  0.00%) 3941920.1520 ( -0.05%)
    Hmean    faults/sec-5 3877573.5867 (  0.00%) 3869385.7553 ( -0.21%)
    Hmean    faults/sec-7 3991832.0418 (  0.00%) 3992181.4189 (  0.01%)
    Hmean    faults/sec-8 3987189.8167 (  0.00%) 3986452.2204 ( -0.02%)
    
    It's only visible at single threaded. The overhead is there for higher
    threads but other factors dominate.
    Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Cc: Tejun Heo <tj@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    eb3c24f3
memory.c 104 KB