• Yunlong Song's avatar
    perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files · 939cda52
    Yunlong Song authored
    The soft maximum number of open files for a calling process is 1024,
    which is defined as INR_OPEN_CUR in include/uapi/linux/fs.h, and the
    hard maximum number of open files for a calling process is 4096, which
    is defined as INR_OPEN_MAX in include/uapi/linux/fs.h.
    
    Both INR_OPEN_CUR and INR_OPEN_MAX are used to limit the value of
    RLIMIT_NOFILE in include/asm-generic/resource.h.
    
    And the soft maximum number finally decides the limitation of the
    maximum files which are allowed to be opened.
    
    That is to say a process can use at most 1024 file descriptors for its
    o pened files, or an EMFILE error will happen.
    
    This error can be fixed by increasing the soft maximum number, under the
    constraint that the soft maximum number can not exceed the hard maximum
    number, or both soft and hard maximum number should be increased
    simultaneously with privilege.
    
    For perf sched replay, it uses sys_perf_event_open to create the file
    descriptor for each of the tasks in order to handle information of perf
    events.
    
    That is to say each task needs a unique file descriptor. In x86_64,
    there may be over 1024 or 4096 tasks correspoinding to the record in
    perf.data, which causes that no enough file descriptors can be used.
    
    As a result, EMFILE error happens and stops the replay process. To solve
    this problem, we adaptively increase the soft and hard maximum number of
    open files with a '-f' option.
    
    Example:
    
    Test environment: x86_64 with 160 cores
    
     $ cat /proc/sys/kernel/pid_max
     163840
     $ cat /proc/sys/fs/file-max
     6815744
     $ ulimit -Sn
     1024
     $ ulimit -Hn
     4096
    
    Before this patch:
    
     $ perf sched replay
     ...
     task   1549 (             :163132:    163132), nr_events: 1
     task   1550 (             :163540:    163540), nr_events: 1
     task   1551 (           <unknown>:         0), nr_events: 10
     Error: sys_perf_event_open() syscall returned with -1 (Too many open
     files)
    
    After this patch:
    
     $ perf sched replay
     ...
     task   1549 (             :163132:    163132), nr_events: 1
     task   1550 (             :163540:    163540), nr_events: 1
     task   1551 (           <unknown>:         0), nr_events: 10
     Error: sys_perf_event_open() syscall returned with -1 (Too many open
     files)
     Have a try with -f option
    
     $ perf sched replay -f
     ...
     task   1549 (             :163132:    163132), nr_events: 1
     task   1550 (             :163540:    163540), nr_events: 1
     task   1551 (           <unknown>:         0), nr_events: 10
     ------------------------------------------------------------
     #1  : 54.401, ravg: 54.40, cpu: 3285.21 / 3285.21
     #2  : 199.548, ravg: 68.92, cpu: 4999.65 / 3456.66
     #3  : 170.483, ravg: 79.07, cpu: 1349.94 / 3245.99
     #4  : 192.034, ravg: 90.37, cpu: 1322.88 / 3053.67
     #5  : 182.929, ravg: 99.62, cpu: 1406.51 / 2888.96
     #6  : 152.974, ravg: 104.96, cpu: 1167.54 / 2716.82
     #7  : 155.579, ravg: 110.02, cpu: 2992.53 / 2744.39
     #8  : 130.557, ravg: 112.08, cpu: 1126.43 / 2582.59
     #9  : 138.520, ravg: 114.72, cpu: 1253.22 / 2449.65
     #10 : 134.328, ravg: 116.68, cpu: 1587.95 / 2363.48
    Signed-off-by: default avatarYunlong Song <yunlong.song@huawei.com>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Wang Nan <wangnan0@huawei.com>
    Link: http://lkml.kernel.org/r/1427809596-29559-8-git-send-email-yunlong.song@huawei.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
    939cda52
builtin-sched.c 45 KB