• Jann Horn's avatar
    sched/fair: Don't free p->numa_faults with concurrent readers · 48046e09
    Jann Horn authored
    commit 16d51a59 upstream.
    
    When going through execve(), zero out the NUMA fault statistics instead of
    freeing them.
    
    During execve, the task is reachable through procfs and the scheduler. A
    concurrent /proc/*/sched reader can read data from a freed ->numa_faults
    allocation (confirmed by KASAN) and write it back to userspace.
    I believe that it would also be possible for a use-after-free read to occur
    through a race between a NUMA fault and execve(): task_numa_fault() can
    lead to task_numa_compare(), which invokes task_weight() on the currently
    running task of a different CPU.
    
    Another way to fix this would be to make ->numa_faults RCU-managed or add
    extra locking, but it seems easier to wipe the NUMA fault statistics on
    execve.
    Signed-off-by: default avatarJann Horn <jannh@google.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Petr Mladek <pmladek@suse.com>
    Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Will Deacon <will@kernel.org>
    Fixes: 82727018 ("sched/numa: Call task_numa_free() from do_execve()")
    Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    48046e09
exec.c 47 KB