• Lokesh Gidra's avatar
    userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob · d0d4730a
    Lokesh Gidra authored
    With this change, when the knob is set to 0, it allows unprivileged users
    to call userfaultfd, like when it is set to 1, but with the restriction
    that page faults from only user-mode can be handled.  In this mode, an
    unprivileged user (without SYS_CAP_PTRACE capability) must pass
    UFFD_USER_MODE_ONLY to userfaultd or the API will fail with EPERM.
    
    This enables administrators to reduce the likelihood that an attacker with
    access to userfaultfd can delay faulting kernel code to widen timing
    windows for other exploits.
    
    The default value of this knob is changed to 0.  This is required for
    correct functioning of pipe mutex.  However, this will fail postcopy live
    migration, which will be unnoticeable to the VM guests.  To avoid this,
    set 'vm.userfault = 1' in /sys/sysctl.conf.
    
    The main reason this change is desirable as in the short term is that the
    Android userland will behave as with the sysctl set to zero.  So without
    this commit, any Linux binary using userfaultfd to manage its memory would
    behave differently if run within the Android userland.  For more details,
    refer to Andrea's reply [1].
    
    [1] https://lore.kernel.org/lkml/20200904033438.GI9411@redhat.com/
    
    Link: https://lkml.kernel.org/r/20201120030411.2690816-3-lokeshgidra@google.comSigned-off-by: default avatarLokesh Gidra <lokeshgidra@google.com>
    Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Peter Xu <peterx@redhat.com>
    Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
    Cc: Eric Biggers <ebiggers@kernel.org>
    Cc: Daniel Colascione <dancol@dancol.org>
    Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
    Cc: Kalesh Singh <kaleshsingh@google.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Jeff Vander Stoep <jeffv@google.com>
    Cc: <calin@google.com>
    Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
    Cc: Shaohua Li <shli@fb.com>
    Cc: Jerome Glisse <jglisse@redhat.com>
    Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Nitin Gupta <nigupta@nvidia.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Iurii Zaikin <yzaikin@google.com>
    Cc: Luis Chamberlain <mcgrof@kernel.org>
    Cc: Daniel Colascione <dancol@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    d0d4730a
vm.rst 34.4 KB