Commit 133e2d3e authored by Andrei Vagin's avatar Andrei Vagin Committed by Kees Cook

fs/exec: allow to unshare a time namespace on vfork+exec

Right now, a new process can't be forked in another time namespace
if it shares mm with its parent. It is prohibited, because each time
namespace has its own vvar page that is mapped into a process address
space.

When a process calls exec, it gets a new mm and so it could be "legal"
to switch time namespace in that case. This was not implemented and
now if we want to do this, we need to add another clone flag to not
break backward compatibility.

We don't have any user requests to switch times on exec except the
vfork+exec combination, so there is no reason to add a new clone flag.
As for vfork+exec, this should be safe to allow switching timens with
the current clone flag. Right now, vfork (CLONE_VFORK | CLONE_VM) fails
if a child is forked into another time namespace. With this change,
vfork creates a new process in parent's timens, and the following exec
does the actual switch to the target time namespace.
Suggested-by: default avatarFlorian Weimer <fweimer@redhat.com>
Signed-off-by: default avatarAndrei Vagin <avagin@gmail.com>
Acked-by: default avatarChristian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: default avatarKees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220613060723.197407-1-avagin@gmail.com
parent b13baccc
......@@ -65,6 +65,7 @@
#include <linux/io_uring.h>
#include <linux/syscall_user_dispatch.h>
#include <linux/coredump.h>
#include <linux/time_namespace.h>
#include <linux/uaccess.h>
#include <asm/mmu_context.h>
......@@ -982,10 +983,12 @@ static int exec_mmap(struct mm_struct *mm)
{
struct task_struct *tsk;
struct mm_struct *old_mm, *active_mm;
bool vfork;
int ret;
/* Notify parent that we're no longer interested in the old VM */
tsk = current;
vfork = !!tsk->vfork_done;
old_mm = current->mm;
exec_mm_release(tsk, old_mm);
if (old_mm)
......@@ -1030,6 +1033,10 @@ static int exec_mmap(struct mm_struct *mm)
tsk->mm->vmacache_seqnum = 0;
vmacache_flush(tsk);
task_unlock(tsk);
if (vfork)
timens_on_fork(tsk->nsproxy, tsk);
if (old_mm) {
mmap_read_unlock(old_mm);
BUG_ON(active_mm != old_mm);
......
......@@ -2033,8 +2033,11 @@ static __latent_entropy struct task_struct *copy_process(
/*
* If the new process will be in a different time namespace
* do not allow it to share VM or a thread group with the forking task.
*
* On vfork, the child process enters the target time namespace only
* after exec.
*/
if (clone_flags & (CLONE_THREAD | CLONE_VM)) {
if ((clone_flags & (CLONE_VM | CLONE_VFORK)) == CLONE_VM) {
if (nsp->time_ns != nsp->time_ns_for_children)
return ERR_PTR(-EINVAL);
}
......
......@@ -179,6 +179,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
if (IS_ERR(new_ns))
return PTR_ERR(new_ns);
if ((flags & CLONE_VM) == 0)
timens_on_fork(new_ns, tsk);
tsk->nsproxy = new_ns;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment