Commit c51534e8 authored by Kirill Smelkov's avatar Kirill Smelkov

Run each testcase with its own /tmp and /dev/shm

and detect leaked temporary files and mount entries after each test run.

Background

Currently we have several testing-related problems that are
all connected to /tmp and similar directories:

Problem 1: many tests create temporary files for each run. Usually
tests are careful to remove them on teardown, but due to bugs, many kind
of tests, test processes being hard-killed (SIGKILL, or SIGSEGV) and
other reasons, in practice this cleanup does not work 100% reliably and
there is steady growth of files leaked on /tmp on testnodes.

Problem 2: due to using shared /tmp and /dev/shm, the isolation in
between different test runs of potentially different users is not
strong. For example @jerome reports that due to leakage of faketime's
shared segments separate test runs affect each other and fail:
https://erp5.nexedi.net/bug_module/20211125-1C8FE17

Problem 3: many tests depend on /tmp being a tmpfs instance. This are for
example wendelin.core tests which are intensively writing to database,
and, if /tmp is resided on disk, timeout due to disk IO stalls in fsync
on every commit. The stalls are as much as >30s and lead to ~2.5x overall
slowdown for test runs. However the main problem is spike of increased
latency which, with close to 100% probability, always render some test
as missing its deadline. This topic is covered in
https://erp5.com/group_section/forum/Using-tmpfs-for--tmp-on-testnodes-JTocCtJjOd

--------

There are many ways to try to address each problem separately, but they
all come with limitations and drawbacks. We discussed things with @tomo
and @jerome, and it looks like that all those problems can be addressed
in one go if we run tests under user namespaces with private mounts for
/tmp and /dev/shm.

Even though namespaces is generally no-go in Nexedi, they seem to be ok
to use in tests. For example they are already used via private_tmpfs
option in SlapOS:

https://lab.nexedi.com/nexedi/slapos/blob/1876c150/slapos/recipe/librecipe/execute.py#L87-103
https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo-input-schema.json#L121-124
https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo.cfg.in#L11-16
https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo.cfg.in#L30-34
https://lab.nexedi.com/nexedi/slapos/blob/1876c150/software/neoppod/instance-neo.cfg.in#L170-177
...
https://lab.nexedi.com/nexedi/slapos/blob/1876c150/stack/erp5/instance-zope.cfg.in#L227-230

Thomas says that using private tmpfs for each test would be a better
solution than implementing tmpfs for whole /tmp on testnodes. He also
reports that @jp is OK to use namespaces for test as long as there is a
fallback if namespaces aren't available.

-> So let's do that: teach nxdtest to run each test case in its own
private environment with privately-mounted /tmp and /dev/shm if we can
detect that user namespaces are available. In an environment where user
namespaces are indeed available this addresses all 3 problems because
isolation and being-tmpfs are there by design, and even if some files
will leak, the kernel will free everything when test terminates and the
filesystem is automatically unmounted. We also detect such leakage and
report a warning so that such problems do not go completely unnoticed.

Implementation

We leverage unshare(1) for simplicity. I decided to preserve uid/gid
instead of becoming uid=0 (= `unshare -Umr`) for better traceability, so
that it is clear from test output under which real slapuser a test is
run(*). Not changing uid requires to activate ambient capabilities so
that mounting filesystems, including FUSE-based needed by wendelin.core,
continue to work under regular non-zero uid. Please see
https://git.kernel.org/linus/58319057b784 for details on this topic. And
please refer to added trun.py for details on how per-test namespace is setup.

Using FUSE inside user namespaces requires Linux >= 4.18 (see
https://git.kernel.org/linus/da315f6e0398 and
https://git.kernel.org/linus/8cb08329b080), so if we are really to use
this patch we'll have to upgrade kernel on our testnodes, at least where
wendelin.core is used in tests.

"no namespaces" detection is implemented via first running `unshare ...
true` with the same unshare options that are going to be used to create
and enter new user namespace for real. If that fails, we fallback into
"no namespaces" mode where no private /tmp and /dev/shm are mounted(%).

(*) for example nxdtest logs information about the system on startup:

    date:   Mon, 29 Nov 2021 17:27:04 MSK
    xnode:  slapuserX@test.node
    ...

(%) Here is how nxdtest is run in fallback mode on my Debian 11 with
    user namespaces disabled via `sysctl kernel.unprivileged_userns_clone=0`

    (neo) (z-dev) (g.env) kirr@deca:~/src/wendelin/nxdtest$ nxdtest
    date:   Thu, 02 Dec 2021 14:04:30 MSK
    xnode:  kirr@deca.navytux.spb.ru
    uname:  Linux deca 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64
    cpu:    Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz

    >>> pytest
    $ python -m pytest
    # user namespaces not available. isolation and many checks will be deactivated.    <--- NOTE
    ===================== test session starts ======================
    platform linux2 -- Python 2.7.18, pytest-4.6.11, py-1.10.0, pluggy-0.13.1
    rootdir: /home/kirr/src/wendelin/nxdtest
    plugins: timeout-1.4.2
    collected 23 items

    nxdtest/nxdtest_pylint_test.py ....                      [ 17%]
    nxdtest/nxdtest_pytest_test.py ...                       [ 30%]
    nxdtest/nxdtest_test.py ......xx                         [ 65%]
    nxdtest/nxdtest_unittest_test.py ........                [100%]

    ============= 21 passed, 2 xfailed in 2.67 seconds =============
    ok      pytest  3.062s  # 23t 0e 0f 0s
    # ran 1 test case:  1·ok

/helped-by @tomo, @jerome
parent 4fe9ee16
...@@ -60,11 +60,15 @@ from subprocess import Popen, PIPE ...@@ -60,11 +60,15 @@ from subprocess import Popen, PIPE
from time import time, sleep, strftime, gmtime, localtime from time import time, sleep, strftime, gmtime, localtime
import os, sys, argparse, logging, traceback, re, pwd, socket import os, sys, argparse, logging, traceback, re, pwd, socket
from errno import ESRCH, EPERM from errno import ESRCH, EPERM
from os.path import dirname
import six import six
from golang import b, defer, func, select, default from golang import b, defer, func, select, default
from golang import context, sync from golang import context, sync
import psutil import psutil
# trun.py is a helper via which we run tests.
trun = "%s/trun.py" % dirname(__file__)
# loadNXDTestFile loads .nxdtest file located @path. # loadNXDTestFile loads .nxdtest file located @path.
def loadNXDTestFile(path): # -> TestEnv def loadNXDTestFile(path): # -> TestEnv
t = TestEnv() t = TestEnv()
...@@ -249,7 +253,7 @@ def main(): ...@@ -249,7 +253,7 @@ def main():
# TODO session -> cgroup, because a child process could create another new session. # TODO session -> cgroup, because a child process could create another new session.
def newsession(): def newsession():
os.setsid() os.setsid()
p = Popen(t.argv, env=env, stdin=devnull, stdout=PIPE, stderr=PIPE, bufsize=0, preexec_fn=newsession, **kw) p = Popen([sys.executable, trun] + t.argv, env=env, stdin=devnull, stdout=PIPE, stderr=PIPE, bufsize=0, preexec_fn=newsession, **kw)
except: except:
stdout, stderr = b'', b(traceback.format_exc()) stdout, stderr = b'', b(traceback.format_exc())
bstderr.write(stderr) bstderr.write(stderr)
......
...@@ -19,10 +19,14 @@ ...@@ -19,10 +19,14 @@
# verify general functionality # verify general functionality
import os
import sys import sys
import re import re
import time import time
from os.path import dirname import tempfile
import shutil
import subprocess
from os.path import dirname, exists, devnull
from golang import chan, select, default, func, defer from golang import chan, select, default, func, defer
from golang import context, sync from golang import context, sync
...@@ -52,6 +56,44 @@ def run_nxdtest(tmpdir): ...@@ -52,6 +56,44 @@ def run_nxdtest(tmpdir):
return _run_nxdtest return _run_nxdtest
# run all tests twice:
# 1) with user namespaces disabled,
# 2) with user namespaces potentially enabled.
@pytest.fixture(autouse=True, params=('userns_disabled', 'userns_default'))
def with_and_without_userns(tmp_path, monkeypatch, request):
if request.param == 'userns_disabled':
if request.node.get_closest_marker("userns_only"):
pytest.skip("test is @userns_only")
with open(str(tmp_path / 'unshare'), 'w') as f:
f.write('#!/bin/sh\nexit 1')
os.chmod(f.name, 0o755)
monkeypatch.setenv("PATH", str(tmp_path), prepend=os.pathsep)
else:
assert request.param == 'userns_default'
request.node.add_marker(
pytest.mark.xfail(not userns_works,
reason="this functionality needs user-namespaces to work"))
# @userns_only marks test as requiring user-namespaces to succeed.
try:
with open(devnull, 'w') as null:
# since trun uses unshare(1) instead of direct system calls, use all
# those unshare options used by trun to verify that we indeed have
#
# 1) userns support from kernel, and
# 2) recent enough unshare that won't fail due to "unknown option".
#
# change this back to plain `unshare -U` when/if trun is reworked to
# use system calls directly.
subprocess.check_call(['unshare', '-Umc', '--keep-caps', 'true'], stdout=null, stderr=null)
except (OSError, subprocess.CalledProcessError):
userns_works = False
else:
userns_works = True
userns_only = pytest.mark.userns_only
def test_main(run_nxdtest, capsys): def test_main(run_nxdtest, capsys):
run_nxdtest( run_nxdtest(
"""\ """\
...@@ -68,7 +110,7 @@ TestCase('TESTNAME', ['echo', 'TEST OUPUT']) ...@@ -68,7 +110,7 @@ TestCase('TESTNAME', ['echo', 'TEST OUPUT'])
assert re.match(u"# ran 1 test case: 1·ok", output_lines[-1]) assert re.match(u"# ran 1 test case: 1·ok", output_lines[-1])
def test_error_invoking_command(run_nxdtest, capsys): def test_command_does_not_exist(run_nxdtest, capsys):
run_nxdtest( run_nxdtest(
"""\ """\
TestCase('TESTNAME', ['not exist command']) TestCase('TESTNAME', ['not exist command'])
...@@ -76,7 +118,21 @@ TestCase('TESTNAME', ['not exist command']) ...@@ -76,7 +118,21 @@ TestCase('TESTNAME', ['not exist command'])
) )
captured = capsys.readouterr() captured = capsys.readouterr()
assert "No such file or directory" in captured.err assert 'Traceback' not in captured.out
assert 'Traceback' not in captured.err
assert captured.err == "not exist command: No such file or directory\n"
def test_command_exit_with_non_zero(run_nxdtest, capsys):
run_nxdtest(
"""\
TestCase('TESTNAME', ['false'])
"""
)
captured = capsys.readouterr()
assert 'Traceback' not in captured.out
assert 'Traceback' not in captured.err
def test_error_invoking_summary(run_nxdtest, capsys): def test_error_invoking_summary(run_nxdtest, capsys):
...@@ -165,3 +221,41 @@ TestCase('TEST_WITH_PROCLEAK', ['%s', 'AAA', 'BBB', 'CCC']) ...@@ -165,3 +221,41 @@ TestCase('TEST_WITH_PROCLEAK', ['%s', 'AAA', 'BBB', 'CCC'])
assert "AAA: terminating" in captured.out assert "AAA: terminating" in captured.out
assert "BBB: terminating" in captured.out assert "BBB: terminating" in captured.out
assert "CCC: terminating" in captured.out assert "CCC: terminating" in captured.out
# verify that files leaked on /tmp are detected.
@userns_only
@func
def test_run_tmpleak(run_nxdtest, capsys):
xtouch = "%s/testprog/xtouch" % (dirname(__file__),)
tmpd = tempfile.mkdtemp("", "nxdtest-leak.", "/tmp")
def _():
shutil.rmtree(tmpd)
defer(_)
tmpleakv = list('%s/%d' % (tmpd, i) for i in range(10))
for f in tmpleakv:
assert not exists(f)
run_nxdtest(
"""
TestCase('TESTCASE', ['%s'] + %r)
""" % (xtouch, tmpleakv,)
)
captured = capsys.readouterr()
for f in tmpleakv:
assert ("# leaked %s" % f) in captured.out
assert not exists(f)
# verify that leaked mounts are detected.
@userns_only
def test_run_mountleak(run_nxdtest, capsys):
run_nxdtest(
"""
TestCase('TESTCASE', ['mount', '-t', 'tmpfs', 'none', '/etc'])
""")
captured = capsys.readouterr()
assert "# leaked mount: none /etc tmpfs" in captured.out
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (C) 2021 Nexedi SA and Contributors.
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
"""Program xtouch helps to verify that nxdtest detects files leaked on /tmp.
It is similar to touch(1), but creates leading directories automatically.
It also always exits with non-zero status to simulate failure.
"""
from __future__ import print_function, absolute_import
import os, sys
from os.path import dirname
from errno import EEXIST
def main():
for f in sys.argv[1:]:
mkdir_p(dirname(f))
with open(f, "a"):
pass
sys.exit(1)
# mkdir_p mimics `mkdir -p`
def mkdir_p(path):
try:
os.makedirs(path)
except OSError as e:
if e.errno != EEXIST:
raise
if __name__ == '__main__':
main()
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (C) 2021 Nexedi SA and Contributors.
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
""" `trun ...` - run test specified by `...`
The test is run in dedicated environment, which, after test completes, is
checked for leaked files, leaked mount entries, etc.
The environment is activated only if user namespaces are available(*).
If user namespaces are not available, the test is still run but without the checks.
(*) see https://man7.org/linux/man-pages/man7/user_namespaces.7.html
"""
from __future__ import print_function, absolute_import
import errno, os, sys, stat, difflib
from subprocess import check_call as xrun, CalledProcessError
from os.path import join, devnull
from golang import func, defer
def main():
# Try to respawn ourselves in user-namespace where we can mount things, e.g. new /tmp.
# Keep current uid/gid the same for better traceability. In other words current user
# stays the same. Activate ambient capabilities(*) so that mounting filesystems,
# including FUSE-based ones for wendelin.core, still works under regular non-zero uid.
#
# (*) see https://man7.org/linux/man-pages/man7/capabilities.7.html
# and git.kernel.org/linus/58319057b784.
in_userns = True
mypid = str(os.getpid())
_ = os.environ.get("_NXDTEST_TRUN_RESPAWNED", "")
if mypid != _:
uargv = ["-Umc", "--keep-caps"] # NOTE keep this in sync with @userns_only in nxdtest_test.py
try:
# check if user namespaces are available
with open(devnull, "w") as null:
xrun(["unshare"] + uargv + ["true"], stdout=null, stderr=null)
except (OSError, CalledProcessError):
in_userns = False
print("# user namespaces not available. isolation and many checks will be deactivated.")
else:
os.environ["_NXDTEST_TRUN_RESPAWNED"] = mypid
os.execvp("unshare", ["unshare"] + uargv + [sys.executable] + sys.argv)
raise AssertionError("unreachable")
# either respawned in new namespace, or entered here without respawn with in_userns=n.
# run the test via corresponding driver.
run = run_in_userns if in_userns else run_no_userns
def _():
try:
xrun(sys.argv[1:])
except OSError as e:
if e.errno != errno.ENOENT:
raise
#print(e.strerror, file=sys.stderr) # e.strerror does not include filename on py2
print("%s: %s" % (sys.argv[1], os.strerror(e.errno)), # e.filename is also ø on py2
file=sys.stderr)
sys.exit(127)
except CalledProcessError as e:
sys.exit(e.returncode)
run(_)
# run_in_userns runs f with checks assuming that we are in a user namespace.
@func
def run_in_userns(f):
# mount new /tmp and /dev/shm to isolate this run from other programs and to detect
# leaked temporary files at the end.
tmpreg = {
"/tmp": [], # mountpoint -> extra options
"/dev/shm": []
}
for tmp, optv in tmpreg.items():
xrun(["mount", "-t", "tmpfs", "none", tmp] + optv)
# in the end: check file leakage on /tmp and friends.
def _():
for root in tmpreg:
for d, dirs, files in os.walk(root):
if d != root:
st = os.stat(d)
if st.st_mode & stat.S_ISVTX:
# sticky wcfs/ alike directories are used as top of registry for
# multiple users. It is kind of normal not to delete such
# directories by default.
print("# found sticky %s/" % d)
else:
print("# leaked %s/" % d)
for f in files:
print("# leaked %s" % join(d, f))
defer(_)
# in the end: check fstab changes.
fstab_before = mounts()
def _():
fstab_after = mounts()
for d in difflib.ndiff(fstab_before, fstab_after):
if d.startswith("- "):
print("# gone mount: %s" % d[2:])
if d.startswith("+ "):
print("# leaked mount: %s" % d[2:])
defer(_)
# run the test
f()
# run_no_userns runs f assuming that we are not in a user namespace.
def run_no_userns(f):
f()
# mounts returns current mount entries.
def mounts(): # -> []str
return readfile("/proc/mounts").split('\n')
# readfile returns content of file @path.
def readfile(path): # -> str
with open(path, "r") as f:
return f.read()
if __name__ == '__main__':
main()
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment