• Kees Cook's avatar
    fortify: Detect struct member overflows in memcpy() at compile-time · f68f2ff9
    Kees Cook authored
    memcpy() is dead; long live memcpy()
    
    tl;dr: In order to eliminate a large class of common buffer overflow
    flaws that continue to persist in the kernel, have memcpy() (under
    CONFIG_FORTIFY_SOURCE) perform bounds checking of the destination struct
    member when they have a known size. This would have caught all of the
    memcpy()-related buffer write overflow flaws identified in at least the
    last three years.
    
    Background and analysis:
    
    While stack-based buffer overflow flaws are largely mitigated by stack
    canaries (and similar) features, heap-based buffer overflow flaws continue
    to regularly appear in the kernel. Many classes of heap buffer overflows
    are mitigated by FORTIFY_SOURCE when using the strcpy() family of
    functions, but a significant number remain exposed through the memcpy()
    family of functions.
    
    At its core, FORTIFY_SOURCE uses the compiler's __builtin_object_size()
    internal[0] to determine the available size at a target address based on
    the compile-time known structure layout details. It operates in two
    modes: outer bounds (0) and inner bounds (1). In mode 0, the size of the
    enclosing structure is used. In mode 1, the size of the specific field
    is used. For example:
    
    	struct object {
    		u16 scalar1;	/* 2 bytes */
    		char array[6];	/* 6 bytes */
    		u64 scalar2;	/* 8 bytes */
    		u32 scalar3;	/* 4 bytes */
    		u32 scalar4;	/* 4 bytes */
    	} instance;
    
    __builtin_object_size(instance.array, 0) == 22, since the remaining size
    of the enclosing structure starting from "array" is 22 bytes (6 + 8 +
    4 + 4).
    
    __builtin_object_size(instance.array, 1) == 6, since the remaining size
    of the specific field "array" is 6 bytes.
    
    The initial implementation of FORTIFY_SOURCE used mode 0 because there
    were many cases of both strcpy() and memcpy() functions being used to
    write (or read) across multiple fields in a structure. For example,
    it would catch this, which is writing 2 bytes beyond the end of
    "instance":
    
    	memcpy(&instance.array, data, 25);
    
    While this didn't protect against overwriting adjacent fields in a given
    structure, it would at least stop overflows from reaching beyond the
    end of the structure into neighboring memory, and provided a meaningful
    mitigation of a subset of buffer overflow flaws. However, many desirable
    targets remain within the enclosing structure (for example function
    pointers).
    
    As it happened, there were very few cases of strcpy() family functions
    intentionally writing beyond the end of a string buffer. Once all known
    cases were removed from the kernel, the strcpy() family was tightened[1]
    to use mode 1, providing greater mitigation coverage.
    
    What remains is switching memcpy() to mode 1 as well, but making the
    switch is much more difficult because of how frustrating it can be to
    find existing "normal" uses of memcpy() that expect to write (or read)
    across multiple fields. The root cause of the problem is that the C
    language lacks a common pattern to indicate the intent of an author's
    use of memcpy(), and is further complicated by the available compile-time
    and run-time mitigation behaviors.
    
    The FORTIFY_SOURCE mitigation comes in two halves: the compile-time half,
    when both the buffer size _and_ the length of the copy is known, and the
    run-time half, when only the buffer size is known. If neither size is
    known, there is no bounds checking possible. At compile-time when the
    compiler sees that a length will always exceed a known buffer size,
    a warning can be deterministically emitted. For the run-time half,
    the length is tested against the known size of the buffer, and the
    overflowing operation is detected. (The performance overhead for these
    tests is virtually zero.)
    
    It is relatively easy to find compile-time false-positives since a warning
    is always generated. Fixing the false positives, however, can be very
    time-consuming as there are hundreds of instances. While it's possible
    some over-read conditions could lead to kernel memory exposures, the bulk
    of the risk comes from the run-time flaws where the length of a write
    may end up being attacker-controlled and lead to an overflow.
    
    Many of the compile-time false-positives take a form similar to this:
    
    	memcpy(&instance.scalar2, data, sizeof(instance.scalar2) +
    					sizeof(instance.scalar3));
    
    and the run-time ones are similar, but lack a constant expression for the
    size of the copy:
    
    	memcpy(instance.array, data, length);
    
    The former is meant to cover multiple fields (though its style has been
    frowned upon more recently), but has been technically legal. Both lack
    any expressivity in the C language about the author's _intent_ in a way
    that a compiler can check when the length isn't known at compile time.
    A comment doesn't work well because what's needed is something a compiler
    can directly reason about. Is a given memcpy() call expected to overflow
    into neighbors? Is it not? By using the new struct_group() macro, this
    intent can be much more easily encoded.
    
    It is not as easy to find the run-time false-positives since the code path
    to exercise a seemingly out-of-bounds condition that is actually expected
    may not be trivially reachable. Tightening the restrictions to block an
    operation for a false positive will either potentially create a greater
    flaw (if a copy is truncated by the mitigation), or destabilize the kernel
    (e.g. with a BUG()), making things completely useless for the end user.
    
    As a result, tightening the memcpy() restriction (when there is a
    reasonable level of uncertainty of the number of false positives), needs
    to first WARN() with no truncation. (Though any sufficiently paranoid
    end-user can always opt to set the panic_on_warn=1 sysctl.) Once enough
    development time has passed, the mitigation can be further intensified.
    (Note that this patch is only the compile-time checking step, which is
    a prerequisite to doing run-time checking, which will come in future
    patches.)
    
    Given the potential frustrations of weeding out all the false positives
    when tightening the run-time checks, it is reasonable to wonder if these
    changes would actually add meaningful protection. Looking at just the
    last three years, there are 23 identified flaws with a CVE that mention
    "buffer overflow", and 11 are memcpy()-related buffer overflows.
    
    (For the remaining 12: 7 are array index overflows that would be
    mitigated by systems built with CONFIG_UBSAN_BOUNDS=y: CVE-2019-0145,
    CVE-2019-14835, CVE-2019-14896, CVE-2019-14897, CVE-2019-14901,
    CVE-2019-17666, CVE-2021-28952. 2 are miscalculated allocation
    sizes which could be mitigated with memory tagging: CVE-2019-16746,
    CVE-2019-2181. 1 is an iovec buffer bug maybe mitigated by memory tagging:
    CVE-2020-10742. 1 is a type confusion bug mitigated by stack canaries:
    CVE-2020-10942. 1 is a string handling logic bug with no mitigation I'm
    aware of: CVE-2021-28972.)
    
    At my last count on an x86_64 allmodconfig build, there are 35,294
    calls to memcpy(). With callers instrumented to report all places
    where the buffer size is known but the length remains unknown (i.e. a
    run-time bounds check is added), we can count how many new run-time
    bounds checks are added when the destination and source arguments of
    memcpy() are changed to use "mode 1" bounds checking: 1,276. This means
    for the future run-time checking, there is a worst-case upper bounds
    of 3.6% false positives to fix. In addition, there were around 150 new
    compile-time warnings to evaluate and fix (which have now been fixed).
    
    With this instrumentation it's also possible to compare the places where
    the known 11 memcpy() flaw overflows manifested against the resulting
    list of potential new run-time bounds checks, as a measure of potential
    efficacy of the tightened mitigation. Much to my surprise, horror, and
    delight, all 11 flaws would have been detected by the newly added run-time
    bounds checks, making this a distinctly clear mitigation improvement: 100%
    coverage for known memcpy() flaws, with a possible 2 orders of magnitude
    gain in coverage over existing but undiscovered run-time dynamic length
    flaws (i.e. 1265 newly covered sites in addition to the 11 known), against
    only <4% of all memcpy() callers maybe gaining a false positive run-time
    check, with only about 150 new compile-time instances needing evaluation.
    
    Specifically these would have been mitigated:
    CVE-2020-24490 https://git.kernel.org/linus/a2ec905d1e160a33b2e210e45ad30445ef26ce0e
    CVE-2020-12654 https://git.kernel.org/linus/3a9b153c5591548612c3955c9600a98150c81875
    CVE-2020-12653 https://git.kernel.org/linus/b70261a288ea4d2f4ac7cd04be08a9f0f2de4f4d
    CVE-2019-14895 https://git.kernel.org/linus/3d94a4a8373bf5f45cf5f939e88b8354dbf2311b
    CVE-2019-14816 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
    CVE-2019-14815 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
    CVE-2019-14814 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
    CVE-2019-10126 https://git.kernel.org/linus/69ae4f6aac1578575126319d3f55550e7e440449
    CVE-2019-9500  https://git.kernel.org/linus/1b5e2423164b3670e8bc9174e4762d297990deff
    no-CVE-yet     https://git.kernel.org/linus/130f634da1af649205f4a3dd86cbe5c126b57914
    no-CVE-yet     https://git.kernel.org/linus/d10a87a3535cce2b890897914f5d0d83df669c63
    
    To accelerate the review of potential run-time false positives, it's
    also worth noting that it is possible to partially automate checking
    by examining the memcpy() buffer argument to check for the destination
    struct member having a neighboring array member. It is reasonable to
    expect that the vast majority of run-time false positives would look like
    the already evaluated and fixed compile-time false positives, where the
    most common pattern is neighboring arrays. (And, FWIW, many of the
    compile-time fixes were actual bugs, so it is reasonable to assume we'll
    have similar cases of actual bugs getting fixed for run-time checks.)
    
    Implementation:
    
    Tighten the memcpy() destination buffer size checking to use the actual
    ("mode 1") target buffer size as the bounds check instead of their
    enclosing structure's ("mode 0") size. Use a common inline for memcpy()
    (and memmove() in a following patch), since all the tests are the
    same. All new cross-field memcpy() uses must use the struct_group() macro
    or similar to target a specific range of fields, so that FORTIFY_SOURCE
    can reason about the size and safety of the copy.
    
    For now, cross-member "mode 1" _read_ detection at compile-time will be
    limited to W=1 builds, since it is, unfortunately, very common. As the
    priority is solving write overflows, read overflows will be part of a
    future phase (and can be fixed in parallel, for anyone wanting to look
    at W=1 build output).
    
    For run-time, the "mode 0" size checking and mitigation is left unchanged,
    with "mode 1" to be added in stages. In this patch, no new run-time
    checks are added. Future patches will first bounds-check writes,
    and only perform a WARN() for now. This way any missed run-time false
    positives can be flushed out over the coming several development cycles,
    but system builders who have tested their workloads to be WARN()-free
    can enable the panic_on_warn=1 sysctl to immediately gain a mitigation
    against this class of buffer overflows. Once that is under way, run-time
    bounds-checking of reads can be similarly enabled.
    
    Related classes of flaws that will remain unmitigated:
    
    - memcpy() with flexible array structures, as the compiler does not
      currently have visibility into the size of the trailing flexible
      array. These can be fixed in the future by refactoring such cases
      to use a new set of flexible array structure helpers to perform the
      common serialization/deserialization code patterns doing allocation
      and/or copying.
    
    - memcpy() with raw pointers (e.g. void *, char *, etc), or otherwise
      having their buffer size unknown at compile time, have no good
      mitigation beyond memory tagging (and even that would only protect
      against inter-object overflow, not intra-object neighboring field
      overflows), or refactoring. Some kind of "fat pointer" solution is
      likely needed to gain proper size-of-buffer awareness. (e.g. see
      struct membuf)
    
    - type confusion where a higher level type's allocation size does
      not match the resulting cast type eventually passed to a deeper
      memcpy() call where the compiler cannot see the true type. In
      theory, greater static analysis could catch these, and the use
      of -Warray-bounds will help find some of these.
    
    [0] https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
    [1] https://git.kernel.org/linus/6a39e62abbafd1d58d1722f40c7d26ef379c6a2fSigned-off-by: default avatarKees Cook <keescook@chromium.org>
    f68f2ff9
write_overflow_field-memcpy.c 138 Bytes