• Marko Mäkelä's avatar
    MDEV-32268: GNU libc posix_fallocate() may be extremely slow · ee1407f7
    Marko Mäkelä authored
    os_file_set_size(): Let us invoke the Linux system call fallocate(2)
    directly, because the GNU libc posix_fallocate() implements a fallback
    that writes to the file 1 byte every 4096 or fewer bytes. In one
    environment, invoking fallocate() directly would lead to 4 times the
    file growth rate during ALTER TABLE. Presumably, what happened was
    that the NFS server used a smaller allocation block size than 4096 bytes
    and therefore created a heavily fragmented sparse file when
    posix_fallocate() was used. For example, extending a file by 4 MiB
    would create 1,024 file fragments. When the file is actually being
    written to with data, it would be "unsparsed".
    
    The built-in EOPNOTSUPP fallback in os_file_set_size() writes a buffer
    of 1 MiB of NUL bytes. This was always used on musl libc and other
    Linux implementations of posix_fallocate().
    ee1407f7
os0file.cc 179 KB