MDEV-32268: GNU libc posix_fallocate() may be extremely slow
os_file_set_size(): Let us invoke the Linux system call fallocate(2) directly, because the GNU libc posix_fallocate() implements a fallback that writes to the file 1 byte every 4096 or fewer bytes. In one environment, invoking fallocate() directly would lead to 4 times the file growth rate during ALTER TABLE. Presumably, what happened was that the NFS server used a smaller allocation block size than 4096 bytes and therefore created a heavily fragmented sparse file when posix_fallocate() was used. For example, extending a file by 4 MiB would create 1,024 file fragments. When the file is actually being written to with data, it would be "unsparsed". The built-in EOPNOTSUPP fallback in os_file_set_size() writes a buffer of 1 MiB of NUL bytes. This was always used on musl libc and other Linux implementations of posix_fallocate().
Showing
Please register or sign in to comment