• Jens Axboe's avatar
    io_uring: add support for pre-mapped user IO buffers · edafccee
    Jens Axboe authored
    If we have fixed user buffers, we can map them into the kernel when we
    setup the io_uring. That avoids the need to do get_user_pages() for
    each and every IO.
    
    To utilize this feature, the application must call io_uring_register()
    after having setup an io_uring instance, passing in
    IORING_REGISTER_BUFFERS as the opcode. The argument must be a pointer to
    an iovec array, and the nr_args should contain how many iovecs the
    application wishes to map.
    
    If successful, these buffers are now mapped into the kernel, eligible
    for IO. To use these fixed buffers, the application must use the
    IORING_OP_READ_FIXED and IORING_OP_WRITE_FIXED opcodes, and then
    set sqe->index to the desired buffer index. sqe->addr..sqe->addr+seq->len
    must point to somewhere inside the indexed buffer.
    
    The application may register buffers throughout the lifetime of the
    io_uring instance. It can call io_uring_register() with
    IORING_UNREGISTER_BUFFERS as the opcode to unregister the current set of
    buffers, and then register a new set. The application need not
    unregister buffers explicitly before shutting down the io_uring
    instance.
    
    It's perfectly valid to setup a larger buffer, and then sometimes only
    use parts of it for an IO. As long as the range is within the originally
    mapped region, it will work just fine.
    
    For now, buffers must not be file backed. If file backed buffers are
    passed in, the registration will fail with -1/EOPNOTSUPP. This
    restriction may be relaxed in the future.
    
    RLIMIT_MEMLOCK is used to check how much memory we can pin. A somewhat
    arbitrary 1G per buffer size is also imposed.
    Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    edafccee
io_uring.c 47.1 KB