• David Howells's avatar
    mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios · db0aa2e9
    David Howells authored
    Define a data structure, struct folio_queue, to represent a sequence of
    folios and a kernel-internal I/O iterator type, ITER_FOLIOQ, to allow a
    list of folio_queue structures to be used to provide a buffer to
    iov_iter-taking functions, such as sendmsg and recvmsg.
    
    The folio_queue structure looks like:
    
    	struct folio_queue {
    		struct folio_batch	vec;
    		u8			orders[PAGEVEC_SIZE];
    		struct folio_queue	*next;
    		struct folio_queue	*prev;
    		unsigned long		marks;
    		unsigned long		marks2;
    	};
    
    It does not use a list_head so that next and/or prev can be set to NULL at
    the ends of the list, allowing iov_iter-handling routines to determine that
    they *are* the ends without needing to store a head pointer in the iov_iter
    struct.
    
    A folio_batch struct is used to hold the folio pointers which allows the
    batch to be passed to batch handling functions.  Two mark bits are
    available per slot.  The intention is to use at least one of them to mark
    folios that need putting, but that might not be ultimately necessary.
    Accessor functions are used to access the slots to do the masking and an
    additional accessor function is used to indicate the size of the array.
    
    The order of each folio is also stored in the structure to avoid the need
    for iov_iter_advance() and iov_iter_revert() to have to query each folio to
    find its size.
    
    With careful barriering, this can be used as an extending buffer with new
    folios inserted and new folio_queue structs added without the need for a
    lock.  Further, provided we always keep at least one struct in the buffer,
    we can also remove consumed folios and consumed structs from the head end
    as we without the need for locks.
    
    [Questions/thoughts]
    
     (1) To manage this, I need a head pointer, a tail pointer, a tail slot
         number (assuming insertion happens at the tail end and the next
         pointers point from head to tail).  Should I put these into a struct
         of their own, say "folio_queue_head" or "rolling_buffer"?
    
         I will end up with two of these in netfs_io_request eventually, one
         keeping track of the pagecache I'm dealing with for buffered I/O and
         the other to hold a bounce buffer when we need one.
    
     (2) Should I make the slots {folio,off,len} or bio_vec?
    
     (3) This is intended to replace ITER_XARRAY eventually.  Using an xarray
         in I/O iteration requires the taking of the RCU read lock, doing
         copying under the RCU read lock, walking the xarray (which may change
         under us), handling retries and dealing with special values.
    
         The advantage of ITER_XARRAY is that when we're dealing with the
         pagecache directly, we don't need any allocation - but if we're doing
         encrypted comms, there's a good chance we'd be using a bounce buffer
         anyway.
    
         This will require afs, erofs, cifs, orangefs and fscache to be
         converted to not use this.  afs still uses it for dirs and symlinks;
         some of erofs usages should be easy to change, but there's one which
         won't be so easy; ceph's use via fscache can be fixed by porting ceph
         to netfslib; cifs is using xarray as a bounce buffer - that can be
         moved to use sheaves instead; and orangefs has a similar problem to
         erofs - maybe orangefs could use netfslib?
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    cc: Matthew Wilcox <willy@infradead.org>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: Steve French <sfrench@samba.org>
    cc: Ilya Dryomov <idryomov@gmail.com>
    cc: Gao Xiang <xiang@kernel.org>
    cc: Mike Marshall <hubcap@omnibond.com>
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    cc: linux-mm@kvack.org
    cc: linux-afs@lists.infradead.org
    cc: linux-cifs@vger.kernel.org
    cc: ceph-devel@vger.kernel.org
    cc: linux-erofs@lists.ozlabs.org
    cc: devel@lists.orangefs.org
    Link: https://lore.kernel.org/r/20240814203850.2240469-13-dhowells@redhat.com/ # v2
    Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
    db0aa2e9
iov_iter.c 47.9 KB