• Nathan Lynch's avatar
    powerpc/pseries: add RTAS work area allocator · 43033bc6
    Nathan Lynch authored
    Various pseries-specific RTAS functions take a temporary "work area"
    parameter - a buffer in memory accessible to RTAS. Typically such
    functions are passed the statically allocated rtas_data_buf buffer as
    the argument. This buffer is protected by a global spinlock. So users
    of rtas_data_buf cannot perform sleeping operations while accessing
    the buffer.
    
    Most RTAS functions that have a work area parameter can return a
    status (-2/990x) that indicates that the caller should retry. Before
    retrying, the caller may need to reschedule or sleep (see
    rtas_busy_delay() for details). This combination of factors
    leads to uncomfortable constructions like this:
    
    	do {
    		spin_lock(&rtas_data_buf_lock);
    		rc = rtas_call(token, __pa(rtas_data_buf, ...);
    		if (rc == 0) {
    			/* parse or copy out rtas_data_buf contents */
    		}
    		spin_unlock(&rtas_data_buf_lock);
    	} while (rtas_busy_delay(rc));
    
    Another unfortunately common way of handling this is for callers to
    blithely ignore the possibility of a -2/990x status and hope for the
    best.
    
    If users were allowed to perform blocking operations while owning a
    work area, the programming model would become less tedious and
    error-prone. Users could schedule away, sleep, or perform other
    blocking operations without having to release and re-acquire
    resources.
    
    We could continue to use a single work area buffer, and convert
    rtas_data_buf_lock to a mutex. But that would impose an unnecessarily
    coarse serialization on all users. As awkward as the current design
    is, it prevents longer running operations that need to repeatedly use
    rtas_data_buf from blocking the progress of others.
    
    There are more considerations. One is that while 4KB is fine for all
    current in-kernel uses, some RTAS calls can take much smaller buffers,
    and some (VPD, platform dumps) would likely benefit from larger
    ones. Another is that at least one RTAS function (ibm,get-vpd)
    has *two* work area parameters. And finally, we should expect the
    number of work area users in the kernel to increase over time as we
    introduce lockdown-compatible ABIs to replace less safe use cases
    based on sys_rtas/librtas.
    
    So a special-purpose allocator for RTAS work area buffers seems worth
    trying.
    
    Properties:
    
    * The backing memory for the allocator is reserved early in boot in
      order to satisfy RTAS addressing requirements, and then managed with
      genalloc.
    * Allocations can block, but they never fail (mempool-like).
    * Prioritizes first-come, first-serve fairness over throughput.
    * Early boot allocations before the allocator has been initialized are
      served via an internal static buffer.
    
    Intended to replace rtas_data_buf. New code that needs RTAS work area
    buffers should prefer this API.
    Signed-off-by: default avatarNathan Lynch <nathanl@linux.ibm.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-12-26929c8cce78@linux.ibm.com
    43033bc6
rtas.c 51.9 KB