• Paul Mackerras's avatar
    [PATCH] ppc64: extra barrier in I/O operations · 307b7297
    Paul Mackerras authored
    At the moment, on PPC64, the instruction we use for wmb() doesn't
    order cacheable stores vs. non-cacheable stores.  (It does order
    cacheable vs. cacheable and non-cacheable vs. non-cacheable.)  This
    causes problems in the sort of driver code that writes stuff into
    memory, does a wmb(), then a writel to the device to start a DMA
    operation to read the stuff it has just written to memory.
    
    This patch solves the problem by adding a sync instruction before the
    store in the write* and out* macros.  The sync is a full barrier that
    orders all loads and stores, cacheable or not.  The patch also moves
    the eieio instruction that we had after the store to before the load
    in the read* and in* macros.  With the sync before the store, we don't
    need an eieio as well in a sequence of stores, but we still need an
    eieio between a store and a load.
    
    I think it is better to do this than to turn wmb() into a full memory
    barrier (a sync instruction) because the full barrier is slow and
    isn't needed with the sync in the write*/out* macros.  This way,
    write*/out* are fully ordered with respect to preceding loads and
    stores, which is what driver writers expect, and we avoid penalizing
    users of wmb() who are only doing cacheable stores.
    307b7297
io.h 13.1 KB