• Steven Rostedt (VMware)'s avatar
    ring-buffer: Add rb_time_t 64 bit operations for speeding up 32 bit · 10464b4a
    Steven Rostedt (VMware) authored
    After a discussion with the new time algorithm to have nested events still
    have proper time keeping but required using local64_t atomic operations.
    Mathieu was concerned about the performance this would have on 32 bit
    machines, as in most cases, atomic 64 bit operations on them can be
    expensive.
    
    As the ring buffer's timing needs do not require full features of local64_t,
    a wrapper is made to implement a new rb_time_t operation that uses two longs
    on 32 bit machines but still uses the local64_t operations on 64 bit
    machines. There's a switch that can be made in the file to force 64 bit to
    use the 32 bit version just for testing purposes.
    
    All reads do not need to succeed if a read happened while the stamp being
    read is in the process of being updated. The requirement is that all reads
    must succed that were done by an interrupting event (where this event was
    interrupted by another event that did the write). Or if the event itself did
    the write first. That is: rb_time_set(t, x) followed by rb_time_read(t) will
    always succeed (even if it gets interrupted by another event that writes to
    t. The result of the read will be either the previous set, or a set
    performed by an interrupting event.
    
    If the read is done by an event that interrupted another event that was in
    the process of setting the time stamp, and no other event came along to
    write to that time stamp, it will fail and the rb_time_read() will return
    that it failed (the value to read will be undefined).
    
    A set will always write to the time stamp and return with a valid time
    stamp, such that any read after it will be valid.
    
    A cmpxchg may fail if it interrupted an event that was in the process of
    updating the time stamp just like the reads do. Other than that, it will act
    like a normal cmpxchg.
    
    The way this works is that the rb_time_t is made of of three fields. A cnt,
    that gets updated atomically everyting a modification is made. A top that
    represents the most significant 30 bits of the time, and a bottom to
    represent the least significant 30 bits of the time. Notice, that the time
    values is only 60 bits long (where the ring buffer only uses 59 bits, which
    gives us 18 years of nanoseconds!).
    
    The top two bits of both the top and bottom is a 2 bit counter that gets set
    by the value of the least two significant bits of the cnt. A read of the top
    and the bottom where both the top and bottom have the same most significant
    top 2 bits, are considered a match and a valid 60 bit number can be created
    from it. If they do not match, then the number is considered invalid, and
    this must only happen if an event interrupted another event in the midst of
    updating the time stamp.
    
    This is only used for 32 bits machines as 64 bit machines can get better
    performance out of the local64_t. This has been tested heavily by forcing 64
    bit to use this logic.
    
    Link: https://lore.kernel.org/r/20200625225345.18cf5881@oasis.local.home
    Link: http://lkml.kernel.org/r/20200629025259.309232719@goodmis.orgInspired-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
    10464b4a
ring_buffer.c 148 KB