• Kirill Smelkov's avatar
    time: Redo timers properly · 044deb35
    Kirill Smelkov authored
    Background: in 2019 in 9c260fde (time: New package that mirrors Go's
    time) and b073f6df (time: Move/Port timers to C++/Pyx nogil) I've added
    basic timers - with proper API but with very dumb implementation that
    was spawning one thread per each timer. There were just a few timers in
    the users and this was working, surprisingly, relatively ok...
    
    ... until 2023 where I was working on XLTE that needs to organize 100Hz
    polling of Amarisoft eNodeB service to retrieve information about flows
    on Data Radio Bearers:
    
        kirr/xlte@2a016d48
        https://lab.nexedi.com/kirr/xlte/-/blob/8e606c64/amari/drb.py
    
    There each request comes with its own deadline - to catch "no reply",
    and the deadlines are implemented via timers. So there are 100 threads
    created every second which adds visible overhead, consumes a lot of
    virtual address space and RSS for threads stacks, and should be all unnecessary.
    
    We was tolerating even that for some time, but recently Joanne approached me
    with reports that xamari program, that does the polling, is leaking memory.
    
    With that, and because it was hard to find what is actually leaking,
    I've started to remove uncertainties and there are a lot of uncertainty
    in what is going on when lots of threads are being created over and over.
    
    In the end the leak turned out to be likely a different thing (see
    nexedi/pygolang!24, still
    discovered while working on hereby patch), but all of the above was
    enough motivation to finally start redoing the timers properly.
    
    --------
    
    So when it comes to do the timers properly more or less, there is
    usually queue of armed timers, and a loop that picks entries from that
    queue to fire them. I was initially trying to do the simple thing and
    use std::priority_queue for that, because priority_queue is internally
    heap, and heaps can provide O(log(n)) insertion and removal of arbitrary
    element, plus O(1) "pick top element to process". Exactly what would
    suit. However I quickly found that even in 2024, std::priority_queue
    does not provide removal operation at all, and there is no such thing as
    e.g. std::sift_heap, that would help to implement that manually. Which
    is surprising, because e.g. libevent implements all that just ok via
    sifting up/down upon removal in logarithmic complexity:
    
    https://github.com/libevent/libevent/blob/80e25c02/minheap-internal.h#L96-L115
    
    the lack of efficient removal operation turned out to be a blocker to
    use std::priority_queue because most of the timers, that are armed for
    timeouts, are never expired and upon successful completion of covered
    operation, the timer is stopped. In other words the timer is removed
    from the timer queue and the removal is one of the most often
    operations.
    
    So, if std::priority_queue cannot work, we would need to either bring in
    another implementation of a heap, or, if we are to bring something,
    bring and use something else that is more suitable for implementing
    timers.
    
    That reminded me that in 2005 for my Navy project, I already implemented
    custom timer wheel to handle timeouts after reading https://lwn.net/Articles/152436/ .
    Contrary to heaps, such timer wheels provide O(1) insertion and removal
    of timers and work generally faster. But this time I did not want to
    delve into implementing all that myself again and tried to look around
    of what is available out there.
    
    There was an update to kernel timer-wheel implementation described at
    https://lwn.net/Articles/646950/ and from that a project called
    Timeout.c was also found that provides implementation for such a wheel
    for user space: https://25thandclement.com/~william/projects/timeout.c.html .
    
    However when we are to pick third-party code, we should be ready to
    understand it and fix bugs there on our own. So the audit of timeout.c
    did not went very smoothly - there are many platform-depended places,
    and the issue tracker shows signs that sometimes not everything is ok
    with the implementation. With that I've looked around a bit more and
    found more compact and more portable Ratas library with good structure
    and description and whose audit came more well:
    
        https://www.snellman.net/blog/archive/2016-07-27-ratas-hierarchical-timer-wheel
        https://github.com/jsnell/ratas
    
    Here, after going through the code, I feel to be capable to understand
    issues and fix bugs myself if that would become needed.
    
    And the benchmark comparison of Timeout.c and Ratas shows that they
    should be of the same order regarding performance:
    
    https://lab.nexedi.com/kirr/misc/-/blob/4f51fd6/bench/time-wheel/ratas-vs-timeout.pdf
    kirr/ratas@382321d2
    kirr/timeout@d6f15744
    
    which makes Ratas the winner for me.
    
    Having timer-wheel implementation, the rest is just technique to glue it
    all together. One implementation aspect deserves to be mentioned though:
    
    The timer loop uses Semaphore.acquire, recently modernized to also
    accept timeout, to organize sleep in between pauses with also being able
    to be simultaneously woken up if new timer is armed with earlier
    expiration time.
    
    Other than that the changes are mostly straightforward. Please see the
    patch itself for details.
    
    Regarding how the new implementation is more efficient for what we had
    before, there are added benchmarks to measure arming timers that do not
    fire, and, for symmetry, arming timers that do fire. We are most
    interested in the first benchmark, because it shows how cheap or
    expensive it is to use timers to implement timeouts, but the second one
    is also useful to have to see the overhead of the whole timers machinery.
    
    On my machine under py3.11 they go as after this patch:
    
        name              time/op
        timer_arm_cancel   805ns ± 0%
        timer_arm_fire    9.63µs ± 0%
    
    and before the patch the benchmarks simply do not run till the end
    because they run out of memory due to huge number of threads being
    created.
    
    Still with the following test program we can measure the effect new
    timers implementation has:
    
        ---- 8< ----
        from golang import time
    
        def main():
            δt_rate = 1*time.millisecond
    
            tprev = time.now()
            tnext = tprev + δt_rate
            while 1:
                timer = time.Timer(5*time.second)
                _ = timer.stop()
                assert _ is True
    
                t = time.now()
                δtsleep = tnext - t
                #print('sleep %.3f ms' % (δtsleep/time.millisecond))
                time.sleep(δtsleep)
                tprev = tnext
                tnext += δt_rate
    
        main()
        ---- 8< ----
    
    This program creates/arms and cancels a timer 1000 times per second.
    
    Before hereby patch this program consumes ~ 30% of CPU, while after
    hereby patch this program consumes ~ 7-8% of CPU.
    
    For the reference just a sleep part of that program, with all code
    related to timers removed consumes ~5% of CPU, while the consumption of
    plain sleep(1ms) in C and directly using system calls
    
        ---- 8< ----
        #include <unistd.h>
    
        int main() {
            while (1) {
                usleep(1000);
            }
            return 0;
        }
        ---- 8< ----
    
    is ~ 3-4% of CPU on my machine.
    
    /cc @jerome
    /cc ORS team (@jhuge, @lu.xu, @tomo, @xavier_thompson, @Daetalus)
    /proposed-for-review-on nexedi/pygolang!26
    044deb35
time.h 5.26 KB
#ifndef _NXD_LIBGOLANG_TIME_H
#define	_NXD_LIBGOLANG_TIME_H

// Copyright (C) 2019-2024  Nexedi SA and Contributors.
//                          Kirill Smelkov <kirr@nexedi.com>
//
// This program is free software: you can Use, Study, Modify and Redistribute
// it under the terms of the GNU General Public License version 3, or (at your
// option) any later version, as published by the Free Software Foundation.
//
// You can also Link and Combine this program with other software covered by
// the terms of any of the Free Software licenses or any of the Open Source
// Initiative approved licenses and Convey the resulting work. Corresponding
// source of such a combination shall include the source code for all other
// software used.
//
// This program is distributed WITHOUT ANY WARRANTY; without even the implied
// warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
//
// See COPYING file for full licensing terms.
// See https://www.nexedi.com/licensing for rationale and options.

// Package time mirrors Go package time.
//
//  - `now` returns current time.
//  - `sleep` pauses current task.
//  - `Ticker` and `Timer` provide timers integrated with channels.
//  - `tick`, `after` and `after_func` are convenience wrappers to use
//    tickers and timers easily.
//
// See also https://golang.org/pkg/time for Go time package documentation.
//
//
// C-level API
//
// Subset of time package functionality is also provided via C-level API:
//
//  - `_tasknanosleep` pauses current task.
//  - `_nanotime` returns current time.


#include <golang/libgolang.h>
#include <golang/sync.h>


// ---- C-level API ----

#ifdef __cplusplus
namespace golang {
namespace time {
extern "C" {
#endif

LIBGOLANG_API void _tasknanosleep(uint64_t dt);
LIBGOLANG_API uint64_t _nanotime(void);

#ifdef __cplusplus
}}} // golang::time:: "C"
#endif


// ---- C++ API ----

#ifdef __cplusplus

// golang::time::
namespace golang {
namespace time {

// golang/pyx/c++ - the same as std python - represents time as float
constexpr double second       = 1.0;
constexpr double nanosecond   = 1E-9 * second;
constexpr double microsecond  = 1E-6 * second;
constexpr double millisecond  = 1E-3 * second;
constexpr double minute       = 60   * second;
constexpr double hour         = 60   * minute;


// sleep pauses current goroutine for at least dt seconds.
LIBGOLANG_API void sleep(double dt);

// now returns current time in seconds.
LIBGOLANG_API double now();


typedef refptr<struct _Ticker> Ticker;
typedef refptr<struct _Timer>  Timer;

// tick returns channel connected to dt ticker.
//
// Note: there is no way to stop created ticker.
// Note: for dt <= 0, contrary to Ticker, tick returns nil channel instead of panicking.
LIBGOLANG_API chan<double> tick(double dt);

// after returns channel connected to dt timer.
//
// Note: with after there is no way to stop/garbage-collect created timer until it fires.
LIBGOLANG_API chan<double> after(double dt);

// after_func arranges to call f after dt time.
//
// The function will be called in its own goroutine.
// Returned timer can be used to cancel the call.
LIBGOLANG_API Timer after_func(double dt, func<void()> f);


// new_ticker creates new Ticker that will be firing at dt intervals.
LIBGOLANG_API Ticker new_ticker(double dt);

// Ticker arranges for time events to be sent to .c channel on dt-interval basis.
//
// If the receiver is slow, Ticker does not queue events and skips them.
// Ticking can be canceled via .stop() .
struct _Ticker : object {
    chan<double> c;

private:
    double      _dt;
    sync::Mutex _mu;
    bool        _stop;
    Timer       _timer;

    // don't new - create only via new_ticker()
private:
    _Ticker();
    ~_Ticker();
    friend Ticker LIBGOLANG_API new_ticker(double dt);
public:
    LIBGOLANG_API void decref();

public:
    // stop cancels the ticker.
    //
    // It is guaranteed that ticker channel is empty after stop completes.
    LIBGOLANG_API void stop();

private:
    void _tick();
};


// new_timer creates new Timer that will fire after dt.
LIBGOLANG_API Timer new_timer(double dt);

// Timer arranges for time event to be sent to .c channel after dt time.
//
// The timer can be stopped (.stop), or reinitialized to another time (.reset).
struct _Timer : object {
    chan<double> c;

    // don't new - create only via new_timer() & co
private:
    _Timer();
    ~_Timer();
    friend Timer _new_timer(double dt, func<void()> f);
    friend class _TimerImpl;
public:
    LIBGOLANG_API void decref();

public:
    // stop cancels the timer.
    //
    // It returns:
    //
    //   False: the timer was already expired or stopped,
    //   True:  the timer was armed and canceled by this stop call.
    //
    // Note: contrary to Go version, there is no need to drain timer channel
    // after stop call - it is guaranteed that after stop the channel is empty.
    //
    // Note: similarly to Go, if Timer is used with function - it is not
    // guaranteed that after stop the function is not running - in such case
    // the caller must explicitly synchronize with that function to complete.
    LIBGOLANG_API bool stop();

    // reset rearms the timer.
    //
    // the timer must be either already stopped or expired.
    LIBGOLANG_API void reset(double dt);
};


}}   // golang::time::
#endif  // __cplusplus

#endif	// _NXD_LIBGOLANG_TIME_H