• Kirill Tkhai's avatar
    sched/deadline: Implement cancel_dl_timer() to use in switched_from_dl() · 67dfa1b7
    Kirill Tkhai authored
    Currently used hrtimer_try_to_cancel() is racy:
    
    raw_spin_lock(&rq->lock)
    ...                            dl_task_timer                 raw_spin_lock(&rq->lock)
    ...                               raw_spin_lock(&rq->lock)   ...
       switched_from_dl()             ...                        ...
          hrtimer_try_to_cancel()     ...                        ...
       switched_to_fair()             ...                        ...
    ...                               ...                        ...
    ...                               ...                        ...
    raw_spin_unlock(&rq->lock)        ...                        (asquired)
    ...                               ...                        ...
    ...                               ...                        ...
    do_exit()                         ...                        ...
       schedule()                     ...                        ...
          raw_spin_lock(&rq->lock)    ...                        raw_spin_unlock(&rq->lock)
          ...                         ...                        ...
          raw_spin_unlock(&rq->lock)  ...                        raw_spin_lock(&rq->lock)
          ...                         ...                        (asquired)
          put_task_struct()           ...                        ...
              free_task_struct()      ...                        ...
          ...                         ...                        raw_spin_unlock(&rq->lock)
    ...                               (asquired)                 ...
    ...                               ...                        ...
    ...                               (use after free)           ...
    
    So, let's implement 100% guaranteed way to cancel the timer and let's
    be sure we are safe even in very unlikely situations.
    
    rq unlocking does not limit the area of switched_from_dl() use, because
    this has already been possible in pull_dl_task() below.
    
    Let's consider the safety of of this unlocking. New code in the patch
    is working when hrtimer_try_to_cancel() fails. This means the callback
    is running. In this case hrtimer_cancel() is just waiting till the
    callback is finished. Two
    
    1) Since we are in switched_from_dl(), new class is not dl_sched_class and
    new prio is not less MAX_DL_PRIO. So, the callback returns early; it's
    right after !dl_task() check. After that hrtimer_cancel() returns back too.
    
    The above is:
    
    raw_spin_lock(rq->lock);                  ...
    ...                                       dl_task_timer()
    ...                                          raw_spin_lock(rq->lock);
       switched_from_dl()                        ...
           hrtimer_try_to_cancel()               ...
              raw_spin_unlock(rq->lock);         ...
              hrtimer_cancel()                   ...
              ...                                raw_spin_unlock(rq->lock);
              ...                                return HRTIMER_NORESTART;
              ...                             ...
              raw_spin_lock(rq->lock);        ...
    
    2) But the below is also possible:
                                       dl_task_timer()
                                          raw_spin_lock(rq->lock);
                                          ...
                                          raw_spin_unlock(rq->lock);
    raw_spin_lock(rq->lock);              ...
       switched_from_dl()                 ...
           hrtimer_try_to_cancel()        ...
           ...                            return HRTIMER_NORESTART;
           raw_spin_unlock(rq->lock);  ...
           hrtimer_cancel();           ...
           raw_spin_lock(rq->lock);    ...
    
    In this case hrtimer_cancel() returns immediately. Very unlikely case,
    just to mention.
    
    Nobody can manipulate the task, because check_class_changed() is
    always called with pi_lock locked. Nobody can force the task to
    participate in (concurrent) priority inheritance schemes (the same reason).
    
    All concurrent task operations require pi_lock, which is held by us.
    No deadlocks with dl_task_timer() are possible, because it returns
    right after !dl_task() check (it does nothing).
    
    If we receive a new dl_task during the time of unlocked rq, we just
    don't have to do pull_dl_task() in switched_from_dl() further.
    Signed-off-by: default avatarKirill Tkhai <ktkhai@parallels.com>
    [ Added comments]
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: default avatarJuri Lelli <juri.lelli@arm.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Link: http://lkml.kernel.org/r/1414420852.19914.186.camel@tkhaiSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    67dfa1b7
core.c 199 KB