1. 12 Sep, 2013 1 commit
    • John Stultz's avatar
      timekeeping: Fix HRTICK related deadlock from ntp lock changes · 7bd36014
      John Stultz authored
      Gerlando Falauto reported that when HRTICK is enabled, it is
      possible to trigger system deadlocks. These were hard to
      reproduce, as HRTICK has been broken in the past, but seemed
      to be connected to the timekeeping_seq lock.
      
      Since seqlock/seqcount's aren't supported w/ lockdep, I added
      some extra spinlock based locking and triggered the following
      lockdep output:
      
      [   15.849182] ntpd/4062 is trying to acquire lock:
      [   15.849765]  (&(&pool->lock)->rlock){..-...}, at: [<ffffffff810aa9b5>] __queue_work+0x145/0x480
      [   15.850051]
      [   15.850051] but task is already holding lock:
      [   15.850051]  (timekeeper_lock){-.-.-.}, at: [<ffffffff810df6df>] do_adjtimex+0x7f/0x100
      
      <snip>
      
      [   15.850051] Chain exists of: &(&pool->lock)->rlock --> &p->pi_lock --> timekeeper_lock
      [   15.850051]  Possible unsafe locking scenario:
      [   15.850051]
      [   15.850051]        CPU0                    CPU1
      [   15.850051]        ----                    ----
      [   15.850051]   lock(timekeeper_lock);
      [   15.850051]                                lock(&p->pi_lock);
      [   15.850051] lock(timekeeper_lock);
      [   15.850051] lock(&(&pool->lock)->rlock);
      [   15.850051]
      [   15.850051]  *** DEADLOCK ***
      
      The deadlock was introduced by 06c017fd ("timekeeping:
      Hold timekeepering locks in do_adjtimex and hardpps") in 3.10
      
      This patch avoids this deadlock, by moving the call to
      schedule_delayed_work() outside of the timekeeper lock
      critical section.
      Reported-by: default avatarGerlando Falauto <gerlando.falauto@keymile.com>
      Tested-by: default avatarLin Ming <minggr@gmail.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: stable <stable@vger.kernel.org> #3.11, 3.10
      Link: http://lkml.kernel.org/r/1378943457-27314-1-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7bd36014
  2. 04 Sep, 2013 1 commit
    • Stanislaw Gruszka's avatar
      sched/cputime: Do not scale when utime == 0 · 5a8e01f8
      Stanislaw Gruszka authored
      scale_stime() silently assumes that stime < rtime, otherwise
      when stime == rtime and both values are big enough (operations
      on them do not fit in 32 bits), the resulting scaling stime can
      be bigger than rtime. In consequence utime = rtime - stime
      results in negative value.
      
      User space visible symptoms of the bug are overflowed TIME
      values on ps/top, for example:
      
       $ ps aux | grep rcu
       root         8  0.0  0.0      0     0 ?        S    12:42   0:00 [rcuc/0]
       root         9  0.0  0.0      0     0 ?        S    12:42   0:00 [rcub/0]
       root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]
       root        11  0.1  0.0      0     0 ?        S    12:42   0:02 [rcuop/0]
       root        12 62422329  0.0  0     0 ?        S    12:42 21114581:35 [rcuop/1]
       root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]
      
      or overflowed utime values read directly from /proc/$PID/stat
      
      Reference:
      
        https://lkml.org/lkml/2013/8/20/259Reported-and-tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/20130904131602.GC2564@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5a8e01f8
  3. 16 Aug, 2013 1 commit
  4. 14 Aug, 2013 18 commits
    • Ingo Molnar's avatar
      Merge branch 'timers/nohz-v3' of... · 6f1d6576
      Ingo Molnar authored
      Merge branch 'timers/nohz-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz
      
      Pull nohz improvements from Frederic Weisbecker:
      
       " It mostly contains fixes and full dynticks off-case optimizations. I believe that
         distros want to enable this feature so it seems important to optimize the case
         where the "nohz_full=" parameter is empty. ie: I'm trying to remove any performance
         regression that comes with NO_HZ_FULL=y when the feature is not used.
      
         This patchset improves the current situation a lot (off-case appears to be around 11% faster
         with hackbench, although I guess it may vary depending on the configuration but it should be
         significantly faster in any case) now there is still some work to do: I can still observe a
         remaining loss of 1.6% throughput seen with hackbench compared to CONFIG_NO_HZ_FULL=n. "
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6f1d6576
    • Frederic Weisbecker's avatar
      nohz: Optimize full dynticks's sched hooks with static keys · d13508f9
      Frederic Weisbecker authored
      Scheduler IPIs and task context switches are serious fast path.
      Let's try to hide as much as we can the impact of full
      dynticks APIs' off case that are called on these sites
      through the use of static keys.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      d13508f9
    • Frederic Weisbecker's avatar
      nohz: Optimize full dynticks state checks with static keys · 460775df
      Frederic Weisbecker authored
      These APIs are frequenctly accessed and priority is given
      to optimize the full dynticks off-case in order to let
      distros enable this feature without suffering from
      significant performance regressions.
      
      Let's inline these APIs and optimize them with static keys.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      460775df
    • Frederic Weisbecker's avatar
      nohz: Rename a few state variables · 73867dcd
      Frederic Weisbecker authored
      Rename the full dynticks's cpumask and cpumask state variables
      to some more exportable names.
      
      These will be used later from global headers to optimize
      the main full dynticks APIs in conjunction with static keys.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      73867dcd
    • Frederic Weisbecker's avatar
      vtime: Always debug check snapshot source _before_ updating it · af2350bd
      Frederic Weisbecker authored
      The vtime delta update performed by get_vtime_delta() always check
      that the source of the snapshot is valid.
      
      Meanhile the snapshot updaters that rely on get_vtime_delta() also
      set the new snapshot origin. But some of them do this right before
      the call to get_vtime_delta(), making its debug check useless.
      
      This is easily fixable by moving the snapshot origin update after
      the call to get_vtime_delta(). The order doesn't matter there.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      af2350bd
    • Frederic Weisbecker's avatar
      vtime: Always scale generic vtime accounting results · b854fafa
      Frederic Weisbecker authored
      The cputime accounting in full dynticks can be a subtle
      mixup of CPUs using tick based accounting and others using
      generic vtime.
      
      As long as the tick can have a share on producing these stats, we
      want to scale the result against CFS precise accounting as the tick
      can miss some task hiding between the periodic interrupt.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      b854fafa
    • Frederic Weisbecker's avatar
      vtime: Optimize full dynticks accounting off case with static keys · b0493406
      Frederic Weisbecker authored
      If no CPU is in the full dynticks range, we can avoid the full
      dynticks cputime accounting through generic vtime along with its
      overhead and use the traditional tick based accounting instead.
      
      Let's do this and nope the off case with static keys.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      b0493406
    • Frederic Weisbecker's avatar
      vtime: Describe overriden functions in dedicated arch headers · a5725ac2
      Frederic Weisbecker authored
      If the arch overrides some generic vtime APIs, let it describe
      these on a dedicated and standalone header. This way it becomes
      convenient to include it in vtime generic headers without irrelevant
      stuff in such a low level header.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      a5725ac2
    • Frederic Weisbecker's avatar
      m68k: hardirq_count() only need preempt_mask.h · a703f9b7
      Frederic Weisbecker authored
      The m68k irqflags implementation needs to check hardirq
      context in some cases.
      
      As it is a very low level header file, it's better to
      include preempt_mask.h rather than hardirq.h when the
      only purpose is to use irq context APIs. This way we
      can avoid future header circular dependencies when
      vtime.h will expand to use static keys.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      a703f9b7
    • Frederic Weisbecker's avatar
      hardirq: Split preempt count mask definitions · 2d4b8473
      Frederic Weisbecker authored
      In order to use static keys with vtime APIs, we'll need to
      add static keys headers to vtime.h
      
      hardirq.h then becomes a problem because it needs vtime.h
      for irqtime accounting in irq_enter/irq_exit, but it's
      often included just to get the irq mask definitions in the
      task preempt_count field and the APIs that come along:
      in_interrupt(), in_hardirq(), etc...
      
      Some very low level arch headers sometimes need these masks
      and APIs such as arch/m68k/include/asm/irqflags.h for example.
      But they don't want to include hardirq.h if vtime.h, jump_label.h
      and even workqueue.h come along. Including such bloated high
      level header from arch headers can quickly result in circular
      headers dependency that crash the build.
      
      So let's split hardirq.h in two parts:
      
      * preempt_mask.h that gathers all the preempt_count definitions
      and the APIs associated. This one is considered low level and can
      be safely included anywhere.
      
      * hardirq.h that includes the previous one. It defines the irq
      entry/exit APIs.
      
      To avoid future circular headers dependencies, the preempt_mask.h
      inclusion can replace hardirq.h on files that don't implement irq
      low level handlers but just need the atomic/context check APIs.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2d4b8473
    • Frederic Weisbecker's avatar
      context_tracking: Split low level state headers · e7358b3b
      Frederic Weisbecker authored
      We plan to use the context tracking static key on inline
      vtime APIs. For this we need to include the context tracking
      headers from those of vtime.
      
      However vtime headers need to stay low level because they are
      included in hardirq.h that mostly contains standalone
      definitions. But context_tracking.h includes sched.h for
      a few task_struct references, therefore it wouldn't be sensible
      to include it from vtime.h
      
      To solve this, lets split the context tracking headers and move
      out the pure state definitions that only require a few low level
      headers. We can safely include that small part in vtime.h later.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      e7358b3b
    • Frederic Weisbecker's avatar
      vtime: Fix racy cputime delta update · 54461562
      Frederic Weisbecker authored
      get_vtime_delta() must be called under the task vtime_seqlock
      with the code that does the cputime accounting flush.
      
      Otherwise the cputime reader can be fooled and run into
      a race where it sees the snapshot update but misses the
      cputime flush. As a result it can report a cputime that is
      way too short.
      
      Fix vtime_account_user() that wasn't complying to that rule.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      54461562
    • Frederic Weisbecker's avatar
      vtime: Remove a few unneeded generic vtime state checks · 7621d1f8
      Frederic Weisbecker authored
      Some generic vtime APIs check if the vtime accounting
      is enabled on the local CPU before doing their work.
      
      Some of these are not needed because all their callers already
      take care of that. Let's remove the checks on these.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      7621d1f8
    • Frederic Weisbecker's avatar
      context_tracking: User/kernel broundary cross trace events · 1b6a259a
      Frederic Weisbecker authored
      This can be useful to track all kernel/user round trips.
      And it's also helpful to debug the context tracking subsystem.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      1b6a259a
    • Frederic Weisbecker's avatar
      context_tracking: Optimize context switch off case with static keys · 73d424f9
      Frederic Weisbecker authored
      No need for syscall slowpath if no CPU is full dynticks,
      rather nop this in this case.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      73d424f9
    • Frederic Weisbecker's avatar
      context_tracking: Optimize guest APIs off case with static key · 48d6a816
      Frederic Weisbecker authored
      Optimize guest entry/exit APIs with static keys. This minimize
      the overhead for those who enable CONFIG_NO_HZ_FULL without
      always using it. Having no range passed to nohz_full= should
      result in the probes overhead to be minimized.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      48d6a816
    • Frederic Weisbecker's avatar
      context_tracking: Optimize main APIs off case with static key · ad65782f
      Frederic Weisbecker authored
      Optimize user and exception entry/exit APIs with static
      keys. This minimize the overhead for those who enable
      CONFIG_NO_HZ_FULL without always using it. Having no range
      passed to nohz_full= should result in the probes to be nopped
      (at least we hope so...).
      
      If this proves not be enough in the long term, we'll need
      to bring an exception slow path by re-routing the exception
      handlers.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      ad65782f
    • Frederic Weisbecker's avatar
      context_tracking: Ground setup for static key use · 65f382fd
      Frederic Weisbecker authored
      Prepare for using a static key in the context tracking subsystem.
      This will help optimizing the off case on its many users:
      
      * user_enter, user_exit, exception_enter, exception_exit, guest_enter,
        guest_exit, vtime_*()
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      65f382fd
  5. 12 Aug, 2013 8 commits
    • Frederic Weisbecker's avatar
      context_tracking: Remove full dynticks' hacky dependency on wide context tracking · d84d27a4
      Frederic Weisbecker authored
      Now that the full dynticks subsystem only enables the context tracking
      on full dynticks CPUs, lets remove the dependency on CONTEXT_TRACKING_FORCE
      
      This dependency was a hack to enable the context tracking widely for the
      full dynticks susbsystem until the latter becomes able to enable it in a
      more CPU-finegrained fashion.
      
      Now CONTEXT_TRACKING_FORCE only stands for testing on archs that
      work on support for the context tracking while full dynticks can't be
      used yet due to unmet dependencies. It simulates a system where all CPUs
      are full dynticks so that RCU user extended quiescent states and dynticks
      cputime accounting can be tested on the given arch.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      d84d27a4
    • Frederic Weisbecker's avatar
      nohz: Only enable context tracking on full dynticks CPUs · 2e709338
      Frederic Weisbecker authored
      The context tracking subsystem has the ability to selectively
      enable the tracking on any defined subset of CPU. This means that
      we can define a CPU range that doesn't run the context tracking
      and another range that does.
      
      Now what we want in practice is to enable the tracking on full
      dynticks CPUs only. In order to perform this, we just need to pass
      our full dynticks CPU range selection from the full dynticks
      subsystem to the context tracking.
      
      This way we can spare the overhead of RCU user extended quiescent
      state and vtime maintainance on the CPUs that are outside the
      full dynticks range. Just keep in mind the raw context tracking
      itself is still necessary everywhere.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2e709338
    • Frederic Weisbecker's avatar
      context_tracking: Fix runtime CPU off-case · d65ec121
      Frederic Weisbecker authored
      As long as the context tracking is enabled on any CPU, even
      a single one, all other CPUs need to keep track of their
      user <-> kernel boundaries cross as well.
      
      This is because a task can sleep while servicing an exception
      that happened in the kernel or in userspace. Then when the task
      eventually wakes up and return from the exception, the CPU needs
      to know if we resume in userspace or in the kernel. exception_exit()
      get this information from exception_enter() that saved the previous
      state.
      
      If the CPU where the exception happened didn't keep track of
      these informations, exception_exit() doesn't know which state
      tracking to restore on the CPU where the task got migrated
      and we may return to userspace with the context tracking
      subsystem thinking that we are in kernel mode.
      
      This can be fixed in the long term if we move our context tracking
      probes on very low level arch fast path user <-> kernel boundary,
      although even that is worrisome as an exception can still happen
      in the few instructions between the probe and the actual iret.
      
      Also we are not yet ready to set these probes in the fast path given
      the potential overhead problem it induces.
      
      So let's fix this by always enable context tracking even on CPUs
      that are not in the full dynticks range. OTOH we can spare the
      rcu_user_*() and vtime_user_*() calls there because the tick runs
      on these CPUs and we can handle RCU state machine and cputime
      accounting through it.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      d65ec121
    • Frederic Weisbecker's avatar
      vtime: Update a few comments · 5b206d48
      Frederic Weisbecker authored
      Update a stale comment from the old vtime era and document some
      locking that might be non obvious.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      5b206d48
    • Frederic Weisbecker's avatar
      context_tracing: Fix guest accounting with native vtime · 2d854e57
      Frederic Weisbecker authored
      1) If context tracking is enabled with native vtime accounting (which
      combo is useless except for dev testing), we call vtime_guest_enter()
      and vtime_guest_exit() on host <-> guest switches. But those are stubs
      in this configurations. As a result, cputime is not correctly flushed
      on kvm context switches.
      
      2) If context tracking runs but is disabled on some CPUs, those
      CPUs end up calling __guest_enter/__guest_exit which in turn
      call vtime_account_system(). We don't want to call this because we
      run in tick based accounting for these CPUs.
      
      Refactor the guest_enter/guest_exit code such that all combinations
      finally work.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2d854e57
    • Frederic Weisbecker's avatar
      sched: Consolidate open coded preemptible() checks · fbb00b56
      Frederic Weisbecker authored
      preempt_schedule() and preempt_schedule_context() open
      code their preemptability checks.
      
      Use the standard API instead for consolidation.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alex Shi <alex.shi@intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      fbb00b56
    • Ingo Molnar's avatar
      Merge branch 'fortglx/3.11/time' of git://git.linaro.org/people/jstultz/linux into timers/urgent · ae920eb2
      Ingo Molnar authored
      Pull small fix for v3.11 from John Stultz.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ae920eb2
    • Linus Torvalds's avatar
      Linux 3.11-rc5 · d4e4ab86
      Linus Torvalds authored
      d4e4ab86
  6. 11 Aug, 2013 4 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · e5d081f4
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is three bug fixes: An fnic warning caused by sleeping under a
        lock, a major regression with our updated WRITE SAME/UNMAP logic which
        caused tons of USB devices (and one RAID card) to cease to function
        and a megaraid_sas firmware initialisation problem which causes kdump
        failures"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        [SCSI] Don't attempt to send extended INQUIRY command if skip_vpd_pages is set
        [SCSI] fnic: BUG: sleeping function called from invalid context during probe
        [SCSI] megaraid_sas: megaraid_sas driver init fails in kdump kernel
      e5d081f4
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 77f63b4d
      Linus Torvalds authored
      Pull powerpc fixes from Ben Herrenschmidt:
       "This includes small series from Michael Neuling to fix a couple of
        nasty remaining problems with the new Power8 support, also targeted at
        stable 3.10, without which some new userspace accessible registers
        aren't properly context switched, and in some case, can be clobbered
        by the user of transactional memory.
      
        Along with that, a few slightly more minor things, such as a missing
        Kconfig option to enable handling of denorm exceptions when not
        running under a hypervisor (or userspace will randomly crash when
        hitting denorms with the vector unit), some nasty bugs in the new
        pstore oops code, and other simple bug fixes worth having in now.
      
        Note: I picked up the two powerpc KVM fixes as Alex Graf asked me to
        handle KVM bits while he is on vacation.  However I'll let him decide
        whether they should go to -stable or not when he is back"
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc/tm: Fix context switching TAR, PPR and DSCR SPRs
        powerpc: Save the TAR register earlier
        powerpc: Fix context switch DSCR on POWER8
        powerpc: Rework setting up H/FSCR bit definitions
        powerpc: Fix hypervisor facility unavaliable vector number
        powerpc/kvm/book3s_pr: Return appropriate error when allocation fails
        powerpc/kvm: Add signed type cast for comparation
        powerpc/eeh: Add missing procfs entry for PowerNV
        powerpc/pseries: Add backward compatibilty to read old kernel oops-log
        powerpc/pseries: Fix buffer overflow when reading from pstore
        powerpc: On POWERNV enable PPC_DENORMALISATION by default
      77f63b4d
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 30b229bd
      Linus Torvalds authored
      Pull s390 kvm fixes from Paolo Bonzini:
       "Two fixes for s390"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: s390: fix pfmf non-quiescing control handling
        KVM: s390: move kvm_guest_enter,exit closer to sie
      30b229bd
    • Linus Torvalds's avatar
      Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 9e6bdaaa
      Linus Torvalds authored
      Pull i2c fixes from Wolfram Sang:
       "Some driver bugfixes for the I2C subsystem"
      
      * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
        i2c: mv64xxx: Document the newly introduced allwinner compatible
        i2c: Fix Kontron PLD prescaler calculation
        i2c: i2c-mxs: Use DMA mode even for small transfers
      9e6bdaaa
  7. 10 Aug, 2013 6 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · d92581fc
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "These are assorted fixes, mostly from Josef nailing down xfstests
        runs.  Zach also has a long standing fix for problems with readdir
        wrapping f_pos (or ctx->pos)
      
        These patches were spread out over different bases, so I rebased
        things on top of rc4 and retested overnight"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        btrfs: don't loop on large offsets in readdir
        Btrfs: check to see if root_list is empty before adding it to dead roots
        Btrfs: release both paths before logging dir/changed extents
        Btrfs: allow splitting of hole em's when dropping extent cache
        Btrfs: make sure the backref walker catches all refs to our extent
        Btrfs: fix backref walking when we hit a compressed extent
        Btrfs: do not offset physical if we're compressed
        Btrfs: fix extent buffer leak after backref walking
        Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents
        btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified
      d92581fc
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · b8ea0d06
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
      
       - Stable patch for lockd to fix Oopses due to inappropriate calls to
         utsname()->nodename
      
       - Stable patches for sunrpc to fix Oopses on shutdown when using
         AF_LOCAL sockets with rpcbind
      
       - Fix memory leak and error checking issues in nfs4_proc_lookup_mountpoint
      
       - Fix a regression with the sync mount option failing to work for nfs4
         mounts
      
       - Fix a writeback performance issue when doing cache invalidation
      
       - Remove an incorrect call to nfs_setsecurity in nfs_fhget
      
      * tag 'nfs-for-3.11-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFSv4: Fix up nfs4_proc_lookup_mountpoint
        NFS: Remove unnecessary call to nfs_setsecurity in nfs_fhget()
        NFSv4: Fix the sync mount option for nfs4 mounts
        NFS: Fix writeback performance issue on cache invalidation
        SUNRPC: If the rpcbind channel is disconnected, fail the call to unregister
        SUNRPC: Don't auto-disconnect from the local rpcbind socket
        LOCKD: Don't call utsname()->nodename from nlmclnt_setlockargs
      b8ea0d06
    • Linus Torvalds's avatar
      Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux · 022e5d09
      Linus Torvalds authored
      Pull nfsd fixes from Bruce Fields:
       "Some fixes for a 4.1 feature that in retrospect probably should have
        waited for 3.12....  But it appears to be working now"
      
      * 'for-3.11' of git://linux-nfs.org/~bfields/linux:
        nfsd: Fix SP4_MACH_CRED negotiation in EXCHANGE_ID
        nfsd4: Fix MACH_CRED NULL dereference
      022e5d09
    • Linus Torvalds's avatar
      Merge tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 1e24f76e
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A couple of USB-audio fixes that should also go to stable kernels"
      
      * tag 'sound-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: usb-audio: do not trust too-big wMaxPacketSize values
        ALSA: 6fire: fix DMA issues with URB transfer_buffer usage
      1e24f76e
    • Linus Torvalds's avatar
      Merge tag 'staging-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 8ae3f1d0
      Linus Torvalds authored
      Pull staging driver fixes from Greg KH:
       "Here are 3 small fixes for staging/IIO drivers for 3.11-rc5.  Nothing
        huge, two IIO driver fixes, and a zcache fix.  All of these have been
        in linux-next for a while"
      
      * tag 'staging-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: zcache: fix "zcache=" kernel parameter
        iio: ti_am335x_adc: Fix wrong samples received on 1st read
        iio:trigger: Fix use_count race condition
      8ae3f1d0
    • Linus Torvalds's avatar
      Merge tag 'usb-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · e6e8ac44
      Linus Torvalds authored
      Pull USB fixes from Greg KH:
       "Here are 3 small USB fixes for 3.11-rc5.
      
        One is a fix that the ChromeOS developers ran into on some Intel
        hardware, one is a build fix, and the last is a MAINTAINERS update to
        help people figure out where to send USB network driver patches.
      
        All of these have been in linux-next for a while"
      
      * tag 'usb-3.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
        MAINTAINERS: Add separate section for USB NETWORKING DRIVERS
        usb: xhci: add missing dma-mapping.h includes
        usb: core: don't try to reset_device() a port that got just disconnected
      e6e8ac44
  8. 09 Aug, 2013 1 commit
    • Zach Brown's avatar
      btrfs: don't loop on large offsets in readdir · db62efbb
      Zach Brown authored
      When btrfs readdir() hits the last entry it sets the readdir offset to a
      huge value to stop buggy apps from breaking when the same name is
      returned by readdir() with concurrent rename()s.
      
      But unconditionally setting the offset to INT_MAX causes readdir() to
      loop returning any entries with offsets past INT_MAX.  It only takes a
      few hours of constant file creation and removal to create entries past
      INT_MAX.
      
      So let's set the huge offset to LLONG_MAX if the last entry has already
      overflowed 32bit loff_t.   Without large offsets behaviour is identical.
      With large offsets 64bit apps will work and 32bit apps will be no more
      broken than they currently are if they see large offsets.
      Signed-off-by: default avatarZach Brown <zab@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      db62efbb