1. 14 Aug, 2013 14 commits
    • Frederic Weisbecker's avatar
      vtime: Always debug check snapshot source _before_ updating it · af2350bd
      Frederic Weisbecker authored
      The vtime delta update performed by get_vtime_delta() always check
      that the source of the snapshot is valid.
      
      Meanhile the snapshot updaters that rely on get_vtime_delta() also
      set the new snapshot origin. But some of them do this right before
      the call to get_vtime_delta(), making its debug check useless.
      
      This is easily fixable by moving the snapshot origin update after
      the call to get_vtime_delta(). The order doesn't matter there.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      af2350bd
    • Frederic Weisbecker's avatar
      vtime: Always scale generic vtime accounting results · b854fafa
      Frederic Weisbecker authored
      The cputime accounting in full dynticks can be a subtle
      mixup of CPUs using tick based accounting and others using
      generic vtime.
      
      As long as the tick can have a share on producing these stats, we
      want to scale the result against CFS precise accounting as the tick
      can miss some task hiding between the periodic interrupt.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      b854fafa
    • Frederic Weisbecker's avatar
      vtime: Optimize full dynticks accounting off case with static keys · b0493406
      Frederic Weisbecker authored
      If no CPU is in the full dynticks range, we can avoid the full
      dynticks cputime accounting through generic vtime along with its
      overhead and use the traditional tick based accounting instead.
      
      Let's do this and nope the off case with static keys.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      b0493406
    • Frederic Weisbecker's avatar
      vtime: Describe overriden functions in dedicated arch headers · a5725ac2
      Frederic Weisbecker authored
      If the arch overrides some generic vtime APIs, let it describe
      these on a dedicated and standalone header. This way it becomes
      convenient to include it in vtime generic headers without irrelevant
      stuff in such a low level header.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      a5725ac2
    • Frederic Weisbecker's avatar
      m68k: hardirq_count() only need preempt_mask.h · a703f9b7
      Frederic Weisbecker authored
      The m68k irqflags implementation needs to check hardirq
      context in some cases.
      
      As it is a very low level header file, it's better to
      include preempt_mask.h rather than hardirq.h when the
      only purpose is to use irq context APIs. This way we
      can avoid future header circular dependencies when
      vtime.h will expand to use static keys.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      a703f9b7
    • Frederic Weisbecker's avatar
      hardirq: Split preempt count mask definitions · 2d4b8473
      Frederic Weisbecker authored
      In order to use static keys with vtime APIs, we'll need to
      add static keys headers to vtime.h
      
      hardirq.h then becomes a problem because it needs vtime.h
      for irqtime accounting in irq_enter/irq_exit, but it's
      often included just to get the irq mask definitions in the
      task preempt_count field and the APIs that come along:
      in_interrupt(), in_hardirq(), etc...
      
      Some very low level arch headers sometimes need these masks
      and APIs such as arch/m68k/include/asm/irqflags.h for example.
      But they don't want to include hardirq.h if vtime.h, jump_label.h
      and even workqueue.h come along. Including such bloated high
      level header from arch headers can quickly result in circular
      headers dependency that crash the build.
      
      So let's split hardirq.h in two parts:
      
      * preempt_mask.h that gathers all the preempt_count definitions
      and the APIs associated. This one is considered low level and can
      be safely included anywhere.
      
      * hardirq.h that includes the previous one. It defines the irq
      entry/exit APIs.
      
      To avoid future circular headers dependencies, the preempt_mask.h
      inclusion can replace hardirq.h on files that don't implement irq
      low level handlers but just need the atomic/context check APIs.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2d4b8473
    • Frederic Weisbecker's avatar
      context_tracking: Split low level state headers · e7358b3b
      Frederic Weisbecker authored
      We plan to use the context tracking static key on inline
      vtime APIs. For this we need to include the context tracking
      headers from those of vtime.
      
      However vtime headers need to stay low level because they are
      included in hardirq.h that mostly contains standalone
      definitions. But context_tracking.h includes sched.h for
      a few task_struct references, therefore it wouldn't be sensible
      to include it from vtime.h
      
      To solve this, lets split the context tracking headers and move
      out the pure state definitions that only require a few low level
      headers. We can safely include that small part in vtime.h later.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      e7358b3b
    • Frederic Weisbecker's avatar
      vtime: Fix racy cputime delta update · 54461562
      Frederic Weisbecker authored
      get_vtime_delta() must be called under the task vtime_seqlock
      with the code that does the cputime accounting flush.
      
      Otherwise the cputime reader can be fooled and run into
      a race where it sees the snapshot update but misses the
      cputime flush. As a result it can report a cputime that is
      way too short.
      
      Fix vtime_account_user() that wasn't complying to that rule.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      54461562
    • Frederic Weisbecker's avatar
      vtime: Remove a few unneeded generic vtime state checks · 7621d1f8
      Frederic Weisbecker authored
      Some generic vtime APIs check if the vtime accounting
      is enabled on the local CPU before doing their work.
      
      Some of these are not needed because all their callers already
      take care of that. Let's remove the checks on these.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      7621d1f8
    • Frederic Weisbecker's avatar
      context_tracking: User/kernel broundary cross trace events · 1b6a259a
      Frederic Weisbecker authored
      This can be useful to track all kernel/user round trips.
      And it's also helpful to debug the context tracking subsystem.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      1b6a259a
    • Frederic Weisbecker's avatar
      context_tracking: Optimize context switch off case with static keys · 73d424f9
      Frederic Weisbecker authored
      No need for syscall slowpath if no CPU is full dynticks,
      rather nop this in this case.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      73d424f9
    • Frederic Weisbecker's avatar
      context_tracking: Optimize guest APIs off case with static key · 48d6a816
      Frederic Weisbecker authored
      Optimize guest entry/exit APIs with static keys. This minimize
      the overhead for those who enable CONFIG_NO_HZ_FULL without
      always using it. Having no range passed to nohz_full= should
      result in the probes overhead to be minimized.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      48d6a816
    • Frederic Weisbecker's avatar
      context_tracking: Optimize main APIs off case with static key · ad65782f
      Frederic Weisbecker authored
      Optimize user and exception entry/exit APIs with static
      keys. This minimize the overhead for those who enable
      CONFIG_NO_HZ_FULL without always using it. Having no range
      passed to nohz_full= should result in the probes to be nopped
      (at least we hope so...).
      
      If this proves not be enough in the long term, we'll need
      to bring an exception slow path by re-routing the exception
      handlers.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      ad65782f
    • Frederic Weisbecker's avatar
      context_tracking: Ground setup for static key use · 65f382fd
      Frederic Weisbecker authored
      Prepare for using a static key in the context tracking subsystem.
      This will help optimizing the off case on its many users:
      
      * user_enter, user_exit, exception_enter, exception_exit, guest_enter,
        guest_exit, vtime_*()
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      65f382fd
  2. 12 Aug, 2013 7 commits
    • Frederic Weisbecker's avatar
      context_tracking: Remove full dynticks' hacky dependency on wide context tracking · d84d27a4
      Frederic Weisbecker authored
      Now that the full dynticks subsystem only enables the context tracking
      on full dynticks CPUs, lets remove the dependency on CONTEXT_TRACKING_FORCE
      
      This dependency was a hack to enable the context tracking widely for the
      full dynticks susbsystem until the latter becomes able to enable it in a
      more CPU-finegrained fashion.
      
      Now CONTEXT_TRACKING_FORCE only stands for testing on archs that
      work on support for the context tracking while full dynticks can't be
      used yet due to unmet dependencies. It simulates a system where all CPUs
      are full dynticks so that RCU user extended quiescent states and dynticks
      cputime accounting can be tested on the given arch.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      d84d27a4
    • Frederic Weisbecker's avatar
      nohz: Only enable context tracking on full dynticks CPUs · 2e709338
      Frederic Weisbecker authored
      The context tracking subsystem has the ability to selectively
      enable the tracking on any defined subset of CPU. This means that
      we can define a CPU range that doesn't run the context tracking
      and another range that does.
      
      Now what we want in practice is to enable the tracking on full
      dynticks CPUs only. In order to perform this, we just need to pass
      our full dynticks CPU range selection from the full dynticks
      subsystem to the context tracking.
      
      This way we can spare the overhead of RCU user extended quiescent
      state and vtime maintainance on the CPUs that are outside the
      full dynticks range. Just keep in mind the raw context tracking
      itself is still necessary everywhere.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2e709338
    • Frederic Weisbecker's avatar
      context_tracking: Fix runtime CPU off-case · d65ec121
      Frederic Weisbecker authored
      As long as the context tracking is enabled on any CPU, even
      a single one, all other CPUs need to keep track of their
      user <-> kernel boundaries cross as well.
      
      This is because a task can sleep while servicing an exception
      that happened in the kernel or in userspace. Then when the task
      eventually wakes up and return from the exception, the CPU needs
      to know if we resume in userspace or in the kernel. exception_exit()
      get this information from exception_enter() that saved the previous
      state.
      
      If the CPU where the exception happened didn't keep track of
      these informations, exception_exit() doesn't know which state
      tracking to restore on the CPU where the task got migrated
      and we may return to userspace with the context tracking
      subsystem thinking that we are in kernel mode.
      
      This can be fixed in the long term if we move our context tracking
      probes on very low level arch fast path user <-> kernel boundary,
      although even that is worrisome as an exception can still happen
      in the few instructions between the probe and the actual iret.
      
      Also we are not yet ready to set these probes in the fast path given
      the potential overhead problem it induces.
      
      So let's fix this by always enable context tracking even on CPUs
      that are not in the full dynticks range. OTOH we can spare the
      rcu_user_*() and vtime_user_*() calls there because the tick runs
      on these CPUs and we can handle RCU state machine and cputime
      accounting through it.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      d65ec121
    • Frederic Weisbecker's avatar
      vtime: Update a few comments · 5b206d48
      Frederic Weisbecker authored
      Update a stale comment from the old vtime era and document some
      locking that might be non obvious.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      5b206d48
    • Frederic Weisbecker's avatar
      context_tracing: Fix guest accounting with native vtime · 2d854e57
      Frederic Weisbecker authored
      1) If context tracking is enabled with native vtime accounting (which
      combo is useless except for dev testing), we call vtime_guest_enter()
      and vtime_guest_exit() on host <-> guest switches. But those are stubs
      in this configurations. As a result, cputime is not correctly flushed
      on kvm context switches.
      
      2) If context tracking runs but is disabled on some CPUs, those
      CPUs end up calling __guest_enter/__guest_exit which in turn
      call vtime_account_system(). We don't want to call this because we
      run in tick based accounting for these CPUs.
      
      Refactor the guest_enter/guest_exit code such that all combinations
      finally work.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      2d854e57
    • Frederic Weisbecker's avatar
      sched: Consolidate open coded preemptible() checks · fbb00b56
      Frederic Weisbecker authored
      preempt_schedule() and preempt_schedule_context() open
      code their preemptability checks.
      
      Use the standard API instead for consolidation.
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alex Shi <alex.shi@intel.com>
      Cc: Paul Turner <pjt@google.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      fbb00b56
    • Ingo Molnar's avatar
      Merge branch 'fortglx/3.11/time' of git://git.linaro.org/people/jstultz/linux into timers/urgent · ae920eb2
      Ingo Molnar authored
      Pull small fix for v3.11 from John Stultz.
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ae920eb2
  3. 25 Jul, 2013 1 commit
  4. 24 Jul, 2013 2 commits
    • Li Zhong's avatar
      nohz: fix compile warning in tick_nohz_init() · ca06416b
      Li Zhong authored
      cpu is not used after commit 5b8621a6Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      ca06416b
    • Steven Rostedt's avatar
      nohz: Do not warn about unstable tsc unless user uses nohz_full · 543487c7
      Steven Rostedt authored
      If the user enables CONFIG_NO_HZ_FULL and runs the kernel on a machine
      with an unstable TSC, it will produce a WARN_ON dump as well as taint
      the kernel. This is a bit extreme for a kernel that just enables a
      feature but doesn't use it.
      
      The warning should only happen if the user tries to use the feature by
      either adding nohz_full to the kernel command line, or by enabling
      CONFIG_NO_HZ_FULL_ALL that makes nohz used on all CPUs at boot up. Note,
      this second feature should not (yet) be used by distros or anyone that
      doesn't care if NO_HZ is used or not.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Kevin Hilman <khilman@linaro.org>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      543487c7
  5. 23 Jul, 2013 4 commits
    • Linus Torvalds's avatar
      Merge tag 'trace-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · b3a3a9c4
      Linus Torvalds authored
      Pull tracing fixes and cleanups from Steven Rostedt:
       "This contains fixes, optimizations and some clean ups
      
        Some of the fixes need to go back to 3.10.  They are minor, and deal
        mostly with incorrect ref counting in accessing event files.
      
        There was a couple of optimizations that should have perf perform a
        bit better when accessing trace events.
      
        And some various clean ups.  Some of the clean ups are necessary to
        help in a fix to a theoretical race between opening a event file and
        deleting that event"
      
      * tag 'trace-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracing: Kill the unbalanced tr->ref++ in tracing_buffers_open()
        tracing: Kill trace_array->waiter
        tracing: Do not (ab)use trace_seq in event_id_read()
        tracing: Simplify the iteration logic in f_start/f_next
        tracing: Add ref_data to function and fgraph tracer structs
        tracing: Miscellaneous fixes for trace_array ref counting
        tracing: Fix error handling to ensure instances can always be removed
        tracing/kprobe: Wait for disabling all running kprobe handlers
        tracing/perf: Move the PERF_MAX_TRACE_SIZE check into perf_trace_buf_prepare()
        tracing/syscall: Avoid perf_trace_buf_*() if sys_data->perf_events is empty
        tracing/function: Avoid perf_trace_buf_*() if event_function.perf_events is empty
        tracing: Typo fix on ring buffer comments
        tracing: Use trace_seq_puts()/trace_seq_putc() where possible
        tracing: Use correct config guard CONFIG_STACK_TRACER
      b3a3a9c4
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · a582e5f5
      Linus Torvalds authored
      Pull thermal management fixes from Zhang Rui:
       "These are fixes collected over the last week, they fixes several
        problems caused by the x86_pkg_temp_thermal introduced in 3.11-rc1.
      
        Specifics:
      
         - the x86_pkg_temp_thermal driver causes crash on systems with no
           package MSR support as there is a bug in the logic to check
           presence of DTHERM and PTS feature together.  Added a change so
           that when there is no PTS support, module doesn't get loaded.
      
         - fix krealloc() misuse in pkg_temp_thermal_device_add().
      
           If krealloc() returns NULL, it doesn't free the original.  Thus if
           we want to exit because of the krealloc() failure, we must make
           sure the original one is freed.
      
         - The error code path of the x86 package temperature thermal driver's
           initialization routine makes an unbalanced call to
           get_online_cpus(), which causes subsequent CPU offline operations,
           and consequently system suspend, to permanently block in
           cpu_hotplug_begin() on systems where get_core_online() returns an
           error code.
      
           Remove the extra get_online_cpus() to fix the problem"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
        Thermal: Fix lockup of cpu_down()
        Thermal: x86_pkg_temp: Limit number of pkg temp zones
        Thermal: x86_pkg_temp: fix krealloc() misuse in in pkg_temp_thermal_device_add()
        Thermal: x86 package temp thermal crash
      a582e5f5
    • Linus Torvalds's avatar
      Merge tag 'gpio-for-v3.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · b7371e31
      Linus Torvalds authored
      Pull gpio fixes from Linus Walleij:
       "A first round of GPIO fixes for the v3.11 series:
         - OMAP device tree boot fix
         - Handle an error condition in the MSM driver
      
        The OMAP patches have been around since around the merge window, but
        since they first caused more breakage I let them boil in -next for a
        while.  These should be fine now"
      
      * tag 'gpio-for-v3.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        drivers: gpio: msm: Fix the error condition for reading ngpio
        gpio/omap: fix build error when OF_GPIO is not defined.
        gpio/omap: auto request GPIO as input if used as IRQ via DT
        gpio/omap: don't create an IRQ mapping for every GPIO on DT
      b7371e31
    • Linus Torvalds's avatar
      Merge branch 'for-3.11/drivers' of git://git.kernel.dk/linux-block · d4c90b1b
      Linus Torvalds authored
      Pull block IO driver bits from Jens Axboe:
       "As I mentioned in the core block pull request, due to real life
        circumstances the driver pull request would be late.  Now it looks
        like -rc2 late...  On the plus side, apart form the rsxx update, these
        are all things that I could argue could go in later in the cycle as
        they are fixes and not features.  So even though things are late, it's
        not ALL bad.
      
        The pull request contains:
      
         - Updates to bcache, all bug fixes, from Kent.
      
         - A pile of drbd bug fixes (no big features this time!).
      
         - xen blk front/back fixes.
      
         - rsxx driver updates, some of them deferred form 3.10.  So should be
           well cooked by now"
      
      * 'for-3.11/drivers' of git://git.kernel.dk/linux-block: (63 commits)
        bcache: Allocation kthread fixes
        bcache: Fix GC_SECTORS_USED() calculation
        bcache: Journal replay fix
        bcache: Shutdown fix
        bcache: Fix a sysfs splat on shutdown
        bcache: Advertise that flushes are supported
        bcache: check for allocation failures
        bcache: Fix a dumb race
        bcache: Use standard utility code
        bcache: Update email address
        bcache: Delete fuzz tester
        bcache: Document shrinker reserve better
        bcache: FUA fixes
        drbd: Allow online change of al-stripes and al-stripe-size
        drbd: Constants should be UPPERCASE
        drbd: Ignore the exit code of a fence-peer handler if it returns too late
        drbd: Fix rcu_read_lock balance on error path
        drbd: fix error return code in drbd_init()
        drbd: Do not sleep inside rcu
        bcache: Refresh usage docs
        ...
      d4c90b1b
  6. 22 Jul, 2013 2 commits
  7. 21 Jul, 2013 5 commits
    • Linus Torvalds's avatar
      Linux 3.11-rc2 · 3b2f64d0
      Linus Torvalds authored
      3b2f64d0
    • Linus Torvalds's avatar
      Merge tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · ea45ea70
      Linus Torvalds authored
      Pull ACPI video support fixes from Rafael Wysocki:
       "I'm sending a separate pull request for this as it may be somewhat
        controversial.  The breakage addressed here is not really new and the
        fixes may not satisfy all users of the affected systems, but we've had
        so much back and forth dance in this area over the last several weeks
        that I think it's time to actually make some progress.
      
        The source of the problem is that about a year ago we started to tell
        BIOSes that we're compatible with Windows 8, which we really need to
        do, because some systems shipping with Windows 8 are tested with it
        and nothing else, so if we tell their BIOSes that we aren't compatible
        with Windows 8, we expose our users to untested BIOS/AML code paths.
      
        However, as it turns out, some Windows 8-specific AML code paths are
        not tested either, because Windows 8 actually doesn't use the ACPI
        methods containing them, so if we declare Windows 8 compatibility and
        attempt to use those ACPI methods, things break.  That occurs mostly
        in the backlight support area where in particular the _BCM and _BQC
        methods are plain unusable on some systems if the OS declares Windows
        8 compatibility.
      
        [ The additional twist is that they actually become usable if the OS
          says it is not compatible with Windows 8, but that may cause
          problems to show up elsewhere ]
      
        Investigation carried out by Matthew Garrett indicates that what
        Windows 8 does about backlight is to leave backlight control up to
        individual graphics drivers.  At least there's evidence that it does
        that if the Intel graphics driver is used, so we've decided to follow
        Windows 8 in that respect and allow i915 to control backlight (Daniel
        likes that part).
      
        The first commit from Aaron Lu makes ACPICA export the variable from
        which we can infer whether or not the BIOS believes that we are
        compatible with Windows 8.
      
        The second commit from Matthew Garrett prepares the ACPI video driver
        by making it initialize the ACPI backlight even if it is not going to
        be used afterward (that is needed for backlight control to work on
        Thinkpads).
      
        The third commit implements the actual workaround making i915 take
        over backlight control if the firmware thinks it's dealing with
        Windows 8 and is based on the work of multiple developers, including
        Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu.
      
        The final commit from Aaron Lu makes us follow Windows 8 by informing
        the firmware through the _DOS method that it should not carry out
        automatic brightness changes, so that brightness can be controlled by
        GUI.
      
        Hopefully, this approach will allow us to avoid using blacklists of
        systems that should not declare Windows 8 compatibility just to avoid
        backlight control problems in the future.
      
         - Change from Aaron Lu makes ACPICA export a variable which can be
           used by driver code to determine whether or not the BIOS believes
           that we are compatible with Windows 8.
      
         - Change from Matthew Garrett makes the ACPI video driver initialize
           the ACPI backlight even if it is not going to be used afterward
           (that is needed for backlight control to work on Thinkpads).
      
         - Fix from Rafael J Wysocki implements Windows 8 backlight support
           workaround making i915 take over bakclight control if the firmware
           thinks it's dealing with Windows 8.  Based on the work of multiple
           developers including Matthew Garrett, Chun-Yi Lee, Seth Forshee,
           and Aaron Lu.
      
         - Fix from Aaron Lu makes the kernel follow Windows 8 by informing
           the firmware through the _DOS method that it should not carry out
           automatic brightness changes, so that brightness can be controlled
           by GUI"
      
      * tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / video: no automatic brightness changes by win8-compatible firmware
        ACPI / video / i915: No ACPI backlight if firmware expects Windows 8
        ACPI / video: Always call acpi_video_init_brightness() on init
        ACPICA: expose OSI version
      ea45ea70
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 90db76e8
      Linus Torvalds authored
      Pull ext[34] tmpfile bugfix from Ted Ts'o:
       "Fix regression caused by commit af51a2ac which added ->tmpfile()
        support (along with a similar fix for ext3)"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext3: fix a BUG when opening a file with O_TMPFILE flag
        ext4: fix a BUG when opening a file with O_TMPFILE flag
      90db76e8
    • Zheng Liu's avatar
      ext3: fix a BUG when opening a file with O_TMPFILE flag · dda5690d
      Zheng Liu authored
      When we try to open a file with O_TMPFILE flag, we will trigger a bug.
      The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
      this check always fails because we set ->i_nlink = 1 in
      inode_init_always().  We can use the following program to trigger it:
      
      int main(int argc, char *argv[])
      {
      	int fd;
      
      	fd = open(argv[1], O_TMPFILE, 0666);
      	if (fd < 0) {
      		perror("open ");
      		return -1;
      	}
      	close(fd);
      	return 0;
      }
      
      The oops message looks like this:
      
      kernel: kernel BUG at fs/ext3/namei.c:1992!
      kernel: invalid opcode: 0000 [#1] SMP
      kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd
      kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ #4
      kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010
      kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000
      kernel: RIP: 0010:[<ffffffffa00db5ae>] [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
      kernel: RSP: 0018:ffff8801124d5cc8  EFLAGS: 00010202
      kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0
      kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8
      kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f
      kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000
      kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8
      kernel: FS:  00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000
      kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0
      kernel: Stack:
      kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7
      kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000
      kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1
      kernel: Call Trace:
      kernel: [<ffffffffa00b80d7>] ? journal_start+0x8c/0xbd [jbd]
      kernel: [<ffffffffa00db7e1>] ext3_tmpfile+0xb2/0x13b [ext3]
      kernel: [<ffffffff821076f8>] path_openat+0x11f/0x5e7
      kernel: [<ffffffff821c86b4>] ? list_del+0x11/0x30
      kernel: [<ffffffff82065fa2>] ?  __dequeue_entity+0x33/0x38
      kernel: [<ffffffff82107cd5>] do_filp_open+0x3f/0x8d
      kernel: [<ffffffff82112532>] ? __alloc_fd+0x50/0x102
      kernel: [<ffffffff820f9296>] do_sys_open+0x13b/0x1cd
      kernel: [<ffffffff820f935c>] SyS_open+0x1e/0x20
      kernel: [<ffffffff82398c02>] system_call_fastpath+0x16/0x1b
      kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 <0f> 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0
      kernel: RIP  [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3]
      kernel: RSP <ffff8801124d5cc8>
      
      Here we couldn't call clear_nlink() directly because in d_tmpfile() we
      will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
      tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.
      Signed-off-by: default avatarZheng Liu <wenqing.lz@taobao.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      dda5690d
    • Zheng Liu's avatar
      ext4: fix a BUG when opening a file with O_TMPFILE flag · e94bd349
      Zheng Liu authored
      When we try to open a file with O_TMPFILE flag, we will trigger a bug.
      The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and
      this check always fails because we set ->i_nlink = 1 in
      inode_init_always().  We can use the following program to trigger it:
      
      int main(int argc, char *argv[])
      {
      	int fd;
      
      	fd = open(argv[1], O_TMPFILE, 0666);
      	if (fd < 0) {
      		perror("open ");
      		return -1;
      	}
      	close(fd);
      	return 0;
      }
      
      The oops message looks like this:
      
      kernel BUG at fs/ext4/namei.c:2572!
      invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      Modules linked in: dlci bridge stp hidp cmtp kernelcapi l2tp_ppp l2tp_netlink l2tp_core sctp libcrc32c rfcomm tun fuse nfnetli
      nk can_raw ipt_ULOG can_bcm x25 scsi_transport_iscsi ipx p8023 p8022 appletalk phonet psnap vmw_vsock_vmci_transport af_key vmw_vmci rose vsock atm can netrom ax25 af_rxrpc ir
      da pppoe pppox ppp_generic slhc bluetooth nfc rfkill rds caif_socket caif crc_ccitt af_802154 llc2 llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec serio_raw snd_pcm pcsp
      kr edac_core snd_page_alloc snd_timer snd soundcore r8169 mii sr_mod cdrom pata_atiixp radeon backlight drm_kms_helper ttm
      CPU: 1 PID: 1812571 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #12
      Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010
      task: ffff88007dfe69a0 ti: ffff88010f7b6000 task.ti: ffff88010f7b6000
      RIP: 0010:[<ffffffff8125ce69>]  [<ffffffff8125ce69>] ext4_orphan_add+0x299/0x2b0
      RSP: 0018:ffff88010f7b7cf8  EFLAGS: 00010202
      RAX: 0000000000000000 RBX: ffff8800966d3020 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff88007dfe70b8 RDI: 0000000000000001
      RBP: ffff88010f7b7d40 R08: ffff880126a3c4e0 R09: ffff88010f7b7ca0
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801271fd668
      R13: ffff8800966d2f78 R14: ffff88011d7089f0 R15: ffff88007dfe69a0
      FS:  00007f70441a3740(0000) GS:ffff88012a800000(0000) knlGS:00000000f77c96c0
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000002834000 CR3: 0000000107964000 CR4: 00000000000007e0
      DR0: 0000000000780000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
      Stack:
       0000000000002000 00000020810b6dde 0000000000000000 ffff88011d46db00
       ffff8800966d3020 ffff88011d7089f0 ffff88009c7f4c10 ffff88010f7b7f2c
       ffff88007dfe69a0 ffff88010f7b7da8 ffffffff8125cfac ffff880100000004
      Call Trace:
       [<ffffffff8125cfac>] ext4_tmpfile+0x12c/0x180
       [<ffffffff811cba78>] path_openat+0x238/0x700
       [<ffffffff8100afc4>] ? native_sched_clock+0x24/0x80
       [<ffffffff811cc647>] do_filp_open+0x47/0xa0
       [<ffffffff811db73f>] ? __alloc_fd+0xaf/0x200
       [<ffffffff811ba2e4>] do_sys_open+0x124/0x210
       [<ffffffff81010725>] ? syscall_trace_enter+0x25/0x290
       [<ffffffff811ba3ee>] SyS_open+0x1e/0x20
       [<ffffffff816ca8d4>] tracesys+0xdd/0xe2
       [<ffffffff81001001>] ? start_thread_common.constprop.6+0x1/0xa0
      Code: 04 00 00 00 89 04 24 31 c0 e8 c4 77 04 00 e9 43 fe ff ff 66 25 00 d0 66 3d 00 80 0f 84 0e fe ff ff 83 7b 48 00 0f 84 04 fe ff ff <0f> 0b 49 8b 8c 24 50 07 00 00 e9 88 fe ff ff 0f 1f 84 00 00 00
      
      Here we couldn't call clear_nlink() directly because in d_tmpfile() we
      will call inode_dec_link_count() to decrease ->i_nlink.  So this commit
      tries to call d_tmpfile() before ext4_orphan_add() to fix this problem.
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarZheng Liu <wenqing.lz@taobao.com>
      Tested-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Tested-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      e94bd349
  8. 20 Jul, 2013 5 commits