1. 08 Apr, 2013 3 commits
    • Stanislaw Gruszka's avatar
      sched/cputime: Fix accounting on multi-threaded processes · e614b333
      Stanislaw Gruszka authored
      Recent commit 6fac4829 ("cputime: Use accessors to read task
      cputime stats") introduced a bug, where we account many times
      the cputime of the first thread, instead of cputimes of all
      the different threads.
      Signed-off-by: default avatarStanislaw Gruszka <sgruszka@redhat.com>
      Acked-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20130404085740.GA2495@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e614b333
    • libin's avatar
      sched/debug: Fix sd->*_idx limit range avoiding overflow · fd9b86d3
      libin authored
      Commit 201c373e ("sched/debug: Limit sd->*_idx range on
      sysctl") was an incomplete bug fix.
      
      This patch fixes sd->*_idx limit range to [0 ~ CPU_LOAD_IDX_MAX-1]
      avoiding array overflow caused by setting sd->*_idx to CPU_LOAD_IDX_MAX
      on sysctl.
      Signed-off-by: default avatarLibin <huawei.libin@huawei.com>
      Cc: <jiang.liu@huawei.com>
      Cc: <guohanjun@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/51626610.2040607@huawei.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      fd9b86d3
    • Thomas Gleixner's avatar
      sched_clock: Prevent 64bit inatomicity on 32bit systems · a1cbcaa9
      Thomas Gleixner authored
      The sched_clock_remote() implementation has the following inatomicity
      problem on 32bit systems when accessing the remote scd->clock, which
      is a 64bit value.
      
      CPU0			CPU1
      
      sched_clock_local()	sched_clock_remote(CPU0)
      ...
      			remote_clock = scd[CPU0]->clock
      			    read_low32bit(scd[CPU0]->clock)
      cmpxchg64(scd->clock,...)
      			    read_high32bit(scd[CPU0]->clock)
      
      While the update of scd->clock is using an atomic64 mechanism, the
      readout on the remote cpu is not, which can cause completely bogus
      readouts.
      
      It is a quite rare problem, because it requires the update to hit the
      narrow race window between the low/high readout and the update must go
      across the 32bit boundary.
      
      The resulting misbehaviour is, that CPU1 will see the sched_clock on
      CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value
      to this bogus timestamp. This stays that way due to the clamping
      implementation for about 4 seconds until the synchronization with
      CLOCK_MONOTONIC undoes the problem.
      
      The issue is hard to observe, because it might only result in a less
      accurate SCHED_OTHER timeslicing behaviour. To create observable
      damage on realtime scheduling classes, it is necessary that the bogus
      update of CPU1 sched_clock happens in the context of an realtime
      thread, which then gets charged 4 seconds of RT runtime, which results
      in the RT throttler mechanism to trigger and prevent scheduling of RT
      tasks for a little less than 4 seconds. So this is quite unlikely as
      well.
      
      The issue was quite hard to decode as the reproduction time is between
      2 days and 3 weeks and intrusive tracing makes it less likely, but the
      following trace recorded with trace_clock=global, which uses
      sched_clock_local(), gave the final hint:
      
        <idle>-0   0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80
        <idle>-0   0d..30 400269.477151: hrtimer_start:  hrtimer=0xf7061e80 ...
      irq/20-S-587 1d..32 400273.772118: sched_wakeup:   comm= ... target_cpu=0
        <idle>-0   0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80
      
      What happens is that CPU0 goes idle and invokes
      sched_clock_idle_sleep_event() which invokes sched_clock_local() and
      CPU1 runs a remote wakeup for CPU0 at the same time, which invokes
      sched_remote_clock(). The time jump gets propagated to CPU0 via
      sched_remote_clock() and stays stale on both cores for ~4 seconds.
      
      There are only two other possibilities, which could cause a stale
      sched clock:
      
      1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic
         wrong value.
      
      2) sched_clock() which reads the TSC returns a sporadic wrong value.
      
      #1 can be excluded because sched_clock would continue to increase for
         one jiffy and then go stale.
      
      #2 can be excluded because it would not make the clock jump
         forward. It would just result in a stale sched_clock for one jiffy.
      
      After quite some brain twisting and finding the same pattern on other
      traces, sched_clock_remote() remained the only place which could cause
      such a problem and as explained above it's indeed racy on 32bit
      systems.
      
      So while on 64bit systems the readout is atomic, we need to verify the
      remote readout on 32bit machines. We need to protect the local->clock
      readout in sched_clock_remote() on 32bit as well because an NMI could
      hit between the low and the high readout, call sched_clock_local() and
      modify local->clock.
      
      Thanks to Siegfried Wulsch for bearing with my debug requests and
      going through the tedious tasks of running a bunch of reproducer
      systems to generate the debug information which let me decode the
      issue.
      Reported-by: default avatarSiegfried Wulsch <Siegfried.Wulsch@rovema.de>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304051544160.21884@ionosSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      a1cbcaa9
  2. 21 Mar, 2013 1 commit
  3. 24 Feb, 2013 2 commits
    • Frederic Weisbecker's avatar
      cputime: Use local_clock() for full dynticks cputime accounting · 7f6575f1
      Frederic Weisbecker authored
      Running the full dynticks cputime accounting with preemptible
      kernel debugging trigger the following warning:
      
      	[    4.488303] BUG: using smp_processor_id() in preemptible [00000000] code: init/1
      	[    4.490971] caller is native_sched_clock+0x22/0x80
      	[    4.493663] Pid: 1, comm: init Not tainted 3.8.0+ #13
      	[    4.496376] Call Trace:
      	[    4.498996]  [<ffffffff813410eb>] debug_smp_processor_id+0xdb/0xf0
      	[    4.501716]  [<ffffffff8101e642>] native_sched_clock+0x22/0x80
      	[    4.504434]  [<ffffffff8101db99>] sched_clock+0x9/0x10
      	[    4.507185]  [<ffffffff81096ccd>] fetch_task_cputime+0xad/0x120
      	[    4.509916]  [<ffffffff81096dd5>] task_cputime+0x35/0x60
      	[    4.512622]  [<ffffffff810f146e>] acct_update_integrals+0x1e/0x40
      	[    4.515372]  [<ffffffff8117d2cf>] do_execve_common+0x4ff/0x5c0
      	[    4.518117]  [<ffffffff8117cf14>] ? do_execve_common+0x144/0x5c0
      	[    4.520844]  [<ffffffff81867a10>] ? rest_init+0x160/0x160
      	[    4.523554]  [<ffffffff8117d457>] do_execve+0x37/0x40
      	[    4.526276]  [<ffffffff810021a3>] run_init_process+0x23/0x30
      	[    4.528953]  [<ffffffff81867aac>] kernel_init+0x9c/0xf0
      	[    4.531608]  [<ffffffff8188356c>] ret_from_fork+0x7c/0xb0
      
      We use sched_clock() to perform and fixup the cputime
      accounting. However we are calling it with preemption enabled
      from the read side, which trigger the bug above.
      
      To fix this up, use local_clock() instead. It takes care of
      preemption and also provide a more reliable clock source. This
      is welcome for this kind of statistic that is widely relied on
      in userspace.
      Reported-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Li Zhong <zhong@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Link: http://lkml.kernel.org/r/1361636925-22288-3-git-send-email-fweisbec@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7f6575f1
    • Li Zhong's avatar
      cputime: Constify timeval_to_cputime(timeval) argument · c78a4bcd
      Li Zhong authored
      Saw the following compiler warning on the linux-next tree:
      
        kernel/itimer.c: In function 'set_cpu_itimer':
        kernel/itimer.c:152:2: warning: passing argument 1 of 'timeval_to_cputime' discards 'const' qualifier from pointer target type [enabled by default]
        ...
      
      timeval_to_cputime() is always passed a constant timeval in
      argument, we need to teach the nsecs based cputime
      implementation about that.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      Link: http://lkml.kernel.org/r/1361636925-22288-2-git-send-email-fweisbec@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Kevin Hilman <khilman@linaro.org>
      c78a4bcd
  4. 22 Feb, 2013 3 commits
  5. 20 Feb, 2013 19 commits
    • Sha Zhengju's avatar
    • Linus Torvalds's avatar
      Merge branch 'for-3.9-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · ece8e0b2
      Linus Torvalds authored
      Pull async changes from Tejun Heo:
       "These are followups for the earlier deadlock issue involving async
        ending up waiting for itself through block requesting module[1].  The
        following changes are made by these commits.
      
         - Instead of requesting default elevator on each request_queue init,
           block now requests it once early during boot.
      
         - Kmod triggers warning if invoked from an async worker.
      
         - Async synchronization implementation has been reimplemented.  It's
           a lot simpler now."
      
      * 'for-3.9-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        async: initialise list heads to fix crash
        async: replace list of active domains with global list of pending items
        async: keep pending tasks on async_domain and remove async_pending
        async: use ULLONG_MAX for infinity cookie value
        async: bring sanity to the use of words domain and running
        async, kmod: warn on synchronous request_module() from async workers
        block: don't request module during elevator init
        init, block: try to load default elevator module early during boot
      ece8e0b2
    • Linus Torvalds's avatar
      Merge branch 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 67cb104b
      Linus Torvalds authored
      Pull workqueue changes from Tejun Heo:
       "A lot of reorganization is going on mostly to prepare for worker pools
        with custom attributes so that workqueue can replace custom pool
        implementations in places including writeback and btrfs and make CPU
        assignment in crypto more flexible.
      
        workqueue evolved from purely per-cpu design and implementation, so
        there are a lot of assumptions regarding being bound to CPUs and even
        unbound workqueues are implemented as an extension of the model -
        workqueues running on the special unbound CPU.  Bulk of changes this
        round are about promoting worker_pools as the top level abstraction
        replacing global_cwq (global cpu workqueue).  At this point, I'm
        fairly confident about getting custom worker pools working pretty soon
        and ready for the next merge window.
      
        Lai's patches are replacing the convoluted mb() dancing workqueue has
        been doing with much simpler mechanism which only depends on
        assignment atomicity of long.  For details, please read the commit
        message of 0b3dae68 ("workqueue: simplify is-work-item-queued-here
        test").  While the change ends up adding one pointer to struct
        delayed_work, the inflation in percentage is less than five percent
        and it decouples delayed_work logic a lot more cleaner from usual work
        handling, removes the unusual memory barrier dancing, and allows for
        further simplification, so I think the trade-off is acceptable.
      
        There will be two more workqueue related pull requests and there are
        some shared commits among them.  I'll write further pull requests
        assuming this pull request is pulled first."
      
      * 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (37 commits)
        workqueue: un-GPL function delayed_work_timer_fn()
        workqueue: rename cpu_workqueue to pool_workqueue
        workqueue: reimplement is_chained_work() using current_wq_worker()
        workqueue: fix is_chained_work() regression
        workqueue: pick cwq instead of pool in __queue_work()
        workqueue: make get_work_pool_id() cheaper
        workqueue: move nr_running into worker_pool
        workqueue: cosmetic update in try_to_grab_pending()
        workqueue: simplify is-work-item-queued-here test
        workqueue: make work->data point to pool after try_to_grab_pending()
        workqueue: add delayed_work->wq to simplify reentrancy handling
        workqueue: make work_busy() test WORK_STRUCT_PENDING first
        workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END
        workqueue: post global_cwq removal cleanups
        workqueue: rename nr_running variables
        workqueue: remove global_cwq
        workqueue: remove worker_pool->gcwq
        workqueue: replace for_each_worker_pool() with for_each_std_worker_pool()
        workqueue: make freezing/thawing per-pool
        workqueue: make hotplug processing per-pool
        ...
      67cb104b
    • Linus Torvalds's avatar
      Merge branch 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · 1eaec821
      Linus Torvalds authored
      Pull workqueue [delayed_]work_pending() cleanups from Tejun Heo:
       "This is part of on-going cleanups to remove / minimize usages of
        workqueue interfaces which are deprecated and/or misleading.
      
        This round drops a number of usages of [delayed_]work_pending(), which
        are dangerous as they lack any form of synchronization and thus often
        lead to buggy / unnecessary code.  There are a couple legitimate use
        cases in kernel.  Hopefully, they can be converted and
        [delayed_]work_pending() can be removed completely.  Even if not,
        removing most of misuses should make it more difficult to find
        examples of misuses and thus slow down growth of them.
      
        These changes are independent from other workqueue changes."
      
      * 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        wimax/i2400m: fix i2400m->wake_tx_skb handling
        kprobes: fix wait_for_kprobe_optimizer()
        ipw2x00: simplify scan_event handling
        video/exynos: don't use [delayed_]work_pending()
        tty/max3100: don't use [delayed_]work_pending()
        x86/mce: don't use [delayed_]work_pending()
        rfkill: don't use [delayed_]work_pending()
        wl1251: don't use [delayed_]work_pending()
        thinkpad_acpi: don't use [delayed_]work_pending()
        mwifiex: don't use [delayed_]work_pending()
        sja1000: don't use [delayed_]work_pending()
      1eaec821
    • Linus Torvalds's avatar
      Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 1a13c0b1
      Linus Torvalds authored
      Pull x86 UV3 support update from Ingo Molnar:
       "Support for the SGI Ultraviolet System 3 (UV3) platform - the upcoming
        third major iteration and upscaling of the SGI UV supercomputing
        platform."
      
      * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, uv, uv3: Trim MMR register definitions after code changes for SGI UV3
        x86, uv, uv3: Check current gru hub support for SGI UV3
        x86, uv, uv3: Update Time Support for SGI UV3
        x86, uv, uv3: Update x2apic Support for SGI UV3
        x86, uv, uv3: Update Hub Info for SGI UV3
        x86, uv, uv3: Update ACPI Check to include SGI UV3
        x86, uv, uv3: Update MMR register definitions for SGI Ultraviolet System 3 (UV3)
      1a13c0b1
    • Linus Torvalds's avatar
      Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · f98982ce
      Linus Torvalds authored
      Pull x86 platform changes from Ingo Molnar:
      
       - Support for the Technologic Systems TS-5500 platform, by Vivien
         Didelot
      
       - Improved NUMA support on AMD systems:
      
         Add support for federated systems where multiple memory controllers
         can exist and see each other over multiple PCI domains.  This
         basically means that AMD node ids can be more than 8 now and the code
         handling this is taught to incorporate PCI domain into those IDs.
      
       - Support for the Goldfish virtual Android emulator, by Jun Nakajima,
         Intel, Google, et al.
      
       - Misc fixlets.
      
      * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: Add TS-5500 platform support
        x86/srat: Simplify memory affinity init error handling
        x86/apb/timer: Remove unnecessary "if"
        goldfish: platform device for x86
        amd64_edac: Fix type usage in NB IDs and memory ranges
        amd64_edac: Fix PCI function lookup
        x86, AMD, NB: Use u16 for northbridge IDs in amd_get_nb_id
        x86, AMD, NB: Add multi-domain support
      f98982ce
    • Linus Torvalds's avatar
      Merge branch 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 29d50523
      Linus Torvalds authored
      Pull x86/hyperv changes from Ingo Molnar:
       "The biggest change is support for Windows 8's improved hypervisor
        interrupt model on the Linux Hyper-V guest subsystem code side.
      
        Smallish fixes otherwise."
      
      * 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, hyperv: HYPERV depends on X86_LOCAL_APIC
        X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
        X86: Add a check to catch Xen emulation of Hyper-V
        x86: Hyper-V: register clocksource only if its advertised
      29d50523
    • Linus Torvalds's avatar
      Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 026f149c
      Linus Torvalds authored
      Pull x86/debug changes from Ingo Molnar:
       "Two init annotations and a built-in memtest speedup"
      
      * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/memtest: Shorten time for tests
        x86: Convert a few mistaken __cpuinit annotations to __init
        x86/EFI: Properly init-annotate BGRT code
      026f149c
    • Linus Torvalds's avatar
      Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 11743a1d
      Linus Torvalds authored
      Pull x86 cleanup patches from Ingo Molnar:
       "Misc smaller cleanups"
      
      * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86: ptrace.c only needs export.h and not the full module.h
        x86, apb_timer: remove unused variable percpu_timer
        um: don't compare a pointer to 0
        arch/x86/platform/uv: use ARRAY_SIZE where possible
      11743a1d
    • Linus Torvalds's avatar
      Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 121027a7
      Linus Torvalds authored
      Pull two x86 kernel build changes from Ingo Molnar:
       "The first change modifies how 'make oldconfig' works on cross-bitness
        situations on x86.  It was felt the new behavior of preserving the
        bitness of the .config is more logical.  This is a leftover of the
        merge.
      
        The second change eliminates a Perl warning.  (There's another, more
        complete fix resulting of this warning fix, which second fix in flight
        to you via the kbuild tree, which will remove the timeconst.pl script
        altogether.)"
      
      * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timeconst.pl: Eliminate Perl warning
        x86: Default to ARCH=x86 to avoid overriding CONFIG_64BIT
      121027a7
    • Linus Torvalds's avatar
      Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5abcd76f
      Linus Torvalds authored
      Pull x86 bootup changes from Ingo Molnar:
       "Deal with bootloaders which fail to initialize unknown fields in
        boot_params to zero, by sanitizing boot params passed in.
      
        This unbreaks versions of kexec-utils.  Other bootloaders do not
        appear to show sensitivity to this change, but it's a possibility for
        breakage nevertheless."
      
      * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, boot: Sanitize boot_params if not zeroed on creation
      5abcd76f
    • Linus Torvalds's avatar
      Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a57ed936
      Linus Torvalds authored
      Pull x86/asm changes from Ingo Molnar:
       "The biggest change (by line count) is the unification of the XOR code
        and then the introduction of an additional SSE based XOR assembly
        method.
      
        The other bigger change is the head_32.S rework/cleanup by Borislav
        Petkov.
      
        Last but not least there's the usual laundry list of small but
        dangerous (and hopefully perfectly tested) changes to subtle low level
        x86 code, plus cleanups."
      
      * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, head_32: Give the 6 label a real name
        x86, head_32: Remove second CPUID detection from default_entry
        x86: Detect CPUID support early at boot
        x86, head_32: Remove i386 pieces
        x86: Require MOVBE feature in cpuid when we use it
        x86: Enable ARCH_USE_BUILTIN_BSWAP
        x86/xor: Add alternative SSE implementation only prefetching once per 64-byte line
        x86/xor: Unify SSE-base xor-block routines
        x86: Fix a typo
        x86/mm: Fix the argument passed to sync_global_pgds()
        x86/mm: Convert update_mmu_cache() and update_mmu_cache_pmd() to functions
        ix86: Tighten asmlinkage_protect() constraints
      a57ed936
    • Linus Torvalds's avatar
      Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5800700f
      Linus Torvalds authored
      Pull x86/apic changes from Ingo Molnar:
       "Main changes:
      
         - Multiple MSI support added to the APIC, PCI and AHCI code - acked
           by all relevant maintainers, by Alexander Gordeev.
      
           The advantage is that multiple AHCI ports can have multiple MSI
           irqs assigned, and can thus spread to multiple CPUs.
      
           [ Drivers can make use of this new facility via the
             pci_enable_msi_block_auto() method ]
      
         - x86 IOAPIC code from interrupt remapping cleanups from Joerg
           Roedel:
      
           These patches move all interrupt remapping specific checks out of
           the x86 core code and replaces the respective call-sites with
           function pointers.  As a result the interrupt remapping code is
           better abstraced from x86 core interrupt handling code.
      
         - Various smaller improvements, fixes and cleanups."
      
      * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
        x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess
        x86, kvm: Fix intialization warnings in kvm.c
        x86, irq: Move irq_remapped out of x86 core code
        x86, io_apic: Introduce eoi_ioapic_pin call-back
        x86, msi: Introduce x86_msi.compose_msi_msg call-back
        x86, irq: Introduce setup_remapped_irq()
        x86, irq: Move irq_remapped() check into free_remapped_irq
        x86, io-apic: Remove !irq_remapped() check from __target_IO_APIC_irq()
        x86, io-apic: Move CONFIG_IRQ_REMAP code out of x86 core
        x86, irq: Add data structure to keep AMD specific irq remapping information
        x86, irq: Move irq_remapping_enabled declaration to iommu code
        x86, io_apic: Remove irq_remapping_enabled check in setup_timer_IRQ0_pin
        x86, io_apic: Move irq_remapping_enabled checks out of check_timer()
        x86, io_apic: Convert setup_ioapic_entry to function pointer
        x86, io_apic: Introduce set_affinity function pointer
        x86, msi: Use IRQ remapping specific setup_msi_irqs routine
        x86, hpet: Introduce x86_msi_ops.setup_hpet_msi
        x86, io_apic: Introduce x86_io_apic_ops.print_entries for debugging
        x86, io_apic: Introduce x86_io_apic_ops.disable()
        x86, apic: Mask IO-APIC and PIC unconditionally on LAPIC resume
        ...
      5800700f
    • Linus Torvalds's avatar
      Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 266d7ad7
      Linus Torvalds authored
      Pull timer changes from Ingo Molnar:
       "Main changes:
      
         - ntp: Add CONFIG_RTC_SYSTOHC: a generic RTC driver facility
           complementing the existing CONFIG_RTC_HCTOSYS, which uses NTP to
           keep the hardware clock updated.
      
         - posix-timers: Fix clock_adjtime to always return timex data on
           success.  This is changing the ABI, but no breakage was expected
           and found - caution is warranted nevertheless.
      
         - platform persistent clock improvements/cleanups.
      
         - clockevents: refactor timer broadcast handling to be more generic
           and less duplicated with matching architecture code (mostly ARM
           motivated.)
      
         - various fixes and cleanups"
      
      * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timers/x86/hpet: Use HPET_COUNTER to specify the hpet counter in vread_hpet()
        posix-cpu-timers: Fix nanosleep task_struct leak
        clockevents: Fix generic broadcast for FEAT_C3STOP
        time, Fix setting of hardware clock in NTP code
        hrtimer: Prevent hrtimer_enqueue_reprogram race
        clockevents: Add generic timer broadcast function
        clockevents: Add generic timer broadcast receiver
        timekeeping: Switch HAS_PERSISTENT_CLOCK to ALWAYS_USE_PERSISTENT_CLOCK
        x86/time/rtc: Don't print extended CMOS year when reading RTC
        x86: Select HAS_PERSISTENT_CLOCK on x86
        timekeeping: Add CONFIG_HAS_PERSISTENT_CLOCK option
        rtc: Skip the suspend/resume handling if persistent clock exist
        timekeeping: Add persistent_clock_exist flag
        posix-timers: Fix clock_adjtime to always return timex data on success
        Round the calculated scale factor in set_cyc2ns_scale()
        NTP: Add a CONFIG_RTC_SYSTOHC configuration
        MAINTAINERS: Update John Stultz's email
        time: create __getnstimeofday for WARNless calls
      266d7ad7
    • Linus Torvalds's avatar
      Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · bcbd818c
      Linus Torvalds authored
      Pull preparatory smp/hotplug patches from Ingo Molnar:
       "Some early preparatory changes for the WIP hotplug rework by Thomas
        Gleixner."
      
      * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        stop_machine: Use smpboot threads
        stop_machine: Store task reference in a separate per cpu variable
        smpboot: Allow selfparking per cpu threads
      bcbd818c
    • Linus Torvalds's avatar
      Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · d652e1eb
      Linus Torvalds authored
      Pull scheduler changes from Ingo Molnar:
       "Main changes:
      
         - scheduler side full-dynticks (user-space execution is undisturbed
           and receives no timer IRQs) preparation changes that convert the
           cputime accounting code to be full-dynticks ready, from Frederic
           Weisbecker.
      
         - Initial sched.h split-up changes, by Clark Williams
      
         - select_idle_sibling() performance improvement by Mike Galbraith:
      
              " 1 tbench pair (worst case) in a 10 core + SMT package:
      
                pre   15.22 MB/sec 1 procs
                post 252.01 MB/sec 1 procs "
      
        - sched_rr_get_interval() ABI fix/change.  We think this detail is not
          used by apps (so it's not an ABI in practice), but lets keep it
          under observation.
      
        - misc RT scheduling cleanups, optimizations"
      
      * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
        sched/rt: Add <linux/sched/rt.h> header to <linux/init_task.h>
        cputime: Remove irqsave from seqlock readers
        sched, powerpc: Fix sched.h split-up build failure
        cputime: Restore CPU_ACCOUNTING config defaults for PPC64
        sched/rt: Move rt specific bits into new header file
        sched/rt: Add a tuning knob to allow changing SCHED_RR timeslice
        sched: Move sched.h sysctl bits into separate header
        sched: Fix signedness bug in yield_to()
        sched: Fix select_idle_sibling() bouncing cow syndrome
        sched/rt: Further simplify pick_rt_task()
        sched/rt: Do not account zero delta_exec in update_curr_rt()
        cputime: Safely read cputime of full dynticks CPUs
        kvm: Prepare to add generic guest entry/exit callbacks
        cputime: Use accessors to read task cputime stats
        cputime: Allow dynamic switch between tick/virtual based cputime accounting
        cputime: Generic on-demand virtual cputime accounting
        cputime: Move default nsecs_to_cputime() to jiffies based cputime file
        cputime: Librarize per nsecs resolution cputime definitions
        cputime: Avoid multiplication overflow on utime scaling
        context_tracking: Export context state for generic vtime
        ...
      
      Fix up conflict in kernel/context_tracking.c due to comment additions.
      d652e1eb
    • Linus Torvalds's avatar
      Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8f55cea4
      Linus Torvalds authored
      Pull perf changes from Ingo Molnar:
       "There are lots of improvements, the biggest changes are:
      
        Main kernel side changes:
      
         - Improve uprobes performance by adding 'pre-filtering' support, by
           Oleg Nesterov.
      
         - Make some POWER7 events available in sysfs, equivalent to what was
           done on x86, from Sukadev Bhattiprolu.
      
         - tracing updates by Steve Rostedt - mostly misc fixes and smaller
           improvements.
      
         - Use perf/event tracing to report PCI Express advanced errors, by
           Tony Luck.
      
         - Enable northbridge performance counters on AMD family 15h, by Jacob
           Shin.
      
         - This tracing commit:
      
              tracing: Remove the extra 4 bytes of padding in events
      
           changes the ABI.  All involved parties (PowerTop in particular)
           seem to agree that it's safe to do now with the introduction of
           libtraceevent, but the devil is in the details ...
      
        Main tooling side changes:
      
         - Add 'event group view', from Namyung Kim:
      
           To use it, 'perf record' should group events when recording.  And
           then perf report parses the saved group relation from file header
           and prints them together if --group option is provided.  You can
           use the 'perf evlist' command to see event group information:
      
              $ perf record -e '{ref-cycles,cycles}' noploop 1
              [ perf record: Woken up 2 times to write data ]
              [ perf record: Captured and wrote 0.385 MB perf.data (~16807 samples) ]
      
              $ perf evlist --group
              {ref-cycles,cycles}
      
           With this example, default perf report will show you each event
           separately.
      
           You can use --group option to enable event group view:
      
              $ perf report --group
              ...
              # group: {ref-cycles,cycles}
              # ========
              # Samples: 7K of event 'anon group { ref-cycles, cycles }'
              # Event count (approx.): 6876107743
              #
              #         Overhead  Command      Shared Object                      Symbol
              # ................  .......  .................  ..........................
                  99.84%  99.76%  noploop  noploop            [.] main
                   0.07%   0.00%  noploop  ld-2.15.so         [.] strcmp
                   0.03%   0.00%  noploop  [kernel.kallsyms]  [k] timerqueue_del
                   0.03%   0.03%  noploop  [kernel.kallsyms]  [k] sched_clock_cpu
                   0.02%   0.00%  noploop  [kernel.kallsyms]  [k] account_user_time
                   0.01%   0.00%  noploop  [kernel.kallsyms]  [k] __alloc_pages_nodemask
                   0.00%   0.00%  noploop  [kernel.kallsyms]  [k] native_write_msr_safe
                   0.00%   0.11%  noploop  [kernel.kallsyms]  [k] _raw_spin_lock
                   0.00%   0.06%  noploop  [kernel.kallsyms]  [k] find_get_page
                   0.00%   0.02%  noploop  [kernel.kallsyms]  [k] rcu_check_callbacks
                   0.00%   0.02%  noploop  [kernel.kallsyms]  [k] __current_kernel_time
      
           As you can see the Overhead column now contains both of ref-cycles
           and cycles and header line shows group information also - 'anon
           group { ref-cycles, cycles }'.  The output is sorted by period of
           group leader first.
      
         - Initial GTK+ annotate browser, from Namhyung Kim.
      
         - Add option for runtime switching perf data file in perf report,
           just press 's' and a menu with the valid files found in the current
           directory will be presented, from Feng Tang.
      
         - Add support to display whole group data for raw columns, from Jiri
           Olsa.
      
         - Add per processor socket count aggregation in perf stat, from
           Stephane Eranian.
      
         - Add interval printing in 'perf stat', from Stephane Eranian.
      
         - 'perf test' improvements
      
         - Add support for wildcards in tracepoint system name, from Jiri
           Olsa.
      
         - Add anonymous huge page recognition, from Joshua Zhu.
      
         - perf build-id cache now can show DSOs present in a perf.data file
           that are not in the cache, to integrate with build-id servers being
           put in place by organizations such as Fedora.
      
         - perf top now shares more of the evsel config/creation routines with
           'record', paving the way for further integration like 'top'
           snapshots, etc.
      
         - perf top now supports DWARF callchains.
      
         - Fix mmap limitations on 32-bit, fix from David Miller.
      
         - 'perf bench numa mem' NUMA performance measurement suite
      
         - ... and lots of fixes, performance improvements, cleanups and other
           improvements I failed to list - see the shortlog and git log for
           details."
      
      * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (270 commits)
        perf/x86/amd: Enable northbridge performance counters on AMD family 15h
        perf/hwbp: Fix cleanup in case of kzalloc failure
        perf tools: Fix build with bison 2.3 and older.
        perf tools: Limit unwind support to x86 archs
        perf annotate: Make it to be able to skip unannotatable symbols
        perf gtk/annotate: Fail early if it can't annotate
        perf gtk/annotate: Show source lines with gray color
        perf gtk/annotate: Support multiple event annotation
        perf ui/gtk: Implement basic GTK2 annotation browser
        perf annotate: Fix warning message on a missing vmlinux
        perf buildid-cache: Add --update option
        uprobes/perf: Avoid uprobe_apply() whenever possible
        uprobes/perf: Teach trace_uprobe/perf code to use UPROBE_HANDLER_REMOVE
        uprobes/perf: Teach trace_uprobe/perf code to pre-filter
        uprobes/perf: Teach trace_uprobe/perf code to track the active perf_event's
        uprobes: Introduce uprobe_apply()
        perf: Introduce hw_perf_event->tp_target and ->tp_list
        uprobes/perf: Always increment trace_uprobe->nhit
        uprobes/tracing: Kill uprobe_trace_consumer, embed uprobe_consumer into trace_uprobe
        uprobes/tracing: Introduce is_trace_uprobe_enabled()
        ...
      8f55cea4
    • Linus Torvalds's avatar
      Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b7133a9a
      Linus Torvalds authored
      Pull irq core changes from Ingo Molnar:
       "The biggest changes are the IRQ-work and printk changes from Frederic
        Weisbecker, which prepare the code for 'full dynticks' (the ability to
        stop or slow down the periodic tick arbitrarily, not just in idle time
        as today):
      
         - Don't stop tick with irq works pending.  This fix is generally
           useful and concerns archs that can't raise self IPIs.
      
         - Flush irq works before CPU offlining.
      
         - Introduce "lazy" irq works that can wait for the next tick to be
           executed, unless it's stopped.
      
         - Implement klogd wake up using irq work.  This removes the ad-hoc
           printk_tick()/printk_needs_cpu() hooks and make it working even in
           dynticks mode.
      
         - Cleanups and fixes."
      
      * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq: Export enable/disable_percpu_irq()
        arch Kconfig: Remove references to IRQ_PER_CPU
        irq_work: Remove return value from the irq_work_queue() function
        genirq: Avoid deadlock in spurious handling
        printk: Wake up klogd using irq_work
        irq_work: Make self-IPIs optable
        irq_work: Warn if there's still work on cpu_down
        irq_work: Flush work on CPU_DYING
        irq_work: Don't stop the tick with pending works
        nohz: Add API to check tick state
        irq_work: Remove CONFIG_HAVE_IRQ_WORK
        irq_work: Fix racy check on work pending flag
        irq_work: Fix racy IRQ_WORK_BUSY flag setting
      b7133a9a
    • Linus Torvalds's avatar
      Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e84cf5d0
      Linus Torvalds authored
      Pull RCU changes from Ingo Molnar:
       "SRCU changes:
      
         - These include debugging aids, updates that move towards the goal of
           permitting srcu_read_lock() and srcu_read_unlock() to be used from
           idle and offline CPUs, and a few small fixes.
      
        Changes to rcutorture and to RCU documentation:
      
         - Posted to LKML at https://lkml.org/lkml/2013/1/26/188
      
        Enhancements to uniprocessor handling in tiny RCU:
      
         - Posted to LKML at https://lkml.org/lkml/2013/1/27/2
      
        Tag RCU callbacks with grace-period number to simplify callback
        advancement:
      
         - Posted to LKML at https://lkml.org/lkml/2013/1/26/203
      
        Miscellaneous fixes:
      
         - Posted to LKML at https://lkml.org/lkml/2013/1/26/204"
      
      * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
        srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()
        srcu: Update synchronize_srcu_expedited()'s comments
        srcu: Update synchronize_srcu()'s comments
        srcu: Remove checks preventing idle CPUs from calling srcu_read_lock()
        srcu: Remove checks preventing offline CPUs from calling srcu_read_lock()
        srcu: Simple cleanup for cleanup_srcu_struct()
        srcu: Add might_sleep() annotation to synchronize_srcu()
        srcu: Simplify __srcu_read_unlock() via this_cpu_dec()
        rcu: Allow rcutorture to be built at low optimization levels
        rcu: Make rcutorture's shuffler task shuffle recently added tasks
        rcu: Allow TREE_PREEMPT_RCU on UP systems
        rcu: Provide RCU CPU stall warnings for tiny RCU
        context_tracking: Add comments on interface and internals
        rcu: Remove obsolete Kconfig option from comment
        rcu: Remove unused code originally used for context tracking
        rcu: Consolidate debugging Kconfig options
        rcu: Correct 'optimized' to 'optimize' in header comment
        rcu: Trace callback acceleration
        rcu: Tag callback lists with corresponding grace-period number
        rcutorture: Don't compare ptr with 0
        ...
      e84cf5d0
  6. 19 Feb, 2013 3 commits
  7. 18 Feb, 2013 5 commits
  8. 16 Feb, 2013 1 commit
  9. 15 Feb, 2013 3 commits
    • Linus Torvalds's avatar
      Merge tag 'stable/for-linus-3.8-rc7-tag-two' of... · f741656d
      Linus Torvalds authored
      Merge tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
      
      Pull xen fixes from Konrad Rzeszutek Wilk:
       "Two fixes:
      
         - A simple bug-fix for redundant NULL check.
      
         - CVE-2013-0228/XSA-42: x86/xen: don't assume %ds is usable in
           xen_iret for 32-bit PVOPS
      
        and two reverts:
      
         - Revert the PVonHVM kexec.  The patch introduces a regression with
           older hypervisor stacks, such as Xen 4.1."
      
      * tag 'stable/for-linus-3.8-rc7-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
        Revert "xen PVonHVM: use E820_Reserved area for shared_info"
        Revert "xen/PVonHVM: fix compile warning in init_hvm_pv_info"
        xen: remove redundant NULL check before unregister_and_remove_pcpu().
        x86/xen: don't assume %ds is usable in xen_iret for 32-bit PVOPS.
      f741656d
    • Mauro Carvalho Chehab's avatar
      Revert "[media] dvb_frontend: return -ENOTTY for unimplement IOCTL" · ac897586
      Mauro Carvalho Chehab authored
      As reported by Klaus Schmidinger:
       "In VDR I use an ioctl() call with FE_READ_UNCORRECTED_BLOCKS on a
        device (using stb0899).  After this call I check 'errno' for
        EOPNOTSUPP to determine whether this device supports this call.  This
        used to work just fine, until a few months ago I noticed that my
        devices using stb0899 didn't display their signal quality in VDR's OSD
        any more.  After further investigation I found that
        ioctl(FE_READ_UNCORRECTED_BLOCKS) no longer returns EOPNOTSUPP, but
        rather ENOTTY.  And since I stop getting the signal quality in case
        any unknown errno value appears, this broke my signal quality query
        function."
      
      While the changes reflect what is there at:
      
        http://comments.gmane.org/gmane.linux.kernel/1235728
      
      it does cause regression on userspace.  So, revert it to stop the
      damage.
      
      This reverts commit 177ffe50 ("[media] dvb_frontend: return -ENOTTY
      for unimplement IOCTL").
      Reported-by: default avatarKlaus Schmidinger <Klaus.Schmidinger@tvdr.de>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ac897586
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 11e76514
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "A couple small fixes for sparc including some THP brown-paper-bag
        material:
      
         1) During the merging of all the THP support for various
            architectures, sparc missed adding a
            HAVE_ARCH_TRANSPARENT_HUGEPAGE to it's Kconfig, oops.
      
         2) Sparc needs to be mindful of hugepages in get_user_pages_fast().
      
         3) Fix memory leak in SBUS probe, from Cong Ding.
      
         4) The sunvdc virtual disk client driver has a test of the bitmask of
            vdisk server supported operations which was off by one bit"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sunvdc: Fix off-by-one in generic_request().
        sparc64: Fix get_user_pages_fast() wrt. THP.
        sparc64: Add missing HAVE_ARCH_TRANSPARENT_HUGEPAGE.
        sparc: kernel/sbus.c: fix memory leakage
      11e76514