1. 09 Jan, 2008 21 commits
  2. 08 Jan, 2008 5 commits
  3. 07 Jan, 2008 11 commits
  4. 06 Jan, 2008 3 commits
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/rmk/linux-2.6-arm · 2b300d20
      Linus Torvalds authored
      * master.kernel.org:/home/rmk/linux-2.6-arm:
        [ARM] 4691/1: add missing i2c_board_info struct for at91rm9200
        [ARM] 4735/1: Unbreak pxa25x suspend/resume
      2b300d20
    • Linus Torvalds's avatar
      Linux 2.6.24-rc7 · 3ce54450
      Linus Torvalds authored
      3ce54450
    • Ingo Molnar's avatar
      CPU hotplug: fix cpu_is_offline() on !CONFIG_HOTPLUG_CPU · a263898f
      Ingo Molnar authored
      make randconfig bootup testing found that the cpufreq code
      crashes on bootup, if the powernow-k8 driver is enabled and
      if maxcpus=1 passed on the boot line to a !CONFIG_HOTPLUG_CPU
      kernel.
      
      First lockdep found out that there's an inconsistent unlock
      sequence:
      
       =====================================
       [ BUG: bad unlock balance detected! ]
       -------------------------------------
       swapper/1 is trying to release lock (&per_cpu(cpu_policy_rwsem, cpu)) at:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       but there are no more locks to release!
      
      Call Trace:
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff80251c29>] print_unlock_inbalance_bug+0x104/0x12c
       [<ffffffff80252f3a>] mark_held_locks+0x56/0x94
       [<ffffffff806ffd8e>] unlock_policy_rwsem_write+0x3c/0x42
       [<ffffffff807008b6>] cpufreq_add_dev+0x2a8/0x5c4
       ...
      
      then shortly afterwards the cpufreq code crashed on an assert:
      
       ------------[ cut here ]------------
       kernel BUG at drivers/cpufreq/cpufreq.c:1068!
       invalid opcode: 0000 [1] SMP
       [...]
       Call Trace:
        [<ffffffff805145d6>] sysdev_driver_unregister+0x5b/0x91
        [<ffffffff806ff520>] cpufreq_register_driver+0x15d/0x1a2
        [<ffffffff80cc0596>] powernowk8_init+0x86/0x94
       [...]
       ---[ end trace 1e9219be2b4431de ]---
      
      the bug was caused by maxcpus=1 bootup, which brought up the
      secondary core as !cpu_online() but !cpu_is_offline() either,
      which on on !CONFIG_HOTPLUG_CPU is always 0 (include/linux/cpu.h):
      
        /* CPUs don't go offline once they're online w/o CONFIG_HOTPLUG_CPU */
        static inline int cpu_is_offline(int cpu) { return 0; }
      
      but the cpufreq code uses cpu_online() and cpu_is_offline() in
      a mixed way - the low-level drivers use cpu_online(), while
      the cpufreq core uses cpu_is_offline(). This opened up the
      possibility to add the non-initialized sysdev device of the
      secondary core:
      
       cpufreq-core: trying to register driver powernow-k8
       cpufreq-core: adding CPU 0
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       cpufreq-core: initialization failed
       cpufreq-core: adding CPU 1
       cpufreq-core: initialization failed
      
      which then blew up. The fix is to make cpu_is_offline() always
      the negation of cpu_online(). With that fix applied the kernel
      boots up fine without crashing:
      
       Calling initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94()
       powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00)
       powernow-k8: BIOS error - no PSB or ACPI _PSS objects
       initcall 0xffffffff80cc0510: powernowk8_init+0x0/0x94() returned -19.
       initcall 0xffffffff80cc0510 ran for 19 msecs: powernowk8_init+0x0/0x94()
       Calling initcall 0xffffffff80cc328f: init_lapic_nmi_sysfs+0x0/0x39()
      
      We could fix this by making CPU enumeration aware of max_cpus, but that
      would be more fragile IMO, and the cpu_online(cpu) != cpu_is_offline(cpu)
      possibility was quite confusing and a continuous source of bugs too.
      
      Most distributions have kernels with CPU hotplug enabled, so this bug
      remained hidden for a long time.
      
      Bug forensics:
      
      The broken cpu_is_offline() API variant was introduced via:
      
       commit a59d2e4e
       Author: Rusty Russell <rusty@rustcorp.com.au>
       Date:   Mon Mar 8 06:06:03 2004 -0800
      
           [PATCH] minor cleanups for hotplug CPUs
      
      ( this predates linux-2.6.git, this commit is available from Thomas's
        historic git tree. )
      
      Then 1.5 years later the cpufreq code made use of it:
      
       commit c32b6b8e
       Author: Ashok Raj <ashok.raj@intel.com>
       Date:   Sun Oct 30 14:59:54 2005 -0800
      
           [PATCH] create and destroy cpufreq sysfs entries based on cpu notifiers
      
       +       if (cpu_is_offline(cpu))
       +               return 0;
      
      which is a correct use of the subtly broken new API. v2.6.15 then
      shipped with this bug included.
      
      then it took two more years for random-kernel qa to hit it.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a263898f