1. 13 Jul, 2008 1 commit
    • Dmitry Adamushko's avatar
      cpusets, hotplug, scheduler: fix scheduler domain breakage · 3e84050c
      Dmitry Adamushko authored
      Commit f18f982a ("sched: CPU hotplug events must not destroy scheduler
      domains created by the cpusets") introduced a hotplug-related problem as
      described below:
      
      Upon CPU_DOWN_PREPARE,
      
        update_sched_domains() -> detach_destroy_domains(&cpu_online_map)
      
      does the following:
      
      /*
       * Force a reinitialization of the sched domains hierarchy. The domains
       * and groups cannot be updated in place without racing with the balancing
       * code, so we temporarily attach all running cpus to the NULL domain
       * which will prevent rebalancing while the sched domains are recalculated.
       */
      
      The sched-domains should be rebuilt when a CPU_DOWN ops. has been
      completed, effectively either upon CPU_DEAD{_FROZEN} (upon success) or
      CPU_DOWN_FAILED{_FROZEN} (upon failure -- restore the things to their
      initial state). That's what update_sched_domains() also does but only
      for !CPUSETS case.
      
      With f18f982a, sched-domains' reinitialization is delegated to
      CPUSETS code:
      
      cpuset_handle_cpuhp() -> common_cpu_mem_hotplug_unplug() ->
      rebuild_sched_domains()
      
      Being called for CPU_UP_PREPARE and if its callback is called after
      update_sched_domains()), it just negates all the work done by
      update_sched_domains() -- i.e. a soon-to-be-offline cpu is included in
      the sched-domains and that makes it visible for the load-balancer
      while the CPU_DOWN ops. is in progress.
      
      __migrate_live_tasks() moves the tasks off a 'dead' cpu (it's already
      "offline" when this function is called).
      
      try_to_wake_up() is called for one of these tasks from another CPU ->
      the load-balancer (wake_idle()) picks up a "dead" CPU and places the
      task on it. Then e.g. BUG_ON(rq->nr_running) detects this a bit later
      -> oops.
      Signed-off-by: default avatarDmitry Adamushko <dmitry.adamushko@gmail.com>
      Tested-by: default avatarVegard Nossum <vegard.nossum@gmail.com>
      Cc: Paul Menage <menage@google.com>
      Cc: Max Krasnyansky <maxk@qualcomm.com>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: miaox@cn.fujitsu.com
      Cc: rostedt@goodmis.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3e84050c
  2. 10 Jul, 2008 2 commits
    • Linus Torvalds's avatar
      sched: fix cpu hotplug, cleanup · b1e38734
      Linus Torvalds authored
      Clean up __migrate_task(): to just have separate "done" and "fail"
      cases, instead of that "out" case with random error behavior.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b1e38734
    • Dmitry Adamushko's avatar
      sched: fix cpu hotplug · dc7fab8b
      Dmitry Adamushko authored
      I think we may have a race between try_to_wake_up() and
      migrate_live_tasks() -> move_task_off_dead_cpu() when the later one
      may end up looping endlessly.
      
      Interrupts are enabled on other CPUs when migration_call(CPU_DEAD, ...) is
      called so we may get a race between try_to_wake_up() and
      migrate_live_tasks() -> move_task_off_dead_cpu(). The former one may push
      a task out of a dead CPU causing the later one to loop endlessly.
      
      Heiko Carstens observed:
      
      | That's exactly what explains a dump I got yesterday. Thanks for fixing! :)
      Signed-off-by: default avatarDmitry Adamushko <dmitry.adamushko@gmail.com>
      Cc: miaox@cn.fujitsu.com
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Avi Kivity <avi@qumranet.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dc7fab8b
  3. 01 Jul, 2008 1 commit
    • Raistlin's avatar
      sched: fix divide error when trying to configure rt_period to zero · 619b0488
      Raistlin authored
      Here it is another little Oops we found while configuring invalid values
      via cgroups:
      
      echo 0 > /dev/cgroups/0/cpu.rt_period_us
      or
      echo 4294967296 > /dev/cgroups/0/cpu.rt_period_us
      
      [  205.509825] divide error: 0000 [#1]
      [  205.510151] Modules linked in:
      [  205.510151]
      [  205.510151] Pid: 2339, comm: bash Not tainted (2.6.26-rc8 #33)
      [  205.510151] EIP: 0060:[<c030c6ef>] EFLAGS: 00000293 CPU: 0
      [  205.510151] EIP is at div64_u64+0x5f/0x70
      [  205.510151] EAX: 0000389f EBX: 00000000 ECX: 00000000 EDX: 00000000
      [  205.510151] ESI: d9800000 EDI: 00000000 EBP: c6cede60 ESP: c6cede50
      [  205.510151]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
      [  205.510151] Process bash (pid: 2339, ti=c6cec000 task=c79be370 task.ti=c6cec000)
      [  205.510151] Stack: d9800000 0000389f c05971a0 d9800000 c6cedeb4 c0214dbd 00000000 00000000
      [  205.510151]        c6cede88 c0242bd8 c05377c0 c7a41b40 00000000 00000000 00000000 c05971a0
      [  205.510151]        c780ed20 c7508494 c7a41b40 00000000 00000002 c6cedebc c05971a0 ffffffea
      [  205.510151] Call Trace:
      [  205.510151]  [<c0214dbd>] ? __rt_schedulable+0x1cd/0x240
      [  205.510151]  [<c0242bd8>] ? cgroup_file_open+0x18/0xe0
      [  205.510151]  [<c0214fe4>] ? tg_set_bandwidth+0xa4/0xf0
      [  205.510151]  [<c0215066>] ? sched_group_set_rt_period+0x36/0x50
      [  205.510151]  [<c021508e>] ? cpu_rt_period_write_uint+0xe/0x10
      [  205.510151]  [<c0242dc5>] ? cgroup_file_write+0x125/0x160
      [  205.510151]  [<c0232c15>] ? hrtimer_interrupt+0x155/0x190
      [  205.510151]  [<c02f047f>] ? security_file_permission+0xf/0x20
      [  205.510151]  [<c0277ad8>] ? rw_verify_area+0x48/0xc0
      [  205.510151]  [<c0283744>] ? dupfd+0x104/0x130
      [  205.510151]  [<c027838c>] ? vfs_write+0x9c/0x160
      [  205.510151]  [<c0242ca0>] ? cgroup_file_write+0x0/0x160
      [  205.510151]  [<c027850d>] ? sys_write+0x3d/0x70
      [  205.510151]  [<c0203019>] ? sysenter_past_esp+0x6a/0x91
      [  205.510151]  =======================
      [  205.510151] Code: 0f 45 de 31 f6 0f ad d0 d3 ea f6 c1 20 0f 45 c2 0f 45 d6 89 45 f0 89 55 f4 8b 55 f4 31 c9 8b 45 f0 39 d3 89 c6 77 08 89 d0 31 d2 <f7> f3 89 c1 83 c4 08 89 f0 f7 f3 89 ca 5b 5e 5d c3 55 89 e5 56
      [  205.510151] EIP: [<c030c6ef>] div64_u64+0x5f/0x70 SS:ESP 0068:c6cede50
      
      The attached patch solves the issue for me.
      
      I'm checking as soon as possible for the period not being zero since, if
      it is, going ahead is useless. This way we also save a mutex_lock() and
      a read_lock() wrt doing it inside tg_set_bandwidth() or
      __rt_schedulable().
      Signed-off-by: default avatarDario Faggioli <raistlin@linux.it>
      Signed-off-by: default avatarMichael Trimarchi <trimarchimichael@yahoo.it>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      619b0488
  4. 29 Jun, 2008 1 commit
    • Dmitry Adamushko's avatar
      sched: fix cpu hotplug · 79c53799
      Dmitry Adamushko authored
      the CPU hotplug problems (crashes under high-volume unplug+replug
      tests) seem to be related to migrate_dead_tasks().
      
      Firstly I added traces to see all tasks being migrated with
      migrate_live_tasks() and migrate_dead_tasks(). On my setup the problem
      pops up (the one with "se == NULL" in the loop of
      pick_next_task_fair()) shortly after the traces indicate that some has
      been migrated with migrate_dead_tasks()). btw., I can reproduce it
      much faster now with just a plain cpu down/up loop.
      
      [disclaimer] Well, unless I'm really missing something important in
      this late hour [/desclaimer] pick_next_task() is not something
      appropriate for migrate_dead_tasks() :-)
      
      the following change seems to eliminate the problem on my setup
      (although, I kept it running only for a few minutes to get a few
      messages indicating migrate_dead_tasks() does move tasks and the
      system is still ok)
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      79c53799
  5. 25 Jun, 2008 4 commits
  6. 24 Jun, 2008 26 commits
  7. 23 Jun, 2008 5 commits