1. 03 Feb, 2010 3 commits
    • Thomas Gleixner's avatar
      futex: Handle futex value corruption gracefully · 59647b6a
      Thomas Gleixner authored
      The WARN_ON in lookup_pi_state which complains about a mismatch
      between pi_state->owner->pid and the pid which we retrieved from the
      user space futex is completely bogus.
      
      The code just emits the warning and then continues despite the fact
      that it detected an inconsistent state of the futex. A conveniant way
      for user space to spam the syslog.
      
      Replace the WARN_ON by a consistency check. If the values do not match
      return -EINVAL and let user space deal with the mess it created.
      
      This also fixes the missing task_pid_vnr() when we compare the
      pi_state->owner pid with the futex value.
      Reported-by: default avatarJermome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarDarren Hart <dvhltc@us.ibm.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      59647b6a
    • Thomas Gleixner's avatar
      futex: Handle user space corruption gracefully · 51246bfd
      Thomas Gleixner authored
      If the owner of a PI futex dies we fix up the pi_state and set
      pi_state->owner to NULL. When a malicious or just sloppy programmed
      user space application sets the futex value to 0 e.g. by calling
      pthread_mutex_init(), then the futex can be acquired again. A new
      waiter manages to enqueue itself on the pi_state w/o damage, but on
      unlock the kernel dereferences pi_state->owner and oopses.
      
      Prevent this by checking pi_state->owner in the unlock path. If
      pi_state->owner is not current we know that user space manipulated the
      futex value. Ignore the mess and return -EINVAL.
      
      This catches the above case and also the case where a task hijacks the
      futex by setting the tid value and then tries to unlock it.
      Reported-by: default avatarJermome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarDarren Hart <dvhltc@us.ibm.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      51246bfd
    • Mikael Pettersson's avatar
      futex_lock_pi() key refcnt fix · 5ecb01cf
      Mikael Pettersson authored
      This fixes a futex key reference count bug in futex_lock_pi(),
      where a key's reference count is incremented twice but decremented
      only once, causing the backing object to not be released.
      
      If the futex is created in a temporary file in an ext3 file system,
      this bug causes the file's inode to become an "undead" orphan,
      which causes an oops from a BUG_ON() in ext3_put_super() when the
      file system is unmounted. glibc's test suite is known to trigger this,
      see <http://bugzilla.kernel.org/show_bug.cgi?id=14256>.
      
      The bug is a regression from 2.6.28-git3, namely Peter Zijlstra's
      38d47c1b "[PATCH] futex: rely on
      get_user_pages() for shared futexes". That commit made get_futex_key()
      also increment the reference count of the futex key, and updated its
      callers to decrement the key's reference count before returning.
      Unfortunately the normal exit path in futex_lock_pi() wasn't corrected:
      the reference count is incremented by get_futex_key() and queue_lock(),
      but the normal exit path only decrements once, via unqueue_me_pi().
      The fix is to put_futex_key() after unqueue_me_pi(), since 2.6.31
      this is easily done by 'goto out_put_key' rather than 'goto out'.
      Signed-off-by: default avatarMikael Pettersson <mikpe@it.uu.se>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarDarren Hart <dvhltc@us.ibm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: <stable@kernel.org>
      5ecb01cf
  2. 01 Feb, 2010 1 commit
    • Jason Wessel's avatar
      softlockup: Add sched_clock_tick() to avoid kernel warning on kgdb resume · d6ad3e28
      Jason Wessel authored
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, sched_clock() gets
      the time from hardware such as the TSC on x86. In this
      configuration kgdb will report a softlock warning message on
      resuming or detaching from a debug session.
      
      Sequence of events in the problem case:
      
       1) "cpu sched clock" and "hardware time" are at 100 sec prior
          to a call to kgdb_handle_exception()
      
       2) Debugger waits in kgdb_handle_exception() for 80 sec and on
          exit the following is called ...  touch_softlockup_watchdog() -->
          __raw_get_cpu_var(touch_timestamp) = 0;
      
       3) "cpu sched clock" = 100s (it was not updated, because the
          interrupt was disabled in kgdb) but the "hardware time" = 180 sec
      
       4) The first timer interrupt after resuming from
          kgdb_handle_exception updates the watchdog from the "cpu sched clock"
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> check (touch_timestamp == 0) (it is "YES"
      here, we have set "touch_timestamp = 0" at kgdb) -->
      __touch_softlockup_watchdog() ***(A)--> reset "touch_timestamp"
      to "get_timestamp()" (Here, the "touch_timestamp" will still be
      set to 100s.)  ...
      
          scheduler_tick() ***(B)--> sched_clock_tick() (update "cpu sched
          clock" to "hardware time" = 180s) ...  }
      
       5) The Second timer interrupt handler appears to have a large
          jump and trips the softlockup warning.
      
      update_process_times() { ...  run_local_timers() -->
      softlockup_tick() --> "cpu sched clock" - "touch_timestamp" =
      180s-100s > 60s --> printk "soft lockup error messages" ...  }
      
      note: ***(A) reset "touch_timestamp" to
      "get_timestamp(this_cpu)"
      
      Why is "touch_timestamp" 100 sec, instead of 180 sec?
      
      When CONFIG_HAVE_UNSTABLE_SCHED_CLOCK is set, the call trace of
      get_timestamp() is:
      
      get_timestamp(this_cpu)
       -->cpu_clock(this_cpu)
       -->sched_clock_cpu(this_cpu)
       -->__update_sched_clock(sched_clock_data, now)
      
      The __update_sched_clock() function uses the GTOD tick value to
      create a window to normalize the "now" values.  So if "now"
      value is too big for sched_clock_data, it will be ignored.
      
      The fix is to invoke sched_clock_tick() to update "cpu sched
      clock" in order to recover from this state.  This is done by
      introducing the function touch_softlockup_watchdog_sync(). This
      allows kgdb to request that the sched clock is updated when the
      watchdog thread runs the first time after a resume from kgdb.
      
      [yong.zhang0@gmail.com: Use per cpu instead of an array]
      Signed-off-by: default avatarJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: default avatarDongdong Deng <Dongdong.Deng@windriver.com>
      Cc: kgdb-bugreport@lists.sourceforge.net
      Cc: peterz@infradead.org
      LKML-Reference: <1264631124-4837-2-git-send-email-jason.wessel@windriver.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d6ad3e28
  3. 27 Jan, 2010 2 commits
    • Oleg Nesterov's avatar
      lockdep: Fix check_usage_backwards() error message · 48d50674
      Oleg Nesterov authored
      Lockdep has found the real bug, but the output doesn't look right to me:
      
      > =========================================================
      > [ INFO: possible irq lock inversion dependency detected ]
      > 2.6.33-rc5 #77
      > ---------------------------------------------------------
      > emacs/1609 just changed the state of lock:
      >  (&(&tty->ctrl_lock)->rlock){+.....}, at: [<ffffffff8127c648>] tty_fasync+0xe8/0x190
      > but this lock took another, HARDIRQ-unsafe lock in the past:
      >  (&(&sighand->siglock)->rlock){-.....}
      
      "HARDIRQ-unsafe" and "this lock took another" looks wrong, afaics.
      
      >   ... key      at: [<ffffffff81c054a4>] __key.46539+0x0/0x8
      >   ... acquired at:
      >    [<ffffffff81089af6>] __lock_acquire+0x1056/0x15a0
      >    [<ffffffff8108a0df>] lock_acquire+0x9f/0x120
      >    [<ffffffff81423012>] _raw_spin_lock_irqsave+0x52/0x90
      >    [<ffffffff8127c1be>] __proc_set_tty+0x3e/0x150
      >    [<ffffffff8127e01d>] tty_open+0x51d/0x5e0
      
      The stack-trace shows that this lock (ctrl_lock) was taken under
      ->siglock (which is hopefully irq-safe).
      
      This is a clear typo in check_usage_backwards() where we tell the print a
      fancy routine we're forwards.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <20100126181641.GA10460@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      48d50674
    • Greg Kroah-Hartman's avatar
      fnctl: f_modown should call write_lock_irqsave/restore · b04da8bf
      Greg Kroah-Hartman authored
      Commit 70362511 exposed that f_modown()
      should call write_lock_irqsave instead of just write_lock_irq so that
      because a caller could have a spinlock held and it would not be good to
      renable interrupts.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: Tavis Ormandy <taviso@google.com>
      Cc: stable <stable@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b04da8bf
  4. 26 Jan, 2010 10 commits
  5. 25 Jan, 2010 24 commits