1. 21 Oct, 2010 1 commit
    • Arnd Bergmann's avatar
      BKL: introduce CONFIG_BKL. · 6de5bd12
      Arnd Bergmann authored
      With all the patches we have queued in the BKL removal tree, only a
      few dozen modules are left that actually rely on the BKL, and even
      there are lots of low-hanging fruit. We need to decide what to do
      about them, this patch illustrates one of the options:
      
      Every user of the BKL is marked as 'depends on BKL' in Kconfig,
      and the CONFIG_BKL becomes a user-visible option. If it gets
      disabled, no BKL using module can be built any more and the BKL
      code itself is compiled out.
      
      The one exception is file locking, which is practically always
      enabled and does a 'select BKL' instead. This effectively forces
      CONFIG_BKL to be enabled until we have solved the fs/lockd
      mess and can apply the patch that removes the BKL from fs/locks.c.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      6de5bd12
  2. 19 Oct, 2010 10 commits
    • Arnd Bergmann's avatar
      dabusb: remove the BKL · 7ff52efd
      Arnd Bergmann authored
      The dabusb device driver is sufficiently serialized using
      its own mutex, no need for the big kernel lock here
      in addition.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      7ff52efd
    • Arnd Bergmann's avatar
      sunrpc: remove the big kernel lock · a6f8dbc6
      Arnd Bergmann authored
      The sunrpc cache_ioctl function does not need the big kernel lock
      because it uses its own queue_lock already.
      
      rpc_pipe_ioctl apparently should be using i_lock like the other
      operations on the pipe file descriptor do.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      a6f8dbc6
    • Namhyung Kim's avatar
      init/main.c: remove BKL notations · 1fa4f3b5
      Namhyung Kim authored
      According to commit 5e3d20a6
      (init: Remove the BKL from startup code) these sparse notations
      should be removed also.
      Signed-off-by: default avatarNamhyung Kim <namhyung@gmail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      1fa4f3b5
    • Arnd Bergmann's avatar
      blktrace: remove the big kernel lock · 01b284f9
      Arnd Bergmann authored
      According to Jens, this code does not need the BKL at all,
      it is sufficiently serialized by bd_mutex.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Jens Axboe <jaxboe@fusionio.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      01b284f9
    • Arnd Bergmann's avatar
      rtmutex-tester: make it build without BKL · 0fc86c7b
      Arnd Bergmann authored
      The big kernel lock is going away, so make sure
      that if it is disabled by Kconfig, we do not
      try to validate it, which would result in
      compile errors.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      0fc86c7b
    • Arnd Bergmann's avatar
      dvb-core: kill the big kernel lock · 72024f1e
      Arnd Bergmann authored
      The dvb core only uses the big kernel lock in the open
      and ioctl functions, which means it can be replaced with
      a dvb specific mutex. Fortunately, all the ioctl functions
      go through dvb_usercopy, so we can move the serialization
      in there.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: linux-media@vger.kernel.org
      72024f1e
    • Arnd Bergmann's avatar
      dvb/bt8xx: kill the big kernel lock · adfedd21
      Arnd Bergmann authored
      The bt8xx driver only uses the big kernel lock in its dst_ca_ioctl
      function and never to serialize against other code, so we can
      trivially replace it with a private mutex.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: linux-media@vger.kernel.org
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      adfedd21
    • Arnd Bergmann's avatar
      tlclk: remove big kernel lock · efbec1cd
      Arnd Bergmann authored
      This driver already has a global mutex, so let's just
      use that in the open function instead of the BKL.
      It may not even be needed there, but this patch should
      have the smallest impact.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Mark Gross <mark.gross@intel.com>
      efbec1cd
    • Al Viro's avatar
      fix rawctl compat ioctls breakage on amd64 and itanic · c4a04727
      Al Viro authored
      RAW_SETBIND and RAW_GETBIND 32bit versions are fscked in interesting ways.
      
      1) fs/compat_ioctl.c has COMPATIBLE_IOCTL(RAW_SETBIND) followed by
      HANDLE_IOCTL(RAW_SETBIND, raw_ioctl).  The latter is ignored.
      
      2) on amd64 (and itanic) the damn thing is broken - we have int + u64 + u64
      and layouts on i386 and amd64 are _not_ the same.  raw_ioctl() would
      work there, but it's never called due to (1).  As it is, i386 /sbin/raw
      definitely doesn't work on amd64 boxen.
      
      3) switching to raw_ioctl() as is would *not* work on e.g. sparc64 and ppc64,
      which would be rather sad, seeing that normal userland there is 32bit.
      The thing is, slapping __packed on the struct in question does not DTRT -
      it eliminates *all* padding.  The real solution is to use compat_u64.
      
      4) of course, all that stuff has no business being outside of raw.c in the
      first place - there should be ->compat_ioctl() for /dev/rawctl instead of
      messing with compat_ioctl.c.
      
      [akpm@linux-foundation.org: coding-style fixes]
      [arnd@arndb.de: port to 2.6.36]
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      c4a04727
    • Arnd Bergmann's avatar
      uml: kill big kernel lock · 9a181c58
      Arnd Bergmann authored
      Three uml device drivers still use the big kernel lock,
      but all of them can be safely converted to using
      a per-driver mutex instead. Most likely this is not
      even necessary, so after further review these can
      and should be removed as well.
      
      The exec system call no longer requires the BKL either,
      so remove it from there, too.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Jeff Dike <jdike@addtoit.com>
      Cc: user-mode-linux-devel@lists.sourceforge.net
      9a181c58
  3. 16 Oct, 2010 1 commit
    • Arnd Bergmann's avatar
      parisc: remove big kernel lock · fa0d4c26
      Arnd Bergmann authored
      The parisc version of the perf code is sufficiently
      protected by its own spinlock, no need to use the BKL.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: linux-parisc@vger.kernel.org
      fa0d4c26
  4. 26 Sep, 2010 4 commits
    • Arnd Bergmann's avatar
      cris: autoconvert trivial BKL users · 0890b588
      Arnd Bergmann authored
      All uses of the big kernel lock in the cris architecture
      are for ioctl and open functions of character device drivers,
      which can be trivially converted to a per-driver mutex.
      
      Most of these are probably unnecessary, so it may make sense
      to audit them and eventually remove the extra mutex introduced
      by this patch.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: linux-cris-kernel@axis.com
      0890b588
    • Arnd Bergmann's avatar
      alpha: kill big kernel lock · 80eb4a6f
      Arnd Bergmann authored
      All uses of the BKL on alpha are totally bogus, nothing
      is really protected by this. Remove the remaining users
      so we don't have to mark alpha as 'depends on BKL'.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: linux-alpha@vger.kernel.org
      80eb4a6f
    • Arnd Bergmann's avatar
      isapnp: BKL removal · 6117d213
      Arnd Bergmann authored
      Remove BKL use from isapnp_proc_bus_lseek(), like was done for
      proc_bus_pci_lseek() a long time ago and recently for Zorro
      by Geert Uytterhoeven.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jaroslav Kysela <perex@perex.cz>
      6117d213
    • Arnd Bergmann's avatar
      s390/block: kill the big kernel lock · cfdb00a7
      Arnd Bergmann authored
      The dasd and dcssblk drivers gained the big
      kernel lock in the recent pushdown from the
      block layer, but they don't really need it,
      so remove the calls without a replacement.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: linux-s390@vger.kernel.org
      cfdb00a7
  5. 15 Sep, 2010 1 commit
    • Arnd Bergmann's avatar
      hpet: kill BKL, add compat_ioctl · 54066a57
      Arnd Bergmann authored
      hpet uses the big kernel lock in its ioctl and open
      functions. Replace this with a private mutex to be
      sure. Since we're already touching the ioctl function,
      add the compat_ioctl version as well -- all commands
      except HPET_INFO are compatible and that one is easy
      to add.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Clemens Ladisch <clemens@ladisch.de>
      Cc: Bob Picco <bob.picco@hp.com>
      54066a57
  6. 12 Sep, 2010 1 commit
  7. 11 Sep, 2010 13 commits
  8. 10 Sep, 2010 9 commits
    • mark gross's avatar
      PM QoS: Correct pr_debug() misuse and improve parameter checks · 0109c2c4
      mark gross authored
      Correct some pr_debug() misuse and add a stronger parameter check to
      pm_qos_write() for the ASCII hex value case.  Thanks to Dan Carpenter
      for pointing out the problem!
      Signed-off-by: default avatarmark gross <markgross@thegnar.org>
      Signed-off-by: default avatarRafael J. Wysocki <rjw@sisk.pl>
      0109c2c4
    • Dave Chinner's avatar
      xfs: log IO completion workqueue is a high priority queue · 51749e47
      Dave Chinner authored
      The workqueue implementation in 2.6.36-rcX has changed, resulting
      in the workqueues no longer having dedicated threads for work
      processing. This has caused severe livelocks under heavy parallel
      create workloads because the log IO completions have been getting
      held up behind metadata IO completions.  Hence log commits would
      stall, memory allocation would stall because pages could not be
      cleaned, and lock contention on the AIL during inode IO completion
      processing was being seen to slow everything down even further.
      
      By making the log Io completion workqueue a high priority workqueue,
      they are queued ahead of all data/metadata IO completions and
      processed before the data/metadata completions. Hence the log never
      gets stalled, and operations needed to clean memory can continue as
      quickly as possible. This avoids the livelock conditions and allos
      the system to keep running under heavy load as per normal.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAlex Elder <aelder@sgi.com>
      51749e47
    • Roland McGrath's avatar
      execve: make responsive to SIGKILL with large arguments · 9aea5a65
      Roland McGrath authored
      An execve with a very large total of argument/environment strings
      can take a really long time in the execve system call.  It runs
      uninterruptibly to count and copy all the strings.  This change
      makes it abort the exec quickly if sent a SIGKILL.
      
      Note that this is the conservative change, to interrupt only for
      SIGKILL, by using fatal_signal_pending().  It would be perfectly
      correct semantics to let any signal interrupt the string-copying in
      execve, i.e. use signal_pending() instead of fatal_signal_pending().
      We'll save that change for later, since it could have user-visible
      consequences, such as having a timer set too quickly make it so that
      an execve can never complete, though it always happened to work before.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9aea5a65
    • Roland McGrath's avatar
      execve: improve interactivity with large arguments · 7993bc1f
      Roland McGrath authored
      This adds a preemption point during the copying of the argument and
      environment strings for execve, in copy_strings().  There is already
      a preemption point in the count() loop, so this doesn't add any new
      points in the abstract sense.
      
      When the total argument+environment strings are very large, the time
      spent copying them can be much more than a normal user time slice.
      So this change improves the interactivity of the rest of the system
      when one process is doing an execve with very large arguments.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7993bc1f
    • Roland McGrath's avatar
      setup_arg_pages: diagnose excessive argument size · 1b528181
      Roland McGrath authored
      The CONFIG_STACK_GROWSDOWN variant of setup_arg_pages() does not
      check the size of the argument/environment area on the stack.
      When it is unworkably large, shift_arg_pages() hits its BUG_ON.
      This is exploitable with a very large RLIMIT_STACK limit, to
      create a crash pretty easily.
      
      Check that the initial stack is not too large to make it possible
      to map in any executable.  We're not checking that the actual
      executable (or intepreter, for binfmt_elf) will fit.  So those
      mappings might clobber part of the initial stack mapping.  But
      that is just userland lossage that userland made happen, not a
      kernel problem.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Reviewed-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1b528181
    • Linus Torvalds's avatar
      Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm · be6200aa
      Linus Torvalds authored
      * 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: Perform hardware_enable in CPU_STARTING callback
        KVM: i8259: fix migration
        KVM: fix i8259 oops when no vcpus are online
        KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts
      be6200aa
    • Linus Torvalds's avatar
      Merge branch 'perf-fixes-for-linus' of... · f2955b49
      Linus Torvalds authored
      Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
      
      * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
        tracing: t_start: reset FTRACE_ITER_HASH in case of seek/pread
        perf symbols: Fix multiple initialization of symbol system
        perf: Fix CPU hotplug
        perf, trace: Fix module leak
        tracing/kprobe: Fix handling of C-unlike argument names
        tracing/kprobes: Fix handling of argument names
        perf probe: Fix handling of arguments names
        perf probe: Fix return probe support
        tracing/kprobe: Fix a memory leak in error case
        tracing: Do not allow llseek to set_ftrace_filter
      f2955b49
    • David Howells's avatar
      KEYS: Fix bug in keyctl_session_to_parent() if parent has no session keyring · 3d96406c
      David Howells authored
      Fix a bug in keyctl_session_to_parent() whereby it tries to check the ownership
      of the parent process's session keyring whether or not the parent has a session
      keyring [CVE-2010-2960].
      
      This results in the following oops:
      
        BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
        IP: [<ffffffff811ae4dd>] keyctl_session_to_parent+0x251/0x443
        ...
        Call Trace:
         [<ffffffff811ae2f3>] ? keyctl_session_to_parent+0x67/0x443
         [<ffffffff8109d286>] ? __do_fault+0x24b/0x3d0
         [<ffffffff811af98c>] sys_keyctl+0xb4/0xb8
         [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
      
      if the parent process has no session keyring.
      
      If the system is using pam_keyinit then it mostly protected against this as all
      processes derived from a login will have inherited the session keyring created
      by pam_keyinit during the log in procedure.
      
      To test this, pam_keyinit calls need to be commented out in /etc/pam.d/.
      Reported-by: default avatarTavis Ormandy <taviso@cmpxchg8b.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarTavis Ormandy <taviso@cmpxchg8b.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d96406c
    • David Howells's avatar
      KEYS: Fix RCU no-lock warning in keyctl_session_to_parent() · 9d1ac65a
      David Howells authored
      There's an protected access to the parent process's credentials in the middle
      of keyctl_session_to_parent().  This results in the following RCU warning:
      
        ===================================================
        [ INFO: suspicious rcu_dereference_check() usage. ]
        ---------------------------------------------------
        security/keys/keyctl.c:1291 invoked rcu_dereference_check() without protection!
      
        other info that might help us debug this:
      
        rcu_scheduler_active = 1, debug_locks = 0
        1 lock held by keyctl-session-/2137:
         #0:  (tasklist_lock){.+.+..}, at: [<ffffffff811ae2ec>] keyctl_session_to_parent+0x60/0x236
      
        stack backtrace:
        Pid: 2137, comm: keyctl-session- Not tainted 2.6.36-rc2-cachefs+ #1
        Call Trace:
         [<ffffffff8105606a>] lockdep_rcu_dereference+0xaa/0xb3
         [<ffffffff811ae379>] keyctl_session_to_parent+0xed/0x236
         [<ffffffff811af77e>] sys_keyctl+0xb4/0xb6
         [<ffffffff81001eab>] system_call_fastpath+0x16/0x1b
      
      The code should take the RCU read lock to make sure the parents credentials
      don't go away, even though it's holding a spinlock and has IRQ disabled.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d1ac65a