1. 03 May, 2007 4 commits
    • Hugh Dickins's avatar
      holepunch: fix disconnected pages after second truncate · 7943951f
      Hugh Dickins authored
      shmem_truncate_range has its own truncate_inode_pages_range, to free any
      pages racily instantiated while it was in progress: a SHMEM_PAGEIN flag
      is set when this might have happened.  But holepunching gets no chance
      to clear that flag at the start of vmtruncate_range, so it's always set
      (unless a truncate came just before), so holepunch almost always does
      this second truncate_inode_pages_range.
      
      shmem holepunch has unlikely swap<->file races hereabouts whatever we do
      (without a fuller rework than is fit for this release): I was going to
      skip the second truncate in the punch_hole case, but Miklos points out
      that would make holepunch correctness more vulnerable to swapoff.  So
      keep the second truncate, but follow it by an unmap_mapping_range to
      eliminate the disconnected pages (freed from pagecache while still
      mapped in userspace) that it might have left behind.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      7943951f
    • Hugh Dickins's avatar
      holepunch: fix shmem_truncate_range punch locking · ffd0472d
      Hugh Dickins authored
      Miklos Szeredi observes that during truncation of shmem page directories,
      info->lock is released to improve latency (after lowering i_size and
      next_index to exclude races); but this is quite wrong for holepunching,
      which receives no such protection from i_size or next_index, and is left
      vulnerable to races with shmem_unuse, shmem_getpage and shmem_writepage.
      
      Hold info->lock throughout when holepunching?  No, any user could prevent
      rescheduling for far too long.  Instead take info->lock just when needed:
      in shmem_free_swp when removing the swap entries, and whenever removing
      a directory page from the level above.  But so long as we remove before
      scanning, we can safely skip taking the lock at the lower levels, except
      at misaligned start and end of the hole.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      ffd0472d
    • Hugh Dickins's avatar
      holepunch: fix shmem_truncate_range punching too far · 0e846d67
      Hugh Dickins authored
      Miklos Szeredi observes BUG_ON(!entry) in shmem_writepage() triggered
      in rare circumstances, because shmem_truncate_range() erroneously
      removes partially truncated directory pages at the end of the range:
      later reclaim on pages pointing to these removed directories triggers
      the BUG.  Indeed, and it can also cause data loss beyond the hole.
      
      Fix this as in the patch proposed by Miklos, but distinguish between
      "limit" (how far we need to search: ignore truncation's next_index
      optimization in the holepunch case - if there are races it's more
      consistent to act on the whole range specified) and "upper_limit"
      (how far we can free directory pages: generally we must be careful
      to keep partially punched pages, but can relax at end of file -
      i_size being held stable by i_mutex).
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      0e846d67
    • Adrian Bunk's avatar
      Linux 2.6.16.50 · e76e407e
      Adrian Bunk authored
      e76e407e
  2. 01 May, 2007 1 commit
  3. 30 Apr, 2007 2 commits
  4. 25 Apr, 2007 5 commits
  5. 23 Apr, 2007 1 commit
    • Shaohua Li's avatar
      x86 microcode: don't check the size · fe1a5ddf
      Shaohua Li authored
      IA32 manual says if micorcode update's size is 0, then the size is
      default size (2048 bytes). But this doesn't suggest all microcode
      update's size should be above 2048 bytes to me. We actually had a
      microcode update whose size is 1024 bytes. The patch just removed the
      check.
      
      Backported by Daniel Drake.
      Signed-off-by: default avatarDaniel Drake <dsd@gentoo.org>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      fe1a5ddf
  6. 22 Apr, 2007 1 commit
  7. 20 Apr, 2007 4 commits
    • Adrian Bunk's avatar
      Linux 2.6.16.49-rc1 · eeceec45
      Adrian Bunk authored
      eeceec45
    • Aristeu Sergio Rozanski Filho's avatar
      tty_io: fix race in master pty close/slave pty close path · fec2411a
      Aristeu Sergio Rozanski Filho authored
      This patch fixes a possible race that leads to double freeing an idr index.
       When the master begin to close, release_dev() is called and then
      pty_close() is called:
      
              if (tty->driver->close)
                      tty->driver->close(tty, filp);
      
      This is done without helding any locks other than BKL.  Inside pty_close(),
      being a master close, the devpts entry will be removed:
      
      #ifdef CONFIG_UNIX98_PTYS
                      if (tty->driver == ptm_driver)
                              devpts_pty_kill(tty->index);
      #endif
      
      But devpts_pty_kill() will call get_node() that may sleep while waiting for
      &devpts_root->d_inode->i_sem.  When this happens and the slave is being
      opened, tty_open() just found the driver and index:
      
              driver = get_tty_driver(device, &index);
              if (!driver) {
                      mutex_unlock(&tty_mutex);
                      return -ENODEV;
              }
      
      This part of the code is already protected under tty_mute.  The problem is
      that the slave close already got an index.  Then init_dev() is called and
      blocks waiting for the same &devpts_root->d_inode->i_sem.
      
      When the master close resumes, it removes the devpts entry, and the
      relation between idr index and the tty is gone.  The master then sleeps
      waiting for the tty_mutex on release_dev().
      
      Slave open resumes and found no tty for that index.  As result, a NULL tty
      is returned and init_dev() doesn't flow to fast_track:
      
              /* check whether we're reopening an existing tty */
              if (driver->flags & TTY_DRIVER_DEVPTS_MEM) {
                      tty = devpts_get_tty(idx);
                      if (tty && driver->subtype == PTY_TYPE_MASTER)
                              tty = tty->link;
              } else {
                      tty = driver->ttys[idx];
              }
              if (tty) goto fast_track;
      
      The result of this, is that a new tty will be created and init_dev() returns
      sucessfull. After returning, tty_mutex is dropped and master close may resume.
      
      Master close finds it's the only use and both sides are closing, then releases
      the tty and the index. At this point, the idr index is free, but slave still
      has it.
      
      Slave open then calls pty_open() and finds that tty->link->count is 0,
      because there's no master and returns error.  Then tty_open() calls
      release_dev() which executes without any warning, as it was a case of last
      slave close when the master is already closed (master->count == 0,
      slave->count == 1).  The tty is then released with the already released idr
      index.
      
      This normally would only issue a warning on idr_remove() but in case of a
      customer's critical application, it's never too simple:
      
      thread1: opens master, gets index X
      thread1: begin closing master
      thread2: begin opening slave with index X
      thread1: finishes closing master, index X released
      thread3: opens master, gets index X, just released
      thread2: fails opening slave, releases index X         <----
      thread4: opens master, gets index X, init_dev() then find an already in use
               and healthy tty and fails
      
      If no more indexes are released, ptmx_open() will keep failing, as the
      first free index available is X, and it will make init_dev() fail because
      you're trying to "reopen a master" which isn't valid.
      
      The patch notices when this race happens and make init_dev() fail
      imediately.  The init_dev() function is called with tty_mutex held, so it's
      safe to continue with tty till the end of function because release_dev()
      won't make any further changes without grabbing the tty_mutex.
      
      Without the patch, on some machines it's possible get easily idr warnings
      like this one:
      
      idr_remove called for id=15 which is not allocated.
       [<c02555b9>] idr_remove+0x139/0x170
       [<c02a1b62>] release_mem+0x182/0x230
       [<c02a28e7>] release_dev+0x4b7/0x700
       [<c02a0ea7>] tty_ldisc_enable+0x27/0x30
       [<c02a1e64>] init_dev+0x254/0x580
       [<c02a0d64>] check_tty_count+0x14/0xb0
       [<c02a4f05>] tty_open+0x1c5/0x340
       [<c02a4d40>] tty_open+0x0/0x340
       [<c017388f>] chrdev_open+0xaf/0x180
       [<c017c2ac>] open_namei+0x8c/0x760
       [<c01737e0>] chrdev_open+0x0/0x180
       [<c0167bc9>] __dentry_open+0xc9/0x210
       [<c0167e2c>] do_filp_open+0x5c/0x70
       [<c0167a91>] get_unused_fd+0x61/0xd0
       [<c0167e93>] do_sys_open+0x53/0x100
       [<c0167f97>] sys_open+0x27/0x30
       [<c010303b>] syscall_call+0x7/0xb
      
      using this test application available on:
       http://www.ruivo.org/~aris/pty_sodomizer.cSigned-off-by: default avatarAristeu Sergio Rozanski Filho <aris@ruivo.org>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      fec2411a
    • Linas Vepstas's avatar
      elevator: move clearing of unplug flag earlier · 908991f1
      Linas Vepstas authored
      A flag was recently added to the elevator code to avoid
      performing an unplug when reuests are being re-queued.
      The goal of this flag was to avoid a deep recursion that
      can occur when re-queueing requests after a SCSI device/host
      reset.  See http://lkml.org/lkml/2006/5/17/254
      
      However, that fix added the flag near the bottom of a case
      statement, where an earlier break (in an if statement) could
      transport one out of the case, without setting the flag.
      This patch sets the flag earlier in the case statement.
      
      I re-discovered the deep recursion recently during testing;
      I was told that it was a known problem, and the fix to it was
      in the kernel I was testing. Indeed it was ... but it didn't
      fix the bug. With the patch below, I no longer see the bug.
      
      Signed-off by: Linas Vepstas <linas@austin.ibm.com>
      Signed-off-by: default avatarJens Axboe <axboe@suse.de>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      908991f1
    • Ard van Breemen's avatar
      start_kernel: test if irq's got enabled early, barf, and disable them again · cfef9300
      Ard van Breemen authored
      The calls made by parse_parms to other initialization code might enable
      interrupts again way too early.
      
      Having interrupts on this early can make systems PANIC when they initialize
      the IRQ controllers (which happens later in the code).  This patch detects
      that irq's are enabled again, barfs about it and disables them again as a
      safety net.
      
      [akpm@osdl.org: cleanups]
      Signed-off-by: default avatarArd van Breemen <ard@telegraafnet.nl>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      cfef9300
  8. 19 Apr, 2007 7 commits
  9. 15 Apr, 2007 1 commit
  10. 13 Apr, 2007 14 commits