1. 31 Aug, 2002 9 commits
  2. 30 Aug, 2002 31 commits
    • Greg Kroah-Hartman's avatar
      3c9bd375
    • Linus Torvalds's avatar
      The SCSI layer should _not_ try to decide about non-existent · ba26eacc
      Linus Torvalds authored
      partitions. The higher layers do a better job of it.
      ba26eacc
    • Neil Brown's avatar
    • Neil Brown's avatar
      [PATCH] PATCH - kNFSd - More small fixes for TCP nfsd · 03d7a386
      Neil Brown authored
      sk_inuse should be bigger than "char" as we can
      have more than 255 server threads.  Due to the way the count
      is used, this is unlikely to actually cause a problem, but it
      should nonetheless be fixed.
      
      Also, two printk generate more noise than we would like,
      so turn them into dprintk (debugging printk).
      03d7a386
    • Chuck Lever's avatar
      [PATCH] sock_writeable not appropriate for TCP sockets, for 2.5.32 · d2279c44
      Chuck Lever authored
      sock_writeable determines whether there is space in a socket's output
      buffer.  socket write_space callbacks use it to determine whether to wake
      up those that are waiting for more output buffer space.
      
      however, sock_writeable is not appropriate for TCP sockets.  because the
      RPC layer's write_space callback uses it for TCP sockets, the RPC layer
      hammers on sock_sendmsg with dozens of write requests that are only a few
      hundred bytes long when it is trying to send a large write RPC request.
      this patch adds logic to the RPC layer's write_space callback that
      properly handles TCP sockets.
      
      patch reviewed by Trond and Alexey.
      d2279c44
    • Chuck Lever's avatar
      [PATCH] prevent oops in xprt_lock_write, against 2.5.32 · 1758bdf3
      Chuck Lever authored
      when several RPC requests want to reconnect a TCP transport socket at
      once, xprt_lock_write serializes the tasks to prevent multiple socket
      connects.  however, TCP connects are always done by a RPC child task that
      has no request slot.  xprt_lock_write can oops if there is no request slot
      allocated to the invoking RPC task.  reviewed and accepted by Trond.
      
      the xprt_lock_write changes are not yet in 2.4, so this patch does not
      apply to 2.4.
      1758bdf3
    • Ingo Molnar's avatar
      [PATCH] TLS boot-initialization bugfix on SMP, 2.5.32-BK · 44a05b3e
      Ingo Molnar authored
      This fixes a bad TLS initialization bug found by Andi Kleen.  x86/SMP
      only worked due to luck.
      44a05b3e
    • Ingo Molnar's avatar
      [PATCH] scheduler fixes, 2.5.32-BK · 2c638ab0
      Ingo Molnar authored
      This adds two scheduler related fixes:
      
       - changes the migration code to use struct completion. Andrew pointed out
         that there might be a small window in where the up() touches the
         semaphore while the waiting task goes on and frees its stack. And
         completion is more suited for this kind of stuff anyway.
      
       - removes two unneeded exports, pointed out by Andrew.
      2c638ab0
    • Ingo Molnar's avatar
      [PATCH] clone-cleanup 2.5.32-BK · 1f9d6582
      Ingo Molnar authored
      This moves CLONE_SETTID and CLONE_CLEARTID handling into kernel/fork.c,
      where it belongs.  [the CLONE_SETTLS is x86-specific and thus remains in
      the per-arch process.c] This makes support for these two new flags much
      easier: architectures only have to pass in the user_tid pointer.
      1f9d6582
    • Dominik Brodowski's avatar
      [PATCH] include/asm-i386/msr.h · a27b8fe9
      Dominik Brodowski authored
      It would be helpful if these msr.h #defines could get in.
      a27b8fe9
    • David Mosberger's avatar
      [PATCH] efi.h move · af05fc03
      David Mosberger authored
      It makes no sense to keep efi.h as an ia64-specific header (there really
      are x86 machines coming out with optional EFI BIOS support).
      af05fc03
    • Linus Torvalds's avatar
      Merge http://lia64.bkbits.net/to-linus-2.5 · 9caf366e
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      9caf366e
    • Andrew Morton's avatar
      [PATCH] ext3 __FUNCTION__ pasting fix · 7c0700ff
      Andrew Morton authored
      Fix a __FUNCTION__ paste in revoke.c
      7c0700ff
    • Andrew Morton's avatar
      [PATCH] O_DIRECT for ext3 · a3b71057
      Andrew Morton authored
      O_DIRECT support for ext3.
      
      It works OK in all journalling modes.
      
      Updates to the file metadata and inode are journalled as usual.
      
      If the system crashes during an appending O_DIRECT write then journal
      recovery will truncate the written-to file back to the length which it
      had on entry to that write.
      
      If the system crashes during a file overwrite to existing blocks then
      the file contents will be an unknown mixture of old and new.
      
      If the system crashes during a file overwrite which instantiates new
      blocks in the middle of the file then there is a possibility of
      uninitialised disk blocks being present in the file post-recovery.
      a3b71057
    • Andrew Morton's avatar
      [PATCH] fix an ext3 deadlock · bebff73c
      Andrew Morton authored
      mpage_writepages() does a lock_page() on pages to be written back, even
      when it is being used for page reclaim writeback.
      
      This is normally OK, because the page is unlocked quickly - pages are
      unlocked during writeback and nobody should be performing __GFP_FS
      allocations inside lock_page().
      
      But it has introduced a ranking problem in ext3:
      
      generic_file_write
      ->lock_page
        ->ext3_prepare_write
          ->journal_start	(waits for a commit)
      
      versus
      
      ext3_create()
      ->journal_start()
        ->ext3_new_inode(GFP_KERNEL)
          ->page reclaim
            ->mpage_writepages
              ->lock_page	(locks up, transaction is held open)
      
      Maybe sometime, I'll have to turn mpage_writepages' lock_page into a
      trylock if the caller is PF_MEMALLOC.  But for now, let's make ext3's
      inside-transaction allocations use GFP_NOFS.  There is only one of them.
      bebff73c
    • Andrew Morton's avatar
      [PATCH] writeback correctness and efficiency changes · ec12ac49
      Andrew Morton authored
      This is a performance and correctness fix against the writeback paths.
      
      The writeback code has competing requirements.  Sometimes it is used
      for "memory cleansing": kupdate, bdflush, writer throttling, page
      allocator writeback, etc.  And sometimes this same code is used for
      data integrity pruposes: fsync, msync, fdatasync, sync, umount, various
      other kernel-internal uses.
      
      The problem is: how to handle a dirty buffer or page which is currently
      under writeback.
      
      For memory cleansing, we just want to skip that buffer/page and go onto
      the next one.  But for sync, we must wait on the old writeback and then
      start new writeback.
      
      mpage_writepages() is current correct for cleansing, but incorrect for
      sync.  block_write_full_page() is currently correct for sync, but
      inefficient for cleansing.
      
      The fix is fairly simple.
      
      - In mpage_writepages(), don't skip the page is it's a sync
      operation.
      
      - In block_write_full_page(), skip the buffer if it is a sync
      operation.  And return -EAGAIN to tell the caller that the writeout
      didn't work out.  The caller must then set the page dirty again and
      move it onto mapping->dirty_pages.
      
      This is an extension of the writepage API: writepage can now return
      EAGAIN.  There are only three callers, and they have been updated.
      
      fail_writepage() and ext3_writepage() were actually doing this by
      hand.  They have been changed to return -EAGAIN.  NTFS will want to
      be able to return -EAGAIN from its writepage as well.
      
      - A sticky question is: how to tell the writeout code which mode it
      is operating in?  Cleansing or sync?
      
      It's such a tiny code change that I didn't have the heart to go and
      propagate a `mode' argument down every instance of writepages() and
      writepage() in the kernel.  So I passed it in via current->flags.
      
      Incidentally, the occurrence of a locked-and-dirty buffer in
      block_write_full_page() is fairly rare: normally the collision avoidance
      happens at the address_space level, via PageWriteback.  But some
      mappings (blockdevs, ext3 files, etc) have their dirty buffers written
      out via submit_bh().  It is these buffers which can stall
      block_write_full_page().
      
      This wart will be pretty intrusive to fix.  ext3 needs to become fully
      page-based (ugh.  It's a block-based journalling filesystem, and pages
      are unnatural).  blockdev mappings are still written out by buffers
      because that's how filesystems use them.  Putting _all_ metadata
      (indirects, inodes, superblocks, etc) into standalone address_spaces
      would fix that up.
      
      - filemap_fdatawrite() sets PF_SYNC.  So filemap_fdatawrite() is the
      kernel function which will start writeback against a mapping for
      "data integrity" purposes, whereas the unexported, internal-only
      do_writepages() is the writeback function which is used for memory
      cleansing.  This difference is the reason why I didn't consolidate
      those functions ages ago...
      
      - Lots of code paths had a bogus extra call to filemap_fdatawait(),
      which I previously added in a moment of weak-headedness.  They have
      all been removed.
      ec12ac49
    • Andrew Morton's avatar
      [PATCH] batched freeing of anon pages · 8fd3d458
      Andrew Morton authored
      A reworked version of the batched page freeing and lock amortisation
      for VMA teardown.
      
      It walks the existing 507-page list in the mmu_gather_t in 16-page
      chunks, drops their refcounts in 16-page chunks, and de-LRUs and
      frees any resulting zero-count pages in up-to-16 page chunks.
      8fd3d458
    • Andrew Morton's avatar
      [PATCH] put_page() consolidation · 2b341443
      Andrew Morton authored
      Clean up put_page() and page_cache_release().  It's pretty simple now:
      
      #define page_cache_get(page)           get_page(page)
      #define page_cache_release(page)       put_page(page)
      2b341443
    • Andrew Morton's avatar
      [PATCH] remove pagevec_lru_del() · e035a047
      Andrew Morton authored
      it was only being used in invalidate_inode_pages(), and from there,
      pagevec_release() does the same thing.
      e035a047
    • Andrew Morton's avatar
      [PATCH] debug check in put_page_testzero() · c99b0372
      Andrew Morton authored
      As suggested by Daniel - it's a bug to run put_page_testzero
      against a zero-ref page.
      c99b0372
    • Ingo Molnar's avatar
      [PATCH] MAINTAINERS patch · cdf2f98b
      Ingo Molnar authored
      please apply this patch (Robert ACK-ed it). While there is a preemptible
      kernel entry already, i think listing this at the scheduler entry is
      justfied, preemption has a number of scheduler interactions.
      cdf2f98b
    • Ingo Molnar's avatar
      [PATCH] ldt-fix-2.5.32-A3 · 89d637a8
      Ingo Molnar authored
      this is an updated version of the LDT fixes. It fixes the following kinds
      of problems:
      
       - fix a possible gcc optimization causing a race causing the loading of a
         corrupt LDT descriptor upon context switch. [this fix got simplified
         over previous versions.]
      
       - remove an unconditional OOM printk, and there's no need to set ->size
         in the OOM path.
      
       - fix preemption bugs, load_LDT()/clear_LDT() was not preemption-safe,
         when it was used outside of spinlocks.
      
      the context-switch race is the following. 'LDT modification' is the
      following operation: the seg->ldt pointer is modified, then seg->size is
      modified. In theory gcc is free to reschedule the two modifications, and
      first modify ->size, then ->ldt. Thus if this modification is not
      synchronized with context-switches, another thread might see a temporary
      state of the new ->size [which was increased], but still the old pointer.
      Ie.:
      
      	CPU0				CPU1
      
      	pc->size = newsize;
      					load_LDT(); // (oldptr, newsize)
      	pc->ldt = newptr;
      
      the corrupt LDT is loaded until the SMP cross-call is sent, leaving the
      window open for many usecs.
      
      the fix is to put a wmb() after ->ldt modifications. [this is also in
      preparation of not-write-ordered SMP x86 designs.]
      89d637a8
    • Linus Torvalds's avatar
      Merge bk://linux-input.bkbits.net/linux-input · e5d588fe
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      e5d588fe
    • Vojtech Pavlik's avatar
      Ignore error 0xff - 'general error' in AUX wire test in i8042.c, · ed0a0a9c
      Vojtech Pavlik authored
      some mainboards (Andrew Morton's Dell) report that even everything
      is okay with AUX. Also remove a check for very old AMI i8042's, which
      could generate false positives on modern buggy mainboards.
      ed0a0a9c
    • Linus Torvalds's avatar
      Merge bk://jfs.bkbits.net/linux-2.5 · c71a4337
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      c71a4337
    • Peter Wächtler's avatar
      652cbb16
    • Peter Wächtler's avatar
      f7dc2012
    • Peter Wächtler's avatar
      81b1edf0
    • Peter Wächtler's avatar
      2fcfdf56
    • Peter Wächtler's avatar
      7a6316fd
    • Peter Wächtler's avatar
      e8342e87