1. 03 Dec, 2002 20 commits
    • Andrew Morton's avatar
      [PATCH] readdir speedup and fixes · c384a968
      Andrew Morton authored
      2.5 is 20% slower than 2.4 in an AIM9 test which is just running
      readdir across /bin.  A lot of this is due to lots of tiny calls to
      copy_to_user() in fs/readdir.c.  The patch speeds up that test by 50%,
      so it's comfortably faster than 2.4.
      
      Also, there were lots of unchecked copy_to_user() and put_user() calls
      in there.  Fixed all that up as well.
      
      The patch assumes that each arch has a working 64-bit put_user(), which
      appears to be the case.
      c384a968
    • Andrew Morton's avatar
      [PATCH] Fix interaction between batched lru addition and hot/cold · 3c7b8b3c
      Andrew Morton authored
      If a page is "freed" while in the deferred-lru-addition queue, the
      final reference to it is the deferred lru addition queue.  When that
      queue gets spilled onto the LRU, the page is actually freed.
      
      Which is all expected and natural and works fine - it's a weird case.
      
      But one of the AIM9 tests was taking a 20% performance hit (relative to
      2.4) because it was going into the page allocator for new pages while
      cache-hot pages were languishiung out in the deferred-addition queue.
      
      So the patch changes things so that we spill the CPU's
      deferred-lru-addition queue before starting to free pages.  This way,
      the recently-used pages actually make it to the hot/cold lists and are
      available for new allocations.
      
      It gets back 15 of the lost 20%.  The other 5% is lost to the general
      additional complexity of all this stuff.  (But we're 250% faster than
      2.4 when running four instances of the test on 4-way).
      3c7b8b3c
    • Andrew Morton's avatar
      [PATCH] truncate speedup · 2f83855c
      Andrew Morton authored
      This patch optimises the truncate of a zero-length file, which is a
      sufficiently common case to justify the extra test-n-branch.
      
      It does this by skipping the entire call into the fs if i_size is not
      being altered.
      
      The AIM9 `open_clo' test just loops, creating and unlinking a file.
      This patch speeds it up 50% for ext2, 600% for reiserfs.
      2f83855c
    • Andrew Morton's avatar
      [PATCH] suppress some buffer-layer warnings on write IO errors · 1ecd39c2
      Andrew Morton authored
      The buffer-stripping code gets upset when it sees a non-uptodate buffer
      against an uptodate page.  This happens because the write end_io
      handler clears BH_Uptodate.
      
      Add a buffer_req() test to suppress these warnings.
      1ecd39c2
    • Andrew Morton's avatar
      [PATCH] Fix PF_MEMDIE · 89de8cb6
      Andrew Morton authored
      Patch from Hugh Dickins and Robert Love.
      
      Fixes up the PF_MEMDIE handling so that it actually works.
      
      (PF_MEMDIE allows an oom-killed task to use the emergency memory
      reserves so that it can actually get out of the page allocator and
      die)
      89de8cb6
    • Andrew Morton's avatar
      [PATCH] speed up signals · 51fef6b3
      Andrew Morton authored
      2.5's signal delivery is 20% slower than 2.4.  A signal send/handle
      cycle is performing a total of 24 copy_*_user() calls, and
      copy_*_user() got optimised for large copies.
      
      The patch reduces that to six copy_*_user() calls, and gets us up to
      about 5% slower than 2.4.  We'd have to go back to some additional
      inlined copy_user() code to get the last 3% back.  And HZ=100 to get
      the 2% back.
      
      It is noteworthy that the benchmark is not using float at all during
      the body of the test, yet the kernel is still doing all that floating
      point stuff.
      51fef6b3
    • Andrew Morton's avatar
      [PATCH] memory barrier work in ipc/util.c · 622d2a68
      Andrew Morton authored
      Patch from Mingming Cao <cmm@us.ibm.com>
      
      - ipc_lock() need a read_barrier_depends() to prevent indexing
        uninitialized new array on the read side.  This is corresponding to
        the write memory barrier added in grow_ary() from Dipankar's patch to
        prevent indexing uninitialized array.
      
      - Replaced "wmb()" in IPC code with "smp_wmb()"."wmb()" produces a
        full write memory barrier in both UP and SMP kernels, while
        "smp_wmb()" provides a full write memory barrier in an SMP kernel,
        but only a compiler directive in a UP kernel.  The same change are
        made for "rmb()".
      
      - Removed rmb() in ipc_get().  We do not need a read memory barrier
        there since ipc_get() is protected by ipc_ids.sem semaphore.
      
      - Added more comments about why write barriers and read barriers are
        needed (or not needed) here or there.
      622d2a68
    • Andrew Morton's avatar
      [PATCH] Move unreleasable pages onto the active list · 1c0f3462
      Andrew Morton authored
      With some workloads a large number of pages coming off the LRU are
      pinned blockdev pagecache - things like ext2 group descriptors, pages
      which have buffers in the per-cpu buffer LRUs, etc.
      
      They keep churning around the inactive list, reducing the overall page
      reclaim effectiveness.
      
      So move these pages onto the active list.
      1c0f3462
    • Andrew Morton's avatar
      [PATCH] Special-case fail_writepage() in page reclaim · 32b51ef2
      Andrew Morton authored
      Pages from memory-backed filesystems are supposed to be moved up onto
      the active list, but that's not working because fail_writepage() is
      called when the page is not on the LRU.
      
      So look for this case in page reclaim and handle it there.
      
      And it's more efficient, the VM knows more about what is going on and
      it later leads to the removal of fail_writepage().
      32b51ef2
    • Andrew Morton's avatar
      [PATCH] Move reclaimable pages to the tail ofthe inactive list on · 3b0db538
      Andrew Morton authored
      The patch addresses some search complexity failures which occur when
      there is a large amount of dirty data on the inactive list.
      
      Normally we attempt to write out those pages and then move them to the
      head of the inactive list.  But this goes against page aging, and means
      that the page has to traverse the entire list again before it can be
      reclaimed.
      
      But the VM really wants to reclaim that page - it has reached the tail
      of the LRU.
      
      So what we do in this patch is to mark the page as needing reclamation,
      and then start I/O.  In the IO completion handler we check to see if
      the page is still probably reclaimable and if so, move it to the tail of
      the inactive list, where it can be reclaimed immediately.
      
      Under really heavy swap-intensive loads this increases the page reclaim
      efficiency (pages reclaimed/pages scanned) from 10% to 25%.  Which is
      OK for that sort of load.  Not great, but OK.
      
      This code path takes the LRU lock once per page.  I didn't bother
      playing games with batching up the locking work - it's a rare code
      path, and the machine has plenty of CPU to spare when this is
      happening.
      3b0db538
    • Andrew Morton's avatar
      [PATCH] Remove the final per-page throttling site in the VM · 3139a3ec
      Andrew Morton authored
      This removes the last remnant of the 2.4 way of throttling page
      allocators: the wait_on_page_writeback() against mapped-or-swapcache
      pages.
      
      I did this because:
      
      a) It's not used much.
      b) It's already causing big latencies
      c) With Jens' large-queue stuff, it can cause huuuuuuuuge latencies.
         Like: ninety seconds.
      
      So kill it, and rely on blk_congestion_wait() to slow the allocator
      down to match the rate at which the IO system can retire writes.
      3139a3ec
    • Andrew Morton's avatar
      [PATCH] add the `oldalloc' and `orlov' mount options to ext3 · f513f6c6
      Andrew Morton authored
      These are the mount options which turn off and on the Orlov allocator.
      ext2 supports them but Ted forgot to wire them up for ext3.
      f513f6c6
    • Andrew Morton's avatar
      [PATCH] hugetlbpage.c build fix · b292e4b7
      Andrew Morton authored
      Patch from Arnd Bergmann <arnd@bergmann-dalldorf.de>
      b292e4b7
    • Andrew Morton's avatar
      [PATCH] timer fixes · 2fc616d1
      Andrew Morton authored
      - revert accidental reversion of the timer initialisation in fbcon.
      
      - init a timer in drivers/net/pcmcia/fmvj18x_cs.c
        (OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>)
      2fc616d1
    • Christoph Hellwig's avatar
      [PATCH] remove bad inodes from hash table · 22d3a67f
      Christoph Hellwig authored
      When testing the XFS 1.2 release we found a problem that was caused
      by inodes made unusable by make_bad_inode() still beeing returned by
      iget() and friends.  The workaround was to call remove_inode_hash()
      before each call to make_bad_inode().
      
      I think the proper fix is to let make_bad_inode() remove the inodes
      from the hash chains.
      22d3a67f
    • Anton Blanchard's avatar
      Merge samba.org:/scratch/anton/linux-2.5 · e13641c9
      Anton Blanchard authored
      into samba.org:/scratch/anton/linux-2.5_ppc64_work
      e13641c9
    • Anton Blanchard's avatar
      Merge samba.org:/scratch/anton/oprofile · d3c837bd
      Anton Blanchard authored
      into samba.org:/scratch/anton/linux-2.5_ppc64_work
      d3c837bd
    • Anton Blanchard's avatar
      145b74a2
    • Anton Blanchard's avatar
      Merge samba.org:/scratch/anton/linux-2.5 · b20e0d0f
      Anton Blanchard authored
      into samba.org:/scratch/anton/linux-2.5_ppc64_work
      b20e0d0f
    • Anton Blanchard's avatar
      ppc64: initial oprofile support · f9e58669
      Anton Blanchard authored
      f9e58669
  2. 02 Dec, 2002 16 commits
    • Andries E. Brouwer's avatar
      [PATCH] kill probe_cmos_for_drives · 467578b9
      Andries E. Brouwer authored
      467578b9
    • Matthew Wilcox's avatar
      [PATCH] PS/2 support for PARISC · c98d5e01
      Matthew Wilcox authored
      This converts the PA-RISC PS/2 keyboard & mouse driver to the input
      layer.  New driver written by Laurent Canet & Thibaut Varene.
      c98d5e01
    • Jochen Hein's avatar
      [PATCH] vfat umas doc update · e80f40a9
      Jochen Hein authored
      Fix the documentation to match the code fix.
      e80f40a9
    • Petr Vandrovec's avatar
      [PATCH] too few spaces in struct definition · c6538c30
      Petr Vandrovec authored
          "static structi2c_clientclient_template" works much better
      when spaces are added at appropriate places.
      c6538c30
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/davem/BK/sparc-2.5 · 306cf7f1
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      306cf7f1
    • David S. Miller's avatar
    • David S. Miller's avatar
    • Pavel Machek's avatar
      [PATCH] devicefs support for system timer · 2aceefe4
      Pavel Machek authored
      Without this, time runs 50x too slow after resume, since nothing
      knows to tell the timer to re-initialize.
      2aceefe4
    • Anton Blanchard's avatar
      [PATCH] fixes for oprofile on ppc64 · f5b162fe
      Anton Blanchard authored
      Here are a few fixes I needed when porting oprofile to ppc64:
      
       - __PAGE_OFFSET isnt defined for all architectures, use PAGE_OFFSET
         instead
       - include linux/cache.h everywhere we use ____cacheline_aligned etc.
         Otherwise we end up with a structure called ____cacheline_aligned
         and no alignment :(
      f5b162fe
    • Ingo Molnar's avatar
      [PATCH] getppid-2.5.50-A3 · d86f4ccd
      Ingo Molnar authored
      This changes sys_getppid() to be more POSIX-threading conformant.
      
      sys_getppid() needs to return the PID of the "process' parent" (ie.  the
      tgid of the parent thread), not the thread parent's PID.  The patch has
      no effect on non-CLONE_THREAD users, for them current->group_leader ==
      current.  The effect on CLONE_THREAD threads is that getppid() does not
      return any PID within the thread group anymore.  Plus if a threaded
      application starts up a (non-thread) child then the child sees the
      process PID of the parent process, not the thread PID of the parent
      thread.
      
      in theory we could introduce the getttid() variant to get to the TID of
      the parent thread, but i doubt it would be of any use.  (and we can add
      it if the need arises.)
      
      The lockless algorithm is still safe because the ->group_leader pointer
      never changes asynchronously.  (the ->real_parent pointer might still
      change asynchronously so the SMP checks are still needed.)
      
      I've also updated the comments (they referenced the nonexistent p_ooptr
      field.), plus i've changed the mb() to rmb() - we need to order the
      reads, we dont do any global writes that need some predictable ordering.
      d86f4ccd
    • Linus Torvalds's avatar
      Merge bk://linuxusb.bkbits.net/pci_hp-2.5 · 79cd7c1c
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      79cd7c1c
    • Linus Torvalds's avatar
      Merge master.kernel.org:/home/hch/BK/xfs/linux-2.5 · ab78bec6
      Linus Torvalds authored
      into home.transmeta.com:/home/torvalds/v2.5/linux
      ab78bec6
    • Matthew Wilcox's avatar
      [PATCH] Neaten up mm/Makefile · 55ff56e3
      Matthew Wilcox authored
      This removes the include of (the now empty) Rules.make, gets rid of the
      ifndef clause and fixes the indentation.
      55ff56e3
    • Ingo Molnar's avatar
      [PATCH] tcore-fixes-2.5.50-E6 · 8a061159
      Ingo Molnar authored
      This fixes threaded coredumps and streamlines the code.  The old code
      caused crashes and hung coredumps.  The new code has been tested for
      some time already and appears to be robust.  Changes:
      
       - the code now uses completions instead of a semaphore and a waitqueue,
         attached to mm_struct:
      
              /* coredumping support */
              int core_waiters;
              struct completion *core_startup_done, core_done;
      
       - extended the completion concept with a 'complete all' call - all pending
         threads are woken up in that case.
      
       - core_waiters is a plain integer now - it's always accessed from under
         the mmap_sem. It's also used as the fastpath-check in the sys_exit()
         path, instead of ->dumpable (which was incorrect).
      
       - got rid of the ->core_waiter task flag - it's not needed anymore.
      8a061159
    • Stelian Pop's avatar
      [PATCH] CREDITS update · fe43697e
      Stelian Pop authored
      Update Stelian Pop's contact information in CREDITS and MAINTAINERS.
      fe43697e
    • Stelian Pop's avatar
      [PATCH] sonypi driver update · c9bf8f64
      Stelian Pop authored
      This corrects a small typo in the previous patch (in the ZOOM button
      definition) and adds events generated by the Memory Stick reader on VAIO
      U3 laptops (thanks to Kunihiko IMAI).
      c9bf8f64
  3. 01 Dec, 2002 4 commits