1. 27 May, 2002 36 commits
    • Andrew Morton's avatar
      [PATCH] move nr_active and nr_inactive into per-CPU page · ce677ce2
      Andrew Morton authored
      It might reduce pagemap_lru_lock hold times a little, and is more
      consistent.  I think all global page accounting is now inside
      page_states[].
      ce677ce2
    • Andrew Morton's avatar
      [PATCH] factor common code in page_alloc.c · 9a0bd0e3
      Andrew Morton authored
      Factor out some similar code in page_alloc.c
      9a0bd0e3
    • Andrew Morton's avatar
      [PATCH] move BH_JBD out of buffer_head.h · 28ea30f7
      Andrew Morton authored
      For historical reasons, ext3 has a private BH state bit which has
      global scope.  This patch moves it inside ext3.
      28ea30f7
    • Andrew Morton's avatar
      [PATCH] fix ext3 __FUNCTION__ warnings · ca927221
      Andrew Morton authored
      Patch from Anton Blanchard which replaces
      
      	printk(KERN_FOO __FUNCTION__ ": msg");
      
      with
      	printk(KERN_FOO "%s: msg", __FUNCTION__);
      
      in ext3.
      ca927221
    • Andrew Morton's avatar
      [PATCH] generic_file_write() cleanup · 124d8831
      Andrew Morton authored
      Fixes all the goto spaghetti in generic_file_write() and turns it into
      something which humans can understand.
      
      Andi tells me that gcc3 does a decent job of relocating blocks out of
      line anyway.  This patch gives the compiler a helping hand with
      appropriate use of likely() and unlikely().
      124d8831
    • Andrew Morton's avatar
      [PATCH] remove mem_map_t · fd6dee02
      Andrew Morton authored
      Random cleanup: remove the mem_map_t typedef.  Just use 'struct page'
      everywhere.
      fd6dee02
    • Andrew Morton's avatar
      [PATCH] dirsync · bb772c58
      Andrew Morton authored
      An implementation of directory-synchronous mounts.
      
      I sent this out some months ago and it didn't generate a lot of
      interest.  Later we had one of the usual cheery exchanges with Wietse
      Venema (postfix development) and he agreed that directory synchronous
      mounts were something that he could use, and that there was benefit in
      implementing them in Linux.  If you choose to apply this I'll push the
      2.4 patch.
      
      
      
      Patch against e2fsprogs-1.26:
              http://www.zip.com.au/~akpm/linux/dirsync/e2fsprogs-1.26.patch
      
      Patch against util-linux-2.11n:
              http://www.zip.com.au/~akpm/linux/dirsync/util-linux-2.11n.patch
      
      
      The kernel patch includes implementations for ext2 and ext3. It's
      pretty simple.
      
      - When dirsync is in operation against a directory, the following operations
        are synchronous within that directory:  create, link, unlink, symlink,
        mkdir, rmdir, mknod, rename (synchronous if either the source or dest
        directory is dirsync).
      
      - dirsync is a subset of sync.  So `mount -o sync' or `chattr +S'
        give you everything which `mount -o dirsync' or `chattr +D' gives,
        plus synchronous file writes.
      
      - ext2's inode.i_attr_flags is unused, and is removed.
      
      - mount /dev/foo /mnt/bar -o dirsync  works as expected.
      
      - An ext2 or ext3 directory tree can be set dirsync with `chattr +D -R'.
      
      - dirsync is maintained as new directories are created under
        a `chattr +D' directory.  Like `chattr +S'.
      
      - Other filesystems can trivially be taught about dirsync.  It's just
        a matter of replacing `IS_SYNC(inode)' with `IS_DIRSYNC(inode)' in
        the directory update functions.  IS_SYNC will still be honoured when
        IS_DIRSYNC is used.
      
      - Non-directory files do not have their dirsync flag propagated.  So
        an S_ISREG file which is created inside a dirsync directory will not
        have its dirsync bit set.  chattr needs to do this as well.
      
      - There was a bit of version skew between e2fsprogs' idea of the
        inode flags and the kernel's.  That is sorted out here.
      
      - `lsattr' shows the dirsync flag as "D".  The letter "D" was
        previously being used for Compressed_Dirty_File.  I changed
        Compressed_Dirty_File to use "Z".  Is that OK?
      
      The mount(2) manpage needs to be taught about MS_DIRSYNC.
      bb772c58
    • Andrew Morton's avatar
      [PATCH] rename writeback_mapping to writepages · 7d608fac
      Andrew Morton authored
      Spot the difference:
      
      aops.readpage
      aops.readpages
      aops.writepage
      aops.writeback_mapping
      
      The patch renames `writeback_mapping' to `writepages'
      7d608fac
    • Andrew Morton's avatar
      [PATCH] enable direct-to-BIO readahead for ext3 · 1dd747c0
      Andrew Morton authored
      Turn on multipage no-buffers reads for ext3.
      1dd747c0
    • Andrew Morton's avatar
      [PATCH] direct-to-BIO writeback · ab9e8941
      Andrew Morton authored
      Multipage BIO writeout from the pagecache.
      
      It's pretty much the same as multipage reads.  It falls back to buffers
      if things got complex.
      
      The write case is a little more complex because it handles pages which
      have buffers and pages which do not.  If the page didn't have buffers
      this code does not add them.
      ab9e8941
    • Andrew Morton's avatar
      [PATCH] direct-to-BIO readahead · bc67de55
      Andrew Morton authored
      Implements BIO-based multipage reads into the pagecache, and turns this
      on for ext2.
      
      CPU load for `cat large_file > /dev/null' is reduced by approximately
      15%.  Similar reductions for tiobench with a single thread.  (Earlier
      claims of 25% were exaggerated - they were measured with slab debug
      enabled.  But 15% isn't bad for a load which is dominated by copy_*_user
      costs).
      
      With 2, 4 and 8 tiobench threads, throughput is increased as well, which was
      unexpected.  It's due to request queue weirdness.  (Generally the
      request queueing is doing bad things under certain workloads - that's a
      separate issue.)
      
      BIOs of up to 64 kbytes are assembled and submitted for readahead and
      for single-page reads.  So the work involved in reading 32 pages has gone
      from:
      
      	- allocate and attach 32 buffer_heads
      	- submit 32 buffer_heads
      	- allocate 32 bios
      	- submit 32 bios
      
      to:
      
      	- allocate 2 bios
      	- submit 2 bios
      
      These pages never have buffers attached.  Buffers will be attached
      later if the application writes to these pages (file overwrite).
      
      The first version of this code (in the "delayed allocation" patches)
      tries to handle everything - bios which start mid-page, bios which end
      mid-page and pages which are covered by multiple bios.  It is very
      complex code and in fact appears to be incorrect: out-of-order BIO
      completion could cause a page to come unlocked at the wrong time.
      
      This implementation is much simpler: if things get complex, it just
      falls back to the buffer-based block_read_full_page(), which isn't
      going away, and which understands all that complexity.  There's no
      point in doing this in two places.
      
      This code will bypass the buffer layer for
      
       - fully-mapped pages which are on-disk contiguous.
      
       - fully unmapoped pages (holes)
      
       - partially unmapped pages, where the unmappedness is at the end of
         the page (end-of-file).
      
      and everything else falls back to buffers.
      
      This means that with blocksize == PAGE_CACHE_SIZE, 100% of pages are
      handed direct to BIO.  With a heavy 10-minute dbench run on 4k
      PAGE_CACHE_SIZE and 1k blocks, 95% of pages were handed direct to BIO.
      Almost all of the other 5% were passed to block_read_full_page()
      because they were already partially uptodate from an earlier sub-page
      write().  This ratio will fall if PAGE_CACHE_SIZE/blocksize is greater
      than four.  But if that's the case, CPU efficiency is far from the main
      concern - there are significant seek and bandwidth problems just at 4
      blocks per page.
      
      This code will stress out the block layer somewhat - RAID0 doesn't like
      multipage BIOs, and there are probably others.  RAID0 seems to struggle
      along - readahead fails but read falls back to single-page reads, which
      succeed.  Such problems may be worked around by setting MPAGE_BIO_MAX_SIZE
      to PAGE_CACHE_SIZE in fs/mpage.c.
      
      It is trivial to enable multipage reads for many other filesystems.  We
      can do that after completion of external testing of ext2.
      bc67de55
    • Andrew Morton's avatar
      [PATCH] relax nr_to_write requirements · 47279570
      Andrew Morton authored
      Relax the requirements on the writeback_mapping a_op.
      
      This function is passed the number of pages which it should write.  The
      current fs-writeback.c code will get confused if the address_space
      writes back more pages than it was asked to.
      
      With this change the address_space may write more pages than required
      if that is convenient.  Extent-based fileystems may wish to do this.
      47279570
    • Andrew Morton's avatar
      [PATCH] mark swapout pages PageWriteback() · 357f5a5e
      Andrew Morton authored
      Pages which are under writeout to swap are locked, and not
      PageWriteback().  So page allocators do not throttle against them in
      shrink_caches().
      
      This causes enormous list scans and general coma under really heavy
      swapout loads.
      
      One fix would be to teach shrink_cache() to wait on PG_locked for swap
      pages.  The other approach is to set both PG_locked and PG_writeback
      for swap pages so they can be handled in the same manner as file-backed
      pages in shrink_cache().
      
      This patch takes the latter approach.
      357f5a5e
    • Andrew Morton's avatar
      [PATCH] fix loop driver for large BIOs · bd052817
      Andrew Morton authored
      Fix bug in the loop driver.
      
      When presented with a multipage BIO, loop is overindexing the first
      page in the BIO rather than advancing to the second page.  It scribbles
      on the backing file and/or on kernel memory.
      
      This happens with multipage BIO-based pagecache I/O and presumably with
      O_DIRECT also.
      
      The fix is much-needed with the multipage-BIO patches - using that code
      on loop-backed filesystems has rather messy results.
      bd052817
    • Andrew Morton's avatar
      [PATCH] ext3 set_page_dirty fix · 12feeeda
      Andrew Morton authored
      The set_page_dirty() in the ext3_writepage() failure path isn't right.
      set_page_dirty() will alter buffer states - it's a "whole page"
      dirtying.
      
      __set_page_dirty_buffers() is emitting warnings when it refuses to set
      dirty a non-uptodate buffer against a partially-mapped page.
      
      All we want to do in there is to move the page back onto
      mapping->dirty_pages, without altering the state of its buffers.
      12feeeda
    • Andrew Morton's avatar
      [PATCH] block_truncate_page fix · 9ff5178d
      Andrew Morton authored
      Fix bug in block_truncate_page().
      
      When buffers are attached to an uptodate page, they are marked as
      being uptodate.  To preserve buffer/page state coherency.  Dirtiness
      is handled in the same way.
      
      But block_truncate_page() assumes that a buffer which is unmapped and
      uptodate is over a hole.  That's not the case, and the net effect is
      that block_truncate_page() is failing to zero the block outside the
      truncation point.
      
      This only happens if the page has a disk mapping but has no attached
      buffers on entry to block_truncate_page().  That's never the case in
      current kernels, so the problem does not exhibit (it _does_ exhibit
      with direct-to-BIO bypass-the-buffers I/O).
      
      There are actually three possible states of buffer mappedness:
      
      - Buffer has a disk mapping            (buffer_mapped(bh) == true)
      
      - buffer is over a hole	               (buffer_mapped(bh) == false)
      
      - don't know.  Need to run get_block() (buffer_mapped(bh) == false)
      
      This ambiguity could be resolved by added another buffer state bit
      (BH_mapping_state_known?) but given that we already elide the get_block
      calls for the common case (buffer outside i_size) it is unlikely that
      the complexity is worthwhile.
      9ff5178d
    • Andrew Morton's avatar
      [PATCH] small fixes in buffer.c · 122d749c
      Andrew Morton authored
      - Fix the fix to the fix to the sector_t printing in buffer_io_error()
      
      - A few microoptimisations in buffer.c.  Replace:
      
      	set_buffer_foo(bh);
      
        with
      
      	if (!buffer_foo(bh))
      		set_buffer_foo(bh);
      
        when buffer_fooness is likely.  To avoid the buslocked rmw, and to
        avoid dirtying a cacheline.
      
      
      - export write_mapping_buffers() - filesystems which put buffers on
        mapping->private_list need this function for I/O scheduling reasons.
      122d749c
    • Rusty Russell's avatar
      [PATCH] MAINTAINERS file addition: Al Viro · 49e59dea
      Rusty Russell authored
      I'm sick of searching my mail archives to find that email addr.
      49e59dea
    • Frank Davis's avatar
      [PATCH] net/ipv4/ipconfig.c minor fix · edc83a99
      Frank Davis authored
      Hello all,
          The following patch fixes two compile warnings 'defined but not used'.
      Since the label and int are only used for IPCONFIG_DYNAMIC, appropriate
      fixes were made to remove the warnings.
      edc83a99
    • Rusty Russell's avatar
      [PATCH] check_region cleanup from drivers/char/ip2main.c · 27b7e752
      Rusty Russell authored
      johnpol@2ka.mipt.ru: 40) request_region check, 31-40:
        You say, i'm frezy :)
      
        	Evgeniy Polyakov ( s0mbre )
      27b7e752
    • Rusty Russell's avatar
      [PATCH] exit path cleanup in drivers/cdrom/sonycd535.c · 5a99778c
      Rusty Russell authored
      johnpol@2ka.mipt.ru: 30) request_region check, 21-30:
        here is one more trivial check.
      
        	Evgeniy Polyakov ( s0mbre )
      5a99778c
    • Rusty Russell's avatar
      [PATCH] irq.h comment fix · 793c1151
      Rusty Russell authored
      Tim Schmielau <tim@physik3.uni-rostock.de>: trivial irq.h comment fix:
      
        Now THIS is a trivial patch: (though admittedly quite useless;-)
      
        include/linux/irq.h starts with
          #ifndef __irq_h
        but ends with a comment
          #endif /* __asm_h */
      
        Tim
      793c1151
    • Rusty Russell's avatar
      [PATCH] jiffies.h includes asm/param.h · a5e3ce10
      Rusty Russell authored
      Tim Schmielau <tim@physik3.uni-rostock.de>: provide HZ from jiffies.h:
        Most files that include <jiffies.h> also need HZ defined, which is
        quite reasonable. So don't require the to include <asm/param.h>
        themselves.
      a5e3ce10
    • Rusty Russell's avatar
      [PATCH] Alpha macro standardize · a7ecd054
      Rusty Russell authored
      Rusty Russell <rusty@rustcorp.com.au>: Trivial ALPHA patch to remove minmax macros:
        Change over to standard max and ALIGN macros.
      a7ecd054
    • Rusty Russell's avatar
      [PATCH] ppc chrp/start.c warnings removal · 174f05b2
      Rusty Russell authored
      Rusty Russell <rusty@rustcorp.com.au>: Finally squish those chrp_start.c warnings:
        They finally irritated me enough to patch.  2.5, should apply against 2.4.
      174f05b2
    • Rusty Russell's avatar
      [PATCH] do_mounts warning removal · 0140603e
      Rusty Russell authored
      Peter Chubb <peter@chubb.wattle.id.au>: Fix compilation warning in do_mounts.c:
      
        	change_floppy() is unused if you don't have the floppy device
        compiled into the kernel --- so why not #ifdef it out?
      0140603e
    • Rusty Russell's avatar
      [PATCH] ppc spinlock warning removal · a51a00ea
      Rusty Russell authored
      Rusty Russell <rusty@rustcorp.com.au>: 2.5.17 Warning removal for ppc:
        test_and_set_bit now expect an "unsigned long", so we want
        &spinlock->lock rather than &spinlock (even though they are
        equivalent).
      
        Rusty.
      a51a00ea
    • Rusty Russell's avatar
      [PATCH] vmscan.c tidy up · b2c71135
      Rusty Russell authored
      (Included in 2.4)
      Pavel Machek <pavel@ucw.cz>: trivial: vmscan extra {}s:
        Hi!
      
        Extra { } look ugly, too, they are not consistant with rest of code and I introduced them :-(
      b2c71135
    • Rusty Russell's avatar
      [PATCH] CREDITS sort order · c6bc7b2f
      Rusty Russell authored
      (Included in 2.2)
      Pavel Machek <pavel@ucw.cz>: CREDITS not sorted properly:
        Hi!
      
        Please apply,
        									Pavel
      c6bc7b2f
    • Rusty Russell's avatar
      [PATCH] dcache.c spelling · 1e8b2524
      Rusty Russell authored
      Dan Aloni <da-x@gmx.net>: fs_dcache.c - typo:
      1e8b2524
    • Rusty Russell's avatar
      [PATCH] semctl SUSv2 compliance · 7cf3b4c6
      Rusty Russell authored
      Christopher Yeoh <cyeoh@samba.org>: (Made -p1 compliant by rusty) SUSv2 semctl compliance:
      
        The semctl call with SETVAL currently does not set sempid (at the
        moment sempid is only set during a successful semop call). An
        explanation from Geoff Clare of the Open Group regarding why sempid
        should be set during the semctl call:
      
        "The spec isn't very clear, but there is a statement on the semget()
        page which I think justifies the assumption made by the test.  It says
        that upon creation, the data structure associated with each semaphore
        in the set is not initialised, and that the semctl() function with
        SETVAL or SETALL can be used to initialise each semaphore.
      
        Therefore semctl() with SETVAL has to set sempid to *something*, and
        since sempid contains the "process ID of the last operation", setting
        it to anything other than the pid of the calling process would mean
        that sempid contained misleading information.  It could be argued that
        setting it to zero would not be misleading, but zero cannot be the
        process ID of a process, and so is not a valid value for sempid anyway."
      
        The following patch changes semctl so when called with SETVAL
        sempid is set to the pid of the calling process:
      7cf3b4c6
    • Rusty Russell's avatar
      [PATCH] xconfig fix · fbce8464
      Rusty Russell authored
      Alexander.Riesen@synopsys.com: xconfig for tulip subsection:
        fixes broken xconfig for tulip drivers.
      
        P.S. Why the double quotes in comment break it?
      fbce8464
    • Rusty Russell's avatar
      [PATCH] autofs_wqt_t for ppc64 · 856e7226
      Rusty Russell authored
      Anton Blanchard <anton@samba.org>: Fix autofs on ppc64:
      
        Define autofs_wqt_t to be an int on ppc64, just like the other mixed
        32/64 bit archs do.
      856e7226
    • Martin Dalecki's avatar
      [PATCH] 2.5.18 IDE 71 · 4478f040
      Martin Dalecki authored
       - Rewritten Artop host chip driver by Vojtech Pavlik. His log entries are:
      
         Cleanup whitespace.
      
         Remove superfluous chip entries in chip table.  Remove global variables to
         allow more than one controller.  Remove other forgotten stuff.
      
         This is a new driver for the Artop (Acard) controllers. It's completely
         untested, as I have never seen the hardware. However, I suspect it is much
         less broken than the previous one ...
      
         UDMA33 controller cannot detect 80-wire cable.
      
       - Separate ioctl handling out from ide.c. It's big enough.
      
       - Move atapi_read and atapi_write to the new atapi module.  Fix the declaration
         of those functions. The data buffer did have the void * type!
      
       - Separate module handling code out from actual transfer handling code in to a
         new module called main.c. Slowly we are at the stage where the code indeed
         has to be organized logically and not just "sporadically" as was the case
         before.
      
       - Apply patch by Adam Richter for the ide-scsi.c attach method implementation.
         This particular driver is still broken due to generic SCSI layer issues.
      
       - Apply true modularization patch for qd65xx.c by Samuel Thibault. Here
         are his notes about it:
      
         Then, patch-modularize-2.[45] is a proposal for modularizing qd65xx.o. As a
         single module, one can choose to insmod it before being able to do some
         hdparm -p /dev/hd[a-d]. But one can't remove it while tuned, since selectproc
         may be needed.
      
         I am sorry I wasn't able to test it under 2.5 series, lacking a functionning
         kernel for my test computer, but it seemed to work perfectly under 2.4
         series, and patches are almost the same.
      
       - Move PCI device id's to where they belong. Patch by Vojtech Pavlik.
      
       - Don't use BH_Lock in ide-tape.c - somehow this driver scares me sometimes.
      4478f040
    • Jens Axboe's avatar
      [PATCH] block documentation updates · cd556cb5
      Jens Axboe authored
      o Add 'tag' to request.txt doc
      o Add bio design etc discussions
      cd556cb5
    • Linus Torvalds's avatar
  2. 26 May, 2002 2 commits
  3. 25 May, 2002 2 commits