1. 18 Jun, 2004 40 commits
    • Paul Serice's avatar
      [PATCH] iso9660: fix handling of inodes beyond 4GB · 9210c204
      Paul Serice authored
      This is my fourth attempt to patch the isofs code.  It is similar to the last
      posting except this one implements the NFS get_parent() method which has
      always been missing.
      
      The original problem I set out to addresses is that the current iso9660 file
      system cannot reach inodes located beyond the 4GB barrier.  This is caused by
      using the inode number as the byte offset of the inode data.  Being 32-bits
      wide, the inode number is unable to reach inode data that does not reside on
      the first 4GB of the file system.
      
      This causes real problems with "growisofs"
      
            http://fy.chalmers.se/~appro/linux/DVD+RW/#isofs4gb
      
      and my pet project "shunt"
      
            http://www.serice.net/shunt/
      
      This patch switches the isofs code from iget() to iget5_locked() which allows
      extra data to be passed into isofs_read_inode() so that inode data anywhere on
      the disk can be reached.
      
      The inode number scheme was also changed.  Continuing to use the byte offset
      would have resulted in non-unique inodes in many common situations, but
      because the inode number no longer plays any role in reading the meta-data off
      the disk, I was free to set the inode number to some unique characteristic of
      the file.  I have chosen to use the block offset which is also 32-bits wide.
      
      Lastly, the pre-patch code uses the default export_operations to handle
      accessing the file system through NFS.  The problem with this is that the
      default NFS operations assume that iget() works which is no longer the case
      because of the necessity of switching to iget5_locked().  So, I had to
      implement the NFS operations too.  As a bonus, I went ahead and implemented
      the NFS get_parent() method which has always been missing.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9210c204
    • Tim Schmielau's avatar
      [PATCH] BSD accounting format rework · f38928f4
      Tim Schmielau authored
      BSD accounting format rework:
      
      Use all explicit and implicit padding in struct acct to
      
       - correctly report 32 bit uid/gid,
       - correctly report jobs (e.g., daemons) running longer than 497 days,
       - increase the precision of ac_etime from 2^-13 to 2^-20
         (i.e., from ~6 hours to ~1 min. after a year)
       - store the current AHZ value.
       - allow cross-platform processing of the accounting file
         (limited for m68k which has a different size struct acct).
       - introduce versioning for smooth transition to incompatible formats in
         the future. Currently the following version numbers are defined:
           0: old format (until 2.6.7) with 16 bit uid/gid
           1: extended variant (binary compatible to v0 on M68K)
           2: extended variant (binary compatible to v0 on everything except M68K)
           3: a new binary incompatible format (64 bytes)
           4: new binary incompatible format (128 bytes).
              layout of its first 64 bytes is the same as for v3.
           5: marks second half of new binary incompatible format (128 bytes)
              (layout is not yet defined)
      
      All this is accomplished without breaking binary compatibility.  32 bit
      uid/gid support is compatible with the patch previously floating around and
      used e.g.  by Red Hat.
      
      This patch also introduces a config option for a new, binary incompatible
      "version 3" format that
      
       - is uniform across and properly aligned on all platforms
       - stores pid and ppid
       - uses AHZ==100 on all platforms (allows to report longer times)
      
      Much of the compatibility glue goes away when v1/v2 support is removed from
      the kernel.  Such a patch is at
      
        http://www.physik3.uni-rostock.de/tim/kernel/2.7/acct-cleanup-04.patch
      
      and might be applied in the 2.7 timeframe.
      
      The new v3 format is source compatible with current GNU acct tools (6.3.5).
      However, current GNU acct tools can be compiled for only one format.  As there
      is no way to pass the kernel configuration to userspace, with my patch it will
      still only support the old v2 format.  Only if v1/v2 support is removed from
      the kernel, recompiling GNU acct tools will yield v3 support.
      
      A preliminary take at the corresponding work on cross-platform userspace tools
      (GNU acct package) is at
      
        http://www.physik3.uni-rostock.de/tim/kernel/utils/acct/
      
      This version of the package is able to read any of the v0/v2/v3 formats,
      regardless of byte-order (untested), even within the same file.
      Cross-platform compatibility with m68k (v1 format) is not yet implemented, but
      native use on m68k should work (untested).  pid and ppid are currently only
      shown by the dump-acct utility.
      
      Thanks to Arthur Corliss, Albert Cahalan and Ragnar Kjørstad for their
      comments, and to Albert Cahalan for the u64->IEEE float conversion code.
      Signed-off-by: default avatarTim Schmielau <tim@physik3.uni-rostock.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f38928f4
    • Alan Cox's avatar
      [PATCH] make the 3c59x/3c90x driver somewhat more reliable · e1cb4984
      Alan Cox authored
      The existing driver violates basic PCI rules in several places making it
      unusable for basic things like DHCP in Fedora Core.  This patch removes all
      the situations I can find where it writes to the device while in D3 state
      and breaks stuff.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e1cb4984
    • Burton N. Windle's avatar
      [PATCH] fix 3c59x.c to allow 3c905c 100bT-FD · 02bf06ec
      Burton N. Windle authored
      Fix the 3c905C 10/100 transceiver initialisation woes.
      
      (This was reverted from 2.6.7-rcX, but the bug reporter said the failure
      turned out to be unrepeatable).
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      02bf06ec
    • Joris van Rantwijk's avatar
      [PATCH] Validate PM-Timer rate at boot time · 6d58b128
      Joris van Rantwijk authored
      Add a check to the PM-Timer initialization code.  It validates the PM-Timer
      rate against PIT channel 2 and rejects the PM-Timer if its rate is not
      withing 5% of the expected number.
      
      Rationale:
      
      The PMTMR timers of certain (older) mainboards are running at invalid
      rates, often much faster than the rate expected by the PM-Timer code.  This
      causes the system clock to run much too fast.  See also
      http://bugme.osdl.org/show_bug.cgi?id=2375
      
      Possible workarounds are disabling the PM-Timer in the kernel config or
      disabling the PM-Timer at boot time through the "clock=tsc" parameter.
      However, we believe it is more user friendly to automatically validate the
      PM-Timer rate at boot time before using it as the system time source.
      
      Tested by me (with broken timer) and John Stultz (with good timer) and
      believed to be ok.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6d58b128
    • Matthew Wilcox's avatar
      [PATCH] ahc1542 !CONFIG_MCA build fix · 261efef3
      Matthew Wilcox authored
      The old 1542 scsi driver is both ISA and MCA.  The MCA portions are disabled
      when !CONFIG_MCA through the typical wrapper scheme (a la pci.h and
      !CONFIG_PCI).  However...  the driver unconditionally includes linux/mca.h
      which in turn unconditionally includes asm/mca.h.
      
      This breaks drivers on platforms with ISA but not MCA, like alpha.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      261efef3
    • Ingo Molnar's avatar
      [PATCH] x86: remove io_apic_sync · 2765df29
      Ingo Molnar authored
      The patch below gets rid of io_apic_sync().
      
      io_apic_sync() was introduced in 2.1.104 and it was originally done for
      masking and unmasking as well.  Later the unmasking use got removed but the
      masking use lingered around.  I dont think it was ever justified to do it
      and clearly since the lack of io_apic_sync() didnt break some of the other
      writes we do to the IO-APIC registers, it must be unnecessary in the
      masking case too.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      2765df29
    • Ingo Molnar's avatar
      [PATCH] x86: remove APIC_LOCKUP_DEBUG · ecab0503
      Ingo Molnar authored
      the patch below gets rid of APIC_LOCKUP_DEBUG.  It has been in the kernel
      for more than 3 years and the message was only reported once during that
      period of time - and even in that case it was a side-effect of a really bad
      crash.  The lockup workaround works, the debugging code can be moved out.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ecab0503
    • Pavel Machek's avatar
      [PATCH] io_apic.c code consolidation · 6be35b97
      Pavel Machek authored
      This cleans up io_apic.c a bit -- I do not really like 4 copies of same
      code.
      
      Ingo said:
      
         yeah, agreed - i checked & test it, it's ok.  I made a small
         modification (see the patch below) to uninline the __modify_IO_APIC_irq()
         function - shaving 0.5K off the kernel's size.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6be35b97
    • Nick Piggin's avatar
      [PATCH] Fix read() vs truncate race · 4bd9607e
      Nick Piggin authored
      do_generic_mapping_read()
      {
      	isize1 = i_size_read();
      	...
      	readpage
      	copy_to_user up to isize1;
      }
      
      readpage()
      {
      	isize2 = i_size_read();
      	...
      	read blocks
      	...
      	zero-fill all blocks past isize2
      }
      
      If a second thread runs truncate and shrinks i_size, so isize1 and isize2 are
      different, the read can return up to a page of zero-fill that shouldn't really
      exist.
      
      The trick is to read isize1 after doing the readpage.  I realised this is the
      right way to do it without having to change the readpage API.
      
      The patch should not cost any cycles when reading from pagecache.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4bd9607e
    • Andrew Morton's avatar
      [PATCH] invalidate_inodes2(): mark pages not uptodate · 3cf8b87b
      Andrew Morton authored
      Andrea Arcangeli <andrea@suse.de> points out that invalidate_inode_pages2() is
      supposed to mark mapped-into-pagetable pages as not uptodate so that next time
      someone faults the page in we will go get a new version from backing store.
      
      The callers are the direct-io code and the NFS "something changed on the
      server" code.  In both these cases we do need to go and re-read the page.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3cf8b87b
    • Herbert Xu's avatar
      [PATCH] Check return status of register calls in i82365 · 8d6d3943
      Herbert Xu authored
      i82365 calls driver_register and platform_device_register without checking
      their return values.  This patch fixes that.
      
      It also runs platform_device_register() prior to isa_probe() so we don't have
      to undo ise_probe()'s effects if platform_device_register() ends up failing.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8d6d3943
    • Tomas Olsson's avatar
      [PATCH] getgroups16() fix · 8783a1ce
      Tomas Olsson authored
      sys_getgroups16 (or rather groups16_to_user()) returns large gids
      truncated.  Needs to be fixed, one way or another.  Don't know why the
      other similar casts are still there.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8783a1ce
    • Rene Herman's avatar
      [PATCH] same small resource tweaks, x86_64 version · 406e1707
      Rene Herman authored
      The same small tweaks for x86_64.  Just to keep the two in sync.  One
      additional wrinkle: vram_resource was exported to e820.c, which didn't
      actually use it.  Undo that.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      406e1707
    • Rene Herman's avatar
      [PATCH] small tweaks to standard resource stuff · bc3e9bd2
      Rene Herman authored
      Various small tweaks. Compiled and booted.
      
      1. add IORESOURCE_BUSY | IORESOURCE_MEM also for the kernel code and
           data resources. I don't believe this actually matters one bit, but
           they're hooked into a BUSY/MEM parent ("System RAM") and marking
           them busy seems to make sense.
      
      2. delete the .start = 1M default for the kernel code resource. This
           isn't actually a change; it's set to virt_to_phys(_text) in
           setup_arch() overriding any default anyways.
      
      3. s/vram_resource/video_ram_resource/. Lines up much nicer with
           video_rom_resource...
      
      4. s/checksum/romchecksum/. setup.c is a fairly large file, and
           "checksum" pollutes the namespace.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      bc3e9bd2
    • H. Peter Anvin's avatar
      [PATCH] Use first-fit for pty allocation · 3931ca0a
      H. Peter Anvin authored
      (With Andrew Morton).
      
      The current dynamic pty allocation scheme has a few problems:
      
      - pty numbers grow to be very large, causing wtmp file bloat.
      
      - Seems to break libc5 and some old applications
      
      So change it to do first-fit.  An IDR tree is used to provide a
      logarithmic-time search.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3931ca0a
    • Theodore Y. Ts'o's avatar
      [PATCH] Ext3: Retry allocation after transaction commit (v2) · 5c4ad014
      Theodore Y. Ts'o authored
      Here is a reworked version of my patch to ext3 to retry certain filesystem
      operations after an ENOSPC error.  The ext3_should_retry_alloc() function will
      not wait on the currently running transaction if there is a currently active
      handle; hence this should avoid deadlocks in the Lustre use case.  The patch
      is versus BK-recent.
      
      I've also included a simple, reliable test case which demonstrates the problem
      this patch is intended to fix.  (Note that BK-recent is not sufficient to
      address this test case, and waiting on the commiting transaction in
      ext3_new_block is also not sufficient.  Been there, tried that, didn't work.
      We need to do the full-bore retry from the top level.  The
      ext3_should_retry_alloc() will only wait on the committing transaction if
      there is an active handle; hence Lustre will probably also need to use
      ext3_should_retry_alloc() if it wants to reliably avoid this particular
      problem.)
      
      #!/bin/sh
      #
      #
      TEST_DIR=/tmp
      IMAGE=$TEST_DIR/retry.img
      MNTPT=$TEST_DIR/retry.mnt
      TEST_SRC=/usr/projects/e2fsprogs/e2fsprogs/build
      MKE2FS_OPTS=""
      IMAGE_SIZE=8192
      
      umount $MNTPT
      dd if=/dev/zero of=$IMAGE bs=4k count=$IMAGE_SIZE
      mke2fs -j -F $MKE2FS_OPTS $IMAGE 
      
      function test_log ()
      {
      	echo $*
      	logger -p local4.notice $*
      }
      
      mkdir -p $MNTPT
      mount -o loop -t ext3 $IMAGE $MNTPT
      test_log Retry test: BEGIN
      for i in `seq 1 3`
      do
      	test_log "Retry test: Loop $i"
      	echo 2 > /proc/sys/fs/jbd-debug
      	while ! mkdir -p $MNTPT/foo/bar
      	do
      		test_log "Retry test: mkdir failed"
      		sleep 1
      	done
      	echo 0 > /proc/sys/fs/jbd-debug
      	cp -r $TEST_SRC $MNTPT/foo/bar 2> /dev/null
      	rm -rf $MNTPT/*
      done
      umount $MNTPT
      test_log "Retry test: END"
      
      
      akpm@osdl.org
      
        Rework the code to make it a formal JBD API entry point.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5c4ad014
    • Rene Herman's avatar
      [PATCH] pc9800: merge std_resources.c back into setup.c · 248af7e2
      Rene Herman authored
      std_resources.{c,h} was only split off due to pc9800 wanting to override it.
      With it gone, it might as well be merged back in.  Doesn't change any code.
      It was compiled and booted.
      
      This time this also actually doesn't break compilation of any of the
      subarches.  That's to say, any further.  I guess it might have been my .config
      (my regular PC config, with just the subarch switched through menuconfig) or
      O=, but only ELAN actually compiled.  Voyager and VISWS bombed out at the
      final link and NUMAQ much sooner (with "physnode_map undeclared" during
      compilation of numaq.c).
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      248af7e2
    • Adrian Bunk's avatar
      [PATCH] more PC9800 removal · 808ef260
      Adrian Bunk authored
      Removes more PC9800 code.
      
      Requires:
      
        bk rm drivers/char/upd4990a.c
        bk rm drivers/net/ne2k_cbus.c
        bk rm drivers/net/ne2k_cbus.h
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      808ef260
    • Randy Dunlap's avatar
      [PATCH] Remove PC9800 support · 5e018f7e
      Randy Dunlap authored
      PC9800 sub-arch is incomplete, hackish (at least in IDE), maintainers don't
      reply to emails and haven't touched it in awhile.  Can't even config it to
      try to build it without other patches to the kernel tree.
      
      bk-rm-script:
      
      #! /bin/sh
      bk rm -r ./arch/i386/mach-pc9800
      bk rm -r ./arch/i386/boot98
      bk rm ./drivers/char/lp_old98.c
      bk rm ./drivers/serial/serial98.c
      bk rm ./drivers/scsi/scsi_pc98.c
      bk rm ./drivers/scsi/pc980155.c
      bk rm ./drivers/scsi/pc980155.h
      bk rm ./drivers/block/floppy98.c
      bk rm ./drivers/input/keyboard/98kbd.c
      bk rm ./drivers/input/serio/98kbd-io.c
      bk rm ./drivers/input/misc/98spkr.c
      bk rm ./drivers/input/mouse/98busmouse.c
      bk rm ./drivers/ide/legacy/pc9800.c
      bk rm ./drivers/ide/legacy/hd98.c
      bk rm -r ./include/asm-i386/mach-pc9800
      bk rm ./include/asm-i386/pc9800_sca.h
      bk rm ./include/asm-i386/pc9800.h
      bk rm ./fs/partitions/nec98.c
      bk rm ./fs/partitions/nec98.h
      bk rm ./sound/isa/cs423x/pc98.c
      bk rm ./sound/isa/cs423x/pc9801_118_magic.h
      bk rm ./sound/isa/cs423x/sound_pc9800.h
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5e018f7e
    • Robert Picco's avatar
      [PATCH] HPET driver · b429f3b3
      Robert Picco authored
      The driver supports the High Precision Event Timer.  The driver has adopted
      a similar API to the Real Time Clock driver.  It can support any number of
      HPET devices and the maximum number of timers per HPET device.  For further
      information look at the documentation in the patch.
      
      Thanks to Venki at Intel for testing the driver on X86 hardware with HPET.
      
      HPET documentation is available at http://www.intel.com/design/chipsets/datashts/252516.htmSigned-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b429f3b3
    • Chris Wright's avatar
      [PATCH] RLIM: adjust default mqueue sizes · 02fb4124
      Chris Wright authored
      Lower default sizes for POSIX mqueue allocation now that rlimits are in place.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      02fb4124
    • Chris Wright's avatar
      [PATCH] RLIM: enforce rlimits for POSIX mqueue allocation · ae17b2b3
      Chris Wright authored
      Add a user_struct to the mq_inode_info structure.  Charge the maximum number
      of bytes that could be allocated to a mqueue to the user who creates the
      mqueue.  This is checked against the per user rlimit.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ae17b2b3
    • Chris Wright's avatar
      [PATCH] RLIM: add mq_attr_ok() helper · b1cae1ec
      Chris Wright authored
      Add helper function mq_attr_ok() to do mq_attr sanity checking, and do some
      extra overlow checking.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b1cae1ec
    • Chris Wright's avatar
      [PATCH] RLIM: add mq_bytes to user_struct · 9d9f6e8b
      Chris Wright authored
      Add mq_bytes field to user_struct, and make sure it's properly initialized.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9d9f6e8b
    • Chris Wright's avatar
      [PATCH] RLIM: add rlimit entry for POSIX mqueue allocation · faaa0feb
      Chris Wright authored
      Add an rlimit entry to control the maximum number of bytes a user can allocate
      to a POSIX mqueue.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      faaa0feb
    • Chris Wright's avatar
      [PATCH] RLIM: add simple get_uid() helper · db49b0f9
      Chris Wright authored
      Add simple helper function to grab a reference to a user_struct.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      db49b0f9
    • Chris Wright's avatar
      [PATCH] RLIM: add sigpending field to user_struct · 84f4d297
      Chris Wright authored
      Add sigpending field to user_struct, and make sure it's properly initialized.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      84f4d297
    • Chris Wright's avatar
      [PATCH] RLIM: add rlimit entry for controlling queued signals · 63e9e5dc
      Chris Wright authored
      The following patches introduce per user rlimits for both queued signals and
      POSIX message queues.  The changes touch all the arches resource.h files as
      well as init_task.c to get the rlimit defaults setup.
      
      Both require caching the user_struct to avoid problems with setuid().
      
      The signal changes makes some small changes to send_signal() to pass along the
      task being signalled to get proper accounting for signals initiated in
      interrupt.  Thanks to Marcelo for getting this one going.
      
      
      This patch:
      
      Add an rlimit entry to control the maximum number of pending signals a user
      may have.  This is essentially just the resource.h changes.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      63e9e5dc
    • Andrew Morton's avatar
      [PATCH] i2c fixups for idr API change · 8e3ca9ba
      Andrew Morton authored
      Fix up the i2c code which uses the IDR library.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8e3ca9ba
    • Corey Minyard's avatar
      [PATCH] IDR fixups · 935b33bc
      Corey Minyard authored
      There were definately some problems in there.  I've made some changes and
      tested with a lot of bounds.  I don't have a machine with enough memory to
      fill it up (it would take ~16GB on a 64-bit machine), but I use the "above"
      code to simulate a lot of situations.
      
      The problems were:
      
          * IDR_FULL was not the right value
          * idr_get_new_above() was not defined in the headers or documented.
          * idr_alloc() bug-ed if there was a race and not enough memory was
            allocated.  It should have returned NULL.
          * id will overflow when you go past the end.
          * There was a "(id >= (1 << (layers*IDR_BITS)))" comparison, but at
            the top layer it would overflow the id and be zero.
          * The allocation should return ENOSPC for an "above" value with
            nothing above it, but it returned EAGAIN.
      
      I have not tested on 64-bits (as I don't have a 64-bit machine).
      
      I've included the files, a diff from the previous version, and my test
      programs.
      
      For the test programs, idr_test <size> will just attempt to allocate 
      <size> elements, check them, free them, and check them again.
      
      idr_test2 <size> <incr> will allocate <size> element with <incr> between
      them.
      
      idr_test3 just tests some bounds and tries all values with just a few in
      the idr.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      935b33bc
    • Andrew Morton's avatar
      [PATCH] idr: remove counter bits from id's · 5470e17c
      Andrew Morton authored
      idr_get_new() currently returns an incrementing counter in the top 8 bits of
      the counter.  Which means that most users have to mask it off again, and we
      only have a 24-bit range.
      
      So remove that counter.  Also:
      
      - Remove the BITS_PER_INT define due to namespace collision risk.
      
      - Make MAX_ID_SHIFT 31, so counters have a 0 to 2G-1 range.
      
      - Why is MAX_ID_SHIFT using sizeof(int) and not sizeof(long)?  If it's for
        consistency across 32- and 64-bit machines, why not just make it "31"?
      
      - Does this still hold true with the counter removed?
      
      /* We can only use half the bits in the top level because there are
         only four possible bits in the top level (5 bits * 4 levels = 25
         bits, but you only use 24 bits in the id). */
      
        If not, what needs to change?
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5470e17c
    • Corey Minyard's avatar
      [PATCH] Fixes for idr code · 90e518e1
      Corey Minyard authored
      * On a 32-bit architecture, the idr code will cease to work if you add
        more than 2^20 entries.  You will not be able to find many of the
        entries.  The problem is that the IDR code uses 5-bit chunks of the
        number and the lower portion used by IDR is 24 bits, so you have one bit
        that leaks over into the comparisons that should not be there.  The
        solution is to mask off that bit before doing IDR processing.  This
        actually causes the POSIX timer code to crash if you create that many
        timers.  I have included an idr_test.tar.gz file that demonstrates this
        with and without the fix, in case you need more evidence :).
      
      * When the IDR fills up, it returns -1.  However, there was no way to
        check for this condition.  This patch adds the ability to check for the
        idr being full and fixes all the users.  It also fixes a problem in
        fs/super.c where the idr code wasn't checking for -1.
      
      * There was a race condition creating POSIX timers.  The timer was added
        to a task struct for another process then the data for the timer was
        filled out.  The other task could use/destroy time timer as soon as it is
        in the task's queue and the lock is released.  This moves settup up the
        timer data to before the timer is enqueued or (for some data) into the
        lock.
      
      * Change things so that the caller doesn't need to run idr_full() to find
        out the reason for an idr_get_new() failure.
      
        Just return -ENOSPC if the tree was full, or -EAGAIN if the caller needs
        to re-run idr_pre_get() and try again.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      90e518e1
    • Chris Mason's avatar
      [PATCH] reiserfs data logging support · f1372916
      Chris Mason authored
      Add data=journal support for reiserfs
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f1372916
    • Chris Mason's avatar
      [PATCH] reiserfs: btree readahead · 2167f071
      Chris Mason authored
      Walking the btree can trigger a number of single block synchronous reads.
      This patch does btree readahead during operations that are likely to be long
      and sequential.  So far, that only includes directory reads and truncates, but
      it can make both much faster.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      2167f071
    • Chris Mason's avatar
      [PATCH] reiserfs: remove debugging warning from block allocator · 36f9f7fc
      Chris Mason authored
      Remove debugging warning from the reiserfs block allocator code
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      36f9f7fc
    • Chris Mason's avatar
      [PATCH] reiserfs: block allocator should not inherit "packing locality 1" · 930c07f9
      Chris Mason authored
      reiserfsck --rebuild-tree expects the only key with a packing locality of 1 to
      be for the root directory (key [1 2]).  The new block allocator inherited that
      packing locality down to subdirectories, which triggers failures in reiserfsck
      --rebuild-tree
      
      reiserfsck in readonly check mode doesn't complain about this, thanks to Jeff
      Mahoney for finding it.
      
      The fix is to never inherit packing locality #1
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      930c07f9
    • Chris Mason's avatar
      [PATCH] reiserfs: block allocator optimizations · 734db689
      Chris Mason authored
      From: <mason@suse.com>
      From: <jeffm@suse.com>
      
      The current reiserfs allocator pretty much allocates things sequentially
      from the start of the disk, it works very nicely for desktop loads but
      once you've got more then one proc doing io data files can fragment badly.
      
      One obvious solution is something like ext2's bitmap groups, which puts
      file data into different areas of the disk based on which subdirectory
      they are in.  The problem with bitmap groups is that if you've got a
      group of subdirectories their contents will be spread out all over the
      disk, leading to lots of seeks during a sequential read.
      
      This allocator patch uses the packing locality to determine which bitmap
      group to allocate from, but when you create a file it looks in the bitmaps
      to see how 'full' that packing locality already is.  If it hasn't been
      heavily used yet, the packing locality is inherited from the parent
      directory putting files in new subdirs close to the parent subdir,
      otherwise it is the inode number of the parent directory putting new
      files far away from the parent subdir.
      
      The end result is fewer bitmap groups for the same working set.  For
      example, one test data set created by 20 procs running in parallel has
      6822 subdirs.  And with vanilla reiserfs that would mean 6822
      packing localities.  This patch turns that into 26 packing localities.
      
      This makes sequential reads of big directory trees more efficient, but
      it also makes the btree more efficient in general.  Things end up sorted
      better because groups of subdirs end up with similar keys in the btree,
      instead of being spread out all over.
      
      The bitmap grouping code tries to use the start of each bitmap group
      for metadata, and offsets the data slightly.  The data and metadata
      are still close together, but not completely intermixed like they are
      in the default allocator.  The end result is that leaf nodes tend to be
      close to each other, making metadata readahead more effective.
      
      The old block allocator had the ability to enforce a minimum
      allocation size, but did not use it.  It now tries to do a pass looking
      for larger allocation chunks before falling back to the old behaviour
      of taking any blocks it can find.
      
      The patch changes the defaults to:
      
      mount -o alloc=skip_busy:dirid_groups:packing_groups
      
      You can get back the old behaviour with mount -o alloc=skip_busy
      
      mount -o alloc=dirid_groups will turn on the bitmap groups
      mount -o alloc=packing_groups turns on the packing locality reduction code
      mount -o alloc=skip_busy:dirid_groups turns on both dirid_groups and
      skip_busy
      
      Finally the patch adds a mount -o alloc=oid_groups, which puts files into
      bitmap groups based on a hash of their objectid.  This would be used for
      databases or other situations where you have a limited number of very
      large files.
      
      This command will tell you how many packing localities are actually in
      use:
      
      debugreiserfs -d /dev/xxx | grep '^|.*SD' | sed 's/^.....//' | awk '{print $1}' | sort -u | wc -l
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      734db689
    • Andrew Morton's avatar
      [PATCH] ppc64: uninline __pte_free_tlb() · fab177a4
      Andrew Morton authored
      The pgalloc.h changes broke ppc64:
      
      In file included from include/asm-generic/tlb.h:18,
                       from include/asm/tlb.h:24,
                       from arch/ppc64/mm/hash_utils.c:48:
      include/asm/pgalloc.h: In function `__pte_free_tlb':
      include/asm/pgalloc.h:110: dereferencing pointer to incomplete type
      include/asm/pgalloc.h:111: dereferencing pointer to incomplete type
      
      Uninlining __pte_free_tlb() fixes that.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fab177a4
    • Russell King's avatar
      [PATCH] Clean up asm/pgalloc.h include 3 · d01034ea
      Russell King authored
      This patch cleans up needless includes of asm/pgalloc.h from the arch/i386/
      subtree.  Compile tested on x86_pc SMP.
      
      [I also tried VISWS + SMP without PM doesn't build in smpboot.c,
       though I don't believe its caused by this patch.  With PM, fails
       to link complaining maxcpus is undefined.  Therefore, I presume
       VISWS + SMP is an invalid configuration.]
      
      This patch is part of a larger patch aiming towards getting the include of
      asm/pgtable.h out of linux/mm.h, so that asm/pgtable.h can sanely get at
      things like mm_struct and friends.
      
      I suggest testing in -mm for a while to ensure there aren't any hidden arch
      issues.
      
      The outstanding list of files for other architectures can be found
      at http://www.arm.linux.org.uk/misc/pgalloc.txtSigned-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d01034ea