1. 03 Apr, 2003 13 commits
    • Andrew Morton's avatar
      [PATCH] struct stat - support larger dev_t · e95b2065
      Andrew Morton authored
      From: Andries.Brouwer@cwi.nl
      
      Below a patch that changes struct stat for a number of
      architectures. Maintainers, please watch carefully.
      
      Struct stat is used to transfer information from kernel
      to user space on a stat() system call.
      It has fields st_dev, st_rdev.
      
      The size of these fields is in principle unrelated to
      the size of a dev_t in user space or the size of a
      dev_t or kdev_t in kernel space.
      
      It is just the "capacity" of the channel.
      The actual amount of useful information is the minimum
      of the four sizes (kernel dev_t, kernel kdev_t,
      user dev_t, width of stat st_dev, st_rdev fields).
      
      The goal of this patch is to make sure that the stat() and stat64()
      system calls transmit at least 32 and 64 bits, respectively.
      This is achieved by using the padding that was present already.
      We fail when no padding was present, or when the padding is on
      the wrong side (after the field, while the machine is big-endian).
      
      alpha:	stat: uses unsigned int, 32 bits
      arm:	stat: uses unsigned short - bad.
      	The padding is on one side, which means that this can
      	be made into unsigned long only on little endian systems.
      	FIXED - unless __ARMEB__.
      	stat64: used unsigned short - FIXED, now unsigned long long.
      cris:	stat: used unsigned short - FIXED, now unsigned long
      	stat64: used unsigned short - FIXED, now unsigned long long.
      i386:	stat: used unsigned short - FIXED, now unsigned long
      	stat64: used unsigned short - FIXED, now unsigned long long.
      ia64:	stat: uses unsigned long, 64 bits
      m68k:	stat: used unsigned short - bad, but this cannot be fixed
      	since m68k is big-endian, and the available padding is on
      	the wrong side. NOT FIXED.
      	stat64: used unsigned short - FIXED, now unsigned long long.
      mips:	stat: uses dev_t which is unsigned int, 32 bits
      	stat64: used unsigned long, 32 bits. NOT FIXED.
      	(There is padding on one side, so this can be fixed if __MIPSEL__.)
      mips64:	stat: uses dev_t which is unsigned int, 32 bits
      parisc:	stat: uses dev_t, 32 bits
      	stat64: uses unsigned long long, 64 bits
      ppc:	stat: uses dev_t which is unsigned int, 32 bits
      	stat64: unsigned long long, 64 bits
      ppc64:	stat: uses dev_t which is unsigned long, 64 bits
      	stat64: uses unsigned long, 64 bits
      sparc:	stat: uses unsigned short, no padding. NOT FIXED.
      	stat64: used unsigned short - FIXED, now unsigned long long.
      sparc64:stat: uses dev_t which is unsigned int, 32 bits
      	stat64: used unsigned short - FIXED, now unsigned long long.
      s390:	stat: used unsigned short, big-endian, padding on the wrong side,
      	NOT FIXED.
      	stat64: used unsigned short - FIXED, now unsigned long long.
      s390x:	stat: uses unsigned long, 64 bits
      sh:	stat: used unsigned short, but padding maybe on wrong side.
      	NOT FIXED.
      	stat64: used unsigned short - FIXED, now unsigned long long.
      v850:	stat: used __kernel_dev_t.
      	BUG: NEVER use __kernel types in a user space interface.
      	Replaced the types. FIXED - now unsigned int - 32 bits.
      	stat64: FIXED - now unsigned long long - 64 bits.
      x86_64:	stat: uses unsigned long, 64 bits
      
      So, on most architectures we achieve the aim of 32 bits for stat,
      64 bits for stat64. On all architectures we achieve at least
      16 bits for stat, 32 bits for stat64.
      e95b2065
    • Andrew Morton's avatar
      [PATCH] tmpfs 6/6: percentile sizing of tmpfs · 65aaef27
      Andrew Morton authored
      From: CaT <cat@zip.com.au>
      
      What this patch does is allow you to specify the max amount of memory tmpfs
      can use as a percentage of available real ram.  This (in my eyes) is useful
      so that you do not have to remember to change the setting if you want
      something other then 50% and some of your ram goes.
      
      Hugh redid the arithmetic to not overflow at 4GB; the particular order of
      lines helps RH's gcc-2.96-110 not to get confused in the do_div.  2.5 can use
      totalram_pages.  Update mount options in tmpfs Doc.
      
      There's an argument that the percentage should be of ram+swap, that's what
      Christoph originally intended.  But we set the default at 50% of ram only, so
      I believe it's more consistent to follow that precedent.
      65aaef27
    • Andrew Morton's avatar
      [PATCH] tmpfs 5/6: use cond_resched · 548ac1de
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      cond_resched each time around the loop in shmem_file_write
      and do_shmem_file_read, matching filemap.c.
      548ac1de
    • Andrew Morton's avatar
      [PATCH] tmpfs 4/6: use mark_page_accessed · 5d86cc8b
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      tmpfs pages should be surfing the LRUs in the company of their filemap
      friends: I was expecting the rules to change, but they've been stable so
      long, let's sprinkle mark_page_accessed in the equivalent places here; but
      (don't ask me why) SetPageReferenced in shmem_file_write.  Ooh, and
      shmem_populate was missing a flush_page_to_ram.
      5d86cc8b
    • Andrew Morton's avatar
      [PATCH] tmpfs 3/6: use generic_file_llseek · f56453c9
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      default_llseek's use of BKL and not i_sem was recently exposed:
      tmpfs should be using generic_file_llseek which guards with i_sem.
      f56453c9
    • Andrew Morton's avatar
      [PATCH] tmpfs 2/6 remove shmem_readpage · 2927b748
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      shmem_readpage was created to give tmpfs sendfile and loop ability; but
      they're both using shmem_file_sendfile now, so remove shmem_readpage.
      2927b748
    • Andrew Morton's avatar
      [PATCH] tmpfs 1/6 use generic_write_checks · acad2c18
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      Remove the duplicated checks in shmem_file-write(), use
      generic_write_checks() instead.
      acad2c18
    • Andrew Morton's avatar
      [PATCH] file limit checking simplification · d80bbda5
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      When handling rlimit != RLIM_INFINITY, generic_write_checks tests file
      position against 0xFFFFFFFFULL, and casts it to a u32.  This code is
      carried forward from 2.4.4, and the 2.4-ac tree contains an apparently
      obvious fix to one part of it (should set count to 0 not to a negative).
      But when you think it through, it all turns out to be bogus.
      
      On a 32-bit architecture: limit is a 32-bit unsigned long, we've
      already handled *pos < 0 and *pos >= limit, so *pos here has no way
      of being > 0xFFFFFFFFULL, and thus casting it to u32 won't truncate it.
      And on a 64-bit architecture: limit is a 64-bit unsigned long, but this
      code is disallowing file position beyond the 32 bits; or if there's some
      userspace compatibility issue, with limit having to fit into 32 bits,
      the 32-bit architecture argument applies and they're still irrelevant.
      
      So just remove the 0xFFFFFFFFULL test; and in place of the u32, cast to
      typeof(limit) so it's right even if rlimits get wider.  And there's no
      way we'd want to send SIGXFSZ below the limit: remove send_sig comment.
      
      There's a similarly suspicious u32 cast a little further down, when
      checking MAX_NON_LFS.  Given its definition, that does no harm on any
      arch: but it's better changed to unsigned long, the type of MAX_NON_LFS.
      d80bbda5
    • Andrew Morton's avatar
      [PATCH] bio kmapping changes · 240d3e2d
      Andrew Morton authored
      RAID5 is calling copy_data() under sh->lock.  But copy_data() does kmap(),
      which can sleep.
      
      The best fix is to use kmap_atomic() in there.  It is faster than kmap() and
      does not block.
      
      The patch removes the unused bio_kmap() and replaces __bio_kmap() with
      __bio_kmap_atomic().  I think it's best to withdraw the sleeping-and-slow
      bio_kmap() from the kernel API before someone else tries to use it.
      
      
      Also, I notice that bio_kmap_irq() was using local_save_flags().  This is a
      bug - local_save_flags() does not disable interrupts.  Converted that to
      local_irq_save().  These names are terribly chosen.
      
      This patch was acked by Jens and Neil.
      240d3e2d
    • Andrew Morton's avatar
      [PATCH] Fix some compile warnings · d597f71b
      Andrew Morton authored
      From: "Martin J. Bligh" <mbligh@aracnet.com>
      
      Fix a couple of instances of "warning: suggest parentheses around assignment
      used as truth value".
      d597f71b
    • Andrew Morton's avatar
      [PATCH] monotonic clock source for hangcheck timer · 92525be5
      Andrew Morton authored
      From: john stultz <johnstul@us.ibm.com>
      
      This patch, written with the advice of Joel Becker, addresses a problem with
      the hangcheck-timer.
      
      The basic problem is that the hangcheck-timer code (Required for Oracle)
      needs a accurate hard clock which can be used to detect OS stalls (due to
      udelay() or pci bus hangs) that would cause system time to skew (its sort of
      a sanity check that insures the system's notion of time is accurate).
      However, currently they are using get_cycles() to fetch the cpu's TSC
      register, thus this does not work on systems w/o a synced TSC.
      
      As suggested by Andi Kleen (see thread here:
      http://www.uwsg.iu.edu/hypermail/linux/kernel/0302.0/1234.html ) I've worked
      with Joel and others to implement the monotonic_clock() interface.  Some of
      the major considerations made when writing this patch were
      
      o Needs to be able to return accurate time in the absence of multiple timer
        interrupts
      
      o Needs to be abstracted out from the hardware
      
      o Avoids impacting gettimeofday() performance
      
      This interface returns a unsigned long long representing the number of
      nanoseconds that has passed since time_init().
      92525be5
    • Andrew Morton's avatar
      [PATCH] handle bad inodes in put_inode · 68fa8120
      Andrew Morton authored
      From: "J. Bruce Fields" <bfields@fieldses.org>
      
      If the NFS daemon is presented with a filehandle for a file that has
      been deleted, it does an iget() in fs/exportfs/expfs.c:export_iget() and
      gets a bad inode back.  When it subsequently iput()s the inode, the
      result is:
      
      Mar 27 12:53:40 snoopy kernel: EXT2-fs error (device ide0(3,3)): ext2_free_blocks: Freeing blocks not in datazone - block = 1802201963, count = 27499
      Mar 27 12:53:40 snoopy kernel: Remounting filesystem read-only
      
      The same can happen if ext2_get_inode() returns an error - ext2_read_inode()
      will return an uninitialised inode and ext2_put_inode() is not allowed to go
      looking inside the bad inode.
      68fa8120
    • Andrew Morton's avatar
      [PATCH] tmpfs blk_congestion_wait fix · 505f7dd2
      Andrew Morton authored
      From: Hugh Dickins <hugh@veritas.com>
      
      The blk_congestion_waits in shmem_getpage are appropriate when the error is
      -ENOMEM, but not when the error is -EEXIST.  So add that test in the first
      instance, but omit it all in the second instance.
      505f7dd2
  2. 02 Apr, 2003 19 commits
  3. 01 Apr, 2003 8 commits