1. 18 Mar, 2004 30 commits
    • Andrew Morton's avatar
      [PATCH] proper alignment of init task in kernel image · 8a87e758
      Andrew Morton authored
      From: Matt Mackall <mpm@selenic.com>
      
      This keeps the alignment of the init task matched with the stack size.
      8a87e758
    • Andrew Morton's avatar
      [PATCH] sysfs: pin kobjects to fix use-after-free crashes · 2c0e195b
      Andrew Morton authored
      From: Maneesh Soni <maneesh@in.ibm.com>
      
      Fix a sysfs use-after-free crash.  The problem we have is of the kobject
      going away while we have a live dentry (the corresponding sysfs directory)
      still pointing to it throuh d_fsdata pointer.  The patch makes sure to keep
      the kobject alive by taking a reference to it during the life-time of
      corresponding dentry.
      
      
      o The following pins the kobject when sysfs assigns dentry and inode to
        the kobject. This ensures that kobject is alive during the life time of
        the dentry and inode, and people holding ref. to the dentry can access the
        kobject without any problems.
      
      o The ref. taken for the kobject is released through dentry->d_op->d_iput()
        call when the dentry ref. count drops to zero and it is being freed. For
        this sysfs_dentry_operations is introduced.
      
      For testing one has to run the following test on a SMP box:
      
      1) Do insmod/rmmod "dummy.o" network driver in a forever loop.
      
      2) Parallely do "find /sys/class/net | xargs cat" also in a forever loop.
      2c0e195b
    • Andrew Morton's avatar
      [PATCH] Fix dentry refcounting in sysfs_remove_group() · b67cee68
      Andrew Morton authored
      From: Maneesh Soni <maneesh@in.ibm.com>
      
      The following patch fixes the dentry refcounting, during
      sysfs_remove_group() and also adds the missing dput() for the "extra" ref
      taken during sysfs_create() for the sub-directory dentry corresponding to
      attribute group.
      b67cee68
    • Andrew Morton's avatar
      [PATCH] sysfs_remove_dir-vs-dcache_readdir race fix · f8f71d1b
      Andrew Morton authored
      From: Maneesh Soni <maneesh@in.ibm.com>
      
      I have re-done the patch fixing the race between sysfs_remove_dir() and
      dcache_readdir().  If you recall, sysfs_remove_dir(kobj) manipulates the
      ->d_subdirs list for the dentry corresponding to the sysfs directory being
      removed.  It can end up deleting the cursor dentry which is added to the
      ->d_subdirs list during a concurrent dcache_dir_open() ==> dcache_readdir()
      for the same directory.  And as a result dcache_readdir() can loop for ever
      holding dcache_lock.
      
      The earlier patch which was included in -mm1 created problems which
      resulted in list_del() BUG hits in prune_dcache().  The reason I think is
      that in the main loop in sysfs_remove_dir(), dcache_lock is dropped and
      re-acquired, and this could result in inconsistent ->d_subdirs list and
      prune_dcache() may try to delete an already deleted dentry.  I have
      corrected this in the new patch as below.
      
      I could do sysfs_remove_dir() more neatly on sysfs backing store patch set
      as there I don't use the ->d_subdirs list.  Instead the list of children
      sysfs_dirent works out well.  But untill sysfs backing store patch is
      picked up the existing code suffer from this race.  This can be easily
      tested by running following two loops on a SMP box
      
      # while true; do insmod drivers/net/dummy.ko; rmmod dummy; done
      # while true; do find /sys/class/net > /dev/null; done
      
      
      o This patch fixes sysfs_remove_dir race with dcache_readdir.  There is
        no need for sysfs_remove_dir to modify the d_subdirs list for the
        directory being deleted as it is taken care in the final dput.  Modifying
        this list results in inconsistent d_subdirs list and causes infinite loop
        in concurrently occurring dcache_readdir.
      
      o The main loop is restarted every time, dcache_lock is re-acquired in
        order to maintain consistency.
      f8f71d1b
    • Andrew Morton's avatar
      [PATCH] Add dma_error() and pci_dma_error() · 78576382
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      Introduce dma_error() and pci_dma_error() which are used to detect failures
      in pci_map_single.
      78576382
    • Andrew Morton's avatar
      [PATCH] ppc64: Fix POWER3 TCE allocation · 1c4c0ff6
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      - Fix for machines with 3GB IO holes (eg nighthawk).
      - Increase the maximum number of PHBs and warn if we exceed this (we used
        to walk off the end of the array)
      - Only allocate an 8MB TCE table on POWER4
      1c4c0ff6
    • Andrew Morton's avatar
      [PATCH] ppc64: Fix SLB reload bug · f75bd853
      Andrew Morton authored
      From: Paul Mackerras <paulus@samba.org>
      
      Recently we found a particularly nasty bug in the segment handling in the
      ppc64 kernel.  It would only happen rarely under heavy load, but when it
      did the machine would lock up with the whole of memory filled with
      exception stack frames.
      
      The primary cause was that we were losing the translation for the kernel
      stack from the SLB, but we still had it in the ERAT for a while longer.
      Now, there is a critical region in various exception exit paths where we
      have loaded the SRR0 and SRR1 registers from GPRs and we are loading those
      GPRs and the stack pointer from the exception frame on the kernel stack.
      If we lose the ERAT entry for the kernel stack in that region, we take an
      SLB miss on the next access to the kernel stack.  Taking the exception
      overwrites the values we have put into SRR0 and SRR1, which means we lose
      state.  In fact we ended up repeating that last section of the exception
      exit path, but using the user stack pointer this time.  That caused another
      exception (or if it didn't, we loaded a new value from the user stack and
      then went around and tried to use that).  And it spiralled downwards from
      there.
      
      The patch below fixes the primary problem by making sure that we really
      never cast out the SLB entry for the kernel stack.  It also improves
      debuggability in case anything like this happens again by:
      
      - In our exception exit paths, we now check whether the RI bit in the
        SRR1 value is 0.  We already set the RI bit to 0 before starting the
        critical region, but we never checked it.  Now, if we do ever get an
        exception in one of the critical regions, we will detect it before
        returning to the critical region, and instead we will print a nasty
        message and oops.
      
      - In the exception entry code, we now check that the kernel stack pointer
        value we're about to use isn't a userspace address.  If it is, we print a
        nasty message and oops.
      
      This has been tested on G5 and pSeries (both with and without hypervisor)
      and compile-tested on iSeries.
      f75bd853
    • Andrew Morton's avatar
      [PATCH] ppc64: Add numa=off command line option · 973a4e70
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      Add numa=off command line option to disable NUMA support at runtime.
      Useful if there are issues with our parsing of the NUMA toplogy or for
      testing NUMA gains.
      973a4e70
    • Andrew Morton's avatar
      [PATCH] ppc64: remove IO_DEBUG · a800d7b1
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      Remove the old __IO_DEBUG stuff and add some nice comments courtesy of x86.
      a800d7b1
    • Andrew Morton's avatar
      [PATCH] ppc64: iSeries virtual tape driver · 590b0a82
      Andrew Morton authored
      From: Stephen Rothwell <sfr@canb.auug.org.au>
      
      This patch adds the driver for the PPC64 iSeries virtual tape.
      590b0a82
    • Alexander Viro's avatar
      [PATCH] add touch_atime() helper · df8781b5
      Alexander Viro authored
      Preparation for per-mountpoint noatime, nodiratime and later -
      per-mountpoint r/o.  Depends on file_accessed() patch, should go after
      it.
      
      New helper - touch_atime(mnt, dentry).  It's a wrapper for
      update_atime() and that's where all future per-mountpoint checks will
      go.
      df8781b5
    • Alexander Viro's avatar
      [PATCH] add file_accessed() helper · 5f9861a6
      Alexander Viro authored
      New inlined helper - file_accessed(file) (wrapper for update_atime())
      5f9861a6
    • Alexander Viro's avatar
      [PATCH] missing check in do_add_mount() · d2a4a177
      Alexander Viro authored
      Make sure that we don't end up with symlink mounted over something
      
      (mount --bind is safe since we use LOOKUP_FOLLOW in pathname resolution
      there).
      d2a4a177
    • David S. Miller's avatar
      Merge nuts.davemloft.net:/disk1/BK/sparcwork-2.6 · 4143a413
      David S. Miller authored
      into nuts.davemloft.net:/disk1/BK/sparc-2.6
      4143a413
    • William Lee Irwin III's avatar
      bce3bd85
    • Jason Wever's avatar
    • David Howells's avatar
      [PATCH] FD_CLOEXEC fcntl cleanup · dc5178fa
      David Howells authored
      This fixes a minor problem with fcntl.
      
      get_close_on_exec() uses FD_ISSET() to determine the fd state, but this
      is not guaranteed to be either 0 of FD_CLOEXEC.  Make that explicit.
      
      Also, the argument of set_close_on_exec() is being AND'ed with the
      literal constant 1.  Make it use an explicit FD_CLOEXEC test.
      dc5178fa
    • Linus Torvalds's avatar
      Make ppc64 __FD_ISSET() return a proper boolean return value. · 437117bf
      Linus Torvalds authored
      (The broken macro only triggers for non-gcc compiles, but
      still..)
      437117bf
    • Bartlomiej Zolnierkiewicz's avatar
      [PATCH] add missing MODULE_DEVICE_TABLE() to IDE PCI drivers · 2b34fa5f
      Bartlomiej Zolnierkiewicz authored
      Original patch from Hannes Reinecke <hare@suse.de>.
      
      This is required to have modular IDE drivers announce themselves
      properly in modules.pcimap.
      2b34fa5f
    • Linus Torvalds's avatar
      1ddcf61a
    • Rusty Russell's avatar
      [PATCH] Implement migrate_all_tasks · ae01bd8f
      Rusty Russell authored
      Implement migrate_all_tasks() which moves tasks off cpu while machine
      is stopped.
      ae01bd8f
    • Rusty Russell's avatar
      [PATCH] Export cpu notifiers and do locking. · 30d67695
      Rusty Russell authored
      The registration and unregistration of CPU notifiers should be done
      under the cpucontrol sem.  They should also be exported.
      30d67695
    • Alexander Viro's avatar
      [PATCH] hpfs: general cleanup · ea747b67
      Alexander Viro authored
      include files moved to fs/hpfs/, gratitious #include removed, stuff that
      doesn't have to be global made static, misindented chunk of
      hpfs_readdir() put in place, etc.
      ea747b67
    • Alexander Viro's avatar
      [PATCH] hpfs: fix locking scheme · 9c96c8be
      Alexander Viro authored
      	Fixed the locking scheme.  The need of extra locking was caused by
      the fact that hpfs_write_inode() must update directory entry; since HPFS
      directories are implemented as b-trees, we must provide protection both
      against rename() (to make sure that we update the entry in right directory)
      and against rebalancing of the parent.
      
      	Old scheme had both deadlocks and races - to start with, we had no
      protection against rename()/unlink()/rmdir(), since (a) locking parent
      was done without any warranties that it will remain our parent and (b)
      check that we still have a directory entry (== have positive nlink) was
      done before we tried to lock the parent.  Moreover, iget serialization
      killed two steps ago gave immediate deadlocks if iget() of parent had
      triggered another hpfs_write_inode().
      
      	New scheme introduces another per-inode semaphore (hpfs-only,
      obviously) protecting the reference to parent.  It's taken on
      rename/rmdir/unlink victims and inode being moved by rename.  Old semaphores
      are taken only on parent(s) and only after we grab one(s) of the new kind.
      hpfs_write_inode() gets the new semaphore on our inode, checks nlink and
      if it's non-zero grabs parent and takes the old semaphore on it.
      
      	Order among the semaphores of the same kind is arbitrary - the only
      function that might take more than one of the same kind is hpfs_rename()
      and it's serialized by VFS.
      
      	We might get away with only one semaphore, but then the ordering
      issues would bite us big way - we would have to make sure that child is
      always locked before parent (hpfs_write_inode() leaves no other choice)
      and while that's easy to do for almost all operations, rename() is a bitch -
      as always.  And per-superblock rwsem giving rename() vs. write_inode()
      exclusion on hpfs would make the entire thing too baroque for my taste.
      	->readdir() takes no locks at all (protection against directory
      modifications is provided by VFS exclusion), ditto for ->lookup().
      	->llseek() on directories switched to use of (VFS) ->i_sem, so
      it's safe from directory modifications and ->readdir() is safe from it -
      no hpfs locks are needed here.
      9c96c8be
    • Alexander Viro's avatar
      [PATCH] hpfs: deadlock fixes · d9013aae
      Alexander Viro authored
      We used to have GFP_KERNEL kmalloc() done by the code that held hpfs
      lock on directory.  That could trigger a call of hpfs_write_inode() and
      deadlock; fixed by switch to GFP_NOFS.  Same for hpfs inodes themselves
      - hpfs_write_inode() calls iget() and that could trigger both the
      deadlocks (avoidable with very baroque locking scheme) and stack
      overflows (unavoidable unless we kill potential recursion here).
      d9013aae
    • Alexander Viro's avatar
      [PATCH] hpfs: hpfs iget locking cleanup · 772fd530
      Alexander Viro authored
      Killed the nightmares in hpfs iget handling.  Since in some (fairly
      frequent) cases hpfs_read_inode() could avoid any IO (basically, lookup
      hitting a native HPFS regular file can get all data from directory
      entry) hpfs had a flag passed to that sucker.  Said flag had been
      protected by a semaphore lookalike made out of spit and duct-tape and
      callers of iget looked like
      
      	hpfs_lock_iget(sb, flag);
      	result = iget(sb, ino);
      	hpfs_unlock_iget(sb);
      
      Since now we are calling hpfs_read_inode() directly (note that calling
      it without hpfs_lock_iget() would simply break) we can forget all that
      crap and get rid of the flag - caller knows what it wants to call.
      
      BTW, that had killed one of the last sleep_on() users in fs/*/*.
      772fd530
    • Alexander Viro's avatar
      [PATCH] hpfs: hpfs iget locking cleanup preparation · b5b83bae
      Alexander Viro authored
      	Preparation to hpfs iget locking cleanup - remaining iget() callers
      replaced with explicit iget_locked() + call hpfs_read_inode()/unlock_new_inode()
      if inode is new.
      b5b83bae
    • Alexander Viro's avatar
      [PATCH] hpfs: new/read/write_inode() cleanups · eb3a6d15
      Alexander Viro authored
      	1) common initialization for all paths in hpfs_read_inode() taken into
      a separate helper (hpfs_init_inode())
      	2) hpfs mkdir(),create(),mknod() and symlink() do not bother with
      iget() anymore - they call new_inode(), do initializations and insert new
      inode into icache.  Handling of OOM failures cleaned up - if we can't
      allocate in-core inode, bail instead of corrupting the filesystem.
      Allocating in-core inode early also avoids one of the deadlocks here
      (hpfs_write_inode() from memory pressure by kmem_cache_alloc() could
      deadlock on attempt to lock our directory).
      	3) hpfs_write_inode() marks the inode dirty again in case if it
      fails to iget() its parent directory.  Again, OOM could trigger fs corruption
      here.
      eb3a6d15
    • Alexander Viro's avatar
      [PATCH] hpfs: clean up lock ordering · fde48def
      Alexander Viro authored
      	hpfs_{lock,unlock}_{2,3}inodes() killed; all places that take more than
      one lock have ->i_sem held by VFS on all inodes involved and all hpfs per-inode
      locks are of the same type.  IOW, we can replace these guys with multiple
      hpfs_lock_inode() - order doesn't matter here.
      fde48def
    • Alexander Viro's avatar
      [PATCH] hpfs: namei.c failure case cleanups · c4357dfe
      Alexander Viro authored
      Failure exits in hpfs/namei.c merged and cleaned up.
      c4357dfe
  2. 17 Mar, 2004 10 commits
    • Andrew Morton's avatar
      [PATCH] ISDN kernelcapi notifier NULL pointer fix · 8754a8e2
      Andrew Morton authored
      From: Armin Schindler <armin@melware.de>
      
      Fixed NULL pointer reference in recv_handler()
      8754a8e2
    • Andrew Morton's avatar
      [PATCH] ISDN kernelcapi notifier workqueue re-structured · 3820421b
      Andrew Morton authored
      From: Armin Schindler <armin@melware.de>
      
      Use the notifier workqueue in a cleaner way.
      3820421b
    • Andrew Morton's avatar
      [PATCH] ISDN kernelcapi debug message enable · 9bab8e3c
      Andrew Morton authored
      From: Armin Schindler <armin@melware.de>
      
      Show debug messages if debug is enabled only.
      9bab8e3c
    • Andrew Morton's avatar
      [PATCH] exportfs - Remove unnecessary locking from find_exported_dentry() · 1b8e3f21
      Andrew Morton authored
      From: "Jose R. Santos" <jrsantos@austin.ibm.com>
      
      After discussing it with Neil, he fell that the original justification for
      taking the kernel_lock on find_exported_dentry() is not longer valid and
      should be safe to remove.
      
      This patch fixes an issue while running SpecSFS where under memory
      pressure, shrinking dcache cause find_exported_dentry() to allocate
      disconnected dentries that later needed to be properly connected.  The
      connecting part of the code was done with BKL taken which cause a sharp
      drop in performance during iterations and profiles showing 75% time spent
      on find_exported_dentry().  After applying the patch, time spent on the
      function is reduce to <1%.
      
      I have tested this on an 8-way machine with 56 filesystems for several days
      now with no problems using ext2, ext3, xfs and jfs.
      1b8e3f21
    • Andrew Morton's avatar
      [PATCH] JBD: avoid panic on corrupted journal superblock · efbe9b14
      Andrew Morton authored
      Don't panic if the journal superblock is wrecked: just fail the mount.
      efbe9b14
    • Andrew Morton's avatar
      [PATCH] ppc64: CONFIG_PREEMPT Kconfig help fix · 83a6b2ed
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      From: Robert Love <rml@ximian.com>
      
      arch/ppc64/Kconfig's entry for CONFIG_PREEMPT is missing the description
      after the "bool" statement, so the entry does not show up.
      
      Also, the help description mentions a restriction that is not [any longer]
      true.
      83a6b2ed
    • Andrew Morton's avatar
      [PATCH] ppc64: xmon oops-the-kernel option · d74f0f95
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      Sometimes we just want to pass the error up to the kernel and let it oops.
      X it is.
      d74f0f95
    • Andrew Morton's avatar
      [PATCH] ppc64: wrap some stuff in __KERNEL__ · 62c2f777
      Andrew Morton authored
      From: Anton Blanchard <anton@samba.org>
      
      - remove now unused kernel syscalls.
      - wrap recently added defines in #ifdef __KERNEL__, fixes glibc
        compile issue
      - some of our extra syscalls used asmlinkage, some did not. Make them
        consistent
      62c2f777
    • Andrew Morton's avatar
      [PATCH] ppc32: Fix booting some IBM PRePs · 6cfe07f4
      Andrew Morton authored
      From: Tom Rini <trini@kernel.crashing.org>
      
      The following patch comes from Paul Mackerras.  Earlier on in 2.6,
      arch/ppc/boot/utils/mkprep.c was changed slightly so that it would build
      and work on Solaris.  Doing this required changing from filling out
      pointers to an area to filling out a local copy of the struct.  However, a
      memcpy was left out, and the info is only needed on some machines to boot.
      The following adds in the missing memcpy and allows for IBM PRePs to boot
      from a raw floppy again.
      6cfe07f4
    • Andrew Morton's avatar
      [PATCH] ppc32: fix SMP build · af0d0480
      Andrew Morton authored
      From: Olaf Hering <olh@suse.de>
      
      Current Linus tree adds an extra space and dot to the mkprep options.
      `make all' with an smp config doesnt work.  This patch fixes it.
      af0d0480