- 18 Mar, 2004 24 commits
-
-
Andrew Morton authored
From: Paul Mackerras <paulus@samba.org> Recently we found a particularly nasty bug in the segment handling in the ppc64 kernel. It would only happen rarely under heavy load, but when it did the machine would lock up with the whole of memory filled with exception stack frames. The primary cause was that we were losing the translation for the kernel stack from the SLB, but we still had it in the ERAT for a while longer. Now, there is a critical region in various exception exit paths where we have loaded the SRR0 and SRR1 registers from GPRs and we are loading those GPRs and the stack pointer from the exception frame on the kernel stack. If we lose the ERAT entry for the kernel stack in that region, we take an SLB miss on the next access to the kernel stack. Taking the exception overwrites the values we have put into SRR0 and SRR1, which means we lose state. In fact we ended up repeating that last section of the exception exit path, but using the user stack pointer this time. That caused another exception (or if it didn't, we loaded a new value from the user stack and then went around and tried to use that). And it spiralled downwards from there. The patch below fixes the primary problem by making sure that we really never cast out the SLB entry for the kernel stack. It also improves debuggability in case anything like this happens again by: - In our exception exit paths, we now check whether the RI bit in the SRR1 value is 0. We already set the RI bit to 0 before starting the critical region, but we never checked it. Now, if we do ever get an exception in one of the critical regions, we will detect it before returning to the critical region, and instead we will print a nasty message and oops. - In the exception entry code, we now check that the kernel stack pointer value we're about to use isn't a userspace address. If it is, we print a nasty message and oops. This has been tested on G5 and pSeries (both with and without hypervisor) and compile-tested on iSeries.
-
Andrew Morton authored
From: Anton Blanchard <anton@samba.org> Add numa=off command line option to disable NUMA support at runtime. Useful if there are issues with our parsing of the NUMA toplogy or for testing NUMA gains.
-
Andrew Morton authored
From: Anton Blanchard <anton@samba.org> Remove the old __IO_DEBUG stuff and add some nice comments courtesy of x86.
-
Andrew Morton authored
From: Stephen Rothwell <sfr@canb.auug.org.au> This patch adds the driver for the PPC64 iSeries virtual tape.
-
Alexander Viro authored
Preparation for per-mountpoint noatime, nodiratime and later - per-mountpoint r/o. Depends on file_accessed() patch, should go after it. New helper - touch_atime(mnt, dentry). It's a wrapper for update_atime() and that's where all future per-mountpoint checks will go.
-
Alexander Viro authored
New inlined helper - file_accessed(file) (wrapper for update_atime())
-
Alexander Viro authored
Make sure that we don't end up with symlink mounted over something (mount --bind is safe since we use LOOKUP_FOLLOW in pathname resolution there).
-
David S. Miller authored
into nuts.davemloft.net:/disk1/BK/sparc-2.6
-
William Lee Irwin III authored
-
Jason Wever authored
-
David Howells authored
This fixes a minor problem with fcntl. get_close_on_exec() uses FD_ISSET() to determine the fd state, but this is not guaranteed to be either 0 of FD_CLOEXEC. Make that explicit. Also, the argument of set_close_on_exec() is being AND'ed with the literal constant 1. Make it use an explicit FD_CLOEXEC test.
-
Linus Torvalds authored
(The broken macro only triggers for non-gcc compiles, but still..)
-
Bartlomiej Zolnierkiewicz authored
Original patch from Hannes Reinecke <hare@suse.de>. This is required to have modular IDE drivers announce themselves properly in modules.pcimap.
-
Linus Torvalds authored
-
Rusty Russell authored
Implement migrate_all_tasks() which moves tasks off cpu while machine is stopped.
-
Rusty Russell authored
The registration and unregistration of CPU notifiers should be done under the cpucontrol sem. They should also be exported.
-
Alexander Viro authored
include files moved to fs/hpfs/, gratitious #include removed, stuff that doesn't have to be global made static, misindented chunk of hpfs_readdir() put in place, etc.
-
Alexander Viro authored
Fixed the locking scheme. The need of extra locking was caused by the fact that hpfs_write_inode() must update directory entry; since HPFS directories are implemented as b-trees, we must provide protection both against rename() (to make sure that we update the entry in right directory) and against rebalancing of the parent. Old scheme had both deadlocks and races - to start with, we had no protection against rename()/unlink()/rmdir(), since (a) locking parent was done without any warranties that it will remain our parent and (b) check that we still have a directory entry (== have positive nlink) was done before we tried to lock the parent. Moreover, iget serialization killed two steps ago gave immediate deadlocks if iget() of parent had triggered another hpfs_write_inode(). New scheme introduces another per-inode semaphore (hpfs-only, obviously) protecting the reference to parent. It's taken on rename/rmdir/unlink victims and inode being moved by rename. Old semaphores are taken only on parent(s) and only after we grab one(s) of the new kind. hpfs_write_inode() gets the new semaphore on our inode, checks nlink and if it's non-zero grabs parent and takes the old semaphore on it. Order among the semaphores of the same kind is arbitrary - the only function that might take more than one of the same kind is hpfs_rename() and it's serialized by VFS. We might get away with only one semaphore, but then the ordering issues would bite us big way - we would have to make sure that child is always locked before parent (hpfs_write_inode() leaves no other choice) and while that's easy to do for almost all operations, rename() is a bitch - as always. And per-superblock rwsem giving rename() vs. write_inode() exclusion on hpfs would make the entire thing too baroque for my taste. ->readdir() takes no locks at all (protection against directory modifications is provided by VFS exclusion), ditto for ->lookup(). ->llseek() on directories switched to use of (VFS) ->i_sem, so it's safe from directory modifications and ->readdir() is safe from it - no hpfs locks are needed here.
-
Alexander Viro authored
We used to have GFP_KERNEL kmalloc() done by the code that held hpfs lock on directory. That could trigger a call of hpfs_write_inode() and deadlock; fixed by switch to GFP_NOFS. Same for hpfs inodes themselves - hpfs_write_inode() calls iget() and that could trigger both the deadlocks (avoidable with very baroque locking scheme) and stack overflows (unavoidable unless we kill potential recursion here).
-
Alexander Viro authored
Killed the nightmares in hpfs iget handling. Since in some (fairly frequent) cases hpfs_read_inode() could avoid any IO (basically, lookup hitting a native HPFS regular file can get all data from directory entry) hpfs had a flag passed to that sucker. Said flag had been protected by a semaphore lookalike made out of spit and duct-tape and callers of iget looked like hpfs_lock_iget(sb, flag); result = iget(sb, ino); hpfs_unlock_iget(sb); Since now we are calling hpfs_read_inode() directly (note that calling it without hpfs_lock_iget() would simply break) we can forget all that crap and get rid of the flag - caller knows what it wants to call. BTW, that had killed one of the last sleep_on() users in fs/*/*.
-
Alexander Viro authored
Preparation to hpfs iget locking cleanup - remaining iget() callers replaced with explicit iget_locked() + call hpfs_read_inode()/unlock_new_inode() if inode is new.
-
Alexander Viro authored
1) common initialization for all paths in hpfs_read_inode() taken into a separate helper (hpfs_init_inode()) 2) hpfs mkdir(),create(),mknod() and symlink() do not bother with iget() anymore - they call new_inode(), do initializations and insert new inode into icache. Handling of OOM failures cleaned up - if we can't allocate in-core inode, bail instead of corrupting the filesystem. Allocating in-core inode early also avoids one of the deadlocks here (hpfs_write_inode() from memory pressure by kmem_cache_alloc() could deadlock on attempt to lock our directory). 3) hpfs_write_inode() marks the inode dirty again in case if it fails to iget() its parent directory. Again, OOM could trigger fs corruption here.
-
Alexander Viro authored
hpfs_{lock,unlock}_{2,3}inodes() killed; all places that take more than one lock have ->i_sem held by VFS on all inodes involved and all hpfs per-inode locks are of the same type. IOW, we can replace these guys with multiple hpfs_lock_inode() - order doesn't matter here.
-
Alexander Viro authored
Failure exits in hpfs/namei.c merged and cleaned up.
-
- 17 Mar, 2004 16 commits
-
-
Andrew Morton authored
From: Armin Schindler <armin@melware.de> Fixed NULL pointer reference in recv_handler()
-
Andrew Morton authored
From: Armin Schindler <armin@melware.de> Use the notifier workqueue in a cleaner way.
-
Andrew Morton authored
From: Armin Schindler <armin@melware.de> Show debug messages if debug is enabled only.
-
Andrew Morton authored
From: "Jose R. Santos" <jrsantos@austin.ibm.com> After discussing it with Neil, he fell that the original justification for taking the kernel_lock on find_exported_dentry() is not longer valid and should be safe to remove. This patch fixes an issue while running SpecSFS where under memory pressure, shrinking dcache cause find_exported_dentry() to allocate disconnected dentries that later needed to be properly connected. The connecting part of the code was done with BKL taken which cause a sharp drop in performance during iterations and profiles showing 75% time spent on find_exported_dentry(). After applying the patch, time spent on the function is reduce to <1%. I have tested this on an 8-way machine with 56 filesystems for several days now with no problems using ext2, ext3, xfs and jfs.
-
Andrew Morton authored
Don't panic if the journal superblock is wrecked: just fail the mount.
-
Andrew Morton authored
From: Anton Blanchard <anton@samba.org> From: Robert Love <rml@ximian.com> arch/ppc64/Kconfig's entry for CONFIG_PREEMPT is missing the description after the "bool" statement, so the entry does not show up. Also, the help description mentions a restriction that is not [any longer] true.
-
Andrew Morton authored
From: Anton Blanchard <anton@samba.org> Sometimes we just want to pass the error up to the kernel and let it oops. X it is.
-
Andrew Morton authored
From: Anton Blanchard <anton@samba.org> - remove now unused kernel syscalls. - wrap recently added defines in #ifdef __KERNEL__, fixes glibc compile issue - some of our extra syscalls used asmlinkage, some did not. Make them consistent
-
Andrew Morton authored
From: Tom Rini <trini@kernel.crashing.org> The following patch comes from Paul Mackerras. Earlier on in 2.6, arch/ppc/boot/utils/mkprep.c was changed slightly so that it would build and work on Solaris. Doing this required changing from filling out pointers to an area to filling out a local copy of the struct. However, a memcpy was left out, and the info is only needed on some machines to boot. The following adds in the missing memcpy and allows for IBM PRePs to boot from a raw floppy again.
-
Andrew Morton authored
From: Olaf Hering <olh@suse.de> Current Linus tree adds an extra space and dot to the mkprep options. `make all' with an smp config doesnt work. This patch fixes it.
-
bk://gkernel.bkbits.net/net-drivers-2.6Linus Torvalds authored
into ppc970.osdl.org:/home/torvalds/v2.6/linux
-
Don Fry authored
Please apply this fix to backout an erroneous change in loopback.c The statistics structure is allocated separately from the loopback_dev structure, and the current code overwrites something other than the statistics. In my case the scsi_cmd_pool structure.
-
Jeff Garzik authored
-
David S. Miller authored
into kernel.bkbits.net:/home/davem/net-2.6
-
Krzysztof Halasa authored
The attached patch fixes the problem: de->macmode variable, meant to shadow MacMode (CSR6) register, was used inconsistently, causing some updates to this register to be dropped. 2.4 kernel doesn't shadow this register at all, so I removed shadowing from 2.6 as well.
-
David S. Miller authored
into nuts.davemloft.net:/disk1/BK/net-2.6
-