- 03 Jun, 2002 9 commits
-
-
Robert Love authored
Looks like sys_sysinfo has not been touched in years. Among other things, it uses a global cli() for protection; I switched it to an existing rwlock. I also pulled it out of info.c and stuck it in timer.c (I choose timer.c because it shares dependencies there already). The details: - move sys_sysinfo to kernel/timer.c from kernel/info.c: why one small syscall got its own file is beyond me. - delete kernel/info.c - stop the global cli! now grab a read_lock on xtime_lock. this is safe as we moved the write_unlock on xtime_lock down one line to cover the calculating of avenrun. - trivial code cleanup
-
Robert Love authored
The attached patch cleans up some per-CPU code in arch/i386/kernel/smp.c that could be problematic under preemption. The first I solve with the new get_cpu interface, for the second two I explicitly disable preemption. I also changed 1 to 1UL in the shift to properly match the type.
-
Robert Love authored
Might as well make it explicit... Patch is against 2.5.20, please apply. Robert Love
-
Robert Love authored
Resend of trivial bits from my scheduler tree...: - shift cpu by 1UL not 1 to match type - clarify various comments - remove the barrier from preempt_schedule. This was here because I used to check need_resched before returning from preempt_schedule but we do not now (although should). The barrier insured need_resched and preempt_count were in sync now and after an interrupt that could occur.
-
Robert Love authored
I started looking into a couple FIXMEs in kernel/capability.c and I ended up with a fairly largish patch (although not quite so many changes to object code). First, it is unsafe to touch task->cap_* while not holding task_capability_lock. The most notable occurrence of this is sys_access which saves the current cap_* values, changes them, does its business, then restores them. In between all this they can change and then be restored to old values. Unfortunately we cannot just grab the lock here since the function can sleep - I marked this with a FIXME for now. Second, I formalized the locking rules with task_capability_lock. I declared the lock in include/linux/capability.h so other code can grab it. Finally, there is a whole boatload of code cleanup: - remove conditional locking/unlocking - that is just gross - don't pointlessly grab the read_lock twice - add/remove/edit comments - change some types (int -> pid_t, etc) - static inline two small functions that are called only once each - remove two FIXMEs - general code cleanup for readability and performance TODO: - fix sys_access and other cap_* accesses - do something about the annoying oddball 5-space indentation in kernel/capability.c !! Patch is against 2.5.20, please apply. Robert Love
-
Robert Love authored
This patch removes the whole wq_lock_t abstraction, forcing the behavior to be that of a standard spinlock and changes all the wq_lock code in the tree appropriately. Removes lots of code - always a Good Thing to me. New behavior is same as previous behavior (USE_RW_WAIT_QUEUE_SPINLOCK unset).
-
Martin Dalecki authored
- Remove last parameter from ide_dump_status. This information is now permanently present in device->staus field, so there is not need to pass it around. - Patch for DVD read through ide-scsi. There is the possibility that we can get request structures passed down, which don't have the queue field set. At lest on the BIO code path this seems to be something worth further investigation. Found by Adam J. Richter. (Jens?) - Revert my change to the hostdata handling. I did get it wrong about the way host structures are allocated by the generic SCSI layer. It plays tricks there. - piix driver updates by Vojtech Pavlik. - We have a ata_out_regfile, so we should have ata_in_regfile too.
-
Martin Dalecki authored
Fix namespace clash with proc stuff an compilation warnings.
-
Linus Torvalds authored
Add "drop_inode" VFS interface to make FS operations cleaner and race-free. Remove old force_delete interface, and update filesystems that used it to use the new infrastructure.
-
- 02 Jun, 2002 31 commits
-
-
Linus Torvalds authored
-
Linus Torvalds authored
-
http://linux-isdn.bkbits.net/linux-2.5.makeLinus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
Rusty Russell authored
-
Linus Torvalds authored
-
Linus Torvalds authored
explicitly.
-
Martin Dalecki authored
- Remove DEVICE_INTR and associated code from floppy driver. - Savlage s390 xpram code from kernel version dependant compilation disease. - Eliminate SET_INTR code from the places where it was used. - Eliminate bogous support for multiple sbpcd controllers. The driver didn't even compile right now before we could think about further supporting it at all we have to get rid of this hack first. Don't call invalidate_buffers in the release method there. Why should it be necessary? - Resurrect sonycd535 compilation. - Let CURRENT request macro use the same primitive at the remaining QUEUE macro in blk.h, which is still not quite right, but first things first :-).
-
Andrew Morton authored
Makes minixfs, sysvfs and ufs understand `mount -o dirsync'.
-
Andrew Morton authored
Fixes a pet peeve: the identifier "flushpage" implies "flush the page to disk". Which is very much not what the flushpage functions actually do. The patch renames block_flushpage and the flushpage address_space_operation to "invalidatepage". It also fixes a buglet in invalidate_this_page2(), which was calling block_flushpage() directly - it needs to call do_flushpage() (now do_invalidatepage()) so that the filesystem's ->flushpage (now ->invalidatepage) a_op gets a chance to relinquish any interest which it has in the page's buffers.
-
Andrew Morton authored
A patch from Hugh Dickins which fixes a couple of error-path leaks related to tmpfs (I think). Also fixes a yield()-inside-spinlock bug. It also includes code to clear the final page outside i_size on truncate. tmpfs should be returning zeroes when a truncated file is later expanded and it currently is not. Hugh is taking care of the 2.4 fix for this.
-
Andrew Morton authored
Replaces SetPageDirty() with set_page_dirty() in several places related to in-memory filesystems. SetPageDirty() is basically always the wrong thing to do. Pages should be moved to the ->dirty_pages list when dirtied so that writeback can see them. Without this change, dirty pages against in-memory filesystems would churn around on the inactive list all the time, rather than getting pushed away onto the active list. A minor efficiency thing.
-
Andrew Morton authored
Fixes a race between unlink and writeback: on the sys_sync() and pdflush paths the caller does not have a reference against the inode. So run __iget prior to dropping inode_lock. Oleg Drokin reported this and seems to believe that it fixes the crashes he was observing. But I was never able to reproduce them..
-
Andrew Morton authored
Fixes a few lock ranking bugs (and deadlocks) related to swap_list_lock(), swap_device_lock(), mapping->page_lock and mapping->private_lock. - Cannot call block_flushpage->try_to_free_buffers() inside mapping->page_lock. Because __set_page_dirty_buffers() takes ->page_lock inside ->private-lock. - Cannot call swap_free->swap_list_lock/swap_device_lock inside mapping->page_lock because exclusive_swap_page() takes ->page_lock inside swap_info_get(). The patch also removes all the block_flushpage() calls from the swap code in favour of a direct call to try_to_free_buffers(). The theory is that the page is locked, there is no I/O underway, nobody else has access to the buffers so they MUST be freeable. A bunch of BUG() checks have been added, and unless someone manages to trigger one, the "block_flushpage() inside spinlock" problem is fixed.
-
Andrew Morton authored
Give swapper_space a ->set_page_dirty() address_space_operation. So swapcache pages do not need special-casing in set_page_dirty_buffers().
-
Andrew Morton authored
Turn on direct-to-BIO writeback for ext3 in data=writeback mode.
-
Andrew Morton authored
block_symlink() is not a "block" function at all. It is a pure pagecache/address_space function. Seeing driverfs calling it was the last straw. The patch renames it to `page_symlink()' and moves it into fs/namei.c
-
Andrew Morton authored
Remove i_wait from struct inode and hash it instead. This is a pure space-saving exercise - 12 bytes from struct inode on x86. NFS was using i_wait for its own purposes. Add a wait_queue_head_t to the fs-private inode for that. This change has been acked by Trond.
-
Andrew Morton authored
Implement buffer_boundary() for ext3. buffer_boundary() is an I/O scheduling hint which the filesystem's get_block() function passes up to the BIO assembly code. It is described in fs/mpage.c The time to read 1,000 52 kbyte files goes from 8.6 seconds down to 2.9 seconds. 52 kbytes is the worst-case size.
-
Andrew Morton authored
Speeds up generic_file_write() by not calling mark_inode_dirty() when the mtime and ctime didn't change. There may be concerns over the fact that this restricts mtime and ctime updates to one-second resolution. But the interface doesn't support that anyway - all the filesystem knows is that its dirty_inode() superop was called. It doesn't know why. So filesystems which support high-resolution timestamps already need to make their own arrangements. We need an update_mtime i_op to support those properly. time to write a one megabyte file one-byte-at-a-time: Before: ext3: 24.8 seconds ext2: 4.9 seconds reiserfs: 17.0 seconds After: ext3: 22.5 seconds ext2: 4.8 seconds reiserfs: 11.6 seconds Not much improvement because we're also calling expensive mark_inode_dirty() functions when i_size is expanded. So compare the overwrite case: time dd if=/dev/zero of=foo bs=1 count=1M conv=notrunc ext3 before: 20.0 seconds ext3 after: 9.7 seconds
-
Andrew Morton authored
First some terminology: this patch introduces a kernel-wide `pgoff_t' type. It is the index of a page into the pagecache. The thing at page->index. For most mappings it is also the offset of the page into that mapping. This type has a very distinct function in the kernel and it needs a name. I don't have any particular plans to go and migrate everything so we can support 64-bit pagecache indices on x86, but this would be the way to do it. This patch improves the packing density of swapcache pages in the radix tree. A swapcache page is identified by the `swap type' (indexes the swap device) and the `offset' (into that swap device). These two numbers are encoded into a `swp_entry_t' machine word in arch-specific code because the resulting number is placed into pagetables in a form which will generate a fault. The kernel also need to generate a pgoff_t for that page to index it into the swapper_space radix tree. That pgoff_t is usually bitwise-identical to the swp_entry_t. That worked OK when the pagecache was using a hash. But with a radix tree, it produces catastrophically bad results. x86 (and many other architectures) place the `type' field into the low-order bits of the swp_entry_t. So *all* swapcache pages are basically identical in the eight low-order bits. This produces a very sparse radix tree for swapcache. I'm observing packing densities of 1% to 2%: so the typical 128-slot radix tree node has only one or two pages in it. The end result is that the kernel needs to allocate approximately one new radix-tree node for each page which is added to the swapcache. So no wonder we're having radix-tree node exhaustion during swapout! (It's actually quite encouraging that the kernel works as well as it does). The patch changes the encoding of the swp_entry_t so that its most-significant bits contain the `type' field and the least-significant bits contain the `offset' field, right-aligned. That is: the encoding in swp_entry_t is now arch-independent. The new file <linux/swapops.h> has conversion functions which convert the swp_entry_t to and from its machine pte representation. Packing density in the swapper_space mapping goes up to around 90% (observed) and the kernel is tons happier under swap load. An alternative approach would be to create new conversion functions which convert an arch-specific swp_entry_t to and from a pgoff_t. I tried that. It worked, but I liked it less.
-
Andrew Morton authored
Remove some unused PageSkip() macros. Presumably leftovers from PG_skip which isn't there any more.
-
Andrew Morton authored
A common and very subtle bug is to use list_heads which aren't on any lists. It causes kernel memory corruption which is observed long after the offending code has executed. The patch nulls out the dangling pointers so we get a nice oops at the site of the buggy code.
-
Jens Axboe authored
I missed this one in the last patch I sent to you.
-
Jens Axboe authored
Too much copy'n paste between 2.4 and 2.5 code base, attached patch on top of the previous block tag fixes makes it work/compile again. Sorry about that.
-
Jens Axboe authored
A buglet and a few adjustments.
-
Jens Axboe authored
This should be the last of tq_disk, at least the trivial ones. md still has some queue_task references, I'll let Ingo/Neil clean those up. suspend is still broken, it was broken before too though. I guess Pavel will want to fix that. Also, I've documented the plug functions.
-
Martin Dalecki authored
- PPC compilation fix by Paul Mackerras. - Various fixes by Bartek: fix ata_irq_enable() and ata_reset() for legacy ATA-1 devices in start_request() for REQ_DRIVE_ACB a) don't run ->prehandler() twice b) return ata_taskfile() value
-
Martin Dalecki authored
- Don't use ata_taskfiles cmd field for drive status reporting, we can now simply use drive->status instead. - Unify command type parser entries which could be unified due to the unification of corresponding interrupt handlers. - Eliminate reading parameter from ata_do_udma(). We have this information already in the rq. This allows us to merge several methods. - Rename XXX_udma to udma_setup, since we have finally settled up on this semantics. - Simplify tons of host chip code by removing wrapper functions.
-
Martin Dalecki authored
- Sanitize the handling of the ioctl's and fix a bug on the way in dealing with the WIN_SMART command where arguments where exchanged. - Finally sanitize ioctl further until it turned out that we could get rid of the special request type REQ_DRIVE_CMD entierly. We are now using consistently REQ_DRIVE_ACB. One hidden code path less again! - Realize the ide_end_drive_cmd can be on the REQ_DRIVE_ACB only for ioctl() to a disk. Eliminate it's usage from device type driver modules. - Remove command member from struct hd_drive_task_hdr and place it in strcut ata_taskfile. It is not common between the normal register file and HOB. We will have to introduce some helper functions for particular command types.
-
Martin Dalecki authored
- Fix typo in sparc_v9 code, in ns87415, just introduced. - Eliminate unnecessary struct hd_drive_hob_hdr those are in reality precisely the same registers as usual. - Eliminate control_t, nowhere used type. - Unfold ide_init_drive_cmd() at the places where it's used. This makes obvious that REQ_DRIVE_CMD gets only used on the ioctl command path.
-
Martin Dalecki authored
- Move ide_fixstring() from ide.c to probe.c, since this is the place, where it's most used. - Remove GET_STAT() - it's not used any longer. - Remove last parameter of ide_error. Rename it to ata_error(). - Don't use ide_fixstring in qd65xx.c host chip driver. The model name is already fixed in probe.c. - Invent ata_irq_enable() for the handling of the trice nIEN bit of the control register. Consistently use ch->intrproc method every time we toggle this bit. This simply wasn't the case before! - Disable interrupts on a previous channel only when we share them indeed. - Eliminate simple drive command handling function drive_cmd. - Simplify the ioctl handler. Move it to ioctl, since that's the only place where it's actually used.
-