An error occurred fetching the project authors.
- 02 Feb, 2003 1 commit
-
-
Andrew Morton authored
The second quota locking fix. Sorry, I seem to have misplaced the changelog.
-
- 01 Jan, 2003 2 commits
-
-
Andrew Morton authored
I've been carrying this since Jan sent it out a month or two ago. I don't know if anyone has tested it though. The sort of people who use quotas tend to like nice stable kernels. I read through it, but can't say that I know enough about quotas to know if it makes sense. The wait_on_dquot() synchronisation is a bit odd. I do need to do a round of stability testing with this and ext3 - the interaction between quotas and ext3 is an area where we've had deadlocks in the past. But the quota locking is definitely looking crufty, and I'd suggest that we run with this.. Patch from Jan Kara <jack@suse.cz> "I'm resending you the patch with new quota SMP locking. The patch removes BKL and replaces it with two spinlocks protecting quota lists and data stored in dquot structures. Also non-SMP locking was changed a bit make SMP locking easier (eg. we got rid of not very nice dq_dup_ref counters). The patch is against 2.5.48 but applies well also to 2.5.49. Would you please apply the patch?" - Change dqoff_sem from a semaphore to an rwsem. - Convert dqi_flags from an int to a ulong and use test_bit/set_bit rather thatn &/| - The various exported quota operations now run without lock_kernel(). This means that things like DQUOT_ALLOC_SPACE no longer take lock_kernel() in out high-perfomance filesystems. Nice. - Replace lock_kernel() in the quota code with two quota-private global locks. - Replace all the open-coded waitqueue management with a semaphore (wait_on_dquot())
-
Christoph Hellwig authored
fs.h only needs the forward-declaration of struct statfs
-
- 21 Dec, 2002 1 commit
-
-
Andrew Morton authored
Running a `mount -o remount' against ext3 deadlocks if there is heavy write activity. It's a sort of AB/BA deadlock caused by calling log_wait_commit() under lock_super(). The caller holds lock_super() and is waiting for a commit, but the commit cannot complete because lock_super() is also used in the block allocator. The way we fixed this in tha past is to drop the superblock lock inside ext3. The way this patch fixes it is to arrange for lock_super() to not be held around the ->sync_fs() call. Also: sync_filesystems is on the sys_sync() path and is racy wrt unmount. Check sb->s_root after taking sb->s_umount.
-
- 14 Dec, 2002 2 commits
-
-
Andrew Morton authored
A little cleanup suggested by Chris Mason or Al Viro. Quite a number of codepaths are testing whether a superblock has a non-null ->s_op pointer. We can remove all those by making sure that all superblocks have a valid ->s_op.
-
Andrew Morton authored
This is infrastructure for fixing the journalled-data ext3 unmount data loss problem. It was sent for comment to linux-fsdevel a week ago; there was none. Add a `sync_fs' superblock operation whose mandate is to perform filesystem-specific operations to ensure a successful sync. It is called in two places: 1: fsync_super() - for umount. 2: sys_sync() - for global sync. In the sys_sync() case we call all the ->write_super() methods first. write_super() is an async flushing operation. It should not block. After that, we call all the ->sync_fs functions. This is independent of the state of s_dirt! That was all confused up before, and in this patch ->write_super() and ->sync_fs() are quite separate. With ext3 as an example, the initial ->write_super() will start a transaction, but will not wait on it. (But only if s_dirt was set!) The first ->sync_fs() call will get the IO underway. The second ->sync_fs() call will wait on the IO. And we really do need to be this elaborate, because all the testing of s_dirt in there makes ->write_super() an unreliable way of detecting when the VFS is trying to sync the filesystem.
-
- 11 Dec, 2002 1 commit
-
-
Christoph Hellwig authored
-
- 16 Nov, 2002 1 commit
-
-
Christoph Hellwig authored
This is a preparation to get rid of the implicit includes in dcache.h and fs_struct.h.
-
- 28 Oct, 2002 1 commit
-
-
Alexander Viro authored
* do_open() cleaned up * we always pick block_device_operations from gendisk->fops now * register_blkdev() just stores the name of driver, nothing more * ->bd_op and ->bd_queue removed - we have that in gendisk * get_blkfops() is gone
-
- 17 Oct, 2002 2 commits
-
-
Greg Kroah-Hartman authored
-
Greg Kroah-Hartman authored
This is needed for the next patches that change the way the security calls work.
-
- 10 Sep, 2002 1 commit
-
-
Andrew Morton authored
The patch fixes a few problems in the writer throttling code. Mainly in the situation where a single large file is being written out. That file could be parked on sb->locked_inodes due to pdflush writeback, and the writer throttling path coming out of balance_dirty_pages() forgot to look for inodes on ->locked_inodes. The net effect was that the amount of dirty memory was exceeding the limit set in /proc/sys/vm/dirty_async_ratio, possibly to the point where the system gets seriously choked. The patch removes sb->locked_inodes altogether and teaches the throttling code to look for inodes on sb->s_io as well as sb->s_dirty. Also, just leave unwritten dirty pages on mapping->io_pages, and unwritten dirty inodes on sb->s_io. Putting them back onto ->dirty_pages and ->dirty_inodes was fairly pointless, given that both lists need to be looked at.
-
- 10 Aug, 2002 2 commits
-
-
Alexander Viro authored
Small, but tricky: fix for check_disk_change() deadlocks. What we do is a) opening block device shifted from check_partition() to grok_partitions(); check_partitions() takes opened struct block_device. b) all callers of check_disk_change() fall in two groups - ones that are called only from some ->open() and ones that are _never_ called from ->open(). There is no middle ground. We split the thing in two functions - check_disk_change() for the first class and full_check_.... for the second. The former (ones inside ->open()) doesn't touch partition tables but marks the bdev as "had been invalidated". In the end of do_open() we check if bdev is marked and call wipe_partitions()/check_partition() if it is - at that point bdev is fully set up and ready. c) ->bd_part_sem kludge is gone - we use ->bd_sem instead. That is, do_open() on a partition grabs ->bd_sem on entire disk and picks partition data while under it; do_open() on entire disk rereads partition if needed before dropping ->bd_sem (right before dropping it); BLKRRPART does trylock on ->bd_sem and then checks ->bd_part_count - same logics as before, except that we use ->bd_sem instead of ->bd_part_sem. That kills recursive open(), gives us the same exclusion rules as we had and makes sure that actual IO (including rereading partition tables) is done only when we are ready to do it. It actually sounds a lot nastier than it is. do_open() is a one sick puppy right now, but we have everything in one place and _out_ of drivers (and 20-odd equally sick puppies are gone from them, along with about the same number of races). Now we are almost ready to clean it up for good - all that remains to do before that is to get the rest of drivers (cciss, DAC960, i2o and a couple of ancients - xd and acsi) using per-disk gendisks. Then most of that crap will disappear. BTW, the only generic ioctl remaining in the drivers is HDIO_GETGEO - a lot of foo_ioctl() starts with if (cmd != HDIO_GETGEO) return -EINVAL; ;-)
-
Alexander Viro authored
check_disk_change() converted to passing struct block_device. Old variant is still needed for a couple of places; wrapper is provided (__check_disk_change(kdev)). do_open() logics with setting ->bd_op sanitized - now we do that before calling ->open().
-
- 22 Jul, 2002 1 commit
-
-
Stephen D. Smalley authored
The below patch adds the filesystem-related LSM hooks, specifically the super_block, inode, and file hooks, to the 2.5.27 kernel.
-
- 11 Jun, 2002 5 commits
-
-
Alexander Viro authored
->s_dev is switched to dev_t. Everything that uses it uses it as a number - i.e. all instances are either minor() or kdev_t_to_nr().
-
Alexander Viro authored
get_super() split in two functions - get_super(bdev) and user_get_super(dev_t). Callers that used get_super() to get superblock by (dev_t) syscall argument switched to the latter; the rest had block_device in question and switched to passing it.
-
Alexander Viro authored
added bdev_read_only() - analog of is_read_only() using block_device. Almost all callers of is_read_only() converted.
-
Alexander Viro authored
sget()/generic_shutdown_super() cleaned up; fixed error handling in sget()
-
Alexander Viro authored
FS_NOMOUNT is gone, initialization for pseudo-filesystems (bdev, pipe, sock) switched to use of a common helper.
-
- 23 May, 2002 1 commit
-
-
Christoph Hellwig authored
Make the 144 files in fs/ that need it include buffer_head.h directly. Again some uses in the VFS files are layering violations and need to be addressed later. The new include statement gives a nice grep pattern for that :)
-
- 22 May, 2002 1 commit
-
-
Christoph Hellwig authored
Currently fs.h is full of unrelated declarations and included in almost any source file. Thus it makes sense to spilt certain aspects out that are only used by few users. This patch starts with the namei/path lookup interface and splits it into <linux/namei.h> which is now directly included by the 24 files that actually need it.
-
- 20 May, 2002 2 commits
-
-
Christoph Hellwig authored
The lock.h header contained some hand-crafted lcoking routines from the pre-SMP days. In 2.5 only lock_super/unlock_super are left, guarded by a number of completly unrelated (!) includes. This patch moves lock_super/unlock_super to fs.h, which defined struct super_block that is needed for those to operate it, removes locks.h and updates all caller to not include it and add the missing, previously nested includes where needed.
-
Jan Kara authored
This is probably the largest chunk in quota patches. It removes old quotactl interface and implements new one. New interface should not need arch specific conversions so they are removed. All quota interface stuff is moved to quota.c so we can easily separate things which should be compiled even if quota is disabled (mainly because XFS needs some interface even if standard VFS quota is disabled). Callbacks to filesystem on quota_on() and quota_off() are implemented (needed by Ext3), quota operations callbacks are now set in super.c on superblock initialization and not on quota_on(). This way it starts to make sense to have callbacks on alloc_space(), alloc_inode() etc. as filesystem can override them on read_super(). This will be used later for implementing journalled quota.
-
- 19 May, 2002 1 commit
-
-
Andrew Morton authored
Fixes a performance problem with many-small-file writeout. At present, files are written out via their mapping and their indirect blocks are written out via the blockdev mapping. As we know that indirects are disk-adjacent to the data it is better to start I/O against the indirects at the same time as the data. The delalloc pathes have code in ext2_writepage() which recognises when the target page->index was at an indirect boundary and does an explicit hunt-and-write against the neighbouring indirect block. Which is ideal. (Unless the file was dirtied seekily and the page which is next to the indirect was not dirtied). This patch does it the other way: when we start writeback against a mapping, also start writeback against any dirty buffers which are attached to mapping->private_list. Let the elevator take care of the rest. The patch makes a number of tuning changes to the writeback path in fs-writeback.c. This is very fiddly code: getting the throughput tuned, getting the data-integrity "sync" operations right, avoiding most of the livelock opportunities, getting the `kupdate' function working efficiently, keeping it all least somewhat comprehensible. An important intent here is to ensure that metadata blocks for inodes are marked dirty before writeback starts working the blockdev mapping, so all the inode blocks are efficiently written back. The patch removes try_to_writeback_unused_inodes(), which became unreferenced in vm-writeback.patch. The patch has a tweak in ext2_put_inode() to prevent ext2 from incorrectly droppping its preallocation window in response to a random iput(). Generally, many-small-file writeout is a lot faster than 2.5.7 (which is linux-before-I-futzed-with-it). The workload which was optimised was tar xfz /nfs/mountpoint/linux-2.4.18.tar.gz ; sync on mem=128M and mem=2048M. With these patches, 2.5.15 is completing in about 2/3 of the time of 2.5.7. But it is only a shade faster than 2.4.19-pre7. Why is 2.5.7 so much slower than 2.4.19? Not sure yet. Heavy dbench loads (dbench 32 on mem=128M) are slightly faster than 2.5.7 and significantly slower than 2.4.19. It appears that the cause is poor read throughput at the later stages of the run. Because there are background writeback threads operating at the same time. The 2.4.19-pre8 write scheduling manages to stop writeback during the latter stages of the dbench run in a way which I haven't been able to sanely emulate yet. It may not be desirable to do this anyway - it's optimising for the case where the files are about to be deleted. But it would be good to find a way of "pausing" the writeback for a few seconds to allow readers to get an interval of decent bandwidth. tiobench throughput is basically the same across all recent kernels. CPU load on writes is down maybe 30% in 2.5.15.
-
- 10 May, 2002 1 commit
-
-
Neil Brown authored
This removes the old alternates to export_operations for exporting a filesystem. It removes fh_to_dentry, dentry_to_fh, and s_nfsd_free_path_sem. It also removes a lot of code. The fs/ntfs change is because it was setting fh_to_dentry and dentry_to_fh (which no longer exist) to NULL.
-
- 01 May, 2002 2 commits
-
-
Alexander Viro authored
- switch block_size() to struct block_device *.
-
Alexander Viro authored
- switch set_blocksize() to struct block_device *.
-
- 25 Apr, 2002 1 commit
-
-
Alexander Viro authored
- bdevname() switched to struct block_device *. Old variant (taking kdev_t) renamed to __bdevname() (very few callers remain). This allow to drop ->b_dev conveniently - it's duplicated by ->b_bdev and most of remaining users were bdevname(bh->b_dev) in various places.
-
- 15 Apr, 2002 1 commit
-
-
Neil Brown authored
Prepare for new export_operations interface (for filehandle lookup): - define d_splice_alias and d_alloc_anon. - define shrink_dcache_anon for removing anonymous dentries - modify d_move to work with anonymous dentries (IS_ROOT dentries) - modify d_find_alias to avoid anonymous dentries where possible as d_splice_alias and d_alloc_anon use this - put in place infrastructure for s_anon allocation and cleaning - replace a piece of code that is in nfsfh, reiserfs and fat with a call to d_alloc_anon - Rename DCACHE_NFSD_DISCONNECTED to DCACHE_DISCONNECTED - Add documentation at Documentation/filesystems/Exporting
-
- 10 Apr, 2002 1 commit
-
-
Alexander Viro authored
Fixes races in jffs2_get_sb() - current code has a window when two mounts of the same mtd device can miss each other, resulting in two active instances of jffs2 fighting over the same device.
-
- 02 Apr, 2002 1 commit
-
-
Alexander Viro authored
get_sb_bdev() stores original block size in ->s_old_blocksize and kill_block_super() restores it. This kills 99% of crap with "oh, I've mounted/umounted that device and its behaviour had changed" (remaining 1% can be dealt in pretty similar ways; ideally I'd like to see ioctls that get/set block size dead and gone).
-
- 18 Mar, 2002 1 commit
-
-
Alexander Viro authored
file_system_typer-related code moved from fs/super.c to fs/filesystems.c
-
- 12 Mar, 2002 2 commits
-
-
Alexander Viro authored
kill_super() and deactivate_super() merged. Next step will be to export these suckers - after that we will be finally done with infrastructure for filesystems with nontrivial ->get_sb().
-
Alexander Viro authored
New helper - sget(). get_sb_bdev() and get_anon_super() switched to using it. Basically, it's get_anon_super() done right (and get_anon_super() itself will probably die).
-
- 11 Mar, 2002 5 commits
-
-
Alexander Viro authored
Grr... When loop in get_sb_bdev() had been switched from global list of superblock to per-type one, we should have switched from sb_entry(p) (aka. list_entry(p, struct super_block, s_list)) to list_entry(p, struct super_block, s_instances). As it is, we end up with false negatives all the time. I.e. second mount from the same block device with the same type gices a new superblock. With obvious nasty results... This fixes that.
-
Alexander Viro authored
bdev filesystems switched. Changes documented in Locking and porting.
-
Alexander Viro authored
The rest of nodev filesystems switched.
-
Alexander Viro authored
FS_LITTER filesystems (ramfs-like) switched to use of ->kill_sb(). FS_LITTER is gone.
-
Alexander Viro authored
New method - ->kill_sb(). It will eventually replace current fs/super.c::shutdown_super() - i.e. it's called when fs driver must shut the superblock down, remove it from all lists, etc.
-