- 05 Jan, 2009 40 commits
-
-
Jan Kara authored
For each quota type each node has local quota file. In this file it stores changes users have made to disk usage via this node. Once in a while this information is synced to global file (and thus with other nodes) so that limits enforcement at least aproximately works. Global quota files contain all the information about usage and limits. It's mostly handled by the generic VFS code (which implements a trie of structures inside a quota file). We only have to provide functions to convert structures from on-disk format to in-memory one. We also have to provide wrappers for various quota functions starting transactions and acquiring necessary cluster locks before the actual IO is really started. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Mark system files as not subject to quota accounting. This prevents possible recursions into quota code and thus deadlocks. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
OCFS2 can easily support nested transactions. We just have to take care and not spoil statistics acquire semaphore unnecessarily. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
OCFS2 needs to scan all active dquots once in a while and sync quota information among cluster nodes. Provide a helper function for it so that it does not have to reimplement internally a list which VFS already has. Moreover this function is probably going to be useful for other clustered filesystems if they decide to use VFS quotas. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
OCFS2 needs to peek whether quota structure is already in memory so that it can avoid expensive cluster locking in that case. Similarly when freeing dquots, it checks whether it is the last quota structure user or not. Finally, it needs to get reference to dquot structure for specified id and quota type when recovering quota file after crash. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Increase reported version number of quota support since quota core has changed significantly. Also remove __DQUOT_NUM_VERSION__ since nobody uses it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Quota in a clustered environment needs to synchronize quota information among cluster nodes. This means we have to occasionally update some information in dquot from disk / network. On the other hand we have to be careful not to overwrite changes administrator did via SETQUOTA. So indicate in dquot->dq_flags which entries have been set by SETQUOTA and quota format can clear these flags when it properly propagated the changes. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
For clustered filesystems, it can happen that space / inode usage goes negative temporarily (because some node is allocating another node is freeing and they are not completely in sync). So let quota code allow this and change qsize_t so a signed type so that we don't underflow the variables. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Coming quota support for OCFS2 is going to need quite a bit of additional per-sb quota information. Moreover having fs.h include all the types needed for this structure would be a pain in the a**. So remove the union from mem_dqinfo and add a private pointer for filesystem's use. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
There is going to be a new version of quota format having 64-bit quota limits and a new quota format for OCFS2. They are both going to use the same tree structure as VFSv0 quota format. So split out tree handling into a separate file and make size of leaf blocks, amount of space usable in each block (needed for checksumming) and structures contained in them configurable so that the code can be shared. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Since these include files are used only by implementation of quota formats, there's no need to have them in include/linux/. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
If filesystem can handle quota files as system files hidden from users, we can skip a lot of cache invalidation, syncing, inode flags setting etc. when turning quotas on, off and quota_sync. Allow filesystem to indicate that it is hiding quota files from users by DQUOT_QUOTA_SYS_FILE flag. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Split DQUOT_USR_ENABLED (and DQUOT_GRP_ENABLED) into DQUOT_USR_USAGE_ENABLED and DQUOT_USR_LIMITS_ENABLED. This way we are able to separately enable / disable whether we should: 1) ignore quotas completely 2) just keep uptodate information about usage 3) actually enforce quota limits This is going to be useful when quota is treated as filesystem metadata - we then want to keep quota information uptodate all the time and just enable / disable limits enforcement. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Upto now, DQUOT_USR_SUSPENDED behaved like a state - i.e., either quota was enabled or suspended or none. Now allowed states are 0, ENABLED, ENABLED | SUSPENDED. This will be useful later when we implement separate enabling of quota usage tracking and limits enforcement because we need to keep track of a state which has been suspended. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Checks like <= 0 for an unsigned type do not make much sence. The value could be only 0 and that does not happen often enough for the check to be worth it. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
So far quota was fine with quota block limits and inode limits/numbers in a 32-bit type. Now with rapid increase in storage sizes there are coming requests to be able to handle quota limits above 4TB / more that 2^32 inodes. So bump up sizes of types in mem_dqblk structure to 64-bits to be able to handle this. Also update inode allocation / checking functions to use qsize_t and make global structure keep quota limits in bytes so that things are consistent. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Jan Kara authored
Some filesystems would like to keep private information together with each dquot. Add callbacks alloc_dquot and destroy_dquot allowing filesystem to allocate larger dquots from their private slab in a similar fashion we currently allocate inodes. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Tao Ma authored
During an xattr set, when we move a xattr which was stored in inode to the outside bucket, we have to delete it and it will use the old value of xis->not_found. xis->not_found is removed by ocfs2_calc_xattr_set_need though, so we must restore it. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Tao Ma authored
When we extend one xattr's value to a large size, the old value size might be smaller than the size of a value root. In those cases, we still need to guess the metadata allocation. Reported-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Mark Fasheh authored
JBD2 is fully backwards compatible with JBD and it's been tested enough with Ocfs2 that we can clean this code up now. Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
Now that we've centralized the ocfs2_read_virt_blocks() code, let's use it in ocfs2_read_dir_block(). Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
The ocfs2_read_dir_block() function really maps an inode's virtual blocks to physical ones before calling ocfs2_read_blocks(). Let's extract that to common code, because other places might want to do that. Other than the block number being virtual, ocfs2_read_virt_blocks() takes the same arguments as ocfs2_read_blocks(). It converts those virtual block numbers to physical before calling ocfs2_read_blocks() directly. If the blocks asked for are discontiguous, this can mean multiple calls to ocfs2_read_blocks(), but this is mostly hidden from the caller. Like ocfs2_read_blocks(), the caller can pass in an existing buffer_head. This is usually done to pick up some readahead I/O. ocfs2_read_virt_blocks() checks the buffer_head's block number against the extent map - it must match. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
Add an optional validation hook to ocfs2_read_blocks(). Now the validation function is only called when a block was actually read off of disk. It is not called when the buffer was in cache. We add a buffer state bit BH_NeedsValidate to flag these buffers. It must always be one higher than the last JBD2 buffer state bit. The dinode, dirblock, extent_block, and xattr_block validators are lifted to this scheme directly. The group_descriptor validator needs to be split into two pieces. The first part only needs the gd buffer and is passed to ocfs2_read_block(). The second part requires the dinode as well, and is called every time. It's only 3 compares, so it's tiny. This also allows us to clean up the non-fatal gd check used by resize.c. It now has no magic argument. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
We weren't consistently checking xattr blocks after we read them. Most places checked the signature, but none checked xb_blkno or xb_fs_signature. Create a toplevel ocfs2_read_xattr_block() that does the read and the validation. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
We have ocfs2_bread() as a vestige of the original ext-based dir code. It's only used by directories, though. Turn it into ocfs2_read_dir_block(), with a prototype matching the other metadata read functions. It's set up to validate dirblocks when the time comes. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
We weren't consistently checking extent blocks after we read them. Most places checked the signature, but none checked h_blkno or h_fs_signature. Create a toplevel ocfs2_read_extent_block() that does the read and the validation. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
Random places in the code would check a group descriptor bh to see if it was valid. The previous commit unified descriptor block reads, validating all block reads in the same place. Thus, these checks are no longer necessary. Rather than eliminate them, however, we change them to BUG_ON() checks. This ensures the assumptions remain true. All of the code paths to these checks have been audited to ensure they come from a validated descriptor read. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
We have a clean call for validating group descriptors, but every place that wants the always does a read_block()+validate() call pair. Create a toplevel ocfs2_read_group_descriptor() that does the right thing. This allows us to leverage the single call point later for fancier handling. We also add validation of gd->bg_generation against the superblock and gd->bg_blkno against the block we thought we read. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
Currently the validation of group descriptors is directly duplicated so that one version can error the filesystem and the other (resize) can just report the problem. Consolidate to one function that takes a boolean. Wrap that function with the old call for the old users. This is in preparation for lifting the read+validate step into a single function. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
Random places in the code would check a dinode bh to see if it was valid. Not only did they do different levels of validation, they handled errors in different ways. The previous commit unified inode block reads, validating all block reads in the same place. Thus, these haphazard checks are no longer necessary. Rather than eliminate them, however, we change them to BUG_ON() checks. This ensures the assumptions remain true. All of the code paths to these checks have been audited to ensure they come from a validated inode read. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
The ocfs2 code currently reads inodes off disk with a simple ocfs2_read_block() call. Each place that does this has a different set of sanity checks it performs. Some check only the signature. A couple validate the block number (the block read vs di->i_blkno). A couple others check for VALID_FL. Only one place validates i_fs_generation. A couple check nothing. Even when an error is found, they don't all do the same thing. We wrap inode reading into ocfs2_read_inode_block(). This will validate all the above fields, going readonly if they are invalid (they never should be). ocfs2_read_inode_block_full() is provided for the places that want to pass read_block flags. Every caller is passing a struct inode with a valid ip_blkno, so we don't need a separate blkno argument either. We will remove the validation checks from the rest of the code in a later commit, as they are no longer necessary. Signed-off-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Tiger Yang authored
This patch adds the Kconfig option "CONFIG_OCFS2_FS_POSIX_ACL" and mount options "acl" to enable acls in Ocfs2. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Tiger Yang authored
We need to get the parent directories acls and let the new child inherit it. To this, we add additional calculations for data/metadata allocation. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Tiger Yang authored
This function is used to update acl xattrs during file mode changes. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-
Tiger Yang authored
This function is used to enhance permission checking with POSIX ACLs. Signed-off-by: Tiger Yang <tiger.yang@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
-