- 03 Oct, 2002 40 commits
-
-
Ivan Kokshaysky authored
Some pci devices may have base address registers locked with non-zero values. Examples: - AGP aperture BAR of AMD-7xx host bridges: if the AGP window disabled, this BAR is read-only and read as 0x00000008; - BAR0-4 of ALi IDE controllers can be non-zero and read-only. Obviously, we can't calculate correct size of the respective region in this case (for AMD AGP window we'll get 4 GB resource - ouch). So I think that we should ignore r/o BARs (let the device specific fixups deal with them if needed). Patch appended (note that extra write(0)/read-back pair is required, as the BAR might be programmed with all 1s).
-
Linus Torvalds authored
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
-
Hugh Dickins authored
Regularize the erratic whitespace conventions in mm/shmem.c. Removal of blank line changes BUG_ON line numbers, otherwise builds the same.
-
Hugh Dickins authored
If PAGE_CACHE_SIZE were to differ from PAGE_SIZE, the VM_ACCT macro, and shmem_nopage's vm_pgoff manipulation, were still not quite right. Slip a cond_resched_lock into shmem_truncate's long loop; but not into shmem_unuse_inode's, since other locks held, and swapoff awful anyway. Move SetPageUptodate to where it's not already set. Replace copy_from_user by __copy_from_user since access already verified. Replace BUG()s by BUG_ON()s. Remove an uninteresting PAGE_BUG().
-
Hugh Dickins authored
If we're going to rely on struct page *s rather than virtual addresses for the metadata pages, let's count nr_swapped in the private field: these pages are only for storing swp_entry_ts, and need not be examined at all when nr_swapped is zero.
-
Hugh Dickins authored
wli suffered OOMs because tmpfs was allocating GFP_USER, for its metadata pages. This patch allocates them GFP_HIGHUSER (default mapping->gfp_mask) and uses atomic kmaps to access (KM_USER0 for upper levels, KM_USER1 for lowest level). shmem_unuse_inode and shmem_truncate rewritten alike to avoid repeated maps and unmaps of the same page: cr's truncate was much more elegant, but I couldn't quite see how to convert it. I do wonder whether this patch is a bloat too far for tmpfs, and even non-highmem configs will be penalised by page_address overhead (perhaps a further patch could get over that). There is an attractive alternative (keep swp_entry_ts in the existing radix-tree, no metadata pages at all), but we haven't worked out an unhacky interface to that. For now at least, let's give tmpfs highmem metadata a spin.
-
Hugh Dickins authored
akpm and wli each discovered unfortunate behaviour of dbench on tmpfs: after tmpfs has reached its data memory limit, dbench continues to lseek and write, and tmpfs carries on allocating unlimited metadata blocks to accommodate the data it then refuses. That particular behaviour could be simply fixed by checking earlier; but I think tmpfs metablocks should be subject to the memory limit, and included in df and du accounting. Also, manipulate inode->i_blocks under lock, was missed before.
-
Hugh Dickins authored
The distinction between shmem_getpage and shmem_getpage_locked is not helpful, particularly now info->sem is gone; and shmem_getpage confusingly tailored to shmem_nopage's expectations. Put the code of shmem_getpage_locked into the frame of shmem_getpage, leaving its callers to unlock_page afterwards.
-
Hugh Dickins authored
Between inode->i_sem and info->lock comes info->sem; but it doesn't guard thoroughly against the difficult races (truncate during read), and serializes reads from tmpfs unlike other filesystems. I'd prefer to work with just i_sem and info->lock, backtracking when necessary (when another task allocates block or metablock at the same time). (I am not satisfied with the locked setting of next_index at the start of shmem_getpage_locked: it's one lock hold too many, and it doesn't really fix races against truncate better than before: another patch in a later batch will resolve that.)
-
Hugh Dickins authored
The earlier partial truncation fix in shmem_truncate admits it is racy, and I've now seen that (though perhaps more likely when mpage_writepages was writing pages it shouldn't). A cleaner fix is, not to repeat the memclear in shmem_truncate, but to hold the partial page in memory throughout truncation, by shmem_holdpage from shmem_notify_change.
-
Hugh Dickins authored
Give tmpfs its own shmem_vm_writeback (and empty shmem_writepages): going through the default mpage_writepages is very wrong for tmpfs, since that may write nearby pages while still mapped into mms, but "writing" converts pages from tmpfs file identity to swap backing identity: doing so while mapped breaks assumptions throughout e.g. the shared file is liable to disintegrate into private instances.
-
Hugh Dickins authored
tmpfs contributes to the AltSysRqM swapcache add and delete statistics, but not to its find statistics: use lookup_swap_cache wrapper to find_get_page, to contribute to those statistics too. Elsewhere, use existing info pointer and NAME_MAX definition. (I'll be sending 2.4 version to Marcelo shortly.)
-
Hugh Dickins authored
Apparently some applications are confused by tmpfs's practice of returning zero for the size of diretories. In 2.4.20-pre6 Peter Anvin submitted a change to make tmpfs directories always have a size of "1". In the same spirit, this patch arranges for tmpfs directories to show up as having 20 * number_of_entries, including "." and "..". Apparently counting up the size of all the entries isn't worth the hassle.
-
Hugh Dickins authored
shmem_rename still didn't get parent directory link count quite right, in the case where you rename a directory in place of an empty directory (with rename syscall: doesn't happen like that with mv command); and it forgot to update new directory's ctime and mtime. (I'll be sending 2.4 version to Marcelo shortly.)
-
Hugh Dickins authored
I've had this patch hanging around for a couple of months (you liked an earlier version, but I never found time to resubmit it), remove some unnecessary PageDirty and PageUptodate manipulations. add_to_page_cache can only receive a dirty page in the add_to_swap case, so deal with it there. add_to_swap is better off using add_to_page_cache directly than add_to_swap_cache. Keep move_to_ and _from_swap_cache simple, and don't fiddle with flags without reason. It's a little less efficient to correct clean->dirty list as an afterthought, but cuts unusual code from slow path.
-
Hugh Dickins authored
tmpfs 1/5 swapoff deadlock: my igrab/iput around the yield in shmem_unuse_inode was rubbish, seems my testing never really hit the case until last week, when truncation of course deadlocked on the page held locked across the iput (at least I had the foresight to say "ugh!" there). Don't yield here, switch over to the simple backoff I'd been using for months in the loopable tmpfs patch (yes, it could loop indefinitely for memory, that's already an issue to be dealt with later). The return convention from shmem_unuse to try_to_unuse is inelegant (commented at both ends), but effective.
-
Andrew Morton authored
From Badari Pavlati. Use bio_add_page() in direct-io.c.
-
Andrew Morton authored
Patch from Rik adds "I/O wait" statistics to /proc/stat. This allows us to determine how much system time is being spent awaiting IO completion. This is an important statistic, as it tends to directly subtract from job completion time. procps-2.0.9 is OK with this, but doesn't report it.
-
Andrew Morton authored
Tells us how many pages were reclaimed by kswapd. The `pgsteal' statistic tells us how many pages were reclaimed altogether. So kswapd_steal - pgsteal is the number of pages which were directly reclaimed by page allocating processes. Also, the `pgscan' data is currently counting the number of pages scanned in shrink_cache() plus the number of pages scanned in refill_inactive_zone(). These are rather separate concepts, so I created the new `pgrefill' counter for refill_inactive_zone(). `pgscan' is now just the number of pages scanned in shrink_cache().
-
Andrew Morton authored
Moves the VM accounting out of /proc/stat and into /proc/vmstat. The VM accounting is now per-cpu. It also moves kstat.pgpgin and kstat.pgpgout into /proc/vmstat. Which is a bit of a duplication of /proc/diskstats (SARD), but it's easy, super-cheap and makes life a lot easier for all the system monitoring applications which we just broke. We now require procps 2.0.9. Updated versions of top and vmstat are available at http://surriel.com and the Cygnus CVS is uptodate for these changes. (Rik has the CVS info at the above site). This tidies up kernel_stat quite a lot - it now only contains CPU things (interrupts and CPU loads) and disk things. So we now have: /proc/stat: CPU things and disk things /proc/vmstat: VM things (plus pgpgin, pgpgout) The SARD patch removes the disk things from /proc/stat as well.
-
Andrew Morton authored
Rewrite these functions to use gang lookup. - This probably has similar performance to the old code in the common case. - It will be vastly quicker than current code for the worst case (single-page truncate). - invalidate_inode_pages() has been changed. It used to use page_count(page) as the "is it mapped into pagetables" heuristic. It now uses the (page->pte.direct != 0) heuristic. - Removes the worst cause of scheduling latency in the kernel. - It's a big code cleanup. - invalidate_inode_pages() has been changed to take an address_space *, not an inode *. - the maximum hold times for mapping->page_lock are enormously reduced, making it quite feasible to turn this into an irq-safe lock. Which, it seems, is a requirement for sane AIO<->direct-io integration, as well as possibly other AIO things. (Thanks Hugh for fixing a bug in this one as well). (Christoph added some stuff too)
-
Andrew Morton authored
Adds a gang lookup facility to radix trees. It provides an efficient means of locating a bunch of pages starting at a particular offset. The implementation is a bit dumb, but is efficient enough. And it is amenable to the `tagged lookup' extension which is proving tricky to write, but which will allow the dirty pages within a mapping to be located in pgoff_t order. Thanks are due to Huch Dickins for finding and fixing an unpleasant bug in here.
-
Andrew Morton authored
Pages with no reverse mapping can be present in page tables as a result of a driver performing remap_page_range(). Don't go BUG over them.
-
Andrew Morton authored
Patch from Hugh Dickins Our earlier fix for mprotect_fixup was broken - passing an already-freed VMA to change_protection().
-
Andrew Morton authored
sys_ioperm() is calling kmalloc(GFP_KERNEL) inside get_cpu(). That's wrong, because the memory allocation could schedule away and return on a different CPU. So change it to perform the memory allocation outside the atomic region.
-
Andrew Morton authored
- hugetlb Documentation update - Add /proc/buddyinfo documentation - nano-cleanup in __remove_from_page_cache.
-
Linus Torvalds authored
into penguin.transmeta.com:/home/penguin/torvalds/repositories/kernel/linux
-
Alexander Viro authored
-
Alexander Viro authored
-
Alexander Viro authored
-
Alexander Viro authored
-
Alexander Viro authored
-
Alexander Viro authored
-
Alexander Viro authored
-
Alexander Viro authored
Removed cruft from pd_ioctl() and friends.
-
Alexander Viro authored
-
Linus Torvalds authored
into home.transmeta.com:/home/torvalds/v2.5/linux
-
Jaroslav Kysela authored
- save_flags/cli/restore_flags removal - updated USB code for 2.5 - fixed SPARC configuration - fixed spinlock/sleep race in PCM midlevel
-
Matthew Wilcox authored
-
David S. Miller authored
-