Commit 88c4650a authored by Andrew Morton's avatar Andrew Morton Committed by Linus Torvalds

[PATCH] direct-to-BIO I/O for swapcache pages

This patch changes the swap I/O handling.  The objectives are:

- Remove swap special-casing
- Stop using buffer_heads -> direct-to-BIO
- Make S_ISREG swapfiles more robust.

I've spent quite some time with swap.  The first patches converted swap to
use block_read/write_full_page().  These were discarded because they are
still using buffer_heads, and a reasonable amount of otherwise unnecessary
infrastructure had to be added to the swap code just to make it look like a
regular fs.  So this code just has a custom direct-to-BIO path for swap,
which seems to be the most comfortable approach.

A significant thing here is the introduction of "swap extents".  A swap
extent is a simple data structure which maps a range of swap pages onto a
range of disk sectors.  It is simply:

	struct swap_extent {
		struct list_head list;
		pgoff_t start_page;
		pgoff_t nr_pages;
		sector_t start_block;
	};

At swapon time (for an S_ISREG swapfile), each block in the file is bmapped()
and the block numbers are parsed to generate the device's swap extent list.
This extent list is quite compact - a 512 megabyte swapfile generates about
130 nodes in the list.  That's about 4 kbytes of storage.  The conversion
from filesystem blocksize blocks into PAGE_SIZE blocks is performed at swapon
time.

At swapon time (for an S_ISBLK swapfile), we install a single swap extent
which describes the entire device.

The advantages of the swap extents are:

1: We never have to run bmap() (ie: read from disk) at swapout time.  So
   S_ISREG swapfiles are now just as robust as S_ISBLK swapfiles.

2: All the differences between S_ISBLK swapfiles and S_ISREG swapfiles are
   handled at swapon time.  During normal operation, we just don't care.
   Both types of swapfiles are handled the same way.

3: The extent lists always operate in PAGE_SIZE units.  So the problems of
   going from fs blocksize to PAGE_SIZE are handled at swapon time and normal
   operating code doesn't need to care.

4: Because we don't have to fiddle with different blocksizes, we can go
   direct-to-BIO for swap_readpage() and swap_writepage().  This introduces
   the kernel-wide invariant "anonymous pages never have buffers attached",
   which cleans some things up nicely.  All those block_flushpage() calls in
   the swap code simply go away.

5: The kernel no longer has to allocate both buffer_heads and BIOs to
   perform swapout.  Just a BIO.

6: It permits us to perform swapcache writeout and throttling for
   GFP_NOFS allocations (a later patch).

(Well, there is one sort of anon page which can have buffers: the pages which
are cast adrift in truncate_complete_page() because do_invalidatepage()
failed.  But these pages are never added to swapcache, and nobody except the
VM LRU has to deal with them).

The swapfile parser in setup_swap_extents() will attempt to extract the
largest possible number of PAGE_SIZE-sized and PAGE_SIZE-aligned chunks of
disk from the S_ISREG swapfile.  Any stray blocks (due to file
discontiguities) are simply discarded - we never swap to those.

If an S_ISREG swapfile is found to have any unmapped blocks (file holes) then
the swapon attempt will fail.

The extent list can be quite large (hundreds of nodes for a gigabyte S_ISREG
swapfile).  It needs to be consulted once for each page within
swap_readpage() and swap_writepage().  Hence there is a risk that we could
blow significant amounts of CPU walking that list.  However I have
implemented a "where we found the last block" cache, which is used as the
starting point for the next search.  Empirical testing indicates that this is
wildly effective - the average length of the list walk in map_swap_page() is
0.3 iterations per page, with a 130-element list.

It _could_ be that some workloads do start suffering long walks in that code,
and perhaps a tree would be needed there.  But I doubt that, and if this is
happening then it means that we're seeking all over the disk for swap I/O,
and the list walk is the least of our problems.

rw_swap_page_nolock() now takes a page*, not a kernel virtual address.  It
has been renamed to rw_swap_page_sync() and it takes care of locking and
unlocking the page itself.  Which is all a much better interface.

Support for type 0 swap has been removed.  Current versions of mkwap(8) seem
to never produce v0 swap unless you explicitly ask for it, so I doubt if this
will affect anyone.  If you _do_ have a type 0 swapfile, swapon will fail and
the message

	version 0 swap is no longer supported. Use mkswap -v1 /dev/sdb3

is printed.  We can remove that code for real later on.  Really, all that
swapfile header parsing should be pushed out to userspace.

This code always uses single-page BIOs for swapin and swapout.  I have an
additional patch which converts swap to use mpage_writepages(), so we swap
out in 16-page BIOs.  It works fine, but I don't intend to submit that.
There just doesn't seem to be any significant advantage to it.

I can't see anything in sys_swapon()/sys_swapoff() which needs the
lock_kernel() calls, so I deleted them.

If you ftruncate an S_ISREG swapfile to a shorter size while it is in use,
subsequent swapout will destroy the filesystem.  It was always thus, but it
is much, much easier to do now.  Not really a kernel problem, but swapon(8)
should not be allowing the kernel to use swapfiles which are modifiable by
unprivileged users.
parent 3ab86fb0
......@@ -492,7 +492,7 @@ static void free_more_memory(void)
}
/*
* I/O completion handler for block_read_full_page() and brw_page() - pages
* I/O completion handler for block_read_full_page() - pages
* which come unlocked at the end of I/O.
*/
static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
......@@ -551,9 +551,8 @@ static void end_buffer_async_read(struct buffer_head *bh, int uptodate)
}
/*
* Completion handler for block_write_full_page() and for brw_page() - pages
* which are unlocked during I/O, and which have PageWriteback cleared
* upon I/O completion.
* Completion handler for block_write_full_page() - pages which are unlocked
* during I/O, and which have PageWriteback cleared upon I/O completion.
*/
static void end_buffer_async_write(struct buffer_head *bh, int uptodate)
{
......@@ -1360,11 +1359,11 @@ int block_invalidatepage(struct page *page, unsigned long offset)
{
struct buffer_head *head, *bh, *next;
unsigned int curr_off = 0;
int ret = 1;
if (!PageLocked(page))
BUG();
BUG_ON(!PageLocked(page));
if (!page_has_buffers(page))
return 1;
goto out;
head = page_buffers(page);
bh = head;
......@@ -1386,12 +1385,10 @@ int block_invalidatepage(struct page *page, unsigned long offset)
* The get_block cached value has been unconditionally invalidated,
* so real IO is not possible anymore.
*/
if (offset == 0) {
if (!try_to_release_page(page, 0))
return 0;
}
return 1;
if (offset == 0)
ret = try_to_release_page(page, 0);
out:
return ret;
}
EXPORT_SYMBOL(block_invalidatepage);
......@@ -2266,57 +2263,6 @@ int brw_kiovec(int rw, int nr, struct kiobuf *iovec[],
return err ? err : transferred;
}
/*
* Start I/O on a page.
* This function expects the page to be locked and may return
* before I/O is complete. You then have to check page->locked
* and page->uptodate.
*
* FIXME: we need a swapper_inode->get_block function to remove
* some of the bmap kludges and interface ugliness here.
*/
int brw_page(int rw, struct page *page,
struct block_device *bdev, sector_t b[], int size)
{
struct buffer_head *head, *bh;
BUG_ON(!PageLocked(page));
if (!page_has_buffers(page))
create_empty_buffers(page, size, 0);
head = bh = page_buffers(page);
/* Stage 1: lock all the buffers */
do {
lock_buffer(bh);
bh->b_blocknr = *(b++);
bh->b_bdev = bdev;
set_buffer_mapped(bh);
if (rw == WRITE) {
set_buffer_uptodate(bh);
clear_buffer_dirty(bh);
mark_buffer_async_write(bh);
} else {
mark_buffer_async_read(bh);
}
bh = bh->b_this_page;
} while (bh != head);
if (rw == WRITE) {
BUG_ON(PageWriteback(page));
SetPageWriteback(page);
unlock_page(page);
}
/* Stage 2: start the IO */
do {
struct buffer_head *next = bh->b_this_page;
submit_bh(rw, bh);
bh = next;
} while (bh != head);
return 0;
}
/*
* Sanity checks for try_to_free_buffers.
*/
......
......@@ -183,7 +183,6 @@ struct buffer_head * __bread(struct block_device *, int, int);
void wakeup_bdflush(void);
struct buffer_head *alloc_buffer_head(int async);
void free_buffer_head(struct buffer_head * bh);
int brw_page(int, struct page *, struct block_device *, sector_t [], int);
void FASTCALL(unlock_buffer(struct buffer_head *bh));
/*
......
......@@ -5,6 +5,7 @@
#include <linux/kdev_t.h>
#include <linux/linkage.h>
#include <linux/mmzone.h>
#include <linux/list.h>
#include <asm/page.h>
#define SWAP_FLAG_PREFER 0x8000 /* set if swap priority specified */
......@@ -61,6 +62,21 @@ typedef struct {
#ifdef __KERNEL__
/*
* A swap extent maps a range of a swapfile's PAGE_SIZE pages onto a range of
* disk blocks. A list of swap extents maps the entire swapfile. (Where the
* term `swapfile' refers to either a blockdevice or an IS_REG file. Apart
* from setup, they're handled identically.
*
* We always assume that blocks are of size PAGE_SIZE.
*/
struct swap_extent {
struct list_head list;
pgoff_t start_page;
pgoff_t nr_pages;
sector_t start_block;
};
/*
* Max bad pages in the new format..
*/
......@@ -83,11 +99,17 @@ enum {
/*
* The in-memory structure used to track swap areas.
* extent_list.prev points at the lowest-index extent. That list is
* sorted.
*/
struct swap_info_struct {
unsigned int flags;
spinlock_t sdev_lock;
struct file *swap_file;
struct block_device *bdev;
struct list_head extent_list;
int nr_extents;
struct swap_extent *curr_swap_extent;
unsigned old_block_size;
unsigned short * swap_map;
unsigned int lowest_bit;
......@@ -134,8 +156,9 @@ extern wait_queue_head_t kswapd_wait;
extern int FASTCALL(try_to_free_pages(zone_t *, unsigned int, unsigned int));
/* linux/mm/page_io.c */
extern void rw_swap_page(int, struct page *);
extern void rw_swap_page_nolock(int, swp_entry_t, char *);
int swap_readpage(struct file *file, struct page *page);
int swap_writepage(struct page *page);
int rw_swap_page_sync(int rw, swp_entry_t entry, struct page *page);
/* linux/mm/page_alloc.c */
......@@ -163,12 +186,13 @@ extern unsigned int nr_swapfiles;
extern struct swap_info_struct swap_info[];
extern void si_swapinfo(struct sysinfo *);
extern swp_entry_t get_swap_page(void);
extern void get_swaphandle_info(swp_entry_t, unsigned long *, struct inode **);
extern int swap_duplicate(swp_entry_t);
extern int swap_count(struct page *);
extern int valid_swaphandles(swp_entry_t, unsigned long *);
extern void swap_free(swp_entry_t);
extern void free_swap_and_cache(swp_entry_t);
sector_t map_swap_page(struct swap_info_struct *p, pgoff_t offset);
struct swap_info_struct *get_swap_info_struct(unsigned type);
struct swap_list_t {
int head; /* head of priority-ordered swapfile list */
int next; /* swapfile to be used next */
......
......@@ -559,7 +559,6 @@ EXPORT_SYMBOL(buffer_insert_list);
EXPORT_SYMBOL(make_bad_inode);
EXPORT_SYMBOL(is_bad_inode);
EXPORT_SYMBOL(event);
EXPORT_SYMBOL(brw_page);
#ifdef CONFIG_UID16
EXPORT_SYMBOL(overflowuid);
......
......@@ -320,14 +320,15 @@ static void mark_swapfiles(swp_entry_t prev, int mode)
{
swp_entry_t entry;
union diskpage *cur;
cur = (union diskpage *)get_free_page(GFP_ATOMIC);
if (!cur)
struct page *page;
page = alloc_page(GFP_ATOMIC);
if (!page)
panic("Out of memory in mark_swapfiles");
cur = page_address(page);
/* XXX: this is dirty hack to get first page of swap file */
entry = swp_entry(root_swap, 0);
lock_page(virt_to_page((unsigned long)cur));
rw_swap_page_nolock(READ, entry, (char *) cur);
rw_swap_page_sync(READ, entry, page);
if (mode == MARK_SWAP_RESUME) {
if (!memcmp("SUSP1R",cur->swh.magic.magic,6))
......@@ -345,10 +346,8 @@ static void mark_swapfiles(swp_entry_t prev, int mode)
cur->link.next = prev; /* prev is the first/last swap page of the resume area */
/* link.next lies *no more* in last 4 bytes of magic */
}
lock_page(virt_to_page((unsigned long)cur));
rw_swap_page_nolock(WRITE, entry, (char *)cur);
free_page((unsigned long)cur);
rw_swap_page_sync(WRITE, entry, page);
__free_page(page);
}
static void read_swapfiles(void) /* This is called before saving image */
......@@ -409,6 +408,7 @@ static int write_suspend_image(void)
int nr_pgdir_pages = SUSPEND_PD_PAGES(nr_copy_pages);
union diskpage *cur, *buffer = (union diskpage *)get_free_page(GFP_ATOMIC);
unsigned long address;
struct page *page;
PRINTS( "Writing data to swap (%d pages): ", nr_copy_pages );
for (i=0; i<nr_copy_pages; i++) {
......@@ -421,13 +421,8 @@ static int write_suspend_image(void)
panic("\nPage %d: not enough swapspace on suspend device", i );
address = (pagedir_nosave+i)->address;
lock_page(virt_to_page(address));
{
long dummy1;
struct inode *suspend_file;
get_swaphandle_info(entry, &dummy1, &suspend_file);
}
rw_swap_page_nolock(WRITE, entry, (char *) address);
page = virt_to_page(address);
rw_swap_page_sync(WRITE, entry, page);
(pagedir_nosave+i)->swap_address = entry;
}
PRINTK(" done\n");
......@@ -452,8 +447,8 @@ static int write_suspend_image(void)
if (PAGE_SIZE % sizeof(struct pbe))
panic("I need PAGE_SIZE to be integer multiple of struct pbe, otherwise next assignment could damage pagedir");
cur->link.next = prev;
lock_page(virt_to_page((unsigned long)cur));
rw_swap_page_nolock(WRITE, entry, (char *) cur);
page = virt_to_page((unsigned long)cur);
rw_swap_page_sync(WRITE, entry, page);
prev = entry;
}
PRINTK(", header");
......@@ -473,8 +468,8 @@ static int write_suspend_image(void)
cur->link.next = prev;
lock_page(virt_to_page((unsigned long)cur));
rw_swap_page_nolock(WRITE, entry, (char *) cur);
page = virt_to_page((unsigned long)cur);
rw_swap_page_sync(WRITE, entry, page);
prev = entry;
PRINTK( ", signature" );
......
......@@ -14,112 +14,163 @@
#include <linux/kernel_stat.h>
#include <linux/pagemap.h>
#include <linux/swap.h>
#include <linux/swapctl.h>
#include <linux/buffer_head.h> /* for brw_page() */
#include <linux/bio.h>
#include <linux/buffer_head.h>
#include <asm/pgtable.h>
#include <linux/swapops.h>
/*
* Reads or writes a swap page.
* wait=1: start I/O and wait for completion. wait=0: start asynchronous I/O.
*
* Important prevention of race condition: the caller *must* atomically
* create a unique swap cache entry for this swap page before calling
* rw_swap_page, and must lock that page. By ensuring that there is a
* single page of memory reserved for the swap entry, the normal VM page
* lock on that page also doubles as a lock on swap entries. Having only
* one lock to deal with per swap entry (rather than locking swap and memory
* independently) also makes it easier to make certain swapping operations
* atomic, which is particularly important when we are trying to ensure
* that shared pages stay shared while being swapped.
*/
static int
swap_get_block(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create)
{
struct swap_info_struct *sis;
swp_entry_t entry;
static int rw_swap_page_base(int rw, swp_entry_t entry, struct page *page)
entry.val = iblock;
sis = get_swap_info_struct(swp_type(entry));
bh_result->b_bdev = sis->bdev;
bh_result->b_blocknr = map_swap_page(sis, swp_offset(entry));
bh_result->b_size = PAGE_SIZE;
set_buffer_mapped(bh_result);
return 0;
}
static struct bio *
get_swap_bio(int gfp_flags, struct page *page, bio_end_io_t end_io)
{
unsigned long offset;
sector_t zones[PAGE_SIZE/512];
int zones_used;
int block_size;
struct inode *swapf = 0;
struct block_device *bdev;
struct bio *bio;
struct buffer_head bh;
if (rw == READ) {
bio = bio_alloc(gfp_flags, 1);
if (bio) {
swap_get_block(NULL, page->index, &bh, 1);
bio->bi_sector = bh.b_blocknr * (PAGE_SIZE >> 9);
bio->bi_bdev = bh.b_bdev;
bio->bi_io_vec[0].bv_page = page;
bio->bi_io_vec[0].bv_len = PAGE_SIZE;
bio->bi_io_vec[0].bv_offset = 0;
bio->bi_vcnt = 1;
bio->bi_idx = 0;
bio->bi_size = PAGE_SIZE;
bio->bi_end_io = end_io;
}
return bio;
}
static void end_swap_bio_write(struct bio *bio)
{
const int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
struct page *page = bio->bi_io_vec[0].bv_page;
if (!uptodate)
SetPageError(page);
end_page_writeback(page);
bio_put(bio);
}
static void end_swap_bio_read(struct bio *bio)
{
const int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
struct page *page = bio->bi_io_vec[0].bv_page;
if (!uptodate) {
SetPageError(page);
ClearPageUptodate(page);
kstat.pswpin++;
} else
kstat.pswpout++;
get_swaphandle_info(entry, &offset, &swapf);
bdev = swapf->i_bdev;
if (bdev) {
zones[0] = offset;
zones_used = 1;
block_size = PAGE_SIZE;
} else {
int i, j;
unsigned int block = offset
<< (PAGE_SHIFT - swapf->i_sb->s_blocksize_bits);
block_size = swapf->i_sb->s_blocksize;
for (i=0, j=0; j< PAGE_SIZE ; i++, j += block_size)
if (!(zones[i] = bmap(swapf,block++))) {
printk("rw_swap_page: bad swap file\n");
return 0;
}
zones_used = i;
bdev = swapf->i_sb->s_bdev;
SetPageUptodate(page);
}
unlock_page(page);
bio_put(bio);
}
/* block_size == PAGE_SIZE/zones_used */
brw_page(rw, page, bdev, zones, block_size);
/*
* We may have stale swap cache pages in memory: notice
* them here and get rid of the unnecessary final write.
*/
int swap_writepage(struct page *page)
{
struct bio *bio;
int ret = 0;
/* Note! For consistency we do all of the logic,
* decrementing the page count, and unlocking the page in the
* swap lock map - in the IO completion handler.
*/
return 1;
if (remove_exclusive_swap_page(page)) {
unlock_page(page);
goto out;
}
bio = get_swap_bio(GFP_NOIO, page, end_swap_bio_write);
if (bio == NULL) {
ret = -ENOMEM;
goto out;
}
kstat.pswpout++;
SetPageWriteback(page);
unlock_page(page);
submit_bio(WRITE, bio);
out:
return ret;
}
int swap_readpage(struct file *file, struct page *page)
{
struct bio *bio;
int ret = 0;
ClearPageUptodate(page);
bio = get_swap_bio(GFP_KERNEL, page, end_swap_bio_read);
if (bio == NULL) {
ret = -ENOMEM;
goto out;
}
kstat.pswpin++;
submit_bio(READ, bio);
out:
return ret;
}
/*
* A simple wrapper so the base function doesn't need to enforce
* that all swap pages go through the swap cache! We verify that:
* - the page is locked
* - it's marked as being swap-cache
* - it's associated with the swap inode
* swapper_space doesn't have a real inode, so it gets a special vm_writeback()
* so we don't need swap special cases in generic_vm_writeback().
*
* Swap pages are PageLocked and PageWriteback while under writeout so that
* memory allocators will throttle against them.
*/
void rw_swap_page(int rw, struct page *page)
static int swap_vm_writeback(struct page *page, int *nr_to_write)
{
swp_entry_t entry;
struct address_space *mapping = page->mapping;
entry.val = page->index;
if (!PageLocked(page))
PAGE_BUG(page);
if (!PageSwapCache(page))
PAGE_BUG(page);
if (!rw_swap_page_base(rw, entry, page))
unlock_page(page);
unlock_page(page);
return generic_writepages(mapping, nr_to_write);
}
struct address_space_operations swap_aops = {
vm_writeback: swap_vm_writeback,
writepage: swap_writepage,
readpage: swap_readpage,
sync_page: block_sync_page,
set_page_dirty: __set_page_dirty_nobuffers,
};
/*
* The swap lock map insists that pages be in the page cache!
* Therefore we can't use it. Later when we can remove the need for the
* lock map and we can reduce the number of functions exported.
* A scruffy utility function to read or write an arbitrary swap page
* and wait on the I/O.
*/
void rw_swap_page_nolock(int rw, swp_entry_t entry, char *buf)
int rw_swap_page_sync(int rw, swp_entry_t entry, struct page *page)
{
struct page *page = virt_to_page(buf);
if (!PageLocked(page))
PAGE_BUG(page);
if (page->mapping)
PAGE_BUG(page);
/* needs sync_page to wait I/O completation */
int ret;
lock_page(page);
BUG_ON(page->mapping);
page->mapping = &swapper_space;
if (rw_swap_page_base(rw, entry, page))
lock_page(page);
if (page_has_buffers(page) && !try_to_free_buffers(page))
PAGE_BUG(page);
page->index = entry.val;
if (rw == READ) {
ret = swap_readpage(NULL, page);
wait_on_page_locked(page);
} else {
ret = swap_writepage(page);
wait_on_page_writeback(page);
}
page->mapping = NULL;
unlock_page(page);
if (ret == 0 && (!PageUptodate(page) || PageError(page)))
ret = -EIO;
return ret;
}
......@@ -14,54 +14,27 @@
#include <linux/init.h>
#include <linux/pagemap.h>
#include <linux/smp_lock.h>
#include <linux/buffer_head.h> /* block_sync_page()/try_to_free_buffers() */
#include <linux/buffer_head.h> /* block_sync_page() */
#include <asm/pgtable.h>
/*
* We may have stale swap cache pages in memory: notice
* them here and get rid of the unnecessary final write.
*/
static int swap_writepage(struct page *page)
{
if (remove_exclusive_swap_page(page)) {
unlock_page(page);
return 0;
}
rw_swap_page(WRITE, page);
return 0;
}
/*
* swapper_space doesn't have a real inode, so it gets a special vm_writeback()
* so we don't need swap special cases in generic_vm_writeback().
*
* Swap pages are PageLocked and PageWriteback while under writeout so that
* memory allocators will throttle against them.
*/
static int swap_vm_writeback(struct page *page, int *nr_to_write)
{
struct address_space *mapping = page->mapping;
unlock_page(page);
return generic_writepages(mapping, nr_to_write);
}
static struct address_space_operations swap_aops = {
vm_writeback: swap_vm_writeback,
writepage: swap_writepage,
sync_page: block_sync_page,
set_page_dirty: __set_page_dirty_nobuffers,
};
/*
* swapper_inode doesn't do anything much. It is really only here to
* avoid some special-casing in other parts of the kernel.
*
* We set i_size to "infinity" to keep the page I/O functions happy. The swap
* block allocator makes sure that allocations are in-range. A strange
* number is chosen to prevent various arith overflows elsewhere. For example,
* `lblock' in block_read_full_page().
*/
static struct inode swapper_inode = {
i_mapping: &swapper_space,
i_mapping: &swapper_space,
i_size: PAGE_SIZE * 0xffffffffLL,
i_blkbits: PAGE_SHIFT,
};
extern struct address_space_operations swap_aops;
struct address_space swapper_space = {
page_tree: RADIX_TREE_INIT(GFP_ATOMIC),
page_lock: RW_LOCK_UNLOCKED,
......@@ -149,14 +122,9 @@ void delete_from_swap_cache(struct page *page)
{
swp_entry_t entry;
/*
* I/O should have completed and nobody can have a ref against the
* page's buffers
*/
BUG_ON(!PageLocked(page));
BUG_ON(PageWriteback(page));
if (page_has_buffers(page) && !try_to_free_buffers(page))
BUG();
BUG_ON(page_has_buffers(page));
entry.val = page->index;
......@@ -222,16 +190,9 @@ int move_from_swap_cache(struct page *page, unsigned long index,
void **pslot;
int err;
/*
* Drop the buffers now, before taking the page_lock. Because
* mapping->private_lock nests outside mapping->page_lock.
* This "must" succeed. The page is locked and all I/O has completed
* and nobody else has a ref against its buffers.
*/
BUG_ON(!PageLocked(page));
BUG_ON(PageWriteback(page));
if (page_has_buffers(page) && !try_to_free_buffers(page))
BUG();
BUG_ON(page_has_buffers(page));
write_lock(&swapper_space.page_lock);
write_lock(&mapping->page_lock);
......@@ -361,7 +322,7 @@ struct page * read_swap_cache_async(swp_entry_t entry)
/*
* Initiate read into locked page and return.
*/
rw_swap_page(READ, new_page);
swap_readpage(NULL, new_page);
return new_page;
}
} while (err != -ENOENT && err != -ENOMEM);
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment