[PATCH] loop setup race fix

From: Chris Mason <mason@suse.com> There's a race in loopback setup, it's easiest to trigger with one or more procs doing loopback mounts at the same time. The problem is that fs/block_dev.c:do_open() only calls bdev_set_size on the first open. Picture two procs: proc1: mount -o loop file1 mnt1 proc2: mount -o loop file2 mnt2 proc1 proc2 open /dev/loop0 # bd_openers now 1 do_open bd_set_size(bdev, 0) # loop unbound, so bdev size is 0 open /dev/loop0 # bd_openers now 2 loop_set_fd # disk capacity now correct, but # bdev not updated mount /dev/loop0 /mnt do_open Because bd_openers != 0 for the last do_open, bd_set_size is not called again and a size of 0 is used. This eventually leads to an oops when the loop device is unmounted, because fsync_bdev calls block_write_full_page who decides every page on the block device is outside i_size and unmaps them. When ext2 or reiserfs try to sync a metadata buffer, we get an oops on because the buffers are no longer mapped. The patch below changes loop_set_fd and loop_clr_fd to also manipulate the size of the block device, which fixes things for me.

[PATCH] loop setup race fix
From: Chris Mason <mason@suse.com> There's a race in loopback setup, it's easiest to trigger with one or more procs doing loopback mounts at the same time. The problem is that fs/block_dev.c:do_open() only calls bdev_set_size on the first open. Picture two procs: proc1: mount -o loop file1 mnt1 proc2: mount -o loop file2 mnt2 proc1 proc2 open /dev/loop0 # bd_openers now 1 do_open bd_set_size(bdev, 0) # loop unbound, so bdev size is 0 open /dev/loop0 # bd_openers now 2 loop_set_fd # disk capacity now correct, but # bdev not updated mount /dev/loop0 /mnt do_open Because bd_openers != 0 for the last do_open, bd_set_size is not called again and a size of 0 is used. This eventually leads to an oops when the loop device is unmounted, because fsync_bdev calls block_write_full_page who decides every page on the block device is outside i_size and unmaps them. When ext2 or reiserfs try to sync a metadata buffer, we get an oops on because the buffers are no longer mapped. The patch below changes loop_set_fd and loop_clr_fd to also manipulate the size of the block device, which fixes things for me.
238a43a0 · Andrew Morton · Linus Torvalds · 275da6a3 · 238a43a0 · 238a43a0
Commit 238a43a0 authored Mar 11, 2004 by Andrew Morton Committed by Linus Torvalds Mar 11, 2004
Hide whitespace changes
Inline Side-by-side

Showing with 5 additions and 1 deletion

drivers/block/loop.c drivers/block/loop.c +2 -0

fs/block_dev.c fs/block_dev.c +2 -1

include/linux/fs.h include/linux/fs.h +1 -0

No files found.
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -677,6 +677,7 @@ static int loop_set_fd(struct loop_device *lo, struct file *lo_file,
 	lo->transfer = NULL;
 	lo->ioctl = NULL;
 	lo->lo_sizelimit = 0;
+	bd_set_size(bdev,(loff_t)get_capacity(disks[lo->lo_number])<<9);
 	lo->old_gfp_mask = mapping_gfp_mask(mapping);
 	mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS));

@@ -780,6 +781,7 @@ static int loop_clr_fd(struct loop_device *lo, struct block_device *bdev)
 	memset(lo->lo_file_name, 0, LO_NAME_SIZE);
 	invalidate_bdev(bdev, 0);
 	set_capacity(disks[lo->lo_number], 0);
+	bd_set_size(bdev, 0);
 	mapping_set_gfp_mask(filp->f_mapping, gfp);
 	lo->lo_state = Lo_unbound;
 	fput(filp);

--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -522,7 +522,7 @@ int check_disk_change(struct block_device *bdev)

 EXPORT_SYMBOL(check_disk_change);

-static void bd_set_size(struct block_device *bdev, loff_t size)
+void bd_set_size(struct block_device *bdev, loff_t size)
 {
 	unsigned bsize = bdev_hardsect_size(bdev);

@@ -535,6 +535,7 @@ static void bd_set_size(struct block_device *bdev, loff_t size)
 	bdev->bd_block_size = bsize;
 	bdev->bd_inode->i_blkbits = blksize_bits(bsize);
 }
+EXPORT_SYMBOL(bd_set_size);

 static int do_open(struct block_device *bdev, struct file *file)
 {

--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1136,6 +1136,7 @@ extern void vfs_caches_init(unsigned long);
 extern int register_blkdev(unsigned int, const char *);
 extern int unregister_blkdev(unsigned int, const char *);
 extern struct block_device *bdget(dev_t);
+extern void bd_set_size(struct block_device *, loff_t size);
 extern void bd_forget(struct inode *inode);
 extern void bdput(struct block_device *);
 extern int blkdev_open(struct inode *, struct file *);