[PATCH] md: Handle concurrent failure of two drives in raid5

If two drives both fail during a write request, raid5 doesn't cope properly and will eventually oops. With this patch, blocks that have already been 'written' are failed when double drive failure is noticed, as well as blocks that are about to be written.

[PATCH] md: Handle concurrent failure of two drives in raid5
If two drives both fail during a write request, raid5 doesn't cope properly and will eventually oops. With this patch, blocks that have already been 'written' are failed when double drive failure is noticed, as well as blocks that are about to be written.
95e7ce7f · Neil Brown · Linus Torvalds · a298fedc · 95e7ce7f
Commit 95e7ce7f authored May 26, 2003 by Neil Brown Committed by Linus Torvalds May 26, 2003
Hide whitespace changes
Inline Side-by-side

Showing with 15 additions and 1 deletion

drivers/md/raid5.c drivers/md/raid5.c +15 -1

No files found.
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -918,7 +918,7 @@ static void handle_stripe(struct stripe_head *sh)
 	/* check if the array has lost two devices and, if so, some requests might
 	 * need to be failed
 	 */
-	if (failed > 1 && to_read+to_write) {
+	if (failed > 1 && to_read+to_write+written) {
 		spin_lock_irq(&conf->device_lock);
 		for (i=disks; i--; ) {
 			/* fail all writes first */
@@ -936,6 +936,20 @@ static void handle_stripe(struct stripe_head *sh)
 				}
 				bi = nextbi;
 			}
+			/* and fail all 'written' */
+			bi = sh->dev[i].written;
+			sh->dev[i].written = NULL;
+			while (bi && bi->bi_sector < dev->sector + STRIPE_SECTORS) {
+				struct bio *bi2 = bi->bi_next;
+				clear_bit(BIO_UPTODATE, &bi->bi_flags);
+				if (--bi->bi_phys_segments == 0) {
+					md_write_end(conf->mddev);
+					bi->bi_next = return_bi;
+					return_bi = bi;
+				}
+				bi = bi2;
+			}
+
 			/* fail any reads if this device is non-operational */
 			if (!test_bit(R5_Insync, &sh->dev[i].flags)) {
 				bi = sh->dev[i].toread;