• Neil Brown's avatar
    [PATCH] md: Fix assorted raid1 problems. · 81fc1e93
    Neil Brown authored
    From Angus Sawyer <angus.sawyer@dsl.pipex.com>:
    
    1. Null pointer dereference in end_sync_read
    
       r1_bio->read_disk is not initialised correctly in sync_request .
    
       this is used in end_sync_read to reference the structure
       conf->mirror[read_disk].rdev which with one disk missing is NULL.
    
    
    2. Null pointer dereference in mempool_free()
    
       This is a race between close_sync() conf->r1_bufpool =3D NULL and put_buf()
       mempool_free().
    
       bio completion -> resume_device -> put_buf -> mempool_free(r1_bufpool)
    				|
    			   [ wakeup]
    				|
    			   close_sync()	-> r1_bufpool = NULL;
    
       The patch attached  reorders the mempool_free before the  barrier is released
       and merges resume_device() into put_buf(), (they are only used together).
       Otherwise I have kept the locking and wakeups identical to the existing code.
       (maybe this could be streamlined)
    
    3.  BUG() at close_sync() if (waitqueue_active(&conf->wait_resume).
    
       This occurs with and without the patch for (2).
    
       I think this is a false BUG().  From what I understand of the device barrier
       code, there is nothing wrong with make_request() waiting on wait_resume when
       this test is made.  Therefore I have removed it (the wait_idle test is still
       correct).
    
    4. raid1 tries to start a resync if there is only one working drive,
       which is pretty pointless, and noisy.  We notice that special case and
       avoid the resync.
    81fc1e93
raid1.c 29.9 KB