1. 14 Apr, 2023 1 commit
    • Li Nan's avatar
      md/raid10: fix task hung in raid10d · 72c215ed
      Li Nan authored
      commit fe630de0 ("md/raid10: avoid deadlock on recovery.") allowed
      normal io and sync io to exist at the same time. Task hung will occur as
      below:
      
      T1                      T2		T3		T4
      raid10d
       handle_read_error
        allow_barrier
         conf->nr_pending--
          -> 0
                              //submit sync io
                              raid10_sync_request
                               raise_barrier
      			  ->will not be blocked
      			  ...
      			//submit to drivers
        raid10_read_request
         wait_barrier
          conf->nr_pending++
           -> 1
      					//retry read fail
      					raid10_end_read_request
      					 reschedule_retry
      					  add to retry_list
      					  conf->nr_queued++
      					   -> 1
      							//sync io fail
      							end_sync_read
      							 __end_sync_read
      							  reschedule_retry
      							   add to retry_list
      					                    conf->nr_queued++
      							     -> 2
       ...
       handle_read_error
       get form retry_list
       conf->nr_queued--
        freeze_array
         wait nr_pending == nr_queued+1
              ->1	      ->2
         //task hung
      
      retry read and sync io will be added to retry_list(nr_queued->2) if they
      fails. raid10d() called handle_read_error() and hung in freeze_array().
      nr_queued will not decrease because raid10d is blocked, nr_pending will
      not increase because conf->barrier is not released.
      
      Fix it by moving allow_barrier() after raid10_read_request().
      raise_barrier() will wait for nr_waiting to become 0. Therefore, sync io
      and regular io will not be issued at the same time.
      
      Also remove the check of nr_queued in stop_waiting_barrier. It can be 0
      but don't need to be blocking. Remove the check for MD_RECOVERY_RUNNING as
      the check is redundent.
      
      Fixes: fe630de0 ("md/raid10: avoid deadlock on recovery.")
      Signed-off-by: default avatarLi Nan <linan122@huawei.com>
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Link: https://lore.kernel.org/r/20230222041000.3341651-2-linan666@huaweicloud.com
      72c215ed
  2. 13 Apr, 2023 33 commits
  3. 12 Apr, 2023 6 commits