• Jianpeng Ma's avatar
    block: Add blk_rq_pos(rq) to sort rq when plushing · 975927b9
    Jianpeng Ma authored
    My workload is a raid5 which had 16 disks. And used our filesystem to
    write using direct-io mode.
    
    I used the blktrace to find those message:
    8,16   0     6647     2.453665504  2579  M   W 7493152 + 8 [md0_raid5]
    8,16   0     6648     2.453672411  2579  Q   W 7493160 + 8 [md0_raid5]
    8,16   0     6649     2.453672606  2579  M   W 7493160 + 8 [md0_raid5]
    8,16   0     6650     2.453679255  2579  Q   W 7493168 + 8 [md0_raid5]
    8,16   0     6651     2.453679441  2579  M   W 7493168 + 8 [md0_raid5]
    8,16   0     6652     2.453685948  2579  Q   W 7493176 + 8 [md0_raid5]
    8,16   0     6653     2.453686149  2579  M   W 7493176 + 8 [md0_raid5]
    8,16   0     6654     2.453693074  2579  Q   W 7493184 + 8 [md0_raid5]
    8,16   0     6655     2.453693254  2579  M   W 7493184 + 8 [md0_raid5]
    8,16   0     6656     2.453704290  2579  Q   W 7493192 + 8 [md0_raid5]
    8,16   0     6657     2.453704482  2579  M   W 7493192 + 8 [md0_raid5]
    8,16   0     6658     2.453715016  2579  Q   W 7493200 + 8 [md0_raid5]
    8,16   0     6659     2.453715247  2579  M   W 7493200 + 8 [md0_raid5]
    8,16   0     6660     2.453721730  2579  Q   W 7493208 + 8 [md0_raid5]
    8,16   0     6661     2.453721974  2579  M   W 7493208 + 8 [md0_raid5]
    8,16   0     6662     2.453728202  2579  Q   W 7493216 + 8 [md0_raid5]
    8,16   0     6663     2.453728436  2579  M   W 7493216 + 8 [md0_raid5]
    8,16   0     6664     2.453734782  2579  Q   W 7493224 + 8 [md0_raid5]
    8,16   0     6665     2.453735019  2579  M   W 7493224 + 8 [md0_raid5]
    8,16   0     6666     2.453741401  2579  Q   W 7493232 + 8 [md0_raid5]
    8,16   0     6667     2.453741632  2579  M   W 7493232 + 8 [md0_raid5]
    8,16   0     6668     2.453748148  2579  Q   W 7493240 + 8 [md0_raid5]
    8,16   0     6669     2.453748386  2579  M   W 7493240 + 8 [md0_raid5]
    8,16   0     6670     2.453851843  2579  I   W 7493144 + 104 [md0_raid5]
    8,16   0        0     2.453853661     0  m   N cfq2579 insert_request
    8,16   0     6671     2.453854064  2579  I   W 7493120 + 24 [md0_raid5]
    8,16   0        0     2.453854439     0  m   N cfq2579 insert_request
    8,16   0     6672     2.453854793  2579  U   N [md0_raid5] 2
    8,16   0        0     2.453855513     0  m   N cfq2579 Not idling.st->count:1
    8,16   0        0     2.453855927     0  m   N cfq2579 dispatch_insert
    8,16   0        0     2.453861771     0  m   N cfq2579 dispatched a request
    8,16   0        0     2.453862248     0  m   N cfq2579 activate rq,drv=1
    8,16   0     6673     2.453862332  2579  D   W 7493120 + 24 [md0_raid5]
    8,16   0        0     2.453865957     0  m   N cfq2579 Not idling.st->count:1
    8,16   0        0     2.453866269     0  m   N cfq2579 dispatch_insert
    8,16   0        0     2.453866707     0  m   N cfq2579 dispatched a request
    8,16   0        0     2.453867061     0  m   N cfq2579 activate rq,drv=2
    8,16   0     6674     2.453867145  2579  D   W 7493144 + 104 [md0_raid5]
    8,16   0     6675     2.454147608     0  C   W 7493120 + 24 [0]
    8,16   0        0     2.454149357     0  m   N cfq2579 complete rqnoidle 0
    8,16   0     6676     2.454791505     0  C   W 7493144 + 104 [0]
    8,16   0        0     2.454794803     0  m   N cfq2579 complete rqnoidle 0
    8,16   0        0     2.454795160     0  m   N cfq schedule dispatch
    
    From above messages,we can find rq[W 7493144 + 104] and rq[W
    7493120 + 24] do not merge.
    Because the bio order is:
      8,16   0     6638     2.453619407  2579  Q   W 7493144 + 8 [md0_raid5]
      8,16   0     6639     2.453620460  2579  G   W 7493144 + 8 [md0_raid5]
      8,16   0     6640     2.453639311  2579  Q   W 7493120 + 8 [md0_raid5]
      8,16   0     6641     2.453639842  2579  G   W 7493120 + 8 [md0_raid5]
    The bio(7493144) first and bio(7493120) later.So the subsequent
    bios will be divided into two parts.
    When flushing plug-list,because elv_attempt_insert_merge only support
    backmerge,not supporting frontmerge.
    So rq[7493120 + 24] can't merge with rq[7493144 + 104].
    
    From my test,i found those situation can count 25% in our system.
    Using this patch, there is no this situation.
    Signed-off-by: default avatarJianpeng Ma <majianpeng@gmail.com>
    CC:Shaohua Li <shli@kernel.org>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    975927b9
blk-core.c 81 KB