Commit d33e4e14 authored by Matthew Wilcox (Oracle)'s avatar Matthew Wilcox (Oracle) Committed by Andrew Morton

vmscan: convert the writeback handling in shrink_page_list() to folios

Slightly more efficient due to fewer calls to compound_head().

Link: https://lkml.kernel.org/r/20220504182857.4013401-7-willy@infradead.orgSigned-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
parent 1bee2c16
...@@ -1598,40 +1598,42 @@ static unsigned int shrink_page_list(struct list_head *page_list, ...@@ -1598,40 +1598,42 @@ static unsigned int shrink_page_list(struct list_head *page_list,
stat->nr_congested += nr_pages; stat->nr_congested += nr_pages;
/* /*
* If a page at the tail of the LRU is under writeback, there * If a folio at the tail of the LRU is under writeback, there
* are three cases to consider. * are three cases to consider.
* *
* 1) If reclaim is encountering an excessive number of pages * 1) If reclaim is encountering an excessive number of folios
* under writeback and this page is both under writeback and * under writeback and this folio is both under
* PageReclaim then it indicates that pages are being queued * writeback and has the reclaim flag set then it
* for IO but are being recycled through the LRU before the * indicates that folios are being queued for I/O but
* IO can complete. Waiting on the page itself risks an * are being recycled through the LRU before the I/O
* indefinite stall if it is impossible to writeback the * can complete. Waiting on the folio itself risks an
* page due to IO error or disconnected storage so instead * indefinite stall if it is impossible to writeback
* note that the LRU is being scanned too quickly and the * the folio due to I/O error or disconnected storage
* caller can stall after page list has been processed. * so instead note that the LRU is being scanned too
* quickly and the caller can stall after the folio
* list has been processed.
* *
* 2) Global or new memcg reclaim encounters a page that is * 2) Global or new memcg reclaim encounters a folio that is
* not marked for immediate reclaim, or the caller does not * not marked for immediate reclaim, or the caller does not
* have __GFP_FS (or __GFP_IO if it's simply going to swap, * have __GFP_FS (or __GFP_IO if it's simply going to swap,
* not to fs). In this case mark the page for immediate * not to fs). In this case mark the folio for immediate
* reclaim and continue scanning. * reclaim and continue scanning.
* *
* Require may_enter_fs() because we would wait on fs, which * Require may_enter_fs() because we would wait on fs, which
* may not have submitted IO yet. And the loop driver might * may not have submitted I/O yet. And the loop driver might
* enter reclaim, and deadlock if it waits on a page for * enter reclaim, and deadlock if it waits on a folio for
* which it is needed to do the write (loop masks off * which it is needed to do the write (loop masks off
* __GFP_IO|__GFP_FS for this reason); but more thought * __GFP_IO|__GFP_FS for this reason); but more thought
* would probably show more reasons. * would probably show more reasons.
* *
* 3) Legacy memcg encounters a page that is already marked * 3) Legacy memcg encounters a folio that already has the
* PageReclaim. memcg does not have any dirty pages * reclaim flag set. memcg does not have any dirty folio
* throttling so we could easily OOM just because too many * throttling so we could easily OOM just because too many
* pages are in writeback and there is nothing else to * folios are in writeback and there is nothing else to
* reclaim. Wait for the writeback to complete. * reclaim. Wait for the writeback to complete.
* *
* In cases 1) and 2) we activate the pages to get them out of * In cases 1) and 2) we activate the folios to get them out of
* the way while we continue scanning for clean pages on the * the way while we continue scanning for clean folios on the
* inactive list and refilling from the active list. The * inactive list and refilling from the active list. The
* observation here is that waiting for disk writes is more * observation here is that waiting for disk writes is more
* expensive than potentially causing reloads down the line. * expensive than potentially causing reloads down the line.
...@@ -1639,38 +1641,42 @@ static unsigned int shrink_page_list(struct list_head *page_list, ...@@ -1639,38 +1641,42 @@ static unsigned int shrink_page_list(struct list_head *page_list,
* memory pressure on the cache working set any longer than it * memory pressure on the cache working set any longer than it
* takes to write them to disk. * takes to write them to disk.
*/ */
if (PageWriteback(page)) { if (folio_test_writeback(folio)) {
/* Case 1 above */ /* Case 1 above */
if (current_is_kswapd() && if (current_is_kswapd() &&
PageReclaim(page) && folio_test_reclaim(folio) &&
test_bit(PGDAT_WRITEBACK, &pgdat->flags)) { test_bit(PGDAT_WRITEBACK, &pgdat->flags)) {
stat->nr_immediate += nr_pages; stat->nr_immediate += nr_pages;
goto activate_locked; goto activate_locked;
/* Case 2 above */ /* Case 2 above */
} else if (writeback_throttling_sane(sc) || } else if (writeback_throttling_sane(sc) ||
!PageReclaim(page) || !may_enter_fs(page, sc->gfp_mask)) { !folio_test_reclaim(folio) ||
!may_enter_fs(page, sc->gfp_mask)) {
/* /*
* This is slightly racy - end_page_writeback() * This is slightly racy -
* might have just cleared PageReclaim, then * folio_end_writeback() might have just
* setting PageReclaim here end up interpreted * cleared the reclaim flag, then setting
* as PageReadahead - but that does not matter * reclaim here ends up interpreted as
* enough to care. What we do want is for this * the readahead flag - but that does
* page to have PageReclaim set next time memcg * not matter enough to care. What we
* reclaim reaches the tests above, so it will * do want is for this folio to have
* then wait_on_page_writeback() to avoid OOM; * the reclaim flag set next time memcg
* and it's also appropriate in global reclaim. * reclaim reaches the tests above, so
* it will then folio_wait_writeback()
* to avoid OOM; and it's also appropriate
* in global reclaim.
*/ */
SetPageReclaim(page); folio_set_reclaim(folio);
stat->nr_writeback += nr_pages; stat->nr_writeback += nr_pages;
goto activate_locked; goto activate_locked;
/* Case 3 above */ /* Case 3 above */
} else { } else {
unlock_page(page); folio_unlock(folio);
wait_on_page_writeback(page); folio_wait_writeback(folio);
/* then go back and try same page again */ /* then go back and try same folio again */
list_add_tail(&page->lru, page_list); list_add_tail(&folio->lru, page_list);
continue; continue;
} }
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment