[PATCH] Fix interaction between batched lru addition and hot/cold

If a page is "freed" while in the deferred-lru-addition queue, the final reference to it is the deferred lru addition queue. When that queue gets spilled onto the LRU, the page is actually freed. Which is all expected and natural and works fine - it's a weird case. But one of the AIM9 tests was taking a 20% performance hit (relative to 2.4) because it was going into the page allocator for new pages while cache-hot pages were languishiung out in the deferred-addition queue. So the patch changes things so that we spill the CPU's deferred-lru-addition queue before starting to free pages. This way, the recently-used pages actually make it to the hot/cold lists and are available for new allocations. It gets back 15 of the lost 20%. The other 5% is lost to the general additional complexity of all this stuff. (But we're 250% faster than 2.4 when running four instances of the test on 4-way).

[PATCH] Fix interaction between batched lru addition and hot/cold
If a page is "freed" while in the deferred-lru-addition queue, the final reference to it is the deferred lru addition queue. When that queue gets spilled onto the LRU, the page is actually freed. Which is all expected and natural and works fine - it's a weird case. But one of the AIM9 tests was taking a 20% performance hit (relative to 2.4) because it was going into the page allocator for new pages while cache-hot pages were languishiung out in the deferred-addition queue. So the patch changes things so that we spill the CPU's deferred-lru-addition queue before starting to free pages. This way, the recently-used pages actually make it to the hot/cold lists and are available for new allocations. It gets back 15 of the lost 20%. The other 5% is lost to the general additional complexity of all this stuff. (But we're 250% faster than 2.4 when running four instances of the test on 4-way).
3c7b8b3c · Andrew Morton · Linus Torvalds · 2f83855c · 3c7b8b3c · 3c7b8b3c
Commit 3c7b8b3c authored Dec 02, 2002 by Andrew Morton Committed by Linus Torvalds Dec 02, 2002
Hide whitespace changes
Inline Side-by-side

Showing with 20 additions and 2 deletions

mm/memory.c mm/memory.c +4 -0

mm/swap.c mm/swap.c +15 -2

mm/swap_state.c mm/swap_state.c +1 -0

No files found.
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -405,6 +405,8 @@ void unmap_page_range(mmu_gather_t *tlb, struct vm_area_struct *vma, unsigned lo

 	BUG_ON(address >= end);

+	lru_add_drain();
+
 	dir = pgd_offset(vma->vm_mm, address);
 	tlb_start_vma(tlb, vma);
 	do {
@@ -447,6 +449,8 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long address, unsigned
 		return;
 	}

+	lru_add_drain();
+
 	spin_lock(&mm->page_table_lock);

  	/*

--- a/mm/swap.c
+++ b/mm/swap.c
@@ -211,8 +211,19 @@ void release_pages(struct page **pages, int nr, int cold)
 	pagevec_free(&pages_to_free);
 }

+/*
+ * The pages which we're about to release may be in the deferred lru-addition
+ * queues.  That would prevent them from really being freed right now.  That's
+ * OK from a correctness point of view but is inefficient - those pages may be
+ * cache-warm and we want to give them back to the page allocator ASAP.
+ *
+ * So __pagevec_release() will drain those queues here.  __pagevec_lru_add()
+ * and __pagevec_lru_add_active() call release_pages() directly to avoid
+ * mutual recursion.
+ */
 void __pagevec_release(struct pagevec *pvec)
 {
+	lru_add_drain();
 	release_pages(pvec->pages, pagevec_count(pvec), pvec->cold);
 	pagevec_reinit(pvec);
 }
@@ -265,7 +276,8 @@ void __pagevec_lru_add(struct pagevec *pvec)
 	}
 	if (zone)
 		spin_unlock_irq(&zone->lru_lock);
-	pagevec_release(pvec);
+	release_pages(pvec->pages, pvec->nr, pvec->cold);
+	pagevec_reinit(pvec);
 }

 void __pagevec_lru_add_active(struct pagevec *pvec)
@@ -291,7 +303,8 @@ void __pagevec_lru_add_active(struct pagevec *pvec)
 	}
 	if (zone)
 		spin_unlock_irq(&zone->lru_lock);
-	pagevec_release(pvec);
+	release_pages(pvec->pages, pvec->nr, pvec->cold);
+	pagevec_reinit(pvec);
 }

 /*

--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -300,6 +300,7 @@ void free_pages_and_swap_cache(struct page **pages, int nr)
 	const int chunk = 16;
 	struct page **pagep = pages;

+	lru_add_drain();
 	while (nr) {
 		int todo = min(chunk, nr);
 		int i;