[PATCH] speed up rmap searching
several functions in rmap.c are searching the ptes[] array fo find the first non-null entry. Despite the fact tha the whole lot is in L1 cache, it is expensive, especially on 128-byte cacheline machines. We can encode the index of the first non-null pte entry inside the pte_chain's `next' field and remove those searches altogether. This reduces the rmap CPU tax by about 25% on a P4. For a total runtime reduction of around 5% in the bash-script intensive test which I use.
Showing
Please register or sign in to comment