[PATCH] Infrastructure for correct hugepage refcounting
We currently have a problem when things like ptrace, futexes and direct-io try to pin user pages. If the user's address is in a huge page we're elevting the refcount of a constituent 4k page, not the head page of the high-order allocation unit. To solve this, a generic way of handling higher-order pages has been implemented: - A higher-order page is called a "compound page". Chose this because "huge page", "large page", "super page", etc all seem to mean different things to different people. - The first (controlling) 4k page of a compound page is referred to as the "head" page. - The remaining pages are tail pages. All pages have PG_compound set. All pages have their lru.next pointing at the head page (even the head page has this). The head page's lru.prev, if non-zero, holds the address of the compound page's put_page() function. The order of the allocation is stored in the first tail page's lru.prev. This is only for debug at present. This usage means that zero-order pages may not be compound. The above relationships are established for _all_ higher-order pages in the page allocator. Which has some cost, but not much - another atomic op during fork(), mainly. This functionality is only enabled if CONFIG_HUGETLB_PAGE, although it could be turned on permanently. There's a little extra cost in get_page/put_page. These changes do not preclude adding compound pages to the LRU in the future - we can add a new page flag to the head page and then move all the additional data to the first tail page's lru.next, lru.prev, list.next, list.prev, index, private, etc.
Showing
Please register or sign in to comment