Commit 8d2c0d36 authored by Linus Torvalds's avatar Linus Torvalds

radix tree: fix sibling entry handling in radix_tree_descend()

The fixes to the radix tree test suite show that the multi-order case is
broken.  The basic reason is that the radix tree code uses tagged
pointers with the "internal" bit in the low bits, and calculating the
pointer indices was supposed to mask off those bits.  But gcc will
notice that we then use the index to re-create the pointer, and will
avoid doing the arithmetic and use the tagged pointer directly.

This cleans the code up, using the existing is_sibling_entry() helper to
validate the sibling pointer range (instead of open-coding it), and
using entry_to_node() to mask off the low tag bit from the pointer.  And
once you do that, you might as well just use the now cleaned-up pointer
directly.

[ Side note: the multi-order code isn't actually ever used in the kernel
  right now, and the only reason I didn't just delete all that code is
  that Kirill Shutemov piped up and said:

    "Well, my ext4-with-huge-pages patchset[1] uses multi-order entries.
     It also converts shmem-with-huge-pages and hugetlb to them.

     I'm okay with converting it to other mechanism, but I need
     something.  (I looked into Konstantin's RFC patchset[2].  It looks
     okay, but I don't feel myself qualified to review it as I don't
     know much about radix-tree internals.)"

  [1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com
  [2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg ]
Reported-by: default avatarMatthew Wilcox <mawilcox@microsoft.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Cedric Blancher <cedric.blancher@gmail.com>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
parent 62fd5258
...@@ -105,10 +105,10 @@ static unsigned int radix_tree_descend(struct radix_tree_node *parent, ...@@ -105,10 +105,10 @@ static unsigned int radix_tree_descend(struct radix_tree_node *parent,
#ifdef CONFIG_RADIX_TREE_MULTIORDER #ifdef CONFIG_RADIX_TREE_MULTIORDER
if (radix_tree_is_internal_node(entry)) { if (radix_tree_is_internal_node(entry)) {
unsigned long siboff = get_slot_offset(parent, entry); if (is_sibling_entry(parent, entry)) {
if (siboff < RADIX_TREE_MAP_SIZE) { void **sibentry = (void **) entry_to_node(entry);
offset = siboff; offset = get_slot_offset(parent, sibentry);
entry = rcu_dereference_raw(parent->slots[offset]); entry = rcu_dereference_raw(*sibentry);
} }
} }
#endif #endif
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment