[PATCH] dcache_rcu
Patch from Maneesh Soni <maneesh@in.ibm.com>, Dipankar Sarma <dipankar@in.ibm.com> and probably others. This patch provides dcache_lock free d_lookup() using RCU. Al pointed races with d_move and lockfree d_lookup() while concurrent rename is going on. We tested this with a test doing million renames each in 50 threads on 50 different ramfs filesystems. And simultaneously running millions of "ls". The tests were done on 4-way SMP box. 1. Lookup going to a different bucket as the current dentry is moved to a different bucket due to rename. This is solved by having a list_head pointer in the dentry structure which points to the bucket head it belongs. The bucket pointer is updated when the dentry is added to the hash chain. Lookup checks if the current dentry belongs to a different bucket, the cached lookup is failed and real lookup will be done. This condition occured nearly about 100 times during the heavy_rename test. 2. Lookup has got the dentry it is looking and it is comparing various keys and meanwhile a rename operation moves the dentry. This is solved by using a per dentry counter (d_move_count) which is updated at the end of d_move. Lookup takes a snapshot of the d_move_count before comparing the keys and once the comparision succeeds, it takes the per dentry lock to check the d_move_count again. If move_count differs, then dentry is moved (or renamed) and the lookup is failed. 3. There can be a theoritical race when a dentry keeps coming back to original bucket due to double moves. Due to this lookup may consider that it has never moved and can end up in a infinite loop. This is solved by using a loop_counter which is compared with a approximate maximum number of dentries per bucket. This never got hit during the heavy_rename test. 4. There is one more change regarding the loop termintaion condition in d_lookup, now the next hash pointer is compared with the current dentries bucket pointer (is_bucket()). 5. memcmp() in d_lookup() can go out of bounds if name pointer and length fields are not consistent. For this we used a pointer to qstr to keep length and name pointer in one structre. We also tried solving these by using a rwlock but it could not compete with lockless solution.
Showing
This diff is collapsed.
Please register or sign in to comment