• Kiran Kumar Modukuri's avatar
    cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag · 5ce83d4b
    Kiran Kumar Modukuri authored
    In cachefiles_mark_object_active(), the new object is marked active and
    then we try to add it to the active object tree.  If a conflicting object
    is already present, we want to wait for that to go away.  After the wait,
    we go round again and try to re-mark the object as being active - but it's
    already marked active from the first time we went through and a BUG is
    issued.
    
    Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.
    
    Analysis from Kiran Kumar Modukuri:
    
    [Impact]
    Oops during heavy NFS + FSCache + Cachefiles
    
    CacheFiles: Error: Overlong wait for old active object to go away.
    
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
    
    CacheFiles: Error: Object already active kernel BUG at
    fs/cachefiles/namei.c:163!
    
    [Cause]
    In a heavily loaded system with big files being read and truncated, an
    fscache object for a cookie is being dropped and a new object being
    looked. The new object being looked for has to wait for the old object
    to go away before the new object is moved to active state.
    
    [Fix]
    Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
    retrying the object lookup.
    
    [Testcase]
    Have run ~100 hours of NFS stress tests and have not seen this bug recur.
    
    [Regression Potential]
     - Limited to fscache/cachefiles.
    
    Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
    Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    5ce83d4b
namei.c 25.6 KB