• Max Kellermann's avatar
    fs/netfs/fscache_cookie: add missing "n_accesses" check · f71aa063
    Max Kellermann authored
    This fixes a NULL pointer dereference bug due to a data race which
    looks like this:
    
      BUG: kernel NULL pointer dereference, address: 0000000000000008
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not-present page
      PGD 0 P4D 0
      Oops: 0000 [#1] SMP PTI
      CPU: 33 PID: 16573 Comm: kworker/u97:799 Not tainted 6.8.7-cm4all1-hp+ #43
      Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 10/17/2018
      Workqueue: events_unbound netfs_rreq_write_to_cache_work
      RIP: 0010:cachefiles_prepare_write+0x30/0xa0
      Code: 57 41 56 45 89 ce 41 55 49 89 cd 41 54 49 89 d4 55 53 48 89 fb 48 83 ec 08 48 8b 47 08 48 83 7f 10 00 48 89 34 24 48 8b 68 20 <48> 8b 45 08 4c 8b 38 74 45 49 8b 7f 50 e8 4e a9 b0 ff 48 8b 73 10
      RSP: 0018:ffffb4e78113bde0 EFLAGS: 00010286
      RAX: ffff976126be6d10 RBX: ffff97615cdb8438 RCX: 0000000000020000
      RDX: ffff97605e6c4c68 RSI: ffff97605e6c4c60 RDI: ffff97615cdb8438
      RBP: 0000000000000000 R08: 0000000000278333 R09: 0000000000000001
      R10: ffff97605e6c4600 R11: 0000000000000001 R12: ffff97605e6c4c68
      R13: 0000000000020000 R14: 0000000000000001 R15: ffff976064fe2c00
      FS:  0000000000000000(0000) GS:ffff9776dfd40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 000000005942c002 CR4: 00000000001706f0
      Call Trace:
       <TASK>
       ? __die+0x1f/0x70
       ? page_fault_oops+0x15d/0x440
       ? search_module_extables+0xe/0x40
       ? fixup_exception+0x22/0x2f0
       ? exc_page_fault+0x5f/0x100
       ? asm_exc_page_fault+0x22/0x30
       ? cachefiles_prepare_write+0x30/0xa0
       netfs_rreq_write_to_cache_work+0x135/0x2e0
       process_one_work+0x137/0x2c0
       worker_thread+0x2e9/0x400
       ? __pfx_worker_thread+0x10/0x10
       kthread+0xcc/0x100
       ? __pfx_kthread+0x10/0x10
       ret_from_fork+0x30/0x50
       ? __pfx_kthread+0x10/0x10
       ret_from_fork_asm+0x1b/0x30
       </TASK>
      Modules linked in:
      CR2: 0000000000000008
      ---[ end trace 0000000000000000 ]---
    
    This happened because fscache_cookie_state_machine() was slow and was
    still running while another process invoked fscache_unuse_cookie();
    this led to a fscache_cookie_lru_do_one() call, setting the
    FSCACHE_COOKIE_DO_LRU_DISCARD flag, which was picked up by
    fscache_cookie_state_machine(), withdrawing the cookie via
    cachefiles_withdraw_cookie(), clearing cookie->cache_priv.
    
    At the same time, yet another process invoked
    cachefiles_prepare_write(), which found a NULL pointer in this code
    line:
    
      struct cachefiles_object *object = cachefiles_cres_object(cres);
    
    The next line crashes, obviously:
    
      struct cachefiles_cache *cache = object->volume->cache;
    
    During cachefiles_prepare_write(), the "n_accesses" counter is
    non-zero (via fscache_begin_operation()).  The cookie must not be
    withdrawn until it drops to zero.
    
    The counter is checked by fscache_cookie_state_machine() before
    switching to FSCACHE_COOKIE_STATE_RELINQUISHING and
    FSCACHE_COOKIE_STATE_WITHDRAWING (in "case
    FSCACHE_COOKIE_STATE_FAILED"), but not for
    FSCACHE_COOKIE_STATE_LRU_DISCARDING ("case
    FSCACHE_COOKIE_STATE_ACTIVE").
    
    This patch adds the missing check.  With a non-zero access counter,
    the function returns and the next fscache_end_cookie_access() call
    will queue another fscache_cookie_state_machine() call to handle the
    still-pending FSCACHE_COOKIE_DO_LRU_DISCARD.
    
    Fixes: 12bb21a2 ("fscache: Implement cookie user counting and resource pinning")
    Signed-off-by: default avatarMax Kellermann <max.kellermann@ionos.com>
    Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    Link: https://lore.kernel.org/r/20240729162002.3436763-2-dhowells@redhat.com
    cc: Jeff Layton <jlayton@kernel.org>
    cc: netfs@lists.linux.dev
    cc: linux-fsdevel@vger.kernel.org
    cc: stable@vger.kernel.org
    Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
    f71aa063
fscache_cookie.c 34.3 KB