• Joonsoo Kim's avatar
    slab: fix oops when reading /proc/slab_allocators · 03787301
    Joonsoo Kim authored
    Commit b1cb0982 ("change the management method of free objects of
    the slab") introduced a bug on slab leak detector
    ('/proc/slab_allocators').  This detector works like as following
    decription.
    
     1. traverse all objects on all the slabs.
     2. determine whether it is active or not.
     3. if active, print who allocate this object.
    
    but that commit changed the way how to manage free objects, so the logic
    determining whether it is active or not is also changed.  In before, we
    regard object in cpu caches as inactive one, but, with this commit, we
    mistakenly regard object in cpu caches as active one.
    
    This intoduces kernel oops if DEBUG_PAGEALLOC is enabled.  If
    DEBUG_PAGEALLOC is enabled, kernel_map_pages() is used to detect who
    corrupt free memory in the slab.  It unmaps page table mapping if object
    is free and map it if object is active.  When slab leak detector check
    object in cpu caches, it mistakenly think this object active so try to
    access object memory to retrieve caller of allocation.  At this point,
    page table mapping to this object doesn't exist, so oops occurs.
    
    Following is oops message reported from Dave.
    
    It blew up when something tried to read /proc/slab_allocators
    (Just cat it, and you should see the oops below)
    
      Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
      Modules linked in:
      [snip...]
      CPU: 1 PID: 9386 Comm: trinity-c33 Not tainted 3.14.0-rc5+ #131
      task: ffff8801aa46e890 ti: ffff880076924000 task.ti: ffff880076924000
      RIP: 0010:[<ffffffffaa1a8f4a>]  [<ffffffffaa1a8f4a>] handle_slab+0x8a/0x180
      RSP: 0018:ffff880076925de0  EFLAGS: 00010002
      RAX: 0000000000001000 RBX: 0000000000000000 RCX: 000000005ce85ce7
      RDX: ffffea00079be100 RSI: 0000000000001000 RDI: ffff880107458000
      RBP: ffff880076925e18 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 000000000000000f R12: ffff8801e6f84000
      R13: ffffea00079be100 R14: ffff880107458000 R15: ffff88022bb8d2c0
      FS:  00007fb769e45740(0000) GS:ffff88024d040000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffff8801e6f84ff8 CR3: 00000000a22db000 CR4: 00000000001407e0
      DR0: 0000000002695000 DR1: 0000000002695000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000070602
      Call Trace:
        leaks_show+0xce/0x240
        seq_read+0x28e/0x490
        proc_reg_read+0x3d/0x80
        vfs_read+0x9b/0x160
        SyS_read+0x58/0xb0
        tracesys+0xd4/0xd9
      Code: f5 00 00 00 0f 1f 44 00 00 48 63 c8 44 3b 0c 8a 0f 84 e3 00 00 00 83 c0 01 44 39 c0 72 eb 41 f6 47 1a 01 0f 84 e9 00 00 00 89 f0 <4d> 8b 4c 04 f8 4d 85 c9 0f 84 88 00 00 00 49 8b 7e 08 4d 8d 46
      RIP   handle_slab+0x8a/0x180
    
    To fix the problem, I introduce an object status buffer on each slab.
    With this, we can track object status precisely, so slab leak detector
    would not access active object and no kernel oops would occur.  Memory
    overhead caused by this fix is only imposed to CONFIG_DEBUG_SLAB_LEAK
    which is mainly used for debugging, so memory overhead isn't big
    problem.
    Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
    Reported-by: default avatarDave Jones <davej@redhat.com>
    Reported-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
    Reviewed-by: default avatarVladimir Davydov <vdavydov@parallels.com>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Pekka Enberg <penberg@kernel.org>
    Cc: David Rientjes <rientjes@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    03787301
slab.c 112 KB