• Minchan Kim's avatar
    mm: support anonymous stable page · 6ca29ee3
    Minchan Kim authored
    commit f0571429 upstream.
    
    During developemnt for zram-swap asynchronous writeback, I found strange
    corruption of compressed page, resulting in:
    
      Modules linked in: zram(E)
      CPU: 3 PID: 1520 Comm: zramd-1 Tainted: G            E   4.8.0-mm1-00320-ge0d4894c9c38-dirty #3274
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      task: ffff88007620b840 task.stack: ffff880078090000
      RIP: set_freeobj.part.43+0x1c/0x1f
      RSP: 0018:ffff880078093ca8  EFLAGS: 00010246
      RAX: 0000000000000018 RBX: ffff880076798d88 RCX: ffffffff81c408c8
      RDX: 0000000000000018 RSI: 0000000000000000 RDI: 0000000000000246
      RBP: ffff880078093cb0 R08: 0000000000000000 R09: 0000000000000000
      R10: ffff88005bc43030 R11: 0000000000001df3 R12: ffff880076798d88
      R13: 000000000005bc43 R14: ffff88007819d1b8 R15: 0000000000000001
      FS:  0000000000000000(0000) GS:ffff88007e380000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fc934048f20 CR3: 0000000077b01000 CR4: 00000000000406e0
      Call Trace:
        obj_malloc+0x22b/0x260
        zs_malloc+0x1e4/0x580
        zram_bvec_rw+0x4cd/0x830 [zram]
        page_requests_rw+0x9c/0x130 [zram]
        zram_thread+0xe6/0x173 [zram]
        kthread+0xca/0xe0
        ret_from_fork+0x25/0x30
    
    With investigation, it reveals currently stable page doesn't support
    anonymous page.  IOW, reuse_swap_page can reuse the page without waiting
    writeback completion so it can overwrite page zram is compressing.
    
    Unfortunately, zram has used per-cpu stream feature from v4.7.
    It aims for increasing cache hit ratio of scratch buffer for
    compressing. Downside of that approach is that zram should ask
    memory space for compressed page in per-cpu context which requires
    stricted gfp flag which could be failed. If so, it retries to
    allocate memory space out of per-cpu context so it could get memory
    this time and compress the data again, copies it to the memory space.
    
    In this scenario, zram assumes the data should never be changed
    but it is not true unless stable page supports. So, If the data is
    changed under us, zram can make buffer overrun because second
    compression size could be bigger than one we got in previous trial
    and blindly, copy bigger size object to smaller buffer which is
    buffer overrun. The overrun breaks zsmalloc free object chaining
    so system goes crash like above.
    
    I think below is same problem.
    https://bugzilla.suse.com/show_bug.cgi?id=997574
    
    Unfortunately, reuse_swap_page should be atomic so that we cannot wait on
    writeback in there so the approach in this patch is simply return false if
    we found it needs stable page.  Although it increases memory footprint
    temporarily, it happens rarely and it should be reclaimed easily althoug
    it happened.  Also, It would be better than waiting of IO completion,
    which is critial path for application latency.
    
    Fixes: da9556a2 ("zram: user per-cpu compression streams")
    Link: http://lkml.kernel.org/r/20161120233015.GA14113@bbox
    Link: http://lkml.kernel.org/r/1482366980-3782-2-git-send-email-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
    Acked-by: default avatarHugh Dickins <hughd@google.com>
    Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Cc: Darrick J. Wong <darrick.wong@oracle.com>
    Cc: Takashi Iwai <tiwai@suse.de>
    Cc: Hyeoncheol Lee <cheol.lee@lge.com>
    Cc: <yjay.kim@lge.com>
    Cc: Sangseok Lee <sangseok.lee@lge.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    6ca29ee3
swapfile.c 77.5 KB