• Minchan Kim's avatar
    zram: support idle/huge page writeback · a939888e
    Minchan Kim authored
    Add a new feature "zram idle/huge page writeback".  In the zram-swap use
    case, zram usually has many idle/huge swap pages.  It's pointless to keep
    them in memory (ie, zram).
    
    To solve this problem, this feature introduces idle/huge page writeback to
    the backing device so the goal is to save more memory space on embedded
    systems.
    
    Normal sequence to use idle/huge page writeback feature is as follows,
    
    while (1) {
            # mark allocated zram slot to idle
            echo all > /sys/block/zram0/idle
            # leave system working for several hours
            # Unless there is no access for some blocks on zram,
    	# they are still IDLE marked pages.
    
            echo "idle" > /sys/block/zram0/writeback
    	or/and
    	echo "huge" > /sys/block/zram0/writeback
            # write the IDLE or/and huge marked slot into backing device
    	# and free the memory.
    }
    
    Per the discussion at
    https://lore.kernel.org/lkml/20181122065926.GG3441@jagdpanzerIV/T/#u,
    
    This patch removes direct incommpressibe page writeback feature
    (d2afd25114f4 ("zram: write incompressible pages to backing device")).
    
    Below concerns from Sergey:
    == &< ==
    
    "IDLE writeback" is superior to "incompressible writeback".
    
    "incompressible writeback" is completely unpredictable and uncontrollable;
    it depens on data patterns and compression algorithms.  While "IDLE
    writeback" is predictable.
    
    I even suspect, that, *ideally*, we can remove "incompressible writeback".
    "IDLE pages" is a super set which also includes "incompressible" pages.
    So, technically, we still can do "incompressible writeback" from "IDLE
    writeback" path; but a much more reasonable one, based on a page idling
    period.
    
    I understand that you want to keep "direct incompressible writeback"
    around.  ZRAM is especially popular on devices which do suffer from flash
    wearout, so I can see "incompressible writeback" path becoming a dead
    code, long term.
    
    == &< ==
    
    Below concerns from Minchan:
    == &< ==
    
    My concern is if we enable CONFIG_ZRAM_WRITEBACK in this implementation,
    both hugepage/idlepage writeck will turn on.  However someuser want to
    enable only idlepage writeback so we need to introduce turn on/off knob
    for hugepage or new CONFIG_ZRAM_IDLEPAGE_WRITEBACK for those usecase.  I
    don't want to make it complicated *if possible*.
    
    Long term, I imagine we need to make VM aware of new swap hierarchy a
    little bit different with as-is.  For example, first high priority swap
    can return -EIO or -ENOCOMP, swap try to fallback to next lower priority
    swap device.  With that, hugepage writeback will work tranparently.
    
    So we could regard it as regression because incompressible pages doesn't
    go to backing storage automatically.  Instead, user should do it via "echo
    huge" > /sys/block/zram/writeback" manually.
    
    == &< ==
    
    Link: http://lkml.kernel.org/r/20181127055429.251614-6-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
    Reviewed-by: default avatarJoey Pabalinas <joeypabalinas@gmail.com>
    Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    a939888e
sysfs-block-zram 3.93 KB