• Wanpeng Li's avatar
    mm/memory-failure.c: fix bug triggered by unpoisoning empty zero page · 3ba5eebc
    Wanpeng Li authored
      Injecting memory failure for page 0x19d0 at 0xb77d2000
      MCE 0x19d0: non LRU page recovery: Ignored
      MCE: Software-unpoisoned page 0x19d0
      BUG: Bad page state in process bash  pfn:019d0
      page:f3461a00 count:0 mapcount:0 mapping:  (null) index:0x0
      page flags: 0x40000404(referenced|reserved)
      Modules linked in: nfsd auth_rpcgss i915 nfs_acl nfs lockd video drm_kms_helper drm bnep rfcomm sunrpc bluetooth psmouse parport_pc ppdev lp serio_raw fscache parport gpio_ich lpc_ich mac_hid i2c_algo_bit tpm_tis wmi usb_storage hid_generic usbhid hid e1000e firewire_ohci firewire_core ahci ptp libahci pps_core crc_itu_t
      CPU: 3 PID: 2123 Comm: bash Not tainted 3.11.0-rc6+ #12
      Hardware name: LENOVO 7034DD7/        , BIOS 9HKT47AUS 01//2012
       00000000 00000000 e9625ea0 c15ec49b f3461a00 e9625eb8 c15ea119 c17cbf18
       ef084314 000019d0 f3461a00 e9625ed8 c110dc8a f3461a00 00000001 00000000
       f3461a00 40000404 00000000 e9625ef8 c110dcc1 f3461a00 f3461a00 000019d0
      Call Trace:
        dump_stack+0x41/0x52
        bad_page+0xcf/0xeb
        free_pages_prepare+0x12a/0x140
        free_hot_cold_page+0x21/0x110
        __put_single_page+0x21/0x30
        put_page+0x25/0x40
        unpoison_memory+0x107/0x200
        hwpoison_unpoison+0x20/0x30
        simple_attr_write+0xb6/0xd0
        vfs_write+0xa0/0x1b0
        SyS_write+0x4f/0x90
        sysenter_do_call+0x12/0x22
      Disabling lock debugging due to kernel taint
    
    Testcase:
    
    #define _GNU_SOURCE
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/mman.h>
    #include <unistd.h>
    #include <fcntl.h>
    #include <sys/types.h>
    #include <errno.h>
    
    #define PAGES_TO_TEST 1
    #define PAGE_SIZE	4096
    
    int main(void)
    {
    	char *mem;
    
    	mem = mmap(NULL, PAGES_TO_TEST * PAGE_SIZE,
    			PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
    
    	if (madvise(mem, PAGES_TO_TEST * PAGE_SIZE, MADV_HWPOISON) == -1)
    		return -1;
    
    	munmap(mem, PAGES_TO_TEST * PAGE_SIZE);
    
    	return 0;
    }
    
    There is one page reference count for default empty zero page,
    madvise_hwpoison add another one by get_user_pages_fast.  memory_hwpoison
    reduce one page reference count since it's a non LRU page.
    unpoison_memory release the last page reference count and free empty zero
    page to buddy system which is not correct since empty zero page has
    PG_reserved flag.  This patch fix it by don't reduce the page reference
    count under 1 against empty zero page.
    Signed-off-by: default avatarWanpeng Li <liwanp@linux.vnet.ibm.com>
    Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
    Cc: Andi Kleen <andi@firstfloor.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    3ba5eebc
memory-failure.c 45.6 KB