• Jane Chu's avatar
    mm/memory-failure: try to send SIGBUS even if unmap failed · aa298fdf
    Jane Chu authored
    Patch series "Enhance soft hwpoison handling and injection", v4.
    
    This series is aimed at the following enhancements:
    
    - Let one hwpoison injector, that is, madvise(MADV_HWPOISON) to behave
      more like as if a real UE occurred.  Because the other two injectors
      such as hwpoison-inject and the 'einj' on x86 can't, and it seems to me
      we need a better simulation to real UE scenario.
    - For years, if the kernel is unable to unmap a hwpoisoned page, it send
      a SIGKILL instead of SIGBUS to prevent user process from potentially
      accessing the page again.  But in doing so, the user process also lose
      important information: vaddr, for recovery.  Fortunately, the kernel
      already has code to kill process re-accessing a hwpoisoned page, so
      remove the '!unmap_success' check.
    - Right now, if a thp page under GUP longterm pin is hwpoisoned, and
      kernel cannot split the thp page, memory-failure simply ignores the UE
      and returns.  That's not ideal, it could deliver a SIGBUS with useful
      information for userspace recovery.
    
    
    This patch (of 5):
    
    For years when it comes down to kill a process due to hwpoison, a SIGBUS
    is delivered only if unmap has been successful.  Otherwise, a SIGKILL is
    delivered.  And the reason for that is to prevent the involved process
    from accessing the hwpoisoned page again.
    
    Since then a lot has changed, a hwpoisoned page is marked and upon being
    re-accessed, the memory-failure handler invokes kill_accessing_process()
    to kill the process immediately.  So let's take out the '!unmap_success'
    factor and try to deliver SIGBUS if possible.
    
    Link: https://lkml.kernel.org/r/20240524215306.2705454-1-jane.chu@oracle.com
    Link: https://lkml.kernel.org/r/20240524215306.2705454-2-jane.chu@oracle.comSigned-off-by: default avatarJane Chu <jane.chu@oracle.com>
    Reviewed-by: default avatarOscar Salvador <osalvador@suse.de>
    Acked-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
    Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
    Cc: Oscar Salvador <oalvador@suse.de>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    aa298fdf
memory-failure.c 74 KB