1. 24 Mar, 2023 4 commits
    • Ryusuke Konishi's avatar
      nilfs2: fix kernel-infoleak in nilfs_ioctl_wrap_copy() · 00358700
      Ryusuke Konishi authored
      The ioctl helper function nilfs_ioctl_wrap_copy(), which exchanges a
      metadata array to/from user space, may copy uninitialized buffer regions
      to user space memory for read-only ioctl commands NILFS_IOCTL_GET_SUINFO
      and NILFS_IOCTL_GET_CPINFO.
      
      This can occur when the element size of the user space metadata given by
      the v_size member of the argument nilfs_argv structure is larger than the
      size of the metadata element (nilfs_suinfo structure or nilfs_cpinfo
      structure) on the file system side.
      
      KMSAN-enabled kernels detect this issue as follows:
      
       BUG: KMSAN: kernel-infoleak in instrument_copy_to_user
       include/linux/instrumented.h:121 [inline]
       BUG: KMSAN: kernel-infoleak in _copy_to_user+0xc0/0x100 lib/usercopy.c:33
        instrument_copy_to_user include/linux/instrumented.h:121 [inline]
        _copy_to_user+0xc0/0x100 lib/usercopy.c:33
        copy_to_user include/linux/uaccess.h:169 [inline]
        nilfs_ioctl_wrap_copy+0x6fa/0xc10 fs/nilfs2/ioctl.c:99
        nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline]
        nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290
        nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343
        __do_compat_sys_ioctl fs/ioctl.c:968 [inline]
        __se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910
        __ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910
        do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
        __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
        do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
        do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
        entry_SYSENTER_compat_after_hwframe+0x70/0x82
      
       Uninit was created at:
        __alloc_pages+0x9f6/0xe90 mm/page_alloc.c:5572
        alloc_pages+0xab0/0xd80 mm/mempolicy.c:2287
        __get_free_pages+0x34/0xc0 mm/page_alloc.c:5599
        nilfs_ioctl_wrap_copy+0x223/0xc10 fs/nilfs2/ioctl.c:74
        nilfs_ioctl_get_info fs/nilfs2/ioctl.c:1173 [inline]
        nilfs_ioctl+0x2402/0x4450 fs/nilfs2/ioctl.c:1290
        nilfs_compat_ioctl+0x1b8/0x200 fs/nilfs2/ioctl.c:1343
        __do_compat_sys_ioctl fs/ioctl.c:968 [inline]
        __se_compat_sys_ioctl+0x7dd/0x1000 fs/ioctl.c:910
        __ia32_compat_sys_ioctl+0x93/0xd0 fs/ioctl.c:910
        do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
        __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
        do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
        do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
        entry_SYSENTER_compat_after_hwframe+0x70/0x82
      
       Bytes 16-127 of 3968 are uninitialized
       ...
      
      This eliminates the leak issue by initializing the page allocated as
      buffer using get_zeroed_page().
      
      Link: https://lkml.kernel.org/r/20230307085548.6290-1-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: syzbot+132fdd2f1e1805fdc591@syzkaller.appspotmail.com
        Link: https://lkml.kernel.org/r/000000000000a5bd2d05f63f04ae@google.comTested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      00358700
    • Liam R. Howlett's avatar
      test_maple_tree: add more testing for mas_empty_area() · 4bd6dded
      Liam R. Howlett authored
      Test robust filling of an entire area of the tree, then test one beyond. 
      This is to test the walking back up the tree at the end of nodes and error
      condition.  Test inspired by the reproducer code provided by Snild Dolkow.
      
      The last test in the function tests for the case of a corrupted maple
      state caused by the incorrect limits set during mas_skip_node().  There
      needs to be a gap in the second last child and last child, but the search
      must rule out the second last child's gap.  This would avoid correcting
      the maple state to the correct max limit and return an error.
      
      Link: https://lkml.kernel.org/r/20230307180247.2220303-3-Liam.Howlett@oracle.com
      Cc: Snild Dolkow <snild@sony.com>
      Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/
      Fixes: e15e06a8 ("lib/test_maple_tree: add testing for maple tree")
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Cc: Peng Zhang <zhangpeng.00@bytedance.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4bd6dded
    • Liam R. Howlett's avatar
      maple_tree: fix mas_skip_node() end slot detection · 0fa99fdf
      Liam R. Howlett authored
      Patch series "Fix mas_skip_node() for mas_empty_area()", v2.
      
      mas_empty_area() was incorrectly returning an error when there was room. 
      The issue was tracked down to mas_skip_node() using the incorrect
      end-of-slot count.  Instead of using the nodes hard limit, the limit of
      data should be used.
      
      mas_skip_node() was also setting the min and max to that of the child
      node, which was unnecessary.  Within these limits being set, there was
      also a bug that corrupted the maple state's max if the offset was set to
      the maximum node pivot.  The bug was without consequence unless there was
      a sufficient gap in the next child node which would cause an error to be
      returned.
      
      This patch set fixes these errors by removing the limit setting from
      mas_skip_node() and uses the mas_data_end() for slot limits, and adds
      tests for all failures discovered.
      
      
      This patch (of 2):
      
      mas_skip_node() is used to move the maple state to the node with a higher
      limit.  It does this by walking up the tree and increasing the slot count.
      Since slot count may not be able to be increased, it may need to walk up
      multiple times to find room to walk right to a higher limit node.  The
      limit of slots that was being used was the node limit and not the last
      location of data in the node.  This would cause the maple state to be
      shifted outside actual data and enter an error state, thus returning
      -EBUSY.
      
      The result of the incorrect error state means that mas_awalk() would
      return an error instead of finding the allocation space.
      
      The fix is to use mas_data_end() in mas_skip_node() to detect the nodes
      data end point and continue walking the tree up until it is safe to move
      to a node with a higher limit.
      
      The walk up the tree also sets the maple state limits so remove the buggy
      code from mas_skip_node().  Setting the limits had the unfortunate side
      effect of triggering another bug if the parent node was full and the there
      was no suitable gap in the second last child, but room in the next child.
      
      mas_skip_node() may also be passed a maple state in an error state from
      mas_anode_descend() when no allocations are available.  Return on such an
      error state immediately.
      
      Link: https://lkml.kernel.org/r/20230307180247.2220303-1-Liam.Howlett@oracle.com
      Link: https://lkml.kernel.org/r/20230307180247.2220303-2-Liam.Howlett@oracle.com
      Fixes: 54a611b6 ("Maple Tree: add new data structure")
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Reported-by: default avatarSnild Dolkow <snild@sony.com>
        Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.com/Tested-by: default avatarSnild Dolkow <snild@sony.com>
      Cc: Peng Zhang <zhangpeng.00@bytedance.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0fa99fdf
    • Michal Hocko's avatar
      mm, vmalloc: fix high order __GFP_NOFAIL allocations · e9c3cda4
      Michal Hocko authored
      Gao Xiang has reported that the page allocator complains about high order
      __GFP_NOFAIL request coming from the vmalloc core:
      
       __alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549
       alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286
       vm_area_alloc_pages mm/vmalloc.c:2989 [inline]
       __vmalloc_area_node mm/vmalloc.c:3057 [inline]
       __vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227
       kvmalloc_node+0x156/0x1a0 mm/util.c:606
       kvmalloc include/linux/slab.h:737 [inline]
       kvmalloc_array include/linux/slab.h:755 [inline]
       kvcalloc include/linux/slab.h:760 [inline]
      
      it seems that I have completely missed high order allocation backing
      vmalloc areas case when implementing __GFP_NOFAIL support.  This means
      that [k]vmalloc at al.  can allocate higher order allocations with
      __GFP_NOFAIL which can trigger OOM killer for non-costly orders easily or
      cause a lot of reclaim/compaction activity if those requests cannot be
      satisfied.
      
      Fix the issue by falling back to zero order allocations for __GFP_NOFAIL
      requests if the high order request fails.
      
      Link: https://lkml.kernel.org/r/ZAXynvdNqcI0f6Us@dhcp22.suse.cz
      Fixes: 9376130c ("mm/vmalloc: add support for __GFP_NOFAIL")
      Reported-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
        Link: https://lkml.kernel.org/r/20230305053035.1911-1-hsiangkao@linux.alibaba.comSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e9c3cda4
  2. 23 Mar, 2023 3 commits
  3. 22 Mar, 2023 3 commits
  4. 21 Mar, 2023 7 commits
  5. 20 Mar, 2023 6 commits
  6. 19 Mar, 2023 17 commits