1. 06 Mar, 2020 23 commits
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2020-03-06-1' of git://anongit.freedesktop.org/drm/drm · 2f501bb1
      Linus Torvalds authored
      Pull vgacon fix from Daniel Vetter:
       "One vgacon input check for stable"
      
      * tag 'drm-fixes-2020-03-06-1' of git://anongit.freedesktop.org/drm/drm:
        vgacon: Fix a UAF in vgacon_invert_region
      2f501bb1
    • Linus Torvalds's avatar
      Merge tag 'for-5.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 30fe0d07
      Linus Torvalds authored
      Pull btrfs fix from David Sterba:
       "One fixup for DIO when in use with the new checksums, a missed case
        where the checksum size was still assuming u32"
      
      * tag 'for-5.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix RAID direct I/O reads with alternate csums
      30fe0d07
    • Linus Torvalds's avatar
      Merge tag 'filelock-v5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux · 0b25d458
      Linus Torvalds authored
      Pull file locking fixes from Jeff Layton:
       "Just a couple of late-breaking patches for the file locking code. The
        second patch (from yangerkun) fixes a rather nasty looking potential
        use-after-free that should go to stable.
      
        The other patch could technically wait for 5.7, but it's fairly
        innocuous so I figured we might as well take it"
      
      * tag 'filelock-v5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
        locks: fix a potential use-after-free problem when wakeup a waiter
        fcntl: Distribute switch variables for initialization
      0b25d458
    • Linus Torvalds's avatar
      Merge tag 'spi-fix-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · ae24a21b
      Linus Torvalds authored
      Pull spi fixes from Mark Brown:
       "A selection of small fixes, mostly for drivers, that have arrived
        since the merge window. None of them are earth shattering in
        themselves but all useful for affected systems"
      
      * tag 'spi-fix-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
        spi: spi_register_controller(): free bus id on error paths
        spi: bcm63xx-hsspi: Really keep pll clk enabled
        spi: atmel-quadspi: fix possible MMIO window size overrun
        spi/zynqmp: remove entry that causes a cs glitch
        spi: pxa2xx: Add CS control clock quirk
        spi: spidev: Fix CS polarity if GPIO descriptors are used
        spi: qup: call spi_qup_pm_resume_runtime before suspending
        spi: spi-omap2-mcspi: Support probe deferral for DMA channels
        spi: spi-omap2-mcspi: Handle DMA size restriction on AM65x
      ae24a21b
    • Linus Torvalds's avatar
      Merge tag 'regulator-fix-v5.6-rc4' of... · 43c63729
      Linus Torvalds authored
      Merge tag 'regulator-fix-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
      
      Pull regulator fixes from Mark Brown:
       "A couple of small fixes, one for a minor issue in the stm32-vrefbuf
        driver and a documentation fix in the Qualcomm code"
      
      * tag 'regulator-fix-v5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
        regulator: stm32-vrefbuf: fix a possible overshoot when re-enabling
        regulator: qcom_spmi: Fix docs for PM8004
      43c63729
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v5.6-rc5' of... · 08e39fcb
      Linus Torvalds authored
      Merge tag 'hwmon-for-v5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull hwmon fixes from Guenter Roeck:
       "Fix an error return in the adt7462 driver, bad voltage limits reported
        by the xdpe12284 driver, and a broken documentation reference in the
        adm1177 driver documentation"
      
      * tag 'hwmon-for-v5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (adt7462) Fix an error return in ADT7462_REG_VOLT()
        hwmon: (pmbus/xdpe12284) Add callback for vout limits conversion
        docs: adm1177: fix a broken reference
      08e39fcb
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · c20c4a08
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Here are another three arm64 fixes for 5.6, all pretty minor. Main
        thing is fixing a silly bug in the fsl_imx8_ddr PMU driver where we
        would zero the counters when disabling them.
      
         - Fix misreporting of ASID limit when KPTI is enabled
      
         - Fix busted NULL pointer checks for GICC structure in ACPI PMU code
      
         - Avoid nobbling the "fsl_imx8_ddr" PMU counters when disabling them"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: context: Fix ASID limit in boot messages
        drivers/perf: arm_pmu_acpi: Fix incorrect checking of gicc pointer
        drivers/perf: fsl_imx8_ddr: Correct the CLEAR bit definition
      c20c4a08
    • Zhang Xiaoxu's avatar
      vgacon: Fix a UAF in vgacon_invert_region · 513dc792
      Zhang Xiaoxu authored
      When syzkaller tests, there is a UAF:
        BUG: KASan: use after free in vgacon_invert_region+0x9d/0x110 at addr
          ffff880000100000
        Read of size 2 by task syz-executor.1/16489
        page:ffffea0000004000 count:0 mapcount:-127 mapping:          (null)
        index:0x0
        page flags: 0xfffff00000000()
        page dumped because: kasan: bad access detected
        CPU: 1 PID: 16489 Comm: syz-executor.1 Not tainted
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
        rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
        Call Trace:
          [<ffffffffb119f309>] dump_stack+0x1e/0x20
          [<ffffffffb04af957>] kasan_report+0x577/0x950
          [<ffffffffb04ae652>] __asan_load2+0x62/0x80
          [<ffffffffb090f26d>] vgacon_invert_region+0x9d/0x110
          [<ffffffffb0a39d95>] invert_screen+0xe5/0x470
          [<ffffffffb0a21dcb>] set_selection+0x44b/0x12f0
          [<ffffffffb0a3bfae>] tioclinux+0xee/0x490
          [<ffffffffb0a1d114>] vt_ioctl+0xff4/0x2670
          [<ffffffffb0a0089a>] tty_ioctl+0x46a/0x1a10
          [<ffffffffb052db3d>] do_vfs_ioctl+0x5bd/0xc40
          [<ffffffffb052e2f2>] SyS_ioctl+0x132/0x170
          [<ffffffffb11c9b1b>] system_call_fastpath+0x22/0x27
          Memory state around the buggy address:
           ffff8800000fff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
           00 00
           ffff8800000fff80: 00 00 00 00 00 00 00 00 00 00 00 00 00
           00 00 00
          >ffff880000100000: ff ff ff ff ff ff ff ff ff ff ff ff ff
           ff ff ff
      
      It can be reproduce in the linux mainline by the program:
        #include <stdio.h>
        #include <stdlib.h>
        #include <unistd.h>
        #include <fcntl.h>
        #include <sys/types.h>
        #include <sys/stat.h>
        #include <sys/ioctl.h>
        #include <linux/vt.h>
      
        struct tiocl_selection {
          unsigned short xs;      /* X start */
          unsigned short ys;      /* Y start */
          unsigned short xe;      /* X end */
          unsigned short ye;      /* Y end */
          unsigned short sel_mode; /* selection mode */
        };
      
        #define TIOCL_SETSEL    2
        struct tiocl {
          unsigned char type;
          unsigned char pad;
          struct tiocl_selection sel;
        };
      
        int main()
        {
          int fd = 0;
          const char *dev = "/dev/char/4:1";
      
          struct vt_consize v = {0};
          struct tiocl tioc = {0};
      
          fd = open(dev, O_RDWR, 0);
      
          v.v_rows = 3346;
          ioctl(fd, VT_RESIZEX, &v);
      
          tioc.type = TIOCL_SETSEL;
          ioctl(fd, TIOCLINUX, &tioc);
      
          return 0;
        }
      
      When resize the screen, update the 'vc->vc_size_row' to the new_row_size,
      but when 'set_origin' in 'vgacon_set_origin', vgacon use 'vga_vram_base'
      for 'vc_origin' and 'vc_visible_origin', not 'vc_screenbuf'. It maybe
      smaller than 'vc_screenbuf'. When TIOCLINUX, use the new_row_size to calc
      the offset, it maybe larger than the vga_vram_size in vgacon driver, then
      bad access.
      Also, if set an larger screenbuf firstly, then set an more larger
      screenbuf, when copy old_origin to new_origin, a bad access may happen.
      
      So, If the screen size larger than vga_vram, resize screen should be
      failed. This alse fix CVE-2020-8649 and CVE-2020-8647.
      
      Linus pointed out that overflow checking seems absent. We're saved by
      the existing bounds checks in vc_do_resize() with rather strict
      limits:
      
      	if (cols > VC_RESIZE_MAXCOL || lines > VC_RESIZE_MAXROW)
      		return -EINVAL;
      
      Fixes: 0aec4867 ("[PATCH] SVGATextMode fix")
      Reference: CVE-2020-8647 and CVE-2020-8649
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZhang Xiaoxu <zhangxiaoxu5@huawei.com>
      [danvet: augment commit message to point out overflow safety]
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200304022429.37738-1-zhangxiaoxu5@huawei.com
      513dc792
    • yangerkun's avatar
      locks: fix a potential use-after-free problem when wakeup a waiter · 6d390e4b
      yangerkun authored
      '16306a61 ("fs/locks: always delete_block after waiting.")' add the
      logic to check waiter->fl_blocker without blocked_lock_lock. And it will
      trigger a UAF when we try to wakeup some waiter:
      
      Thread 1 has create a write flock a on file, and now thread 2 try to
      unlock and delete flock a, thread 3 try to add flock b on the same file.
      
      Thread2                         Thread3
                                      flock syscall(create flock b)
      	                        ...flock_lock_inode_wait
      				    flock_lock_inode(will insert
      				    our fl_blocked_member list
      				    to flock a's fl_blocked_requests)
      				   sleep
      flock syscall(unlock)
      ...flock_lock_inode_wait
          locks_delete_lock_ctx
          ...__locks_wake_up_blocks
              __locks_delete_blocks(
      	b->fl_blocker = NULL)
      	...
                                         break by a signal
      				   locks_delete_block
      				    b->fl_blocker == NULL &&
      				    list_empty(&b->fl_blocked_requests)
      	                            success, return directly
      				 locks_free_lock b
      	wake_up(&b->fl_waiter)
      	trigger UAF
      
      Fix it by remove this logic, and this patch may also fix CVE-2019-19769.
      
      Cc: stable@vger.kernel.org
      Fixes: 16306a61 ("fs/locks: always delete_block after waiting.")
      Signed-off-by: default avataryangerkun <yangerkun@huawei.com>
      Signed-off-by: default avatarJeff Layton <jlayton@kernel.org>
      6d390e4b
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · aeb542a1
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "7 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description
        mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled
        mm/z3fold.c: do not include rwlock.h directly
        fat: fix uninit-memory access for partial initialized inode
        mm: avoid data corruption on CoW fault into PFN-mapped VMA
        mm: fix possible PMD dirty bit lost in set_pmd_migration_entry()
        mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa
      aeb542a1
    • Miroslav Benes's avatar
      arch/Kconfig: update HAVE_RELIABLE_STACKTRACE description · 140d7e88
      Miroslav Benes authored
      save_stack_trace_tsk_reliable() is not the only function providing the
      reliable stack traces anymore.  Architecture might define ARCH_STACKWALK
      which provides a newer stack walking interface and has
      arch_stack_walk_reliable() function.  Update the description accordingly.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Link: http://lkml.kernel.org/r/20200120154042.9934-1-mbenes@suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      140d7e88
    • Vlastimil Babka's avatar
      mm, hotplug: fix page online with DEBUG_PAGEALLOC compiled but not enabled · c87cbc1f
      Vlastimil Babka authored
      Commit cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      fixed memory hotplug with debug_pagealloc enabled, where onlining a page
      goes through page freeing, which removes the direct mapping.  Some arches
      don't like when the page is not mapped in the first place, so
      generic_online_page() maps it first.  This is somewhat wasteful, but
      better than special casing page freeing fast paths.
      
      The commit however missed that DEBUG_PAGEALLOC configured doesn't mean
      it's actually enabled.  One has to test debug_pagealloc_enabled() since
      031bc574 ("mm/debug-pagealloc: make debug-pagealloc boottime
      configurable"), or alternatively debug_pagealloc_enabled_static() since
      8e57f8ac ("mm, debug_pagealloc: don't rely on static keys too early"),
      but this is not done.
      
      As a result, a s390 kernel with DEBUG_PAGEALLOC configured but not enabled
      will crash:
      
      Unable to handle kernel pointer dereference in virtual kernel address space
      Failing address: 0000000000000000 TEID: 0000000000000483
      Fault in home space mode while using kernel ASCE.
      AS:0000001ece13400b R2:000003fff7fd000b R3:000003fff7fcc007 S:000003fff7fd7000 P:000000000000013d
      Oops: 0004 ilc:2 [#1] SMP
      CPU: 1 PID: 26015 Comm: chmem Kdump: loaded Tainted: GX 5.3.18-5-default #1 SLE15-SP2 (unreleased)
      Krnl PSW : 0704e00180000000 0000001ecd281b9e (__kernel_map_pages+0x166/0x188)
      R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
      Krnl GPRS: 0000000000000000 0000000000000800 0000400b00000000 0000000000000100
      0000000000000001 0000000000000000 0000000000000002 0000000000000100
      0000001ece139230 0000001ecdd98d40 0000400b00000100 0000000000000000
      000003ffa17e4000 001fffe0114f7d08 0000001ecd4d93ea 001fffe0114f7b20
      Krnl Code: 0000001ecd281b8e: ec17ffff00d8 ahik %r1,%r7,-1
      0000001ecd281b94: ec111dbc0355 risbg %r1,%r1,29,188,3
      >0000001ecd281b9e: 94fb5006 ni 6(%r5),251
      0000001ecd281ba2: 41505008 la %r5,8(%r5)
      0000001ecd281ba6: ec51fffc6064 cgrj %r5,%r1,6,1ecd281b9e
      0000001ecd281bac: 1a07 ar %r0,%r7
      0000001ecd281bae: ec03ff584076 crj %r0,%r3,4,1ecd281a5e
      Call Trace:
      [<0000001ecd281b9e>] __kernel_map_pages+0x166/0x188
      [<0000001ecd4d9516>] online_pages_range+0xf6/0x128
      [<0000001ecd2a8186>] walk_system_ram_range+0x7e/0xd8
      [<0000001ecda28aae>] online_pages+0x2fe/0x3f0
      [<0000001ecd7d02a6>] memory_subsys_online+0x8e/0xc0
      [<0000001ecd7add42>] device_online+0x5a/0xc8
      [<0000001ecd7d0430>] state_store+0x88/0x118
      [<0000001ecd5b9f62>] kernfs_fop_write+0xc2/0x200
      [<0000001ecd5064b6>] vfs_write+0x176/0x1e0
      [<0000001ecd50676a>] ksys_write+0xa2/0x100
      [<0000001ecda315d4>] system_call+0xd8/0x2c8
      
      Fix this by checking debug_pagealloc_enabled_static() before calling
      kernel_map_pages(). Backports for kernel before 5.5 should use
      debug_pagealloc_enabled() instead. Also add comments.
      
      Fixes: cd02cf1a ("mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC")
      Reported-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Qian Cai <cai@lca.pw>
      Link: http://lkml.kernel.org/r/20200224094651.18257-1-vbabka@suse.czSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c87cbc1f
    • Sebastian Andrzej Siewior's avatar
      mm/z3fold.c: do not include rwlock.h directly · a8198fed
      Sebastian Andrzej Siewior authored
      rwlock.h should not be included directly. Instead linux/splinlock.h
      should be included. One thing it does is to break the RT build.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20200224133631.1510569-1-bigeasy@linutronix.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a8198fed
    • OGAWA Hirofumi's avatar
      fat: fix uninit-memory access for partial initialized inode · bc87302a
      OGAWA Hirofumi authored
      When get an error in the middle of reading an inode, some fields in the
      inode might be still not initialized.  And then the evict_inode path may
      access those fields via iput().
      
      To fix, this makes sure that inode fields are initialized.
      
      Reported-by: syzbot+9d82b8de2992579da5d0@syzkaller.appspotmail.com
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarOGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/871rqnreqx.fsf@mail.parknet.co.jpSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bc87302a
    • Kirill A. Shutemov's avatar
      mm: avoid data corruption on CoW fault into PFN-mapped VMA · c3e5ea6e
      Kirill A. Shutemov authored
      Jeff Moyer has reported that one of xfstests triggers a warning when run
      on DAX-enabled filesystem:
      
      	WARNING: CPU: 76 PID: 51024 at mm/memory.c:2317 wp_page_copy+0xc40/0xd50
      	...
      	wp_page_copy+0x98c/0xd50 (unreliable)
      	do_wp_page+0xd8/0xad0
      	__handle_mm_fault+0x748/0x1b90
      	handle_mm_fault+0x120/0x1f0
      	__do_page_fault+0x240/0xd70
      	do_page_fault+0x38/0xd0
      	handle_page_fault+0x10/0x30
      
      The warning happens on failed __copy_from_user_inatomic() which tries to
      copy data into a CoW page.
      
      This happens because of race between MADV_DONTNEED and CoW page fault:
      
      	CPU0					CPU1
       handle_mm_fault()
         do_wp_page()
           wp_page_copy()
             do_wp_page()
      					madvise(MADV_DONTNEED)
      					  zap_page_range()
      					    zap_pte_range()
      					      ptep_get_and_clear_full()
      					      <TLB flush>
      	 __copy_from_user_inatomic()
      	 sees empty PTE and fails
      	 WARN_ON_ONCE(1)
      	 clear_page()
      
      The solution is to re-try __copy_from_user_inatomic() under PTL after
      checking that PTE is matches the orig_pte.
      
      The second copy attempt can still fail, like due to non-readable PTE, but
      there's nothing reasonable we can do about, except clearing the CoW page.
      Reported-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Tested-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Cc: <stable@vger.kernel.org>
      Cc: Justin He <Justin.He@arm.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Link: http://lkml.kernel.org/r/20200218154151.13349-1-kirill.shutemov@linux.intel.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c3e5ea6e
    • Huang Ying's avatar
      mm: fix possible PMD dirty bit lost in set_pmd_migration_entry() · 8a8683ad
      Huang Ying authored
      In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
      atomically.  But the PMD is read before that with an ordinary memory
      reading.  If the THP (transparent huge page) is written between the PMD
      reading and pmdp_invalidate(), the PMD dirty bit may be lost, and cause
      data corruption.  The race window is quite small, but still possible in
      theory, so need to be fixed.
      
      The race is fixed via using the return value of pmdp_invalidate() to get
      the original content of PMD, which is a read/modify/write atomic
      operation.  So no THP writing can occur in between.
      
      The race has been introduced when the THP migration support is added in
      the commit 616b8371 ("mm: thp: enable thp migration in generic path").
      But this fix depends on the commit d52605d7 ("mm: do not lose dirty
      and accessed bits in pmdp_invalidate()").  So it's easy to be backported
      after v4.16.  But the race window is really small, so it may be fine not
      to backport the fix at all.
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatar"Huang, Ying" <ying.huang@intel.com>
      Reviewed-by: default avatarZi Yan <ziy@nvidia.com>
      Reviewed-by: default avatarWilliam Kucharski <william.kucharski@oracle.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Link: http://lkml.kernel.org/r/20200220075220.2327056-1-ying.huang@intel.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a8683ad
    • Mel Gorman's avatar
      mm, numa: fix bad pmd by atomically check for pmd_trans_huge when marking page tables prot_numa · 8b272b3c
      Mel Gorman authored
      : A user reported a bug against a distribution kernel while running a
      : proprietary workload described as "memory intensive that is not swapping"
      : that is expected to apply to mainline kernels.  The workload is
      : read/write/modifying ranges of memory and checking the contents.  They
      : reported that within a few hours that a bad PMD would be reported followed
      : by a memory corruption where expected data was all zeros.  A partial
      : report of the bad PMD looked like
      :
      :   [ 5195.338482] ../mm/pgtable-generic.c:33: bad pmd ffff8888157ba008(000002e0396009e2)
      :   [ 5195.341184] ------------[ cut here ]------------
      :   [ 5195.356880] kernel BUG at ../mm/pgtable-generic.c:35!
      :   ....
      :   [ 5195.410033] Call Trace:
      :   [ 5195.410471]  [<ffffffff811bc75d>] change_protection_range+0x7dd/0x930
      :   [ 5195.410716]  [<ffffffff811d4be8>] change_prot_numa+0x18/0x30
      :   [ 5195.410918]  [<ffffffff810adefe>] task_numa_work+0x1fe/0x310
      :   [ 5195.411200]  [<ffffffff81098322>] task_work_run+0x72/0x90
      :   [ 5195.411246]  [<ffffffff81077139>] exit_to_usermode_loop+0x91/0xc2
      :   [ 5195.411494]  [<ffffffff81003a51>] prepare_exit_to_usermode+0x31/0x40
      :   [ 5195.411739]  [<ffffffff815e56af>] retint_user+0x8/0x10
      :
      : Decoding revealed that the PMD was a valid prot_numa PMD and the bad PMD
      : was a false detection.  The bug does not trigger if automatic NUMA
      : balancing or transparent huge pages is disabled.
      :
      : The bug is due a race in change_pmd_range between a pmd_trans_huge and
      : pmd_nond_or_clear_bad check without any locks held.  During the
      : pmd_trans_huge check, a parallel protection update under lock can have
      : cleared the PMD and filled it with a prot_numa entry between the transhuge
      : check and the pmd_none_or_clear_bad check.
      :
      : While this could be fixed with heavy locking, it's only necessary to make
      : a copy of the PMD on the stack during change_pmd_range and avoid races.  A
      : new helper is created for this as the check if quite subtle and the
      : existing similar helpful is not suitable.  This passed 154 hours of
      : testing (usually triggers between 20 minutes and 24 hours) without
      : detecting bad PMDs or corruption.  A basic test of an autonuma-intensive
      : workload showed no significant change in behaviour.
      
      Although Mel withdrew the patch on the face of LKML comment
      https://lkml.org/lkml/2017/4/10/922 the race window aforementioned is
      still open, and we have reports of Linpack test reporting bad residuals
      after the bad PMD warning is observed.  In addition to that, bad
      rss-counter and non-zero pgtables assertions are triggered on mm teardown
      for the task hitting the bad PMD.
      
       host kernel: mm/pgtable-generic.c:40: bad pmd 00000000b3152f68(8000000d2d2008e7)
       ....
       host kernel: BUG: Bad rss-counter state mm:00000000b583043d idx:1 val:512
       host kernel: BUG: non-zero pgtables_bytes on freeing mm: 4096
      
      The issue is observed on a v4.18-based distribution kernel, but the race
      window is expected to be applicable to mainline kernels, as well.
      
      [akpm@linux-foundation.org: fix comment typo, per Rafael]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: <stable@vger.kernel.org>
      Cc: Zi Yan <zi.yan@cs.rutgers.edu>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michal Hocko <mhocko@suse.com>
      Link: http://lkml.kernel.org/r/20200216191800.22423-1-aquini@redhat.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b272b3c
    • Linus Torvalds's avatar
      Merge tag 'devprop-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · b0b8a945
      Linus Torvalds authored
      Pull device properties framework fix from Rafael Wysocki:
       "Revert a problematic commit from the 5.3 development cycle (Brendan
        Higgins)"
      
      * tag 'devprop-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Revert "software node: Simplify software_node_release() function"
      b0b8a945
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · fe67d182
      Linus Torvalds authored
      Pull ACPI documentation fix from Rafael Wysocki:
       "Fix Sphinx format warinings in an ACPI fan document added recently
        (Randy Dunlap)"
      
      * tag 'acpi-5.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        Documentation/admin-guide/acpi: fix fan_performance_states.rst warnings
      fe67d182
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2020-03-06' of git://anongit.freedesktop.org/drm/drm · ba0ae9ac
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Weekly fixes round, looks like a few people woke up, got a bunch of
        fixes across the drivers. Bit bigger than I'd like but they all seem
        fine and hopefully it quiets down now.
      
        sun4i, kirin, mediatek and exynos on the ARM side. virtio-gpu and core
        have some mmap fixes, and there is a dma-buf leak. one ttm fence leak
        is also fixed.
      
        Otherwise it's mostly amdgpu and i915.
      
        One of the i915 fixes is for a very long latency I was seeing (using
        latencytop) running gnome-shell locally when using firefox and eating
        nearly all my RAM, it really helps with desktop responsiveness esp
        when firefox is chewing a lot.
      
        dma-buf:
         - fix memory leak
      
        core:
         - shmem object mmap fix.
      
        ttm:
         - Fix fence leak in ttm_buffer_object_transfer().
      
        amdgpu:
         - Gfx reset fix for gfx9, 10
         - Fix for gfx10
         - DP MST fix
         - DCC fix
         - Renoir power fixes
         - Navi power fix
      
        i915:
         - Break up long lists of object reclaim with cond_resched()
         - PSR probe fix
         - TGL workarounds
         - Selftest return value fix
         - Drop timeline mutex while waiting for retirement
         - Wait for OA configuration completion before writes to OA buffer
      
        virtio:
         - Fix resource id creation race in virtio.
         - mmap fixes
      
        sun4i:
         - Fixes for sun4i VI layer format support.
      
        kirin:
         - kirin: Revert "Fix for hikey620 display offset problem"
      
        exynos:
         - fix a kernel oops problem in case that driver is loaded as module.
         - fix a regulator warning issue when I2C DDC adapter cannot be gathered.
         - print out an error message only in error case excepting -EPROBE_DEFER.
      
        mediatek:
         - overlay, cursor and gce fixes"
      `
      
      * tag 'drm-fixes-2020-03-06' of git://anongit.freedesktop.org/drm/drm: (38 commits)
        drm/amdgpu/display: navi1x copy dcn watermark clock settings to smu resume from s3 (v2)
        drm/amd/powerplay: map mclk to fclk for COMBINATIONAL_BYPASS case
        drm/amd/powerplay: fix pre-check condition for setting clock range
        drm/amd/display: fix dcc swath size calculations on dcn1
        drm/amd/display: Clear link settings on MST disable connector
        drm/amdgpu: disable 3D pipe 1 on Navi1x
        drm/amdgpu: clean wptr on wb when gpu recovery
        drm: kirin: Revert "Fix for hikey620 display offset problem"
        drm/i915/gt: Drop the timeline->mutex as we wait for retirement
        drm/i915/perf: Reintroduce wait on OA configuration completion
        drm/sun4i: Fix DE2 VI layer format support
        drm/sun4i: Add separate DE3 VI layer formats
        drm/sun4i: de2/de3: Remove unsupported VI layer formats
        drm/i915/selftests: Fix return in assert_mmap_offset()
        drm/i915: Protect i915_request_await_start from early waits
        drm/i915/tgl: Add Wa_1608008084
        drm/i915/tgl: Add Wa_22010178259:tgl
        drm/i915: Program MBUS with rmw during initialization
        drm/i915/psr: Force PSR probe only after full initialization
        drm/i915/gem: Break up long lists of object reclaim
        ...
      ba0ae9ac
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-doc' · 86dfa5be
      Rafael J. Wysocki authored
      * acpi-doc:
        Documentation/admin-guide/acpi: fix fan_performance_states.rst warnings
      86dfa5be
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-5.6-2020-03-05' of... · 2ac4853e
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-5.6-2020-03-05' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
      
      amd-drm-fixes-5.6-2020-03-05:
      
      amdgpu:
      - Gfx reset fix for gfx9, 10
      - Fix for gfx10
      - DP MST fix
      - DCC fix
      - Renoir power fixes
      - Navi power fix
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexdeucher@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200305185957.4268-1-alexander.deucher@amd.com
      2ac4853e
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2020-03-05' of... · 64c3fd53
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2020-03-05' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
      
      drm/i915 fixes for v5.6-rc5:
      - Break up long lists of object reclaim with cond_resched()
      - PSR probe fix
      - TGL workarounds
      - Selftest return value fix
      - Drop timeline mutex while waiting for retirement
      - Wait for OA configuration completion before writes to OA buffer
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Jani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/87eeu7nl6z.fsf@intel.com
      64c3fd53
  2. 05 Mar, 2020 15 commits
  3. 04 Mar, 2020 2 commits
    • Brendan Higgins's avatar
      Revert "software node: Simplify software_node_release() function" · 7589238a
      Brendan Higgins authored
      This reverts commit 3df85a1a.
      
      The reverted commit says "It's possible to release the node ID
      immediately when fwnode_remove_software_node() is called, no need to
      wait for software_node_release() with that." However, releasing the node
      ID before waiting for software_node_release() to be called causes the
      node ID to be released before the kobject and the underlying sysfs
      entry; this means there is a period of time where a sysfs entry exists
      that is associated with an unallocated node ID.
      
      Once consequence of this is that there is a race condition where it is
      possible to call fwnode_create_software_node() with no parent node
      specified (NULL) and have it fail with -EEXIST because the node ID that
      was assigned is still associated with a stale sysfs entry that hasn't
      been cleaned up yet.
      
      Although it is difficult to reproduce this race condition under normal
      conditions, it can be deterministically reproduced with the following
      minconfig on UML:
      
      CONFIG_KUNIT_DRIVER_PE_TEST=y
      CONFIG_DEBUG_KERNEL=y
      CONFIG_DEBUG_OBJECTS=y
      CONFIG_DEBUG_OBJECTS_TIMERS=y
      CONFIG_DEBUG_KOBJECT_RELEASE=y
      CONFIG_KUNIT=y
      
      Running the tests with this configuration causes the following failure:
      
      <snip>
      kobject: 'node0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 400)
      	ok 1 - pe_test_uints
      sysfs: cannot create duplicate filename '/kernel/software_nodes/node0'
      CPU: 0 PID: 28 Comm: kunit_try_catch Not tainted 5.6.0-rc3-next-20200227 #14
      <snip>
      kobject_add_internal failed for node0 with -EEXIST, don't try to register things with the same name in the same directory.
      kobject: 'node0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 100)
      	# pe_test_uint_arrays: ASSERTION FAILED at drivers/base/test/property-entry-test.c:123
      	Expected node is not error, but is: -17
      	not ok 2 - pe_test_uint_arrays
      <snip>
      Reported-by: default avatarHeidi Fahim <heidifahim@google.com>
      Signed-off-by: default avatarBrendan Higgins <brendanhiggins@google.com>
      Reviewed-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Cc: 5.3+ <stable@vger.kernel.org> # 5.3+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      7589238a
    • Linus Torvalds's avatar
      Merge tag 'for-5.6/dm-fixes' of... · 776e49e8
      Linus Torvalds authored
      Merge tag 'for-5.6/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix request-based DM's congestion_fn and actually wire it up to the
         bdi.
      
       - Extend dm-bio-record to track additional struct bio members needed by
         DM integrity target.
      
       - Fix DM core to properly advertise that a device is suspended during
         unload (between the presuspend and postsuspend hooks). This change is
         a prereq for related DM integrity and DM writecache fixes. It
         elevates DM integrity's 'suspending' state tracking to DM core.
      
       - Four stable fixes for DM integrity target.
      
       - Fix crash in DM cache target due to incorrect work item cancelling.
      
       - Fix DM thin metadata lockdep warning that was introduced during 5.6
         merge window.
      
       - Fix DM zoned target's chunk work refcounting that regressed during
         recent conversion to refcount_t.
      
       - Bump the minor version for DM core and all target versions that have
         seen interface changes or important fixes during the 5.6 cycle.
      
      * tag 'for-5.6/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: bump version of core and various targets
        dm: fix congested_fn for request-based device
        dm integrity: use dm_bio_record and dm_bio_restore
        dm bio record: save/restore bi_end_io and bi_integrity
        dm zoned: Fix reference counter initial value of chunk works
        dm writecache: verify watermark during resume
        dm: report suspended device during destroy
        dm thin metadata: fix lockdep complaint
        dm cache: fix a crash due to incorrect work item cancelling
        dm integrity: fix invalid table returned due to argument count mismatch
        dm integrity: fix a deadlock due to offloading to an incorrect workqueue
        dm integrity: fix recalculation when moving from journal mode to bitmap mode
      776e49e8