Commits · 7b6889f54a3c8c4139137a24a3ca12fe52a91dba · Kirill Smelkov / linux

05 Jun, 2021 9 commits

mm/kasan/init.c: fix doc warning · 7b6889f5

Yu Kuai authored Jun 04, 2021

Fix gcc W=1 warning:

mm/kasan/init.c:228: warning: Function parameter or member 'shadow_start' not described in 'kasan_populate_early_shadow'
mm/kasan/init.c:228: warning: Function parameter or member 'shadow_end' not described in 'kasan_populate_early_shadow'

Link: https://lkml.kernel.org/r/20210603140700.3045298-1-yukuai3@huawei.comSigned-off-by: Yu Kuai <yukuai3@huawei.com>
Acked-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

7b6889f5

proc: add .gitignore for proc-subset-pid selftest · 263e88d6

David Matlack authored Jun 04, 2021

This new selftest needs an entry in the .gitignore file otherwise git
will try to track the binary.

Link: https://lkml.kernel.org/r/20210601164305.11776-1-dmatlack@google.com
Fixes: 268af17a ("selftests: proc: test subset=pid")
Signed-off-by: David Matlack <dmatlack@google.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alexey Gladkov <gladkov.alexey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

263e88d6

hugetlb: pass head page to remove_hugetlb_page() · 0c5da357

Naoya Horiguchi authored Jun 04, 2021

When memory_failure() or soft_offline_page() is called on a tail page of
some hugetlb page, "BUG: unable to handle page fault" error can be
triggered.

remove_hugetlb_page() dereferences page->lru, so it's assumed that the
page points to a head page, but one of the caller,
dissolve_free_huge_page(), provides remove_hugetlb_page() with 'page'
which could be a tail page.  So pass 'head' to it, instead.

Link: https://lkml.kernel.org/r/20210526235257.2769473-1-nao.horiguchi@gmail.com
Fixes: 6eb4e88a ("hugetlb: create remove_hugetlb_page() to separate functionality")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0c5da357

drivers/base/memory: fix trying offlining memory blocks with memory holes on aarch64 · 92813053

David Hildenbrand authored Jun 04, 2021

offline_pages() properly checks for memory holes and bails out.
However, we do a page_zone(pfn_to_page(start_pfn)) before calling
offline_pages() when offlining a memory block.

We should not unconditionally call page_zone(pfn_to_page(start_pfn)) on
aarch64 in offlining code, otherwise we can trigger a BUG when hitting a
memory hole:

   kernel BUG at include/linux/mm.h:1383!
   Internal error: Oops - BUG: 0 [#1] SMP
   Modules linked in: loop processor efivarfs ip_tables x_tables ext4 mbcache jbd2 dm_mod igb nvme i2c_algo_bit mlx5_core i2c_core nvme_core firmware_class
   CPU: 13 PID: 1694 Comm: ranbug Not tainted 5.12.0-next-20210524+ #4
   Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020
   pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
   pc : memory_subsys_offline+0x1f8/0x250
   lr : memory_subsys_offline+0x1f8/0x250
   Call trace:
     memory_subsys_offline+0x1f8/0x250
     device_offline+0x154/0x1d8
     online_store+0xa4/0x118
     dev_attr_store+0x44/0x78
     sysfs_kf_write+0xe8/0x138
     kernfs_fop_write_iter+0x26c/0x3d0
     new_sync_write+0x2bc/0x4f8
     vfs_write+0x718/0xc88
     ksys_write+0xf8/0x1e0
     __arm64_sys_write+0x74/0xa8
     invoke_syscall.constprop.0+0x78/0x1e8
     do_el0_svc+0xe4/0x298
     el0_svc+0x20/0x30
     el0_sync_handler+0xb0/0xb8
     el0_sync+0x178/0x180
   Kernel panic - not syncing: Oops - BUG: Fatal exception
   SMP: stopping secondary CPUs
   Kernel Offset: disabled
   CPU features: 0x00000251,20000846
   Memory Limit: none

If nr_vmemmap_pages is set, we know that we are dealing with hotplugged
memory that doesn't have any holes.  So call
page_zone(pfn_to_page(start_pfn)) only when really necessary -- when
nr_vmemmap_pages is set and we actually adjust the present pages.

Link: https://lkml.kernel.org/r/20210526075226.5572-1-david@redhat.com
Fixes: a08a2ae3 ("mm,memory_hotplug: allocate memmap from the added memory range")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Qian Cai (QUIC) <quic_qiancai@quicinc.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

92813053

mm/page_alloc: fix counting of free pages after take off from buddy · bac9c6fa

Ding Hui authored Jun 04, 2021

Recently we found that there is a lot MemFree left in /proc/meminfo
after do a lot of pages soft offline, it's not quite correct.

Before Oscar's rework of soft offline for free pages [1], if we soft
offline free pages, these pages are left in buddy with HWPoison flag,
and NR_FREE_PAGES is not updated immediately.  So the difference between
NR_FREE_PAGES and real number of available free pages is also even big
at the beginning.

However, with the workload running, when we catch HWPoison page in any
alloc functions subsequently, we will remove it from buddy, meanwhile
update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES will get
more and more closer to the real number of available free pages.
(regardless of unpoison_memory())

Now, for offline free pages, after a successful call
take_page_off_buddy(), the page is no longer belong to buddy allocator,
and will not be used any more, but we missed accounting NR_FREE_PAGES in
this situation, and there is no chance to be updated later.

Do update in take_page_off_buddy() like rmqueue() does, but avoid double
counting if some one already set_migratetype_isolate() on the page.

[1]: commit 06be6ff3 ("mm,hwpoison: rework soft offline for free pages")

Link: https://lkml.kernel.org/r/20210526075247.11130-1-dinghui@sangfor.com.cn
Fixes: 06be6ff3 ("mm,hwpoison: rework soft offline for free pages")
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bac9c6fa

mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests() · 04f7ce3f

Gerald Schaefer authored Jun 04, 2021

In pmd/pud_advanced_tests(), the vaddr is aligned up to the next pmd/pud
entry, and so it does not match the given pmdp/pudp and (aligned down)
pfn any more.

For s390, this results in memory corruption, because the IDTE
instruction used e.g.  in xxx_get_and_clear() will take the vaddr for
some calculations, in combination with the given pmdp.  It will then end
up with a wrong table origin, ending on ...ff8, and some of those
wrongly set low-order bits will also select a wrong pagetable level for
the index addition.  IDTE could therefore invalidate (or 0x20) something
outside of the page tables, depending on the wrongly picked index, which
in turn depends on the random vaddr.

As result, we sometimes see "BUG task_struct (Not tainted): Padding
overwritten" on s390, where one 0x5a padding value got overwritten with
0x7a.

Fix this by aligning down, similar to how the pmd/pud_aligned pfns are
calculated.

Link: https://lkml.kernel.org/r/20210525130043.186290-2-gerald.schaefer@linux.ibm.com
Fixes: a5c3b9ff ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers")
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: <stable@vger.kernel.org>	[5.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

04f7ce3f

pid: take a reference when initializing `cad_pid` · 0711f0d7

Mark Rutland authored Jun 04, 2021

During boot, kernel_init_freeable() initializes `cad_pid` to the init
task's struct pid.  Later on, we may change `cad_pid` via a sysctl, and
when this happens proc_do_cad_pid() will increment the refcount on the
new pid via get_pid(), and will decrement the refcount on the old pid
via put_pid().  As we never called get_pid() when we initialized
`cad_pid`, we decrement a reference we never incremented, can therefore
free the init task's struct pid early.  As there can be dangling
references to the struct pid, we can later encounter a use-after-free
(e.g.  when delivering signals).

This was spotted when fuzzing v5.13-rc3 with Syzkaller, but seems to
have been around since the conversion of `cad_pid` to struct pid in
commit 9ec52099 ("[PATCH] replace cad_pid by a struct pid") from the
pre-KASAN stone age of v2.6.19.

Fix this by getting a reference to the init task's struct pid when we
assign it to `cad_pid`.

Full KASAN splat below.

   ==================================================================
   BUG: KASAN: use-after-free in ns_of_pid include/linux/pid.h:153 [inline]
   BUG: KASAN: use-after-free in task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509
   Read of size 4 at addr ffff23794dda0004 by task syz-executor.0/273

   CPU: 1 PID: 273 Comm: syz-executor.0 Not tainted 5.12.0-00001-g9aef892b2d15 #1
   Hardware name: linux,dummy-virt (DT)
   Call trace:
    ns_of_pid include/linux/pid.h:153 [inline]
    task_active_pid_ns+0xc0/0xc8 kernel/pid.c:509
    do_notify_parent+0x308/0xe60 kernel/signal.c:1950
    exit_notify kernel/exit.c:682 [inline]
    do_exit+0x2334/0x2bd0 kernel/exit.c:845
    do_group_exit+0x108/0x2c8 kernel/exit.c:922
    get_signal+0x4e4/0x2a88 kernel/signal.c:2781
    do_signal arch/arm64/kernel/signal.c:882 [inline]
    do_notify_resume+0x300/0x970 arch/arm64/kernel/signal.c:936
    work_pending+0xc/0x2dc

   Allocated by task 0:
    slab_post_alloc_hook+0x50/0x5c0 mm/slab.h:516
    slab_alloc_node mm/slub.c:2907 [inline]
    slab_alloc mm/slub.c:2915 [inline]
    kmem_cache_alloc+0x1f4/0x4c0 mm/slub.c:2920
    alloc_pid+0xdc/0xc00 kernel/pid.c:180
    copy_process+0x2794/0x5e18 kernel/fork.c:2129
    kernel_clone+0x194/0x13c8 kernel/fork.c:2500
    kernel_thread+0xd4/0x110 kernel/fork.c:2552
    rest_init+0x44/0x4a0 init/main.c:687
    arch_call_rest_init+0x1c/0x28
    start_kernel+0x520/0x554 init/main.c:1064
    0x0

   Freed by task 270:
    slab_free_hook mm/slub.c:1562 [inline]
    slab_free_freelist_hook+0x98/0x260 mm/slub.c:1600
    slab_free mm/slub.c:3161 [inline]
    kmem_cache_free+0x224/0x8e0 mm/slub.c:3177
    put_pid.part.4+0xe0/0x1a8 kernel/pid.c:114
    put_pid+0x30/0x48 kernel/pid.c:109
    proc_do_cad_pid+0x190/0x1b0 kernel/sysctl.c:1401
    proc_sys_call_handler+0x338/0x4b0 fs/proc/proc_sysctl.c:591
    proc_sys_write+0x34/0x48 fs/proc/proc_sysctl.c:617
    call_write_iter include/linux/fs.h:1977 [inline]
    new_sync_write+0x3ac/0x510 fs/read_write.c:518
    vfs_write fs/read_write.c:605 [inline]
    vfs_write+0x9c4/0x1018 fs/read_write.c:585
    ksys_write+0x124/0x240 fs/read_write.c:658
    __do_sys_write fs/read_write.c:670 [inline]
    __se_sys_write fs/read_write.c:667 [inline]
    __arm64_sys_write+0x78/0xb0 fs/read_write.c:667
    __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
    invoke_syscall arch/arm64/kernel/syscall.c:49 [inline]
    el0_svc_common.constprop.1+0x16c/0x388 arch/arm64/kernel/syscall.c:129
    do_el0_svc+0xf8/0x150 arch/arm64/kernel/syscall.c:168
    el0_svc+0x28/0x38 arch/arm64/kernel/entry-common.c:416
    el0_sync_handler+0x134/0x180 arch/arm64/kernel/entry-common.c:432
    el0_sync+0x154/0x180 arch/arm64/kernel/entry.S:701

   The buggy address belongs to the object at ffff23794dda0000
    which belongs to the cache pid of size 224
   The buggy address is located 4 bytes inside of
    224-byte region [ffff23794dda0000, ffff23794dda00e0)
   The buggy address belongs to the page:
   page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4dda0
   head:(____ptrval____) order:1 compound_mapcount:0
   flags: 0x3fffc0000010200(slab|head)
   raw: 03fffc0000010200 dead000000000100 dead000000000122 ffff23794d40d080
   raw: 0000000000000000 0000000000190019 00000001ffffffff 0000000000000000
   page dumped because: kasan: bad access detected

   Memory state around the buggy address:
    ffff23794dd9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff23794dd9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
   >ffff23794dda0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                      ^
    ffff23794dda0080: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
    ffff23794dda0100: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00
   ==================================================================

Link: https://lkml.kernel.org/r/20210524172230.38715-1-mark.rutland@arm.com
Fixes: 9ec52099 ("[PATCH] replace cad_pid by a struct pid")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Cedric Le Goater <clg@fr.ibm.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

0711f0d7

kfence: use TASK_IDLE when awaiting allocation · 8fd0e995

Marco Elver authored Jun 04, 2021

Since wait_event() uses TASK_UNINTERRUPTIBLE by default, waiting for an
allocation counts towards load.  However, for KFENCE, this does not make
any sense, since there is no busy work we're awaiting.

Instead, use TASK_IDLE via wait_event_idle() to not count towards load.

BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1185565
Link: https://lkml.kernel.org/r/20210521083209.3740269-1-elver@google.com
Fixes: 407f1d8c ("kfence: await for allocation using wait_event")
Signed-off-by: Marco Elver <elver@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Hillf Danton <hdanton@sina.com>
Cc: <stable@vger.kernel.org>	[5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8fd0e995

Revert "MIPS: make userspace mapping young by default" · 50c25ee9

Thomas Bogendoerfer authored Jun 04, 2021

This reverts commit f685a533.

The MIPS cache flush logic needs to know whether the mapping was already
established to decide how to flush caches.  This is done by checking the
valid bit in the PTE.  The commit above breaks this logic by setting the
valid in the PTE in new mappings, which causes kernel crashes.

Link: https://lkml.kernel.org/r/20210526094335.92948-1-tsbogend@alpha.franken.de
Fixes: f685a533 ("MIPS: make userspace mapping young by default")
Reported-by: Zhou Yanjie <zhouyanjie@wanyeetech.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Huang Pei <huangpei@loongson.cn>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

50c25ee9

04 Jun, 2021 3 commits

Merge tag 'sound-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 16f0596f

Linus Torvalds authored Jun 04, 2021

Pull sound fixes from Takashi Iwai:
 "A couple of small fixes are found in the ALSA core side at this time;
  a fix in the new LED handling code and a long-standing (and likely no
  one would notice) ioctl bug.

  The rest are usual HD-audio fixes, mostly device-specific quirks but
  also one major regression fix that was introduced in 5.13"

* tag 'sound-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda: update the power_state during the direct-complete
  ALSA: timer: Fix master timer notification
  ALSA: control led: fix memory leak in snd_ctl_led_register
  ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx
  ALSA: hda/cirrus: Set Initial DMIC volume to -26 dB
  ALSA: hda: Fix a regression in Capture Switch mixer read
  ALSA: hda: Add AlderLake-M PCI ID

16f0596f

Merge tag 'drm-fixes-2021-06-04-1' of git://anongit.freedesktop.org/drm/drm · 3a3c5ab3

Linus Torvalds authored Jun 04, 2021

Pull drm fixes from Dave Airlie:
 "Two big regression reverts in here, one for fbdev and one i915.
  Otherwise it's mostly amdgpu display fixes, and tegra fixes.

  fb:
   - revert broken fb_defio patch

  amdgpu:
   - Display fixes
   - FRU EEPROM error handling fix
   - RAS fix
   - PSP fix
   - Releasing pinned BO fix

  i915:
   - Revert conversion to io_mapping_map_user() which lead to BUG_ON()
   - Fix check for error valued returns in a selftest

  tegra:
   - SOR power domain race condition fix
   - build warning fix
   - runtime pm ref leak fix
   - modifier fix"

* tag 'drm-fixes-2021-06-04-1' of git://anongit.freedesktop.org/drm/drm:
  amd/display: convert DRM_DEBUG_ATOMIC to drm_dbg_atomic
  drm/amdgpu: make sure we unpin the UVD BO
  drm/amd/amdgpu:save psp ring wptr to avoid attack
  drm/amd/display: Fix potential memory leak in DMUB hw_init
  drm/amdgpu: Don't query CE and UE errors
  drm/amd/display: Fix overlay validation by considering cursors
  drm/amdgpu: refine amdgpu_fru_get_product_info
  drm/amdgpu: add judgement for dc support
  drm/amd/display: Fix GPU scaling regression by FS video support
  drm/amd/display: Allow bandwidth validation for 0 streams.
  Revert "i915: use io_mapping_map_user"
  drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest()
  Revert "fb_defio: Remove custom address_space_operations"
  drm/tegra: Correct DRM_FORMAT_MOD_NVIDIA_SECTOR_LAYOUT
  drm/tegra: sor: Fix AUX device reference leak
  drm/tegra: Get ref for DP AUX channel, not its ddc adapter
  drm/tegra: Fix shift overflow in tegra_shared_plane_atomic_update
  drm/tegra: sor: Fully initialize SOR before registration
  gpu: host1x: Split up client initalization and registration
  drm/tegra: sor: Do not leak runtime PM reference

3a3c5ab3

Merge tag 'drm/tegra/for-5.13-rc5' of ssh://git.freedesktop.org/git/tegra/linux into drm-fixes · 37e2f2e8

Dave Airlie authored Jun 04, 2021

drm/tegra: Fixes for v5.13-rc5

The most important change here fixes a race condition that causes either
HDA or (more frequently) display to malfunction because they race for
enabling the SOR power domain at probe time.

Other than that, there's a couple of build warnings for issues
introduced in v5.13 as well as some minor fixes, such as reference leak
plugs.
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603144624.788861-1-thierry.reding@gmail.com

37e2f2e8

03 Jun, 2021 11 commits

Merge tag 'amd-drm-fixes-5.13-2021-06-02' of... · d6273d8f

Dave Airlie authored Jun 04, 2021

Merge tag 'amd-drm-fixes-5.13-2021-06-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.13-2021-06-02:

amdgpu:
- Display fixes
- FRU EEPROM error handling fix
- RAS fix
- PSP fix
- Releasing pinned BO fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603040410.4080-1-alexander.deucher@amd.com

d6273d8f

Merge tag 'drm-intel-fixes-2021-06-03' of... · ff7a24a8

Dave Airlie authored Jun 04, 2021

Merge tag 'drm-intel-fixes-2021-06-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.13-rc5:
- Revert conversion to io_mapping_map_user() which lead to BUG_ON()
- Fix check for error valued returns in a selftest
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87lf7rpcmp.fsf@intel.com

ff7a24a8

Merge tag 'drm-misc-fixes-2021-06-03' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes · 59dda702

Dave Airlie authored Jun 04, 2021

One fix for a fb_defio breakage
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603085321.l5l6flslj632yqse@gilmour

59dda702

Merge tag 'vfio-v5.13-rc5' of git://github.com/awilliam/linux-vfio · f88cd3fb

Linus Torvalds authored Jun 03, 2021

Pull VFIO fixes from Alex Williamson:

 - Fix error path return value (Zhen Lei)

 - Add vfio-pci CONFIG_MMU dependency (Randy Dunlap)

 - Replace open coding with struct_size() (Gustavo A. R. Silva)

 - Fix sample driver error path (Wei Yongjun)

 - Fix vfio-platform error path module_put() (Max Gurtovoy)

* tag 'vfio-v5.13-rc5' of git://github.com/awilliam/linux-vfio:
  vfio/platform: fix module_put call in error flow
  samples: vfio-mdev: fix error handing in mdpy_fb_probe()
  vfio/iommu_type1: Use struct_size() for kzalloc()
  vfio/pci: zap_vma_ptes() needs MMU
  vfio/pci: Fix error return code in vfio_ecap_init()

f88cd3fb

Merge tag 'block-5.13-2021-06-03' of git://git.kernel.dk/linux-block · 143d28dc

Linus Torvalds authored Jun 03, 2021

Pull block fixes from Jens Axboe:
 "NVMe fixes from Christoph:

   - Fix corruption in RDMA in-capsule SGLs (Sagi Grimberg)

   - nvme-loop reset fixes (Hannes Reinecke)

   - nvmet fix for freeing unallocated p2pmem (Max Gurtovoy)"

* tag 'block-5.13-2021-06-03' of git://git.kernel.dk/linux-block:
  nvmet: fix freeing unallocated p2pmem
  nvme-loop: do not warn for deleted controllers during reset
  nvme-loop: check for NVME_LOOP_Q_LIVE in nvme_loop_destroy_admin_queue()
  nvme-loop: clear NVME_LOOP_Q_LIVE when nvme_loop_configure_admin_queue() fails
  nvme-loop: reset queue count to 1 in nvme_loop_destroy_io_queues()
  nvme-rdma: fix in-casule data send for chained sgls

143d28dc

Merge tag 'io_uring-5.13-2021-06-03' of git://git.kernel.dk/linux-block · ec955023

Linus Torvalds authored Jun 03, 2021

Pull io_uring fix from Jens Axboe:
 "Just a single one-liner fix for an accounting regression in this
  release"

* tag 'io_uring-5.13-2021-06-03' of git://git.kernel.dk/linux-block:
  io_uring: fix misaccounting fix buf pinned pages

ec955023

Merge tag 'for-5.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · fd2ff277

Linus Torvalds authored Jun 03, 2021

Pull btrfs fixes from David Sterba:
 "Error handling improvements, caught by error injection:

   - handle errors during checksum deletion

   - set error on mapping when ordered extent io cannot be finished

   - inode link count fixup in tree-log

   - missing return value checks for inode updates in tree-log

   - abort transaction in rename exchange if adding second reference
     fails

  Fixes:

   - fix fsync failure after writes to prealloc extents

   - fix deadlock when cloning inline extents and low on available space

   - fix compressed writes that cross stripe boundary"

* tag 'for-5.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  MAINTAINERS: add btrfs IRC link
  btrfs: fix deadlock when cloning inline extents and low on available space
  btrfs: fix fsync failure and transaction abort after writes to prealloc extents
  btrfs: abort in rename_exchange if we fail to insert the second ref
  btrfs: check error value from btrfs_update_inode in tree log
  btrfs: fixup error handling in fixup_inode_link_counts
  btrfs: mark ordered extent and inode with error if we fail to finish
  btrfs: return errors from btrfs_del_csums in cleanup_ref_head
  btrfs: fix error handling in btrfs_del_csums
  btrfs: fix compressed writes that cross stripe boundary

fd2ff277

Merge tag 'nvme-5.13-2021-06-03' of git://git.infradead.org/nvme into block-5.13 · e369edbb

Jens Axboe authored Jun 03, 2021

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.13:

 - fix corruption in RDMA in-capsule SGLs (Sagi Grimberg)
 - nvme-loop reset fixes (Hannes Reinecke)
 - nvmet fix for freeing unallocated p2pmem (Max Gurtovoy)"

* tag 'nvme-5.13-2021-06-03' of git://git.infradead.org/nvme:
  nvmet: fix freeing unallocated p2pmem
  nvme-loop: do not warn for deleted controllers during reset
  nvme-loop: check for NVME_LOOP_Q_LIVE in nvme_loop_destroy_admin_queue()
  nvme-loop: clear NVME_LOOP_Q_LIVE when nvme_loop_configure_admin_queue() fails
  nvme-loop: reset queue count to 1 in nvme_loop_destroy_io_queues()
  nvme-rdma: fix in-casule data send for chained sgls

e369edbb

MAINTAINERS: add btrfs IRC link · 503d1acb

David Sterba authored Jun 03, 2021

We haven't had an IRC link before but now it's a good time to announce
where to reach the community.
Signed-off-by: David Sterba <dsterba@suse.com>

503d1acb

ALSA: hda: update the power_state during the direct-complete · b8b90c17

Hui Wang authored Jun 02, 2021

The patch_realtek.c needs to check if the power_state.event equals
PM_EVENT_SUSPEND, after using the direct-complete, the suspend() and
resume() will be skipped if the codec is already rt_suspended, in this
case, the patch_realtek.c will always get PM_EVENT_ON even the system
is really resumed from S3.

We could set power_state to PMSG_SUSPEND in the prepare(), if other
PM functions are called before complete(), those functions will
override power_state; if no other PM functions are called before
complete(), we could know the suspend() and resume() are skipped since
only S3 pm functions could be skipped by direct-complete, in this case
set power_state to PMSG_RESUME in the complete(). This could guarantee
the first time of calling hda_codec_runtime_resume() after complete()
has the correct power_state.

Fixes: 215a22ed ("ALSA: hda: Refactor codec PM to use direct-complete optimization")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20210602145424.3132-1-hui.wang@canonical.comSigned-off-by: Takashi Iwai <tiwai@suse.de>

b8b90c17

ALSA: timer: Fix master timer notification · 9c1fe96b

Takashi Iwai authored Jun 02, 2021

snd_timer_notify1() calls the notification to each slave for a master
event, but it passes a wrong event number.  It should be +10 offset,
corresponding to SNDRV_TIMER_EVENT_MXXX, but it's incorrectly with
+100 offset.  Casually this was spotted by UBSAN check via syzkaller.

Reported-by: syzbot+d102fa5b35335a7e544e@syzkaller.appspotmail.com
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/000000000000e5560e05c3bd1d63@google.com
Link: https://lore.kernel.org/r/20210602113823.23777-1-tiwai@suse.deSigned-off-by: Takashi Iwai <tiwai@suse.de>

9c1fe96b

02 Jun, 2021 17 commits

amd/display: convert DRM_DEBUG_ATOMIC to drm_dbg_atomic · e7591a8d

Simon Ser authored May 26, 2021

This allows to tie the log message to a specific DRM device.
Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

e7591a8d

drm/amdgpu: make sure we unpin the UVD BO · 07438603

Nirmoy Das authored May 28, 2021

Releasing pinned BOs is illegal now. UVD 6 was missing from:
commit 2f40801d ("drm/amdgpu: make sure we unpin the UVD BO")

Fixes: 2f40801d ("drm/amdgpu: make sure we unpin the UVD BO")
Cc: stable@vger.kernel.org
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

07438603

drm/amd/amdgpu:save psp ring wptr to avoid attack · 2370eba9

Victor Zhao authored Mar 18, 2021

[Why]
When some tools performing psp mailbox attack, the readback value
of register can be a random value which may break psp.

[How]
Use a psp wptr cache machanism to aovid the change made by attack.

v2: unify change and add detailed reason
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

2370eba9

drm/amd/display: Fix potential memory leak in DMUB hw_init · c5699e2d

Roman Li authored May 10, 2021

[Why]
On resume we perform DMUB hw_init which allocates memory:
dm_resume->dm_dmub_hw_init->dc_dmub_srv_create->kzalloc
That results in memory leak in suspend/resume scenarios.

[How]
Allocate memory for the DC wrapper to DMUB only if it was not
allocated before.
No need to reallocate it on suspend/resume.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c5699e2d

drm/amdgpu: Don't query CE and UE errors · dce3d8e1

Luben Tuikov authored May 12, 2021

On QUERY2 IOCTL don't query counts of correctable
and uncorrectable errors, since when RAS is
enabled and supported on Vega20 server boards,
this takes insurmountably long time, in O(n^3),
which slows the system down to the point of it
being unusable when we have GUI up.

Fixes: ae363a21 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2")
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

dce3d8e1

drm/amd/display: Fix overlay validation by considering cursors · 33f409e6

Rodrigo Siqueira authored May 14, 2021

A few weeks ago, we saw a two cursor issue in a ChromeOS system. We
fixed it in the commit:

 drm/amd/display: Fix two cursor duplication when using overlay
 (read the commit message for more details)

After this change, we noticed that some IGT subtests related to
kms_plane and kms_plane_scaling started to fail. After investigating
this issue, we noticed that all subtests that fail have a primary plane
covering the overlay plane, which is currently rejected by amdgpu dm.
Fail those IGT tests highlight that our verification was too broad and
compromises the overlay usage in our drive. This patch fixes this issue
by ensuring that we only reject commits where the primary plane is not
fully covered by the overlay when the cursor hardware is enabled. With
this fix, all IGT tests start to pass again, which means our overlay
support works as expected.

Cc: Tianci.Yin <tianci.yin@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Nicholas Choi <nicholas.choi@amd.com>
Cc: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com>
Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Mark Yacoub <markyacoub@google.com>
Cc: Daniel Wheeler <daniel.wheeler@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

33f409e6

drm/amdgpu: refine amdgpu_fru_get_product_info · 5cfc9125

Jiansong Chen authored May 25, 2021

1. eliminate potential array index out of bounds.
2. return meaningful value for failure.
Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

5cfc9125

drm/amdgpu: add judgement for dc support · 147feb00

Asher Song authored May 21, 2021

Drop DC initialization when DCN is harvested in VBIOS. The way
doesn't affect virtual display ip initialization.
Signed-off-by: Likun Gao  <Likun.Gao@amd.com>
Signed-off-by: Asher Song <Asher.Song@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

147feb00

drm/amd/display: Fix GPU scaling regression by FS video support · a53085c1

Nicholas Kazlauskas authored May 19, 2021

[Why]
FS video support regressed GPU scaling and the scaled buffer ends up
stuck in the top left of the screen at native size - full, aspect,
center scaling modes do not function.

This is because decide_crtc_timing_for_drm_display_mode() does not
get called when scaling is enabled.

[How]
Split recalculate timing and scaling into two different flags.

We don't want to call drm_mode_set_crtcinfo() for scaling, but we
do want to call it for FS video.

Optimize and move preferred_refresh calculation next to
decide_crtc_timing_for_drm_display_mode() like it used to be since
that's not used for FS video.

We don't need to copy over the VIC or polarity in the case of FS video
modes because those don't change.

Fixes: 6f59f229 ("drm/amd/display: Skip modeset for front porch change")

Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a53085c1

drm/amd/display: Allow bandwidth validation for 0 streams. · ba8e5977

Bindu Ramamurthy authored May 20, 2021

[Why]
Bandwidth calculations are triggered for non zero streams, and
in case of 0 streams, these calculations were skipped with
pstate status not being updated.

[How]
As the pstate status is applicable for non zero streams, check
added for allowing 0 streams inline with dcn internal bandwidth
validations.
Signed-off-by: Bindu Ramamurthy <bindu.r@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

ba8e5977

Merge tag 'efi-urgent-2021-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 324c92e5

Linus Torvalds authored Jun 02, 2021

Pull EFI fixes from Ingo Molnar:
 "A handful of EFI fixes:

   - Fix/robustify a diagnostic printk

   - Fix a (normally not triggered) parser bug in the libstub code

   - Allow !EFI_MEMORY_XP && !EFI_MEMORY_RO entries in the memory map

   - Stop RISC-V from crashing on boot if there's no FDT table"

* tag 'efi-urgent-2021-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: cper: fix snprintf() use in cper_dimm_err_location()
  efi/libstub: prevent read overflow in find_file_option()
  efi: Allow EFI_MEMORY_XP and EFI_MEMORY_RO both to be cleared
  efi/fdt: fix panic when no valid fdt found

324c92e5

Merge tag 'acpi-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 0372b6dd

Linus Torvalds authored Jun 02, 2021

Pull ACPI fix from Rafael Wysocki:
 "Fix a mutex object memory leak in ACPICA occurring during object
  deletion that was introduced in 5.12-rc1 (Erik Kaneda)"

* tag 'acpi-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPICA: Clean up context mutex during object deletion

0372b6dd

Merge tag 'hwmon-for-v5.13-rc4' of... · 3bfc6ffb

Linus Torvalds authored Jun 02, 2021

Merge tag 'hwmon-for-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
 "The most notable fix is for the q54sj108a2 driver to let it actually
  instantiate.

  Also attribute fixes for pmbus/isl68137, pmbus/fsp-3y, and
  dell-smm-hwmon drivers"

* tag 'hwmon-for-v5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon/pmbus: (q54sj108a2) The PMBUS_MFR_ID is actually 6 chars instead of 5
  hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_3 for RAA228228
  hwmon: (pmbus/fsp-3y) Fix FSP-3Y YH-5151E VOUT
  hwmon: (dell-smm-hwmon) Fix index values

3bfc6ffb

Revert "i915: use io_mapping_map_user" · b87482df

Matthew Auld authored May 27, 2021

This reverts commit b739f125.

We are unfortunately seeing more issues like we did in 293837b9
("Revert "i915: fix remap_io_sg to verify the pgprot""), except this is
now for the vm_fault_gtt path, where we are now hitting the same
BUG_ON(!pte_none(*pte)):

[10887.466150] kernel BUG at mm/memory.c:2183!
[10887.466162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[10887.466168] CPU: 0 PID: 7775 Comm: ffmpeg Tainted: G     U            5.13.0-rc3-CI-Nightly #1
[10887.466174] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
[10887.466177] RIP: 0010:remap_pfn_range_notrack+0x30f/0x440
[10887.466188] Code: e8 96 d7 e0 ff 84 c0 0f 84 27 01 00 00 48 ba 00 f0 ff ff ff ff 0f 00 4c 89 e0 48 c1 e0 0c 4d 85 ed 75 96 48 21 d0 31 f6 eb a9 <0f> 0b 48 39 37 0f 85 0e 01 00 00 48 8b 0c 24 48 39 4f 08 0f 85 00
[10887.466193] RSP: 0018:ffffc90006e33c50 EFLAGS: 00010286
[10887.466198] RAX: 800000000000002f RBX: 00007f5e01800000 RCX: 0000000000000028
[10887.466201] RDX: 0000000000000001 RSI: ffffea0000000000 RDI: 0000000000000000
[10887.466204] RBP: ffffea000033fea8 R08: 800000000000002f R09: ffff8881072256e0
[10887.466207] R10: ffffc9000b84fff8 R11: 0000000017dab000 R12: 0000000000089f9f
[10887.466210] R13: 800000000000002f R14: 00007f5e017e4000 R15: ffff88800cffaf20
[10887.466213] FS:  00007f5e04849640(0000) GS:ffff888278000000(0000) knlGS:0000000000000000
[10887.466216] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10887.466220] CR2: 00007fd9b191a2ac CR3: 00000001829ac000 CR4: 00000000003506f0
[10887.466223] Call Trace:
[10887.466233]  vm_fault_gtt+0x1ca/0x5d0 [i915]
[10887.466381]  ? ktime_get+0x38/0x90
[10887.466389]  __do_fault+0x37/0x90
[10887.466395]  __handle_mm_fault+0xc46/0x1200
[10887.466402]  handle_mm_fault+0xce/0x2a0
[10887.466407]  do_user_addr_fault+0x1c5/0x660

Reverting this commit is reported to fix the issue.
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3519
Fixes: b739f125 ("i915: use io_mapping_map_user")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210527185145.458021-1-matthew.auld@intel.com
(cherry picked from commit 0e4fe0c9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

b87482df

drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() · 10c1f0cb

Zhihao Cheng authored Jun 01, 2021

In case of error, the function live_context() returns ERR_PTR() and never
returns NULL. The NULL test in the return value check should be replaced
with IS_ERR().

Fixes: 52c0fdb2 ("drm/i915: Replace global breadcrumbs with per-context interrupt tracking")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/33c46ef24cd547d0ad21dc106441491a@intel.com
[tursulin: Wrap commit text, fix Fixes: tag.]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
(cherry picked from commit 8f4caef8)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

10c1f0cb

ALSA: control led: fix memory leak in snd_ctl_led_register · 3ae72f6a

Dongliang Mu authored Jun 02, 2021

The snd_ctl_led_sysfs_add and snd_ctl_led_sysfs_remove should contain
the refcount operations in pair. However, snd_ctl_led_sysfs_remove fails
to decrease the refcount to zero, which causes device_release never to
be invoked. This leads to memory leak to some resources, like struct
device_private. In addition, we also free some other similar memory
leaks in snd_ctl_led_init/snd_ctl_led_exit.

Fix this by replacing device_del to device_unregister
in snd_ctl_led_sysfs_remove/snd_ctl_led_init/snd_ctl_led_exit.

Note that, when CONFIG_DEBUG_KOBJECT_RELEASE is enabled, put_device will
call kobject_release and delay the release of kobject, which will cause
use-after-free when the memory backing the kobject is freed at once.

Reported-by: syzbot+08a7d8b51ea048a74ffb@syzkaller.appspotmail.com
Fixes: a135dfb5 ("ALSA: led control - add sysfs kcontrol LED marking layer")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20210602034136.2762497-1-mudongliangabcd@gmail.comSigned-off-by: Takashi Iwai <tiwai@suse.de>

3ae72f6a

nvmet: fix freeing unallocated p2pmem · bcd9a079

Max Gurtovoy authored Jun 01, 2021

In case p2p device was found but the p2p pool is empty, the nvme target
is still trying to free the sgl from the p2p pool instead of the
regular sgl pool and causing a crash (BUG() is called). Instead, assign
the p2p_dev for the request only if it was allocated from p2p pool.

This is the crash that was caused:

[Sun May 30 19:13:53 2021] ------------[ cut here ]------------
[Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518!
[Sun May 30 19:13:53 2021] invalid opcode: 0000 [#1] SMP PTI
...
[Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518!
...
[Sun May 30 19:13:53 2021] RIP: 0010:gen_pool_free_owner+0xa8/0xb0
...
[Sun May 30 19:13:53 2021] Call Trace:
[Sun May 30 19:13:53 2021] ------------[ cut here ]------------
[Sun May 30 19:13:53 2021]  pci_free_p2pmem+0x2b/0x70
[Sun May 30 19:13:53 2021]  pci_p2pmem_free_sgl+0x4f/0x80
[Sun May 30 19:13:53 2021]  nvmet_req_free_sgls+0x1e/0x80 [nvmet]
[Sun May 30 19:13:53 2021] kernel BUG at lib/genalloc.c:518!
[Sun May 30 19:13:53 2021]  nvmet_rdma_release_rsp+0x4e/0x1f0 [nvmet_rdma]
[Sun May 30 19:13:53 2021]  nvmet_rdma_send_done+0x1c/0x60 [nvmet_rdma]

Fixes: c6e3f133 ("nvmet: add metadata support for block devices")
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

bcd9a079