- 16 Sep, 2019 40 commits
-
-
Coly Li authored
[ Upstream commit 41508bb7 ] When accessing or modifying BTREE_NODE_dirty bit, it is not always necessary to acquire b->write_lock. In bch_btree_cache_free() and mca_reap() acquiring b->write_lock is necessary, and this patch adds comments to explain why mutex_lock(&b->write_lock) is necessary for checking or clearing BTREE_NODE_dirty bit there. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Coly Li authored
[ Upstream commit e5ec5f47 ] In bch_btree_cache_free() and btree_node_free(), BTREE_NODE_dirty is always set no matter btree node is dirty or not. The code looks like this, if (btree_node_dirty(b)) btree_complete_write(b, btree_current_write(b)); clear_bit(BTREE_NODE_dirty, &b->flags); Indeed if btree_node_dirty(b) returns false, it means BTREE_NODE_dirty bit is cleared, then it is unnecessary to clear the bit again. This patch only clears BTREE_NODE_dirty when btree_node_dirty(b) is true (the bit is set), to save a few CPU cycles. Signed-off-by: Coly Li <colyli@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Trond Myklebust authored
[ Upstream commit 5eb8d18c ] Once we clear the NFS_DELEGATED_STATE flag, we're telling nfs_delegation_claim_opens() that we're done recovering all open state for that stateid, so we really need to ensure that we test for all open modes that are currently cached and recover them before exiting nfs4_open_delegation_recall(). Fixes: 24311f88 ("NFSv4: Recovery of recalled read delegations...") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Arnd Bergmann authored
[ Upstream commit 90c6260c ] gcc-9 complains about a blatant uninitialized variable use that all earlier compiler versions missed: drivers/iio/adc/rcar-gyroadc.c:510:5: warning: 'ret' may be used uninitialized in this function [-Wmaybe-uninitialized] Return -EINVAL instead here and a few lines above it where we accidentally return 0 on failure. Cc: stable@vger.kernel.org Fixes: 059c53b3 ("iio: adc: Add Renesas GyroADC driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Ralph Campbell authored
[ Upstream commit 7b358c6f ] When CONFIG_MIGRATE_VMA_HELPER is enabled, migrate_vma() calls migrate_vma_collect() which initializes a struct mm_walk but didn't initialize mm_walk.pud_entry. (Found by code inspection) Use a C structure initialization to make sure it is set to NULL. Link: http://lkml.kernel.org/r/20190719233225.12243-1-rcampbell@nvidia.com Fixes: 8763cb45 ("mm/migrate: new memory migration helper for use with device memory") Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Michał Mirosław authored
[ Upstream commit b1ac6704 ] In SAMA5D2 datasheet, TWIHS_CWGR register rescription mentions clock offset of 3 cycles (compared to 4 in eg. SAMA5D3). Cc: stable@vger.kernel.org # 5.2.x [needs applying to i2c-at91.c instead for earlier kernels] Fixes: 0ef6f321 ("i2c: at91: add support for new alternative command mode") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Michał Mirosław authored
[ Upstream commit d12e3aae ] Driver was not disabling TXRDY interrupt after last TX byte. This caused interrupt storm until transfer timeouts for slow or broken device on the bus. The patch fixes the interrupt storm on my SAMA5D2-based board. Cc: stable@vger.kernel.org # 5.2.x [v5.2 introduced file split; the patch should apply to i2c-at91.c before the split] Fixes: fac368a0 ("i2c: at91: add new driver") Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Tested-by: Raag Jadav <raagjadav@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bartosz Golaszewski authored
[ Upstream commit ffe0bbab ] If gpiolib is disabled, we use the inline stubs from gpio/consumer.h instead of regular definitions of GPIO API. The stubs for 'optional' variants of gpiod_get routines return NULL in this case as if the relevant GPIO wasn't found. This is correct so far. Calling other (non-gpio_get) stubs from this header triggers a warning because the GPIO descriptor couldn't have been requested. The warning however is unconditional (WARN_ON(1)) and is emitted even if the passed descriptor pointer is NULL. We don't want to force the users of 'optional' gpio_get to check the returned pointer before calling e.g. gpiod_set_value() so let's only WARN on non-NULL descriptors. Cc: stable@vger.kernel.org Reported-by: Claus H. Stovgaard <cst@phaseone.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Chris Wilson authored
[ Upstream commit 9eed17d3 ] Since the cached32_node is allowed to be advanced above dma_32bit_pfn (to provide a shortcut into the limited range), we need to be careful to remove the to be freed node if it is the cached32_node. [ 48.477773] BUG: KASAN: use-after-free in __cached_rbnode_delete_update+0x68/0x110 [ 48.477812] Read of size 8 at addr ffff88870fc19020 by task kworker/u8:1/37 [ 48.477843] [ 48.477879] CPU: 1 PID: 37 Comm: kworker/u8:1 Tainted: G U 5.2.0+ #735 [ 48.477915] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017 [ 48.478047] Workqueue: i915 __i915_gem_free_work [i915] [ 48.478075] Call Trace: [ 48.478111] dump_stack+0x5b/0x90 [ 48.478137] print_address_description+0x67/0x237 [ 48.478178] ? __cached_rbnode_delete_update+0x68/0x110 [ 48.478212] __kasan_report.cold.3+0x1c/0x38 [ 48.478240] ? __cached_rbnode_delete_update+0x68/0x110 [ 48.478280] ? __cached_rbnode_delete_update+0x68/0x110 [ 48.478308] __cached_rbnode_delete_update+0x68/0x110 [ 48.478344] private_free_iova+0x2b/0x60 [ 48.478378] iova_magazine_free_pfns+0x46/0xa0 [ 48.478403] free_iova_fast+0x277/0x340 [ 48.478443] fq_ring_free+0x15a/0x1a0 [ 48.478473] queue_iova+0x19c/0x1f0 [ 48.478597] cleanup_page_dma.isra.64+0x62/0xb0 [i915] [ 48.478712] __gen8_ppgtt_cleanup+0x63/0x80 [i915] [ 48.478826] __gen8_ppgtt_cleanup+0x42/0x80 [i915] [ 48.478940] __gen8_ppgtt_clear+0x433/0x4b0 [i915] [ 48.479053] __gen8_ppgtt_clear+0x462/0x4b0 [i915] [ 48.479081] ? __sg_free_table+0x9e/0xf0 [ 48.479116] ? kfree+0x7f/0x150 [ 48.479234] i915_vma_unbind+0x1e2/0x240 [i915] [ 48.479352] i915_vma_destroy+0x3a/0x280 [i915] [ 48.479465] __i915_gem_free_objects+0xf0/0x2d0 [i915] [ 48.479579] __i915_gem_free_work+0x41/0xa0 [i915] [ 48.479607] process_one_work+0x495/0x710 [ 48.479642] worker_thread+0x4c7/0x6f0 [ 48.479687] ? process_one_work+0x710/0x710 [ 48.479724] kthread+0x1b2/0x1d0 [ 48.479774] ? kthread_create_worker_on_cpu+0xa0/0xa0 [ 48.479820] ret_from_fork+0x1f/0x30 [ 48.479864] [ 48.479907] Allocated by task 631: [ 48.479944] save_stack+0x19/0x80 [ 48.479994] __kasan_kmalloc.constprop.6+0xc1/0xd0 [ 48.480038] kmem_cache_alloc+0x91/0xf0 [ 48.480082] alloc_iova+0x2b/0x1e0 [ 48.480125] alloc_iova_fast+0x58/0x376 [ 48.480166] intel_alloc_iova+0x90/0xc0 [ 48.480214] intel_map_sg+0xde/0x1f0 [ 48.480343] i915_gem_gtt_prepare_pages+0xb8/0x170 [i915] [ 48.480465] huge_get_pages+0x232/0x2b0 [i915] [ 48.480590] ____i915_gem_object_get_pages+0x40/0xb0 [i915] [ 48.480712] __i915_gem_object_get_pages+0x90/0xa0 [i915] [ 48.480834] i915_gem_object_prepare_write+0x2d6/0x330 [i915] [ 48.480955] create_test_object.isra.54+0x1a9/0x3e0 [i915] [ 48.481075] igt_shared_ctx_exec+0x365/0x3c0 [i915] [ 48.481210] __i915_subtests.cold.4+0x30/0x92 [i915] [ 48.481341] __run_selftests.cold.3+0xa9/0x119 [i915] [ 48.481466] i915_live_selftests+0x3c/0x70 [i915] [ 48.481583] i915_pci_probe+0xe7/0x220 [i915] [ 48.481620] pci_device_probe+0xe0/0x180 [ 48.481665] really_probe+0x163/0x4e0 [ 48.481710] device_driver_attach+0x85/0x90 [ 48.481750] __driver_attach+0xa5/0x180 [ 48.481796] bus_for_each_dev+0xda/0x130 [ 48.481831] bus_add_driver+0x205/0x2e0 [ 48.481882] driver_register+0xca/0x140 [ 48.481927] do_one_initcall+0x6c/0x1af [ 48.481970] do_init_module+0x106/0x350 [ 48.482010] load_module+0x3d2c/0x3ea0 [ 48.482058] __do_sys_finit_module+0x110/0x180 [ 48.482102] do_syscall_64+0x62/0x1f0 [ 48.482147] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 48.482190] [ 48.482224] Freed by task 37: [ 48.482273] save_stack+0x19/0x80 [ 48.482318] __kasan_slab_free+0x12e/0x180 [ 48.482363] kmem_cache_free+0x70/0x140 [ 48.482406] __free_iova+0x1d/0x30 [ 48.482445] fq_ring_free+0x15a/0x1a0 [ 48.482490] queue_iova+0x19c/0x1f0 [ 48.482624] cleanup_page_dma.isra.64+0x62/0xb0 [i915] [ 48.482749] __gen8_ppgtt_cleanup+0x63/0x80 [i915] [ 48.482873] __gen8_ppgtt_cleanup+0x42/0x80 [i915] [ 48.482999] __gen8_ppgtt_clear+0x433/0x4b0 [i915] [ 48.483123] __gen8_ppgtt_clear+0x462/0x4b0 [i915] [ 48.483250] i915_vma_unbind+0x1e2/0x240 [i915] [ 48.483378] i915_vma_destroy+0x3a/0x280 [i915] [ 48.483500] __i915_gem_free_objects+0xf0/0x2d0 [i915] [ 48.483622] __i915_gem_free_work+0x41/0xa0 [i915] [ 48.483659] process_one_work+0x495/0x710 [ 48.483704] worker_thread+0x4c7/0x6f0 [ 48.483748] kthread+0x1b2/0x1d0 [ 48.483787] ret_from_fork+0x1f/0x30 [ 48.483831] [ 48.483868] The buggy address belongs to the object at ffff88870fc19000 [ 48.483868] which belongs to the cache iommu_iova of size 40 [ 48.483920] The buggy address is located 32 bytes inside of [ 48.483920] 40-byte region [ffff88870fc19000, ffff88870fc19028) [ 48.483964] The buggy address belongs to the page: [ 48.484006] page:ffffea001c3f0600 refcount:1 mapcount:0 mapping:ffff8888181a91c0 index:0x0 compound_mapcount: 0 [ 48.484045] flags: 0x8000000000010200(slab|head) [ 48.484096] raw: 8000000000010200 ffffea001c421a08 ffffea001c447e88 ffff8888181a91c0 [ 48.484141] raw: 0000000000000000 0000000000120012 00000001ffffffff 0000000000000000 [ 48.484188] page dumped because: kasan: bad access detected [ 48.484230] [ 48.484265] Memory state around the buggy address: [ 48.484314] ffff88870fc18f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 48.484361] ffff88870fc18f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 48.484406] >ffff88870fc19000: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc [ 48.484451] ^ [ 48.484494] ffff88870fc19080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 48.484530] ffff88870fc19100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108602 Fixes: e60aa7b5 ("iommu/iova: Extend rbtree node caching") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Joerg Roedel <joro@8bytes.org> Cc: <stable@vger.kernel.org> # v4.15+ Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Suraj Jitindar Singh authored
[ Upstream commit da0ef933 ] The virtual real mode addressing (VRMA) mechanism is used when a partition is using HPT (Hash Page Table) translation and performs real mode accesses (MSR[IR|DR] = 0) in non-hypervisor mode. In this mode effective address bits 0:23 are treated as zero (i.e. the access is aliased to 0) and the access is performed using an implicit 1TB SLB entry. The size of the RMA (Real Memory Area) is communicated to the guest as the size of the first memory region in the device tree. And because of the mechanism described above can be expected to not exceed 1TB. In the event that the host erroneously represents the RMA as being larger than 1TB, guest accesses in real mode to memory addresses above 1TB will be aliased down to below 1TB. This means that a memory access performed in real mode may differ to one performed in virtual mode for the same memory address, which would likely have unintended consequences. To avoid this outcome have the guest explicitly limit the size of the RMA to the current maximum, which is 1TB. This means that even if the first memory block is larger than 1TB, only the first 1TB should be accessed in real mode. Fixes: c610d65c ("powerpc/pseries: lift RTAS limit for hash") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20190710052018.14628-1-sjitindarsingh@gmail.comSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Takashi Iwai authored
[ Upstream commit 2756d914 ] It turned out that the recent Intel HD-audio controller chips show a significant stall during the system PM resume intermittently. It doesn't happen so often and usually it may read back successfully after one or more seconds, but in some rare worst cases the driver went into fallback mode. After trial-and-error, we found out that the communication stall seems covered by issuing the sync after each verb write, as already done for AMD and other chipsets. So this patch enables the write-sync flag for the recent Intel chips, Skylake and onward, as a workaround. Also, since Broxton and co have the very same driver flags as Skylake, refer to the Skylake driver flags instead of defining the same contents again for simplification. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=201901Reported-and-tested-by: Todd Brandt <todd.e.brandt@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Sébastien Szymanski authored
[ Upstream commit c479450f ] This patch adds support for the Armadeus ST0700 Adapt. It comes with a Santek ST0700I5Y-RBSLW 7.0" WVGA (800x480) TFT and an adapter board so that it can be connected on the TFT header of Armadeus Dev boards. Cc: stable@vger.kernel.org # v4.19 Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190507152713.27494-1-sebastien.szymanski@armadeus.comSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Mike Snitzer authored
[ Upstream commit 54fa16ee ] Check if in fail_io mode at start of dm_pool_metadata_set_needs_check(). Otherwise dm_pool_metadata_set_needs_check()'s superblock_lock() can crash in dm_bm_write_lock() while accessing the block manager object that was previously destroyed as part of a failed dm_pool_abort_metadata() that ultimately set fail_io to begin with. Also, update DMERR() message to more accurately describe superblock_lock() failure. Cc: stable@vger.kernel.org Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Norbert Manthey authored
[ Upstream commit 4c6d80e1 ] The pstore_mkfile() function is passed a pointer to a struct pstore_record. On success it consumes this 'record' pointer and references it from the created inode. On failure, however, it may or may not free the record. There are even two different code paths which return -ENOMEM -- one of which does and the other doesn't free the record. Make the behaviour deterministic by never consuming and freeing the record when returning failure, allowing the caller to do the cleanup consistently. Signed-off-by: Norbert Manthey <nmanthey@amazon.de> Link: https://lore.kernel.org/r/1562331960-26198-1-git-send-email-nmanthey@amazon.de Fixes: 83f70f07 ("pstore: Do not duplicate record metadata") Fixes: 1dfff7dd ("pstore: Pass record contents instead of copying") Cc: stable@vger.kernel.org [kees: also move "private" allocation location, rename inode cleanup label] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Nadav Amit authored
[ Upstream commit 49f17c26 ] Since resources can be removed, locking should ensure that the resource is not removed while accessing it. However, find_next_iomem_res() does not hold the lock while copying the data of the resource. Keep holding the lock while the data is copied. While at it, change the return value to a more informative value. It is disregarded by the callers. [akpm@linux-foundation.org: fix find_next_iomem_res() documentation] Link: http://lkml.kernel.org/r/20190613045903.4922-2-namit@vmware.com Fixes: ff3cc952 ("resource: Add remove_resource interface") Signed-off-by: Nadav Amit <namit@vmware.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bjorn Helgaas authored
[ Upstream commit 010a93bf ] Previously find_next_iomem_res() used "*res" as both an input parameter for the range to search and the type of resource to search for, and an output parameter for the resource we found, which makes the interface confusing. The current callers use find_next_iomem_res() incorrectly because they allocate a single struct resource and use it for repeated calls to find_next_iomem_res(). When find_next_iomem_res() returns a resource, it overwrites the start, end, flags, and desc members of the struct. If we call find_next_iomem_res() again, we must update or restore these fields. The previous code restored res.start and res.end, but not res.flags or res.desc. Since the callers did not restore res.flags, if they searched for flags IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would incorrectly skip resources unless they were also marked as IORESOURCE_SYSRAM. Fix this by restructuring the interface so it takes explicit "start, end, flags" parameters and uses "*res" only as an output parameter. Based on a patch by Lianbo Jiang <lijiang@redhat.com>. [ bp: While at it: - make comments kernel-doc style. - Originally-by: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.comSigned-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Andrew Morton <akpm@linux-foundation.org> CC: Brijesh Singh <brijesh.singh@amd.com> CC: Dan Williams <dan.j.williams@intel.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Lianbo Jiang <lijiang@redhat.com> CC: Takashi Iwai <tiwai@suse.de> CC: Thomas Gleixner <tglx@linutronix.de> CC: Tom Lendacky <thomas.lendacky@amd.com> CC: Vivek Goyal <vgoyal@redhat.com> CC: Yaowei Bai <baiyaowei@cmss.chinamobile.com> CC: bhe@redhat.com CC: dan.j.williams@intel.com CC: dyoung@redhat.com CC: kexec@lists.infradead.org CC: mingo@redhat.com CC: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/153805812916.1157.177580438135143788.stgit@bhelgaas-glaptop.roam.corp.google.comSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Bjorn Helgaas authored
[ Upstream commit a98959fd ] find_next_iomem_res() finds an iomem resource that covers part of a range described by "start, end". All callers expect that range to be inclusive, i.e., both start and end are included, but find_next_iomem_res() doesn't handle the end address correctly. If it finds an iomem resource that contains exactly the end address, it skips it, e.g., if "start, end" is [0x0-0x10000] and there happens to be an iomem resource [mem 0x10000-0x10000] (the single byte at 0x10000), we skip it: find_next_iomem_res(...) { start = 0x0; end = 0x10000; for (p = next_resource(...)) { # p->start = 0x10000; # p->end = 0x10000; # we *should* return this resource, but this condition is false: if ((p->end >= start) && (p->start < end)) break; Adjust find_next_iomem_res() so it allows a resource that includes the single byte at the end of the range. This is a corner case that we probably don't see in practice. Fixes: 58c1b5b0 ("[PATCH] memory hotadd fixes: find_next_system_ram catch range fix") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Borislav Petkov <bp@suse.de> CC: Andrew Morton <akpm@linux-foundation.org> CC: Brijesh Singh <brijesh.singh@amd.com> CC: Dan Williams <dan.j.williams@intel.com> CC: H. Peter Anvin <hpa@zytor.com> CC: Lianbo Jiang <lijiang@redhat.com> CC: Takashi Iwai <tiwai@suse.de> CC: Thomas Gleixner <tglx@linutronix.de> CC: Tom Lendacky <thomas.lendacky@amd.com> CC: Vivek Goyal <vgoyal@redhat.com> CC: Yaowei Bai <baiyaowei@cmss.chinamobile.com> CC: bhe@redhat.com CC: dan.j.williams@intel.com CC: dyoung@redhat.com CC: kexec@lists.infradead.org CC: mingo@redhat.com CC: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/153805812254.1157.16736368485811773752.stgit@bhelgaas-glaptop.roam.corp.google.comSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Johannes Thumshirn authored
[ Upstream commit aa53e3bf ] Nikolay reported the following KASAN splat when running btrfs/048: [ 1843.470920] ================================================================== [ 1843.471971] BUG: KASAN: slab-out-of-bounds in strncmp+0x66/0xb0 [ 1843.472775] Read of size 1 at addr ffff888111e369e2 by task btrfs/3979 [ 1843.473904] CPU: 3 PID: 3979 Comm: btrfs Not tainted 5.2.0-rc3-default #536 [ 1843.475009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 1843.476322] Call Trace: [ 1843.476674] dump_stack+0x7c/0xbb [ 1843.477132] ? strncmp+0x66/0xb0 [ 1843.477587] print_address_description+0x114/0x320 [ 1843.478256] ? strncmp+0x66/0xb0 [ 1843.478740] ? strncmp+0x66/0xb0 [ 1843.479185] __kasan_report+0x14e/0x192 [ 1843.479759] ? strncmp+0x66/0xb0 [ 1843.480209] kasan_report+0xe/0x20 [ 1843.480679] strncmp+0x66/0xb0 [ 1843.481105] prop_compression_validate+0x24/0x70 [ 1843.481798] btrfs_xattr_handler_set_prop+0x65/0x160 [ 1843.482509] __vfs_setxattr+0x71/0x90 [ 1843.483012] __vfs_setxattr_noperm+0x84/0x130 [ 1843.483606] vfs_setxattr+0xac/0xb0 [ 1843.484085] setxattr+0x18c/0x230 [ 1843.484546] ? vfs_setxattr+0xb0/0xb0 [ 1843.485048] ? __mod_node_page_state+0x1f/0xa0 [ 1843.485672] ? _raw_spin_unlock+0x24/0x40 [ 1843.486233] ? __handle_mm_fault+0x988/0x1290 [ 1843.486823] ? lock_acquire+0xb4/0x1e0 [ 1843.487330] ? lock_acquire+0xb4/0x1e0 [ 1843.487842] ? mnt_want_write_file+0x3c/0x80 [ 1843.488442] ? debug_lockdep_rcu_enabled+0x22/0x40 [ 1843.489089] ? rcu_sync_lockdep_assert+0xe/0x70 [ 1843.489707] ? __sb_start_write+0x158/0x200 [ 1843.490278] ? mnt_want_write_file+0x3c/0x80 [ 1843.490855] ? __mnt_want_write+0x98/0xe0 [ 1843.491397] __x64_sys_fsetxattr+0xba/0xe0 [ 1843.492201] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 1843.493201] do_syscall_64+0x6c/0x230 [ 1843.493988] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1843.495041] RIP: 0033:0x7fa7a8a7707a [ 1843.495819] Code: 48 8b 0d 21 de 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 be 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee dd 2b 00 f7 d8 64 89 01 48 [ 1843.499203] RSP: 002b:00007ffcb73bca38 EFLAGS: 00000202 ORIG_RAX: 00000000000000be [ 1843.500210] RAX: ffffffffffffffda RBX: 00007ffcb73bda9d RCX: 00007fa7a8a7707a [ 1843.501170] RDX: 00007ffcb73bda9d RSI: 00000000006dc050 RDI: 0000000000000003 [ 1843.502152] RBP: 00000000006dc050 R08: 0000000000000000 R09: 0000000000000000 [ 1843.503109] R10: 0000000000000002 R11: 0000000000000202 R12: 00007ffcb73bda91 [ 1843.504055] R13: 0000000000000003 R14: 00007ffcb73bda82 R15: ffffffffffffffff [ 1843.505268] Allocated by task 3979: [ 1843.505771] save_stack+0x19/0x80 [ 1843.506211] __kasan_kmalloc.constprop.5+0xa0/0xd0 [ 1843.506836] setxattr+0xeb/0x230 [ 1843.507264] __x64_sys_fsetxattr+0xba/0xe0 [ 1843.507886] do_syscall_64+0x6c/0x230 [ 1843.508429] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1843.509558] Freed by task 0: [ 1843.510188] (stack is not available) [ 1843.511309] The buggy address belongs to the object at ffff888111e369e0 which belongs to the cache kmalloc-8 of size 8 [ 1843.514095] The buggy address is located 2 bytes inside of 8-byte region [ffff888111e369e0, ffff888111e369e8) [ 1843.516524] The buggy address belongs to the page: [ 1843.517561] page:ffff88813f478d80 refcount:1 mapcount:0 mapping:ffff88811940c300 index:0xffff888111e373b8 compound_mapcount: 0 [ 1843.519993] flags: 0x4404000010200(slab|head) [ 1843.520951] raw: 0004404000010200 ffff88813f48b008 ffff888119403d50 ffff88811940c300 [ 1843.522616] raw: ffff888111e373b8 000000000016000f 00000001ffffffff 0000000000000000 [ 1843.524281] page dumped because: kasan: bad access detected [ 1843.525936] Memory state around the buggy address: [ 1843.526975] ffff888111e36880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.528479] ffff888111e36900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.530138] >ffff888111e36980: fc fc fc fc fc fc fc fc fc fc fc fc 02 fc fc fc [ 1843.531877] ^ [ 1843.533287] ffff888111e36a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.534874] ffff888111e36a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 1843.536468] ================================================================== This is caused by supplying a too short compression value ('lz') in the test-case and comparing it to 'lzo' with strncmp() and a length of 3. strncmp() read past the 'lz' when looking for the 'o' and thus caused an out-of-bounds read. Introduce a new check 'btrfs_compress_is_valid_type()' which not only checks the user-supplied value against known compression types, but also employs checks for too short values. Reported-by: Nikolay Borisov <nborisov@suse.com> Fixes: 272e5326 ("btrfs: prop: fix vanished compression property after failed set") CC: stable@vger.kernel.org # 5.1+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bart Van Assche authored
[ Upstream commit bcef5b72 ] The function srp_parse_in() is used both for parsing source address specifications and for target address specifications. Target addresses must have a port number. Having to specify a port number for source addresses is inconvenient. Make sure that srp_parse_in() supports again parsing addresses with no port number. Cc: <stable@vger.kernel.org> Fixes: c62adb7d ("IB/srp: Fix IPv6 address parsing") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bart Van Assche authored
[ Upstream commit e37df2d5 ] This patch avoids that a warning is reported when building with W=1. Cc: Sergey Gorenko <sergeygo@mellanox.com> Cc: Max Gurtovoy <maxg@mellanox.com> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Linus Walleij authored
[ Upstream commit f90b8fda ] The SPI to the display on the DIR-685 is active low, we were just saved by the SPI library enforcing active low on everything before, so set it as active low to avoid ambiguity. Link: https://lore.kernel.org/r/20190715202101.16060-1-linus.walleij@linaro.org Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Michael Neuling authored
[ Upstream commit 3fefd1cd ] When emulating tsr, treclaim and trechkpt, we incorrectly set CR0. The code currently sets: CR0 <- 00 || MSR[TS] but according to the ISA it should be: CR0 <- 0 || MSR[TS] || 0 This fixes the bit shift to put the bits in the correct location. This is a data integrity issue as CR0 is corrupted. Fixes: 4bb3c7a0 ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Cc: stable@vger.kernel.org # v4.17+ Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Paul Mackerras authored
[ Upstream commit fd0944ba ] When the 'regs' field was added to struct kvm_vcpu_arch, the code was changed to use several of the fields inside regs (e.g., gpr, lr, etc.) but not the ccr field, because the ccr field in struct pt_regs is 64 bits on 64-bit platforms, but the cr field in kvm_vcpu_arch is only 32 bits. This changes the code to use the regs.ccr field instead of cr, and changes the assembly code on 64-bit platforms to use 64-bit loads and stores instead of 32-bit ones. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Wanpeng Li authored
[ Upstream commit 4d763b16 ] Raise #GP when guest read/write IA32_XSS, but the CPUID bits say that it shouldn't exist. Fixes: 20300099 (kvm: vmx: add MSR logic for XSAVES) Reported-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> Reported-by: Tao Xu <tao3.xu@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Sean Christopherson authored
[ Upstream commit beb8d93b ] A previous fix to prevent KVM from consuming stale VMCS state after a failed VM-Entry inadvertantly blocked KVM's handling of machine checks that occur during VM-Entry. Per Intel's SDM, a #MC during VM-Entry is handled in one of three ways, depending on when the #MC is recognoized. As it pertains to this bug fix, the third case explicitly states EXIT_REASON_MCE_DURING_VMENTRY is handled like any other VM-Exit during VM-Entry, i.e. sets bit 31 to indicate the VM-Entry failed. If a machine-check event occurs during a VM entry, one of the following occurs: - The machine-check event is handled as if it occurred before the VM entry: ... - The machine-check event is handled after VM entry completes: ... - A VM-entry failure occurs as described in Section 26.7. The basic exit reason is 41, for "VM-entry failure due to machine-check event". Explicitly handle EXIT_REASON_MCE_DURING_VMENTRY as a one-off case in vmx_vcpu_run() instead of binning it into vmx_complete_atomic_exit(). Doing so allows vmx_vcpu_run() to handle VMX_EXIT_REASONS_FAILED_VMENTRY in a sane fashion and also simplifies vmx_complete_atomic_exit() since VMCS.VM_EXIT_INTR_INFO is guaranteed to be fresh. Fixes: b060ca3b ("kvm: vmx: Handle VMLAUNCH/VMRESUME failure properly") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Sean Christopherson authored
[ Upstream commit d28f4290 ] The behavior of WRMSR is in no way dependent on whether or not KVM consumes the value. Fixes: 4566654b ("KVM: vmx: Inject #GP on invalid PAT CR") Cc: stable@vger.kernel.org Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Paolo Bonzini authored
[ Upstream commit 674ea351 ] This check will soon be done on every nested vmentry and vmexit, "parallelize" it using bitwise operations. Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Yan, Zheng authored
[ Upstream commit 87bc5b89 ] remove_session_caps() relies on __wait_on_freeing_inode(), to wait for freeing inode to remove its caps. But VFS wakes freeing inode waiters before calling destroy_inode(). Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/40102Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Takashi Iwai authored
[ Upstream commit 4914da2f ] We apply the codec resume forcibly at system resume callback for updating and syncing the jack detection state that may have changed during sleeping. This is, however, superfluous for the codec like Intel HDMI/DP, where the jack detection is managed via the audio component notification; i.e. the jack state change shall be reported sooner or later from the graphics side at mode change. This patch changes the codec resume callback to avoid the forcible resume conditionally with a new flag, codec->relaxed_resume, for reducing the resume time. The flag is set in the codec probe. Although this doesn't fix the entire bug mentioned in the bugzilla entry below, it's still a good optimization and some improvements are seen. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201901 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Paulo Alcantara (SUSE) authored
[ Upstream commit 29fbeb7a ] Fix mount options comparison when serverino option is turned off later in cifs_autodisable_serverino() and thus avoiding mismatch of new cifs mounts. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <paulo@paulo.ac> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilove@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Benjamin Block authored
[ Upstream commit 106d45f3 ] When tracing instances where we open and close WKA ports, we also pass the request-ID of the respective FSF command. But after successfully sending the FSF command we must not use the request-object anymore, as this might result in an use-after-free (see "zfcp: fix request object use-after-free in send path causing seqno errors" ). To fix this add a new variable that caches the request-ID before sending the request. This won't change during the hand-off to the FCP channel, and so it's safe to trace this cached request-ID later, instead of using the request object. Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Fixes: d27a7cb9 ("zfcp: trace on request for open and close of WKA port") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Steffen Maier <maier@linux.ibm.com> Reviewed-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Ajay Singh authored
[ Upstream commit 6419f818 ] For the error path in wilc_wlan_initialize(), the resources are not cleanup in the correct order. Reverted the previous changes and use the correct order to free during error condition. Fixes: b46d6882 ("staging: wilc1000: remove COMPLEMENT_BOOT") Cc: <stable@vger.kernel.org> Signed-off-by: Ajay Singh <ajay.kathat@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Roman Bolshakov authored
[ Upstream commit 5676234f ] WRITE SAME corrupts data on the block device behind iblock if the command is emulated. The emulation code issues (M - 1) * N times more bios than requested, where M is the number of 512 blocks per real block size and N is the NUMBER OF LOGICAL BLOCKS specified in WRITE SAME command. So, for a device with 4k blocks, 7 * N more LBAs gets written after the requested range. The issue happens because the number of 512 byte sectors to be written is decreased one by one while the real bios are typically from 1 to 8 512 byte sectors per bio. Fixes: c66ac9db ("[SCSI] target: Add LIO target core v4.0.0-rc6") Cc: <stable@vger.kernel.org> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bart Van Assche authored
[ Upstream commit 80b045b3 ] Instead of duplicating the SECTOR_SHIFT definition from <linux/blkdev.h>, use it. This patch does not change any functionality. Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Mike Salvatore authored
[ Upstream commit 156e4299 ] Each function that manipulates the aa_ext struct should reset it's "pos" member on failure. This ensures that, on failure, no changes are made to the state of the aa_ext struct. There are paths were elements are optional and the error path is used to indicate the optional element is not present. This means instead of just aborting on error the unpack stream can become unsynchronized on optional elements, if using one of the affected functions. Cc: stable@vger.kernel.org Fixes: 736ec752 ("AppArmor: policy routines for loading and unpacking policy") Signed-off-by: Mike Salvatore <mike.salvatore@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Mike Marciniszyn authored
[ Upstream commit cf131a81 ] Heavy contention of the sde flushlist_lock can cause hard lockups at extreme scale when the flushing logic is under stress. Mitigate by replacing the item at a time copy to the local list with an O(1) list_splice_init() and using the high priority work queue to do the flushes. Fixes: 77241056 ("IB/hfi1: add driver files") Cc: <stable@vger.kernel.org> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Jon Hunter authored
[ Upstream commit 9caec662 ] Currently the default clock rates for the HDA and HDA2CODEC_2X clocks are both 19.2MHz. However, the default rates for these clocks should actually be 51MHz and 48MHz, respectively. The current clock settings results in a distorted output during audio playback. Correct the default clock rates for these clocks by specifying them in the clock init table for Tegra210. Cc: stable@vger.kernel.org Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Jon Hunter authored
[ Upstream commit 845d782d ] The maximum frequency supported for I2S on Tegra124 and Tegra210 is 24.576MHz (as stated in the Tegra TK1 data sheet for Tegra124 and the Jetson TX1 module data sheet for Tegra210). However, the maximum I2S frequency is limited to 24MHz because that is the maximum frequency of the audio sync clock. Increase the maximum audio sync clock frequency to 24.576MHz for Tegra124 and Tegra210 in order to support 24.576MHz for I2S. Update the tegra_clk_register_sync_source() function so that it does not set the initial rate for the sync clocks and use the clock init tables to set the initial rate instead. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Ronnie Sahlberg authored
[ Upstream commit 487317c9 ] We can not depend on the tcon->open_file_lock here since in multiuser mode we may have the same file/inode open via multiple different tcons. The current code is race prone and will crash if one user deletes a file at the same time a different user opens/create the file. To avoid this we need to have a spinlock attached to the inode and not the tcon. RHBZ: 1580165 CC: Stable <stable@vger.kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Filipe Manana authored
[ Upstream commit 8eaf40c0 ] If a task is removing the block group that currently has the highest start offset amongst all existing block groups, there is a short time window where it races with a concurrent block group allocation, resulting in a transaction abort with an error code of EEXIST. The following diagram explains the race in detail: Task A Task B btrfs_remove_block_group(bg offset X) remove_extent_mapping(em offset X) -> removes extent map X from the tree of extent maps (fs_info->mapping_tree), so the next call to find_next_chunk() will return offset X btrfs_alloc_chunk() find_next_chunk() --> returns offset X __btrfs_alloc_chunk(offset X) btrfs_make_block_group() btrfs_create_block_group_cache() --> creates btrfs_block_group_cache object with a key corresponding to the block group item in the extent, the key is: (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G) --> adds the btrfs_block_group_cache object to the list new_bgs of the transaction handle btrfs_end_transaction(trans handle) __btrfs_end_transaction() btrfs_create_pending_block_groups() --> sees the new btrfs_block_group_cache in the new_bgs list of the transaction handle --> its call to btrfs_insert_item() fails with -EEXIST when attempting to insert the block group item key (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G) because task A has not removed that key yet --> aborts the running transaction with error -EEXIST btrfs_del_item() -> removes the block group's key from the extent tree, key is (offset X, BTRFS_BLOCK_GROUP_ITEM_KEY, 1G) A sample transaction abort trace: [78912.403537] ------------[ cut here ]------------ [78912.403811] BTRFS: Transaction aborted (error -17) [78912.404082] WARNING: CPU: 2 PID: 20465 at fs/btrfs/extent-tree.c:10551 btrfs_create_pending_block_groups+0x196/0x250 [btrfs] (...) [78912.405642] CPU: 2 PID: 20465 Comm: btrfs Tainted: G W 5.0.0-btrfs-next-46 #1 [78912.405941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014 [78912.406586] RIP: 0010:btrfs_create_pending_block_groups+0x196/0x250 [btrfs] (...) [78912.407636] RSP: 0018:ffff9d3d4b7e3b08 EFLAGS: 00010282 [78912.407997] RAX: 0000000000000000 RBX: ffff90959a3796f0 RCX: 0000000000000006 [78912.408369] RDX: 0000000000000007 RSI: 0000000000000001 RDI: ffff909636b16860 [78912.408746] RBP: ffff909626758a58 R08: 0000000000000000 R09: 0000000000000000 [78912.409144] R10: ffff9095ff462400 R11: 0000000000000000 R12: ffff90959a379588 [78912.409521] R13: ffff909626758ab0 R14: ffff9095036c0000 R15: ffff9095299e1158 [78912.409899] FS: 00007f387f16f700(0000) GS:ffff909636b00000(0000) knlGS:0000000000000000 [78912.410285] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [78912.410673] CR2: 00007f429fc87cbc CR3: 000000014440a004 CR4: 00000000003606e0 [78912.411095] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [78912.411496] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [78912.411898] Call Trace: [78912.412318] __btrfs_end_transaction+0x5b/0x1c0 [btrfs] [78912.412746] btrfs_inc_block_group_ro+0xcf/0x160 [btrfs] [78912.413179] scrub_enumerate_chunks+0x188/0x5b0 [btrfs] [78912.413622] ? __mutex_unlock_slowpath+0x100/0x2a0 [78912.414078] btrfs_scrub_dev+0x2ef/0x720 [btrfs] [78912.414535] ? __sb_start_write+0xd4/0x1c0 [78912.414963] ? mnt_want_write_file+0x24/0x50 [78912.415403] btrfs_ioctl+0x17fb/0x3120 [btrfs] [78912.415832] ? lock_acquire+0xa6/0x190 [78912.416256] ? do_vfs_ioctl+0xa2/0x6f0 [78912.416685] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] [78912.417116] do_vfs_ioctl+0xa2/0x6f0 [78912.417534] ? __fget+0x113/0x200 [78912.417954] ksys_ioctl+0x70/0x80 [78912.418369] __x64_sys_ioctl+0x16/0x20 [78912.418812] do_syscall_64+0x60/0x1b0 [78912.419231] entry_SYSCALL_64_after_hwframe+0x49/0xbe [78912.419644] RIP: 0033:0x7f3880252dd7 (...) [78912.420957] RSP: 002b:00007f387f16ed68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [78912.421426] RAX: ffffffffffffffda RBX: 000055f5becc1df0 RCX: 00007f3880252dd7 [78912.421889] RDX: 000055f5becc1df0 RSI: 00000000c400941b RDI: 0000000000000003 [78912.422354] RBP: 0000000000000000 R08: 00007f387f16f700 R09: 0000000000000000 [78912.422790] R10: 00007f387f16f700 R11: 0000000000000246 R12: 0000000000000000 [78912.423202] R13: 00007ffda49c266f R14: 0000000000000000 R15: 00007f388145e040 [78912.425505] ---[ end trace eb9bfe7c426fc4d3 ]--- Fix this by calling remove_extent_mapping(), at btrfs_remove_block_group(), only at the very end, after removing the block group item key from the extent tree (and removing the free space tree entry if we are using the free space tree feature). Fixes: 04216820 ("Btrfs: fix race between fs trimming and block group remove/allocation") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-