- 13 Jan, 2019 40 commits
-
-
Ondrej Mosnacek authored
commit 5df275cd upstream. Do the LE conversions before doing the Infiniband-related range checks. The incorrect checks are otherwise causing a failure to load any policy with an ibendportcon rule on BE systems. This can be reproduced by running (on e.g. ppc64): cat >my_module.cil <<EOF (type test_ibendport_t) (roletype object_r test_ibendport_t) (ibendportcon mlx4_0 1 (system_u object_r test_ibendport_t ((s0) (s0)))) EOF semodule -i my_module.cil Also, fix loading/storing the 64-bit subnet prefix for OCON_IBPKEY to use a correctly aligned buffer. Finally, do not use the 'nodebuf' (u32) buffer where 'buf' (__le32) should be used instead. Tested internally on a ppc64 machine with a RHEL 7 kernel with this patch applied. Cc: Daniel Jurgens <danielj@mellanox.com> Cc: Eli Cohen <eli@mellanox.com> Cc: James Morris <jmorris@namei.org> Cc: Doug Ledford <dledford@redhat.com> Cc: <stable@vger.kernel.org> # 4.13+ Fixes: a806f7a1 ("selinux: Create policydb version for Infiniband support") Signed-off-by:
Ondrej Mosnacek <omosnace@redhat.com> Acked-by:
Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Larry Finger authored
commit 8ea3819c upstream. The cordic routine for calculating sines and cosines that was added in commit 6f98e62a ("b43: update cordic code to match current specs") contains an error whereby a quantity declared u32 can in fact go negative. This problem was detected by Priit Laes who is switching b43 to use the routine in the library functions of the kernel. Fixes: 98650454 ("b43: make cordic common (LP-PHY and N-PHY need it)") Reported-by:
Priit Laes <plaes@plaes.org> Cc: Rafał Miłecki <zajec5@gmail.com> Cc: Stable <stable@vger.kernel.org> # 2.6.34 Signed-off-by:
Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by:
Priit Laes <plaes@plaes.org> Signed-off-by:
Kalle Valo <kvalo@codeaurora.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andreas Gruenbacher authored
commit 2d29f6b9 upstream. Fix the resource group wrap-around logic in gfs2_rbm_find that commit e579ed4f broke. The bug can lead to unnecessary repeated scanning of the same bitmaps; there is a risk that future changes will turn this into an endless loop. Fixes: e579ed4f ("GFS2: Introduce rbm field bii") Cc: stable@vger.kernel.org # v3.13+ Signed-off-by:
Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by:
Bob Peterson <rpeterso@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andreas Gruenbacher authored
commit 6ff9b09e upstream. In gfs2_create_inode, after setting and releasing the acl / default_acl, the acl / default_acl pointers are not set to NULL as they should be. In that state, when the function reaches label fail_free_acls, gfs2_create_inode will try to release the same acls again. Fix that by setting the pointers to NULL after releasing the acls. Slightly simplify the logic. Also, posix_acl_release checks for NULL already, so there is no need to duplicate those checks here. Fixes: e01580bf ("gfs2: use generic posix ACL infrastructure") Reported-by:
Pan Bian <bianpan2016@163.com> Cc: Christoph Hellwig <hch@lst.de> Cc: stable@vger.kernel.org # v4.9+ Signed-off-by:
Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by:
Bob Peterson <rpeterso@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit d47b41ac upstream. According to comment in dlm_user_request() ua should be freed in dlm_free_lkb() after successful attach to lkb. However ua is attached to lkb not in set_lock_args() but later, inside request_lock(). Fixes 597d0cae ("[DLM] dlm: user locks") Cc: stable@kernel.org # 2.6.19 Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
David Teigland <teigland@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit c0174726 upstream. Fixes 6d40c4a7 ("dlm: improve error and debug messages") Cc: stable@kernel.org # 3.5 Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
David Teigland <teigland@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit 23851e97 upstream. Fixes 3d6aa675 ("dlm: keep lkbs in idr") Cc: stable@kernel.org # 3.1 Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
David Teigland <teigland@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit b982896c upstream. If allocation fails on last elements of array need to free already allocated elements. v2: just move existing out_rsbtbl label to right place Fixes 789924ba635f ("dlm: fix race between remove and lookup") Cc: stable@kernel.org # 3.6 Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
David Teigland <teigland@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Damien Le Moal authored
commit 7211aef8 upstream. For a zoned block device using mq-deadline, if a write request for a zone is received while another write was already dispatched for the same zone, dd_dispatch_request() will return NULL and the newly inserted write request is kept in the scheduler queue waiting for the ongoing zone write to complete. With this behavior, when no other request has been dispatched, rq_list in blk_mq_sched_dispatch_requests() is empty and blk_mq_sched_mark_restart_hctx() not called. This in turn leads to __blk_mq_free_request() call of blk_mq_sched_restart() to not run the queue when the already dispatched write request completes. The newly dispatched request stays stuck in the scheduler queue until eventually another request is submitted. This problem does not affect SCSI disk as the SCSI stack handles queue restart on request completion. However, this problem is can be triggered the nullblk driver with zoned mode enabled. Fix this by always requesting a queue restart in dd_dispatch_request() if no request was dispatched while WRITE requests are queued. Fixes: 5700f691 ("mq-deadline: Introduce zone locking support") Cc: <stable@vger.kernel.org> Signed-off-by:
Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Add missing export of blk_mq_sched_restart() Signed-off-by:
Jens Axboe <axboe@kernel.dk>
-
Ming Lei authored
commit 544fbd16 upstream. rwb_enabled() can't be changed when there is any inflight IO. wbt_disable_default() may set rwb->wb_normal as zero, however the blk_stat timer may still be pending, and the timer function will update wrb->wb_normal again. This patch introduces blk_stat_deactivate() and applies it in wbt_disable_default(), then the following IO hang triggered when running parted & switching io scheduler can be fixed: [ 369.937806] INFO: task parted:3645 blocked for more than 120 seconds. [ 369.938941] Not tainted 4.20.0-rc6-00284-g906c801e5248 #498 [ 369.939797] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.940768] parted D 0 3645 3239 0x00000000 [ 369.941500] Call Trace: [ 369.941874] ? __schedule+0x6d9/0x74c [ 369.942392] ? wbt_done+0x5e/0x5e [ 369.942864] ? wbt_cleanup_cb+0x16/0x16 [ 369.943404] ? wbt_done+0x5e/0x5e [ 369.943874] schedule+0x67/0x78 [ 369.944298] io_schedule+0x12/0x33 [ 369.944771] rq_qos_wait+0xb5/0x119 [ 369.945193] ? karma_partition+0x1c2/0x1c2 [ 369.945691] ? wbt_cleanup_cb+0x16/0x16 [ 369.946151] wbt_wait+0x85/0xb6 [ 369.946540] __rq_qos_throttle+0x23/0x2f [ 369.947014] blk_mq_make_request+0xe6/0x40a [ 369.947518] generic_make_request+0x192/0x2fe [ 369.948042] ? submit_bio+0x103/0x11f [ 369.948486] ? __radix_tree_lookup+0x35/0xb5 [ 369.949011] submit_bio+0x103/0x11f [ 369.949436] ? blkg_lookup_slowpath+0x25/0x44 [ 369.949962] submit_bio_wait+0x53/0x7f [ 369.950469] blkdev_issue_flush+0x8a/0xae [ 369.951032] blkdev_fsync+0x2f/0x3a [ 369.951502] do_fsync+0x2e/0x47 [ 369.951887] __x64_sys_fsync+0x10/0x13 [ 369.952374] do_syscall_64+0x89/0x149 [ 369.952819] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 369.953492] RIP: 0033:0x7f95a1e729d4 [ 369.953996] Code: Bad RIP value. [ 369.954456] RSP: 002b:00007ffdb570dd48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a [ 369.955506] RAX: ffffffffffffffda RBX: 000055c2139c6be0 RCX: 00007f95a1e729d4 [ 369.956389] RDX: 0000000000000001 RSI: 0000000000001261 RDI: 0000000000000004 [ 369.957325] RBP: 0000000000000002 R08: 0000000000000000 R09: 000055c2139c6ce0 [ 369.958199] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c2139c0380 [ 369.959143] R13: 0000000000000004 R14: 0000000000000100 R15: 0000000000000008 Cc: stable@vger.kernel.org Cc: Paolo Valente <paolo.valente@linaro.org> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Matthew Wilcox authored
commit 1a80dade upstream. The failure path removes the allocated PIDs from the wrong namespace. This could lead to us inadvertently reusing PIDs in the leaf namespace and leaking PIDs in parent namespaces. Fixes: 95846ecf ("pid: replace pid bitmap implementation with IDR API") Cc: <stable@vger.kernel.org> Signed-off-by:
Matthew Wilcox <willy@infradead.org> Acked-by:
"Eric W. Biederman" <ebiederm@xmission.com> Reviewed-by:
Oleg Nesterov <oleg@redhat.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael J. Wysocki authored
commit e121a833 upstream. __device_release_driver() has to check dev->bus->need_parent_lock before dropping the parent lock and acquiring it again as it may attempt to drop a lock that hasn't been acquired or lock a device that shouldn't be locked and create a lock imbalance. Fixes: 8c97a46a (driver core: hold dev's parent lock when needed) Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dennis Krein authored
commit eb4c2382 upstream. The srcu_gp_start() function is called with the srcu_struct structure's ->lock held, but not with the srcu_data structure's ->lock. This is problematic because this function accesses and updates the srcu_data structure's ->srcu_cblist, which is protected by that lock. Failing to hold this lock can result in corruption of the SRCU callback lists, which in turn can result in arbitrarily bad results. This commit therefore makes srcu_gp_start() acquire the srcu_data structure's ->lock across the calls to rcu_segcblist_advance() and rcu_segcblist_accelerate(), thus preventing this corruption. Reported-by:
Bart Van Assche <bvanassche@acm.org> Reported-by:
Christoph Hellwig <hch@infradead.org> Reported-by:
Sebastian Kuzminsky <seb.kuzminsky@gmail.com> Signed-off-by:
Dennis Krein <Dennis.Krein@netapp.com> Signed-off-by:
Paul E. McKenney <paulmck@linux.ibm.com> Tested-by:
Dennis Krein <Dennis.Krein@netapp.com> Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit 3e96d728 upstream. There are a few places where we access the data without checking the actual object size from the USB audio descriptor. This may result in OOB access, as recently reported. This patch addresses these missing checks. Most of added codes are simple bLength checks in the caller side. For the input and output terminal parsers, we put the length check in the parser functions. For the input terminal, a new argument is added to distinguish between UAC1 and the rest, as they treat different objects. Reported-by:
Mathias Payer <mathias.payer@nebelwelt.net> Reported-by:
Hui Peng <benquike@163.com> Tested-by:
Hui Peng <benquike@163.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hui Peng authored
commit cbb2ebf7 upstream. In `create_composite_quirk`, the terminating condition of for loops is `quirk->ifnum < 0`. So any composite quirks should end with `struct snd_usb_audio_quirk` object with ifnum < 0. for (quirk = quirk_comp->data; quirk->ifnum >= 0; ++quirk) { ..... } the data field of Bower's & Wilkins PX headphones usb device device quirks do not end with {.ifnum = -1}, wihch may result in out-of-bound read. This Patch fix the bug by adding an ending quirk object. Fixes: 240a8af9 ("ALSA: usb-audio: Add a quirck for B&W PX headphones") Signed-off-by:
Hui Peng <benquike@163.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit 0bfe5e43 upstream. We've had some sanity checks of the mixer unit descriptors but they are too loose and some corner cases are overlooked. Add more strict checks in uac_mixer_unit_get_channels() for avoiding possible OOB accesses by malformed descriptors. This also changes the semantics of uac_mixer_unit_get_channels() slightly. Now it returns zero for the cases where the descriptor lacks of bmControls instead of -EINVAL. Then the caller side skips the mixer creation for such unit while it keeps parsing it. This corresponds to the case like Maya44. Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit f4351a19 upstream. The parser for the processing unit reads bNrInPins field before the bLength sanity check, which may lead to an out-of-bound access when a malformed descriptor is given. Fix it by assignment after the bLength check. Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 1524f4e4 upstream. The "chip->dsp_spos_instance" can be NULL on some of the ealier error paths in snd_cs46xx_create(). Reported-by:
"Yavuz, Tuba" <tuba@ece.ufl.edu> Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Brad Love authored
commit 4bd46aa0 upstream. It is reported that commit 95f408bb ("media: cx23885: Ryzen DMA related RiSC engine stall fixes") caused regresssions with other CPUs. Ensure that the quirk will be applied only for the CPUs that are known to cause problems. A module option is added for explicit control of the behaviour. Fixes: 95f408bb ("media: cx23885: Ryzen DMA related RiSC engine stall fixes") Signed-off-by:
Brad Love <brad@nextdimension.cc> Signed-off-by:
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lorenzo Bianconi authored
commit 0ae976a1 upstream. Enable hw capabilities supported by mt76-usb layer - fast_xmit - tx/rx amsdu - MFP - non-linear tx skbs [This is one line hw feature backport from 0ae976a1 ("mt76x0: init hw capabilities"), which add also other different features, however those are not supported in 4.19. 802.11w is supported by mac80211 and mt76x0u driver in 4.19 correctly fall-back to software encryption when 802.11w ciphers are used. Without the patch we fail to associate with WPA3 APs, so this is considered as fix.] Signed-off-by:
Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by:
Felix Fietkau <nbd@nbd.name> [remove marking non-working features on 4.19, make topic correspond the change] Signed-off-by:
Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lendacky, Thomas authored
commit c92a54cf upstream. The dma_direct_supported() function intends to check the DMA mask against specific values. However, the phys_to_dma() function includes the SME encryption mask, which defeats the intended purpose of the check. This results in drivers that support less than 48-bit DMA (SME encryption mask is bit 47) from being able to set the DMA mask successfully when SME is active, which results in the driver failing to initialize. Change the function used to check the mask from phys_to_dma() to __phys_to_dma() so that the SME encryption mask is not part of the check. Fixes: c1d0af1a ("kernel/dma/direct: take DMA offset into account in dma_direct_supported") Signed-off-by:
Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joel Stanley authored
commit e213574a upstream. We cannot build these files with clang as it does not allow altivec instructions in assembly when -msoft-float is passed. Jinsong Ji <jji@us.ibm.com> wrote: > We currently disable Altivec/VSX support when enabling soft-float. So > any usage of vector builtins will break. > > Enable Altivec/VSX with soft-float may need quite some clean up work, so > I guess this is currently a limitation. > > Removing -msoft-float will make it work (and we are lucky that no > floating point instructions will be generated as well). This is a workaround until the issue is resolved in clang. Link: https://bugs.llvm.org/show_bug.cgi?id=31177 Link: https://github.com/ClangBuiltLinux/linux/issues/239Signed-off-by:
Joel Stanley <joel@jms.id.au> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joel Stanley authored
commit 813af51f upstream. Clang needs to be told which target it is building for when cross compiling. Link: https://github.com/ClangBuiltLinux/linux/issues/259Signed-off-by:
Joel Stanley <joel@jms.id.au> Tested-by: Daniel Axtens <dja@axtens.net> # powerpc 64-bit BE Acked-by:
Michael Ellerman <mpe@ellerman.id.au> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joel Stanley authored
commit 3bd98050 upstream. The powerpc makefile will use these in it's boot wrapper. Signed-off-by:
Joel Stanley <joel@jms.id.au> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masahiro Yamada authored
commit 238bcbc4 upstream. Collect basic Clang options such as --target, --prefix, --gcc-toolchain, -no-integrated-as into a single variable CLANG_FLAGS so that it can be easily reused in other parts of Makefile. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Tested-by:
Nick Desaulniers <ndesaulniers@google.com> Acked-by:
Greg Hackmann <ghackmann@google.com> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masahiro Yamada authored
commit dbe27a00 upstream. We are still a way off the Clang's integrated assembler support for the kernel. Hence, -no-integrated-as is mandatory to build the kernel with Clang. If you had an ancient version of Clang that does not recognize this option, you would not be able to compile the kernel anyway. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> Tested-by:
Nick Desaulniers <ndesaulniers@google.com> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Joel Stanley authored
commit aea44714 upstream. The powerpc kernel uses setjmp which causes a warning when building with clang: In file included from arch/powerpc/xmon/xmon.c:51: ./arch/powerpc/include/asm/setjmp.h:15:13: error: declaration of built-in function 'setjmp' requires inclusion of the header <setjmp.h> [-Werror,-Wbuiltin-requires-header] extern long setjmp(long *); ^ ./arch/powerpc/include/asm/setjmp.h:16:13: error: declaration of built-in function 'longjmp' requires inclusion of the header <setjmp.h> [-Werror,-Wbuiltin-requires-header] extern void longjmp(long *, long); ^ This *is* the header and we're not using the built-in setjump but rather the one in arch/powerpc/kernel/misc.S. As the compiler warning does not make sense, it for the files where setjmp is used. Signed-off-by:
Joel Stanley <joel@jms.id.au> Reviewed-by:
Nick Desaulniers <ndesaulniers@google.com> [mpe: Move subdir-ccflags in xmon/Makefile to not clobber -Werror] Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicholas Piggin authored
commit 6977f95e upstream. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Reviewed-by:
Joel Stanley <joel@jms.id.au> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicholas Piggin authored
commit 2a056f58 upstream. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Reviewed-by:
Joel Stanley <joel@jms.id.au> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicholas Piggin authored
commit f2910f0e upstream. GCC 4.6 is the minimum supported now. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Reviewed-by:
Joel Stanley <joel@jms.id.au> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> [nc: Applied to minimize unnecessary conflicts] Signed-off-by:
Nathan Chancellor <natechancellor@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit b8be5674 upstream. Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Cc: stable@vger.kernel.org Signed-off-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vasily Averin authored
commit 4ecd55ea upstream. After commit d202cce8, an expired cache_head can be removed from the cache_detail's hash. However, the expired cache_head may be waiting for a reply from a previously submitted request. Such a cache_head has an increased refcounter and therefore it won't be freed after cache_put(freeme). Because the cache_head was removed from the hash it cannot be found during cache_clean() and can be leaked forever, together with stalled cache_request and other taken resources. In our case we noticed it because an entry in the export cache was holding a reference on a filesystem. Fixes d202cce8 ("sunrpc: never return expired entries in sunrpc_cache_lookup") Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: stable@kernel.org # 2.6.35 Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Reviewed-by:
NeilBrown <neilb@suse.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michal Hocko authored
commit 7056d3a3 upstream. Burt Holzman has noticed that memcg v1 doesn't notify about OOM events via eventfd anymore. The reason is that 29ef680a ("memcg, oom: move out_of_memory back to the charge path") has moved the oom handling back to the charge path. While doing so the notification was left behind in mem_cgroup_oom_synchronize. Fix the issue by replicating the oom hierarchy locking and the notification. Link: http://lkml.kernel.org/r/20181224091107.18354-1-mhocko@kernel.org Fixes: 29ef680a ("memcg, oom: move out_of_memory back to the charge path") Signed-off-by:
Michal Hocko <mhocko@suse.com> Reported-by:
Burt Holzman <burt@fnal.gov> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com Cc: <stable@vger.kernel.org> [4.19+] Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Huang Ying authored
commit 7af7a8e1 upstream. KSM pages may be mapped to the multiple VMAs that cannot be reached from one anon_vma. So during swapin, a new copy of the page need to be generated if a different anon_vma is needed, please refer to comments of ksm_might_need_to_copy() for details. During swapoff, unuse_vma() uses anon_vma (if available) to locate VMA and virtual address mapped to the page, so not all mappings to a swapped out KSM page could be found. So in try_to_unuse(), even if the swap count of a swap entry isn't zero, the page needs to be deleted from swap cache, so that, in the next round a new page could be allocated and swapin for the other mappings of the swapped out KSM page. But this contradicts with the THP swap support. Where the THP could be deleted from swap cache only after the swap count of every swap entry in the huge swap cluster backing the THP has reach 0. So try_to_unuse() is changed in commit e0709829 ("mm, THP, swap: support to reclaim swap space for THP swapped out") to check that before delete a page from swap cache, but this has broken KSM swapoff too. Fortunately, KSM is for the normal pages only, so the original behavior for KSM pages could be restored easily via checking PageTransCompound(). That is how this patch works. The bug is introduced by e0709829 ("mm, THP, swap: support to reclaim swap space for THP swapped out"), which is merged by v4.14-rc1. So I think we should backport the fix to from 4.14 on. But Hugh thinks it may be rare for the KSM pages being in the swap device when swapoff, so nobody reports the bug so far. Link: http://lkml.kernel.org/r/20181226051522.28442-1-ying.huang@intel.com Fixes: e0709829 ("mm, THP, swap: support to reclaim swap space for THP swapped out") Signed-off-by:
"Huang, Ying" <ying.huang@intel.com> Reported-by:
Hugh Dickins <hughd@google.com> Tested-by:
Hugh Dickins <hughd@google.com> Acked-by:
Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Shaohua Li <shli@kernel.org> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 02917e9f upstream. At Maintainer Summit, Greg brought up a topic I proposed around EXPORT_SYMBOL_GPL usage. The motivation was considerations for when EXPORT_SYMBOL_GPL is warranted and the criteria for taking the exceptional step of reclassifying an existing export. Specifically, I wanted to make the case that although the line is fuzzy and hard to specify in abstract terms, it is nonetheless clear that devm_memremap_pages() and HMM (Heterogeneous Memory Management) have crossed it. The devm_memremap_pages() facility should have been EXPORT_SYMBOL_GPL from the beginning, and HMM as a derivative of that functionality should have naturally picked up that designation as well. Contrary to typical rules, the HMM infrastructure was merged upstream with zero in-tree consumers. There was a promise at the time that those users would be merged "soon", but it has been over a year with no drivers arriving. While the Nouveau driver is about to belatedly make good on that promise it is clear that HMM was targeted first and foremost at an out-of-tree consumer. HMM is derived from devm_memremap_pages(), a facility Christoph and I spearheaded to support persistent memory. It combines a device lifetime model with a dynamically created 'struct page' / memmap array for any physical address range. It enables coordination and control of the many code paths in the kernel built to interact with memory via 'struct page' objects. With HMM the integration goes even deeper by allowing device drivers to hook and manipulate page fault and page free events. One interpretation of when EXPORT_SYMBOL is suitable is when it is exporting stable and generic leaf functionality. The devm_memremap_pages() facility continues to see expanding use cases, peer-to-peer DMA being the most recent, with no clear end date when it will stop attracting reworks and semantic changes. It is not suitable to export devm_memremap_pages() as a stable 3rd party driver API due to the fact that it is still changing and manipulates core behavior. Moreover, it is not in the best interest of the long term development of the core memory management subsystem to permit any external driver to effectively define its own system-wide memory management policies with no encouragement to engage with upstream. I am also concerned that HMM was designed in a way to minimize further engagement with the core-MM. That, with these hooks in place, device-drivers are free to implement their own policies without much consideration for whether and how the core-MM could grow to meet that need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the core-MM should be allowed the opportunity and stimulus to change and address these new use cases as first class functionality. Original changelog: hmm_devmem_add(), and hmm_devmem_add_resource() duplicated devm_memremap_pages() and are now simple now wrappers around the core facility to inject a dev_pagemap instance into the global pgmap_radix and hook page-idle events. The devm_memremap_pages() interface is base infrastructure for HMM. HMM has more and deeper ties into the kernel memory management implementation than base ZONE_DEVICE which is itself a EXPORT_SYMBOL_GPL facility. Originally, the HMM page structure creation routines copied the devm_memremap_pages() code and reused ZONE_DEVICE. A cleanup to unify the implementations was discussed during the initial review: http://lkml.iu.edu/hypermail/linux/kernel/1701.2/00812.html Recent work to extend devm_memremap_pages() for the peer-to-peer-DMA facility enabled this cleanup to move forward. In addition to the integration with devm_memremap_pages() HMM depends on other GPL-only symbols: mmu_notifier_unregister_no_release percpu_ref region_intersects __class_create It goes further to consume / indirectly expose functionality that is not exported to any other driver: alloc_pages_vma walk_page_range HMM is derived from devm_memremap_pages(), and extends deep core-kernel fundamentals. Similar to devm_memremap_pages(), mark its entry points EXPORT_SYMBOL_GPL(). [logang@deltatee.com: PCI/P2PDMA: match interface changes to devm_memremap_pages()] Link: http://lkml.kernel.org/r/20181130225911.2900-1-logang@deltatee.com Link: http://lkml.kernel.org/r/154275560565.76910.15919297436557795278.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Logan Gunthorpe <logang@deltatee.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com>, Cc: Michal Hocko <mhocko@suse.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 58ef15b7 upstream. devm semantics arrange for resources to be torn down when device-driver-probe fails or when device-driver-release completes. Similar to devm_memremap_pages() there is no need to support an explicit remove operation when the users properly adhere to devm semantics. Note that devm_kzalloc() automatically handles allocating node-local memory. Link: http://lkml.kernel.org/r/154275559545.76910.9186690723515469051.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Jérôme Glisse <jglisse@redhat.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 69324b8f upstream. In preparation for consolidating all ZONE_DEVICE enabling via devm_memremap_pages(), teach it how to handle the constraints of MEMORY_DEVICE_PRIVATE ranges. [jglisse@redhat.com: call move_pfn_range_to_zone for MEMORY_DEVICE_PRIVATE] Link: http://lkml.kernel.org/r/154275559036.76910.12434636179931292607.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Jérôme Glisse <jglisse@redhat.com> Acked-by:
Christoph Hellwig <hch@lst.de> Reported-by:
Logan Gunthorpe <logang@deltatee.com> Reviewed-by:
Logan Gunthorpe <logang@deltatee.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit a95c90f1 upstream. The last step before devm_memremap_pages() returns success is to allocate a release action, devm_memremap_pages_release(), to tear the entire setup down. However, the result from devm_add_action() is not checked. Checking the error from devm_add_action() is not enough. The api currently relies on the fact that the percpu_ref it is using is killed by the time the devm_memremap_pages_release() is run. Rather than continue this awkward situation, offload the responsibility of killing the percpu_ref to devm_memremap_pages_release() directly. This allows devm_memremap_pages() to do the right thing relative to init failures and shutdown. Without this change we could fail to register the teardown of devm_memremap_pages(). The likelihood of hitting this failure is tiny as small memory allocations almost always succeed. However, the impact of the failure is large given any future reconfiguration, or disable/enable, of an nvdimm namespace will fail forever as subsequent calls to devm_memremap_pages() will fail to setup the pgmap_radix since there will be stale entries for the physical address range. An argument could be made to require that the ->kill() operation be set in the @pgmap arg rather than passed in separately. However, it helps code readability, tracking the lifetime of a given instance, to be able to grep the kill routine directly at the devm_memremap_pages() call site. Link: http://lkml.kernel.org/r/154275558526.76910.7535251937849268605.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com> Fixes: e8d51348 ("memremap: change devm_memremap_pages interface...") Reviewed-by:
"Jérôme Glisse" <jglisse@redhat.com> Reported-by:
Logan Gunthorpe <logang@deltatee.com> Reviewed-by:
Logan Gunthorpe <logang@deltatee.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 06489cfb upstream. Given the fact that devm_memremap_pages() requires a percpu_ref that is torn down by devm_memremap_pages_release() the current support for mapping RAM is broken. Support for remapping "System RAM" has been broken since the beginning and there is no existing user of this this code path, so just kill the support and make it an explicit error. This cleanup also simplifies a follow-on patch to fix the error path when setting a devm release action for devm_memremap_pages_release() fails. Link: http://lkml.kernel.org/r/154275557997.76910.14689813630968180480.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
"Jérôme Glisse" <jglisse@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Logan Gunthorpe <logang@deltatee.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Williams authored
commit 808153e1 upstream. devm_memremap_pages() is a facility that can create struct page entries for any arbitrary range and give drivers the ability to subvert core aspects of page management. Specifically the facility is tightly integrated with the kernel's memory hotplug functionality. It injects an altmap argument deep into the architecture specific vmemmap implementation to allow allocating from specific reserved pages, and it has Linux specific assumptions about page structure reference counting relative to get_user_pages() and get_user_pages_fast(). It was an oversight and a mistake that this was not marked EXPORT_SYMBOL_GPL from the outset. Again, devm_memremap_pagex() exposes and relies upon core kernel internal assumptions and will continue to evolve along with 'struct page', memory hotplug, and support for new memory types / topologies. Only an in-kernel GPL-only driver is expected to keep up with this ongoing evolution. This interface, and functionality derived from this interface, is not suitable for kernel-external drivers. Link: http://lkml.kernel.org/r/154275557457.76910.16923571232582744134.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by:
Dan Williams <dan.j.williams@intel.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Acked-by:
Michal Hocko <mhocko@suse.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-