- 20 Dec, 2017 40 commits
-
-
Matthias Brugger authored
[ Upstream commit 395df08d ] There exist two Mediatek iommu drivers for the two different generations of the device. But both drivers have the same name "mtk-iommu". This breaks the registration of the second driver: Error: Driver 'mtk-iommu' is already registered, aborting... Fix this by changing the name for first generation to "mtk-iommu-v1". Fixes: b17336c5 ("iommu/mediatek: add support for mtk iommu generation one HW") Signed-off-by:
Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by:
Alex Williamson <alex.williamson@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mika Westerberg authored
[ Upstream commit a20c7f36 ] One can ask more buses to be reserved for hotplug bridges by passing pci=hpbussize=N in the kernel command line. If the parent bus does not have enough bus space available we incorrectly create child bus with the requested number of subordinate buses. In the example below hpbussize is set to one more than we have available buses in the root port: pci 0000:07:00.0: [8086:1578] type 01 class 0x060400 pci 0000:07:00.0: scanning [bus 00-00] behind bridge, pass 0 pci 0000:07:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:07:00.0: scanning [bus 00-00] behind bridge, pass 1 pci_bus 0000:08: busn_res: can not insert [bus 08-ff] under [bus 07-3f] (conflicts with (null) [bus 07-3f]) pci_bus 0000:08: scanning bus ... pci_bus 0000:0a: bus scan returning with max=40 pci_bus 0000:0a: busn_res: [bus 0a-ff] end is updated to 40 pci_bus 0000:0a: [bus 0a-40] partially hidden behind bridge 0000:07 [bus 07-3f] pci_bus 0000:08: bus scan returning with max=40 pci_bus 0000:08: busn_res: [bus 08-ff] end is updated to 40 Instead of allowing this, limit the subordinate number to be less than or equal the maximum subordinate number allocated for the parent bus (if it has any). Signed-off-by:
Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: remove irrelevant dmesg messages] Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Shriya authored
[ Upstream commit cd77b5ce ] The call to /proc/cpuinfo in turn calls cpufreq_quick_get() which returns the last frequency requested by the kernel, but may not reflect the actual frequency the processor is running at. This patch makes a call to cpufreq_get() instead which returns the current frequency reported by the hardware. Fixes: fb5153d0 ("powerpc: powernv: Implement ppc_md.get_proc_freq()") Signed-off-by:
Shriya <shriyak@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Qiang authored
[ Upstream commit 3ad3f8ce ] PCIe PME and native hotplug share the same interrupt number, so hotplug interrupts are also processed by PME. In some cases, e.g., a Link Down interrupt, a device may be present but unreachable, so when we try to read its Root Status register, the read fails and we get all ones data (0xffffffff). Previously, we interpreted that data as PCI_EXP_RTSTA_PME being set, i.e., "some device has asserted PME," so we scheduled pcie_pme_work_fn(). This caused an infinite loop because pcie_pme_work_fn() tried to handle PME requests until PCI_EXP_RTSTA_PME is cleared, but with the link down, PCI_EXP_RTSTA_PME can't be cleared. Check for the invalid 0xffffffff data everywhere we read the Root Status register. 1469d17d ("PCI: pciehp: Handle invalid data when reading from non-existent devices") added similar checks in the hotplug driver. Signed-off-by:
Qiang Zheng <zhengqiang10@huawei.com> [bhelgaas: changelog, also check in pcie_pme_work_fn(), use "~0" to follow other similar checks] Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wei Yongjun authored
[ Upstream commit d86fd113 ] Fix to return a negative error code from the VID create error handling case instead of 0, as done elsewhere in this function. Fixes: c57529e1 ("mlxsw: spectrum: Replace vPorts with Port-VLAN") Signed-off-by:
Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Ujfalusi authored
[ Upstream commit 288e7560 ] The used 0x1f mask is only valid for am335x family of SoC, different family using this type of crossbar might have different number of electable events. In case of am43xx family 0x3f mask should have been used for example. Instead of trying to handle each family's mask, just use u8 type to store the mux value since the event offsets are aligned to byte offset. Fixes: 42dbdcc6 ("dmaengine: ti-dma-crossbar: Add support for crossbar on AM33xx/AM43xx") Signed-off-by:
Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by:
Vinod Koul <vinod.koul@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pankaj Bharadiya authored
[ Upstream commit f8e06652 ] In the loop that adds the uuid_module to the uuid_list list, allocated memory is not properly freed in the error path free uuid_list whenever any of the memory allocation in the loop fails to avoid memory leak. Signed-off-by:
Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by:
Guneshwor Singh <guneshwor.o.singh@intel.com> Acked-By:
Vinod Koul <vinod.koul@intel.com> Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rajat Jain authored
[ Upstream commit 95b982b4 ] Problem: This flag does not get cleared currently in the suspend or resume path in the following cases: * In case some driver's suspend routine returns an error. * Successful s2idle case * etc? Why is this a problem: What happens is that the next suspend attempt could fail even though the user did not enable the flag by writing to /sys/power/wakeup_count. This is 1 use case how the issue can be seen (but similar use case with driver suspend failure can be thought of): 1. Read /sys/power/wakeup_count 2. echo count > /sys/power/wakeup_count 3. echo freeze > /sys/power/wakeup_count 4. Let the system suspend, and wakeup the system using some wake source that calls pm_wakeup_event() e.g. power button or something. 5. Note that the combined wakeup count would be incremented due to the pm_wakeup_event() in the resume path. 6. After resuming the events_check_enabled flag is still set. At this point if the user attempts to freeze again (without writing to /sys/power/wakeup_count), the suspend would fail even though there has been no wake event since the past resume. Address that by clearing the flag just before a resume is completed, so that it is always cleared for the corner cases mentioned above. Signed-off-by:
Rajat Jain <rajatja@google.com> Acked-by:
Pavel Machek <pavel@ucw.cz> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Pixel Ding authored
[ Upstream commit dce1e131 ] KIQ ring submission is used for register accessing on SRIOV VF that could happen both in irq enabled and irq disabled cases. Inversion lock could happen on adev->ring_lru_list_lock, while this operation is useless and just adds overhead in this use case. Signed-off-by:
Pixel Ding <Pixel.Ding@amd.com> Reviewed-by:
Monk Liu <Monk.Liu@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
[ Upstream commit 820f1886 ] aacraid passes the current time to the firmware in one of two ways, either as year/month/day/... or as 32-bit unsigned seconds. The first one is broken on 32-bit architectures as it cannot go past year 2038. Using timespec64 here makes it behave properly on both 32-bit and 64-bit architectures, and avoids relying on signed integer overflow to pass times into the second interface. The interface used in aac_send_hosttime() however is still problematic in year 2106 when 32-bit seconds overflow. Hopefully we don't have to worry about aacraid by that time. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Reviewed-by:
Dave Carroll <david.carroll@microsemi.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Philipp Zabel authored
[ Upstream commit a3350f9c ] The pcf8563_clkout_recalc_rate function erroneously ignores the frequency index read from the CLKO register and always returns 32768 Hz. Fixes: a39a6405 ("rtc: pcf8563: add CLKOUT to common clock framework") Signed-off-by:
Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by:
Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christophe JAILLET authored
[ Upstream commit 8cae353e ] 'ret' is known to be 0 at this point. In case of memory allocation error in 'framebuffer_alloc()', return -ENOMEM instead. Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Tejun Heo <tj@kernel.org> Signed-off-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christophe JAILLET authored
[ Upstream commit 451f1306 ] We should go through the error handling code instead of returning -ENOMEM directly. Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: Tejun Heo <tj@kernel.org> Signed-off-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ladislav Michl authored
[ Upstream commit c9876947 ] While usb_control_msg function expects timeout in miliseconds, a value of HZ is used. Replace it with USB_CTRL_GET_TIMEOUT and also fix error message which looks like: udlfb: Read EDID byte 78 failed err ffffff92 as error is either negative errno or number of bytes transferred use %d format specifier. Returned EDID is in second byte, so return error when less than two bytes are received. Fixes: 18dffdf8 ("staging: udlfb: enhance EDID and mode handling support") Signed-off-by:
Ladislav Michl <ladis@linux-mips.org> Cc: Bernie Thompson <bernie@plugable.com> Signed-off-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Geert Uytterhoeven authored
[ Upstream commit ac831a37 ] Dan's static analysis says: drivers/video/fbdev/controlfb.c:560 control_setup() error: buffer overflow 'control_mac_modes' 20 <= 21 Indeed, control_mac_modes[] has only 20 elements, while VMODE_MAX is 22, which may lead to an out of bounds read when parsing vmode commandline options. The bug was introduced in v2.4.5.6, when 2 new modes were added to macmodes.h, but control_mac_modes[] wasn't updated: https://kernel.opensuse.org/cgit/kernel/diff/include/video/macmodes.h?h=v2.5.2&id=29f279c764808560eaceb88fef36cbc35c529aad Augment control_mac_modes[] with the two new video modes to fix this. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Robert Stonehouse authored
[ Upstream commit cbad52e9 ] Fixes: 535a6177 ("sfc: suppress handled MCDI failures when changing the MAC address") Signed-off-by:
Bert Kenward <bkenward@solarflare.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sébastien Szymanski authored
[ Upstream commit 7da85fbf ] When everything goes smoothly, ret is set to 0 which makes the function to return EIO error. Fixes: 8e9faa15 ("HID: cp2112: fix gpio-callback error handling") Signed-off-by:
Sébastien Szymanski <sebastien.szymanski@armadeus.com> Reviewed-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Guy Levi authored
[ Upstream commit 108809a0 ] In the modify QP handler the base_qpn_udp field in the RSS QPC is overwrite later by irrelevant value assignment. Hence, ingress packets which gets to the RSS QP will be steered then to a garbage QPN. The patch fixes this by skipping the above assignment when a RSS QP is modified, also, the RSS context's attributes assignments are relocated just before the context is posted to avoid future issues like this. Additionally, this patch takes the opportunity to change the code to be disciplined to the device's manual and assigns the RSS QP context just at RESET to INIT transition. Fixes:3078f5f1 ("IB/mlx4: Add support for RSS QP") Signed-off-by:
Guy Levi <guyle@mellanox.com> Reviewed-by:
Yishai Hadas <yishaih@mellanox.com> Signed-off-by:
Leon Romanovsky <leon@kernel.org> Signed-off-by:
Doug Ledford <dledford@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@verizon.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chandan Rajendra authored
commit 9d5afec6 upstream. On a ppc64 machine, when mounting a fuzzed ext2 image (generated by fsfuzzer) the following call trace is seen, VFS: brelse: Trying to free free buffer WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40 .__brelse.part.6+0x20/0x40 (unreliable) .ext4_find_entry+0x384/0x4f0 .ext4_lookup+0x84/0x250 .lookup_slow+0xdc/0x230 .walk_component+0x268/0x400 .path_lookupat+0xec/0x2d0 .filename_lookup+0x9c/0x1d0 .vfs_statx+0x98/0x140 .SyS_newfstatat+0x48/0x80 system_call+0x58/0x6c This happens because the directory that ext4_find_entry() looks up has inode->i_size that is less than the block size of the filesystem. This causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not reading any of the directory file's blocks. This renders the entries in bh_use[] array to continue to have garbage data. buffer_uptodate() on bh_use[0] can then return a zero value upon which brelse() function is invoked. This commit fixes the bug by returning -ENOENT when the directory file has no associated blocks. Reported-by:
Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by:
Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Theodore Ts'o authored
commit 996fc447 upstream. It's possible for ext4_get_acl() to return an ERR_PTR. So we need to add a check for this case in __ext4_new_inode(). Otherwise on an error we can end up oops the kernel. This was getting triggered by xfstests generic/388, which is a test which exercises the shutdown code path. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eryu Guan authored
commit c894aa97 upstream. Currently, fallocate(2) with KEEP_SIZE followed by a fdatasync(2) then crash, we'll see wrong allocated block number (stat -c %b), the blocks allocated beyond EOF are all lost. fstests generic/468 exposes this bug. Commit 67a7d5f5 ("ext4: fix fdatasync(2) after extent manipulation operations") fixed all the other extent manipulation operation paths such as hole punch, zero range, collapse range etc., but forgot the fallocate case. So similarly, fix it by recording the correct journal tid in ext4 inode in fallocate(2) path, so that ext4_sync_file() will wait for the right tid to be committed on fdatasync(2). This addresses the test failure in xfstests test generic/468. Signed-off-by:
Eryu Guan <eguan@redhat.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andi Kleen authored
commit fc82228a upstream. 407cd7fb (ext4: change fast symlink test to not rely on i_blocks) broke ~10 years old ext3 file systems created by 2.6.17. Any ELF executable fails because the /lib/ld-linux.so.2 fast symlink cannot be read anymore. The patch assumed fast symlinks were created in a specific way, but that's not true on these really old file systems. The new behavior is apparently needed only with the large EA inode feature. Revert to the old behavior if the large EA inode feature is not set. This makes my old VM boot again. Fixes: 407cd7fb (ext4: change fast symlink test to not rely on i_blocks) Signed-off-by:
Andi Kleen <ak@linux.intel.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kees Cook authored
commit 779f4e1c upstream. This reverts commit 04e35f44. SELinux runs with secureexec for all non-"noatsecure" domain transitions, which means lots of processes end up hitting the stack hard-limit change that was introduced in order to fix a race with prlimit(). That race fix will need to be redesigned. Reported-by:
Laura Abbott <labbott@redhat.com> Reported-by:
Tomáš Trnka <trnka@scm.com> Signed-off-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Adam Wallis authored
commit 6f6a23a2 upstream. Commit adfa543e ("dmatest: don't use set_freezable_with_signal()") introduced a bug (that is in fact documented by the patch commit text) that leaves behind a dangling pointer. Since the done_wait structure is allocated on the stack, future invocations to the DMATEST can produce undesirable results (e.g., corrupted spinlocks). Commit a9df21e3 ("dmaengine: dmatest: warn user when dma test times out") attempted to WARN the user that the stack was likely corrupted but did not fix the actual issue. This patch fixes the issue by pushing the wait queue and callback structs into the the thread structure. If a failure occurs due to time, dmaengine_terminate_all will force the callback to safely call wake_up_all() without possibility of using a freed pointer. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197605 Fixes: adfa543e ("dmatest: don't use set_freezable_with_signal()") Reviewed-by:
Sinan Kaya <okaya@codeaurora.org> Suggested-by:
Shunyong Yang <shunyong.yang@hxt-semitech.com> Signed-off-by:
Adam Wallis <awallis@codeaurora.org> Signed-off-by:
Vinod Koul <vinod.koul@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit cef31d9a upstream. timer_create() specifies via sigevent->sigev_notify the signal delivery for the new timer. The valid modes are SIGEV_NONE, SIGEV_SIGNAL, SIGEV_THREAD and (SIGEV_SIGNAL | SIGEV_THREAD_ID). The sanity check in good_sigevent() is only checking the valid combination for the SIGEV_THREAD_ID bit, i.e. SIGEV_SIGNAL, but if SIGEV_THREAD_ID is not set it accepts any random value. This has no real effects on the posix timer and signal delivery code, but it affects show_timer() which handles the output of /proc/$PID/timers. That function uses a string array to pretty print sigev_notify. The access to that array has no bound checks, so random sigev_notify cause access beyond the array bounds. Add proper checks for the valid notify modes and remove the SIGEV_THREAD_ID masking from various code pathes as SIGEV_NONE can never be set in combination with SIGEV_THREAD_ID. Reported-by:
Eric Biggers <ebiggers3@gmail.com> Reported-by:
Dmitry Vyukov <dvyukov@google.com> Reported-by:
Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Lechner authored
commit 7f6d2ecd upstream. Trying to read the MAC address from an eeprom that has an offset that is not a multiple of 4 causes an error currently. Fix it by changing the nvmem stride to 1. Signed-off-by:
David Lechner <david@lechnology.com> [Bartosz: tweaked the commit message] Signed-off-by:
Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kirill A. Shutemov authored
commit 6d7e0ba2 upstream. If the machine does not support the paging mode for which the kernel was compiled, the boot process cannot continue. It's not possible to let the kernel detect the mismatch as it does not even reach the point where cpu features can be evaluted due to a triple fault in the KASLR setup. Instead of instantaneous silent reboot, emit an error message which gives the user the information why the boot fails. Fixes: 77ef56e4 ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y") Reported-by:
Borislav Petkov <bp@suse.de> Signed-off-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-mm@kvack.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20171204124059.63515-3-kirill.shutemov@linux.intel.comSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kirill A. Shutemov authored
commit 08529078 upstream. Prerequisite for fixing the current problem of instantaneous reboots when a 5-level paging kernel is booted on 4-level paging hardware. At the same time this change prepares the decompression code to boot-time switching between 4- and 5-level paging. [ tglx: Folded the GCC < 5 fix. ] Fixes: 77ef56e4 ("x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y") Signed-off-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: linux-mm@kvack.org Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20171204124059.63515-2-kirill.shutemov@linux.intel.comSigned-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steve Wise authored
commit c058ecf6 upstream. Only insert our special drain CQEs to support ib_drain_sq/rq() after the wq is flushed. Otherwise, existing but not yet polled CQEs can be returned out of order to the user application. This can happen when the QP has exited RTS but not yet flushed the QP, which can happen during a normal close (vs abortive close). In addition never count the drain CQEs when determining how many CQEs need to be synthesized during the flush operation. This latter issue should never happen if the QP is properly flushed before inserting the drain CQE, but I wanted to avoid corrupting the CQ state. So we handle it and log a warning once. Fixes: 4fe7c296 ("iw_cxgb4: refactor sq/rq drain logic") Signed-off-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Jason Gunthorpe <jgg@mellanox.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trond Myklebust authored
commit 90d91b0c upstream. We must ensure that the call to rpc_sleep_on() in xprt_transmit() cannot race with the call to xprt_complete_rqst(). Reported-by:
Chuck Lever <chuck.lever@oracle.com> Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=317 Fixes: ce7c252a ("SUNRPC: Add a separate spinlock to protect..") Reviewed-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
monty_pavel@sina.com authored
commit 7e6358d2 upstream. A NULL pointer is seen if two concurrent "vgchange -ay -K <vg name>" processes race to load the dm-thin-pool module: PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange" #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9 0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992 0000002 [ffff883cd743d730] oops_end at ffffffff81515c90 0000003 [ffff883cd743d760] no_context at ffffffff81049f1b 0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5 0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce 0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f 0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae 0000008 [ffff883cd743d980] page_fault at ffffffff81514f95 [exception RIP: kmem_cache_alloc+108] RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046 RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0 RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000 RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000 R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0 R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5 0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083 0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4 0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool] 0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod] 0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod] 0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod] The race results in a NULL pointer because: Process A (vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target not registered; c. modprobe dm_thin_pool and wait until end. Process B (vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target registered; c. table_load->dm_table_add_target->pool_ctr; d. _new_mapping_cache is NULL and panic. Note: 1. process A and process B are two concurrent processes. 2. pool_target can be detected by process B but _new_mapping_cache initialization has not ended. To fix dm-thin-pool, and other targets (cache, multipath, and snapshot) with the same problem, simply dm_register_target() after all resources created during module init (as labelled with __init) are finished. Signed-off-by:
monty <monty_pavel@sina.com> Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt authored
commit f73c52a5 upstream. Daniel Wagner reported a crash on the BeagleBone Black SoC. This is a single CPU architecture, and does not have a functional arch_send_call_function_single_ipi() implementation which can crash the kernel if that is called. As it only has one CPU, it shouldn't be called, but if the kernel is compiled for SMP, the push/pull RT scheduling logic now calls it for irq_work if the one CPU is overloaded, it can use that function to call itself and crash the kernel. Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system only has a single CPU. But SCHED_FEAT is a constant if sched debugging is turned off. Another fix can also be used, and this should also help with normal SMP machines. That is, do not initiate the pull code if there's only one RT overloaded CPU, and that CPU happens to be the current CPU that is scheduling in a lower priority task. Even on a system with many CPUs, if there's many RT tasks waiting to run on a single CPU, and that CPU schedules in another RT task of lower priority, it will initiate the PULL logic in case there's a higher priority RT task on another CPU that is waiting to run. But if there is no other CPU with waiting RT tasks, it will initiate the RT pull logic on itself (as it still has RT tasks waiting to run). This is a wasted effort. Not only does this help with SMP code where the current CPU is the only one with RT overloaded tasks, it should also solve the issue that Daniel encountered, because it will prevent the PULL logic from executing, as there's only one CPU on the system, and the check added here will cause it to exit the RT pull code. Reported-by:
Daniel Wagner <wagi@monom.org> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by:
Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-rt-users <linux-rt-users@vger.kernel.org> Fixes: 4bdced5c ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.homeSigned-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jason Yan authored
commit 621f6401 upstream. The return value of smp_execute_task_sg() is the untransferred residual, but bsg_job_done() requires the length of payload received. This makes SMP passthrough commands from userland by sg ioctl to libsas get a wrong response. The userland tools such as smp_utils failed because of these wrong responses: ~#smp_discover /dev/bsg/expander-2\:13 response too short, len=0 ~#smp_discover /dev/bsg/expander-2\:134 response too short, len=0 Fix this by passing the actual received length to bsg_job_done(). And if smp_execute_task_sg() returns 0, this means received length is exactly the buffer length. [mkp: typo] Fixes: 651a0136 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough") Signed-off-by:
Jason Yan <yanaijie@huawei.com> Reported-by:
chenqilin <chenqilin2@huawei.com> Tested-by:
chenqilin <chenqilin2@huawei.com> CC: Christoph Hellwig <hch@lst.de> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bart Van Assche authored
commit 14e3062f upstream. Avoid that scsi_show_rq() triggers a NULL pointer dereference if called after sd_uninit_command(). Swap the NULL pointer assignment and the mempool_free() call in sd_uninit_command() to make it less likely that scsi_show_rq() triggers a use-after-free. Note: even with these changes scsi_show_rq() can trigger a use-after-free but that's a lesser evil than e.g. suppressing debug information for T10 PI Type 2 commands completely. This patch fixes the following oops: BUG: unable to handle kernel NULL pointer dereference at (null) IP: scsi_format_opcode_name+0x1a/0x1c0 CPU: 1 PID: 1881 Comm: cat Not tainted 4.14.0-rc2.blk_mq_io_hang+ #516 Call Trace: __scsi_format_command+0x27/0xc0 scsi_show_rq+0x5c/0xc0 __blk_mq_debugfs_rq_show+0x116/0x130 blk_mq_debugfs_rq_show+0xe/0x10 seq_read+0xfe/0x3b0 full_proxy_read+0x54/0x90 __vfs_read+0x37/0x160 vfs_read+0x96/0x130 SyS_read+0x55/0xc0 entry_SYSCALL_64_fastpath+0x1a/0xa5 [mkp: added Type 2] Fixes: 0eebd005 ("scsi: Implement blk_mq_ops.show_rq()") Reported-by:
Ming Lei <ming.lei@redhat.com> Signed-off-by:
Bart Van Assche <bart.vanassche@wdc.com> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mark Rutland authored
commit 1d08a044 upstream. In ptdump_check_wx(), we pass walk_pgd() a start address of 0 (rather than VA_START) for the init_mm. This means that any reported W&X addresses are offset by VA_START, which is clearly wrong and can make them appear like userspace addresses. Fix this by telling the ptdump code that we're walking init_mm starting at VA_START. We don't need to update the addr_markers, since these are still valid bounds regardless. Fixes: 1404d6f1 ("arm64: dump: Add checking for writable and exectuable pages") Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Laura Abbott <labbott@redhat.com> Reported-by:
Timur Tabi <timur@codeaurora.org> Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steve Capper authored
commit f24e5834 upstream. The high_memory global variable is used by cma_declare_contiguous(.) before it is defined. We don't notice this as we compute __pa(high_memory - 1), and it looks like we're processing a VA from the direct linear map. This problem becomes apparent when we flip the kernel virtual address space and the linear map is moved to the bottom of the kernel VA space. This patch moves the initialisation of high_memory before it used. Fixes: f7426b98 ("mm: cma: adjust address limit to avoid hitting low/high memory boundary") Signed-off-by:
Steve Capper <steve.capper@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steve Capper authored
commit 8781bcbc upstream. On systems with hardware dirty bit management, the ltp madvise09 unit test fails due to dirty bit information being lost and pages being incorrectly freed. This was bisected to: arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect() Reverting this commit leads to a separate problem, that the unit test retains pages that should have been dropped due to the function madvise_free_pte_range(.) not cleaning pte's properly. Currently pte_mkclean only clears the software dirty bit, thus the following code sequence can appear: pte = pte_mkclean(pte); if (pte_dirty(pte)) // this condition can return true with HW DBM! This patch also adjusts pte_mkclean to set PTE_RDONLY thus effectively clearing both the SW and HW dirty information. In order for this to function on systems without HW DBM, we need to also adjust pte_mkdirty to remove the read only bit from writable pte's to avoid infinite fault loops. Fixes: 64c26841 ("arm64: Ignore hardware dirty bit updates in ptep_set_wrprotect()") Reported-by:
Bhupinder Thakur <bhupinder.thakur@linaro.org> Tested-by:
Bhupinder Thakur <bhupinder.thakur@linaro.org> Reviewed-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Steve Capper <steve.capper@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Scott Mayhew authored
commit dc4fd9ab upstream. If there were no commit requests, then nfs_commit_inode() should not wait on the commit or mark the inode dirty, otherwise the following BUG_ON can be triggered: [ 1917.130762] kernel BUG at fs/inode.c:578! [ 1917.130766] Oops: Exception in kernel mode, sig: 5 [#1] [ 1917.130768] SMP NR_CPUS=2048 NUMA pSeries [ 1917.130772] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc sg nx_crypto pseries_rng ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod [ 1917.130805] CPU: 2 PID: 14923 Comm: umount.nfs4 Tainted: G ------------ T 3.10.0-768.el7.ppc64 #1 [ 1917.130810] task: c0000005ecd88040 ti: c00000004cea0000 task.ti: c00000004cea0000 [ 1917.130813] NIP: c000000000354178 LR: c000000000354160 CTR: c00000000012db80 [ 1917.130816] REGS: c00000004cea3720 TRAP: 0700 Tainted: G ------------ T (3.10.0-768.el7.ppc64) [ 1917.130820] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI> CR: 22002822 XER: 20000000 [ 1917.130828] CFAR: c00000000011f594 SOFTE: 1 GPR00: c000000000354160 c00000004cea39a0 c0000000014c4700 c0000000018cc750 GPR04: 000000000000c750 80c0000000000000 0600000000000000 04eeb76bea749a03 GPR08: 0000000000000034 c0000000018cc758 0000000000000001 d000000005e619e8 GPR12: c00000000012db80 c000000007b31200 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR24: 0000000000000000 c000000000dfc3ec 0000000000000000 c0000005eefc02c0 GPR28: d0000000079dbd50 c0000005b94a02c0 c0000005b94a0250 c0000005b94a01c8 [ 1917.130867] NIP [c000000000354178] .evict+0x1c8/0x350 [ 1917.130871] LR [c000000000354160] .evict+0x1b0/0x350 [ 1917.130873] Call Trace: [ 1917.130876] [c00000004cea39a0] [c000000000354160] .evict+0x1b0/0x350 (unreliable) [ 1917.130880] [c00000004cea3a30] [c0000000003558cc] .evict_inodes+0x13c/0x270 [ 1917.130884] [c00000004cea3af0] [c000000000327d20] .kill_anon_super+0x70/0x1e0 [ 1917.130896] [c00000004cea3b80] [d000000005e43e30] .nfs_kill_super+0x20/0x60 [nfs] [ 1917.130900] [c00000004cea3c00] [c000000000328a20] .deactivate_locked_super+0xa0/0x1b0 [ 1917.130903] [c00000004cea3c80] [c00000000035ba54] .cleanup_mnt+0xd4/0x180 [ 1917.130907] [c00000004cea3d10] [c000000000119034] .task_work_run+0x114/0x150 [ 1917.130912] [c00000004cea3db0] [c00000000001ba6c] .do_notify_resume+0xcc/0x100 [ 1917.130916] [c00000004cea3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60 [ 1917.130919] Instruction dump: [ 1917.130921] 7fc3f378 486734b5 60000000 387f00a0 38800003 4bdcb365 60000000 e95f00a0 [ 1917.130927] 694a0060 7d4a0074 794ad182 694a0001 <0b0a0000> 892d02a4 2f890000 40de0134 Signed-off-by:
Scott Mayhew <smayhew@redhat.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Jurgens authored
commit 0fbe8f57 upstream. Per the infiniband spec an SMI MAD can have any PKey. Checking the pkey on SMI MADs is not necessary, and it seems that some older adapters using the mthca driver don't follow the convention of using the default PKey, resulting in false denials, or errors querying the PKey cache. SMI MAD security is still enforced, only agents allowed to manage the subnet are able to receive or send SMI MADs. Reported-by:
Chris Blake <chrisrblake93@gmail.com> Fixes: 47a2b338 ("IB/core: Enforce security on management datagrams") Signed-off-by:
Daniel Jurgens <danielj@mellanox.com> Reviewed-by:
Parav Pandit <parav@mellanox.com> Signed-off-by:
Leon Romanovsky <leon@kernel.org> Signed-off-by:
Doug Ledford <dledford@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Jurgens authored
commit 4cae8ff1 upstream. The alternate port number is used as an array index in the IB security implementation, invalid values can result in a kernel panic. Fixes: d291f1a6 ("IB/core: Enforce PKey security on QPs") Signed-off-by:
Daniel Jurgens <danielj@mellanox.com> Reviewed-by:
Parav Pandit <parav@mellanox.com> Signed-off-by:
Leon Romanovsky <leon@kernel.org> Signed-off-by:
Doug Ledford <dledford@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-