- 10 May, 2019 4 commits
-
-
Christian Brauner authored
Avoid calling cgroup_threadgroup_change_end() without having called cgroup_threadgroup_change_begin() first. During process creation we need to check whether the cgroup we are in allows us to fork. To perform this check the cgroup needs to guard itself against threadgroup changes and takes a lock. Prior to CLONE_PIDFD the cleanup target "bad_fork_free_pid" would also need to call cgroup_threadgroup_change_end() because said lock had already been taken. However, this is not the case anymore with the addition of CLONE_PIDFD. We are now allocating a pidfd before we check whether the cgroup we're in can fork and thus prior to taking the lock. So when copy_process() fails at the right step it would release a lock we haven't taken. This bug is not even very subtle to be honest. It's just not very clear from the naming of cgroup_threadgroup_change_{begin,end}() that a lock is taken. Here's the relevant splat: entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7fec849 Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000ffed5a8c EFLAGS: 00000246 ORIG_RAX: 0000000000000078 RAX: ffffffffffffffda RBX: 0000000000003ffc RCX: 0000000000000000 RDX: 00000000200005c0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000012 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(depth <= 0) WARNING: CPU: 1 PID: 7744 at kernel/locking/lockdep.c:4052 __lock_release kernel/locking/lockdep.c:4052 [inline] WARNING: CPU: 1 PID: 7744 at kernel/locking/lockdep.c:4052 lock_release+0x667/0xa00 kernel/locking/lockdep.c:4321 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 7744 Comm: syz-executor007 Not tainted 5.1.0+ #4 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x65c kernel/panic.c:214 __warn.cold+0x20/0x45 kernel/panic.c:566 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:179 [inline] fixup_bug arch/x86/kernel/traps.c:174 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:972 RIP: 0010:__lock_release kernel/locking/lockdep.c:4052 [inline] RIP: 0010:lock_release+0x667/0xa00 kernel/locking/lockdep.c:4321 Code: 0f 85 a0 03 00 00 8b 35 77 66 08 08 85 f6 75 23 48 c7 c6 a0 55 6b 87 48 c7 c7 40 25 6b 87 4c 89 85 70 ff ff ff e8 b7 a9 eb ff <0f> 0b 4c 8b 85 70 ff ff ff 4c 89 ea 4c 89 e6 4c 89 c7 e8 52 63 ff RSP: 0018:ffff888094117b48 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 1ffff11012822f6f RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff815af236 RDI: ffffed1012822f5b RBP: ffff888094117c00 R08: ffff888092bfc400 R09: fffffbfff113301d R10: fffffbfff113301c R11: ffffffff889980e3 R12: ffffffff8a451df8 R13: ffffffff8142e71f R14: ffffffff8a44cc80 R15: ffff888094117bd8 percpu_up_read.constprop.0+0xcb/0x110 include/linux/percpu-rwsem.h:92 cgroup_threadgroup_change_end include/linux/cgroup-defs.h:712 [inline] copy_process.part.0+0x47ff/0x6710 kernel/fork.c:2222 copy_process kernel/fork.c:1772 [inline] _do_fork+0x25d/0xfd0 kernel/fork.c:2338 __do_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:240 [inline] __se_compat_sys_x86_clone arch/x86/ia32/sys_ia32.c:236 [inline] __ia32_compat_sys_x86_clone+0xbc/0x140 arch/x86/ia32/sys_ia32.c:236 do_syscall_32_irqs_on arch/x86/entry/common.c:334 [inline] do_fast_syscall_32+0x281/0xd54 arch/x86/entry/common.c:405 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7fec849 Code: 85 d2 74 02 89 0a 5b 5d c3 8b 04 24 c3 8b 14 24 c3 8b 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000ffed5a8c EFLAGS: 00000246 ORIG_RAX: 0000000000000078 RAX: ffffffffffffffda RBX: 0000000000003ffc RCX: 0000000000000000 RDX: 00000000200005c0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000012 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Kernel Offset: disabled Rebooting in 86400 seconds.. Reported-and-tested-by: syzbot+3286e58549edc479faae@syzkaller.appspotmail.com Fixes: b3e58382 ("clone: add CLONE_PIDFD") Signed-off-by: Christian Brauner <christian@brauner.io>
-
Christian Brauner authored
Ignore the pidfd-metadata binary so it doesn't show up in unwanted scenarios. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christian Brauner <christian@brauner.io>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs mount fix from Al Viro: "Fix for umount -l/mount --move race caught by syzbot yesterday..." * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: do_move_mount(): fix an unsafe use of is_anon_ns()
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: "Several bug fixes, many are quick merge-window regression cures: - When NLM_F_EXCL is not set, allow same fib rule insertion. From Hangbin Liu. - Several cures in sja1105 DSA driver (while loop exit condition fix, return of negative u8, etc.) from Vladimir Oltean. - Handle tx/rx delays in realtek PHY driver properly, from Serge Semin. - Double free in cls_matchall, from Pieter Jansen van Vuuren. - Disable SIOCSHWTSTAMP in macvlan/vlan containers, from Hangbin Liu. - Endainness fixes in aqc111, from Oliver Neukum. - Handle errors in packet_init properly, from Haibing Yue. - Various W=1 warning fixes in kTLS, from Jakub Kicinski" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits) nfp: add missing kdoc net/tls: handle errors from padding_length() net/tls: remove set but not used variables docs/btf: fix the missing section marks nfp: bpf: fix static check error through tightening shift amount adjustment selftests: bpf: initialize bpf_object pointers where needed packet: Fix error path in packet_init net/tcp: use deferred jump label for TCP acked data hook net: aquantia: fix undefined devm_hwmon_device_register_with_info reference aqc111: fix double endianness swap on BE aqc111: fix writing to the phy on BE aqc111: fix endianness issue in aqc111_change_mtu vlan: disable SIOCSHWTSTAMP in container macvlan: disable SIOCSHWTSTAMP in container tipc: fix hanging clients using poll with EPOLLOUT flag tuntap: synchronize through tfiles array instead of tun->numqueues tuntap: fix dividing by zero in ebpf queue selection dwmac4_prog_mtl_tx_algorithms() missing write operation ptp_qoriq: fix NULL access if ptp dt node missing net/sched: avoid double free on matchall reoffload ...
-
- 09 May, 2019 36 commits
-
-
Jakub Kicinski authored
Add missing kdoc for app member. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jakub Kicinski says: ==================== net/tls: fix W=1 build warnings This small series cleans up two outstanding W=1 build warnings in tls code. Both are set but not used variables. The first case looks fairly straightforward. In the second I think it's better to propagate the error code, even if not doing some does not lead to a crash with current code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
At the time padding_length() is called the record header is still part of the message. If malicious TLS 1.3 peer sends an all-zero record padding_length() will stop at the record header, and return full length of the data including the tail_size. Subsequent subtraction of prot->overhead_size from rxm->full_len will cause rxm->full_len to turn negative. skb accessors, however, will always catch resulting out-of-bounds operation, so in practice this fix comes down to returning the correct error code. It also fixes a set but not used warning. This code was added by commit 130b392c ("net: tls: Add tls 1.3 support"). CC: Dave Watson <davejwatson@fb.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jakub Kicinski authored
Commit 4504ab0e ("net/tls: Inform user space about send buffer availability") made us report write_space regardless whether partial record push was successful or not. Remove the now unused return value to clean up the following W=1 warning: net/tls/tls_device.c: In function ‘tls_device_write_space’: net/tls/tls_device.c:546:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable] int rc = 0; ^~ CC: Vakul Garg <vakul.garg@nxp.com> CC: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Alexei Starovoitov says: ==================== pull-request: bpf 2019-05-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) three small fixes from Gary, Jiong and Lorenz. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gary Lin authored
The section titles of 3.4 and 3.5 are not marked correctly. Signed-off-by: Gary Lin <glin@suse.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Jiong Wang authored
NFP shift instruction has something special. If shift direction is left then shift amount of 1 to 31 is specified as 32 minus the amount to shift. But no need to do this for indirect shift which has shift amount be 0. Even after we do this subtraction, shift amount 0 will be turned into 32 which will eventually be encoded the same as 0 because only low 5 bits are encoded, but shift amount be 32 will fail the FIELD_PREP check done later on shift mask (0x1f), due to 32 is out of mask range. Such error has been observed when compiling nfp/bpf/jit.c using gcc 8.3 + O3. This issue has started when indirect shift support added after which the incoming shift amount to __emit_shf could be 0, therefore it is at that time shift amount adjustment inside __emit_shf should have been tightened. Fixes: 991f5b36 ("nfp: bpf: support logic indirect shifts (BPF_[L|R]SH | BPF_X)") Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Reported-by: Pablo Cascón <pablo.cascon@netronome.com Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
Lorenz Bauer authored
There are a few tests which call bpf_object__close on uninitialized bpf_object*, which may segfault. Explicitly zero-initialise these pointers to avoid this. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds authored
Pull sparc updates from David Miller: "Here we go: - Fix various long standing issues in the sparc 32-bit IOMMU support code, from Christoph Hellwig. - Various other code cleanups and simplifications all over. From Gustavo A. R. Silva, Jagadeesh Pagadala, Masahiro Yamada, Mauro Carvalho Chehab, Mike Rapoport" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: simplify reduce_memory() function sparc: use struct_size() in kzalloc() docs: sparc: convert to ReST sparc/iommu: merge iommu_get_one and __sbus_iommu_map_page sparc/iommu: use __sbus_iommu_map_page to implement the map_sg path sparc/iommu: fix __sbus_iommu_map_page for highmem pages sparc/iommu: move per-page flushing into __sbus_iommu_map_page sparc/iommu: pass a physical address to iommu_get_one sparc/iommu: create a common helper for map_sg sparc/iommu: merge iommu_release_one and sbus_iommu_unmap_page sparc/iommu: use sbus_iommu_unmap_page in sbus_iommu_unmap_sg sparc/iommu: use !PageHighMem to check if a page has a kernel mapping sparc: vdso: add FORCE to the build rule of %.so arch:sparc:kernel/uprobes.c : Remove duplicate header
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk framework updates from Stephen Boyd: "We have a couple new features and changes in the core clk framework this time around because we've finally gotten around to fixing some long standing issues. There's still work to do though, so this pull request is largely laying down the foundation for all the driver changes to come in the next merge window. The first problem we're alleviating is how parents of clks are specified. With the new method, we should see lots of drivers migrate away from the current design of string comparisons on the entire clk tree to a more direct method where they can use clk_hw pointers or more localized names specified in DT or via clkdev. This should reduce our reliance on string comparisons for all the topology description logic that we've been using for years and hopefully speed some things up while avoiding problems we have with generating clk names. Beyond that we also got rid of the CLK_IS_BASIC flag because it wasn't really helping anyone and we introduced big-endian versions of the basic clk types so that we can get rid of clk_{readl,writel}(). Both of these are things that driver developers have tried to use over the years that I typically bat away during code reviews because they're not useful. It's great to see these two things go away so maintainers can save time not worrying about these things. On the driver side we got the usual collection of new SoC support and non-critical fixes and updates to existing code. The big topics that stand out are the new driver support for Mediatek MT8183 and MT8516 SoCs, Amlogic Meson8b and G12a SoCs, and the SiFive FU540 SoC. The other patches in the driver pile are mostly fixes for things that are being used for the first time or additions for clks that couldn't be tested before because there wasn't a consumer driver that exercised them. Details are below and also in the sub-maintainer tags. Core: - Remove clk_readl() and introduce BE versions of basic clk types - Rewrite how clk parents can be specified to allow DT/clkdev lookups - Removal of the CLK_IS_BASIC clk flag - Framework documentation updates and fixes New Drivers: - Support for STM32F769 - AT91 sam9x60 PMC support - SiFive FU540 PRCI and PLL support - Qualcomm QCS404 CDSP clk support - Qualcomm QCS404 Turing clk support - Mediatek MT8183 clock support - Mediatek MT8516 clock support - Milbeaut M10V clk controller support - Support for Cirrus Logic Lochnagar clks Updates: - Rework AT91 sckc DT bindings - Fix slow RC oscillator issue on sama5d3 - Mark UFS clk as critical on Hi-Silicon hi3660 SoCs - Various static analysis fixes/finds and const markings - Video Engine (ECLK) support on Aspeed SoCs - Xilinx ZynqMP Versal platform support - Convert Xilinx ZynqMP driver to be struct oriented - Fixes for Rockchip rk3328 and rk3288 SoCs - Sub-type for Rockchip SoCs where mux and divider aren't a single register - Remove SNVS clock from i.MX7UPL clock driver and bindings - Improve i.MX5 clock driver for i.MX50 support - Addition of ADC clock definition for Exynos 5410 SoC (Odroid XU) - Export a new clock for the MBUS controller on the A13 - Allwinner H6 fixes to support a finer clocking of the video and VPU engines - Add g12a support in the Amlogic axg audio clock controller - Add missing PCI USB clock on Rensas RZ/N1 - Add Z2 (Cortex-A53) clocks on Rensas R-Car E3 and RZ/G2E - A new helper DIV64_U64_ROUND_CLOSEST() in <linux/math64.h> - VPU and Video Decoder clocks on Amlogic Meson8b - Finally remove the wrong ABP Meson8b clock id - Add Video Decoder, PCIe PLL, and CPU Clocks on Amlogic G12A - Re-expose SAR_ADC_SEL and CTS_OSCIN on Amlogic G12A AO clock controller - Un-expose some Amlogic AXG-Audio input clocks IDs" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (172 commits) clk: Cache core in clk_fetch_parent_index() without names clk: imx: correct pfdv2 gate_bit/vld_bit operations clk: sifive: add a driver for the SiFive FU540 PRCI IP block clk: analogbits: add Wide-Range PLL library clk: imx: clk-pllv3: mark expected switch fall-throughs clk: imx8mq: Add dsi_ipg_div clk: imx: pllv4: add fractional-N pll support clk: sunxi-ng: Use the correct style for SPDX License Identifier clk: sprd: Use the correct style for SPDX License Identifier clk: renesas: Use the correct style for SPDX License Identifier clk: qcom: Use the correct style for SPDX License Identifier clk: davinci: Use the correct style for SPDX License Identifier clk: actions: Use the correct style for SPDX License Identifier clk: imx: keep uart clock on during system boot clk: imx: correct i.MX7D AV PLL num/denom offset dt-bindings: clk: add documentation for the SiFive PRCI driver clk: stm32mp1: Add ddrperfm clock clk: Remove CLK_IS_BASIC clk flag clock: milbeaut: Add Milbeaut M10V clock controller dt-bindings: clock: milbeaut: add Milbeaut clock description ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linuxLinus Torvalds authored
Pull RTC updates from Alexandre Belloni: "A huge series from me this cycle. I went through many drivers to set the date and time range supported by the RTC which helps solving HW limitation when the time comes (as early as next year for some). This time, I focused on drivers using .set_mms and .set_mmss64, allowing me to remove those callbacks. About a third of the patches got reviews, I actually own the RTCs and I tested another third and the remaining one are unlikely to cause any issues. Other than that, a single new driver and the usual fixes here and there. Summary: Subsystem: - set_mmss and set_mmss64 rtc_ops removal - Fix timestamp value for RTC_TIMESTAMP_BEGIN_1900 - Use SPDX identifier for the core - validate upper bound of tm->tm_year New driver: - Aspeed BMC SoC RTC Drivers: - abx80x: use rtc_add_group - ds3232: nvram support - pcf85063: add alarm, nvram, offset correction and microcrystal rv8263 support - x1205: add of_match_table - Use set_time instead of set_mms/set_mmss64 for: ab3100, coh901331, digicolor, ds1672, ds2404, ep93xx, imxdi, jz4740, lpc32xx, mc13xxx, mxc, pcap, stmp3xxx, test, wm831x, xgene. - Set RTC range for: ab3100, at91sam9, coh901331, da9063, digicolor, dm355evm, ds1672, ds2404, ep39xx, goldfish, imxdi, jz4740, lpc32xx, mc13xxx, mv, mxc, omap, pcap, pcf85063, pcf85363, ps3, sh, stmp3xxx, sun4v, tegra, wm831x, xgene. - Switch to rtc_time64_to_tm/rtc_tm_to_time64 for the driver that properly set the RTC range. - Use dev_get_drvdata instead of multiple indirections" * tag 'rtc-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (177 commits) rtc: snvs: Use __maybe_unused instead of #if CONFIG_PM_SLEEP rtc: imxdi: remove unused variable rtc: drop set_mms and set_mmss64 rtc: pcap: convert to SPDX identifier rtc: pcap: use .set_time rtc: pcap: switch to rtc_time64_to_tm/rtc_tm_to_time64 rtc: pcap: set range rtc: digicolor: convert to SPDX identifier rtc: digicolor: use .set_time rtc: digicolor: set range rtc: digicolor: fix possible race condition rtc: jz4740: convert to SPDX identifier rtc: jz4740: rework invalid time detection rtc: jz4740: use dev_pm_set_wake_irq() to simplify code rtc: jz4740: use .set_time rtc: jz4740: remove useless check rtc: jz4740: switch to rtc_time64_to_tm/rtc_tm_to_time64 rtc: jz4740: set range rtc: 88pm860x: prevent use-after-free on device remove rtc: Use dev_get_drvdata() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull i2c updates from Wolfram Sang: - API for late atomic transfers (e.g. to shut down via PMIC). We have a seperate callback now which is called under clearly defined conditions. In-kernel users are converted, too. - new driver for the AMD PCIe MP2 I2C controller - large refactoring for at91 and bcm-iproc (both gain slave support due to this) - and a good share of various driver improvements anf fixes * 'i2c/for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (57 commits) dt-bindings: i2c: riic: document r7s9210 support i2c: imx-lpi2c: Use __maybe_unused instead of #if CONFIG_PM_SLEEP i2c-piix4: Add Hygon Dhyana SMBus support i2c: core: apply 'is_suspended' check for SMBus, too i2c: core: ratelimit 'transfer when suspended' errors i2c: iproc: Change driver to use 'BIT' macro i2c: riic: Add Runtime PM support i2c: mux: demux-pinctrl: use struct_size() in devm_kzalloc() i2c: mux: pca954x: allow management of device idle state via sysfs i2c: mux: pca9541: remove support for unused platform data i2c: mux: pca954x: remove support for unused platform data dt-bindings: i2c: i2c-mtk: add support for MT8516 i2c: axxia: use auto cmd for last message i2c: gpio: flag atomic capability if possible i2c: algo: bit: add flag to whitelist atomic transfers i2c: stu300: use xfer_atomic callback to bail out early i2c: ocores: enable atomic xfers i2c: ocores: refactor setup for polling i2c: tegra-bpmp: convert to use new atomic callbacks i2c: omap: Add the master_xfer_atomic hook ...
-
git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds authored
Pull NFS client updates from Anna Schumaker: "Highlights include: Stable bugfixes: - Fall back to MDS if no deviceid is found rather than aborting # v4.11+ - NFS4: Fix v4.0 client state corruption when mount Features: - Much improved handling of soft mounts with NFS v4.0: - Reduce risk of false positive timeouts - Faster failover of reads and writes after a timeout - Added a "softerr" mount option to return ETIMEDOUT instead of EIO to the application after a timeout - Increase number of xprtrdma backchannel requests - Add additional xprtrdma tracepoints - Improved send completion batching for xprtrdma Other bugfixes and cleanups: - Return -EINVAL when NFS v4.2 is passed an invalid dedup mode - Reduce usage of GFP_ATOMIC pages in SUNRPC - Various minor NFS over RDMA cleanups and bugfixes - Use the correct container namespace for upcalls - Don't share superblocks between user namespaces - Various other container fixes - Make nfs_match_client() killable to prevent soft lockups - Don't mark all open state for recovery when handling recallable state revoked flag" * tag 'nfs-for-5.2-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (69 commits) SUNRPC: Rebalance a kref in auth_gss.c NFS: Fix a double unlock from nfs_match,get_client nfs: pass the correct prototype to read_cache_page NFSv4: don't mark all open state for recovery when handling recallable state revoked flag SUNRPC: Fix an error code in gss_alloc_msg() SUNRPC: task should be exit if encode return EKEYEXPIRED more times NFS4: Fix v4.0 client state corruption when mount PNFS fallback to MDS if no deviceid found NFS: make nfs_match_client killable lockd: Store the lockd client credential in struct nlm_host NFS: When mounting, don't share filesystems between different user namespaces NFS: Convert NFSv2 to use the container user namespace NFSv4: Convert the NFS client idmapper to use the container user namespace NFS: Convert NFSv3 to use the container user namespace SUNRPC: Use namespace of listening daemon in the client AUTH_GSS upcall SUNRPC: Use the client user namespace when encoding creds NFS: Store the credential of the mount process in the nfs_server SUNRPC: Cache cred of process creating the rpc_client xprtrdma: Remove stale comment xprtrdma: Update comments that reference ib_drain_qp ...
-
Mike Rapoport authored
The reduce_memory() function clampls the available memory to a limit defined by the "mem=" command line parameter. It takes into account the amount of already reserved memory and excludes it from the limit calculations. Rather than traverse memblocks and remove them by hand, use memblock_reserved_size() to account the reserved memory and memblock_enforce_memory_limit() to clamp the available memory. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct foo { int stuff; void *entry[]; }; instance = kzalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL); Instead of leaving these open-coded and prone to type mistakes, we can now use the new struct_size() helper: instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL); This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds authored
Pull cgroup updates from Tejun Heo: "This includes Roman's cgroup2 freezer implementation. It's a separate machanism from cgroup1 freezer. Instead of blocking user tasks in arbitrary uninterruptible sleeps, the new implementation extends jobctl stop - frozen tasks are trapped in jobctl stop until thawed and can be killed and ptraced. Lots of thanks to Oleg for sheperding the effort. Other than that, there are a few trivial changes" * 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: never call do_group_exit() with task->frozen bit set kernel: cgroup: fix misuse of %x cgroup: get rid of cgroup_freezer_frozen_exit() cgroup: prevent spurious transition into non-frozen state cgroup: Remove unused cgrp variable cgroup: document cgroup v2 freezer interface cgroup: add tracing points for cgroup v2 freezer cgroup: make TRACE_CGROUP_PATH irq-safe kselftests: cgroup: add freezer controller self-tests kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy() cgroup: cgroup v2 freezer cgroup: protect cgroup->nr_(dying_)descendants by css_set_lock cgroup: implement __cgroup_task_count() helper cgroup: rename freezer.c into legacy_freezer.c cgroup: remove extra cgroup_migrate_finish() call
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds authored
Pull workqueue updates from Tejun Heo: "Only three commits, of which two are trivial. The non-trivial chagne is Thomas's patch to switch workqueue from sched RCU to regular one. The use of sched RCU is mostly historic and doesn't really buy us anything noticeable" * 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Use normal rcu kernel/workqueue: Document wq_worker_last_func() argument kernel/workqueue: Use __printf markup to silence compiler in function 'alloc_workqueue'
-
YueHaibing authored
kernel BUG at lib/list_debug.c:47! invalid opcode: 0000 [#1 CPU: 0 PID: 12914 Comm: rmmod Tainted: G W 5.1.0+ #47 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 RIP: 0010:__list_del_entry_valid+0x53/0x90 Code: 48 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 00 5d c3 48 89 fe 48 89 c2 48 c7 c7 18 75 fe 82 e8 cb 34 78 ff <0f> 0b 48 89 fe 48 c7 c7 50 75 fe 82 e8 ba 34 78 ff 0f 0b 48 89 f2 RSP: 0018:ffffc90001c2fe40 EFLAGS: 00010286 RAX: 000000000000004e RBX: ffffffffa0184000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff888237a17788 RDI: 00000000ffffffff RBP: ffffc90001c2fe40 R08: 0000000000000000 R09: 0000000000000000 R10: ffffc90001c2fe10 R11: 0000000000000000 R12: 0000000000000000 R13: ffffc90001c2fe50 R14: ffffffffa0184000 R15: 0000000000000000 FS: 00007f3d83634540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555c350ea818 CR3: 0000000231677000 CR4: 00000000000006f0 Call Trace: unregister_pernet_operations+0x34/0x120 unregister_pernet_subsys+0x1c/0x30 packet_exit+0x1c/0x369 [af_packet __x64_sys_delete_module+0x156/0x260 ? lockdep_hardirqs_on+0x133/0x1b0 ? do_syscall_64+0x12/0x1f0 do_syscall_64+0x6e/0x1f0 entry_SYSCALL_64_after_hwframe+0x49/0xbe When modprobe af_packet, register_pernet_subsys fails and does a cleanup, ops->list is set to LIST_POISON1, but the module init is considered to success, then while rmmod it, BUG() is triggered in __list_del_entry_valid which is called from unregister_pernet_subsys. This patch fix error handing path in packet_init to avoid possilbe issue if some error occur. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://github.com/c-sky/csky-linuxLinus Torvalds authored
Pull arch/csky perf update from Guo Ren: "Add support for perf unwind-libdw" * tag 'csky-for-linus-5.2-perf-unwind-libdw' of git://github.com/c-sky/csky-linux: csky: Add support for perf unwind-libdw
-
Chuck Lever authored
Restore the kref_get that matches the gss_put_auth(gss_msg->auth) done by gss_release_msg(). Fixes: ac83228a ("SUNRPC: Use namespace of listening daemon ...") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Benjamin Coddington authored
Now that nfs_match_client drops the nfs_client_lock, we should be careful to always return it in the same condition: locked. Fixes: 950a578c ("NFS: make nfs_match_client killable") Reported-by: syzbot+228a82b263b5da91883d@syzkaller.appspotmail.com Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Christoph Hellwig authored
Fix the callbacks NFS passes to read_cache_page to actually have the proper type expected. Casting around function pointers can easily hide typing bugs, and defeats control flow protection. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Scott Mayhew authored
Only delegations and layouts can be recalled, so it shouldn't be necessary to recover all opens when handling the status bit SEQ4_STATUS_RECALLABLE_STATE_REVOKED. We'll still wind up calling nfs41_open_expired() when a TEST_STATEID returns NFS4ERR_DELEG_REVOKED. Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Dan Carpenter authored
If kstrdup_const() then this function returns zero (success) but it should return -ENOMEM. Fixes: ac83228a ("SUNRPC: Use namespace of listening daemon in the client AUTH_GSS upcall") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
ZhangXiaoxu authored
If the rpc.gssd always return cred success, but now the cred is expired, then the task will loop in call_refresh and call_transmit. Exit the rpc task after retry. Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
ZhangXiaoxu authored
stat command with soft mount never return after server is stopped. When alloc a new client, the state of the client will be set to NFS4CLNT_LEASE_EXPIRED. When the server is stopped, the state manager will work, and accord the state to recover. But the state is NFS4CLNT_LEASE_EXPIRED, it will drain the slot table and lead other task to wait queue, until the client recovered. Then the stat command is hung. When discover server trunking, the client will renew the lease, but check the client state, it lead the client state corruption. So, we need to call state manager to recover it when detect server ip trunking. Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Olga Kornievskaia authored
If we fail to find a good deviceid while trying to pnfs instead of propogating an error back fallback to doing IO to the MDS. Currently, code with fals the IO with EINVAL. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Fixes: 8d40b0f1 ("NFS filelayout:call GETDEVICEINFO after pnfs_layout_process completes" Cc: stable@vger.kernel.org # v4.11+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-securityLinus Torvalds authored
Pull smack updates from James Morris: "Bug fixes for IPv6 handling and other issues and two memory use improvements." * 'next-smack' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: Smack: Fix kbuild reported build error smack: Check address length before reading address family Smack: Fix IPv6 handling of 0 secmark Smack: Create smack_rule cache to optimize memory usage smack: removal of global rule list
-
Linus Torvalds authored
Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull intgrity updates from James Morris: "This contains just three patches, the remainder were either included in other pull requests (eg. audit, lockdown) or will be upstreamed via other subsystems (eg. kselftests, Power). Included here is one bug fix, one documentation update, and extending the x86 IMA arch policy rules to coordinate the different kernel module signature verification methods" * 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: doc/kernel-parameters.txt: Deprecate ima_appraise_tcb x86/ima: add missing include x86/ima: require signed kernel modules
-
Jakub Kicinski authored
User space can flip the clean_acked_data_enabled static branch on and off with TLS offload when CONFIG_TLS_DEVICE is enabled. jump_label.h suggests we use the delayed version in this case. Deferred branches now also don't take the branch mutex on decrement, so we avoid potential locking issues. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kefeng Wang authored
drivers/net/ethernet/aquantia/atlantic/aq_drvinfo.o: In function `aq_drvinfo_init': aq_drvinfo.c:(.text+0xe8): undefined reference to `devm_hwmon_device_register_with_info' Fix it by using #if IS_REACHABLE(CONFIG_HWMON). Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.open-mesh.org/linux-mergeDavid S. Miller authored
Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich (we forgot to include this patch previously ...) - fix multicast tt/tvlv worker locking, by Linus Lüssing ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linuxLinus Torvalds authored
Pull orangefs updates from Mike Marshall: "This includes one fix and our "Orangefs through the pagecache" patch series which greatly improves our small IO performance and helps us pass more xfstests than before. Fix: - orangefs: truncate before updating size Pagecache series: - all the rest" * tag 'for-linus-5.2-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: (23 commits) orangefs: truncate before updating size orangefs: copy Orangefs-sized blocks into the pagecache if possible. orangefs: pass slot index back to readpage. orangefs: remember count when reading. orangefs: add orangefs_revalidate_mapping orangefs: implement writepages orangefs: write range tracking orangefs: avoid fsync service operation on flush orangefs: skip inode writeout if nothing to write orangefs: move do_readv_writev to direct_IO orangefs: do not return successful read when the client-core disappeared orangefs: implement writepage orangefs: migrate to generic_file_read_iter orangefs: service ops done for writeback are not killable orangefs: remove orangefs_readpages orangefs: reorganize setattr functions to track attribute changes orangefs: let setattr write to cached inode orangefs: set up and use backing_dev_info orangefs: hold i_lock during inode_getattr orangefs: update attributes rather than relying on server ...
-
Oliver Neukum authored
If you are using a function that does a swap in place, you cannot just reuse the buffer on the assumption that it has not been changed. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oliver Neukum authored
When writing to the phy on BE architectures an internal data structure was directly given, leading to it being byte swapped in the wrong way for the CPU in 50% of all cases. A temporary buffer must be used. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Oliver Neukum authored
If the MTU is large enough, the first write to the device is just repeated. On BE architectures, however, the first word of the command will be swapped a second time and garbage will be written. Avoid that. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-