- 22 Oct, 2018 1 commit
-
-
David S. Miller authored
The current VDSO patch mechanism has several problems: 1) It assumes how gcc will emit a function, with a register window, an initial save instruction and then immediately the %tick read when compiling vread_tick(). There is no such guarantees, code generation could change at any time, gcc could put a nop between the save and the %tick read, etc. So this is extremely fragile and would fail some day. 2) It disallows us to properly inline vread_tick() into the callers and thus get the best possible code sequences. So fix this to patch properly, with location based annotations. We have to be careful because we cannot do it the way we do patches elsewhere in the kernel. Those use a sequence like: 1: insn .section .whatever_patch, "ax" .word 1b replacement_insn .previous This is a dynamic shared object, so that .word cannot be resolved at build time, and thus cannot be used to execute the patches when the kernel initializes the images. Even trying to use label difference equations doesn't work in the above kind of scheme: 1: insn .section .whatever_patch, "ax" .word . - 1b replacement_insn .previous The assembler complains that it cannot resolve that computation. The issue is that this is contained in an executable section. Borrow the sequence used by x86 alternatives, which is: 1: insn .pushsection .whatever_patch, "a" .word . - 1b, . - 1f .popsection .pushsection .whatever_patch_replacements, "ax" 1: replacement_insn .previous This works, allows us to inline vread_tick() as much as we like, and can be used for arbitrary kinds of VDSO patching in the future. Also, reverse the condition for patching. Most systems are %stick based, so if we only patch on %tick systems the patching code will get little or no testing. Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 19 Oct, 2018 6 commits
-
-
David S. Miller authored
If PARPORT_PC_FIFO is not enabled, do not provide the dma lock macros and lock definition. Otherwise: ./arch/sparc/include/asm/parport.h:24:24: warning: ‘dma_spin_lock’ defined but not used [-Wunused-variable] static DEFINE_SPINLOCK(dma_spin_lock); ^~~~~~~~~~~~~ ./include/linux/spinlock_types.h:81:39: note: in definition of macro ‘DEFINE_SPINLOCK’ #define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x) Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netGreg Kroah-Hartman authored
David writes: "Networking 1) Fix gro_cells leak in xfrm layer, from Li RongQing. 2) BPF selftests change RLIMIT_MEMLOCK blindly, don't do that. From Eric Dumazet. 3) AF_XDP calls synchronize_net() under RCU lock, fix from Björn Töpel. 4) Out of bounds packet access in _decode_session6(), from Alexei Starovoitov. 5) Several ethtool bugs, where we copy a struct into the kernel twice and our validations of the values in the first copy can be invalidated by the second copy due to asynchronous updates to the memory by the user. From Wenwen Wang. 6) Missing netlink attribute validation in cls_api, from Davide Caratti. 7) LLC SAP sockets neet to be SOCK_RCU FREE, from Cong Wang. 8) rxrpc operates on wrong kvec, from Yue Haibing. 9) A regression was introduced by the disassosciation of route neighbour references in rt6_probe(), causing probe for neighbourless routes to not be properly rate limited. Fix from Sabrina Dubroca. 10) Unsafe RCU locking in tipc, from Tung Nguyen. 11) Use after free in inet6_mc_check(), from Eric Dumazet. 12) PMTU from icmp packets should update the SCTP transport pathmtu, from Xin Long. 13) Missing peer put on error in rxrpc, from David Howells. 14) Fix pedit in nfp driver, from Pieter Jansen van Vuuren. 15) Fix overflowing shift statement in qla3xxx driver, from Nathan Chancellor. 16) Fix Spectre v1 in ptp code, from Gustavo A. R. Silva. 17) udp6_unicast_rcv_skb() interprets udpv6_queue_rcv_skb() return value in an inverted manner, fix from Paolo Abeni. 18) Fix missed unresolved entries in ipmr dumps, from Nikolay Aleksandrov. 19) Fix NAPI handling under high load, we can completely miss events when NAPI has to loop more than one time in a cycle. From Heiner Kallweit." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits) ip6_tunnel: Fix encapsulation layout tipc: fix info leak from kernel tipc_event net: socket: fix a missing-check bug net: sched: Fix for duplicate class dump r8169: fix NAPI handling under high load net: ipmr: fix unresolved entry dumps net: mscc: ocelot: Fix comment in ocelot_vlant_wait_for_completion() sctp: fix the data size calculation in sctp_data_size virtio_net: avoid using netif_tx_disable() for serializing tx routine udp6: fix encap return code for resubmitting mlxsw: core: Fix use-after-free when flashing firmware during init sctp: not free the new asoc when sctp_wait_for_connect returns err sctp: fix race on sctp_id2asoc r8169: re-enable MSI-X on RTL8168g net: bpfilter: use get_pid_task instead of pid_task ptp: fix Spectre v1 vulnerability net: qla3xxx: Remove overflowing shift statement geneve, vxlan: Don't set exceptions if skb->len < mtu geneve, vxlan: Don't check skb_dst() twice sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL instead ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcGreg Kroah-Hartman authored
David writes: "Sparc fixes: The main bit here is fixing how fallback system calls are handled in the sparc vDSO. Unfortunately, I fat fingered the commit and some perf debugging hacks slipped into the vDSO fix, which I revert in the very next commit." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Revert unintended perf changes. sparc: vDSO: Silence an uninitialized variable warning sparc: Fix syscall fallback bugs in VDSO.
-
git://anongit.freedesktop.org/drm/drmGreg Kroah-Hartman authored
Dave writes: "drm fixes for 4.19 final Just a last set of misc core fixes for final. 4 fixes, one use after free, one fb integration fix, one EDID fix, and one laptop panel quirk," * tag 'drm-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm: drm/edid: VSDB yCBCr420 Deep Color mode bit definitions drm: fix use of freed memory in drm_mode_setcrtc drm: fb-helper: Reject all pixel format changing requests drm/edid: Add 6 bpc quirk for BOE panel in HP Pavilion 15-n233sl
-
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaGreg Kroah-Hartman authored
Doug writes: "Really final for-rc pull request for 4.19 Ok, so last week I thought we had sent our final pull request for 4.19. Well, wouldn't ya know someone went and found a couple Spectre v1 fixes were needed :-/. So, a couple *very* small specter patches for this (hopefully) final -rc week." * tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/ucma: Fix Spectre v1 vulnerability IB/ucm: Fix Spectre v1 vulnerability
-
git://anongit.freedesktop.org/drm/drm-miscDave Airlie authored
drm-misc-fixes for v4.19: - Fix use of freed memory in drm_mode_setcrtc. - Reject pixel format changing requests in fb helper. - Add 6 bpc quirk for HP Pavilion 15-n233sl - Fix VSDB yCBCr420 Deep Color mode bit definitions Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/647fe5d0-4ec5-57cc-9f23-a4836b29e278@linux.intel.com
-
- 18 Oct, 2018 31 commits
-
-
Stefano Brivio authored
Commit 058214a4 ("ip6_tun: Add infrastructure for doing encapsulation") added the ip6_tnl_encap() call in ip6_tnl_xmit(), before the call to ipv6_push_frag_opts() to append the IPv6 Tunnel Encapsulation Limit option (option 4, RFC 2473, par. 5.1) to the outer IPv6 header. As long as the option didn't actually end up in generated packets, this wasn't an issue. Then commit 89a23c8b ("ip6_tunnel: Fix missing tunnel encapsulation limit option") fixed sending of this option, and the resulting layout, e.g. for FoU, is: .-------------------.------------.----------.-------------------.----- - - | Outer IPv6 Header | UDP header | Option 4 | Inner IPv6 Header | Payload '-------------------'------------'----------'-------------------'----- - - Needless to say, FoU and GUE (at least) won't work over IPv6. The option is appended by default, and I couldn't find a way to disable it with the current iproute2. Turn this into a more reasonable: .-------------------.----------.------------.-------------------.----- - - | Outer IPv6 Header | Option 4 | UDP header | Inner IPv6 Header | Payload '-------------------'----------'------------'-------------------'----- - - With this, and with 84dad559 ("udp6: fix encap return code for resubmitting"), FoU and GUE work again over IPv6. Fixes: 058214a4 ("ip6_tun: Add infrastructure for doing encapsulation") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Maloy authored
We initialize a struct tipc_event allocated on the kernel stack to zero to avert info leak to user space. Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wenwen Wang authored
In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch statement to see whether it is necessary to pre-process the ethtool structure, because, as mentioned in the comment, the structure ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc' is allocated through compat_alloc_user_space(). One thing to note here is that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is partially determined by 'rule_cnt', which is actually acquired from the user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through get_user(). After 'rxnfc' is allocated, the data in the original user-space buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(), including the 'rule_cnt' field. However, after this copy, no check is re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user race to change the value in the 'compat_rxnfc->rule_cnt' between these two copies. Through this way, the attacker can bypass the previous check on 'rule_cnt' and inject malicious data. This can cause undefined behavior of the kernel and introduce potential security risk. This patch avoids the above issue via copying the value acquired by get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
When dumping classes by parent, kernel would return classes twice: | # tc qdisc add dev lo root prio | # tc class show dev lo | class prio 8001:1 parent 8001: | class prio 8001:2 parent 8001: | class prio 8001:3 parent 8001: | # tc class show dev lo parent 8001: | class prio 8001:1 parent 8001: | class prio 8001:2 parent 8001: | class prio 8001:3 parent 8001: | class prio 8001:1 parent 8001: | class prio 8001:2 parent 8001: | class prio 8001:3 parent 8001: This comes from qdisc_match_from_root() potentially returning the root qdisc itself if its handle matched. Though in that case, root's classes were already dumped a few lines above. Fixes: cb395b20 ("net: sched: optimize class dumps") Signed-off-by: Phil Sutter <phil@nwl.cc> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
rtl_rx() and rtl_tx() are called only if the respective bits are set in the interrupt status register. Under high load NAPI may not be able to process all data (work_done == budget) and it will schedule subsequent calls to the poll callback. rtl_ack_events() however resets the bits in the interrupt status register, therefore subsequent calls to rtl8169_poll() won't call rtl_rx() and rtl_tx() - chip interrupts are still disabled. Fix this by calling rtl_rx() and rtl_tx() independent of the bits set in the interrupt status register. Both functions will detect if there's nothing to do for them. Fixes: da78dbff ("r8169: remove work from irq handler.") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Some local debugging hacks accidently slipped into the VDSO commit. Sorry! Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsecDavid S. Miller authored
Steffen Klassert says: ==================== pull request (net): ipsec 2018-10-18 1) Free the xfrm interface gro_cells when deleting the interface, otherwise we leak it. From Li RongQing. 2) net/core/flow.c does not exist anymore, so remove it from the MAINTAINERS file. 3) Fix a slab-out-of-bounds in _decode_session6. From Alexei Starovoitov. 4) Fix RCU protection when policies inserted into thei bydst lists. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Sandeen authored
fscache_set_key() can incur an out-of-bounds read, reported by KASAN: BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x5b3/0x680 [fscache] Read of size 4 at addr ffff88084ff056d4 by task mount.nfs/32615 and also reported by syzbot at https://lkml.org/lkml/2018/7/8/236 BUG: KASAN: slab-out-of-bounds in fscache_set_key fs/fscache/cookie.c:120 [inline] BUG: KASAN: slab-out-of-bounds in fscache_alloc_cookie+0x7a9/0x880 fs/fscache/cookie.c:171 Read of size 4 at addr ffff8801d3cc8bb4 by task syz-executor907/4466 This happens for any index_key_len which is not divisible by 4 and is larger than the size of the inline key, because the code allocates exactly index_key_len for the key buffer, but the hashing loop is stepping through it 4 bytes (u32) at a time in the buf[] array. Fix this by calculating how many u32 buffers we'll need by using DIV_ROUND_UP, and then using kcalloc() to allocate a precleared allocation buffer to hold the index_key, then using that same count as the hashing index limit. Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies") Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com Signed-off-by: Eric Sandeen <sandeen@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Howells authored
The inline key in struct rxrpc_cookie is insufficiently initialized, zeroing only 3 of the 4 slots, therefore an index_key_len between 13 and 15 bytes will end up hashing uninitialized memory because the memcpy only partially fills the last buf[] element. Fix this by clearing fscache_cookie objects on allocation rather than using the slab constructor to initialise them. We're going to pretty much fill in the entire struct anyway, so bringing it into our dcache writably shouldn't incur much overhead. This removes the need to do clearance in fscache_set_key() (where we aren't doing it correctly anyway). Also, we don't need to set cookie->key_len in fscache_set_key() as we already did it in the only caller, so remove that. Fixes: ec0328e4 ("fscache: Maintain a catalogue of allocated cookies") Reported-by: syzbot+a95b989b2dde8e806af8@syzkaller.appspotmail.com Reported-by: Eric Sandeen <sandeen@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Al Viro authored
the victim might've been rmdir'ed just before the lock_rename(); unlike the normal callers, we do not look the source up after the parents are locked - we know it beforehand and just recheck that it's still the child of what used to be its parent. Unfortunately, the check is too weak - we don't spot a dead directory since its ->d_parent is unchanged, dentry is positive, etc. So we sail all the way to ->rename(), with hosting filesystems _not_ expecting to be asked renaming an rmdir'ed subdirectory. The fix is easy, fortunately - the lock on parent is sufficient for making IS_DEADDIR() on child safe. Cc: stable@vger.kernel.org Fixes: 9ae326a6 (CacheFiles: A cache that backs onto a mounted filesystem) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
Jann Horn points out that our TLB flushing was subtly wrong for the mremap() case. What makes mremap() special is that we don't follow the usual "add page to list of pages to be freed, then flush tlb, and then free pages". No, mremap() obviously just _moves_ the page from one page table location to another. That matters, because mremap() thus doesn't directly control the lifetime of the moved page with a freelist: instead, the lifetime of the page is controlled by the page table locking, that serializes access to the entry. As a result, we need to flush the TLB not just before releasing the lock for the source location (to avoid any concurrent accesses to the entry), but also before we release the destination page table lock (to avoid the TLB being flushed after somebody else has already done something to that page). This also makes the whole "need_flush" logic unnecessary, since we now always end up flushing the TLB for every valid entry. Reported-and-tested-by: Jann Horn <jannh@google.com> Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christoph Hellwig authored
Using non-GPL licenses for our documentation is rather problematic, as it can directly include other files, which generally are GPLv2 licensed and thus not compatible. Remove this license now that the only user (idr.rst) is gone to avoid people semi-accidentally using it again. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
git://git.infradead.org/users/willy/linux-daxGreg Kroah-Hartman authored
Matthew writes: "IDA/IDR fixes for 4.19 I have two tiny fixes, one for the IDA test-suite and one for the IDR documentation license." * 'ida-fixes-4.19-rc8' of git://git.infradead.org/users/willy/linux-dax: idr: Change documentation license test_ida: Fix lockdep warning
-
Nikolay Aleksandrov authored
If the skb space ends in an unresolved entry while dumping we'll miss some unresolved entries. The reason is due to zeroing the entry counter between dumping resolved and unresolved mfc entries. We should just keep counting until the whole table is dumped and zero when we move to the next as we have a separate table counter. Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: 8fb472c0 ("ipmr: improve hash scalability") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gregory CLEMENT authored
The ocelot_vlant_wait_for_completion() function is very similar to the ocelot_mact_wait_for_completion(). It seemed to have be copied but the comment was not updated, so let's fix it. Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
sctp data size should be calculated by subtracting data chunk header's length from chunk_hdr->length, not just data header. Fixes: 668c9beb ("sctp: implement assign_number for sctp_stream_interleave") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ake Koomsin authored
Commit 713a98d9 ("virtio-net: serialize tx routine during reset") introduces netif_tx_disable() after netif_device_detach() in order to avoid use-after-free of tx queues. However, there are two issues. 1) Its operation is redundant with netif_device_detach() in case the interface is running. 2) In case of the interface is not running before suspending and resuming, the tx does not get resumed by netif_device_attach(). This results in losing network connectivity. It is better to use netif_tx_lock_bh()/netif_tx_unlock_bh() instead for serializing tx routine during reset. This also preserves the symmetry of netif_device_detach() and netif_device_attach(). Fixes commit 713a98d9 ("virtio-net: serialize tx routine during reset") Signed-off-by: Ake Koomsin <ake@igel.co.jp> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-traceGreg Kroah-Hartman authored
Steven writes: "tracing: Two fixes for 4.19 This fixes two bugs: - Fix size mismatch of tracepoint array - Have preemptirq test module use same clock source of the selftest" * tag 'trace-v4.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Use trace_clock_local() for looping in preemptirq_delay_test.c tracepoint: Fix tracepoint array element size mismatch
-
Paolo Abeni authored
The commit eb63f296 ("udp6: add missing checks on edumux packet processing") used the same return code convention of the ipv4 counterpart, but ipv6 uses the opposite one: positive values means resubmit. This change addresses the issue, using positive return value for resubmitting. Also update the related comment, which was broken, too. Fixes: eb63f296 ("udp6: add missing checks on edumux packet processing") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
When the switch driver (e.g., mlxsw_spectrum) determines it needs to flash a new firmware version it resets the ASIC after the flashing process. The bus driver (e.g., mlxsw_pci) then registers itself again with mlxsw_core which means (among other things) that the device registers itself again with the hwmon subsystem again. Since the device was registered with the hwmon subsystem using devm_hwmon_device_register_with_groups(), then the old hwmon device (registered before the flashing) was never unregistered and was referencing stale data, resulting in a use-after free. Fix by removing reliance on device managed APIs in mlxsw_hwmon_init(). Fixes: c86d62cc ("mlxsw: spectrum: Reset FW after flash") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Tested-by: Alexander Petrovskiy <alexpe@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
When sctp_wait_for_connect is called to wait for connect ready for sp->strm_interleave in sctp_sendmsg_to_asoc, a panic could be triggered if cpu is scheduled out and the new asoc is freed elsewhere, as it will return err and later the asoc gets freed again in sctp_sendmsg. [ 285.840764] list_del corruption, ffff9f0f7b284078->next is LIST_POISON1 (dead000000000100) [ 285.843590] WARNING: CPU: 1 PID: 8861 at lib/list_debug.c:47 __list_del_entry_valid+0x50/0xa0 [ 285.846193] Kernel panic - not syncing: panic_on_warn set ... [ 285.846193] [ 285.848206] CPU: 1 PID: 8861 Comm: sctp_ndata Kdump: loaded Not tainted 4.19.0-rc7.label #584 [ 285.850559] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 285.852164] Call Trace: ... [ 285.872210] ? __list_del_entry_valid+0x50/0xa0 [ 285.872894] sctp_association_free+0x42/0x2d0 [sctp] [ 285.873612] sctp_sendmsg+0x5a4/0x6b0 [sctp] [ 285.874236] sock_sendmsg+0x30/0x40 [ 285.874741] ___sys_sendmsg+0x27a/0x290 [ 285.875304] ? __switch_to_asm+0x34/0x70 [ 285.875872] ? __switch_to_asm+0x40/0x70 [ 285.876438] ? ptep_set_access_flags+0x2a/0x30 [ 285.877083] ? do_wp_page+0x151/0x540 [ 285.877614] __sys_sendmsg+0x58/0xa0 [ 285.878138] do_syscall_64+0x55/0x180 [ 285.878669] entry_SYSCALL_64_after_hwframe+0x44/0xa9 This is a similar issue with the one fixed in Commit ca3af4dd ("sctp: do not free asoc when it is already dead in sctp_sendmsg"). But this one can't be fixed by returning -ESRCH for the dead asoc in sctp_wait_for_connect, as it will break sctp_connect's return value to users. This patch is to simply set err to -ESRCH before it returns to sctp_sendmsg when any err is returned by sctp_wait_for_connect for sp->strm_interleave, so that no asoc would be freed due to this. When users see this error, they will know the packet hasn't been sent. And it also makes sense to not free asoc because waiting connect fails, like the second call for sctp_wait_for_connect in sctp_sendmsg_to_asoc. Fixes: 668c9beb ("sctp: implement assign_number for sctp_stream_interleave") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Marcelo Ricardo Leitner authored
syzbot reported an use-after-free involving sctp_id2asoc. Dmitry Vyukov helped to root cause it and it is because of reading the asoc after it was freed: CPU 1 CPU 2 (working on socket 1) (working on socket 2) sctp_association_destroy sctp_id2asoc spin lock grab the asoc from idr spin unlock spin lock remove asoc from idr spin unlock free(asoc) if asoc->base.sk != sk ... [*] This can only be hit if trying to fetch asocs from different sockets. As we have a single IDR for all asocs, in all SCTP sockets, their id is unique on the system. An application can try to send stuff on an id that matches on another socket, and the if in [*] will protect from such usage. But it didn't consider that as that asoc may belong to another socket, it may be freed in parallel (read: under another socket lock). We fix it by moving the checks in [*] into the protected region. This fixes it because the asoc cannot be freed while the lock is held. Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com Acked-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Heiner Kallweit authored
Similar to d49c88d7 ("r8169: Enable MSI-X on RTL8106e") after e9d0ba506ea8 ("PCI: Reprogram bridge prefetch registers on resume") we can safely assume that this also fixes the root cause of the issue worked around by 7c53a722 ("r8169: don't use MSI-X on RTL8168g"). So let's revert it. Fixes: 7c53a722 ("r8169: don't use MSI-X on RTL8168g") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Taehee Yoo authored
pid_task() dereferences rcu protected tasks array. But there is no rcu_read_lock() in shutdown_umh() routine so that rcu_read_lock() is needed. get_pid_task() is wrapper function of pid_task. it holds rcu_read_lock() then calls pid_task(). if task isn't NULL, it increases reference count of task. test commands: %modprobe bpfilter %modprobe -rv bpfilter splat looks like: [15102.030932] ============================= [15102.030957] WARNING: suspicious RCU usage [15102.030985] 4.19.0-rc7+ #21 Not tainted [15102.031010] ----------------------------- [15102.031038] kernel/pid.c:330 suspicious rcu_dereference_check() usage! [15102.031063] other info that might help us debug this: [15102.031332] rcu_scheduler_active = 2, debug_locks = 1 [15102.031363] 1 lock held by modprobe/1570: [15102.031389] #0: 00000000580ef2b0 (bpfilter_lock){+.+.}, at: stop_umh+0x13/0x52 [bpfilter] [15102.031552] stack backtrace: [15102.031583] CPU: 1 PID: 1570 Comm: modprobe Not tainted 4.19.0-rc7+ #21 [15102.031607] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [15102.031628] Call Trace: [15102.031676] dump_stack+0xc9/0x16b [15102.031723] ? show_regs_print_info+0x5/0x5 [15102.031801] ? lockdep_rcu_suspicious+0x117/0x160 [15102.031855] pid_task+0x134/0x160 [15102.031900] ? find_vpid+0xf0/0xf0 [15102.032017] shutdown_umh.constprop.1+0x1e/0x53 [bpfilter] [15102.032055] stop_umh+0x46/0x52 [bpfilter] [15102.032092] __x64_sys_delete_module+0x47e/0x570 [ ... ] Fixes: d2ba09c1 ("net: add skeleton of bpfilter kernel module") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gustavo A. R. Silva authored
pin_index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/ptp/ptp_chardev.c:253 ptp_ioctl() warn: potential spectre issue 'ops->pin_config' [r] (local cap) Fix this by sanitizing pin_index before using it to index ops->pin_config, and before passing it as an argument to function ptp_set_pinfunc(), in which it is used to index info->pin_config. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
Smatch complains that "val" would be uninitialized if kstrtoul() fails. Fixes: 9a08862a ("vDSO for sparc") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nathan Chancellor authored
Clang currently warns: drivers/net/ethernet/qlogic/qla3xxx.c:384:24: warning: signed shift result (0xF00000000) requires 37 bits to represent, but 'int' only has 32 bits [-Wshift-overflow] ((ISP_NVRAM_MASK << 16) | qdev->eeprom_cmd_data)); ~~~~~~~~~~~~~~ ^ ~~ 1 warning generated. The warning is certainly accurate since ISP_NVRAM_MASK is defined as (0x000F << 16) which is then shifted by 16, resulting in 64424509440, well above UINT_MAX. Given that this is the only location in this driver where ISP_NVRAM_MASK is shifted again, it seems likely that ISP_NVRAM_MASK was originally defined without a shift and during the move of the shift to the definition, this statement wasn't properly removed (since ISP_NVRAM_MASK is used in the statenent right above this). Only the maintainers can confirm this since this statment has been here since the driver was first added to the kernel. Link: https://github.com/ClangBuiltLinux/linux/issues/127Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Stefano Brivio says: ==================== geneve, vxlan: Don't set exceptions if skb->len < mtu This series fixes the exception abuse described in 2/2, and 1/2 is just a preparatory change to make 2/2 less ugly. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefano Brivio authored
We shouldn't abuse exceptions: if the destination MTU is already higher than what we're transmitting, no exception should be created. Fixes: 52a589d5 ("geneve: update skb dst pmtu on tx path") Fixes: a93bf0ff ("vxlan: update skb dst pmtu on tx path") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefano Brivio authored
Commit f15ca723 ("net: don't call update_pmtu unconditionally") avoids that we try updating PMTU for a non-existent destination, but didn't clean up cases where the check was already explicit. Drop those redundant checks. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
First, the trap number for 32-bit syscalls is 0x10. Also, only negate the return value when syscall error is indicated by the carry bit being set. Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 17 Oct, 2018 2 commits
-
-
Steven Rostedt (VMware) authored
The preemptirq_delay_test module is used for the ftrace selftest code that tests the latency tracers. The problem is that it uses ktime for the delay loop, and then checks the tracer to see if the delay loop is caught, but the tracer uses trace_clock_local() which uses various different other clocks to measure the latency. As ktime uses the clock cycles, and the code then converts that to nanoseconds, it causes rounding errors, and the preemptirq latency tests are failing due to being off by 1 (it expects to see a delay of 500000 us, but the delay is only 499999 us). This is happening due to a rounding error in the ktime (which is totally legit). The purpose of the test is to see if it can catch the delay, not to test the accuracy between trace_clock_local() and ktime_get(). Best to use apples to apples, and have the delay loop use the same clock as the latency tracer does. Cc: stable@vger.kernel.org Fixes: f96e8577 ("lib: Add module for testing preemptoff/irqsoff latency tracers") Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-
Mathieu Desnoyers authored
commit 46e0c9be ("kernel: tracepoints: add support for relative references") changes the layout of the __tracepoint_ptrs section on architectures supporting relative references. However, it does so without turning struct tracepoint * const into const int elsewhere in the tracepoint code, which has the following side-effect: Setting mod->num_tracepoints is done in by module.c: mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs", sizeof(*mod->tracepoints_ptrs), &mod->num_tracepoints); Basically, since sizeof(*mod->tracepoints_ptrs) is a pointer size (rather than sizeof(int)), num_tracepoints is erroneously set to half the size it should be on 64-bit arch. So a module with an odd number of tracepoints misses the last tracepoint due to effect of integer division. So in the module going notifier: for_each_tracepoint_range(mod->tracepoints_ptrs, mod->tracepoints_ptrs + mod->num_tracepoints, tp_module_going_check_quiescent, NULL); the expression (mod->tracepoints_ptrs + mod->num_tracepoints) actually evaluates to something within the bounds of the array, but miss the last tracepoint if the number of tracepoints is odd on 64-bit arch. Fix this by introducing a new typedef: tracepoint_ptr_t, which is either "const int" on architectures that have PREL32 relocations, or "struct tracepoint * const" on architectures that does not have this feature. Also provide a new tracepoint_ptr_defer() static inline to encapsulate deferencing this type rather than duplicate code and ugly idefs within the for_each_tracepoint_range() implementation. This issue appears in 4.19-rc kernels, and should ideally be fixed before the end of the rc cycle. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Jessica Yu <jeyu@kernel.org> Link: http://lkml.kernel.org/r/20181013191050.22389-1-mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/20180704083651.24360-7-ard.biesheuvel@linaro.org Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morris <james.morris@microsoft.com> Cc: James Morris <jmorris@namei.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Russell King <linux@armlinux.org.uk> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
-