- 19 May, 2018 16 commits
-
-
Dave Watson authored
[ Upstream commit c212d2c7 ] It is reported that in some cases, write_space may be called in do_tcp_sendpages, such that we recursively invoke do_tcp_sendpages again: [ 660.468802] ? do_tcp_sendpages+0x8d/0x580 [ 660.468826] ? tls_push_sg+0x74/0x130 [tls] [ 660.468852] ? tls_push_record+0x24a/0x390 [tls] [ 660.468880] ? tls_write_space+0x6a/0x80 [tls] ... tls_push_sg already does a loop over all sending sg's, so ignore any tls_write_space notifications until we are done sending. We then have to call the previous write_space to wake up poll() waiters after we are done with the send loop. Reported-by:
Andre Tomt <andre@tomt.net> Signed-off-by:
Dave Watson <davejwatson@fb.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lance Richardson authored
[ Upstream commit 988bf724 ] For the x32 ABI, struct timeval has two 64-bit fields. However the kernel currently interprets the user-space values used for the SO_RCVTIMEO and SO_SNDTIMEO socket options as having a pair of 32-bit fields. When the seconds portion of the requested timeout is less than 2**32, the seconds portion of the effective timeout is correct but the microseconds portion is zero. When the seconds portion of the requested timeout is zero and the microseconds portion is non-zero, the kernel interprets the timeout as zero (never timeout). Fix by using 64-bit time for SO_RCVTIMEO/SO_SNDTIMEO as required for the ABI. The code included below demonstrates the problem. Results before patch: $ gcc -m64 -Wall -O2 -o socktmo socktmo.c && ./socktmo recv time: 2.008181 seconds send time: 2.015985 seconds $ gcc -m32 -Wall -O2 -o socktmo socktmo.c && ./socktmo recv time: 2.016763 seconds send time: 2.016062 seconds $ gcc -mx32 -Wall -O2 -o socktmo socktmo.c && ./socktmo recv time: 1.007239 seconds send time: 1.023890 seconds Results after patch: $ gcc -m64 -O2 -Wall -o socktmo socktmo.c && ./socktmo recv time: 2.010062 seconds send time: 2.015836 seconds $ gcc -m32 -O2 -Wall -o socktmo socktmo.c && ./socktmo recv time: 2.013974 seconds send time: 2.015981 seconds $ gcc -mx32 -O2 -Wall -o socktmo socktmo.c && ./socktmo recv time: 2.030257 seconds send time: 2.013383 seconds #include <stdio.h> #include <stdlib.h> #include <sys/socket.h> #include <sys/types.h> #include <sys/time.h> void checkrc(char *str, int rc) { if (rc >= 0) return; perror(str); exit(1); } static char buf[1024]; int main(int argc, char **argv) { int rc; int socks[2]; struct timeval tv; struct timeval start, end, delta; rc = socketpair(AF_UNIX, SOCK_STREAM, 0, socks); checkrc("socketpair", rc); /* set timeout to 1.999999 seconds */ tv.tv_sec = 1; tv.tv_usec = 999999; rc = setsockopt(socks[0], SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof tv); rc = setsockopt(socks[0], SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof tv); checkrc("setsockopt", rc); /* measure actual receive timeout */ gettimeofday(&start, NULL); rc = recv(socks[0], buf, sizeof buf, 0); gettimeofday(&end, NULL); timersub(&end, &start, &delta); printf("recv time: %ld.%06ld seconds\n", (long)delta.tv_sec, (long)delta.tv_usec); /* fill send buffer */ do { rc = send(socks[0], buf, sizeof buf, 0); } while (rc > 0); /* measure actual send timeout */ gettimeofday(&start, NULL); rc = send(socks[0], buf, sizeof buf, 0); gettimeofday(&end, NULL); timersub(&end, &start, &delta); printf("send time: %ld.%06ld seconds\n", (long)delta.tv_sec, (long)delta.tv_usec); exit(0); } Fixes: 515c7af8 ("x32: Use compat shims for {g,s}etsockopt") Reported-by:
Gopal RajagopalSai <gopalsr83@gmail.com> Signed-off-by:
Lance Richardson <lance.richardson.net@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 7df40c26 ] Normally, a socket can not be freed/reused unless all its TX packets left qdisc and were TX-completed. However connect(AF_UNSPEC) allows this to happen. With commit fc59d5bd ("pkt_sched: fq: clear time_next_packet for reused flows") we cleared f->time_next_packet but took no special action if the flow was still in the throttled rb-tree. Since f->time_next_packet is the key used in the rb-tree searches, blindly clearing it might break rb-tree integrity. We need to make sure the flow is no longer in the rb-tree to avoid this problem. Fixes: fc59d5bd ("pkt_sched: fq: clear time_next_packet for reused flows") Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Roman Mashak authored
[ Upstream commit a52956df ] When application fails to pass flags in netlink TLV when replacing existing skbmod action, the kernel will leak refcnt: $ tc actions get action skbmod index 1 total acts 0 action order 0: skbmod pipe set smac 00:11:22:33:44:55 index 1 ref 1 bind 0 For example, at this point a buggy application replaces the action with index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags, however refcnt gets bumped: $ tc actions get actions skbmod index 1 total acts 0 action order 0: skbmod pipe set smac 00:11:22:33:44:55 index 1 ref 2 bind 0 $ Tha patch fixes this by calling tcf_idr_release() on existing actions. Fixes: 86da71b5 ("net_sched: Introduce skbmod action") Signed-off-by:
Roman Mashak <mrv@mojatatu.com> Acked-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Adi Nissim authored
[ Upstream commit 88d725bb ] The host side reporting of VF vport statistics didn't include the VF RDMA traffic. Fixes: 3b751a2a ("net/mlx5: E-Switch, Introduce get vf statistics") Signed-off-by:
Adi Nissim <adin@mellanox.com> Reported-by:
Ariel Almog <ariela@mellanox.com> Reviewed-by:
Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Roi Dayan authored
[ Upstream commit f85900c3 ] The HW doesn't support matching on frag first/later, return error if we are asked to offload that. Fixes: 3f7d0eb4 ("net/mlx5e: Offload TC matching on packets being IP fragments") Signed-off-by:
Roi Dayan <roid@mellanox.com> Reviewed-by:
Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Moshe Shemesh authored
[ Upstream commit 6ad4e91c ] Add check of coalescing parameters received through ethtool are within range of values supported by the HW. Driver gets the coalescing rx/tx-usecs and rx/tx-frames as set by the users through ethtool. The ethtool support up to 32 bit value for each. However, mlx4 modify cq limits the coalescing time parameter and coalescing frames parameters to 16 bits. Return out of range error if user tries to set these parameters to higher values. Change type of sample-interval and adaptive_rx_coal parameters in mlx4 driver to u32 as the ethtool holds them as u32 and these parameters are not limited due to mlx4 HW. Fixes: c27a02cd ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC') Signed-off-by:
Moshe Shemesh <moshe@mellanox.com> Signed-off-by:
Tariq Toukan <tariqt@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christophe JAILLET authored
[ Upstream commit a577d868 ] If an error occurs, 'mlx4_en_destroy_netdev()' is called. It then calls 'mlx4_en_free_resources()' which does the needed resources cleanup. So, doing some explicit kfree in the error handling path would lead to some double kfree. Simplify code to avoid such a case. Fixes: 67f8b1dc ("net/mlx4_en: Refactor the XDP forwarding rings scheme") Signed-off-by:
Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by:
Tariq Toukan <tariqt@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Grygorii Strashko authored
[ Upstream commit 5e5add17 ] In dual_mac mode packets arrived on one port should not be forwarded by switch hw to another port. Only Linux Host can forward packets between ports. The below test case (reported in [1]) shows that packet arrived on one port can be leaked to anoter (reproducible with dual port evms): - connect port 1 (eth0) to linux Host 0 and run tcpdump or Wireshark - connect port 2 (eth1) to linux Host 1 with vlan 1 configured - ping <IPx> from Host 1 through vlan 1 interface. ARP packets will be seen on Host 0. Issue happens because dual_mac mode is implemnted using two vlans: 1 (Port 1+Port 0) and 2 (Port 2+Port 0), so there are vlan records created for for each vlan. By default, the ALE will find valid vlan record in its table when vlan 1 tagged packet arrived on Port 2 and so forwards packet to all ports which are vlan 1 members (like Port. To avoid such behaviorr the ALE VLAN ID Ingress Check need to be enabled for each external CPSW port (ALE_PORTCTLn.VID_INGRESS_CHECK) so ALE will drop ingress packets if Rx port is not VLAN member. Signed-off-by:
Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rob Taglang authored
[ Upstream commit 14224923 ] Currently, skb->len and skb->data_len are set to the page size, not the packet size. This causes the frame check sequence to not be located at the "end" of the packet resulting in ethernet frame check errors. The driver does work currently, but stricter kernel facing networking solutions like OpenVSwitch will drop these packets as invalid. These changes set the packet size correctly so that these errors no longer occur. The length does not include the frame check sequence, so that subtraction was removed. Tested on Oracle/SUN Multithreaded 10-Gigabit Ethernet Network Controller [108e:abcd] and validated in wireshark. Signed-off-by:
Rob Taglang <rob@taglang.io> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 2c5d5b13 ] syzbot loves to set very small mtu on devices, since it brings joy. We must make llc_ui_sendmsg() fool proof. usercopy: Kernel memory overwrite attempt detected to wrapped address (offset 0, size 18446612139802320068)! kernel BUG at mm/usercopy.c:100! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 17464 Comm: syz-executor1 Not tainted 4.17.0-rc3+ #36 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:usercopy_abort+0xbb/0xbd mm/usercopy.c:88 RSP: 0018:ffff8801868bf800 EFLAGS: 00010282 RAX: 000000000000006c RBX: ffffffff87d2fb00 RCX: 0000000000000000 RDX: 000000000000006c RSI: ffffffff81610731 RDI: ffffed0030d17ef6 RBP: ffff8801868bf858 R08: ffff88018daa4200 R09: ffffed003b5c4fb0 R10: ffffed003b5c4fb0 R11: ffff8801dae27d87 R12: ffffffff87d2f8e0 R13: ffffffff87d2f7a0 R14: ffffffff87d2f7a0 R15: ffffffff87d2f7a0 FS: 00007f56a14ac700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2bc21000 CR3: 00000001abeb1000 CR4: 00000000001426f0 DR0: 0000000020000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000030602 Call Trace: check_bogus_address mm/usercopy.c:153 [inline] __check_object_size+0x5d9/0x5d9 mm/usercopy.c:256 check_object_size include/linux/thread_info.h:108 [inline] check_copy_size include/linux/thread_info.h:139 [inline] copy_from_iter_full include/linux/uio.h:121 [inline] memcpy_from_msg include/linux/skbuff.h:3305 [inline] llc_ui_sendmsg+0x4b1/0x1530 net/llc/af_llc.c:941 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:639 __sys_sendto+0x3d7/0x670 net/socket.c:1789 __do_sys_sendto net/socket.c:1801 [inline] __se_sys_sendto net/socket.c:1797 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x455979 RSP: 002b:00007f56a14abc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f56a14ac6d4 RCX: 0000000000455979 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000018 RBP: 000000000072bea0 R08: 00000000200012c0 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000548 R14: 00000000006fbf60 R15: 0000000000000000 Code: 55 c0 e8 c0 55 bb ff ff 75 c8 48 8b 55 c0 4d 89 f9 ff 75 d0 4d 89 e8 48 89 d9 4c 89 e6 41 56 48 c7 c7 80 fa d2 87 e8 a0 0b a3 ff <0f> 0b e8 95 55 bb ff e8 c0 a8 f7 ff 8b 95 14 ff ff ff 4d 89 e8 RIP: usercopy_abort+0xbb/0xbd mm/usercopy.c:88 RSP: ffff8801868bf800 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Eric Dumazet <edumazet@google.com> Reported-by:
syzbot <syzkaller@googlegroups.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andrey Ignatov authored
[ Upstream commit 1b97013b ] Fix more memory leaks in ip_cmsg_send() callers. Part of them were fixed earlier in 91948309. * udp_sendmsg one was there since the beginning when linux sources were first added to git; * ping_v4_sendmsg one was copy/pasted in c319b4d7. Whenever return happens in udp_sendmsg() or ping_v4_sendmsg() IP options have to be freed if they were allocated previously. Add label so that future callers (if any) can use it instead of kfree() before return that is easy to forget. Fixes: c319b4d7 (net: ipv4: add IPPROTO_ICMP socket kind) Signed-off-by:
Andrey Ignatov <rdna@fb.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Julian Anastasov authored
[ Upstream commit 94720e3a ] Allow some non-cached routes to use non-expired fnhe: 1. ip_del_fnhe: moved above and now called by find_exception. The 4.5+ commit deed49df expires fnhe only when caching routes. Change that to: 1.1. use fnhe for non-cached local output routes, with the help from (2) 1.2. allow __mkroute_input to detect expired fnhe (outdated fnhe_gw, for example) when do_cache is false, eg. when itag!=0 for unicast destinations. 2. __mkroute_output: keep fi to allow local routes with orig_oif != 0 to use fnhe info even when the new route will not be cached into fnhe. After commit 839da4d9 ("net: ipv4: set orig_oif based on fib result for local traffic") it means all local routes will be affected because they are not cached. This change is used to solve a PMTU problem with IPVS (and probably Netfilter DNAT) setups that redirect local clients from target local IP (local route to Virtual IP) to new remote IP target, eg. IPVS TUN real server. Loopback has 64K MTU and we need to create fnhe on the local route that will keep the reduced PMTU for the Virtual IP. Without this change fnhe_pmtu is updated from ICMP but never exposed to non-cached local routes. This includes routes with flowi4_oif!=0 for 4.6+ and with flowi4_oif=any for 4.14+). 3. update_or_create_fnhe: make sure fnhe_expires is not 0 for new entries Fixes: 839da4d9 ("net: ipv4: set orig_oif based on fib result for local traffic") Fixes: d6d5e999 ("route: do not cache fib route info on local routes with oif") Fixes: deed49df ("route: check and remove route cache when we get route") Cc: David Ahern <dsahern@gmail.com> Cc: Xin Long <lucien.xin@gmail.com> Signed-off-by:
Julian Anastasov <ja@ssi.bg> Acked-by:
David Ahern <dsahern@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit a8d7aa17 ] syzbot reported a crash in tasklet_action_common() caused by dccp. dccp needs to make sure socket wont disappear before tasklet handler has completed. This patch takes a reference on the socket when arming the tasklet, and moves the sock_put() from dccp_write_xmit_timer() to dccp_write_xmitlet() kernel BUG at kernel/softirq.c:514! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 17 Comm: ksoftirqd/1 Not tainted 4.17.0-rc3+ #30 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: 0018:ffff8801d9b3faf8 EFLAGS: 00010246 dccp_close: ABORT with 65423 bytes unread RAX: 1ffff1003b367f6b RBX: ffff8801daf1f3f0 RCX: 0000000000000000 RDX: ffff8801cf895498 RSI: 0000000000000004 RDI: 0000000000000000 RBP: ffff8801d9b3fc40 R08: ffffed0039f12a95 R09: ffffed0039f12a94 dccp_close: ABORT with 65423 bytes unread R10: ffffed0039f12a94 R11: ffff8801cf8954a3 R12: 0000000000000000 R13: ffff8801d9b3fc18 R14: dffffc0000000000 R15: ffff8801cf895490 FS: 0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2bc28000 CR3: 00000001a08a9000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tasklet_action+0x1d/0x20 kernel/softirq.c:533 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285 dccp_close: ABORT with 65423 bytes unread run_ksoftirqd+0x86/0x100 kernel/softirq.c:646 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164 kthread+0x345/0x410 kernel/kthread.c:238 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 Code: 48 8b 85 e8 fe ff ff 48 8b 95 f0 fe ff ff e9 94 fb ff ff 48 89 95 f0 fe ff ff e8 81 53 6e 00 48 8b 95 f0 fe ff ff e9 62 fb ff ff <0f> 0b 48 89 cf 48 89 8d e8 fe ff ff e8 64 53 6e 00 48 8b 8d e8 RIP: tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: ffff8801d9b3faf8 Fixes: dc841e30 ("dccp: Extend CCID packet dequeueing interface") Signed-off-by:
Eric Dumazet <edumazet@google.com> Reported-by:
syzbot <syzkaller@googlegroups.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hangbin Liu authored
[ Upstream commit e8238fc2 ] When we set a bond slave's master to bridge via ioctl, we only check the IFF_BRIDGE_PORT flag. Although we will find the slave's real master at netdev_master_upper_dev_link() later, it already does some settings and allocates some resources. It would be better to return as early as possible. v1 -> v2: use netdev_master_upper_dev_get() instead of netdev_has_any_upper_dev() to check if we have a master, because not all upper devs are masters, e.g. vlan device. Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com Signed-off-by:
Hangbin Liu <liuhangbin@gmail.com> Acked-by:
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ingo Molnar authored
[ Upstream commit af3e0fcf ] Use disable_irq_nosync() instead of disable_irq() as this might be called in atomic context with netpoll. Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 16 May, 2018 24 commits
-
-
Greg Kroah-Hartman authored
-
Anthoine Bourgeois authored
commit ecf08dad upstream. Since the commit "8003c9ae: add APIC Timer periodic/oneshot mode VMX preemption timer support", a Windows 10 guest has some erratic timer spikes. Here the results on a 150000 times 1ms timer without any load: Before 8003c9ae | After 8003c9ae Max 1834us | 86000us Mean 1100us | 1021us Deviation 59us | 149us Here the results on a 150000 times 1ms timer with a cpu-z stress test: Before 8003c9ae | After 8003c9ae Max 32000us | 140000us Mean 1006us | 1997us Deviation 140us | 11095us The root cause of the problem is starting hrtimer with an expiry time already in the past can take more than 20 milliseconds to trigger the timer function. It can be solved by forward such past timers immediately, rather than submitting them to hrtimer_start(). In case the timer is periodic, update the target expiration and call hrtimer_start with it. v2: Check if the tsc deadline is already expired. Thank you Mika. v3: Execute the past timers immediately rather than submitting them to hrtimer_start(). v4: Rearm the periodic timer with advance_periodic_target_expiration() a simpler version of set_target_expiration(). Thank you Paolo. Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by:
Anthoine Bourgeois <anthoine.bourgeois@blade-group.com> 8003c9ae ("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support") Signed-off-by:
Radim Krčmář <rkrcmar@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Mackerras authored
commit c3856aeb upstream. This fixes several bugs in the radix page fault handler relating to the way large pages in the memory backing the guest were handled. First, the check for large pages only checked for explicit huge pages and missed transparent huge pages. Then the check that the addresses (host virtual vs. guest physical) had appropriate alignment was wrong, meaning that the code never put a large page in the partition scoped radix tree; it was always demoted to a small page. Fixing this exposed bugs in kvmppc_create_pte(). We were never invalidating a 2MB PTE, which meant that if a page was initially faulted in without write permission and the guest then attempted to store to it, we would never update the PTE to have write permission. If we find a valid 2MB PTE in the PMD, we need to clear it and do a TLB invalidation before installing either the new 2MB PTE or a pointer to a page table page. This also corrects an assumption that get_user_pages_fast would set the _PAGE_DIRTY bit if we are writing, which is not true. Instead we mark the page dirty explicitly with set_page_dirty_lock(). This also means we don't need the dirty bit set on the host PTE when providing write access on a read fault. [paulus@ozlabs.org - use mark_pages_dirty instead of kvmppc_update_dirty_map] Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 46b1b577 upstream. > arch/x86/events/intel/cstate.c:307 cstate_pmu_event_init() warn: potential spectre issue 'pkg_msr' (local cap) > arch/x86/events/intel/core.c:337 intel_pmu_event_map() warn: potential spectre issue 'intel_perfmon_event_map' > arch/x86/events/intel/knc.c:122 knc_pmu_event_map() warn: potential spectre issue 'knc_perfmon_event_map' > arch/x86/events/intel/p4.c:722 p4_pmu_event_map() warn: potential spectre issue 'p4_general_events' > arch/x86/events/intel/p6.c:116 p6_pmu_event_map() warn: potential spectre issue 'p6_perfmon_event_map' > arch/x86/events/amd/core.c:132 amd_pmu_event_map() warn: potential spectre issue 'amd_perfmon_event_map' Userspace controls @attr, sanitize @attr->config before passing it on to x86_pmu::event_map(). Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 4411ec1d upstream. > kernel/events/ring_buffer.c:871 perf_mmap_to_page() warn: potential spectre issue 'rb->aux_pages' Userspace controls @pgoff through the fault address. Sanitize the array index before doing the array dereference. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 06ce6e9b upstream. > arch/x86/events/msr.c:178 msr_event_init() warn: potential spectre issue 'msr' (local cap) Userspace controls @attr, sanitize cfg (attr->config) before using it to index an array. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit a5f81290 upstream. > arch/x86/events/intel/cstate.c:307 cstate_pmu_event_init() warn: potential spectre issue 'pkg_msr' (local cap) Userspace controls @attr, sanitize cfg (attr->config) before using it to index an array. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit ef9ee4ad upstream. > arch/x86/events/core.c:319 set_ext_hw_attr() warn: potential spectre issue 'hw_cache_event_ids[cache_type]' (local cap) > arch/x86/events/core.c:319 set_ext_hw_attr() warn: potential spectre issue 'hw_cache_event_ids' (local cap) > arch/x86/events/core.c:328 set_ext_hw_attr() warn: potential spectre issue 'hw_cache_extra_regs[cache_type]' (local cap) > arch/x86/events/core.c:328 set_ext_hw_attr() warn: potential spectre issue 'hw_cache_extra_regs' (local cap) Userspace controls @config which contains 3 (byte) fields used for a 3 dimensional array deref. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Masami Hiramatsu authored
commit 50268a3d upstream. Fix string fetch function to terminate with NUL. It is OK to drop the rest of string. Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org Cc: 范龙飞 <long7573@126.com> Fixes: 5baaa59e ("tracing/probes: Implement 'memory' fetch method for uprobes") Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Zijlstra authored
commit 354d7793 upstream. > kernel/sched/autogroup.c:230 proc_sched_autogroup_set_nice() warn: potential spectre issue 'sched_prio_to_weight' Userspace controls @nice, sanitize the array index. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steve French authored
commit 6e70c267 upstream. As with NFS, which ignores sync on directory handles, fsync on a directory handle is a noop for CIFS/SMB3. Do not return an error on it. It breaks some database apps otherwise. Signed-off-by:
Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> Reviewed-by:
Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by:
Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jens Axboe authored
commit 9abd68ef upstream. Some P3100 drives have a bug where they think WRRU (weighted round robin) is always enabled, even though the host doesn't set it. Since they think it's enabled, they also look at the submission queue creation priority. We used to set that to MEDIUM by default, but that was removed in commit 81c1cd98. This causes various issues on that drive. Add a quirk to still set MEDIUM priority for that controller. Fixes: 81c1cd98 ("nvme/pci: Don't set reserved SQ create flags") Cc: stable@vger.kernel.org Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Keith Busch <keith.busch@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marek Szyprowski authored
commit c8da6cde upstream. tmu_read() in case of Exynos4210 might return error for out of bound values. Current code ignores such value, what leads to reporting critical temperature value. Add proper error code propagation to exynos_get_temp() function. Signed-off-by:
Marek Szyprowski <m.szyprowski@samsung.com> CC: stable@vger.kernel.org # v4.6+ Signed-off-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by:
Eduardo Valentin <edubezval@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marek Szyprowski authored
commit 88fc6f73 upstream. When thermal sensor is not yet enabled, reading temperature might return random value. This might even result in stopping system booting when such temperature is higher than the critical value. Fix this by checking if TMU has been actually enabled before reading the temperature. This change fixes booting of Exynos4210-based board with TMU enabled (for example Samsung Trats board), which was broken since v4.4 kernel release. Signed-off-by:
Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 9e4249b4 ("thermal: exynos: Fix first temperature read after registering sensor") CC: stable@vger.kernel.org # v4.6+ Signed-off-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by:
Eduardo Valentin <edubezval@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hans de Goede authored
commit fc549102 upstream. Jeremy Cline correctly points out in rhbz#1514836 that a device where the QCA rome chipset needs the USB_QUIRK_RESET_RESUME quirk, may also ship with a different wifi/bt chipset in some configurations. If that is the case then we are needlessly penalizing those other chipsets with a reset-resume quirk, typically causing 0.4W extra power use because this disables runtime-pm. This commit moves the DMI table check to a btusb_check_needs_reset_resume() helper (so that we can easily also call it for other chipsets) and calls this new helper only for QCA_ROME chipsets for now. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514836 Cc: stable@vger.kernel.org Cc: Jeremy Cline <jcline@redhat.com> Suggested-by:
Jeremy Cline <jcline@redhat.com> Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hans de Goede authored
commit 596b07a9 upstream. The Dell XPS 13 9360 uses a QCA Rome chip which needs to be reset (and have its firmware reloaded) for bluetooth to work after suspend/resume. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514836 Cc: stable@vger.kernel.org Cc: Garrett LeSage <glesage@redhat.com> Reported-and-tested-by:
Garrett LeSage <glesage@redhat.com> Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hans de Goede authored
commit 544a5916 upstream. Commit f44cb4b1 ("Bluetooth: btusb: Fix quirk for Atheros 1525/QCA6174") is causing bluetooth to no longer work for several people, see: https://bugzilla.redhat.com/show_bug.cgi?id=1568911 So lets revert it for now and try to find another solution for devices which need the modified quirk. Cc: stable@vger.kernel.org Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael J. Wysocki authored
commit 97739501 upstream. If the next_freq field of struct sugov_policy is set to UINT_MAX, it shouldn't be used for updating the CPU frequency (this is a special "invalid" value), but after commit b7eaf1aa (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely) it may be passed as the new frequency to sugov_update_commit() in sugov_update_single(). Fix that by adding an extra check for the special UINT_MAX value of next_freq to sugov_update_single(). Fixes: b7eaf1aa (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely) Reported-by:
Viresh Kumar <viresh.kumar@linaro.org> Cc: 4.12+ <stable@vger.kernel.org> # 4.12+ Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by:
Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael J. Wysocki authored
commit cfcadfaa upstream. Commit 0847684c (PCI / PM: Simplify device wakeup settings code) went too far and dropped the device_may_wakeup() check from pci_enable_wake() which causes wakeup to be enabled during system suspend, hibernation or shutdown for some PCI devices that are not allowed by user space to wake up the system from sleep (or power off). As a result of this, excessive power is drawn by some of the affected systems while in sleep states or off. Restore the device_may_wakeup() check in pci_enable_wake(), but make sure that the PCI bus type's runtime suspend callback will not call device_may_wakeup() which is about system wakeup from sleep and not about device wakeup from runtime suspend. Fixes: 0847684c (PCI / PM: Simplify device wakeup settings code) Reported-by:
Joseph Salisbury <joseph.salisbury@canonical.com> Cc: 4.13+ <stable@vger.kernel.org> # 4.13+ Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kai Heng Feng authored
commit 8feaec33 upstream. USB controller ASM1042 stops working after commit de3ef1eb (PM / core: Drop run_wake flag from struct dev_pm_info). The device in question is not power managed by platform firmware, furthermore, it only supports PME# from D3cold: Capabilities: [78] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot-,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Before commit de3ef1eb, the device never gets runtime suspended. After that commit, the device gets runtime suspended to D3hot, which can not generate any PME#. usb_hcd_pci_probe() unconditionally calls device_wakeup_enable(), hence device_can_wakeup() in pci_dev_run_wake() always returns true. So pci_dev_run_wake() needs to check PME wakeup capability as its first condition. In addition, change wakeup flag passed to pci_target_state() from false to true, because we want to find the deepest state different from D3cold that the device can still generate PME#. In this case, it's D0 for the device in question. Fixes: de3ef1eb (PM / core: Drop run_wake flag from struct dev_pm_info) Signed-off-by:
Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: 4.13+ <stable@vger.kernel.org> # 4.13+ Acked-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gustavo A. R. Silva authored
commit 2be147f7 upstream. pool can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/atm/zatm.c:1462 zatm_ioctl() warn: potential spectre issue 'zatm_dev->pool_info' (local cap) Fix this by sanitizing pool before using it to index zatm_dev->pool_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by:
Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Gustavo A. R. Silva authored
commit acf784bd upstream. ioc_data.dev_num can be controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: net/atm/lec.c:702 lec_vcc_attach() warn: potential spectre issue 'dev_lec' Fix this by sanitizing ioc_data.dev_num before using it to index dev_lec. Also, notice that there is another instance in which array dev_lec is being indexed using ioc_data.dev_num at line 705: lec_vcc_added(netdev_priv(dev_lec[ioc_data.dev_num]), Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by:
Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ville Syrjälä authored
commit b5cb2e5a upstream. Clear the old_state and new_state pointers for private objects in drm_atomic_state_default_clear(). We don't actually have functions to get the new/old state for private objects so getting access to the potentially stale pointers requires a bit more manual labour than for other object types. But let's clear the pointers for private objects as well, if only to avoid future surprises when someone decides to add the functions to get at them. v2: Split private objs to a separate patch (Daniel) Cc: <stable@vger.kernel.org> # v4.14+ Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Abhay Kumar <abhay.kumar@intel.com> Fixes: a4370c77 (drm/atomic: Make private objs proper objects) Signed-off-by:
Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502183247.5746-1-ville.syrjala@linux.intel.comReviewed-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by:
Sean Paul <seanpaul@chromium.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ville Syrjälä authored
commit f0b408ee upstream. Clear the old_state and new_state pointers for every object in drm_atomic_state_default_clear(). Otherwise drm_atomic_get_{new,old}_*_state() will hand out stale pointers to anyone who hasn't first confirmed that the object is in fact part of the current atomic transcation, if they are called after we've done the ww backoff dance while hanging on to the same drm_atomic_state. For example, handle_conflicting_encoders() looks like it could hit this since it iterates the full connector list and just calls drm_atomic_get_new_connector_state() for each. And I believe we have now witnessed this happening at least once in i915 check_digital_port_conflicts(). Commit 8b69449d ("drm/i915: Remove last references to drm_atomic_get_existing* macros") changed the safe drm_atomic_get_existing_connector_state() to the unsafe drm_atomic_get_new_connector_state(), which opened the doors for this particular bug there as well. v2: Split private objs out to a separate patch (Daniel) Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Abhay Kumar <abhay.kumar@intel.com> Fixes: 581e49fe ("drm/atomic: Add new iterators over all state, v3.") Signed-off-by:
Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180502183247.5746-1-ville.syrjala@linux.intel.comReviewed-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by:
Sean Paul <seanpaul@chromium.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-