- 26 Apr, 2018 40 commits
-
-
Tony Lindgren authored
[ Upstream commit 69728051 ] If a device is runtime PM suspended when we enter suspend and has a dedicated wake IRQ, we can get the following warning: WARNING: CPU: 0 PID: 108 at kernel/irq/manage.c:526 enable_irq+0x40/0x94 [ 102.087860] Unbalanced enable for IRQ 147 ... (enable_irq) from [<c06117a8>] (dev_pm_arm_wake_irq+0x4c/0x60) (dev_pm_arm_wake_irq) from [<c0618360>] (device_wakeup_arm_wake_irqs+0x58/0x9c) (device_wakeup_arm_wake_irqs) from [<c0615948>] (dpm_suspend_noirq+0x10/0x48) (dpm_suspend_noirq) from [<c01ac7ac>] (suspend_devices_and_enter+0x30c/0xf14) (suspend_devices_and_enter) from [<c01adf20>] (enter_state+0xad4/0xbd8) (enter_state) from [<c01ad3ec>] (pm_suspend+0x38/0x98) (pm_suspend) from [<c01ab3e8>] (state_store+0x68/0xc8) This is because the dedicated wake IRQ for the device may have been already enabled earlier by dev_pm_enable_wake_irq_check(). Fix the issue by checking for runtime PM suspended status. This issue can be easily reproduced by setting serial console log level to zero, letting the serial console idle, and suspend the system from an ssh terminal. On resume, dmesg will have the warning above. The reason why I have not run into this issue earlier has been that I typically run my PM test cases from on a serial console instead over ssh. Fixes: c8434559 (PM / wakeirq: Enable dedicated wakeirq for suspend) Signed-off-by:
Tony Lindgren <tony@atomide.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafael J. Wysocki authored
[ Upstream commit 3cd091a7 ] Commit 66259146 (ACPI / EC: Drop EC noirq hooks to fix a regression) modified the ACPI EC driver so that it doesn't switch over to busy polling mode during noirq stages of system suspend and resume in an attempt to fix an issue resulting from that behavior. However, that modification introduced a system resume regression on Thinkpad X240, so make the EC driver switch over to the polling mode during noirq stages of system suspend and resume again, which effectively reverts the problematic commit. Fixes: 66259146 (ACPI / EC: Drop EC noirq hooks to fix a regression) Link: https://bugzilla.kernel.org/show_bug.cgi?id=197863Reported-by:
Markus Demleitner <m@tfiu.de> Tested-by:
Markus Demleitner <m@tfiu.de> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daniel Borkmann authored
[ Upstream commit 941ff6f1 ] Fix two issues in the reuseport_bpf selftests that were reported by Linaro CI: [...] + ./reuseport_bpf ---- IPv4 UDP ---- Testing EBPF mod 10... Reprograming, testing mod 5... ./reuseport_bpf: ebpf error. log: 0: (bf) r6 = r1 1: (20) r0 = *(u32 *)skb[0] 2: (97) r0 %= 10 3: (95) exit processed 4 insns : Operation not permitted + echo FAIL [...] ---- IPv4 TCP ---- Testing EBPF mod 10... ./reuseport_bpf: failed to bind send socket: Address already in use + echo FAIL [...] For the former adjust rlimit since this was the cause of failure for loading the BPF prog, and for the latter add SO_REUSEADDR. Reported-by:
Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://bugs.linaro.org/show_bug.cgi?id=3502Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Niklas Cassel authored
[ Upstream commit 1b84ca18 ] The interrupt status register in both dwmac1000 and dwmac4 ignores interrupt enable (for dwmac4) / interrupt mask (for dwmac1000). Therefore, if we want to check only the bits that can actually trigger an irq, we have to filter the interrupt status register manually. Commit 0a764db1 ("stmmac: Discard masked flags in interrupt status register") fixed this for dwmac1000. Fix the same issue for dwmac4. Just like commit 0a764db1 ("stmmac: Discard masked flags in interrupt status register"), this makes sure that we do not get spurious link up/link down prints. Signed-off-by:
Niklas Cassel <niklas.cassel@axis.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Trond Myklebust authored
[ Upstream commit 0afa6b44 ] Calling __UDPX_INC_STATS() from a preemptible context leads to a warning of the form: BUG: using __this_cpu_add() in preemptible [00000000] code: kworker/u5:0/31 caller is xs_udp_data_receive_workfn+0x194/0x270 CPU: 1 PID: 31 Comm: kworker/u5:0 Not tainted 4.15.0-rc8-00076-g90ea9f1b #2 Workqueue: xprtiod xs_udp_data_receive_workfn Call Trace: dump_stack+0x85/0xc1 check_preemption_disabled+0xce/0xe0 xs_udp_data_receive_workfn+0x194/0x270 process_one_work+0x318/0x620 worker_thread+0x20a/0x390 ? process_one_work+0x620/0x620 kthread+0x120/0x130 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x24/0x30 Since we're taking a spinlock in those functions anyway, let's fix the issue by moving the call so that it occurs under the spinlock. Reported-by:
kernel test robot <fengguang.wu@intel.com> Signed-off-by:
Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Mackerras authored
[ Upstream commit 05f2bb03 ] This fixes the computation of the HPTE index to use when the HPT resizing code encounters a bolted HPTE which is stored in its secondary HPTE group. The code inverts the HPTE group number, which is correct, but doesn't then mask it with new_hash_mask. As a result, new_pteg will be effectively negative, resulting in new_hptep pointing before the new HPT, which will corrupt memory. In addition, this removes two BUG_ON statements. The condition that the BUG_ONs were testing -- that we have computed the hash value incorrectly -- has never been observed in testing, and if it did occur, would only affect the guest, not the host. Given that BUG_ON should only be used in conditions where the kernel (i.e. the host kernel, in this case) can't possibly continue execution, it is not appropriate here. Reviewed-by:
David Gibson <david@gibson.dropbear.id.au> Signed-off-by:
Paul Mackerras <paulus@ozlabs.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jesper Dangaard Brouer authored
[ Upstream commit e3d91b0c ] V3: More generic skipping of relo-section (suggested by Daniel) If clang >= 4.0.1 is missing the option '-target bpf', it will cause llc/llvm to create two ELF sections for "Exception Frames", with section names '.eh_frame' and '.rel.eh_frame'. The BPF ELF loader library libbpf fails when loading files with these sections. The other in-kernel BPF ELF loader in samples/bpf/bpf_load.c, handle this gracefully. And iproute2 loader also seems to work with these "eh" sections. The issue in libbpf is caused by bpf_object__elf_collect() skipping some sections, and later when performing relocation it will be pointing to a skipped section, as these sections cannot be found by bpf_object__find_prog_by_idx() in bpf_object__collect_reloc(). This is a general issue that also occurs for other sections, like debug sections which are also skipped and can have relo section. As suggested by Daniel. To avoid keeping state about all skipped sections, instead perform a direct qlookup in the ELF object. Lookup the section that the relo-section points to and check if it contains executable machine instructions (denoted by the sh_flags SHF_EXECINSTR). Use this check to also skip irrelevant relo-sections. Note, for samples/bpf/ the '-target bpf' parameter to clang cannot be used due to incompatibility with asm embedded headers, that some of the samples include. This is explained in more details by Yonghong Song in bpf_devel_QA. Signed-off-by:
Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mathieu Malaterre authored
[ Upstream commit e728789c ] In commit c7f5d105 ("net: Add eth_platform_get_mac_address() helper."), two declarations were added: int eth_platform_get_mac_address(struct device *dev, u8 *mac_addr); unsigned char *arch_get_platform_get_mac_address(void); An extra '_get' was introduced in arch_get_platform_get_mac_address, remove it. Fix compile warning using W=1: CC net/ethernet/eth.o net/ethernet/eth.c:523:24: warning: no previous prototype for ‘arch_get_platform_mac_address’ [-Wmissing-prototypes] unsigned char * __weak arch_get_platform_mac_address(void) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ AR net/ethernet/built-in.o Signed-off-by:
Mathieu Malaterre <malat@debian.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chuck Lever authored
[ Upstream commit 175e0310 ] A single NFSv4 WRITE compound can often have three operations: PUTFH, WRITE, then GETATTR. When the WRITE payload is sent in a Read chunk, the client places the GETATTR in the inline part of the RPC/RDMA message, just after the WRITE operation (sans payload). The position value in the Read chunk enables the receiver to insert the Read chunk at the correct place in the received XDR stream; that is between the WRITE and GETATTR. According to RFC 8166, an NFS/RDMA client does not have to add XDR round-up to the Read chunk that carries the WRITE payload. The receiver adds XDR round-up padding if it is absent and the receiver's XDR decoder requires it to be present. Commit 193bcb7b ("svcrdma: Populate tail iovec when receiving") attempted to add support for receiving such a compound so that just the WRITE payload appears in rq_arg's page list, and the trailing GETATTR is placed in rq_arg's tail iovec. (TCP just strings the whole compound into the head iovec and page list, without regard to the alignment of the WRITE payload). The server transport logic also had to accommodate the optional XDR round-up of the Read chunk, which it did simply by lengthening the tail iovec when round-up was needed. This approach is adequate for the NFSv2 and NFSv3 WRITE decoders. Unfortunately it is not sufficient for nfsd4_decode_write. When the Read chunk length is a couple of bytes less than PAGE_SIZE, the computation at the end of nfsd4_decode_write allows argp->pagelen to go negative, which breaks the logic in read_buf that looks for the tail iovec. The result is that a WRITE operation whose payload length is just less than a multiple of a page succeeds, but the subsequent GETATTR in the same compound fails with NFS4ERR_OP_ILLEGAL because the XDR decoder can't find it. Clients ignore the error, but they must update their attribute cache via a separate round trip. As nfsd4_decode_write appears to expect the payload itself to always have appropriate XDR round-up, have svc_rdma_build_normal_read_chunk add the Read chunk XDR round-up to the page_len rather than lengthening the tail iovec. Reported-by:
Olga Kornievskaia <kolga@netapp.com> Fixes: 193bcb7b ("svcrdma: Populate tail iovec when receiving") Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Tested-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Howells authored
[ Upstream commit 8c2f826d ] Don't put buffers of data to be handed to crypto on the stack as this may cause an assertion failure in the kernel (see below). Fix this by using an kmalloc'd buffer instead. kernel BUG at ./include/linux/scatterlist.h:147! ... RIP: 0010:rxkad_encrypt_response.isra.6+0x191/0x1b0 [rxrpc] RSP: 0018:ffffbe2fc06cfca8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff989277d59900 RCX: 0000000000000028 RDX: 0000259dc06cfd88 RSI: 0000000000000025 RDI: ffffbe30406cfd88 RBP: ffffbe2fc06cfd60 R08: ffffbe2fc06cfd08 R09: ffffbe2fc06cfd08 R10: 0000000000000000 R11: 0000000000000000 R12: 1ffff7c5f80d9f95 R13: ffffbe2fc06cfd88 R14: ffff98927a3f7aa0 R15: ffffbe2fc06cfd08 FS: 0000000000000000(0000) GS:ffff98927fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b1ff28f0f8 CR3: 000000001b412003 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rxkad_respond_to_challenge+0x297/0x330 [rxrpc] rxrpc_process_connection+0xd1/0x690 [rxrpc] ? process_one_work+0x1c3/0x680 ? __lock_is_held+0x59/0xa0 process_one_work+0x249/0x680 worker_thread+0x3a/0x390 ? process_one_work+0x680/0x680 kthread+0x121/0x140 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x3a/0x50 Reported-by:
Jonathan Billings <jsbillings@jsbillings.org> Reported-by:
Marc Dionne <marc.dionne@auristor.com> Signed-off-by:
David Howells <dhowells@redhat.com> Tested-by:
Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Steven Rostedt (VMware) authored
[ Upstream commit 97fe22ad ] Al Viro discovered a bug in the glob ftrace filtering code where "*a*b" is treated the same as "a*b", and functions that would be selected by "*a*b" but not "a*b" are not selected with "*a*b". Add tests for patterns "*a*b" and "a*b*" to the glob selftest. Link: http://lkml.kernel.org/r/20180127170748.GF13338@ZenIV.linux.org.uk Cc: Shuah Khan <shuah@kernel.org> Acked-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chen Yu authored
[ Upstream commit 70f6bf2a ] When maxcpus=1 is in the kernel command line, the BP is responsible for re-enabling the HWP - because currently only the APs invoke intel_pstate_hwp_enable() during their online process - which might put the system into unstable state after resume. Fix this by enabling the HWP explicitly on BP during resume. Reported-by:
Doug Smythies <dsmythies@telus.net> Suggested-by:
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by:
Yu Chen <yu.c.chen@intel.com> [ rjw: Subject/changelog, minor modifications ] Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tang Junhui authored
[ Upstream commit 7f4fc93d ] I attach a back-end device to a cache set, and the cache set is not registered yet, this back-end device did not attach successfully, and no error returned: [root]# echo 87859280-fec6-4bcc-20df7ca8f86b > /sys/block/sde/bcache/attach [root]# In sysfs_attach(), the return value "v" is initialized to "size" in the beginning, and if no cache set exist in bch_cache_sets, the "v" value would not change any more, and return to sysfs, sysfs regard it as success since the "size" is a positive number. This patch fixes this issue by assigning "v" with "-ENOENT" in the initialization. Signed-off-by:
Tang Junhui <tang.junhui@zte.com.cn> Reviewed-by:
Michael Lyle <mlyle@lyle.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tang Junhui authored
[ Upstream commit 73ac105b ] back-end device sdm has already attached a cache_set with ID f67ebe1f-f8bc-4d73-bfe5-9dc88607f119, then try to attach with another cache set, and it returns with an error: [root]# cd /sys/block/sdm/bcache [root]# echo 5ccd0a63-148e-48b8-afa2-aca9cbd6279f > attach -bash: echo: write error: Invalid argument After that, execute a command to modify the label of bcache device: [root]# echo data_disk1 > label Then we reboot the system, when the system power on, the back-end device can not attach to cache_set, a messages show in the log: Feb 5 12:05:52 ceph152 kernel: [922385.508498] bcache: bch_cached_dev_attach() couldn't find uuid for sdm in set In sysfs_attach(), dc->sb.set_uuid was assigned to the value which input through sysfs, no matter whether it is success or not in bch_cached_dev_attach(). For example, If the back-end device has already attached to an cache set, bch_cached_dev_attach() would fail, but dc->sb.set_uuid was changed. Then modify the label of bcache device, it will call bch_write_bdev_super(), which would write the dc->sb.set_uuid to the super block, so we record a wrong cache set ID in the super block, after the system reboot, the cache set couldn't find the uuid of the back-end device, so the bcache device couldn't exist and use any more. In this patch, we don't assigned cache set ID to dc->sb.set_uuid in sysfs_attach() directly, but input it into bch_cached_dev_attach(), and assigned dc->sb.set_uuid to the cache set ID after the back-end device attached to the cache set successful. Signed-off-by:
Tang Junhui <tang.junhui@zte.com.cn> Reviewed-by:
Michael Lyle <mlyle@lyle.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tang Junhui authored
[ Upstream commit 682811b3 ] After long time running of random small IO writing, I reboot the machine, and after the machine power on, I found bcache got stuck, the stack is: [root@ceph153 ~]# cat /proc/2510/task/*/stack [<ffffffffa06b2455>] closure_sync+0x25/0x90 [bcache] [<ffffffffa06b6be8>] bch_journal+0x118/0x2b0 [bcache] [<ffffffffa06b6dc7>] bch_journal_meta+0x47/0x70 [bcache] [<ffffffffa06be8f7>] bch_prio_write+0x237/0x340 [bcache] [<ffffffffa06a8018>] bch_allocator_thread+0x3c8/0x3d0 [bcache] [<ffffffff810a631f>] kthread+0xcf/0xe0 [<ffffffff8164c318>] ret_from_fork+0x58/0x90 [<ffffffffffffffff>] 0xffffffffffffffff [root@ceph153 ~]# cat /proc/2038/task/*/stack [<ffffffffa06b1abd>] __bch_btree_map_nodes+0x12d/0x150 [bcache] [<ffffffffa06b1bd1>] bch_btree_insert+0xf1/0x170 [bcache] [<ffffffffa06b637f>] bch_journal_replay+0x13f/0x230 [bcache] [<ffffffffa06c75fe>] run_cache_set+0x79a/0x7c2 [bcache] [<ffffffffa06c0cf8>] register_bcache+0xd48/0x1310 [bcache] [<ffffffff812f702f>] kobj_attr_store+0xf/0x20 [<ffffffff8125b216>] sysfs_write_file+0xc6/0x140 [<ffffffff811dfbfd>] vfs_write+0xbd/0x1e0 [<ffffffff811e069f>] SyS_write+0x7f/0xe0 [<ffffffff8164c3c9>] system_call_fastpath+0x16/0x1 The stack shows the register thread and allocator thread were getting stuck when registering cache device. I reboot the machine several times, the issue always exsit in this machine. I debug the code, and found the call trace as bellow: register_bcache() ==>run_cache_set() ==>bch_journal_replay() ==>bch_btree_insert() ==>__bch_btree_map_nodes() ==>btree_insert_fn() ==>btree_split() //node need split ==>btree_check_reserve() In btree_check_reserve(), It will check if there is enough buckets of RESERVE_BTREE type, since allocator thread did not work yet, so no buckets of RESERVE_BTREE type allocated, so the register thread waits on c->btree_cache_wait, and goes to sleep. Then the allocator thread initialized, the call trace is bellow: bch_allocator_thread() ==>bch_prio_write() ==>bch_journal_meta() ==>bch_journal() ==>journal_wait_for_write() In journal_wait_for_write(), It will check if journal is full by journal_full(), but the long time random small IO writing causes the exhaustion of journal buckets(journal.blocks_free=0), In order to release the journal buckets, the allocator calls btree_flush_write() to flush keys to btree nodes, and waits on c->journal.wait until btree nodes writing over or there has already some journal buckets space, then the allocator thread goes to sleep. but in btree_flush_write(), since bch_journal_replay() is not finished, so no btree nodes have journal (condition "if (btree_current_write(b)->journal)" never satisfied), so we got no btree node to flush, no journal bucket released, and allocator sleep all the times. Through the above analysis, we can see that: 1) Register thread wait for allocator thread to allocate buckets of RESERVE_BTREE type; 2) Alloctor thread wait for register thread to replay journal, so it can flush btree nodes and get journal bucket. then they are all got stuck by waiting for each other. Hua Rui provided a patch for me, by allocating some buckets of RESERVE_BTREE type in advance, so the register thread can get bucket when btree node splitting and no need to waiting for the allocator thread. I tested it, it has effect, and register thread run a step forward, but finally are still got stuck, the reason is only 8 bucket of RESERVE_BTREE type were allocated, and in bch_journal_replay(), after 2 btree nodes splitting, only 4 bucket of RESERVE_BTREE type left, then btree_check_reserve() is not satisfied anymore, so it goes to sleep again, and in the same time, alloctor thread did not flush enough btree nodes to release a journal bucket, so they all got stuck again. So we need to allocate more buckets of RESERVE_BTREE type in advance, but how much is enough? By experience and test, I think it should be as much as journal buckets. Then I modify the code as this patch, and test in the machine, and it works. This patch modified base on Hua Rui’s patch, and allocate more buckets of RESERVE_BTREE type in advance to avoid register thread and allocate thread going to wait for each other. [patch v2] ca->sb.njournal_buckets would be 0 in the first time after cache creation, and no journal exists, so just 8 btree buckets is OK. Signed-off-by:
Hua Rui <huarui.dev@gmail.com> Signed-off-by:
Tang Junhui <tang.junhui@zte.com.cn> Reviewed-by:
Michael Lyle <mlyle@lyle.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Coly Li authored
[ Upstream commit 99361bbf ] Kernel thread routine bch_writeback_thread() has the following code block, 447 down_write(&dc->writeback_lock); 448~450 if (check conditions) { 451 up_write(&dc->writeback_lock); 452 set_current_state(TASK_INTERRUPTIBLE); 453 454 if (kthread_should_stop()) 455 return 0; 456 457 schedule(); 458 continue; 459 } If condition check is true, its task state is set to TASK_INTERRUPTIBLE and call schedule() to wait for others to wake up it. There are 2 issues in current code, 1, Task state is set to TASK_INTERRUPTIBLE after the condition checks, if another process changes the condition and call wake_up_process(dc-> writeback_thread), then at line 452 task state is set back to TASK_INTERRUPTIBLE, the writeback kernel thread will lose a chance to be waken up. 2, At line 454 if kthread_should_stop() is true, writeback kernel thread will return to kernel/kthread.c:kthread() with TASK_INTERRUPTIBLE and call do_exit(). It is not good to enter do_exit() with task state TASK_INTERRUPTIBLE, in following code path might_sleep() is called and a warning message is reported by __might_sleep(): "WARNING: do not call blocking ops when !TASK_RUNNING; state=1 set at [xxxx]". For the first issue, task state should be set before condition checks. Ineed because dc->writeback_lock is required when modifying all the conditions, calling set_current_state() inside code block where dc-> writeback_lock is hold is safe. But this is quite implicit, so I still move set_current_state() before all the condition checks. For the second issue, frankley speaking it does not hurt when kernel thread exits with TASK_INTERRUPTIBLE state, but this warning message scares users, makes them feel there might be something risky with bcache and hurt their data. Setting task state to TASK_RUNNING before returning fixes this problem. In alloc.c:allocator_wait(), there is also a similar issue, and is also fixed in this patch. Changelog: v3: merge two similar fixes into one patch v2: fix the race issue in v1 patch. v1: initial buggy fix. Signed-off-by:
Coly Li <colyli@suse.de> Reviewed-by:
Hannes Reinecke <hare@suse.de> Reviewed-by:
Michael Lyle <mlyle@lyle.org> Cc: Michael Lyle <mlyle@lyle.org> Cc: Junhui Tang <tang.junhui@zte.com.cn> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
[ Upstream commit ade7db99 ] This bug was fixed before, but came up again with the latest compiler in another function: fs/cifs/cifssmb.c: In function 'CIFSSMBSetEA': fs/cifs/cifssmb.c:6362:3: error: 'strncpy' offset 8 is out of the bounds [0, 4] [-Werror=array-bounds] strncpy(parm_data->list[0].name, ea_name, name_len); Let's apply the same fix that was used for the other instances. Fixes: b2a3ad9c ("cifs: silence compiler warnings showing up with gcc-4.7.0") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Steve French <smfrench@gmail.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ulf Hansson authored
[ Upstream commit a3381e3a ] Commit b539cc82 (PM / Domains: Ignore domain-idle-states that are not compatible), made it possible to ignore non-compatible domain-idle-states OF nodes. However, in case that happens while doing the OF parsing, the number of elements in the allocated array would exceed the numbers actually needed, thus wasting memory. Fix this by pre-iterating the genpd OF node and counting the number of compatible domain-idle-states nodes, before doing the allocation. While doing this, it makes sense to rework the code a bit to avoid open coding, of parts responsible for the OF node iteration. Let's also take the opportunity to clarify the function header for of_genpd_parse_idle_states(), about what is being returned in case of errors. Fixes: b539cc82 (PM / Domains: Ignore domain-idle-states that are not compatible) Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by:
Lina Iyer <ilina@codeaurora.org> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexey Dobriyan authored
[ Upstream commit ac7f1061 ] Current code does: if (sscanf(dentry->d_name.name, "%lx-%lx", start, end) != 2) However sscanf() is broken garbage. It silently accepts whitespace between format specifiers (did you know that?). It silently accepts valid strings which result in integer overflow. Do not use sscanf() for any even remotely reliable parsing code. OK # readlink '/proc/1/map_files/55a23af39000-55a23b05b000' /lib/systemd/systemd broken # readlink '/proc/1/map_files/ 55a23af39000-55a23b05b000' /lib/systemd/systemd broken # readlink '/proc/1/map_files/55a23af39000-55a23b05b000 ' /lib/systemd/systemd very broken # readlink '/proc/1/map_files/1000000000000000055a23af39000-55a23b05b000' /lib/systemd/systemd Andrei said: : This patch breaks criu. It was a bug in criu. And this bug is on a minor : path, which works when memfd_create() isn't available. It is a reason why : I ask to not backport this patch to stable kernels. : : In CRIU this bug can be triggered, only if this patch will be backported : to a kernel which version is lower than v3.16. Link: http://lkml.kernel.org/r/20171120212706.GA14325@avx2Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Will Deacon authored
[ Upstream commit 202fb4ef ] If the spinlock "next" ticket wraps around between the initial LDR and the cmpxchg in the LSE version of spin_trylock, then we can erroneously think that we have successfuly acquired the lock because we only check whether the next ticket return by the cmpxchg is equal to the owner ticket in our updated lock word. This patch fixes the issue by performing a full 32-bit check of the lock word when trying to determine whether or not the CASA instruction updated memory. Reported-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Guanglei Li authored
[ Upstream commit 2c0aa086 ] Scenario: 1. Port down and do fail over 2. Ap do rds_bind syscall PID: 47039 TASK: ffff89887e2fe640 CPU: 47 COMMAND: "kworker/u:6" #0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9 #1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3 #2 [ffff898e35f15b30] oops_end at ffffffff8150f518 #3 [ffff898e35f15b60] no_context at ffffffff8104854c #4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675 #5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3 #6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8 #7 [ffff898e35f15d10] page_fault at ffffffff8150ea95 [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: ffff898e35f15dc8 RFLAGS: 00010282 RAX: 00000000fffffffe RBX: ffff889b77f6fc00 RCX:ffffffff81c99d88 RDX: 0000000000000000 RSI: ffff896019ee08e8 RDI:ffff889b77f6fc00 RBP: ffff898e35f15df0 R8: ffff896019ee08c8 R9:0000000000000000 R10: 0000000000000400 R11: 0000000000000000 R12:ffff896019ee08c0 R13: ffff889b77f6fe68 R14: ffffffff81c99d80 R15: ffffffffa022a1e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm] #9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6 #10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0 #11 [ffff898e35f15ee8] kthread at ffffffff81090fe6 PID: 45659 TASK: ffff880d313d2500 CPU: 31 COMMAND: "oracle_45659_ap" #0 [ffff881024ccfc98] __schedule at ffffffff8150bac4 #1 [ffff881024ccfd40] schedule at ffffffff8150c2cf #2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7 #3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb #4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm] #5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma] #6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds] #7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds] #8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670 PID: 45659 PID: 47039 rds_ib_laddr_check /* create id_priv with a null event_handler */ rdma_create_id rdma_bind_addr cma_acquire_dev /* add id_priv to cma_dev->id_list */ cma_attach_to_dev cma_ndev_work_handler /* event_hanlder is null */ id_priv->id.event_handler Signed-off-by:
Guanglei Li <guanglei.li@oracle.com> Signed-off-by:
Honglei Wang <honglei.wang@oracle.com> Reviewed-by:
Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by:
Yanjun Zhu <yanjun.zhu@oracle.com> Reviewed-by:
Leon Romanovsky <leonro@mellanox.com> Acked-by:
Santosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by:
Doug Ledford <dledford@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John Fastabend authored
[ Upstream commit 3d9e9526 ] When a program is attached to a map we increment the program refcnt to ensure that the program is not removed while it is potentially being referenced from sockmap side. However, if this same program also references the map (this is a reasonably common pattern in my programs) then the verifier will also increment the maps refcnt from the verifier. This is to ensure the map doesn't get garbage collected while the program has a reference to it. So we are left in a state where the map holds the refcnt on the program stopping it from being removed and releasing the map refcnt. And vice versa the program holds a refcnt on the map stopping it from releasing the refcnt on the prog. All this is fine as long as users detach the program while the map fd is still around. But, if the user omits this detach command we are left with a dangling map we can no longer release. To resolve this when the map fd is released decrement the program references and remove any reference from the map to the program. This fixes the issue with possibly dangling map and creates a user side API constraint. That is, the map fd must be held open for programs to be attached to a map. Fixes: 174a79ff ("bpf: sockmap with sk redirect support") Signed-off-by:
John Fastabend <john.fastabend@gmail.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ross Lagerwall authored
[ Upstream commit 3ac7292a ] The page given to gnttab_end_foreign_access() to free could be a compound page so use put_page() instead of free_page() since it can handle both compound and single pages correctly. This bug was discovered when migrating a Xen VM with several VIFs and CONFIG_DEBUG_VM enabled. It hits a BUG usually after fewer than 10 iterations. All netfront devices disconnect from the backend during a suspend/resume and this will call gnttab_end_foreign_access() if a netfront queue has an outstanding skb. The mismatch between calling get_page() and free_page() on a compound page causes a reference counting error which is detected when DEBUG_VM is enabled. Signed-off-by:
Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ross Lagerwall authored
[ Upstream commit f599c64f ] When a netfront device is set up it registers a netdev fairly early on, before it has set up the queues and is actually usable. A userspace tool like NetworkManager will immediately try to open it and access its state as soon as it appears. The bug can be reproduced by hotplugging VIFs until the VM runs out of grant refs. It registers the netdev but fails to set up any queues (since there are no more grant refs). In the meantime, NetworkManager opens the device and the kernel crashes trying to access the queues (of which there are none). Fix this in two ways: * For initial setup, register the netdev much later, after the queues are setup. This avoids the race entirely. * During a suspend/resume cycle, the frontend reconnects to the backend and the queues are recreated. It is possible (though highly unlikely) to race with something opening the device and accessing the queues after they have been destroyed but before they have been recreated. Extend the region covered by the rtnl semaphore to protect against this race. There is a possibility that we fail to recreate the queues so check for this in the open function. Signed-off-by:
Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
Juergen Gross <jgross@suse.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Olsa authored
[ Upstream commit 49c0ae80 ] Stephane reported that we don't set properly PERIOD sample type for events with period term defined. Before: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME|PERIOD, ... After: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME, ... Setting PERIOD sample type based on period term setup. Committer note: When we use -c or a period=N term in the event definition, then we don't need to ask the kernel, for this event, via perf_event_attr.sample_type |= PERF_SAMPLE_PERIOD, to put the event period in each sample for this event, as we know it already, it is in perf_event_attr.sample_period. Reported-by:
Stephane Eranian <eranian@google.com> Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Tested-by:
Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180201083812.11359-2-jolsa@kernel.orgSigned-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Matt Redfearn authored
[ Upstream commit 7bf8b16d ] The GIC supports running in External Interrupt Controller (EIC) mode, and will signal this via cpu_has_veic if enabled in hardware. Currently the generic kernel will panic if cpu_has_veic is set - but the GIC can legitimately set this flag if either configured to boot in EIC mode, or if the GIC driver enables this mode. Make the kernel not panic in this case, and instead just check if the GIC is present. If so, use it's CPU local interrupt routing functions. If an EIC is present, but it is not the GIC, then the kernel does not know how to get the VIRQ for the CPU local interrupts and should panic. Support for alternative EICs being present is needed here for the generic kernel to support them. Suggested-by:
Paul Burton <paul.burton@mips.com> Signed-off-by:
Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18191/Signed-off-by:
James Hogan <jhogan@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Olsa authored
[ Upstream commit f290aa1f ] Stephan reported we don't unset PERIOD sample type when --no-period is specified. Adding the unset check and reset PERIOD if --no-period is specified. Committer notes: Check the sample_type, it shouldn't have PERF_SAMPLE_PERIOD there when --no-period is used. Before: # perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # After: [root@jouet ~]# perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (17 samples) ] [root@jouet ~]# perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@jouet ~]# Reported-by:
Stephane Eranian <eranian@google.com> Signed-off-by:
Jiri Olsa <jolsa@kernel.org> Tested-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by:
Stephane Eranian <eranian@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180201083812.11359-3-jolsa@kernel.orgSigned-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Matt Redfearn authored
[ Upstream commit 0cde5b44 ] When commit b27311e1 ("MIPS: TXx9: Add RBTX4939 board support") added board support for the RBTX4939, it added a call to led_classdev_register even if the LED class is built as a module. Built-in arch code cannot call module code directly like this. Commit b33b4407 ("MIPS: TXX9: use IS_ENABLED() macro") subsequently changed the inclusion of this code to a single check that CONFIG_LEDS_CLASS is either builtin or a module, but the same issue remains. This leads to MIPS allmodconfig builds failing when CONFIG_MACH_TX49XX=y is set: arch/mips/txx9/rbtx4939/setup.o: In function `rbtx4939_led_probe': setup.c:(.init.text+0xc0): undefined reference to `of_led_classdev_register' make: *** [Makefile:999: vmlinux] Error 1 Fix this by using the IS_BUILTIN() macro instead. Fixes: b27311e1 ("MIPS: TXx9: Add RBTX4939 board support") Signed-off-by:
Matt Redfearn <matt.redfearn@mips.com> Reviewed-by:
James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/18544/Signed-off-by:
James Hogan <jhogan@kernel.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yonghong Song authored
[ Upstream commit 09584b40 ] With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file, tools/testing/selftests/bpf/test_kmod.sh failed like below: [root@localhost bpf]# ./test_kmod.sh sysctl: setting key "net.core.bpf_jit_enable": Invalid argument [ JIT enabled:0 hardened:0 ] [ 132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:0 ] [ 133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:1 ] [ 134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [ JIT enabled:1 hardened:2 ] [ 136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096 [ 136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed] [root@localhost bpf]# The test_kmod.sh load/remove test_bpf.ko multiple times with different settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297 of test_bpf.ko is designed such that JIT always fails. Commit 290af866 (bpf: introduce BPF_JIT_ALWAYS_ON config) introduced the following tightening logic: ... if (!bpf_prog_is_dev_bound(fp->aux)) { fp = bpf_int_jit_compile(fp); #ifdef CONFIG_BPF_JIT_ALWAYS_ON if (!fp->jited) { *err = -ENOTSUPP; return fp; } #endif ... With this logic, Test #297 always gets return value -ENOTSUPP when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure. This patch fixed the failure by marking Test #297 as expected failure when CONFIG_BPF_JIT_ALWAYS_ON is defined. Fixes: 290af866 (bpf: introduce BPF_JIT_ALWAYS_ON config) Signed-off-by:
Yonghong Song <yhs@fb.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hans de Goede authored
[ Upstream commit 63347db0 ] The acpi_get_bus_status wrapper for acpi_bus_get_status_handle has some code to handle certain device quirks, in some cases we also need this quirk handling for the initial _STA call. Specifically on some devices calling _STA before all _DEP dependencies are met results in errors like these: [ 0.123579] ACPI Error: No handler for Region [ECRM] (00000000ba9edc4c) [GenericSerialBus] (20170831/evregion-166) [ 0.123601] ACPI Error: Region GenericSerialBus (ID=9) has no handler (20170831/exfldio-299) [ 0.123618] ACPI Error: Method parse/execution failed \_SB.I2C1.BAT1._STA, AE_NOT_EXIST (20170831/psparse-550) acpi_get_bus_status already has code to avoid this, so by using it we also silence these errors from the initial _STA call. Note that in order for the acpi_get_bus_status handling for this to work, we initialize dep_unmet to 1 until acpi_device_dep_initialize gets called, this means that battery devices will be instantiated with an initial status of 0. This is not a problem, acpi_bus_attach will get called soon after the instantiation anyways and it will update the status as first point of order. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hans de Goede authored
[ Upstream commit 54ddce70 ] The battery code uses acpi_device->dep_unmet to check for unmet deps and if there are unmet deps it does not bind to the device to avoid errors about missing OpRegions when calling ACPI methods on the device. The missing OpRegions when there are unmet deps problem also applies to the _STA method of some battery devices and calling it too early results in errors like these: [ 0.123579] ACPI Error: No handler for Region [ECRM] (00000000ba9edc4c) [GenericSerialBus] (20170831/evregion-166) [ 0.123601] ACPI Error: Region GenericSerialBus (ID=9) has no handler (20170831/exfldio-299) [ 0.123618] ACPI Error: Method parse/execution failed \_SB.I2C1.BAT1._STA, AE_NOT_EXIST (20170831/psparse-550) This commit fixes these errors happening when acpi_get_bus_status gets called by checking dep_unmet for battery devices and reporting a status of 0 until all dependencies are met. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Chen Yu authored
[ Upstream commit ba1edb9a ] The following warning was triggered after resumed from S3 - if all the nonboot CPUs were put offline before suspend: [ 1840.329515] unchecked MSR access error: RDMSR from 0x771 at rIP: 0xffffffff86061e3a (native_read_msr+0xa/0x30) [ 1840.329516] Call Trace: [ 1840.329521] __rdmsr_on_cpu+0x33/0x50 [ 1840.329525] generic_exec_single+0x81/0xb0 [ 1840.329527] smp_call_function_single+0xd2/0x100 [ 1840.329530] ? acpi_ds_result_pop+0xdd/0xf2 [ 1840.329532] ? acpi_ds_create_operand+0x215/0x23c [ 1840.329534] rdmsrl_on_cpu+0x57/0x80 [ 1840.329536] ? cpumask_next+0x1b/0x20 [ 1840.329538] ? rdmsrl_on_cpu+0x57/0x80 [ 1840.329541] intel_pstate_update_perf_limits+0xf3/0x220 [ 1840.329544] ? notifier_call_chain+0x4a/0x70 [ 1840.329546] intel_pstate_set_policy+0x4e/0x150 [ 1840.329548] cpufreq_set_policy+0xcd/0x2f0 [ 1840.329550] cpufreq_update_policy+0xb2/0x130 [ 1840.329552] ? cpufreq_update_policy+0x130/0x130 [ 1840.329556] acpi_processor_ppc_has_changed+0x65/0x80 [ 1840.329558] acpi_processor_notify+0x80/0x100 [ 1840.329561] acpi_ev_notify_dispatch+0x44/0x5c [ 1840.329563] acpi_os_execute_deferred+0x14/0x20 [ 1840.329565] process_one_work+0x193/0x3c0 [ 1840.329567] worker_thread+0x35/0x3b0 [ 1840.329569] kthread+0x125/0x140 [ 1840.329571] ? process_one_work+0x3c0/0x3c0 [ 1840.329572] ? kthread_park+0x60/0x60 [ 1840.329575] ? do_syscall_64+0x67/0x180 [ 1840.329577] ret_from_fork+0x25/0x30 [ 1840.329585] unchecked MSR access error: WRMSR to 0x774 (tried to write 0x0000000000000000) at rIP: 0xffffffff86061f78 (native_write_msr+0x8/0x30) [ 1840.329586] Call Trace: [ 1840.329587] __wrmsr_on_cpu+0x37/0x40 [ 1840.329589] generic_exec_single+0x81/0xb0 [ 1840.329592] smp_call_function_single+0xd2/0x100 [ 1840.329594] ? acpi_ds_create_operand+0x215/0x23c [ 1840.329595] ? cpumask_next+0x1b/0x20 [ 1840.329597] wrmsrl_on_cpu+0x57/0x70 [ 1840.329598] ? rdmsrl_on_cpu+0x57/0x80 [ 1840.329599] ? wrmsrl_on_cpu+0x57/0x70 [ 1840.329602] intel_pstate_hwp_set+0xd3/0x150 [ 1840.329604] intel_pstate_set_policy+0x119/0x150 [ 1840.329606] cpufreq_set_policy+0xcd/0x2f0 [ 1840.329607] cpufreq_update_policy+0xb2/0x130 [ 1840.329610] ? cpufreq_update_policy+0x130/0x130 [ 1840.329613] acpi_processor_ppc_has_changed+0x65/0x80 [ 1840.329615] acpi_processor_notify+0x80/0x100 [ 1840.329617] acpi_ev_notify_dispatch+0x44/0x5c [ 1840.329619] acpi_os_execute_deferred+0x14/0x20 [ 1840.329620] process_one_work+0x193/0x3c0 [ 1840.329622] worker_thread+0x35/0x3b0 [ 1840.329624] kthread+0x125/0x140 [ 1840.329625] ? process_one_work+0x3c0/0x3c0 [ 1840.329626] ? kthread_park+0x60/0x60 [ 1840.329628] ? do_syscall_64+0x67/0x180 [ 1840.329631] ret_from_fork+0x25/0x30 This is because if there's only one online CPU, the MSR_PM_ENABLE (package wide)can not be enabled after resumed, due to intel_pstate_hwp_enable() will only be invoked on AP's online process after resumed - if there's no AP online, the HWP remains disabled after resumed (BIOS has disabled it in S3). Then if there comes a _PPC change notification which touches HWP register during this stage, the warning is triggered. Since we don't call acpi_processor_register_performance() when HWP is enabled, the pr->performance will be NULL. When this is NULL we don't need to do _PPC change notification. Reported-by:
Doug Smythies <dsmythies@telus.net> Suggested-by:
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by:
Yu Chen <yu.c.chen@intel.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jean Delvare authored
[ Upstream commit a7770ae1 ] The handling of empty DMI strings looks quite broken to me: * Strings from 1 to 7 spaces are not considered empty. * True empty DMI strings (string index set to 0) are not considered empty, and result in allocating a 0-char string. * Strings with invalid index also result in allocating a 0-char string. * Strings starting with 8 spaces are all considered empty, even if non-space characters follow (sounds like a weird thing to do, but I have actually seen occurrences of this in DMI tables before.) * Strings which are considered empty are reported as 8 spaces, instead of being actually empty. Some of these issues are the result of an off-by-one error in memcmp, the rest is incorrect by design. So let's get it square: missing strings and strings made of only spaces, regardless of their length, should be treated as empty and no memory should be allocated for them. All other strings are non-empty and should be allocated. Signed-off-by:
Jean Delvare <jdelvare@suse.de> Fixes: 79da4721 ("x86: fix DMI out of memory problems") Cc: Parag Warudkar <parag.warudkar@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
[ Upstream commit ebfc1501 ] In some configurations, 'partial' does not get initialized, as shown by this gcc-8 warning: arch/x86/kernel/dumpstack.c: In function 'show_trace_log_lvl': arch/x86/kernel/dumpstack.c:156:4: error: 'partial' may be used uninitialized in this function [-Werror=maybe-uninitialized] show_regs_if_on_stack(&stack_info, regs, partial); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This initializes it to false, to get the previous behavior in this case. Fixes: a9cdbe72 ("x86/dumpstack: Fix partial register dumps") Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lkml.kernel.org/r/20180202145634.200291-1-arnd@arndb.deSigned-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
[ Upstream commit 328008a7 ] The declaration for swsusp_arch_resume marks it as 'asmlinkage', but the definition in x86-32 does not, and it fails to include the header with the declaration. This leads to a warning when building with link-time-optimizations: kernel/power/power.h:108:23: error: type of 'swsusp_arch_resume' does not match original declaration [-Werror=lto-type-mismatch] extern asmlinkage int swsusp_arch_resume(void); ^ arch/x86/power/hibernate_32.c:148:0: note: 'swsusp_arch_resume' was previously declared here int swsusp_arch_resume(void) This moves the declaration into a globally visible header file and fixes up both x86 definitions to match it. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Len Brown <len.brown@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: linux-pm@vger.kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Pavel Machek <pavel@ucw.cz> Cc: Bart Van Assche <bart.vanassche@wdc.com> Link: https://lkml.kernel.org/r/20180202145634.200291-2-arnd@arndb.deSigned-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Subash Abhinov Kasiviswanathan authored
[ Upstream commit ea23d5e3 ] Failures were seen in ICMPv6 fragmentation timeout tests if they were run after the RFC2460 failure tests. Kernel was not sending out the ICMPv6 fragment reassembly time exceeded packet after the fragmentation reassembly timeout of 1 minute had elapsed. This happened because the frag queue was not released if an error in IPv6 fragmentation header was detected by RFC2460. Fixes: 83f1999c ("netfilter: ipv6: nf_defrag: Pass on packets to stack per RFC2460") Signed-off-by:
Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sebastian Ott authored
[ Upstream commit 366b77ae ] Commit 2a842aca ("block: introduce new block status code type") added blk_status_t usage to the eadm subchannel driver. However blk_status_t is unknown when included via <linux/blkdev.h> for CONFIG_BLOCK=n. Only include <linux/blk_types.h> since this is the only dependency eadm has. This fixes build failures like below: In file included from drivers/s390/cio/eadm_sch.c:24:0: ./arch/s390/include/asm/eadm.h:111:4: error: unknown type name 'blk_status_t'; did you mean 'si_status'? blk_status_t error); Reported-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Karol Herbst authored
[ Upstream commit fe9748b7 ] Fixes failure to compile with recent envyas as a result of the 'movw' alias being removed for v5. A bit of history: v3 only has a 16-bit sign-extended immediate mov op. In order to set the high bits, there's a separate 'sethi' op. envyas validates that the value passed to mov(imm) is between -0x8000 and 0x7fff. In order to simplify macros that load both the low and high word, a 'movw' alias was added which takes an unsigned 16-bit immediate. However the actual hardware op still sign extends. v5 has a full 32-bit immediate mov op. The v3 16-bit immediate mov op is gone (loads 0 into the dst reg). However due to a bug in envyas, the movw alias still existed, and selected the no-longer-present v3 16-bit immediate mov op. As a result usage of movw on v5 is the same as mov with a 0x0 argument. The proper fix throughout is to only ever use the 'movw' alias in combination with 'sethi'. Anything else should get the sign-extended validation to ensure that the intended value ends up in the destination register. Changes in fuc3 binaries is the result of a different encoding being selected for a mov with an 8-bit value. v2: added commit message written by Ilia, thanks for that! v3: messed up rebasing, now it should apply Signed-off-by:
Karol Herbst <kherbst@redhat.com> Signed-off-by:
Ben Skeggs <bskeggs@redhat.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Don Hiatt authored
[ Upstream commit 87daac68 ] iWarp devices do not support the creation of address handles so return AH_ATTR_TYPE_UNDEFINED for all iWarp devices. While we are here reduce the size of port_num to u8 and add a comment. Fixes: 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types") Reported-by:
Parav Pandit <parav@mellanox.com> CC: Sean Hefty <sean.hefty@intel.com> Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Reviewed-by:
Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by:
Don Hiatt <don.hiatt@intel.com> Signed-off-by:
Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by:
Jason Gunthorpe <jgg@mellanox.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Estrin authored
[ Upstream commit 10293610 ] On reboot SM can program port pkey table before ipoib registered its event handler, which could result in missing pkey event and leave root interface with initial pkey value from index 0. Since OPA port starts with invalid pkey in index 0, root interface will fail to initialize and stay down with no-carrier flag. For IB ipoib interface may end up with pkey different from value opensm put in pkey table idx 0, resulting in connectivity issues (different mcast groups, for example). Close the window by calling event handler after registration to make sure ipoib pkey is in sync with port pkey table. Reviewed-by:
Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by:
Ira Weiny <ira.weiny@intel.com> Signed-off-by:
Alex Estrin <alex.estrin@intel.com> Signed-off-by:
Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by:
Jason Gunthorpe <jgg@mellanox.com> Signed-off-by:
Sasha Levin <alexander.levin@microsoft.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-