1. 05 Oct, 2019 40 commits
    • Nikolay Borisov's avatar
      btrfs: Relinquish CPUs in btrfs_compare_trees · 067f82a0
      Nikolay Borisov authored
      commit 6af112b1 upstream.
      
      When doing any form of incremental send the parent and the child trees
      need to be compared via btrfs_compare_trees. This  can result in long
      loop chains without ever relinquishing the CPU. This causes softlockup
      detector to trigger when comparing trees with a lot of items. Example
      report:
      
      watchdog: BUG: soft lockup - CPU#0 stuck for 24s! [snapperd:16153]
      CPU: 0 PID: 16153 Comm: snapperd Not tainted 5.2.9-1-default #1 openSUSE Tumbleweed (unreleased)
      Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
      pstate: 40000005 (nZcv daif -PAN -UAO)
      pc : __ll_sc_arch_atomic_sub_return+0x14/0x20
      lr : btrfs_release_extent_buffer_pages+0xe0/0x1e8 [btrfs]
      sp : ffff00001273b7e0
      Call trace:
       __ll_sc_arch_atomic_sub_return+0x14/0x20
       release_extent_buffer+0xdc/0x120 [btrfs]
       free_extent_buffer.part.0+0xb0/0x118 [btrfs]
       free_extent_buffer+0x24/0x30 [btrfs]
       btrfs_release_path+0x4c/0xa0 [btrfs]
       btrfs_free_path.part.0+0x20/0x40 [btrfs]
       btrfs_free_path+0x24/0x30 [btrfs]
       get_inode_info+0xa8/0xf8 [btrfs]
       finish_inode_if_needed+0xe0/0x6d8 [btrfs]
       changed_cb+0x9c/0x410 [btrfs]
       btrfs_compare_trees+0x284/0x648 [btrfs]
       send_subvol+0x33c/0x520 [btrfs]
       btrfs_ioctl_send+0x8a0/0xaf0 [btrfs]
       btrfs_ioctl+0x199c/0x2288 [btrfs]
       do_vfs_ioctl+0x4b0/0x820
       ksys_ioctl+0x84/0xb8
       __arm64_sys_ioctl+0x28/0x38
       el0_svc_common.constprop.0+0x7c/0x188
       el0_svc_handler+0x34/0x90
       el0_svc+0x8/0xc
      
      Fix this by adding a call to cond_resched at the beginning of the main
      loop in btrfs_compare_trees.
      
      Fixes: 7069830a ("Btrfs: add btrfs_compare_trees function")
      CC: stable@vger.kernel.org # 4.4+
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      067f82a0
    • Filipe Manana's avatar
      Btrfs: fix use-after-free when using the tree modification log · b08344be
      Filipe Manana authored
      commit efad8a85 upstream.
      
      At ctree.c:get_old_root(), we are accessing a root's header owner field
      after we have freed the respective extent buffer. This results in an
      use-after-free that can lead to crashes, and when CONFIG_DEBUG_PAGEALLOC
      is set, results in a stack trace like the following:
      
        [ 3876.799331] stack segment: 0000 [#1] SMP DEBUG_PAGEALLOC PTI
        [ 3876.799363] CPU: 0 PID: 15436 Comm: pool Not tainted 5.3.0-rc3-btrfs-next-54 #1
        [ 3876.799385] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
        [ 3876.799433] RIP: 0010:btrfs_search_old_slot+0x652/0xd80 [btrfs]
        (...)
        [ 3876.799502] RSP: 0018:ffff9f08c1a2f9f0 EFLAGS: 00010286
        [ 3876.799518] RAX: ffff8dd300000000 RBX: ffff8dd85a7a9348 RCX: 000000038da26000
        [ 3876.799538] RDX: 0000000000000000 RSI: ffffe522ce368980 RDI: 0000000000000246
        [ 3876.799559] RBP: dae1922adadad000 R08: 0000000008020000 R09: ffffe522c0000000
        [ 3876.799579] R10: ffff8dd57fd788c8 R11: 000000007511b030 R12: ffff8dd781ddc000
        [ 3876.799599] R13: ffff8dd9e6240578 R14: ffff8dd6896f7a88 R15: ffff8dd688cf90b8
        [ 3876.799620] FS:  00007f23ddd97700(0000) GS:ffff8dda20200000(0000) knlGS:0000000000000000
        [ 3876.799643] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 3876.799660] CR2: 00007f23d4024000 CR3: 0000000710bb0005 CR4: 00000000003606f0
        [ 3876.799682] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [ 3876.799703] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [ 3876.799723] Call Trace:
        [ 3876.799735]  ? do_raw_spin_unlock+0x49/0xc0
        [ 3876.799749]  ? _raw_spin_unlock+0x24/0x30
        [ 3876.799779]  resolve_indirect_refs+0x1eb/0xc80 [btrfs]
        [ 3876.799810]  find_parent_nodes+0x38d/0x1180 [btrfs]
        [ 3876.799841]  btrfs_check_shared+0x11a/0x1d0 [btrfs]
        [ 3876.799870]  ? extent_fiemap+0x598/0x6e0 [btrfs]
        [ 3876.799895]  extent_fiemap+0x598/0x6e0 [btrfs]
        [ 3876.799913]  do_vfs_ioctl+0x45a/0x700
        [ 3876.799926]  ksys_ioctl+0x70/0x80
        [ 3876.799938]  ? trace_hardirqs_off_thunk+0x1a/0x20
        [ 3876.799953]  __x64_sys_ioctl+0x16/0x20
        [ 3876.799965]  do_syscall_64+0x62/0x220
        [ 3876.799977]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [ 3876.799993] RIP: 0033:0x7f23e0013dd7
        (...)
        [ 3876.800056] RSP: 002b:00007f23ddd96ca8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
        [ 3876.800078] RAX: ffffffffffffffda RBX: 00007f23d80210f8 RCX: 00007f23e0013dd7
        [ 3876.800099] RDX: 00007f23d80210f8 RSI: 00000000c020660b RDI: 0000000000000003
        [ 3876.800626] RBP: 000055fa2a2a2440 R08: 0000000000000000 R09: 00007f23ddd96d7c
        [ 3876.801143] R10: 00007f23d8022000 R11: 0000000000000246 R12: 00007f23ddd96d80
        [ 3876.801662] R13: 00007f23ddd96d78 R14: 00007f23d80210f0 R15: 00007f23ddd96d80
        (...)
        [ 3876.805107] ---[ end trace e53161e179ef04f9 ]---
      
      Fix that by saving the root's header owner field into a local variable
      before freeing the root's extent buffer, and then use that local variable
      when needed.
      
      Fixes: 30b0463a ("Btrfs: fix accessing the root pointer in tree mod log functions")
      CC: stable@vger.kernel.org # 3.10+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Reviewed-by: default avatarAnand Jain <anand.jain@oracle.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b08344be
    • Christophe Leroy's avatar
      btrfs: fix allocation of free space cache v1 bitmap pages · 4874c6fe
      Christophe Leroy authored
      commit 3acd4850 upstream.
      
      Various notifications of type "BUG kmalloc-4096 () : Redzone
      overwritten" have been observed recently in various parts of the kernel.
      After some time, it has been made a relation with the use of BTRFS
      filesystem and with SLUB_DEBUG turned on.
      
      [   22.809700] BUG kmalloc-4096 (Tainted: G        W        ): Redzone overwritten
      
      [   22.810286] INFO: 0xbe1a5921-0xfbfc06cd. First byte 0x0 instead of 0xcc
      [   22.810866] INFO: Allocated in __load_free_space_cache+0x588/0x780 [btrfs] age=22 cpu=0 pid=224
      [   22.811193] 	__slab_alloc.constprop.26+0x44/0x70
      [   22.811345] 	kmem_cache_alloc_trace+0xf0/0x2ec
      [   22.811588] 	__load_free_space_cache+0x588/0x780 [btrfs]
      [   22.811848] 	load_free_space_cache+0xf4/0x1b0 [btrfs]
      [   22.812090] 	cache_block_group+0x1d0/0x3d0 [btrfs]
      [   22.812321] 	find_free_extent+0x680/0x12a4 [btrfs]
      [   22.812549] 	btrfs_reserve_extent+0xec/0x220 [btrfs]
      [   22.812785] 	btrfs_alloc_tree_block+0x178/0x5f4 [btrfs]
      [   22.813032] 	__btrfs_cow_block+0x150/0x5d4 [btrfs]
      [   22.813262] 	btrfs_cow_block+0x194/0x298 [btrfs]
      [   22.813484] 	commit_cowonly_roots+0x44/0x294 [btrfs]
      [   22.813718] 	btrfs_commit_transaction+0x63c/0xc0c [btrfs]
      [   22.813973] 	close_ctree+0xf8/0x2a4 [btrfs]
      [   22.814107] 	generic_shutdown_super+0x80/0x110
      [   22.814250] 	kill_anon_super+0x18/0x30
      [   22.814437] 	btrfs_kill_super+0x18/0x90 [btrfs]
      [   22.814590] INFO: Freed in proc_cgroup_show+0xc0/0x248 age=41 cpu=0 pid=83
      [   22.814841] 	proc_cgroup_show+0xc0/0x248
      [   22.814967] 	proc_single_show+0x54/0x98
      [   22.815086] 	seq_read+0x278/0x45c
      [   22.815190] 	__vfs_read+0x28/0x17c
      [   22.815289] 	vfs_read+0xa8/0x14c
      [   22.815381] 	ksys_read+0x50/0x94
      [   22.815475] 	ret_from_syscall+0x0/0x38
      
      Commit 69d24804 ("btrfs: use copy_page for copying pages instead of
      memcpy") changed the way bitmap blocks are copied. But allthough bitmaps
      have the size of a page, they were allocated with kzalloc().
      
      Most of the time, kzalloc() allocates aligned blocks of memory, so
      copy_page() can be used. But when some debug options like SLAB_DEBUG are
      activated, kzalloc() may return unaligned pointer.
      
      On powerpc, memcpy(), copy_page() and other copying functions use
      'dcbz' instruction which provides an entire zeroed cacheline to avoid
      memory read when the intention is to overwrite a full line. Functions
      like memcpy() are writen to care about partial cachelines at the start
      and end of the destination, but copy_page() assumes it gets pages. As
      pages are naturally cache aligned, copy_page() doesn't care about
      partial lines. This means that when copy_page() is called with a
      misaligned pointer, a few leading bytes are zeroed.
      
      To fix it, allocate bitmaps through kmem_cache instead of using kzalloc()
      The cache pool is created with PAGE_SIZE alignment constraint.
      Reported-by: default avatarErhard F. <erhard_f@mailbox.org>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204371
      Fixes: 69d24804 ("btrfs: use copy_page for copying pages instead of memcpy")
      Cc: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      [ rename to btrfs_free_space_bitmap ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4874c6fe
    • Mark Salyzyn's avatar
      ovl: filter of trusted xattr results in audit · 934243a7
      Mark Salyzyn authored
      commit 5c2e9f34 upstream.
      
      When filtering xattr list for reading, presence of trusted xattr
      results in a security audit log.  However, if there is other content
      no errno will be set, and if there isn't, the errno will be -ENODATA
      and not -EPERM as is usually associated with a lack of capability.
      The check does not block the request to list the xattrs present.
      
      Switch to ns_capable_noaudit to reflect a more appropriate check.
      Signed-off-by: default avatarMark Salyzyn <salyzyn@android.com>
      Cc: linux-security-module@vger.kernel.org
      Cc: kernel-team@android.com
      Cc: stable@vger.kernel.org # v3.18+
      Fixes: a082c6f6 ("ovl: filter trusted xattr for non-admin")
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      934243a7
    • Ding Xiang's avatar
      ovl: Fix dereferencing possible ERR_PTR() · e7265adc
      Ding Xiang authored
      commit 97f024b9 upstream.
      
      if ovl_encode_real_fh() fails, no memory was allocated
      and the error in the error-valued pointer should be returned.
      
      Fixes: 9b6faee0 ("ovl: check ERR_PTR() return value from ovl_encode_fh()")
      Signed-off-by: default avatarDing Xiang <dingxiang@cmss.chinamobile.com>
      Cc: <stable@vger.kernel.org> # v4.16+
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e7265adc
    • Steve French's avatar
      smb3: allow disabling requesting leases · 2e96c933
      Steve French authored
      commit 3e7a02d4 upstream.
      
      In some cases to work around server bugs or performance
      problems it can be helpful to be able to disable requesting
      SMB2.1/SMB3 leases on a particular mount (not to all servers
      and all shares we are mounted to). Add new mount parm
      "nolease" which turns off requesting leases on directory
      or file opens.  Currently the only way to disable leases is
      globally through a module load parameter. This is more
      granular.
      Suggested-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Reviewed-by: default avatarRonnie Sahlberg <lsahlber@redhat.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      CC: Stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e96c933
    • Yufen Yu's avatar
      block: fix null pointer dereference in blk_mq_rq_timed_out() · 82652c06
      Yufen Yu authored
      commit 8d699663 upstream.
      
      We got a null pointer deference BUG_ON in blk_mq_rq_timed_out()
      as following:
      
      [  108.825472] BUG: kernel NULL pointer dereference, address: 0000000000000040
      [  108.827059] PGD 0 P4D 0
      [  108.827313] Oops: 0000 [#1] SMP PTI
      [  108.827657] CPU: 6 PID: 198 Comm: kworker/6:1H Not tainted 5.3.0-rc8+ #431
      [  108.829503] Workqueue: kblockd blk_mq_timeout_work
      [  108.829913] RIP: 0010:blk_mq_check_expired+0x258/0x330
      [  108.838191] Call Trace:
      [  108.838406]  bt_iter+0x74/0x80
      [  108.838665]  blk_mq_queue_tag_busy_iter+0x204/0x450
      [  108.839074]  ? __switch_to_asm+0x34/0x70
      [  108.839405]  ? blk_mq_stop_hw_queue+0x40/0x40
      [  108.839823]  ? blk_mq_stop_hw_queue+0x40/0x40
      [  108.840273]  ? syscall_return_via_sysret+0xf/0x7f
      [  108.840732]  blk_mq_timeout_work+0x74/0x200
      [  108.841151]  process_one_work+0x297/0x680
      [  108.841550]  worker_thread+0x29c/0x6f0
      [  108.841926]  ? rescuer_thread+0x580/0x580
      [  108.842344]  kthread+0x16a/0x1a0
      [  108.842666]  ? kthread_flush_work+0x170/0x170
      [  108.843100]  ret_from_fork+0x35/0x40
      
      The bug is caused by the race between timeout handle and completion for
      flush request.
      
      When timeout handle function blk_mq_rq_timed_out() try to read
      'req->q->mq_ops', the 'req' have completed and reinitiated by next
      flush request, which would call blk_rq_init() to clear 'req' as 0.
      
      After commit 12f5b931 ("blk-mq: Remove generation seqeunce"),
      normal requests lifetime are protected by refcount. Until 'rq->ref'
      drop to zero, the request can really be free. Thus, these requests
      cannot been reused before timeout handle finish.
      
      However, flush request has defined .end_io and rq->end_io() is still
      called even if 'rq->ref' doesn't drop to zero. After that, the 'flush_rq'
      can be reused by the next flush request handle, resulting in null
      pointer deference BUG ON.
      
      We fix this problem by covering flush request with 'rq->ref'.
      If the refcount is not zero, flush_end_io() return and wait the
      last holder recall it. To record the request status, we add a new
      entry 'rq_status', which will be used in flush_end_io().
      
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Keith Busch <keith.busch@intel.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: stable@vger.kernel.org # v4.18+
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      -------
      v2:
       - move rq_status from struct request to struct blk_flush_queue
      v3:
       - remove unnecessary '{}' pair.
      v4:
       - let spinlock to protect 'fq->rq_status'
      v5:
       - move rq_status after flush_running_idx member of struct blk_flush_queue
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      82652c06
    • Stefan Assmann's avatar
      i40e: check __I40E_VF_DISABLE bit in i40e_sync_filters_subtask · db5b2fe4
      Stefan Assmann authored
      commit a7542b87 upstream.
      
      While testing VF spawn/destroy the following panic occurred.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000029
      [...]
      Workqueue: i40e i40e_service_task [i40e]
      RIP: 0010:i40e_sync_vsi_filters+0x6fd/0xc60 [i40e]
      [...]
      Call Trace:
       ? __switch_to_asm+0x35/0x70
       ? __switch_to_asm+0x41/0x70
       ? __switch_to_asm+0x35/0x70
       ? _cond_resched+0x15/0x30
       i40e_sync_filters_subtask+0x56/0x70 [i40e]
       i40e_service_task+0x382/0x11b0 [i40e]
       ? __switch_to_asm+0x41/0x70
       ? __switch_to_asm+0x41/0x70
       process_one_work+0x1a7/0x3b0
       worker_thread+0x30/0x390
       ? create_worker+0x1a0/0x1a0
       kthread+0x112/0x130
       ? kthread_bind+0x30/0x30
       ret_from_fork+0x35/0x40
      
      Investigation revealed a race where pf->vf[vsi->vf_id].trusted may get
      accessed by the watchdog via i40e_sync_filters_subtask() although
      i40e_free_vfs() already free'd pf->vf.
      To avoid this the call to i40e_sync_vsi_filters() in
      i40e_sync_filters_subtask() needs to be guarded by __I40E_VF_DISABLE,
      which is also used by i40e_free_vfs().
      
      Note: put the __I40E_VF_DISABLE check after the
      __I40E_MACVLAN_SYNC_PENDING check as the latter is more likely to
      trigger.
      
      CC: stable@vger.kernel.org
      Signed-off-by: default avatarStefan Assmann <sassmann@kpanic.de>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db5b2fe4
    • Michal Hocko's avatar
      memcg, kmem: do not fail __GFP_NOFAIL charges · b4a734a5
      Michal Hocko authored
      commit e55d9d9b upstream.
      
      Thomas has noticed the following NULL ptr dereference when using cgroup
      v1 kmem limit:
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      PGD 0
      P4D 0
      Oops: 0000 [#1] PREEMPT SMP PTI
      CPU: 3 PID: 16923 Comm: gtk-update-icon Not tainted 4.19.51 #42
      Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015
      RIP: 0010:create_empty_buffers+0x24/0x100
      Code: cd 0f 1f 44 00 00 0f 1f 44 00 00 41 54 49 89 d4 ba 01 00 00 00 55 53 48 89 fb e8 97 fe ff ff 48 89 c5 48 89 c2 eb 03 48 89 ca <48> 8b 4a 08 4c 09 22 48 85 c9 75 f1 48 89 6a 08 48 8b 43 18 48 8d
      RSP: 0018:ffff927ac1b37bf8 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: fffff2d4429fd740 RCX: 0000000100097149
      RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff9075a99fbe00
      RBP: 0000000000000000 R08: fffff2d440949cc8 R09: 00000000000960c0
      R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
      R13: ffff907601f18360 R14: 0000000000002000 R15: 0000000000001000
      FS:  00007fb55b288bc0(0000) GS:ffff90761f8c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 000000007aebc002 CR4: 00000000001606e0
      Call Trace:
       create_page_buffers+0x4d/0x60
       __block_write_begin_int+0x8e/0x5a0
       ? ext4_inode_attach_jinode.part.82+0xb0/0xb0
       ? jbd2__journal_start+0xd7/0x1f0
       ext4_da_write_begin+0x112/0x3d0
       generic_perform_write+0xf1/0x1b0
       ? file_update_time+0x70/0x140
       __generic_file_write_iter+0x141/0x1a0
       ext4_file_write_iter+0xef/0x3b0
       __vfs_write+0x17e/0x1e0
       vfs_write+0xa5/0x1a0
       ksys_write+0x57/0xd0
       do_syscall_64+0x55/0x160
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Tetsuo then noticed that this is because the __memcg_kmem_charge_memcg
      fails __GFP_NOFAIL charge when the kmem limit is reached.  This is a wrong
      behavior because nofail allocations are not allowed to fail.  Normal
      charge path simply forces the charge even if that means to cross the
      limit.  Kmem accounting should be doing the same.
      
      Link: http://lkml.kernel.org/r/20190906125608.32129-1-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarThomas Lindroth <thomas.lindroth@gmail.com>
      Debugged-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Thomas Lindroth <thomas.lindroth@gmail.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4a734a5
    • Tetsuo Handa's avatar
      memcg, oom: don't require __GFP_FS when invoking memcg OOM killer · d40b3eaf
      Tetsuo Handa authored
      commit f9c64562 upstream.
      
      Masoud Sharbiani noticed that commit 29ef680a ("memcg, oom: move
      out_of_memory back to the charge path") broke memcg OOM called from
      __xfs_filemap_fault() path.  It turned out that try_charge() is retrying
      forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
      cannot invoke the OOM killer due to commit 3da88fb3 ("mm, oom:
      move GFP_NOFS check to out_of_memory").
      
      Allowing forced charge due to being unable to invoke memcg OOM killer will
      lead to global OOM situation.  Also, just returning -ENOMEM will be risky
      because OOM path is lost and some paths (e.g.  get_user_pages()) will leak
      -ENOMEM.  Therefore, invoking memcg OOM killer (despite GFP_NOFS) will be
      the only choice we can choose for now.
      
      Until 29ef680a, we were able to invoke memcg OOM killer when
      GFP_KERNEL reclaim failed [1].  But since 29ef680a, we need to
      invoke memcg OOM killer when GFP_NOFS reclaim failed [2].  Although in the
      past we did invoke memcg OOM killer for GFP_NOFS [3], we might get
      pre-mature memcg OOM reports due to this patch.
      
      [1]
      
       leaker invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
       CPU: 0 PID: 2746 Comm: leaker Not tainted 4.18.0+ #19
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        dump_stack+0x63/0x88
        dump_header+0x67/0x27a
        ? mem_cgroup_scan_tasks+0x91/0xf0
        oom_kill_process+0x210/0x410
        out_of_memory+0x10a/0x2c0
        mem_cgroup_out_of_memory+0x46/0x80
        mem_cgroup_oom_synchronize+0x2e4/0x310
        ? high_work_func+0x20/0x20
        pagefault_out_of_memory+0x31/0x76
        mm_fault_error+0x55/0x115
        ? handle_mm_fault+0xfd/0x220
        __do_page_fault+0x433/0x4e0
        do_page_fault+0x22/0x30
        ? page_fault+0x8/0x30
        page_fault+0x1e/0x30
       RIP: 0033:0x4009f0
       Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
       RSP: 002b:00007ffe29ae96f0 EFLAGS: 00010206
       RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001ce1000
       RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
       RBP: 000000000000000c R08: 0000000000000000 R09: 00007f94be09220d
       R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
       R13: 0000000000000003 R14: 00007f949d845000 R15: 0000000002800000
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 158965
       memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 2016kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:844KB rss:521136KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:132KB writeback:0KB inactive_anon:0KB active_anon:521224KB inactive_file:1012KB active_file:8KB unevictable:0KB
       Memory cgroup out of memory: Kill process 2746 (leaker) score 998 or sacrifice child
       Killed process 2746 (leaker) total-vm:536704kB, anon-rss:521176kB, file-rss:1208kB, shmem-rss:0kB
       oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      
      [2]
      
       leaker invoked oom-killer: gfp_mask=0x600040(GFP_NOFS), nodemask=(null), order=0, oom_score_adj=0
       CPU: 1 PID: 2746 Comm: leaker Not tainted 4.18.0+ #20
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        dump_stack+0x63/0x88
        dump_header+0x67/0x27a
        ? mem_cgroup_scan_tasks+0x91/0xf0
        oom_kill_process+0x210/0x410
        out_of_memory+0x109/0x2d0
        mem_cgroup_out_of_memory+0x46/0x80
        try_charge+0x58d/0x650
        ? __radix_tree_replace+0x81/0x100
        mem_cgroup_try_charge+0x7a/0x100
        __add_to_page_cache_locked+0x92/0x180
        add_to_page_cache_lru+0x4d/0xf0
        iomap_readpages_actor+0xde/0x1b0
        ? iomap_zero_range_actor+0x1d0/0x1d0
        iomap_apply+0xaf/0x130
        iomap_readpages+0x9f/0x150
        ? iomap_zero_range_actor+0x1d0/0x1d0
        xfs_vm_readpages+0x18/0x20 [xfs]
        read_pages+0x60/0x140
        __do_page_cache_readahead+0x193/0x1b0
        ondemand_readahead+0x16d/0x2c0
        page_cache_async_readahead+0x9a/0xd0
        filemap_fault+0x403/0x620
        ? alloc_set_pte+0x12c/0x540
        ? _cond_resched+0x14/0x30
        __xfs_filemap_fault+0x66/0x180 [xfs]
        xfs_filemap_fault+0x27/0x30 [xfs]
        __do_fault+0x19/0x40
        __handle_mm_fault+0x8e8/0xb60
        handle_mm_fault+0xfd/0x220
        __do_page_fault+0x238/0x4e0
        do_page_fault+0x22/0x30
        ? page_fault+0x8/0x30
        page_fault+0x1e/0x30
       RIP: 0033:0x4009f0
       Code: 03 00 00 00 e8 71 fd ff ff 48 83 f8 ff 49 89 c6 74 74 48 89 c6 bf c0 0c 40 00 31 c0 e8 69 fd ff ff 45 85 ff 7e 21 31 c9 66 90 <41> 0f be 14 0e 01 d3 f7 c1 ff 0f 00 00 75 05 41 c6 04 0e 2a 48 83
       RSP: 002b:00007ffda45c9290 EFLAGS: 00010206
       RAX: 000000000000001b RBX: 0000000000000000 RCX: 0000000001a1e000
       RDX: 0000000000000000 RSI: 000000007fffffe5 RDI: 0000000000000000
       RBP: 000000000000000c R08: 0000000000000000 R09: 00007f6d061ff20d
       R10: 0000000000000002 R11: 0000000000000246 R12: 00000000000186a0
       R13: 0000000000000003 R14: 00007f6ce59b2000 R15: 0000000002800000
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 7221
       memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 1944kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:3632KB rss:518232KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB inactive_anon:0KB active_anon:518408KB inactive_file:3908KB active_file:12KB unevictable:0KB
       Memory cgroup out of memory: Kill process 2746 (leaker) score 992 or sacrifice child
       Killed process 2746 (leaker) total-vm:536704kB, anon-rss:518264kB, file-rss:1188kB, shmem-rss:0kB
       oom_reaper: reaped process 2746 (leaker), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
      
      [3]
      
       leaker invoked oom-killer: gfp_mask=0x50, order=0, oom_score_adj=0
       leaker cpuset=/ mems_allowed=0
       CPU: 1 PID: 3206 Comm: leaker Not tainted 3.10.0-957.27.2.el7.x86_64 #1
       Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
       Call Trace:
        [<ffffffffaf364147>] dump_stack+0x19/0x1b
        [<ffffffffaf35eb6a>] dump_header+0x90/0x229
        [<ffffffffaedbb456>] ? find_lock_task_mm+0x56/0xc0
        [<ffffffffaee32a38>] ? try_get_mem_cgroup_from_mm+0x28/0x60
        [<ffffffffaedbb904>] oom_kill_process+0x254/0x3d0
        [<ffffffffaee36c36>] mem_cgroup_oom_synchronize+0x546/0x570
        [<ffffffffaee360b0>] ? mem_cgroup_charge_common+0xc0/0xc0
        [<ffffffffaedbc194>] pagefault_out_of_memory+0x14/0x90
        [<ffffffffaf35d072>] mm_fault_error+0x6a/0x157
        [<ffffffffaf3717c8>] __do_page_fault+0x3c8/0x4f0
        [<ffffffffaf371925>] do_page_fault+0x35/0x90
        [<ffffffffaf36d768>] page_fault+0x28/0x30
       Task in /leaker killed as a result of limit of /leaker
       memory: usage 524288kB, limit 524288kB, failcnt 20628
       memory+swap: usage 524288kB, limit 9007199254740988kB, failcnt 0
       kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
       Memory cgroup stats for /leaker: cache:840KB rss:523448KB rss_huge:0KB mapped_file:0KB swap:0KB inactive_anon:0KB active_anon:523448KB inactive_file:464KB active_file:376KB unevictable:0KB
       Memory cgroup out of memory: Kill process 3206 (leaker) score 970 or sacrifice child
       Killed process 3206 (leaker) total-vm:536692kB, anon-rss:523304kB, file-rss:412kB, shmem-rss:0kB
      
      Bisected by Masoud Sharbiani.
      
      Link: http://lkml.kernel.org/r/cbe54ed1-b6ba-a056-8899-2dc42526371d@i-love.sakura.ne.jp
      Fixes: 3da88fb3 ("mm, oom: move GFP_NOFS check to out_of_memory") [necessary after 29ef680a]
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: default avatarMasoud Sharbiani <msharbiani@apple.com>
      Tested-by: default avatarMasoud Sharbiani <msharbiani@apple.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[4.19+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d40b3eaf
    • Bob Peterson's avatar
      gfs2: clear buf_in_tr when ending a transaction in sweep_bh_for_rgrps · e0c1e6e5
      Bob Peterson authored
      commit f0b444b3 upstream.
      
      In function sweep_bh_for_rgrps, which is a helper for punch_hole,
      it uses variable buf_in_tr to keep track of when it needs to commit
      pending block frees on a partial delete that overflows the
      transaction created for the delete. The problem is that the
      variable was initialized at the start of function sweep_bh_for_rgrps
      but it was never cleared, even when starting a new transaction.
      
      This patch reinitializes the variable when the transaction is
      ended, so the next transaction starts out with it cleared.
      
      Fixes: d552a2b9 ("GFS2: Non-recursive delete")
      Cc: stable@vger.kernel.org # v4.12+
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0c1e6e5
    • Hans de Goede's avatar
      efifb: BGRT: Improve efifb_bgrt_sanity_check · 3620b06b
      Hans de Goede authored
      commit 51677dfc upstream.
      
      For various reasons, at least with x86 EFI firmwares, the xoffset and
      yoffset in the BGRT info are not always reliable.
      
      Extensive testing has shown that when the info is correct, the
      BGRT image is always exactly centered horizontally (the yoffset variable
      is more variable and not always predictable).
      
      This commit simplifies / improves the bgrt_sanity_check to simply
      check that the BGRT image is exactly centered horizontally and skips
      (re)drawing it when it is not.
      
      This fixes the BGRT image sometimes being drawn in the wrong place.
      
      Cc: stable@vger.kernel.org
      Fixes: 88fe4ceb ("efifb: BGRT: Do not copy the boot graphics for non native resolutions")
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Cc: Peter Jones <pjones@redhat.com>,
      Signed-off-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190721131918.10115-1-hdegoede@redhat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3620b06b
    • Mark Brown's avatar
      regulator: Defer init completion for a while after late_initcall · c4f65c2f
      Mark Brown authored
      commit 55576cf1 upstream.
      
      The kernel has no way of knowing when we have finished instantiating
      drivers, between deferred probe and systems that build key drivers as
      modules we might be doing this long after userspace has booted. This has
      always been a bit of an issue with regulator_init_complete since it can
      power off hardware that's not had it's driver loaded which can result in
      user visible effects, the main case is powering off displays. Practically
      speaking it's not been an issue in real systems since most systems that
      use the regulator API are embedded and build in key drivers anyway but
      with Arm laptops coming on the market it's becoming more of an issue so
      let's do something about it.
      
      In the absence of any better idea just defer the powering off for 30s
      after late_initcall(), this is obviously a hack but it should mask the
      issue for now and it's no more arbitrary than late_initcall() itself.
      Ideally we'd have some heuristics to detect if we're on an affected
      system and tune or skip the delay appropriately, and there may be some
      need for a command line option to be added.
      
      Link: https://lore.kernel.org/r/20190904124250.25844-1-broonie@kernel.orgSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      Tested-by: default avatarLee Jones <lee.jones@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c4f65c2f
    • Thadeu Lima de Souza Cascardo's avatar
      alarmtimer: Use EOPNOTSUPP instead of ENOTSUPP · 3784576f
      Thadeu Lima de Souza Cascardo authored
      commit f18ddc13 upstream.
      
      ENOTSUPP is not supposed to be returned to userspace. This was found on an
      OpenPower machine, where the RTC does not support set_alarm.
      
      On that system, a clock_nanosleep(CLOCK_REALTIME_ALARM, ...) results in
      "524 Unknown error 524"
      
      Replace it with EOPNOTSUPP which results in the expected "95 Operation not
      supported" error.
      
      Fixes: 1c6b39ad (alarmtimers: Return -ENOTSUPP if no RTC device is present)
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190903171802.28314-1-cascardo@canonical.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3784576f
    • Shawn Lin's avatar
      arm64: dts: rockchip: limit clock rate of MMC controllers for RK3328 · 174bbcc5
      Shawn Lin authored
      commit 03e61929 upstream.
      
      150MHz is a fundamental limitation of RK3328 Soc, w/o this limitation,
      eMMC, for instance, will run into 200MHz clock rate in HS200 mode, which
      makes the RK3328 boards not always boot properly. By adding it in
      rk3328.dtsi would also obviate the worry of missing it when adding new
      boards.
      
      Fixes: 52e02d37 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
      Cc: stable@vger.kernel.org
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Liang Chen <cl@rock-chips.com>
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      174bbcc5
    • Will Deacon's avatar
      arm64: tlb: Ensure we execute an ISB following walk cache invalidation · 8cfe3b8a
      Will Deacon authored
      commit 51696d34 upstream.
      
      05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      added a new TLB invalidation helper which is used when freeing
      intermediate levels of page table used for kernel mappings, but is
      missing the required ISB instruction after completion of the TLBI
      instruction.
      
      Add the missing barrier.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 05f2d2f8 ("arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable")
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cfe3b8a
    • Will Deacon's avatar
      Revert "arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}" · fc7d6bfd
      Will Deacon authored
      commit d0b7a302 upstream.
      
      This reverts commit 24fe1b0e.
      
      Commit 24fe1b0e ("arm64: Remove unnecessary ISBs from
      set_{pte,pmd,pud}") removed ISB instructions immediately following updates
      to the page table, on the grounds that they are not required by the
      architecture and a DSB alone is sufficient to ensure that subsequent data
      accesses use the new translation:
      
        DDI0487E_a, B2-128:
      
        | ... no instruction that appears in program order after the DSB
        | instruction can alter any state of the system or perform any part of
        | its functionality until the DSB completes other than:
        |
        | * Being fetched from memory and decoded
        | * Reading the general-purpose, SIMD and floating-point,
        |   Special-purpose, or System registers that are directly or indirectly
        |   read without causing side-effects.
      
      However, the same document also states the following:
      
        DDI0487E_a, B2-125:
      
        | DMB and DSB instructions affect reads and writes to the memory system
        | generated by Load/Store instructions and data or unified cache
        | maintenance instructions being executed by the PE. Instruction fetches
        | or accesses caused by a hardware translation table access are not
        | explicit accesses.
      
      which appears to claim that the DSB alone is insufficient.  Unfortunately,
      some CPU designers have followed the second clause above, whereas in Linux
      we've been relying on the first. This means that our mapping sequence:
      
      	MOV	X0, <valid pte>
      	STR	X0, [Xptep]	// Store new PTE to page table
      	DSB	ISHST
      	LDR	X1, [X2]	// Translates using the new PTE
      
      can actually raise a translation fault on the load instruction because the
      translation can be performed speculatively before the page table update and
      then marked as "faulting" by the CPU. For user PTEs, this is ok because we
      can handle the spurious fault, but for kernel PTEs and intermediate table
      entries this results in a panic().
      
      Revert the offending commit to reintroduce the missing barriers.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 24fe1b0e ("arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}")
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc7d6bfd
    • Luis Araneda's avatar
      ARM: zynq: Use memcpy_toio instead of memcpy on smp bring-up · 881edc16
      Luis Araneda authored
      commit b7005d4e upstream.
      
      This fixes a kernel panic on memcpy when
      FORTIFY_SOURCE is enabled.
      
      The initial smp implementation on commit aa7eb2bb
      ("arm: zynq: Add smp support")
      used memcpy, which worked fine until commit ee333554
      ("ARM: 8749/1: Kconfig: Add ARCH_HAS_FORTIFY_SOURCE")
      enabled overflow checks at runtime, producing a read
      overflow panic.
      
      The computed size of memcpy args are:
      - p_size (dst): 4294967295 = (size_t) -1
      - q_size (src): 1
      - size (len): 8
      
      Additionally, the memory is marked as __iomem, so one of
      the memcpy_* functions should be used for read/write.
      
      Fixes: aa7eb2bb ("arm: zynq: Add smp support")
      Signed-off-by: default avatarLuis Araneda <luaraneda@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichal Simek <michal.simek@xilinx.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      881edc16
    • Lihua Yao's avatar
      ARM: samsung: Fix system restart on S3C6410 · 22092794
      Lihua Yao authored
      commit 16986074 upstream.
      
      S3C6410 system restart is triggered by watchdog reset.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 9f55342c ("ARM: dts: s3c64xx: Fix infinite interrupt in soft mode")
      Signed-off-by: default avatarLihua Yao <ylhuajnu@outlook.com>
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      22092794
    • Amadeusz Sławiński's avatar
      ASoC: Intel: Fix use of potentially uninitialized variable · ad884155
      Amadeusz Sławiński authored
      commit 810f3b86 upstream.
      
      If ipc->ops.reply_msg_match is NULL, we may end up using uninitialized
      mask value.
      
      reported by smatch:
      sound/soc/intel/common/sst-ipc.c:266 sst_ipc_reply_find_msg() error: uninitialized symbol 'mask'.
      Signed-off-by: default avatarAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-3-amadeuszx.slawinski@linux.intel.comReviewed-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad884155
    • Amadeusz Sławiński's avatar
      ASoC: Intel: Skylake: Use correct function to access iomem space · 7bdab364
      Amadeusz Sławiński authored
      commit 17d29ff9 upstream.
      
      For copying from __iomem, we should use __ioread32_copy.
      
      reported by sparse:
      sound/soc/intel/skylake/skl-debug.c:437:34: warning: incorrect type in argument 1 (different address spaces)
      sound/soc/intel/skylake/skl-debug.c:437:34:    expected void [noderef] <asn:2> *to
      sound/soc/intel/skylake/skl-debug.c:437:34:    got unsigned char *
      sound/soc/intel/skylake/skl-debug.c:437:51: warning: incorrect type in argument 2 (different address spaces)
      sound/soc/intel/skylake/skl-debug.c:437:51:    expected void const *from
      sound/soc/intel/skylake/skl-debug.c:437:51:    got void [noderef] <asn:2> *[assigned] fw_reg_addr
      Signed-off-by: default avatarAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-2-amadeuszx.slawinski@linux.intel.comReviewed-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7bdab364
    • Amadeusz Sławiński's avatar
      ASoC: Intel: NHLT: Fix debug print format · 3c54f463
      Amadeusz Sławiński authored
      commit 855a06da upstream.
      
      oem_table_id is 8 chars long, so we need to limit it, otherwise it
      may print some unprintable characters into dmesg.
      Signed-off-by: default avatarAmadeusz Sławiński <amadeuszx.slawinski@intel.com>
      Link: https://lore.kernel.org/r/20190827141712.21015-7-amadeuszx.slawinski@linux.intel.comReviewed-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3c54f463
    • Kees Cook's avatar
      binfmt_elf: Do not move brk for INTERP-less ET_EXEC · 29ecf8ca
      Kees Cook authored
      commit 7be3cb01 upstream.
      
      When brk was moved for binaries without an interpreter, it should have
      been limited to ET_DYN only. In other words, the special case was an
      ET_DYN that lacks an INTERP, not just an executable that lacks INTERP.
      The bug manifested for giant static executables, where the brk would end
      up in the middle of the text area on 32-bit architectures.
      Reported-and-tested-by: default avatarRichard Kojedzinszky <richard@kojedz.in>
      Fixes: bbdc6076 ("binfmt_elf: move brk out of mmap when doing direct loader exec")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      29ecf8ca
    • Arnd Bergmann's avatar
      media: don't drop front-end reference count for ->detach · 02ef5c29
      Arnd Bergmann authored
      commit 14e3cdbb upstream.
      
      A bugfix introduce a link failure in configurations without CONFIG_MODULES:
      
      In file included from drivers/media/usb/dvb-usb/pctv452e.c:20:0:
      drivers/media/usb/dvb-usb/pctv452e.c: In function 'pctv452e_frontend_attach':
      drivers/media/dvb-frontends/stb0899_drv.h:151:36: error: weak declaration of 'stb0899_attach' being applied to a already existing, static definition
      
      The problem is that the !IS_REACHABLE() declaration of stb0899_attach()
      is a 'static inline' definition that clashes with the weak definition.
      
      I further observed that the bugfix was only done for one of the five users
      of stb0899_attach(), the other four still have the problem.  This reverts
      the bugfix and instead addresses the problem by not dropping the reference
      count when calling '->detach()', instead we call this function directly
      in dvb_frontend_put() before dropping the kref on the front-end.
      
      I first submitted this in early 2018, and after some discussion it
      was apparently discarded.  While there is a long-term plan in place,
      that plan is obviously not nearing completion yet, and the current
      kernel is still broken unless this patch is applied.
      
      Link: https://patchwork.kernel.org/patch/10140175/
      Link: https://patchwork.linuxtv.org/patch/54831/
      
      Cc: Max Kellermann <max.kellermann@gmail.com>
      Cc: Wolfgang Rohdewald <wolfgang@rohdewald.de>
      Cc: stable@vger.kernel.org
      Fixes: f686c143 ("[media] stb0899: move code to "detach" callback")
      Fixes: 6cdeaed3 ("media: dvb_usb_pctv452e: module refcount changes were unbalanced")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02ef5c29
    • Hans de Goede's avatar
      media: sn9c20x: Add MSI MS-1039 laptop to flip_dmi_table · 589ca8ec
      Hans de Goede authored
      commit 7e0bb582 upstream.
      
      Like a bunch of other MSI laptops the MS-1039 uses a 0c45:627b
      SN9C201 + OV7660 webcam which is mounted upside down.
      
      Add it to the sn9c20x flip_dmi_table to deal with this.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarRui Salvaterra <rsalvaterra@gmail.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      589ca8ec
    • Sean Christopherson's avatar
      KVM: x86: Manually calculate reserved bits when loading PDPTRS · 496cf984
      Sean Christopherson authored
      commit 16cfacc8 upstream.
      
      Manually generate the PDPTR reserved bit mask when explicitly loading
      PDPTRs.  The reserved bits that are being tracked by the MMU reflect the
      current paging mode, which is unlikely to be PAE paging in the vast
      majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
      __set_sregs(), etc...  This can cause KVM to incorrectly signal a bad
      PDPTR, or more likely, miss a reserved bit check and subsequently fail
      a VM-Enter due to a bad VMCS.GUEST_PDPTR.
      
      Add a one off helper to generate the reserved bits instead of sharing
      code across the MMU's calculations and the PDPTR emulation.  The PDPTR
      reserved bits are basically set in stone, and pushing a helper into
      the MMU's calculation adds unnecessary complexity without improving
      readability.
      
      Oppurtunistically fix/update the comment for load_pdptrs().
      
      Note, the buggy commit also introduced a deliberate functional change,
      "Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
      effectively (and correctly) reverted by commit cd9ae5fe ("KVM: x86:
      Fix page-tables reserved bits").  A bit of SDM archaeology shows that
      the SDM from late 2008 had a bug (likely a copy+paste error) where it
      listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
      for 2mb entries.  I.e. the SDM contradicted itself, and bits 6:5 are and
      always have been reserved.
      
      Fixes: 20c466b5 ("KVM: Use rsvd_bits_mask in load_pdptrs()")
      Cc: stable@vger.kernel.org
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Reported-by: default avatarDoug Reiland <doug.reiland@intel.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      496cf984
    • Jan Dakinevich's avatar
      KVM: x86: set ctxt->have_exception in x86_decode_insn() · 933e3e2b
      Jan Dakinevich authored
      commit c8848cee upstream.
      
      x86_emulate_instruction() takes into account ctxt->have_exception flag
      during instruction decoding, but in practice this flag is never set in
      x86_decode_insn().
      
      Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn")
      Cc: stable@vger.kernel.org
      Cc: Denis Lunev <den@virtuozzo.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: default avatarJan Dakinevich <jan.dakinevich@virtuozzo.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      933e3e2b
    • Jan Dakinevich's avatar
      KVM: x86: always stop emulation on page fault · 9723e445
      Jan Dakinevich authored
      commit 8530a79c upstream.
      
      inject_emulated_exception() returns true if and only if nested page
      fault happens. However, page fault can come from guest page tables
      walk, either nested or not nested. In both cases we should stop an
      attempt to read under RIP and give guest to step over its own page
      fault handler.
      
      This is also visible when an emulated instruction causes a #GP fault
      and the VMware backdoor is enabled.  To handle the VMware backdoor,
      KVM intercepts #GP faults; with only the next patch applied,
      x86_emulate_instruction() injects a #GP but returns EMULATE_FAIL
      instead of EMULATE_DONE.   EMULATE_FAIL causes handle_exception_nmi()
      (or gp_interception() for SVM) to re-inject the original #GP because it
      thinks emulation failed due to a non-VMware opcode.  This patch prevents
      the issue as x86_emulate_instruction() will return EMULATE_DONE after
      injecting the #GP.
      
      Fixes: 6ea6e843 ("KVM: x86: inject exceptions produced by x86_decode_insn")
      Cc: stable@vger.kernel.org
      Cc: Denis Lunev <den@virtuozzo.com>
      Cc: Roman Kagan <rkagan@virtuozzo.com>
      Cc: Denis Plotnikov <dplotnikov@virtuozzo.com>
      Signed-off-by: default avatarJan Dakinevich <jan.dakinevich@virtuozzo.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9723e445
    • Helge Deller's avatar
      parisc: Disable HP HSC-PCI Cards to prevent kernel crash · 8225db4a
      Helge Deller authored
      commit 5fa16591 upstream.
      
      The HP Dino PCI controller chip can be used in two variants: as on-board
      controller (e.g. in B160L), or on an Add-On card ("Card-Mode") to bridge
      PCI components to systems without a PCI bus, e.g. to a HSC/GSC bus.  One
      such Add-On card is the HP HSC-PCI Card which has one or more DEC Tulip
      PCI NIC chips connected to the on-card Dino PCI controller.
      
      Dino in Card-Mode has a big disadvantage: All PCI memory accesses need
      to go through the DINO_MEM_DATA register, so Linux drivers will not be
      able to use the ioremap() function. Without ioremap() many drivers will
      not work, one example is the tulip driver which then simply crashes the
      kernel if it tries to access the ports on the HP HSC card.
      
      This patch disables the HP HSC card if it finds one, and as such
      fixes the kernel crash on a HP D350/2 machine.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Noticed-by: default avatarPhil Scarr <phil.scarr@pm.me>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8225db4a
    • Vasily Averin's avatar
      fuse: fix missing unlock_page in fuse_writepage() · ad411629
      Vasily Averin authored
      commit d5880c7a upstream.
      
      unlock_page() was missing in case of an already in-flight write against the
      same page.
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Fixes: ff17be08 ("fuse: writepage: skip already in flight")
      Cc: <stable@vger.kernel.org> # v3.13
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad411629
    • Madhavan Srinivasan's avatar
      powerpc/imc: Dont create debugfs files for cpu-less nodes · ecfe4b5f
      Madhavan Srinivasan authored
      commit 41ba17f2 upstream.
      
      Commit <684d9840> ('powerpc/powernv: Add debugfs interface for
      imc-mode and imc') added debugfs interface for the nest imc pmu
      devices to support changing of different ucode modes. Primarily adding
      this capability for debug. But when doing so, the code did not
      consider the case of cpu-less nodes. So when reading the _cmd_ or
      _mode_ file of a cpu-less node will create this crash.
      
        Faulting instruction address: 0xc0000000000d0d58
        Oops: Kernel access of bad area, sig: 11 [#1]
        ...
        CPU: 67 PID: 5301 Comm: cat Not tainted 5.2.0-rc6-next-20190627+ #19
        NIP:  c0000000000d0d58 LR: c00000000049aa18 CTR:c0000000000d0d50
        REGS: c00020194548f9e0 TRAP: 0300   Not tainted  (5.2.0-rc6-next-20190627+)
        MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR:28022822  XER: 00000000
        CFAR: c00000000049aa14 DAR: 000000000003fc08 DSISR:40000000 IRQMASK: 0
        ...
        NIP imc_mem_get+0x8/0x20
        LR  simple_attr_read+0x118/0x170
        Call Trace:
          simple_attr_read+0x70/0x170 (unreliable)
          debugfs_attr_read+0x6c/0xb0
          __vfs_read+0x3c/0x70
           vfs_read+0xbc/0x1a0
          ksys_read+0x7c/0x140
          system_call+0x5c/0x70
      
      Patch fixes the issue with a more robust check for vbase to NULL.
      
      Before patch, ls output for the debugfs imc directory
      
        # ls /sys/kernel/debug/powerpc/imc/
        imc_cmd_0    imc_cmd_251  imc_cmd_253  imc_cmd_255  imc_mode_0    imc_mode_251  imc_mode_253  imc_mode_255
        imc_cmd_250  imc_cmd_252  imc_cmd_254  imc_cmd_8    imc_mode_250  imc_mode_252  imc_mode_254  imc_mode_8
      
      After patch, ls output for the debugfs imc directory
      
        # ls /sys/kernel/debug/powerpc/imc/
        imc_cmd_0  imc_cmd_8  imc_mode_0  imc_mode_8
      
      Actual bug here is that, we have two loops with potentially different
      loop counts. That is, in imc_get_mem_addr_nest(), loop count is
      obtained from the dt entries. But in case of export_imc_mode_and_cmd(),
      loop was based on for_each_nid() count. Patch fixes the loop count in
      latter based on the struct mem_info. Ideally it would be better to
      have array size in struct imc_pmu.
      
      Fixes: 684d9840 ('powerpc/powernv: Add debugfs interface for imc-mode and imc')
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Suggested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190827101635.6942-1-maddy@linux.vnet.ibm.com
      Cc: Jan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecfe4b5f
    • Ming Lei's avatar
      scsi: implement .cleanup_rq callback · e94443fc
      Ming Lei authored
      [ Upstream commit b7e9e1fb ]
      
      Implement .cleanup_rq() callback for freeing driver private part
      of the request. Then we can avoid to leak this part if the request isn't
      completed by SCSI, and freed by blk-mq or upper layer(such as dm-rq) finally.
      
      Cc: Ewan D. Milne <emilne@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: <stable@vger.kernel.org>
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e94443fc
    • Ming Lei's avatar
      blk-mq: add callback of .cleanup_rq · 4ec3ca27
      Ming Lei authored
      [ Upstream commit 226b4fc7 ]
      
      SCSI maintains its own driver private data hooked off of each SCSI
      request, and the pridate data won't be freed after scsi_queue_rq()
      returns BLK_STS_RESOURCE or BLK_STS_DEV_RESOURCE. An upper layer driver
      (e.g. dm-rq) may need to retry these SCSI requests, before SCSI has
      fully dispatched them, due to a lower level SCSI driver's resource
      limitation identified in scsi_queue_rq(). Currently SCSI's per-request
      private data is leaked when the upper layer driver (dm-rq) frees and
      then retries these requests in response to BLK_STS_RESOURCE or
      BLK_STS_DEV_RESOURCE returns from scsi_queue_rq().
      
      This usecase is so specialized that it doesn't warrant training an
      existing blk-mq interface (e.g. blk_mq_free_request) to allow SCSI to
      account for freeing its driver private data -- doing so would add an
      extra branch for handling a special case that all other consumers of
      SCSI (and blk-mq) won't ever need to worry about.
      
      So the most pragmatic way forward is to delegate freeing SCSI driver
      private data to the upper layer driver (dm-rq).  Do so by adding
      new .cleanup_rq callback and calling a new blk_mq_cleanup_rq() method
      from dm-rq.  A following commit will implement the .cleanup_rq() hook
      in scsi_mq_ops.
      
      Cc: Ewan D. Milne <emilne@redhat.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: Hannes Reinecke <hare@suse.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: <stable@vger.kernel.org>
      Fixes: 396eaf21 ("blk-mq: improve DM's blk-mq IO merging via blk_insert_cloned_request feedback")
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4ec3ca27
    • Jan-Marek Glogowski's avatar
      ALSA: hda/realtek - PCI quirk for Medion E4254 · 4848fb93
      Jan-Marek Glogowski authored
      [ Upstream commit bd9c10bc ]
      
      The laptop has a combined jack to attach headsets on the right.
      The BIOS encodes them as two different colored jacks at the front,
      but otherwise it seems to be configured ok. But any adaption of
      the pins config on its own doesn't fix the jack detection to work
      in Linux. Still Windows works correct.
      
      This is somehow fixed by chaining ALC256_FIXUP_ASUS_HEADSET_MODE,
      which seems to register the microphone jack as a headset part and
      also results in fixing jack sensing, visible in dmesg as:
      
      -snd_hda_codec_realtek hdaudioC0D0:      Mic=0x19
      +snd_hda_codec_realtek hdaudioC0D0:      Headset Mic=0x19
      
      [ Actually the essential change is the location of the jack; the
        driver created "Front Mic Jack" without the matching volume / mute
        control element due to its jack location, which confused PA.
        -- tiwai ]
      Signed-off-by: default avatarJan-Marek Glogowski <glogow@fbihome.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lore.kernel.org/r/8f4f9b20-0aeb-f8f1-c02f-fd53c09679f1@fbihome.deSigned-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4848fb93
    • Yan, Zheng's avatar
      ceph: use ceph_evict_inode to cleanup inode's resource · e9bcaf82
      Yan, Zheng authored
      [ Upstream commit 87bc5b89 ]
      
      remove_session_caps() relies on __wait_on_freeing_inode(), to wait for
      freeing inode to remove its caps. But VFS wakes freeing inode waiters
      before calling destroy_inode().
      
      [ jlayton: mainline moved to ->free_inode before the original patch was
      	   merged. This backport reinstates ceph_destroy_inode and just
      	   has it do the call_rcu call. ]
      
      Cc: stable@vger.kernel.org
      Link: https://tracker.ceph.com/issues/40102Signed-off-by: default avatar"Yan, Zheng" <zyan@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e9bcaf82
    • Sasha Levin's avatar
      Revert "ceph: use ceph_evict_inode to cleanup inode's resource" · 72f0fff3
      Sasha Levin authored
      This reverts commit 81281039.
      
      The backport was incorrect and was causing kernel panics. Revert and
      re-apply a correct backport from Jeff Layton.
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      72f0fff3
    • Joonwon Kang's avatar
      randstruct: Check member structs in is_pure_ops_struct() · 98dc6d95
      Joonwon Kang authored
      commit 60f2c82e upstream.
      
      While no uses in the kernel triggered this case, it was possible to have
      a false negative where a struct contains other structs which contain only
      function pointers because of unreachable code in is_pure_ops_struct().
      Signed-off-by: default avatarJoonwon Kang <kjw1627@gmail.com>
      Link: https://lore.kernel.org/r/20190727155841.GA13586@host
      Fixes: 313dd1b6 ("gcc-plugins: Add the randstruct plugin")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98dc6d95
    • Ira Weiny's avatar
      IB/hfi1: Define variables as unsigned long to fix KASAN warning · ad6819cd
      Ira Weiny authored
      commit f8659d68 upstream.
      
      Define the working variables to be unsigned long to be compatible with
      for_each_set_bit and change types as needed.
      
      While we are at it remove unused variables from a couple of functions.
      
      This was found because of the following KASAN warning:
       ==================================================================
         BUG: KASAN: stack-out-of-bounds in find_first_bit+0x19/0x70
         Read of size 8 at addr ffff888362d778d0 by task kworker/u308:2/1889
      
         CPU: 21 PID: 1889 Comm: kworker/u308:2 Tainted: G W         5.3.0-rc2-mm1+ #2
         Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.02.04.0003.102320141138 10/23/2014
         Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
         Call Trace:
          dump_stack+0x9a/0xf0
          ? find_first_bit+0x19/0x70
          print_address_description+0x6c/0x332
          ? find_first_bit+0x19/0x70
          ? find_first_bit+0x19/0x70
          __kasan_report.cold.6+0x1a/0x3b
          ? find_first_bit+0x19/0x70
          kasan_report+0xe/0x12
          find_first_bit+0x19/0x70
          pma_get_opa_portstatus+0x5cc/0xa80 [hfi1]
          ? ret_from_fork+0x3a/0x50
          ? pma_get_opa_port_ectrs+0x200/0x200 [hfi1]
          ? stack_trace_consume_entry+0x80/0x80
          hfi1_process_mad+0x39b/0x26c0 [hfi1]
          ? __lock_acquire+0x65e/0x21b0
          ? clear_linkup_counters+0xb0/0xb0 [hfi1]
          ? check_chain_key+0x1d7/0x2e0
          ? lock_downgrade+0x3a0/0x3a0
          ? match_held_lock+0x2e/0x250
          ib_mad_recv_done+0x698/0x15e0 [ib_core]
          ? clear_linkup_counters+0xb0/0xb0 [hfi1]
          ? ib_mad_send_done+0xc80/0xc80 [ib_core]
          ? mark_held_locks+0x79/0xa0
          ? _raw_spin_unlock_irqrestore+0x44/0x60
          ? rvt_poll_cq+0x1e1/0x340 [rdmavt]
          __ib_process_cq+0x97/0x100 [ib_core]
          ib_cq_poll_work+0x31/0xb0 [ib_core]
          process_one_work+0x4ee/0xa00
          ? pwq_dec_nr_in_flight+0x110/0x110
          ? do_raw_spin_lock+0x113/0x1d0
          worker_thread+0x57/0x5a0
          ? process_one_work+0xa00/0xa00
          kthread+0x1bb/0x1e0
          ? kthread_create_on_node+0xc0/0xc0
          ret_from_fork+0x3a/0x50
      
         The buggy address belongs to the page:
         page:ffffea000d8b5dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
         flags: 0x17ffffc0000000()
         raw: 0017ffffc0000000 0000000000000000 ffffea000d8b5dc8 0000000000000000
         raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
         page dumped because: kasan: bad access detected
      
         addr ffff888362d778d0 is located in stack of task kworker/u308:2/1889 at offset 32 in frame:
          pma_get_opa_portstatus+0x0/0xa80 [hfi1]
      
         this frame has 1 object:
          [32, 36) 'vl_select_mask'
      
         Memory state around the buggy address:
          ffff888362d77780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          ffff888362d77800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         >ffff888362d77880: 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2 00 00
                                                          ^
          ffff888362d77900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          ffff888362d77980: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
      
       ==================================================================
      
      Cc: <stable@vger.kernel.org>
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Link: https://lore.kernel.org/r/20190911113053.126040.47327.stgit@awfm-01.aw.intel.comReviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarKaike Wan <kaike.wan@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad6819cd
    • Danit Goldberg's avatar
      IB/mlx5: Free mpi in mp_slave mode · a924850c
      Danit Goldberg authored
      commit 5d44adeb upstream.
      
      ib_add_slave_port() allocates a multiport struct but never frees it.
      Don't leak memory, free the allocated mpi struct during driver unload.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 32f69e4b ("{net, IB}/mlx5: Manage port association for multiport RoCE")
      Link: https://lore.kernel.org/r/20190916064818.19823-3-leon@kernel.orgSigned-off-by: default avatarDanit Goldberg <danitg@mellanox.com>
      Reviewed-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a924850c
    • Vincent Whitchurch's avatar
      printk: Do not lose last line in kmsg buffer dump · 40b07199
      Vincent Whitchurch authored
      commit c9dccacf upstream.
      
      kmsg_dump_get_buffer() is supposed to select all the youngest log
      messages which fit into the provided buffer.  It determines the correct
      start index by using msg_print_text() with a NULL buffer to calculate
      the size of each entry.  However, when performing the actual writes,
      msg_print_text() only writes the entry to the buffer if the written len
      is lesser than the size of the buffer.  So if the lengths of the
      selected youngest log messages happen to precisely fill up the provided
      buffer, the last log message is not included.
      
      We don't want to modify msg_print_text() to fill up the buffer and start
      returning a length which is equal to the size of the buffer, since
      callers of its other users, such as kmsg_dump_get_line(), depend upon
      the current behaviour.
      
      Instead, fix kmsg_dump_get_buffer() to compensate for this.
      
      For example, with the following two final prints:
      
      [    6.427502] AAAAAAAAAAAAA
      [    6.427769] BBBBBBBB12345
      
      A dump of a 64-byte buffer filled by kmsg_dump_get_buffer(), before this
      patch:
      
       00000000: 3c 30 3e 5b 20 20 20 20 36 2e 35 32 32 31 39 37  <0>[    6.522197
       00000010: 5d 20 41 41 41 41 41 41 41 41 41 41 41 41 41 0a  ] AAAAAAAAAAAAA.
       00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      
      After this patch:
      
       00000000: 3c 30 3e 5b 20 20 20 20 36 2e 34 35 36 36 37 38  <0>[    6.456678
       00000010: 5d 20 42 42 42 42 42 42 42 42 31 32 33 34 35 0a  ] BBBBBBBB12345.
       00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
       00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      
      Link: http://lkml.kernel.org/r/20190711142937.4083-1-vincent.whitchurch@axis.com
      Fixes: e2ae715d ("kmsg - kmsg_dump() use iterator to receive log buffer content")
      To: rostedt@goodmis.org
      Cc: linux-kernel@vger.kernel.org
      Cc: <stable@vger.kernel.org> # v3.5+
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarPetr Mladek <pmladek@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40b07199