1. 26 Jul, 2019 40 commits
    • Marco Felsch's avatar
      media: coda: fix last buffer handling in V4L2_ENC_CMD_STOP · cc1c3011
      Marco Felsch authored
      [ Upstream commit f3775f89 ]
      
      coda_encoder_cmd() is racy, as the last scheduled picture run worker can
      still be in-flight while the ENC_CMD_STOP command is issued. Depending
      on the exact timing the sequence numbers might already be changed, but
      the last buffer might not have been put on the destination queue yet.
      
      In this case the current implementation would prematurely wake the
      destination queue with last_buffer_dequeued=true, causing userspace to
      call streamoff before the last buffer is handled.
      
      Close this race window by synchronizing with the pic_run_worker before
      doing the sequence check.
      Signed-off-by: default avatarMarco Felsch <m.felsch@pengutronix.de>
      [l.stach@pengutronix.de: switch to flush_work, reword commit message]
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Signed-off-by: default avatarPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cc1c3011
    • Philipp Zabel's avatar
      media: coda: fix mpeg2 sequence number handling · b6a853dd
      Philipp Zabel authored
      [ Upstream commit 56d159a4 ]
      
      Sequence number handling assumed that the BIT processor frame number
      starts counting at 1, but this is not true for the MPEG-2 decoder,
      which starts at 0. Fix the sequence counter offset detection to handle
      this.
      Signed-off-by: default avatarPhilipp Zabel <p.zabel@pengutronix.de>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b6a853dd
    • Ard Biesheuvel's avatar
      acpi/arm64: ignore 5.1 FADTs that are reported as 5.0 · 67d73cc8
      Ard Biesheuvel authored
      [ Upstream commit 2af22f3e ]
      
      Some Qualcomm Snapdragon based laptops built to run Microsoft Windows
      are clearly ACPI 5.1 based, given that that is the first ACPI revision
      that supports ARM, and introduced the FADT 'arm_boot_flags' field,
      which has a non-zero field on those systems.
      
      So in these cases, infer from the ARM boot flags that the FADT must be
      5.1 or later, and treat it as 5.1.
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Tested-by: default avatarLee Jones <lee.jones@linaro.org>
      Reviewed-by: default avatarGraeme Gregory <graeme.gregory@linaro.org>
      Acked-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      67d73cc8
    • Kuninori Morimoto's avatar
      ASoC: soc-core: call snd_soc_unbind_card() under mutex_lock; · 95783489
      Kuninori Morimoto authored
      [ Upstream commit b545542a ]
      
      commit 34ac3c3e ("ASoC: core: lock client_mutex while removing
      link components") added mutex_lock() at soc_remove_link_components().
      
      Is is called from snd_soc_unbind_card()
      
      	snd_soc_unbind_card()
      =>		soc_remove_link_components()
      		soc_cleanup_card_resources()
      			soc_remove_dai_links()
      =>				soc_remove_link_components()
      
      And, there are 2 way to call it.
      
      (1)
      	snd_soc_unregister_component()
      **		mutex_lock()
      			snd_soc_component_del_unlocked()
      =>				snd_soc_unbind_card()
      **		mutex_unlock()
      
      (2)
      	snd_soc_unregister_card()
      =>		snd_soc_unbind_card()
      
      (1) case is already using mutex_lock() when it calles
      snd_soc_unbind_card(), thus, we will get lockdep warning.
      
      commit 495f926c ("ASoC: core: Fix deadlock in
      snd_soc_instantiate_card()") tried to fixup it, but still not
      enough. We still have lockdep warning when we try unbind/bind.
      
      We need mutex_lock() under snd_soc_unregister_card()
      instead of snd_remove_link_components()/snd_soc_unbind_card().
      
      Fixes: 34ac3c3e ("ASoC: core: lock client_mutex while removing link components")
      Fixes: 495f926c ("ASoC: core: Fix deadlock in snd_soc_instantiate_card()")
      Signed-off-by: default avatarKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      95783489
    • Robert Jarzmik's avatar
      media: mt9m111: fix fw-node refactoring · 3b12115f
      Robert Jarzmik authored
      [ Upstream commit 8d4e29a5 ]
      
      In the patch refactoring the fw-node, the mt9m111 was broken for all
      platform_data based platforms, which were the first aim of this
      driver. Only the devicetree platform are still functional, probably
      because the testing was done on these.
      
      The result is that -EINVAL is systematically return for such platforms,
      what this patch fixes.
      
      [Sakari Ailus: Rework this to resolve a merge conflict and use dev_fwnode]
      
      Fixes: 98480d65 ("media: mt9m111: allow to setup pixclk polarity")
      Signed-off-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3b12115f
    • Nathan Huckleberry's avatar
      timer_list: Guard procfs specific code · ee2e483f
      Nathan Huckleberry authored
      [ Upstream commit a9314773 ]
      
      With CONFIG_PROC_FS=n the following warning is emitted:
      
      kernel/time/timer_list.c:361:36: warning: unused variable
      'timer_list_sops' [-Wunused-const-variable]
         static const struct seq_operations timer_list_sops = {
      
      Add #ifdef guard around procfs specific code.
      Signed-off-by: default avatarNathan Huckleberry <nhuck@google.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Cc: john.stultz@linaro.org
      Cc: sboyd@kernel.org
      Cc: clang-built-linux@googlegroups.com
      Link: https://github.com/ClangBuiltLinux/linux/issues/534
      Link: https://lkml.kernel.org/r/20190614181604.112297-1-nhuck@google.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      ee2e483f
    • Miroslav Lichvar's avatar
      ntp: Limit TAI-UTC offset · a0d5e56f
      Miroslav Lichvar authored
      [ Upstream commit d897a4ab ]
      
      Don't allow the TAI-UTC offset of the system clock to be set by adjtimex()
      to a value larger than 100000 seconds.
      
      This prevents an overflow in the conversion to int, prevents the CLOCK_TAI
      clock from getting too far ahead of the CLOCK_REALTIME clock, and it is
      still large enough to allow leap seconds to be inserted at the maximum rate
      currently supported by the kernel (once per day) for the next ~270 years,
      however unlikely it is that someone can survive a catastrophic event which
      slowed down the rotation of the Earth so much.
      Reported-by: default avatarWeikang shi <swkhack@gmail.com>
      Signed-off-by: default avatarMiroslav Lichvar <mlichvar@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Stephen Boyd <sboyd@kernel.org>
      Link: https://lkml.kernel.org/r/20190618154713.20929-1-mlichvar@redhat.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      a0d5e56f
    • Anders Roxell's avatar
      media: i2c: fix warning same module names · 10e60db5
      Anders Roxell authored
      [ Upstream commit b2ce5617 ]
      
      When building with CONFIG_VIDEO_ADV7511 and CONFIG_DRM_I2C_ADV7511
      enabled as loadable modules, we see the following warning:
      
        drivers/gpu/drm/bridge/adv7511/adv7511.ko
        drivers/media/i2c/adv7511.ko
      
      Rework so that the file is named adv7511-v4l2.c.
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      10e60db5
    • Marek Szyprowski's avatar
      media: s5p-mfc: Make additional clocks optional · c98a3afe
      Marek Szyprowski authored
      [ Upstream commit e08efef8 ]
      
      Since the beginning the second clock ('special', 'sclk') was optional and
      it is not available on some variants of Exynos SoCs (i.e. Exynos5420 with
      v7 of MFC hardware).
      
      However commit 1bce6fb3 ("[media] s5p-mfc: Rework clock handling")
      made handling of all specified clocks mandatory. This patch restores
      original behavior of the driver and fixes its operation on
      Exynos5420 SoCs.
      
      Fixes: 1bce6fb3 ("[media] s5p-mfc: Rework clock handling")
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: default avatarHans Verkuil <hverkuil-cisco@xs4all.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c98a3afe
    • Julian Anastasov's avatar
      ipvs: defer hook registration to avoid leaks · affa31e6
      Julian Anastasov authored
      [ Upstream commit cf47a0b8 ]
      
      syzkaller reports for memory leak when registering hooks [1]
      
      As we moved the nf_unregister_net_hooks() call into
      __ip_vs_dev_cleanup(), defer the nf_register_net_hooks()
      call, so that hooks are allocated and freed from same
      pernet_operations (ipvs_core_dev_ops).
      
      [1]
      BUG: memory leak
      unreferenced object 0xffff88810acd8a80 (size 96):
       comm "syz-executor073", pid 7254, jiffies 4294950560 (age 22.250s)
       hex dump (first 32 bytes):
         02 00 00 00 00 00 00 00 50 8b bb 82 ff ff ff ff  ........P.......
         00 00 00 00 00 00 00 00 00 77 bb 82 ff ff ff ff  .........w......
       backtrace:
         [<0000000013db61f1>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
         [<0000000013db61f1>] slab_post_alloc_hook mm/slab.h:439 [inline]
         [<0000000013db61f1>] slab_alloc_node mm/slab.c:3269 [inline]
         [<0000000013db61f1>] kmem_cache_alloc_node_trace+0x15b/0x2a0 mm/slab.c:3597
         [<000000001a27307d>] __do_kmalloc_node mm/slab.c:3619 [inline]
         [<000000001a27307d>] __kmalloc_node+0x38/0x50 mm/slab.c:3627
         [<0000000025054add>] kmalloc_node include/linux/slab.h:590 [inline]
         [<0000000025054add>] kvmalloc_node+0x4a/0xd0 mm/util.c:431
         [<0000000050d1bc00>] kvmalloc include/linux/mm.h:637 [inline]
         [<0000000050d1bc00>] kvzalloc include/linux/mm.h:645 [inline]
         [<0000000050d1bc00>] allocate_hook_entries_size+0x3b/0x60 net/netfilter/core.c:61
         [<00000000e8abe142>] nf_hook_entries_grow+0xae/0x270 net/netfilter/core.c:128
         [<000000004b94797c>] __nf_register_net_hook+0x9a/0x170 net/netfilter/core.c:337
         [<00000000d1545cbc>] nf_register_net_hook+0x34/0xc0 net/netfilter/core.c:464
         [<00000000876c9b55>] nf_register_net_hooks+0x53/0xc0 net/netfilter/core.c:480
         [<000000002ea868e0>] __ip_vs_init+0xe8/0x170 net/netfilter/ipvs/ip_vs_core.c:2280
         [<000000002eb2d451>] ops_init+0x4c/0x140 net/core/net_namespace.c:130
         [<000000000284ec48>] setup_net+0xde/0x230 net/core/net_namespace.c:316
         [<00000000a70600fa>] copy_net_ns+0xf0/0x1e0 net/core/net_namespace.c:439
         [<00000000ff26c15e>] create_new_namespaces+0x141/0x2a0 kernel/nsproxy.c:107
         [<00000000b103dc79>] copy_namespaces+0xa1/0xe0 kernel/nsproxy.c:165
         [<000000007cc008a2>] copy_process.part.0+0x11fd/0x2150 kernel/fork.c:2035
         [<00000000c344af7c>] copy_process kernel/fork.c:1800 [inline]
         [<00000000c344af7c>] _do_fork+0x121/0x4f0 kernel/fork.c:2369
      
      Reported-by: syzbot+722da59ccb264bc19910@syzkaller.appspotmail.com
      Fixes: 719c7d56 ("ipvs: Fix use-after-free in ip_vs_in")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      affa31e6
    • Arnd Bergmann's avatar
      ipsec: select crypto ciphers for xfrm_algo · a275d049
      Arnd Bergmann authored
      [ Upstream commit 597179b0 ]
      
      kernelci.org reports failed builds on arc because of what looks
      like an old missed 'select' statement:
      
      net/xfrm/xfrm_algo.o: In function `xfrm_probe_algs':
      xfrm_algo.c:(.text+0x1e8): undefined reference to `crypto_has_ahash'
      
      I don't see this in randconfig builds on other architectures, but
      it's fairly clear we want to select the hash code for it, like we
      do for all its other users. As Herbert points out, CRYPTO_BLKCIPHER
      is also required even though it has not popped up in build tests.
      
      Fixes: 17bc1970 ("ipsec: Use skcipher and ahash when probing algorithms")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a275d049
    • Julien Thierry's avatar
      arm64: Do not enable IRQs for ct_user_exit · 7e41783d
      Julien Thierry authored
      [ Upstream commit 9034f625 ]
      
      For el0_dbg and el0_error, DAIF bits get explicitly cleared before
      calling ct_user_exit.
      
      When context tracking is disabled, DAIF gets set (almost) immediately
      after. When context tracking is enabled, among the first things done
      is disabling IRQs.
      
      What is actually needed is:
      - PSR.D = 0 so the system can be debugged (should be already the case)
      - PSR.A = 0 so async error can be handled during context tracking
      
      Do not clear PSR.I in those two locations.
      Reviewed-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarJames Morse <james.morse@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7e41783d
    • Minwoo Im's avatar
      nvme-pci: adjust irq max_vector using num_possible_cpus() · 8d81f2b8
      Minwoo Im authored
      [ Upstream commit dad77d63 ]
      
      If the "irq_queues" are greater than num_possible_cpus(),
      nvme_calc_irq_sets() can have irq set_size for HCTX_TYPE_DEFAULT greater
      than it can be afforded.
      2039         affd->set_size[HCTX_TYPE_DEFAULT] = nrirqs - nr_read_queues;
      
      It might cause a WARN() from the irq_build_affinity_masks() like [1]:
      220         if (nr_present < numvecs)
      221                 WARN_ON(nr_present + nr_others < numvecs);
      
      This patch prevents it from the WARN() by adjusting the max_vector value
      from the nvme_setup_irqs().
      
      [1] WARN messages when modprobe nvme write_queues=32 poll_queues=0:
      root@target:~/nvme# nproc
      8
      root@target:~/nvme# modprobe nvme write_queues=32 poll_queues=0
      [   17.925326] nvme nvme0: pci function 0000:00:04.0
      [   17.940601] WARNING: CPU: 3 PID: 1030 at kernel/irq/affinity.c:221 irq_create_affinity_masks+0x222/0x330
      [   17.940602] Modules linked in: nvme nvme_core [last unloaded: nvme]
      [   17.940605] CPU: 3 PID: 1030 Comm: kworker/u17:4 Tainted: G        W         5.1.0+ #156
      [   17.940605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
      [   17.940608] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
      [   17.940609] RIP: 0010:irq_create_affinity_masks+0x222/0x330
      [   17.940611] Code: 4c 8d 4c 24 28 4c 8d 44 24 30 e8 c9 fa ff ff 89 44 24 18 e8 c0 38 fa ff 8b 44 24 18 44 8b 54 24 1c 5a 44 01 d0 41 39 c4 76 02 <0f> 0b 48 89 df 44 01 e5 e8 f1 ce 10 00 48 8b 34 24 44 89 f0 44 01
      [   17.940611] RSP: 0018:ffffc90002277c50 EFLAGS: 00010216
      [   17.940612] RAX: 0000000000000008 RBX: ffff88807ca48860 RCX: 0000000000000000
      [   17.940612] RDX: ffff88807bc03800 RSI: 0000000000000020 RDI: 0000000000000000
      [   17.940613] RBP: 0000000000000001 R08: ffffc90002277c78 R09: ffffc90002277c70
      [   17.940613] R10: 0000000000000008 R11: 0000000000000001 R12: 0000000000000020
      [   17.940614] R13: 0000000000025d08 R14: 0000000000000001 R15: ffff88807bc03800
      [   17.940614] FS:  0000000000000000(0000) GS:ffff88807db80000(0000) knlGS:0000000000000000
      [   17.940616] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   17.940617] CR2: 00005635e583f790 CR3: 000000000240a000 CR4: 00000000000006e0
      [   17.940617] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   17.940618] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   17.940618] Call Trace:
      [   17.940622]  __pci_enable_msix_range+0x215/0x540
      [   17.940623]  ? kernfs_put+0x117/0x160
      [   17.940625]  pci_alloc_irq_vectors_affinity+0x74/0x110
      [   17.940626]  nvme_reset_work+0xc30/0x1397 [nvme]
      [   17.940628]  ? __switch_to_asm+0x34/0x70
      [   17.940628]  ? __switch_to_asm+0x40/0x70
      [   17.940629]  ? __switch_to_asm+0x34/0x70
      [   17.940630]  ? __switch_to_asm+0x40/0x70
      [   17.940630]  ? __switch_to_asm+0x34/0x70
      [   17.940631]  ? __switch_to_asm+0x40/0x70
      [   17.940632]  ? nvme_irq_check+0x30/0x30 [nvme]
      [   17.940633]  process_one_work+0x20b/0x3e0
      [   17.940634]  worker_thread+0x1f9/0x3d0
      [   17.940635]  ? cancel_delayed_work+0xa0/0xa0
      [   17.940636]  kthread+0x117/0x120
      [   17.940637]  ? kthread_stop+0xf0/0xf0
      [   17.940638]  ret_from_fork+0x3a/0x50
      [   17.940639] ---[ end trace aca8a131361cd42a ]---
      [   17.942124] nvme nvme0: 7/1/0 default/read/poll queues
      Signed-off-by: default avatarMinwoo Im <minwoo.im.dev@gmail.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d81f2b8
    • Heiner Litz's avatar
      lightnvm: pblk: fix freeing of merged pages · dc7cb800
      Heiner Litz authored
      [ Upstream commit 510fd8ea ]
      
      bio_add_pc_page() may merge pages when a bio is padded due to a flush.
      Fix iteration over the bio to free the correct pages in case of a merge.
      Signed-off-by: default avatarHeiner Litz <hlitz@ucsc.edu>
      Reviewed-by: default avatarJavier González <javier@javigon.com>
      Signed-off-by: default avatarMatias Bjørling <mb@lightnvm.io>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dc7cb800
    • Chaitanya Kulkarni's avatar
      nvme-pci: set the errno on ctrl state change error · e05c9c36
      Chaitanya Kulkarni authored
      [ Upstream commit e71afda4 ]
      
      This patch removes the confusing assignment of the variable result at
      the time of declaration and sets the value in error cases next to the
      places where the actual error is happening.
      
      Here we also set the result value to -ENODEV when we fail at the final
      ctrl state transition in nvme_reset_work(). Without this assignment
      result will hold 0 from nvme_setup_io_queue() and on failure 0 will be
      passed to he nvme_remove_dead_ctrl() from final state transition.
      Signed-off-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e05c9c36
    • Minwoo Im's avatar
      nvme-pci: properly report state change failure in nvme_reset_work · bd3df654
      Minwoo Im authored
      [ Upstream commit cee6c269 ]
      
      If the state change to NVME_CTRL_CONNECTING fails, the dmesg is going to
      be like:
      
        [  293.689160] nvme nvme0: failed to mark controller CONNECTING
        [  293.689160] nvme nvme0: Removing after probe failure status: 0
      
      Even it prints the first line to indicate the situation, the second line
      is not proper because the status is 0 which means normally success of
      the previous operation.
      
      This patch makes it indicate the proper error value when it fails.
        [   25.932367] nvme nvme0: failed to mark controller CONNECTING
        [   25.932369] nvme nvme0: Removing after probe failure status: -16
      
      This situation is able to be easily reproduced by:
        root@target:~# rmmod nvme && modprobe nvme && rmmod nvme
      Signed-off-by: default avatarMinwoo Im <minwoo.im.dev@gmail.com>
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      bd3df654
    • Anton Eidelman's avatar
      nvme: fix possible io failures when removing multipathed ns · 942c09be
      Anton Eidelman authored
      [ Upstream commit 2181e455 ]
      
      When a shared namespace is removed, we call blk_cleanup_queue()
      when the device can still be accessed as the current path and this can
      result in submission to a dying queue. Hence, direct_make_request()
      called by our mpath device may fail (propagating the failure to userspace).
      Instead, we want to failover this I/O to a different path if one exists.
      Thus, before we cleanup the request queue, we make sure that the device is
      cleared from the current path nor it can be selected again as such.
      
      Fix this by:
      - clear the ns from the head->list and synchronize rcu to make sure there is
        no concurrent path search that restores it as the current path
      - clear the mpath current path in order to trigger a subsequent path search
        and sync srcu to wait for any ongoing request submissions
      - safely continue to namespace removal and blk_cleanup_queue
      Signed-off-by: default avatarAnton Eidelman <anton@lightbitslabs.com>
      Signed-off-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      942c09be
    • Pan Bian's avatar
      EDAC/sysfs: Fix memory leak when creating a csrow object · 3db69908
      Pan Bian authored
      [ Upstream commit 585fb3d9 ]
      
      In edac_create_csrow_object(), the reference to the object is not
      released when adding the device to the device hierarchy fails
      (device_add()). This may result in a memory leak.
      Signed-off-by: default avatarPan Bian <bianpan2016@163.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: James Morse <james.morse@arm.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Link: https://lkml.kernel.org/r/1555554438-103953-1-git-send-email-bianpan2016@163.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      3db69908
    • Greg KH's avatar
      EDAC/sysfs: Drop device references properly · 4b036124
      Greg KH authored
      [ Upstream commit 7adc05d2 ]
      
      Do put_device() if device_add() fails.
      
       [ bp: do device_del() for the successfully created devices in
         edac_create_csrow_objects(), on the unwind path. ]
      Signed-off-by: default avatarGreg KH <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: https://lkml.kernel.org/r/20190427214925.GE16338@kroah.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      4b036124
    • Tudor Ambarus's avatar
      spi: fix ctrl->num_chipselect constraint · a2fd0a9b
      Tudor Ambarus authored
      [ Upstream commit f9481b08 ]
      
      at91sam9g25ek showed the following error at probe:
      atmel_spi f0000000.spi: Using dma0chan2 (tx) and dma0chan3 (rx)
      for DMA transfers
      atmel_spi: probe of f0000000.spi failed with error -22
      
      Commit 0a919ae4 ("spi: Don't call spi_get_gpio_descs() before device name is set")
      moved the calling of spi_get_gpio_descs() after ctrl->dev is set,
      but didn't move the !ctrl->num_chipselect check. When there are
      chip selects in the device tree, the spi-atmel driver lets the
      SPI core discover them when registering the SPI master.
      The ctrl->num_chipselect is thus expected to be set by
      spi_get_gpio_descs().
      
      Move the !ctlr->num_chipselect after spi_get_gpio_descs() as it was
      before the aforementioned commit. While touching this block, get rid
      of the explicit comparison with 0 and update the commenting style.
      
      Fixes: 0a919ae4 ("spi: Don't call spi_get_gpio_descs() before device name is set")
      Signed-off-by: default avatarTudor Ambarus <tudor.ambarus@microchip.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a2fd0a9b
    • Rafael J. Wysocki's avatar
      ACPICA: Clear status of GPEs on first direct enable · 467ebe42
      Rafael J. Wysocki authored
      [ Upstream commit 44758baf ]
      
      ACPI GPEs (other than the EC one) can be enabled in two situations.
      First, the GPEs with existing _Lxx and _Exx methods are enabled
      implicitly by ACPICA during system initialization.  Second, the
      GPEs without these methods (like GPEs listed by _PRW objects for
      wakeup devices) need to be enabled directly by the code that is
      going to use them (e.g. ACPI power management or device drivers).
      
      In the former case, if the status of a given GPE is set to start
      with, its handler method (either _Lxx or _Exx) needs to be invoked
      to take care of the events (possibly) signaled before the GPE was
      enabled.  In the latter case, however, the first caller of
      acpi_enable_gpe() for a given GPE should not be expected to care
      about any events that might be signaled through it earlier.  In
      that case, it is better to clear the status of the GPE before
      enabling it, to prevent stale events from triggering unwanted
      actions (like spurious system resume, for example).
      
      For this reason, modify acpi_ev_add_gpe_reference() to take an
      additional boolean argument indicating whether or not the GPE
      status needs to be cleared when its reference counter changes from
      zero to one and make acpi_enable_gpe() pass TRUE to it through
      that new argument.
      
      Fixes: 18996f2d ("ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume")
      Reported-by: default avatarFurquan Shaikh <furquan@google.com>
      Tested-by: default avatarFurquan Shaikh <furquan@google.com>
      Tested-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      467ebe42
    • Dennis Zhou's avatar
      blk-iolatency: only account submitted bios · c8632e34
      Dennis Zhou authored
      [ Upstream commit a3fb01ba ]
      
      As is, iolatency recognizes done_bio and cleanup as ending paths. If a
      request is marked REQ_NOWAIT and fails to get a request, the bio is
      cleaned up via rq_qos_cleanup() and ended in bio_wouldblock_error().
      This results in underflowing the inflight counter. Fix this by only
      accounting bios that were actually submitted.
      Signed-off-by: default avatarDennis Zhou <dennis@kernel.org>
      Cc: Josef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c8632e34
    • Qian Cai's avatar
      x86/cacheinfo: Fix a -Wtype-limits warning · 1d4bf99c
      Qian Cai authored
      [ Upstream commit 1b7aebf0 ]
      
      cpuinfo_x86.x86_model is an unsigned type, so comparing against zero
      will generate a compilation warning:
      
        arch/x86/kernel/cpu/cacheinfo.c: In function 'cacheinfo_amd_init_llc_id':
        arch/x86/kernel/cpu/cacheinfo.c:662:19: warning: comparison is always true \
          due to limited range of data type [-Wtype-limits]
      
      Remove the unnecessary lower bound check.
      
       [ bp: Massage. ]
      
      Fixes: 68091ee7 ("x86/CPU/AMD: Calculate last level cache ID from number of sharing threads")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1560954773-11967-1-git-send-email-cai@lca.pwSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d4bf99c
    • Ilias Apalodimas's avatar
      net: netsec: initialize tx ring on ndo_open · a98c1517
      Ilias Apalodimas authored
      [ Upstream commit 39e3622e ]
      
      Since we changed the Tx ring handling and now depends on bit31 to figure
      out the owner of the descriptor, we should initialize this every time
      the device goes down-up instead of doing it once on driver init. If the
      value is not correctly initialized the device won't have any available
      descriptors
      
      Changes since v1:
      - Typo fixes
      
      Fixes: 35e07d23 ("net: socionext: remove mmio reads on Tx")
      Signed-off-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a98c1517
    • Mika Westerberg's avatar
      PCI: Add missing link delays required by the PCIe spec · 3c795a8e
      Mika Westerberg authored
      [ Upstream commit c2bf1fc2 ]
      
      Currently Linux does not follow PCIe spec regarding the required delays
      after reset. A concrete example is a Thunderbolt add-in-card that
      consists of a PCIe switch and two PCIe endpoints:
      
        +-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
                                        +-01.0-[04-36]-- DS hotplug port
                                        +-02.0-[37]----00.0 xHCI controller
                                        \-04.0-[38-6b]-- DS hotplug port
      
      The root port (1b.0) and the PCIe switch downstream ports are all PCIe
      gen3 so they support 8GT/s link speeds.
      
      We wait for the PCIe hierarchy to enter D3cold (runtime):
      
        pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
      
      When it wakes up from D3cold, according to the PCIe 4.0 section 5.8 the
      PCIe switch is put to reset and its power is re-applied. This means that
      we must follow the rules in PCIe 4.0 section 6.6.1.
      
      For the PCIe gen3 ports we are dealing with here, the following applies:
      
        With a Downstream Port that supports Link speeds greater than 5.0
        GT/s, software must wait a minimum of 100 ms after Link training
        completes before sending a Configuration Request to the device
        immediately below that Port. Software can determine when Link training
        completes by polling the Data Link Layer Link Active bit or by setting
        up an associated interrupt (see Section 6.7.3.3).
      
      Translating this into the above topology we would need to do this (DLLLA
      stands for Data Link Layer Link Active):
      
        pcieport 0000:00:1b.0: wait for 100ms after DLLLA is set before access to 0000:01:00.0
        pcieport 0000:02:00.0: wait for 100ms after DLLLA is set before access to 0000:03:00.0
        pcieport 0000:02:02.0: wait for 100ms after DLLLA is set before access to 0000:37:00.0
      
      I've instrumented the kernel with additional logging so we can see the
      actual delays the kernel performs:
      
        pcieport 0000:00:1b.0: power state changed by ACPI to D0
        pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
        pcieport 0000:00:1b.0: waking up bus
        pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
        pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
        ...
        pcieport 0000:00:1b.0: PME# disabled
        pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:01:00.0: PME# disabled
        pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:00.0: PME# disabled
        pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:01.0: PME# disabled
        pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:02.0: PME# disabled
        pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:04.0: PME# disabled
        pcieport 0000:02:01.0: PME# enabled
        pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
        pcieport 0000:02:04.0: PME# enabled
        pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
        thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
        ...
        thunderbolt 0000:03:00.0: PME# disabled
        xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
        ...
        xhci_hcd 0000:37:00.0: PME# disabled
      
      For the switch upstream port (01:00.0) we wait for 100ms but not taking
      into account the DLLLA requirement. We then wait 10ms for D3hot -> D0
      transition of the root port and the two downstream hotplug ports. This
      means that we deviate from what the spec requires.
      
      Performing the same check for system sleep (s2idle) transitions we can
      see following when resuming from s2idle:
      
        pcieport 0000:00:1b.0: power state changed by ACPI to D0
        pcieport 0000:00:1b.0: restoring config space at offset 0x2c (was 0x60, writing 0x60)
        ...
        pcieport 0000:01:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        ...
        pcieport 0000:02:02.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:02.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
        pcieport 0000:02:01.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:04.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:02.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
        pcieport 0000:02:00.0: restoring config space at offset 0x3c (was 0x1ff, writing 0x201ff)
        pcieport 0000:02:02.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
        pcieport 0000:02:01.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
        pcieport 0000:02:02.0: restoring config space at offset 0x20 (was 0x0, writing 0x73f073f0)
        pcieport 0000:02:04.0: restoring config space at offset 0x2c (was 0x0, writing 0x60)
        pcieport 0000:02:01.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
        pcieport 0000:02:00.0: restoring config space at offset 0x2c (was 0x0, writing 0x0)
        pcieport 0000:02:02.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
        pcieport 0000:02:04.0: restoring config space at offset 0x28 (was 0x0, writing 0x60)
        pcieport 0000:02:01.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1ff10001)
        pcieport 0000:02:00.0: restoring config space at offset 0x28 (was 0x0, writing 0x0)
        pcieport 0000:02:02.0: restoring config space at offset 0x18 (was 0x0, writing 0x373702)
        pcieport 0000:02:04.0: restoring config space at offset 0x24 (was 0x10001, writing 0x49f12001)
        pcieport 0000:02:01.0: restoring config space at offset 0x20 (was 0x0, writing 0x73e05c00)
        pcieport 0000:02:00.0: restoring config space at offset 0x24 (was 0x10001, writing 0x1fff1)
        pcieport 0000:02:04.0: restoring config space at offset 0x20 (was 0x0, writing 0x89f07400)
        pcieport 0000:02:01.0: restoring config space at offset 0x1c (was 0x101, writing 0x5151)
        pcieport 0000:02:00.0: restoring config space at offset 0x20 (was 0x0, writing 0x8a008a00)
        pcieport 0000:02:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:04.0: restoring config space at offset 0x1c (was 0x101, writing 0x6161)
        pcieport 0000:02:01.0: restoring config space at offset 0x18 (was 0x0, writing 0x360402)
        pcieport 0000:02:00.0: restoring config space at offset 0x1c (was 0x101, writing 0x1f1)
        pcieport 0000:02:04.0: restoring config space at offset 0x18 (was 0x0, writing 0x6b3802)
        pcieport 0000:02:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:00.0: restoring config space at offset 0x18 (was 0x0, writing 0x30302)
        pcieport 0000:02:01.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:04.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:00.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
        pcieport 0000:02:01.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:04.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        pcieport 0000:02:00.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100407)
        xhci_hcd 0000:37:00.0: restoring config space at offset 0x10 (was 0x0, writing 0x73f00000)
        ...
        thunderbolt 0000:03:00.0: restoring config space at offset 0x14 (was 0x0, writing 0x8a040000)
      
      This is even worse. None of the mandatory delays are performed. If this
      would be S3 instead of s2idle then according to PCI FW spec 3.2 section
      4.6.8.  there is a specific _DSM that allows the OS to skip the delays
      but this platform does not provide the _DSM and does not go to S3 anyway
      so no firmware is involved that could already handle these delays.
      
      In this particular Intel Coffee Lake platform these delays are not
      actually needed because there is an additional delay as part of the ACPI
      power resource that is used to turn on power to the hierarchy but since
      that additional delay is not required by any of standards (PCIe, ACPI)
      it is not present in the Intel Ice Lake, for example where missing the
      mandatory delays causes pciehp to start tearing down the stack too early
      (links are not yet trained).
      
      For this reason, change the PCIe portdrv PM resume hooks so that they
      perform the mandatory delays before the downstream component gets
      resumed. We perform the delays before port services are resumed because
      otherwise pciehp might find that the link is not up (even if it is just
      training) and tears-down the hierarchy.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3c795a8e
    • Alexei Starovoitov's avatar
      bpf: fix callees pruning callers · 70cc29db
      Alexei Starovoitov authored
      [ Upstream commit eea1c227 ]
      
      The commit 7640ead9 partially resolved the issue of callees
      incorrectly pruning the callers.
      With introduction of bounded loops and jmps_processed heuristic
      single verifier state may contain multiple branches and calls.
      It's possible that new verifier state (for future pruning) will be
      allocated inside callee. Then callee will exit (still within the same
      verifier state). It will go back to the caller and there R6-R9 registers
      will be read and will trigger mark_reg_read. But the reg->live for all frames
      but the top frame is not set to LIVE_NONE. Hence mark_reg_read will fail
      to propagate liveness into parent and future walking will incorrectly
      conclude that the states are equivalent because LIVE_READ is not set.
      In other words the rule for parent/live should be:
      whenever register parentage chain is set the reg->live should be set to LIVE_NONE.
      is_state_visited logic already follows this rule for spilled registers.
      
      Fixes: 7640ead9 ("bpf: verifier: make sure callees don't prune with caller differences")
      Fixes: f4d7e40a ("bpf: introduce function calls (verification)")
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      70cc29db
    • Nilkanth Ahirrao's avatar
      ASoC: rsnd: fixup mod ID calculation in rsnd_ctu_probe_ · f89b29fa
      Nilkanth Ahirrao authored
      [ Upstream commit ac28ec07 ]
      
      commit c16015f3 ("ASoC: rsnd: add .get_id/.get_id_sub")
      introduces rsnd_ctu_id which calcualates and gives
      the main Device id of the CTU by dividing the id by 4.
      rsnd_mod_id uses this interface to get the CTU main
      Device id. But this commit forgets to revert the main
      Device id calcution previously done in rsnd_ctu_probe_
      which also divides the id by 4. This path corrects the
      same to get the correct main Device id.
      
      The issue is observered when rsnd_ctu_probe_ is done for CTU1
      
      Fixes: c16015f3 ("ASoC: rsnd: add .get_id/.get_id_sub")
      Signed-off-by: default avatarNilkanth Ahirrao <anilkanth@jp.adit-jv.com>
      Signed-off-by: default avatarSuresh Udipi <sudipi@jp.adit-jv.com>
      Signed-off-by: default avatarJiada Wang <jiada_wang@mentor.com>
      Acked-by: default avatarKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f89b29fa
    • Denis Kirjanov's avatar
      ipoib: correcly show a VF hardware address · c545bda3
      Denis Kirjanov authored
      [ Upstream commit 64d701c6 ]
      
      in the case of IPoIB with SRIOV enabled hardware
      ip link show command incorrecly prints
      0 instead of a VF hardware address.
      
      Before:
      11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
      state UP mode DEFAULT group default qlen 256
          link/infiniband
      80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
      00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
          vf 0 MAC 00:00:00:00:00:00, spoof checking off, link-state disable,
      trust off, query_rss off
      ...
      After:
      11: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
      state UP mode DEFAULT group default qlen 256
          link/infiniband
      80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
      00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
          vf 0     link/infiniband
      80:00:00:66:fe:80:00:00:00:00:00:00:24:8a:07:03:00:a4:3e:7c brd
      00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff, spoof
      checking off, link-state disable, trust off, query_rss off
      
      v1->v2: just copy an address without modifing ifla_vf_mac
      v2->v3: update the changelog
      Signed-off-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Acked-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c545bda3
    • Mitch Williams's avatar
      iavf: allow null RX descriptors · 2a51e334
      Mitch Williams authored
      [ Upstream commit efa14c39 ]
      
      In some circumstances, the hardware can hand us a null receive
      descriptor, with no data attached but otherwise valid. Unfortunately,
      the driver was ill-equipped to handle such an event, and would stop
      processing packets at that point.
      
      To fix this, use the Descriptor Done bit instead of the size to
      determine whether or not a descriptor is ready to be processed. Add some
      checks to allow for unused buffers.
      Signed-off-by: default avatarMitch Williams <mitch.a.williams@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2a51e334
    • Jason Wang's avatar
      vhost_net: disable zerocopy by default · 46ffa155
      Jason Wang authored
      [ Upstream commit 098eadce ]
      
      Vhost_net was known to suffer from HOL[1] issues which is not easy to
      fix. Several downstream disable the feature by default. What's more,
      the datapath was split and datacopy path got the support of batching
      and XDP support recently which makes it faster than zerocopy part for
      small packets transmission.
      
      It looks to me that disable zerocopy by default is more
      appropriate. It cold be enabled by default again in the future if we
      fix the above issues.
      
      [1] https://patchwork.kernel.org/patch/3787671/Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      46ffa155
    • Arnaldo Carvalho de Melo's avatar
      perf evsel: Make perf_evsel__name() accept a NULL argument · 0b0e0d74
      Arnaldo Carvalho de Melo authored
      [ Upstream commit fdbdd7e8 ]
      
      In which case it simply returns "unknown", like when it can't figure out
      the evsel->name value.
      
      This makes this code more robust and fixes a problem in 'perf trace'
      where a NULL evsel was being passed to a routine that only used the
      evsel for printing its name when a invalid syscall id was passed.
      Reported-by: default avatarLeo Yan <leo.yan@linaro.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-f30ztaasku3z935cn3ak3h53@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0b0e0d74
    • Peter Zijlstra's avatar
      x86/atomic: Fix smp_mb__{before,after}_atomic() · 380ca130
      Peter Zijlstra authored
      [ Upstream commit 69d927bb ]
      
      Recent probing at the Linux Kernel Memory Model uncovered a
      'surprise'. Strongly ordered architectures where the atomic RmW
      primitive implies full memory ordering and
      smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
      fail for:
      
      	*x = 1;
      	atomic_inc(u);
      	smp_mb__after_atomic();
      	r0 = *y;
      
      Because, while the atomic_inc() implies memory order, it
      (surprisingly) does not provide a compiler barrier. This then allows
      the compiler to re-order like so:
      
      	atomic_inc(u);
      	*x = 1;
      	smp_mb__after_atomic();
      	r0 = *y;
      
      Which the CPU is then allowed to re-order (under TSO rules) like:
      
      	atomic_inc(u);
      	r0 = *y;
      	*x = 1;
      
      And this very much was not intended. Therefore strengthen the atomic
      RmW ops to include a compiler barrier.
      
      NOTE: atomic_{or,and,xor} and the bitops already had the compiler
      barrier.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      380ca130
    • Geert Uytterhoeven's avatar
      integrity: Fix __integrity_init_keyring() section mismatch · 1d0d6eb4
      Geert Uytterhoeven authored
      [ Upstream commit 8c655784 ]
      
      With gcc-4.6.3:
      
          WARNING: vmlinux.o(.text.unlikely+0x24c64): Section mismatch in reference from the function __integrity_init_keyring() to the function .init.text:set_platform_trusted_keys()
          The function __integrity_init_keyring() references
          the function __init set_platform_trusted_keys().
          This is often because __integrity_init_keyring lacks a __init
          annotation or the annotation of set_platform_trusted_keys is wrong.
      
      Indeed, if the compiler decides not to inline __integrity_init_keyring(),
      a warning is issued.
      
      Fix this by adding the missing __init annotation.
      
      Fixes: 9dc92c45 ("integrity: Define a trusted platform keyring")
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Reviewed-by: default avatarNayna Jain <nayna@linux.ibm.com>
      Reviewed-by: default avatarJames Morris <jamorris@linux.microsoft.com>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1d0d6eb4
    • Kan Liang's avatar
      perf/x86/intel/uncore: Handle invalid event coding for free-running counter · 5536b0ee
      Kan Liang authored
      [ Upstream commit 543ac280 ]
      
      Counting with invalid event coding for free-running counter may cause
      OOPs, e.g. uncore_iio_free_running_0/event=1/.
      
      Current code only validate the event with free-running event format,
      event=0xff,umask=0xXY. Non-free-running event format never be checked
      for the PMU with free-running counters.
      
      Add generic hw_config() to check and reject the invalid event coding
      for free-running PMU.
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: acme@kernel.org
      Cc: eranian@google.com
      Fixes: 0f519f03 ("perf/x86/intel/uncore: Support IIO free-running counters on SKX")
      Link: https://lkml.kernel.org/r/1556672028-119221-2-git-send-email-kan.liang@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5536b0ee
    • Jiri Olsa's avatar
      perf/x86/intel: Disable check_msr for real HW · 826e73e7
      Jiri Olsa authored
      [ Upstream commit d0e1a507 ]
      
      Tom Vaden reported false failure of the check_msr() function, because
      some servers can do POST tracing and enable LBR tracing during
      bootup.
      
      Kan confirmed that check_msr patch was to fix a bug report in
      guest, so it's ok to disable it for real HW.
      Reported-by: default avatarTom Vaden <tom.vaden@hpe.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTom Vaden <tom.vaden@hpe.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Liang Kan <kan.liang@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190616141313.GD2500@krava
      [ Readability edits. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      826e73e7
    • Qian Cai's avatar
      sched/fair: Fix "runnable_avg_yN_inv" not used warnings · 7104a8f3
      Qian Cai authored
      [ Upstream commit 509466b7 ]
      
      runnable_avg_yN_inv[] is only used in kernel/sched/pelt.c but was
      included in several other places because they need other macros all
      came from kernel/sched/sched-pelt.h which was generated by
      Documentation/scheduler/sched-pelt. As the result, it causes compilation
      a lot of warnings,
      
        kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=]
        kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=]
        kernel/sched/sched-pelt.h:4:18: warning: 'runnable_avg_yN_inv' defined but not used [-Wunused-const-variable=]
        ...
      
      Silence it by appending the __maybe_unused attribute for it, so all
      generated variables and macros can still be kept in the same file.
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/1559596304-31581-1-git-send-email-cai@lca.pwSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7104a8f3
    • Gao Xiang's avatar
      sched/core: Add __sched tag for io_schedule() · 1ce5c914
      Gao Xiang authored
      [ Upstream commit e3b929b0 ]
      
      Non-inline io_schedule() was introduced in:
      
        commit 10ab5643 ("sched/core: Separate out io_schedule_prepare() and io_schedule_finish()")
      
      Keep in line with io_schedule_timeout(), otherwise "/proc/<pid>/wchan" will
      report io_schedule() rather than its callers when waiting for IO.
      Reported-by: default avatarJilong Kou <koujilong@huawei.com>
      Signed-off-by: default avatarGao Xiang <gaoxiang25@huawei.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Miao Xie <miaoxie@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 10ab5643 ("sched/core: Separate out io_schedule_prepare() and io_schedule_finish()")
      Link: https://lkml.kernel.org/r/20190603091338.2695-1-gaoxiang25@huawei.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ce5c914
    • Nicolas Dichtel's avatar
      xfrm: fix sa selector validation · db1af6e8
      Nicolas Dichtel authored
      [ Upstream commit b8d6d007 ]
      
      After commit b38ff407, the following command does not work anymore:
      $ ip xfrm state add src 10.125.0.2 dst 10.125.0.1 proto esp spi 34 reqid 1 \
        mode tunnel enc 'cbc(aes)' 0xb0abdba8b782ad9d364ec81e3a7d82a1 auth-trunc \
        'hmac(sha1)' 0xe26609ebd00acb6a4d51fca13e49ea78a72c73e6 96 flag align4
      
      In fact, the selector is not mandatory, allow the user to provide an empty
      selector.
      
      Fixes: b38ff407 ("xfrm: Fix xfrm sel prefix length validation")
      CC: Anirudh Gupta <anirudh.gupta@sophos.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      db1af6e8
    • Tejun Heo's avatar
      blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration · ded52121
      Tejun Heo authored
      [ Upstream commit 66311422 ]
      
      wbc_account_io() collects information on cgroup ownership of writeback
      pages to determine which cgroup should own the inode.  Pages can stay
      associated with dead memcgs but we want to avoid attributing IOs to
      dead blkcgs as much as possible as the association is likely to be
      stale.  However, currently, pages associated with dead memcgs
      contribute to the accounting delaying and/or confusing the
      arbitration.
      
      Fix it by ignoring pages associated with dead memcgs.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ded52121
    • Bob Liu's avatar
      block: null_blk: fix race condition for null_del_dev · 7b6ee182
      Bob Liu authored
      [ Upstream commit 7602843f ]
      
      Dulicate call of null_del_dev() will trigger null pointer error like below.
      The reason is a race condition between nullb_device_power_store() and
      nullb_group_drop_item().
      
        CPU#0                         CPU#1
        ----------------              -----------------
        do_rmdir()
         >configfs_rmdir()
          >client_drop_item()
           >nullb_group_drop_item()
                                      nullb_device_power_store()
      				>null_del_dev()
      
            >test_and_clear_bit(NULLB_DEV_FL_UP
             >null_del_dev()
             ^^^^^
             Duplicated null_dev_dev() triger null pointer error
      
      				>clear_bit(NULLB_DEV_FL_UP
      
      The fix could be keep the sequnce of clear NULLB_DEV_FL_UP and null_del_dev().
      
      [  698.613600] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
      [  698.613608] #PF error: [normal kernel read fault]
      [  698.613611] PGD 0 P4D 0
      [  698.613619] Oops: 0000 [#1] SMP PTI
      [  698.613627] CPU: 3 PID: 6382 Comm: rmdir Not tainted 5.0.0+ #35
      [  698.613631] Hardware name: LENOVO 20LJS2EV08/20LJS2EV08, BIOS R0SET33W (1.17 ) 07/18/2018
      [  698.613644] RIP: 0010:null_del_dev+0xc/0x110 [null_blk]
      [  698.613649] Code: 00 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b eb 97 e8 47 bb 2a e8 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <8b> 77 18 48 89 fb 4c 8b 27 48 c7 c7 40 57 1e c1 e8 bf c7 cb e8 48
      [  698.613654] RSP: 0018:ffffb887888bfde0 EFLAGS: 00010286
      [  698.613659] RAX: 0000000000000000 RBX: ffff9d436d92bc00 RCX: ffff9d43a9184681
      [  698.613663] RDX: ffffffffc11e5c30 RSI: 0000000068be6540 RDI: 0000000000000000
      [  698.613667] RBP: ffffb887888bfdf0 R08: 0000000000000001 R09: 0000000000000000
      [  698.613671] R10: ffffb887888bfdd8 R11: 0000000000000f16 R12: ffff9d436d92bc08
      [  698.613675] R13: ffff9d436d94e630 R14: ffffffffc11e5088 R15: ffffffffc11e5000
      [  698.613680] FS:  00007faa68be6540(0000) GS:ffff9d43d14c0000(0000) knlGS:0000000000000000
      [  698.613685] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  698.613689] CR2: 0000000000000018 CR3: 000000042f70c002 CR4: 00000000003606e0
      [  698.613693] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  698.613697] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  698.613700] Call Trace:
      [  698.613712]  nullb_group_drop_item+0x50/0x70 [null_blk]
      [  698.613722]  client_drop_item+0x29/0x40
      [  698.613728]  configfs_rmdir+0x1ed/0x300
      [  698.613738]  vfs_rmdir+0xb2/0x130
      [  698.613743]  do_rmdir+0x1c7/0x1e0
      [  698.613750]  __x64_sys_rmdir+0x17/0x20
      [  698.613759]  do_syscall_64+0x5a/0x110
      [  698.613768]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7b6ee182