1. 11 Oct, 2019 15 commits
    • Santosh Sivaraj's avatar
      powerpc/mce: Schedule work from irq_work · ba3ca9fc
      Santosh Sivaraj authored
      commit b5bda626 upstream.
      
      schedule_work() cannot be called from MCE exception context as MCE can
      interrupt even in interrupt disabled context.
      
      Fixes: 733e4a4c ("powerpc/mce: hookup memory_failure for UE errors")
      Cc: stable@vger.kernel.org # v4.15+
      Reviewed-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarSantosh Sivaraj <santosh@fossix.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190820081352.8641-2-santosh@fossix.orgSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba3ca9fc
    • Balbir Singh's avatar
      powerpc/mce: Fix MCE handling for huge pages · ee6eeeb8
      Balbir Singh authored
      commit 99ead78a upstream.
      
      The current code would fail on huge pages addresses, since the shift would
      be incorrect. Use the correct page shift value returned by
      __find_linux_pte() to get the correct physical address. The code is more
      generic and can handle both regular and compound pages.
      
      Fixes: ba41e1e1 ("powerpc/mce: Hookup derror (load/store) UE errors")
      Signed-off-by: default avatarBalbir Singh <bsingharora@gmail.com>
      [arbab@linux.ibm.com: Fixup pseries_do_memory_failure()]
      Signed-off-by: default avatarReza Arbab <arbab@linux.ibm.com>
      Tested-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: default avatarSantosh Sivaraj <santosh@fossix.org>
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190820081352.8641-3-santosh@fossix.orgSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee6eeeb8
    • Oleksandr Suvorov's avatar
      ASoC: sgtl5000: Improve VAG power and mute control · 1284f207
      Oleksandr Suvorov authored
      commit b1f373a1 upstream.
      
      VAG power control is improved to fit the manual [1]. This patch fixes as
      minimum one bug: if customer muxes Headphone to Line-In right after boot,
      the VAG power remains off that leads to poor sound quality from line-in.
      
      I.e. after boot:
        - Connect sound source to Line-In jack;
        - Connect headphone to HP jack;
        - Run following commands:
        $ amixer set 'Headphone' 80%
        $ amixer set 'Headphone Mux' LINE_IN
      
      Change VAG power on/off control according to the following algorithm:
        - turn VAG power ON on the 1st incoming event.
        - keep it ON if there is any active VAG consumer (ADC/DAC/HP/Line-In).
        - turn VAG power OFF when there is the latest consumer's pre-down event
          come.
        - always delay after VAG power OFF to avoid pop.
        - delay after VAG power ON if the initiative consumer is Line-In, this
          prevents pop during line-in muxing.
      
      According to the data sheet [1], to avoid any pops/clicks,
      the outputs should be muted during input/output
      routing changes.
      
      [1] https://www.nxp.com/docs/en/data-sheet/SGTL5000.pdf
      
      Cc: stable@vger.kernel.org
      Fixes: 9b34e6cc ("ASoC: Add Freescale SGTL5000 codec support")
      Signed-off-by: default avatarOleksandr Suvorov <oleksandr.suvorov@toradex.com>
      Reviewed-by: default avatarMarcel Ziswiler <marcel.ziswiler@toradex.com>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Reviewed-by: default avatarCezary Rojewski <cezary.rojewski@intel.com>
      Link: https://lore.kernel.org/r/20190719100524.23300-3-oleksandr.suvorov@toradex.comSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1284f207
    • Oleksandr Suvorov's avatar
      ASoC: Define a set of DAPM pre/post-up events · 50090b75
      Oleksandr Suvorov authored
      commit cfc8f568 upstream.
      
      Prepare to use SND_SOC_DAPM_PRE_POST_PMU definition to
      reduce coming code size and make it more readable.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarOleksandr Suvorov <oleksandr.suvorov@toradex.com>
      Reviewed-by: default avatarMarcel Ziswiler <marcel.ziswiler@toradex.com>
      Reviewed-by: default avatarIgor Opaniuk <igor.opaniuk@toradex.com>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Link: https://lore.kernel.org/r/20190719100524.23300-2-oleksandr.suvorov@toradex.comSigned-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50090b75
    • Dmitry Osipenko's avatar
      PM / devfreq: tegra: Fix kHz to Hz conversion · 42b888f6
      Dmitry Osipenko authored
      commit 62bacb06 upstream.
      
      The kHz to Hz is incorrectly converted in a few places in the code,
      this results in a wrong frequency being calculated because devfreq core
      uses OPP frequencies that are given in Hz to clamp the rate, while
      tegra-devfreq gives to the core value in kHz and then it also expects to
      receive value in kHz from the core. In a result memory freq is always set
      to a value which is close to ULONG_MAX because of the bug. Hence the EMC
      frequency is always capped to the maximum and the driver doesn't do
      anything useful. This patch was tested on Tegra30 and Tegra124 SoC's, EMC
      frequency scaling works properly now.
      
      Cc: <stable@vger.kernel.org> # 4.14+
      Tested-by: default avatarSteev Klimaszewski <steev@kali.org>
      Reviewed-by: default avatarChanwoo Choi <cw00.choi@samsung.com>
      Signed-off-by: default avatarDmitry Osipenko <digetx@gmail.com>
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarMyungJoo Ham <myungjoo.ham@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42b888f6
    • Mike Christie's avatar
      nbd: fix max number of supported devs · 9f0f39c9
      Mike Christie authored
      commit e9e006f5 upstream.
      
      This fixes a bug added in 4.10 with commit:
      
      commit 9561a7ad
      Author: Josef Bacik <jbacik@fb.com>
      Date:   Tue Nov 22 14:04:40 2016 -0500
      
          nbd: add multi-connection support
      
      that limited the number of devices to 256. Before the patch we could
      create 1000s of devices, but the patch switched us from using our
      own thread to using a work queue which has a default limit of 256
      active works.
      
      The problem is that our recv_work function sits in a loop until
      disconnection but only handles IO for one connection. The work is
      started when the connection is started/restarted, but if we end up
      creating 257 or more connections, the queue_work call just queues
      connection257+'s recv_work and that waits for connection 1 - 256's
      recv_work to be disconnected and that work instance completing.
      
      Instead of reverting back to kthreads, this has us allocate a
      workqueue_struct per device, so we can block in the work.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarMike Christie <mchristi@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f0f39c9
    • Jack Wang's avatar
      KVM: nVMX: handle page fault in vmread fix · eff3a54a
      Jack Wang authored
      During backport f7eea636 ("KVM: nVMX: handle page fault in vmread"),
      there was a mistake the exception reference should be passed to function
      kvm_write_guest_virt_system, instead of NULL, other wise, we will get
      NULL pointer deref, eg
      
      kvm-unit-test triggered a NULL pointer deref below:
      [  948.518437] kvm [24114]: vcpu0, guest rIP: 0x407ef9 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x3, nop
      [  949.106464] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
      [  949.106707] PGD 0 P4D 0
      [  949.106872] Oops: 0002 [#1] SMP
      [  949.107038] CPU: 2 PID: 24126 Comm: qemu-2.7 Not tainted 4.19.77-pserver #4.19.77-1+feature+daily+update+20191005.1625+a4168bb~deb9
      [  949.107283] Hardware name: Dell Inc. Precision Tower 3620/09WH54, BIOS 2.7.3 01/31/2018
      [  949.107549] RIP: 0010:kvm_write_guest_virt_system+0x12/0x40 [kvm]
      [  949.107719] Code: c0 5d 41 5c 41 5d 41 5e 83 f8 03 41 0f 94 c0 41 c1 e0 02 e9 b0 ed ff ff 0f 1f 44 00 00 48 89 f0 c6 87 59 56 00 00 01 48 89 d6 <49> c7 00 00 00 00 00 89 ca 49 c7 40 08 00 00 00 00 49 c7 40 10 00
      [  949.108044] RSP: 0018:ffffb31b0a953cb0 EFLAGS: 00010202
      [  949.108216] RAX: 000000000046b4d8 RBX: ffff9e9f415b0000 RCX: 0000000000000008
      [  949.108389] RDX: ffffb31b0a953cc0 RSI: ffffb31b0a953cc0 RDI: ffff9e9f415b0000
      [  949.108562] RBP: 00000000d2e14928 R08: 0000000000000000 R09: 0000000000000000
      [  949.108733] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffffffffc8
      [  949.108907] R13: 0000000000000002 R14: ffff9e9f4f26f2e8 R15: 0000000000000000
      [  949.109079] FS:  00007eff8694c700(0000) GS:ffff9e9f51a80000(0000) knlGS:0000000031415928
      [  949.109318] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  949.109495] CR2: 0000000000000000 CR3: 00000003be53b002 CR4: 00000000003626e0
      [  949.109671] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  949.109845] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  949.110017] Call Trace:
      [  949.110186]  handle_vmread+0x22b/0x2f0 [kvm_intel]
      [  949.110356]  ? vmexit_fill_RSB+0xc/0x30 [kvm_intel]
      [  949.110549]  kvm_arch_vcpu_ioctl_run+0xa98/0x1b30 [kvm]
      [  949.110725]  ? kvm_vcpu_ioctl+0x388/0x5d0 [kvm]
      [  949.110901]  kvm_vcpu_ioctl+0x388/0x5d0 [kvm]
      [  949.111072]  do_vfs_ioctl+0xa2/0x620
      Signed-off-by: default avatarJack Wang <jinpu.wang@cloud.ionos.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      eff3a54a
    • Wanpeng Li's avatar
      KVM: X86: Fix userspace set invalid CR4 · 21874027
      Wanpeng Li authored
      commit 3ca94192 upstream.
      
      Reported by syzkaller:
      
      	WARNING: CPU: 0 PID: 6544 at /home/kernel/data/kvm/arch/x86/kvm//vmx/vmx.c:4689 handle_desc+0x37/0x40 [kvm_intel]
      	CPU: 0 PID: 6544 Comm: a.out Tainted: G           OE     5.3.0-rc4+ #4
      	RIP: 0010:handle_desc+0x37/0x40 [kvm_intel]
      	Call Trace:
      	 vmx_handle_exit+0xbe/0x6b0 [kvm_intel]
      	 vcpu_enter_guest+0x4dc/0x18d0 [kvm]
      	 kvm_arch_vcpu_ioctl_run+0x407/0x660 [kvm]
      	 kvm_vcpu_ioctl+0x3ad/0x690 [kvm]
      	 do_vfs_ioctl+0xa2/0x690
      	 ksys_ioctl+0x6d/0x80
      	 __x64_sys_ioctl+0x1a/0x20
      	 do_syscall_64+0x74/0x720
      	 entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      When CR4.UMIP is set, guest should have UMIP cpuid flag. Current
      kvm set_sregs function doesn't have such check when userspace inputs
      sregs values. SECONDARY_EXEC_DESC is enabled on writes to CR4.UMIP
      in vmx_set_cr4 though guest doesn't have UMIP cpuid flag. The testcast
      triggers handle_desc warning when executing ltr instruction since
      guest architectural CR4 doesn't set UMIP. This patch fixes it by
      adding valid CR4 and CPUID combination checking in __set_sregs.
      
      syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=138efb99600000
      
      Reported-by: syzbot+0f1819555fbdce992df9@syzkaller.appspotmail.com
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21874027
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Don't lose pending doorbell request on migration on P9 · 30fbe0d3
      Paul Mackerras authored
      commit ff42df49 upstream.
      
      On POWER9, when userspace reads the value of the DPDES register on a
      vCPU, it is possible for 0 to be returned although there is a doorbell
      interrupt pending for the vCPU.  This can lead to a doorbell interrupt
      being lost across migration.  If the guest kernel uses doorbell
      interrupts for IPIs, then it could malfunction because of the lost
      interrupt.
      
      This happens because a newly-generated doorbell interrupt is signalled
      by setting vcpu->arch.doorbell_request to 1; the DPDES value in
      vcpu->arch.vcore->dpdes is not updated, because it can only be updated
      when holding the vcpu mutex, in order to avoid races.
      
      To fix this, we OR in vcpu->arch.doorbell_request when reading the
      DPDES value.
      
      Cc: stable@vger.kernel.org # v4.13+
      Fixes: 57900694 ("KVM: PPC: Book3S HV: Virtualize doorbell facility on POWER9")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Tested-by: default avatarAlexey Kardashevskiy <aik@ozlabs.ru>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30fbe0d3
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Check for MMU ready on piggybacked virtual cores · 4faa7f05
      Paul Mackerras authored
      commit d28eafc5 upstream.
      
      When we are running multiple vcores on the same physical core, they
      could be from different VMs and so it is possible that one of the
      VMs could have its arch.mmu_ready flag cleared (for example by a
      concurrent HPT resize) when we go to run it on a physical core.
      We currently check the arch.mmu_ready flag for the primary vcore
      but not the flags for the other vcores that will be run alongside
      it.  This adds that check, and also a check when we select the
      secondary vcores from the preempted vcores list.
      
      Cc: stable@vger.kernel.org # v4.14+
      Fixes: 38c53af8 ("KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates")
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4faa7f05
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts · 577a5119
      Paul Mackerras authored
      commit 959c5d51 upstream.
      
      Escalation interrupts are interrupts sent to the host by the XIVE
      hardware when it has an interrupt to deliver to a guest VCPU but that
      VCPU is not running anywhere in the system.  Hence we disable the
      escalation interrupt for the VCPU being run when we enter the guest
      and re-enable it when the guest does an H_CEDE hypercall indicating
      it is idle.
      
      It is possible that an escalation interrupt gets generated just as we
      are entering the guest.  In that case the escalation interrupt may be
      using a queue entry in one of the interrupt queues, and that queue
      entry may not have been processed when the guest exits with an H_CEDE.
      The existing entry code detects this situation and does not clear the
      vcpu->arch.xive_esc_on flag as an indication that there is a pending
      queue entry (if the queue entry gets processed, xive_esc_irq() will
      clear the flag).  There is a comment in the code saying that if the
      flag is still set on H_CEDE, we have to abort the cede rather than
      re-enabling the escalation interrupt, lest we end up with two
      occurrences of the escalation interrupt in the interrupt queue.
      
      However, the exit code doesn't do that; it aborts the cede in the sense
      that vcpu->arch.ceded gets cleared, but it still enables the escalation
      interrupt by setting the source's PQ bits to 00.  Instead we need to
      set the PQ bits to 10, indicating that an interrupt has been triggered.
      We also need to avoid setting vcpu->arch.xive_esc_on in this case
      (i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because
      xive_esc_irq() will run at some point and clear it, and if we race with
      that we may end up with an incorrect result (i.e. xive_esc_on set when
      the escalation interrupt has just been handled).
      
      It is extremely unlikely that having two queue entries would cause
      observable problems; theoretically it could cause queue overflow, but
      the CPU would have to have thousands of interrupts targetted to it for
      that to be possible.  However, this fix will also make it possible to
      determine accurately whether there is an unhandled escalation
      interrupt in the queue, which will be needed by the following patch.
      
      Fixes: 9b9b13a6 ("KVM: PPC: Book3S HV: Keep XIVE escalation interrupt masked unless ceded")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190813100349.GD9567@blackberrySigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      577a5119
    • Vasily Gorbik's avatar
      s390/cio: exclude subchannels with no parent from pseudo check · 46cb14a5
      Vasily Gorbik authored
      commit ab575884 upstream.
      
      ccw console is created early in start_kernel and used before css is
      initialized or ccw console subchannel is registered. Until then console
      subchannel does not have a parent. For that reason assume subchannels
      with no parent are not pseudo subchannels. This fixes the following
      kasan finding:
      
      BUG: KASAN: global-out-of-bounds in sch_is_pseudo_sch+0x8e/0x98
      Read of size 8 at addr 00000000000005e8 by task swapper/0/0
      
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc8-07370-g6ac43dd12538 #2
      Hardware name: IBM 2964 NC9 702 (z/VM 6.4.0)
      Call Trace:
      ([<000000000012cd76>] show_stack+0x14e/0x1e0)
       [<0000000001f7fb44>] dump_stack+0x1a4/0x1f8
       [<00000000007d7afc>] print_address_description+0x64/0x3c8
       [<00000000007d75f6>] __kasan_report+0x14e/0x180
       [<00000000018a2986>] sch_is_pseudo_sch+0x8e/0x98
       [<000000000189b950>] cio_enable_subchannel+0x1d0/0x510
       [<00000000018cac7c>] ccw_device_recognition+0x12c/0x188
       [<0000000002ceb1a8>] ccw_device_enable_console+0x138/0x340
       [<0000000002cf1cbe>] con3215_init+0x25e/0x300
       [<0000000002c8770a>] console_init+0x68a/0x9b8
       [<0000000002c6a3d6>] start_kernel+0x4fe/0x728
       [<0000000000100070>] startup_continue+0x70/0xd0
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarSebastian Ott <sebott@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46cb14a5
    • Vasily Gorbik's avatar
      s390/topology: avoid firing events before kobjs are created · 9aa823b3
      Vasily Gorbik authored
      commit f3122a79 upstream.
      
      arch_update_cpu_topology is first called from:
      kernel_init_freeable->sched_init_smp->sched_init_domains
      
      even before cpus has been registered in:
      kernel_init_freeable->do_one_initcall->s390_smp_init
      
      Do not trigger kobject_uevent change events until cpu devices are
      actually created. Fixes the following kasan findings:
      
      BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb40/0xee0
      Read of size 8 at addr 0000000000000020 by task swapper/0/1
      
      BUG: KASAN: global-out-of-bounds in kobject_uevent_env+0xb36/0xee0
      Read of size 8 at addr 0000000000000018 by task swapper/0/1
      
      CPU: 0 PID: 1 Comm: swapper/0 Tainted: G    B
      Hardware name: IBM 3906 M04 704 (LPAR)
      Call Trace:
      ([<0000000143c6db7e>] show_stack+0x14e/0x1a8)
       [<0000000145956498>] dump_stack+0x1d0/0x218
       [<000000014429fb4c>] print_address_description+0x64/0x380
       [<000000014429f630>] __kasan_report+0x138/0x168
       [<0000000145960b96>] kobject_uevent_env+0xb36/0xee0
       [<0000000143c7c47c>] arch_update_cpu_topology+0x104/0x108
       [<0000000143df9e22>] sched_init_domains+0x62/0xe8
       [<000000014644c94a>] sched_init_smp+0x3a/0xc0
       [<0000000146433a20>] kernel_init_freeable+0x558/0x958
       [<000000014599002a>] kernel_init+0x22/0x160
       [<00000001459a71d4>] ret_from_fork+0x28/0x30
       [<00000001459a71dc>] kernel_thread_starter+0x0/0x10
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9aa823b3
    • Thomas Huth's avatar
      KVM: s390: Test for bad access register and size at the start of S390_MEM_OP · ddfef75f
      Thomas Huth authored
      commit a13b03bb upstream.
      
      If the KVM_S390_MEM_OP ioctl is called with an access register >= 16,
      then there is certainly a bug in the calling userspace application.
      We check for wrong access registers, but only if the vCPU was already
      in the access register mode before (i.e. the SIE block has recorded
      it). The check is also buried somewhere deep in the calling chain (in
      the function ar_translation()), so this is somewhat hard to find.
      
      It's better to always report an error to the userspace in case this
      field is set wrong, and it's safer in the KVM code if we block wrong
      values here early instead of relying on a check somewhere deep down
      the calling chain, so let's add another check to kvm_s390_guest_mem_op()
      directly.
      
      We also should check that the "size" is non-zero here (thanks to Janosch
      Frank for the hint!). If we do not check the size, we could call vmalloc()
      with this 0 value, and this will cause a kernel warning.
      Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
      Link: https://lkml.kernel.org/r/20190829122517.31042-1-thuth@redhat.comReviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Reviewed-by: default avatarJanosch Frank <frankja@linux.ibm.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ddfef75f
    • Vasily Gorbik's avatar
      s390/process: avoid potential reading of freed stack · 8b41a30f
      Vasily Gorbik authored
      commit 8769f610 upstream.
      
      With THREAD_INFO_IN_TASK (which is selected on s390) task's stack usage
      is refcounted and should always be protected by get/put when touching
      other task's stack to avoid race conditions with task's destruction code.
      
      Fixes: d5c352cd ("s390: move thread_info into task_struct")
      Cc: stable@vger.kernel.org # v4.10+
      Acked-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b41a30f
  2. 07 Oct, 2019 25 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.78 · 58fce206
      Greg Kroah-Hartman authored
      58fce206
    • Bharath Vedartham's avatar
      9p/cache.c: Fix memory leak in v9fs_cache_session_get_cookie · 5b0446c8
      Bharath Vedartham authored
      commit 962a991c upstream.
      
      v9fs_cache_session_get_cookie assigns a random cachetag to v9ses->cachetag,
      if the cachetag is not assigned previously.
      
      v9fs_random_cachetag allocates memory to v9ses->cachetag with kmalloc and uses
      scnprintf to fill it up with a cachetag.
      
      But if scnprintf fails, v9ses->cachetag is not freed in the current
      code causing a memory leak.
      
      Fix this by freeing v9ses->cachetag it v9fs_random_cachetag fails.
      
      This was reported by syzbot, the link to the report is below:
      https://syzkaller.appspot.com/bug?id=f012bdf297a7a4c860c38a88b44fbee43fd9bbf3
      
      Link: http://lkml.kernel.org/r/20190522194519.GA5313@bharath12345-Inspiron-5559
      Reported-by: syzbot+3a030a73b6c1e9833815@syzkaller.appspotmail.com
      Signed-off-by: default avatarBharath Vedartham <linux.bhar@gmail.com>
      Signed-off-by: default avatarDominique Martinet <dominique.martinet@cea.fr>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b0446c8
    • Tetsuo Handa's avatar
      kexec: bail out upon SIGKILL when allocating memory. · d85bc11a
      Tetsuo Handa authored
      commit 7c3a6aed upstream.
      
      syzbot found that a thread can stall for minutes inside kexec_load() after
      that thread was killed by SIGKILL [1].  It turned out that the reproducer
      was trying to allocate 2408MB of memory using kimage_alloc_page() from
      kimage_load_normal_segment().  Let's check for SIGKILL before doing memory
      allocation.
      
      [1] https://syzkaller.appspot.com/bug?id=a0e3436829698d5824231251fad9d8e998f94f5e
      
      Link: http://lkml.kernel.org/r/993c9185-d324-2640-d061-bed2dd18b1f7@I-love.SAKURA.ne.jpSigned-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: default avatarsyzbot <syzbot+8ab2d0f39fb79fe6ca40@syzkaller.appspotmail.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d85bc11a
    • Andrey Konovalov's avatar
      NFC: fix attrs checks in netlink interface · c8a65ec0
      Andrey Konovalov authored
      commit 18917d51 upstream.
      
      nfc_genl_deactivate_target() relies on the NFC_ATTR_TARGET_INDEX
      attribute being present, but doesn't check whether it is actually
      provided by the user. Same goes for nfc_genl_fw_download() and
      NFC_ATTR_FIRMWARE_NAME.
      
      This patch adds appropriate checks.
      
      Found with syzkaller.
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8a65ec0
    • Eric Biggers's avatar
      smack: use GFP_NOFS while holding inode_smack::smk_lock · 1b425032
      Eric Biggers authored
      commit e5bfad3d upstream.
      
      inode_smack::smk_lock is taken during smack_d_instantiate(), which is
      called during a filesystem transaction when creating a file on ext4.
      Therefore to avoid a deadlock, all code that takes this lock must use
      GFP_NOFS, to prevent memory reclaim from waiting for the filesystem
      transaction to complete.
      
      Reported-by: syzbot+0eefc1e06a77d327a056@syzkaller.appspotmail.com
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b425032
    • Jann Horn's avatar
      Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set · ef9744a0
      Jann Horn authored
      commit 3675f052 upstream.
      
      There is a logic bug in the current smack_bprm_set_creds():
      If LSM_UNSAFE_PTRACE is set, but the ptrace state is deemed to be
      acceptable (e.g. because the ptracer detached in the meantime), the other
      ->unsafe flags aren't checked. As far as I can tell, this means that
      something like the following could work (but I haven't tested it):
      
       - task A: create task B with fork()
       - task B: set NO_NEW_PRIVS
       - task B: install a seccomp filter that makes open() return 0 under some
         conditions
       - task B: replace fd 0 with a malicious library
       - task A: attach to task B with PTRACE_ATTACH
       - task B: execve() a file with an SMACK64EXEC extended attribute
       - task A: while task B is still in the middle of execve(), exit (which
         destroys the ptrace relationship)
      
      Make sure that if any flags other than LSM_UNSAFE_PTRACE are set in
      bprm->unsafe, we reject the execve().
      
      Cc: stable@vger.kernel.org
      Fixes: 5663884c ("Smack: unify all ptrace accesses in the smack")
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef9744a0
    • Pierre-Louis Bossart's avatar
      soundwire: fix regmap dependencies and align with other serial links · 47035934
      Pierre-Louis Bossart authored
      [ Upstream commit 8676b3ca ]
      
      The existing code has a mixed select/depend usage which makes no sense.
      
      config SOUNDWIRE_BUS
             tristate
             select REGMAP_SOUNDWIRE
      
      config REGMAP_SOUNDWIRE
              tristate
              depends on SOUNDWIRE_BUS
      
      Let's remove one layer of Kconfig definitions and align with the
      solutions used by all other serial links.
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Link: https://lore.kernel.org/r/20190718230215.18675-1-pierre-louis.bossart@linux.intel.comSigned-off-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      47035934
    • Pierre-Louis Bossart's avatar
      soundwire: Kconfig: fix help format · 322753c7
      Pierre-Louis Bossart authored
      [ Upstream commit 9d7cd9d5 ]
      
      Move to the regular help format, --help-- is no longer recommended.
      Reviewed-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      322753c7
    • Eric Dumazet's avatar
      sch_cbq: validate TCA_CBQ_WRROPT to avoid crash · 74e2a311
      Eric Dumazet authored
      [ Upstream commit e9789c7c ]
      
      syzbot reported a crash in cbq_normalize_quanta() caused
      by an out of range cl->priority.
      
      iproute2 enforces this check, but malicious users do not.
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] SMP KASAN PTI
      Modules linked in:
      CPU: 1 PID: 26447 Comm: syz-executor.1 Not tainted 5.3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:cbq_normalize_quanta.part.0+0x1fd/0x430 net/sched/sch_cbq.c:902
      RSP: 0018:ffff8801a5c333b0 EFLAGS: 00010206
      RAX: 0000000020000003 RBX: 00000000fffffff8 RCX: ffffc9000712f000
      RDX: 00000000000043bf RSI: ffffffff83be8962 RDI: 0000000100000018
      RBP: ffff8801a5c33420 R08: 000000000000003a R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000002ef
      R13: ffff88018da95188 R14: dffffc0000000000 R15: 0000000000000015
      FS:  00007f37d26b1700(0000) GS:ffff8801dad00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004c7cec CR3: 00000001bcd0a006 CR4: 00000000001626f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       [<ffffffff83be9d57>] cbq_normalize_quanta include/net/pkt_sched.h:27 [inline]
       [<ffffffff83be9d57>] cbq_addprio net/sched/sch_cbq.c:1097 [inline]
       [<ffffffff83be9d57>] cbq_set_wrr+0x2d7/0x450 net/sched/sch_cbq.c:1115
       [<ffffffff83bee8a7>] cbq_change_class+0x987/0x225b net/sched/sch_cbq.c:1537
       [<ffffffff83b96985>] tc_ctl_tclass+0x555/0xcd0 net/sched/sch_api.c:2329
       [<ffffffff83a84655>] rtnetlink_rcv_msg+0x485/0xc10 net/core/rtnetlink.c:5248
       [<ffffffff83cadf0a>] netlink_rcv_skb+0x17a/0x460 net/netlink/af_netlink.c:2510
       [<ffffffff83a7db6d>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5266
       [<ffffffff83cac2c6>] netlink_unicast_kernel net/netlink/af_netlink.c:1324 [inline]
       [<ffffffff83cac2c6>] netlink_unicast+0x536/0x720 net/netlink/af_netlink.c:1350
       [<ffffffff83cacd4a>] netlink_sendmsg+0x89a/0xd50 net/netlink/af_netlink.c:1939
       [<ffffffff8399d46e>] sock_sendmsg_nosec net/socket.c:673 [inline]
       [<ffffffff8399d46e>] sock_sendmsg+0x12e/0x170 net/socket.c:684
       [<ffffffff8399f1fd>] ___sys_sendmsg+0x81d/0x960 net/socket.c:2359
       [<ffffffff839a2d05>] __sys_sendmsg+0x105/0x1d0 net/socket.c:2397
       [<ffffffff839a2df9>] SYSC_sendmsg net/socket.c:2406 [inline]
       [<ffffffff839a2df9>] SyS_sendmsg+0x29/0x30 net/socket.c:2404
       [<ffffffff8101ccc8>] do_syscall_64+0x528/0x770 arch/x86/entry/common.c:305
       [<ffffffff84400091>] entry_SYSCALL_64_after_hwframe+0x42/0xb7
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74e2a311
    • Tuong Lien's avatar
      tipc: fix unlimited bundling of small messages · ed9420dd
      Tuong Lien authored
      [ Upstream commit e95584a8 ]
      
      We have identified a problem with the "oversubscription" policy in the
      link transmission code.
      
      When small messages are transmitted, and the sending link has reached
      the transmit window limit, those messages will be bundled and put into
      the link backlog queue. However, bundles of data messages are counted
      at the 'CRITICAL' level, so that the counter for that level, instead of
      the counter for the real, bundled message's level is the one being
      increased.
      Subsequent, to-be-bundled data messages at non-CRITICAL levels continue
      to be tested against the unchanged counter for their own level, while
      contributing to an unrestrained increase at the CRITICAL backlog level.
      
      This leaves a gap in congestion control algorithm for small messages
      that can result in starvation for other users or a "real" CRITICAL
      user. Even that eventually can lead to buffer exhaustion & link reset.
      
      We fix this by keeping a 'target_bskb' buffer pointer at each levels,
      then when bundling, we only bundle messages at the same importance
      level only. This way, we know exactly how many slots a certain level
      have occupied in the queue, so can manage level congestion accurately.
      
      By bundling messages at the same level, we even have more benefits. Let
      consider this:
      - One socket sends 64-byte messages at the 'CRITICAL' level;
      - Another sends 4096-byte messages at the 'LOW' level;
      
      When a 64-byte message comes and is bundled the first time, we put the
      overhead of message bundle to it (+ 40-byte header, data copy, etc.)
      for later use, but the next message can be a 4096-byte one that cannot
      be bundled to the previous one. This means the last bundle carries only
      one payload message which is totally inefficient, as for the receiver
      also! Later on, another 64-byte message comes, now we make a new bundle
      and the same story repeats...
      
      With the new bundling algorithm, this will not happen, the 64-byte
      messages will be bundled together even when the 4096-byte message(s)
      comes in between. However, if the 4096-byte messages are sent at the
      same level i.e. 'CRITICAL', the bundling algorithm will again cause the
      same overhead.
      
      Also, the same will happen even with only one socket sending small
      messages at a rate close to the link transmit's one, so that, when one
      message is bundled, it's transmitted shortly. Then, another message
      comes, a new bundle is created and so on...
      
      We will solve this issue radically by another patch.
      
      Fixes: 365ad353 ("tipc: reduce risk of user starvation during link congestion")
      Reported-by: default avatarHoang Le <hoang.h.le@dektech.com.au>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarTuong Lien <tuong.t.lien@dektech.com.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed9420dd
    • Dongli Zhang's avatar
      xen-netfront: do not use ~0U as error return value for xennet_fill_frags() · a1afd826
      Dongli Zhang authored
      [ Upstream commit a761129e ]
      
      xennet_fill_frags() uses ~0U as return value when the sk_buff is not able
      to cache extra fragments. This is incorrect because the return type of
      xennet_fill_frags() is RING_IDX and 0xffffffff is an expected value for
      ring buffer index.
      
      In the situation when the rsp_cons is approaching 0xffffffff, the return
      value of xennet_fill_frags() may become 0xffffffff which xennet_poll() (the
      caller) would regard as error. As a result, queue->rx.rsp_cons is set
      incorrectly because it is updated only when there is error. If there is no
      error, xennet_poll() would be responsible to update queue->rx.rsp_cons.
      Finally, queue->rx.rsp_cons would point to the rx ring buffer entries whose
      queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL.
      This leads to NULL pointer access in the next iteration to process rx ring
      buffer entries.
      
      The symptom is similar to the one fixed in
      commit 00b36850 ("xen-netfront: do not assume sk_buff_head list is
      empty in error handling").
      
      This patch changes the return type of xennet_fill_frags() to indicate
      whether it is successful or failed. The queue->rx.rsp_cons will be
      always updated inside this function.
      
      Fixes: ad4f15dc ("xen/netfront: don't bug in case of too many frags")
      Signed-off-by: default avatarDongli Zhang <dongli.zhang@oracle.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1afd826
    • Dotan Barak's avatar
      net/rds: Fix error handling in rds_ib_add_one() · 36a4043c
      Dotan Barak authored
      [ Upstream commit d64bf89a ]
      
      rds_ibdev:ipaddr_list and rds_ibdev:conn_list are initialized
      after allocation some resources such as protection domain.
      If allocation of such resources fail, then these uninitialized
      variables are accessed in rds_ib_dev_free() in failure path. This
      can potentially crash the system. The code has been updated to
      initialize these variables very early in the function.
      Signed-off-by: default avatarDotan Barak <dotanb@dev.mellanox.co.il>
      Signed-off-by: default avatarSudhakar Dindukurti <sudhakar.dindukurti@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36a4043c
    • Josh Hunt's avatar
      udp: only do GSO if # of segs > 1 · 012363f5
      Josh Hunt authored
      [ Upstream commit 4094871d ]
      
      Prior to this change an application sending <= 1MSS worth of data and
      enabling UDP GSO would fail if the system had SW GSO enabled, but the
      same send would succeed if HW GSO offload is enabled. In addition to this
      inconsistency the error in the SW GSO case does not get back to the
      application if sending out of a real device so the user is unaware of this
      failure.
      
      With this change we only perform GSO if the # of segments is > 1 even
      if the application has enabled segmentation. I've also updated the
      relevant udpgso selftests.
      
      Fixes: bec1f6f6 ("udp: generate gso with UDP_SEGMENT")
      Signed-off-by: default avatarJosh Hunt <johunt@akamai.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      012363f5
    • Linus Walleij's avatar
      net: dsa: rtl8366: Check VLAN ID and not ports · 5c08d7e4
      Linus Walleij authored
      [ Upstream commit e8521e53 ]
      
      There has been some confusion between the port number and
      the VLAN ID in this driver. What we need to check for
      validity is the VLAN ID, nothing else.
      
      The current confusion came from assigning a few default
      VLANs for default routing and we need to rewrite that
      properly.
      
      Instead of checking if the port number is a valid VLAN
      ID, check the actual VLAN IDs passed in to the callback
      one by one as expected.
      
      Fixes: d8652956 ("net: dsa: realtek-smi: Add Realtek SMI driver")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c08d7e4
    • Dexuan Cui's avatar
      vsock: Fix a lockdep warning in __vsock_release() · 3c1f0704
      Dexuan Cui authored
      [ Upstream commit 0d9138ff ]
      
      Lockdep is unhappy if two locks from the same class are held.
      
      Fix the below warning for hyperv and virtio sockets (vmci socket code
      doesn't have the issue) by using lock_sock_nested() when __vsock_release()
      is called recursively:
      
      ============================================
      WARNING: possible recursive locking detected
      5.3.0+ #1 Not tainted
      --------------------------------------------
      server/1795 is trying to acquire lock:
      ffff8880c5158990 (sk_lock-AF_VSOCK){+.+.}, at: hvs_release+0x10/0x120 [hv_sock]
      
      but task is already holding lock:
      ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(sk_lock-AF_VSOCK);
        lock(sk_lock-AF_VSOCK);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by server/1795:
       #0: ffff8880c5d05ff8 (&sb->s_type->i_mutex_key#10){+.+.}, at: __sock_release+0x2d/0xa0
       #1: ffff8880c5158150 (sk_lock-AF_VSOCK){+.+.}, at: __vsock_release+0x2e/0xf0 [vsock]
      
      stack backtrace:
      CPU: 5 PID: 1795 Comm: server Not tainted 5.3.0+ #1
      Call Trace:
       dump_stack+0x67/0x90
       __lock_acquire.cold.67+0xd2/0x20b
       lock_acquire+0xb5/0x1c0
       lock_sock_nested+0x6d/0x90
       hvs_release+0x10/0x120 [hv_sock]
       __vsock_release+0x24/0xf0 [vsock]
       __vsock_release+0xa0/0xf0 [vsock]
       vsock_release+0x12/0x30 [vsock]
       __sock_release+0x37/0xa0
       sock_close+0x14/0x20
       __fput+0xc1/0x250
       task_work_run+0x98/0xc0
       do_exit+0x344/0xc60
       do_group_exit+0x47/0xb0
       get_signal+0x15c/0xc50
       do_signal+0x30/0x720
       exit_to_usermode_loop+0x50/0xa0
       do_syscall_64+0x24e/0x270
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7f4184e85f31
      Tested-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3c1f0704
    • Josh Hunt's avatar
      udp: fix gso_segs calculations · 544aee54
      Josh Hunt authored
      [ Upstream commit 44b321e5 ]
      
      Commit dfec0ee2 ("udp: Record gso_segs when supporting UDP segmentation offload")
      added gso_segs calculation, but incorrectly got sizeof() the pointer and
      not the underlying data type. In addition let's fix the v6 case.
      
      Fixes: bec1f6f6 ("udp: generate gso with UDP_SEGMENT")
      Fixes: dfec0ee2 ("udp: Record gso_segs when supporting UDP segmentation offload")
      Signed-off-by: default avatarJosh Hunt <johunt@akamai.com>
      Reviewed-by: default avatarAlexander Duyck <alexander.h.duyck@linux.intel.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      544aee54
    • Eric Dumazet's avatar
      sch_dsmark: fix potential NULL deref in dsmark_init() · 79fd59ae
      Eric Dumazet authored
      [ Upstream commit 474f0813 ]
      
      Make sure TCA_DSMARK_INDICES was provided by the user.
      
      syzbot reported :
      
      kasan: CONFIG_KASAN_INLINE enabled
      kasan: GPF could be caused by NULL-ptr deref or user memory access
      general protection fault: 0000 [#1] PREEMPT SMP KASAN
      CPU: 1 PID: 8799 Comm: syz-executor235 Not tainted 5.3.0+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:nla_get_u16 include/net/netlink.h:1501 [inline]
      RIP: 0010:dsmark_init net/sched/sch_dsmark.c:364 [inline]
      RIP: 0010:dsmark_init+0x193/0x640 net/sched/sch_dsmark.c:339
      Code: 85 db 58 0f 88 7d 03 00 00 e8 e9 1a ac fb 48 8b 9d 70 ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 ca
      RSP: 0018:ffff88809426f3b8 EFLAGS: 00010247
      RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff85c6eb09
      RDX: 0000000000000000 RSI: ffffffff85c6eb17 RDI: 0000000000000004
      RBP: ffff88809426f4b0 R08: ffff88808c4085c0 R09: ffffed1015d26159
      R10: ffffed1015d26158 R11: ffff8880ae930ac7 R12: ffff8880a7e96940
      R13: dffffc0000000000 R14: ffff88809426f8c0 R15: 0000000000000000
      FS:  0000000001292880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000080 CR3: 000000008ca1b000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       qdisc_create+0x4ee/0x1210 net/sched/sch_api.c:1237
       tc_modify_qdisc+0x524/0x1c50 net/sched/sch_api.c:1653
       rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5223
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:637 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:657
       ___sys_sendmsg+0x803/0x920 net/socket.c:2311
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
       __do_sys_sendmsg net/socket.c:2365 [inline]
       __se_sys_sendmsg net/socket.c:2363 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
       do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x440369
      
      Fixes: 758cc43c ("[PKT_SCHED]: Fix dsmark to apply changes consistent")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79fd59ae
    • David Howells's avatar
      rxrpc: Fix rxrpc_recvmsg tracepoint · 76b55277
      David Howells authored
      [ Upstream commit db9b2e0a ]
      
      Fix the rxrpc_recvmsg tracepoint to handle being called with a NULL call
      parameter.
      
      Fixes: a25e21f0 ("rxrpc, afs: Use debug_ids rather than pointers in traces")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76b55277
    • Reinhard Speyerer's avatar
      qmi_wwan: add support for Cinterion CLS8 devices · 7047aae6
      Reinhard Speyerer authored
      [ Upstream commit cf74ac6d ]
      
      Add support for Cinterion CLS8 devices.
      Use QMI_QUIRK_SET_DTR as required for Qualcomm MDM9x07 chipsets.
      
      T:  Bus=01 Lev=03 Prnt=05 Port=01 Cnt=02 Dev#= 25 Spd=480  MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=1e2d ProdID=00b0 Rev= 3.18
      S:  Manufacturer=GEMALTO
      S:  Product=USB Modem
      C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
      I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
      E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
      E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      E:  Ad=89(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
      E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      Signed-off-by: default avatarReinhard Speyerer <rspmn@arcor.de>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7047aae6
    • Eric Dumazet's avatar
      nfc: fix memory leak in llcp_sock_bind() · dd9c580a
      Eric Dumazet authored
      [ Upstream commit a0c2dc1f ]
      
      sysbot reported a memory leak after a bind() has failed.
      
      While we are at it, abort the operation if kmemdup() has failed.
      
      BUG: memory leak
      unreferenced object 0xffff888105d83ec0 (size 32):
        comm "syz-executor067", pid 7207, jiffies 4294956228 (age 19.430s)
        hex dump (first 32 bytes):
          00 69 6c 65 20 72 65 61 64 00 6e 65 74 3a 5b 34  .ile read.net:[4
          30 32 36 35 33 33 30 39 37 5d 00 00 00 00 00 00  026533097]......
        backtrace:
          [<0000000036bac473>] kmemleak_alloc_recursive /./include/linux/kmemleak.h:43 [inline]
          [<0000000036bac473>] slab_post_alloc_hook /mm/slab.h:522 [inline]
          [<0000000036bac473>] slab_alloc /mm/slab.c:3319 [inline]
          [<0000000036bac473>] __do_kmalloc /mm/slab.c:3653 [inline]
          [<0000000036bac473>] __kmalloc_track_caller+0x169/0x2d0 /mm/slab.c:3670
          [<000000000cd39d07>] kmemdup+0x27/0x60 /mm/util.c:120
          [<000000008e57e5fc>] kmemdup /./include/linux/string.h:432 [inline]
          [<000000008e57e5fc>] llcp_sock_bind+0x1b3/0x230 /net/nfc/llcp_sock.c:107
          [<000000009cb0b5d3>] __sys_bind+0x11c/0x140 /net/socket.c:1647
          [<00000000492c3bbc>] __do_sys_bind /net/socket.c:1658 [inline]
          [<00000000492c3bbc>] __se_sys_bind /net/socket.c:1656 [inline]
          [<00000000492c3bbc>] __x64_sys_bind+0x1e/0x30 /net/socket.c:1656
          [<0000000008704b2a>] do_syscall_64+0x76/0x1a0 /arch/x86/entry/common.c:296
          [<000000009f4c57a4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 30cc4587 ("NFC: Move LLCP code to the NFC top level diirectory")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd9c580a
    • Martin KaFai Lau's avatar
      net: Unpublish sk from sk_reuseport_cb before call_rcu · d5b1db1c
      Martin KaFai Lau authored
      [ Upstream commit 8c7138b3 ]
      
      The "reuse->sock[]" array is shared by multiple sockets.  The going away
      sk must unpublish itself from "reuse->sock[]" before making call_rcu()
      call.  However, this unpublish-action is currently done after a grace
      period and it may cause use-after-free.
      
      The fix is to move reuseport_detach_sock() to sk_destruct().
      Due to the above reason, any socket with sk_reuseport_cb has
      to go through the rcu grace period before freeing it.
      
      It is a rather old bug (~3 yrs).  The Fixes tag is not necessary
      the right commit but it is the one that introduced the SOCK_RCU_FREE
      logic and this fix is depending on it.
      
      Fixes: a4298e45 ("net: add SOCK_RCU_FREE socket flag")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d5b1db1c
    • Navid Emamdoost's avatar
      net: qlogic: Fix memory leak in ql_alloc_large_buffers · 9d0995cc
      Navid Emamdoost authored
      [ Upstream commit 1acb8f2a ]
      
      In ql_alloc_large_buffers, a new skb is allocated via netdev_alloc_skb.
      This skb should be released if pci_dma_mapping_error fails.
      
      Fixes: 0f8ab89e ("qla3xxx: Check return code from pci_map_single() in ql_release_to_lrg_buf_free_list(), ql_populate_free_queue(), ql_alloc_large_buffers(), and ql3xxx_send()")
      Signed-off-by: default avatarNavid Emamdoost <navid.emamdoost@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d0995cc
    • Paolo Abeni's avatar
      net: ipv4: avoid mixed n_redirects and rate_tokens usage · 124b64fe
      Paolo Abeni authored
      [ Upstream commit b406472b ]
      
      Since commit c09551c6 ("net: ipv4: use a dedicated counter
      for icmp_v4 redirect packets") we use 'n_redirects' to account
      for redirect packets, but we still use 'rate_tokens' to compute
      the redirect packets exponential backoff.
      
      If the device sent to the relevant peer any ICMP error packet
      after sending a redirect, it will also update 'rate_token' according
      to the leaking bucket schema; typically 'rate_token' will raise
      above BITS_PER_LONG and the redirect packets backoff algorithm
      will produce undefined behavior.
      
      Fix the issue using 'n_redirects' to compute the exponential backoff
      in ip_rt_send_redirect().
      
      Note that we still clear rate_tokens after a redirect silence period,
      to avoid changing an established behaviour.
      
      The root cause predates git history; before the mentioned commit in
      the critical scenario, the kernel stopped sending redirects, after
      the mentioned commit the behavior more randomic.
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Fixes: c09551c6 ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      124b64fe
    • David Ahern's avatar
      ipv6: Handle missing host route in __ipv6_ifa_notify · 6f8564ed
      David Ahern authored
      [ Upstream commit 2d819d25 ]
      
      Rajendra reported a kernel panic when a link was taken down:
      
          [ 6870.263084] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
          [ 6870.271856] IP: [<ffffffff8efc5764>] __ipv6_ifa_notify+0x154/0x290
      
          <snip>
      
          [ 6870.570501] Call Trace:
          [ 6870.573238] [<ffffffff8efc58c6>] ? ipv6_ifa_notify+0x26/0x40
          [ 6870.579665] [<ffffffff8efc98ec>] ? addrconf_dad_completed+0x4c/0x2c0
          [ 6870.586869] [<ffffffff8efe70c6>] ? ipv6_dev_mc_inc+0x196/0x260
          [ 6870.593491] [<ffffffff8efc9c6a>] ? addrconf_dad_work+0x10a/0x430
          [ 6870.600305] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
          [ 6870.606732] [<ffffffff8ea93a7a>] ? process_one_work+0x18a/0x430
          [ 6870.613449] [<ffffffff8ea93d6d>] ? worker_thread+0x4d/0x490
          [ 6870.619778] [<ffffffff8ea93d20>] ? process_one_work+0x430/0x430
          [ 6870.626495] [<ffffffff8ea99dd9>] ? kthread+0xd9/0xf0
          [ 6870.632145] [<ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
          [ 6870.638573] [<ffffffff8ea99d00>] ? kthread_park+0x60/0x60
          [ 6870.644707] [<ffffffff8f01ae77>] ? ret_from_fork+0x57/0x70
          [ 6870.650936] Code: 31 c0 31 d2 41 b9 20 00 08 02 b9 09 00 00 0
      
      addrconf_dad_work is kicked to be scheduled when a device is brought
      up. There is a race between addrcond_dad_work getting scheduled and
      taking the rtnl lock and a process taking the link down (under rtnl).
      The latter removes the host route from the inet6_addr as part of
      addrconf_ifdown which is run for NETDEV_DOWN. The former attempts
      to use the host route in __ipv6_ifa_notify. If the down event removes
      the host route due to the race to the rtnl, then the BUG listed above
      occurs.
      
      Since the DAD sequence can not be aborted, add a check for the missing
      host route in __ipv6_ifa_notify. The only way this should happen is due
      to the previously mentioned race. The host route is created when the
      address is added to an interface; it is only removed on a down event
      where the address is kept. Add a warning if the host route is missing
      AND the device is up; this is a situation that should never happen.
      
      Fixes: f1705ec1 ("net: ipv6: Make address flushing on ifdown optional")
      Reported-by: default avatarRajendra Dendukuri <rajendra.dendukuri@broadcom.com>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f8564ed
    • Eric Dumazet's avatar
      ipv6: drop incoming packets having a v4mapped source address · 658d7ee4
      Eric Dumazet authored
      [ Upstream commit 6af1799a ]
      
      This began with a syzbot report. syzkaller was injecting
      IPv6 TCP SYN packets having a v4mapped source address.
      
      After an unsuccessful 4-tuple lookup, TCP creates a request
      socket (SYN_RECV) and calls reqsk_queue_hash_req()
      
      reqsk_queue_hash_req() calls sk_ehashfn(sk)
      
      At this point we have AF_INET6 sockets, and the heuristic
      used by sk_ehashfn() to either hash the IPv4 or IPv6 addresses
      is to use ipv6_addr_v4mapped(&sk->sk_v6_daddr)
      
      For the particular spoofed packet, we end up hashing V4 addresses
      which were not initialized by the TCP IPv6 stack, so KMSAN fired
      a warning.
      
      I first fixed sk_ehashfn() to test both source and destination addresses,
      but then faced various problems, including user-space programs
      like packetdrill that had similar assumptions.
      
      Instead of trying to fix the whole ecosystem, it is better
      to admit that we have a dual stack behavior, and that we
      can not build linux kernels without V4 stack anyway.
      
      The dual stack API automatically forces the traffic to be IPv4
      if v4mapped addresses are used at bind() or connect(), so it makes
      no sense to allow IPv6 traffic to use the same v4mapped class.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      658d7ee4