1. 09 Mar, 2018 40 commits
    • Adam Ford's avatar
      ARM: dts: LogicPD SOM-LV: Fix I2C1 pinmux · 2b844657
      Adam Ford authored
      commit 84c7efd6 upstream.
      
      The pinmuxing was missing for I2C1 which was causing intermittent issues
      with the PMIC which is connected to I2C1.  The bootloader did not quite
      configure the I2C1 either, so when running at 2.6MHz, it was generating
      errors at times.
      
      This correctly sets the I2C1 pinmuxing so it can operate at 2.6MHz
      
      Fixes: ab8dd3ae ("ARM: DTS: Add minimal Support for Logic PD DM3730
      SOM-LV")
      Signed-off-by: default avatarAdam Ford <aford173@gmail.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2b844657
    • Kai Heng Feng's avatar
      ACPI / bus: Parse tables as term_list for Dell XPS 9570 and Precision M5530 · b2190cc3
      Kai Heng Feng authored
      commit 36904703 upstream.
      
      The i2c touchpad on Dell XPS 9570 and Precision M5530 doesn't work out
      of box.
      
      The touchpad relies on its _INI method to update its _HID value from
      XXXX0000 to SYNA2393.
      
      Also, the _STA relies on value of I2CN to report correct status.
      
      Set acpi_gbl_parse_table_as_term_list so the value of I2CN can be
      correctly set up, and _INI can get run. The ACPI table in this machine
      is designed to get parsed this way.
      
      Also, change the quirk table to a more generic name.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=198515Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2190cc3
    • Eric Biggers's avatar
      KVM/x86: remove WARN_ON() for when vm_munmap() fails · b95f8ca8
      Eric Biggers authored
      commit 103c763c upstream.
      
      On x86, special KVM memslots such as the TSS region have anonymous
      memory mappings created on behalf of userspace, and these mappings are
      removed when the VM is destroyed.
      
      It is however possible for removing these mappings via vm_munmap() to
      fail.  This can most easily happen if the thread receives SIGKILL while
      it's waiting to acquire ->mmap_sem.   This triggers the 'WARN_ON(r < 0)'
      in __x86_set_memory_region().  syzkaller was able to hit this, using
      'exit()' to send the SIGKILL.  Note that while the vm_munmap() failure
      results in the mapping not being removed immediately, it is not leaked
      forever but rather will be freed when the process exits.
      
      It's not really possible to handle this failure properly, so almost
      every other caller of vm_munmap() doesn't check the return value.  It's
      a limitation of having the kernel manage these mappings rather than
      userspace.
      
      So just remove the WARN_ON() so that users can't spam the kernel log
      with this warning.
      
      Fixes: f0d648bd ("KVM: x86: map/unmap private slots in __x86_set_memory_region")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b95f8ca8
    • Tianyu Lan's avatar
      KVM/x86: Fix wrong macro references of X86_CR0_PG_BIT and X86_CR4_PAE_BIT in kvm_valid_sregs() · 61546237
      Tianyu Lan authored
      commit 37b95951 upstream.
      
      kvm_valid_sregs() should use X86_CR0_PG and X86_CR4_PAE to check bit
      status rather than X86_CR0_PG_BIT and X86_CR4_PAE_BIT. This patch is
      to fix it.
      
      Fixes: f2981033(KVM/x86: Check input paging mode when cs.l is set)
      Reported-by: default avatarJeremi Piotrowski <jeremi.piotrowski@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarTianyu Lan <Tianyu.Lan@microsoft.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61546237
    • Ard Biesheuvel's avatar
      PCI/ASPM: Deal with missing root ports in link state handling · db98acd6
      Ard Biesheuvel authored
      commit ee8bdfb6 upstream.
      
      Even though it is unconventional, some PCIe host implementations omit the
      root ports entirely, and simply consist of a host bridge (which is not
      modeled as a device in the PCI hierarchy) and a link.
      
      When the downstream device is an endpoint, our current code does not seem
      to mind this unusual configuration. However, when PCIe switches are
      involved, the ASPM code assumes that any downstream switch port has a
      parent, and blindly dereferences the bus->parent->self field of the pci_dev
      struct to chain the downstream link state to the link state of the root
      port. Given that the root port is missing, the link is not modeled at all,
      and nor is the link state, and attempting to access it results in a NULL
      pointer dereference and a crash.
      
      Avoid this by allowing the link state chain to terminate at the downstream
      port if no root port exists.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db98acd6
    • Radim Krčmář's avatar
      KVM: x86: fix vcpu initialization with userspace lapic · b4830f3a
      Radim Krčmář authored
      commit b7e31be3 upstream.
      
      Moving the code around broke this rare configuration.
      Use this opportunity to finally call lapic reset from vcpu reset.
      
      Reported-by: syzbot+fb7a33a4b6c35007a72b@syzkaller.appspotmail.com
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Fixes: 0b2e9904 ("KVM: x86: move LAPIC initialization after VMCS creation")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b4830f3a
    • Paolo Bonzini's avatar
      KVM/VMX: Optimize vmx_vcpu_run() and svm_vcpu_run() by marking the RDMSR path as unlikely() · 1f17daea
      Paolo Bonzini authored
      commit 946fbbc1 upstream.
      
      vmx_vcpu_run() and svm_vcpu_run() are large functions, and giving
      branch hints to the compiler can actually make a substantial cycle
      difference by keeping the fast path contiguous in memory.
      
      With this optimization, the retpoline-guest/retpoline-host case is
      about 50 cycles faster.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: KarimAllah Ahmed <karahmed@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20180222154318.20361-3-pbonzini@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f17daea
    • Paolo Bonzini's avatar
      KVM: x86: move LAPIC initialization after VMCS creation · 03d62460
      Paolo Bonzini authored
      commit 0b2e9904 upstream.
      
      The initial reset of the local APIC is performed before the VMCS has been
      created, but it tries to do a vmwrite:
      
       vmwrite error: reg 810 value 4a00 (err 18944)
       CPU: 54 PID: 38652 Comm: qemu-kvm Tainted: G        W I      4.16.0-0.rc2.git0.1.fc28.x86_64 #1
       Hardware name: Intel Corporation S2600CW/S2600CW, BIOS SE5C610.86B.01.01.0003.090520141303 09/05/2014
       Call Trace:
        vmx_set_rvi [kvm_intel]
        vmx_hwapic_irr_update [kvm_intel]
        kvm_lapic_reset [kvm]
        kvm_create_lapic [kvm]
        kvm_arch_vcpu_init [kvm]
        kvm_vcpu_init [kvm]
        vmx_create_vcpu [kvm_intel]
        kvm_vm_ioctl [kvm]
      
      Move it later, after the VMCS has been created.
      
      Fixes: 4191db26 ("KVM: x86: Update APICv on APIC reset")
      Cc: stable@vger.kernel.org
      Cc: Liran Alon <liran.alon@oracle.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03d62460
    • Paolo Bonzini's avatar
      KVM/x86: Remove indirect MSR op calls from SPEC_CTRL · 0d62a56d
      Paolo Bonzini authored
      commit ecb586bd upstream.
      
      Having a paravirt indirect call in the IBRS restore path is not a
      good idea, since we are trying to protect from speculative execution
      of bogus indirect branch targets.  It is also slower, so use
      native_wrmsrl() on the vmentry path too.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: KarimAllah Ahmed <karahmed@amazon.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kvm@vger.kernel.org
      Cc: stable@vger.kernel.org
      Fixes: d28b387f
      Link: http://lkml.kernel.org/r/20180222154318.20361-2-pbonzini@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d62a56d
    • Wanpeng Li's avatar
      KVM: mmu: Fix overlap between public and private memslots · 7135aaf3
      Wanpeng Li authored
      commit b28676bb upstream.
      
      Reported by syzkaller:
      
          pte_list_remove: ffff9714eb1f8078 0->BUG
          ------------[ cut here ]------------
          kernel BUG at arch/x86/kvm/mmu.c:1157!
          invalid opcode: 0000 [#1] SMP
          RIP: 0010:pte_list_remove+0x11b/0x120 [kvm]
          Call Trace:
           drop_spte+0x83/0xb0 [kvm]
           mmu_page_zap_pte+0xcc/0xe0 [kvm]
           kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm]
           kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm]
           kvm_arch_flush_shadow_all+0xe/0x10 [kvm]
           kvm_mmu_notifier_release+0x6c/0xa0 [kvm]
           ? kvm_mmu_notifier_release+0x5/0xa0 [kvm]
           __mmu_notifier_release+0x79/0x110
           ? __mmu_notifier_release+0x5/0x110
           exit_mmap+0x15a/0x170
           ? do_exit+0x281/0xcb0
           mmput+0x66/0x160
           do_exit+0x2c9/0xcb0
           ? __context_tracking_exit.part.5+0x4a/0x150
           do_group_exit+0x50/0xd0
           SyS_exit_group+0x14/0x20
           do_syscall_64+0x73/0x1f0
           entry_SYSCALL64_slow_path+0x25/0x25
      
      The reason is that when creates new memslot, there is no guarantee for new
      memslot not overlap with private memslots. This can be triggered by the
      following program:
      
         #include <fcntl.h>
         #include <pthread.h>
         #include <setjmp.h>
         #include <signal.h>
         #include <stddef.h>
         #include <stdint.h>
         #include <stdio.h>
         #include <stdlib.h>
         #include <string.h>
         #include <sys/ioctl.h>
         #include <sys/stat.h>
         #include <sys/syscall.h>
         #include <sys/types.h>
         #include <unistd.h>
         #include <linux/kvm.h>
      
         long r[16];
      
         int main()
         {
      	void *p = valloc(0x4000);
      
      	r[2] = open("/dev/kvm", 0);
      	r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul);
      
      	uint64_t addr = 0xf000;
      	ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr);
      	r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul);
      	ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul);
      	ioctl(r[6], KVM_RUN, 0);
      	ioctl(r[6], KVM_RUN, 0);
      
      	struct kvm_userspace_memory_region mr = {
      		.slot = 0,
      		.flags = KVM_MEM_LOG_DIRTY_PAGES,
      		.guest_phys_addr = 0xf000,
      		.memory_size = 0x4000,
      		.userspace_addr = (uintptr_t) p
      	};
      	ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr);
      	return 0;
         }
      
      This patch fixes the bug by not adding a new memslot even if it
      overlaps with private memslots.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Eric Biggers <ebiggers3@gmail.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      7135aaf3
    • Wanpeng Li's avatar
      KVM: X86: Fix SMRAM accessing even if VM is shutdown · 1ebf9ab6
      Wanpeng Li authored
      commit 95e057e2 upstream.
      
      Reported by syzkaller:
      
         WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
         CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4
         RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel]
         Call Trace:
          vmx_handle_exit+0xbd/0xe20 [kvm_intel]
          kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm]
          kvm_vcpu_ioctl+0x3e9/0x720 [kvm]
          do_vfs_ioctl+0xa4/0x6a0
          SyS_ioctl+0x79/0x90
          entry_SYSCALL_64_fastpath+0x25/0x9c
      
      The testcase creates a first thread to issue KVM_SMI ioctl, and then creates
      a second thread to mmap and operate on the same vCPU.  This triggers a race
      condition when running the testcase with multiple threads. Sometimes one thread
      exits with a triple fault while another thread mmaps and operates on the same
      vCPU.  Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler
      results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE
      in kvm_handle_bad_page(), which will go on to cause an emulation failure and an
      exit with KVM_EXIT_INTERNAL_ERROR.
      
      Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ebf9ab6
    • Paolo Bonzini's avatar
      KVM: x86: extend usage of RET_MMIO_PF_* constants · f925158c
      Paolo Bonzini authored
      commit 9b8ebbdb upstream.
      
      The x86 MMU if full of code that returns 0 and 1 for retry/emulate.  Use
      the existing RET_MMIO_PF_RETRY/RET_MMIO_PF_EMULATE enum, renaming it to
      drop the MMIO part.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: Thomas Backlund <tmb@mageia.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f925158c
    • Arnd Bergmann's avatar
      ARM: kvm: fix building with gcc-8 · e0c7b2b1
      Arnd Bergmann authored
      commit 67870eb1 upstream.
      
      In banked-sr.c, we use a top-level '__asm__(".arch_extension virt")'
      statement to allow compilation of a multi-CPU kernel for ARMv6
      and older ARMv7-A that don't normally support access to the banked
      registers.
      
      This is considered to be a programming error by the gcc developers
      and will no longer work in gcc-8, where we now get a build error:
      
      /tmp/cc4Qy7GR.s:34: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_usr'
      /tmp/cc4Qy7GR.s:41: Error: Banked registers are not available with this architecture. -- `mrs r3,ELR_hyp'
      /tmp/cc4Qy7GR.s:55: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_svc'
      /tmp/cc4Qy7GR.s:62: Error: Banked registers are not available with this architecture. -- `mrs r3,LR_svc'
      /tmp/cc4Qy7GR.s:69: Error: Banked registers are not available with this architecture. -- `mrs r3,SPSR_svc'
      /tmp/cc4Qy7GR.s:76: Error: Banked registers are not available with this architecture. -- `mrs r3,SP_abt'
      
      Passign the '-march-armv7ve' flag to gcc works, and is ok here, because
      we know the functions won't ever be called on pre-ARMv7VE machines.
      Unfortunately, older compiler versions (4.8 and earlier) do not understand
      that flag, so we still need to keep the asm around.
      
      Backporting to stable kernels (4.6+) is needed to allow those to be built
      with future compilers as well.
      
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84129
      Fixes: 33280b4c ("ARM: KVM: Add banked registers save/restore")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0c7b2b1
    • Ulf Magnusson's avatar
      ARM: mvebu: Fix broken PL310_ERRATA_753970 selects · fc6be8bc
      Ulf Magnusson authored
      commit 8aa36a8d upstream.
      
      The MACH_ARMADA_375 and MACH_ARMADA_38X boards select ARM_ERRATA_753970,
      but it was renamed to PL310_ERRATA_753970 by commit fa0ce403 ("ARM:
      7162/1: errata: tidy up Kconfig options for PL310 errata workarounds").
      
      Fix the selects to use the new name.
      
      Discovered with the
      https://github.com/ulfalizer/Kconfiglib/blob/master/examples/list_undefined.py
      script.
      Fixes: fa0ce403 ("ARM: 7162/1: errata: tidy up Kconfig options for
      PL310 errata workarounds"
      cc: stable@vger.kernel.org
      Signed-off-by: default avatarUlf Magnusson <ulfalizer@gmail.com>
      Signed-off-by: default avatarGregory CLEMENT <gregory.clement@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fc6be8bc
    • Daniel Schultz's avatar
      ARM: dts: rockchip: Remove 1.8 GHz operation point from phycore som · 4c02f016
      Daniel Schultz authored
      commit 5ce0bad4 upstream.
      
      Rockchip recommends to run the CPU cores only with operations points of
      1.6 GHz or lower.
      
      Removed the cpu0 node with too high operation points and use the default
      values instead.
      
      Fixes: 903d31e3 ("ARM: dts: rockchip: Add support for phyCORE-RK3288 SoM")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDaniel Schultz <d.schultz@phytec.de>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c02f016
    • Arnd Bergmann's avatar
      ARM: orion: fix orion_ge00_switch_board_info initialization · 8dc356e5
      Arnd Bergmann authored
      commit 8337d083 upstream.
      
      A section type mismatch warning shows up when building with LTO,
      since orion_ge00_mvmdio_bus_name was put in __initconst but not marked
      const itself:
      
      include/linux/of.h: In function 'spear_setup_of_timer':
      arch/arm/mach-spear/time.c:207:34: error: 'timer_of_match' causes a section type conflict with 'orion_ge00_mvmdio_bus_name'
       static const struct of_device_id timer_of_match[] __initconst = {
                                        ^
      arch/arm/plat-orion/common.c:475:32: note: 'orion_ge00_mvmdio_bus_name' was declared here
       static __initconst const char *orion_ge00_mvmdio_bus_name = "orion-mii";
                                      ^
      
      As pointed out by Andrew Lunn, it should in fact be 'const' but not
      '__initconst' because the string is never copied but may be accessed
      after the init sections are freed. To fix that, I get rid of the
      extra symbol and rewrite the initialization in a simpler way that
      assigns both the bus_id and modalias statically.
      
      I spotted another theoretical bug in the same place, where d->netdev[i]
      may be an out of bounds access, this can be fixed by moving the device
      assignment into the loop.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8dc356e5
    • Jan Beulich's avatar
      x86/mm: Fix {pmd,pud}_{set,clear}_flags() · b20d1086
      Jan Beulich authored
      commit 842cef91 upstream.
      
      Just like pte_{set,clear}_flags() their PMD and PUD counterparts should
      not do any address translation. This was outright wrong under Xen
      (causing a dead boot with no useful output on "suitable" systems), and
      produced needlessly more complicated code (even if just slightly) when
      paravirt was enabled.
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/5A8AF1BB02000078001A91C3@prv-mh.provo.novell.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b20d1086
    • Rasmus Villemoes's avatar
      nospec: Allow index argument to have const-qualified type · 656772cb
      Rasmus Villemoes authored
      commit b98c6a16 upstream.
      
      The last expression in a statement expression need not be a bare
      variable, quoting gcc docs
      
        The last thing in the compound statement should be an expression
        followed by a semicolon; the value of this subexpression serves as the
        value of the entire construct.
      
      and we already use that in e.g. the min/max macros which end with a
      ternary expression.
      
      This way, we can allow index to have const-qualified type, which will in
      some cases avoid the need for introducing a local copy of index of
      non-const qualified type. That, in turn, can prevent readers not
      familiar with the internals of array_index_nospec from wondering about
      the seemingly redundant extra variable, and I think that's worthwhile
      considering how confusing the whole _nospec business is.
      
      The expression _i&_mask has type unsigned long (since that is the type
      of _mask, and the BUILD_BUG_ONs guarantee that _i will get promoted to
      that), so in order not to change the type of the whole expression, add
      a cast back to typeof(_i).
      Signed-off-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arch@vger.kernel.org
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/151881604837.17395.10812767547837568328.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      656772cb
    • David Hildenbrand's avatar
      KVM: s390: consider epoch index on TOD clock syncs · 81a158d2
      David Hildenbrand authored
      commit 1575767e upstream.
      
      For now, we don't take care of over/underflows. Especially underflows
      are critical:
      
      Assume the epoch is currently 0 and we get a sync request for delta=1,
      meaning the TOD is moved forward by 1 and we have to fix it up by
      subtracting 1 from the epoch. Right now, this will leave the epoch
      index untouched, resulting in epoch=-1, epoch_idx=0, which is wrong.
      
      We have to take care of over and underflows, also for the VSIE case. So
      let's factor out calculation into a separate function.
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20180207114647.6220-5-david@redhat.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Fixes: 8fa1696e ("KVM: s390: Multiple Epoch Facility support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      [use u8 for idx]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81a158d2
    • David Hildenbrand's avatar
      KVM: s390: consider epoch index on hotplugged CPUs · dbab3751
      David Hildenbrand authored
      commit d16b52cb upstream.
      
      We must copy both, the epoch and the epoch_idx.
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20180207114647.6220-4-david@redhat.com>
      Fixes: 8fa1696e ("KVM: s390: Multiple Epoch Facility support")
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Fixes: 8fa1696e ("KVM: s390: Multiple Epoch Facility support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbab3751
    • David Hildenbrand's avatar
      KVM: s390: provide only a single function for setting the tod (fix SCK) · 58a5d1ac
      David Hildenbrand authored
      commit 0e7def5f upstream.
      
      Right now, SET CLOCK called in the guest does not properly take care of
      the epoch index, as the call goes via the old kvm_s390_set_tod_clock()
      interface. So the epoch index is neither reset to 0, if required, nor
      properly set to e.g. 0xff on negative values.
      
      Fix this by providing a single kvm_s390_set_tod_clock() function. Move
      Multiple-epoch facility handling into it.
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20180207114647.6220-3-david@redhat.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Fixes: 8fa1696e ("KVM: s390: Multiple Epoch Facility support")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      58a5d1ac
    • David Hildenbrand's avatar
      KVM: s390: take care of clock-comparator sign control · c09ea9a8
      David Hildenbrand authored
      commit 5fe01793 upstream.
      
      Missed when enabling the Multiple-epoch facility. If the facility is
      installed and the control is set, a sign based comaprison has to be
      performed.
      
      Right now we would inject wrong interrupts and ignore interrupt
      conditions. Also the sleep time is calculated in a wrong way.
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Message-Id: <20180207114647.6220-2-david@redhat.com>
      Fixes: 8fa1696e ("KVM: s390: Multiple Epoch Facility support")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c09ea9a8
    • Anna Karbownik's avatar
      EDAC, sb_edac: Fix out of bound writes during DIMM configuration on KNL · bd3ead45
      Anna Karbownik authored
      commit bf848670 upstream.
      
      Commit
      
        3286d3eb ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
      
      decreased NUM_CHANNELS from 8 to 4, but this is not enough for Knights
      Landing which supports up to 6 channels.
      
      This caused out-of-bounds writes to pvt->mirror_mode and pvt->tolm
      variables which don't pay critical role on KNL code path, so the memory
      corruption wasn't causing any visible driver failures.
      
      The easiest way of fixing it is to change NUM_CHANNELS to 6. Do that.
      
      An alternative solution would be to restructure the KNL part of the
      driver to 2MC/3channel representation.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAnna Karbownik <anna.karbownik@intel.com>
      Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: jim.m.snow@intel.com
      Cc: krzysztof.paliswiat@intel.com
      Cc: lukasz.odzioba@intel.com
      Cc: qiuxu.zhuo@intel.com
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: <stable@vger.kernel.org>
      Fixes: 3286d3eb ("EDAC, sb_edac: Drop NUM_CHANNELS from 8 back to 4")
      Link: http://lkml.kernel.org/r/1519312693-4789-1-git-send-email-anna.karbownik@intel.com
      [ Massage commit message. ]
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd3ead45
    • Mauro Carvalho Chehab's avatar
      media: m88ds3103: don't call a non-initalized function · 1ba2b9e0
      Mauro Carvalho Chehab authored
      commit b9c97c67 upstream.
      
      If m88d3103 chip ID is not recognized, the device is not initialized.
      
      However, it returns from probe without any error, causing this OOPS:
      
      [    7.689289] Unable to handle kernel NULL pointer dereference at virtual address 00000000
      [    7.689297] pgd = 7b0bd7a7
      [    7.689302] [00000000] *pgd=00000000
      [    7.689318] Internal error: Oops: 80000005 [#1] SMP ARM
      [    7.689322] Modules linked in: dvb_usb_dvbsky(+) m88ds3103 dvb_usb_v2 dvb_core videobuf2_vmalloc videobuf2_memops videobuf2_core crc32_arm_ce videodev media
      [    7.689358] CPU: 3 PID: 197 Comm: systemd-udevd Not tainted 4.15.0-mcc+ #23
      [    7.689361] Hardware name: BCM2835
      [    7.689367] PC is at 0x0
      [    7.689382] LR is at m88ds3103_attach+0x194/0x1d0 [m88ds3103]
      [    7.689386] pc : [<00000000>]    lr : [<bf0ae1ec>]    psr: 60000013
      [    7.689391] sp : ed8e5c20  ip : ed8c1e00  fp : ed8945c0
      [    7.689395] r10: ed894000  r9 : ed894378  r8 : eda736c0
      [    7.689400] r7 : ed894070  r6 : ed8e5c44  r5 : bf0bb040  r4 : eda77600
      [    7.689405] r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : eda77600
      [    7.689412] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [    7.689417] Control: 10c5383d  Table: 2d8e806a  DAC: 00000051
      [    7.689423] Process systemd-udevd (pid: 197, stack limit = 0xe9dbfb63)
      [    7.689428] Stack: (0xed8e5c20 to 0xed8e6000)
      [    7.689439] 5c20: ed853a80 eda73640 ed894000 ed8942c0 ed853a80 bf0b9e98 ed894070 bf0b9f10
      [    7.689449] 5c40: 00000000 00000000 bf08c17c c08dfc50 00000000 00000000 00000000 00000000
      [    7.689459] 5c60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      [    7.689468] 5c80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      [    7.689479] 5ca0: 00000000 00000000 ed8945c0 ed8942c0 ed894000 ed894830 bf0b9e98 00000000
      [    7.689490] 5cc0: ed894378 bf0a3cb4 bf0bc3b0 0000533b ed920540 00000000 00000034 bf0a6434
      [    7.689500] 5ce0: ee952070 ed826600 bf0a7038 bf0a2dd8 00000001 bf0a6768 bf0a2f90 ed8943c0
      [    7.689511] 5d00: 00000000 c08eca68 ed826620 ed826620 00000000 ee952070 bf0bc034 ee952000
      [    7.689521] 5d20: ed826600 bf0bb080 ffffffed c0aa9e9c c0aa9dac ed826620 c16edf6c c168c2c8
      [    7.689531] 5d40: c16edf70 00000000 bf0bc034 0000000d 00000000 c08e268c bf0bb080 ed826600
      [    7.689541] 5d60: bf0bc034 ed826654 ed826620 bf0bc034 c164c8bc 00000000 00000001 00000000
      [    7.689553] 5d80: 00000028 c08e2948 00000000 bf0bc034 c08e2848 c08e0778 ee9f0a58 ed88bab4
      [    7.689563] 5da0: bf0bc034 ed90ba80 c168c1f0 c08e1934 bf0bb3bc c17045ac bf0bc034 c164c8bc
      [    7.689574] 5dc0: bf0bc034 bf0bb3bc ed91f564 c08e34ec bf0bc000 c164c8bc bf0bc034 c0aa8dc4
      [    7.689584] 5de0: ffffe000 00000000 bf0bf000 ed91f600 ed91f564 c03021e4 00000001 00000000
      [    7.689595] 5e00: c166e040 8040003f ed853a80 bf0bc448 00000000 c1678174 ed853a80 f0f22000
      [    7.689605] 5e20: f0f21fff 8040003f 014000c0 ed91e700 ed91e700 c16d8e68 00000001 ed91e6c0
      [    7.689615] 5e40: bf0bc400 00000001 bf0bc400 ed91f564 00000001 00000000 00000028 c03c9a24
      [    7.689625] 5e60: 00000001 c03c8c94 ed8e5f50 ed8e5f50 00000001 bf0bc400 ed91f540 c03c8cb0
      [    7.689637] 5e80: bf0bc40c 00007fff bf0bc400 c03c60b0 00000000 bf0bc448 00000028 c0e09684
      [    7.689647] 5ea0: 00000002 bf0bc530 c1234bf8 bf0bc5dc bf0bc514 c10ebbe8 ffffe000 bf000000
      [    7.689657] 5ec0: 00011538 00000000 ed8e5f48 00000000 00000000 00000000 00000000 00000000
      [    7.689666] 5ee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      [    7.689676] 5f00: 00000000 00000000 7fffffff 00000000 00000013 b6e55a18 0000017b c0309104
      [    7.689686] 5f20: ed8e4000 00000000 00510af0 c03c9430 7fffffff 00000000 00000003 00000000
      [    7.689697] 5f40: 00000000 f0f0f000 00011538 00000000 f0f107b0 f0f0f000 00011538 f0f1fdb8
      [    7.689707] 5f60: f0f1fbe8 f0f1b974 00004000 000041e0 bf0bc3d0 00000001 00000000 000024c4
      [    7.689717] 5f80: 0000002d 0000002e 00000019 00000000 00000010 00000000 16894000 00000000
      [    7.689727] 5fa0: 00000000 c0308f20 16894000 00000000 00000013 b6e55a18 00000000 b6e5652c
      [    7.689737] 5fc0: 16894000 00000000 00000000 0000017b 00020000 00508110 00000000 00510af0
      [    7.689748] 5fe0: bef68948 bef68938 b6e4d3d0 b6d32590 60000010 00000013 00000000 00000000
      [    7.689790] [<bf0ae1ec>] (m88ds3103_attach [m88ds3103]) from [<bf0b9f10>] (dvbsky_s960c_attach+0x78/0x280 [dvb_usb_dvbsky])
      [    7.689821] [<bf0b9f10>] (dvbsky_s960c_attach [dvb_usb_dvbsky]) from [<bf0a3cb4>] (dvb_usbv2_probe+0xa3c/0x1024 [dvb_usb_v2])
      [    7.689849] [<bf0a3cb4>] (dvb_usbv2_probe [dvb_usb_v2]) from [<c0aa9e9c>] (usb_probe_interface+0xf0/0x2a8)
      [    7.689869] [<c0aa9e9c>] (usb_probe_interface) from [<c08e268c>] (driver_probe_device+0x2f8/0x4b4)
      [    7.689881] [<c08e268c>] (driver_probe_device) from [<c08e2948>] (__driver_attach+0x100/0x11c)
      [    7.689895] [<c08e2948>] (__driver_attach) from [<c08e0778>] (bus_for_each_dev+0x4c/0x9c)
      [    7.689909] [<c08e0778>] (bus_for_each_dev) from [<c08e1934>] (bus_add_driver+0x1c0/0x264)
      [    7.689919] [<c08e1934>] (bus_add_driver) from [<c08e34ec>] (driver_register+0x78/0xf4)
      [    7.689931] [<c08e34ec>] (driver_register) from [<c0aa8dc4>] (usb_register_driver+0x70/0x134)
      [    7.689946] [<c0aa8dc4>] (usb_register_driver) from [<c03021e4>] (do_one_initcall+0x44/0x168)
      [    7.689963] [<c03021e4>] (do_one_initcall) from [<c03c9a24>] (do_init_module+0x64/0x1f4)
      [    7.689979] [<c03c9a24>] (do_init_module) from [<c03c8cb0>] (load_module+0x20a0/0x25c8)
      [    7.689993] [<c03c8cb0>] (load_module) from [<c03c9430>] (SyS_finit_module+0xb4/0xec)
      [    7.690007] [<c03c9430>] (SyS_finit_module) from [<c0308f20>] (ret_fast_syscall+0x0/0x54)
      [    7.690018] Code: bad PC value
      
      This may happen on normal circumstances, if, for some reason, the demod
      hangs and start returning an invalid chip ID:
      
      [   10.394395] m88ds3103 3-0068: Unknown device. Chip_id=00
      
      So, change the logic to cause probe to fail with -ENODEV, preventing
      the OOPS.
      
      Detected while testing DVB MMAP patches on Raspberry Pi 3 with
      DVBSky S960CI.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ba2b9e0
    • Ming Lei's avatar
      blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatch · ccddee81
      Ming Lei authored
      commit 105976f5 upstream.
      
      __blk_mq_requeue_request() covers two cases:
      
      - one is that the requeued request is added to hctx->dispatch, such as
      blk_mq_dispatch_rq_list()
      
      - another case is that the request is requeued to io scheduler, such as
      blk_mq_requeue_request().
      
      We should call io sched's .requeue_request callback only for the 2nd
      case.
      
      Cc: Paolo Valente <paolo.valente@linaro.org>
      Cc: Omar Sandoval <osandov@fb.com>
      Fixes: bd166ef1 ("blk-mq-sched: add framework for MQ capable IO schedulers")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Acked-by: default avatarPaolo Valente <paolo.valente@linaro.org>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ccddee81
    • Julian Wiedmann's avatar
      s390/qeth: fix IPA command submission race · c5f32462
      Julian Wiedmann authored
      
      [ Upstream commit d22ffb5a ]
      
      If multiple IPA commands are build & sent out concurrently,
      fill_ipacmd_header() may assign a seqno value to a command that's
      different from what send_control_data() later assigns to this command's
      reply.
      This is due to other commands passing through send_control_data(),
      and incrementing card->seqno.ipa along the way.
      
      So one IPA command has no reply that's waiting for its seqno, while some
      other IPA command has multiple reply objects waiting for it.
      Only one of those waiting replies wins, and the other(s) times out and
      triggers a recovery via send_ipa_cmd().
      
      Fix this by making sure that the same seqno value is assigned to
      a command and its reply object.
      Do so immediately before submitting the command & while holding the
      irq_pending "lock", to produce nicely ascending seqnos.
      
      As a side effect, *all* IPA commands now use a reply object that's
      waiting for its actual seqno. Previously, early IPA commands that were
      submitted while the card was still DOWN used the "catch-all" IDX seqno.
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5f32462
    • Julian Wiedmann's avatar
      s390/qeth: fix IP address lookup for L3 devices · eae17c40
      Julian Wiedmann authored
      
      [ Upstream commit c5c48c58 ]
      
      Current code ("qeth_l3_ip_from_hash()") matches a queried address object
      against objects in the IP table by IP address, Mask/Prefix Length and
      MAC address ("qeth_l3_ipaddrs_is_equal()"). But what callers actually
      require is either
      a) "is this IP address registered" (ie. match by IP address only),
      before adding a new address.
      b) or "is this address object registered" (ie. match all relevant
         attributes), before deleting an address.
      
      Right now
      1. the ADD path is too strict in its lookup, and eg. doesn't detect
      conflicts between an existing NORMAL address and a new VIPA address
      (because the NORMAL address will have mask != 0, while VIPA has
      a mask == 0),
      2. the DELETE path is not strict enough, and eg. allows del_rxip() to
      delete a VIPA address as long as the IP address matches.
      
      Fix all this by adding helpers (_addr_match_ip() and _addr_match_all())
      that do the appropriate checking.
      
      Note that the ADD path for NORMAL addresses is special, as qeth keeps
      track of how many times such an address is in use (and there is no
      immediate way of returning errors to the caller). So when a requested
      NORMAL address _fully_ matches an existing one, it's not considered a
      conflict and we merely increment the refcount.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eae17c40
    • Julian Wiedmann's avatar
      Revert "s390/qeth: fix using of ref counter for rxip addresses" · 87c4789f
      Julian Wiedmann authored
      
      [ Upstream commit 4964c66f ]
      
      This reverts commit cb816192.
      
      The issue this attempted to fix never actually occurs.
      l3_add_rxip() checks (via l3_ip_from_hash()) if the requested address
      was previously added to the card. If so, it returns -EEXIST and doesn't
      call l3_add_ip().
      As a result, the "address exists" path in l3_add_ip() is never taken
      for rxip addresses, and this patch had no effect.
      
      Fixes: cb816192 ("s390/qeth: fix using of ref counter for rxip addresses")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87c4789f
    • Julian Wiedmann's avatar
      s390/qeth: fix double-free on IP add/remove race · 56f662db
      Julian Wiedmann authored
      
      [ Upstream commit 14d066c3 ]
      
      Registering an IPv4 address with the HW takes quite a while, so we
      temporarily drop the ip_htable lock. Any concurrent add/remove of the
      same IP adjusts the IP's use count, and (on remove) is then blocked by
      addr->in_progress.
      After the register call has completed, we check the use count for
      concurrently attempted add/remove calls - and possibly straight-away
      deregister the IP again. This happens via l3_delete_ip(), which
      1) looks up the queried IP in the htable (getting a reference to the
         *same* queried object),
      2) deregisters the IP from the HW, and
      3) frees the IP object.
      
      The caller in l3_add_ip() then does a second free on the same object.
      
      For this case, skip all the extra checks and lookups in l3_delete_ip()
      and just deregister & free the IP object ourselves.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      56f662db
    • Julian Wiedmann's avatar
      s390/qeth: fix IP removal on offline cards · 02763710
      Julian Wiedmann authored
      
      [ Upstream commit 98d823ab ]
      
      If the HW is not reachable, then none of the IPs in qeth's internal
      table has been registered with the HW yet. So when deleting such an IP,
      there's no need to stage it for deregistration - just drop it from
      the table.
      
      This fixes the "add-delete-add" scenario on an offline card, where the
      the second "add" merely increments the IP's use count. But as the IP is
      still set to DISP_ADDR_DELETE from the previous "delete" step,
      l3_recover_ip() won't register it with the HW when the card goes online.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02763710
    • Julian Wiedmann's avatar
      s390/qeth: fix overestimated count of buffer elements · fa4919e3
      Julian Wiedmann authored
      
      [ Upstream commit 12472af8 ]
      
      qeth_get_elements_for_range() doesn't know how to handle a 0-length
      range (ie. start == end), and returns 1 when it should return 0.
      Such ranges occur on TSO skbs, where the L2/L3/L4 headers (and thus all
      of the skb's linear data) are skipped when mapping the skb into regular
      buffer elements.
      
      This overestimation may cause several performance-related issues:
      1. sub-optimal IO buffer selection, where the next buffer gets selected
         even though the skb would actually still fit into the current buffer.
      2. forced linearization, if the element count for a non-linear skb
         exceeds QETH_MAX_BUFFER_ELEMENTS.
      
      Rather than modifying qeth_get_elements_for_range() and adding overhead
      to every caller, fix up those callers that are in risk of passing a
      0-length range.
      
      Fixes: 2863c613 ("qeth: refactor calculation of SBALE count")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fa4919e3
    • Julian Wiedmann's avatar
      s390/qeth: fix SETIP command handling · 128c7e69
      Julian Wiedmann authored
      
      [ Upstream commit 1c5b2216 ]
      
      send_control_data() applies some special handling to SETIP v4 IPA
      commands. But current code parses *all* command types for the SETIP
      command code. Limit the command code check to IPA commands.
      
      Fixes: 5b54e16f ("qeth: do not spin for SETIP ip assist command")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      128c7e69
    • Ursula Braun's avatar
      s390/qeth: fix underestimated count of buffer elements · fcdfb9d8
      Ursula Braun authored
      
      [ Upstream commit 89271c65 ]
      
      For a memory range/skb where the last byte falls onto a page boundary
      (ie. 'end' is of the form xxx...xxx001), the PFN_UP() part of the
      calculation currently doesn't round up to the next PFN due to an
      off-by-one error.
      Thus qeth believes that the skb occupies one page less than it
      actually does, and may select a IO buffer that doesn't have enough spare
      buffer elements to fit all of the skb's data.
      HW detects this as a malformed buffer descriptor, and raises an
      exception which then triggers device recovery.
      
      Fixes: 2863c613 ("qeth: refactor calculation of SBALE count")
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.vnet.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fcdfb9d8
    • Jason Wang's avatar
      virtio-net: disable NAPI only when enabled during XDP set · 99a78194
      Jason Wang authored
      
      [ Upstream commit 4e09ff53 ]
      
      We try to disable NAPI to prevent a single XDP TX queue being used by
      multiple cpus. But we don't check if device is up (NAPI is enabled),
      this could result stall because of infinite wait in
      napi_disable(). Fixing this by checking device state through
      netif_running() before.
      
      Fixes: 4941d472 ("virtio-net: do not reset during XDP set")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99a78194
    • Jason Wang's avatar
      tuntap: disable preemption during XDP processing · 5134b919
      Jason Wang authored
      
      [ Upstream commit 23e43f07 ]
      
      Except for tuntap, all other drivers' XDP was implemented at NAPI
      poll() routine in a bh. This guarantees all XDP operation were done at
      the same CPU which is required by e.g BFP_MAP_TYPE_PERCPU_ARRAY. But
      for tuntap, we do it in process context and we try to protect XDP
      processing by RCU reader lock. This is insufficient since
      CONFIG_PREEMPT_RCU can preempt the RCU reader critical section which
      breaks the assumption that all XDP were processed in the same CPU.
      
      Fixing this by simply disabling preemption during XDP processing.
      
      Fixes: 761876c8 ("tap: XDP support")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5134b919
    • Jason Wang's avatar
      tuntap: correctly add the missing XDP flush · 1903344b
      Jason Wang authored
      
      [ Upstream commit 1bb4f2e8 ]
      
      We don't flush batched XDP packets through xdp_do_flush_map(), this
      will cause packets stall at TX queue. Consider we don't do XDP on NAPI
      poll(), the only possible fix is to call xdp_do_flush_map()
      immediately after xdp_do_redirect().
      
      Note, this in fact won't try to batch packets through devmap, we could
      address in the future.
      Reported-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Fixes: 761876c8 ("tap: XDP support")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1903344b
    • Soheil Hassas Yeganeh's avatar
      tcp: purge write queue upon RST · abb4a8b8
      Soheil Hassas Yeganeh authored
      
      [ Upstream commit a27fd7a8 ]
      
      When the connection is reset, there is no point in
      keeping the packets on the write queue until the connection
      is closed.
      
      RFC 793 (page 70) and RFC 793-bis (page 64) both suggest
      purging the write queue upon RST:
      https://tools.ietf.org/html/draft-ietf-tcpm-rfc793bis-07
      
      Moreover, this is essential for a correct MSG_ZEROCOPY
      implementation, because userspace cannot call close(fd)
      before receiving zerocopy signals even when the connection
      is reset.
      
      Fixes: f214f915 ("tcp: enable MSG_ZEROCOPY")
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abb4a8b8
    • Jason A. Donenfeld's avatar
      netlink: put module reference if dump start fails · eec434c5
      Jason A. Donenfeld authored
      
      [ Upstream commit b87b6194 ]
      
      Before, if cb->start() failed, the module reference would never be put,
      because cb->cb_running is intentionally false at this point. Users are
      generally annoyed by this because they can no longer unload modules that
      leak references. Also, it may be possible to tediously wrap a reference
      counter back to zero, especially since module.c still uses atomic_inc
      instead of refcount_inc.
      
      This patch expands the error path to simply call module_put if
      cb->start() fails.
      
      Fixes: 41c87425 ("netlink: do not set cb_running if dump's start() errs")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eec434c5
    • Ido Schimmel's avatar
      mlxsw: spectrum_router: Do not unconditionally clear route offload indication · abd7663b
      Ido Schimmel authored
      
      [ Upstream commit d1c95af3 ]
      
      When mlxsw replaces (or deletes) a route it removes the offload
      indication from the replaced route. This is problematic for IPv4 routes,
      as the offload indication is stored in the fib_info which is usually
      shared between multiple routes.
      
      Instead of unconditionally clearing the offload indication, only clear
      it if no other route is using the fib_info.
      
      Fixes: 3984d1a8 ("mlxsw: spectrum_router: Provide offload indication using nexthop flags")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarAlexander Petrovskiy <alexpe@mellanox.com>
      Tested-by: default avatarAlexander Petrovskiy <alexpe@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abd7663b
    • Paolo Abeni's avatar
      cls_u32: fix use after free in u32_destroy_key() · ebadf888
      Paolo Abeni authored
      
      [ Upstream commit d7cdee5e ]
      
      Li Shuang reported an Oops with cls_u32 due to an use-after-free
      in u32_destroy_key(). The use-after-free can be triggered with:
      
      dev=lo
      tc qdisc add dev $dev root handle 1: htb default 10
      tc filter add dev $dev parent 1: prio 5 handle 1: protocol ip u32 divisor 256
      tc filter add dev $dev protocol ip parent 1: prio 5 u32 ht 800:: match ip dst\
       10.0.0.0/8 hashkey mask 0x0000ff00 at 16 link 1:
      tc qdisc del dev $dev root
      
      Which causes the following kasan splat:
      
       ==================================================================
       BUG: KASAN: use-after-free in u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
       Read of size 4 at addr ffff881b83dae618 by task kworker/u48:5/571
      
       CPU: 17 PID: 571 Comm: kworker/u48:5 Not tainted 4.15.0+ #87
       Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
       Workqueue: tc_filter_workqueue u32_delete_key_freepf_work [cls_u32]
       Call Trace:
        dump_stack+0xd6/0x182
        ? dma_virt_map_sg+0x22e/0x22e
        print_address_description+0x73/0x290
        kasan_report+0x277/0x360
        ? u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
        u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
        u32_delete_key_freepf_work+0x1c/0x30 [cls_u32]
        process_one_work+0xae0/0x1c80
        ? sched_clock+0x5/0x10
        ? pwq_dec_nr_in_flight+0x3c0/0x3c0
        ? _raw_spin_unlock_irq+0x29/0x40
        ? trace_hardirqs_on_caller+0x381/0x570
        ? _raw_spin_unlock_irq+0x29/0x40
        ? finish_task_switch+0x1e5/0x760
        ? finish_task_switch+0x208/0x760
        ? preempt_notifier_dec+0x20/0x20
        ? __schedule+0x839/0x1ee0
        ? check_noncircular+0x20/0x20
        ? firmware_map_remove+0x73/0x73
        ? find_held_lock+0x39/0x1c0
        ? worker_thread+0x434/0x1820
        ? lock_contended+0xee0/0xee0
        ? lock_release+0x1100/0x1100
        ? init_rescuer.part.16+0x150/0x150
        ? retint_kernel+0x10/0x10
        worker_thread+0x216/0x1820
        ? process_one_work+0x1c80/0x1c80
        ? lock_acquire+0x1a5/0x540
        ? lock_downgrade+0x6b0/0x6b0
        ? sched_clock+0x5/0x10
        ? lock_release+0x1100/0x1100
        ? compat_start_thread+0x80/0x80
        ? do_raw_spin_trylock+0x190/0x190
        ? _raw_spin_unlock_irq+0x29/0x40
        ? trace_hardirqs_on_caller+0x381/0x570
        ? _raw_spin_unlock_irq+0x29/0x40
        ? finish_task_switch+0x1e5/0x760
        ? finish_task_switch+0x208/0x760
        ? preempt_notifier_dec+0x20/0x20
        ? __schedule+0x839/0x1ee0
        ? kmem_cache_alloc_trace+0x143/0x320
        ? firmware_map_remove+0x73/0x73
        ? sched_clock+0x5/0x10
        ? sched_clock_cpu+0x18/0x170
        ? find_held_lock+0x39/0x1c0
        ? schedule+0xf3/0x3b0
        ? lock_downgrade+0x6b0/0x6b0
        ? __schedule+0x1ee0/0x1ee0
        ? do_wait_intr_irq+0x340/0x340
        ? do_raw_spin_trylock+0x190/0x190
        ? _raw_spin_unlock_irqrestore+0x32/0x60
        ? process_one_work+0x1c80/0x1c80
        ? process_one_work+0x1c80/0x1c80
        kthread+0x312/0x3d0
        ? kthread_create_worker_on_cpu+0xc0/0xc0
        ret_from_fork+0x3a/0x50
      
       Allocated by task 1688:
        kasan_kmalloc+0xa0/0xd0
        __kmalloc+0x162/0x380
        u32_change+0x1220/0x3c9e [cls_u32]
        tc_ctl_tfilter+0x1ba6/0x2f80
        rtnetlink_rcv_msg+0x4f0/0x9d0
        netlink_rcv_skb+0x124/0x320
        netlink_unicast+0x430/0x600
        netlink_sendmsg+0x8fa/0xd60
        sock_sendmsg+0xb1/0xe0
        ___sys_sendmsg+0x678/0x980
        __sys_sendmsg+0xc4/0x210
        do_syscall_64+0x232/0x7f0
        return_from_SYSCALL_64+0x0/0x75
      
       Freed by task 112:
        kasan_slab_free+0x71/0xc0
        kfree+0x114/0x320
        rcu_process_callbacks+0xc3f/0x1600
        __do_softirq+0x2bf/0xc06
      
       The buggy address belongs to the object at ffff881b83dae600
        which belongs to the cache kmalloc-4096 of size 4096
       The buggy address is located 24 bytes inside of
        4096-byte region [ffff881b83dae600, ffff881b83daf600)
       The buggy address belongs to the page:
       page:ffffea006e0f6a00 count:1 mapcount:0 mapping:          (null) index:0x0 compound_mapcount: 0
       flags: 0x17ffffc0008100(slab|head)
       raw: 0017ffffc0008100 0000000000000000 0000000000000000 0000000100070007
       raw: dead000000000100 dead000000000200 ffff880187c0e600 0000000000000000
       page dumped because: kasan: bad access detected
      
       Memory state around the buggy address:
        ffff881b83dae500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
        ffff881b83dae580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       >ffff881b83dae600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                   ^
        ffff881b83dae680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
        ffff881b83dae700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ==================================================================
      
      The problem is that the htnode is freed before the linked knodes and the
      latter will try to access the first at u32_destroy_key() time.
      This change addresses the issue using the htnode refcnt to guarantee
      the correct free order. While at it also add a RCU annotation,
      to keep sparse happy.
      
      v1 -> v2: use rtnl_derefence() instead of RCU read locks
      v2 -> v3:
        - don't check refcnt in u32_destroy_hnode()
        - cleaned-up u32_destroy() implementation
        - cleaned-up code comment
      v3 -> v4:
        - dropped unneeded comment
      Reported-by: default avatarLi Shuang <shuali@redhat.com>
      Fixes: c0d378ef ("net_sched: use tcf_queue_work() in u32 filter")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ebadf888