1. 13 Feb, 2018 20 commits
    • Eric Dumazet's avatar
      net: igmp: add a missing rcu locking section · ce43c07f
      Eric Dumazet authored
      
      [ Upstream commit e7aadb27 ]
      
      Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.
      
      Timer callbacks do not ensure this locking.
      
      =============================
      WARNING: suspicious RCU usage
      4.15.0+ #200 Not tainted
      -----------------------------
      ./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      3 locks held by syzkaller616973/4074:
       #0:  (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
       #1:  ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
       #1:  ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
       #2:  (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
       #2:  (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600
      
      stack backtrace:
      CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:53
       lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
       __in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
       igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
       igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
       add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
       add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
       igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
       igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
       igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
       call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
       expire_timers kernel/time/timer.c:1363 [inline]
       __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
       run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
       __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
       invoke_softirq kernel/softirq.c:365 [inline]
       irq_exit+0x1cc/0x200 kernel/softirq.c:405
       exiting_irq arch/x86/include/asm/apic.h:541 [inline]
       smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938
      
      Fixes: a46182b0 ("net: igmp: Use correct source address on IGMPv3 reports")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce43c07f
    • Nikolay Aleksandrov's avatar
      ip6mr: fix stale iterator · 7d3d60ef
      Nikolay Aleksandrov authored
      
      [ Upstream commit 4adfa79f ]
      
      When we dump the ip6mr mfc entries via proc, we initialize an iterator
      with the table to dump but we don't clear the cache pointer which might
      be initialized from a prior read on the same descriptor that ended. This
      can result in lock imbalance (an unnecessary unlock) leading to other
      crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
      Thanks for the reliable reproducer.
      
      Here's syzbot's trace:
       WARNING: bad unlock balance detected!
       4.15.0-rc3+ #128 Not tainted
       syzkaller971460/3195 is trying to release lock (mrt_lock) at:
       [<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
       but there are no more locks to release!
      
       other info that might help us debug this:
       1 lock held by syzkaller971460/3195:
        #0:  (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
       fs/seq_file.c:165
      
       stack backtrace:
       CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
       Google 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:53
        print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
        __lock_release kernel/locking/lockdep.c:3775 [inline]
        lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
        __raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
        _raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
        ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
        traverse+0x3bc/0xa00 fs/seq_file.c:135
        seq_read+0x96a/0x13d0 fs/seq_file.c:189
        proc_reg_read+0xef/0x170 fs/proc/inode.c:217
        do_loop_readv_writev fs/read_write.c:673 [inline]
        do_iter_read+0x3db/0x5b0 fs/read_write.c:897
        compat_readv+0x1bf/0x270 fs/read_write.c:1140
        do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
        C_SYSC_preadv fs/read_write.c:1209 [inline]
        compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
        do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
        do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
        entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
       RIP: 0023:0xf7f73c79
       RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
       RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
       RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
       RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       BUG: sleeping function called from invalid context at lib/usercopy.c:25
       in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
       INFO: lockdep is turned off.
       CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
       Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
       Google 01/01/2011
       Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:53
        ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
        __might_sleep+0x95/0x190 kernel/sched/core.c:6013
        __might_fault+0xab/0x1d0 mm/memory.c:4525
        _copy_to_user+0x2c/0xc0 lib/usercopy.c:25
        copy_to_user include/linux/uaccess.h:155 [inline]
        seq_read+0xcb4/0x13d0 fs/seq_file.c:279
        proc_reg_read+0xef/0x170 fs/proc/inode.c:217
        do_loop_readv_writev fs/read_write.c:673 [inline]
        do_iter_read+0x3db/0x5b0 fs/read_write.c:897
        compat_readv+0x1bf/0x270 fs/read_write.c:1140
        do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
        C_SYSC_preadv fs/read_write.c:1209 [inline]
        compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
        do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
        do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
        entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
       RIP: 0023:0xf7f73c79
       RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
       RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
       RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
       RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
       R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
       R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
       WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
       lib/usercopy.c:26
      Reported-by: default avatarsyzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d3d60ef
    • Sebastian Andrzej Siewior's avatar
      serial: core: mark port as initialized after successful IRQ change · ffcf167d
      Sebastian Andrzej Siewior authored
      commit 44117a1d upstream.
      
      setserial changes the IRQ via uart_set_info(). It invokes
      uart_shutdown() which free the current used IRQ and clear
      TTY_PORT_INITIALIZED. It will then update the IRQ number and invoke
      uart_startup() before returning to the caller leaving
      TTY_PORT_INITIALIZED cleared.
      
      The next open will crash with
      |  list_add double add: new=ffffffff839fcc98, prev=ffffffff839fcc98, next=ffffffff839fcc98.
      since the close from the IOCTL won't free the IRQ (and clean the list)
      due to the TTY_PORT_INITIALIZED check in uart_shutdown().
      
      There is same pattern in uart_do_autoconfig() and I *think* it also
      needs to set TTY_PORT_INITIALIZED there.
      Is there a reason why uart_startup() does not set the flag by itself
      after the IRQ has been acquired (since it is cleared in uart_shutdown)?
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ffcf167d
    • Hugh Dickins's avatar
      kaiser: allocate pgd with order 0 when pti=off · 400d3c8b
      Hugh Dickins authored
      The 4.9.77 version of "x86/pti/efi: broken conversion from efi to kernel
      page table" looked nicer than the 4.4.112 version, but was suboptimal on
      machines booted with "pti=off" (or on AMD machines): it allocated pgd
      with an order 1 page whatever the setting of kaiser_enabled.
      
      Fix that by moving the definition of PGD_ALLOCATION_ORDER from
      asm/pgalloc.h to asm/pgtable.h, which already defines kaiser_enabled.
      
      Fixes: 1b92c48a ("x86/pti/efi: broken conversion from efi to kernel page table")
      Reviewed-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Steven Sistare <steven.sistare@oracle.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      400d3c8b
    • Dave Hansen's avatar
      x86/pti: Make unpoison of pgd for trusted boot work for real · ae1fc8de
      Dave Hansen authored
      commit 445b69e3 upstream
      
      The inital fix for trusted boot and PTI potentially misses the pgd clearing
      if pud_alloc() sets a PGD.  It probably works in *practice* because for two
      adjacent calls to map_tboot_page() that share a PGD entry, the first will
      clear NX, *then* allocate and set the PGD (without NX clear).  The second
      call will *not* allocate but will clear the NX bit.
      
      Defer the NX clearing to a point after it is known that all top-level
      allocations have occurred.  Add a comment to clarify why.
      
      [ tglx: Massaged changelog ]
      
      [ hughd notes: I have not tested tboot, but this looks to me as necessary
      and as safe in old-Kaiser backports as it is upstream; I'm not submitting
      the commit-to-be-fixed 262b6b30, since it was undone by 445b69e3,
      and makes conflict trouble because of 5-level's p4d versus 4-level's pgd.]
      
      Fixes: 262b6b30 ("x86/tboot: Unbreak tboot with PTI enabled")
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: peterz@infradead.org
      Cc: ning.sun@intel.com
      Cc: tboot-devel@lists.sourceforge.net
      Cc: andi@firstfloor.org
      Cc: luto@kernel.org
      Cc: law@redhat.com
      Cc: pbonzini@redhat.com
      Cc: torvalds@linux-foundation.org
      Cc: gregkh@linux-foundation.org
      Cc: dwmw@amazon.co.uk
      Cc: nickc@redhat.com
      Link: https://lkml.kernel.org/r/20180110224939.2695CD47@viggo.jf.intel.com
      Cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae1fc8de
    • Hugh Dickins's avatar
      kaiser: fix intel_bts perf crashes · 0a61cd6c
      Hugh Dickins authored
      Vince reported perf_fuzzer quickly locks up on 4.15-rc7 with PTI;
      Robert reported Bad RIP with KPTI and Intel BTS also on 4.15-rc7:
      honggfuzz -f /tmp/somedirectorywithatleastonefile \
                --linux_perf_bts_edge -s -- /bin/true
      (honggfuzz from https://github.com/google/honggfuzz) crashed with
      BUG: unable to handle kernel paging request at ffff9d3215100000
      (then narrowed it down to
      perf record --per-thread -e intel_bts//u -- /bin/ls).
      
      The intel_bts driver does not use the 'normal' BTS buffer which is
      exposed through kaiser_add_mapping(), but instead uses the memory
      allocated for the perf AUX buffer.
      
      This obviously comes apart when using PTI, because then the kernel
      mapping, which includes that AUX buffer memory, disappears while
      switched to user page tables.
      
      Easily fixed in old-Kaiser backports, by applying kaiser_add_mapping()
      to those pages; perhaps not so easy for upstream, where 4.15-rc8 commit
      99a9dc98 ("x86,perf: Disable intel_bts when PTI") disables for now.
      
      Slightly reorganized surrounding code in bts_buffer_setup_aux(),
      so it can better match bts_buffer_free_aux(): free_aux with an #ifdef
      to avoid the loop when PTI is off, but setup_aux needs to loop anyway
      (and kaiser_add_mapping() is cheap when PTI config is off or "pti=off").
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Reported-by: default avatarRobert Święcki <robert@swiecki.net>
      Analyzed-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Analyzed-by: default avatarStephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Vince Weaver <vince@deater.net>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0a61cd6c
    • Jesse Chan's avatar
      ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE · 374c84de
      Jesse Chan authored
      commit 0cab20ce upstream.
      
      This change resolves a new compile-time warning
      when built as a loadable module:
      
      WARNING: modpost: missing MODULE_LICENSE() in sound/soc/codecs/snd-soc-pcm512x-spi.o
      see include/linux/module.h for more information
      
      This adds the license as "GPL v2", which matches the header of the file.
      
      MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
      Signed-off-by: default avatarJesse Chan <jc@linux.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      374c84de
    • Jesse Chan's avatar
      pinctrl: pxa: pxa2xx: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE · 0ee4f5e7
      Jesse Chan authored
      commit 0b9335cb upstream.
      
      This change resolves a new compile-time warning
      when built as a loadable module:
      
      WARNING: modpost: missing MODULE_LICENSE() in drivers/pinctrl/pxa/pinctrl-pxa2xx.o
      see include/linux/module.h for more information
      
      This adds the license as "GPL v2", which matches the header of the file.
      
      MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
      Signed-off-by: default avatarJesse Chan <jc@linux.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0ee4f5e7
    • Jesse Chan's avatar
      auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE · 781a2d68
      Jesse Chan authored
      commit 09c479f7 upstream.
      
      This change resolves a new compile-time warning
      when built as a loadable module:
      
      WARNING: modpost: missing MODULE_LICENSE() in drivers/auxdisplay/img-ascii-lcd.o
      see include/linux/module.h for more information
      
      This adds the license as "GPL", which matches the header of the file.
      
      MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
      Signed-off-by: default avatarJesse Chan <jc@linux.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      781a2d68
    • Michael Ellerman's avatar
      powerpc/64s: Allow control of RFI flush via debugfs · 9fed3978
      Michael Ellerman authored
      commit 236003e6 upstream.
      
      Expose the state of the RFI flush (enabled/disabled) via debugfs, and
      allow it to be enabled/disabled at runtime.
      
      eg: $ cat /sys/kernel/debug/powerpc/rfi_flush
          1
          $ echo 0 > /sys/kernel/debug/powerpc/rfi_flush
          $ cat /sys/kernel/debug/powerpc/rfi_flush
          0
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9fed3978
    • Michael Ellerman's avatar
      powerpc/64s: Wire up cpu_show_meltdown() · 1f0c936f
      Michael Ellerman authored
      commit fd6e440f upstream.
      
      The recent commit 87590ce6 ("sysfs/cpu: Add vulnerability folder")
      added a generic folder and set of files for reporting information on
      CPU vulnerabilities. One of those was for meltdown:
      
        /sys/devices/system/cpu/vulnerabilities/meltdown
      
      This commit wires up that file for 64-bit Book3S powerpc.
      
      For now we default to "Vulnerable" unless the RFI flush is enabled.
      That may not actually be true on all hardware, further patches will
      refine the reporting based on the CPU/platform etc. But for now we
      default to being pessimists.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f0c936f
    • Oliver O'Halloran's avatar
      powerpc/powernv: Check device-tree for RFI flush settings · 6aec12e1
      Oliver O'Halloran authored
      commit 6e032b35 upstream.
      
      New device-tree properties are available which tell the hypervisor
      settings related to the RFI flush. Use them to determine the
      appropriate flush instruction to use, and whether the flush is
      required.
      Signed-off-by: default avatarOliver O'Halloran <oohall@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6aec12e1
    • Michael Neuling's avatar
      powerpc/pseries: Query hypervisor for RFI flush settings · 7db0fff6
      Michael Neuling authored
      commit 8989d568 upstream.
      
      A new hypervisor call is available which tells the guest settings
      related to the RFI flush. Use it to query the appropriate flush
      instruction(s), and whether the flush is required.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7db0fff6
    • Michael Ellerman's avatar
      powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti · 0ef9f828
      Michael Ellerman authored
      commit bc9c9304 upstream.
      
      Because there may be some performance overhead of the RFI flush, add
      kernel command line options to disable it.
      
      We add a sensibly named 'no_rfi_flush' option, but we also hijack the
      x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we
      see 'nopti' we can guess that the user is trying to avoid any overhead
      of Meltdown mitigations, and it means we don't have to educate every
      one about a different command line option.
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0ef9f828
    • Michael Ellerman's avatar
      powerpc/64s: Add support for RFI flush of L1-D cache · c3b82ebe
      Michael Ellerman authored
      commit aa8a5e00 upstream.
      
      On some CPUs we can prevent the Meltdown vulnerability by flushing the
      L1-D cache on exit from kernel to user mode, and from hypervisor to
      guest.
      
      This is known to be the case on at least Power7, Power8 and Power9. At
      this time we do not know the status of the vulnerability on other CPUs
      such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
      CPUs. As more information comes to light we can enable this, or other
      mechanisms on those CPUs.
      
      The vulnerability occurs when the load of an architecturally
      inaccessible memory region (eg. userspace load of kernel memory) is
      speculatively executed to the point where its result can influence the
      address of a subsequent speculatively executed load.
      
      In order for that to happen, the first load must hit in the L1,
      because before the load is sent to the L2 the permission check is
      performed. Therefore if no kernel addresses hit in the L1 the
      vulnerability can not occur. We can ensure that is the case by
      flushing the L1 whenever we return to userspace. Similarly for
      hypervisor vs guest.
      
      In order to flush the L1-D cache on exit, we add a section of nops at
      each (h)rfi location that returns to a lower privileged context, and
      patch that with some sequence. Newer firmwares are able to advertise
      to us that there is a special nop instruction that flushes the L1-D.
      If we do not see that advertised, we fall back to doing a displacement
      flush in software.
      
      For guest kernels we support migration between some CPU versions, and
      different CPUs may use different flush instructions. So that we are
      prepared to migrate to a machine with a different flush instruction
      activated, we may have to patch more than one flush instruction at
      boot if the hypervisor tells us to.
      
      In the end this patch is mostly the work of Nicholas Piggin and
      Michael Ellerman. However a cast of thousands contributed to analysis
      of the issue, earlier versions of the patch, back ports testing etc.
      Many thanks to all of them.
      Tested-by: default avatarJon Masters <jcm@redhat.com>
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [Balbir - back ported to stable with changes]
      Signed-off-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3b82ebe
    • Nicholas Piggin's avatar
      powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL · 48cc95d4
      Nicholas Piggin authored
      commit c7305645 upstream.
      
      In the SLB miss handler we may be returning to user or kernel. We need
      to add a check early on and save the result in the cr4 register, and
      then we bifurcate the return path based on that.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      [mpe: Backport to 4.4 based on patch from Balbir]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48cc95d4
    • Nicholas Piggin's avatar
      powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL · 00e40620
      Nicholas Piggin authored
      commit b8e90cb7 upstream.
      
      In the syscall exit path we may be returning to user or kernel
      context. We already have a test for that, because we conditionally
      restore r13. So use that existing test and branch, and bifurcate the
      return based on that.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      00e40620
    • Nicholas Piggin's avatar
      powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL · 9d914324
      Nicholas Piggin authored
      commit a08f828c upstream.
      
      Similar to the syscall return path, in fast_exception_return we may be
      returning to user or kernel context. We already have a test for that,
      because we conditionally restore r13. So use that existing test and
      branch, and bifurcate the return based on that.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d914324
    • Nicholas Piggin's avatar
      powerpc/64: Add macros for annotating the destination of rfid/hrfid · 8fd3f98d
      Nicholas Piggin authored
      commit 50e51c13 upstream.
      
      The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is
      used for switching from the kernel to userspace, and from the
      hypervisor to the guest kernel. However it can and is also used for
      other transitions, eg. from real mode kernel code to virtual mode
      kernel code, and it's not always clear from the code what the
      destination context is.
      
      To make it clearer when reading the code, add macros which encode the
      expected destination context.
      Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8fd3f98d
    • Michael Neuling's avatar
      powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper · be6641a7
      Michael Neuling authored
      commit 191eccb1 upstream.
      
      A new hypervisor call has been defined to communicate various
      characteristics of the CPU to guests. Add definitions for the hcall
      number, flags and a wrapper function.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [Balbir fixed conflicts in backport]
      Signed-off-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be6641a7
  2. 03 Feb, 2018 20 commits