1. 16 Feb, 2018 4 commits
    • Lukas Wunner's avatar
      drm/radeon: Fix deadlock on runtime suspend · 15734fef
      Lukas Wunner authored
      radeon's ->runtime_suspend hook calls drm_kms_helper_poll_disable(),
      which waits for the output poll worker to finish if it's running.
      
      The output poll worker meanwhile calls pm_runtime_get_sync() in
      radeon's ->detect hooks, which waits for the ongoing suspend to finish,
      causing a deadlock.
      
      Fix by not acquiring a runtime PM ref if the ->detect hooks are called
      in the output poll worker's context.  This is safe because the poll
      worker is only enabled while runtime active and we know that
      ->runtime_suspend waits for it to finish.
      
      Stack trace for posterity:
      
        INFO: task kworker/0:3:31847 blocked for more than 120 seconds
        Workqueue: events output_poll_execute [drm_kms_helper]
        Call Trace:
         schedule+0x3c/0x90
         rpm_resume+0x1e2/0x690
         __pm_runtime_resume+0x3f/0x60
         radeon_lvds_detect+0x39/0xf0 [radeon]
         output_poll_execute+0xda/0x1e0 [drm_kms_helper]
         process_one_work+0x14b/0x440
         worker_thread+0x48/0x4a0
      
        INFO: task kworker/2:0:10493 blocked for more than 120 seconds.
        Workqueue: pm pm_runtime_work
        Call Trace:
         schedule+0x3c/0x90
         schedule_timeout+0x1b3/0x240
         wait_for_common+0xc2/0x180
         wait_for_completion+0x1d/0x20
         flush_work+0xfc/0x1a0
         __cancel_work_timer+0xa5/0x1d0
         cancel_delayed_work_sync+0x13/0x20
         drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
         radeon_pmops_runtime_suspend+0x3d/0xa0 [radeon]
         pci_pm_runtime_suspend+0x61/0x1a0
         vga_switcheroo_runtime_suspend+0x21/0x70
         __rpm_callback+0x32/0x70
         rpm_callback+0x24/0x80
         rpm_suspend+0x12b/0x640
         pm_runtime_work+0x6f/0xb0
         process_one_work+0x14b/0x440
         worker_thread+0x48/0x4a0
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94147
      Fixes: 10ebc0bc ("drm/radeon: add runtime PM support (v2)")
      Cc: stable@vger.kernel.org # v3.13+: 27d4ee03: workqueue: Allow retrieval of current task's work struct
      Cc: stable@vger.kernel.org # v3.13+: 25c058cc: drm: Allow determining if current task is output poll worker
      Cc: Ismo Toijala <ismo.toijala@gmail.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Dave Airlie <airlied@redhat.com>
      Reviewed-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/64ea02c44f91dda19bc563902b97bbc699040392.1518338789.git.lukas@wunner.de
      15734fef
    • Lukas Wunner's avatar
      drm/nouveau: Fix deadlock on runtime suspend · d61a5c10
      Lukas Wunner authored
      nouveau's ->runtime_suspend hook calls drm_kms_helper_poll_disable(),
      which waits for the output poll worker to finish if it's running.
      
      The output poll worker meanwhile calls pm_runtime_get_sync() in
      nouveau_connector_detect() which waits for the ongoing suspend to finish,
      causing a deadlock.
      
      Fix by not acquiring a runtime PM ref if nouveau_connector_detect() is
      called in the output poll worker's context.  This is safe because
      the poll worker is only enabled while runtime active and we know that
      ->runtime_suspend waits for it to finish.
      
      Other contexts calling nouveau_connector_detect() do require a runtime
      PM ref, these comprise:
      
        status_store() drm sysfs interface
        ->fill_modes drm callback
        drm_fb_helper_probe_connector_modes()
        drm_mode_getconnector()
        nouveau_connector_hotplug()
        nouveau_display_hpd_work()
        nv17_tv_set_property()
      
      Stack trace for posterity:
      
        INFO: task kworker/0:1:58 blocked for more than 120 seconds.
        Workqueue: events output_poll_execute [drm_kms_helper]
        Call Trace:
         schedule+0x28/0x80
         rpm_resume+0x107/0x6e0
         __pm_runtime_resume+0x47/0x70
         nouveau_connector_detect+0x7e/0x4a0 [nouveau]
         nouveau_connector_detect_lvds+0x132/0x180 [nouveau]
         drm_helper_probe_detect_ctx+0x85/0xd0 [drm_kms_helper]
         output_poll_execute+0x11e/0x1c0 [drm_kms_helper]
         process_one_work+0x184/0x380
         worker_thread+0x2e/0x390
      
        INFO: task kworker/0:2:252 blocked for more than 120 seconds.
        Workqueue: pm pm_runtime_work
        Call Trace:
         schedule+0x28/0x80
         schedule_timeout+0x1e3/0x370
         wait_for_completion+0x123/0x190
         flush_work+0x142/0x1c0
         nouveau_pmops_runtime_suspend+0x7e/0xd0 [nouveau]
         pci_pm_runtime_suspend+0x5c/0x180
         vga_switcheroo_runtime_suspend+0x1e/0xa0
         __rpm_callback+0xc1/0x200
         rpm_callback+0x1f/0x70
         rpm_suspend+0x13c/0x640
         pm_runtime_work+0x6e/0x90
         process_one_work+0x184/0x380
         worker_thread+0x2e/0x390
      
      Bugzilla: https://bugs.archlinux.org/task/53497
      Bugzilla: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=870523
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70388#c33
      Fixes: 5addcf0a ("nouveau: add runtime PM support (v0.9)")
      Cc: stable@vger.kernel.org # v3.12+: 27d4ee03: workqueue: Allow retrieval of current task's work struct
      Cc: stable@vger.kernel.org # v3.12+: 25c058cc: drm: Allow determining if current task is output poll worker
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Dave Airlie <airlied@redhat.com>
      Reviewed-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/b7d2cbb609a80f59ccabfdf479b9d5907c603ea1.1518338789.git.lukas@wunner.de
      d61a5c10
    • Lukas Wunner's avatar
      drm: Allow determining if current task is output poll worker · 25c058cc
      Lukas Wunner authored
      Introduce a helper to determine if the current task is an output poll
      worker.
      
      This allows us to fix a long-standing deadlock in several DRM drivers
      wherein the ->runtime_suspend callback waits for the output poll worker
      to finish and the worker in turn calls a ->detect callback which waits
      for runtime suspend to finish.  The ->detect callback is invoked from
      multiple call sites and waiting for runtime suspend to finish is the
      correct thing to do except if it's executing in the context of the
      worker.
      
      v2: Expand kerneldoc to specifically mention deadlock between
          output poll worker and autosuspend worker as use case. (Lyude)
      
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Reviewed-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/3549ce32e7f1467102e70d3e9cbf70c46bfe108e.1518593424.git.lukas@wunner.de
      25c058cc
    • Lukas Wunner's avatar
      workqueue: Allow retrieval of current task's work struct · 27d4ee03
      Lukas Wunner authored
      Introduce a helper to retrieve the current task's work struct if it is
      a workqueue worker.
      
      This allows us to fix a long-standing deadlock in several DRM drivers
      wherein the ->runtime_suspend callback waits for a specific worker to
      finish and that worker in turn calls a function which waits for runtime
      suspend to finish.  That function is invoked from multiple call sites
      and waiting for runtime suspend to finish is the correct thing to do
      except if it's executing in the context of the worker.
      
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Ben Skeggs <bskeggs@redhat.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Reviewed-by: default avatarLyude Paul <lyude@redhat.com>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/2d8f603074131eb87e588d2b803a71765bd3a2fd.1518338788.git.lukas@wunner.de
      27d4ee03
  2. 01 Feb, 2018 1 commit
  3. 31 Jan, 2018 1 commit
  4. 18 Jan, 2018 4 commits
  5. 17 Jan, 2018 2 commits
  6. 14 Jan, 2018 9 commits
    • Linus Torvalds's avatar
      Linux 4.15-rc8 · a8750ddc
      Linus Torvalds authored
      a8750ddc
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · aaae98a8
      Linus Torvalds authored
      Pull x86 fixlet from Thomas Gleixner.
      
      Remove a warning about lack of compiler support for retpoline that most
      people can't do anything about, so it just annoys them needlessly.
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/retpoline: Remove compile time warning
      aaae98a8
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.15-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · 6bb82119
      Linus Torvalds authored
      Pull powerpc fixes from Michael Ellerman:
       "One fix for an oops at boot if we take a hotplug interrupt before we
        are ready to handle it.
      
        The bulk is patches to implement mitigation for Meltdown, see the
        change logs for more details.
      
        Thanks to: Nicholas Piggin, Michael Neuling, Oliver O'Halloran, Jon
        Masters, Jose Ricardo Ziviani, David Gibson"
      
      * tag 'powerpc-4.15-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/powernv: Check device-tree for RFI flush settings
        powerpc/pseries: Query hypervisor for RFI flush settings
        powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti
        powerpc/64s: Add support for RFI flush of L1-D cache
        powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL
        powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL
        powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL
        powerpc/64s: Simple RFI macro conversions
        powerpc/64: Add macros for annotating the destination of rfid/hrfid
        powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper
        powerpc/pseries: Make RAS IRQ explicitly dependent on DLPAR WQ
      6bb82119
    • Thomas Gleixner's avatar
      x86/retpoline: Remove compile time warning · b8b9ce4b
      Thomas Gleixner authored
      Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler
      does not have retpoline support. Linus rationale for this is:
      
        It's wrong because it will just make people turn off RETPOLINE, and the
        asm updates - and return stack clearing - that are independent of the
        compiler are likely the most important parts because they are likely the
        ones easiest to target.
      
        And it's annoying because most people won't be able to do anything about
        it. The number of people building their own compiler? Very small. So if
        their distro hasn't got a compiler yet (and pretty much nobody does), the
        warning is just annoying crap.
      
        It is already properly reported as part of the sysfs interface. The
        compile-time warning only encourages bad things.
      
      Fixes: 76b04384 ("x86/retpoline: Add initial retpoline support")
      Requested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com
      b8b9ce4b
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · 9443c168
      Linus Torvalds authored
      Pull NVMe fix from Jens Axboe:
       "Just a single fix for nvme over fabrics that should go into 4.15"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        nvme-fabrics: initialize default host->id in nvmf_host_default()
      9443c168
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 40548c6b
      Linus Torvalds authored
      Pull x86 pti updates from Thomas Gleixner:
       "This contains:
      
         - a PTI bugfix to avoid setting reserved CR3 bits when PCID is
           disabled. This seems to cause issues on a virtual machine at least
           and is incorrect according to the AMD manual.
      
         - a PTI bugfix which disables the perf BTS facility if PTI is
           enabled. The BTS AUX buffer is not globally visible and causes the
           CPU to fault when the mapping disappears on switching CR3 to user
           space. A full fix which restores BTS on PTI is non trivial and will
           be worked on.
      
         - PTI bugfixes for EFI and trusted boot which make sure that the user
           space visible page table entries have the NX bit cleared
      
         - removal of dead code in the PTI pagetable setup functions
      
         - add PTI documentation
      
         - add a selftest for vsyscall to verify that the kernel actually
           implements what it advertises.
      
         - a sysfs interface to expose vulnerability and mitigation
           information so there is a coherent way for users to retrieve the
           status.
      
         - the initial spectre_v2 mitigations, aka retpoline:
      
            + The necessary ASM thunk and compiler support
      
            + The ASM variants of retpoline and the conversion of affected ASM
              code
      
            + Make LFENCE serializing on AMD so it can be used as speculation
              trap
      
            + The RSB fill after vmexit
      
         - initial objtool support for retpoline
      
        As I said in the status mail this is the most of the set of patches
        which should go into 4.15 except two straight forward patches still on
        hold:
      
         - the retpoline add on of LFENCE which waits for ACKs
      
         - the RSB fill after context switch
      
        Both should be ready to go early next week and with that we'll have
        covered the major holes of spectre_v2 and go back to normality"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
        x86,perf: Disable intel_bts when PTI
        security/Kconfig: Correct the Documentation reference for PTI
        x86/pti: Fix !PCID and sanitize defines
        selftests/x86: Add test_vsyscall
        x86/retpoline: Fill return stack buffer on vmexit
        x86/retpoline/irq32: Convert assembler indirect jumps
        x86/retpoline/checksum32: Convert assembler indirect jumps
        x86/retpoline/xen: Convert Xen hypercall indirect jumps
        x86/retpoline/hyperv: Convert assembler indirect jumps
        x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
        x86/retpoline/entry: Convert entry assembler indirect jumps
        x86/retpoline/crypto: Convert crypto assembler indirect jumps
        x86/spectre: Add boot time option to select Spectre v2 mitigation
        x86/retpoline: Add initial retpoline support
        objtool: Allow alternatives to be ignored
        objtool: Detect jumps to retpoline thunks
        x86/pti: Make unpoison of pgd for trusted boot work for real
        x86/alternatives: Fix optimize_nops() checking
        sysfs/cpu: Fix typos in vulnerability documentation
        x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
        ...
      40548c6b
    • Peter Zijlstra's avatar
      x86,perf: Disable intel_bts when PTI · 99a9dc98
      Peter Zijlstra authored
      The intel_bts driver does not use the 'normal' BTS buffer which is exposed
      through the cpu_entry_area but instead uses the memory allocated for the
      perf AUX buffer.
      
      This obviously comes apart when using PTI because then the kernel mapping;
      which includes that AUX buffer memory; disappears. Fixing this requires to
      expose a mapping which is visible in all context and that's not trivial.
      
      As a quick fix disable this driver when PTI is enabled to prevent
      malfunction.
      
      Fixes: 385ce0ea ("x86/mm/pti: Add Kconfig")
      Reported-by: default avatarVince Weaver <vincent.weaver@maine.edu>
      Reported-by: default avatarRobert Święcki <robert@swiecki.net>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: greg@kroah.com
      Cc: hughd@google.com
      Cc: luto@amacapital.net
      Cc: Vince Weaver <vince@deater.net>
      Cc: torvalds@linux-foundation.org
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180114102713.GB6166@worktop.programming.kicks-ass.net
      99a9dc98
    • W. Trevor King's avatar
      security/Kconfig: Correct the Documentation reference for PTI · a237f762
      W. Trevor King authored
      When the config option for PTI was added a reference to documentation was
      added as well. But the documentation did not exist at that point. The final
      documentation has a different file name.
      
      Fix it up to point to the proper file.
      
      Fixes: 385ce0ea ("x86/mm/pti: Add Kconfig")
      Signed-off-by: default avatarW. Trevor King <wking@tremily.us>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: linux-mm@kvack.org
      Cc: linux-security-module@vger.kernel.org
      Cc: James Morris <james.l.morris@oracle.com>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/3009cc8ccbddcd897ec1e0cb6dda524929de0d14.1515799398.git.wking@tremily.us
      a237f762
    • Thomas Gleixner's avatar
      x86/pti: Fix !PCID and sanitize defines · f10ee3dc
      Thomas Gleixner authored
      The switch to the user space page tables in the low level ASM code sets
      unconditionally bit 12 and bit 11 of CR3. Bit 12 is switching the base
      address of the page directory to the user part, bit 11 is switching the
      PCID to the PCID associated with the user page tables.
      
      This fails on a machine which lacks PCID support because bit 11 is set in
      CR3. Bit 11 is reserved when PCID is inactive.
      
      While the Intel SDM claims that the reserved bits are ignored when PCID is
      disabled, the AMD APM states that they should be cleared.
      
      This went unnoticed as the AMD APM was not checked when the code was
      developed and reviewed and test systems with Intel CPUs never failed to
      boot. The report is against a Centos 6 host where the guest fails to boot,
      so it's not yet clear whether this is a virt issue or can happen on real
      hardware too, but thats irrelevant as the AMD APM clearly ask for clearing
      the reserved bits.
      
      Make sure that on non PCID machines bit 11 is not set by the page table
      switching code.
      
      Andy suggested to rename the related bits and masks so they are clearly
      describing what they should be used for, which is done as well for clarity.
      
      That split could have been done with alternatives but the macro hell is
      horrible and ugly. This can be done on top if someone cares to remove the
      extra orq. For now it's a straight forward fix.
      
      Fixes: 6fd166aa ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
      Reported-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable <stable@vger.kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801140009150.2371@nanos
      f10ee3dc
  7. 13 Jan, 2018 13 commits
  8. 12 Jan, 2018 6 commits