1. 30 Nov, 2017 16 commits
    • Andy Lutomirski's avatar
      x86/entry/64: Add missing irqflags tracing to native_load_gs_index() · f9a64e23
      Andy Lutomirski authored
      commit ca37e57b upstream.
      
      Running this code with IRQs enabled (where dummy_lock is a spinlock):
      
      static void check_load_gs_index(void)
      {
      	/* This will fail. */
      	load_gs_index(0xffff);
      
      	spin_lock(&dummy_lock);
      	spin_unlock(&dummy_lock);
      }
      
      Will generate a lockdep warning.  The issue is that the actual write
      to %gs would cause an exception with IRQs disabled, and the exception
      handler would, as an inadvertent side effect, update irqflag tracing
      to reflect the IRQs-off status.  native_load_gs_index() would then
      turn IRQs back on and return with irqflag tracing still thinking that
      IRQs were off.  The dummy lock-and-unlock causes lockdep to notice the
      error and warn.
      
      Fix it by adding the missing tracing.
      
      Apparently nothing did this in a context where it mattered.  I haven't
      tried to find a code path that would actually exhibit the warning if
      appropriately nasty user code were running.
      
      I suspect that the security impact of this bug is very, very low --
      production systems don't run with lockdep enabled, and the warning is
      mostly harmless anyway.
      
      Found during a quick audit of the entry code to try to track down an
      unrelated bug that Ingo found in some still-in-development code.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9a64e23
    • Andy Lutomirski's avatar
      x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing · c91f3fc2
      Andy Lutomirski authored
      commit 548c3050 upstream.
      
      When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
      before it.  This means that users of entry_SYSCALL_64_after_hwframe()
      were responsible for invoking TRACE_IRQS_OFF, and the one and only
      user (Xen, added in the same commit) got it wrong.
      
      I think this would manifest as a warning if a Xen PV guest with
      CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
      context tracking bit is to cause lockdep to get invoked before we
      turn IRQs back on.)  I haven't tested that for real yet because I
      can't get a kernel configured like that to boot at all on Xen PV.
      
      Move TRACE_IRQS_OFF below the label.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bpetkov@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 8a9949bc ("x86/xen/64: Rearrange the SYSCALL entries")
      Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c91f3fc2
    • Masami Hiramatsu's avatar
      x86/decoder: Add new TEST instruction pattern · 00d5e292
      Masami Hiramatsu authored
      commit 12a78d43 upstream.
      
      The kbuild test robot reported this build warning:
      
        Warning: arch/x86/tools/test_get_len found difference at <jump_table>:ffffffff8103dd2c
      
        Warning: ffffffff8103dd82: f6 09 d8 testb $0xd8,(%rcx)
        Warning: objdump says 3 bytes, but insn_get_length() says 2
        Warning: decoded and checked 1569014 instructions with 1 warnings
      
      This sequence seems to be a new instruction not in the opcode map in the Intel SDM.
      
      The instruction sequence is "F6 09 d8", means Group3(F6), MOD(00)REG(001)RM(001), and 0xd8.
      Intel SDM vol2 A.4 Table A-6 said the table index in the group is "Encoding of Bits 5,4,3 of
      the ModR/M Byte (bits 2,1,0 in parenthesis)"
      
      In that table, opcodes listed by the index REG bits as:
      
        000         001       010 011  100        101        110         111
       TEST Ib/Iz,(undefined),NOT,NEG,MUL AL/rAX,IMUL AL/rAX,DIV AL/rAX,IDIV AL/rAX
      
      So, it seems TEST Ib is assigned to 001.
      
      Add the new pattern.
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      00d5e292
    • Tom Lendacky's avatar
      x86/boot: Fix boot failure when SMP MP-table is based at 0 · 46855f80
      Tom Lendacky authored
      commit ac5292e9 upstream.
      
      When crosvm is used to boot a kernel as a VM, the SMP MP-table is found
      at physical address 0x0. This causes mpf_base to be set to 0 and a
      subsequent "if (!mpf_base)" check in default_get_smp_config() results in
      the MP-table not being parsed.  Further into the boot this results in an
      oops when attempting a read_apic_id().
      
      Add a boolean variable that is set to true when the MP-table is found.
      Use this variable for testing if the MP-table was found so that even a
      value of 0 for mpf_base will result in continued parsing of the MP-table.
      
      Fixes: 5997efb9 ("x86/boot: Use memremap() to map the MPF and MPC data")
      Reported-by: default avatarTomeu Vizoso <tomeu@tomeuvizoso.net>
      Signed-off-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: regression@leemhuis.info
      Link: https://lkml.kernel.org/r/20171106201753.23059.86674.stgit@tlendack-t1.amdoffice.netSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46855f80
    • Eric Biggers's avatar
      lib/mpi: call cond_resched() from mpi_powm() loop · ce922b7b
      Eric Biggers authored
      commit 1d9ddde1 upstream.
      
      On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the
      largest permitted inputs (16384 bits), the kernel spends 10+ seconds
      doing modular exponentiation in mpi_powm() without rescheduling.  If all
      threads do it, it locks up the system.  Moreover, it can cause
      rcu_sched-stall warnings.
      
      Notwithstanding the insanity of doing this calculation in kernel mode
      rather than in userspace, fix it by calling cond_resched() as each bit
      from the exponent is processed.  It's still noninterruptible, but at
      least it's preemptible now.
      
      Do the cond_resched() once per bit rather than once per MPI limb because
      each limb might still easily take 100+ milliseconds on slow CPUs.
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce922b7b
    • Paul E. McKenney's avatar
      sched: Make resched_cpu() unconditional · 9f088f6a
      Paul E. McKenney authored
      commit 7c2102e5 upstream.
      
      The current implementation of synchronize_sched_expedited() incorrectly
      assumes that resched_cpu() is unconditional, which it is not.  This means
      that synchronize_sched_expedited() can hang when resched_cpu()'s trylock
      fails as follows (analysis by Neeraj Upadhyay):
      
      o	CPU1 is waiting for expedited wait to complete:
      
      	sync_rcu_exp_select_cpus
      	     rdp->exp_dynticks_snap & 0x1   // returns 1 for CPU5
      	     IPI sent to CPU5
      
      	synchronize_sched_expedited_wait
      		 ret = swait_event_timeout(rsp->expedited_wq,
      					   sync_rcu_preempt_exp_done(rnp_root),
      					   jiffies_stall);
      
      	expmask = 0x20, CPU 5 in idle path (in cpuidle_enter())
      
      o	CPU5 handles IPI and fails to acquire rq lock.
      
      	Handles IPI
      	     sync_sched_exp_handler
      		 resched_cpu
      		     returns while failing to try lock acquire rq->lock
      		 need_resched is not set
      
      o	CPU5 calls  rcu_idle_enter() and as need_resched is not set, goes to
      	idle (schedule() is not called).
      
      o	CPU 1 reports RCU stall.
      
      Given that resched_cpu() is now used only by RCU, this commit fixes the
      assumption by making resched_cpu() unconditional.
      Reported-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Suggested-by: default avatarNeeraj Upadhyay <neeraju@codeaurora.org>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f088f6a
    • Johan Hovold's avatar
      serdev: fix registration of second slave · 668a1285
      Johan Hovold authored
      commit 08fcee28 upstream.
      
      Serdev currently only supports a single slave device, but the required
      sanity checks to prevent further registration attempts were missing.
      
      If a serial-port node has two child nodes with compatible properties,
      the OF code would try to register two slave devices using the same id
      and name. Driver core will not allow this (and there will be loud
      complaints), but the controller's slave pointer would already have been
      set to address of the soon to be deallocated second struct
      serdev_device. As the first slave device remains registered, this can
      lead to later use-after-free issues when the slave callbacks are
      accessed.
      
      Note that while the serdev registration helpers are exported, they are
      typically only called by serdev core. Any other (out-of-tree) callers
      must serialise registration and deregistration themselves.
      
      Fixes: cd6484e1 ("serdev: Introduce new bus for serial attached devices")
      Cc: Rob Herring <robh@kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      668a1285
    • Viresh Kumar's avatar
      cpufreq: schedutil: Reset cached_raw_freq when not in sync with next_freq · b7997494
      Viresh Kumar authored
      commit 07458f6a upstream.
      
      'cached_raw_freq' is used to get the next frequency quickly but should
      always be in sync with sg_policy->next_freq. There is a case where it is
      not and in such cases it should be reset to avoid switching to incorrect
      frequencies.
      
      Consider this case for example:
      
       - policy->cur is 1.2 GHz (Max)
       - New request comes for 780 MHz and we store that in cached_raw_freq.
       - Based on 780 MHz, we calculate the effective frequency as 800 MHz.
       - We then see the CPU wasn't idle recently and choose to keep the next
         freq as 1.2 GHz.
       - Now we have cached_raw_freq is 780 MHz and sg_policy->next_freq is
         1.2 GHz.
       - Now if the utilization doesn't change in then next request, then the
         next target frequency will still be 780 MHz and it will match with
         cached_raw_freq. But we will choose 1.2 GHz instead of 800 MHz here.
      
      Fixes: b7eaf1aa (cpufreq: schedutil: Avoid reducing frequency of busy CPUs prematurely)
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b7997494
    • Lv Zheng's avatar
      ACPI / EC: Fix regression related to triggering source of EC event handling · 3fe36d0c
      Lv Zheng authored
      commit 53c5eaab upstream.
      
      Originally the Samsung quirks removed by commit 4c237371 can be covered
      by commit e923e8e7 and ec_freeze_events=Y mode. But commit 9c40f956
      changed ec_freeze_events=Y back to N, making this problem re-surface.
      
      Actually, if commit e923e8e7 is robust enough, we can freely change
      ec_freeze_events mode, so this patch fixes the issue by improving
      commit e923e8e7.
      
      Related commits listed in the merged order:
      
       Commit: e923e8e7
       Subject: ACPI / EC: Fix an issue that SCI_EVT cannot be detected
                after event is enabled
      
       Commit: 4c237371
       Subject: ACPI / EC: Remove old CLEAR_ON_RESUME quirk
      
       Commit: 9c40f956
       Subject: Revert "ACPI / EC: Enable event freeze mode..." to fix
                a regression
      
      This patch not only fixes the reported post-resume EC event triggering
      source issue, but also fixes an unreported similar issue related to the
      driver bind by adding EC event triggering source in ec_install_handlers().
      
      Fixes: e923e8e7 (ACPI / EC: Fix an issue that SCI_EVT cannot be detected after event is enabled)
      Fixes: 4c237371 (ACPI / EC: Remove old CLEAR_ON_RESUME quirk)
      Fixes: 9c40f956 (Revert "ACPI / EC: Enable event freeze mode..." to fix a regression)
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=196833Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Reported-by: default avatarAlistair Hamilton <ahpatent@gmail.com>
      Tested-by: default avatarAlistair Hamilton <ahpatent@gmail.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3fe36d0c
    • Ville Syrjälä's avatar
      ACPI / PM: Fix acpi_pm_notifier_lock vs flush_workqueue() deadlock · ef2b11c0
      Ville Syrjälä authored
      commit ff165679 upstream.
      
      acpi_remove_pm_notifier() ends up calling flush_workqueue() while
      holding acpi_pm_notifier_lock, and that same lock is taken by
      by the work via acpi_pm_notify_handler(). This can deadlock.
      
      To fix the problem let's split the single lock into two: one to
      protect the dev->wakeup between the work vs. add/remove, and
      another one to handle notifier installation vs. removal.
      
      After commit a1d14934 "workqueue/lockdep: 'Fix' flush_work()
      annotation" I was able to kill the machine (Intel Braswell)
      very easily with 'powertop --auto-tune', runtime suspending i915,
      and trying to wake it up via the USB keyboard. The cases when
      it didn't die are presumably explained by lockdep getting disabled
      by something else (cpu hotplug locking issues usually).
      
      Fortunately I still got a lockdep report over netconsole
      (trickling in very slowly), even though the machine was
      otherwise practically dead:
      
      [  112.179806] ======================================================
      [  114.670858] WARNING: possible circular locking dependency detected
      [  117.155663] 4.13.0-rc6-bsw-bisect-00169-ga1d14934 #119 Not tainted
      [  119.658101] ------------------------------------------------------
      [  121.310242] xhci_hcd 0000:00:14.0: xHCI host not responding to stop endpoint command.
      [  121.313294] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
      [  121.313346] xhci_hcd 0000:00:14.0: HC died; cleaning up
      [  121.313485] usb 1-6: USB disconnect, device number 3
      [  121.313501] usb 1-6.2: USB disconnect, device number 4
      [  134.747383] kworker/0:2/47 is trying to acquire lock:
      [  137.220790]  (acpi_pm_notifier_lock){+.+.}, at: [<ffffffff813cafdf>] acpi_pm_notify_handler+0x2f/0x80
      [  139.721524]
      [  139.721524] but task is already holding lock:
      [  144.672922]  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
      [  147.184450]
      [  147.184450] which lock already depends on the new lock.
      [  147.184450]
      [  154.604711]
      [  154.604711] the existing dependency chain (in reverse order) is:
      [  159.447888]
      [  159.447888] -> #2 ((&dpc->work)){+.+.}:
      [  164.183486]        __lock_acquire+0x1255/0x13f0
      [  166.504313]        lock_acquire+0xb5/0x210
      [  168.778973]        process_one_work+0x1b9/0x720
      [  171.030316]        worker_thread+0x4c/0x440
      [  173.257184]        kthread+0x154/0x190
      [  175.456143]        ret_from_fork+0x27/0x40
      [  177.624348]
      [  177.624348] -> #1 ("kacpi_notify"){+.+.}:
      [  181.850351]        __lock_acquire+0x1255/0x13f0
      [  183.941695]        lock_acquire+0xb5/0x210
      [  186.046115]        flush_workqueue+0xdd/0x510
      [  190.408153]        acpi_os_wait_events_complete+0x31/0x40
      [  192.625303]        acpi_remove_notify_handler+0x133/0x188
      [  194.820829]        acpi_remove_pm_notifier+0x56/0x90
      [  196.989068]        acpi_dev_pm_detach+0x5f/0xa0
      [  199.145866]        dev_pm_domain_detach+0x27/0x30
      [  201.285614]        i2c_device_probe+0x100/0x210
      [  203.411118]        driver_probe_device+0x23e/0x310
      [  205.522425]        __driver_attach+0xa3/0xb0
      [  207.634268]        bus_for_each_dev+0x69/0xa0
      [  209.714797]        driver_attach+0x1e/0x20
      [  211.778258]        bus_add_driver+0x1bc/0x230
      [  213.837162]        driver_register+0x60/0xe0
      [  215.868162]        i2c_register_driver+0x42/0x70
      [  217.869551]        0xffffffffa0172017
      [  219.863009]        do_one_initcall+0x45/0x170
      [  221.843863]        do_init_module+0x5f/0x204
      [  223.817915]        load_module+0x225b/0x29b0
      [  225.757234]        SyS_finit_module+0xc6/0xd0
      [  227.661851]        do_syscall_64+0x5c/0x120
      [  229.536819]        return_from_SYSCALL_64+0x0/0x7a
      [  231.392444]
      [  231.392444] -> #0 (acpi_pm_notifier_lock){+.+.}:
      [  235.124914]        check_prev_add+0x44e/0x8a0
      [  237.024795]        __lock_acquire+0x1255/0x13f0
      [  238.937351]        lock_acquire+0xb5/0x210
      [  240.840799]        __mutex_lock+0x75/0x940
      [  242.709517]        mutex_lock_nested+0x1c/0x20
      [  244.551478]        acpi_pm_notify_handler+0x2f/0x80
      [  246.382052]        acpi_ev_notify_dispatch+0x44/0x5c
      [  248.194412]        acpi_os_execute_deferred+0x14/0x30
      [  250.003925]        process_one_work+0x1ec/0x720
      [  251.803191]        worker_thread+0x4c/0x440
      [  253.605307]        kthread+0x154/0x190
      [  255.387498]        ret_from_fork+0x27/0x40
      [  257.153175]
      [  257.153175] other info that might help us debug this:
      [  257.153175]
      [  262.324392] Chain exists of:
      [  262.324392]   acpi_pm_notifier_lock --> "kacpi_notify" --> (&dpc->work)
      [  262.324392]
      [  267.391997]  Possible unsafe locking scenario:
      [  267.391997]
      [  270.758262]        CPU0                    CPU1
      [  272.431713]        ----                    ----
      [  274.060756]   lock((&dpc->work));
      [  275.646532]                                lock("kacpi_notify");
      [  277.260772]                                lock((&dpc->work));
      [  278.839146]   lock(acpi_pm_notifier_lock);
      [  280.391902]
      [  280.391902]  *** DEADLOCK ***
      [  280.391902]
      [  284.986385] 2 locks held by kworker/0:2/47:
      [  286.524895]  #0:  ("kacpi_notify"){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
      [  288.112927]  #1:  ((&dpc->work)){+.+.}, at: [<ffffffff8109ce90>] process_one_work+0x160/0x720
      [  289.727725]
      
      Fixes: c072530f (ACPI / PM: Revork the handling of ACPI device wakeup notifications)
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef2b11c0
    • Vasily Gorbik's avatar
      s390/disassembler: increase show_code buffer size · 60479800
      Vasily Gorbik authored
      commit b192571d upstream.
      
      Current buffer size of 64 is too small. objdump shows that there are
      instructions which would require up to 75 bytes buffer (with current
      formating). 128 bytes "ought to be enough for anybody".
      
      Also replaces 8 spaces with a single tab to reduce the memory footprint.
      
      Fixes the following KASAN finding:
      
      BUG: KASAN: stack-out-of-bounds in number+0x3fe/0x538
      Write of size 1 at addr 000000005a4a75a0 by task bash/1282
      
      CPU: 1 PID: 1282 Comm: bash Not tainted 4.14.0+ #215
      Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
      Call Trace:
      ([<000000000011eeb6>] show_stack+0x56/0x88)
       [<0000000000e1ce1a>] dump_stack+0x15a/0x1b0
       [<00000000004e2994>] print_address_description+0xf4/0x288
       [<00000000004e2cf2>] kasan_report+0x13a/0x230
       [<0000000000e38ae6>] number+0x3fe/0x538
       [<0000000000e3dfe4>] vsnprintf+0x194/0x948
       [<0000000000e3ea42>] sprintf+0xa2/0xb8
       [<00000000001198dc>] print_insn+0x374/0x500
       [<0000000000119346>] show_code+0x4ee/0x538
       [<000000000011f234>] show_registers+0x34c/0x388
       [<000000000011f2ae>] show_regs+0x3e/0xa8
       [<000000000011f502>] die+0x1ea/0x2e8
       [<0000000000138f0e>] do_no_context+0x106/0x168
       [<0000000000139a1a>] do_protection_exception+0x4da/0x7d0
       [<0000000000e55914>] pgm_check_handler+0x16c/0x1c0
       [<000000000090639e>] sysrq_handle_crash+0x46/0x58
      ([<0000000000000007>] 0x7)
       [<00000000009073fa>] __handle_sysrq+0x102/0x218
       [<0000000000907c06>] write_sysrq_trigger+0xd6/0x100
       [<000000000061d67a>] proc_reg_write+0xb2/0x128
       [<0000000000520be6>] __vfs_write+0xee/0x368
       [<0000000000521222>] vfs_write+0x21a/0x278
       [<000000000052156a>] SyS_write+0xda/0x178
       [<0000000000e555cc>] system_call+0xc4/0x270
      
      The buggy address belongs to the page:
      page:000003d1016929c0 count:0 mapcount:0 mapping:          (null) index:0x0
      flags: 0x0()
      raw: 0000000000000000 0000000000000000 0000000000000000 ffffffff00000000
      raw: 0000000000000100 0000000000000200 0000000000000000 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       000000005a4a7480: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
       000000005a4a7500: 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00 00
      >000000005a4a7580: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
                                     ^
       000000005a4a7600: 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f8 f8
       000000005a4a7680: f2 f2 f2 f2 f2 f2 f8 f8 f2 f2 f3 f3 f3 f3 00 00
      ==================================================================
      Signed-off-by: default avatarVasily Gorbik <gor@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      60479800
    • Heiko Carstens's avatar
      s390/disassembler: add missing end marker for e7 table · 15e82cdb
      Heiko Carstens authored
      commit 5c505387 upstream.
      
      The e7 opcode table does not have an end marker. Hence when trying to
      find an unknown e7 instruction the code will access memory behind the
      table until it finds something that matches the opcode, or the kernel
      crashes, whatever comes first.
      
      This affects not only the in-kernel disassembler but also uprobes and
      kprobes which refuse to set a probe on unknown instructions, and
      therefore search the opcode tables to figure out if instructions are
      known or not.
      
      Fixes: 3585cb02 ("s390/disassembler: add vector instructions")
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15e82cdb
    • Heiko Carstens's avatar
      s390/guarded storage: fix possible memory corruption · 7ee3f026
      Heiko Carstens authored
      commit fa1edf3f upstream.
      
      For PREEMPT enabled kernels the guarded storage (GS) code contains a
      possible use-after-free bug. If a task that makes use of GS exits, it
      will execute do_exit() while still enabled for preemption.
      
      That function will call exit_thread_runtime_instr() via exit_thread().
      If exit_thread_gs() gets preempted after the GS control block of the
      task has been freed but before the pointer to it is set to NULL, then
      save_gs_cb(), called from switch_to(), will write to already freed
      memory.
      
      Avoid this and simply disable preemption while freeing the control
      block and setting the pointer to NULL.
      
      Fixes: 916cda1a ("s390: add a system call for guarded storage")
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ee3f026
    • Heiko Carstens's avatar
      s390/runtime instrumention: fix possible memory corruption · 27576413
      Heiko Carstens authored
      commit d6e646ad upstream.
      
      For PREEMPT enabled kernels the runtime instrumentation (RI) code
      contains a possible use-after-free bug. If a task that makes use of RI
      exits, it will execute do_exit() while still enabled for preemption.
      
      That function will call exit_thread_runtime_instr() via
      exit_thread(). If exit_thread_runtime_instr() gets preempted after the
      RI control block of the task has been freed but before the pointer to
      it is set to NULL, then save_ri_cb(), called from switch_to(), will
      write to already freed memory.
      
      Avoid this and simply disable preemption while freeing the control
      block and setting the pointer to NULL.
      
      Fixes: e4b8b3f3 ("s390: add support for runtime instrumentation")
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27576413
    • Heiko Carstens's avatar
      s390/noexec: execute kexec datamover without DAT · 21caac65
      Heiko Carstens authored
      commit d0e810ee upstream.
      
      Rebooting into a new kernel with kexec fails (system dies) if tried on
      a machine that has no-execute support. Reason for this is that the so
      called datamover code gets executed with DAT on (MMU is active) and
      the page that contains the datamover is marked as non-executable.
      Therefore when branching into the datamover an unexpected program
      check happens and afterwards the machine is dead.
      
      This can be simply avoided by disabling DAT, which also disables any
      no-execute checks, just before the datamover gets executed.
      
      In fact the first thing done by the datamover is to disable DAT. The
      code in the datamover that disables DAT can be removed as well.
      
      Thanks to Michael Holzheu and Gerald Schaefer for tracking this down.
      Reviewed-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Reviewed-by: default avatarPhilipp Rudo <prudo@linux.vnet.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Fixes: 57d7f939 ("s390: add no-execute support")
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      21caac65
    • Heiko Carstens's avatar
      s390: fix transactional execution control register handling · 236f6e72
      Heiko Carstens authored
      commit a1c5befc upstream.
      
      Dan Horák reported the following crash related to transactional execution:
      
      User process fault: interruption code 0013 ilc:3 in libpthread-2.26.so[3ff93c00000+1b000]
      CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1
      Hardware name: IBM 2827 H43 400 (z/VM 6.4.0)
      task: 00000000fafc8000 task.stack: 00000000fafc4000
      User PSW : 0705200180000000 000003ff93c14e70
                 R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3
      User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000003ff93144d5e
                 0000000000000000 0000000000000002 0000000000000000 000003ff00000000
                 0000000000000000 0000000000000418 0000000000000000 000003ffcc9fe770
                 000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000003ffcc9fe6d0
      User Code: 000003ff93c14e62: 60e0b030            std     %f14,48(%r11)
                 000003ff93c14e66: 60f0b038            std     %f15,56(%r11)
                #000003ff93c14e6a: e5600000ff0e        tbegin  0,65294
                >000003ff93c14e70: a7740006            brc     7,3ff93c14e7c
                 000003ff93c14e74: a7080000            lhi     %r0,0
                 000003ff93c14e78: a7f40023            brc     15,3ff93c14ebe
                 000003ff93c14e7c: b2220000            ipm     %r0
                 000003ff93c14e80: 8800001c            srl     %r0,28
      
      There are several bugs with control register handling with respect to
      transactional execution:
      
      - on task switch update_per_regs() is only called if the next task has
        an mm (is not a kernel thread). This however is incorrect. This
        breaks e.g. for user mode helper handling, where the kernel creates
        a kernel thread and then execve's a user space program. Control
        register contents related to transactional execution won't be
        updated on execve. If the previous task ran with transactional
        execution disabled then the new task will also run with
        transactional execution disabled, which is incorrect. Therefore call
        update_per_regs() unconditionally within switch_to().
      
      - on startup the transactional execution facility is not enabled for
        the idle thread. This is not really a bug, but an inconsistency to
        other facilities. Therefore enable the facility if it is available.
      
      - on fork the new thread's per_flags field is not cleared. This means
        that a child process inherits the PER_FLAG_NO_TE flag. This flag can
        be set with a ptrace request to disable transactional execution for
        the current process. It should not be inherited by new child
        processes in order to be consistent with the handling of all other
        PER related debugging options. Therefore clear the per_flags field in
        copy_thread_tls().
      Reported-and-tested-by: default avatarDan Horák <dan@danny.cz>
      Fixes: d35339a4 ("s390: add support for transactional memory")
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Reviewed-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarHendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      236f6e72
  2. 24 Nov, 2017 20 commits
  3. 21 Nov, 2017 4 commits