1. 27 Jul, 2017 40 commits
    • Tony Camuso's avatar
      ipmi: use rcu lock around call to intf->handlers->sender() · 65acfd38
      Tony Camuso authored
      commit cdea4656 upstream.
      
      A vendor with a system having more than 128 CPUs occasionally encounters
      the following crash during shutdown. This is not an easily reproduceable
      event, but the vendor was able to provide the following analysis of the
      crash, which exhibits the same footprint each time.
      
      crash> bt
      PID: 0      TASK: ffff88017c70ce70  CPU: 5   COMMAND: "swapper/5"
       #0 [ffff88085c143ac8] machine_kexec at ffffffff81059c8b
       #1 [ffff88085c143b28] __crash_kexec at ffffffff811052e2
       #2 [ffff88085c143bf8] crash_kexec at ffffffff811053d0
       #3 [ffff88085c143c10] oops_end at ffffffff8168ef88
       #4 [ffff88085c143c38] no_context at ffffffff8167ebb3
       #5 [ffff88085c143c88] __bad_area_nosemaphore at ffffffff8167ec49
       #6 [ffff88085c143cd0] bad_area_nosemaphore at ffffffff8167edb3
       #7 [ffff88085c143ce0] __do_page_fault at ffffffff81691d1e
       #8 [ffff88085c143d40] do_page_fault at ffffffff81691ec5
       #9 [ffff88085c143d70] page_fault at ffffffff8168e188
          [exception RIP: unknown or invalid address]
          RIP: ffffffffa053c800  RSP: ffff88085c143e28  RFLAGS: 00010206
          RAX: ffff88017c72bfd8  RBX: ffff88017a8dc000  RCX: ffff8810588b5ac8
          RDX: ffff8810588b5a00  RSI: ffffffffa053c800  RDI: ffff8810588b5a00
          RBP: ffff88085c143e58   R8: ffff88017c70d408   R9: ffff88017a8dc000
          R10: 0000000000000002  R11: ffff88085c143da0  R12: ffff8810588b5ac8
          R13: 0000000000000100  R14: ffffffffa053c800  R15: ffff8810588b5a00
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
          <IRQ stack>
          [exception RIP: cpuidle_enter_state+82]
          RIP: ffffffff81514192  RSP: ffff88017c72be50  RFLAGS: 00000202
          RAX: 0000001e4c3c6f16  RBX: 000000000000f8a0  RCX: 0000000000000018
          RDX: 0000000225c17d03  RSI: ffff88017c72bfd8  RDI: 0000001e4c3c6f16
          RBP: ffff88017c72be78   R8: 000000000000237e   R9: 0000000000000018
          R10: 0000000000002494  R11: 0000000000000001  R12: ffff88017c72be20
          R13: ffff88085c14f8e0  R14: 0000000000000082  R15: 0000001e4c3bb400
          ORIG_RAX: ffffffffffffff10  CS: 0010  SS: 0018
      
      This is the corresponding stack trace
      
      It has crashed because the area pointed with RIP extracted from timer
      element is already removed during a shutdown process.
      
      The function is smi_timeout().
      
      And we think ffff8810588b5a00 in RDX is a parameter struct smi_info
      
      crash> rd ffff8810588b5a00 20
      ffff8810588b5a00:  ffff8810588b6000 0000000000000000   .`.X............
      ffff8810588b5a10:  ffff880853264400 ffffffffa05417e0   .D&S......T.....
      ffff8810588b5a20:  24a024a000000000 0000000000000000   .....$.$........
      ffff8810588b5a30:  0000000000000000 0000000000000000   ................
      ffff8810588b5a30:  0000000000000000 0000000000000000   ................
      ffff8810588b5a40:  ffffffffa053a040 ffffffffa053a060   @.S.....`.S.....
      ffff8810588b5a50:  0000000000000000 0000000100000001   ................
      ffff8810588b5a60:  0000000000000000 0000000000000e00   ................
      ffff8810588b5a70:  ffffffffa053a580 ffffffffa053a6e0   ..S.......S.....
      ffff8810588b5a80:  ffffffffa053a4a0 ffffffffa053a250   ..S.....P.S.....
      ffff8810588b5a90:  0000000500000002 0000000000000000   ................
      
      Unfortunately the top of this area is already detroyed by someone.
      But because of two reasonns we think this is struct smi_info
       1) The address included in between  ffff8810588b5a70 and ffff8810588b5a80:
        are inside of ipmi_si_intf.c  see crash> module ffff88085779d2c0
      
       2) We've found the area which point this.
        It is offset 0x68 of  ffff880859df4000
      
      crash> rd  ffff880859df4000 100
      ffff880859df4000:  0000000000000000 0000000000000001   ................
      ffff880859df4010:  ffffffffa0535290 dead000000000200   .RS.............
      ffff880859df4020:  ffff880859df4020 ffff880859df4020    @.Y.... @.Y....
      ffff880859df4030:  0000000000000002 0000000000100010   ................
      ffff880859df4040:  ffff880859df4040 ffff880859df4040   @@.Y....@@.Y....
      ffff880859df4050:  0000000000000000 0000000000000000   ................
      ffff880859df4060:  0000000000000000 ffff8810588b5a00   .........Z.X....
      ffff880859df4070:  0000000000000001 ffff880859df4078   ........x@.Y....
      
       If we regards it as struct ipmi_smi in shutdown process
       it looks consistent.
      
      The remedy for this apparent race is affixed below.
      Signed-off-by: default avatarTony Camuso <tcamuso@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      This was first introduced in 7ea0ed2b ipmi: Make the
      message handler easier to use for SMI interfaces
      where some code was moved outside of the rcu_read_lock()
      and the lock was not added.
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      65acfd38
    • Eric Anholt's avatar
      drm/etnaviv: Expose our reservation object when exporting a dmabuf. · 719829d1
      Eric Anholt authored
      commit 8555137e upstream.
      
      Without this, polling on the dma-buf (and presumably other devices
      synchronizing against our rendering) would return immediately, even
      while the BO was busy.
      Signed-off-by: default avatarEric Anholt <eric@anholt.net>
      Cc: Lucas Stach <l.stach@pengutronix.de>
      Cc: Russell King <linux+etnaviv@armlinux.org.uk>
      Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
      Cc: etnaviv@lists.freedesktop.org
      Signed-off-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      719829d1
    • John Brooks's avatar
      drm/ttm: Fix use-after-free in ttm_bo_clean_mm · cb5634ea
      John Brooks authored
      commit 8046e195 upstream.
      
      We unref the man->move fence in ttm_bo_clean_mm() and then call
      ttm_bo_force_list_clean() which waits on it, except the refcount is now
      zero so a warning is generated (or worse):
      
      [149492.279301] refcount_t: increment on 0; use-after-free.
      [149492.279309] ------------[ cut here ]------------
      [149492.279315] WARNING: CPU: 3 PID: 18726 at lib/refcount.c:150 refcount_inc+0x2b/0x30
      [149492.279315] Modules linked in: vhost_net vhost tun x86_pkg_temp_thermal crc32_pclmul ghash_clmulni_intel efivarfs amdgpu(
      -) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
      [149492.279326] CPU: 3 PID: 18726 Comm: rmmod Not tainted 4.12.0-rc5-drm-next-4.13-ttmpatch+ #1
      [149492.279326] Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD3H-BK/Z97X-UD3H-BK-CF, BIOS F6 06/17/2014
      [149492.279327] task: ffff8804ddfedcc0 task.stack: ffffc90008d20000
      [149492.279329] RIP: 0010:refcount_inc+0x2b/0x30
      [149492.279330] RSP: 0018:ffffc90008d23c30 EFLAGS: 00010286
      [149492.279331] RAX: 000000000000002b RBX: 0000000000000170 RCX: 0000000000000000
      [149492.279331] RDX: 0000000000000000 RSI: ffff88051ecccbe8 RDI: ffff88051ecccbe8
      [149492.279332] RBP: ffffc90008d23c30 R08: 0000000000000001 R09: 00000000000003ee
      [149492.279333] R10: ffffc90008d23bb0 R11: 00000000000003ee R12: ffff88043aaac960
      [149492.279333] R13: ffff8805005e28a8 R14: 0000000000000002 R15: ffff88050115e178
      [149492.279334] FS:  00007fc540168700(0000) GS:ffff88051ecc0000(0000) knlGS:0000000000000000
      [149492.279335] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [149492.279336] CR2: 00007fc3e8654140 CR3: 000000027ba77000 CR4: 00000000001426e0
      [149492.279337] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [149492.279337] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [149492.279338] Call Trace:
      [149492.279345]  ttm_bo_force_list_clean+0xb9/0x110 [ttm]
      [149492.279348]  ttm_bo_clean_mm+0x7a/0xe0 [ttm]
      [149492.279375]  amdgpu_ttm_fini+0xc9/0x1f0 [amdgpu]
      [149492.279392]  amdgpu_bo_fini+0x12/0x40 [amdgpu]
      [149492.279415]  gmc_v7_0_sw_fini+0x32/0x40 [amdgpu]
      [149492.279430]  amdgpu_fini+0x2c9/0x490 [amdgpu]
      [149492.279445]  amdgpu_device_fini+0x58/0x1b0 [amdgpu]
      [149492.279461]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
      [149492.279470]  drm_dev_unregister+0x3c/0xe0 [drm]
      [149492.279485]  amdgpu_pci_remove+0x19/0x30 [amdgpu]
      [149492.279487]  pci_device_remove+0x39/0xc0
      [149492.279490]  device_release_driver_internal+0x155/0x210
      [149492.279491]  driver_detach+0x38/0x70
      [149492.279493]  bus_remove_driver+0x4c/0xa0
      [149492.279494]  driver_unregister+0x2c/0x40
      [149492.279496]  pci_unregister_driver+0x21/0x90
      [149492.279520]  amdgpu_exit+0x15/0x406 [amdgpu]
      [149492.279523]  SyS_delete_module+0x1a8/0x270
      [149492.279525]  ? exit_to_usermode_loop+0x92/0xa0
      [149492.279528]  entry_SYSCALL_64_fastpath+0x13/0x94
      [149492.279529] RIP: 0033:0x7fc53fcb68e7
      [149492.279529] RSP: 002b:00007ffcfbfaabb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [149492.279531] RAX: ffffffffffffffda RBX: 0000563117adb200 RCX: 00007fc53fcb68e7
      [149492.279531] RDX: 000000000000000a RSI: 0000000000000800 RDI: 0000563117adb268
      [149492.279532] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999
      [149492.279533] R10: 0000000000000883 R11: 0000000000000206 R12: 00007ffcfbfa9ba0
      [149492.279533] R13: 0000000000000000 R14: 0000000000000000 R15: 0000563117adb200
      [149492.279534] Code: 55 48 89 e5 e8 77 fe ff ff 84 c0 74 02 5d c3 80 3d 40 f2 a4 00 00 75 f5 48 c7 c7 20 3c ca 81 c6 05 30 f2 a4 00 01 e8 91 f0 d7 ff <0f> ff 5d c3 90 55 48 89 fe bf 01 00 00 00 48 89 e5 e8 9f fe ff
      [149492.279557] ---[ end trace 2d4e0ffcb66a1016 ]---
      
      Unref the fence *after* waiting for it.
      
      v2: Set man->move to NULL after dropping the last ref (Christian König)
      
      Fixes: aff98ba1 (drm/ttm: wait for eviction in ttm_bo_force_list_clean)
      Signed-off-by: default avatarJohn Brooks <john@fastquake.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cb5634ea
    • Mario Kleiner's avatar
      drm/radeon: Fix eDP for single-display iMac10,1 (v2) · 011a0006
      Mario Kleiner authored
      commit 564d8a2c upstream.
      
      The late 2009, 27 inch Apple iMac10,1 has an
      internal eDP display and an external Mini-
      Displayport output, driven by a DCE-3.2, RV730
      Radeon Mobility HD-4670.
      
      The machine worked fine in a dual-display setup
      with eDP panel + externally connected HDMI
      or DVI-D digital display sink, connected via
      MiniDP to DVI or HDMI adapter.
      
      However, booting the machine single-display with
      only eDP panel results in a completely black
      display - even backlight powering off, as soon as
      the radeon modesetting driver loads.
      
      This patch fixes the single dispay eDP case by
      assigning encoders based on dig->linkb, similar
      to DCE-4+. While this should not be generally
      necessary (Alex: "...atom on normal boards
      should be able to handle any mapping."), Apple
      seems to use some special routing here.
      
      One remaining problem not solved by this patch
      is that an external Minidisplayport->DP sink
      does still not work on iMac10,1, whereas external
      DVI and HDMI sinks continue to work.
      
      The problem affects at least all tested kernels
      since Linux 3.13 - didn't test earlier kernels, so
      backporting to stable probably makes sense.
      
      v2: With the original patch from 2016, Alex was worried it
          will break other DCE3.2 systems. Use dmi_match() to
          apply this special encoder assignment only for the
          Apple iMac 10,1 from late 2009.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Michel Dänzer <michel.daenzer@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      011a0006
    • Alex Deucher's avatar
      drm/radeon/ci: disable mclk switching for high refresh rates (v2) · b123162f
      Alex Deucher authored
      commit ab03d9fe upstream.
      
      Even if the vblank period would allow it, it still seems to
      be problematic on some cards.
      
      v2: fix logic inversion (Nils)
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b123162f
    • John Brooks's avatar
      drm/amdgpu: Don't call amd_powerplay_destroy() if we don't have powerplay · e97b3dc3
      John Brooks authored
      commit 7bc7b777 upstream.
      
      amd_powerplay_destroy() expects a handle pointing to a struct pp_instance.
      On chips without PowerPlay, pp_handle points to a struct amdgpu_device. The
      resulting attempt to kfree() fields of the wrong struct ends in fire:
      
      [   91.560405] BUG: unable to handle kernel paging request at ffffebe000000620
      [   91.560414] IP: kfree+0x57/0x160
      [   91.560416] PGD 0
      [   91.560416] P4D 0
      
      [   91.560420] Oops: 0000 [#1] SMP
      [   91.560422] Modules linked in: tun x86_pkg_temp_thermal crc32_pclmul ghash_clmulni_intel efivarfs amdgpu(-) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
      [   91.560438] CPU: 6 PID: 3598 Comm: rmmod Not tainted 4.12.0-rc5-drm-next-4.13-ttmpatch+ #1
      [   91.560443] Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD3H-BK/Z97X-UD3H-BK-CF, BIOS F6 06/17/2014
      [   91.560448] task: ffff8805063d6a00 task.stack: ffffc90003400000
      [   91.560451] RIP: 0010:kfree+0x57/0x160
      [   91.560454] RSP: 0018:ffffc90003403cc0 EFLAGS: 00010286
      [   91.560457] RAX: 000077ff80000000 RBX: 00000000000186a0 RCX: 0000000180400035
      [   91.560460] RDX: 0000000180400036 RSI: ffffea001418e740 RDI: ffffea0000000000
      [   91.560463] RBP: ffffc90003403cd8 R08: 000000000639d201 R09: 0000000180400035
      [   91.560467] R10: ffffebe000000600 R11: 0000000000000300 R12: ffff880500530030
      [   91.560470] R13: ffffffffa01e70fc R14: 00000000ffffffff R15: ffff880500530000
      [   91.560473] FS:  00007f7e500c3700(0000) GS:ffff88051ed80000(0000) knlGS:0000000000000000
      [   91.560478] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   91.560480] CR2: ffffebe000000620 CR3: 0000000503103000 CR4: 00000000001406e0
      [   91.560483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   91.560487] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   91.560489] Call Trace:
      [   91.560530]  amd_powerplay_destroy+0x1c/0x60 [amdgpu]
      [   91.560558]  amdgpu_pp_late_fini+0x44/0x60 [amdgpu]
      [   91.560575]  amdgpu_fini+0x254/0x490 [amdgpu]
      [   91.560593]  amdgpu_device_fini+0x58/0x1b0 [amdgpu]
      [   91.560610]  amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu]
      [   91.560622]  drm_dev_unregister+0x3c/0xe0 [drm]
      [   91.560638]  amdgpu_pci_remove+0x19/0x30 [amdgpu]
      [   91.560643]  pci_device_remove+0x39/0xc0
      [   91.560648]  device_release_driver_internal+0x155/0x210
      [   91.560651]  driver_detach+0x38/0x70
      [   91.560655]  bus_remove_driver+0x4c/0xa0
      [   91.560658]  driver_unregister+0x2c/0x40
      [   91.560662]  pci_unregister_driver+0x21/0x90
      [   91.560689]  amdgpu_exit+0x15/0x406 [amdgpu]
      [   91.560694]  SyS_delete_module+0x1a8/0x270
      [   91.560698]  ? exit_to_usermode_loop+0x92/0xa0
      [   91.560702]  entry_SYSCALL_64_fastpath+0x13/0x94
      [   91.560705] RIP: 0033:0x7f7e4fc118e7
      [   91.560708] RSP: 002b:00007fff978ca118 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
      [   91.560713] RAX: ffffffffffffffda RBX: 000055afe21bc200 RCX: 00007f7e4fc118e7
      [   91.560716] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055afe21bc268
      [   91.560719] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999
      [   91.560722] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fff978c9100
      [   91.560725] R13: 0000000000000000 R14: 0000000000000000 R15: 000055afe21bc200
      [   91.560728] Code: 00 00 00 80 ff 77 00 00 48 bf 00 00 00 00 00 ea ff ff 49 01 da 48 0f 42 05 57 33 bd 00 49 01 c2 49 c1 ea 0c 49 c1 e2 06 49 01 fa <49> 8b 42 20 48 8d 78 ff a8 01 4c 0f 45 d7 49 8b 52 20 48 8d 42
      [   91.560759] RIP: kfree+0x57/0x160 RSP: ffffc90003403cc0
      [   91.560761] CR2: ffffebe000000620
      [   91.560765] ---[ end trace 08a9f3cd82223c1d ]---
      
      Fixes: 1c863802 (drm/amd/powerplay: refine powerplay interface.)
      Signed-off-by: default avatarJohn Brooks <john@fastquake.com>
      Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e97b3dc3
    • Huang Rui's avatar
      drm/amdgpu: fix the memory corruption on S3 · 3738cd03
      Huang Rui authored
      commit 67bef0f7 upstream.
      
      psp->cmd will be used on resume phase, so we can not free it on hw_init.
      Otherwise, a memory corruption will be triggered.
      Signed-off-by: default avatarHuang Rui <ray.huang@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Tested-by: default avatarXiaojie Yuan <Xiaojie.Yuan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3738cd03
    • Tom St Denis's avatar
      drm/amd/amdgpu: Return error if initiating read out of range on vram · 9a0b375b
      Tom St Denis authored
      commit 9156e723 upstream.
      
      If you initiate a read that is out of the VRAM address space return
      ENXIO instead of 0.
      
      Reads that begin below that point will read upto the VRAM limit as
      before.
      Signed-off-by: default avatarTom St Denis <tom.stdenis@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a0b375b
    • Alex Deucher's avatar
      drm/amdgpu/cgs: always set reference clock in mode_info · f64c0826
      Alex Deucher authored
      commit 73cc9079 upstream.
      
      It's relevent regardless of whether there are displays
      enabled.  Fixes garbage values for ref clock in powerplay
      leading to incorrect fan speed reporting when displays
      are disabled.
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=101653Acked-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f64c0826
    • Alex Deucher's avatar
      drm/amdgpu: fix vblank_time when displays are off · 2dc1889e
      Alex Deucher authored
      commit beb37776 upstream.
      
      If the displays are off, set the vblank time to max to make
      sure mclk switching is enabled.  Avoid mclk getting set
      to high when no displays are attached.
      
      bug: https://bugs.freedesktop.org/show_bug.cgi?id=101528
      fixes: 09be4a52 (drm/amd/powerplay/smu7: add vblank check for mclk switching (v2))
      Reviewed-by: default avatarMichel Dänzer <michel.daenzer@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2dc1889e
    • Alex Deucher's avatar
      drm/amdgpu/gfx8: drop per-APU CU limits · 4a0b1855
      Alex Deucher authored
      commit 943c05bd upstream.
      
      Always use the max for the family rather than the per sku limits.
      This makes sure the mask is always the max size to avoid reporting
      the wrong number of CUs.
      Reviewed-by: default avatarAlex Xie <AlexBin.Xie@amd.com>
      Reviewed-by: default avatarAndres Rodriguez <andresx7@gmail.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4a0b1855
    • Jiri Olsa's avatar
      s390/syscalls: Fix out of bounds arguments access · eee9c161
      Jiri Olsa authored
      commit c46fc042 upstream.
      
      Zorro reported following crash while having enabled
      syscall tracing (CONFIG_FTRACE_SYSCALLS):
      
        Unable to handle kernel pointer dereference at virtual ...
        Oops: 0011 [#1] SMP DEBUG_PAGEALLOC
      
        SNIP
      
        Call Trace:
        ([<000000000024d79c>] ftrace_syscall_enter+0xec/0x1d8)
         [<00000000001099c6>] do_syscall_trace_enter+0x236/0x2f8
         [<0000000000730f1c>] sysc_tracesys+0x1a/0x32
         [<000003fffcf946a2>] 0x3fffcf946a2
        INFO: lockdep is turned off.
        Last Breaking-Event-Address:
         [<000000000022dd44>] rb_event_data+0x34/0x40
        ---[ end trace 8c795f86b1b3f7b9 ]---
      
      The crash happens in syscall_get_arguments function for
      syscalls with zero arguments, that will try to access
      first argument (args[0]) in event entry, but it's not
      allocated.
      
      Bail out of there are no arguments.
      Reported-by: default avatarZorro Lang <zlang@redhat.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eee9c161
    • Xiao Ni's avatar
      Raid5 should update rdev->sectors after reshape · 25b43a86
      Xiao Ni authored
      commit b5d27718 upstream.
      
      The raid5 md device is created by the disks which we don't use the total size. For example,
      the size of the device is 5G and it just uses 3G of the devices to create one raid5 device.
      Then change the chunksize and wait reshape to finish. After reshape finishing stop the raid
      and assemble it again. It fails.
      mdadm -CR /dev/md0 -l5 -n3 /dev/loop[0-2] --size=3G --chunk=32 --assume-clean
      mdadm /dev/md0 --grow --chunk=64
      wait reshape to finish
      mdadm -S /dev/md0
      mdadm -As
      The error messages:
      [197519.814302] md: loop1 does not have a valid v1.2 superblock, not importing!
      [197519.821686] md: md_import_device returned -22
      
      After reshape the data offset is changed. It selects backwards direction in this condition.
      In function super_1_load it compares the available space of the underlying device with
      sb->data_size. The new data offset gets bigger after reshape. So super_1_load returns -EINVAL.
      rdev->sectors is updated in md_finish_reshape. Then sb->data_size is set in super_1_sync based
      on rdev->sectors. So add md_finish_reshape in end_reshape.
      Signed-off-by: default avatarXiao Ni <xni@redhat.com>
      Acked-by: default avatarGuoqing Jiang <gqjiang@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      25b43a86
    • Heinz Mauelshagen's avatar
      dm raid: stop using BUG() in __rdev_sectors() · 47f1b42a
      Heinz Mauelshagen authored
      commit 4d49f1b4 upstream.
      
      Return 0 rather than BUG() if __rdev_sectors() fails and catch invalid
      rdev size in the constructor.
      Reported-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarHeinz Mauelshagen <heinzm@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      47f1b42a
    • Jan Kara's avatar
      ext2: Don't clear SGID when inheriting ACLs · ee31ec07
      Jan Kara authored
      commit a992f2d3 upstream.
      
      When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
      set, DIR1 is expected to have SGID bit set (and owning group equal to
      the owning group of 'DIR0'). However when 'DIR0' also has some default
      ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
      'DIR1' to get cleared if user is not member of the owning group.
      
      Fix the problem by creating __ext2_set_acl() function that does not call
      posix_acl_update_mode() and use it when inheriting ACLs. That prevents
      SGID bit clearing and the mode has been properly set by
      posix_acl_create() anyway.
      
      Fixes: 07393101
      CC: linux-ext4@vger.kernel.org
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee31ec07
    • Toshi Kani's avatar
      libnvdimm: fix badblock range handling of ARS range · 08196c1c
      Toshi Kani authored
      commit 4e3f0701 upstream.
      
      __add_badblock_range() does not account sector alignment when
      it sets 'num_sectors'.  Therefore, an ARS error record range
      spanning across two sectors is set to a single sector length,
      which leaves the 2nd sector unprotected.
      
      Change __add_badblock_range() to set 'num_sectors' properly.
      
      Fixes: 0caeef63 ("libnvdimm: Add a poison list and export badblocks")
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08196c1c
    • Vishal Verma's avatar
      libnvdimm: fix the clear-error check in nsio_rw_bytes · e01c81e8
      Vishal Verma authored
      commit 7e5a21df upstream.
      
      A leftover from the 'bandaid' fix that disabled BTT error clearing in
      rw_bytes resulted in an incorrect check. After we converted these checks
      over to use the NVDIMM_IO_ATOMIC flag, the ndns->claim check was both
      redundant, and incorrect. Remove it.
      
      Fixes: 3ae3d67b ("libnvdimm: add an atomic vs process context flag to rw_bytes")
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e01c81e8
    • Vishal Verma's avatar
      libnvdimm, btt: fix btt_rw_page not returning errors · c3210112
      Vishal Verma authored
      commit c13c43d5 upstream.
      
      btt_rw_page was not propagating errors frm btt_do_bvec, resulting in any
      IO errors via the rw_page path going unnoticed. the pmem driver recently
      fixed this in e10624f8 pmem: fail io-requests to known bad blocks
      but same problem in BTT went neglected.
      
      Fixes: 5212e11f ("nd_btt: atomic sector updates")
      Cc: Toshi Kani <toshi.kani@hpe.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Jeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3210112
    • Yasunori Goto's avatar
      tools/testing/nvdimm: fix nfit_test buffer overflow · e6259c4a
      Yasunori Goto authored
      commit a117699c upstream.
      
      The root cause of panic is the num_pm of nfit_test1 is wrong.
      Though 1 is specified for num_pm at nfit_test_init(), it must be 2,
      because nfit_test1->spa_set[] array has 2 elements.
      
      Since the array is smaller than expected, the driver breaks other area.
      (it is often the link list of devres).
      
      As a result, panic occurs like the following example.
      
          CPU: 4 PID: 2233 Comm: lt-libndctl Tainted: G           O    4.12.0-rc1+ #12
          RIP: 0010:__list_del_entry_valid+0x6c/0xa0
          Call Trace:
           release_nodes+0x76/0x260
           devres_release_all+0x3c/0x50
           device_release_driver_internal+0x159/0x200
           device_release_driver+0x12/0x20
           bus_remove_device+0xfd/0x170
           device_del+0x1e8/0x330
           platform_device_del+0x28/0x90
           platform_device_unregister+0x12/0x30
           nfit_test_exit+0x2a/0x93b [nfit_test]
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6259c4a
    • David Härdeman's avatar
      rc-core: fix input repeat handling · d6226fc7
      David Härdeman authored
      commit b2aceb73 upstream.
      
      The call to input_register_device() needs to take place
      before the repeat parameters are set or the input subsystem
      repeat handling will be disabled (as was already noted in
      the comments in that function).
      Signed-off-by: default avatarDavid Härdeman <david@hardeman.nu>
      Signed-off-by: default avatarSean Young <sean@mess.org>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6226fc7
    • Devin Heitmueller's avatar
      cx88: Fix regression in initial video standard setting · 735c17de
      Devin Heitmueller authored
      commit 4e0973a9 upstream.
      
      Setting initial standard at the top of cx8800_initdev would cause the
      first call to cx88_set_tvnorm() to return without programming any
      registers (leaving the driver saying it's set to NTSC but the hardware
      isn't programmed).  Even worse, any subsequent attempt to explicitly
      set it to NTSC-M will return success but actually fail to program the
      underlying registers unless first changing the standard to something
      other than NTSC-M.
      
      Set the initial standard later in the process, and make sure the field
      is zero at the beginning to ensure that the call always goes through.
      
      This regression was introduced in the following commit:
      
      commit ccd6f1d4 ("[media] cx88: move width, height and field to core
      struct")
      
      Author: Hans Verkuil <hans.verkuil@cisco.com>
      
      [media] cx88: move width, height and field to core struct
      Signed-off-by: default avatarDevin Heitmueller <dheitmueller@kernellabs.com>
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      735c17de
    • Marek Marczykowski-Górecki's avatar
      x86/xen: allow userspace access during hypercalls · dc1e0c2b
      Marek Marczykowski-Górecki authored
      commit c54590ca upstream.
      
      Userspace application can do a hypercall through /dev/xen/privcmd, and
      some for some hypercalls argument is a pointers to user-provided
      structure. When SMAP is supported and enabled, hypervisor can't access.
      So, lets allow it.
      
      The same applies to HYPERVISOR_dm_op, where additionally privcmd driver
      carefully verify buffer addresses.
      Signed-off-by: default avatarMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc1e0c2b
    • NeilBrown's avatar
      md: fix deadlock between mddev_suspend() and md_write_start() · a2bfc675
      NeilBrown authored
      commit cc27b0c7 upstream.
      
      If mddev_suspend() races with md_write_start() we can deadlock
      with mddev_suspend() waiting for the request that is currently
      in md_write_start() to complete the ->make_request() call,
      and md_write_start() waiting for the metadata to be updated
      to mark the array as 'dirty'.
      As metadata updates done by md_check_recovery() only happen then
      the mddev_lock() can be claimed, and as mddev_suspend() is often
      called with the lock held, these threads wait indefinitely for each
      other.
      
      We fix this by having md_write_start() abort if mddev_suspend()
      is happening, and ->make_request() aborts if md_write_start()
      aborted.
      md_make_request() can detect this abort, decrease the ->active_io
      count, and wait for mddev_suspend().
      Reported-by: default avatarNix <nix@esperi.org.uk>
      Fix: 68866e42(MD: no sync IO while suspended)
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a2bfc675
    • Mikulas Patocka's avatar
      md: don't use flush_signals in userspace processes · 8d73fe66
      Mikulas Patocka authored
      commit f9c79bc0 upstream.
      
      The function flush_signals clears all pending signals for the process. It
      may be used by kernel threads when we need to prepare a kernel thread for
      responding to signals. However using this function for an userspaces
      processes is incorrect - clearing signals without the program expecting it
      can cause misbehavior.
      
      The raid1 and raid5 code uses flush_signals in its request routine because
      it wants to prepare for an interruptible wait. This patch drops
      flush_signals and uses sigprocmask instead to block all signals (including
      SIGKILL) around the schedule() call. The signals are not lost, but the
      schedule() call won't respond to them.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Acked-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d73fe66
    • Dmitry Torokhov's avatar
      HID: multitouch: do not blindly set EV_KEY or EV_ABS bits · b110a29f
      Dmitry Torokhov authored
      commit 4cf56a89 upstream.
      
      Now that input core insists on having dev->absinfo when device claims to
      generate EV_ABS in its dev->evbit, we should not be blindly setting that
      bit.
      
      The code in question might have been needed before input_set_abs_params()
      started setting EV_ABS in device's evbit, but not anymore, and is now
      breaking devices such as SMART SPNL-6075 Touchscreen.
      
      Fixes: 6ecfe51b ("Input: refuse to register absolute devices ...")
      Reported-by: default avatarMatthias Fend <Matthias.Fend@wolfvision.net>
      Tested-by: default avatarMatthias Fend <Matthias.Fend@wolfvision.net>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b110a29f
    • Yoshihiro Shimoda's avatar
      usb: renesas_usbhs: gadget: disable all eps when the driver stops · 971443b0
      Yoshihiro Shimoda authored
      commit b8b9c974 upstream.
      
      A gadget driver will not disable eps immediately when ->disconnect()
      is called. But, since this driver assumes all eps stop after
      the ->disconnect(), unexpected behavior happens (especially in system
      suspend).
      So, this patch disables all eps in usbhsg_try_stop(). After disabling
      eps by renesas_usbhs driver, since some functions will be called by
      both a gadget and renesas_usbhs driver, renesas_usbhs driver should
      protect uep->pipe. To protect uep->pipe easily, this patch adds a new
      lock in struct usbhsg_uep.
      
      Fixes: 2f98382d ("usb: renesas_usbhs: Add Renesas USBHS Gadget")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      971443b0
    • Yoshihiro Shimoda's avatar
      usb: renesas_usbhs: fix usbhsc_resume() for !USBHSF_RUNTIME_PWCTRL · d0cacd66
      Yoshihiro Shimoda authored
      commit 59a0879a upstream.
      
      This patch fixes an issue that some registers may be not initialized
      after resume if the USBHSF_RUNTIME_PWCTRL is not set. Otherwise,
      if a cable is not connected, the driver will not enable INTENB0.VBSE
      after resume. And then, the driver cannot detect the VBUS.
      
      Fixes: ca8a282a ("usb: gadget: renesas_usbhs: add suspend/resume support")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0cacd66
    • Johan Hovold's avatar
      USB: cdc-acm: add device-id for quirky printer · c007b283
      Johan Hovold authored
      commit fe855789 upstream.
      
      Add device-id entry for DATECS FP-2000 fiscal printer needing the
      NO_UNION_NORMAL quirk.
      Reported-by: default avatarAnton Avramov <lukav@lukav.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c007b283
    • Colin Ian King's avatar
      usb: storage: return on error to avoid a null pointer dereference · 2aee5d17
      Colin Ian King authored
      commit 446230f5 upstream.
      
      When us->extra is null the driver is not initialized, however, a
      later call to osd200_scsi_to_ata is made that dereferences
      us->extra, causing a null pointer dereference.  The code
      currently detects and reports that the driver is not initialized;
      add a return to avoid the subsequent dereference issue in this
      check.
      
      Thanks to Alan Stern for pointing out that srb->result needs setting
      to DID_ERROR << 16
      
      Detected by CoverityScan, CID#100308 ("Dereference after null check")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2aee5d17
    • Devin Heitmueller's avatar
      mxl111sf: Fix driver to use heap allocate buffers for USB messages · 2f110d39
      Devin Heitmueller authored
      commit d90b336f upstream.
      
      The recent changes in 4.9 to mandate USB buffers be heap allocated
      broke this driver, which was allocating the buffers on the stack.
      This resulted in the device failing at initialization.
      
      Introduce dedicated send/receive buffers as part of the state
      structure, and add a mutex to protect access to them.
      
      Note: we also had to tweak the API to mxl111sf_ctrl_msg to pass
      the pointer to the state struct rather than the device, since
      we need it inside the function to access the buffers and the
      mutex.  This patch adjusts the callers to match the API change.
      Signed-off-by: default avatarDevin Heitmueller <dheitmueller@kernellabs.com>
      Reported-by: default avatarDoug Lung <dlung0@gmail.com>
      Cc: Michael Ira Krufky <mkrufky@linuxtv.org>
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f110d39
    • Jiahau Chang's avatar
      xhci: Bad Ethernet performance plugged in ASM1042A host · 5cc9b698
      Jiahau Chang authored
      commit 9da5a109 upstream.
      
      When USB Ethernet is plugged in ASMEDIA ASM1042A xHCI host, bad
      performance was manifesting in Web browser use (like download
      large file such as ISO image). It is known limitation of
      ASM1042A that is not compatible with driver scheduling,
      As a workaround we can modify flow control handling of ASM1042A.
      The register we modify is changes the behavior
      
      [use quirk bit 28, usleep_range 40-60us, empty non-pci function -Mathias]
      Signed-off-by: default avatarJiahau Chang <Lars_chang@asmedia.com.tw>
      Signed-off-by: default avatarIan Pilcher <arequipeno@gmail.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cc9b698
    • Mathias Nyman's avatar
      xhci: Fix NULL pointer dereference when cleaning up streams for removed host · 7b7a1f02
      Mathias Nyman authored
      commit 4b895868 upstream.
      
      This off by one in stream_id indexing caused NULL pointer dereference and
      soft lockup on machines with USB attached SCSI devices connected to a
      hotpluggable xhci controller.
      
      The code that cleans up pending URBs for dead hosts tried to dereference
      a stream ring at the invalid stream_id 0.
      ep->stream_info->stream_rings[0] doesn't point to a ring.
      
      Start looping stream_id from 1 like in all the other places in the driver,
      and check that the ring exists before trying to kill URBs on it.
      Reported-by: default avatarrocko r <rockorequin@gmail.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b7a1f02
    • Mathias Nyman's avatar
      xhci: fix 20000ms port resume timeout · be67d4af
      Mathias Nyman authored
      commit a54408d0 upstream.
      
      A uncleared PLC (port link change) bit will prevent furuther port event
      interrupts for that port. Leaving it uncleared caused get_port_status()
      to timeout after 20000ms while waiting to get the final port event
      interrupt for resume -> U0 state change.
      
      This is a targeted fix for a specific case where we get a port resume event
      racing with xhci resume. The port event interrupt handler notices xHC is
      not yet running and bails out early, leaving PLC uncleared.
      
      The whole xhci port resuming needs more attention, but while working on it
      it anyways makes sense to always ensure PLC is cleared in get_port_status
      before setting a new link state and waiting for its completion.
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be67d4af
    • Shu Wang's avatar
      xhci: fix memleak in xhci_run() · 01c5b393
      Shu Wang authored
      commit d6f5f071 upstream.
      
      Found this issue by kmemleak.
      xhci_run() did not check return val and free command for
      xhci_queue_vendor_command()
      
      unreferenced object 0xffff88011c0be500 (size 64):
        comm "kworker/0:1", pid 58, jiffies 4294670908 (age 50.420s)
        hex dump (first 32 bytes):
        backtrace:
          [<ffffffff8176166a>] kmemleak_alloc+0x4a/0xa0
          [<ffffffff8121801a>] kmem_cache_alloc_trace+0xca/0x1d0
          [<ffffffff81576bf4>] xhci_alloc_command+0x44/0x130
          [<ffffffff8156f1cc>] xhci_run+0x4cc/0x630
          [<ffffffff8153b84b>] usb_add_hcd+0x3bb/0x950
          [<ffffffff8154eac8>] usb_hcd_pci_probe+0x188/0x500
          [<ffffffff815851ac>] xhci_pci_probe+0x2c/0x220
          [<ffffffff813d2ca5>] local_pci_probe+0x45/0xa0
          [<ffffffff810a54e4>] work_for_cpu_fn+0x14/0x20
          [<ffffffff810a8409>] process_one_work+0x149/0x360
          [<ffffffff810a8d08>] worker_thread+0x1d8/0x3c0
          [<ffffffff810ae7d9>] kthread+0x109/0x140
          [<ffffffff8176d585>] ret_from_fork+0x25/0x30
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: default avatarShu Wang <shuwang@redhat.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01c5b393
    • Peter Chen's avatar
      usb: xhci: fix spinlock recursion for USB2 test mode · 9c97237c
      Peter Chen authored
      commit 576d5546 upstream.
      
      Both xhci_hub_control and xhci_disable_slot tries to hold spinlock, the
      spinlock recursion occurs when enters USB2 test mode. Fix it by unlock
      spinlock before calling xhci_disable_slot.
      
      Fixes: 0f1d832e ("usb: xhci: Add port test modes support for usb2")
      Signed-off-by: default avatarPeter Chen <peter.chen@nxp.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c97237c
    • Michael Hernandez's avatar
      PCI/MSI: Ignore affinity if pre/post vector count is more than min_vecs · 07c79fd9
      Michael Hernandez authored
      commit 6f9a22bc upstream.
      
      min_vecs is the minimum amount of vectors needed to operate in MSI-X mode
      which may just include the vectors that don't need affinity.
      
      Disabling affinity settings causes the qla2xxx driver scsi_add_host() to fail
      when blk_mq is enabled as the blk_mq_pci_map_queues() expects affinity masks
      on each vector.
      
      Fixes: dfef358b ("PCI/MSI: Don't apply affinity if there aren't enough vectors left")
      Signed-off-by: default avatarMichael Hernandez <michael.hernandez@cavium.com>
      Signed-off-by: default avatarHimanshu Madhani <himanshu.madhani@cavium.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      07c79fd9
    • Chen Yu's avatar
      PCI/PM: Restore the status of PCI devices across hibernation · c1ead164
      Chen Yu authored
      commit e60514bd upstream.
      
      Currently we saw a lot of "No irq handler" errors during hibernation, which
      caused the system hang finally:
      
        ata4.00: qc timeout (cmd 0xec)
        ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4)
        ata4.00: revalidation failed (errno=-5)
        ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
        do_IRQ: 31.151 No irq handler for vector
      
      According to above logs, there is an interrupt triggered and it is
      dispatched to CPU31 with a vector number 151, but there is no handler for
      it, thus this IRQ will not get acked and will cause an IRQ flood which
      kills the system.  To be more specific, the 31.151 is an interrupt from the
      AHCI host controller.
      
      After some investigation, the reason why this issue is triggered is because
      the thaw_noirq() function does not restore the MSI/MSI-X settings across
      hibernation.
      
      The scenario is illustrated below:
      
        1. Before hibernation, IRQ 34 is the handler for the AHCI device, which
           is bound to CPU31.
      
        2. Hibernation starts, the AHCI device is put into low power state.
      
        3. All the nonboot CPUs are put offline, so IRQ 34 has to be migrated to
           the last alive one - CPU0.
      
        4. After the snapshot has been created, all the nonboot CPUs are brought
           up again; IRQ 34 remains bound to CPU0.
      
        5. AHCI devices are put into D0.
      
        6. The snapshot is written to the disk.
      
      The issue is triggered in step 6.  The AHCI interrupt should be delivered
      to CPU0, however it is delivered to the original CPU31 instead, which
      causes the "No irq handler" issue.
      
      Ying Huang has provided a clue that, in step 3 it is possible that writing
      to the register might not take effect as the PCI devices have been
      suspended.
      
      In step 3, the IRQ 34 affinity should be modified from CPU31 to CPU0, but
      in fact it is not.  In __pci_write_msi_msg(), if the device is already in
      low power state, the low level MSI message entry will not be updated but
      cached.  During the device restore process after a normal suspend/resume,
      pci_restore_msi_state() writes the cached MSI back to the hardware.
      
      But this is not the case for hibernation.  pci_restore_msi_state() is not
      currently called in pci_pm_thaw_noirq(), although pci_save_state() has
      saved the necessary PCI cached information in pci_pm_freeze_noirq().
      
      Restore the PCI status for the device during hibernation.  Otherwise the
      status might be lost across hibernation (for example, settings for MSI,
      MSI-X, ATS, ACS, IOV, etc.), which might cause problems during hibernation.
      Suggested-by: default avatarYing Huang <ying.huang@intel.com>
      Suggested-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
      [bhelgaas: changelog]
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: Ying Huang <ying.huang@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1ead164
    • Shawn Lin's avatar
      PCI: rockchip: Use normal register bank for config accessors · 18fc66a8
      Shawn Lin authored
      commit dc8cca5e upstream.
      
      Rockchip's RC has two banks of registers for the root port: a normal bank
      that is strictly compatible with the PCIe spec, and a privileged bank that
      can be used to change RO bits of root port registers.
      
      When probing the RC driver, we use the privileged bank to do some basic
      setup work as some RO bits are hw-inited to wrong value.  But we didn't
      change to the normal bank after probing the driver.
      
      This leads to a serious problem when the PME code tries to clear the PME
      status by writing PCI_EXP_RTSTA_PME to the register of PCI_EXP_RTSTA.  Per
      PCIe 3.0 spec, section 7.8.14, the PME status bit is RW1C.  So the PME code
      is doing the right thing to clear the PME status but we find the RC doesn't
      clear it but actually setting it to one.  So finally the system trap in
      pcie_pme_work_fn() as PCI_EXP_RTSTA_PME is true now forever.  This issue
      can be reproduced by booting kernel with pci=nomsi.
      
      Use the normal register bank for the PCI config accessors.  The privileged
      bank is used only internally by this driver.
      
      Fixes: e77f847d ("PCI: rockchip: Add Rockchip PCIe controller support")
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: Jeffy Chen <jeffy.chen@rock-chips.com>
      Cc: Brian Norris <briannorris@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18fc66a8
    • Bjorn Helgaas's avatar
      PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11 · 31f8d306
      Bjorn Helgaas authored
      commit 13cfc732 upstream.
      
      Neither soft poweroff (transition to ACPI power state S5) nor
      suspend-to-RAM (transition to state S3) works on the Macbook Pro 11,4 and
      11,5.
      
      The problem is related to the [mem 0x7fa00000-0x7fbfffff] space.  When we
      use that space, e.g., by assigning it to the 00:1c.0 Root Port, the ACPI
      Power Management 1 Control Register (PM1_CNT) at [io 0x1804] doesn't work
      anymore.
      
      Linux does a soft poweroff (transition to S5) by writing to PM1_CNT.  The
      theory about why this doesn't work is:
      
        - The write to PM1_CNT causes an SMI
        - The BIOS SMI handler depends on something in
          [mem 0x7fa00000-0x7fbfffff]
        - When Linux assigns [mem 0x7fa00000-0x7fbfffff] to the 00:1c.0 Port, it
          covers up whatever the SMI handler uses, so the SMI handler no longer
          works correctly
      
      Reserve the [mem 0x7fa00000-0x7fbfffff] space so we don't assign it to
      anything.
      
      This is voodoo programming, since we don't know what the real conflict is,
      but we've failed to find the root cause.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=103211
      Tested-by: thejoe@gmail.com
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Cc: Rafael J. Wysocki <rafael@kernel.org>
      Cc: Lukas Wunner <lukas@wunner.de>
      Cc: Chen Yu <yu.c.chen@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31f8d306
    • Jon Derrick's avatar
      PCI: vmd: Move SRCU cleanup after bus, child device removal · 2e09bcd1
      Jon Derrick authored
      commit 0cb259c4 upstream.
      
      Recent __call_srcu() changes have exposed that we need to cleanup SRCU
      structures after pci_stop_root_bus() calls into vmd_msi_free().
      
      Fixes: 3906b918 ("PCI: vmd: Use SRCU as a local RCU to prevent delaying global RCU")
      Signed-off-by: default avatarJon Derrick <jonathan.derrick@intel.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarKeith Busch <keith.busch@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e09bcd1