1. 15 May, 2014 40 commits
    • Dan Williams's avatar
      libata/ahci: accommodate tag ordered controllers · d168685b
      Dan Williams authored
      commit 8a4aeec8 upstream.
      
      The AHCI spec allows implementations to issue commands in tag order
      rather than FIFO order:
      
      	5.3.2.12 P:SelectCmd
      	HBA sets pSlotLoc = (pSlotLoc + 1) mod (CAP.NCS + 1)
      	or HBA selects the command to issue that has had the
      	PxCI bit set to '1' longer than any other command
      	pending to be issued.
      
      The result is that commands posted sequentially (time-wise) may play out
      of sequence when issued by hardware.
      
      This behavior has likely been hidden by drives that arrange for commands
      to complete in issue order.  However, it appears recent drives (two from
      different vendors that we have found so far) inflict out-of-order
      completions as a matter of course.  So, we need to take care to maintain
      ordered submission, otherwise we risk triggering a drive to fall out of
      sequential-io automation and back to random-io processing, which incurs
      large latency and degrades throughput.
      
      This issue was found in simple benchmarks where QD=2 seq-write
      performance was 30-50% *greater* than QD=32 seq-write performance.
      
      Tagging for -stable and making the change globally since it has a low
      risk-to-reward ratio.  Also, word is that recent versions of an unnamed
      OS also does it this way now.  So, drives in the field are already
      experienced with this tag ordering scheme.
      
      Cc: Dave Jiang <dave.jiang@intel.com>
      Cc: Ed Ciechanowski <ed.ciechanowski@intel.com>
      Reviewed-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d168685b
    • David Milburn's avatar
      ahci: do not request irq for dummy port · f7f31e5e
      David Milburn authored
      commit 9ae794ac upstream.
      
      System may crash in ahci_hw_interrupt() or ahci_thread_fn() when
      accessing the interrupt status in a port's private_data if the port is
      actually a DUMMY port.
      
      00:1f.2 SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI Controller
      
      <snip console output for linux-3.15-rc1>
      [    9.352080] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x1 impl SATA mode
      [    9.352084] ahci 0000:00:1f.2: flags: 64bit ncq sntf pm led clo pio slum part ccc
      [    9.368155] Console: switching to colour frame buffer device 128x48
      [    9.439759] mgag200 0000:11:00.0: fb0: mgadrmfb frame buffer device
      [    9.446765] mgag200 0000:11:00.0: registered panic notifier
      [    9.470166] scsi1 : ahci
      [    9.479166] scsi2 : ahci
      [    9.488172] scsi3 : ahci
      [    9.497174] scsi4 : ahci
      [    9.506175] scsi5 : ahci
      [    9.515174] scsi6 : ahci
      [    9.518181] ata1: SATA max UDMA/133 abar m2048@0x95c00000 port 0x95c00100 irq 91
      [    9.526448] ata2: DUMMY
      [    9.529182] ata3: DUMMY
      [    9.531916] ata4: DUMMY
      [    9.534650] ata5: DUMMY
      [    9.537382] ata6: DUMMY
      [    9.576196] [drm] Initialized mgag200 1.0.0 20110418 for 0000:11:00.0 on minor 0
      [    9.845257] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
      [    9.865161] ata1.00: ATAPI: Optiarc DVD RW AD-7580S, FX04, max UDMA/100
      [    9.891407] ata1.00: configured for UDMA/100
      [    9.900525] scsi 1:0:0:0: CD-ROM            Optiarc  DVD RW AD-7580S  FX04 PQ: 0 ANSI: 5
      [   10.247399] iTCO_vendor_support: vendor-support=0
      [   10.261572] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
      [   10.269764] iTCO_wdt: unable to reset NO_REBOOT flag, device disabled by hardware/BIOS
      [   10.301932] sd 0:2:0:0: [sda] 570310656 512-byte logical blocks: (291 GB/271 GiB)
      [   10.317085] sd 0:2:0:0: [sda] Write Protect is off
      [   10.328326] sd 0:2:0:0: [sda] Write cache: disabled, read cache: disabled, supports DPO and FUA
      [   10.375452] BUG: unable to handle kernel NULL pointer dereference at 000000000000003c
      [   10.384217] IP: [<ffffffffa0133df0>] ahci_hw_interrupt+0x100/0x130 [libahci]
      [   10.392101] PGD 0
      [   10.394353] Oops: 0000 [#1] SMP
      [   10.397978] Modules linked in: sr_mod(+) cdrom sd_mod iTCO_wdt crc_t10dif iTCO_vendor_support crct10dif_common ahci libahci libata lpc_ich mfd_core mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm drm i2c_core megaraid_sas dm_mirror dm_region_hash
      dm_log dm_mod
      [   10.426499] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.0-rc1 #1
      [   10.433495] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011
      [   10.443886] task: ffffffff81906460 ti: ffffffff818f0000 task.ti: ffffffff818f0000
      [   10.452239] RIP: 0010:[<ffffffffa0133df0>]  [<ffffffffa0133df0>] ahci_hw_interrupt+0x100/0x130 [libahci]
      [   10.462838] RSP: 0018:ffff880033c03d98  EFLAGS: 00010046
      [   10.468767] RAX: 0000000000a400a4 RBX: ffff880029a6bc18 RCX: 00000000fffffffa
      [   10.476731] RDX: 00000000000000a4 RSI: ffff880029bb0000 RDI: ffff880029a6bc18
      [   10.484696] RBP: ffff880033c03dc8 R08: 0000000000000000 R09: ffff88002f800490
      [   10.492661] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000000
      [   10.500625] R13: ffff880029a6bd98 R14: 0000000000000000 R15: ffffc90000194000
      [   10.508590] FS:  0000000000000000(0000) GS:ffff880033c00000(0000) knlGS:0000000000000000
      [   10.517623] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   10.524035] CR2: 000000000000003c CR3: 00000000328ff000 CR4: 00000000000007b0
      [   10.531999] Stack:
      [   10.534241]  0000000000000017 ffff880031ba7d00 000000000000005c ffff880031ba7d00
      [   10.542535]  0000000000000000 000000000000005c ffff880033c03e10 ffffffff810c2a1e
      [   10.550827]  ffff880031ae2900 000000008108fb4f ffff880031ae2900 ffff880031ae2984
      [   10.559121] Call Trace:
      [   10.561849]  <IRQ>
      [   10.563994]  [<ffffffff810c2a1e>] handle_irq_event_percpu+0x3e/0x1a0
      [   10.571309]  [<ffffffff810c2bbd>] handle_irq_event+0x3d/0x60
      [   10.577631]  [<ffffffff810c4fdd>] try_one_irq.isra.6+0x8d/0xf0
      [   10.584142]  [<ffffffff810c5313>] note_interrupt+0x173/0x1f0
      [   10.590460]  [<ffffffff810c2a8e>] handle_irq_event_percpu+0xae/0x1a0
      [   10.597554]  [<ffffffff810c2bbd>] handle_irq_event+0x3d/0x60
      [   10.603872]  [<ffffffff810c5727>] handle_edge_irq+0x77/0x130
      [   10.610199]  [<ffffffff81014b8f>] handle_irq+0xbf/0x150
      [   10.616040]  [<ffffffff8109ff4e>] ? vtime_account_idle+0xe/0x50
      [   10.622654]  [<ffffffff815fca1a>] ? atomic_notifier_call_chain+0x1a/0x20
      [   10.630140]  [<ffffffff816038cf>] do_IRQ+0x4f/0xf0
      [   10.635490]  [<ffffffff815f8aed>] common_interrupt+0x6d/0x6d
      [   10.641805]  <EOI>
      [   10.643950]  [<ffffffff8149ca9f>] ? cpuidle_enter_state+0x4f/0xc0
      [   10.650972]  [<ffffffff8149ca98>] ? cpuidle_enter_state+0x48/0xc0
      [   10.657775]  [<ffffffff8149cb47>] cpuidle_enter+0x17/0x20
      [   10.663807]  [<ffffffff810b0070>] cpu_startup_entry+0x2c0/0x3d0
      [   10.670423]  [<ffffffff815dfcc7>] rest_init+0x77/0x80
      [   10.676065]  [<ffffffff81a60f47>] start_kernel+0x40f/0x41a
      [   10.682190]  [<ffffffff81a60941>] ? repair_env_string+0x5c/0x5c
      [   10.688799]  [<ffffffff81a60120>] ? early_idt_handlers+0x120/0x120
      [   10.695699]  [<ffffffff81a605ee>] x86_64_start_reservations+0x2a/0x2c
      [   10.702889]  [<ffffffff81a60733>] x86_64_start_kernel+0x143/0x152
      [   10.709689] Code: a0 fc ff 85 c0 8b 4d d4 74 c3 48 8b 7b 08 89 ca 48 c7 c6 60 66 13 a0 31 c0 e8 9d 70 28 e1 8b 4d d4 eb aa 0f 1f 84 00 00 00 00 00 <45> 8b 64 24 3c 48 89 df e8 23 47 4c e1 41 83 fc 01 19 c0 48 83
      [   10.731470] RIP  [<ffffffffa0133df0>] ahci_hw_interrupt+0x100/0x130 [libahci]
      [   10.739441]  RSP <ffff880033c03d98>
      [   10.743333] CR2: 000000000000003c
      [   10.747032] ---[ end trace b6e82636970e2690 ]---
      [   10.760190] Kernel panic - not syncing: Fatal exception in interrupt
      [   10.767291] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
      
      Cc: Alexander Gordeev <agordeev@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-of-by: default avatarDavid Milburn <dmilburn@redhat.com>
      Fixes: 5ca72c4f ("AHCI: Support multiple MSIs")
      f7f31e5e
    • Thomas Petazzoni's avatar
      Revert "net: mvneta: fix usage as a module on RGMII configurations" · 45a96331
      Thomas Petazzoni authored
      commit cc6ca302 upstream.
      
      This reverts commit e3a8786c. While
      this commit allows to use the mvneta driver as a module on some
      configurations, it breaks other configurations even if mvneta is used
      built-in.
      
      This breakage is due to the fact that on some RGMII platforms, the PCS
      bit has to be set, and on some other platforms, it has to be
      cleared. At the moment, we lack informations to know exactly the
      significance of this bit (the datasheet only says "enables PCS"), and
      so we can't produce a patch that will work on all platforms at this
      point. And since this change is breaking the network completely for
      many users, it's much better to revert it for now. We'll come back
      later with a proper fix that takes into account all platforms.
      
      Basically:
      
       * Armada XP GP is configured as RGMII-ID, and needs the PCS bit to be
         set.
       * Armada 370 Mirabox is configured as RGMII-ID, and needs the PCS bit
         to be cleared.
      
      And at the moment, we don't know how to make the distinction between
      those two cases. One hint is that the Armada XP GP appears in fact to
      be using a QSGMII connection with the PHY (Quad-SGMII), but
      configuring it as SGMII doesn't work, while RGMII-ID works. This needs
      more investigation, but in the mean time, let's unbreak the network
      for all those users.
      Signed-off-by: default avatarThomas Petazzoni <thomas.petazzoni@free-electrons.com>
      Reported-by: default avatarArnaud Ebalard <arno@natisbad.org>
      Reported-by: default avatarAlexander Reuter <Alexander.Reuter@gmx.net>
      Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=73401Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      45a96331
    • Rafał Miłecki's avatar
      b43: Fix machine check error due to improper access of B43_MMIO_PSM_PHY_HDR · 1347f082
      Rafał Miłecki authored
      commit 12cd43c6 upstream.
      
      Register B43_MMIO_PSM_PHY_HDR is 16 bit one, so accessing it with 32b
      functions isn't safe. On my machine it causes delayed (!) CPU exception:
      
      Disabling lock debugging due to kernel taint
      mce: [Hardware Error]: CPU 0: Machine Check Exception: 4 Bank 4: b200000000070f0f
      mce: [Hardware Error]: TSC 164083803dc
      mce: [Hardware Error]: PROCESSOR 2:20fc2 TIME 1396650505 SOCKET 0 APIC 0 microcode 0
      mce: [Hardware Error]: Run the above through 'mcelog --ascii'
      mce: [Hardware Error]: Machine check: Processor context corrupt
      Kernel panic - not syncing: Fatal machine check on current CPU
      Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
      Signed-off-by: default avatarRafał Miłecki <zajec5@gmail.com>
      Acked-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1347f082
    • Mikulas Patocka's avatar
      mach64: fix cursor when character width is not a multiple of 8 pixels · cff94c14
      Mikulas Patocka authored
      commit 43751a1b upstream.
      
      This patch fixes the hardware cursor on mach64 when font width is not a
      multiple of 8 pixels.
      
      If you load such a font, the cursor is expanded to the next 8-byte
      boundary and a part of the next character after the cursor is not
      visible.
      For example, when you load a font with 12-pixel width, the cursor width
      is 16 pixels and when the cursor is displayed, 4 pixels of the next
      character are not visible.
      
      The reason is this: atyfb_cursor is called with proper parameters to
      load an image that is 12-pixel wide. However, the number is aligned on
      the next 8-pixel boundary on the line
      "unsigned int width = (cursor->image.width + 7) >> 3;" and the whole
      function acts as it is was loading a 16-pixel image.
      
      This patch fixes it so that the value written to the framebuffer is
      padded with 0xaaaa (the transparent pattern) when the image size it not
      a multiple of 8 pixels. The transparent pattern causes that the cursor
      will not interfere with the next character.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cff94c14
    • Mikulas Patocka's avatar
      mach64: use unaligned access · 7aee7576
      Mikulas Patocka authored
      commit c29dd869 upstream.
      
      This patch fixes mach64 to use unaligned access to the font bitmap.
      
      This fixes unaligned access warning on sparc64 when 14x8 font is loaded.
      
      On x86(64), unaligned access is handled in hardware, so both functions
      le32_to_cpup and get_unaligned_le32 perform the same operation.
      
      On RISC machines, unaligned access is not handled in hardware, so we
      better use get_unaligned_le32 to avoid the unaligned trap and warning.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7aee7576
    • Mikulas Patocka's avatar
      matroxfb: restore the registers M_ACCESS and M_PITCH · ee8ba724
      Mikulas Patocka authored
      commit a772d473 upstream.
      
      When X11 is running and the user switches back to console, the card
      modifies the content of registers M_MACCESS and M_PITCH in periodic
      intervals.
      
      This patch fixes it by restoring the content of these registers before
      issuing any accelerator command.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ee8ba724
    • Mikulas Patocka's avatar
      framebuffer: fix cfb_copyarea · d0579cf3
      Mikulas Patocka authored
      commit 00a9d699 upstream.
      
      The function cfb_copyarea is buggy when the copy operation is not aligned on
      long boundary (4 bytes on 32-bit machines, 8 bytes on 64-bit machines).
      
      How to reproduce:
      - use x86-64 machine
      - use a framebuffer driver without acceleration (for example uvesafb)
      - set the framebuffer to 8-bit depth
      	(for example fbset -a 1024x768-60 -depth 8)
      - load a font with character width that is not a multiple of 8 pixels
      	note: the console-tools package cannot load a font that has
      	width different from 8 pixels. You need to install the packages
      	"kbd" and "console-terminus" and use the program "setfont" to
      	set font width (for example: setfont Uni2-Terminus20x10)
      - move some text left and right on the bash command line and you get a
      	screen corruption
      
      To expose more bugs, put this line to the end of uvesafb_init_info:
      info->flags |= FBINFO_HWACCEL_COPYAREA | FBINFO_READS_FAST;
      - Now framebuffer console will use cfb_copyarea for console scrolling.
      You get a screen corruption when console is scrolled.
      
      This patch is a rewrite of cfb_copyarea. It fixes the bugs, with this
      patch, console scrolling in 8-bit depth with a font width that is not a
      multiple of 8 pixels works fine.
      
      The cfb_copyarea code was very buggy and it looks like it was written
      and never tried with non-8-pixel font.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d0579cf3
    • Vineet Gupta's avatar
      ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safe · ca29d240
      Vineet Gupta authored
      commit 8aa9e85a upstream.
      
      There was a very small race window where resume to kernel mode from a
      Exception Path (or pure kernel mode which is true for most of ARC
      exceptions anyways), was not disabling interrupts in restore_regs,
      clobbering the exception regs
      
      Anton found the culprit call flow (after many sleepless nights)
      
      | 1. we got a Trap from user land
      | 2. started to service it.
      | 3. While doing some stuff on user-land memory (I think it is padzero()),
      |     we got a DataTlbMiss
      | 4. On return from it we are taking "resume_kernel_mode" path
      | 5. NEED_RESHED is not set, so we go to "return from exception" path in
      |     restore regs.
      | 6. there seems to be IRQ happening
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Cc: Anton Kolesov <Anton.Kolesov@synopsys.com>
      Cc: Francois Bedard <Francois.Bedard@synopsys.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ca29d240
    • Richard Weinberger's avatar
      ARC: Remove ARC_HAS_COH_RTSC · 796f316e
      Richard Weinberger authored
      commit d345ea28 upstream.
      
      The symbol is an orphan, get rid of it.
      
      Fixes: 7d0857a5 ("ARC: [SMP] Disallow RTSC")
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Acked-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      796f316e
    • Jarkko Nikula's avatar
      ASoC: dapm: Fix widget double free with auto-disable DAPM kcontrol · 79c3ca4f
      Jarkko Nikula authored
      commit 2697e4fb upstream.
      
      Commit 9e1fda4ae158 ("ASoC: dapm: Implement mixer input auto-disable")
      is trying to free the widget it allocated by snd_soc_dapm_new_control()
      call in dapm_kcontrol_data_alloc() by adding kfree(data->widget) to
      dapm_kcontrol_free().
      
      This is causing a widget double free with auto-disabled DAPM kcontrols
      in sound card unregistration because widgets are already freed before
      dapm_kcontrol_free() is called.
      
      Reason for that is all widgets are added into dapm->card->widgets list
      in snd_soc_dapm_new_control() and freed in dapm_free_widgets() during
      execution of snd_soc_dapm_free().
      
      Now snd_soc_dapm_free() calls for different DAPM contexts happens before
      snd_card_free() call from where the call chain to dapm_kcontrol_free()
      begins:
      
      soc_cleanup_card_resources()
        soc_remove_dai_links()
          soc_remove_link_dais()
            snd_soc_dapm_free(&cpu_dai->dapm)
          soc_remove_link_components()
            soc_remove_platform()
              snd_soc_dapm_free(&platform->dapm)
            soc_remove_codec()
              snd_soc_dapm_free(&codec->dapm)
        snd_soc_dapm_free(&card->dapm)
        snd_card_free()
          snd_card_do_free()
            snd_device_free_all()
              snd_device_free()
                snd_ctl_dev_free()
                  snd_ctl_remove()
                    snd_ctl_free_one()
                      dapm_kcontrol_free()
      
      This wasn't making harm with ordinary DAPM kcontrols since data->widget is NULL for
      them.
      
      Fixes: 9e1fda4ae158 (ASoC: dapm: Implement mixer input auto-disable)
      Signed-off-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Acked-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarMark Brown <broonie@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      79c3ca4f
    • Martin Schwidefsky's avatar
      s390/bpf,jit: initialize A register if 1st insn is BPF_S_LDX_B_MSH · 96ac4631
      Martin Schwidefsky authored
      commit 6e0de817 upstream.
      
      The A register needs to be initialized to zero in the prolog if the
      first instruction of the BPF program is BPF_S_LDX_B_MSH to prevent
      leaking the content of %r5 to user space.
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      96ac4631
    • Sebastian Ott's avatar
      s390/chsc: fix SEI usage on old FW levels · 1671b4e1
      Sebastian Ott authored
      commit 06cd7a87 upstream.
      
      Using a notification type mask for the store event information chsc
      is unsupported on some firmware levels. Retry SEI with that mask set
      to zero (which is the old way of requesting only channel subsystem
      related events).
      Reported-and-tested-by: default avatarStefan Haberland <stefan.haberland@de.ibm.com>
      Reviewed-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Signed-off-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1671b4e1
    • Michael Neuling's avatar
      powerpc/tm: Disable IRQ in tm_recheckpoint · 32dbc1eb
      Michael Neuling authored
      commit e6b8fd02 upstream.
      
      We can't take an IRQ when we're about to do a trechkpt as our GPR state is set
      to user GPR values.
      
      We've hit this when running some IBM Java stress tests in the lab resulting in
      the following dump:
      
        cpu 0x3f: Vector: 700 (Program Check) at [c000000007eb3d40]
            pc: c000000000050074: restore_gprs+0xc0/0x148
            lr: 00000000b52a8184
            sp: ac57d360
           msr: 8000000100201030
          current = 0xc00000002c500000
          paca    = 0xc000000007dbfc00     softe: 0     irq_happened: 0x00
            pid   = 34535, comm = Pooled Thread #
        R00 = 00000000b52a8184   R16 = 00000000b3e48fda
        R01 = 00000000ac57d360   R17 = 00000000ade79bd8
        R02 = 00000000ac586930   R18 = 000000000fac9bcc
        R03 = 00000000ade60000   R19 = 00000000ac57f930
        R04 = 00000000f6624918   R20 = 00000000ade79be8
        R05 = 00000000f663f238   R21 = 00000000ac218a54
        R06 = 0000000000000002   R22 = 000000000f956280
        R07 = 0000000000000008   R23 = 000000000000007e
        R08 = 000000000000000a   R24 = 000000000000000c
        R09 = 00000000b6e69160   R25 = 00000000b424cf00
        R10 = 0000000000000181   R26 = 00000000f66256d4
        R11 = 000000000f365ec0   R27 = 00000000b6fdcdd0
        R12 = 00000000f66400f0   R28 = 0000000000000001
        R13 = 00000000ada71900   R29 = 00000000ade5a300
        R14 = 00000000ac2185a8   R30 = 00000000f663f238
        R15 = 0000000000000004   R31 = 00000000f6624918
        pc  = c000000000050074 restore_gprs+0xc0/0x148
        cfar= c00000000004fe28 dont_restore_vec+0x1c/0x1a4
        lr  = 00000000b52a8184
        msr = 8000000100201030   cr  = 24804888
        ctr = 0000000000000000   xer = 0000000000000000   trap =  700
      
      This moves tm_recheckpoint to a C function and moves the tm_restore_sprs into
      that function.  It then adds IRQ disabling over the trechkpt critical section.
      It also sets the TEXASR FS in the signals code to ensure this is never set now
      that we explictly write the TM sprs in tm_recheckpoint.
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      32dbc1eb
    • Anton Blanchard's avatar
      powerpc/compat: 32-bit little endian machine name is ppcle, not ppc · 9913ed94
      Anton Blanchard authored
      commit 422b9b96 upstream.
      
      I noticed this when testing setarch. No, we don't magically
      support a big endian userspace on a little endian kernel.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9913ed94
    • Tyler Stachecki's avatar
      mpt2sas: Don't disable device twice at suspend. · e2b70781
      Tyler Stachecki authored
      commit af61e27c upstream.
      
      On suspend, _scsih_suspend calls mpt2sas_base_free_resources, which
      in turn calls pci_disable_device if the device is enabled prior to
      suspending. However, _scsih_suspend also calls pci_disable_device
      itself.
      
      Thus, in the event that the device is enabled prior to suspending,
      pci_disable_device will be called twice. This patch removes the
      duplicate call to pci_disable_device in _scsi_suspend as it is both
      unnecessary and results in a kernel oops.
      Signed-off-by: default avatarTyler Stachecki <tstache1@binghamton.edu>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e2b70781
    • Fam Zheng's avatar
      virtio-scsi: Skip setting affinity on uninitialized vq · 7b600e97
      Fam Zheng authored
      commit 0c8482ac upstream.
      
      virtscsi_init calls virtscsi_remove_vqs on err, even before initializing
      the vqs. The latter calls virtscsi_set_affinity, so let's check the
      pointer there before setting affinity on it.
      
      This fixes a panic when setting device's num_queues=2 on RHEL 6.5:
      
      qemu-system-x86_64 ... \
      -device virtio-scsi-pci,id=scsi0,addr=0x13,...,num_queues=2 \
      -drive file=/stor/vm/dummy.raw,id=drive-scsi-disk,... \
      -device scsi-hd,drive=drive-scsi-disk,...
      
      [    0.354734] scsi0 : Virtio SCSI HBA
      [    0.379504] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
      [    0.380141] IP: [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
      [    0.380141] PGD 0
      [    0.380141] Oops: 0000 [#1] SMP
      [    0.380141] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0+ #5
      [    0.380141] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
      [    0.380141] task: ffff88003c9f0000 ti: ffff88003c9f8000 task.ti: ffff88003c9f8000
      [    0.380141] RIP: 0010:[<ffffffff814741ef>]  [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
      [    0.380141] RSP: 0000:ffff88003c9f9c08  EFLAGS: 00010256
      [    0.380141] RAX: 0000000000000000 RBX: ffff88003c3a9d40 RCX: 0000000000001070
      [    0.380141] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000
      [    0.380141] RBP: ffff88003c9f9c28 R08: 00000000000136c0 R09: ffff88003c801c00
      [    0.380141] R10: ffffffff81475229 R11: 0000000000000008 R12: 0000000000000000
      [    0.380141] R13: ffffffff81cc7ca8 R14: ffff88003cac3d40 R15: ffff88003cac37a0
      [    0.380141] FS:  0000000000000000(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000
      [    0.380141] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [    0.380141] CR2: 0000000000000020 CR3: 0000000001c0e000 CR4: 00000000000006f0
      [    0.380141] Stack:
      [    0.380141]  ffff88003c3a9d40 0000000000000000 ffff88003cac3d80 ffff88003cac3d40
      [    0.380141]  ffff88003c9f9c48 ffffffff814742e8 ffff88003c26d000 ffff88003c26d000
      [    0.380141]  ffff88003c9f9c68 ffffffff81474321 ffff88003c26d000 ffff88003c3a9d40
      [    0.380141] Call Trace:
      [    0.380141]  [<ffffffff814742e8>] virtscsi_set_affinity+0x28/0x40
      [    0.380141]  [<ffffffff81474321>] virtscsi_remove_vqs+0x21/0x50
      [    0.380141]  [<ffffffff81475231>] virtscsi_init+0x91/0x240
      [    0.380141]  [<ffffffff81365290>] ? vp_get+0x50/0x70
      [    0.380141]  [<ffffffff81475544>] virtscsi_probe+0xf4/0x280
      [    0.380141]  [<ffffffff81363ea5>] virtio_dev_probe+0xe5/0x140
      [    0.380141]  [<ffffffff8144c669>] driver_probe_device+0x89/0x230
      [    0.380141]  [<ffffffff8144c8ab>] __driver_attach+0x9b/0xa0
      [    0.380141]  [<ffffffff8144c810>] ? driver_probe_device+0x230/0x230
      [    0.380141]  [<ffffffff8144c810>] ? driver_probe_device+0x230/0x230
      [    0.380141]  [<ffffffff8144ac1c>] bus_for_each_dev+0x8c/0xb0
      [    0.380141]  [<ffffffff8144c499>] driver_attach+0x19/0x20
      [    0.380141]  [<ffffffff8144bf28>] bus_add_driver+0x198/0x220
      [    0.380141]  [<ffffffff8144ce9f>] driver_register+0x5f/0xf0
      [    0.380141]  [<ffffffff81d27c91>] ? spi_transport_init+0x79/0x79
      [    0.380141]  [<ffffffff8136403b>] register_virtio_driver+0x1b/0x30
      [    0.380141]  [<ffffffff81d27d19>] init+0x88/0xd6
      [    0.380141]  [<ffffffff81d27c18>] ? scsi_init_procfs+0x5b/0x5b
      [    0.380141]  [<ffffffff81ce88a7>] do_one_initcall+0x7f/0x10a
      [    0.380141]  [<ffffffff81ce8aa7>] kernel_init_freeable+0x14a/0x1de
      [    0.380141]  [<ffffffff81ce8b3b>] ? kernel_init_freeable+0x1de/0x1de
      [    0.380141]  [<ffffffff817dec20>] ? rest_init+0x80/0x80
      [    0.380141]  [<ffffffff817dec29>] kernel_init+0x9/0xf0
      [    0.380141]  [<ffffffff817e68fc>] ret_from_fork+0x7c/0xb0
      [    0.380141]  [<ffffffff817dec20>] ? rest_init+0x80/0x80
      [    0.380141] RIP  [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
      [    0.380141]  RSP <ffff88003c9f9c08>
      [    0.380141] CR2: 0000000000000020
      [    0.380141] ---[ end trace 8074b70c3d5e1d73 ]---
      [    0.475018] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
      [    0.475018]
      [    0.475068] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
      [    0.475068] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
      
      [jejb: checkpatch fixes]
      Signed-off-by: default avatarFam Zheng <famz@redhat.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7b600e97
    • Huacai Chen's avatar
      MIPS: Hibernate: Flush TLB entries in swsusp_arch_resume() · 3a9d307d
      Huacai Chen authored
      commit c14af233 upstream.
      
      The original MIPS hibernate code flushes cache and TLB entries in
      swsusp_arch_resume(). But they are removed in Commit 44eeab67
      (MIPS: Hibernation: Remove SMP TLB and cacheflushing code.). A cross-
      CPU flush is surely unnecessary because all but the local CPU have
      already been disabled. But a local flush (at least the TLB flush) is
      needed. When we do hibernation on Loongson-3 with an E1000E NIC, it is
      very easy to produce a kernel panic (kernel page fault, or unaligned
      access). The root cause is E1000E driver use vzalloc_node() to allocate
      pages, the stale TLB entries of the booting kernel will be misused by
      the resumed target kernel.
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Cc: John Crispin <john@phrozen.org>
      Cc: Steven J. Hill <Steven.Hill@imgtec.com>
      Cc: Aurelien Jarno <aurelien@aurel32.net>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Patchwork: https://patchwork.linux-mips.org/patch/6643/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3a9d307d
    • James Hogan's avatar
      MIPS: KVM: Pass reserved instruction exceptions to guest · 33e4c53a
      James Hogan authored
      commit 15505679 upstream.
      
      Previously a reserved instruction exception while in guest code would
      cause a KVM internal error if kvm_mips_handle_ri() didn't recognise the
      instruction (including a RDHWR from an unrecognised hardware register).
      
      However the guest OS should really have the opportunity to catch the
      exception so that it can take the appropriate actions such as sending a
      SIGILL to the guest user process or emulating the instruction itself.
      
      Therefore in these cases emulate a guest RI exception and only return
      EMULATE_FAIL if that fails, being careful to revert the PC first in case
      the exception occurred in a branch delay slot in which case the PC will
      already point to the branch target.
      
      Also turn the printk messages relating to these cases into kvm_debug
      messages so that they aren't usually visible.
      
      This allows crashme to run in the guest without killing the entire VM.
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Sanjay Lal <sanjayl@kymasys.com>
      Cc: linux-mips@linux-mips.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      33e4c53a
    • Mark Salter's avatar
      arm: KVM: fix possible misalignment of PGDs and bounce page · 723334b9
      Mark Salter authored
      commit 5d4e08c4 upstream.
      
      The kvm/mmu code shared by arm and arm64 uses kalloc() to allocate
      a bounce page (if hypervisor init code crosses page boundary) and
      hypervisor PGDs. The problem is that kalloc() does not guarantee
      the proper alignment. In the case of the bounce page, the page sized
      buffer allocated may also cross a page boundary negating the purpose
      and leading to a hang during kvm initialization. Likewise the PGDs
      allocated may not meet the minimum alignment requirements of the
      underlying MMU. This patch uses __get_free_page() to guarantee the
      worst case alignment needs of the bounce page and PGDs on both arm
      and arm64.
      Signed-off-by: default avatarMark Salter <msalter@redhat.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      723334b9
    • Haibin Wang's avatar
      KVM: ARM: vgic: Fix sgi dispatch problem · dfcb8cde
      Haibin Wang authored
      commit 91021a6c upstream.
      
      When dispatch SGI(mode == 0), that is the vcpu of VM should send
      sgi to the cpu which the target_cpus list.
      So, there must add the "break" to branch of case 0.
      Signed-off-by: default avatarHaibin Wang <wanghaibin.wang@huawei.com>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dfcb8cde
    • Matthew Daley's avatar
      floppy: don't write kernel-only members to FDRAWCMD ioctl output · 3d43edf5
      Matthew Daley authored
      commit 2145e15e upstream.
      
      Do not leak kernel-only floppy_raw_cmd structure members to userspace.
      This includes the linked-list pointer and the pointer to the allocated
      DMA space.
      Signed-off-by: default avatarMatthew Daley <mattd@bugfuzz.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3d43edf5
    • Matthew Daley's avatar
      floppy: ignore kernel-only members in FDRAWCMD ioctl input · 36cdf95d
      Matthew Daley authored
      commit ef87dbe7 upstream.
      
      Always clear out these floppy_raw_cmd struct members after copying the
      entire structure from userspace so that the in-kernel version is always
      valid and never left in an interdeterminate state.
      Signed-off-by: default avatarMatthew Daley <mattd@bugfuzz.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      36cdf95d
    • Peter Hurley's avatar
      n_tty: Fix n_tty_write crash when echoing in raw mode · 61461fa9
      Peter Hurley authored
      commit 4291086b upstream.
      
      The tty atomic_write_lock does not provide an exclusion guarantee for
      the tty driver if the termios settings are LECHO & !OPOST.  And since
      it is unexpected and not allowed to call TTY buffer helpers like
      tty_insert_flip_string concurrently, this may lead to crashes when
      concurrect writers call pty_write. In that case the following two
      writers:
      * the ECHOing from a workqueue and
      * pty_write from the process
      race and can overflow the corresponding TTY buffer like follows.
      
      If we look into tty_insert_flip_string_fixed_flag, there is:
        int space = __tty_buffer_request_room(port, goal, flags);
        struct tty_buffer *tb = port->buf.tail;
        ...
        memcpy(char_buf_ptr(tb, tb->used), chars, space);
        ...
        tb->used += space;
      
      so the race of the two can result in something like this:
                    A                                B
      __tty_buffer_request_room
                                        __tty_buffer_request_room
      memcpy(buf(tb->used), ...)
      tb->used += space;
                                        memcpy(buf(tb->used), ...) ->BOOM
      
      B's memcpy is past the tty_buffer due to the previous A's tb->used
      increment.
      
      Since the N_TTY line discipline input processing can output
      concurrently with a tty write, obtain the N_TTY ldisc output_lock to
      serialize echo output with normal tty writes.  This ensures the tty
      buffer helper tty_insert_flip_string is not called concurrently and
      everything is fine.
      
      Note that this is nicely reproducible by an ordinary user using
      forkpty and some setup around that (raw termios + ECHO). And it is
      present in kernels at least after commit
      d945cb9c (pty: Rework the pty layer to
      use the normal buffering logic) in 2.6.31-rc3.
      
      js: add more info to the commit log
      js: switch to bool
      js: lock unconditionally
      js: lock only the tty->ops->write call
      
      References: CVE-2014-0196
      Reported-and-tested-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      61461fa9
    • Peter Hurley's avatar
      tty: Fix lockless tty buffer race · 88d123f1
      Peter Hurley authored
      commit 62a0d8d7 upstream.
      
      Commit 6a20dbd6,
      "tty: Fix race condition between __tty_buffer_request_room and flush_to_ldisc"
      correctly identifies an unsafe race condition between
      __tty_buffer_request_room() and flush_to_ldisc(), where the consumer
      flush_to_ldisc() prematurely advances the head before consuming the
      last of the data committed. For example:
      
                 CPU 0                     |            CPU 1
      __tty_buffer_request_room            | flush_to_ldisc
        ...                                |   ...
                                           |   count = head->commit - head->read
        n = tty_buffer_alloc()             |
        b->commit = b->used                |
        b->next = n                        |
                                           |   if (!count)                /* T */
                                           |     if (head->next == NULL)  /* F */
                                           |     buf->head = head->next
      
      In this case, buf->head has been advanced but head->commit may have
      been updated with a new value.
      
      Instead of reintroducing an unnecessary lock, fix the race locklessly.
      Read the commit-next pair in the reverse order of writing, which guarantees
      the commit value read is the latest value written if the head is
      advancing.
      Reported-by: default avatarManfred Schlaegl <manfred.schlaegl@gmx.at>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      88d123f1
    • Michael Welling's avatar
      tty: serial: 8250_core.c Bug fix for Exar chips. · 04d5d946
      Michael Welling authored
      commit b790f210 upstream.
      
      The sleep function was updated to put the serial port to sleep only when necessary.
      This appears to resolve the errant behavior of the driver as described in
      Kernel Bug 61961 – "My Exar Corp. XR17C/D152 Dual PCI UART modem does not
      work with 3.8.0".
      Signed-off-by: default avatarMichael Welling <mwelling@ieee.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      04d5d946
    • Tomoki Sekiyama's avatar
      drivers/tty/hvc: don't free hvc_console_setup after init · 50e2278a
      Tomoki Sekiyama authored
      commit 501fed45 upstream.
      
      When 'console=hvc0' is specified to the kernel parameter in x86 KVM guest,
      hvc console is setup within a kthread. However, that will cause SEGV
      and the boot will fail when the driver is builtin to the kernel,
      because currently hvc_console_setup() is annotated with '__init'. This
      patch removes '__init' to boot the guest successfully with 'console=hvc0'.
      Signed-off-by: default avatarTomoki Sekiyama <tomoki.sekiyama@hds.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      50e2278a
    • Leigh Brown's avatar
      ARM: dts: am335x: update USB DT references · 4dba1cd1
      Leigh Brown authored
      commit a2f8d6b3 upstream.
      
      In "ARM: dts: am33xx: correcting dt node unit address for usb", the
      usb_ctrl_mod and cppi41dma nodes were updated with the correct register
      addresses.  However, the dts files that reference these nodes were not
      updated, and those devices are no longer being enabled.
      
      This patch corrects the references for the affected dts files.
      Signed-off-by: default avatarLeigh Brown <leigh@solinno.co.uk>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Cc: Johan Hovold <jhovold@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4dba1cd1
    • Aaron Sanders's avatar
      USB: pl2303: add ids for Hewlett-Packard HP POS pole displays · 623e9d09
      Aaron Sanders authored
      commit b16c02fb upstream.
      
      Add device ids to pl2303 for the Hewlett-Packard HP POS pole displays:
      
      LD960: 03f0:0B39
      LCM220: 03f0:3139
      LCM960: 03f0:3239
      
      [ Johan: fix indentation and sort PIDs numerically ]
      Signed-off-by: default avatarAaron Sanders <aaron.sanders@hp.com>
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      623e9d09
    • Julius Werner's avatar
      usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb · 29b56e8c
      Julius Werner authored
      commit 1f81b6d2 upstream.
      
      We have observed a rare cycle state desync bug after Set TR Dequeue
      Pointer commands on Intel LynxPoint xHCs (resulting in an endpoint that
      doesn't fetch new TRBs and thus an unresponsive USB device). It always
      triggers when a previous Set TR Dequeue Pointer command has set the
      pointer to the final Link TRB of a segment, and then another URB gets
      enqueued and cancelled again before it can be completed. Further
      investigation showed that the xHC had returned the Link TRB in the TRB
      Pointer field of the Transfer Event (CC == Stopped -- Length Invalid),
      but when xhci_find_new_dequeue_state() later accesses the Endpoint
      Context's TR Dequeue Pointer field it is set to the first TRB of the
      next segment.
      
      The driver expects those two values to be the same in this situation,
      and uses the cycle state of the latter together with the address of the
      former. This should be fine according to the XHCI specification, since
      the endpoint ring should be stopped when returning the Transfer Event
      and thus should not advance over the Link TRB before it gets restarted.
      However, real-world XHCI implementations apparently don't really care
      that much about these details, so the driver should follow a more
      defensive approach to try to work around HC spec violations.
      
      This patch removes the stopped_trb variable that had been used to store
      the TRB Pointer from the last Transfer Event of a stopped TRB. Instead,
      xhci_find_new_dequeue_state() now relies only on the Endpoint Context,
      requiring a small amount of additional processing to find the virtual
      address corresponding to the TR Dequeue Pointer. Some other parts of the
      function were slightly rearranged to better fit into this model.
      
      This patch should be backported to kernels as old as 2.6.31 that contain
      the commit ae636747 "USB: xhci: URB
      cancellation support."
      Signed-off-by: default avatarJulius Werner <jwerner@chromium.org>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      29b56e8c
    • Theodore Ts'o's avatar
      ext4: use i_size_read in ext4_unaligned_aio() · ecb80519
      Theodore Ts'o authored
      commit 6e6358fc upstream.
      
      We haven't taken i_mutex yet, so we need to use i_size_read().
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ecb80519
    • Theodore Ts'o's avatar
      ext4: move ext4_update_i_disksize() into mpage_map_and_submit_extent() · afeeb034
      Theodore Ts'o authored
      commit 622cad13 upstream.
      
      The function ext4_update_i_disksize() is used in only one place, in
      the function mpage_map_and_submit_extent().  Move its code to simplify
      the code paths, and also move the call to ext4_mark_inode_dirty() into
      the i_data_sem's critical region, to be consistent with all of the
      other places where we update i_disksize.  That way, we also keep the
      raw_inode's i_disksize protected, to avoid the following race:
      
            CPU #1                                 CPU #2
      
         down_write(&i_data_sem)
         Modify i_disk_size
         up_write(&i_data_sem)
                                              down_write(&i_data_sem)
                                              Modify i_disk_size
                                              Copy i_disk_size to on-disk inode
                                              up_write(&i_data_sem)
         Copy i_disk_size to on-disk inode
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      afeeb034
    • Jan Kara's avatar
      ext4: fix jbd2 warning under heavy xattr load · 44b9a5ad
      Jan Kara authored
      commit ec4cb1aa upstream.
      
      When heavily exercising xattr code the assertion that
      jbd2_journal_dirty_metadata() shouldn't return error was triggered:
      
      WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/jbd2/transaction.c:1237
      jbd2_journal_dirty_metadata+0x1ba/0x260()
      
      CPU: 0 PID: 8877 Comm: ceph-osd Tainted: G    W 3.10.0-ceph-00049-g68d04c9 #1
      Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011
       ffffffff81a1d3c8 ffff880214469928 ffffffff816311b0 ffff880214469968
       ffffffff8103fae0 ffff880214469958 ffff880170a9dc30 ffff8802240fbe80
       0000000000000000 ffff88020b366000 ffff8802256e7510 ffff880214469978
      Call Trace:
       [<ffffffff816311b0>] dump_stack+0x19/0x1b
       [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0
       [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20
       [<ffffffff81267c2a>] jbd2_journal_dirty_metadata+0x1ba/0x260
       [<ffffffff81245093>] __ext4_handle_dirty_metadata+0xa3/0x140
       [<ffffffff812561f3>] ext4_xattr_release_block+0x103/0x1f0
       [<ffffffff81256680>] ext4_xattr_block_set+0x1e0/0x910
       [<ffffffff8125795b>] ext4_xattr_set_handle+0x38b/0x4a0
       [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff81257b32>] ext4_xattr_set+0xc2/0x140
       [<ffffffff81258547>] ext4_xattr_user_set+0x47/0x50
       [<ffffffff811935ce>] generic_setxattr+0x6e/0x90
       [<ffffffff81193ecb>] __vfs_setxattr_noperm+0x7b/0x1c0
       [<ffffffff811940d4>] vfs_setxattr+0xc4/0xd0
       [<ffffffff8119421e>] setxattr+0x13e/0x1e0
       [<ffffffff811719c7>] ? __sb_start_write+0xe7/0x1b0
       [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
       [<ffffffff8118c65c>] ? fget_light+0x3c/0x130
       [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60
       [<ffffffff8118f1f8>] ? __mnt_want_write+0x58/0x70
       [<ffffffff811946be>] SyS_fsetxattr+0xbe/0x100
       [<ffffffff816407c2>] system_call_fastpath+0x16/0x1b
      
      The reason for the warning is that buffer_head passed into
      jbd2_journal_dirty_metadata() didn't have journal_head attached. This is
      caused by the following race of two ext4_xattr_release_block() calls:
      
      CPU1                                CPU2
      ext4_xattr_release_block()          ext4_xattr_release_block()
      lock_buffer(bh);
      /* False */
      if (BHDR(bh)->h_refcount == cpu_to_le32(1))
      } else {
        le32_add_cpu(&BHDR(bh)->h_refcount, -1);
        unlock_buffer(bh);
                                          lock_buffer(bh);
                                          /* True */
                                          if (BHDR(bh)->h_refcount == cpu_to_le32(1))
                                            get_bh(bh);
                                            ext4_free_blocks()
                                              ...
                                              jbd2_journal_forget()
                                                jbd2_journal_unfile_buffer()
                                                -> JH is gone
        error = ext4_handle_dirty_xattr_block(handle, inode, bh);
        -> triggers the warning
      
      We fix the problem by moving ext4_handle_dirty_xattr_block() under the
      buffer lock. Sadly this cannot be done in nojournal mode as that
      function can call sync_dirty_buffer() which would deadlock. Luckily in
      nojournal mode the race is harmless (we only dirty already freed buffer)
      and thus for nojournal mode we leave the dirtying outside of the buffer
      lock.
      Reported-by: default avatarSage Weil <sage@inktank.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      44b9a5ad
    • Matthew Wilcox's avatar
      ext4: note the error in ext4_end_bio() · 92235297
      Matthew Wilcox authored
      commit 9503c67c upstream.
      
      ext4_end_bio() currently throws away the error that it receives.  Chances
      are this is part of a spate of errors, one of which will end up getting
      the error returned to userspace somehow, but we shouldn't take that risk.
      Also print out the errno to aid in debug.
      Signed-off-by: default avatarMatthew Wilcox <matthew.r.wilcox@intel.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      92235297
    • Kazuya Mio's avatar
      ext4: FIBMAP ioctl causes BUG_ON due to handle EXT_MAX_BLOCKS · be687f34
      Kazuya Mio authored
      commit 4adb6ab3 upstream.
      
      When we try to get 2^32-1 block of the file which has the extent
      (ee_block=2^32-2, ee_len=1) with FIBMAP ioctl, it causes BUG_ON
      in ext4_ext_put_gap_in_cache().
      
      To avoid the problem, ext4_map_blocks() needs to check the file logical block
      number. ext4_ext_put_gap_in_cache() called via ext4_map_blocks() cannot
      handle 2^32-1 because the maximum file logical block number is 2^32-2.
      
      Note that ext4_ind_map_blocks() returns -EIO when the block number is invalid.
      So ext4_map_blocks() should also return the same errno.
      Signed-off-by: default avatarKazuya Mio <k-mio@sx.jp.nec.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      be687f34
    • Krzysztof Kozlowski's avatar
      clk: s2mps11: Fix possible NULL pointer dereference · eb41a94f
      Krzysztof Kozlowski authored
      commit 238e1405 upstream.
      
      If parent device does not have of_node set the s2mps11_clk_parse_dt()
      returned NULL. This NULL was later passed to of_clk_add_provider() which
      dereferenced it in pr_debug() call.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarMike Turquette <mturquette@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      eb41a94f
    • Tetsuo Handa's avatar
      ocfs2: fix panic on kfree(xattr->name) · 6c2d09ee
      Tetsuo Handa authored
      commit f81c2015 upstream.
      
      Commit 9548906b ('xattr: Constify ->name member of "struct xattr"')
      missed that ocfs2 is calling kfree(xattr->name).  As a result, kernel
      panic occurs upon calling kfree(xattr->name) because xattr->name refers
      static constant names.  This patch removes kfree(xattr->name) from
      ocfs2_mknod() and ocfs2_symlink().
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reported-by: default avatarTariq Saeed <tariq.x.saeed@oracle.com>
      Tested-by: default avatarTariq Saeed <tariq.x.saeed@oracle.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6c2d09ee
    • alex chen's avatar
      ocfs2: do not put bh when buffer_uptodate failed · 26961cab
      alex chen authored
      commit f7cf4f5b upstream.
      
      Do not put bh when buffer_uptodate failed in ocfs2_write_block and
      ocfs2_write_super_or_backup, because it will put bh in b_end_io.
      Otherwise it will hit a warning "VFS: brelse: Trying to free free
      buffer".
      Signed-off-by: default avatarAlex Chen <alex.chen@huawei.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Acked-by: default avatarJoel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      26961cab
    • Junxiao Bi's avatar
      ocfs2: dlm: fix recovery hung · a409d747
      Junxiao Bi authored
      commit ded2cf71 upstream.
      
      There is a race window in dlm_do_recovery() between dlm_remaster_locks()
      and dlm_reset_recovery() when the recovery master nearly finish the
      recovery process for a dead node.  After the master sends FINALIZE_RECO
      message in dlm_remaster_locks(), another node may become the recovery
      master for another dead node, and then send the BEGIN_RECO message to
      all the nodes included the old master, in the handler of this message
      dlm_begin_reco_handler() of old master, dlm->reco.dead_node and
      dlm->reco.new_master will be set to the second dead node and the new
      master, then in dlm_reset_recovery(), these two variables will be reset
      to default value.  This will cause new recovery master can not finish
      the recovery process and hung, at last the whole cluster will hung for
      recovery.
      
      old recovery master:                                 new recovery master:
      dlm_remaster_locks()
                                                        become recovery master for
                                                        another dead node.
                                                        dlm_send_begin_reco_message()
      dlm_begin_reco_handler()
      {
       if (dlm->reco.state & DLM_RECO_STATE_FINALIZE) {
        return -EAGAIN;
       }
       dlm_set_reco_master(dlm, br->node_idx);
       dlm_set_reco_dead_node(dlm, br->dead_node);
      }
      dlm_reset_recovery()
      {
       dlm_set_reco_dead_node(dlm, O2NM_INVALID_NODE_NUM);
       dlm_set_reco_master(dlm, O2NM_INVALID_NODE_NUM);
      }
                                                        will hang in dlm_remaster_locks() for
                                                        request dlm locks info
      
      Before send FINALIZE_RECO message, recovery master should set
      DLM_RECO_STATE_FINALIZE for itself and clear it after the recovery done,
      this can break the race windows as the BEGIN_RECO messages will not be
      handled before DLM_RECO_STATE_FINALIZE flag is cleared.
      
      A similar race may happen between new recovery master and normal node
      which is in dlm_finalize_reco_handler(), also fix it.
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Reviewed-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a409d747
    • Junxiao Bi's avatar
      ocfs2: dlm: fix lock migration crash · 7cb96132
      Junxiao Bi authored
      commit 34aa8dac upstream.
      
      This issue was introduced by commit 800deef3 ("ocfs2: use
      list_for_each_entry where benefical") in 2007 where it replaced
      list_for_each with list_for_each_entry.  The variable "lock" will point
      to invalid data if "tmpq" list is empty and a panic will be triggered
      due to this.  Sunil advised reverting it back, but the old version was
      also not right.  At the end of the outer for loop, that
      list_for_each_entry will also set "lock" to an invalid data, then in the
      next loop, if the "tmpq" list is empty, "lock" will be an stale invalid
      data and cause the panic.  So reverting the list_for_each back and reset
      "lock" to NULL to fix this issue.
      
      Another concern is that this seemes can not happen because the "tmpq"
      list should not be empty.  Let me describe how.
      
      old lock resource owner(node 1):                                  migratation target(node 2):
      image there's lockres with a EX lock from node 2 in
      granted list, a NR lock from node x with convert_type
      EX in converting list.
      dlm_empty_lockres() {
       dlm_pick_migration_target() {
         pick node 2 as target as its lock is the first one
         in granted list.
       }
       dlm_migrate_lockres() {
         dlm_mark_lockres_migrating() {
           res->state |= DLM_LOCK_RES_BLOCK_DIRTY;
           wait_event(dlm->ast_wq, !dlm_lockres_is_dirty(dlm, res));
      	 //after the above code, we can not dirty lockres any more,
           // so dlm_thread shuffle list will not run
                                                                         downconvert lock from EX to NR
                                                                         upconvert lock from NR to EX
      <<< migration may schedule out here, then
      <<< node 2 send down convert request to convert type from EX to
      <<< NR, then send up convert request to convert type from NR to
      <<< EX, at this time, lockres granted list is empty, and two locks
      <<< in the converting list, node x up convert lock followed by
      <<< node 2 up convert lock.
      
      	 // will set lockres RES_MIGRATING flag, the following
      	 // lock/unlock can not run
           dlm_lockres_release_ast(dlm, res);
         }
      
         dlm_send_one_lockres()
                                                                       dlm_process_recovery_data()
                                                                         for (i=0; i<mres->num_locks; i++)
                                                                           if (ml->node == dlm->node_num)
                                                                             for (j = DLM_GRANTED_LIST; j <= DLM_BLOCKED_LIST; j++) {
                                                                              list_for_each_entry(lock, tmpq, list)
                                                                              if (lock) break; <<< lock is invalid as grant list is empty.
                                                                             }
                                                                             if (lock->ml.node != ml->node)
                                                                               BUG() >>> crash here
       }
      
      I see the above locks status from a vmcore of our internal bug.
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Cc: Sunil Mushran <sunil.mushran@gmail.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7cb96132