1. 13 Dec, 2015 40 commits
    • Oleg Nesterov's avatar
      proc: actually make proc_fd_permission() thread-friendly · 720ae734
      Oleg Nesterov authored
      commit 54708d28 upstream.
      
      The commit 96d0df79 ("proc: make proc_fd_permission() thread-friendly")
      fixed the access to /proc/self/fd from sub-threads, but introduced another
      problem: a sub-thread can't access /proc/<tid>/fd/ or /proc/thread-self/fd
      if generic_permission() fails.
      
      Change proc_fd_permission() to check same_thread_group(pid_task(), current).
      
      Fixes: 96d0df79 ("proc: make proc_fd_permission() thread-friendly")
      Reported-by: default avatar"Jin, Yihua" <yihua.jin@intel.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      720ae734
    • Takashi Iwai's avatar
      Input: elantech - add Fujitsu Lifebook U745 to force crc_enabled · edd540a6
      Takashi Iwai authored
      commit 60603950 upstream.
      
      Another Lifebook machine that needs the same quirk as other similar
      models to make the driver working.
      
      Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=883192Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      edd540a6
    • Catalin Marinas's avatar
      mm: slab: only move management objects off-slab for sizes larger than KMALLOC_MIN_SIZE · ca317d04
      Catalin Marinas authored
      commit d4322d88 upstream.
      
      On systems with a KMALLOC_MIN_SIZE of 128 (arm64, some mips and powerpc
      configurations defining ARCH_DMA_MINALIGN to 128), the first
      kmalloc_caches[] entry to be initialised after slab_early_init = 0 is
      "kmalloc-128" with index 7.  Depending on the debug kernel configuration,
      sizeof(struct kmem_cache) can be larger than 128 resulting in an
      INDEX_NODE of 8.
      
      Commit 8fc9cf42 ("slab: make more slab management structure off the
      slab") enables off-slab management objects for sizes starting with
      PAGE_SIZE >> 5 (128 bytes for a 4KB page configuration) and the creation
      of the "kmalloc-128" cache would try to place the management objects
      off-slab.  However, since KMALLOC_MIN_SIZE is already 128 and
      freelist_size == 32 in __kmem_cache_create(), kmalloc_slab(freelist_size)
      returns NULL (kmalloc_caches[7] not populated yet).  This triggers the
      following bug on arm64:
      
        kernel BUG at /work/Linux/linux-2.6-aarch64/mm/slab.c:2283!
        Internal error: Oops - BUG: 0 [#1] SMP
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 4.3.0-rc4+ #540
        Hardware name: Juno (DT)
        PC is at __kmem_cache_create+0x21c/0x280
        LR is at __kmem_cache_create+0x210/0x280
        [...]
        Call trace:
          __kmem_cache_create+0x21c/0x280
          create_boot_cache+0x48/0x80
          create_kmalloc_cache+0x50/0x88
          create_kmalloc_caches+0x4c/0xf4
          kmem_cache_init+0x100/0x118
          start_kernel+0x214/0x33c
      
      This patch introduces an OFF_SLAB_MIN_SIZE definition to avoid off-slab
      management objects for sizes equal to or smaller than KMALLOC_MIN_SIZE.
      
      Fixes: 8fc9cf42 ("slab: make more slab management structure off the slab")
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Reported-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ca317d04
    • Christoph Hellwig's avatar
      scsi: restart list search after unlock in scsi_remove_target · 0c422a84
      Christoph Hellwig authored
      commit 40998193 upstream.
      
      When dropping a lock while iterating a list we must restart the search
      as other threads could have manipulated the list under us.  Without this
      we can get stuck in an endless loop.  This bug was introduced by
      
      commit bc3f02a7
      Author: Dan Williams <djbw@fb.com>
      Date:   Tue Aug 28 22:12:10 2012 -0700
      
          [SCSI] scsi_remove_target: fix softlockup regression on hot remove
      
      Which was itself trying to fix a reported soft lockup issue
      
      http://thread.gmane.org/gmane.linux.kernel/1348679
      
      However, we believe even with this revert of the original patch, the soft
      lockup problem has been fixed by
      
      commit f2495e22
      Author: James Bottomley <JBottomley@Parallels.com>
      Date:   Tue Jan 21 07:01:41 2014 -0800
      
          [SCSI] dual scan thread bug fix
      
      Thanks go to Dan Williams <dan.j.williams@intel.com> for tracking all this
      prior history down.
      Reported-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Fixes: bc3f02a7Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      0c422a84
    • Stefan Richter's avatar
      firewire: ohci: fix JMicron JMB38x IT context discovery · bf60a52f
      Stefan Richter authored
      commit 100ceb66 upstream.
      
      Reported by Clifford and Craig for JMicron OHCI-1394 + SDHCI combo
      controllers:  Often or even most of the time, the controller is
      initialized with the message "added OHCI v1.10 device as card 0, 4 IR +
      0 IT contexts, quirks 0x10".  With 0 isochronous transmit DMA contexts
      (IT contexts), applications like audio output are impossible.
      
      However, OHCI-1394 demands that at least 4 IT contexts are implemented
      by the link layer controller, and indeed JMicron JMB38x do implement
      four of them.  Only their IsoXmitIntMask register is unreliable at early
      access.
      
      With my own JMB381 single function controller I found:
        - I can reproduce the problem with a lower probability than Craig's.
        - If I put a loop around the section which clears and reads
          IsoXmitIntMask, then either the first or the second attempt will
          return the correct initial mask of 0x0000000f.  I never encountered
          a case of needing more than a second attempt.
        - Consequently, if I put a dummy reg_read(...IsoXmitIntMaskSet)
          before the first write, the subsequent read will return the correct
          result.
        - If I merely ignore a wrong read result and force the known real
          result, later isochronous transmit DMA usage works just fine.
      
      So let's just fix this chip bug up by the latter method.  Tested with
      JMB381 on kernel 3.13 and 4.3.
      
      Since OHCI-1394 generally requires 4 IT contexts at a minium, this
      workaround is simply applied whenever the initial read of IsoXmitIntMask
      returns 0, regardless whether it's a JMicron chip or not.  I never heard
      of this issue together with any other chip though.
      
      I am not 100% sure that this fix works on the OHCI-1394 part of JMB380
      and JMB388 combo controllers exactly the same as on the JMB381 single-
      function controller, but so far I haven't had a chance to let an owner
      of a combo chip run a patched kernel.
      
      Strangely enough, IsoRecvIntMask is always reported correctly, even
      though it is probed right before IsoXmitIntMask.
      
      Reported-by: Clifford Dunn
      Reported-by: default avatarCraig Moore <craig.moore@qenos.com>
      Signed-off-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      bf60a52f
    • Alexandra Yates's avatar
      ALSA: hda - Add Intel Lewisburg device IDs Audio · 2bec64a0
      Alexandra Yates authored
      commit 5cf92c8b upstream.
      
      Adding Intel codename Lewisburg platform device IDs for audio.
      
      [rearranged the position by tiwai]
      Signed-off-by: default avatarAlexandra Yates <alexandra.yates@linux.intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      2bec64a0
    • Takashi Iwai's avatar
      ALSA: hda - Apply pin fixup for HP ProBook 6550b · 9927f021
      Takashi Iwai authored
      commit c932b98c upstream.
      
      HP ProBook 6550b needs the same pin fixup applied to other HP B-series
      laptops with docks for making its headphone and dock headphone jacks
      working properly.  We just need to add the codec SSID to the list.
      
      Bugzilla: https://bugzilla.kernel.org/attachment.cgi?id=191971Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9927f021
    • Krzysztof Kozlowski's avatar
      thermal: exynos: Fix unbalanced regulator disable on probe failure · b6f74a6e
      Krzysztof Kozlowski authored
      commit 824ead03 upstream.
      
      During probe if the regulator could not be enabled, the error exit path
      would still disable it. This could lead to unbalanced counter of
      regulator enable/disable.
      
      The patch moves code for getting and enabling the regulator from
      exynos_map_dt_data() to probe function because it is really not a part
      of getting Device Tree properties.
      Acked-by: default avatarLukasz Majewski <l.majewski@samsung.com>
      Tested-by: default avatarLukasz Majewski <l.majewski@samsung.com>
      Reviewed-by: default avatarAlim Akhtar <alim.akhtar@samsung.com>
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Fixes: 5f09a5cb ("thermal: exynos: Disable the regulator on probe failure")
      Signed-off-by: default avatarEduardo Valentin <edubezval@gmail.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      b6f74a6e
    • Radim Krčmář's avatar
      KVM: VMX: fix SMEP and SMAP without EPT · 2628ad5d
      Radim Krčmář authored
      commit 656ec4a4 upstream.
      
      The comment in code had it mostly right, but we enable paging for
      emulated real mode regardless of EPT.
      
      Without EPT (which implies emulated real mode), secondary VCPUs won't
      start unless we disable SM[AE]P when the guest doesn't use paging.
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      2628ad5d
    • libin's avatar
      recordmcount: Fix endianness handling bug for nop_mcount · 811f0bd9
      libin authored
      commit c84da8b9 upstream.
      
      In nop_mcount, shdr->sh_offset and welp->r_offset should handle
      endianness properly, otherwise it will trigger Segmentation fault
      if the recordmcount main and file.o have different endianness.
      
      Link: http://lkml.kernel.org/r/563806C7.7070606@huawei.comSigned-off-by: default avatarLi Bin <huawei.libin@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      811f0bd9
    • Max Filippov's avatar
      xtensa: fix secondary core boot in SMP · fd5507ba
      Max Filippov authored
      commit ab45fb14 upstream.
      
      There are multiple factors adding to the issue in different
      configurations:
      
      - commit 17290231 ("xtensa: add fixup for double exception raised
        in window overflow") added function window_overflow_restore_a0_fixup to
        double exception vector overlapping reset vector location of secondary
        processor cores.
      - on MMUv2 cores RESET_VECTOR1_VADDR may point to uncached kernel memory
        making code overlapping depend on cache type and size, so that without
        cache or with WT cache reset vector code overwrites double exception
        code, making issue even harder to detect.
      - on MMUv3 cores RESET_VECTOR1_VADDR may point to unmapped area, as
        MMUv3 cores change virtual address map to match MMUv2 layout, but
        reset vector virtual address is given for the original MMUv3 mapping.
      - physical memory region of the secondary reset vector is not reserved
        in the physical memory map, and thus may be allocated and overwritten
        at arbitrary moment.
      
      Fix it as follows:
      
      - move window_overflow_restore_a0_fixup code to .text section.
      - define RESET_VECTOR1_VADDR so that it points to reset vector in the
        cacheable MMUv2 map for cores with MMU.
      - reserve reset vector region in the physical memory map. Drop separate
        literal section and build mxhead.S with text section literals.
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      fd5507ba
    • Arik Nemtsov's avatar
      mac80211: allow null chandef in tracing · 50025953
      Arik Nemtsov authored
      commit 254d3dfe upstream.
      
      In TDLS channel-switch operations the chandef can sometimes be NULL.
      Avoid an oops in the trace code for these cases and just print a
      chandef full of zeros.
      
      Fixes: a7a6bdd0 ("mac80211: introduce TDLS channel switch ops")
      Signed-off-by: default avatarArik Nemtsov <arikx.nemtsov@intel.com>
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      50025953
    • Janusz.Dziedzic@tieto.com's avatar
      mac80211: fix divide by zero when NOA update · f5e1cc75
      Janusz.Dziedzic@tieto.com authored
      commit 519ee691 upstream.
      
      In case of one shot NOA the interval can be 0, catch that
      instead of potentially (depending on the driver) crashing
      like this:
      
      divide error: 0000 [#1] SMP
      [...]
      Call Trace:
      <IRQ>
      [<ffffffffc08e891c>] ieee80211_extend_absent_time+0x6c/0xb0 [mac80211]
      [<ffffffffc08e8a17>] ieee80211_update_p2p_noa+0xb7/0xe0 [mac80211]
      [<ffffffffc069cc30>] ath9k_p2p_ps_timer+0x170/0x190 [ath9k]
      [<ffffffffc070adf8>] ath_gen_timer_isr+0xc8/0xf0 [ath9k_hw]
      [<ffffffffc0691156>] ath9k_tasklet+0x296/0x2f0 [ath9k]
      [<ffffffff8107ad65>] tasklet_action+0xe5/0xf0
      [...]
      Signed-off-by: default avatarJanusz Dziedzic <janusz.dziedzic@tieto.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f5e1cc75
    • sumit.saxena@avagotech.com's avatar
      megaraid_sas : SMAP restriction--do not access user memory from IOCTL code · bdd07ab7
      sumit.saxena@avagotech.com authored
      commit 323c4a02 upstream.
      
      This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit
      OS. Do not access user memory from kernel code. The SMAP bit restricts
      accessing user memory from kernel code.
      Signed-off-by: default avatarSumit Saxena <sumit.saxena@avagotech.com>
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      bdd07ab7
    • Max Filippov's avatar
      xtensa: fixes for configs without loop option · 84e3da3b
      Max Filippov authored
      commit 5029615e upstream.
      
      Build-time fixes:
      - make lbeg/lend/lcount save/restore conditional on kernel entry;
      - don't clear lcount in platform_restart functions unconditionally.
      
      Run-time fixes:
      - use correct end of range register in __endla paired with __loopt, not
        the unused temporary register. This fixes .bss zero-initialization.
        Update comments in asmmacro.h;
      - don't clobber a10 in the usercopy that leads to access to unmapped
        memory.
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      84e3da3b
    • Herbert Xu's avatar
      crypto: algif_hash - Only export and import on sockets with data · 214fcf79
      Herbert Xu authored
      commit 4afa5f96 upstream.
      
      The hash_accept call fails to work on sockets that have not received
      any data.  For some algorithm implementations it may cause crashes.
      
      This patch fixes this by ensuring that we only export and import on
      sockets that have received data.
      Reported-by: default avatarHarsh Jain <harshjain.prof@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      214fcf79
    • Jani Nikula's avatar
    • Mauricio Faria de Oliveira's avatar
      Revert "dm mpath: fix stalls when handling invalid ioctls" · f5699565
      Mauricio Faria de Oliveira authored
      commit 47796938 upstream.
      
      This reverts commit a1989b33.
      
      That commit introduced a regression at least for the case of the SG_IO ioctl()
      running without CAP_SYS_RAWIO capability (e.g., unprivileged users) when there
      are no active paths: the ioctl() fails with the ENOTTY errno immediately rather
      than blocking due to queue_if_no_path until a path becomes active, for example.
      
      That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
      (qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2])
      from multipath devices; which leads to SCSI/filesystem errors in such a guest.
      
      More general scenarios can hit that regression too. The following demonstration
      employs a SG_IO ioctl() with a standard SCSI INQUIRY command for this objective
      (some output & user changes omitted for brevity and comments added for clarity).
      
      Reverting that commit restores normal operation (queueing) in failing scenarios;
      tested on linux-next (next-20151022).
      
      1) Test-case is based on sg_simple0 [3] (just SG_IO; remove SG_GET_VERSION_NUM)
      
          $ cat sg_simple0.c
          ... see [3] ...
          $ sed '/SG_GET_VERSION_NUM/,/}/d' sg_simple0.c > sgio_inquiry.c
          $ gcc sgio_inquiry.c -o sgio_inquiry
      
      2) The ioctl() works fine with active paths present.
      
          # multipath -l 85ag56
          85ag56 (...) dm-19 IBM     ,2145
          size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
          |-+- policy='service-time 0' prio=0 status=active
          | |- 8:0:11:0  sdz  65:144  active undef running
          | `- 9:0:9:0   sdbf 67:144  active undef running
          `-+- policy='service-time 0' prio=0 status=enabled
            |- 8:0:12:0  sdae 65:224  active undef running
            `- 9:0:12:0  sdbo 68:32   active undef running
      
          $ ./sgio_inquiry /dev/mapper/85ag56
          Some of the INQUIRY command's response:
              IBM       2145              0000
          INQUIRY duration=0 millisecs, resid=0
      
      3) The ioctl() fails with ENOTTY errno with _no_ active paths present,
         for unprivileged users (rather than blocking due to queue_if_no_path).
      
          # for path in $(multipath -l 85ag56 | grep -o 'sd[a-z]\+'); \
                do multipathd -k"fail path $path"; done
      
          # multipath -l 85ag56
          85ag56 (...) dm-19 IBM     ,2145
          size=60G features='1 queue_if_no_path' hwhandler='0' wp=rw
          |-+- policy='service-time 0' prio=0 status=enabled
          | |- 8:0:11:0  sdz  65:144  failed undef running
          | `- 9:0:9:0   sdbf 67:144  failed undef running
          `-+- policy='service-time 0' prio=0 status=enabled
            |- 8:0:12:0  sdae 65:224  failed undef running
            `- 9:0:12:0  sdbo 68:32   failed undef running
      
          $ ./sgio_inquiry /dev/mapper/85ag56
          sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device
      
      4) dmesg shows that scsi_verify_blk_ioctl() failed for SG_IO (0x2285);
         it returns -ENOIOCTLCMD, later replaced with -ENOTTY in vfs_ioctl().
      
          $ dmesg
          <...>
          [] device-mapper: multipath: Failing path 65:144.
          [] device-mapper: multipath: Failing path 67:144.
          [] device-mapper: multipath: Failing path 65:224.
          [] device-mapper: multipath: Failing path 68:32.
          [] sgio_inquiry: sending ioctl 2285 to a partition!
      
      5) The ioctl() only works if the SYS_CAP_RAWIO capability is present
         (then queueing happens -- in this example, queue_if_no_path is set);
         this is due to a conditional check in scsi_verify_blk_ioctl().
      
          # capsh --drop=cap_sys_rawio -- -c './sgio_inquiry /dev/mapper/85ag56'
          sg_simple0: Inquiry SG_IO ioctl error: Inappropriate ioctl for device
      
          # ./sgio_inquiry /dev/mapper/85ag56 &
          [1] 72830
      
          # cat /proc/72830/stack
          [<c00000171c0df700>] 0xc00000171c0df700
          [<c000000000015934>] __switch_to+0x204/0x350
          [<c000000000152d4c>] msleep+0x5c/0x80
          [<c00000000077dfb0>] dm_blk_ioctl+0x70/0x170
          [<c000000000487c40>] blkdev_ioctl+0x2b0/0x9b0
          [<c0000000003128e4>] block_ioctl+0x64/0xd0
          [<c0000000002dd3b0>] do_vfs_ioctl+0x490/0x780
          [<c0000000002dd774>] SyS_ioctl+0xd4/0xf0
          [<c000000000009358>] system_call+0x38/0xd0
      
      6) This is the function call chain exercised in this analysis:
      
      SYSCALL_DEFINE3(ioctl, <...>) @ fs/ioctl.c
          -> do_vfs_ioctl()
              -> vfs_ioctl()
                  ...
                  error = filp->f_op->unlocked_ioctl(filp, cmd, arg);
                  ...
                      -> dm_blk_ioctl() @ drivers/md/dm.c
                          -> multipath_ioctl() @ drivers/md/dm-mpath.c
                              ...
                              (bdev = NULL, due to no active paths)
                              ...
                              if (!bdev || <...>) {
                                  int err = scsi_verify_blk_ioctl(NULL, cmd);
                                  if (err)
                                      r = err;
                              }
                              ...
                                  -> scsi_verify_blk_ioctl() @ block/scsi_ioctl.c
                                      ...
                                      if (bd && bd == bd->bd_contains) // not taken (bd = NULL)
                                          return 0;
                                      ...
                                      if (capable(CAP_SYS_RAWIO)) // not taken (unprivileged user)
                                          return 0;
                                      ...
                                      printk_ratelimited(KERN_WARNING
                                                 "%s: sending ioctl %x to a partition!\n" <...>);
      
                                      return -ENOIOCTLCMD;
                                  <-
                              ...
                              return r ? : <...>
                          <-
                  ...
                  if (error == -ENOIOCTLCMD)
                      error = -ENOTTY;
                   out:
                      return error;
                  ...
      
      Links:
      [1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
      [2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')
      [3] http://tldp.org/HOWTO/SCSI-Generic-HOWTO/pexample.html (Revision 1.2, 2002-05-03)
      Signed-off-by: default avatarMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f5699565
    • Brian Norris's avatar
      mtd: blkdevs: fix potential deadlock + lockdep warnings · 9ea06026
      Brian Norris authored
      commit f3c63795 upstream.
      
      Commit 073db4a5 ("mtd: fix: avoid race condition when accessing
      mtd->usecount") fixed a race condition but due to poor ordering of the
      mutex acquisition, introduced a potential deadlock.
      
      The deadlock can occur, for example, when rmmod'ing the m25p80 module, which
      will delete one or more MTDs, along with any corresponding mtdblock
      devices. This could potentially race with an acquisition of the block
      device as follows.
      
       -> blktrans_open()
          ->  mutex_lock(&dev->lock);
          ->  mutex_lock(&mtd_table_mutex);
      
       -> del_mtd_device()
          ->  mutex_lock(&mtd_table_mutex);
          ->  blktrans_notify_remove() -> del_mtd_blktrans_dev()
             ->  mutex_lock(&dev->lock);
      
      This is a classic (potential) ABBA deadlock, which can be fixed by
      making the A->B ordering consistent everywhere. There was no real
      purpose to the ordering in the original patch, AFAIR, so this shouldn't
      be a problem. This ordering was actually already present in
      del_mtd_blktrans_dev(), for one, where the function tried to ensure that
      its caller already held mtd_table_mutex before it acquired &dev->lock:
      
              if (mutex_trylock(&mtd_table_mutex)) {
                      mutex_unlock(&mtd_table_mutex);
                      BUG();
              }
      
      So, reverse the ordering of acquisition of &dev->lock and &mtd_table_mutex so
      we always acquire mtd_table_mutex first.
      
      Snippets of the lockdep output follow:
      
        # modprobe -r m25p80
        [   53.419251]
        [   53.420838] ======================================================
        [   53.427300] [ INFO: possible circular locking dependency detected ]
        [   53.433865] 4.3.0-rc6 #96 Not tainted
        [   53.437686] -------------------------------------------------------
        [   53.444220] modprobe/372 is trying to acquire lock:
        [   53.449320]  (&new->lock){+.+...}, at: [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc
        [   53.457271]
        [   53.457271] but task is already holding lock:
        [   53.463372]  (mtd_table_mutex){+.+.+.}, at: [<c0439994>] del_mtd_device+0x18/0x100
        [   53.471321]
        [   53.471321] which lock already depends on the new lock.
        [   53.471321]
        [   53.479856]
        [   53.479856] the existing dependency chain (in reverse order) is:
        [   53.487660]
        -> #1 (mtd_table_mutex){+.+.+.}:
        [   53.492331]        [<c043fc5c>] blktrans_open+0x34/0x1a4
        [   53.497879]        [<c01afce0>] __blkdev_get+0xc4/0x3b0
        [   53.503364]        [<c01b0bb8>] blkdev_get+0x108/0x320
        [   53.508743]        [<c01713c0>] do_dentry_open+0x218/0x314
        [   53.514496]        [<c0180454>] path_openat+0x4c0/0xf9c
        [   53.519959]        [<c0182044>] do_filp_open+0x5c/0xc0
        [   53.525336]        [<c0172758>] do_sys_open+0xfc/0x1cc
        [   53.530716]        [<c000f740>] ret_fast_syscall+0x0/0x1c
        [   53.536375]
        -> #0 (&new->lock){+.+...}:
        [   53.540587]        [<c063f124>] mutex_lock_nested+0x38/0x3cc
        [   53.546504]        [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc
        [   53.552606]        [<c043f164>] blktrans_notify_remove+0x7c/0x84
        [   53.558891]        [<c04399f0>] del_mtd_device+0x74/0x100
        [   53.564544]        [<c043c670>] del_mtd_partitions+0x80/0xc8
        [   53.570451]        [<c0439aa0>] mtd_device_unregister+0x24/0x48
        [   53.576637]        [<c046ce6c>] spi_drv_remove+0x1c/0x34
        [   53.582207]        [<c03de0f0>] __device_release_driver+0x88/0x114
        [   53.588663]        [<c03de19c>] device_release_driver+0x20/0x2c
        [   53.594843]        [<c03dd9e8>] bus_remove_device+0xd8/0x108
        [   53.600748]        [<c03dacc0>] device_del+0x10c/0x210
        [   53.606127]        [<c03dadd0>] device_unregister+0xc/0x20
        [   53.611849]        [<c046d878>] __unregister+0x10/0x20
        [   53.617211]        [<c03da868>] device_for_each_child+0x50/0x7c
        [   53.623387]        [<c046eae8>] spi_unregister_master+0x58/0x8c
        [   53.629578]        [<c03e12f0>] release_nodes+0x15c/0x1c8
        [   53.635223]        [<c03de0f8>] __device_release_driver+0x90/0x114
        [   53.641689]        [<c03de900>] driver_detach+0xb4/0xb8
        [   53.647147]        [<c03ddc78>] bus_remove_driver+0x4c/0xa0
        [   53.652970]        [<c00cab50>] SyS_delete_module+0x11c/0x1e4
        [   53.658976]        [<c000f740>] ret_fast_syscall+0x0/0x1c
        [   53.664621]
        [   53.664621] other info that might help us debug this:
        [   53.664621]
        [   53.672979]  Possible unsafe locking scenario:
        [   53.672979]
        [   53.679169]        CPU0                    CPU1
        [   53.683900]        ----                    ----
        [   53.688633]   lock(mtd_table_mutex);
        [   53.692383]                                lock(&new->lock);
        [   53.698306]                                lock(mtd_table_mutex);
        [   53.704658]   lock(&new->lock);
        [   53.707946]
        [   53.707946]  *** DEADLOCK ***
      
      Fixes: 073db4a5 ("mtd: fix: avoid race condition when accessing mtd->usecount")
      Reported-by: default avatarFelipe Balbi <balbi@ti.com>
      Tested-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9ea06026
    • Marek Vasut's avatar
      can: Use correct type in sizeof() in nla_put() · a4d95541
      Marek Vasut authored
      commit 562b103a upstream.
      
      The sizeof() is invoked on an incorrect variable, likely due to some
      copy-paste error, and this might result in memory corruption. Fix this.
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Wolfgang Grandegger <wg@grandegger.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      a4d95541
    • Robin Murphy's avatar
      arm64: Fix compat register mappings · 25a0fff8
      Robin Murphy authored
      commit 5accd17d upstream.
      
      For reasons not entirely apparent, but now enshrined in history, the
      architectural mapping of AArch32 banked registers to AArch64 registers
      actually orders SP_<mode> and LR_<mode> backwards compared to the
      intuitive r13/r14 order, for all modes except FIQ.
      
      Fix the compat_<reg>_<mode> macros accordingly, in the hope of avoiding
      subtle bugs with KVM and AArch32 guests.
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      25a0fff8
    • David Hildenbrand's avatar
      KVM: s390: SCA must not cross page boundaries · 7c724edb
      David Hildenbrand authored
      commit c5c2c393 upstream.
      
      We seemed to have missed a few corner cases in commit f6c137ff
      ("KVM: s390: randomize sca address").
      
      The SCA has a maximum size of 2112 bytes. By setting the sca_offset to
      some unlucky numbers, we exceed the page.
      
      0x7c0 (1984) -> Fits exactly
      0x7d0 (2000) -> 16 bytes out
      0x7e0 (2016) -> 32 bytes out
      0x7f0 (2032) -> 48 bytes out
      
      One VCPU entry is 32 bytes long.
      
      For the last two cases, we actually write data to the other page.
      1. The address of the VCPU.
      2. Injection/delivery/clearing of SIGP externall calls via SIGP IF.
      
      Especially the 2. happens regularly. So this could produce two problems:
      1. The guest losing/getting external calls.
      2. Random memory overwrites in the host.
      
      So this problem happens on every 127 + 128 created VM with 64 VCPUs.
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      7c724edb
    • sumit.saxena@avagotech.com's avatar
      megaraid_sas: Do not use PAGE_SIZE for max_sectors · dea853b1
      sumit.saxena@avagotech.com authored
      commit 357ae967 upstream.
      
      Do not use PAGE_SIZE marco to calculate max_sectors per I/O
      request. Driver code assumes PAGE_SIZE will be always 4096 which can
      lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue
      was reported in Ubuntu Bugzilla Bug #1475166.
      Signed-off-by: default avatarSumit Saxena <sumit.saxena@avagotech.com>
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      dea853b1
    • Vineet Gupta's avatar
      MAINTAINERS: Add public mailing list for ARC · fae7751f
      Vineet Gupta authored
      commit 9acdc911 upstream.
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      fae7751f
    • Kailang Yang's avatar
      ALSA: hda/realtek - Dell XPS one ALC3260 speaker no sound after resume back · c1fc5009
      Kailang Yang authored
      commit 6ed1131f upstream.
      
      This machine had I2S codec for speaker output.
      It need to refill the I2S codec initial verb after resume back.
      Signed-off-by: default avatarKailang Yang <kailang@realtek.com>
      Reported-and-tested-by: default avatarGeorge Gugulea <gugulea@gmail.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c1fc5009
    • Chen Yu's avatar
      ACPI: Use correct IRQ when uninstalling ACPI interrupt handler · 79f08032
      Chen Yu authored
      commit 49e4b843 upstream.
      
      Currently when the system is trying to uninstall the ACPI interrupt
      handler, it uses acpi_gbl_FADT.sci_interrupt as the IRQ number.
      However, the IRQ number that the ACPI interrupt handled is installed
      for comes from acpi_gsi_to_irq() and that is the number that should
      be used for the handler removal.
      
      Fix this problem by using the mapped IRQ returned from acpi_gsi_to_irq()
      as appropriate.
      Acked-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      79f08032
    • Larry Finger's avatar
      staging: rtl8712: Add device ID for Sitecom WLA2100 · 9bd08031
      Larry Finger authored
      commit 1e6e6328 upstream.
      
      This adds the USB ID for the Sitecom WLA2100. The Windows 10 inf file
      was checked to verify that the addition is correct.
      Reported-by: default avatarFrans van de Wiel <fvdw@fvdw.eu>
      Signed-off-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Cc: Frans van de Wiel <fvdw@fvdw.eu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9bd08031
    • Bjørn Mork's avatar
      USB: qcserial: add Sierra Wireless MC74xx/EM74xx · 05e877eb
      Bjørn Mork authored
      commit f504ab18 upstream.
      
      New device IDs shamelessly lifted from the vendor driver.
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      05e877eb
    • David Mosberger-Tang's avatar
      spi: atmel: Fix DMA-setup for transfers with more than 8 bits per word · ec024552
      David Mosberger-Tang authored
      commit 06515f83 upstream.
      
      The DMA-slave configuration depends on the whether <= 8 or > 8 bits
      are transferred per word, so we need to call
      atmel_spi_dma_slave_config() with the correct value.
      Signed-off-by: default avatarDavid Mosberger <davidm@egauge.net>
      Signed-off-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ec024552
    • Dmitry Tunin's avatar
      Bluetooth: ath3k: Add support of AR3012 0cf3:817b device · 6cadc9ae
      Dmitry Tunin authored
      commit 18e0afab upstream.
      
      T: Bus=04 Lev=02 Prnt=02 Port=04 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
      D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
      P: Vendor=0cf3 ProdID=817b Rev=00.02
      C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      
      BugLink: https://bugs.launchpad.net/bugs/1506615Signed-off-by: default avatarDmitry Tunin <hanipouspilot@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      6cadc9ae
    • Dmitry Tunin's avatar
      Bluetooth: ath3k: Add new AR3012 0930:021c id · 59d383d8
      Dmitry Tunin authored
      commit cd355ff0 upstream.
      
      This adapter works with the existing linux-firmware.
      
      T:  Bus=01 Lev=01 Prnt=01 Port=03 Cnt=02 Dev#=  3 Spd=12  MxCh= 0
      D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=0930 ProdID=021c Rev=00.01
      C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      
      BugLink: https://bugs.launchpad.net/bugs/1502781Signed-off-by: default avatarDmitry Tunin <hanipouspilot@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      59d383d8
    • David Herrmann's avatar
      Bluetooth: hidp: fix device disconnect on idle timeout · 38cfa4dc
      David Herrmann authored
      commit 660f0fc0 upstream.
      
      The HIDP specs define an idle-timeout which automatically disconnects a
      device. This has always been implemented in the HIDP layer and forced a
      synchronous shutdown of the hidp-scheduler. This works just fine, but
      lacks a forced disconnect on the underlying l2cap channels. This has been
      broken since:
      
          commit 5205185d
          Author: David Herrmann <dh.herrmann@gmail.com>
          Date:   Sat Apr 6 20:28:47 2013 +0200
      
              Bluetooth: hidp: remove old session-management
      
      The old session-management always forced an l2cap error on the ctrl/intr
      channels when shutting down. The new session-management skips this, as we
      don't want to enforce channel policy on the caller. In other words, if
      user-space removes an HIDP device, the underlying channels (which are
      *owned* and *referenced* by user-space) are still left active. User-space
      needs to call shutdown(2) or close(2) to release them.
      
      Unfortunately, this does not work with idle-timeouts. There is no way to
      signal user-space that the HIDP layer has been stopped. The API simply
      does not support any event-passing except for poll(2). Hence, we restore
      old behavior and force EUNATCH on the sockets if the HIDP layer is
      disconnected due to idle-timeouts (behavior of explicit disconnects
      remains unmodified). User-space can still call
      
          getsockopt(..., SO_ERROR, ...)
      
      ..to retrieve the EUNATCH error and clear sk_err. Hence, the channels can
      still be re-used (which nobody does so far, though). Therefore, the API
      still supports the new behavior, but with this patch it's also compatible
      to the old implicit channel shutdown.
      Reported-by: default avatarMark Haun <haunma@keteu.org>
      Reported-by: default avatarLuiz Augusto von Dentz <luiz.dentz@gmail.com>
      Signed-off-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      38cfa4dc
    • Tiffany Lin's avatar
      [media] media: vb2 dma-contig: Fully cache synchronise buffers in prepare and finish · 5eb6e56e
      Tiffany Lin authored
      commit d9a98588 upstream.
      
      In videobuf2 dma-contig memory type the prepare and finish ops, instead of
      passing the number of entries in the original scatterlist as the "nents"
      parameter to dma_sync_sg_for_device() and dma_sync_sg_for_cpu(), the value
      returned by dma_map_sg() was used. Albeit this has been suggested in
      comments of some implementations (which have since been corrected), this
      is wrong.
      
      Fixes: 199d101e ("v4l: vb2-dma-contig: add prepare/finish to dma-contig allocator")
      Signed-off-by: default avatarTiffany Lin <tiffany.lin@mediatek.com>
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      5eb6e56e
    • Andy Shevchenko's avatar
      spi: dw: explicitly free IRQ handler in dw_spi_remove_host() · d599fd12
      Andy Shevchenko authored
      commit 02f20387 upstream.
      
      The following warning occurs when DW SPI is compiled as a module and it's a PCI
      device. On the removal stage pcibios_free_irq() is called earlier than
      free_irq() due to the latter is called at managed resources free strage.
      
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 1003 at /home/andy/prj/linux/fs/proc/generic.c:575 remove_proc_entry+0x118/0x150()
      remove_proc_entry: removing non-empty directory 'irq/38', leaking at least 'dw_spi1'
      Modules linked in: spi_dw_midpci(-) spi_dw [last unloaded: dw_dmac_core]
      CPU: 1 PID: 1003 Comm: modprobe Not tainted 4.3.0-rc5-next-20151013+ #32
       00000000 00000000 f5535d70 c12dc220 f5535db0 f5535da0 c104e912 c198a6bc
       f5535dcc 000003eb c198a638 0000023f c11b4098 c11b4098 f54f1ec8 f54f1ea0
       f642ba20 f5535db8 c104e96e 00000009 f5535db0 c198a6bc f5535dcc f5535df0
      Call Trace:
       [<c12dc220>] dump_stack+0x41/0x61
       [<c104e912>] warn_slowpath_common+0x82/0xb0
       [<c11b4098>] ? remove_proc_entry+0x118/0x150
       [<c11b4098>] ? remove_proc_entry+0x118/0x150
       [<c104e96e>] warn_slowpath_fmt+0x2e/0x30
       [<c11b4098>] remove_proc_entry+0x118/0x150
       [<c109b96a>] unregister_irq_proc+0xaa/0xc0
       [<c109575e>] free_desc+0x1e/0x60
       [<c10957d2>] irq_free_descs+0x32/0x70
       [<c109b1a0>] irq_domain_free_irqs+0x120/0x150
       [<c1039e8c>] mp_unmap_irq+0x5c/0x60
       [<c16277b0>] intel_mid_pci_irq_disable+0x20/0x40
       [<c1627c7f>] pcibios_free_irq+0xf/0x20
       [<c13189f2>] pci_device_remove+0x52/0xb0
       [<c13f6367>] __device_release_driver+0x77/0x100
       [<c13f6da7>] driver_detach+0x87/0x90
       [<c13f5eaa>] bus_remove_driver+0x4a/0xc0
       [<c128bf0d>] ? selinux_capable+0xd/0x10
       [<c13f7483>] driver_unregister+0x23/0x60
       [<c10bad8a>] ? find_module_all+0x5a/0x80
       [<c1317413>] pci_unregister_driver+0x13/0x60
       [<f80ac654>] dw_spi_driver_exit+0xd/0xf [spi_dw_midpci]
       [<c10bce9a>] SyS_delete_module+0x17a/0x210
      
      Explicitly call free_irq() at removal stage of the DW SPI driver.
      
      Fixes: 04f421e7 (spi: dw: use managed resources)
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      d599fd12
    • Hon Ching \\(Vicky\\) Lo's avatar
      vTPM: fix memory allocation flag for rtce buffer at kernel boot · 32c939ce
      Hon Ching \\(Vicky\\) Lo authored
      commit 60ecd86c upstream.
      
      At ibm vtpm initialzation, tpm_ibmvtpm_probe() registers its interrupt
      handler, ibmvtpm_interrupt, which calls ibmvtpm_crq_process to allocate
      memory for rtce buffer.  The current code uses 'GFP_KERNEL' as the
      type of kernel memory allocation, which resulted a warning at
      kernel/lockdep.c.  This patch uses 'GFP_ATOMIC' instead so that the
      allocation is high-priority and does not sleep.
      Signed-off-by: default avatarHon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      32c939ce
    • Daeho Jeong's avatar
      ext4, jbd2: ensure entering into panic after recording an error in superblock · abf7bef5
      Daeho Jeong authored
      commit 4327ba52 upstream.
      
      If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
      journaling will be aborted first and the error number will be recorded
      into JBD2 superblock and, finally, the system will enter into the
      panic state in "errors=panic" option.  But, in the rare case, this
      sequence is little twisted like the below figure and it will happen
      that the system enters into panic state, which means the system reset
      in mobile environment, before completion of recording an error in the
      journal superblock. In this case, e2fsck cannot recognize that the
      filesystem failure occurred in the previous run and the corruption
      wouldn't be fixed.
      
      Task A                        Task B
      ext4_handle_error()
      -> jbd2_journal_abort()
        -> __journal_abort_soft()
          -> __jbd2_journal_abort_hard()
          | -> journal->j_flags |= JBD2_ABORT;
          |
          |                         __ext4_abort()
          |                         -> jbd2_journal_abort()
          |                         | -> __journal_abort_soft()
          |                         |   -> if (journal->j_flags & JBD2_ABORT)
          |                         |           return;
          |                         -> panic()
          |
          -> jbd2_journal_update_sb_errno()
      Tested-by: default avatarHobin Woo <hobin.woo@samsung.com>
      Signed-off-by: default avatarDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      abf7bef5
    • Andy Leiserson's avatar
      [PATCH] fix calculation of meta_bg descriptor backups · ea9460a6
      Andy Leiserson authored
      commit 904dad47 upstream.
      
      "group" is the group where the backup will be placed, and is
      initialized to zero in the declaration. This meant that backups for
      meta_bg descriptors were erroneously written to the backup block group
      descriptors in groups 1 and (desc_per_block-1).
      
      Reproduction information:
        mke2fs -Fq -t ext4 -b 1024 -O ^resize_inode /tmp/foo.img 16G
        truncate -s 24G /tmp/foo.img
        losetup /dev/loop0 /tmp/foo.img
        mount /dev/loop0 /mnt
        resize2fs /dev/loop0
        umount /dev/loop0
        dd if=/dev/zero of=/dev/loop0 bs=1024 count=2
        e2fsck -fy /dev/loop0
        losetup -d /dev/loop0
      Signed-off-by: default avatarAndy Leiserson <andy@leiserson.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ea9460a6
    • Lukas Czerner's avatar
      ext4: fix potential use after free in __ext4_journal_stop · 13ea5f86
      Lukas Czerner authored
      commit 6934da92 upstream.
      
      There is a use-after-free possibility in __ext4_journal_stop() in the
      case that we free the handle in the first jbd2_journal_stop() because
      we're referencing handle->h_err afterwards. This was introduced in
      9705acd6 and it is wrong. Fix it by
      storing the handle->h_err value beforehand and avoid referencing
      potentially freed handle.
      
      Fixes: 9705acd6Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      13ea5f86
    • Filipe Manana's avatar
      Btrfs: fix truncation of compressed and inlined extents · c40009c4
      Filipe Manana authored
      commit 0305cd5f upstream.
      
      When truncating a file to a smaller size which consists of an inline
      extent that is compressed, we did not discard (or made unusable) the
      data between the new file size and the old file size, wasting metadata
      space and allowing for the truncated data to be leaked and the data
      corruption/loss mentioned below.
      We were also not correctly decrementing the number of bytes used by the
      inode, we were setting it to zero, giving a wrong report for callers of
      the stat(2) syscall. The fsck tool also reported an error about a mismatch
      between the nbytes of the file versus the real space used by the file.
      
      Now because we weren't discarding the truncated region of the file, it
      was possible for a caller of the clone ioctl to actually read the data
      that was truncated, allowing for a security breach without requiring root
      access to the system, using only standard filesystem operations. The
      scenario is the following:
      
         1) User A creates a file which consists of an inline and compressed
            extent with a size of 2000 bytes - the file is not accessible to
            any other users (no read, write or execution permission for anyone
            else);
      
         2) The user truncates the file to a size of 1000 bytes;
      
         3) User A makes the file world readable;
      
         4) User B creates a file consisting of an inline extent of 2000 bytes;
      
         5) User B issues a clone operation from user A's file into its own
            file (using a length argument of 0, clone the whole range);
      
         6) User B now gets to see the 1000 bytes that user A truncated from
            its file before it made its file world readbale. User B also lost
            the bytes in the range [1000, 2000[ bytes from its own file, but
            that might be ok if his/her intention was reading stale data from
            user A that was never supposed to be public.
      
      Note that this contrasts with the case where we truncate a file from 2000
      bytes to 1000 bytes and then truncate it back from 1000 to 2000 bytes. In
      this case reading any byte from the range [1000, 2000[ will return a value
      of 0x00, instead of the original data.
      
      This problem exists since the clone ioctl was added and happens both with
      and without my recent data loss and file corruption fixes for the clone
      ioctl (patch "Btrfs: fix file corruption and data loss after cloning
      inline extents").
      
      So fix this by truncating the compressed inline extents as we do for the
      non-compressed case, which involves decompressing, if the data isn't already
      in the page cache, compressing the truncated version of the extent, writing
      the compressed content into the inline extent and then truncate it.
      
      The following test case for fstests reproduces the problem. In order for
      the test to pass both this fix and my previous fix for the clone ioctl
      that forbids cloning a smaller inline extent into a larger one,
      which is titled "Btrfs: fix file corruption and data loss after cloning
      inline extents", are needed. Without that other fix the test fails in a
      different way that does not leak the truncated data, instead part of
      destination file gets replaced with zeroes (because the destination file
      has a larger inline extent than the source).
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
      
        rm -f $seqres.full
      
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount "-o compress"
      
        # Create our test files. File foo is going to be the source of a clone operation
        # and consists of a single inline extent with an uncompressed size of 512 bytes,
        # while file bar consists of a single inline extent with an uncompressed size of
        # 256 bytes. For our test's purpose, it's important that file bar has an inline
        # extent with a size smaller than foo's inline extent.
        $XFS_IO_PROG -f -c "pwrite -S 0xa1 0 128"   \
                -c "pwrite -S 0x2a 128 384" \
                $SCRATCH_MNT/foo | _filter_xfs_io
        $XFS_IO_PROG -f -c "pwrite -S 0xbb 0 256" $SCRATCH_MNT/bar | _filter_xfs_io
      
        # Now durably persist all metadata and data. We do this to make sure that we get
        # on disk an inline extent with a size of 512 bytes for file foo.
        sync
      
        # Now truncate our file foo to a smaller size. Because it consists of a
        # compressed and inline extent, btrfs did not shrink the inline extent to the
        # new size (if the extent was not compressed, btrfs would shrink it to 128
        # bytes), it only updates the inode's i_size to 128 bytes.
        $XFS_IO_PROG -c "truncate 128" $SCRATCH_MNT/foo
      
        # Now clone foo's inline extent into bar.
        # This clone operation should fail with errno EOPNOTSUPP because the source
        # file consists only of an inline extent and the file's size is smaller than
        # the inline extent of the destination (128 bytes < 256 bytes). However the
        # clone ioctl was not prepared to deal with a file that has a size smaller
        # than the size of its inline extent (something that happens only for compressed
        # inline extents), resulting in copying the full inline extent from the source
        # file into the destination file.
        #
        # Note that btrfs' clone operation for inline extents consists of removing the
        # inline extent from the destination inode and copy the inline extent from the
        # source inode into the destination inode, meaning that if the destination
        # inode's inline extent is larger (N bytes) than the source inode's inline
        # extent (M bytes), some bytes (N - M bytes) will be lost from the destination
        # file. Btrfs could copy the source inline extent's data into the destination's
        # inline extent so that we would not lose any data, but that's currently not
        # done due to the complexity that would be needed to deal with such cases
        # (specially when one or both extents are compressed), returning EOPNOTSUPP, as
        # it's normally not a very common case to clone very small files (only case
        # where we get inline extents) and copying inline extents does not save any
        # space (unlike for normal, non-inlined extents).
        $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
      
        # Now because the above clone operation used to succeed, and due to foo's inline
        # extent not being shinked by the truncate operation, our file bar got the whole
        # inline extent copied from foo, making us lose the last 128 bytes from bar
        # which got replaced by the bytes in range [128, 256[ from foo before foo was
        # truncated - in other words, data loss from bar and being able to read old and
        # stale data from foo that should not be possible to read anymore through normal
        # filesystem operations. Contrast with the case where we truncate a file from a
        # size N to a smaller size M, truncate it back to size N and then read the range
        # [M, N[, we should always get the value 0x00 for all the bytes in that range.
      
        # We expected the clone operation to fail with errno EOPNOTSUPP and therefore
        # not modify our file's bar data/metadata. So its content should be 256 bytes
        # long with all bytes having the value 0xbb.
        #
        # Without the btrfs bug fix, the clone operation succeeded and resulted in
        # leaking truncated data from foo, the bytes that belonged to its range
        # [128, 256[, and losing data from bar in that same range. So reading the
        # file gave us the following content:
        #
        # 0000000 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1
        # *
        # 0000200 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a
        # *
        # 0000400
        echo "File bar's content after the clone operation:"
        od -t x1 $SCRATCH_MNT/bar
      
        # Also because the foo's inline extent was not shrunk by the truncate
        # operation, btrfs' fsck, which is run by the fstests framework everytime a
        # test completes, failed reporting the following error:
        #
        #  root 5 inode 257 errors 400, nbytes wrong
      
        status=0
        exit
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      c40009c4
    • David Woodhouse's avatar
      iommu/vt-d: Fix ATSR handling for Root-Complex integrated endpoints · ebe0a78e
      David Woodhouse authored
      commit d14053b3 upstream.
      
      The VT-d specification says that "Software must enable ATS on endpoint
      devices behind a Root Port only if the Root Port is reported as
      supporting ATS transactions."
      
      We walk up the tree to find a Root Port, but for integrated devices we
      don't find one — we get to the host bridge. In that case we *should*
      allow ATS. Currently we don't, which means that we are incorrectly
      failing to use ATS for the integrated graphics. Fix that.
      
      We should never break out of this loop "naturally" with bus==NULL,
      since we'll always find bridge==NULL in that case (and now return 1).
      
      So remove the check for (!bridge) after the loop, since it can never
      happen. If it did, it would be worthy of a BUG_ON(!bridge). But since
      it'll oops anyway in that case, that'll do just as well.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ebe0a78e