1. 14 Apr, 2015 40 commits
    • Michael S. Tsirkin's avatar
      virtio_pci: document why we defer kfree · 39930a5d
      Michael S. Tsirkin authored
      commit a1eb03f5 upstream.
      
      The reason we defer kfree until release function is because it's a
      general rule for kobjects: kfree of the reference counter itself is only
      legal in the release function.
      
      Previous patch didn't make this clear, document this in code.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      [lizf: Backported to 3.4: adjust filename]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      39930a5d
    • Sasha Levin's avatar
      virtio_pci: defer kfree until release callback · dfa6201a
      Sasha Levin authored
      commit 63bd62a0 upstream.
      
      A struct device which has just been unregistered can live on past the
      point at which a driver decides to drop it's initial reference to the
      kobject gained on allocation.
      
      This implies that when releasing a virtio device, we can't free a struct
      virtio_device until the underlying struct device has been released,
      which might not happen immediately on device_unregister().
      
      Unfortunately, this is exactly what virtio pci does:
      it has an empty release callback, and frees memory immediately
      after unregistering the device.
      
      This causes an easy to reproduce crash if CONFIG_DEBUG_KOBJECT_RELEASE
      it enabled.
      
      To fix, free the memory only once we know the device is gone in the release
      callback.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      [lizf: Backported to 3.4: adjust filename]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      dfa6201a
    • Wanlong Gao's avatar
      virtio: use dev_to_virtio wrapper in virtio · d756c73d
      Wanlong Gao authored
      commit 9bffdca8 upstream.
      
      Use dev_to_virtio wrapper in virtio to make code clearly.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Signed-off-by: default avatarWanlong Gao <gaowanlong@cn.fujitsu.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      d756c73d
    • Takashi Iwai's avatar
      ALSA: hda - Fix wrong gpio_dir & gpio_mask hint setups for IDT/STAC codecs · 3d8b6705
      Takashi Iwai authored
      commit c507de88 upstream.
      
      stac_store_hints() does utterly wrong for masking the values for
      gpio_dir and gpio_data, likely due to copy&paste errors.  Fortunately,
      this feature is used very rarely, so the impact must be really small.
      Reported-by: default avatarRasmus Villemoes <linux@rasmusvillemoes.dk>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3d8b6705
    • Daniel Borkmann's avatar
      x86, um: actually mark system call tables readonly · acc0b234
      Daniel Borkmann authored
      commit b485342b upstream.
      
      Commit a074335a ("x86, um: Mark system call tables readonly") was
      supposed to mark the sys_call_table in UML as RO by adding the const,
      but it doesn't have the desired effect as it's nevertheless being placed
      into the data section since __cacheline_aligned enforces sys_call_table
      being placed into .data..cacheline_aligned instead. We need to use
      the ____cacheline_aligned version instead to fix this issue.
      
      Before:
      
      $ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
                       U sys_writev
      0000000000000000 D sys_call_table
      0000000000000000 D syscall_table_size
      
      After:
      
      $ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
                       U sys_writev
      0000000000000000 R sys_call_table
      0000000000000000 D syscall_table_size
      
      Fixes: a074335a ("x86, um: Mark system call tables readonly")
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      acc0b234
    • Preston Fick's avatar
      USB: cp210x: fix ID for production CEL MeshConnect USB Stick · 07ba58c8
      Preston Fick authored
      commit 90441b4d upstream.
      
      Fixing typo for MeshConnect IDs. The original PID (0x8875) is not in
      production and is not needed. Instead it has been changed to the
      official production PID (0x8857).
      Signed-off-by: default avatarPreston Fick <pffick@gmail.com>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      07ba58c8
    • Tomi Valkeinen's avatar
      video/logo: prevent use of logos after they have been freed · b3ca896b
      Tomi Valkeinen authored
      commit 92b004d1 upstream.
      
      If the probe of an fb driver has been deferred due to missing
      dependencies, and the probe is later ran when a module is loaded, the
      fbdev framework will try to find a logo to use.
      
      However, the logos are __initdata, and have already been freed. This
      causes sometimes page faults, if the logo memory is not mapped,
      sometimes other random crashes as the logo data is invalid, and
      sometimes nothing, if the fbdev decides to reject the logo (e.g. the
      random value depicting the logo's height is too big).
      
      This patch adds a late_initcall function to mark the logos as freed. In
      reality the logos are freed later, and fbdev probe may be ran between
      this late_initcall and the freeing of the logos. In that case we will
      miss drawing the logo, even if it would be possible.
      Signed-off-by: default avatarTomi Valkeinen <tomi.valkeinen@ti.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b3ca896b
    • Toshiaki Makita's avatar
      net: Fix stacked vlan offload features computation · 3f2bfecb
      Toshiaki Makita authored
      commit 796f2da8 upstream.
      
      When vlan tags are stacked, it is very likely that the outer tag is stored
      in skb->vlan_tci and skb->protocol shows the inner tag's vlan_proto.
      Currently netif_skb_features() first looks at skb->protocol even if there
      is the outer tag in vlan_tci, thus it incorrectly retrieves the protocol
      encapsulated by the inner vlan instead of the inner vlan protocol.
      This allows GSO packets to be passed to HW and they end up being
      corrupted.
      
      Fixes: 58e998c6 ("offloading: Force software GSO for multiple vlan tags.")
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [lizf: Backported to 3.4:
       - remove ETH_P_8021AD
       - pass protocol to harmonize_features()]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3f2bfecb
    • Rabin Vincent's avatar
      crypto: af_alg - fix backlog handling · 6fb66b08
      Rabin Vincent authored
      commit 7e77bdeb upstream.
      
      If a request is backlogged, it's complete() handler will get called
      twice: once with -EINPROGRESS, and once with the final error code.
      
      af_alg's complete handler, unlike other users, does not handle the
      -EINPROGRESS but instead always completes the completion that recvmsg()
      is waiting on.  This can lead to a return to user space while the
      request is still pending in the driver.  If userspace closes the sockets
      before the requests are handled by the driver, this will lead to
      use-after-frees (and potential crashes) in the kernel due to the tfm
      having been freed.
      
      The crashes can be easily reproduced (for example) by reducing the max
      queue length in cryptod.c and running the following (from
      http://www.chronox.de/libkcapi.html) on AES-NI capable hardware:
      
       $ while true; do kcapi -x 1 -e -c '__ecb-aes-aesni' \
          -k 00000000000000000000000000000000 \
          -p 00000000000000000000000000000000 >/dev/null & done
      Signed-off-by: default avatarRabin Vincent <rabin.vincent@axis.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      6fb66b08
    • Jan Kara's avatar
      udf: Check component length before reading it · 7e7154ff
      Jan Kara authored
      commit e237ec37 upstream.
      
      Check that length specified in a component of a symlink fits in the
      input buffer we are reading. Also properly ignore component length for
      component types that do not use it. Otherwise we read memory after end
      of buffer for corrupted udf image.
      Reported-by: default avatarCarl Henrik Lunde <chlunde@ping.uio.no>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      7e7154ff
    • Andy Lutomirski's avatar
      x86_64, vdso: Fix the vdso address randomization algorithm · bd2b759a
      Andy Lutomirski authored
      commit 394f56fe upstream.
      
      The theory behind vdso randomization is that it's mapped at a random
      offset above the top of the stack.  To avoid wasting a page of
      memory for an extra page table, the vdso isn't supposed to extend
      past the lowest PMD into which it can fit.  Other than that, the
      address should be a uniformly distributed address that meets all of
      the alignment requirements.
      
      The current algorithm is buggy: the vdso has about a 50% probability
      of being at the very end of a PMD.  The current algorithm also has a
      decent chance of failing outright due to incorrect handling of the
      case where the top of the stack is near the top of its PMD.
      
      This fixes the implementation.  The paxtest estimate of vdso
      "randomisation" improves from 11 bits to 18 bits.  (Disclaimer: I
      don't know what the paxtest code is actually calculating.)
      
      It's worth noting that this algorithm is inherently biased: the vdso
      is more likely to end up near the end of its PMD than near the
      beginning.  Ideally we would either nix the PMD sharing requirement
      or jointly randomize the vdso and the stack to reduce the bias.
      
      In the mean time, this is a considerable improvement with basically
      no risk of compatibility issues, since the allowed outputs of the
      algorithm are unchanged.
      
      As an easy test, doing this:
      
      for i in `seq 10000`
        do grep -P vdso /proc/self/maps |cut -d- -f1
      done |sort |uniq -d
      
      used to produce lots of output (1445 lines on my most recent run).
      A tiny subset looks like this:
      
      7fffdfffe000
      7fffe01fe000
      7fffe05fe000
      7fffe07fe000
      7fffe09fe000
      7fffe0bfe000
      7fffe0dfe000
      
      Note the suspicious fe000 endings.  With the fix, I get a much more
      palatable 76 repeated addresses.
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      [lizf: Backported to 3.4:
       - adjust context
       - adjust comment]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      bd2b759a
    • Jan Kara's avatar
      udf: Check path length when reading symlink · 99961be4
      Jan Kara authored
      commit 0e5cc9a4 upstream.
      
      Symlink reading code does not check whether the resulting path fits into
      the page provided by the generic code. This isn't as easy as just
      checking the symlink size because of various encoding conversions we
      perform on path. So we have to check whether there is still enough space
      in the buffer on the fly.
      Reported-by: default avatarCarl Henrik Lunde <chlunde@ping.uio.no>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      [lizf: Backported to 3.4: udf_get_filename() is called in do_udf_readdir()]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      99961be4
    • Jan Kara's avatar
      udf: Verify symlink size before loading it · fb1b2074
      Jan Kara authored
      commit a1d47b26 upstream.
      
      UDF specification allows arbitrarily large symlinks. However we support
      only symlinks at most one block large. Check the length of the symlink
      so that we don't access memory beyond end of the symlink block.
      Reported-by: default avatarCarl Henrik Lunde <chlunde@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      fb1b2074
    • Jan Kara's avatar
      udf: Verify i_size when loading inode · f9063aff
      Jan Kara authored
      commit e159332b upstream.
      
      Verify that inode size is sane when loading inode with data stored in
      ICB. Otherwise we may get confused later when working with the inode and
      inode size is too big.
      Reported-by: default avatarCarl Henrik Lunde <chlunde@ping.uio.no>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      [lizf: Backported to 3.4: just return on error, as there's no "out" label]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      f9063aff
    • Jan Kara's avatar
      isofs: Fix unchecked printing of ER records · e44da0a9
      Jan Kara authored
      commit 4e202462 upstream.
      
      We didn't check length of rock ridge ER records before printing them.
      Thus corrupted isofs image can cause us to access and print some memory
      behind the buffer with obvious consequences.
      Reported-and-tested-by: default avatarCarl Henrik Lunde <chlunde@ping.uio.no>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      e44da0a9
    • Junxiao Bi's avatar
      ocfs2: fix journal commit deadlock · 9939834c
      Junxiao Bi authored
      commit 136f49b9 upstream.
      
      For buffer write, page lock will be got in write_begin and released in
      write_end, in ocfs2_write_end_nolock(), before it unlock the page in
      ocfs2_free_write_ctxt(), it calls ocfs2_run_deallocs(), this will ask
      for the read lock of journal->j_trans_barrier.  Holding page lock and
      ask for journal->j_trans_barrier breaks the locking order.
      
      This will cause a deadlock with journal commit threads, ocfs2cmt will
      get write lock of journal->j_trans_barrier first, then it wakes up
      kjournald2 to do the commit work, at last it waits until done.  To
      commit journal, kjournald2 needs flushing data first, it needs get the
      cache page lock.
      
      Since some ocfs2 cluster locks are holding by write process, this
      deadlock may hung the whole cluster.
      
      unlock pages before ocfs2_run_deallocs() can fix the locking order, also
      put unlock before ocfs2_commit_trans() to make page lock is unlocked
      before j_trans_barrier to preserve unlocking order.
      Signed-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Reviewed-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      9939834c
    • Jiri Jaburek's avatar
      ALSA: usb-audio: extend KEF X300A FU 10 tweak to Arcam rPAC · 46ca2c2c
      Jiri Jaburek authored
      commit d70a1b98 upstream.
      
      The Arcam rPAC seems to have the same problem - whenever anything
      (alsamixer, udevd, 3.9+ kernel from 60af3d03, ..) attempts to
      access mixer / control interface of the card, the firmware "locks up"
      the entire device, resulting in
        SNDRV_PCM_IOCTL_HW_PARAMS failed (-5): Input/output error
      from alsa-lib.
      
      Other operating systems can somehow read the mixer (there seems to be
      playback volume/mute), but any manipulation is ignored by the device
      (which has hardware volume controls).
      Signed-off-by: default avatarJiri Jaburek <jjaburek@redhat.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      46ca2c2c
    • Nicholas Bellinger's avatar
      iscsi-target: Fail connection on short sendmsg writes · 32ba5199
      Nicholas Bellinger authored
      commit 6bf6ca75 upstream.
      
      This patch changes iscsit_do_tx_data() to fail on short writes
      when kernel_sendmsg() returns a value different than requested
      transfer length, returning -EPIPE and thus causing a connection
      reset to occur.
      
      This avoids a potential bug in the original code where a short
      write would result in kernel_sendmsg() being called again with
      the original iovec base + length.
      
      In practice this has not been an issue because iscsit_do_tx_data()
      is only used for transferring 48 byte headers + 4 byte digests,
      along with seldom used control payloads from NOPIN + TEXT_RSP +
      REJECT with less than 32k of data.
      
      So following Al's audit of iovec consumers, go ahead and fail
      the connection on short writes for now, and remove the bogus
      logic ahead of his proper upstream fix.
      Reported-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      32ba5199
    • Jan Kara's avatar
      isofs: Fix infinite looping over CE entries · 96e44adc
      Jan Kara authored
      commit f54e18f1 upstream.
      
      Rock Ridge extensions define so called Continuation Entries (CE) which
      define where is further space with Rock Ridge data. Corrupted isofs
      image can contain arbitrarily long chain of these, including a one
      containing loop and thus causing kernel to end in an infinite loop when
      traversing these entries.
      
      Limit the traversal to 32 entries which should be more than enough space
      to store all the Rock Ridge data.
      Reported-by: default avatarP J P <ppandit@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      96e44adc
    • Long Li's avatar
      storvsc: ring buffer failures may result in I/O freeze · 68aa0365
      Long Li authored
      commit e86fb5e8 upstream.
      
      When ring buffer returns an error indicating retry, storvsc may not
      return a proper error code to SCSI when bounce buffer is not used.
      This has introduced I/O freeze on RAID running atop storvsc devices.
      This patch fixes it by always returning a proper error code.
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Reviewed-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      68aa0365
    • Andy Lutomirski's avatar
      x86/tls: Don't validate lm in set_thread_area() after all · 54c93a65
      Andy Lutomirski authored
      commit 3fb2f423 upstream.
      
      It turns out that there's a lurking ABI issue.  GCC, when
      compiling this in a 32-bit program:
      
      struct user_desc desc = {
      	.entry_number    = idx,
      	.base_addr       = base,
      	.limit           = 0xfffff,
      	.seg_32bit       = 1,
      	.contents        = 0, /* Data, grow-up */
      	.read_exec_only  = 0,
      	.limit_in_pages  = 1,
      	.seg_not_present = 0,
      	.useable         = 0,
      };
      
      will leave .lm uninitialized.  This means that anything in the
      kernel that reads user_desc.lm for 32-bit tasks is unreliable.
      
      Revert the .lm check in set_thread_area().  The value never did
      anything in the first place.
      
      Fixes: 0e58af4e ("x86/tls: Disallow unusual TLS segments")
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [lizf: Backported to 3.4: adjust filename]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      54c93a65
    • Andy Lutomirski's avatar
      x86/tls: Disallow unusual TLS segments · 3a0dc113
      Andy Lutomirski authored
      commit 0e58af4e upstream.
      
      Users have no business installing custom code segments into the
      GDT, and segments that are not present but are otherwise valid
      are a historical source of interesting attacks.
      
      For completeness, block attempts to set the L bit.  (Prior to
      this patch, the L bit would have been silently dropped.)
      
      This is an ABI break.  I've checked glibc, musl, and Wine, and
      none of them look like they'll have any trouble.
      
      Note to stable maintainers: this is a hardening patch that fixes
      no known bugs.  Given the possibility of ABI issues, this
      probably shouldn't be backported quickly.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Acked-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: security@kernel.org <security@kernel.org>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3a0dc113
    • Thomas Gleixner's avatar
      genirq: Prevent proc race against freeing of irq descriptors · 0d030473
      Thomas Gleixner authored
      commit c291ee62 upstream.
      
      Since the rework of the sparse interrupt code to actually free the
      unused interrupt descriptors there exists a race between the /proc
      interfaces to the irq subsystem and the code which frees the interrupt
      descriptor.
      
      CPU0				CPU1
      				show_interrupts()
      				  desc = irq_to_desc(X);
      free_desc(desc)
        remove_from_radix_tree();
        kfree(desc);
      				  raw_spinlock_irq(&desc->lock);
      
      /proc/interrupts is the only interface which can actively corrupt
      kernel memory via the lock access. /proc/stat can only read from freed
      memory. Extremly hard to trigger, but possible.
      
      The interfaces in /proc/irq/N/ are not affected by this because the
      removal of the proc file is serialized in procfs against concurrent
      readers/writers. The removal happens before the descriptor is freed.
      
      For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
      as the descriptor is never freed. It's merely cleared out with the irq
      descriptor lock held. So any concurrent proc access will either see
      the old correct value or the cleared out ones.
      
      Protect the lookup and access to the irq descriptor in
      show_interrupts() with the sparse_irq_lock.
      
      Provide kstat_irqs_usr() which is protecting the lookup and access
      with sparse_irq_lock and switch /proc/stat to use it.
      
      Document the existing kstat_irqs interfaces so it's clear that the
      caller needs to take care about protection. The users of these
      interfaces are either not affected due to SPARSE_IRQ=n or already
      protected against removal.
      
      Fixes: 1f5a5b87 "genirq: Implement a sane sparse_irq allocator"
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      [lizf: Backported to 3.4:
       - define kstat_irqs() for CONFIG_GENERIC_HARDIRQS
       - add ifdef/endif CONFIG_SPARSE_IRQ]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      0d030473
    • Andy Lutomirski's avatar
      x86_64, switch_to(): Load TLS descriptors before switching DS and ES · 81cc271d
      Andy Lutomirski authored
      commit f647d7c1 upstream.
      
      Otherwise, if buggy user code points DS or ES into the TLS
      array, they would be corrupted after a context switch.
      
      This also significantly improves the comments and documents some
      gotchas in the code.
      
      Before this patch, the both tests below failed.  With this
      patch, the es test passes, although the gsbase test still fails.
      
       ----- begin es test -----
      
      /*
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned short GDT3(int idx)
      {
      	return (idx << 3) | 3;
      }
      
      static int create_tls(int idx, unsigned int base)
      {
      	struct user_desc desc = {
      		.entry_number    = idx,
      		.base_addr       = base,
      		.limit           = 0xfffff,
      		.seg_32bit       = 1,
      		.contents        = 0, /* Data, grow-up */
      		.read_exec_only  = 0,
      		.limit_in_pages  = 1,
      		.seg_not_present = 0,
      		.useable         = 0,
      	};
      
      	if (syscall(SYS_set_thread_area, &desc) != 0)
      		err(1, "set_thread_area");
      
      	return desc.entry_number;
      }
      
      int main()
      {
      	int idx = create_tls(-1, 0);
      	printf("Allocated GDT index %d\n", idx);
      
      	unsigned short orig_es;
      	asm volatile ("mov %%es,%0" : "=rm" (orig_es));
      
      	int errors = 0;
      	int total = 1000;
      	for (int i = 0; i < total; i++) {
      		asm volatile ("mov %0,%%es" : : "rm" (GDT3(idx)));
      		usleep(100);
      
      		unsigned short es;
      		asm volatile ("mov %%es,%0" : "=rm" (es));
      		asm volatile ("mov %0,%%es" : : "rm" (orig_es));
      		if (es != GDT3(idx)) {
      			if (errors == 0)
      				printf("[FAIL]\tES changed from 0x%hx to 0x%hx\n",
      				       GDT3(idx), es);
      			errors++;
      		}
      	}
      
      	if (errors) {
      		printf("[FAIL]\tES was corrupted %d/%d times\n", errors, total);
      		return 1;
      	} else {
      		printf("[OK]\tES was preserved\n");
      		return 0;
      	}
      }
      
       ----- end es test -----
      
       ----- begin gsbase test -----
      
      /*
       * gsbase.c, a gsbase test
       * Copyright (c) 2014 Andy Lutomirski
       * GPL v2
       */
      
      static unsigned char *testptr, *testptr2;
      
      static unsigned char read_gs_testvals(void)
      {
      	unsigned char ret;
      	asm volatile ("movb %%gs:%1, %0" : "=r" (ret) : "m" (*testptr));
      	return ret;
      }
      
      int main()
      {
      	int errors = 0;
      
      	testptr = mmap((void *)0x200000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr == MAP_FAILED)
      		err(1, "mmap");
      
      	testptr2 = mmap((void *)0x300000000UL, 1, PROT_READ | PROT_WRITE,
      		       MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);
      	if (testptr2 == MAP_FAILED)
      		err(1, "mmap");
      
      	*testptr = 0;
      	*testptr2 = 1;
      
      	if (syscall(SYS_arch_prctl, ARCH_SET_GS,
      		    (unsigned long)testptr2 - (unsigned long)testptr) != 0)
      		err(1, "ARCH_SET_GS");
      
      	usleep(100);
      
      	if (read_gs_testvals() == 1) {
      		printf("[OK]\tARCH_SET_GS worked\n");
      	} else {
      		printf("[FAIL]\tARCH_SET_GS failed\n");
      		errors++;
      	}
      
      	asm volatile ("mov %0,%%gs" : : "r" (0));
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tWriting 0 to gs worked\n");
      	} else {
      		printf("[FAIL]\tWriting 0 to gs failed\n");
      		errors++;
      	}
      
      	usleep(100);
      
      	if (read_gs_testvals() == 0) {
      		printf("[OK]\tgsbase is still zero\n");
      	} else {
      		printf("[FAIL]\tgsbase was corrupted\n");
      		errors++;
      	}
      
      	return errors == 0 ? 0 : 1;
      }
      
       ----- end gsbase test -----
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/509d27c9fec78217691c3dad91cec87e1006b34a.1418075657.git.luto@amacapital.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      81cc271d
    • Jan Kara's avatar
      ncpfs: return proper error from NCP_IOC_SETROOT ioctl · 0b3edd6b
      Jan Kara authored
      commit a682e9c2 upstream.
      
      If some error happens in NCP_IOC_SETROOT ioctl, the appropriate error
      return value is then (in most cases) just overwritten before we return.
      This can result in reporting success to userspace although error happened.
      
      This bug was introduced by commit 2e54eb96 ("BKL: Remove BKL from
      ncpfs").  Propagate the errors correctly.
      
      Coverity id: 1226925.
      
      Fixes: 2e54eb96 ("BKL: Remove BKL from ncpfs")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Petr Vandrovec <petr@vandrovec.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      0b3edd6b
    • Filipe Manana's avatar
      Btrfs: fix fs corruption on transaction abort if device supports discard · 11b814e3
      Filipe Manana authored
      commit 678886bd upstream.
      
      When we abort a transaction we iterate over all the ranges marked as dirty
      in fs_info->freed_extents[0] and fs_info->freed_extents[1], clear them
      from those trees, add them back (unpin) to the free space caches and, if
      the fs was mounted with "-o discard", perform a discard on those regions.
      Also, after adding the regions to the free space caches, a fitrim ioctl call
      can see those ranges in a block group's free space cache and perform a discard
      on the ranges, so the same issue can happen without "-o discard" as well.
      
      This causes corruption, affecting one or multiple btree nodes (in the worst
      case leaving the fs unmountable) because some of those ranges (the ones in
      the fs_info->pinned_extents tree) correspond to btree nodes/leafs that are
      referred by the last committed super block - breaking the rule that anything
      that was committed by a transaction is untouched until the next transaction
      commits successfully.
      
      I ran into this while running in a loop (for several hours) the fstest that
      I recently submitted:
      
        [PATCH] fstests: add btrfs test to stress chunk allocation/removal and fstrim
      
      The corruption always happened when a transaction aborted and then fsck complained
      like this:
      
         _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
         *** fsck.btrfs output ***
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         Check tree block failed, want=94945280, have=0
         read block failed check_tree_block
         Couldn't open file system
      
      In this case 94945280 corresponded to the root of a tree.
      Using frace what I observed was the following sequence of steps happened:
      
         1) transaction N started, fs_info->pinned_extents pointed to
            fs_info->freed_extents[0];
      
         2) node/eb 94945280 is created;
      
         3) eb is persisted to disk;
      
         4) transaction N commit starts, fs_info->pinned_extents now points to
            fs_info->freed_extents[1], and transaction N completes;
      
         5) transaction N + 1 starts;
      
         6) eb is COWed, and btrfs_free_tree_block() called for this eb;
      
         7) eb range (94945280 to 94945280 + 16Kb) is added to
            fs_info->pinned_extents (fs_info->freed_extents[1]);
      
         8) Something goes wrong in transaction N + 1, like hitting ENOSPC
            for example, and the transaction is aborted, turning the fs into
            readonly mode. The stack trace I got for example:
      
            [112065.253935]  [<ffffffff8140c7b6>] dump_stack+0x4d/0x66
            [112065.254271]  [<ffffffff81042984>] warn_slowpath_common+0x7f/0x98
            [112065.254567]  [<ffffffffa0325990>] ? __btrfs_abort_transaction+0x50/0x10b [btrfs]
            [112065.261674]  [<ffffffff810429e5>] warn_slowpath_fmt+0x48/0x50
            [112065.261922]  [<ffffffffa032949e>] ? btrfs_free_path+0x26/0x29 [btrfs]
            [112065.262211]  [<ffffffffa0325990>] __btrfs_abort_transaction+0x50/0x10b [btrfs]
            [112065.262545]  [<ffffffffa036b1d6>] btrfs_remove_chunk+0x537/0x58b [btrfs]
            [112065.262771]  [<ffffffffa033840f>] btrfs_delete_unused_bgs+0x1de/0x21b [btrfs]
            [112065.263105]  [<ffffffffa0343106>] cleaner_kthread+0x100/0x12f [btrfs]
            (...)
            [112065.264493] ---[ end trace dd7903a975a31a08 ]---
            [112065.264673] BTRFS: error (device sdc) in btrfs_remove_chunk:2625: errno=-28 No space left
            [112065.264997] BTRFS info (device sdc): forced readonly
      
         9) The clear kthread sees that the BTRFS_FS_STATE_ERROR bit is set in
            fs_info->fs_state and calls btrfs_cleanup_transaction(), which in
            turn calls btrfs_destroy_pinned_extent();
      
         10) Then btrfs_destroy_pinned_extent() iterates over all the ranges
             marked as dirty in fs_info->freed_extents[], and for each one
             it calls discard, if the fs was mounted with "-o discard", and
             adds the range to the free space cache of the respective block
             group;
      
         11) btrfs_trim_block_group(), invoked from the fitrim ioctl code path,
             sees the free space entries and performs a discard;
      
         12) After an umount and mount (or fsck), our eb's location on disk was full
             of zeroes, and it should have been untouched, because it was marked as
             dirty in the fs_info->pinned_extents tree, and therefore used by the
             trees that the last committed superblock points to.
      
      Fix this by not performing a discard and not adding the ranges to the free space
      caches - it's useless from this point since the fs is now in readonly mode and
      we won't write free space caches to disk anymore (otherwise we would leak space)
      nor any new superblock. By not adding the ranges to the free space caches, it
      prevents other code paths from allocating that space and write to it as well,
      therefore being safer and simpler.
      
      This isn't a new problem, as it's been present since 2011 (git commit
      acce952b).
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      11b814e3
    • Takashi Iwai's avatar
      KEYS: Fix stale key registration at error path · 187c38d0
      Takashi Iwai authored
      commit b26bdde5 upstream.
      
      When loading encrypted-keys module, if the last check of
      aes_get_sizes() in init_encrypted() fails, the driver just returns an
      error without unregistering its key type.  This results in the stale
      entry in the list.  In addition to memory leaks, this leads to a kernel
      crash when registering a new key type later.
      
      This patch fixes the problem by swapping the calls of aes_get_sizes()
      and register_key_type(), and releasing resources properly at the error
      paths.
      
      Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=908163Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      187c38d0
    • Takashi Iwai's avatar
      ALSA: usb-audio: Don't resubmit pending URBs at MIDI error recovery · db84e0af
      Takashi Iwai authored
      commit 66139a48 upstream.
      
      In snd_usbmidi_error_timer(), the driver tries to resubmit MIDI input
      URBs to reactivate the MIDI stream, but this causes the error when
      some of URBs are still pending like:
      
       WARNING: CPU: 0 PID: 0 at ../drivers/usb/core/urb.c:339 usb_submit_urb+0x5f/0x70()
       URB ef705c40 submitted while active
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.6-2-desktop #1
       Hardware name: FOXCONN TPS01/TPS01, BIOS 080015  03/23/2010
        c0984bfa f4009ed4 c078deaf f4009ee4 c024c884 c09a135c f4009f00 00000000
        c0984bfa 00000153 c061ac4f c061ac4f 00000009 00000001 ef705c40 e854d1c0
        f4009eec c024c8d3 00000009 f4009ee4 c09a135c f4009f00 f4009f04 c061ac4f
       Call Trace:
        [<c0205df6>] try_stack_unwind+0x156/0x170
        [<c020482a>] dump_trace+0x5a/0x1b0
        [<c0205e56>] show_trace_log_lvl+0x46/0x50
        [<c02049d1>] show_stack_log_lvl+0x51/0xe0
        [<c0205eb7>] show_stack+0x27/0x50
        [<c078deaf>] dump_stack+0x45/0x65
        [<c024c884>] warn_slowpath_common+0x84/0xa0
        [<c024c8d3>] warn_slowpath_fmt+0x33/0x40
        [<c061ac4f>] usb_submit_urb+0x5f/0x70
        [<f7974104>] snd_usbmidi_submit_urb+0x14/0x60 [snd_usbmidi_lib]
        [<f797483a>] snd_usbmidi_error_timer+0x6a/0xa0 [snd_usbmidi_lib]
        [<c02570c0>] call_timer_fn+0x30/0x130
        [<c0257442>] run_timer_softirq+0x1c2/0x260
        [<c0251493>] __do_softirq+0xc3/0x270
        [<c0204732>] do_softirq_own_stack+0x22/0x30
        [<c025186d>] irq_exit+0x8d/0xa0
        [<c0795228>] smp_apic_timer_interrupt+0x38/0x50
        [<c0794a3c>] apic_timer_interrupt+0x34/0x3c
        [<c0673d9e>] cpuidle_enter_state+0x3e/0xd0
        [<c028bb8d>] cpu_idle_loop+0x29d/0x3e0
        [<c028bd23>] cpu_startup_entry+0x53/0x60
        [<c0bfac1e>] start_kernel+0x415/0x41a
      
      For avoiding these errors, check the pending URBs and skip
      resubmitting such ones.
      Reported-and-tested-by: default avatarStefan Seyfried <stefan.seyfried@googlemail.com>
      Acked-by: default avatarClemens Ladisch <clemens@ladisch.de>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      db84e0af
    • Stephane Grosjean's avatar
      can: peak_usb: fix cleanup sequence order in case of error during init · 3a95358a
      Stephane Grosjean authored
      commit af35d0f1 upstream.
      
      This patch sets the correct reverse sequence order to the instructions
      set to run, when any failure occurs during the initialization steps.
      It also adds the missing unregistration call of the can device if the
      failure appears after having been registered.
      Signed-off-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      3a95358a
    • Stephane Grosjean's avatar
      can: peak_usb: fix memset() usage · b476ea71
      Stephane Grosjean authored
      commit dc50ddcd upstream.
      
      This patchs fixes a misplaced call to memset() that fills the request
      buffer with 0. The problem was with sending PCAN_USBPRO_REQ_FCT
      requests, the content set by the caller was thus lost.
      
      With this patch, the memory area is zeroed only when requesting info
      from the device.
      Signed-off-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b476ea71
    • Alex Deucher's avatar
      drm/radeon: check the right ring in radeon_evict_flags() · b3995df2
      Alex Deucher authored
      commit 5e5c21ca upstream.
      
      Check the that ring we are using for copies is functional
      rather than the GFX ring.  On newer asics we use the DMA
      ring for bo moves.
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b3995df2
    • Dominique Leuenberger's avatar
      hp_accel: Add support for HP ZBook 15 · 664c3095
      Dominique Leuenberger authored
      commit 6583659e upstream.
      
      HP ZBook 15 laptop needs a non-standard mapping (x_inverted).
      
      BugLink: http://bugzilla.opensuse.org/show_bug.cgi?id=905329Signed-off-by: default avatarDominique Leuenberger <dimstar@opensuse.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      664c3095
    • Thomas Hellstrom's avatar
      drm/vmwgfx: Fix fence event code · 6fe79f9e
      Thomas Hellstrom authored
      commit 89669e7a upstream.
      
      The commit "vmwgfx: Rework fence event action" introduced a number of bugs
      that are fixed with this commit:
      
      a) A forgotten return stateemnt.
      b) An if statement with identical branches.
      Reported-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJakob Bornecrantz <jakob@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      6fe79f9e
    • Thomas Hellstrom's avatar
      drm/vmwgfx: Don't use memory accounting for kernel-side fence objects · cf1d72fd
      Thomas Hellstrom authored
      commit 1f563a6a upstream.
      
      Kernel side fence objects are used when unbinding resources and may thus be
      created as part of a memory reclaim operation. This might trigger recursive
      memory reclaims and result in the kernel running out of stack space.
      
      So a simple way out is to avoid accounting of these fence objects.
      In principle this is OK since while user-space can trigger the creation of
      such objects, it can't really hold on to them. However, their lifetime is
      quite long, so some form of accounting should perhaps be implemented in the
      future.
      
      Fixes kernel crashes when running, for example viewperf11 ensight-04 test 3
      with low system memory settings.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJakob Bornecrantz <jakob@vmware.com>
      Reviewed-by: default avatarSinclair Yeh <syeh@vmware.com>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      cf1d72fd
    • Jiang Liu's avatar
      iommu/vt-d: Fix an off-by-one bug in __domain_mapping() · b21fe28e
      Jiang Liu authored
      commit cc4f14aa upstream.
      
      There's an off-by-one bug in function __domain_mapping(), which may
      trigger the BUG_ON(nr_pages < lvl_pages) when
      	(nr_pages + 1) & superpage_mask == 0
      
      The issue was introduced by commit 9051aa02 "intel-iommu: Combine
      domain_pfn_mapping() and domain_sg_mapping()", which sets sg_res to
      "nr_pages + 1" to avoid some of the 'sg_res==0' code paths.
      
      It's safe to remove extra "+1" because sg_res is only used to calculate
      page size now.
      Reported-And-Tested-by: default avatarSudeep Dutt <sudeep.dutt@intel.com>
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Acked-By: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      b21fe28e
    • Felix Fietkau's avatar
      ath5k: fix hardware queue index assignment · 93083078
      Felix Fietkau authored
      commit 9e4982f6 upstream.
      
      Like with ath9k, ath5k queues also need to be ordered by priority.
      queue_info->tqi_subtype already contains the correct index, so use it
      instead of relying on the order of ath5k_hw_setup_tx_queue calls.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      93083078
    • Felix Fietkau's avatar
      ath9k: fix BE/BK queue order · 82e2db27
      Felix Fietkau authored
      commit 78063d81 upstream.
      
      Hardware queues are ordered by priority. Use queue index 0 for BK, which
      has lower priority than BE.
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      82e2db27
    • Felix Fietkau's avatar
      ath9k_hw: fix hardware queue allocation · 37d2cb4f
      Felix Fietkau authored
      commit ad8fdccf upstream.
      
      The driver passes the desired hardware queue index for a WMM data queue
      in qinfo->tqi_subtype. This was ignored in ath9k_hw_setuptxqueue, which
      instead relied on the order in which the function is called.
      Reported-by: default avatarHubert Feurstein <h.feurstein@gmail.com>
      Signed-off-by: default avatarFelix Fietkau <nbd@openwrt.org>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      37d2cb4f
    • Michael Halcrow's avatar
      eCryptfs: Remove buggy and unnecessary write in file name decode routine · 04fb28bc
      Michael Halcrow authored
      commit 94208064 upstream.
      
      Dmitry Chernenkov used KASAN to discover that eCryptfs writes past the
      end of the allocated buffer during encrypted filename decoding. This
      fix corrects the issue by getting rid of the unnecessary 0 write when
      the current bit offset is 2.
      Signed-off-by: default avatarMichael Halcrow <mhalcrow@google.com>
      Reported-by: default avatarDmitry Chernenkov <dmitryc@google.com>
      Suggested-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      04fb28bc
    • Robert Baldyga's avatar
      serial: samsung: wait for transfer completion before clock disable · e660b2f2
      Robert Baldyga authored
      commit 1ff383a4 upstream.
      
      This patch adds waiting until transmit buffer and shifter will be empty
      before clock disabling.
      
      Without this fix it's possible to have clock disabled while data was
      not transmited yet, which causes unproper state of TX line and problems
      in following data transfers.
      Signed-off-by: default avatarRobert Baldyga <r.baldyga@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      [lizf: Backported to 3.4: adjust context]
      Signed-off-by: default avatarZefan Li <lizefan@huawei.com>
      e660b2f2