1. 16 Aug, 2016 40 commits
    • Johannes Weiner's avatar
      mm: memcontrol: fix cgroup creation failure after many small jobs · 8627c775
      Johannes Weiner authored
      commit 73f576c0 upstream.
      
      The memory controller has quite a bit of state that usually outlives the
      cgroup and pins its CSS until said state disappears.  At the same time
      it imposes a 16-bit limit on the CSS ID space to economically store IDs
      in the wild.  Consequently, when we use cgroups to contain frequent but
      small and short-lived jobs that leave behind some page cache, we quickly
      run into the 64k limitations of outstanding CSSs.  Creating a new cgroup
      fails with -ENOSPC while there are only a few, or even no user-visible
      cgroups in existence.
      
      Although pinning CSSs past cgroup removal is common, there are only two
      instances that actually need an ID after a cgroup is deleted: cache
      shadow entries and swapout records.
      
      Cache shadow entries reference the ID weakly and can deal with the CSS
      having disappeared when it's looked up later.  They pose no hurdle.
      
      Swap-out records do need to pin the css to hierarchically attribute
      swapins after the cgroup has been deleted; though the only pages that
      remain swapped out after offlining are tmpfs/shmem pages.  And those
      references are under the user's control, so they are manageable.
      
      This patch introduces a private 16-bit memcg ID and switches swap and
      cache shadow entries over to using that.  This ID can then be recycled
      after offlining when the CSS remains pinned only by objects that don't
      specifically need it.
      
      This script demonstrates the problem by faulting one cache page in a new
      cgroup and deleting it again:
      
        set -e
        mkdir -p pages
        for x in `seq 128000`; do
          [ $((x % 1000)) -eq 0 ] && echo $x
          mkdir /cgroup/foo
          echo $$ >/cgroup/foo/cgroup.procs
          echo trex >pages/$x
          echo $$ >/cgroup/cgroup.procs
          rmdir /cgroup/foo
        done
      
      When run on an unpatched kernel, we eventually run out of possible IDs
      even though there are no visible cgroups:
      
        [root@ham ~]# ./cssidstress.sh
        [...]
        65000
        mkdir: cannot create directory '/cgroup/foo': No space left on device
      
      After this patch, the IDs get released upon cgroup destruction and the
      cache and css objects get released once memory reclaim kicks in.
      
      [hannes@cmpxchg.org: init the IDR]
        Link: http://lkml.kernel.org/r/20160621154601.GA22431@cmpxchg.org
      Fixes: b2052564 ("mm: memcontrol: continue cache reclaim from offlined groups")
      Link: http://lkml.kernel.org/r/20160617162516.GD19084@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarJohn Garcia <john.garcia@mesosphere.io>
      Reviewed-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Nikolay Borisov <kernel@kyup.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8627c775
    • Vegard Nossum's avatar
      ext4: fix reference counting bug on block allocation error · 3a22cf0c
      Vegard Nossum authored
      commit 554a5ccc upstream.
      
      If we hit this error when mounted with errors=continue or
      errors=remount-ro:
      
          EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
      
      then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
      continue. However, ext4_mb_release_context() is the wrong thing to call
      here since we are still actually using the allocation context.
      
      Instead, just error out. We could retry the allocation, but there is a
      possibility of getting stuck in an infinite loop instead, so this seems
      safer.
      
      [ Fixed up so we don't return EAGAIN to userspace. --tytso ]
      
      Fixes: 8556e8f3 ("ext4: Don't allow new groups to be added during block allocation")
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3a22cf0c
    • Vegard Nossum's avatar
      ext4: short-cut orphan cleanup on error · db82c747
      Vegard Nossum authored
      commit c65d5c6c upstream.
      
      If we encounter a filesystem error during orphan cleanup, we should stop.
      Otherwise, we may end up in an infinite loop where the same inode is
      processed again and again.
      
          EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended
          EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters
          Aborting journal on device loop0-8.
          EXT4-fs (loop0): Remounting filesystem read-only
          EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure
          EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted
          EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted
          EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted
          EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed!
          [...]
          EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed!
          [...]
          EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed!
          [...]
      
      See-also: c9eb13a9 ("ext4: fix hang when processing corrupted orphaned inode list")
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db82c747
    • Theodore Ts'o's avatar
      ext4: validate s_reserved_gdt_blocks on mount · f8d4d52c
      Theodore Ts'o authored
      commit 5b9554dc upstream.
      
      If s_reserved_gdt_blocks is extremely large, it's possible for
      ext4_init_block_bitmap(), which is called when ext4 sets up an
      uninitialized block bitmap, to corrupt random kernel memory.  Add the
      same checks which e2fsck has --- it must never be larger than
      blocksize / sizeof(__u32) --- and then add a backup check in
      ext4_init_block_bitmap() in case the superblock gets modified after
      the file system is mounted.
      Reported-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f8d4d52c
    • Vegard Nossum's avatar
      ext4: don't call ext4_should_journal_data() on the journal inode · 175f36cb
      Vegard Nossum authored
      commit 6a7fd522 upstream.
      
      If ext4_fill_super() fails early, it's possible for ext4_evict_inode()
      to call ext4_should_journal_data() before superblock options and flags
      are fully set up.  In that case, the iput() on the journal inode can
      end up causing a BUG().
      
      Work around this problem by reordering the tests so we only call
      ext4_should_journal_data() after we know it's not the journal inode.
      
      Fixes: 2d859db3 ("ext4: fix data corruption in inodes with journalled data")
      Fixes: 2b405bfa ("ext4: fix data=journal fast mount/umount hang")
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      175f36cb
    • Jan Kara's avatar
      ext4: fix deadlock during page writeback · 5a7f477c
      Jan Kara authored
      commit 646caa9c upstream.
      
      Commit 06bd3c36 (ext4: fix data exposure after a crash) uncovered a
      deadlock in ext4_writepages() which was previously much harder to hit.
      After this commit xfstest generic/130 reproduces the deadlock on small
      filesystems.
      
      The problem happens when ext4_do_update_inode() sets LARGE_FILE feature
      and marks current inode handle as synchronous. That subsequently results
      in ext4_journal_stop() called from ext4_writepages() to block waiting for
      transaction commit while still holding page locks, reference to io_end,
      and some prepared bio in mpd structure each of which can possibly block
      transaction commit from completing and thus results in deadlock.
      
      Fix the problem by releasing page locks, io_end reference, and
      submitting prepared bio before calling ext4_journal_stop().
      
      [ Changed to defer the call to ext4_journal_stop() only if the handle
        is synchronous.  --tytso ]
      Reported-and-tested-by: default avatarEryu Guan <eguan@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a7f477c
    • Vegard Nossum's avatar
      ext4: check for extents that wrap around · 9e38db20
      Vegard Nossum authored
      commit f70749ca upstream.
      
      An extent with lblock = 4294967295 and len = 1 will pass the
      ext4_valid_extent() test:
      
      	ext4_lblk_t last = lblock + len - 1;
      
      	if (len == 0 || lblock > last)
      		return 0;
      
      since last = 4294967295 + 1 - 1 = 4294967295. This would later trigger
      the BUG_ON(es->es_lblk + es->es_len < es->es_lblk) in ext4_es_end().
      
      We can simplify it by removing the - 1 altogether and changing the test
      to use lblock + len <= lblock, since now if len = 0, then lblock + 0 ==
      lblock and it fails, and if len > 0 then lblock + len > lblock in order
      to pass (i.e. it doesn't overflow).
      
      Fixes: 5946d089 ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
      Fixes: 2f974865 ("ext4: check for zero length extent explicitly")
      Cc: Eryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatarPhil Turnbull <phil.turnbull@oracle.com>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e38db20
    • Herbert Xu's avatar
      crypto: scatterwalk - Fix test in scatterwalk_done · 08bb036c
      Herbert Xu authored
      commit 5f070e81 upstream.
      
      When there is more data to be processed, the current test in
      scatterwalk_done may prevent us from calling pagedone even when
      we should.
      
      In particular, if we're on an SG entry spanning multiple pages
      where the last page is not a full page, we will incorrectly skip
      calling pagedone on the second last page.
      
      This patch fixes this by adding a separate test for whether we've
      reached the end of a page.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08bb036c
    • Herbert Xu's avatar
      crypto: gcm - Filter out async ghash if necessary · 148fbb96
      Herbert Xu authored
      commit b30bdfa8 upstream.
      
      As it is if you ask for a sync gcm you may actually end up with
      an async one because it does not filter out async implementations
      of ghash.
      
      This patch fixes this by adding the necessary filter when looking
      for ghash.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      148fbb96
    • Wei Fang's avatar
      fs/dcache.c: avoid soft-lockup in dput() · 92f71339
      Wei Fang authored
      commit 47be6184 upstream.
      
      We triggered soft-lockup under stress test which
      open/access/write/close one file concurrently on more than
      five different CPUs:
      
      WARN: soft lockup - CPU#0 stuck for 11s! [who:30631]
      ...
      [<ffffffc0003986f8>] dput+0x100/0x298
      [<ffffffc00038c2dc>] terminate_walk+0x4c/0x60
      [<ffffffc00038f56c>] path_lookupat+0x5cc/0x7a8
      [<ffffffc00038f780>] filename_lookup+0x38/0xf0
      [<ffffffc000391180>] user_path_at_empty+0x78/0xd0
      [<ffffffc0003911f4>] user_path_at+0x1c/0x28
      [<ffffffc00037d4fc>] SyS_faccessat+0xb4/0x230
      
      ->d_lock trylock may failed many times because of concurrently
      operations, and dput() may execute a long time.
      
      Fix this by replacing cpu_relax() with cond_resched().
      dput() used to be sleepable, so make it sleepable again
      should be safe.
      Signed-off-by: default avatarWei Fang <fangwei1@huawei.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92f71339
    • Wei Fang's avatar
      fuse: fix wrong assignment of ->flags in fuse_send_init() · b6e0a217
      Wei Fang authored
      commit 9446385f upstream.
      
      FUSE_HAS_IOCTL_DIR should be assigned to ->flags, it may be a typo.
      Signed-off-by: default avatarWei Fang <fangwei1@huawei.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 69fe05c9 ("fuse: add missing INIT flags")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6e0a217
    • Maxim Patlasov's avatar
      fuse: fuse_flush must check mapping->flags for errors · 9ca5f11d
      Maxim Patlasov authored
      commit 9ebce595 upstream.
      
      fuse_flush() calls write_inode_now() that triggers writeback, but actual
      writeback will happen later, on fuse_sync_writes(). If an error happens,
      fuse_writepage_end() will set error bit in mapping->flags. So, we have to
      check mapping->flags after fuse_sync_writes().
      Signed-off-by: default avatarMaxim Patlasov <mpatlasov@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4d99ff8f ("fuse: Turn writeback cache on")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9ca5f11d
    • Alexey Kuznetsov's avatar
      fuse: fsync() did not return IO errors · 3d1c64d8
      Alexey Kuznetsov authored
      commit ac7f052b upstream.
      
      Due to implementation of fuse writeback filemap_write_and_wait_range() does
      not catch errors. We have to do this directly after fuse_sync_writes()
      Signed-off-by: default avatarAlexey Kuznetsov <kuznet@virtuozzo.com>
      Signed-off-by: default avatarMaxim Patlasov <mpatlasov@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 4d99ff8f ("fuse: Turn writeback cache on")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d1c64d8
    • Fabian Frederick's avatar
      sysv, ipc: fix security-layer leaking · 62659f0b
      Fabian Frederick authored
      commit 9b24fef9 upstream.
      
      Commit 53dad6d3 ("ipc: fix race with LSMs") updated ipc_rcu_putref()
      to receive rcu freeing function but used generic ipc_rcu_free() instead
      of msg_rcu_free() which does security cleaning.
      
      Running LTP msgsnd06 with kmemleak gives the following:
      
        cat /sys/kernel/debug/kmemleak
      
        unreferenced object 0xffff88003c0a11f8 (size 8):
          comm "msgsnd06", pid 1645, jiffies 4294672526 (age 6.549s)
          hex dump (first 8 bytes):
            1b 00 00 00 01 00 00 00                          ........
          backtrace:
            kmemleak_alloc+0x23/0x40
            kmem_cache_alloc_trace+0xe1/0x180
            selinux_msg_queue_alloc_security+0x3f/0xd0
            security_msg_queue_alloc+0x2e/0x40
            newque+0x4e/0x150
            ipcget+0x159/0x1b0
            SyS_msgget+0x39/0x40
            entry_SYSCALL_64_fastpath+0x13/0x8f
      
      Manfred Spraul suggested to fix sem.c as well and Davidlohr Bueso to
      only use ipc_rcu_free in case of security allocation failure in newary()
      
      Fixes: 53dad6d3 ("ipc: fix race with LSMs")
      Link: http://lkml.kernel.org/r/1470083552-22966-1-git-send-email-fabf@skynet.beSigned-off-by: default avatarFabian Frederick <fabf@skynet.be>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62659f0b
    • Vegard Nossum's avatar
      block: fix use-after-free in seq file · 9a95c0cf
      Vegard Nossum authored
      commit 77da1605 upstream.
      
      I got a KASAN report of use-after-free:
      
          ==================================================================
          BUG: KASAN: use-after-free in klist_iter_exit+0x61/0x70 at addr ffff8800b6581508
          Read of size 8 by task trinity-c1/315
          =============================================================================
          BUG kmalloc-32 (Not tainted): kasan: bad access detected
          -----------------------------------------------------------------------------
      
          Disabling lock debugging due to kernel taint
          INFO: Allocated in disk_seqf_start+0x66/0x110 age=144 cpu=1 pid=315
                  ___slab_alloc+0x4f1/0x520
                  __slab_alloc.isra.58+0x56/0x80
                  kmem_cache_alloc_trace+0x260/0x2a0
                  disk_seqf_start+0x66/0x110
                  traverse+0x176/0x860
                  seq_read+0x7e3/0x11a0
                  proc_reg_read+0xbc/0x180
                  do_loop_readv_writev+0x134/0x210
                  do_readv_writev+0x565/0x660
                  vfs_readv+0x67/0xa0
                  do_preadv+0x126/0x170
                  SyS_preadv+0xc/0x10
                  do_syscall_64+0x1a1/0x460
                  return_from_SYSCALL_64+0x0/0x6a
          INFO: Freed in disk_seqf_stop+0x42/0x50 age=160 cpu=1 pid=315
                  __slab_free+0x17a/0x2c0
                  kfree+0x20a/0x220
                  disk_seqf_stop+0x42/0x50
                  traverse+0x3b5/0x860
                  seq_read+0x7e3/0x11a0
                  proc_reg_read+0xbc/0x180
                  do_loop_readv_writev+0x134/0x210
                  do_readv_writev+0x565/0x660
                  vfs_readv+0x67/0xa0
                  do_preadv+0x126/0x170
                  SyS_preadv+0xc/0x10
                  do_syscall_64+0x1a1/0x460
                  return_from_SYSCALL_64+0x0/0x6a
      
          CPU: 1 PID: 315 Comm: trinity-c1 Tainted: G    B           4.7.0+ #62
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
           ffffea0002d96000 ffff880119b9f918 ffffffff81d6ce81 ffff88011a804480
           ffff8800b6581500 ffff880119b9f948 ffffffff8146c7bd ffff88011a804480
           ffffea0002d96000 ffff8800b6581500 fffffffffffffff4 ffff880119b9f970
          Call Trace:
           [<ffffffff81d6ce81>] dump_stack+0x65/0x84
           [<ffffffff8146c7bd>] print_trailer+0x10d/0x1a0
           [<ffffffff814704ff>] object_err+0x2f/0x40
           [<ffffffff814754d1>] kasan_report_error+0x221/0x520
           [<ffffffff8147590e>] __asan_report_load8_noabort+0x3e/0x40
           [<ffffffff83888161>] klist_iter_exit+0x61/0x70
           [<ffffffff82404389>] class_dev_iter_exit+0x9/0x10
           [<ffffffff81d2e8ea>] disk_seqf_stop+0x3a/0x50
           [<ffffffff8151f812>] seq_read+0x4b2/0x11a0
           [<ffffffff815f8fdc>] proc_reg_read+0xbc/0x180
           [<ffffffff814b24e4>] do_loop_readv_writev+0x134/0x210
           [<ffffffff814b4c45>] do_readv_writev+0x565/0x660
           [<ffffffff814b8a17>] vfs_readv+0x67/0xa0
           [<ffffffff814b8de6>] do_preadv+0x126/0x170
           [<ffffffff814b92ec>] SyS_preadv+0xc/0x10
      
      This problem can occur in the following situation:
      
      open()
       - pread()
          - .seq_start()
             - iter = kmalloc() // succeeds
             - seqf->private = iter
          - .seq_stop()
             - kfree(seqf->private)
       - pread()
          - .seq_start()
             - iter = kmalloc() // fails
          - .seq_stop()
             - class_dev_iter_exit(seqf->private) // boom! old pointer
      
      As the comment in disk_seqf_stop() says, stop is called even if start
      failed, so we need to reinitialise the private pointer to NULL when seq
      iteration stops.
      
      An alternative would be to set the private pointer to NULL when the
      kmalloc() in disk_seqf_start() fails.
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a95c0cf
    • David Howells's avatar
      x86/syscalls/64: Add compat_sys_keyctl for 32-bit userspace · 3cde0e74
      David Howells authored
      commit f7d66562 upstream.
      
      x86_64 needs to use compat_sys_keyctl for 32-bit userspace rather than
      calling sys_keyctl(). The latter will work in a lot of cases, thereby
      hiding the issue.
      Reported-by: default avatarStephan Mueller <smueller@chronox.de>
      Tested-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: keyrings@vger.kernel.org
      Cc: linux-security-module@vger.kernel.org
      Link: http://lkml.kernel.org/r/146961615805.14395.5581949237156769439.stgit@warthog.procyon.org.ukSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3cde0e74
    • Matt Roper's avatar
      drm/i915: Pretend cursor is always on for ILK-style WM calculations (v2) · 821d5e6b
      Matt Roper authored
      commit e2e407dc upstream.
      
      Due to our lack of two-step watermark programming, our driver has
      historically pretended that the cursor plane is always on for the
      purpose of watermark calculations; this helps avoid serious flickering
      when the cursor turns off/on (e.g., when the user moves the mouse
      pointer to a different screen).  That workaround was accidentally
      dropped as we started working toward atomic watermark updates.  Since we
      still aren't quite there yet with two-stage updates, we need to
      resurrect the workaround and treat the cursor as always active.
      
      v2: Tweak cursor width calculations slightly to more closely match the
          logic we used before the atomic overhaul began.  (Ville)
      
      Cc: simdev11@outlook.com
      Cc: manfred.kitzbichler@gmail.com
      Cc: drm-intel-fixes@lists.freedesktop.org
      Reported-by: simdev11@outlook.com
      Reported-by: manfred.kitzbichler@gmail.com
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93892
      Fixes: 43d59eda ("drm/i915: Eliminate usage of plane_wm_parameters from ILK-style WM code (v2)")
      Signed-off-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1454479611-6804-1-git-send-email-matthew.d.roper@intel.com
      (cherry picked from commit b2435692)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1454958328-30129-1-git-send-email-matthew.d.roper@intel.comTested-by: default avatarJay <mymailclone@t-online.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      821d5e6b
    • Toshi Kani's avatar
      x86/mm/pat: Fix BUG_ON() in mmap_mem() on QEMU/i386 · fb93281f
      Toshi Kani authored
      commit 1886297c upstream.
      
      The following BUG_ON() crash was reported on QEMU/i386:
      
        kernel BUG at arch/x86/mm/physaddr.c:79!
        Call Trace:
        phys_mem_access_prot_allowed
        mmap_mem
        ? mmap_region
        mmap_region
        do_mmap
        vm_mmap_pgoff
        SyS_mmap_pgoff
        do_int80_syscall_32
        entry_INT80_32
      
      after commit:
      
        edfe63ec ("x86/mtrr: Fix Xorg crashes in Qemu sessions")
      
      PAT is now set to disabled state when MTRRs are disabled.
      Thus, reactivating the __pa(high_memory) check in
      phys_mem_access_prot_allowed().
      
      When CONFIG_DEBUG_VIRTUAL is set, __pa() calls __phys_addr(),
      which in turn calls slow_virt_to_phys() for 'high_memory'.
      Because 'high_memory' is set to (the max direct mapped virt
      addr + 1), it is not a valid virtual address.  Hence,
      slow_virt_to_phys() returns 0 and hit the BUG_ON.  Using
      __pa_nodebug() instead of __pa() will fix this BUG_ON.
      
      However, this code block, originally written for Pentiums and
      earlier, is no longer adequate since a 32-bit Xen guest has
      MTRRs disabled and supports ZONE_HIGHMEM.  In this setup,
      this code sets UC attribute for accessing RAM in high memory
      range.
      
      Delete this code block as it has been unused for a long time.
      Reported-by: default avatarkernel test robot <ying.huang@linux.intel.com>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1460403360-25441-1-git-send-email-toshi.kani@hpe.com
      Link: https://lkml.org/lkml/2016/4/1/608Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fb93281f
    • Toshi Kani's avatar
      x86/pat: Document the PAT initialization sequence · e270fdc5
      Toshi Kani authored
      commit b6350c21 upstream.
      
      Update PAT documentation to describe how PAT is initialized under
      various configurations.
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-8-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e270fdc5
    • Toshi Kani's avatar
      x86/xen, pat: Remove PAT table init code from Xen · 26b340ea
      Toshi Kani authored
      commit 88ba2811 upstream.
      
      Xen supports PAT without MTRRs for its guests.  In order to
      enable WC attribute, it was necessary for xen_start_kernel()
      to call pat_init_cache_modes() to update PAT table before
      starting guest kernel.
      
      Now that the kernel initializes PAT table to the BIOS handoff
      state when MTRR is disabled, this Xen-specific PAT init code
      is no longer necessary.  Delete it from xen_start_kernel().
      
      Also change __init_cache_modes() to a static function since
      PAT table should not be tweaked by other modules.
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-7-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      26b340ea
    • Toshi Kani's avatar
      x86/mtrr: Fix PAT init handling when MTRR is disabled · a23b299b
      Toshi Kani authored
      commit ad025a73 upstream.
      
      get_mtrr_state() calls pat_init() on BSP even if MTRR is disabled.
      This results in calling pat_init() on BSP only since APs do not call
      pat_init() when MTRR is disabled.  This inconsistency between BSP
      and APs leads to undefined behavior.
      
      Make BSP's calling condition to pat_init() consistent with AP's,
      mtrr_ap_init() and mtrr_aps_init().
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-6-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a23b299b
    • Toshi Kani's avatar
      x86/mtrr: Fix Xorg crashes in Qemu sessions · 594055cf
      Toshi Kani authored
      commit edfe63ec upstream.
      
      A Xorg failure on qemu32 was reported as a regression [1] caused by
      commit 9cd25aac ("x86/mm/pat: Emulate PAT when it is disabled").
      
      This patch fixes the Xorg crash.
      
      Negative effects of this regression were the following two failures [2]
      in Xorg on QEMU with QEMU CPU model "qemu32" (-cpu qemu32), which were
      triggered by the fact that its virtual CPU does not support MTRRs.
      
       #1. copy_process() failed in the check in reserve_pfn_range()
      
          copy_process
           copy_mm
            dup_mm
             dup_mmap
              copy_page_range
               track_pfn_copy
                reserve_pfn_range
      
       A WC map request was tracked as WC in memtype, which set a PTE as
       UC (pgprot) per __cachemode2pte_tbl[].  This led to this error in
       reserve_pfn_range() called from track_pfn_copy(), which obtained
       a pgprot from a PTE.  It converts pgprot to page_cache_mode, which
       does not necessarily result in the original page_cache_mode since
       __cachemode2pte_tbl[] redirects multiple types to UC.
      
       #2. error path in copy_process() then hit WARN_ON_ONCE in
           untrack_pfn().
      
           x86/PAT: Xorg:509 map pfn expected mapping type uncached-
           minus for [mem 0xfd000000-0xfdffffff], got write-combining
            Call Trace:
           dump_stack
           warn_slowpath_common
           ? untrack_pfn
           ? untrack_pfn
           warn_slowpath_null
           untrack_pfn
           ? __kunmap_atomic
           unmap_single_vma
           ? pagevec_move_tail_fn
           unmap_vmas
           exit_mmap
           mmput
           copy_process.part.47
           _do_fork
           SyS_clone
           do_syscall_32_irqs_on
           entry_INT80_32
      
      These negative effects are caused by two separate bugs, but they
      can be addressed in separate patches.  Fixing the pat_init() issue
      described below addresses the root cause, and avoids Xorg to hit
      these cases.
      
      When the CPU does not support MTRRs, MTRR does not call pat_init(),
      which leaves PAT enabled without initializing PAT.  This pat_init()
      issue is a long-standing issue, but manifested as issue #1 (and then
      hit issue #2) with the above-mentioned commit because the memtype
      now tracks cache attribute with 'page_cache_mode'.
      
      This pat_init() issue existed before the commit, but we used pgprot
      in memtype.  Hence, we did not have issue #1 before.  But WC request
      resulted in WT in effect because WC pgrot is actually WT when PAT
      is not initialized.  This is not how it was designed to work.  When
      PAT is set to disable properly, WC is converted to UC.  The use of
      WT can result in a system crash if the target range does not support
      WT.  Fortunately, nobody ran into such issue before.
      
      To fix this pat_init() issue, PAT code has been enhanced to provide
      pat_disable() interface.  Call this interface when MTRRs are disabled.
      By setting PAT to disable properly, PAT bypasses the memtype check,
      and avoids issue #1.
      
        [1]: https://lkml.org/lkml/2016/3/3/828
        [2]: https://lkml.org/lkml/2016/3/4/775Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-5-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      594055cf
    • Toshi Kani's avatar
      x86/mm/pat: Replace cpu_has_pat with boot_cpu_has() · 32c85428
      Toshi Kani authored
      commit d63dcf49 upstream.
      
      Borislav Petkov suggested:
      
       > Please use on init paths boot_cpu_has(X86_FEATURE_PAT) and on fast
       > paths static_cpu_has(X86_FEATURE_PAT). No more of that cpu_has_XXX
       > ugliness.
      
      Replace the use of cpu_has_pat on init paths with boot_cpu_has().
      Suggested-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-4-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32c85428
    • Toshi Kani's avatar
      x86/mm/pat: Add pat_disable() interface · d50e8b10
      Toshi Kani authored
      commit 224bb1e5 upstream.
      
      In preparation for fixing a regression caused by:
      
        9cd25aac ("x86/mm/pat: Emulate PAT when it is disabled")
      
      ... PAT needs to provide an interface that prevents the OS from
      initializing the PAT MSR.
      
      PAT MSR initialization must be done on all CPUs using the specific
      sequence of operations defined in the Intel SDM.  This requires MTRRs
      to be enabled since pat_init() is called as part of MTRR init
      from mtrr_rendezvous_handler().
      
      Make pat_disable() as the interface that prevents the OS from
      initializing the PAT MSR.  MTRR will call this interface when it
      cannot provide the SDM-defined sequence to initialize PAT.
      
      This also assures that pat_disable() called from pat_bsp_init()
      will set the PAT table properly when CPU does not support PAT.
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Robert Elliott <elliott@hpe.com>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-3-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d50e8b10
    • Toshi Kani's avatar
      x86/mm/pat: Add support of non-default PAT MSR setting · 8f5b8210
      Toshi Kani authored
      commit 02f037d6 upstream.
      
      In preparation for fixing a regression caused by:
      
        9cd25aac ("x86/mm/pat: Emulate PAT when it is disabled")'
      
      ... PAT needs to support a case that PAT MSR is initialized with a
      non-default value.
      
      When pat_init() is called and PAT is disabled, it initializes the
      PAT table with the BIOS default value. Xen, however, sets PAT MSR
      with a non-default value to enable WC. This causes inconsistency
      between the PAT table and PAT MSR when PAT is set to disable on Xen.
      
      Change pat_init() to handle the PAT disable cases properly.  Add
      init_cache_modes() to handle two cases when PAT is set to disable.
      
       1. CPU supports PAT: Set PAT table to be consistent with PAT MSR.
       2. CPU does not support PAT: Set PAT table to be consistent with
          PWT and PCD bits in a PTE.
      
      Note, __init_cache_modes(), renamed from pat_init_cache_modes(),
      will be changed to a static function in a later patch.
      Signed-off-by: default avatarToshi Kani <toshi.kani@hpe.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Luis R. Rodriguez <mcgrof@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Toshi Kani <toshi.kani@hp.com>
      Cc: elliott@hpe.com
      Cc: konrad.wilk@oracle.com
      Cc: paul.gortmaker@windriver.com
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1458769323-24491-2-git-send-email-toshi.kani@hpe.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8f5b8210
    • Linus Torvalds's avatar
      devpts: clean up interface to pty drivers · 5c7d0f49
      Linus Torvalds authored
      commit 67245ff3 upstream.
      
      This gets rid of the horrible notion of having that
      
          struct inode *ptmx_inode
      
      be the linchpin of the interface between the pty code and devpts.
      
      By de-emphasizing the ptmx inode, a lot of things actually get cleaner,
      and we will have a much saner way forward.  In particular, this will
      allow us to associate with any particular devpts instance at open-time,
      and not be artificially tied to one particular ptmx inode.
      
      The patch itself is actually fairly straightforward, and apart from some
      locking and return path cleanups it's pretty mechanical:
      
       - the interfaces that devpts exposes all take "struct pts_fs_info *"
         instead of "struct inode *ptmx_inode" now.
      
         NOTE! The "struct pts_fs_info" thing is a completely opaque structure
         as far as the pty driver is concerned: it's still declared entirely
         internally to devpts. So the pty code can't actually access it in any
         way, just pass it as a "cookie" to the devpts code.
      
       - the "look up the pts fs info" is now a single clear operation, that
         also does the reference count increment on the pts superblock.
      
         So "devpts_add/del_ref()" is gone, and replaced by a "lookup and get
         ref" operation (devpts_get_ref(inode)), along with a "put ref" op
         (devpts_put_ref()).
      
       - the pty master "tty->driver_data" field now contains the pts_fs_info,
         not the ptmx inode.
      
       - because we don't care about the ptmx inode any more as some kind of
         base index, the ref counting can now drop the inode games - it just
         gets the ref on the superblock.
      
       - the pts_fs_info now has a back-pointer to the super_block. That's so
         that we can easily look up the information we actually need. Although
         quite often, the pts fs info was actually all we wanted, and not having
         to look it up based on some magical inode makes things more
         straightforward.
      
      In particular, now that "devpts_get_ref(inode)" operation should really
      be the *only* place we need to look up what devpts instance we're
      associated with, and we do it exactly once, at ptmx_open() time.
      
      The other side of this is that one ptmx node could now be associated
      with multiple different devpts instances - you could have a single
      /dev/ptmx node, and then have multiple mount namespaces with their own
      instances of devpts mounted on /dev/pts/.  And that's all perfectly sane
      in a model where we just look up the pts instance at open time.
      
      This will eventually allow us to get rid of our odd single-vs-multiple
      pts instance model, but this patch in itself changes no semantics, only
      an internal binding model.
      
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Peter Hurley <peter@hurleysoftware.com>
      Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Cc: Aurelien Jarno <aurelien@aurel32.net>
      Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
      Cc: Jann Horn <jann@thejh.net>
      Cc: Greg KH <greg@kroah.com>
      Cc: Jiri Slaby <jslaby@suse.com>
      Cc: Florian Weimer <fw@deneb.enyo.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Francesco Ruggeri <fruggeri@arista.com>
      Cc: "Herton R. Krzesinski" <herton@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c7d0f49
    • Theodore Ts'o's avatar
      random: strengthen input validation for RNDADDTOENTCNT · 93f84c88
      Theodore Ts'o authored
      commit 86a574de upstream.
      
      Don't allow RNDADDTOENTCNT or RNDADDENTROPY to accept a negative
      entropy value.  It doesn't make any sense to subtract from the entropy
      counter, and it can trigger a warning:
      
      random: negative entropy/overflow: pool input count -40000
      ------------[ cut here ]------------
      WARNING: CPU: 3 PID: 6828 at drivers/char/random.c:670[<      none
       >] credit_entropy_bits+0x21e/0xad0 drivers/char/random.c:670
      Modules linked in:
      CPU: 3 PID: 6828 Comm: a.out Not tainted 4.7.0-rc4+ #4
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
       ffffffff880b58e0 ffff88005dd9fcb0 ffffffff82cc838f ffffffff87158b40
       fffffbfff1016b1c 0000000000000000 0000000000000000 ffffffff87158b40
       ffffffff83283dae 0000000000000009 ffff88005dd9fcf8 ffffffff8136d27f
      Call Trace:
       [<     inline     >] __dump_stack lib/dump_stack.c:15
       [<ffffffff82cc838f>] dump_stack+0x12e/0x18f lib/dump_stack.c:51
       [<ffffffff8136d27f>] __warn+0x19f/0x1e0 kernel/panic.c:516
       [<ffffffff8136d48c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:551
       [<ffffffff83283dae>] credit_entropy_bits+0x21e/0xad0 drivers/char/random.c:670
       [<     inline     >] credit_entropy_bits_safe drivers/char/random.c:734
       [<ffffffff8328785d>] random_ioctl+0x21d/0x250 drivers/char/random.c:1546
       [<     inline     >] vfs_ioctl fs/ioctl.c:43
       [<ffffffff8185316c>] do_vfs_ioctl+0x18c/0xff0 fs/ioctl.c:674
       [<     inline     >] SYSC_ioctl fs/ioctl.c:689
       [<ffffffff8185405f>] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:680
       [<ffffffff86a995c0>] entry_SYSCALL_64_fastpath+0x23/0xc1
      arch/x86/entry/entry_64.S:207
      ---[ end trace 5d4902b2ba842f1f ]---
      
      This was triggered using the test program:
      
      // autogenerated by syzkaller (http://github.com/google/syzkaller)
      
      int main() {
              int fd = open("/dev/random", O_RDWR);
              int val = -5000;
              ioctl(fd, RNDADDTOENTCNT, &val);
              return 0;
      }
      
      It's harmless in that (a) only root can trigger it, and (b) after
      complaining the code never does let the entropy count go negative, but
      it's better to simply not allow this userspace from passing in a
      negative entropy value altogether.
      
      Google-Bug-Id: #29575089
      Reported-By: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93f84c88
    • John Johansen's avatar
    • Michael Holzheu's avatar
      Revert "s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL" · 4cf8f0b0
      Michael Holzheu authored
      commit 5419447e upstream.
      
      This reverts commit 852ffd0f.
      
      There are use cases where an intermediate boot kernel (1) uses kexec
      to boot the final production kernel (2). For this scenario we should
      provide the original boot information to the production kernel (2).
      Therefore clearing the boot information during kexec() should not
      be done.
      Reported-by: default avatarSteffen Maier <maier@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Reviewed-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cf8f0b0
    • David Howells's avatar
      KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace · cca36a7d
      David Howells authored
      commit 20f06ed9 upstream.
      
      MIPS64 needs to use compat_sys_keyctl for 32-bit userspace rather than
      calling sys_keyctl.  The latter will work in a lot of cases, thereby hiding
      the issue.
      Reported-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Cc: linux-security-module@vger.kernel.org
      Cc: keyrings@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/13832/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cca36a7d
    • Dave Weinstein's avatar
      arm: oabi compat: add missing access checks · 0107ea0e
      Dave Weinstein authored
      commit 7de24996 upstream.
      
      Add access checks to sys_oabi_epoll_wait() and sys_oabi_semtimedop().
      This fixes CVE-2016-3857, a local privilege escalation under
      CONFIG_OABI_COMPAT.
      Reported-by: default avatarChiachih Wu <wuchiachih@gmail.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarNicolas Pitre <nico@linaro.org>
      Signed-off-by: default avatarDave Weinstein <olorin@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0107ea0e
    • Bjørn Mork's avatar
      cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind · 66e5d7b4
      Bjørn Mork authored
      commit 4d06dd53 upstream.
      
      usbnet_link_change will call schedule_work and should be
      avoided if bind is failing. Otherwise we will end up with
      scheduled work referring to a netdev which has gone away.
      
      Instead of making the call conditional, we can just defer
      it to usbnet_probe, using the driver_info flag made for
      this purpose.
      
      Fixes: 8a34b0ae ("usbnet: cdc_ncm: apply usbnet_link_change")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66e5d7b4
    • Mika Westerberg's avatar
      i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR · 3088903a
      Mika Westerberg authored
      commit a7ae8195 upstream.
      
      Many Intel systems the BIOS declares a SystemIO OpRegion below the SMBus
      PCI device as can be seen in ACPI DSDT table from Lenovo Yoga 900:
      
        Device (SBUS)
        {
            OperationRegion (SMBI, SystemIO, (SBAR << 0x05), 0x10)
            Field (SMBI, ByteAcc, NoLock, Preserve)
            {
                HSTS,   8,
                Offset (0x02),
                HCON,   8,
                HCOM,   8,
                TXSA,   8,
                DAT0,   8,
                DAT1,   8,
                HBDR,   8,
                PECR,   8,
                RXSA,   8,
                SDAT,   16
            }
      
      There are also bunch of AML methods that that the BIOS can use to access
      these fields. Most of the systems in question AML methods accessing the
      SMBI OpRegion are never used.
      
      Now, because of this SMBI OpRegion many systems fail to load the SMBus
      driver with an error looking like one below:
      
        ACPI Warning: SystemIO range 0x0000000000003040-0x000000000000305F
             conflicts with OpRegion 0x0000000000003040-0x000000000000304F
             (\_SB.PCI0.SBUS.SMBI) (20160108/utaddress-255)
        ACPI: If an ACPI driver is available for this device, you should use
             it instead of the native driver
      
      The reason is that this SMBI OpRegion conflicts with the PCI BAR used by
      the SMBus driver.
      
      It turns out that we can install a custom SystemIO address space handler
      for the SMBus device to intercept all accesses through that OpRegion. This
      allows us to share the PCI BAR with the AML code if it for some reason is
      using it. We do not expect that this OpRegion handler will ever be called
      but if it is we print a warning and prevent all access from the SMBus
      driver itself.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=110041Reported-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reported-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Suggested-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Acked-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarJean Delvare <jdelvare@suse.de>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Tested-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Tested-by: default avatarJean Delvare <jdelvare@suse.de>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3088903a
    • Hector Marco-Gisbert's avatar
      x86/mm/32: Enable full randomization on i386 and X86_32 · 979a61a0
      Hector Marco-Gisbert authored
      commit 8b8addf8 upstream.
      
      Currently on i386 and on X86_64 when emulating X86_32 in legacy mode, only
      the stack and the executable are randomized but not other mmapped files
      (libraries, vDSO, etc.). This patch enables randomization for the
      libraries, vDSO and mmap requests on i386 and in X86_32 in legacy mode.
      
      By default on i386 there are 8 bits for the randomization of the libraries,
      vDSO and mmaps which only uses 1MB of VA.
      
      This patch preserves the original randomness, using 1MB of VA out of 3GB or
      4GB. We think that 1MB out of 3GB is not a big cost for having the ASLR.
      
      The first obvious security benefit is that all objects are randomized (not
      only the stack and the executable) in legacy mode which highly increases
      the ASLR effectiveness, otherwise the attackers may use these
      non-randomized areas. But also sensitive setuid/setgid applications are
      more secure because currently, attackers can disable the randomization of
      these applications by setting the ulimit stack to "unlimited". This is a
      very old and widely known trick to disable the ASLR in i386 which has been
      allowed for too long.
      
      Another trick used to disable the ASLR was to set the ADDR_NO_RANDOMIZE
      personality flag, but fortunately this doesn't work on setuid/setgid
      applications because there is security checks which clear Security-relevant
      flags.
      
      This patch always randomizes the mmap_legacy_base address, removing the
      possibility to disable the ASLR by setting the stack to "unlimited".
      Signed-off-by: default avatarHector Marco-Gisbert <hecmargi@upv.es>
      Acked-by: default avatarIsmael Ripoll Ripoll <iripoll@upv.es>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: akpm@linux-foundation.org
      Cc: kees Cook <keescook@chromium.org>
      Link: http://lkml.kernel.org/r/1457639460-5242-1-git-send-email-hecmargi@upv.esSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      979a61a0
    • Benjamin Tissoires's avatar
      HID: sony: do not bail out when the sixaxis refuses the output report · 6e124249
      Benjamin Tissoires authored
      commit 19f4c2ba upstream.
      
      When setting the operational mode, some third party (Speedlink Strike-FX)
      gamepads refuse the output report. Failing here means we refuse to
      initialize the gamepad while this should be harmless.
      
      The weird part is that the initial commit that added this: a7de9b86
      ("HID: sony: Enable Gasia third-party PS3 controllers") mentions this
      very same controller as one requiring this output report.
      Anyway, it's broken for one user at least, so let's change it.
      We will report an error, but at least the controller should work.
      
      And no, these devices present themselves as legacy Sony controllers
      (VID:PID of 054C:0268, as in the official ones) so there are no ways
      of discriminating them from the official ones.
      
      https://bugzilla.redhat.com/show_bug.cgi?id=1255325Reported-and-tested-by: default avatarMax Fedotov <thesourcehim@gmail.com>
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e124249
    • Christophe Le Roy's avatar
      PNP: Add Broadwell to Intel MCH size workaround · d71d4ace
      Christophe Le Roy authored
      commit a77060f0 upstream.
      
      Add device ID 0x1604 for Broadwell to commit cb171f7a ("PNP:
      Work around BIOS defects in Intel MCH area reporting").
      
      >From a Lenovo ThinkPad T550:
      
        system 00:01: [io  0x1800-0x189f] could not be reserved
        system 00:01: [io  0x0800-0x087f] has been reserved
        system 00:01: [io  0x0880-0x08ff] has been reserved
        system 00:01: [io  0x0900-0x097f] has been reserved
        system 00:01: [io  0x0980-0x09ff] has been reserved
        system 00:01: [io  0x0a00-0x0a7f] has been reserved
        system 00:01: [io  0x0a80-0x0aff] has been reserved
        system 00:01: [io  0x0b00-0x0b7f] has been reserved
        system 00:01: [io  0x0b80-0x0bff] has been reserved
        system 00:01: [io  0x15e0-0x15ef] has been reserved
        system 00:01: [io  0x1600-0x167f] has been reserved
        system 00:01: [io  0x1640-0x165f] has been reserved
        system 00:01: [mem 0xf8000000-0xfbffffff] could not be reserved
        system 00:01: [mem 0xfed1c000-0xfed1ffff] has been reserved
        system 00:01: [mem 0xfed10000-0xfed13fff] has been reserved
        system 00:01: [mem 0xfed18000-0xfed18fff] has been reserved
        system 00:01: [mem 0xfed19000-0xfed19fff] has been reserved
        system 00:01: [mem 0xfed45000-0xfed4bfff] has been reserved
        system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
        [...]
        resource sanity check: requesting [mem 0xfed10000-0xfed15fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff]
        ------------[ cut here ]------------
        WARNING: CPU: 2 PID: 1 at /build/linux-CrHvZ_/linux-4.2.6/arch/x86/mm/ioremap.c:198 __ioremap_caller+0x2ee/0x360()
        Info: mapping multiple BARs. Your kernel is fine.
        Modules linked in:
        CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.2.0-1-amd64 #1 Debian 4.2.6-1
        Hardware name: LENOVO 20CKCTO1WW/20CKCTO1WW, BIOS N11ET34W (1.10 ) 08/20/2015
         0000000000000000 ffffffff817e6868 ffffffff8154e2f6 ffff8802241efbf8
         ffffffff8106e5b1 ffffc90000e98000 0000000000006000 ffffc90000e98000
         0000000000006000 0000000000000000 ffffffff8106e62a ffffffff817e68c8
        Call Trace:
         [<ffffffff8154e2f6>] ? dump_stack+0x40/0x50
         [<ffffffff8106e5b1>] ? warn_slowpath_common+0x81/0xb0
         [<ffffffff8106e62a>] ? warn_slowpath_fmt+0x4a/0x50
         [<ffffffff810742a3>] ? iomem_map_sanity_check+0xb3/0xc0
         [<ffffffff8105dade>] ? __ioremap_caller+0x2ee/0x360
         [<ffffffff81036ae6>] ? snb_uncore_imc_init_box+0x66/0x90
         [<ffffffff810351a8>] ? uncore_pci_probe+0xc8/0x1a0
         [<ffffffff81302d7f>] ? local_pci_probe+0x3f/0xa0
         [<ffffffff81303ea4>] ? pci_device_probe+0xc4/0x110
         [<ffffffff813d9b1e>] ? driver_probe_device+0x1ee/0x450
         [<ffffffff813d9dfb>] ? __driver_attach+0x7b/0x80
         [<ffffffff813d9d80>] ? driver_probe_device+0x450/0x450
         [<ffffffff813d796a>] ? bus_for_each_dev+0x5a/0x90
         [<ffffffff813d9091>] ? bus_add_driver+0x1f1/0x290
         [<ffffffff81b37fa8>] ? uncore_cpu_setup+0xc/0xc
         [<ffffffff813da73f>] ? driver_register+0x5f/0xe0
         [<ffffffff81b38074>] ? intel_uncore_init+0xcc/0x2b0
         [<ffffffff81b37fa8>] ? uncore_cpu_setup+0xc/0xc
         [<ffffffff8100213e>] ? do_one_initcall+0xce/0x200
         [<ffffffff8108a100>] ? parse_args+0x140/0x4e0
         [<ffffffff81b2b0cb>] ? kernel_init_freeable+0x162/0x1e8
         [<ffffffff815443f0>] ? rest_init+0x80/0x80
         [<ffffffff815443fe>] ? kernel_init+0xe/0xf0
         [<ffffffff81553e5f>] ? ret_from_fork+0x3f/0x70
         [<ffffffff815443f0>] ? rest_init+0x80/0x80
        ---[ end trace 472e7959536abf12 ]---
      
        00:00.0 Host bridge: Intel Corporation Broadwell-U Host Bridge -OPI (rev 09)
                Subsystem: Lenovo Device 2223
                Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
                Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
                Latency: 0
                Capabilities: [e0] Vendor Specific Information: Len=0c <?>
                Kernel driver in use: bdw_uncore
        00: 86 80 04 16 06 00 90 20 09 00 00 06 00 00 00 00
        10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        20: 00 00 00 00 00 00 00 00 00 00 00 00 aa 17 23 22
        30: 00 00 00 00 e0 00 00 00 00 00 00 00 00 00 00 00
      Signed-off-by: default avatarChristophe Le Roy <christophe.fish@gmail.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d71d4ace
    • Josh Boyer's avatar
      PNP: Add Haswell-ULT to Intel MCH size workaround · 02170f4a
      Josh Boyer authored
      commit ed1f0eee upstream.
      
      Add device ID 0x0a04 for Haswell-ULT to the list of devices with MCH
      problems.
      
      From a Lenovo ThinkPad T440S:
      [    0.188604] pnp: PnP ACPI init
      [    0.189044] system 00:00: [mem 0x00000000-0x0009ffff] could not be reserved
      [    0.189048] system 00:00: [mem 0x000c0000-0x000c3fff] could not be reserved
      [    0.189050] system 00:00: [mem 0x000c4000-0x000c7fff] could not be reserved
      [    0.189052] system 00:00: [mem 0x000c8000-0x000cbfff] could not be reserved
      [    0.189054] system 00:00: [mem 0x000cc000-0x000cffff] could not be reserved
      [    0.189056] system 00:00: [mem 0x000d0000-0x000d3fff] has been reserved
      [    0.189058] system 00:00: [mem 0x000d4000-0x000d7fff] has been reserved
      [    0.189060] system 00:00: [mem 0x000d8000-0x000dbfff] has been reserved
      [    0.189061] system 00:00: [mem 0x000dc000-0x000dffff] has been reserved
      [    0.189063] system 00:00: [mem 0x000e0000-0x000e3fff] could not be reserved
      [    0.189065] system 00:00: [mem 0x000e4000-0x000e7fff] could not be reserved
      [    0.189067] system 00:00: [mem 0x000e8000-0x000ebfff] could not be reserved
      [    0.189069] system 00:00: [mem 0x000ec000-0x000effff] could not be reserved
      [    0.189071] system 00:00: [mem 0x000f0000-0x000fffff] could not be reserved
      [    0.189073] system 00:00: [mem 0x00100000-0xdf9fffff] could not be reserved
      [    0.189075] system 00:00: [mem 0xfec00000-0xfed3ffff] could not be reserved
      [    0.189078] system 00:00: [mem 0xfed4c000-0xffffffff] could not be reserved
      [    0.189082] system 00:00: Plug and Play ACPI device, IDs PNP0c01 (active)
      [    0.189216] system 00:01: [io  0x1800-0x189f] could not be reserved
      [    0.189220] system 00:01: [io  0x0800-0x087f] has been reserved
      [    0.189222] system 00:01: [io  0x0880-0x08ff] has been reserved
      [    0.189224] system 00:01: [io  0x0900-0x097f] has been reserved
      [    0.189226] system 00:01: [io  0x0980-0x09ff] has been reserved
      [    0.189229] system 00:01: [io  0x0a00-0x0a7f] has been reserved
      [    0.189231] system 00:01: [io  0x0a80-0x0aff] has been reserved
      [    0.189233] system 00:01: [io  0x0b00-0x0b7f] has been reserved
      [    0.189235] system 00:01: [io  0x0b80-0x0bff] has been reserved
      [    0.189238] system 00:01: [io  0x15e0-0x15ef] has been reserved
      [    0.189240] system 00:01: [io  0x1600-0x167f] has been reserved
      [    0.189242] system 00:01: [io  0x1640-0x165f] has been reserved
      [    0.189246] system 00:01: [mem 0xf8000000-0xfbffffff] could not be reserved
      [    0.189249] system 00:01: [mem 0x00000000-0x00000fff] could not be reserved
      [    0.189251] system 00:01: [mem 0xfed1c000-0xfed1ffff] has been reserved
      [    0.189254] system 00:01: [mem 0xfed10000-0xfed13fff] has been reserved
      [    0.189256] system 00:01: [mem 0xfed18000-0xfed18fff] has been reserved
      [    0.189258] system 00:01: [mem 0xfed19000-0xfed19fff] has been reserved
      [    0.189261] system 00:01: [mem 0xfed45000-0xfed4bfff] has been reserved
      [    0.189264] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
      [....]
      [    0.583653] resource sanity check: requesting [mem 0xfed10000-0xfed15fff], which spans more than pnp 00:01 [mem 0xfed10000-0xfed13fff]
      [    0.583654] ------------[ cut here ]------------
      [    0.583660] WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:198 __ioremap_caller+0x2c5/0x380()
      [    0.583661] Info: mapping multiple BARs. Your kernel is fine.
      [    0.583662] Modules linked in:
      
      [    0.583666] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.3.3-303.fc23.x86_64 #1
      [    0.583668] Hardware name: LENOVO 20AR001GXS/20AR001GXS, BIOS GJET86WW (2.36 ) 12/04/2015
      [    0.583670]  0000000000000000 0000000014cf7e59 ffff880214a1baf8 ffffffff813a625f
      [    0.583673]  ffff880214a1bb40 ffff880214a1bb30 ffffffff810a07c2 00000000fed10000
      [    0.583675]  ffffc90000cb8000 0000000000006000 0000000000000000 ffff8800d6381040
      [    0.583678] Call Trace:
      [    0.583683]  [<ffffffff813a625f>] dump_stack+0x44/0x55
      [    0.583686]  [<ffffffff810a07c2>] warn_slowpath_common+0x82/0xc0
      [    0.583688]  [<ffffffff810a085c>] warn_slowpath_fmt+0x5c/0x80
      [    0.583692]  [<ffffffff810a6fba>] ? iomem_map_sanity_check+0xba/0xd0
      [    0.583695]  [<ffffffff81065835>] __ioremap_caller+0x2c5/0x380
      [    0.583698]  [<ffffffff81065907>] ioremap_nocache+0x17/0x20
      [    0.583701]  [<ffffffff8103a119>] snb_uncore_imc_init_box+0x79/0xb0
      [    0.583705]  [<ffffffff81038900>] uncore_pci_probe+0xd0/0x1b0
      [    0.583707]  [<ffffffff813efda5>] local_pci_probe+0x45/0xa0
      [    0.583710]  [<ffffffff813f118d>] pci_device_probe+0xfd/0x140
      [    0.583713]  [<ffffffff814d9b52>] driver_probe_device+0x222/0x480
      [    0.583715]  [<ffffffff814d9e34>] __driver_attach+0x84/0x90
      [    0.583717]  [<ffffffff814d9db0>] ? driver_probe_device+0x480/0x480
      [    0.583720]  [<ffffffff814d762c>] bus_for_each_dev+0x6c/0xc0
      [    0.583722]  [<ffffffff814d930e>] driver_attach+0x1e/0x20
      [    0.583724]  [<ffffffff814d8e4b>] bus_add_driver+0x1eb/0x280
      [    0.583727]  [<ffffffff81d6af1a>] ? uncore_cpu_setup+0x12/0x12
      [    0.583729]  [<ffffffff814da680>] driver_register+0x60/0xe0
      [    0.583733]  [<ffffffff813ef78c>] __pci_register_driver+0x4c/0x50
      [    0.583736]  [<ffffffff81d6affc>] intel_uncore_init+0xe2/0x2e6
      [    0.583738]  [<ffffffff81d6af1a>] ? uncore_cpu_setup+0x12/0x12
      [    0.583741]  [<ffffffff81002123>] do_one_initcall+0xb3/0x200
      [    0.583745]  [<ffffffff810be500>] ? parse_args+0x1a0/0x4a0
      [    0.583749]  [<ffffffff81d5c1c8>] kernel_init_freeable+0x189/0x223
      [    0.583752]  [<ffffffff81775c40>] ? rest_init+0x80/0x80
      [    0.583754]  [<ffffffff81775c4e>] kernel_init+0xe/0xe0
      [    0.583758]  [<ffffffff81781adf>] ret_from_fork+0x3f/0x70
      [    0.583760]  [<ffffffff81775c40>] ? rest_init+0x80/0x80
      [    0.583765] ---[ end trace 077c426a39e018aa ]---
      
      00:00.0 Host bridge [0600]: Intel Corporation Haswell-ULT DRAM Controller [8086:0a04] (rev 0b)
      	Subsystem: Lenovo Device [17aa:220c]
      	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
      	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
      	Latency: 0
      	Capabilities: <access denied>
      	Kernel driver in use: hsw_uncore
      
      Link: https://bugzilla.redhat.com/show_bug.cgi?id=1300955
      Tested-by: <robo@tcp.sk>
      Signed-off-by: default avatarJosh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02170f4a
    • Hannes Reinecke's avatar
      scsi: ignore errors from scsi_dh_add_device() · 5a6f9d06
      Hannes Reinecke authored
      commit 221255ae upstream.
      
      device handler initialisation might fail due to a number of
      reasons. But as device_handlers are optional this shouldn't
      cause us to disable the device entirely.
      So just ignore errors from scsi_dh_add_device().
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Cc: Laura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a6f9d06
    • Ben Hutchings's avatar
      ipath: Restrict use of the write() interface · 694dfd0e
      Ben Hutchings authored
      Commit e6bd18f5 ("IB/security: Restrict use of the write()
      interface") fixed a security problem with various write()
      implementations in the Infiniband subsystem.  In older kernel versions
      the ipath_write() function has the same problem and needs the same
      restriction.  (The ipath driver has been completely removed upstream.)
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      694dfd0e
    • Soheil Hassas Yeganeh's avatar
      tcp: consider recv buf for the initial window scale · 9c946c93
      Soheil Hassas Yeganeh authored
      [ Upstream commit f626300a ]
      
      tcp_select_initial_window() intends to advertise a window
      scaling for the maximum possible window size. To do so,
      it considers the maximum of net.ipv4.tcp_rmem[2] and
      net.core.rmem_max as the only possible upper-bounds.
      However, users with CAP_NET_ADMIN can use SO_RCVBUFFORCE
      to set the socket's receive buffer size to values
      larger than net.ipv4.tcp_rmem[2] and net.core.rmem_max.
      Thus, SO_RCVBUFFORCE is effectively ignored by
      tcp_select_initial_window().
      
      To fix this, consider the maximum of net.ipv4.tcp_rmem[2],
      net.core.rmem_max and socket's initial buffer space.
      
      Fixes: b0573dea ("[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options")
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Suggested-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c946c93