1. 01 Oct, 2017 3 commits
  2. 30 Sep, 2017 4 commits
  3. 29 Sep, 2017 17 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 99637e42
      Linus Torvalds authored
      Pull waitid fix from Al Viro:
       "Fix infoleak in waitid()"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix infoleak in waitid(2)
      99637e42
    • Linus Torvalds's avatar
      Merge branch 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 5ba88cd6
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "We've collected a bunch of isolated fixes, for crashes, user-visible
        behaviour or missing bits from other subsystem cleanups from the past.
      
        The overall number is not small but I was not able to make it
        significantly smaller. Most of the patches are supposed to go to
        stable"
      
      * 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: log csums for all modified extents
        Btrfs: fix unexpected result when dio reading corrupted blocks
        btrfs: Report error on removing qgroup if del_qgroup_item fails
        Btrfs: skip checksum when reading compressed data if some IO have failed
        Btrfs: fix kernel oops while reading compressed data
        Btrfs: use btrfs_op instead of bio_op in __btrfs_map_block
        Btrfs: do not backup tree roots when fsync
        btrfs: remove BTRFS_FS_QUOTA_DISABLING flag
        btrfs: propagate error to btrfs_cmp_data_prepare caller
        btrfs: prevent to set invalid default subvolid
        Btrfs: send: fix error number for unknown inode types
        btrfs: fix NULL pointer dereference from free_reloc_roots()
        btrfs: finish ordered extent cleaning if no progress is found
        btrfs: clear ordered flag on cleaning up ordered extents
        Btrfs: fix incorrect {node,sector}size endianness from BTRFS_IOC_FS_INFO
        Btrfs: do not reset bio->bi_ops while writing bio
        Btrfs: use the new helper wbc_to_write_flags
      5ba88cd6
    • Linus Torvalds's avatar
      Merge tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md · 7b5ef823
      Linus Torvalds authored
      Pull MD fixes from Shaohua Li:
       "A few fixes for MD. Mainly fix a problem introduced in 4.13, which we
        retry bio for some code paths but not all in some situations"
      
      * tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
        md/raid5: cap worker count
        dm-raid: fix a race condition in request handling
        md: fix a race condition for flush request handling
        md: separate request handling
      7b5ef823
    • Linus Torvalds's avatar
      Merge tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 93b5533a
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - fix CONFIG_PCI=n build error (introduced in v4.14-rc1) (Geert
         Uytterhoeven)
      
       - fix a race in sysfs driver_override store/show (Nicolai Stange)
      
      * tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        PCI: Fix race condition with driver_override
        PCI: Add dummy pci_acs_enabled() for CONFIG_PCI=n build
      93b5533a
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux · a3583202
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Regular fixes pull, some amdkfd, amdgpu, etnaviv, sun4i, qxl, tegra
        fixes.
      
        I've got an outstanding pull for i915 but it wasn't on an rc2 base so
        I wanted to ship these out first, I might get to it before rc3 or I
        might not"
      
      * tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux:
        drm/tegra: trace: Fix path to include
        qxl: fix framebuffer unpinning
        drm/sun4i: cec: Enable back CEC-pin framework
        drm/amdkfd: Print event limit messages only once per process
        drm/amdkfd: Fix kernel-queue wrapping bugs
        drm/amdkfd: Fix incorrect destroy_mqd parameter
        drm/radeon: disable hard reset in hibernate for APUs
        drm/amdgpu: revert tile table update for oland
        etnaviv: fix gem object list corruption
        etnaviv: fix submit error path
        qxl: fix primary surface handling
        drm/amdkfd: check for null dev to avoid a null pointer dereference
      a3583202
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 35dbba31
      Linus Torvalds authored
      Pull IOMMU fixes from Joerg Roedel:
      
       - A comment fix for 'struct iommu_ops'
      
       - Format string fixes for AMD IOMMU, unfortunatly I missed that during
         review.
      
       - Limit mediatek physical addresses to 32 bit for v7s to fix a warning
         triggered in io-page-table code.
      
       - Fix dma-sync in io-pgtable-arm-v7s code
      
      * tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu: Fix comment for iommu_ops.map_sg
        iommu/amd: pr_err() strings should end with newlines
        iommu/mediatek: Limit the physical address in 32bit for v7s
        iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA
      35dbba31
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 06482600
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - SPsel register initialisation on reset as the architecture defines
         its state as unknown
      
       - Use READ_ONCE when dereferencing pmd_t pointers to avoid race
         conditions in page_vma_mapped_walk() (or fast GUP) with concurrent
         modifications of the page table
      
       - Avoid invoking the mm fault handling code for kernel addresses (check
         against TASK_SIZE) which would otherwise result in calling
         might_sleep() in atomic context
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: fault: Route pte translation faults via do_translation_fault
        arm64: mm: Use READ_ONCE when dereferencing pointer to pte table
        arm64: Make sure SPsel is always set
      06482600
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 9f2a5128
      Linus Torvalds authored
      Pull xen fixes from Juergen Gross:
      
       - avoid a warning when compiling with clang
      
       - consider read-only bits in xen-pciback when writing to a BAR
      
       - fix a boot crash of pv-domains
      
      * tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping
        xen-pciback: relax BAR sizing write value check
        x86/xen: clean up clang build warning
      9f2a5128
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 42057e18
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Mixed bugfixes. Perhaps the most interesting one is a latent bug that
        was finally triggered by PCID support"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        kvm/x86: Handle async PF in RCU read-side critical sections
        KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
        KVM: VMX: use cmpxchg64
        KVM: VMX: simplify and fix vmx_vcpu_pi_load
        KVM: VMX: avoid double list add with VT-d posted interrupts
        KVM: VMX: extract __pi_post_block
        KVM: PPC: Book3S HV: Check for updated HDSISR on P9 HDSI exception
        KVM: nVMX: fix HOST_CR3/HOST_CR4 cache
      42057e18
    • Al Viro's avatar
      fix infoleak in waitid(2) · 6c85501f
      Al Viro authored
      kernel_waitid() can return a PID, an error or 0.  rusage is filled in the first
      case and waitid(2) rusage should've been copied out exactly in that case, *not*
      whenever kernel_waitid() has not returned an error.  Compat variant shares that
      braino; none of kernel_wait4() callers do, so the below ought to fix it.
      Reported-and-tested-by: default avatarAlexander Potapenko <glider@google.com>
      Fixes: ce72a16f ("wait4(2)/waitid(2): separate copying rusage to userland")
      Cc: stable@vger.kernel.org # v4.13
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      6c85501f
    • Linus Torvalds's avatar
      Merge branch 'fixes-v4.14-rc3' of... · 95d3652e
      Linus Torvalds authored
      Merge branch 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull keys fixes from James Morris:
       "Notable here is a rewrite of big_key crypto by Jason Donenfeld to
        address some issues in the original code.
      
        From Jason's commit log:
         "This started out as just replacing the use of crypto/rng with
          get_random_bytes_wait, so that we wouldn't use bad randomness at
          boot time. But, upon looking further, it appears that there were
          even deeper underlying cryptographic problems, and that this seems
          to have been committed with very little crypto review. So, I rewrote
          the whole thing, trying to keep to the conventions introduced by the
          previous author, to fix these cryptographic flaws."
      
        There has been positive review of the new code by Eric Biggers and
        Herbert Xu, and it passes basic testing via the keyutils test suite.
        Eric also manually tested it.
      
        Generally speaking, we likely need to improve the amount of crypto
        review for kernel crypto users including keys (I'll post a note
        separately to ksummit-discuss)"
      
      * 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        security/keys: rewrite all of big_key crypto
        security/keys: properly zero out sensitive key material in big_key
        KEYS: use kmemdup() in request_key_auth_new()
        KEYS: restrict /proc/keys by credentials at open time
        KEYS: reset parent each time before searching key_user_tree
        KEYS: prevent KEYCTL_READ on negative key
        KEYS: prevent creating a different user's keyrings
        KEYS: fix writing past end of user-supplied buffer in keyring_read()
        KEYS: fix key refcount leak in keyctl_read_key()
        KEYS: fix key refcount leak in keyctl_assume_authority()
        KEYS: don't revoke uninstantiated key in request_key_auth_new()
        KEYS: fix cred refcount leak in request_key_auth_new()
      95d3652e
    • Will Deacon's avatar
      arm64: fault: Route pte translation faults via do_translation_fault · 760bfb47
      Will Deacon authored
      We currently route pte translation faults via do_page_fault, which elides
      the address check against TASK_SIZE before invoking the mm fault handling
      code. However, this can cause issues with the path walking code in
      conjunction with our word-at-a-time implementation because
      load_unaligned_zeropad can end up faulting in kernel space if it reads
      across a page boundary and runs into a page fault (e.g. by attempting to
      read from a guard region).
      
      In the case of such a fault, load_unaligned_zeropad has registered a
      fixup to shift the valid data and pad with zeroes, however the abort is
      reported as a level 3 translation fault and we dispatch it straight to
      do_page_fault, despite it being a kernel address. This results in calling
      a sleeping function from atomic context:
      
        BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313
        in_atomic(): 0, irqs_disabled(): 0, pid: 10290
        Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
        [...]
        [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144
        [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c
        [<ffffff8e016977f0>] do_page_fault+0x140/0x330
        [<ffffff8e01681328>] do_mem_abort+0x54/0xb0
        Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0)
        [...]
        [<ffffff8e016844fc>] el1_da+0x18/0x78
        [<ffffff8e017f399c>] path_parentat+0x44/0x88
        [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8
        [<ffffff8e017f5044>] filename_create+0x4c/0x128
        [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8
        [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28
        Code: 36380080 d5384100 f9400800 9402566d (d4210000)
        ---[ end trace 2d01889f2bca9b9f ]---
      
      Fix this by dispatching all translation faults to do_translation_faults,
      which avoids invoking the page fault logic for faults on kernel addresses.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarAnkit Jain <ankijain@codeaurora.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      760bfb47
    • Will Deacon's avatar
      arm64: mm: Use READ_ONCE when dereferencing pointer to pte table · f069faba
      Will Deacon authored
      On kernels built with support for transparent huge pages, different CPUs
      can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk
      and they must take care to use READ_ONCE to avoid value tearing or caching
      of stale values by the compiler. Unfortunately, these functions call into
      our pgtable macros, which don't use READ_ONCE, and compiler caching has
      been observed to cause the following crash during ext4 writeback:
      
      PC is at check_pte+0x20/0x170
      LR is at page_vma_mapped_walk+0x2e0/0x540
      [...]
      Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
      Call trace:
      [<ffff000008233328>] check_pte+0x20/0x170
      [<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540
      [<ffff000008234adc>] page_mkclean_one+0xac/0x278
      [<ffff000008234d98>] rmap_walk_file+0xf0/0x238
      [<ffff000008236e74>] rmap_walk+0x64/0xa0
      [<ffff0000082370c8>] page_mkclean+0x90/0xa8
      [<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8
      [<ffff00000832f984>] mpage_submit_page+0x34/0x98
      [<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170
      [<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8
      [<ffff00000833530c>] ext4_writepages+0x484/0xe30
      [<ffff0000081f6ab4>] do_writepages+0x44/0xe8
      [<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110
      [<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8
      [<ffff000008324310>] ext4_sync_file+0x80/0x4b8
      [<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0
      [<ffff0000082332b4>] SyS_msync+0x194/0x1e8
      
      This is because page_vma_mapped_walk loads the PMD twice before calling
      pte_offset_map: the first time without READ_ONCE (where it gets all zeroes
      due to a concurrent pmdp_invalidate) and the second time with READ_ONCE
      (where it sees a valid table pointer due to a concurrent pmd_populate).
      However, the compiler inlines everything and caches the first value in
      a register, which is subsequently used in pte_offset_phys which returns
      a junk pointer that is later dereferenced when attempting to access the
      relevant pte.
      
      This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure
      that a stale value is not used. Whilst this is a point fix for a known
      failure (and simple to backport), a full fix moving all of our page table
      accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in
      page_vma_mapped_walk is in the works for a future kernel release.
      
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Timur Tabi <timur@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Fixes: f27176cf ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
      Tested-by: default avatarRichard Ruigrok <rruigrok@codeaurora.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f069faba
    • Boqun Feng's avatar
      kvm/x86: Handle async PF in RCU read-side critical sections · b862789a
      Boqun Feng authored
      Sasha Levin reported a WARNING:
      
      | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
      | rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
      | WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
      | rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
      ...
      | CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246
      | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      | 1.10.1-1ubuntu1 04/01/2014
      | Call Trace:
      ...
      | RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
      | RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
      | RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002
      | RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000
      | RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410
      | RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001
      | R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08
      | R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040
      | __schedule+0x201/0x2240 kernel/sched/core.c:3292
      | schedule+0x113/0x460 kernel/sched/core.c:3421
      | kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158
      | do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271
      | async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069
      | RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996
      | RSP: 0018:ffff88003b2df520 EFLAGS: 00010283
      | RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670
      | RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140
      | RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000
      | R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8
      | R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000
      | vsnprintf+0x173/0x1700 lib/vsprintf.c:2136
      | sprintf+0xbe/0xf0 lib/vsprintf.c:2386
      | proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23
      | get_link fs/namei.c:1047 [inline]
      | link_path_walk+0x1041/0x1490 fs/namei.c:2127
      ...
      
      This happened when the host hit a page fault, and delivered it as in an
      async page fault, while the guest was in an RCU read-side critical
      section.  The guest then tries to reschedule in kvm_async_pf_task_wait(),
      but rcu_preempt_note_context_switch() would treat the reschedule as a
      sleep in RCU read-side critical section, which is not allowed (even in
      preemptible RCU).  Thus the WARN.
      
      To cure this, make kvm_async_pf_task_wait() go to the halt path if the
      PF happens in a RCU read-side critical section.
      Reported-by: default avatarSasha Levin <levinsasha928@gmail.com>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBoqun Feng <boqun.feng@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b862789a
    • Wanpeng Li's avatar
      KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume · 305d0ab4
      Wanpeng Li authored
      ------------[ cut here ]------------
       WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
       CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0+ #17
       RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
       Call Trace:
        ? emulator_read_emulated+0x15/0x20 [kvm]
        ? segmented_read+0xae/0xf0 [kvm]
        vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
        ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
        x86_emulate_instruction+0x733/0x810 [kvm]
        vmx_handle_exit+0x2f4/0xda0 [kvm_intel]
        ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm]
        kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm]
        ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
        kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? __fget+0xfc/0x210
        do_vfs_ioctl+0xa4/0x6a0
        ? __fget+0x11d/0x210
        SyS_ioctl+0x79/0x90
        entry_SYSCALL_64_fastpath+0x23/0xc2
      
      A nested #PF is triggered during L0 emulating instruction for L2. However, it
      doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes
      it by queuing the #PF exception instead ,requesting an immediate VM exit from
      L2 and keeping the exception for L1 pending for a subsequent nested VM exit.
      
      This should actually work all the time, making vmx_inject_page_fault_nested
      totally unnecessary.  However, that's not working yet, so this patch can work
      around the issue in the meanwhile.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      305d0ab4
    • Prateek Sood's avatar
      locking/rwsem-xadd: Fix missed wakeup due to reordering of load · 9c29c318
      Prateek Sood authored
      If a spinner is present, there is a chance that the load of
      rwsem_has_spinner() in rwsem_wake() can be reordered with
      respect to decrement of rwsem count in __up_write() leading
      to wakeup being missed:
      
       spinning writer                  up_write caller
       ---------------                  -----------------------
       [S] osq_unlock()                 [L] osq
        spin_lock(wait_lock)
        sem->count=0xFFFFFFFF00000001
                  +0xFFFFFFFF00000000
        count=sem->count
        MB
                                         sem->count=0xFFFFFFFE00000001
                                                   -0xFFFFFFFF00000001
                                         spin_trylock(wait_lock)
                                         return
       rwsem_try_write_lock(count)
       spin_unlock(wait_lock)
       schedule()
      
      Reordering of atomic_long_sub_return_release() in __up_write()
      and rwsem_has_spinner() in rwsem_wake() can cause missing of
      wakeup in up_write() context. In spinning writer, sem->count
      and local variable count is 0XFFFFFFFE00000001. It would result
      in rwsem_try_write_lock() failing to acquire rwsem and spinning
      writer going to sleep in rwsem_down_write_failed().
      
      The smp_rmb() will make sure that the spinner state is
      consulted after sem->count is updated in up_write context.
      Signed-off-by: default avatarPrateek Sood <prsood@codeaurora.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dave@stgolabs.net
      Cc: longman@redhat.com
      Cc: parri.andrea@gmail.com
      Cc: sramana@codeaurora.org
      Link: http://lkml.kernel.org/r/1504794658-15397-1-git-send-email-prsood@codeaurora.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9c29c318
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2017-09-28-1' of... · 2b702e72
      Dave Airlie authored
      Merge tag 'drm-misc-fixes-2017-09-28-1' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
      
      Driver Changes:
      - qxl: fix primary surface and fb unpinning (Gerd)
      - sun41: fix CEC_PIN config gate now that media has been merged (Hans)
      - tegra: fix TRACE_INCLUDE_PATH (Thierry)
      
      Cc: Thierry Reding <treding@nvidia.com>
      Cc: Hans Verkuil <hverkuil@xs4all.nl>
      Cc: Gerd Hoffmann <kraxel@redhat.com>
      
      * tag 'drm-misc-fixes-2017-09-28-1' of git://anongit.freedesktop.org/git/drm-misc:
        drm/tegra: trace: Fix path to include
        qxl: fix framebuffer unpinning
        drm/sun4i: cec: Enable back CEC-pin framework
        qxl: fix primary surface handling
      2b702e72
  4. 28 Sep, 2017 16 commits
    • Linus Torvalds's avatar
      Merge tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 770b782f
      Linus Torvalds authored
      Pull ACPI fix from Rafael Wysocki:
       "This fixes an APEI problem that may cause a reported error to be
        missed due to a race condition"
      
      * tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI / APEI: clear error status before acknowledging the error
      770b782f
    • Linus Torvalds's avatar
      Merge tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 74de8187
      Linus Torvalds authored
      Pull power management fixes from Rafael Wysocki:
       "These fix a deadlock in the operating performance points (OPP)
        framework introduced during the 4.11 cycle, more issues with duplicate
        device objects for cpufreq-dt and cpufreq documentation.
      
        Specifics:
      
         - Fix a deadlock in the operating performance points (OPP) framework
           caused by a notifier callback taking a lock that's already held by
           its caller (Viresh Kumar).
      
         - Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
           attempting to register conflicting device objects which triggers a
           warning from sysfs (Suniel Mahesh).
      
         - Drop a stale reference to a piece of intel_pstate documentation
           that's not in the tree any more (Rafael Wysocki)"
      
      * tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        cpufreq: docs: Drop intel-pstate.txt from index.txt
        cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
        PM / OPP: Call notifier without holding opp_table->lock
      74de8187
    • Linus Torvalds's avatar
      Merge tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 02a2b053
      Linus Torvalds authored
      Pull xfs fixes from Darrick Wong:
      
       - fix various problems with the copy-on-write extent maps getting freed
         at the wrong time
      
       - fix printk format specifier problems
      
       - report zeroing operation outcomes instead of dropping them on the
         floor
      
       - fix some crashes when dio operations partially fail
      
       - fix a race condition between unwritten extent conversion & dio read
      
       - fix some incorrect tests in the inode log item processing
      
       - correct the delayed allocation space reservations on rmap filesystems
      
       - fix some problems checking for dax support
      
      * tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: revert "xfs: factor rmap btree size into the indlen calculations"
        xfs: Capture state of the right inode in xfs_iflush_done
        xfs: perag initialization should only touch m_ag_max_usable for AG 0
        xfs: update i_size after unwritten conversion in dio completion
        iomap_dio_rw: Allocate AIO completion queue before submitting dio
        xfs: validate bdev support for DAX inode flag
        xfs: remove redundant re-initialization of total_nr_pages
        xfs: Output warning message when discard option was enabled even though the device does not support discard
        xfs: report zeroed or not correctly in xfs_zero_range()
        xfs: kill meaningless variable 'zero'
        fs/xfs: Use %pS printk format for direct addresses
        xfs: evict CoW fork extents when performing finsert/fcollapse
        xfs: don't unconditionally clear the reflink flag on zero-block files
      02a2b053
    • Linus Torvalds's avatar
      Revert "Bluetooth: Add option for disabling legacy ioctl interfaces" · e49aa15e
      Linus Torvalds authored
      This reverts commit dbbccdc4.
      
      It turns out that the "legacy" users aren't so legacy at all, and that
      turning off the legacy ioctl will break the current Qt bluetooth stack
      for bluetooth LE devices that were released just a couple of months ago.
      
      So it's simply not true that this was a legacy interface that hasn't
      been needed and is only limited to old legacy BT devices.  Because I
      actually read Kconfig help messages, and actively try to turn off
      features that I don't need, I turned the option off.
      
      Then I spent _way_ too much time debugging BLE issues until I realized
      that it wasn't the Qt and subsurface development that had broken one of
      my dive computer BLE downloads, but simply my broken kernel config.
      
      Maybe in a decade it will be true that this is a legacy interface.  And
      maybe with a better help-text and correct dependencies, this kind of
      legacy removal might be acceptable.  But as things are right now both
      the commit message and the Kconfig help text were misleading, and the
      Kconfig option had the wrong dependenencies.
      
      There's no reason to keep that broken Kconfig option in the tree.
      
      Cc: Marcel Holtmann <marcel@holtmann.org>
      Cc: Johan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e49aa15e
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-apei' · 333d1774
      Rafael J. Wysocki authored
      * acpi-apei:
        ACPI / APEI: clear error status before acknowledging the error
      333d1774
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 91735832
      Linus Torvalds authored
      Pull rdma fixes from Doug Ledford:
       "Second -rc update for 4.14.
      
        Both Mellanox and Intel had a series of -rc fixes that landed this
        week. The Mellanox bunch is spread throughout the stack and not just
        in their driver, where as the Intel bunch was mostly in the hfi1
        driver. And, several of the fixes in the hfi1 driver were more than
        just simple 5 line fixes. As a result, the hfi1 driver fixes has a
        sizable LOC count.
      
        Everything else is as one would expect in an RC cycle in terms of LOC
        count. One item that might jump out and make you think "That's not an
        rc item" is the fix that corrects a typo. But, that change fixes a
        typo in a user visible API that was just added in this merge window,
        so if we fix it now, we can fix it. If we don't, the typo is in the
        API forever. Another that might not appear to be a fix at first glance
        is the Simplify mlx5_ib_cont_pages patch, but the simplification
        allows them to fix a bug in the existing function whenever the length
        of an SGE exceeded page size. We also had to revert one patch from the
        merge window that was wrong.
      
        Summary:
      
         - a few core fixes
         - a few ipoib fixes
         - a few mlx5 fixes
         - a 7-patch hfi1 related series"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load
        IB/hfi1: On error, fix use after free during user context setup
        Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
        IB/hfi1: Return correct value in general interrupt handler
        IB/hfi1: Check eeprom config partition validity
        IB/hfi1: Only reset QSFP after link up and turn off AOC TX
        IB/hfi1: Turn off AOC TX after offline substates
        IB/mlx5: Fix NULL deference on mlx5_ib_update_xlt failure
        IB/mlx5: Simplify mlx5_ib_cont_pages
        IB/ipoib: Fix inconsistency with free_netdev and free_rdma_netdev
        IB/ipoib: Fix sysfs Pkey create<->remove possible deadlock
        IB: Correct MR length field to be 64-bit
        IB/core: Fix qp_sec use after free access
        IB/core: Fix typo in the name of the tag-matching cap struct
      91735832
    • Rafael J. Wysocki's avatar
      Merge branches 'pm-opp' and 'pm-cpufreq' · abeb19a2
      Rafael J. Wysocki authored
      * pm-opp:
        PM / OPP: Call notifier without holding opp_table->lock
      
      * pm-cpufreq:
        cpufreq: docs: Drop intel-pstate.txt from index.txt
        cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
      abeb19a2
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 26e811cd
      Linus Torvalds authored
      Pull seccomp fix from Kees Cook:
       "Fix refcounting bug in CRIU interface, noticed by Chris Salls (Oleg &
        Tycho)"
      
      * tag 'seccomp-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
      26e811cd
    • Paolo Bonzini's avatar
      KVM: VMX: use cmpxchg64 · c0a1666b
      Paolo Bonzini authored
      This fixes a compilation failure on 32-bit systems.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      c0a1666b
    • Zhenzhong Duan's avatar
      xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping · 0d805ee7
      Zhenzhong Duan authored
      When bootup a PVM guest with large memory(Ex.240GB), XEN provided initial
      mapping overlaps with kernel module virtual space. When mapping in this space
      is cleared by xen_cleanhighmap(), in certain case there could be an 2MB mapping
      left. This is due to XEN initialize 4MB aligned mapping but xen_cleanhighmap()
      finish at 2MB boundary.
      
      When module loading is just on top of the 2MB space, got below warning:
      
      WARNING: at mm/vmalloc.c:106 vmap_pte_range+0x14e/0x190()
      Call Trace:
       [<ffffffff81117083>] warn_alloc_failed+0xf3/0x160
       [<ffffffff81146022>] __vmalloc_area_node+0x182/0x1c0
       [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
       [<ffffffff81145df7>] __vmalloc_node_range+0xa7/0x110
       [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
       [<ffffffff8103ca54>] module_alloc+0x64/0x70
       [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
       [<ffffffff810ac91e>] module_alloc_update_bounds+0x1e/0x80
       [<ffffffff810ac9a7>] move_module+0x27/0x150
       [<ffffffff810aefa0>] layout_and_allocate+0x120/0x1b0
       [<ffffffff810af0a8>] load_module+0x78/0x640
       [<ffffffff811ff90b>] ? security_file_permission+0x8b/0x90
       [<ffffffff810af6d2>] sys_init_module+0x62/0x1e0
       [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b
      
      Then the mapping of 2MB is cleared, finally oops when the page in that space is
      accessed.
      
      BUG: unable to handle kernel paging request at ffff880022600000
      IP: [<ffffffff81260877>] clear_page_c_e+0x7/0x10
      PGD 1788067 PUD 178c067 PMD 22434067 PTE 0
      Oops: 0002 [#1] SMP
      Call Trace:
       [<ffffffff81116ef7>] ? prep_new_page+0x127/0x1c0
       [<ffffffff81117d42>] get_page_from_freelist+0x1e2/0x550
       [<ffffffff81133010>] ? ii_iovec_copy_to_user+0x90/0x140
       [<ffffffff81119c9d>] __alloc_pages_nodemask+0x12d/0x230
       [<ffffffff81155516>] alloc_pages_vma+0xc6/0x1a0
       [<ffffffff81006ffd>] ? pte_mfn_to_pfn+0x7d/0x100
       [<ffffffff81134cfb>] do_anonymous_page+0x16b/0x350
       [<ffffffff81139c34>] handle_pte_fault+0x1e4/0x200
       [<ffffffff8100712e>] ? xen_pmd_val+0xe/0x10
       [<ffffffff810052c9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
       [<ffffffff81139dab>] handle_mm_fault+0x15b/0x270
       [<ffffffff81510c10>] do_page_fault+0x140/0x470
       [<ffffffff8150d7d5>] page_fault+0x25/0x30
      
      Call xen_cleanhighmap() with 4MB aligned for page tables mapping to fix it.
      The unnecessory call of xen_cleanhighmap() in DEBUG mode is also removed.
      
      -v2: add comment about XEN alignment from Juergen.
      
      References: https://lists.xen.org/archives/html/xen-devel/2012-07/msg01562.htmlSigned-off-by: default avatarZhenzhong Duan <zhenzhong.duan@oracle.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      
      [boris: added 'xen/mmu' tag to commit subject]
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      0d805ee7
    • Jan Beulich's avatar
      xen-pciback: relax BAR sizing write value check · 8c28ef3f
      Jan Beulich authored
      Just like done in d2bd05d8 ("xen-pciback: return proper values during
      BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the
      written value directly against ~0, but consider the r/o bits at the
      bottom (if any).
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      8c28ef3f
    • Jeffy Chen's avatar
      irq/generic-chip: Don't replace domain's name · 72364d32
      Jeffy Chen authored
      When generic irq chips are allocated for an irq domain the domain name is
      set to the irq chip name. That was done to have named domains before the
      recent changes which enforce domain naming were done.
      
      Since then the overwrite causes a memory leak when the domain name is
      dynamically allocated and even worse it would cause the domain free code to
      free the wrong name pointer, which might point to a constant.
      
      Remove the name assignment to prevent this.
      
      Fixes: d59f6617 ("genirq: Allow fwnode to carry name information only")
      Signed-off-by: default avatarJeffy Chen <jeffy.chen@rock-chips.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
      72364d32
    • Oleg Nesterov's avatar
      seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter() · 66a733ea
      Oleg Nesterov authored
      As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end
      up using different filters. Once we drop ->siglock it is possible for
      task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC.
      
      Fixes: f8e529ed ("seccomp, ptrace: add support for dumping seccomp filters")
      Reported-by: default avatarChris Salls <chrissalls5@gmail.com>
      Cc: stable@vger.kernel.org # needs s/refcount_/atomic_/ for v4.12 and earlier
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      [tycho: add __get_seccomp_filter vs. open coding refcount_inc()]
      Signed-off-by: default avatarTycho Andersen <tycho@docker.com>
      [kees: tweak commit log]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      66a733ea
    • Josh Poimboeuf's avatar
      objtool: Support unoptimized frame pointer setup · 607a4029
      Josh Poimboeuf authored
      Arnd Bergmann reported a bunch of warnings like:
      
        crypto/jitterentropy.o: warning: objtool: jent_fold_time()+0x3b: call without frame pointer save/setup
        crypto/jitterentropy.o: warning: objtool: jent_stuck()+0x1d: call without frame pointer save/setup
        crypto/jitterentropy.o: warning: objtool: jent_unbiased_bit()+0x15: call without frame pointer save/setup
        crypto/jitterentropy.o: warning: objtool: jent_read_entropy()+0x32: call without frame pointer save/setup
        crypto/jitterentropy.o: warning: objtool: jent_entropy_collector_free()+0x19: call without frame pointer save/setup
      
      and
      
        arch/x86/events/core.o: warning: objtool: collect_events uses BP as a scratch register
        arch/x86/events/core.o: warning: objtool: events_ht_sysfs_show()+0x22: call without frame pointer save/setup
      
      With certain rare configurations, GCC sometimes sets up the frame
      pointer with:
      
        lea    (%rsp),%rbp
      
      instead of:
      
        mov    %rsp,%rbp
      
      The instructions are equivalent, so treat the former like the latter.
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/a468af8b28a69b83fffc6d7668be9b6fcc873699.1506526584.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      607a4029
    • Josh Poimboeuf's avatar
      objtool: Skip unreachable warnings for GCC 4.4 and older · da541b20
      Josh Poimboeuf authored
      The kbuild bot occasionally reports warnings like:
      
        drivers/scsi/pcmcia/aha152x_core.o: warning: objtool: seldo_run()+0x130: unreachable instruction
      
      These warnings are always with GCC 4.4.  That version of GCC sometimes
      places unreachable instructions after calls to noreturn functions.
      
      The unreachable warnings aren't very important anyway.  Just ignore them
      for old versions of GCC.
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/bc89b807d965b98ec18a0bb94f96a594bd58f2f2.1506551639.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      da541b20
    • Shaohua Li's avatar
      md/raid5: cap worker count · 7d5d7b50
      Shaohua Li authored
      static checker reports a potential integer overflow. Cap the worker count to
      avoid the overflow.
      
      Reported:-by: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      7d5d7b50