1. 03 Aug, 2017 5 commits
    • Kees Cook's avatar
      ipc: add missing container_of()s for randstruct · ade9f91b
      Kees Cook authored
      When building with the randstruct gcc plugin, the layout of the IPC
      structs will be randomized, which requires any sub-structure accesses to
      use container_of().  The proc display handlers were missing the needed
      container_of()s since the iterator is passing in the top-level struct
      kern_ipc_perm.
      
      This would lead to crashes when running the "lsipc" program after the
      system had IPC registered (e.g. after starting up Gnome):
      
        general protection fault: 0000 [#1] PREEMPT SMP
        ...
        RIP: 0010:shm_add_rss_swap.isra.1+0x13/0xa0
        ...
        Call Trace:
          sysvipc_shm_proc_show+0x5e/0x150
          sysvipc_proc_show+0x1a/0x30
          seq_read+0x2e9/0x3f0
        ...
      
      Link: http://lkml.kernel.org/r/20170730205950.GA55841@beast
      Fixes: 3859a271
      
       ("randstruct: Mark various structs for randomization")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarDavidlohr Bueso <dave@stgolabs.net>
      Acked-by: Manfred Spraul <manfre...
      ade9f91b
    • Dima Zavin's avatar
      cpuset: fix a deadlock due to incomplete patching of cpusets_enabled() · 89affbf5
      Dima Zavin authored
      In codepaths that use the begin/retry interface for reading
      mems_allowed_seq with irqs disabled, there exists a race condition that
      stalls the patch process after only modifying a subset of the
      static_branch call sites.
      
      This problem manifested itself as a deadlock in the slub allocator,
      inside get_any_partial.  The loop reads mems_allowed_seq value (via
      read_mems_allowed_begin), performs the defrag operation, and then
      verifies the consistency of mem_allowed via the read_mems_allowed_retry
      and the cookie returned by xxx_begin.
      
      The issue here is that both begin and retry first check if cpusets are
      enabled via cpusets_enabled() static branch.  This branch can be
      rewritted dynamically (via cpuset_inc) if a new cpuset is created.  The
      x86 jump label code fully synchronizes across all CPUs for every entry
      it rewrites.  If it rewrites only one of the callsites (specifically the
      one in read_mems_allowed_retry) and then waits for the
      smp_call...
      89affbf5
    • Mike Rapoport's avatar
      userfaultfd_zeropage: return -ENOSPC in case mm has gone · 9d95aa4b
      Mike Rapoport authored
      In the non-cooperative userfaultfd case, the process exit may race with
      outstanding mcopy_atomic called by the uffd monitor.  Returning -ENOSPC
      instead of -EINVAL when mm is already gone will allow uffd monitor to
      distinguish this case from other error conditions.
      
      Unfortunately I overlooked userfaultfd_zeropage when updating
      userfaultd_copy().
      
      Link: http://lkml.kernel.org/r/1501136819-21857-1-git-send-email-rppt@linux.vnet.ibm.com
      Fixes: 96333187
      
       ("userfaultfd_copy: return -ENOSPC in case mm has gone")
      Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
      Cc: Pavel Emelyanov <xemul@virtuozzo.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d95aa4b
    • Heiko Carstens's avatar
      mm: take memory hotplug lock within numa_zonelist_order_handler() · 167d0f25
      Heiko Carstens authored
      Andre Wild reported the following warning:
      
        WARNING: CPU: 2 PID: 1205 at kernel/cpu.c:240 lockdep_assert_cpus_held+0x4c/0x60
        Modules linked in:
        CPU: 2 PID: 1205 Comm: bash Not tainted 4.13.0-rc2-00022-gfd2b2c57 #10
        Hardware name: IBM 2964 N96 702 (z/VM 6.4.0)
        task: 00000000701d8100 task.stack: 0000000073594000
        Krnl PSW : 0704f00180000000 0000000000145e24 (lockdep_assert_cpus_held+0x4c/0x60)
        ...
        Call Trace:
         lockdep_assert_cpus_held+0x42/0x60)
         stop_machine_cpuslocked+0x62/0xf0
         build_all_zonelists+0x92/0x150
         numa_zonelist_order_handler+0x102/0x150
         proc_sys_call_handler.isra.12+0xda/0x118
         proc_sys_write+0x34/0x48
         __vfs_write+0x3c/0x178
         vfs_write+0xbc/0x1a0
         SyS_write+0x66/0xc0
         system_call+0xc4/0x2b0
         locks held by bash/1205:
         #0:  (sb_writers#4){.+.+.+}, at: vfs_write+0xa6/0x1a0
         #1:  (zl_order_mutex){+.+...}, at: numa_zonelist_order_handler+0x44/0x150
         #2:  (zonelists_mutex){+.+...}, at:...
      167d0f25
    • Tetsuo Handa's avatar
      mm/page_io.c: fix oops during block io poll in swapin path · b0ba2d0f
      Tetsuo Handa authored
      When a thread is OOM-killed during swap_readpage() operation, an oops
      occurs because end_swap_bio_read() is calling wake_up_process() based on
      an assumption that the thread which called swap_readpage() is still
      alive.
      
        Out of memory: Kill process 525 (polkitd) score 0 or sacrifice child
        Killed process 525 (polkitd) total-vm:528128kB, anon-rss:0kB, file-rss:4kB, shmem-rss:0kB
        oom_reaper: reaped process 525 (polkitd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
        general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
        Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter coretemp ppdev pcspkr vmw_balloon sg shpchp vmw_vmci parport_pc parport i2c_piix4 ip_tables xfs libcrc32c sd_mod sr_mod cdrom ata_generic pata_acpi vmwgfx ahci libahci drm_kms_helper ata_piix syscopyarea sysfillrect sysimgblt fb_sys_fops mptspi scsi_transport_spi ttm e1000 mptscsih drm mptbase i2c_core libata serio_raw
        CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0-rc2-next-20170725 #129
        Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
        task: ffffffffb7c16500 task.stack: ffffffffb7c00000
        RIP: 0010:__lock_acquire+0x151/0x12f0
        Call Trace:
         <IRQ>
         lock_acquire+0x59/0x80
         _raw_spin_lock_irqsave+0x3b/0x4f
         try_to_wake_up+0x3b/0x410
         wake_up_process+0x10/0x20
         end_swap_bio_read+0x6f/0xf0
         bio_endio+0x92/0xb0
         blk_update_request+0x88/0x270
         scsi_end_request+0x32/0x1c0
         scsi_io_completion+0x209/0x680
         scsi_finish_command+0xd4/0x120
         scsi_softirq_done+0x120/0x140
         __blk_mq_complete_request_remote+0xe/0x10
         flush_smp_call_function_queue+0x51/0x120
         generic_smp_call_function_single_interrupt+0xe/0x20
         smp_trace_call_function_single_interrupt+0x22/0x30
         smp_call_function_single_interrupt+0x9/0x10
         call_function_single_interrupt+0xa7/0xb0
         </IRQ>
        RIP: 0010:native_safe_halt+0x6/0x10
         default_idle+0xe/0x20
         arch_cpu_idle+0xa/0x10
         default_idle_call+0x1e/0x30
         do_idle+0x187/0x200
         cpu_startup_entry+0x6e/0x70
         rest_init+0xd0/0xe0
         start_kernel+0x456/0x477
         x86_64_start_reservations+0x24/0x26
         x86_64_start_kernel+0xf7/0x11a
         secondary_startup_64+0xa5/0xa5
        Code: c3 49 81 3f 20 9e 0b b8 41 bc 00 00 00 00 44 0f 45 e2 83 fe 01 0f 87 62 ff ff ff 89 f0 49 8b 44 c7 08 48 85 c0 0f 84 52 ff ff ff <f0> ff 80 98 01 00 00 8b 3d 5a 49 c4 01 45 8b b3 18 0c 00 00 85
        RIP: __lock_acquire+0x151/0x12f0 RSP: ffffa01f39e03c50
        ---[ end trace 6c441db499169b1e ]---
        Kernel panic - not syncing: Fatal exception in interrupt
        Kernel Offset: 0x36000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
        ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      
      Fix it by holding a reference to the thread.
      
      [akpm@linux-foundation.org: add comment]
      Fixes: 23955622
      
       ("swap: add block io poll in swapin path")
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: default avatarShaohua Li <shli@fb.com>
      Cc: Tim Chen <tim.c.chen@intel.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0ba2d0f
  2. 02 Aug, 2017 9 commits
    • Minchan Kim's avatar
      zram: do not free pool->size_class · 3189c820
      Minchan Kim authored
      Mike reported kernel goes oops with ltp:zram03 testcase.
      
        zram: Added device: zram0
        zram0: detected capacity change from 0 to 107374182400
        BUG: unable to handle kernel paging request at 0000306d61727a77
        IP: zs_map_object+0xb9/0x260
        PGD 0
        P4D 0
        Oops: 0000 [#1] SMP
        Dumping ftrace buffer:
           (ftrace buffer empty)
        Modules linked in: zram(E) xfs(E) libcrc32c(E) btrfs(E) xor(E) raid6_pq(E) loop(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) af_packet(E) br_netfilter(E) bridge(E) stp(E) llc(E) iscsi_ibft(E) iscsi_boot_sysfs(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) intel_powerclamp(E) coretemp(E) cdc_ether(E) kvm_intel(E) usbnet(E) mii(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) iTCO_wdt(E) ghash_clmulni_intel(E) bnx2(E) iTCO_vendor_support(E) pcbc(E) ioatdma(E) ipmi_ssif(E) aesni_intel(E) i5500_temp(E) i2c_i801(E) aes_x86_64(E) lpc_ich(E) shpchp(E) mfd_core(E) crypto_simd(E) i7core_edac(E) dca(E) glue_helper(E) cryptd(E) ipmi_si(E) button(E) acpi_cpufreq(E) ipmi_devintf(E) pcspkr(E) ipmi_msghandler(E)
         nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) ata_generic(E) i2c_algo_bit(E) ata_piix(E) drm_kms_helper(E) ahci(E) syscopyarea(E) sysfillrect(E) libahci(E) sysimgblt(E) fb_sys_fops(E) uhci_hcd(E) ehci_pci(E) ttm(E) ehci_hcd(E) libata(E) drm(E) megaraid_sas(E) usbcore(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) efivarfs(E) autofs4(E) [last unloaded: zram]
        CPU: 6 PID: 12356 Comm: swapon Tainted: G            E   4.13.0.g87b2c3fc-default #194
        Hardware name: IBM System x3550 M3 -[7944K3G]-/69Y5698     , BIOS -[D6E150AUS-1.10]- 12/15/2010
        task: ffff880158d2c4c0 task.stack: ffffc90001680000
        RIP: 0010:zs_map_object+0xb9/0x260
        Call Trace:
         zram_bvec_rw.isra.26+0xe8/0x780 [zram]
         zram_rw_page+0x6e/0xa0 [zram]
         bdev_read_page+0x81/0xb0
         do_mpage_readpage+0x51a/0x710
         mpage_readpages+0x122/0x1a0
         blkdev_readpages+0x1d/0x20
         __do_page_cache_readahead+0x1b2/0x270
         ondemand_readahead+0x180/0x2c0
         page_cache_sync_readahead+0x31/0x50
         generic_file_read_iter+0x7e7/0xaf0
         blkdev_read_iter+0x37/0x40
         __vfs_read+0xce/0x140
         vfs_read+0x9e/0x150
         SyS_read+0x46/0xa0
         entry_SYSCALL_64_fastpath+0x1a/0xa5
        Code: 81 e6 00 c0 3f 00 81 fe 00 00 16 00 0f 85 9f 01 00 00 0f b7 13 65 ff 05 5e 07 dc 7e 66 c1 ea 02 81 e2 ff 01 00 00 49 8b 54 d4 08 <8b> 4a 48 41 0f af ce 81 e1 ff 0f 00 00 41 89 c9 48 c7 c3 a0 70
        RIP: zs_map_object+0xb9/0x260 RSP: ffffc90001683988
        CR2: 0000306d61727a77
      
      He bisected the problem is [1].
      
      After commit cf8e0fed ("mm/zsmalloc: simplify zs_max_alloc_size
      handling"), zram doesn't use double pointer for pool->size_class any
      more in zs_create_pool so counter function zs_destroy_pool don't need to
      free it, either.
      
      Otherwise, it does kfree wrong address and then, kernel goes Oops.
      
      Link: http://lkml.kernel.org/r/20170725062650.GA12134@bbox
      Fixes: cf8e0fed
      
       ("mm/zsmalloc: simplify zs_max_alloc_size handling")
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Tested-by: default avatarMike Galbraith <efault@gmx.de>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3189c820
    • Jonathan Corbet's avatar
      kthread: fix documentation build warning · d16977f3
      Jonathan Corbet authored
      The kerneldoc comment for kthread_create() had an incorrect argument
      name, leading to a warning in the docs build.
      
      Correct it, and make one more small step toward a warning-free build.
      
      Link: http://lkml.kernel.org/r/20170724135916.7f486c6f@lwn.net
      
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d16977f3
    • Arnd Bergmann's avatar
      kasan: avoid -Wmaybe-uninitialized warning · e7701557
      Arnd Bergmann authored
      gcc-7 produces this warning:
      
        mm/kasan/report.c: In function 'kasan_report':
        mm/kasan/report.c:351:3: error: 'info.first_bad_addr' may be used uninitialized in this function [-Werror=maybe-uninitialized]
           print_shadow_for_address(info->first_bad_addr);
           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        mm/kasan/report.c:360:27: note: 'info.first_bad_addr' was declared here
      
      The code seems fine as we only print info.first_bad_addr when there is a
      shadow, and we always initialize it in that case, but this is relatively
      hard for gcc to figure out after the latest rework.
      
      Adding an intialization to the most likely value together with the other
      struct members shuts up that warning.
      
      Fixes: b235b9808664 ("kasan: unify report headers")
      Link: https://patchwork.kernel.org/patch/9641417/
      Link: http://lkml.kernel.org/r/20170725152739.4176967-1-arnd@arndb.de
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Suggested-by: default avatarAlexander Potapenko <glider@google.com>
      Suggested-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Acked-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e7701557
    • Mike Rapoport's avatar
      userfaultfd: non-cooperative: notify about unmap of destination during mremap · b2282371
      Mike Rapoport authored
      When mremap is called with MREMAP_FIXED it unmaps memory at the
      destination address without notifying userfaultfd monitor.
      
      If the destination were registered with userfaultfd, the monitor has no
      way to distinguish between the old and new ranges and to properly relate
      the page faults that would occur in the destination region.
      
      Fixes: 897ab3e0 ("userfaultfd: non-cooperative: add event for memory unmaps")
      Link: http://lkml.kernel.org/r/1500276876-3350-1-git-send-email-rppt@linux.vnet.ibm.com
      
      Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: default avatarPavel Emelyanov <xemul@virtuozzo.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b2282371
    • Mel Gorman's avatar
      mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries · 3ea27719
      Mel Gorman authored
      Nadav Amit identified a theoritical race between page reclaim and
      mprotect due to TLB flushes being batched outside of the PTL being held.
      
      He described the race as follows:
      
              CPU0                            CPU1
              ----                            ----
                                              user accesses memory using RW PTE
                                              [PTE now cached in TLB]
              try_to_unmap_one()
              ==> ptep_get_and_clear()
              ==> set_tlb_ubc_flush_pending()
                                              mprotect(addr, PROT_READ)
                                              ==> change_pte_range()
                                              ==> [ PTE non-present - no flush ]
      
                                              user writes using cached RW PTE
              ...
      
              try_to_unmap_flush()
      
      The same type of race exists for reads when protecting for PROT_NONE and
      also exists for operations that can leave an old TLB entry behind such
      as munmap, mremap and madvise.
      
      For some operations like mprotect, it's not necessarily a data integrity
      issue but it is a correctness issue as there is a window where an
      mprotect that limits access still allows access.  For munmap, it's
      potentially a data integrity issue although the race is massive as an
      munmap, mmap and return to userspace must all complete between the
      window when reclaim drops the PTL and flushes the TLB.  However, it's
      theoritically possible so handle this issue by flushing the mm if
      reclaim is potentially currently batching TLB flushes.
      
      Other instances where a flush is required for a present pte should be ok
      as either the page lock is held preventing parallel reclaim or a page
      reference count is elevated preventing a parallel free leading to
      corruption.  In the case of page_mkclean there isn't an obvious path
      that userspace could take advantage of without using the operations that
      are guarded by this patch.  Other users such as gup as a race with
      reclaim looks just at PTEs.  huge page variants should be ok as they
      don't race with reclaim.  mincore only looks at PTEs.  userfault also
      should be ok as if a parallel reclaim takes place, it will either fault
      the page back in or read some of the data before the flush occurs
      triggering a fault.
      
      Note that a variant of this patch was acked by Andy Lutomirski but this
      was for the x86 parts on top of his PCID work which didn't make the 4.13
      merge window as expected.  His ack is dropped from this version and
      there will be a follow-on patch on top of PCID that will include his
      ack.
      
      [akpm@linux-foundation.org: tweak comments]
      [akpm@linux-foundation.org: fix spello]
      Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.de
      
      Reported-by: default avatarNadav Amit <nadav.amit@gmail.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: <stable@vger.kernel.org>	[v4.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3ea27719
    • Kefeng Wang's avatar
      pid: kill pidhash_size in pidhash_init() · 27e37d84
      Kefeng Wang authored
      After commit 3d375d78 ("mm: update callers to use HASH_ZERO flag"),
      drop unused pidhash_size in pidhash_init().
      
      Link: http://lkml.kernel.org/r/1500389267-49222-1-git-send-email-wangkefeng.wang@huawei.com
      
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Reviewed-by: default avatarPavel Tatashin <Pasha.Tatashin@Oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      27e37d84
    • Daniel Jordan's avatar
      mm/hugetlb.c: __get_user_pages ignores certain follow_hugetlb_page errors · 2be7cfed
      Daniel Jordan authored
      Commit 9a291a7c ("mm/hugetlb: report -EHWPOISON not -EFAULT when
      FOLL_HWPOISON is specified") causes __get_user_pages to ignore certain
      errors from follow_hugetlb_page.  After such error, __get_user_pages
      subsequently calls faultin_page on the same VMA and start address that
      follow_hugetlb_page failed on instead of returning the error immediately
      as it should.
      
      In follow_hugetlb_page, when hugetlb_fault returns a value covered under
      VM_FAULT_ERROR, follow_hugetlb_page returns it without setting nr_pages
      to 0 as __get_user_pages expects in this case, which causes the
      following to happen in __get_user_pages: the "while (nr_pages)" check
      succeeds, we skip the "if (!vma..." check because we got a VMA the last
      time around, we find no page with follow_page_mask, and we call
      faultin_page, which calls hugetlb_fault for the second time.
      
      This issue also slightly changes how __get_user_pages works.  Before, it
      only returned error if it had made no progress (i = 0).  But now,
      follow_hugetlb_page can clobber "i" with an error code since its new
      return path doesn't check for progress.  So if "i" is nonzero before a
      failing call to follow_hugetlb_page, that indication of progress is lost
      and __get_user_pages can return error even if some pages were
      successfully pinned.
      
      To fix this, change follow_hugetlb_page so that it updates nr_pages,
      allowing __get_user_pages to fail immediately and restoring the "error
      only if no progress" behavior to __get_user_pages.
      
      Tested that __get_user_pages returns when expected on error from
      hugetlb_fault in follow_hugetlb_page.
      
      Fixes: 9a291a7c ("mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified")
      Link: http://lkml.kernel.org/r/1500406795-58462-1-git-send-email-daniel.m.jordan@oracle.com
      
      Signed-off-by: default avatarDaniel Jordan <daniel.m.jordan@oracle.com>
      Acked-by: default avatarPunit Agrawal <punit.agrawal@arm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: zhong jiang <zhongjiang@huawei.com>
      Cc: <stable@vger.kernel.org>	[4.12.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2be7cfed
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v4.13-3' of git://git.infradead.org/linux-platform-drivers-x86 · 4d3f5d04
      Linus Torvalds authored
      Pull x86 platform driver fixes from Darren Hart:
       "Fix two bugs under error or abnormal usage conditions. Correct a
        config dependency:
      
        dell-wmi:
         - Fix driver interface version query
      
        wmi:
         - Fix error handling in acpi_wmi_init()
      
        peaq-wmi:
         - select INPUT_POLLDEV"
      
      * tag 'platform-drivers-x86-v4.13-3' of git://git.infradead.org/linux-platform-drivers-x86:
        platform/x86: dell-wmi: Fix driver interface version query
        platform/x86: wmi: Fix error handling in acpi_wmi_init()
        platform/x86: peaq-wmi: select INPUT_POLLDEV
      4d3f5d04
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 33611ba0
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "These seven patches are mostly minor build, Kconfig and error leg
        fixes"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qedi: Fix return code in qedi_ep_connect()
        scsi: lpfc: fix linking against modular NVMe support
        scsi: scsi_transport_fc: return -EBUSY for deleted vport
        scsi: libcxgbi: add check for valid cxgbi_task_data
        scsi: aic7xxx: fix firmware build with O=path
        scsi: megaraid_sas: fix memleak in megasas_alloc_cmdlist_fusion
        scsi: qedi: Add ISCSI_BOOT_SYSFS to Kconfig
      33611ba0
  3. 01 Aug, 2017 12 commits
  4. 31 Jul, 2017 9 commits
    • Mark Cave-Ayland's avatar
      sunhme: fix up GREG_STAT and GREG_IMASK register offsets · 96a734ba
      Mark Cave-Ayland authored
      
      Update the values to match those from the STP2002QFP documentation.
      Signed-off-by: default avatarMark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96a734ba
    • Linus Torvalds's avatar
      Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 2e7ca206
      Linus Torvalds authored
      Pull cgroup fixes from Tejun Heo:
       "Several cgroup bug fixes.
      
         - cgroup core was calling a migration callback on empty migrations,
           which could make cpuset crash.
      
         - There was a very subtle bug where the controller interface files
           aren't created directly when cgroup2 is mounted. Because later
           operations create them, this bug didn't get noticed earlier.
      
         - Failed writes to cgroup.subtree_control were incorrectly returning
           zero"
      
      * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: fix error return value from cgroup_subtree_control()
        cgroup: create dfl_root files on subsys registration
        cgroup: don't call migration methods if there are no tasks to migrate
      2e7ca206
    • Linus Torvalds's avatar
      Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · ff2620f7
      Linus Torvalds authored
      Pull workqueue fixes from Tejun Heo:
       "Two notable fixes.
      
         - While adding NUMA affinity support to unbound workqueues, the
           assumption that an unbound workqueue with max_active == 1 is
           ordered was broken.
      
           The plan was to use explicit alloc_ordered_workqueue() for those
           cases. Unfortunately, I forgot to update the documentation properly
           and we grew a handful of use cases which depend on that assumption.
      
           While we want to convert them to alloc_ordered_workqueue(), we
           don't really lose anything by enforcing ordered execution on
           unbound max_active == 1 workqueues and it doesn't make sense to
           risk subtle bugs. Restore the assumption.
      
         - Workqueue assumes that CPU <-> NUMA node mapping remains static.
      
           This is a general assumption - we don't have any synchronization
           mechanism around CPU <-> node mapping. Unfortunately, powerpc may
           change the mapping dynamically leading to crashes. Michael added a
           workaround so that we at least don't crash while powerpc hotplug
           code gets updated"
      
      * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        workqueue: Work around edge cases for calc of pool's cpumask
        workqueue: implicit ordered attribute should be overridable
        workqueue: restore WQ_UNBOUND/max_active==1 to be ordered
      ff2620f7
    • Linus Torvalds's avatar
      Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 3dcc4c7d
      Linus Torvalds authored
      Pull libata fixes from Tejun Heo:
       "Dan found a really old bug where libata hotplug code wasn't sanitizing
        index value from userland and may end up indexing with a negative
        number. It is scary but fortunately can only be triggered by root.
      
        Other than that, minor fixes"
      
      * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        libata: fix a couple of doc build warnings
        libata: array underflow in ata_find_dev()
        ata: sata_rcar: add gen[23] fallback compatibility strings
        libata: remove unused rc in ata_eh_handle_port_resume
        libata: Cleanup ata_read_log_page()
        ata: fix gemini Kconfig dependencies
      3dcc4c7d
    • Babu Moger's avatar
      parisc: Define CONFIG_CPU_BIG_ENDIAN · 74ad3d28
      Babu Moger authored
      
      While working on enabling queued rwlock on SPARC, found this following
      code in include/asm-generic/qrwlock.h which uses CONFIG_CPU_BIG_ENDIAN
      to clear a byte.
      
      static inline u8 *__qrwlock_write_byte(struct qrwlock *lock)
       {
      	return (u8 *)lock + 3 * IS_BUILTIN(CONFIG_CPU_BIG_ENDIAN);
       }
      
      Problem is many of the fixed big endian architectures don't define
      CPU_BIG_ENDIAN and clears the wrong byte.
      
      Define CPU_BIG_ENDIAN for parisc architecture to fix it.
      Signed-off-by: default avatarBabu Moger <babu.moger@oracle.com>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      74ad3d28
    • Jonathan Corbet's avatar
      libata: fix a couple of doc build warnings · 2f60e1ab
      Jonathan Corbet authored
      
      The kerneldoc comments for a couple of functions in drivers/ata/libata-eh.c
      had fallen behind the current implementation, resulting in these doc build
      warnings:
      
        ./drivers/ata/libata-eh.c:1449: warning: No description found for parameter 'link'
        ./drivers/ata/libata-eh.c:1449: warning: Excess function parameter 'ap' description in 'ata_eh_done'
        ./drivers/ata/libata-eh.c:1590: warning: No description found for parameter 'qc'
        ./drivers/ata/libata-eh.c:1590: warning: Excess function parameter 'dev' description in 'ata_eh_request_sense'
      
      Update the comments and make the warnings go away.
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      2f60e1ab
    • James Bottomley's avatar
      parisc: pdc_stable: Fix locking when creating sysfs links · 93964fd4
      James Bottomley authored
      There's no need to take the write lock when creating sysfs links.
      
      This patch fixes the following BUG:
       BUG: sleeping function called from invalid context at mm/slab.h:416
       in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: swapper/0
       CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2-00110-g0b5477d9
      
       #111
       Backtrace:
       [<0000000040217ac8>] show_stack+0x20/0x38
       [<00000000406fbbb0>] dump_stack+0xb0/0x128
       [<0000000040274090>] ___might_sleep+0x180/0x1b8
       [<0000000040274144>] __might_sleep+0x7c/0xe8
       [<0000000040373874>] kmem_cache_alloc+0x14c/0x1e0
       [<0000000040419514>] __kernfs_new_node+0x84/0x1b8
       [<000000004041b09c>] kernfs_new_node+0x3c/0x78
       [<000000004041e040>] kernfs_create_link+0x40/0xd8
       [<000000004041f320>] sysfs_do_create_link_sd.isra.0+0xb0/0x130
       [<000000004041f3d4>] sysfs_create_link+0x34/0x58
       [<000000004011b4a4>] pdc_stable_init+0x2c4/0x458
       [<0000000040200250>] do_one_initcall+0x70/0x1d8
       [<0000000040101644>] kernel_init_freeable+0x27c/0x390
       [<000000004020be44>] kernel_init+0x24/0x1c0
      Signed-off-by: default avatarJames Bottomley <James.Bottomley@HansenPartnership.com>
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      93964fd4
    • Helge Deller's avatar
      parisc: Increase thread and stack size to 32kb · 8f8201df
      Helge Deller authored
      
      Since kernel 4.11 the thread and irq stacks on parisc randomly overflow
      the default size of 16k. The reason why stack usage suddenly grew is yet
      unknown.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # 4.11+
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      8f8201df
    • John David Anglin's avatar
      parisc: Handle vma's whose context is not current in flush_cache_range · 13d57093
      John David Anglin authored
      
      In testing James' patch to drivers/parisc/pdc_stable.c, I hit the BUG
      statement in flush_cache_range() during a system shutdown:
      
      kernel BUG at arch/parisc/kernel/cache.c:595!
      CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
      Workqueue: events free_ioctx
      
       IAOQ[0]: flush_cache_range+0x144/0x148
       IAOQ[1]: flush_cache_page+0x0/0x1a8
       RP(r2): flush_cache_range+0xec/0x148
      Backtrace:
       [<00000000402910ac>] unmap_page_range+0x84/0x880
       [<00000000402918f4>] unmap_single_vma+0x4c/0x60
       [<0000000040291a18>] zap_page_range_single+0x110/0x160
       [<0000000040291c34>] unmap_mapping_range+0x174/0x1a8
       [<000000004026ccd8>] truncate_pagecache+0x50/0xa8
       [<000000004026cd84>] truncate_setsize+0x54/0x70
       [<000000004033d534>] put_aio_ring_file+0x44/0xb0
       [<000000004033d5d8>] aio_free_ring+0x38/0x140
       [<000000004033d714>] free_ioctx+0x34/0xa8
       [<00000000401b0028>] process_one_work+0x1b8/0x4d0
       [<00000000401b04f4>] worker_thread+0x1b4/0x648
       [<00000000401b9128>] kthread+0x1b0/0x208
       [<0000000040150020>] end_fault_vector+0x20/0x28
       [<0000000040639518>] nf_ip_reroute+0x50/0xa8
       [<0000000040638ed0>] nf_ip_route+0x10/0x78
       [<0000000040638c90>] xfrm4_mode_tunnel_input+0x180/0x1f8
      
      CPU: 2 PID: 6532 Comm: kworker/2:0 Not tainted 4.13.0-rc2+ #1
      Workqueue: events free_ioctx
      Backtrace:
       [<0000000040163bf0>] show_stack+0x20/0x38
       [<0000000040688480>] dump_stack+0xa8/0x120
       [<0000000040163dc4>] die_if_kernel+0x19c/0x2b0
       [<0000000040164d0c>] handle_interruption+0xa24/0xa48
      
      This patch modifies flush_cache_range() to handle non current contexts.
      In as much as this occurs infrequently, the simplest approach is to
      flush the entire cache when this happens.
      Signed-off-by: default avatarJohn David Anglin <dave.anglin@bell.net>
      Cc: stable@vger.kernel.org # 4.9+
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      13d57093
  5. 30 Jul, 2017 5 commits