1. 13 Dec, 2013 18 commits
    • James Solner's avatar
      Add Documentation/module-signing.txt file · 3cafea30
      James Solner authored
      This patch adds the Documentation/module-signing.txt file that is
      currently missing from the Documentation directory. The init/Kconfig
      file references the Documentation/module-signing.txt file to explain
      how kernel module signing works. This patch supplies this documentation.
      Signed-off-by: default avatarJames Solner <solner@alcatel-lucent.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      3cafea30
    • Xiao Guangrong's avatar
      KEYS: fix uninitialized persistent_keyring_register_sem · 6bd364d8
      Xiao Guangrong authored
      We run into this bug:
      [ 2736.063245] Unable to handle kernel paging request for data at address 0x00000000
      [ 2736.063293] Faulting instruction address: 0xc00000000037efb0
      [ 2736.063300] Oops: Kernel access of bad area, sig: 11 [#1]
      [ 2736.063303] SMP NR_CPUS=2048 NUMA pSeries
      [ 2736.063310] Modules linked in: sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6table_security ip6table_raw ip6t_REJECT iptable_nat nf_nat_ipv4 iptable_mangle iptable_security iptable_raw ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ebtable_filter ebtables ip6table_filter iptable_filter ip_tables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_nat nf_conntrack ip6_tables ibmveth pseries_rng nx_crypto nfsd auth_rpcgss nfs_acl lockd sunrpc binfmt_misc xfs libcrc32c dm_service_time sd_mod crc_t10dif crct10dif_common ibmvfc scsi_transport_fc scsi_tgt dm_mirror dm_region_hash dm_log dm_multipath dm_mod
      [ 2736.063383] CPU: 1 PID: 7128 Comm: ssh Not tainted 3.10.0-48.el7.ppc64 #1
      [ 2736.063389] task: c000000131930120 ti: c0000001319a0000 task.ti: c0000001319a0000
      [ 2736.063394] NIP: c00000000037efb0 LR: c0000000006c40f8 CTR: 0000000000000000
      [ 2736.063399] REGS: c0000001319a3870 TRAP: 0300   Not tainted  (3.10.0-48.el7.ppc64)
      [ 2736.063403] MSR: 8000000000009032 <SF,EE,ME,IR,DR,RI>  CR: 28824242  XER: 20000000
      [ 2736.063415] SOFTE: 0
      [ 2736.063418] CFAR: c00000000000908c
      [ 2736.063421] DAR: 0000000000000000, DSISR: 40000000
      [ 2736.063425]
      GPR00: c0000000006c40f8 c0000001319a3af0 c000000001074788 c0000001319a3bf0
      GPR04: 0000000000000000 0000000000000000 0000000000000020 000000000000000a
      GPR08: fffffffe00000002 00000000ffff0000 0000000080000001 c000000000924888
      GPR12: 0000000028824248 c000000007e00400 00001fffffa0f998 0000000000000000
      GPR16: 0000000000000022 00001fffffa0f998 0000010022e92470 0000000000000000
      GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR24: 0000000000000000 c000000000f4a828 00003ffffe527108 0000000000000000
      GPR28: c000000000f4a730 c000000000f4a828 0000000000000000 c0000001319a3bf0
      [ 2736.063498] NIP [c00000000037efb0] .__list_add+0x30/0x110
      [ 2736.063504] LR [c0000000006c40f8] .rwsem_down_write_failed+0x78/0x264
      [ 2736.063508] PACATMSCRATCH [800000000280f032]
      [ 2736.063511] Call Trace:
      [ 2736.063516] [c0000001319a3af0] [c0000001319a3b80] 0xc0000001319a3b80 (unreliable)
      [ 2736.063523] [c0000001319a3b80] [c0000000006c40f8] .rwsem_down_write_failed+0x78/0x264
      [ 2736.063530] [c0000001319a3c50] [c0000000006c1bb0] .down_write+0x70/0x78
      [ 2736.063536] [c0000001319a3cd0] [c0000000002e5ffc] .keyctl_get_persistent+0x20c/0x320
      [ 2736.063542] [c0000001319a3dc0] [c0000000002e2388] .SyS_keyctl+0x238/0x260
      [ 2736.063548] [c0000001319a3e30] [c000000000009e7c] syscall_exit+0x0/0x7c
      [ 2736.063553] Instruction dump:
      [ 2736.063556] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7cbd2b78 7c9e2378 7c7f1b78 f8010010
      [ 2736.063566] f821ff71 e8a50008 7fa52040 40de00c0 <e8be0000> 7fbd2840 40de0094 7fbff040
      [ 2736.063579] ---[ end trace 2708241785538296 ]---
      
      It's caused by uninitialized persistent_keyring_register_sem.
      
      The bug was introduced by commit f36f8c75, two typos are in that commit:
      CONFIG_KEYS_KERBEROS_CACHE should be CONFIG_PERSISTENT_KEYRINGS and
      krb_cache_register_sem should be persistent_keyring_register_sem.
      Signed-off-by: default avatarXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      6bd364d8
    • Kirill Tkhai's avatar
      KEYS: Remove files generated when SYSTEM_TRUSTED_KEYRING=y · f46a3cbb
      Kirill Tkhai authored
      Always remove generated SYSTEM_TRUSTED_KEYRING files while doing make mrproper.
      Signed-off-by: default avatarKirill Tkhai <tkhai@yandex.ru>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      f46a3cbb
    • David Howells's avatar
      X.509: Fix certificate gathering · d7ec435f
      David Howells authored
      Fix the gathering of certificates from both the source tree and the build tree
      to correctly calculate the pathnames of all the certificates.
      
      The problem was that if the default generated cert, signing_key.x509, didn't
      exist then it would not have a path attached and if it did, it would have a
      path attached.
      
      This means that the contents of kernel/.x509.list would change between the
      first compilation in a directory and the second.  After the second it would
      remain stable because the signing_key.x509 file exists.
      
      The consequence was that the kernel would get relinked unconditionally on the
      second recompilation.  The second recompilation would also show something like
      this:
      
         X.509 certificate list changed
           CERTS   kernel/x509_certificate_list
           - Including cert /home/torvalds/v2.6/linux/signing_key.x509
           AS      kernel/system_certificates.o
           LD      kernel/built-in.o
      
      which is why the relink would happen.
      
      
      Unfortunately, it isn't a simple matter of just sticking a path on the front
      of the filename of the certificate in the build directory as make can't then
      work out how to build it.
      
      So the path has to be prepended to the name for sorting and duplicate
      elimination and then removed for the make rule if it is in the build tree.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      d7ec435f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (fixes from Andrew) · 8d276377
      Linus Torvalds authored
      Merge patches from Andrew Morton:
        "13 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm: memcg: do not allow task about to OOM kill to bypass the limit
        mm: memcg: fix race condition between memcg teardown and swapin
        thp: move preallocated PTE page table on move_huge_pmd()
        mfd/rtc: s5m: fix register updating by adding regmap for RTC
        rtc: s5m: enable IRQ wake during suspend
        rtc: s5m: limit endless loop waiting for register update
        rtc: s5m: fix unsuccesful IRQ request during probe
        drivers/rtc/rtc-s5m.c: fix info->rtc assignment
        include/linux/kernel.h: make might_fault() a nop for !MMU
        drivers/rtc/rtc-at91rm9200.c: correct alarm over day/month wrap
        procfs: also fix proc_reg_get_unmapped_area() for !MMU case
        mm: memcg: do not declare OOM from __GFP_NOFAIL allocations
        include/linux/hugetlb.h: make isolate_huge_page() an inline
      8d276377
    • Johannes Weiner's avatar
      mm: memcg: do not allow task about to OOM kill to bypass the limit · 1f14c1ac
      Johannes Weiner authored
      Commit 49426420 ("mm: memcg: handle non-error OOM situations more
      gracefully") allowed tasks that already entered a memcg OOM condition to
      bypass the memcg limit on subsequent allocation attempts hoping this
      would expedite finishing the page fault and executing the kill.
      
      David Rientjes is worried that this breaks memcg isolation guarantees
      and since there is no evidence that the bypass actually speeds up fault
      processing just change it so that these subsequent charge attempts fail
      outright.  The notable exception being __GFP_NOFAIL charges which are
      required to bypass the limit regardless.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-bt: David Rientjes <rientjes@google.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f14c1ac
    • Johannes Weiner's avatar
      mm: memcg: fix race condition between memcg teardown and swapin · 96f1c58d
      Johannes Weiner authored
      There is a race condition between a memcg being torn down and a swapin
      triggered from a different memcg of a page that was recorded to belong
      to the exiting memcg on swapout (with CONFIG_MEMCG_SWAP extension).  The
      result is unreclaimable pages pointing to dead memcgs, which can lead to
      anything from endless loops in later memcg teardown (the page is charged
      to all hierarchical parents but is not on any LRU list) or crashes from
      following the dangling memcg pointer.
      
      Memcgs with tasks in them can not be torn down and usually charges don't
      show up in memcgs without tasks.  Swapin with the CONFIG_MEMCG_SWAP
      extension is the notable exception because it charges the cgroup that
      was recorded as owner during swapout, which may be empty and in the
      process of being torn down when a task in another memcg triggers the
      swapin:
      
        teardown:                 swapin:
      
                                  lookup_swap_cgroup_id()
                                  rcu_read_lock()
                                  mem_cgroup_lookup()
                                  css_tryget()
                                  rcu_read_unlock()
        disable css_tryget()
        call_rcu()
          offline_css()
            reparent_charges()
                                  res_counter_charge() (hierarchical!)
                                  css_put()
                                    css_free()
                                  pc->mem_cgroup = dead memcg
                                  add page to dead lru
      
      Add a final reparenting step into css_free() to make sure any such raced
      charges are moved out of the memcg before it's finally freed.
      
      In the longer term it would be cleaner to have the css_tryget() and the
      res_counter charge under the same RCU lock section so that the charge
      reparenting is deferred until the last charge whose tryget succeeded is
      visible.  But this will require more invasive changes that will be
      harder to evaluate and backport into stable, so better defer them to a
      separate change set.
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96f1c58d
    • Kirill A. Shutemov's avatar
      thp: move preallocated PTE page table on move_huge_pmd() · 3592806c
      Kirill A. Shutemov authored
      Andrey Wagin reported crash on VM_BUG_ON() in pgtable_pmd_page_dtor() with
      fallowing backtrace:
      
        free_pgd_range+0x2bf/0x410
        free_pgtables+0xce/0x120
        unmap_region+0xe0/0x120
        do_munmap+0x249/0x360
        move_vma+0x144/0x270
        SyS_mremap+0x3b9/0x510
        system_call_fastpath+0x16/0x1b
      
      The crash can be reproduce with this test case:
      
        #define _GNU_SOURCE
        #include <sys/mman.h>
        #include <stdio.h>
        #include <unistd.h>
      
        #define MB (1024 * 1024UL)
        #define GB (1024 * MB)
      
        int main(int argc, char **argv)
        {
      	char *p;
      	int i;
      
      	p = mmap((void *) GB, 10 * MB, PROT_READ | PROT_WRITE,
      			MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
      	for (i = 0; i < 10 * MB; i += 4096)
      		p[i] = 1;
      	mremap(p, 10 * MB, 10 * MB, MREMAP_FIXED | MREMAP_MAYMOVE, 2 * GB);
      	return 0;
        }
      
      Due to split PMD lock, we now store preallocated PTE tables for THP
      pages per-PMD table.  It means we need to move them to other PMD table
      if huge PMD moved there.
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reported-by: default avatarAndrey Vagin <avagin@openvz.org>
      Tested-by: default avatarAndrey Vagin <avagin@openvz.org>
      Reviewed-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3592806c
    • Krzysztof Kozlowski's avatar
      mfd/rtc: s5m: fix register updating by adding regmap for RTC · 3e1e4a5f
      Krzysztof Kozlowski authored
      Rename old regmap field of "struct sec_pmic_dev" to "regmap_pmic" and
      add new regmap for RTC.
      
      On S5M8767A registers were not properly updated and read due to usage of
      the same regmap as the PMIC.  This could be observed in various hangs,
      e.g.  in infinite loop during waiting for UDR field change.
      
      On this chip family the RTC has different I2C address than PMIC so
      additional regmap is needed.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarKyungmin Park <kyungmin.park@samsung.com>
      Reviewed-by: default avatarMark Brown <broonie@linaro.org>
      Acked-by: default avatarSangbeom Kim <sbkim73@samsung.com>
      Cc: Samuel Ortiz <sameo@linux.intel.com>
      Cc: Lee Jones <lee.jones@linaro.org>
      Cc: Liam Girdwood <lgirdwood@gmail.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3e1e4a5f
    • Krzysztof Kozlowski's avatar
      rtc: s5m: enable IRQ wake during suspend · 222ead7f
      Krzysztof Kozlowski authored
      Add PM suspend/resume ops to rtc-s5m driver and enable IRQ wake during
      suspend so the RTC would act like a wake up source.  This allows waking
      up from suspend to RAM on RTC alarm interrupt.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarKyungmin Park <kyungmin.park@samsung.com>
      Cc: Mark Brown <broonie@linaro.org>
      Acked-by: default avatarSangbeom Kim <sbkim73@samsung.com>
      Cc: Samuel Ortiz <sameo@linux.intel.com>
      Cc: Lee Jones <lee.jones@linaro.org>
      Cc: Liam Girdwood <lgirdwood@gmail.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      222ead7f
    • Krzysztof Kozlowski's avatar
      rtc: s5m: limit endless loop waiting for register update · d73238d4
      Krzysztof Kozlowski authored
      After setting alarm or time the driver is waiting for UDR register to be
      cleared indicating that registers data have been transferred.
      
      Limit the endless loop to only 5 retries.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarKyungmin Park <kyungmin.park@samsung.com>
      Reviewed-by: default avatarMark Brown <broonie@linaro.org>
      Acked-by: default avatarSangbeom Kim <sbkim73@samsung.com>
      Cc: Samuel Ortiz <sameo@linux.intel.com>
      Cc: Lee Jones <lee.jones@linaro.org>
      Cc: Liam Girdwood <lgirdwood@gmail.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d73238d4
    • Krzysztof Kozlowski's avatar
      rtc: s5m: fix unsuccesful IRQ request during probe · 7b003be8
      Krzysztof Kozlowski authored
      Probe failed for rtc-s5m:
      
      	s5m-rtc s5m-rtc: Failed to request alarm IRQ: 12: -22
      	s5m-rtc: probe of s5m-rtc failed with error -22
      
      Fix rtc-s5m interrupt request by using regmap_irq_get_virq() for mapping
      the IRQ.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarKyungmin Park <kyungmin.park@samsung.com>
      Reviewed-by: default avatarMark Brown <broonie@linaro.org>
      Acked-by: default avatarSangbeom Kim <sbkim73@samsung.com>
      Cc: Samuel Ortiz <sameo@linux.intel.com>
      Cc: Lee Jones <lee.jones@linaro.org>
      Cc: Liam Girdwood <lgirdwood@gmail.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Kyungmin Park <kyungmin.park@samsung.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b003be8
    • Geert Uytterhoeven's avatar
      drivers/rtc/rtc-s5m.c: fix info->rtc assignment · 5ccb7d71
      Geert Uytterhoeven authored
      Fix this warning:
      
        drivers/rtc/rtc-s5m.c: In function `s5m_rtc_probe':
        drivers/rtc/rtc-s5m.c:545: warning: assignment from incompatible pointer type
      
      struct s5m_rtc_info.rtc has type "struct regmap *", while
      struct sec_pmic_dev.rtc has type "struct i2c_client *".
      
      Probably the author wanted to assign "struct sec_pmic_dev.regmap", which
      has the correct type.
      
      Also, as "rtc" doesn't make much sense as a name for a regmap, rename it
      to "regmap".
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Cc: Sangbeom Kim <sbkim73@samsung.com>
      Cc: Sachin Kamat <sachin.kamat@linaro.org>
      Tested-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5ccb7d71
    • Axel Lin's avatar
      include/linux/kernel.h: make might_fault() a nop for !MMU · 386e7906
      Axel Lin authored
      The machine cannot fault if !MUU, so make might_fault() a nop for !MMU.
      
      This fixes below build error if
      !CONFIG_MMU && (CONFIG_PROVE_LOCKING=y || CONFIG_DEBUG_ATOMIC_SLEEP=y):
      
        arch/arm/kernel/built-in.o: In function `arch_ptrace':
        arch/arm/kernel/ptrace.c:852: undefined reference to `might_fault'
        arch/arm/kernel/built-in.o: In function `restore_sigframe':
        arch/arm/kernel/signal.c:173: undefined reference to `might_fault'
        ...
        arch/arm/kernel/built-in.o:arch/arm/kernel/signal.c:177: more undefined references to `might_fault' follow
        make: *** [vmlinux] Error 1
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      386e7906
    • Linus Pizunski's avatar
      drivers/rtc/rtc-at91rm9200.c: correct alarm over day/month wrap · eb3c2272
      Linus Pizunski authored
      Update month and day of month to the alarm month/day instead of current
      day/month when setting the RTC alarm mask.
      Signed-off-by: default avatarLinus Pizunski <linus@narrativeteam.com>
      Signed-off-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eb3c2272
    • Jan Beulich's avatar
      procfs: also fix proc_reg_get_unmapped_area() for !MMU case · ae5758a1
      Jan Beulich authored
      Commit fad1a86e ("procfs: call default get_unmapped_area on
      MMU-present architectures"), as its title says, took care of only the
      MMU case, leaving the !MMU side still in the regressed state (returning
      -EIO in all cases where pde->proc_fops->get_unmapped_area is NULL).
      
      From the fad1a86e changelog:
      
       "Commit c4fe2448 ("sparc: fix PCI device proc file mmap(2)") added
        proc_reg_get_unmapped_area in proc_reg_file_ops and
        proc_reg_file_ops_no_compat, by which now mmap always returns EIO if
        get_unmapped_area method is not defined for the target procfs file, which
        causes regression of mmap on /proc/vmcore.
      
        To address this issue, like get_unmapped_area(), call default
        current->mm->get_unmapped_area on MMU-present architectures if
        pde->proc_fops->get_unmapped_area, i.e.  the one in actual file operation
        in the procfs file, is not defined"
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: <stable@vger.kernel.org>	[3.12.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae5758a1
    • Johannes Weiner's avatar
      mm: memcg: do not declare OOM from __GFP_NOFAIL allocations · a0d8b00a
      Johannes Weiner authored
      Commit 84235de3 ("fs: buffer: move allocation failure loop into the
      allocator") started recognizing __GFP_NOFAIL in memory cgroups but
      forgot to disable the OOM killer.
      
      Any task that does not fail allocation will also not enter the OOM
      completion path.  So don't declare an OOM state in this case or it'll be
      leaked and the task be able to bypass the limit until the next
      userspace-triggered page fault cleans up the OOM state.
      Reported-by: default avatarWilliam Dauchy <wdauchy@gmail.com>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Cc: David Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>	[3.12.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0d8b00a
    • Naoya Horiguchi's avatar
      include/linux/hugetlb.h: make isolate_huge_page() an inline · f40386a4
      Naoya Horiguchi authored
      With CONFIG_HUGETLBFS=n:
      
        mm/migrate.c: In function `do_move_page_to_node_array':
        include/linux/hugetlb.h:140:33: warning: statement with no effect [-Wunused-value]
         #define isolate_huge_page(p, l) false
                                         ^
        mm/migrate.c:1170:4: note: in expansion of macro `isolate_huge_page'
            isolate_huge_page(page, &pagelist);
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Tested-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f40386a4
  2. 12 Dec, 2013 22 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 54fb723c
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Four security fixes for KVM on x86.  Thanks to Andrew Honig and Lars
        Bull from Google for reporting them"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376)
        KVM: x86: Convert vapic synchronization to _cached functions (CVE-2013-6368)
        KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367)
        KVM: Improve create VCPU parameter (CVE-2013-4587)
      54fb723c
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · ea1e61cb
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "Another week, another batch of fixes.
      
        Again, OMAP regressions due to move to DT is the bulk of the changes
        here, but this should be the last of it for 3.13.  There are also a
        handful of OMAP hwmod changes (power management, reset handling) for
        USB on OMAP3 that fixes some longish-standing bugs around USB resets.
      
        There are a couple of other changes that also add up line count a bit:
        One is a long-standing bug with the keyboard layout on one of the PXA
        platforms.  The other is a fix for highbank that moves their
        power-off/reset button handling to be done in-kernel since relying on
        userspace to handle it was fragile and awkward"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: sun6i: dt: Fix interrupt trigger types
        ARM: sun7i: dt: Fix interrupt trigger types
        MAINTAINERS: merge IMX6 entry into IMX
        ARM: tegra: add missing break to fuse initialization code
        ARM: pxa: prevent PXA270 occasional reboot freezes
        ARM: pxa: tosa: fix keys mapping
        ARM: OMAP2+: omap_device: add fail hook for runtime_pm when bad data is detected
        ARM: OMAP2+: hwmod: Fix usage of invalid iclk / oclk when clock node is not present
        ARM: OMAP3: hwmod data: Don't prevent RESET of USB Host module
        ARM: OMAP2+: hwmod: Fix SOFTRESET logic
        ARM: OMAP4+: hwmod data: Don't prevent RESET of USB Host module
        ARM: dts: Fix booting for secure omaps
        ARM: OMAP2+: Fix the machine entry for am3517
        ARM: dts: Fix missing entries for am3517
        ARM: OMAP2+: Fix overwriting hwmod data with data from device tree
        ARM: davinci: Fix McASP mem resource names
        ARM: highbank: handle soft poweroff and reset key events
        ARM: davinci: fix number of resources passed to davinci_gpio_register()
        gpio: davinci: fix check for unbanked gpio
      ea1e61cb
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · e09f67f1
      Linus Torvalds authored
      Pull btrfs fixes from Chris Mason:
       "This is a small collection of fixes.  It was rebased this morning, but
        I was just fixing signed-off-by tags with the wrong email"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        Btrfs: fix access_ok() check in btrfs_ioctl_send()
        Btrfs: make sure we cleanup all reloc roots if error happens
        Btrfs: skip building backref tree for uuid and quota tree when doing balance relocation
        Btrfs: fix an oops when doing balance relocation
        Btrfs: don't miss skinny extent items on delayed ref head contention
        btrfs: call mnt_drop_write after interrupted subvol deletion
        Btrfs: don't clear the default compression type
      e09f67f1
    • Linus Torvalds's avatar
      Merge branch 'for-3.13' of git://linux-nfs.org/~bfields/linux · c9111b4d
      Linus Torvalds authored
      Pull nfsd reply cache bugfix from Bruce Fields:
       "One bugfix for nfsd crashes"
      
      * 'for-3.13' of git://linux-nfs.org/~bfields/linux:
        nfsd: when reusing an existing repcache entry, unhash it first
      c9111b4d
    • Gleb Natapov's avatar
      KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376) · 17d68b76
      Gleb Natapov authored
      A guest can cause a BUG_ON() leading to a host kernel crash.
      When the guest writes to the ICR to request an IPI, while in x2apic
      mode the following things happen, the destination is read from
      ICR2, which is a register that the guest can control.
      
      kvm_irq_delivery_to_apic_fast uses the high 16 bits of ICR2 as the
      cluster id.  A BUG_ON is triggered, which is a protection against
      accessing map->logical_map with an out-of-bounds access and manages
      to avoid that anything really unsafe occurs.
      
      The logic in the code is correct from real HW point of view. The problem
      is that KVM supports only one cluster with ID 0 in clustered mode, but
      the code that has the bug does not take this into account.
      Reported-by: default avatarLars Bull <larsbull@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGleb Natapov <gleb@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      17d68b76
    • Andy Honig's avatar
      KVM: x86: Convert vapic synchronization to _cached functions (CVE-2013-6368) · fda4e2e8
      Andy Honig authored
      In kvm_lapic_sync_from_vapic and kvm_lapic_sync_to_vapic there is the
      potential to corrupt kernel memory if userspace provides an address that
      is at the end of a page.  This patches concerts those functions to use
      kvm_write_guest_cached and kvm_read_guest_cached.  It also checks the
      vapic_address specified by userspace during ioctl processing and returns
      an error to userspace if the address is not a valid GPA.
      
      This is generally not guest triggerable, because the required write is
      done by firmware that runs before the guest.  Also, it only affects AMD
      processors and oldish Intel that do not have the FlexPriority feature
      (unless you disable FlexPriority, of course; then newer processors are
      also affected).
      
      Fixes: b93463aa ('KVM: Accelerated apic support')
      Reported-by: default avatarAndrew Honig <ahonig@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Honig <ahonig@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fda4e2e8
    • Andy Honig's avatar
      KVM: x86: Fix potential divide by 0 in lapic (CVE-2013-6367) · b963a22e
      Andy Honig authored
      Under guest controllable circumstances apic_get_tmcct will execute a
      divide by zero and cause a crash.  If the guest cpuid support
      tsc deadline timers and performs the following sequence of requests
      the host will crash.
      - Set the mode to periodic
      - Set the TMICT to 0
      - Set the mode bits to 11 (neither periodic, nor one shot, nor tsc deadline)
      - Set the TMICT to non-zero.
      Then the lapic_timer.period will be 0, but the TMICT will not be.  If the
      guest then reads from the TMCCT then the host will perform a divide by 0.
      
      This patch ensures that if the lapic_timer.period is 0, then the division
      does not occur.
      Reported-by: default avatarAndrew Honig <ahonig@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Honig <ahonig@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      b963a22e
    • Andy Honig's avatar
      KVM: Improve create VCPU parameter (CVE-2013-4587) · 338c7dba
      Andy Honig authored
      In multiple functions the vcpu_id is used as an offset into a bitfield.  Ag
      malicious user could specify a vcpu_id greater than 255 in order to set or
      clear bits in kernel memory.  This could be used to elevate priveges in the
      kernel.  This patch verifies that the vcpu_id provided is less than 255.
      The api documentation already specifies that the vcpu_id must be less than
      max_vcpus, but this is currently not checked.
      Reported-by: default avatarAndrew Honig <ahonig@google.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndrew Honig <ahonig@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      338c7dba
    • Linus Torvalds's avatar
      Merge tag 'sound-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2208f651
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Still a slightly high amount of changes than wished, but they are all
        good regression and/or device-specific fixes.  Majority of commits are
        for HD-audio, an HDMI ctl index fix that hits old graphics boards,
        regression fixes for AD codecs and a few quirks.
      
        Other than that, two major fixes are included: a 64bit ABI fix for
        compress offload, and 64bit dma_addr_t truncation fix, which had hit
        on PAE kernels"
      
      * tag 'sound-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Add static DAC/pin mapping for AD1986A codec
        ALSA: hda - One more Dell headset detection quirk
        ALSA: hda - hdmi: Fix IEC958 ctl indexes for some simple HDMI devices
        ALSA: hda - Mute all aamix inputs as default
        ALSA: compress: Fix 64bit ABI incompatibility
        ALSA: memalloc.h - fix wrong truncation of dma_addr_t
        ALSA: hda - Another Dell headset detection quirk
        ALSA: hda - A Dell headset detection quirk
        ALSA: hda - Remove quirk for Dell Vostro 131
        ALSA: usb-audio: fix uninitialized variable compile warning
        ALSA: hda - fix mic issues on Acer Aspire E-572
      2208f651
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · ea4ebd1c
      Linus Torvalds authored
      Pull input fixes from Dmitry Torokhov:
       "A fix for recent sysfs breakage in serio subsystem plus a fixup to
        adxl34x driver"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: adxl34x - Fix bug in definition of ADXL346_2D_ORIENT
        Input: serio - fix sysfs layout
      ea4ebd1c
    • Linus Torvalds's avatar
      Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 846f29a6
      Linus Torvalds authored
      Pull media fixes from Mauro Carvalho Chehab:
       "A dvb core deadlock fix, a couple videobuf2 fixes an a series of media
        driver fixes"
      
      * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (30 commits)
        [media] videobuf2-dma-sg: fix possible memory leak
        [media] vb2: regression fix: always set length field.
        [media] mt9p031: Include linux/of.h header
        [media] rtl2830: add parent for I2C adapter
        [media] media: marvell-ccic: use devm to release clk
        [media] ths7303: Declare as static a private function
        [media] em28xx-video: Swap release order to avoid lock nesting
        [media] usbtv: Add support for PAL video source
        [media] media_tree: Fix spelling errors
        [media] videobuf2: Add support for file access mode flags for DMABUF exporting
        [media] radio-shark2: Mark shark_resume_leds() inline to kill compiler warning
        [media] radio-shark: Mark shark_resume_leds() inline to kill compiler warning
        [media] af9035: unlock on error in af9035_i2c_master_xfer()
        [media] af9033: fix broken I2C
        [media] v4l: omap3isp: Don't check for missing get_fmt op on remote subdev
        [media] af9035: fix broken I2C and USB I/O
        [media] wm8775: fix broken audio routing
        [media] marvell-ccic: drop resource free in driver remove
        [media] tef6862/radio-tea5764: actually assign clamp result
        [media] cx231xx: use after free on error path in probe
        ...
      846f29a6
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 86b581f6
      Linus Torvalds authored
      Pull hwmon fix from Guenter Roeck:
       "Fix HIH-6130 driver to work with BeagleBone"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: HIH-6130: Support I2C bus drivers without I2C_FUNC_SMBUS_QUICK
      86b581f6
    • Linus Torvalds's avatar
      Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · c8469441
      Linus Torvalds authored
      Pull hwmon fixes from Jean Delvare.
      
      * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        hwmon: Prevent some divide by zeros in FAN_TO_REG()
        hwmon: (w83l768ng) Fix fan speed control range
        hwmon: (w83l786ng) Fix fan speed control mode setting and reporting
        hwmon: (lm90) Unregister hwmon device if interrupt setup fails
      c8469441
    • Will Deacon's avatar
      word-at-a-time: provide generic big-endian zero_bytemask implementation · 11ec50ca
      Will Deacon authored
      Whilst architectures may be able to do better than this (which they can,
      by simply defining their own macro), this is a generic stab at a
      zero_bytemask implementation for the asm-generic, big-endian
      word-at-a-time implementation.
      
      On arm64, a clz instruction is used to implement the fls efficiently.
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      11ec50ca
    • Will Deacon's avatar
      dcache: allow word-at-a-time name hashing with big-endian CPUs · a5c21dce
      Will Deacon authored
      When explicitly hashing the end of a string with the word-at-a-time
      interface, we have to be careful which end of the word we pick up.
      
      On big-endian CPUs, the upper-bits will contain the data we're after, so
      ensure we generate our masks accordingly (and avoid hashing whatever
      random junk may have been sitting after the string).
      
      This patch adds a new dcache helper, bytemask_from_count, which creates
      a mask appropriate for the CPU endianness.
      
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a5c21dce
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-for-v3.13-rc4' of git://github.com/awilliam/linux-vfio · 319720f5
      Linus Torvalds authored
      Pull iommu fixes from Alex Williamson:
       "arm/smmu driver updates via Will Deacon fixing locking around page
        table walks and a couple other issues"
      
      * tag 'iommu-fixes-for-v3.13-rc4' of git://github.com/awilliam/linux-vfio:
        iommu/arm-smmu: fix error return code in arm_smmu_device_dt_probe()
        iommu/arm-smmu: remove potential NULL dereference on mapping path
        iommu/arm-smmu: use mutex instead of spinlock for locking page tables
      319720f5
    • Linus Torvalds's avatar
      Merge tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 5dec682c
      Linus Torvalds authored
      Pull misc keyrings fixes from David Howells:
       "These break down into five sets:
      
         - A patch to error handling in the big_key type for huge payloads.
           If the payload is larger than the "low limit" and the backing store
           allocation fails, then big_key_instantiate() doesn't clear the
           payload pointers in the key, assuming them to have been previously
           cleared - but only one of them is.
      
           Unfortunately, the garbage collector still calls big_key_destroy()
           when sees one of the pointers with a weird value in it (and not
           NULL) which it then tries to clean up.
      
         - Three patches to fix the keyring type:
      
           * A patch to fix the hash function to correctly divide keyrings off
             from keys in the topology of the tree inside the associative
             array.  This is only a problem if searching through nested
             keyrings - and only if the hash function incorrectly puts the a
             keyring outside of the 0 branch of the root node.
      
           * A patch to fix keyrings' use of the associative array.  The
             __key_link_begin() function initially passes a NULL key pointer
             to assoc_array_insert() on the basis that it's holding a place in
             the tree whilst it does more allocation and stuff.
      
             This is only a problem when a node contains 16 keys that match at
             that level and we want to add an also matching 17th.  This should
             easily be manufactured with a keyring full of keyrings (without
             chucking any other sort of key into the mix) - except for (a)
             above which makes it on average adding the 65th keyring.
      
           * A patch to fix searching down through nested keyrings, where any
             keyring in the set has more than 16 keyrings and none of the
             first keyrings we look through has a match (before the tree
             iteration needs to step to a more distal node).
      
           Test in keyutils test suite:
      
              http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=8b4ae963ed92523aea18dfbb8cab3f4979e13bd1
      
         - A patch to fix the big_key type's use of a shmem file as its
           backing store causing audit messages and LSM check failures.  This
           is done by setting S_PRIVATE on the file to avoid LSM checks on the
           file (access to the shmem file goes through the keyctl() interface
           and so is gated by the LSM that way).
      
           This isn't normally a problem if a key is used by the context that
           generated it - and it's currently only used by libkrb5.
      
           Test in keyutils test suite:
      
              http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=d9a53cbab42c293962f2f78f7190253fc73bd32e
      
         - A patch to add a generated file to .gitignore.
      
         - A patch to fix the alignment of the system certificate data such
           that it it works on s390.  As I understand it, on the S390 arch,
           symbols must be 2-byte aligned because loading the address discards
           the least-significant bit"
      
      * tag 'keys-devel-20131210' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        KEYS: correct alignment of system_certificate_list content in assembly file
        Ignore generated file kernel/x509_certificate_list
        security: shmem: implement kernel private shmem inodes
        KEYS: Fix searching of nested keyrings
        KEYS: Fix multiple key add into associative array
        KEYS: Fix the keyring hash function
        KEYS: Pre-clear struct key on allocation
      5dec682c
    • Linus Torvalds's avatar
      Merge tag 'xfs-for-linus-v3.13-rc4' of git://oss.sgi.com/xfs/xfs · 48a2f0b2
      Linus Torvalds authored
      Pull xfs bugfixes from Ben Myers:
      
       - fix for buffer overrun in agfl with growfs on v4 superblock
      
       - return EINVAL if requested discard length is less than a block
      
       - fix possible memory corruption in xfs_attrlist_by_handle()
      
      * tag 'xfs-for-linus-v3.13-rc4' of git://oss.sgi.com/xfs/xfs:
        xfs: growfs overruns AGFL buffer on V4 filesystems
        xfs: don't perform discard if the given range length is less than block size
        xfs: underflow bug in xfs_attrlist_by_handle()
      48a2f0b2
    • Linus Torvalds's avatar
      futex: move user address verification up to common code · 5cdec2d8
      Linus Torvalds authored
      When debugging the read-only hugepage case, I was confused by the fact
      that get_futex_key() did an access_ok() only for the non-shared futex
      case, since the user address checking really isn't in any way specific
      to the private key handling.
      
      Now, it turns out that the shared key handling does effectively do the
      equivalent checks inside get_user_pages_fast() (it doesn't actually
      check the address range on x86, but does check the page protections for
      being a user page).  So it wasn't actually a bug, but the fact that we
      treat the address differently for private and shared futexes threw me
      for a loop.
      
      Just move the check up, so that it gets done for both cases.  Also, use
      the 'rw' parameter for the type, even if it doesn't actually matter any
      more (it's a historical artifact of the old racy i386 "page faults from
      kernel space don't check write protections").
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5cdec2d8
    • Linus Torvalds's avatar
      futex: fix handling of read-only-mapped hugepages · f12d5bfc
      Linus Torvalds authored
      The hugepage code had the exact same bug that regular pages had in
      commit 7485d0d3 ("futexes: Remove rw parameter from
      get_futex_key()").
      
      The regular page case was fixed by commit 9ea71503 ("futex: Fix
      regression with read only mappings"), but the transparent hugepage case
      (added in a5b338f2: "thp: update futex compound knowledge") case
      remained broken.
      
      Found by Dave Jones and his trinity tool.
      Reported-and-tested-by: default avatarDave Jones <davej@fedoraproject.org>
      Cc: stable@kernel.org # v2.6.38+
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Darren Hart <dvhart@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f12d5bfc
    • Dan Carpenter's avatar
      Btrfs: fix access_ok() check in btrfs_ioctl_send() · 700ff4f0
      Dan Carpenter authored
      The closing parenthesis is in the wrong place.  We want to check
      "sizeof(*arg->clone_sources) * arg->clone_sources_count" instead of
      "sizeof(*arg->clone_sources * arg->clone_sources_count)".
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      cc: stable@vger.kernel.org
      700ff4f0
    • Wang Shilong's avatar
      Btrfs: make sure we cleanup all reloc roots if error happens · 467bb1d2
      Wang Shilong authored
      I hit an oops when merging reloc roots fails, the reason is that
      new reloc roots may be added and we should make sure we cleanup
      all reloc roots.
      Signed-off-by: default avatarWang Shilong <wangsl.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      467bb1d2