1. 31 Jan, 2014 37 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 271bf66d
      Linus Torvalds authored
      Pull some further ceph acl cleanups from Sage Weil:
       "I do have a couple patches on top of what's in your tree, though, that
        clean up a couple duplicated lines in your fix and apply Christoph's
        cleanup"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
        ceph: simplify ceph_{get,init}_acl
        ceph: remove duplicate declaration of ceph_setattr
      271bf66d
    • Christoph Hellwig's avatar
      ceph: simplify ceph_{get,init}_acl · 75858236
      Christoph Hellwig authored
       - ->get_acl only gets called after we checked for a cached ACL, so no
         need to call get_cached_acl again.
       - no need to check IS_POSIXACL in ->get_acl, without that it should
         never get set as all the callers that set it already have the check.
       - you should be able to use the full posix_acl_create in CEPH
      Signed-off-by: default avatarChristoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarSage Weil <sage@inktank.com>
      75858236
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew Morton) · aa2e7100
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "A few hotfixes and various leftovers which were awaiting other merges.
      
        Mainly movement of zram into mm/"
      
      * emailed patches fron Andrew Morton <akpm@linux-foundation.org>: (25 commits)
        memcg: fix mutex not unlocked on memcg_create_kmem_cache fail path
        Documentation/filesystems/vfs.txt: update file_operations documentation
        mm, oom: base root bonus on current usage
        mm: don't lose the SOFT_DIRTY flag on mprotect
        mm/slub.c: fix page->_count corruption (again)
        mm/mempolicy.c: fix mempolicy printing in numa_maps
        zram: remove zram->lock in read path and change it with mutex
        zram: remove workqueue for freeing removed pending slot
        zram: introduce zram->tb_lock
        zram: use atomic operation for stat
        zram: remove unnecessary free
        zram: delay pending free request in read path
        zram: fix race between reset and flushing pending work
        zsmalloc: add maintainers
        zram: add zram maintainers
        zsmalloc: add copyright
        zram: add copyright
        zram: remove old private project comment
        zram: promote zram from staging
        zsmalloc: move it under mm
        ...
      aa2e7100
    • PaX Team's avatar
      x86, x32: Correct invalid use of user timespec in the kernel · 2def2ef2
      PaX Team authored
      The x32 case for the recvmsg() timout handling is broken:
      
        asmlinkage long compat_sys_recvmmsg(int fd, struct compat_mmsghdr __user *mmsg,
                                            unsigned int vlen, unsigned int flags,
                                            struct compat_timespec __user *timeout)
        {
                int datagrams;
                struct timespec ktspec;
      
                if (flags & MSG_CMSG_COMPAT)
                        return -EINVAL;
      
                if (COMPAT_USE_64BIT_TIME)
                        return __sys_recvmmsg(fd, (struct mmsghdr __user *)mmsg, vlen,
                                              flags | MSG_CMSG_COMPAT,
                                              (struct timespec *) timeout);
                ...
      
      The timeout pointer parameter is provided by userland (hence the __user
      annotation) but for x32 syscalls it's simply cast to a kernel pointer
      and is passed to __sys_recvmmsg which will eventually directly
      dereference it for both reading and writing.  Other callers to
      __sys_recvmmsg properly copy from userland to the kernel first.
      
      The bug was introduced by commit ee4fa23c ("compat: Use
      COMPAT_USE_64BIT_TIME in net/compat.c") and should affect all kernels
      since 3.4 (and perhaps vendor kernels if they backported x32 support
      along with this code).
      
      Note that CONFIG_X86_X32_ABI gets enabled at build time and only if
      CONFIG_X86_X32 is enabled and ld can build x32 executables.
      
      Other uses of COMPAT_USE_64BIT_TIME seem fine.
      
      This addresses CVE-2014-0038.
      Signed-off-by: default avatarPaX Team <pageexec@freemail.hu>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org> # v3.4+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2def2ef2
    • Linus Torvalds's avatar
      Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 12f2bbd6
      Linus Torvalds authored
      Pull x86 asmlinkage (LTO) changes from Peter Anvin:
       "This patchset adds more infrastructure for link time optimization
        (LTO).
      
        This patchset was pulled into my tree late because of a
        miscommunication (part of the patchset was picked up by other
        maintainers).  However, the patchset is strictly build-related and
        seems to be okay in testing"
      
      * 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, asmlinkage, xen: Fix type of NMI
        x86, asmlinkage, xen, kvm: Make {xen,kvm}_lock_spinning global and visible
        x86: Use inline assembler instead of global register variable to get sp
        x86, asmlinkage, paravirt: Make paravirt thunks global
        x86, asmlinkage, paravirt: Don't rely on local assembler labels
        x86, asmlinkage, lguest: Fix C functions used by inline assembler
      12f2bbd6
    • Linus Torvalds's avatar
      Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 10ffe3db
      Linus Torvalds authored
      Pull x86 build bits from Peter Anvin:
       "Various build-related minor bits.
      
        Most of this is work by David Woodhouse to be able to compile the
        early boot code with clang/llvm; we have also managed to push an
        actual -m16 option into gcc 4.9 so this makes us use that option if
        available instead of hacking it.
      
        The balance is a patch from Michael Davidson to the relocs program to
        help manual debugging.
      
        None of these should change the actual compiled binary with currently
        released compilers"
      
      * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86, build: Build 16-bit code with -m16 where possible
        x86, boot: Fix word-size assumptions in has_eflag() inline asm
        x86, boot: Use __attribute__((used)) to ensure videocard structs are emitted
        x86: Remove duplication of 16-bit CFLAGS
        x86, relocs: Add manual debug mode
      10ffe3db
    • Linus Torvalds's avatar
      Merge tag 'late-dt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · f8a504c4
      Linus Torvalds authored
      Pull ARM SoC late changes from Kevin Hilman:
       "These are changes that arrived a little late but were considered
        self-contained enough to still go in for v3.14.
      
        They are all device tree updtes this time around, and mainly for
        Broadcom SoCs"
      
      * tag 'late-dt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
        ARM: moxart: move fixed rate clock child node to board level dts
        clk: bcm281xx: define kona clock binding
        ARM: dts: add usb udc support to bcm281xx
        ARM: dts: Specify clocks for timer on bcm11351
        Documentation: dt: kona-timer: Add clocks property
        ARM: dts: Specify clocks for SDHCIs on bcm11351
        Documentation: dt: kona-sdhci: Add clocks property
        ARM: dts: Specify clocks for UARTs on bcm11351
        ARM: dts: bcm281xx: Add i2c busses
        ARM: dts: Declare clocks as fixed on bcm11351
        ARM: dts: bcm28155-ap: Enable all the i2c busses
      f8a504c4
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · cdfc8307
      Linus Torvalds authored
      Pull MIPS updates from Ralf Baechle:
       "The most notable new addition inside this pull request is the support
        for MIPS's latest and greatest core called "inter/proAptiv".  The
        patch series describes this core as follows.
      
          "The interAptiv is a power-efficient multi-core microprocessor
           for use in system-on-chip (SoC) applications. The interAptiv combines
           a multi-threading pipeline with a coherence manager to deliver improved
           computational throughput and power efficiency. The interAptiv can
           contain one to four MIPS32R3 interAptiv cores, system level
           coherence manager with L2 cache, optional coherent I/O port,
           and optional floating point unit."
      
        The platform specific patches touch all 3 Broadcom families.  It adds
        support for the new Broadcom/Netlogix XLP9xx Soc, building a common
        BCM63XX SMP kernel for all BCM63XX SoCs regardless of core type/count
        and full gpio button/led descriptions for BCM47xx.
      
        The rest of the series are cleanups and bug fixes that are MIPS
        generic and consist largely of changes that Imgtec/MIPS had published
        in their linux-mti-3.10.git stable tree.  Random other cleanups and
        patches preparing code to be merged in 3.15"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (139 commits)
        mips: select ARCH_MIGHT_HAVE_PC_SERIO
        mips: delete non-required instances of include <linux/init.h>
        MIPS: KVM: remove shadow_tlb code
        MIPS: KVM: use common EHINV aware UNIQUE_ENTRYHI
        mips/ide: flush dcache also if icache does not snoop dcache
        MIPS: BCM47XX: fix position of cpu_wait disabling
        MIPS: BCM63XX: select correct MIPS_L1_CACHE_SHIFT value
        MIPS: update MIPS_L1_CACHE_SHIFT based on MIPS_L1_CACHE_SHIFT_<N>
        MIPS: introduce MIPS_L1_CACHE_SHIFT_<N>
        MIPS: ZBOOT: gather string functions into string.c
        arch/mips/pci: don't check resource with devm_ioremap_resource
        arch/mips/lantiq/xway: don't check resource with devm_ioremap_resource
        bcma: gpio: don't cast u32 to unsigned long
        ssb: gpio: add own IRQ domain
        MIPS: BCM47XX: fix sparse warnings in board.c
        MIPS: BCM47XX: add board detection for Linksys WRT54GS V1
        MIPS: BCM47XX: fix detection for some boards
        MIPS: BCM47XX: Enable buttons support on SSB
        MIPS: BCM47XX: Convert WNDR4500 to new syntax
        MIPS: BCM47XX: Use "timer" trigger for status LEDs
        ...
      cdfc8307
    • Linus Torvalds's avatar
      Merge tag 'for-3.14' of git://openrisc.net/~jonas/linux · 04a24ae4
      Linus Torvalds authored
      Pull OpenRISC updates from Jonas Bonn:
       "The interesting change here is a rework of the OpenRISC signal
        handling to make it more like other architectures in the hopes that
        this makes it easier for others to comment on and understand.  This
        rework fixes some real bugs, like the fact that syscall restart did
        not work reliably"
      
      * tag 'for-3.14' of git://openrisc.net/~jonas/linux:
        openrisc: Use get_signal() signal_setup_done()
        openrisc: Rework signal handling
      04a24ae4
    • Linus Torvalds's avatar
      Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 4bcec913
      Linus Torvalds authored
      Pull more powerpc bits from Ben Herrenschmidt:
       "Here are a few more powerpc bits for this merge window.  The bulk is
        made of two pull requests from Scott and Anatolij that I had missed
        previously (they arrived while I was away).  Since both their branches
        are in -next independently, and the content has been around for a
        little while, they can still go in.
      
        The rest is mostly bug and regression fixes, a small series of
        cleanups to our pseries cpuidle code (including moving it to the right
        place), and one new cpuidle bakend for the powernv platform.  I also
        wired up the new sched_attr syscalls"
      
      * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (37 commits)
        powerpc: Wire up sched_setattr and sched_getattr syscalls
        powerpc/hugetlb: Replace __get_cpu_var with get_cpu_var
        powerpc: Make sure "cache" directory is removed when offlining cpu
        powerpc/mm: Fix mmap errno when MAP_FIXED is set and mapping exceeds the allowed address space
        powerpc/powernv/cpuidle: Back-end cpuidle driver for powernv platform.
        powerpc/pseries/cpuidle: smt-snooze-delay cleanup.
        powerpc/pseries/cpuidle: Remove MAX_IDLE_STATE macro.
        powerpc/pseries/cpuidle: Make cpuidle-pseries backend driver a non-module.
        powerpc/pseries/cpuidle: Use cpuidle_register() for initialisation.
        powerpc/pseries/cpuidle: Move processor_idle.c to drivers/cpuidle.
        powerpc: Fix 32-bit frames for signals delivered when transactional
        powerpc/iommu: Fix initialisation of DART iommu table
        powerpc/numa: Fix decimal permissions
        powerpc/mm: Fix compile error of pgtable-ppc64.h
        powerpc: Fix hw breakpoints on !HAVE_HW_BREAKPOINT configurations
        clk: corenet: Adds the clock binding
        powerpc/booke64: Guard e6500 tlb handler with CONFIG_PPC_FSL_BOOK3E
        powerpc/512x: dts: add MPC5125 clock specs
        powerpc/512x: clk: support MPC5121/5123/5125 SoC variants
        powerpc/512x: clk: enforce even SDHC divider values
        ...
      4bcec913
    • Linus Torvalds's avatar
      Merge branch 'drop-time' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 03c7287d
      Linus Torvalds authored
      Pull __TIME__/__DATE__ removal from Michal Marek:
       "This series by Josh finishes the removal of __DATE__ and __TIME__ from
        the kernel.  The last patch adds -Werror=date-time to KBUILD_CFLAGS to
        stop these from reappearing.
      
        Part of the series went through Greg's trees during this merge window,
        which is why this pull request is not based on v3.13-rc1"
      
      * 'drop-time' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        Makefile: Build with -Werror=date-time if the compiler supports it
        x86: math-emu: Drop already-disabled print of build date
        net: wireless: brcm80211: Drop debug version with build date/time
        mtd: denali: Drop print of build date/time
      03c7287d
    • Linus Torvalds's avatar
      Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild · 597690cd
      Linus Torvalds authored
      Pull kbuild changes from Michal Marek:
       - fix make -s detection with make-4.0
       - fix for scripts/setlocalversion when the kernel repository is a
         submodule
       - do not hardcode ';' in macros that expand to assembler code, as some
         architectures' assemblers use a different character for newline
       - Fix passing --gdwarf-2 to the assembler
      
      * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
        frv: Remove redundant debugging info flag
        mn10300: Remove redundant debugging info flag
        kbuild: Fix debugging info generation for .S files
        arch: use ASM_NL instead of ';' for assembler new line character in the macro
        kbuild: Fix silent builds with make-4
        Fix detectition of kernel git repository in setlocalversion script [take #2]
      597690cd
    • Vladimir Davydov's avatar
      memcg: fix mutex not unlocked on memcg_create_kmem_cache fail path · 7c094fd6
      Vladimir Davydov authored
      Commit 842e2873 ("memcg: get rid of kmem_cache_dup()") introduced a
      mutex for memcg_create_kmem_cache() to protect the tmp_name buffer that
      holds the memcg name.  It failed to unlock the mutex if this buffer
      could not be allocated.
      
      This patch fixes the issue by appropriately unlocking the mutex if the
      allocation fails.
      Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Glauber Costa <glommer@parallels.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7c094fd6
    • Richard Yao's avatar
      Documentation/filesystems/vfs.txt: update file_operations documentation · 46bf16c4
      Richard Yao authored
      ->readv, ->writev and ->sendfile have been removed while ->show_fdinfo
      has been added. The documentation should reflect this.
      Signed-off-by: default avatarRichard Yao <ryao@gentoo.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      46bf16c4
    • David Rientjes's avatar
      mm, oom: base root bonus on current usage · 778c14af
      David Rientjes authored
      A 3% of system memory bonus is sometimes too excessive in comparison to
      other processes.
      
      With commit a63d83f4 ("oom: badness heuristic rewrite"), the OOM
      killer tries to avoid killing privileged tasks by subtracting 3% of
      overall memory (system or cgroup) from their per-task consumption.  But
      as a result, all root tasks that consume less than 3% of overall memory
      are considered equal, and so it only takes 33+ privileged tasks pushing
      the system out of memory for the OOM killer to do something stupid and
      kill dhclient or other root-owned processes.  For example, on a 32G
      machine it can't tell the difference between the 1M agetty and the 10G
      fork bomb member.
      
      The changelog describes this 3% boost as the equivalent to the global
      overcommit limit being 3% higher for privileged tasks, but this is not
      the same as discounting 3% of overall memory from _every privileged task
      individually_ during OOM selection.
      
      Replace the 3% of system memory bonus with a 3% of current memory usage
      bonus.
      
      By giving root tasks a bonus that is proportional to their actual size,
      they remain comparable even when relatively small.  In the example
      above, the OOM killer will discount the 1M agetty's 256 badness points
      down to 179, and the 10G fork bomb's 262144 points down to 183500 points
      and make the right choice, instead of discounting both to 0 and killing
      agetty because it's first in the task list.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Reported-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      778c14af
    • Andrey Vagin's avatar
      mm: don't lose the SOFT_DIRTY flag on mprotect · 24f91eba
      Andrey Vagin authored
      The SOFT_DIRTY bit shows that the content of memory was changed after a
      defined point in the past.  mprotect() doesn't change the content of
      memory, so it must not change the SOFT_DIRTY bit.
      
      This bug causes a malfunction: on the first iteration all pages are
      dumped.  On other iterations only pages with the SOFT_DIRTY bit are
      dumped.  So if the SOFT_DIRTY bit is cleared from a page by mistake, the
      page is not dumped and its content will be restored incorrectly.
      
      This patch does nothing with _PAGE_SWP_SOFT_DIRTY, becase pte_modify()
      is called only for present pages.
      
      Fixes commit 0f8975ec ("mm: soft-dirty bits for user memory changes
      tracking").
      Signed-off-by: default avatarAndrey Vagin <avagin@openvz.org>
      Acked-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Wen Congyang <wency@cn.fujitsu.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      24f91eba
    • Dave Hansen's avatar
      mm/slub.c: fix page->_count corruption (again) · a0320865
      Dave Hansen authored
      Commit abca7c49 ("mm: fix slab->page _count corruption when using
      slub") notes that we can not _set_ a page->counters directly, except
      when using a real double-cmpxchg.  Doing so can lose updates to
      ->_count.
      
      That is an absolute rule:
      
              You may not *set* page->counters except via a cmpxchg.
      
      Commit abca7c49 fixed this for the folks who have the slub
      cmpxchg_double code turned off at compile time, but it left the bad case
      alone.  It can still be reached, and the same bug triggered in two
      cases:
      
      1. Turning on slub debugging at runtime, which is available on
         the distro kernels that I looked at.
      2. On 64-bit CPUs with no CMPXCHG16B (some early AMD x86-64
         cpus, evidently)
      
      There are at least 3 ways we could fix this:
      
      1. Take all of the exising calls to cmpxchg_double_slab() and
         __cmpxchg_double_slab() and convert them to take an old, new
         and target 'struct page'.
      2. Do (1), but with the newly-introduced 'slub_data'.
      3. Do some magic inside the two cmpxchg...slab() functions to
         pull the counters out of new_counters and only set those
         fields in page->{inuse,frozen,objects}.
      
      I've done (2) as well, but it's a bunch more code.  This patch is an
      attempt at (3).  This was the most straightforward and foolproof way
      that I could think to do this.
      
      This would also technically allow us to get rid of the ugly
      
      #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \
             defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE)
      
      in 'struct page', but leaving it alone has the added benefit that
      'counters' stays 'unsigned' instead of 'unsigned long', so all the
      copies that the slub code does stay a bit smaller.
      Signed-off-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Matt Mackall <mpm@selenic.com>
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0320865
    • David Rientjes's avatar
      mm/mempolicy.c: fix mempolicy printing in numa_maps · 8790c71a
      David Rientjes authored
      As a result of commit 5606e387 ("mm: numa: Migrate on reference
      policy"), /proc/<pid>/numa_maps prints the mempolicy for any <pid> as
      "prefer:N" for the local node, N, of the process reading the file.
      
      This should only be printed when the mempolicy of <pid> is
      MPOL_PREFERRED for node N.
      
      If the process is actually only using the default mempolicy for local
      node allocation, make sure "default" is printed as expected.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Reported-by: default avatarRobert Lippert <rlippert@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: <stable@vger.kernel.org>	[3.7+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8790c71a
    • Minchan Kim's avatar
      zram: remove zram->lock in read path and change it with mutex · e46e3315
      Minchan Kim authored
      Finally, we separated zram->lock dependency from 32bit stat/ table
      handling so there is no reason to use rw_semaphore between read and
      write path so this patch removes the lock from read path totally and
      changes rw_semaphore with mutex.  So, we could do
      
      old:
      
        read-read: OK
        read-write: NO
        write-write: NO
      
      Now:
      
        read-read: OK
        read-write: OK
        write-write: NO
      
      The below data proves mixed workload performs well 11 times and there is
      also enhance on write-write path because current rw-semaphore doesn't
      support SPIN_ON_OWNER.  It's side effect but anyway good thing for us.
      
      Write-related tests perform better (from 61% to 1058%) but read path has
      good/bad(from -2.22% to 1.45%) but they are all marginal within stddev.
      
        CPU 12
        iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0
      
        ==Initial write                ==Initial write
        records: 10                    records: 10
        avg:  516189.16                avg:  839907.96
        std:   22486.53 (4.36%)        std:   47902.17 (5.70%)
        max:  546970.60                max:  909910.35
        min:  481131.54                min:  751148.38
        ==Rewrite                      ==Rewrite
        records: 10                    records: 10
        avg:  509527.98                avg: 1050156.37
        std:   45799.94 (8.99%)        std:   40695.44 (3.88%)
        max:  611574.27                max: 1111929.26
        min:  443679.95                min:  980409.62
        ==Read                         ==Read
        records: 10                    records: 10
        avg: 4408624.17                avg: 4472546.76
        std:  281152.61 (6.38%)        std:  163662.78 (3.66%)
        max: 4867888.66                max: 4727351.03
        min: 4058347.69                min: 4126520.88
        ==Re-read                      ==Re-read
        records: 10                    records: 10
        avg: 4462147.53                avg: 4363257.75
        std:  283546.11 (6.35%)        std:  247292.63 (5.67%)
        max: 4912894.44                max: 4677241.75
        min: 4131386.50                min: 4035235.84
        ==Reverse Read                 ==Reverse Read
        records: 10                    records: 10
        avg: 4565865.97                avg: 4485818.08
        std:  313395.63 (6.86%)        std:  248470.10 (5.54%)
        max: 5232749.16                max: 4789749.94
        min: 4185809.62                min: 3963081.34
        ==Stride read                  ==Stride read
        records: 10                    records: 10
        avg: 4515981.80                avg: 4418806.01
        std:  211192.32 (4.68%)        std:  212837.97 (4.82%)
        max: 4889287.28                max: 4686967.22
        min: 4210362.00                min: 4083041.84
        ==Random read                  ==Random read
        records: 10                    records: 10
        avg: 4410525.23                avg: 4387093.18
        std:  236693.22 (5.37%)        std:  235285.23 (5.36%)
        max: 4713698.47                max: 4669760.62
        min: 4057163.62                min: 3952002.16
        ==Mixed workload               ==Mixed workload
        records: 10                    records: 10
        avg:  243234.25                avg: 2818677.27
        std:   28505.07 (11.72%)       std:  195569.70 (6.94%)
        max:  288905.23                max: 3126478.11
        min:  212473.16                min: 2484150.69
        ==Random write                 ==Random write
        records: 10                    records: 10
        avg:  555887.07                avg: 1053057.79
        std:   70841.98 (12.74%)       std:   35195.36 (3.34%)
        max:  683188.28                max: 1096125.73
        min:  437299.57                min:  992481.93
        ==Pwrite                       ==Pwrite
        records: 10                    records: 10
        avg:  501745.93                avg:  810363.09
        std:   16373.54 (3.26%)        std:   19245.01 (2.37%)
        max:  518724.52                max:  833359.70
        min:  464208.73                min:  765501.87
        ==Pread                        ==Pread
        records: 10                    records: 10
        avg: 4539894.60                avg: 4457680.58
        std:  197094.66 (4.34%)        std:  188965.60 (4.24%)
        max: 4877170.38                max: 4689905.53
        min: 4226326.03                min: 4095739.72
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e46e3315
    • Minchan Kim's avatar
      zram: remove workqueue for freeing removed pending slot · f614a9f4
      Minchan Kim authored
      Commit a0c516cb ("zram: don't grab mutex in zram_slot_free_noity")
      introduced free request pending code to avoid scheduling by mutex under
      spinlock and it was a mess which made code lenghty and increased
      overhead.
      
      Now, we don't need zram->lock any more to free slot so this patch
      reverts it and then, tb_lock should protect it.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f614a9f4
    • Minchan Kim's avatar
      zram: introduce zram->tb_lock · 92967471
      Minchan Kim authored
      Currently, the zram table is protected by zram->lock but it's rather
      coarse-grained lock and it makes hard for scalibility.
      
      Let's use own rwlock instead of depending on zram->lock.  This patch
      adds new locking so obviously, it would make slow but this patch is just
      prepartion for removing coarse-grained rw_semaphore(ie, zram->lock)
      which is hurdle about zram scalability.
      
      Final patch in this patchset series will remove the lock from read-path
      and change rw_semaphore with mutex in write path.  With bonus, we could
      drop pending slot free mess in next patch.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      92967471
    • Minchan Kim's avatar
      zram: use atomic operation for stat · deb0bdeb
      Minchan Kim authored
      Some of fields in zram->stats are protected by zram->lock which is
      rather coarse-grained so let's use atomic operation without explict
      locking.
      
      This patch is ready for removing dependency of zram->lock in read path
      which is very coarse-grained rw_semaphore.  Of course, this patch adds
      new atomic operation so it might make slow but my 12CPU test couldn't
      spot any regression.  All gain/lose is marginal within stddev.
      
        iozone -t -T -l 12 -u 12 -r 16K -s 60M -I +Z -V 0
      
        ==Initial write                ==Initial write
        records: 50                    records: 50
        avg:  412875.17                avg:  415638.23
        std:   38543.12 (9.34%)        std:   36601.11 (8.81%)
        max:  521262.03                max:  502976.72
        min:  343263.13                min:  351389.12
        ==Rewrite                      ==Rewrite
        records: 50                    records: 50
        avg:  416640.34                avg:  397914.33
        std:   60798.92 (14.59%)       std:   46150.42 (11.60%)
        max:  543057.07                max:  522669.17
        min:  304071.67                min:  316588.77
        ==Read                         ==Read
        records: 50                    records: 50
        avg: 4147338.63                avg: 4070736.51
        std:  179333.25 (4.32%)        std:  223499.89 (5.49%)
        max: 4459295.28                max: 4539514.44
        min: 3753057.53                min: 3444686.31
        ==Re-read                      ==Re-read
        records: 50                    records: 50
        avg: 4096706.71                avg: 4117218.57
        std:  229735.04 (5.61%)        std:  171676.25 (4.17%)
        max: 4430012.09                max: 4459263.94
        min: 2987217.80                min: 3666904.28
        ==Reverse Read                 ==Reverse Read
        records: 50                    records: 50
        avg: 4062763.83                avg: 4078508.32
        std:  186208.46 (4.58%)        std:  172684.34 (4.23%)
        max: 4401358.78                max: 4424757.22
        min: 3381625.00                min: 3679359.94
        ==Stride read                  ==Stride read
        records: 50                    records: 50
        avg: 4094933.49                avg: 4082170.22
        std:  185710.52 (4.54%)        std:  196346.68 (4.81%)
        max: 4478241.25                max: 4460060.97
        min: 3732593.23                min: 3584125.78
        ==Random read                  ==Random read
        records: 50                    records: 50
        avg: 4031070.04                avg: 4074847.49
        std:  192065.51 (4.76%)        std:  206911.33 (5.08%)
        max: 4356931.16                max: 4399442.56
        min: 3481619.62                min: 3548372.44
        ==Mixed workload               ==Mixed workload
        records: 50                    records: 50
        avg:  149925.73                avg:  149675.54
        std:    7701.26 (5.14%)        std:    6902.09 (4.61%)
        max:  191301.56                max:  175162.05
        min:  133566.28                min:  137762.87
        ==Random write                 ==Random write
        records: 50                    records: 50
        avg:  404050.11                avg:  393021.47
        std:   58887.57 (14.57%)       std:   42813.70 (10.89%)
        max:  601798.09                max:  524533.43
        min:  325176.99                min:  313255.34
        ==Pwrite                       ==Pwrite
        records: 50                    records: 50
        avg:  411217.70                avg:  411237.96
        std:   43114.99 (10.48%)       std:   33136.29 (8.06%)
        max:  530766.79                max:  471899.76
        min:  320786.84                min:  317906.94
        ==Pread                        ==Pread
        records: 50                    records: 50
        avg: 4154908.65                avg: 4087121.92
        std:  151272.08 (3.64%)        std:  219505.04 (5.37%)
        max: 4459478.12                max: 4435857.38
        min: 3730512.41                min: 3101101.67
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      deb0bdeb
    • Minchan Kim's avatar
      zram: remove unnecessary free · 874e3cdd
      Minchan Kim authored
      Commit a0c516cb ("zram: don't grab mutex in zram_slot_free_noity")
      introduced pending zram slot free in zram's write path in case of
      missing slot free by memory allocation failure in zram_slot_free_notify
      but it is not necessary because we have already freed the slot right
      before overwriting.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      874e3cdd
    • Minchan Kim's avatar
      zram: delay pending free request in read path · 9b353db1
      Minchan Kim authored
      Sergey reported we don't need to handle pending free request every I/O
      so that this patch removes it in read path while we remain it in write
      path.
      
      Let's consider below example.
      
      Swap subsystem ask to zram "A" block free by swap_slot_free_notify but
      zram had been pended it without real freeing.  Swap subsystem allocates
      "A" block for new data but request pended for a long time just handled
      and zram blindly free new data on the "A" block.  :(
      
      That's why we couldn't remove handle pending free request right before
      zram-write.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reported-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9b353db1
    • Minchan Kim's avatar
      zram: fix race between reset and flushing pending work · da4a0412
      Minchan Kim authored
      Dan and Sergey reported that there is a racy between reset and flushing
      of pending work so that it could make oops by freeing zram->meta in
      reset while zram_slot_free can access zram->meta if new request is
      adding during the race window.
      
      This patch moves flush after taking init_lock so it prevents new request
      so that it closes the race.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Tested-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      da4a0412
    • Minchan Kim's avatar
      zsmalloc: add maintainers · eae70d06
      Minchan Kim authored
      tAdd adds maintainer information for zsmalloc into the MAINTAINERS file.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      eae70d06
    • Minchan Kim's avatar
      zram: add zram maintainers · 6920f2cc
      Minchan Kim authored
      Add maintainer information for zram into the MAINTAINERS file.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6920f2cc
    • Minchan Kim's avatar
      zsmalloc: add copyright · 31fc00bb
      Minchan Kim authored
      Add my copyright to the zsmalloc source code which I maintain.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      31fc00bb
    • Minchan Kim's avatar
      zram: add copyright · 7bfb3de8
      Minchan Kim authored
      Add my copyright to the zram source code which I maintain.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7bfb3de8
    • Minchan Kim's avatar
      zram: remove old private project comment · 49061236
      Minchan Kim authored
      Remove the old private compcache project address so upcoming patches
      should be sent to LKML because we Linux kernel community will take care.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49061236
    • Minchan Kim's avatar
      zram: promote zram from staging · cd67e10a
      Minchan Kim authored
      Zram has lived in staging for a LONG LONG time and have been
      fixed/improved by many contributors so code is clean and stable now.  Of
      course, there are lots of product using zram in real practice.
      
      The major TV companys have used zram as swap since two years ago and
      recently our production team released android smart phone with zram
      which is used as swap, too and recently Android Kitkat start to use zram
      for small memory smart phone.  And there was a report Google released
      their ChromeOS with zram, too and cyanogenmod have been used zram long
      time ago.  And I heard some disto have used zram block device for tmpfs.
      In addition, I saw many report from many other peoples.  For example,
      Lubuntu start to use it.
      
      The benefit of zram is very clear.  With my experience, one of the
      benefit was to remove jitter of video application with backgroud memory
      pressure.  It would be effect of efficient memory usage by compression
      but more issue is whether swap is there or not in the system.  Recent
      mobile platforms have used JAVA so there are many anonymous pages.  But
      embedded system normally are reluctant to use eMMC or SDCard as swap
      because there is wear-leveling and latency issues so if we do not use
      swap, it means we can't reclaim anoymous pages and at last, we could
      encounter OOM kill.  :(
      
      Although we have real storage as swap, it was a problem, too.  Because
      it sometime ends up making system very unresponsible caused by slow swap
      storage performance.
      
      Quote from Luigi on Google
       "Since Chrome OS was mentioned: the main reason why we don't use swap
        to a disk (rotating or SSD) is because it doesn't degrade gracefully
        and leads to a bad interactive experience.  Generally we prefer to
        manage RAM at a higher level, by transparently killing and restarting
        processes.  But we noticed that zram is fast enough to be competitive
        with the latter, and it lets us make more efficient use of the
        available RAM.  " and he announced.
      http://www.spinics.net/lists/linux-mm/msg57717.html
      
      Other uses case is to use zram for block device.  Zram is block device
      so anyone can format the block device and mount on it so some guys on
      the internet start zram as /var/tmp.
      http://forums.gentoo.org/viewtopic-t-838198-start-0.html
      
      Let's promote zram and enhance/maintain it instead of removing.
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Acked-by: default avatarNitin Gupta <ngupta@vflare.org>
      Acked-by: default avatarPekka Enberg <penberg@kernel.org>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cd67e10a
    • Minchan Kim's avatar
      zsmalloc: move it under mm · bcf1647d
      Minchan Kim authored
      This patch moves zsmalloc under mm directory.
      
      Before that, description will explain why we have needed custom
      allocator.
      
      Zsmalloc is a new slab-based memory allocator for storing compressed
      pages.  It is designed for low fragmentation and high allocation success
      rate on large object, but <= PAGE_SIZE allocations.
      
      zsmalloc differs from the kernel slab allocator in two primary ways to
      achieve these design goals.
      
      zsmalloc never requires high order page allocations to back slabs, or
      "size classes" in zsmalloc terms.  Instead it allows multiple
      single-order pages to be stitched together into a "zspage" which backs
      the slab.  This allows for higher allocation success rate under memory
      pressure.
      
      Also, zsmalloc allows objects to span page boundaries within the zspage.
      This allows for lower fragmentation than could be had with the kernel
      slab allocator for objects between PAGE_SIZE/2 and PAGE_SIZE.  With the
      kernel slab allocator, if a page compresses to 60% of it original size,
      the memory savings gained through compression is lost in fragmentation
      because another object of the same size can't be stored in the leftover
      space.
      
      This ability to span pages results in zsmalloc allocations not being
      directly addressable by the user.  The user is given an
      non-dereferencable handle in response to an allocation request.  That
      handle must be mapped, using zs_map_object(), which returns a pointer to
      the mapped region that can be used.  The mapping is necessary since the
      object data may reside in two different noncontigious pages.
      
      The zsmalloc fulfills the allocation needs for zram perfectly
      
      [sjenning@linux.vnet.ibm.com: borrow Seth's quote]
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Acked-by: default avatarNitin Gupta <ngupta@vflare.org>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Bob Liu <bob.liu@oracle.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Luigi Semenzato <semenzato@google.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bcf1647d
    • Roman Gushchin's avatar
      kernel/smp.c: remove cpumask_ipi · 73f94550
      Roman Gushchin authored
      After commit 9a46ad6d ("smp: make smp_call_function_many() use logic
      similar to smp_call_function_single()"), cfd->cpumask is accessed only
      in smp_call_function_many().  So there is no more need to copy it into
      cfd->cpumask_ipi before putting csd into the list.  The cpumask_ipi
      field is obsolete and can be removed.
      Signed-off-by: default avatarRoman Gushchin <klamm@yandex-team.ru>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Wang YanQing <udknight@gmail.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Cc: Shaohua Li <shli@fusionio.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      73f94550
    • Christoph Hellwig's avatar
      kernel: use lockless list for smp_call_function_single · 6897fc22
      Christoph Hellwig authored
      Make smp_call_function_single and friends more efficient by using a
      lockless list.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6897fc22
    • Levente Kurusa's avatar
      drivers/net/phy/mdio_bus.c: call put_device on device_register() failure · 0c692d07
      Levente Kurusa authored
      It is required to call put_device() if device_register() fails, so that
      we give up the last reference to the device.  Calling put_device allows
      for mdiobus_release to be executed, kfreeing the bus.
      Signed-off-by: default avatarLevente Kurusa <levex@linux.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: David Daney <david.daney@cavium.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0c692d07
    • Levente Kurusa's avatar
      drivers/video/backlight/lcd.c: call put_device if device_register fails · 54f5968d
      Levente Kurusa authored
      Currently we kfree the container of the device which failed to register.
      This is wrong as the last reference is not given up with a put_device
      call.  Also, now that we have put_device() callen, we no longer need the
      kfree as the new_ld->dev.release function will take care of kfreeing the
      associated memory.
      Signed-off-by: default avatarLevente Kurusa <levex@linux.com>
      Acked-by: default avatarJingoo Han <jg1.han@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      54f5968d
    • Yinghai Lu's avatar
      memblock, bootmem: restore goal for alloc_low · 07bacb38
      Yinghai Lu authored
      Now we have memblock_virt_alloc_low to replace original bootmem api in
      swiotlb.
      
      But we should not use BOOTMEM_LOW_LIMIT for arch that does not support
      CONFIG_NOBOOTMEM, as old api take 0.
      
      | #define alloc_bootmem_low(x) \
      |        __alloc_bootmem_low(x, SMP_CACHE_BYTES, 0)
      |#define alloc_bootmem_low_pages_nopanic(x) \
      |        __alloc_bootmem_low_nopanic(x, PAGE_SIZE, 0)
      
      and we have
       #define BOOTMEM_LOW_LIMIT __pa(MAX_DMA_ADDRESS)
      for CONFIG_NOBOOTMEM.
      
      Restore goal to 0 to fix ia64 crash, that Tony found.
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Reported-by: default avatarTony Luck <tony.luck@gmail.com>
      Tested-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      07bacb38
  2. 30 Jan, 2014 3 commits
    • Linus Torvalds's avatar
      Merge branch 'for-3.14/drivers' of git://git.kernel.dk/linux-block · 53d8ab29
      Linus Torvalds authored
      Pull block IO driver changes from Jens Axboe:
      
       - bcache update from Kent Overstreet.
      
       - two bcache fixes from Nicholas Swenson.
      
       - cciss pci init error fix from Andrew.
      
       - underflow fix in the parallel IDE pg_write code from Dan Carpenter.
         I'm sure the 1 (or 0) users of that are now happy.
      
       - two PCI related fixes for sx8 from Jingoo Han.
      
       - floppy init fix for first block read from Jiri Kosina.
      
       - pktcdvd error return miss fix from Julia Lawall.
      
       - removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from
         Michael Opdenacker.
      
       - comment typo fix for the loop driver from Olaf Hering.
      
       - potential oops fix for null_blk from Raghavendra K T.
      
       - two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing
         an OOM problem and a problem with handling security locked conditions
      
      * 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits)
        mg_disk: Spelling s/finised/finished/
        null_blk: Null pointer deference problem in alloc_page_buffers
        mtip32xx: Correctly handle security locked condition
        mtip32xx: Make SGL container per-command to eliminate high order dma allocation
        drivers/block/loop.c: fix comment typo in loop_config_discard
        drivers/block/cciss.c:cciss_init_one(): use proper errnos
        drivers/block/paride/pg.c: underflow bug in pg_write()
        drivers/block/sx8.c: remove unnecessary pci_set_drvdata()
        drivers/block/sx8.c: use module_pci_driver()
        floppy: bail out in open() if drive is not responding to block0 read
        bcache: Fix auxiliary search trees for key size > cacheline size
        bcache: Don't return -EINTR when insert finished
        bcache: Improve bucket_prio() calculation
        bcache: Add bch_bkey_equal_header()
        bcache: update bch_bkey_try_merge
        bcache: Move insert_fixup() to btree_keys_ops
        bcache: Convert sorting to btree_keys
        bcache: Convert debug code to btree_keys
        bcache: Convert btree_iter to struct btree_keys
        bcache: Refactor bset_tree sysfs stats
        ...
      53d8ab29
    • Linus Torvalds's avatar
      Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block · f568849e
      Linus Torvalds authored
      Pull core block IO changes from Jens Axboe:
       "The major piece in here is the immutable bio_ve series from Kent, the
        rest is fairly minor.  It was supposed to go in last round, but
        various issues pushed it to this release instead.  The pull request
        contains:
      
         - Various smaller blk-mq fixes from different folks.  Nothing major
           here, just minor fixes and cleanups.
      
         - Fix for a memory leak in the error path in the block ioctl code
           from Christian Engelmayer.
      
         - Header export fix from CaiZhiyong.
      
         - Finally the immutable biovec changes from Kent Overstreet.  This
           enables some nice future work on making arbitrarily sized bios
           possible, and splitting more efficient.  Related fixes to immutable
           bio_vecs:
      
              - dm-cache immutable fixup from Mike Snitzer.
              - btrfs immutable fixup from Muthu Kumar.
      
        - bio-integrity fix from Nic Bellinger, which is also going to stable"
      
      * 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits)
        xtensa: fixup simdisk driver to work with immutable bio_vecs
        block/blk-mq-cpu.c: use hotcpu_notifier()
        blk-mq: for_each_* macro correctness
        block: Fix memory leak in rw_copy_check_uvector() handling
        bio-integrity: Fix bio_integrity_verify segment start bug
        block: remove unrelated header files and export symbol
        blk-mq: uses page->list incorrectly
        blk-mq: use __smp_call_function_single directly
        btrfs: fix missing increment of bi_remaining
        Revert "block: Warn and free bio if bi_end_io is not set"
        block: Warn and free bio if bi_end_io is not set
        blk-mq: fix initializing request's start time
        block: blk-mq: don't export blk_mq_free_queue()
        block: blk-mq: make blk_sync_queue support mq
        block: blk-mq: support draining mq queue
        dm cache: increment bi_remaining when bi_end_io is restored
        block: fixup for generic bio chaining
        block: Really silence spurious compiler warnings
        block: Silence spurious compiler warnings
        block: Kill bio_pair_split()
        ...
      f568849e
    • Linus Torvalds's avatar
      Merge branch 'for-3.14' of git://linux-nfs.org/~bfields/linux · d9894c22
      Linus Torvalds authored
      Pull nfsd updates from Bruce Fields:
       - Handle some loose ends from the vfs read delegation support.
         (For example nfsd can stop breaking leases on its own in a
          fewer places where it can now depend on the vfs to.)
       - Make life a little easier for NFSv4-only configurations
         (thanks to Kinglong Mee).
       - Fix some gss-proxy problems (thanks Jeff Layton).
       - miscellaneous bug fixes and cleanup
      
      * 'for-3.14' of git://linux-nfs.org/~bfields/linux: (38 commits)
        nfsd: consider CLAIM_FH when handing out delegation
        nfsd4: fix delegation-unlink/rename race
        nfsd4: delay setting current_fh in open
        nfsd4: minor nfs4_setlease cleanup
        gss_krb5: use lcm from kernel lib
        nfsd4: decrease nfsd4_encode_fattr stack usage
        nfsd: fix encode_entryplus_baggage stack usage
        nfsd4: simplify xdr encoding of nfsv4 names
        nfsd4: encode_rdattr_error cleanup
        nfsd4: nfsd4_encode_fattr cleanup
        minor svcauth_gss.c cleanup
        nfsd4: better VERIFY comment
        nfsd4: break only delegations when appropriate
        NFSD: Fix a memory leak in nfsd4_create_session
        sunrpc: get rid of use_gssp_lock
        sunrpc: fix potential race between setting use_gss_proxy and the upcall rpc_clnt
        sunrpc: don't wait for write before allowing reads from use-gss-proxy file
        nfsd: get rid of unused function definition
        Define op_iattr for nfsd4_open instead using macro
        NFSD: fix compile warning without CONFIG_NFSD_V3
        ...
      d9894c22