1. 20 May, 2015 14 commits
  2. 16 May, 2015 19 commits
  3. 15 May, 2015 7 commits
    • Leon Yu's avatar
      mm: fix anon_vma->degree underflow in anon_vma endless growing prevention · 2b4396f4
      Leon Yu authored
      commit 3fe89b3e upstream.
      
      I have constantly stumbled upon "kernel BUG at mm/rmap.c:399!" after
      upgrading to 3.19 and had no luck with 4.0-rc1 neither.
      
      So, after looking into new logic introduced by commit 7a3ef208 ("mm:
      prevent endless growth of anon_vma hierarchy"), I found chances are that
      unlink_anon_vmas() is called without incrementing dst->anon_vma->degree
      in anon_vma_clone() due to allocation failure.  If dst->anon_vma is not
      NULL in error path, its degree will be incorrectly decremented in
      unlink_anon_vmas() and eventually underflow when exiting as a result of
      another call to unlink_anon_vmas().  That's how "kernel BUG at
      mm/rmap.c:399!" is triggered for me.
      
      This patch fixes the underflow by dropping dst->anon_vma when allocation
      fails.  It's safe to do so regardless of original value of dst->anon_vma
      because dst->anon_vma doesn't have valid meaning if anon_vma_clone()
      fails.  Besides, callers don't care dst->anon_vma in such case neither.
      
      Also suggested by Michal Hocko, we can clean up vma_adjust() a bit as
      anon_vma_clone() now does the work.
      
      [akpm@linux-foundation.org: tweak comment]
      Fixes: 7a3ef208 ("mm: prevent endless growth of anon_vma hierarchy")
      Signed-off-by: default avatarLeon Yu <chianglungyu@gmail.com>
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2b4396f4
    • Konstantin Khlebnikov's avatar
      mm: fix corner case in anon_vma endless growing prevention · 4cd01168
      Konstantin Khlebnikov authored
      commit b800c91a upstream.
      
      Fix for BUG_ON(anon_vma->degree) splashes in unlink_anon_vmas() ("kernel
      BUG at mm/rmap.c:399!") caused by commit 7a3ef208 ("mm: prevent
      endless growth of anon_vma hierarchy")
      
      Anon_vma_clone() is usually called for a copy of source vma in
      destination argument.  If source vma has anon_vma it should be already
      in dst->anon_vma.  NULL in dst->anon_vma is used as a sign that it's
      called from anon_vma_fork().  In this case anon_vma_clone() finds
      anon_vma for reusing.
      
      Vma_adjust() calls it differently and this breaks anon_vma reusing
      logic: anon_vma_clone() links vma to old anon_vma and updates degree
      counters but vma_adjust() overrides vma->anon_vma right after that.  As
      a result final unlink_anon_vmas() decrements degree for wrong anon_vma.
      
      This patch assigns ->anon_vma before calling anon_vma_clone().
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-and-tested-by: default avatarChris Clayton <chris2553@googlemail.com>
      Reported-and-tested-by: default avatarOded Gabbay <oded.gabbay@amd.com>
      Reported-and-tested-by: default avatarChih-Wei Huang <cwhuang@android-x86.org>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Daniel Forrest <dan.forrest@ssec.wisc.edu>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: stable@vger.kernel.org  # to match back-porting of 7a3ef208Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4cd01168
    • Konstantin Khlebnikov's avatar
      mm: prevent endless growth of anon_vma hierarchy · f3c423eb
      Konstantin Khlebnikov authored
      commit 7a3ef208 upstream.
      
      Constantly forking task causes unlimited grow of anon_vma chain.  Each
      next child allocates new level of anon_vmas and links vma to all
      previous levels because pages might be inherited from any level.
      
      This patch adds heuristic which decides to reuse existing anon_vma
      instead of forking new one.  It adds counter anon_vma->degree which
      counts linked vmas and directly descending anon_vmas and reuses anon_vma
      if counter is lower than two.  As a result each anon_vma has either vma
      or at least two descending anon_vmas.  In such trees half of nodes are
      leafs with alive vmas, thus count of anon_vmas is no more than two times
      bigger than count of vmas.
      
      This heuristic reuses anon_vmas as few as possible because each reuse
      adds false aliasing among vmas and rmap walker ought to scan more ptes
      when it searches where page is might be mapped.
      
      Link: http://lkml.kernel.org/r/20120816024610.GA5350@evergreen.ssec.wisc.edu
      Fixes: 5beb4930 ("mm: change anon_vma linking to fix multi-process server scalability issue")
      [akpm@linux-foundation.org: fix typo, per Rik]
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-by: default avatarDaniel Forrest <dan.forrest@ssec.wisc.edu>
      Tested-by: default avatarMichal Hocko <mhocko@suse.cz>
      Tested-by: default avatarJerome Marchand <jmarchan@redhat.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: <stable@vger.kernel.org>	[2.6.34+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f3c423eb
    • Heiko Carstens's avatar
      fs/seq_file: fallback to vmalloc allocation · 77167581
      Heiko Carstens authored
      commit 058504ed upstream.
      
      There are a couple of seq_files which use the single_open() interface.
      This interface requires that the whole output must fit into a single
      buffer.
      
      E.g.  for /proc/stat allocation failures have been observed because an
      order-4 memory allocation failed due to memory fragmentation.  In such
      situations reading /proc/stat is not possible anymore.
      
      Therefore change the seq_file code to fallback to vmalloc allocations
      which will usually result in a couple of order-0 allocations and hence
      also work if memory is fragmented.
      
      For reference a call trace where reading from /proc/stat failed:
      
        sadc: page allocation failure: order:4, mode:0x1040d0
        CPU: 1 PID: 192063 Comm: sadc Not tainted 3.10.0-123.el7.s390x #1
        [...]
        Call Trace:
          show_stack+0x6c/0xe8
          warn_alloc_failed+0xd6/0x138
          __alloc_pages_nodemask+0x9da/0xb68
          __get_free_pages+0x2e/0x58
          kmalloc_order_trace+0x44/0xc0
          stat_open+0x5a/0xd8
          proc_reg_open+0x8a/0x140
          do_dentry_open+0x1bc/0x2c8
          finish_open+0x46/0x60
          do_last+0x382/0x10d0
          path_openat+0xc8/0x4f8
          do_filp_open+0x46/0xa8
          do_sys_open+0x114/0x1f0
          sysc_tracego+0x14/0x1a
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Tested-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      77167581
    • Al Viro's avatar
      seq_file: always clear m->count when we free m->buf · bb1317a3
      Al Viro authored
      commit 801a7605 upstream.
      
      Once we'd freed m->buf, m->count should become zero - we have no valid
      contents reachable via m->buf.
      Reported-by: default avatarCharley (Hao Chuan) Chu <charley.chu@broadcom.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bb1317a3
    • Heiko Carstens's avatar
      /proc/stat: convert to single_open_size() · a58d83a1
      Heiko Carstens authored
      commit f74373a5 upstream.
      
      These two patches are supposed to "fix" failed order-4 memory
      allocations which have been observed when reading /proc/stat.  The
      problem has been observed on s390 as well as on x86.
      
      To address the problem change the seq_file memory allocations to
      fallback to use vmalloc, so that allocations also work if memory is
      fragmented.
      
      This approach seems to be simpler and less intrusive than changing
      /proc/stat to use an interator.  Also it "fixes" other users as well,
      which use seq_file's single_open() interface.
      
      This patch (of 2):
      
      Use seq_file's single_open_size() to preallocate a buffer that is large
      enough to hold the whole output, instead of open coding it.  Also
      calculate the requested size using the number of online cpus instead of
      possible cpus, since the size of the output only depends on the number
      of online cpus.
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Thorsten Diehl <thorsten.diehl@de.ibm.com>
      Cc: Andrea Righi <andrea@betterlinux.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Stefan Bader <stefan.bader@canonical.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a58d83a1
    • Vineet Gupta's avatar
      ARC: signal handling robustify · dcaf4bff
      Vineet Gupta authored
      commit e4140819 upstream.
      
      A malicious signal handler / restorer can DOS the system by fudging the
      user regs saved on stack, causing weird things such as sigreturn returning
      to user mode PC but cpu state still being kernel mode....
      
      Ensure that in sigreturn path status32 always has U bit; any other bogosity
      (gargbage PC etc) will be taken care of by normal user mode exceptions mechanisms.
      
      Reproducer signal handler:
      
          void handle_sig(int signo, siginfo_t *info, void *context)
          {
      	ucontext_t *uc = context;
      	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);
      
      	regs->scratch.status32 = 0;
          }
      
      Before the fix, kernel would go off to weeds like below:
      
          --------->8-----------
          [ARCLinux]$ ./signal-test
          Path: /signal-test
          CPU: 0 PID: 61 Comm: signal-test Not tainted 4.0.0-rc5+ #65
          task: 8f177880 ti: 5ffe6000 task.ti: 8f15c000
      
          [ECR   ]: 0x00220200 => Invalid Write @ 0x00000010 by insn @ 0x00010698
          [EFA   ]: 0x00000010
          [BLINK ]: 0x2007c1ee
          [ERET  ]: 0x10698
          [STAT32]: 0x00000000 :                                   <--------
          BTA: 0x00010680	 SP: 0x5ffe7e48	 FP: 0x00000000
          LPS: 0x20003c6c	LPE: 0x20003c70	LPC: 0x00000000
          ...
          --------->8-----------
      Reported-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dcaf4bff