1. 22 Feb, 2014 4 commits
    • Rafael Aquini's avatar
      mm: fix page leak at nfs_symlink() · b0306534
      Rafael Aquini authored
      commit a0b54add upstream.
      
      Changes in commit a0b8cab3 ("mm: remove lru parameter from
      __pagevec_lru_add and remove parts of pagevec API") have introduced a
      call to add_to_page_cache_lru() which causes a leak in nfs_symlink() as
      now the page gets an extra refcount that is not dropped.
      
      Jan Stancek observed and reported the leak effect while running test8
      from Connectathon Testsuite.  After several iterations over the test
      case, which creates several symlinks on a NFS mountpoint, the test
      system was quickly getting into an out-of-memory scenario.
      
      This patch fixes the page leak by dropping that extra refcount
      add_to_page_cache_lru() is grabbing.
      Signed-off-by: default avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: Trond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b0306534
    • Eric W. Biederman's avatar
      fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem · e69755bb
      Eric W. Biederman authored
      commit 96c7a2ff upstream.
      
      Recently due to a spike in connections per second memcached on 3
      separate boxes triggered the OOM killer from accept.  At the time the
      OOM killer was triggered there was 4GB out of 36GB free in zone 1.  The
      problem was that alloc_fdtable was allocating an order 3 page (32KiB) to
      hold a bitmap, and there was sufficient fragmentation that the largest
      page available was 8KiB.
      
      I find the logic that PAGE_ALLOC_COSTLY_ORDER can't fail pretty dubious
      but I do agree that order 3 allocations are very likely to succeed.
      
      There are always pathologies where order > 0 allocations can fail when
      there are copious amounts of free memory available.  Using the pigeon
      hole principle it is easy to show that it requires 1 page more than 50%
      of the pages being free to guarantee an order 1 (8KiB) allocation will
      succeed, 1 page more than 75% of the pages being free to guarantee an
      order 2 (16KiB) allocation will succeed and 1 page more than 87.5% of
      the pages being free to guarantee an order 3 allocate will succeed.
      
      A server churning memory with a lot of small requests and replies like
      memcached is a common case that if anything can will skew the odds
      against large pages being available.
      
      Therefore let's not give external applications a practical way to kill
      linux server applications, and specify __GFP_NORETRY to the kmalloc in
      alloc_fdmem.  Unless I am misreading the code and by the time the code
      reaches should_alloc_retry in __alloc_pages_slowpath (where
      __GFP_NORETRY becomes signification).  We have already tried everything
      reasonable to allocate a page and the only thing left to do is wait.  So
      not waiting and falling back to vmalloc immediately seems like the
      reasonable thing to do even if there wasn't a chance of triggering the
      OOM killer.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Cong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e69755bb
    • David Vrabel's avatar
      xen-blkfront: handle backend CLOSED without CLOSING · e998c107
      David Vrabel authored
      commit 36613717 upstream.
      
      Backend drivers shouldn't transistion to CLOSED unless the frontend is
      CLOSED.  If a backend does transition to CLOSED too soon then the
      frontend may not see the CLOSING state and will not properly shutdown.
      
      So, treat an unexpected backend CLOSED state the same as CLOSING.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e998c107
    • Mel Gorman's avatar
      xen: properly account for _PAGE_NUMA during xen pte translations · 1160ce4b
      Mel Gorman authored
      commit a9c8e4be upstream.
      
      Steven Noonan forwarded a users report where they had a problem starting
      vsftpd on a Xen paravirtualized guest, with this in dmesg:
      
        BUG: Bad page map in process vsftpd  pte:8000000493b88165 pmd:e9cc01067
        page:ffffea00124ee200 count:0 mapcount:-1 mapping:     (null) index:0x0
        page flags: 0x2ffc0000000014(referenced|dirty)
        addr:00007f97eea74000 vm_flags:00100071 anon_vma:ffff880e98f80380 mapping:          (null) index:7f97eea74
        CPU: 4 PID: 587 Comm: vsftpd Not tainted 3.12.7-1-ec2 #1
        Call Trace:
          dump_stack+0x45/0x56
          print_bad_pte+0x22e/0x250
          unmap_single_vma+0x583/0x890
          unmap_vmas+0x65/0x90
          exit_mmap+0xc5/0x170
          mmput+0x65/0x100
          do_exit+0x393/0x9e0
          do_group_exit+0xcc/0x140
          SyS_exit_group+0x14/0x20
          system_call_fastpath+0x1a/0x1f
        Disabling lock debugging due to kernel taint
        BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:0 val:-1
        BUG: Bad rss-counter state mm:ffff880e9ca60580 idx:1 val:1
      
      The issue could not be reproduced under an HVM instance with the same
      kernel, so it appears to be exclusive to paravirtual Xen guests.  He
      bisected the problem to commit 1667918b ("mm: numa: clear numa
      hinting information on mprotect") that was also included in 3.12-stable.
      
      The problem was related to how xen translates ptes because it was not
      accounting for the _PAGE_NUMA bit.  This patch splits pte_present to add
      a pteval_present helper for use by xen so both bare metal and xen use
      the same code when checking if a PTE is present.
      
      [mgorman@suse.de: wrote changelog, proposed minor modifications]
      [akpm@linux-foundation.org: fix typo in comment]
      Reported-by: default avatarSteven Noonan <steven@uplinklabs.net>
      Tested-by: default avatarSteven Noonan <steven@uplinklabs.net>
      Signed-off-by: default avatarElena Ufimtseva <ufimtseva@gmail.com>
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reviewed-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1160ce4b
  2. 20 Feb, 2014 32 commits
  3. 13 Feb, 2014 4 commits