1. 27 Jan, 2015 31 commits
  2. 16 Jan, 2015 9 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.18.3 · 219b188d
      Greg Kroah-Hartman authored
      219b188d
    • Linus Torvalds's avatar
      mm: Don't count the stack guard page towards RLIMIT_STACK · f2f5d44b
      Linus Torvalds authored
      commit 690eac53 upstream.
      
      Commit fee7e49d ("mm: propagate error from stack expansion even for
      guard page") made sure that we return the error properly for stack
      growth conditions.  It also theorized that counting the guard page
      towards the stack limit might break something, but also said "Let's see
      if anybody notices".
      
      Somebody did notice.  Apparently android-x86 sets the stack limit very
      close to the limit indeed, and including the guard page in the rlimit
      check causes the android 'zygote' process problems.
      
      So this adds the (fairly trivial) code to make the stack rlimit check be
      against the actual real stack size, rather than the size of the vma that
      includes the guard page.
      Reported-and-tested-by: default avatarChih-Wei Huang <cwhuang@android-x86.org>
      Cc: Jay Foad <jay.foad@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2f5d44b
    • Linus Torvalds's avatar
      mm: propagate error from stack expansion even for guard page · c03aed64
      Linus Torvalds authored
      commit fee7e49d upstream.
      
      Jay Foad reports that the address sanitizer test (asan) sometimes gets
      confused by a stack pointer that ends up being outside the stack vma
      that is reported by /proc/maps.
      
      This happens due to an interaction between RLIMIT_STACK and the guard
      page: when we do the guard page check, we ignore the potential error
      from the stack expansion, which effectively results in a missing guard
      page, since the expected stack expansion won't have been done.
      
      And since /proc/maps explicitly ignores the guard page (commit
      d7824370: "mm: fix up some user-visible effects of the stack guard
      page"), the stack pointer ends up being outside the reported stack area.
      
      This is the minimal patch: it just propagates the error.  It also
      effectively makes the guard page part of the stack limit, which in turn
      measn that the actual real stack is one page less than the stack limit.
      
      Let's see if anybody notices.  We could teach acct_stack_growth() to
      allow an extra page for a grow-up/grow-down stack in the rlimit test,
      but I don't want to add more complexity if it isn't needed.
      Reported-and-tested-by: default avatarJay Foad <jay.foad@gmail.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c03aed64
    • Vlastimil Babka's avatar
      mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed · 53bcf5c3
      Vlastimil Babka authored
      commit 9e5e3661 upstream.
      
      Charles Shirron and Paul Cassella from Cray Inc have reported kswapd
      stuck in a busy loop with nothing left to balance, but
      kswapd_try_to_sleep() failing to sleep.  Their analysis found the cause
      to be a combination of several factors:
      
      1. A process is waiting in throttle_direct_reclaim() on pgdat->pfmemalloc_wait
      
      2. The process has been killed (by OOM in this case), but has not yet been
         scheduled to remove itself from the waitqueue and die.
      
      3. kswapd checks for throttled processes in prepare_kswapd_sleep():
      
              if (waitqueue_active(&pgdat->pfmemalloc_wait)) {
                      wake_up(&pgdat->pfmemalloc_wait);
      		return false; // kswapd will not go to sleep
      	}
      
         However, for a process that was already killed, wake_up() does not remove
         the process from the waitqueue, since try_to_wake_up() checks its state
         first and returns false when the process is no longer waiting.
      
      4. kswapd is running on the same CPU as the only CPU that the process is
         allowed to run on (through cpus_allowed, or possibly single-cpu system).
      
      5. CONFIG_PREEMPT_NONE=y kernel is used. If there's nothing to balance, kswapd
         encounters no voluntary preemption points and repeatedly fails
         prepare_kswapd_sleep(), blocking the process from running and removing
         itself from the waitqueue, which would let kswapd sleep.
      
      So, the source of the problem is that we prevent kswapd from going to
      sleep until there are processes waiting on the pfmemalloc_wait queue,
      and a process waiting on a queue is guaranteed to be removed from the
      queue only when it gets scheduled.  This was done to make sure that no
      process is left sleeping on pfmemalloc_wait when kswapd itself goes to
      sleep.
      
      However, it isn't necessary to postpone kswapd sleep until the
      pfmemalloc_wait queue actually empties.  To prevent processes from being
      left sleeping, it's actually enough to guarantee that all processes
      waiting on pfmemalloc_wait queue have been woken up by the time we put
      kswapd to sleep.
      
      This patch therefore fixes this issue by substituting 'wake_up' with
      'wake_up_all' and removing 'return false' in the code snippet from
      prepare_kswapd_sleep() above.  Note that if any process puts itself in
      the queue after this waitqueue_active() check, or after the wake up
      itself, it means that the process will also wake up kswapd - and since
      we are under prepare_to_wait(), the wake up won't be missed.  Also we
      update the comment prepare_kswapd_sleep() to hopefully more clearly
      describe the races it is preventing.
      
      Fixes: 5515061d ("mm: throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage")
      Signed-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarVladimir Davydov <vdavydov@parallels.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53bcf5c3
    • Johannes Weiner's avatar
      mm: protect set_page_dirty() from ongoing truncation · a78e877e
      Johannes Weiner authored
      commit 2d6d7f98 upstream.
      
      Tejun, while reviewing the code, spotted the following race condition
      between the dirtying and truncation of a page:
      
      __set_page_dirty_nobuffers()       __delete_from_page_cache()
        if (TestSetPageDirty(page))
                                           page->mapping = NULL
      				     if (PageDirty())
      				       dec_zone_page_state(page, NR_FILE_DIRTY);
      				       dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
          if (page->mapping)
            account_page_dirtied(page)
              __inc_zone_page_state(page, NR_FILE_DIRTY);
      	__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
      
      which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE.
      
      Dirtiers usually lock out truncation, either by holding the page lock
      directly, or in case of zap_pte_range(), by pinning the mapcount with
      the page table lock held.  The notable exception to this rule, though,
      is do_wp_page(), for which this race exists.  However, do_wp_page()
      already waits for a locked page to unlock before setting the dirty bit,
      in order to prevent a race where clear_page_dirty() misses the page bit
      in the presence of dirty ptes.  Upgrade that wait to a fully locked
      set_page_dirty() to also cover the situation explained above.
      
      Afterwards, the code in set_page_dirty() dealing with a truncation race
      is no longer needed.  Remove it.
      Reported-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a78e877e
    • Oleg Nesterov's avatar
      exit: fix race between wait_consider_task() and wait_task_zombie() · d73437ad
      Oleg Nesterov authored
      commit 3245d6ac upstream.
      
      wait_consider_task() checks EXIT_ZOMBIE after EXIT_DEAD/EXIT_TRACE and
      both checks can fail if we race with EXIT_ZOMBIE -> EXIT_DEAD/EXIT_TRACE
      change in between, gcc needs to reload p->exit_state after
      security_task_wait().  In this case ->notask_error will be wrongly
      cleared and do_wait() can hang forever if it was the last eligible
      child.
      
      Many thanks to Arne who carefully investigated the problem.
      
      Note: this bug is very old but it was pure theoretical until commit
      b3ab0316 ("wait: completely ignore the EXIT_DEAD tasks").  Before
      this commit "-O2" was probably enough to guarantee that compiler won't
      read ->exit_state twice.
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reported-by: default avatarArne Goedeke <el@laramies.com>
      Tested-by: default avatarArne Goedeke <el@laramies.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d73437ad
    • Krzysztof Kozlowski's avatar
      mmc: sdhci: Fix sleep in atomic after inserting SD card · 0324896e
      Krzysztof Kozlowski authored
      commit 2836766a upstream.
      
      Sleep in atomic context happened on Trats2 board after inserting or
      removing SD card because mmc_gpio_get_cd() was called under spin lock.
      
      Fix this by moving card detection earlier, before acquiring spin lock.
      The mmc_gpio_get_cd() call does not have to be protected by spin lock
      because it does not access any sdhci internal data.
      The sdhci_do_get_cd() call access host flags (SDHCI_DEVICE_DEAD). After
      moving it out side of spin lock it could theoretically race with driver
      removal but still there is no actual protection against manual card
      eject.
      
      Dmesg after inserting SD card:
      [   41.663414] BUG: sleeping function called from invalid context at drivers/gpio/gpiolib.c:1511
      [   41.670469] in_atomic(): 1, irqs_disabled(): 128, pid: 30, name: kworker/u8:1
      [   41.677580] INFO: lockdep is turned off.
      [   41.681486] irq event stamp: 61972
      [   41.684872] hardirqs last  enabled at (61971): [<c0490ee0>] _raw_spin_unlock_irq+0x24/0x5c
      [   41.693118] hardirqs last disabled at (61972): [<c04907ac>] _raw_spin_lock_irq+0x18/0x54
      [   41.701190] softirqs last  enabled at (61648): [<c0026fd4>] __do_softirq+0x234/0x2c8
      [   41.708914] softirqs last disabled at (61631): [<c00273a0>] irq_exit+0xd0/0x114
      [   41.716206] Preemption disabled at:[<  (null)>]   (null)
      [   41.721500]
      [   41.722985] CPU: 3 PID: 30 Comm: kworker/u8:1 Tainted: G        W      3.18.0-rc5-next-20141121 #883
      [   41.732111] Workqueue: kmmcd mmc_rescan
      [   41.735945] [<c0014d2c>] (unwind_backtrace) from [<c0011c80>] (show_stack+0x10/0x14)
      [   41.743661] [<c0011c80>] (show_stack) from [<c0489d14>] (dump_stack+0x70/0xbc)
      [   41.750867] [<c0489d14>] (dump_stack) from [<c0228b74>] (gpiod_get_raw_value_cansleep+0x18/0x30)
      [   41.759628] [<c0228b74>] (gpiod_get_raw_value_cansleep) from [<c03646e8>] (mmc_gpio_get_cd+0x38/0x58)
      [   41.768821] [<c03646e8>] (mmc_gpio_get_cd) from [<c036d378>] (sdhci_request+0x50/0x1a4)
      [   41.776808] [<c036d378>] (sdhci_request) from [<c0357934>] (mmc_start_request+0x138/0x268)
      [   41.785051] [<c0357934>] (mmc_start_request) from [<c0357cc8>] (mmc_wait_for_req+0x58/0x1a0)
      [   41.793469] [<c0357cc8>] (mmc_wait_for_req) from [<c0357e68>] (mmc_wait_for_cmd+0x58/0x78)
      [   41.801714] [<c0357e68>] (mmc_wait_for_cmd) from [<c0361c00>] (mmc_io_rw_direct_host+0x98/0x124)
      [   41.810480] [<c0361c00>] (mmc_io_rw_direct_host) from [<c03620f8>] (sdio_reset+0x2c/0x64)
      [   41.818641] [<c03620f8>] (sdio_reset) from [<c035a3d8>] (mmc_rescan+0x254/0x2e4)
      [   41.826028] [<c035a3d8>] (mmc_rescan) from [<c003a0e0>] (process_one_work+0x180/0x3f4)
      [   41.833920] [<c003a0e0>] (process_one_work) from [<c003a3bc>] (worker_thread+0x34/0x4b0)
      [   41.841991] [<c003a3bc>] (worker_thread) from [<c003fed8>] (kthread+0xe4/0x104)
      [   41.849285] [<c003fed8>] (kthread) from [<c000f268>] (ret_from_fork+0x14/0x2c)
      [   42.038276] mmc0: new high speed SDHC card at address 1234
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Fixes: 94144a46 ("mmc: sdhci: add get_cd() implementation")
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0324896e
    • Krzysztof Kozlowski's avatar
      regulator: s2mps11: Fix dw_mmc failure on Gear 2 · 9abaccf3
      Krzysztof Kozlowski authored
      commit 1222d8fe upstream.
      
      Invalid buck4 configuration for linear mapping of voltage in S2MPS14
      regulators caused boot failure on Gear 2 (dw_mmc-exynos):
      
      [    3.569137] EXT4-fs (mmcblk0p15): mounted filesystem with ordered data mode. Opts: (null)
      [    3.571716] VFS: Mounted root (ext4 filesystem) readonly on device 179:15.
      [    3.629842] mmcblk0: error -110 sending status command, retrying
      [    3.630244] mmcblk0: error -110 sending status command, retrying
      [    3.636292] mmcblk0: error -110 sending status command, aborting
      
      Buck4 voltage regulator has different minimal voltage value than other
      bucks. Commit merging multiple regulator description macros caused to
      use linear_min_sel from buck[1235] regulators as value for buck4. This
      lead to lower voltage of buck4 than required.
      
      Output of the buck4 is used internally as power source for
      LDO{3,4,7,11,19,20,21,23}. On Gear 2 board LDO11 is used as MMC
      regulator (V_EMMC_1.8V).
      
      Fixes: 5a867cf2 ("regulator: s2mps11: Optimize the regulator description macro")
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9abaccf3
    • Dave Airlie's avatar
      nouveau: bring back legacy mmap handler · cc01e9c0
      Dave Airlie authored
      commit 2036eaa7 upstream.
      
      nouveau userspace back at 1.0.1 used to call the X server
      DRIOpenDRMMaster interface even for DRI2 (doh!), this attempts
      to map the sarea and fails if it can't.
      
      Since 884c6dab from Daniel,
      this fails, but only ancient drivers would see it.
      
      Revert the nouveau bits of that fix.
      Acked-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc01e9c0