1. 28 Mar, 2015 5 commits
  2. 27 Mar, 2015 6 commits
  3. 26 Mar, 2015 12 commits
    • Linus Torvalds's avatar
      Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · 3c435c1e
      Linus Torvalds authored
      Pull drm refcounting fixes from Dave Airlie:
       "Here is the complete set of i915 bug/warn/refcounting fixes"
      
      * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
        drm/i915: Fixup legacy plane->crtc link for initial fb config
        drm/i915: Fix atomic state when reusing the firmware fb
        drm/i915: Keep ring->active_list and ring->requests_list consistent
        drm/i915: Don't try to reference the fb in get_initial_plane_config()
        drm: Fixup racy refcounting in plane_force_disable
      3c435c1e
    • Linus Torvalds's avatar
      Merge tag 'dm-4.0-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · be8a9bc6
      Linus Torvalds authored
      Pull device mapper fix from Mike Snitzer:
       "Fix DM core device cleanup regression -- due to a latent race that was
        exposed by the bdi changes that were introduced during the 4.0 merge"
      
      * tag 'dm-4.0-fix-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: fix add_disk() NULL pointer due to race with free_dev()
      be8a9bc6
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-4.0-rc6' of... · 0e536e25
      Linus Torvalds authored
      Merge tag 'linux-kselftest-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fix from Shuah Khan.
      
      * tag 'linux-kselftest-4.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests: Fix build failures when invoked from kselftest target
      0e536e25
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2015-03-26' of git://anongit.freedesktop.org/drm-intel into drm-fixes · 9822393d
      Dave Airlie authored
      This should cover the final warnings in -rc5 with two more backports
      from our development branch (drm-intel-next-queued). They're the ones
      from Daniel and Damien, with references to the reports.
      
      This is on top of drm-fixes because of the dependency on the two earlier
      fixes not yet in Linus' tree.
      
      There's an additional regression fix from Chris.
      
      * tag 'drm-intel-fixes-2015-03-26' of git://anongit.freedesktop.org/drm-intel:
        drm/i915: Fixup legacy plane->crtc link for initial fb config
        drm/i915: Fix atomic state when reusing the firmware fb
        drm/i915: Keep ring->active_list and ring->requests_list consistent
      9822393d
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · d6702d84
      Linus Torvalds authored
      Pull s390 fixes from Martin Schwidefsky:
       "A couple of bug fixes for s390.
      
        The ftrace comile fix is quite large for a -rc6 release, but it would
        be nice to have it in 4.0"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
        s390/smp: reenable smt after resume
        s390/mm: limit STACK_RND_MASK for compat tasks
        s390/ftrace: fix compile error if CONFIG_KPROBES is disabled
        s390/cpum_sf: add diagnostic sampling event only if it is authorized
      d6702d84
    • Daniel Vetter's avatar
      drm/i915: Fixup legacy plane->crtc link for initial fb config · 5f407751
      Daniel Vetter authored
      This is a very similar bug in the load detect code fixed in
      
      commit 9128b040
      Author: Daniel Vetter <daniel.vetter@ffwll.ch>
      Date:   Tue Mar 3 17:31:21 2015 +0100
      
          drm/i915: Fix modeset state confusion in the load detect code
      
      But this time around it was the initial fb code that forgot to update
      the plane->crtc pointer. Otherwise it's the exact same bug, with the
      exact same restrains (any set_config call/ioctl that doesn't disable
      the pipe papers over the bug for free, so fairly hard to hit in normal
      testing). So if you want the full explanation just go read that one
      over there - it's rather long ...
      
      Cc: Matt Roper <matthew.d.roper@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Josh Boyer <jwboyer@fedoraproject.org>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Reported-and-tested-by: default avatarJosh Boyer <jwboyer@fedoraproject.org>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      [Jani: backported to drm-intel-fixes for v4.0-rc]
      Reference: http://mid.gmane.org/CA+5PVA7ChbtJrknqws1qvZcbrg1CW2pQAFkSMURWWgyASRyGXg@mail.gmail.comSigned-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      5f407751
    • Damien Lespiau's avatar
      drm/i915: Fix atomic state when reusing the firmware fb · 3164a803
      Damien Lespiau authored
      Right now, we get a warning when taking over the firmware fb:
      
        [drm:drm_atomic_plane_check] FB set but no CRTC
      
      with the following backtrace:
      
        [<ffffffffa010339d>] drm_atomic_check_only+0x35d/0x510 [drm]
        [<ffffffffa0103567>] drm_atomic_commit+0x17/0x60 [drm]
        [<ffffffffa00a6ccd>] drm_atomic_helper_plane_set_property+0x8d/0xd0 [drm_kms_helper]
        [<ffffffffa00f1fed>] drm_mode_plane_set_obj_prop+0x2d/0x90 [drm]
        [<ffffffffa00a8a1b>] restore_fbdev_mode+0x6b/0xf0 [drm_kms_helper]
        [<ffffffffa00aa969>] drm_fb_helper_restore_fbdev_mode_unlocked+0x29/0x80 [drm_kms_helper]
        [<ffffffffa00aa9e2>] drm_fb_helper_set_par+0x22/0x50 [drm_kms_helper]
        [<ffffffffa050a71a>] intel_fbdev_set_par+0x1a/0x60 [i915]
        [<ffffffff813ad444>] fbcon_init+0x4f4/0x580
      
      That's because we update the plane state with the fb from the firmware, but we
      never associate the plane to that CRTC.
      
      We don't quite have the full DRM take over from HW state just yet, so
      fake enough of the plane atomic state to pass the checks.
      
      v2: Fix the state on which we set the CRTC in the case we're sharing the
          initial fb with another pipe. (Matt)
      Signed-off-by: default avatarDamien Lespiau <damien.lespiau@intel.com>
      Reviewed-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      [Jani: backported to drm-intel-fixes for v4.0-rc]
      Reference: http://mid.gmane.org/CA+5PVA7yXH=U757w8V=Zj2U1URG4nYNav20NpjtQ4svVueyPNw@mail.gmail.com
      Reference: http://lkml.kernel.org/r/CA+55aFweWR=nDzc2Y=rCtL_H8JfdprQiCimN5dwc+TgyD4Bjsg@mail.gmail.comSigned-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      3164a803
    • Hui Wang's avatar
      ALSA: hda - Add one more node in the EAPD supporting candidate list · af95b414
      Hui Wang authored
      We have a HP machine which use the codec node 0x17 connecting the
      internal speaker, and from the node capability, we saw the EAPD,
      if we don't set the EAPD on for this node, the internal speaker
      can't output any sound.
      
      Cc: <stable@vger.kernel.org>
      BugLink: https://bugs.launchpad.net/bugs/1436745Signed-off-by: default avatarHui Wang <hui.wang@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      af95b414
    • Chris Wilson's avatar
      drm/i915: Keep ring->active_list and ring->requests_list consistent · 832a3aad
      Chris Wilson authored
      If we retire requests last, we may use a later seqno and so clear
      the requests lists without clearing the active list, leading to
      confusion. Hence we should retire requests first for consistency with
      the early return. The order used to be important as the lifecycle for
      the object on the active list was determined by request->seqno. However,
      the requests themselves are now reference counted removing the
      constraint from the order of retirement.
      
      Fixes regression from
      
      commit 1b5a433a
      Author: John Harrison <John.C.Harrison@Intel.com>
      Date:   Mon Nov 24 18:49:42 2014 +0000
      
          drm/i915: Convert 'i915_seqno_passed' calls into 'i915_gem_request_completed
      '
      
      and a
      
      	WARNING: CPU: 0 PID: 1383 at drivers/gpu/drm/i915/i915_gem_evict.c:279 i915_gem_evict_vm+0x10c/0x140()
      	WARN_ON(!list_empty(&vm->active_list))
      
      Identified by updating WATCH_LISTS:
      
      	[drm:i915_verify_lists] *ERROR* blitter ring: active list not empty, but no requests
      	WARNING: CPU: 0 PID: 681 at drivers/gpu/drm/i915/i915_gem.c:2751 i915_gem_retire_requests_ring+0x149/0x230()
      	WARN_ON(i915_verify_lists(ring->dev))
      
      Note that this is only a problem in evict_vm where the following happens
      after a retire_request has cleaned out all requests, but not all active
      bo:
      - intel_ring_idle called from i915_gpu_idle notices that no requests are
        outstanding and immediately returns.
      - i915_gem_retire_requests_ring called from i915_gem_retire_requests also
        immediately returns when there's no request, still leaving the bo on the
        active list.
      - evict_vm hits the WARN_ON(!list_empty(&vm->active_list)) after evicting
        all active objects that there's still stuff left that shouldn't be
        there.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: John Harrison <John.C.Harrison@Intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      832a3aad
    • Libin Yang's avatar
      ALSA: hda_intel: apply the Seperate stream_tag for Sunrise Point · db48abf4
      Libin Yang authored
      The total stream number of Sunrise Point's input and output stream
      exceeds 15, which will cause some streams do not work because
      of the overflow on SDxCTL.STRM field if using the legacy
      stream tag allocation method.
      
      This patch uses the new stream tag allocation method by add
      the flag AZX_DCAPS_SEPARATE_STREAM_TAG for Skylake platform.
      Signed-off-by: default avatarLibin Yang <libin.yang@intel.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      db48abf4
    • Vineet Gupta's avatar
      ARC: signal handling robustify · e4140819
      Vineet Gupta authored
      A malicious signal handler / restorer can DOS the system by fudging the
      user regs saved on stack, causing weird things such as sigreturn returning
      to user mode PC but cpu state still being kernel mode....
      
      Ensure that in sigreturn path status32 always has U bit; any other bogosity
      (gargbage PC etc) will be taken care of by normal user mode exceptions mechanisms.
      
      Reproducer signal handler:
      
          void handle_sig(int signo, siginfo_t *info, void *context)
          {
      	ucontext_t *uc = context;
      	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);
      
      	regs->scratch.status32 = 0;
          }
      
      Before the fix, kernel would go off to weeds like below:
      
          --------->8-----------
          [ARCLinux]$ ./signal-test
          Path: /signal-test
          CPU: 0 PID: 61 Comm: signal-test Not tainted 4.0.0-rc5+ #65
          task: 8f177880 ti: 5ffe6000 task.ti: 8f15c000
      
          [ECR   ]: 0x00220200 => Invalid Write @ 0x00000010 by insn @ 0x00010698
          [EFA   ]: 0x00000010
          [BLINK ]: 0x2007c1ee
          [ERET  ]: 0x10698
          [STAT32]: 0x00000000 :                                   <--------
          BTA: 0x00010680	 SP: 0x5ffe7e48	 FP: 0x00000000
          LPS: 0x20003c6c	LPE: 0x20003c70	LPC: 0x00000000
          ...
          --------->8-----------
      Reported-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      e4140819
    • Vineet Gupta's avatar
      ARC: SA_SIGINFO ucontext regs off-by-one · 6914e1e3
      Vineet Gupta authored
      The regfile provided to SA_SIGINFO signal handler as ucontext was off by
      one due to pt_regs gutter cleanups in 2013.
      
      Before handling signal, user pt_regs are copied onto user_regs_struct and copied
      back later. Both structs are binary compatible. This was all fine until
      commit 2fa91904 (ARC: pt_regs update #2) which removed the empty stack slot
      at top of pt_regs (corresponding to first pad) and made the corresponding
      fixup in struct user_regs_struct (the pad in there was moved out of
      @scratch - not removed altogether as it is part of ptrace ABI)
      
       struct user_regs_struct {
      +       long pad;
              struct {
      -               long pad;
                      long bta, lp_start, lp_end,....
              } scratch;
       ...
       }
      
      This meant that now user_regs_struct was off by 1 reg w.r.t pt_regs and
      signal code needs to user_regs_struct.scratch to reflect it as pt_regs,
      which is what this commit does.
      
      This problem was hidden for 2 years, because both save/restore, despite
      using wrong location, were using the same location. Only an interim
      inspection (reproducer below) exposed the issue.
      
           void handle_segv(int signo, siginfo_t *info, void *context)
           {
       	ucontext_t *uc = context;
      	struct user_regs_struct *regs = &(uc->uc_mcontext.regs);
      
      	printf("regs %x %x\n",               <=== prints 7 8 (vs. 8 9)
                     regs->scratch.r8, regs->scratch.r9);
           }
      
           int main()
           {
      	struct sigaction sa;
      
      	sa.sa_sigaction = handle_segv;
      	sa.sa_flags = SA_SIGINFO;
      	sigemptyset(&sa.sa_mask);
      	sigaction(SIGSEGV, &sa, NULL);
      
      	asm volatile(
      	"mov	r7, 7	\n"
      	"mov	r8, 8	\n"
      	"mov	r9, 9	\n"
      	"mov	r10, 10	\n"
      	:::"r7","r8","r9","r10");
      
      	*((unsigned int*)0x10) = 0;
           }
      
      Fixes: 2fa91904 "ARC: pt_regs update #2: Remove unused gutter at start of pt_regs"
      CC: <stable@vger.kernel.org>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      6914e1e3
  4. 25 Mar, 2015 17 commits
    • Linus Torvalds's avatar
      Merge tag 'metag-fixes-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag · 4c4fe4c2
      Linus Torvalds authored
      Pull arch/metag fix from James Hogan:
       "Another metag architecture fix for v4.0
      
        This is another single fix, for an include dependency problem when
        using ioremap_wc() from asm/io.h without also including asm/pgtable.h"
      
      * tag 'metag-fixes-v4.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
        metag: Fix ioremap_wc/ioremap_cached build errors
      4c4fe4c2
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 9c8e30d1
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "15 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm: numa: mark huge PTEs young when clearing NUMA hinting faults
        mm: numa: slow PTE scan rate if migration failures occur
        mm: numa: preserve PTE write permissions across a NUMA hinting fault
        mm: numa: group related processes based on VMA flags instead of page table flags
        hfsplus: fix B-tree corruption after insertion at position 0
        MAINTAINERS: add Jan as DMI/SMBIOS support maintainer
        fs/affs/file.c: unlock/release page on error
        mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate
        mm/slub: fix lockups on PREEMPT && !SMP kernels
        mm/memory hotplug: postpone the reset of obsolete pgdat
        MAINTAINERS: correct rtc armada38x pattern entry
        mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers
        mm: fix anon_vma->degree underflow in anon_vma endless growing prevention
        drivers/rtc/rtc-mrst: fix suspend/resume
        aoe: update aoe maintainer information
      9c8e30d1
    • Marcelo Tosatti's avatar
      Merge tag 'signed-for-4.0' of git://github.com/agraf/linux-2.6 · 27bfc6cf
      Marcelo Tosatti authored
      Patch queue for 4.0 - 2015-03-25
      
      A few bug fixes for Book3S HV KVM:
      
        - Fix spinlock ordering
        - Fix idle guests on LE hosts
        - Fix instruction emulation
      27bfc6cf
    • Mel Gorman's avatar
      mm: numa: mark huge PTEs young when clearing NUMA hinting faults · b7b04004
      Mel Gorman authored
      Base PTEs are marked young when the NUMA hinting information is cleared
      but the same does not happen for huge pages which this patch addresses.
      
      Note that migrated pages are not marked young as the base page migration
      code does not assume that migrated pages have been referenced.  This
      could be addressed but beyond the scope of this series which is aimed at
      Dave Chinners shrink workload that is unlikely to be affected by this
      issue.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b7b04004
    • Mel Gorman's avatar
      mm: numa: slow PTE scan rate if migration failures occur · 074c2381
      Mel Gorman authored
      Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226
      
        Across the board the 4.0-rc1 numbers are much slower, and the degradation
        is far worse when using the large memory footprint configs. Perf points
        straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config:
      
         -   56.07%    56.07%  [kernel]            [k] default_send_IPI_mask_sequence_phys
            - default_send_IPI_mask_sequence_phys
               - 99.99% physflat_send_IPI_mask
                  - 99.37% native_send_call_func_ipi
                       smp_call_function_many
                     - native_flush_tlb_others
                        - 99.85% flush_tlb_page
                             ptep_clear_flush
                             try_to_unmap_one
                             rmap_walk
                             try_to_unmap
                             migrate_pages
                             migrate_misplaced_page
                           - handle_mm_fault
                              - 99.73% __do_page_fault
                                   trace_do_page_fault
                                   do_async_page_fault
                                 + async_page_fault
                    0.63% native_send_call_func_single_ipi
                       generic_exec_single
                       smp_call_function_single
      
      This is showing excessive migration activity even though excessive
      migrations are meant to get throttled.  Normally, the scan rate is tuned
      on a per-task basis depending on the locality of faults.  However, if
      migrations fail for any reason then the PTE scanner may scan faster if
      the faults continue to be remote.  This means there is higher system CPU
      overhead and fault trapping at exactly the time we know that migrations
      cannot happen.  This patch tracks when migration failures occur and
      slows the PTE scanner.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Tested-by: default avatarDave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      074c2381
    • Mel Gorman's avatar
      mm: numa: preserve PTE write permissions across a NUMA hinting fault · b191f9b1
      Mel Gorman authored
      Protecting a PTE to trap a NUMA hinting fault clears the writable bit
      and further faults are needed after trapping a NUMA hinting fault to set
      the writable bit again.  This patch preserves the writable bit when
      trapping NUMA hinting faults.  The impact is obvious from the number of
      minor faults trapped during the basis balancing benchmark and the system
      CPU usage;
      
        autonumabench
                                                   4.0.0-rc4             4.0.0-rc4
                                                    baseline              preserve
        Time System-NUMA01                  107.13 (  0.00%)      103.13 (  3.73%)
        Time System-NUMA01_THEADLOCAL       131.87 (  0.00%)       83.30 ( 36.83%)
        Time System-NUMA02                    8.95 (  0.00%)       10.72 (-19.78%)
        Time System-NUMA02_SMT                4.57 (  0.00%)        3.99 ( 12.69%)
        Time Elapsed-NUMA01                 515.78 (  0.00%)      517.26 ( -0.29%)
        Time Elapsed-NUMA01_THEADLOCAL      384.10 (  0.00%)      384.31 ( -0.05%)
        Time Elapsed-NUMA02                  48.86 (  0.00%)       48.78 (  0.16%)
        Time Elapsed-NUMA02_SMT              47.98 (  0.00%)       48.12 ( -0.29%)
      
                     4.0.0-rc4   4.0.0-rc4
                      baseline    preserve
        User          44383.95    43971.89
        System          252.61      201.24
        Elapsed         998.68     1000.94
      
        Minor Faults   2597249     1981230
        Major Faults       365         364
      
      There is a similar drop in system CPU usage using Dave Chinner's xfsrepair
      workload
      
                                            4.0.0-rc4             4.0.0-rc4
                                             baseline              preserve
        Amean    real-xfsrepair      454.14 (  0.00%)      442.36 (  2.60%)
        Amean    syst-xfsrepair      277.20 (  0.00%)      204.68 ( 26.16%)
      
      The patch looks hacky but the alternatives looked worse.  The tidest was
      to rewalk the page tables after a hinting fault but it was more complex
      than this approach and the performance was worse.  It's not generally
      safe to just mark the page writable during the fault if it's a write
      fault as it may have been read-only for COW so that approach was
      discarded.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Tested-by: default avatarDave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b191f9b1
    • Mel Gorman's avatar
      mm: numa: group related processes based on VMA flags instead of page table flags · bea66fbd
      Mel Gorman authored
      These are three follow-on patches based on the xfsrepair workload Dave
      Chinner reported was problematic in 4.0-rc1 due to changes in page table
      management -- https://lkml.org/lkml/2015/3/1/226.
      
      Much of the problem was reduced by commit 53da3bc2 ("mm: fix up numa
      read-only thread grouping logic") and commit ba68bc01 ("mm: thp:
      Return the correct value for change_huge_pmd").  It was known that the
      performance in 3.19 was still better even if is far less safe.  This
      series aims to restore the performance without compromising on safety.
      
      For the test of this mail, I'm comparing 3.19 against 4.0-rc4 and the
      three patches applied on top
      
        autonumabench
                                                      3.19.0             4.0.0-rc4             4.0.0-rc4             4.0.0-rc4             4.0.0-rc4
                                                     vanilla               vanilla          vmwrite-v5r8         preserve-v5r8         slowscan-v5r8
        Time System-NUMA01                  124.00 (  0.00%)      161.86 (-30.53%)      107.13 ( 13.60%)      103.13 ( 16.83%)      145.01 (-16.94%)
        Time System-NUMA01_THEADLOCAL       115.54 (  0.00%)      107.64 (  6.84%)      131.87 (-14.13%)       83.30 ( 27.90%)       92.35 ( 20.07%)
        Time System-NUMA02                    9.35 (  0.00%)       10.44 (-11.66%)        8.95 (  4.28%)       10.72 (-14.65%)        8.16 ( 12.73%)
        Time System-NUMA02_SMT                3.87 (  0.00%)        4.63 (-19.64%)        4.57 (-18.09%)        3.99 ( -3.10%)        3.36 ( 13.18%)
        Time Elapsed-NUMA01                 570.06 (  0.00%)      567.82 (  0.39%)      515.78 (  9.52%)      517.26 (  9.26%)      543.80 (  4.61%)
        Time Elapsed-NUMA01_THEADLOCAL      393.69 (  0.00%)      384.83 (  2.25%)      384.10 (  2.44%)      384.31 (  2.38%)      380.73 (  3.29%)
        Time Elapsed-NUMA02                  49.09 (  0.00%)       49.33 ( -0.49%)       48.86 (  0.47%)       48.78 (  0.63%)       50.94 ( -3.77%)
        Time Elapsed-NUMA02_SMT              47.51 (  0.00%)       47.15 (  0.76%)       47.98 ( -0.99%)       48.12 ( -1.28%)       49.56 ( -4.31%)
      
                      3.19.0   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4
                     vanilla     vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8
        User        46334.60    46391.94    44383.95    43971.89    44372.12
        System        252.84      284.66      252.61      201.24      249.00
        Elapsed      1062.14     1050.96      998.68     1000.94     1026.78
      
      Overall the system CPU usage is comparable and the test is naturally a
      bit variable.  The slowing of the scanner hurts numa01 but on this
      machine it is an adverse workload and patches that dramatically help it
      often hurt absolutely everything else.
      
      Due to patch 2, the fault activity is interesting
      
                                        3.19.0   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4
                                       vanilla     vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8
        Minor Faults                   2097811     2656646     2597249     1981230     1636841
        Major Faults                       362         450         365         364         365
      
      Note the impact preserving the write bit across protection updates and
      fault reduces faults.
      
        NUMA alloc hit                 1229008     1217015     1191660     1178322     1199681
        NUMA alloc miss                      0           0           0           0           0
        NUMA interleave hit                  0           0           0           0           0
        NUMA alloc local               1228514     1216317     1190871     1177448     1199021
        NUMA base PTE updates        245706197   240041607   238195516   244704842   115012800
        NUMA huge PMD updates           479530      468448      464868      477573      224487
        NUMA page range updates      491225557   479886983   476207932   489222218   229950144
        NUMA hint faults                659753      656503      641678      656926      294842
        NUMA hint local faults          381604      373963      360478      337585      186249
        NUMA hint local percent             57          56          56          51          63
        NUMA pages migrated            5412140     6374899     6266530     5277468     5755096
        AutoNUMA cost                    5121%       5083%       4994%       5097%       2388%
      
      Here the impact of slowing the PTE scanner on migratrion failures is
      obvious as "NUMA base PTE updates" and "NUMA huge PMD updates" are
      massively reduced even though the headline performance is very similar.
      
      As xfsrepair was the reported workload here is the impact of the series
      on it.
      
        xfsrepair
                                               3.19.0             4.0.0-rc4             4.0.0-rc4             4.0.0-rc4             4.0.0-rc4
                                              vanilla               vanilla          vmwrite-v5r8         preserve-v5r8         slowscan-v5r8
        Min      real-fsmark        1183.29 (  0.00%)     1165.73 (  1.48%)     1152.78 (  2.58%)     1153.64 (  2.51%)     1177.62 (  0.48%)
        Min      syst-fsmark        4107.85 (  0.00%)     4027.75 (  1.95%)     3986.74 (  2.95%)     3979.16 (  3.13%)     4048.76 (  1.44%)
        Min      real-xfsrepair      441.51 (  0.00%)      463.96 ( -5.08%)      449.50 ( -1.81%)      440.08 (  0.32%)      439.87 (  0.37%)
        Min      syst-xfsrepair      195.76 (  0.00%)      278.47 (-42.25%)      262.34 (-34.01%)      203.70 ( -4.06%)      143.64 ( 26.62%)
        Amean    real-fsmark        1188.30 (  0.00%)     1177.34 (  0.92%)     1157.97 (  2.55%)     1158.21 (  2.53%)     1182.22 (  0.51%)
        Amean    syst-fsmark        4111.37 (  0.00%)     4055.70 (  1.35%)     3987.19 (  3.02%)     3998.72 (  2.74%)     4061.69 (  1.21%)
        Amean    real-xfsrepair      450.88 (  0.00%)      468.32 ( -3.87%)      454.14 ( -0.72%)      442.36 (  1.89%)      440.59 (  2.28%)
        Amean    syst-xfsrepair      199.66 (  0.00%)      290.60 (-45.55%)      277.20 (-38.84%)      204.68 ( -2.51%)      150.55 ( 24.60%)
        Stddev   real-fsmark           4.12 (  0.00%)       10.82 (-162.29%)       4.14 ( -0.28%)        5.98 (-45.05%)        4.60 (-11.53%)
        Stddev   syst-fsmark           2.63 (  0.00%)       20.32 (-671.82%)       0.37 ( 85.89%)       16.47 (-525.59%)      15.05 (-471.79%)
        Stddev   real-xfsrepair        6.87 (  0.00%)        4.55 ( 33.75%)        3.46 ( 49.58%)        1.78 ( 74.12%)        0.52 ( 92.50%)
        Stddev   syst-xfsrepair        3.02 (  0.00%)       10.30 (-241.37%)      13.17 (-336.37%)       0.71 ( 76.63%)        5.00 (-65.61%)
        CoeffVar real-fsmark           0.35 (  0.00%)        0.92 (-164.73%)       0.36 ( -2.91%)        0.52 (-48.82%)        0.39 (-12.10%)
        CoeffVar syst-fsmark           0.06 (  0.00%)        0.50 (-682.41%)       0.01 ( 85.45%)        0.41 (-543.22%)       0.37 (-478.78%)
        CoeffVar real-xfsrepair        1.52 (  0.00%)        0.97 ( 36.21%)        0.76 ( 49.94%)        0.40 ( 73.62%)        0.12 ( 92.33%)
        CoeffVar syst-xfsrepair        1.51 (  0.00%)        3.54 (-134.54%)       4.75 (-214.31%)       0.34 ( 77.20%)        3.32 (-119.63%)
        Max      real-fsmark        1193.39 (  0.00%)     1191.77 (  0.14%)     1162.90 (  2.55%)     1166.66 (  2.24%)     1188.50 (  0.41%)
        Max      syst-fsmark        4114.18 (  0.00%)     4075.45 (  0.94%)     3987.65 (  3.08%)     4019.45 (  2.30%)     4082.80 (  0.76%)
        Max      real-xfsrepair      457.80 (  0.00%)      474.60 ( -3.67%)      457.82 ( -0.00%)      444.42 (  2.92%)      441.03 (  3.66%)
        Max      syst-xfsrepair      203.11 (  0.00%)      303.65 (-49.50%)      294.35 (-44.92%)      205.33 ( -1.09%)      155.28 ( 23.55%)
      
      The really relevant lines as syst-xfsrepair which is the system CPU
      usage when running xfsrepair.  Note that on my machine the overhead was
      45% higher on 4.0-rc4 which may be part of what Dave is seeing.  Once we
      preserve the write bit across faults, it's only 2.51% higher on average.
      With the full series applied, system CPU usage is 24.6% lower on
      average.
      
      Again, the impact of preserving the write bit on minor faults is obvious
      and the impact of slowing scanning after migration failures is obvious
      on the PTE updates.  Note also that the number of pages migrated is much
      reduced even though the headline performance is comparable.
      
                                        3.19.0   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4
                                       vanilla     vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8
        Minor Faults                 153466827   254507978   249163829   153501373   105737890
        Major Faults                       610         702         690         649         724
        NUMA base PTE updates        217735049   210756527   217729596   216937111   144344993
        NUMA huge PMD updates           129294       85044      106921      127246       79887
        NUMA pages migrated           21938995    29705270    28594162    22687324    16258075
      
                              3.19.0   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4   4.0.0-rc4
                             vanilla     vanillavmwrite-v5r8preserve-v5r8slowscan-v5r8
        Mean sdb-avgqusz       13.47        2.54        2.55        2.47        2.49
        Mean sdb-avgrqsz      202.32      140.22      139.50      139.02      138.12
        Mean sdb-await         25.92        5.09        5.33        5.02        5.22
        Mean sdb-r_await        4.71        0.19        0.83        0.51        0.11
        Mean sdb-w_await      104.13        5.21        5.38        5.05        5.32
        Mean sdb-svctm          0.59        0.13        0.14        0.13        0.14
        Mean sdb-rrqm           0.16        0.00        0.00        0.00        0.00
        Mean sdb-wrqm           3.59     1799.43     1826.84     1812.21     1785.67
        Max  sdb-avgqusz      111.06       12.13       14.05       11.66       15.60
        Max  sdb-avgrqsz      255.60      190.34      190.01      187.33      191.78
        Max  sdb-await        168.24       39.28       49.22       44.64       65.62
        Max  sdb-r_await      660.00       52.00      280.00       76.00       12.00
        Max  sdb-w_await     7804.00       39.28       49.22       44.64       65.62
        Max  sdb-svctm          4.00        2.82        2.86        1.98        2.84
        Max  sdb-rrqm           8.30        0.00        0.00        0.00        0.00
        Max  sdb-wrqm          34.20     5372.80     5278.60     5386.60     5546.15
      
      FWIW, I also checked SPECjbb in different configurations but it's
      similar observations -- minor faults lower, PTE update activity lower
      and performance is roughly comparable against 3.19.
      
      This patch (of 3):
      
      Threads that share writable data within pages are grouped together as
      related tasks.  This decision is based on whether the PTE is marked
      dirty which is subject to timing races between the PTE scanner update
      and when the application writes the page.  If the page is file-backed,
      then background flushes and sync also affect placement.  This is
      unpredictable behaviour which is impossible to reason about so this
      patch makes grouping decisions based on the VMA flags.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Tested-by: default avatarDave Chinner <david@fromorbit.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bea66fbd
    • Sergei Antonov's avatar
      hfsplus: fix B-tree corruption after insertion at position 0 · 98cf21c6
      Sergei Antonov authored
      Fix B-tree corruption when a new record is inserted at position 0 in the
      node in hfs_brec_insert().  In this case a hfs_brec_update_parent() is
      called to update the parent index node (if exists) and it is passed
      hfs_find_data with a search_key containing a newly inserted key instead
      of the key to be updated.  This results in an inconsistent index node.
      The bug reproduces on my machine after an extents overflow record for
      the catalog file (CNID=4) is inserted into the extents overflow B-tree.
      Because of a low (reserved) value of CNID=4, it has to become the first
      record in the first leaf node.
      
      The resulting first leaf node is correct:
      
        ----------------------------------------------------
        | key0.CNID=4 | key1.CNID=123 | key2.CNID=456, ... |
        ----------------------------------------------------
      
      But the parent index key0 still contains the previous key CNID=123:
      
        -----------------------
        | key0.CNID=123 | ... |
        -----------------------
      
      A change in hfs_brec_insert() makes hfs_brec_update_parent() work
      correctly by preventing it from getting fd->record=-1 value from
      __hfs_brec_find().
      
      Along the way, I removed duplicate code with unification of the if
      condition.  The resulting code is equivalent to the original code
      because node is never 0.
      
      Also hfs_brec_update_parent() will now return an error after getting a
      negative fd->record value.  However, the return value of
      hfs_brec_update_parent() is not checked anywhere in the file and I'm
      leaving it unchanged by this patch.  brec.c lacks error checking after
      some other calls too, but this issue is of less importance than the one
      being fixed by this patch.
      Signed-off-by: default avatarSergei Antonov <saproj@gmail.com>
      Cc: Joe Perches <joe@perches.com>
      Reviewed-by: default avatarVyacheslav Dubeyko <slava@dubeyko.com>
      Acked-by: default avatarHin-Tak Leung <htl10@users.sourceforge.net>
      Cc: Anton Altaparmakov <aia21@cam.ac.uk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      98cf21c6
    • Jean Delvare's avatar
      MAINTAINERS: add Jan as DMI/SMBIOS support maintainer · 1f31e1b1
      Jean Delvare authored
      I am familiar with these drivers and I care about them so let me add
      myself as their maintainer.
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Acked-by: default avatarMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1f31e1b1
    • Taesoo Kim's avatar
      fs/affs/file.c: unlock/release page on error · 3d5d472c
      Taesoo Kim authored
      When affs_bread_ino() fails, correctly unlock the page and release the
      page cache with proper error value.  All write_end() should
      unlock/release the page that was locked by write_beg().
      Signed-off-by: default avatarTaesoo Kim <tsgatesv@gmail.com>
      Cc: Fabian Frederick <fabf@skynet.be>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d5d472c
    • Laura Abbott's avatar
      mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate · cfa86943
      Laura Abbott authored
      Commit 3c605096 ("mm/page_alloc: restrict max order of merging on
      isolated pageblock") changed the logic of unset_migratetype_isolate to
      check the buddy allocator and explicitly call __free_pages to merge.
      
      The page that is being freed in this path never had prep_new_page called
      so set_page_refcounted is called explicitly but there is no call to
      kernel_map_pages.  With the default kernel_map_pages this is mostly
      harmless but if kernel_map_pages does any manipulation of the page
      tables (unmapping or setting pages to read only) this may trigger a
      fault:
      
          alloc_contig_range test_pages_isolated(ceb00, ced00) failed
          Unable to handle kernel paging request at virtual address ffffffc0cec00000
          pgd = ffffffc045fc4000
          [ffffffc0cec00000] *pgd=0000000000000000
          Internal error: Oops: 9600004f [#1] PREEMPT SMP
          Modules linked in: exfatfs
          CPU: 1 PID: 23237 Comm: TimedEventQueue Not tainted 3.10.49-gc72ad36-dirty #1
          task: ffffffc03de52100 ti: ffffffc015388000 task.ti: ffffffc015388000
          PC is at memset+0xc8/0x1c0
          LR is at kernel_map_pages+0x1ec/0x244
      
      Fix this by calling kernel_map_pages to ensure the page is set in the
      page table properly
      
      Fixes: 3c605096 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
      Signed-off-by: default avatarLaura Abbott <lauraa@codeaurora.org>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Cc: Xishi Qiu <qiuxishi@huawei.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Acked-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Gioh Kim <gioh.kim@lge.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cfa86943
    • Mark Rutland's avatar
      mm/slub: fix lockups on PREEMPT && !SMP kernels · 859b7a0e
      Mark Rutland authored
      Commit 9aabf810 ("mm/slub: optimize alloc/free fastpath by removing
      preemption on/off") introduced an occasional hang for kernels built with
      CONFIG_PREEMPT && !CONFIG_SMP.
      
      The problem is the following loop the patch introduced to
      slab_alloc_node and slab_free:
      
          do {
              tid = this_cpu_read(s->cpu_slab->tid);
              c = raw_cpu_ptr(s->cpu_slab);
          } while (IS_ENABLED(CONFIG_PREEMPT) && unlikely(tid != c->tid));
      
      GCC 4.9 has been observed to hoist the load of c and c->tid above the
      loop for !SMP kernels (as in this case raw_cpu_ptr(x) is compile-time
      constant and does not force a reload).  On arm64 the generated assembly
      looks like:
      
               ldr     x4, [x0,#8]
        loop:
               ldr     x1, [x0,#8]
               cmp     x1, x4
               b.ne    loop
      
      If the thread is preempted between the load of c->tid (into x1) and tid
      (into x4), and an allocation or free occurs in another thread (bumping
      the cpu_slab's tid), the thread will be stuck in the loop until
      s->cpu_slab->tid wraps, which may be forever in the absence of
      allocations/frees on the same CPU.
      
      This patch changes the loop condition to access c->tid with READ_ONCE.
      This ensures that the value is reloaded even when the compiler would
      otherwise assume it could cache the value, and also ensures that the
      load will not be torn.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarChristoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Steve Capper <steve.capper@linaro.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      859b7a0e
    • Gu Zheng's avatar
      mm/memory hotplug: postpone the reset of obsolete pgdat · b0dc3a34
      Gu Zheng authored
      Qiu Xishi reported the following BUG when testing hot-add/hot-remove node under
      stress condition:
      
        BUG: unable to handle kernel paging request at 0000000000025f60
        IP: next_online_pgdat+0x1/0x50
        PGD 0
        Oops: 0000 [#1] SMP
        ACPI: Device does not support D3cold
        Modules linked in: fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod coretemp mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr microcode igb dca i2c_algo_bit ipv6 megaraid_sas iTCO_wdt i2c_i801 i2c_core iTCO_vendor_support tg3 sg hwmon ptp lpc_ich pps_core mfd_core acpi_pad rtc_cmos button ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh ahci libahci libata scsi_mod [last unloaded: rasf]
        CPU: 23 PID: 238 Comm: kworker/23:1 Tainted: G           O 3.10.15-5885-euler0302 #1
        Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei N1, BIOS V100R001 03/02/2015
        Workqueue: events vmstat_update
        task: ffffa800d32c0000 ti: ffffa800d32ae000 task.ti: ffffa800d32ae000
        RIP: 0010: next_online_pgdat+0x1/0x50
        RSP: 0018:ffffa800d32afce8  EFLAGS: 00010286
        RAX: 0000000000001440 RBX: ffffffff81da53b8 RCX: 0000000000000082
        RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000000
        RBP: ffffa800d32afd28 R08: ffffffff81c93bfc R09: ffffffff81cbdc96
        R10: 00000000000040ec R11: 00000000000000a0 R12: ffffa800fffb3440
        R13: ffffa800d32afd38 R14: 0000000000000017 R15: ffffa800e6616800
        FS:  0000000000000000(0000) GS:ffffa800e6600000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000025f60 CR3: 0000000001a0b000 CR4: 00000000001407e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        Call Trace:
          refresh_cpu_vm_stats+0xd0/0x140
          vmstat_update+0x11/0x50
          process_one_work+0x194/0x3d0
          worker_thread+0x12b/0x410
          kthread+0xc6/0xd0
          ret_from_fork+0x7c/0xb0
      
      The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of
      try_offline_node, which will reset all the content of pgdat to 0, as the
      pgdat is accessed lock-free, so that the users still using the pgdat
      will panic, such as the vmstat_update routine.
      
      process A:				offline node XX:
      
      vmstat_updat()
         refresh_cpu_vm_stats()
           for_each_populated_zone()
             find online node XX
           cond_resched()
      					offline cpu and memory, then try_offline_node()
      					node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
             zone = next_zone(zone)
               pg_data_t *pgdat = zone->zone_pgdat;  // here pgdat is NULL now
                 next_online_pgdat(pgdat)
                   next_online_node(pgdat->node_id);  // NULL pointer access
      
      So the solution here is postponing the reset of obsolete pgdat from
      try_offline_node() to hotadd_new_pgdat(), and just resetting
      pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset
      0 to avoid breaking pointer information in pgdat.
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Reported-by: default avatarXishi Qiu <qiuxishi@huawei.com>
      Suggested-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
      Cc: Tang Chen <tangchen@cn.fujitsu.com>
      Cc: Xie XiuQi <xiexiuqi@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b0dc3a34
    • Joe Perches's avatar
      MAINTAINERS: correct rtc armada38x pattern entry · 59ec9671
      Joe Perches authored
      Commit c6a95dbe ("MAINTAINERS: add the RTC driver for the
      Armada38x") typoed the pattern, fix it.
      Signed-off-by: default avatarJoe Perches <joe@perches.com>
      Acked-by: default avatarGregory CLEMENT <gregory.clement@free-electrons.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      59ec9671
    • Naoya Horiguchi's avatar
      mm/pagewalk.c: prevent positive return value of walk_page_test() from being passed to callers · f6837395
      Naoya Horiguchi authored
      walk_page_test() is purely pagewalk's internal stuff, and its positive
      return values are not intended to be passed to the callers of pagewalk.
      
      However, in the current code if the last vma in the do-while loop in
      walk_page_range() happens to return a positive value, it leaks outside
      walk_page_range().  So the user visible effect is invalid/unexpected
      return value (according to the reporter, mbind() causes it.)
      
      This patch fixes it simply by reinitializing the return value after
      checked.
      
      Another exposed interface, walk_page_vma(), already returns 0 for such
      cases so no problem.
      
      Fixes: fafaa426 ("pagewalk: improve vma handling")
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarKazutomo Yoshii <kazutomo.yoshii@gmail.com>
      Reported-by: default avatarKazutomo Yoshii <kazutomo.yoshii@gmail.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f6837395
    • Leon Yu's avatar
      mm: fix anon_vma->degree underflow in anon_vma endless growing prevention · 3fe89b3e
      Leon Yu authored
      I have constantly stumbled upon "kernel BUG at mm/rmap.c:399!" after
      upgrading to 3.19 and had no luck with 4.0-rc1 neither.
      
      So, after looking into new logic introduced by commit 7a3ef208 ("mm:
      prevent endless growth of anon_vma hierarchy"), I found chances are that
      unlink_anon_vmas() is called without incrementing dst->anon_vma->degree
      in anon_vma_clone() due to allocation failure.  If dst->anon_vma is not
      NULL in error path, its degree will be incorrectly decremented in
      unlink_anon_vmas() and eventually underflow when exiting as a result of
      another call to unlink_anon_vmas().  That's how "kernel BUG at
      mm/rmap.c:399!" is triggered for me.
      
      This patch fixes the underflow by dropping dst->anon_vma when allocation
      fails.  It's safe to do so regardless of original value of dst->anon_vma
      because dst->anon_vma doesn't have valid meaning if anon_vma_clone()
      fails.  Besides, callers don't care dst->anon_vma in such case neither.
      
      Also suggested by Michal Hocko, we can clean up vma_adjust() a bit as
      anon_vma_clone() now does the work.
      
      [akpm@linux-foundation.org: tweak comment]
      Fixes: 7a3ef208 ("mm: prevent endless growth of anon_vma hierarchy")
      Signed-off-by: default avatarLeon Yu <chianglungyu@gmail.com>
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reviewed-by: default avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3fe89b3e
    • Lars-Peter Clausen's avatar
      drivers/rtc/rtc-mrst: fix suspend/resume · ddd2a30d
      Lars-Peter Clausen authored
      The Moorestown RTC driver implements suspend and resume callbacks and
      assigns them to the suspend and resume fields of the device_driver
      struct.  These callbacks are never actually called by anything though.
      
      Modify the driver to properly use dev_pm_ops so that the suspend and
      resume functions are actually executed upon suspend/resume.
      
      [akpm@linux-foundation.org: device_driver.name is const char *]
      Signed-off-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Feng Tang <feng.tang@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ddd2a30d