1. 21 Feb, 2019 14 commits
    • Johannes Weiner's avatar
      psi: avoid divide-by-zero crash inside virtual machines · 4e37504d
      Johannes Weiner authored
      We've been seeing hard-to-trigger psi crashes when running inside VM
      instances:
      
          divide error: 0000 [#1] SMP PTI
          Modules linked in: [...]
          CPU: 0 PID: 212 Comm: kworker/0:2 Not tainted 4.16.18-119_fbk9_3817_gfe944c98d695 #119
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
          Workqueue: events psi_clock
          RIP: 0010:psi_update_stats+0x270/0x490
          RSP: 0018:ffffc90001117e10 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800a35a13f8
          RDX: 0000000000000000 RSI: ffff8800a35a1340 RDI: 0000000000000000
          RBP: 0000000000000658 R08: ffff8800a35a1470 R09: 0000000000000000
          R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
          R13: 0000000000000000 R14: 0000000000000000 R15: 00000000000f8502
          FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fbe370fa000 CR3: 00000000b1e3a000 CR4: 00000000000006f0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           psi_clock+0x12/0x50
           process_one_work+0x1e0/0x390
           worker_thread+0x2b/0x3c0
           ? rescuer_thread+0x330/0x330
           kthread+0x113/0x130
           ? kthread_create_worker_on_cpu+0x40/0x40
           ? SyS_exit_group+0x10/0x10
           ret_from_fork+0x35/0x40
          Code: 48 0f 47 c7 48 01 c2 45 85 e4 48 89 16 0f 85 e6 00 00 00 4c 8b 49 10 4c 8b 51 08 49 69 d9 f2 07 00 00 48 6b c0 64 4c 8b 29 31 d2 <48> f7 f7 49 69 d5 8d 06 00 00 48 89 c5 4c 69 f0 00 98 0b 00 48
      
      The Code-line points to `period` being 0 inside update_stats(), and we
      divide by that when calculating that period's pressure percentage.
      
      The elapsed period should never be 0.  The reason this can happen is due
      to an off-by-one in the idle time / missing period calculation combined
      with a coarse sched_clock() in the virtual machine.
      
      The target time for aggregation is advanced into the future on a fixed
      grid to prevent clock drift.  So when an aggregation runs after some idle
      period, we can not just set it to "now + psi_period", but have to
      calculate the downtime and advance the target time relative to itself.
      
      However, if the aggregator was disabled exactly one psi_period (ns), we
      drop one idle period in the calculation due to a > when we should do >=.
      In that case, next_update will be advanced from 'now - psi_period' to
      'now' when it should be moved to 'now + psi_period'.  The run finishes
      with last_update == next_update == sched_clock().
      
      With hardware clocks, this exact nanosecond match isn't likely in the
      first place; but if it does happen, the clock will still have moved on and
      the period non-zero by the time the worker runs.  A pointlessly short
      period, but besides the extra work, no harm no foul.  However, a slow
      sched_clock() like we have on VMs might not have advanced either by the
      time the worker runs again.  And when we calculate the elapsed period, the
      result, our pressure divisor, will be 0.  Ouch.
      
      Fix this by correctly handling the situation when the elapsed time between
      aggregation runs is precisely two periods, and advance the expiration
      timestamp correctly to period into the future.
      
      Link: http://lkml.kernel.org/r/20190214193157.15788-1-hannes@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: Łukasz Siudut <lsiudut@fb.com
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4e37504d
    • Michal Hocko's avatar
      mm: handle lru_add_drain_all for UP properly · 6ea183d6
      Michal Hocko authored
      Since for_each_cpu(cpu, mask) added by commit 2d3854a3
      ("cpumask: introduce new API, without changing anything") did not
      evaluate the mask argument if NR_CPUS == 1 due to CONFIG_SMP=n,
      lru_add_drain_all() is hitting WARN_ON() at __flush_work() added by
      commit 4d43d395 ("workqueue: Try to catch flush_work() without
      INIT_WORK().") by unconditionally calling flush_work() [1].
      
      Workaround this issue by using CONFIG_SMP=n specific lru_add_drain_all
      implementation.  There is no real need to defer the implementation to
      the workqueue as the draining is going to happen on the local cpu.  So
      alias lru_add_drain_all to lru_add_drain which does all the necessary
      work.
      
      [akpm@linux-foundation.org: fix various build warnings]
      [1] https://lkml.kernel.org/r/18a30387-6aa5-6123-e67c-57579ecc3f38@roeck-us.net
      Link: http://lkml.kernel.org/r/20190213124334.GH4525@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Debugged-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6ea183d6
    • Mel Gorman's avatar
      mm, page_alloc: fix a division by zero error when boosting watermarks v2 · 94b3334c
      Mel Gorman authored
      Yury Norov reported that an arm64 KVM instance could not boot since
      after v5.0-rc1 and could addressed by reverting the patches
      
        1c30844d ("mm: reclaim small amounts of memory when an external
        73444bc4 ("mm, page_alloc: do not wake kswapd with zone lock held")
      
      The problem is that a division by zero error is possible if boosting
      occurs very early in boot if the system has very little memory.  This
      patch avoids the division by zero error.
      
      Link: http://lkml.kernel.org/r/20190213143012.GT9565@techsingularity.net
      Fixes: 1c30844d ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reported-by: default avatarYury Norov <yury.norov@gmail.com>
      Tested-by: default avatarYury Norov <yury.norov@gmail.com>
      Tested-by: default avatarWill Deacon <will.deacon@arm.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94b3334c
    • Robin Murphy's avatar
      mm/debug.c: fix __dump_page() for poisoned pages · 311ade0e
      Robin Murphy authored
      Evaluating page_mapping() on a poisoned page ends up dereferencing junk
      and making PF_POISONED_CHECK() considerably crashier than intended:
      
          Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006
          Mem abort info:
            ESR = 0x96000005
            Exception class = DABT (current EL), IL = 32 bits
            SET = 0, FnV = 0
            EA = 0, S1PTW = 0
          Data abort info:
            ISV = 0, ISS = 0x00000005
            CM = 0, WnR = 0
          user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000c2f6ac38
          [0000000000000006] pgd=0000000000000000, pud=0000000000000000
          Internal error: Oops: 96000005 [#1] PREEMPT SMP
          Modules linked in:
          CPU: 2 PID: 491 Comm: bash Not tainted 5.0.0-rc1+ #1
          Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018
          pstate: 00000005 (nzcv daif -PAN -UAO)
          pc : page_mapping+0x18/0x118
          lr : __dump_page+0x1c/0x398
          Process bash (pid: 491, stack limit = 0x000000004ebd4ecd)
          Call trace:
           page_mapping+0x18/0x118
           __dump_page+0x1c/0x398
           dump_page+0xc/0x18
           remove_store+0xbc/0x120
           dev_attr_store+0x18/0x28
           sysfs_kf_write+0x40/0x50
           kernfs_fop_write+0x130/0x1d8
           __vfs_write+0x30/0x180
           vfs_write+0xb4/0x1a0
           ksys_write+0x60/0xd0
           __arm64_sys_write+0x18/0x20
           el0_svc_common+0x94/0xf8
           el0_svc_handler+0x68/0x70
           el0_svc+0x8/0xc
          Code: f9400401 d1000422 f240003f 9a801040 (f9400402)
          ---[ end trace cdb5eb5bf435cecb ]---
      
      Fix that by not inspecting the mapping until we've determined that it's
      likely to be valid.  Now the above condition still ends up stopping the
      kernel, but in the correct manner:
      
          page:ffffffbf20000000 is uninitialized and poisoned
          raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
          raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
          page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
          ------------[ cut here ]------------
          kernel BUG at ./include/linux/mm.h:1006!
          Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
          Modules linked in:
          CPU: 1 PID: 483 Comm: bash Not tainted 5.0.0-rc1+ #3
          Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Dec 17 2018
          pstate: 40000005 (nZcv daif -PAN -UAO)
          pc : remove_store+0xbc/0x120
          lr : remove_store+0xbc/0x120
          ...
      
      Link: http://lkml.kernel.org/r/03b53ee9d7e76cda4b9b5e1e31eea080db033396.1550071778.git.robin.murphy@arm.com
      Fixes: 1c6fb1d8 ("mm: print more information about mapping in __dump_page")
      Signed-off-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      311ade0e
    • Michal Hocko's avatar
      proc, oom: do not report alien mms when setting oom_score_adj · b2b46993
      Michal Hocko authored
      Tetsuo has reported that creating a thousands of processes sharing MM
      without SIGHAND (aka alien threads) and setting
      /proc/<pid>/oom_score_adj will swamp the kernel log and takes ages [1]
      to finish.  This is especially worrisome that all that printing is done
      under RCU lock and this can potentially trigger RCU stall or softlockup
      detector.
      
      The primary reason for the printk was to catch potential users who might
      depend on the behavior prior to 44a70ade ("mm, oom_adj: make sure
      processes sharing mm have same view of oom_score_adj") but after more
      than 2 years without a single report I guess it is safe to simply remove
      the printk altogether.
      
      The next step should be moving oom_score_adj over to the mm struct and
      remove all the tasks crawling as suggested by [2]
      
      [1] http://lkml.kernel.org/r/97fce864-6f75-bca5-14bc-12c9f890e740@i-love.sakura.ne.jp
      [2] http://lkml.kernel.org/r/20190117155159.GA4087@dhcp22.suse.cz
      
      Link: http://lkml.kernel.org/r/20190212102129.26288-1-mhocko@kernel.orgSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Yong-Taek Lee <ytk.lee@samsung.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b2b46993
    • Qian Cai's avatar
      slub: fix SLAB_CONSISTENCY_CHECKS + KASAN_SW_TAGS · 338cfaad
      Qian Cai authored
      Enabling SLUB_DEBUG's SLAB_CONSISTENCY_CHECKS with KASAN_SW_TAGS
      triggers endless false positives during boot below due to
      check_valid_pointer() checks tagged pointers which have no addresses
      that is valid within slab pages:
      
        BUG radix_tree_node (Tainted: G    B            ): Freelist Pointer check fails
        -----------------------------------------------------------------------------
      
        INFO: Slab objects=69 used=69 fp=0x          (null) flags=0x7ffffffc000200
        INFO: Object @offset=15060037153926966016 fp=0x
      
        Redzone: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 18 6b 06 00 08 80 ff d0  .........k......
        Object : 18 6b 06 00 08 80 ff d0 00 00 00 00 00 00 00 00  .k..............
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        Redzone: bb bb bb bb bb bb bb bb                          ........
        Padding: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a  ZZZZZZZZZZZZZZZZ
        CPU: 0 PID: 0 Comm: swapper/0 Tainted: G    B             5.0.0-rc5+ #18
        Call trace:
          dump_backtrace+0x0/0x450
          show_stack+0x20/0x2c
          __dump_stack+0x20/0x28
          dump_stack+0xa0/0xfc
          print_trailer+0x1bc/0x1d0
          object_err+0x40/0x50
          alloc_debug_processing+0xf0/0x19c
          ___slab_alloc+0x554/0x704
          kmem_cache_alloc+0x2f8/0x440
          radix_tree_node_alloc+0x90/0x2fc
          idr_get_free+0x1e8/0x6d0
          idr_alloc_u32+0x11c/0x2a4
          idr_alloc+0x74/0xe0
          worker_pool_assign_id+0x5c/0xbc
          workqueue_init_early+0x49c/0xd50
          start_kernel+0x52c/0xac4
        FIX radix_tree_node: Marking all objects used
      
      Link: http://lkml.kernel.org/r/20190209044128.3290-1-cai@lca.pwSigned-off-by: default avatarQian Cai <cai@lca.pw>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      338cfaad
    • Andrey Konovalov's avatar
      kasan, slub: fix more conflicts with CONFIG_SLAB_FREELIST_HARDENED · d36a63a9
      Andrey Konovalov authored
      When CONFIG_KASAN_SW_TAGS is enabled, ptr_addr might be tagged.  Normally,
      this doesn't cause any issues, as both set_freepointer() and
      get_freepointer() are called with a pointer with the same tag.  However,
      there are some issues with CONFIG_SLUB_DEBUG code.  For example, when
      __free_slub() iterates over objects in a cache, it passes untagged
      pointers to check_object().  check_object() in turns calls
      get_freepointer() with an untagged pointer, which causes the freepointer
      to be restored incorrectly.
      
      Add kasan_reset_tag to freelist_ptr(). Also add a detailed comment.
      
      Link: http://lkml.kernel.org/r/bf858f26ef32eb7bd24c665755b3aee4bc58d0e4.1550103861.git.andreyknvl@google.comSigned-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Tested-by: default avatarQian Cai <cai@lca.pw>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d36a63a9
    • Andrey Konovalov's avatar
      kasan, slub: fix conflicts with CONFIG_SLAB_FREELIST_HARDENED · 18e50661
      Andrey Konovalov authored
      CONFIG_SLAB_FREELIST_HARDENED hashes freelist pointer with the address of
      the object where the pointer gets stored.  With tag based KASAN we don't
      account for that when building freelist, as we call set_freepointer() with
      the first argument untagged.  This patch changes the code to properly
      propagate tags throughout the loop.
      
      Link: http://lkml.kernel.org/r/3df171559c52201376f246bf7ce3184fe21c1dc7.1549921721.git.andreyknvl@google.comSigned-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18e50661
    • Andrey Konovalov's avatar
      kasan, slub: move kasan_poison_slab hook before page_address · a7101224
      Andrey Konovalov authored
      With tag based KASAN page_address() looks at the page flags to see whether
      the resulting pointer needs to have a tag set.  Since we don't want to set
      a tag when page_address() is called on SLAB pages, we call
      page_kasan_tag_reset() in kasan_poison_slab().  However in allocate_slab()
      page_address() is called before kasan_poison_slab().  Fix it by changing
      the order.
      
      [andreyknvl@google.com: fix compilation error when CONFIG_SLUB_DEBUG=n]
        Link: http://lkml.kernel.org/r/ac27cc0bbaeb414ed77bcd6671a877cf3546d56e.1550066133.git.andreyknvl@google.com
      Link: http://lkml.kernel.org/r/cd895d627465a3f1c712647072d17f10883be2a1.1549921721.git.andreyknvl@google.comSigned-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a7101224
    • Andrey Konovalov's avatar
      kmemleak: account for tagged pointers when calculating pointer range · a2f77575
      Andrey Konovalov authored
      kmemleak keeps two global variables, min_addr and max_addr, which store
      the range of valid (encountered by kmemleak) pointer values, which it
      later uses to speed up pointer lookup when scanning blocks.
      
      With tagged pointers this range will get bigger than it needs to be.  This
      patch makes kmemleak untag pointers before saving them to min_addr and
      max_addr and when performing a lookup.
      
      Link: http://lkml.kernel.org/r/16e887d442986ab87fe87a755815ad92fa431a5f.1550066133.git.andreyknvl@google.comSigned-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarQian Cai <cai@lca.pw>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a2f77575
    • Andrey Konovalov's avatar
      kasan, kmemleak: pass tagged pointers to kmemleak · 53128245
      Andrey Konovalov authored
      Right now we call kmemleak hooks before assigning tags to pointers in
      KASAN hooks.  As a result, when an objects gets allocated, kmemleak sees a
      differently tagged pointer, compared to the one it sees when the object
      gets freed.  Fix it by calling KASAN hooks before kmemleak's ones.
      
      Link: http://lkml.kernel.org/r/cd825aa4897b0fc37d3316838993881daccbe9f5.1549921721.git.andreyknvl@google.comSigned-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reported-by: default avatarQian Cai <cai@lca.pw>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      53128245
    • Andrey Konovalov's avatar
      kasan: fix assigning tags twice · e1db95be
      Andrey Konovalov authored
      When an object is kmalloc()'ed, two hooks are called: kasan_slab_alloc()
      and kasan_kmalloc().  Right now we assign a tag twice, once in each of the
      hooks.  Fix it by assigning a tag only in the former hook.
      
      Link: http://lkml.kernel.org/r/ce8c6431da735aa7ec051fd6497153df690eb021.1549921721.git.andreyknvl@google.comSigned-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgeniy Stepanov <eugenis@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kostya Serebryany <kcc@google.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e1db95be
    • Ralph Campbell's avatar
      numa: change get_mempolicy() to use nr_node_ids instead of MAX_NUMNODES · 050c17f2
      Ralph Campbell authored
      The system call, get_mempolicy() [1], passes an unsigned long *nodemask
      pointer and an unsigned long maxnode argument which specifies the length
      of the user's nodemask array in bits (which is rounded up).  The manual
      page says that if the maxnode value is too small, get_mempolicy will
      return EINVAL but there is no system call to return this minimum value.
      To determine this value, some programs search /proc/<pid>/status for a
      line starting with "Mems_allowed:" and use the number of digits in the
      mask to determine the minimum value.  A recent change to the way this line
      is formatted [2] causes these programs to compute a value less than
      MAX_NUMNODES so get_mempolicy() returns EINVAL.
      
      Change get_mempolicy(), the older compat version of get_mempolicy(), and
      the copy_nodes_to_user() function to use nr_node_ids instead of
      MAX_NUMNODES, thus preserving the defacto method of computing the minimum
      size for the nodemask array and the maxnode argument.
      
      [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
      [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redhat.com
      
      Link: http://lkml.kernel.org/r/20190211180245.22295-1-rcampbell@nvidia.com
      Fixes: 4fb8e5b89bcbbbb ("include/linux/nodemask.h: use nr_node_ids (not MAX_NUMNODES) in __nodemask_pr_numnodes()")
      Signed-off-by: default avatarRalph Campbell <rcampbell@nvidia.com>
      Suggested-by: default avatarAlexander Duyck <alexander.duyck@gmail.com>
      Cc: Waiman Long <longman@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      050c17f2
    • Andrew Morton's avatar
      revert "initramfs: cleanup incomplete rootfs" · a841c673
      Andrew Morton authored
      Revert ff1522bb ("initramfs: cleanup incomplete rootfs").
      
      Andy reports
      
      : This breaks my setup where I have U-boot provided more size of initramfs
      : than needed.  This allows a bit of flexibility to increase or decrease
      : initramfs compressed image without taking care of bootloader.  The proper
      : solution is to do this if we sure that we didn't get enough memory,
      : otherwise I can't consider the error fatal to clean up rootfs.
      
      Fixes: ff1522bb ("initramfs: cleanup incomplete rootfs")
      Reported-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Tested-by: default avatarAndy Shevchenko <andy.shevchenko@gmail.com>
      Cc: David Engraf <david.engraf@sysgo.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a841c673
  2. 20 Feb, 2019 6 commits
    • Linus Torvalds's avatar
      Merge tag 'sound-5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2137397c
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Here are a few last-minute fixes for 5.0.
      
        The most significant one is the OF-node refcount fix for ASoC
        simple-card, which could be triggered on many boards. Another fix for
        ASoC core is for the error handling in topology, while others are
        device-specific fixes for Samsung and HD-audio"
      
      * tag 'sound-5.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ASoC: simple-card: fixup refcount_t underflow
        ASoC: topology: free created components in tplg load error
        ALSA: hda/realtek: Disable PC beep in passthrough on alc285
        ALSA: hda/realtek - Headset microphone and internal speaker support for System76 oryp5
        ASoC: samsung: i2s: Fix prescaler setting for the secondary DAI
      2137397c
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · fb83f15e
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
       "Some final pin control fixes (I hope) to round off the v5.0 pin
        control development cycle.
      
        Only driver fixes, one for stable:
      
         - Meson8B fixup for the sdc pins
      
         - Fix SDC tile position for Qualcomm QCS404"
      
      * tag 'pinctrl-v5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: meson: meson8b: fix the sdxc_a data 1..3 pins
        pinctrl: qcom: qcs404: Correct SDC tile
      fb83f15e
    • Linus Torvalds's avatar
      Merge tag 'gpio-v5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · c828c265
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "Two GPIO fixes for the v5.0 series:
      
         - Per-instance irqchip on the MT7621
      
         - Avoid direction setting using pin control on MMP2"
      
      * tag 'gpio-v5.0-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio: pxa: avoid attempting to set pin direction via pinctrl on MMP2
        gpio: MT7621: use a per instance irq_chip structure
      c828c265
    • Linus Torvalds's avatar
      Merge tag 'mtd/fixes-for-5.0-rc8' of git://git.infradead.org/linux-mtd · 7d9d592c
      Linus Torvalds authored
      Pull MTD fixes from Boris Brezillon:
      
       - Don't add a digit to MTD-backed nvmem device names
      
       - Make sure powernv flash names are unique
      
      * tag 'mtd/fixes-for-5.0-rc8' of git://git.infradead.org/linux-mtd:
        mtd: powernv_flash: Fix device registration error
        mtd: Use mtd->name when registering nvmem device
      7d9d592c
    • Linus Torvalds's avatar
      Merge branch 'fixes-v5.1-rc6' of... · 1f5a018c
      Linus Torvalds authored
      Merge branch 'fixes-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
      
      Pull keys fixes from James Morris:
      
       - Handle quotas better, allowing full quota to be reached.
      
       - Fix the creation of shortcuts in the assoc_array internal
         representation when the index key needs to be an exact multiple of
         the machine word size.
      
       - Fix a dependency loop between the request_key contruction record and
         the request_key authentication key. The construction record isn't
         really necessary and can be dispensed with.
      
       - Set the timestamp on a new key rather than leaving it as 0. This
         would ordinarily be fine - provided the system clock is never set to
         a time before 1970
      
      * 'fixes-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
        keys: Timestamp new keys
        keys: Fix dependency loop between construction record and auth key
        assoc_array: Fix shortcut creation
        KEYS: allow reaching the keys quotas exactly
      1f5a018c
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 40e196a9
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix suspend and resume in mt76x0u USB driver, from Stanislaw
          Gruszka.
      
       2) Missing memory barriers in xsk, from Magnus Karlsson.
      
       3) rhashtable fixes in mac80211 from Herbert Xu.
      
       4) 32-bit MIPS eBPF JIT fixes from Paul Burton.
      
       5) Fix for_each_netdev_feature() on big endian, from Hauke Mehrtens.
      
       6) GSO validation fixes from Willem de Bruijn.
      
       7) Endianness fix for dwmac4 timestamp handling, from Alexandre Torgue.
      
       8) More strict checks in tcp_v4_err(), from Eric Dumazet.
      
       9) af_alg_release should NULL out the sk after the sock_put(), from Mao
          Wenan.
      
      10) Missing unlock in mac80211 mesh error path, from Wei Yongjun.
      
      11) Missing device put in hns driver, from Salil Mehta.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
        sky2: Increase D3 delay again
        vhost: correctly check the return value of translate_desc() in log_used()
        net: netcp: Fix ethss driver probe issue
        net: hns: Fixes the missing put_device in positive leg for roce reset
        net: stmmac: Fix a race in EEE enable callback
        qed: Fix iWARP syn packet mac address validation.
        qed: Fix iWARP buffer size provided for syn packet processing.
        r8152: Add support for MAC address pass through on RTL8153-BD
        mac80211: mesh: fix missing unlock on error in table_path_del()
        net/mlx4_en: fix spelling mistake: "quiting" -> "quitting"
        net: crypto set sk to NULL when af_alg_release.
        net: Do not allocate page fragments that are not skb aligned
        mm: Use fixed constant in page_frag_alloc instead of size + 1
        tcp: tcp_v4_err() should be more careful
        tcp: clear icsk_backoff in tcp_write_queue_purge()
        net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe()
        qmi_wwan: apply SET_DTR quirk to Sierra WP7607
        net: stmmac: handle endianness in dwmac4_get_timestamp
        doc: Mention MSG_ZEROCOPY implementation for UDP
        mlxsw: __mlxsw_sp_port_headroom_set(): Fix a use of local variable
        ...
      40e196a9
  3. 19 Feb, 2019 14 commits
  4. 18 Feb, 2019 6 commits
    • Colin Ian King's avatar
      net/mlx4_en: fix spelling mistake: "quiting" -> "quitting" · 21d2cb49
      Colin Ian King authored
      There is a spelling mistake in a en_err error message. Fix it.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21d2cb49
    • Mao Wenan's avatar
      net: crypto set sk to NULL when af_alg_release. · 9060cb71
      Mao Wenan authored
      KASAN has found use-after-free in sockfs_setattr.
      The existed commit 6d8c50dc ("socket: close race condition between sock_close()
      and sockfs_setattr()") is to fix this simillar issue, but it seems to ignore
      that crypto module forgets to set the sk to NULL after af_alg_release.
      
      KASAN report details as below:
      BUG: KASAN: use-after-free in sockfs_setattr+0x120/0x150
      Write of size 4 at addr ffff88837b956128 by task syz-executor0/4186
      
      CPU: 2 PID: 4186 Comm: syz-executor0 Not tainted xxx + #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      1.10.2-1ubuntu1 04/01/2014
      Call Trace:
       dump_stack+0xca/0x13e
       print_address_description+0x79/0x330
       ? vprintk_func+0x5e/0xf0
       kasan_report+0x18a/0x2e0
       ? sockfs_setattr+0x120/0x150
       sockfs_setattr+0x120/0x150
       ? sock_register+0x2d0/0x2d0
       notify_change+0x90c/0xd40
       ? chown_common+0x2ef/0x510
       chown_common+0x2ef/0x510
       ? chmod_common+0x3b0/0x3b0
       ? __lock_is_held+0xbc/0x160
       ? __sb_start_write+0x13d/0x2b0
       ? __mnt_want_write+0x19a/0x250
       do_fchownat+0x15c/0x190
       ? __ia32_sys_chmod+0x80/0x80
       ? trace_hardirqs_on_thunk+0x1a/0x1c
       __x64_sys_fchownat+0xbf/0x160
       ? lockdep_hardirqs_on+0x39a/0x5e0
       do_syscall_64+0xc8/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x462589
      Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89
      f7 48 89 d6 48 89
      ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3
      48 c7 c1 bc ff ff
      ff f7 d8 64 89 01 48
      RSP: 002b:00007fb4b2c83c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000104
      RAX: ffffffffffffffda RBX: 000000000072bfa0 RCX: 0000000000462589
      RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000007
      RBP: 0000000000000005 R08: 0000000000001000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007fb4b2c846bc
      R13: 00000000004bc733 R14: 00000000006f5138 R15: 00000000ffffffff
      
      Allocated by task 4185:
       kasan_kmalloc+0xa0/0xd0
       __kmalloc+0x14a/0x350
       sk_prot_alloc+0xf6/0x290
       sk_alloc+0x3d/0xc00
       af_alg_accept+0x9e/0x670
       hash_accept+0x4a3/0x650
       __sys_accept4+0x306/0x5c0
       __x64_sys_accept4+0x98/0x100
       do_syscall_64+0xc8/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 4184:
       __kasan_slab_free+0x12e/0x180
       kfree+0xeb/0x2f0
       __sk_destruct+0x4e6/0x6a0
       sk_destruct+0x48/0x70
       __sk_free+0xa9/0x270
       sk_free+0x2a/0x30
       af_alg_release+0x5c/0x70
       __sock_release+0xd3/0x280
       sock_close+0x1a/0x20
       __fput+0x27f/0x7f0
       task_work_run+0x136/0x1b0
       exit_to_usermode_loop+0x1a7/0x1d0
       do_syscall_64+0x461/0x580
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Syzkaller reproducer:
      r0 = perf_event_open(&(0x7f0000000000)={0x0, 0x70, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
      0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, @perf_config_ext}, 0x0, 0x0,
      0xffffffffffffffff, 0x0)
      r1 = socket$alg(0x26, 0x5, 0x0)
      getrusage(0x0, 0x0)
      bind(r1, &(0x7f00000001c0)=@alg={0x26, 'hash\x00', 0x0, 0x0,
      'sha256-ssse3\x00'}, 0x80)
      r2 = accept(r1, 0x0, 0x0)
      r3 = accept4$unix(r2, 0x0, 0x0, 0x0)
      r4 = dup3(r3, r0, 0x0)
      fchownat(r4, &(0x7f00000000c0)='\x00', 0x0, 0x0, 0x1000)
      
      Fixes: 6d8c50dc ("socket: close race condition between sock_close() and sockfs_setattr()")
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9060cb71
    • Kuninori Morimoto's avatar
      ASoC: simple-card: fixup refcount_t underflow · 19dd0777
      Kuninori Morimoto authored
      commit da215354 ("ASoC: simple-card: merge simple-scu-card")
      merged simple-card and simple-scu-card. Then it had refcount
      underflow bug. This patch fixup it.
      We will get below error without this patch.
      
      	OF: ERROR: Bad of_node_put() on /sound
      	CPU: 3 PID: 237 Comm: kworker/3:1 Not tainted 5.0.0-rc6+ #1514
      	Hardware name: Renesas H3ULCB Kingfisher board based on r8a7795 ES2.0+ (DT)
      	Workqueue: events deferred_probe_work_func
      	Call trace:
      	 dump_backtrace+0x0/0x150
      	 show_stack+0x24/0x30
      	 dump_stack+0xb0/0xec
      	 of_node_release+0xd0/0xd8
      	 kobject_put+0x74/0xe8
      	 of_node_put+0x24/0x30
      	 __of_get_next_child+0x50/0x70
      	 of_get_next_child+0x40/0x68
      	 asoc_simple_card_probe+0x604/0x730
      	 platform_drv_probe+0x58/0xa8
      	 ...
      Reported-by: default avatarVicente Bergas <vicencb@gmail.com>
      Signed-off-by: default avatarKuninori Morimoto <kuninori.morimoto.gx@renesas.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      19dd0777
    • Bard liao's avatar
      ASoC: topology: free created components in tplg load error · 304017d3
      Bard liao authored
      Topology resources are no longer needed if any element failed to load.
      Signed-off-by: default avatarBard liao <yung-chuan.liao@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      304017d3
    • Linus Torvalds's avatar
      Merge tag 'mailbox-fixes-v5.0-rc7' of... · 301e3610
      Linus Torvalds authored
      Merge tag 'mailbox-fixes-v5.0-rc7' of git://git.linaro.org/landing-teams/working/fujitsu/integration
      
      Pull mailbox fixes from Jassi Brar:
      
       - API: Fix build breakge by exporting the function mbox_flush
      
       - BRCM: Fix FlexRM ring flush timeout issue
      
      * tag 'mailbox-fixes-v5.0-rc7' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
        mailbox: bcm-flexrm-mailbox: Fix FlexRM ring flush timeout issue
        mailbox: Export mbox_flush()
      301e3610
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 3ddc14e2
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "A few ARM fixes:
      
         - Dietmar Eggemann noticed an issue with IRQ migration during CPU
           hotplug stress testing.
      
         - Mathieu Desnoyers noticed that a previous fix broke optimised
           kprobes.
      
         - Robin Murphy noticed a case where we were not clearing the dma_ops"
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 8835/1: dma-mapping: Clear DMA ops on teardown
        ARM: 8834/1: Fix: kprobes: optimized kprobes illegal instruction
        ARM: 8824/1: fix a migrating irq bug when hotplug cpu
      3ddc14e2