1. 25 Mar, 2021 24 commits
    • Vladimir Oltean's avatar
      net: dsa: only unset VLAN filtering when last port leaves last VLAN-aware bridge · 479dc497
      Vladimir Oltean authored
      DSA is aware of switches with global VLAN filtering since the blamed
      commit, but it makes a bad decision when multiple bridges are spanning
      the same switch:
      
      ip link add br0 type bridge vlan_filtering 1
      ip link add br1 type bridge vlan_filtering 1
      ip link set swp2 master br0
      ip link set swp3 master br0
      ip link set swp4 master br1
      ip link set swp5 master br1
      ip link set swp5 nomaster
      ip link set swp4 nomaster
      [138665.939930] sja1105 spi0.1: port 3: dsa_core: VLAN filtering is a global setting
      [138665.947514] DSA: failed to notify DSA_NOTIFIER_BRIDGE_LEAVE
      
      When all ports leave br1, DSA blindly attempts to disable VLAN filtering
      on the switch, ignoring the fact that br0 still exists and is VLAN-aware
      too. It fails while doing that.
      
      This patch checks whether any port exists at all and is under a
      VLAN-aware bridge.
      
      Fixes: d371b7c9 ("net: dsa: Unset vlan_filtering when ports leave the bridge")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Tested-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      479dc497
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 00232240
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "14 patches.
      
        Subsystems affected by this patch series: mm (hugetlb, kasan, gup,
        selftests, z3fold, kfence, memblock, and highmem), squashfs, ia64,
        gcov, and mailmap"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mailmap: update Andrey Konovalov's email address
        mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP
        mm: memblock: fix section mismatch warning again
        kfence: make compatible with kmemleak
        gcov: fix clang-11+ support
        ia64: fix format strings for err_inject
        ia64: mca: allocate early mca with GFP_ATOMIC
        squashfs: fix xattr id and id lookup sanity checks
        squashfs: fix inode lookup sanity checks
        z3fold: prevent reclaim/free race for headless pages
        selftests/vm: fix out-of-tree build
        mm/mmu_notifiers: ensure range_end() is paired with range_start()
        kasan: fix per-page tags for non-page_alloc pages
        hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings
      00232240
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 2ba9bea2
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "Not much going on, just some small bug fixes:
      
         - Typo causing a regression in mlx5 devx
      
         - Regression in the recent hns rework causing the HW to get out of
           sync
      
         - Long-standing cxgb4 adaptor crash when destroying cm ids"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server
        RDMA/hns: Fix bug during CMDQ initialization
        RDMA/mlx5: Fix typo in destroy_mkey inbox
      2ba9bea2
    • Linus Torvalds's avatar
      Merge tag 'mfd-fixes-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 58e4b9de
      Linus Torvalds authored
      Pull mfs fix from Lee Jones:
       "Unconstify editable placeholder structures"
      
      * tag 'mfd-fixes-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
        mfd: intel_quark_i2c_gpio: Revert "Constify static struct resources"
      58e4b9de
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 43f0b562
      Linus Torvalds authored
      Pull arm64 fixes from Will Deacon:
       "Minor fixes all over, ranging from typos to tests to errata
        workarounds:
      
         - Fix possible memory hotplug failure with KASLR
      
         - Fix FFR value in SVE kselftest
      
         - Fix backtraces reported in /proc/$pid/stack
      
         - Disable broken CnP implementation on NVIDIA Carmel
      
         - Typo fixes and ACPI documentation clarification
      
         - Fix some W=1 warnings"
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: kernel: disable CNP on Carmel
        arm64/process.c: fix Wmissing-prototypes build warnings
        kselftest/arm64: sve: Do not use non-canonical FFR register value
        arm64: mm: correct the inside linear map range during hotplug check
        arm64: kdump: update ppos when reading elfcorehdr
        arm64: cpuinfo: Fix a typo
        Documentation: arm64/acpi : clarify arm64 support of IBFT
        arm64: stacktrace: don't trace arch_stack_walk()
        arm64: csum: cast to the proper type
      43f0b562
    • Chris Chiu's avatar
      mailmap: update the email address for Chris Chiu · 7aae5432
      Chris Chiu authored
      Redirect my older email addresses in the git logs.
      Signed-off-by: default avatarChris Chiu <chris.chiu@canonical.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7aae5432
    • Andrey Konovalov's avatar
    • Ira Weiny's avatar
      mm/highmem: fix CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP · 487cfade
      Ira Weiny authored
      The kernel test robot found that __kmap_local_sched_out() was not
      correctly skipping the guard pages when DEBUG_KMAP_LOCAL_FORCE_MAP was
      set.[1] This was due to DEBUG_HIGHMEM check being used.
      
      Change the configuration check to be correct.
      
      [1] https://lore.kernel.org/lkml/20210304083825.GB17830@xsang-OptiPlex-9020/
      
      Link: https://lkml.kernel.org/r/20210318230657.1497881-1-ira.weiny@intel.com
      Fixes: 0e91a0c6 ("mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP")
      Signed-off-by: default avatarIra Weiny <ira.weiny@intel.com>
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Oliver Sang <oliver.sang@intel.com>
      Cc: Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com>
      Cc: David Sterba <dsterba@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      487cfade
    • Mike Rapoport's avatar
      mm: memblock: fix section mismatch warning again · a024b7c2
      Mike Rapoport authored
      Commit 34dc2efb ("memblock: fix section mismatch warning") marked
      memblock_bottom_up() and memblock_set_bottom_up() as __init, but they
      could be referenced from non-init functions like
      memblock_find_in_range_node() on architectures that enable
      CONFIG_ARCH_KEEP_MEMBLOCK.
      
      For such builds kernel test robot reports:
      
         WARNING: modpost: vmlinux.o(.text+0x74fea4): Section mismatch in reference from the function memblock_find_in_range_node() to the function .init.text:memblock_bottom_up()
         The function memblock_find_in_range_node() references the function __init memblock_bottom_up().
         This is often because memblock_find_in_range_node lacks a __init  annotation or the annotation of memblock_bottom_up is wrong.
      
      Replace __init annotations with __init_memblock annotations so that the
      appropriate section will be selected depending on
      CONFIG_ARCH_KEEP_MEMBLOCK.
      
      Link: https://lore.kernel.org/lkml/202103160133.UzhgY0wt-lkp@intel.com
      Link: https://lkml.kernel.org/r/20210316171347.14084-1-rppt@kernel.org
      Fixes: 34dc2efb ("memblock: fix section mismatch warning")
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a024b7c2
    • Marco Elver's avatar
      kfence: make compatible with kmemleak · 95511580
      Marco Elver authored
      Because memblock allocations are registered with kmemleak, the KFENCE
      pool was seen by kmemleak as one large object.  Later allocations
      through kfence_alloc() that were registered with kmemleak via
      slab_post_alloc_hook() would then overlap and trigger a warning.
      Therefore, once the pool is initialized, we can remove (free) it from
      kmemleak again, since it should be treated as allocator-internal and be
      seen as "free memory".
      
      The second problem is that kmemleak is passed the rounded size, and not
      the originally requested size, which is also the size of KFENCE objects.
      To avoid kmemleak scanning past the end of an object and trigger a
      KFENCE out-of-bounds error, fix the size if it is a KFENCE object.
      
      For simplicity, to avoid a call to kfence_ksize() in
      slab_post_alloc_hook() (and avoid new IS_ENABLED(CONFIG_DEBUG_KMEMLEAK)
      guard), just call kfence_ksize() in mm/kmemleak.c:create_object().
      
      Link: https://lkml.kernel.org/r/20210317084740.3099921-1-elver@google.comSigned-off-by: default avatarMarco Elver <elver@google.com>
      Reported-by: default avatarLuis Henriques <lhenriques@suse.de>
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Tested-by: default avatarLuis Henriques <lhenriques@suse.de>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Jann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      95511580
    • Nick Desaulniers's avatar
      gcov: fix clang-11+ support · 60bcf728
      Nick Desaulniers authored
      LLVM changed the expected function signatures for llvm_gcda_start_file()
      and llvm_gcda_emit_function() in the clang-11 release.  Users of
      clang-11 or newer may have noticed their kernels failing to boot due to
      a panic when enabling CONFIG_GCOV_KERNEL=y +CONFIG_GCOV_PROFILE_ALL=y.
      Fix up the function signatures so calling these functions doesn't panic
      the kernel.
      
      Link: https://reviews.llvm.org/rGcdd683b516d147925212724b09ec6fb792a40041
      Link: https://reviews.llvm.org/rG13a633b438b6500ecad9e4f936ebadf3411d0f44
      Link: https://lkml.kernel.org/r/20210312224132.3413602-2-ndesaulniers@google.comSigned-off-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Reported-by: default avatarPrasad Sodagudi <psodagud@quicinc.com>
      Suggested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Reviewed-by: default avatarFangrui Song <maskray@google.com>
      Tested-by: default avatarNathan Chancellor <nathan@kernel.org>
      Acked-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Reviewed-by: default avatarNathan Chancellor <nathan@kernel.org>
      Cc: <stable@vger.kernel.org>	[5.4+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      60bcf728
    • Sergei Trofimovich's avatar
      ia64: fix format strings for err_inject · 95d44a47
      Sergei Trofimovich authored
      Fix warning with %lx / u64 mismatch:
      
        arch/ia64/kernel/err_inject.c: In function 'show_resources':
        arch/ia64/kernel/err_inject.c:62:22: warning:
          format '%lx' expects argument of type 'long unsigned int',
          but argument 3 has type 'u64' {aka 'long long unsigned int'}
           62 |  return sprintf(buf, "%lx", name[cpu]);   \
              |                      ^~~~~~~
      
      Link: https://lkml.kernel.org/r/20210313104312.1548232-1-slyfox@gentoo.orgSigned-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      95d44a47
    • Sergei Trofimovich's avatar
      ia64: mca: allocate early mca with GFP_ATOMIC · f2a419cf
      Sergei Trofimovich authored
      The sleep warning happens at early boot right at secondary CPU
      activation bootup:
      
          smp: Bringing up secondary CPUs ...
          BUG: sleeping function called from invalid context at mm/page_alloc.c:4942
          in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
          CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.12.0-rc2-00007-g79e228d0b611-dirty #99
          ..
          Call Trace:
            show_stack+0x90/0xc0
            dump_stack+0x150/0x1c0
            ___might_sleep+0x1c0/0x2a0
            __might_sleep+0xa0/0x160
            __alloc_pages_nodemask+0x1a0/0x600
            alloc_page_interleave+0x30/0x1c0
            alloc_pages_current+0x2c0/0x340
            __get_free_pages+0x30/0xa0
            ia64_mca_cpu_init+0x2d0/0x3a0
            cpu_init+0x8b0/0x1440
            start_secondary+0x60/0x700
            start_ap+0x750/0x780
          Fixed BSP b0 value from CPU 1
      
      As I understand interrupts are not enabled yet and system has a lot of
      memory.  There is little chance to sleep and switch to GFP_ATOMIC should
      be a no-op.
      
      Link: https://lkml.kernel.org/r/20210315085045.204414-1-slyfox@gentoo.orgSigned-off-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f2a419cf
    • Phillip Lougher's avatar
      squashfs: fix xattr id and id lookup sanity checks · 8b44ca2b
      Phillip Lougher authored
      The checks for maximum metadata block size is missing
      SQUASHFS_BLOCK_OFFSET (the two byte length count).
      
      Link: https://lkml.kernel.org/r/2069685113.2081245.1614583677427@webmail.123-reg.co.uk
      Fixes: f37aa4c7 ("squashfs: add more sanity checks in id lookup")
      Signed-off-by: default avatarPhillip Lougher <phillip@squashfs.org.uk>
      Cc: Sean Nyekjaer <sean@geanix.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8b44ca2b
    • Sean Nyekjaer's avatar
      squashfs: fix inode lookup sanity checks · c1b20283
      Sean Nyekjaer authored
      When mouting a squashfs image created without inode compression it fails
      with: "unable to read inode lookup table"
      
      It turns out that the BLOCK_OFFSET is missing when checking the
      SQUASHFS_METADATA_SIZE agaist the actual size.
      
      Link: https://lkml.kernel.org/r/20210226092903.1473545-1-sean@geanix.com
      Fixes: eabac19e ("squashfs: add more sanity checks in inode lookup")
      Signed-off-by: default avatarSean Nyekjaer <sean@geanix.com>
      Acked-by: default avatarPhillip Lougher <phillip@squashfs.org.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c1b20283
    • Thomas Hebb's avatar
      z3fold: prevent reclaim/free race for headless pages · 6d679578
      Thomas Hebb authored
      Commit ca0246bb ("z3fold: fix possible reclaim races") introduced
      the PAGE_CLAIMED flag "to avoid racing on a z3fold 'headless' page
      release." By atomically testing and setting the bit in each of
      z3fold_free() and z3fold_reclaim_page(), a double-free was avoided.
      
      However, commit dcf5aedb ("z3fold: stricter locking and more careful
      reclaim") appears to have unintentionally broken this behavior by moving
      the PAGE_CLAIMED check in z3fold_reclaim_page() to after the page lock
      gets taken, which only happens for non-headless pages.  For headless
      pages, the check is now skipped entirely and races can occur again.
      
      I have observed such a race on my system:
      
          page:00000000ffbd76b7 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x165316
          flags: 0x2ffff0000000000()
          raw: 02ffff0000000000 ffffea0004535f48 ffff8881d553a170 0000000000000000
          raw: 0000000000000000 0000000000000011 00000000ffffffff 0000000000000000
          page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
          ------------[ cut here ]------------
          kernel BUG at include/linux/mm.h:707!
          invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
          CPU: 2 PID: 291928 Comm: kworker/2:0 Tainted: G    B             5.10.7-arch1-1-kasan #1
          Hardware name: Gigabyte Technology Co., Ltd. H97N-WIFI/H97N-WIFI, BIOS F9b 03/03/2016
          Workqueue: zswap-shrink shrink_worker
          RIP: 0010:__free_pages+0x10a/0x130
          Code: c1 e7 06 48 01 ef 45 85 e4 74 d1 44 89 e6 31 d2 41 83 ec 01 e8 e7 b0 ff ff eb da 48 c7 c6 e0 32 91 88 48 89 ef e8 a6 89 f8 ff <0f> 0b 4c 89 e7 e8 fc 79 07 00 e9 33 ff ff ff 48 89 ef e8 ff 79 07
          RSP: 0000:ffff88819a2ffb98 EFLAGS: 00010296
          RAX: 0000000000000000 RBX: ffffea000594c5a8 RCX: 0000000000000000
          RDX: 1ffffd4000b298b7 RSI: 0000000000000000 RDI: ffffea000594c5b8
          RBP: ffffea000594c580 R08: 000000000000003e R09: ffff8881d5520bbb
          R10: ffffed103aaa4177 R11: 0000000000000001 R12: ffffea000594c5b4
          R13: 0000000000000000 R14: ffff888165316000 R15: ffffea000594c588
          FS:  0000000000000000(0000) GS:ffff8881d5500000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007f7c8c3654d8 CR3: 0000000103f42004 CR4: 00000000001706e0
          Call Trace:
           z3fold_zpool_shrink+0x9b6/0x1240
           shrink_worker+0x35/0x90
           process_one_work+0x70c/0x1210
           worker_thread+0x539/0x1200
           kthread+0x330/0x400
           ret_from_fork+0x22/0x30
          Modules linked in: rfcomm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ccm algif_aead des_generic libdes ecb algif_skcipher cmac bnep md4 algif_hash af_alg vfat fat intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iwlmvm hid_logitech_hidpp kvm at24 mac80211 snd_hda_codec_realtek iTCO_wdt snd_hda_codec_generic intel_pmc_bxt snd_hda_codec_hdmi ledtrig_audio iTCO_vendor_support mei_wdt mei_hdcp snd_hda_intel snd_intel_dspcfg libarc4 soundwire_intel irqbypass iwlwifi soundwire_generic_allocation rapl soundwire_cadence intel_cstate snd_hda_codec intel_uncore btusb joydev mousedev snd_usb_audio pcspkr btrtl uvcvideo nouveau btbcm i2c_i801 btintel snd_hda_core videobuf2_vmalloc i2c_smbus snd_usbmidi_lib videobuf2_memops bluetooth snd_hwdep soundwire_bus snd_soc_rt5640 videobuf2_v4l2 cfg80211 snd_soc_rl6231 videobuf2_common snd_rawmidi lpc_ich alx videodev mdio snd_seq_device snd_soc_core mc ecdh_generic mxm_wmi mei_me
           hid_logitech_dj wmi snd_compress e1000e ac97_bus mei ttm rfkill snd_pcm_dmaengine ecc snd_pcm snd_timer snd soundcore mac_hid acpi_pad pkcs8_key_parser it87 hwmon_vid crypto_user fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 dm_crypt cbc encrypted_keys trusted tpm rng_core usbhid dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper xhci_pci xhci_pci_renesas i915 video intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart
          ---[ end trace 126d646fc3dc0ad8 ]---
      
      To fix the issue, re-add the earlier test and set in the case where we
      have a headless page.
      
      Link: https://lkml.kernel.org/r/c8106dbe6d8390b290cd1d7f873a2942e805349e.1615452048.git.tommyhebb@gmail.com
      Fixes: dcf5aedb ("z3fold: stricter locking and more careful reclaim")
      Signed-off-by: default avatarThomas Hebb <tommyhebb@gmail.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Cc: Jongseok Kim <ks77sj@gmail.com>
      Cc: Snild Dolkow <snild@sony.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d679578
    • Rong Chen's avatar
      selftests/vm: fix out-of-tree build · 19ec368c
      Rong Chen authored
      When building out-of-tree, attempting to make target from $(OUTPUT) directory:
      
        make[1]: *** No rule to make target '$(OUTPUT)/protection_keys.c', needed by '$(OUTPUT)/protection_keys_32'.
      
      Link: https://lkml.kernel.org/r/20210315094700.522753-1-rong.a.chen@intel.comSigned-off-by: default avatarRong Chen <rong.a.chen@intel.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19ec368c
    • Sean Christopherson's avatar
      mm/mmu_notifiers: ensure range_end() is paired with range_start() · c2655835
      Sean Christopherson authored
      If one or more notifiers fails .invalidate_range_start(), invoke
      .invalidate_range_end() for "all" notifiers.  If there are multiple
      notifiers, those that did not fail are expecting _start() and _end() to
      be paired, e.g.  KVM's mmu_notifier_count would become imbalanced.
      Disallow notifiers that can fail _start() from implementing _end() so
      that it's unnecessary to either track which notifiers rejected _start(),
      or had already succeeded prior to a failed _start().
      
      Note, the existing behavior of calling _start() on all notifiers even
      after a previous notifier failed _start() was an unintented "feature".
      Make it canon now that the behavior is depended on for correctness.
      
      As of today, the bug is likely benign:
      
        1. The only caller of the non-blocking notifier is OOM kill.
        2. The only notifiers that can fail _start() are the i915 and Nouveau
           drivers.
        3. The only notifiers that utilize _end() are the SGI UV GRU driver
           and KVM.
        4. The GRU driver will never coincide with the i195/Nouveau drivers.
        5. An imbalanced kvm->mmu_notifier_count only causes soft lockup in the
           _guest_, and the guest is already doomed due to being an OOM victim.
      
      Fix the bug now to play nice with future usage, e.g.  KVM has a
      potential use case for blocking memslot updates in KVM while an
      invalidation is in-progress, and failure to unblock would result in said
      updates being blocked indefinitely and hanging.
      
      Found by inspection.  Verified by adding a second notifier in KVM that
      periodically returns -EAGAIN on non-blockable ranges, triggering OOM,
      and observing that KVM exits with an elevated notifier count.
      
      Link: https://lkml.kernel.org/r/20210311180057.1582638-1-seanjc@google.com
      Fixes: 93065ac7 ("mm, oom: distinguish blockable mode for mmu notifiers")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Suggested-by: default avatarJason Gunthorpe <jgg@ziepe.ca>
      Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ben Gardon <bgardon@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c2655835
    • Andrey Konovalov's avatar
      kasan: fix per-page tags for non-page_alloc pages · cf10bd4c
      Andrey Konovalov authored
      To allow performing tag checks on page_alloc addresses obtained via
      page_address(), tag-based KASAN modes store tags for page_alloc
      allocations in page->flags.
      
      Currently, the default tag value stored in page->flags is 0x00.
      Therefore, page_address() returns a 0x00ffff...  address for pages that
      were not allocated via page_alloc.
      
      This might cause problems.  A particular case we encountered is a
      conflict with KFENCE.  If a KFENCE-allocated slab object is being freed
      via kfree(page_address(page) + offset), the address passed to kfree()
      will get tagged with 0x00 (as slab pages keep the default per-page
      tags).  This leads to is_kfence_address() check failing, and a KFENCE
      object ending up in normal slab freelist, which causes memory
      corruptions.
      
      This patch changes the way KASAN stores tag in page-flags: they are now
      stored xor'ed with 0xff.  This way, KASAN doesn't need to initialize
      per-page flags for every created page, which might be slow.
      
      With this change, page_address() returns natively-tagged (with 0xff)
      pointers for pages that didn't have tags set explicitly.
      
      This patch fixes the encountered conflict with KFENCE and prevents more
      similar issues that can occur in the future.
      
      Link: https://lkml.kernel.org/r/1a41abb11c51b264511d9e71c303bb16d5cb367b.1615475452.git.andreyknvl@google.com
      Fixes: 2813b9c0 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf10bd4c
    • Miaohe Lin's avatar
      hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings · d85aecf2
      Miaohe Lin authored
      The current implementation of hugetlb_cgroup for shared mappings could
      have different behavior.  Consider the following two scenarios:
      
       1.Assume initial css reference count of hugetlb_cgroup is 1:
        1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference
            count is 2 associated with 1 file_region.
        1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference
            count is 3 associated with 2 file_region.
        1.3 coalesce_file_region will coalesce these two file_regions into
            one. So css reference count is 3 associated with 1 file_region
            now.
      
       2.Assume initial css reference count of hugetlb_cgroup is 1 again:
        2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference
            count is 2 associated with 1 file_region.
      
      Therefore, we might have one file_region while holding one or more css
      reference counts. This inconsistency could lead to imbalanced css_get()
      and css_put() pair. If we do css_put one by one (i.g. hole punch case),
      scenario 2 would put one more css reference. If we do css_put all
      together (i.g. truncate case), scenario 1 will leak one css reference.
      
      The imbalanced css_get() and css_put() pair would result in a non-zero
      reference when we try to destroy the hugetlb cgroup. The hugetlb cgroup
      directory is removed __but__ associated resource is not freed. This
      might result in OOM or can not create a new hugetlb cgroup in a busy
      workload ultimately.
      
      In order to fix this, we have to make sure that one file_region must
      hold exactly one css reference. So in coalesce_file_region case, we
      should release one css reference before coalescence. Also only put css
      reference when the entire file_region is removed.
      
      The last thing to note is that the caller of region_add() will only hold
      one reference to h_cg->css for the whole contiguous reservation region.
      But this area might be scattered when there are already some
      file_regions reside in it. As a result, many file_regions may share only
      one h_cg->css reference. In order to ensure that one file_region must
      hold exactly one css reference, we should do css_get() for each
      file_region and release the reference held by caller when they are done.
      
      [linmiaohe@huawei.com: fix imbalanced css_get and css_put pair for shared mappings]
        Link: https://lkml.kernel.org/r/20210316023002.53921-1-linmiaohe@huawei.com
      
      Link: https://lkml.kernel.org/r/20210301120540.37076-1-linmiaohe@huawei.com
      Fixes: 075a61d0 ("hugetlb_cgroup: add accounting for shared mappings")
      Reported-by: kernel test robot <lkp@intel.com> (auto build test ERROR)
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Wanpeng Li <liwp.linux@gmail.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d85aecf2
    • Potnuri Bharat Teja's avatar
      RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server · 3408be14
      Potnuri Bharat Teja authored
      Not setting the ipv6 bit while destroying ipv6 listening servers may
      result in potential fatal adapter errors due to lookup engine memory hash
      errors. Therefore always set ipv6 field while destroying ipv6 listening
      servers.
      
      Fixes: 830662f6 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
      Link: https://lore.kernel.org/r/20210324190453.8171-1-bharat@chelsio.comSigned-off-by: default avatarPotnuri Bharat Teja <bharat@chelsio.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      3408be14
    • Rich Wiley's avatar
      arm64: kernel: disable CNP on Carmel · 20109a85
      Rich Wiley authored
      On NVIDIA Carmel cores, CNP behaves differently than it does on standard
      ARM cores. On Carmel, if two cores have CNP enabled and share an L2 TLB
      entry created by core0 for a specific ASID, a non-shareable TLBI from
      core1 may still see the shared entry. On standard ARM cores, that TLBI
      will invalidate the shared entry as well.
      
      This causes issues with patchsets that attempt to do local TLBIs based
      on cpumasks instead of broadcast TLBIs. Avoid these issues by disabling
      CNP support for NVIDIA Carmel cores.
      Signed-off-by: default avatarRich Wiley <rwiley@nvidia.com>
      Link: https://lore.kernel.org/r/20210324002809.30271-1-rwiley@nvidia.com
      [will: Fix pre-existing whitespace issue]
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      20109a85
    • Maninder Singh's avatar
      arm64/process.c: fix Wmissing-prototypes build warnings · baa96377
      Maninder Singh authored
      Fix GCC warnings reported when building with "-Wmissing-prototypes":
      
        arch/arm64/kernel/process.c:261:6: warning: no previous prototype for '__show_regs' [-Wmissing-prototypes]
            261 | void __show_regs(struct pt_regs *regs)
                |      ^~~~~~~~~~~
        arch/arm64/kernel/process.c:307:6: warning: no previous prototype for '__show_regs_alloc_free' [-Wmissing-prototypes]
            307 | void __show_regs_alloc_free(struct pt_regs *regs)
                |      ^~~~~~~~~~~~~~~~~~~~~~
        arch/arm64/kernel/process.c:365:5: warning: no previous prototype for 'arch_dup_task_struct' [-Wmissing-prototypes]
            365 | int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
                |     ^~~~~~~~~~~~~~~~~~~~
        arch/arm64/kernel/process.c:546:41: warning: no previous prototype for '__switch_to' [-Wmissing-prototypes]
            546 | __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
                |                                         ^~~~~~~~~~~
        arch/arm64/kernel/process.c:710:25: warning: no previous prototype for 'arm64_preempt_schedule_irq' [-Wmissing-prototypes]
            710 | asmlinkage void __sched arm64_preempt_schedule_irq(void)
                |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Link: https://lore.kernel.org/lkml/202103192250.AennsfXM-lkp@intel.comReported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarManinder Singh <maninder1.s@samsung.com>
      Link: https://lore.kernel.org/r/1616568899-986-1-git-send-email-maninder1.s@samsung.comSigned-off-by: default avatarWill Deacon <will@kernel.org>
      baa96377
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · e1381380
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Various fixes, all over:
      
         1) Fix overflow in ptp_qoriq_adjfine(), from Yangbo Lu.
      
         2) Always store the rx queue mapping in veth, from Maciej
            Fijalkowski.
      
         3) Don't allow vmlinux btf in map_create, from Alexei Starovoitov.
      
         4) Fix memory leak in octeontx2-af from Colin Ian King.
      
         5) Use kvalloc in bpf x86 JIT for storing jit'd addresses, from
            Yonghong Song.
      
         6) Fix tx ptp stats in mlx5, from Aya Levin.
      
         7) Check correct ip version in tun decap, fropm Roi Dayan.
      
         8) Fix rate calculation in mlx5 E-Switch code, from arav Pandit.
      
         9) Work item memork leak in mlx5, from Shay Drory.
      
        10) Fix ip6ip6 tunnel crash with bpf, from Daniel Borkmann.
      
        11) Lack of preemptrion awareness in macvlan, from Eric Dumazet.
      
        12) Fix data race in pxa168_eth, from Pavel Andrianov.
      
        13) Range validate stab in red_check_params(), from Eric Dumazet.
      
        14) Inherit vlan filtering setting properly in b53 driver, from
            Florian Fainelli.
      
        15) Fix rtnl locking in igc driver, from Sasha Neftin.
      
        16) Pause handling fixes in igc driver, from Muhammad Husaini
            Zulkifli.
      
        17) Missing rtnl locking in e1000_reset_task, from Vitaly Lifshits.
      
        18) Use after free in qlcnic, from Lv Yunlong.
      
        19) fix crash in fritzpci mISDN, from Tong Zhang.
      
        20) Premature rx buffer reuse in igb, from Li RongQing.
      
        21) Missing termination of ip[a driver message handler arrays, from
            Alex Elder.
      
        22) Fix race between "x25_close" and "x25_xmit"/"x25_rx" in hdlc_x25
            driver, from Xie He.
      
        23) Use after free in c_can_pci_remove(), from Tong Zhang.
      
        24) Uninitialized variable use in nl80211, from Jarod Wilson.
      
        25) Off by one size calc in bpf verifier, from Piotr Krysiuk.
      
        26) Use delayed work instead of deferrable for flowtable GC, from
            Yinjun Zhang.
      
        27) Fix infinite loop in NPC unmap of octeontx2 driver, from
            Hariprasad Kelam.
      
        28) Fix being unable to change MTU of dwmac-sun8i devices due to lack
            of fifo sizes, from Corentin Labbe.
      
        29) DMA use after free in r8169 with WoL, fom Heiner Kallweit.
      
        30) Mismatched prototypes in isdn-capi, from Arnd Bergmann.
      
        31) Fix psample UAPI breakage, from Ido Schimmel"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (171 commits)
        psample: Fix user API breakage
        math: Export mul_u64_u64_div_u64
        ch_ktls: fix enum-conversion warning
        octeontx2-af: Fix memory leak of object buf
        ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation
        net: bridge: don't notify switchdev for local FDB addresses
        net/sched: act_ct: clear post_ct if doing ct_clear
        net: dsa: don't assign an error value to tag_ops
        isdn: capi: fix mismatched prototypes
        net/mlx5: SF, do not use ecpu bit for vhca state processing
        net/mlx5e: Fix division by 0 in mlx5e_select_queue
        net/mlx5e: Fix error path for ethtool set-priv-flag
        net/mlx5e: Offload tuple rewrite for non-CT flows
        net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP
        net/mlx5: Add back multicast stats for uplink representor
        net: ipconfig: ic_dev can be NULL in ic_close_devs
        MAINTAINERS: Combine "QLOGIC QLGE 10Gb ETHERNET DRIVER" sections into one
        docs: networking: Fix a typo
        r8169: fix DMA being used after buffer free if WoL is enabled
        net: ipa: fix init header command validation
        ...
      e1381380
  2. 24 Mar, 2021 9 commits
  3. 23 Mar, 2021 7 commits