1. 25 Mar, 2021 5 commits
    • Thomas Hebb's avatar
      z3fold: prevent reclaim/free race for headless pages · 6d679578
      Thomas Hebb authored
      Commit ca0246bb ("z3fold: fix possible reclaim races") introduced
      the PAGE_CLAIMED flag "to avoid racing on a z3fold 'headless' page
      release." By atomically testing and setting the bit in each of
      z3fold_free() and z3fold_reclaim_page(), a double-free was avoided.
      
      However, commit dcf5aedb ("z3fold: stricter locking and more careful
      reclaim") appears to have unintentionally broken this behavior by moving
      the PAGE_CLAIMED check in z3fold_reclaim_page() to after the page lock
      gets taken, which only happens for non-headless pages.  For headless
      pages, the check is now skipped entirely and races can occur again.
      
      I have observed such a race on my system:
      
          page:00000000ffbd76b7 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x165316
          flags: 0x2ffff0000000000()
          raw: 02ffff0000000000 ffffea0004535f48 ffff8881d553a170 0000000000000000
          raw: 0000000000000000 0000000000000011 00000000ffffffff 0000000000000000
          page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
          ------------[ cut here ]------------
          kernel BUG at include/linux/mm.h:707!
          invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
          CPU: 2 PID: 291928 Comm: kworker/2:0 Tainted: G    B             5.10.7-arch1-1-kasan #1
          Hardware name: Gigabyte Technology Co., Ltd. H97N-WIFI/H97N-WIFI, BIOS F9b 03/03/2016
          Workqueue: zswap-shrink shrink_worker
          RIP: 0010:__free_pages+0x10a/0x130
          Code: c1 e7 06 48 01 ef 45 85 e4 74 d1 44 89 e6 31 d2 41 83 ec 01 e8 e7 b0 ff ff eb da 48 c7 c6 e0 32 91 88 48 89 ef e8 a6 89 f8 ff <0f> 0b 4c 89 e7 e8 fc 79 07 00 e9 33 ff ff ff 48 89 ef e8 ff 79 07
          RSP: 0000:ffff88819a2ffb98 EFLAGS: 00010296
          RAX: 0000000000000000 RBX: ffffea000594c5a8 RCX: 0000000000000000
          RDX: 1ffffd4000b298b7 RSI: 0000000000000000 RDI: ffffea000594c5b8
          RBP: ffffea000594c580 R08: 000000000000003e R09: ffff8881d5520bbb
          R10: ffffed103aaa4177 R11: 0000000000000001 R12: ffffea000594c5b4
          R13: 0000000000000000 R14: ffff888165316000 R15: ffffea000594c588
          FS:  0000000000000000(0000) GS:ffff8881d5500000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007f7c8c3654d8 CR3: 0000000103f42004 CR4: 00000000001706e0
          Call Trace:
           z3fold_zpool_shrink+0x9b6/0x1240
           shrink_worker+0x35/0x90
           process_one_work+0x70c/0x1210
           worker_thread+0x539/0x1200
           kthread+0x330/0x400
           ret_from_fork+0x22/0x30
          Modules linked in: rfcomm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ccm algif_aead des_generic libdes ecb algif_skcipher cmac bnep md4 algif_hash af_alg vfat fat intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iwlmvm hid_logitech_hidpp kvm at24 mac80211 snd_hda_codec_realtek iTCO_wdt snd_hda_codec_generic intel_pmc_bxt snd_hda_codec_hdmi ledtrig_audio iTCO_vendor_support mei_wdt mei_hdcp snd_hda_intel snd_intel_dspcfg libarc4 soundwire_intel irqbypass iwlwifi soundwire_generic_allocation rapl soundwire_cadence intel_cstate snd_hda_codec intel_uncore btusb joydev mousedev snd_usb_audio pcspkr btrtl uvcvideo nouveau btbcm i2c_i801 btintel snd_hda_core videobuf2_vmalloc i2c_smbus snd_usbmidi_lib videobuf2_memops bluetooth snd_hwdep soundwire_bus snd_soc_rt5640 videobuf2_v4l2 cfg80211 snd_soc_rl6231 videobuf2_common snd_rawmidi lpc_ich alx videodev mdio snd_seq_device snd_soc_core mc ecdh_generic mxm_wmi mei_me
           hid_logitech_dj wmi snd_compress e1000e ac97_bus mei ttm rfkill snd_pcm_dmaengine ecc snd_pcm snd_timer snd soundcore mac_hid acpi_pad pkcs8_key_parser it87 hwmon_vid crypto_user fuse ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 dm_crypt cbc encrypted_keys trusted tpm rng_core usbhid dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper xhci_pci xhci_pci_renesas i915 video intel_gtt i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm agpgart
          ---[ end trace 126d646fc3dc0ad8 ]---
      
      To fix the issue, re-add the earlier test and set in the case where we
      have a headless page.
      
      Link: https://lkml.kernel.org/r/c8106dbe6d8390b290cd1d7f873a2942e805349e.1615452048.git.tommyhebb@gmail.com
      Fixes: dcf5aedb ("z3fold: stricter locking and more careful reclaim")
      Signed-off-by: default avatarThomas Hebb <tommyhebb@gmail.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Cc: Jongseok Kim <ks77sj@gmail.com>
      Cc: Snild Dolkow <snild@sony.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6d679578
    • Rong Chen's avatar
      selftests/vm: fix out-of-tree build · 19ec368c
      Rong Chen authored
      When building out-of-tree, attempting to make target from $(OUTPUT) directory:
      
        make[1]: *** No rule to make target '$(OUTPUT)/protection_keys.c', needed by '$(OUTPUT)/protection_keys_32'.
      
      Link: https://lkml.kernel.org/r/20210315094700.522753-1-rong.a.chen@intel.comSigned-off-by: default avatarRong Chen <rong.a.chen@intel.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      19ec368c
    • Sean Christopherson's avatar
      mm/mmu_notifiers: ensure range_end() is paired with range_start() · c2655835
      Sean Christopherson authored
      If one or more notifiers fails .invalidate_range_start(), invoke
      .invalidate_range_end() for "all" notifiers.  If there are multiple
      notifiers, those that did not fail are expecting _start() and _end() to
      be paired, e.g.  KVM's mmu_notifier_count would become imbalanced.
      Disallow notifiers that can fail _start() from implementing _end() so
      that it's unnecessary to either track which notifiers rejected _start(),
      or had already succeeded prior to a failed _start().
      
      Note, the existing behavior of calling _start() on all notifiers even
      after a previous notifier failed _start() was an unintented "feature".
      Make it canon now that the behavior is depended on for correctness.
      
      As of today, the bug is likely benign:
      
        1. The only caller of the non-blocking notifier is OOM kill.
        2. The only notifiers that can fail _start() are the i915 and Nouveau
           drivers.
        3. The only notifiers that utilize _end() are the SGI UV GRU driver
           and KVM.
        4. The GRU driver will never coincide with the i195/Nouveau drivers.
        5. An imbalanced kvm->mmu_notifier_count only causes soft lockup in the
           _guest_, and the guest is already doomed due to being an OOM victim.
      
      Fix the bug now to play nice with future usage, e.g.  KVM has a
      potential use case for blocking memslot updates in KVM while an
      invalidation is in-progress, and failure to unblock would result in said
      updates being blocked indefinitely and hanging.
      
      Found by inspection.  Verified by adding a second notifier in KVM that
      periodically returns -EAGAIN on non-blockable ranges, triggering OOM,
      and observing that KVM exits with an elevated notifier count.
      
      Link: https://lkml.kernel.org/r/20210311180057.1582638-1-seanjc@google.com
      Fixes: 93065ac7 ("mm, oom: distinguish blockable mode for mmu notifiers")
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Suggested-by: default avatarJason Gunthorpe <jgg@ziepe.ca>
      Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Ben Gardon <bgardon@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c2655835
    • Andrey Konovalov's avatar
      kasan: fix per-page tags for non-page_alloc pages · cf10bd4c
      Andrey Konovalov authored
      To allow performing tag checks on page_alloc addresses obtained via
      page_address(), tag-based KASAN modes store tags for page_alloc
      allocations in page->flags.
      
      Currently, the default tag value stored in page->flags is 0x00.
      Therefore, page_address() returns a 0x00ffff...  address for pages that
      were not allocated via page_alloc.
      
      This might cause problems.  A particular case we encountered is a
      conflict with KFENCE.  If a KFENCE-allocated slab object is being freed
      via kfree(page_address(page) + offset), the address passed to kfree()
      will get tagged with 0x00 (as slab pages keep the default per-page
      tags).  This leads to is_kfence_address() check failing, and a KFENCE
      object ending up in normal slab freelist, which causes memory
      corruptions.
      
      This patch changes the way KASAN stores tag in page-flags: they are now
      stored xor'ed with 0xff.  This way, KASAN doesn't need to initialize
      per-page flags for every created page, which might be slow.
      
      With this change, page_address() returns natively-tagged (with 0xff)
      pointers for pages that didn't have tags set explicitly.
      
      This patch fixes the encountered conflict with KFENCE and prevents more
      similar issues that can occur in the future.
      
      Link: https://lkml.kernel.org/r/1a41abb11c51b264511d9e71c303bb16d5cb367b.1615475452.git.andreyknvl@google.com
      Fixes: 2813b9c0 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf10bd4c
    • Miaohe Lin's avatar
      hugetlb_cgroup: fix imbalanced css_get and css_put pair for shared mappings · d85aecf2
      Miaohe Lin authored
      The current implementation of hugetlb_cgroup for shared mappings could
      have different behavior.  Consider the following two scenarios:
      
       1.Assume initial css reference count of hugetlb_cgroup is 1:
        1.1 Call hugetlb_reserve_pages with from = 1, to = 2. So css reference
            count is 2 associated with 1 file_region.
        1.2 Call hugetlb_reserve_pages with from = 2, to = 3. So css reference
            count is 3 associated with 2 file_region.
        1.3 coalesce_file_region will coalesce these two file_regions into
            one. So css reference count is 3 associated with 1 file_region
            now.
      
       2.Assume initial css reference count of hugetlb_cgroup is 1 again:
        2.1 Call hugetlb_reserve_pages with from = 1, to = 3. So css reference
            count is 2 associated with 1 file_region.
      
      Therefore, we might have one file_region while holding one or more css
      reference counts. This inconsistency could lead to imbalanced css_get()
      and css_put() pair. If we do css_put one by one (i.g. hole punch case),
      scenario 2 would put one more css reference. If we do css_put all
      together (i.g. truncate case), scenario 1 will leak one css reference.
      
      The imbalanced css_get() and css_put() pair would result in a non-zero
      reference when we try to destroy the hugetlb cgroup. The hugetlb cgroup
      directory is removed __but__ associated resource is not freed. This
      might result in OOM or can not create a new hugetlb cgroup in a busy
      workload ultimately.
      
      In order to fix this, we have to make sure that one file_region must
      hold exactly one css reference. So in coalesce_file_region case, we
      should release one css reference before coalescence. Also only put css
      reference when the entire file_region is removed.
      
      The last thing to note is that the caller of region_add() will only hold
      one reference to h_cg->css for the whole contiguous reservation region.
      But this area might be scattered when there are already some
      file_regions reside in it. As a result, many file_regions may share only
      one h_cg->css reference. In order to ensure that one file_region must
      hold exactly one css reference, we should do css_get() for each
      file_region and release the reference held by caller when they are done.
      
      [linmiaohe@huawei.com: fix imbalanced css_get and css_put pair for shared mappings]
        Link: https://lkml.kernel.org/r/20210316023002.53921-1-linmiaohe@huawei.com
      
      Link: https://lkml.kernel.org/r/20210301120540.37076-1-linmiaohe@huawei.com
      Fixes: 075a61d0 ("hugetlb_cgroup: add accounting for shared mappings")
      Reported-by: kernel test robot <lkp@intel.com> (auto build test ERROR)
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Wanpeng Li <liwp.linux@gmail.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d85aecf2
  2. 23 Mar, 2021 1 commit
  3. 22 Mar, 2021 1 commit
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 84196390
      Linus Torvalds authored
      Pull selinux fixes from Paul Moore:
       "Three SELinux patches:
      
         - Fix a problem where a local variable is used outside its associated
           function. Thankfully this can only be triggered by reloading the
           SELinux policy, which is a restricted operation for other obvious
           reasons.
      
         - Fix some incorrect, and inconsistent, audit and printk messages
           when loading the SELinux policy.
      
        All three patches are relatively minor and have been through our
        testing with no failures"
      
      * tag 'selinux-pr-20210322' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinuxfs: unify policy load error reporting
        selinux: fix variable scope issue in live sidtab conversion
        selinux: don't log MAC_POLICY_LOAD record on failed policy load
      84196390
  4. 21 Mar, 2021 23 commits
  5. 20 Mar, 2021 7 commits
    • Thomas Gleixner's avatar
      genirq: Disable interrupts for force threaded handlers · 81e2073c
      Thomas Gleixner authored
      With interrupt force threading all device interrupt handlers are invoked
      from kernel threads. Contrary to hard interrupt context the invocation only
      disables bottom halfs, but not interrupts. This was an oversight back then
      because any code like this will have an issue:
      
      thread(irq_A)
        irq_handler(A)
          spin_lock(&foo->lock);
      
      interrupt(irq_B)
        irq_handler(B)
          spin_lock(&foo->lock);
      
      This has been triggered with networking (NAPI vs. hrtimers) and console
      drivers where printk() happens from an interrupt which interrupted the
      force threaded handler.
      
      Now people noticed and started to change the spin_lock() in the handler to
      spin_lock_irqsave() which affects performance or add IRQF_NOTHREAD to the
      interrupt request which in turn breaks RT.
      
      Fix the root cause and not the symptom and disable interrupts before
      invoking the force threaded handler which preserves the regular semantics
      and the usefulness of the interrupt force threading as a general debugging
      tool.
      
      For not RT this is not changing much, except that during the execution of
      the threaded handler interrupts are delayed until the handler
      returns. Vs. scheduling and softirq processing there is no difference.
      
      For RT kernels there is no issue.
      
      Fixes: 8d32a307 ("genirq: Provide forced interrupt threading")
      Reported-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarJohan Hovold <johan@kernel.org>
      Acked-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Link: https://lore.kernel.org/r/20210317143859.513307808@linutronix.de
      81e2073c
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 812da4d3
      Linus Torvalds authored
      Pull RISC-V fixes from Palmer Dabbelt:
       "A handful of fixes for 5.12:
      
         - fix the SBI remote fence numbers for hypervisor fences, which had
           been transcribed in the wrong order in Linux. These fences are only
           used with the KVM patches applied.
      
         - fix a whole host of build warnings, these should have no functional
           change.
      
         - fix init_resources() to prevent an off-by-one error from causing an
           out-of-bounds array reference. This was manifesting during boot on
           vexriscv.
      
         - ensure the KASAN mappings are visible before proceeding to use
           them"
      
      * tag 'riscv-for-linus-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
        riscv: Correct SPARSEMEM configuration
        RISC-V: kasan: Declare kasan_shallow_populate() static
        riscv: Ensure page table writes are flushed when initializing KASAN vmalloc
        RISC-V: Fix out-of-bounds accesses in init_resources()
        riscv: Fix compilation error with Canaan SoC
        ftrace: Fix spelling mistake "disabed" -> "disabled"
        riscv: fix bugon.cocci warnings
        riscv: process: Fix no prototype for arch_dup_task_struct
        riscv: ftrace: Use ftrace_get_regs helper
        riscv: process: Fix no prototype for show_regs
        riscv: syscall_table: Reduce W=1 compilation warnings noise
        riscv: time: Fix no prototype for time_init
        riscv: ptrace: Fix no prototype warnings
        riscv: sbi: Fix comment of __sbi_set_timer_v01
        riscv: irq: Fix no prototype warning
        riscv: traps: Fix no prototype warnings
        RISC-V: correct enum sbi_ext_rfence_fid
      812da4d3
    • Linus Torvalds's avatar
      Merge tag '5.12-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6 · bfdc4aa9
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "Five cifs/smb3 fixes - three for stable, including an important ACL
        fix and security signature fix"
      
      * tag '5.12-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6:
        cifs: fix allocation size on newly created files
        cifs: warn and fail if trying to use rootfs without the config option
        fs/cifs/: fix misspellings using codespell tool
        cifs: Fix preauth hash corruption
        cifs: update new ACE pointer after populate_new_aces.
      bfdc4aa9
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · af97713d
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Eight fixes, all in drivers, all fairly minor either being fixes in
        error legs, memory leaks on teardown, context errors or semantic
        problems"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: mpt3sas: Do not use GFP_KERNEL in atomic context
        scsi: ufs: ufs-mediatek: Correct operator & -> &&
        scsi: sd_zbc: Update write pointer offset cache
        scsi: lpfc: Fix some error codes in debugfs
        scsi: qla2xxx: Fix broken #endif placement
        scsi: st: Fix a use after free in st_open()
        scsi: myrs: Fix a double free in myrs_cleanup()
        scsi: ibmvfc: Free channel_setup_buf during device tear down
      af97713d
    • Linus Torvalds's avatar
      Merge tag 'zonefs-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs · 1c273e10
      Linus Torvalds authored
      Pull zonefs fixes from Damien Le Moal:
      
       - fix inode write open reference count (Chao)
      
       - Fix wrong write offset for asynchronous O_APPEND writes (me)
      
       - Prevent use of sequential zone file as swap files (me)
      
      * tag 'zonefs-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
        zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone()
        zonefs: Fix O_APPEND async write handling
        zonefs: prevent use of seq files as swap file
      1c273e10
    • Linus Torvalds's avatar
      Merge tag 'block-5.12-2021-03-19' of git://git.kernel.dk/linux-block · d626c692
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Just an NVMe pull request this week:
      
         - fix tag allocation for keep alive
      
         - fix a unit mismatch for the Write Zeroes limits
      
         - various TCP transport fixes (Sagi Grimberg, Elad Grupi)
      
         - fix iosqes and iocqes validation for discovery controllers (Sagi Grimberg)"
      
      * tag 'block-5.12-2021-03-19' of git://git.kernel.dk/linux-block:
        nvmet-tcp: fix kmap leak when data digest in use
        nvmet: don't check iosqes,iocqes for discovery controllers
        nvme-rdma: fix possible hang when failing to set io queues
        nvme-tcp: fix possible hang when failing to set io queues
        nvme-tcp: fix misuse of __smp_processor_id with preemption enabled
        nvme-tcp: fix a NULL deref when receiving a 0-length r2t PDU
        nvme: fix Write Zeroes limitations
        nvme: allocate the keep alive request using BLK_MQ_REQ_NOWAIT
        nvme: merge nvme_keep_alive into nvme_keep_alive_work
        nvme-fabrics: only reserve a single tag
      d626c692
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block · 0ada2dad
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Quieter week this time, which was both expected and desired. About
        half of the below is fixes for this release, the other half are just
        fixes in general. In detail:
      
         - Fix the freezing of IO threads, by making the freezer not send them
           fake signals. Make them freezable by default.
      
         - Like we did for personalities, move the buffer IDR to xarray. Kills
           some code and avoids a use-after-free on teardown.
      
         - SQPOLL cleanups and fixes (Pavel)
      
         - Fix linked timeout race (Pavel)
      
         - Fix potential completion post use-after-free (Pavel)
      
         - Cleanup and move internal structures outside of general kernel view
           (Stefan)
      
         - Use MSG_SIGNAL for send/recv from io_uring (Stefan)"
      
      * tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block:
        io_uring: don't leak creds on SQO attach error
        io_uring: use typesafe pointers in io_uring_task
        io_uring: remove structures from include/linux/io_uring.h
        io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls
        io_uring: fix sqpoll cancellation via task_work
        io_uring: add generic callback_head helpers
        io_uring: fix concurrent parking
        io_uring: halt SQO submission on ctx exit
        io_uring: replace sqd rw_semaphore with mutex
        io_uring: fix complete_post use ctx after free
        io_uring: fix ->flags races by linked timeouts
        io_uring: convert io_buffer_idr to XArray
        io_uring: allow IO worker threads to be frozen
        kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing
      0ada2dad
  6. 19 Mar, 2021 3 commits
    • Johan Hovold's avatar
      x86/apic/of: Fix CPU devicetree-node lookups · dd926880
      Johan Hovold authored
      Architectures that describe the CPU topology in devicetree and do not have
      an identity mapping between physical and logical CPU ids must override the
      default implementation of arch_match_cpu_phys_id().
      
      Failing to do so breaks CPU devicetree-node lookups using of_get_cpu_node()
      and of_cpu_device_node_get() which several drivers rely on. It also causes
      the CPU struct devices exported through sysfs to point to the wrong
      devicetree nodes.
      
      On x86, CPUs are described in devicetree using their APIC ids and those
      do not generally coincide with the logical ids, even if CPU0 typically
      uses APIC id 0.
      
      Add the missing implementation of arch_match_cpu_phys_id() so that CPU-node
      lookups work also with SMP.
      
      Apart from fixing the broken sysfs devicetree-node links this likely does
      not affect current users of mainline kernels on x86.
      
      Fixes: 4e07db9c ("x86/devicetree: Use CPU description from Device Tree")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lore.kernel.org/r/20210312092033.26317-1-johan@kernel.org
      dd926880
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · ecd8ee7f
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "Fixes for kvm on x86:
      
         - new selftests
      
         - fixes for migration with HyperV re-enlightenment enabled
      
         - fix RCU/SRCU usage
      
         - fixes for local_irq_restore misuse false positive"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        documentation/kvm: additional explanations on KVM_SET_BOOT_CPU_ID
        x86/kvm: Fix broken irq restoration in kvm_wait
        KVM: X86: Fix missing local pCPU when executing wbinvd on all dirty pCPUs
        KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish
        selftests: kvm: add set_boot_cpu_id test
        selftests: kvm: add _vm_ioctl
        selftests: kvm: add get_msr_index_features
        selftests: kvm: Add basic Hyper-V clocksources tests
        KVM: x86: hyper-v: Don't touch TSC page values when guest opted for re-enlightenment
        KVM: x86: hyper-v: Track Hyper-V TSC page status
        KVM: x86: hyper-v: Prevent using not-yet-updated TSC page by secondary CPUs
        KVM: x86: hyper-v: Limit guest to writing zero to HV_X64_MSR_TSC_EMULATION_STATUS
        KVM: x86/mmu: Store the address space ID in the TDP iterator
        KVM: x86/mmu: Factor out tdp_iter_return_to_root
        KVM: x86/mmu: Fix RCU usage when atomically zapping SPTEs
        KVM: x86/mmu: Fix RCU usage in handle_removed_tdp_mmu_page
      ecd8ee7f
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 3149860d
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
       "Two fixes for the GPIO subsystem. Both address issues in the core GPIO
        code:
      
         - fix the return value in error path in gpiolib_dev_init()
      
         - fix the 'gpio-line-names' property handling correctly this time"
      
      * tag 'gpio-fixes-for-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpiolib: Assign fwnode to parent's if no primary one provided
        gpiolib: Fix error return code in gpiolib_dev_init()
      3149860d