1. 07 Sep, 2021 14 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 192ad3c2
      Linus Torvalds authored
      Pull KVM updates from Paolo Bonzini:
       "ARM:
         - Page ownership tracking between host EL1 and EL2
         - Rely on userspace page tables to create large stage-2 mappings
         - Fix incompatibility between pKVM and kmemleak
         - Fix the PMU reset state, and improve the performance of the virtual
           PMU
         - Move over to the generic KVM entry code
         - Address PSCI reset issues w.r.t. save/restore
         - Preliminary rework for the upcoming pKVM fixed feature
         - A bunch of MM cleanups
         - a vGIC fix for timer spurious interrupts
         - Various cleanups
      
        s390:
         - enable interpretation of specification exceptions
         - fix a vcpu_idx vs vcpu_id mixup
      
        x86:
         - fast (lockless) page fault support for the new MMU
         - new MMU now the default
         - increased maximum allowed VCPU count
         - allow inhibit IRQs on KVM_RUN while debugging guests
         - let Hyper-V-enabled guests run with virtualized LAPIC as long as
           they do not enable the Hyper-V "AutoEOI" feature
         - fixes and optimizations for the toggling of AMD AVIC (virtualized
           LAPIC)
         - tuning for the case when two-dimensional paging (EPT/NPT) is
           disabled
         - bugfixes and cleanups, especially with respect to vCPU reset and
           choosing a paging mode based on CR0/CR4/EFER
         - support for 5-level page table on AMD processors
      
        Generic:
         - MMU notifier invalidation callbacks do not take mmu_lock unless
           necessary
         - improved caching of LRU kvm_memory_slot
         - support for histogram statistics
         - add statistics for halt polling and remote TLB flush requests"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits)
        KVM: Drop unused kvm_dirty_gfn_invalid()
        KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted
        KVM: MMU: mark role_regs and role accessors as maybe unused
        KVM: MIPS: Remove a "set but not used" variable
        x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait
        KVM: stats: Add VM stat for remote tlb flush requests
        KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count()
        KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page
        KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality
        Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()"
        KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page
        kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710
        kvm: x86: Increase MAX_VCPUS to 1024
        kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS
        KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation
        KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host
        KVM: s390: index kvm->arch.idle_mask by vcpu_idx
        KVM: s390: Enable specification exception interpretation
        KVM: arm64: Trim guest debug exception handling
        KVM: SVM: Add 5-level page table support for SVM
        ...
      192ad3c2
    • Linus Torvalds's avatar
      Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging · a2b28235
      Linus Torvalds authored
      Pull dmi fix from Jean Delvare.
      
      Unbreak some existing udev/hwdb modalias matches due to misplaced
      product_sku field.
      
      * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
        firmware: dmi: Move product_sku info to the end of the modalias
      a2b28235
    • Linus Torvalds's avatar
      Merge tag 'ntb-5.15' of git://github.com/jonmason/ntb · 1735715e
      Linus Torvalds authored
      Pull NTB updates from Jon Mason:
       "Bug fixes and clean-ups for Linux v5.15"
      
      * tag 'ntb-5.15' of git://github.com/jonmason/ntb:
        NTB: switch from 'pci_' to 'dma_' API
        ntb: ntb_pingpong: remove redundant initialization of variables msg_data and spad_data
        NTB: perf: Fix an error code in perf_setup_inbuf()
        NTB: Fix an error code in ntb_msit_probe()
        ntb: intel: remove invalid email address in header comment
      1735715e
    • Linus Torvalds's avatar
      Merge tag 'rproc-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc · 21f577b0
      Linus Torvalds authored
      Pull remoteproc updates from Bjorn Andersson:
      
       - move the crash recovery worker to the freezable work queue to avoid
         interaction with other drivers during suspend & resume
      
       - fix a couple of typos in comments
      
       - add support for handling the audio DSP on SDM660
      
       - fix a race between the Qualcomm wireless subsystem driver and the
         associated driver for the RF chip
      
      * tag 'rproc-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
        remoteproc: q6v5_pas: Add sdm660 ADSP PIL compatible
        dt-bindings: remoteproc: qcom: adsp: Add SDM660 ADSP
        remoteproc: use freezable workqueue for crash notifications
        remoteproc: fix kernel doc for struct rproc_ops
        remoteproc: fix an typo in fw_elf_get_class code comments
        remoteproc: qcom: wcnss: Fix race with iris probe
      21f577b0
    • Linus Torvalds's avatar
      Merge tag 'backlight-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight · 2d7b4cdb
      Linus Torvalds authored
      Pull backlight updates from Lee Jones:
       "Fix-ups:
         - Improve bootloader/kernel device handover
      
        Bug Fixes:
         - Stabilise backlight in ktd253 driver"
      
      * tag 'backlight-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight:
        backlight: pwm_bl: Improve bootloader/kernel device handover
        backlight: ktd253: Stabilize backlight
      2d7b4cdb
    • Linus Torvalds's avatar
      Merge tag 'mfd-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd · 86406a9e
      Linus Torvalds authored
      Pull MFD updates from Lee Jones:
       "Core Frameworks:
         - Add support for registering devices via MFD cells to Simple MFD (I2C)
      
        New Drivers:
         - Add support for Renesas Synchronization Management Unit (SMU)
      
        New Device Support:
         - Add support for N5010 to Intel M10 BMC
         - Add support for Cannon Lake to Intel LPSS ACPI
         - Add support for Samsung SSG{1,2} to ST-Ericsson's U8500 family
         - Add support for TQMx110EB and TQMxE40x to TQ-Systems PLD TQMx86
      
        New Functionality:
         - Add support for GPIO to Intel LPC ICH
         - Add support for Reset to Texas Instruments TPS65086
      
        Fix-ups:
         - Trivial, sorting, whitespace, renaming, etc; mt6360-core, db8500-prcmu-regs, tqmx86
         - Device Tree fiddling; syscon, axp20x, qcom,pm8008, ti,tps65086, brcm,cru
         - Use proper APIs for IRQ map resolution; ab8500-core, stmpe, tc3589x, wm8994-irq
         - Pass 'supplied-from' property through axp288_fuel_gauge via swnode
         - Remove unused file entry; MAINTAINERS
         - Make interrupt line optional; tps65086
         - Rename db8500-cpuidle driver symbol; db8500-prcmu
         - Remove support for unused hardware; tqmx86
         - Provide a standard LPC clock frequency for unknown boards; tqmx86
         - Remove unused code; ti_am335x_tscadc
         - Use of_iomap() instead of ioremap(); syscon
      
        Bug Fixes:
         - Clear GPIO IRQ resource flags when no IRQ is set; tqmx86
         - Fix incorrect/misleading frequencies; db8500-prcmu
         - Mitigate namespace clash with other GPIOBASE users"
      
      * tag 'mfd-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (31 commits)
        mfd: lpc_sch: Rename GPIOBASE to prevent build error
        mfd: syscon: Use of_iomap() instead of ioremap()
        dt-bindings: mfd: Add Broadcom CRU
        mfd: ti_am335x_tscadc: Delete superfluous error message
        mfd: tqmx86: Assume 24MHz LPC clock for unknown boards
        mfd: tqmx86: Add support for TQ-Systems DMI IDs
        mfd: tqmx86: Add support for TQMx110EB and TQMxE40x
        mfd: tqmx86: Fix typo in "platform"
        mfd: tqmx86: Remove incorrect TQMx90UC board ID
        mfd: tqmx86: Clear GPIO IRQ resource when no IRQ is set
        mfd: simple-mfd-i2c: Add support for registering devices via MFD cells
        mfd/cpuidle: ux500: Rename driver symbol
        mfd: tps65086: Add cell entry for reset driver
        mfd: tps65086: Make interrupt line optional
        dt-bindings: mfd: Convert tps65086.txt to YAML
        MAINTAINERS: Adjust ARM/NOMADIK/Ux500 ARCHITECTURES to file renaming
        mfd: db8500-prcmu: Handle missing FW variant
        mfd: db8500-prcmu: Rename register header
        mfd: axp20x: Add supplied-from property to axp288_fuel_gauge cell
        mfd: Don't use irq_create_mapping() to resolve a mapping
        ...
      86406a9e
    • Linus Torvalds's avatar
      Merge tag 'gpio-updates-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 5e6a5845
      Linus Torvalds authored
      Pull gpio updates from Bartosz Golaszewski:
       "We mostly have various improvements and refactoring all over the place
        but also some interesting new features - like the virtio GPIO driver
        that allows guest VMs to use host's GPIOs. We also have a new/old GPIO
        driver for rockchip - this one has been split out of the pinctrl
        driver.
      
        Summary:
      
         - new driver: gpio-virtio allowing a guest VM running linux to access
           GPIO lines provided by the host
      
         - split the GPIO driver out of the rockchip pin control driver
      
         - add support for a new model to gpio-aspeed-sgpio, refactor the
           driver and use generic device property interfaces, improve property
           sanitization
      
         - add ACPI support to gpio-tegra186
      
         - improve the code setting the line names to support multiple GPIO
           banks per device
      
         - constify a bunch of OF functions in the core GPIO code and make the
           declaration for one of the core OF functions we use consistent
           within its header
      
         - use software nodes in intel_quark_i2c_gpio
      
         - add support for the gpio-line-names property in gpio-mt7621
      
         - use the standard GPIO function for setting the GPIO names in
           gpio-brcmstb
      
         - fix a bunch of leaks and other bugs in gpio-mpc8xxx
      
         - use generic pm callbacks in gpio-ml-ioh
      
         - improve resource management and PM handling in gpio-mlxbf2
      
         - modernize and improve the gpio-dwapb driver
      
         - coding style improvements in gpio-rcar
      
         - documentation fixes and improvements
      
         - update the MAINTAINERS entry for gpio-zynq
      
         - minor tweaks in several drivers"
      
      * tag 'gpio-updates-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (35 commits)
        gpio: mpc8xxx: Use 'devm_gpiochip_add_data()' to simplify the code and avoid a leak
        gpio: mpc8xxx: Fix a potential double iounmap call in 'mpc8xxx_probe()'
        gpio: mpc8xxx: Fix a resources leak in the error handling path of 'mpc8xxx_probe()'
        gpio: viperboard: remove platform_set_drvdata() call in probe
        gpio: virtio: Add missing mailings lists in MAINTAINERS entry
        gpio: virtio: Fix sparse warnings
        gpio: remove the obsolete MX35 3DS BOARD MC9S08DZ60 GPIO functions
        gpio: max730x: Use the right include
        gpio: Add virtio-gpio driver
        gpio: mlxbf2: Use DEFINE_RES_MEM_NAMED() helper macro
        gpio: mlxbf2: Use devm_platform_ioremap_resource()
        gpio: mlxbf2: Drop wrong use of ACPI_PTR()
        gpio: mlxbf2: Convert to device PM ops
        gpio: dwapb: Get rid of legacy platform data
        mfd: intel_quark_i2c_gpio: Convert GPIO to use software nodes
        gpio: dwapb: Read GPIO base from gpio-base property
        gpio: dwapb: Unify ACPI enumeration checks in get_irq() and configure_irqs()
        gpiolib: Deduplicate forward declaration in the consumer.h header
        MAINTAINERS: update gpio-zynq.yaml reference
        gpio: tegra186: Add ACPI support
        ...
      5e6a5845
    • Linus Torvalds's avatar
      Merge tag 'fuse-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · 75b96f0e
      Linus Torvalds authored
      Pull fuse updates from Miklos Szeredi:
      
       - Allow mounting an active fuse device. Previously the fuse device
         would always be mounted during initialization, and sharing a fuse
         superblock was only possible through mount or namespace cloning
      
       - Fix data flushing in syncfs (virtiofs only)
      
       - Fix data flushing in copy_file_range()
      
       - Fix a possible deadlock in atomic O_TRUNC
      
       - Misc fixes and cleanups
      
      * tag 'fuse-update-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
        fuse: remove unused arg in fuse_write_file_get()
        fuse: wait for writepages in syncfs
        fuse: flush extending writes
        fuse: truncate pagecache on atomic_o_trunc
        fuse: allow sharing existing sb
        fuse: move fget() to fuse_get_tree()
        fuse: move option checking into fuse_fill_super()
        fuse: name fs_context consistently
        fuse: fix use after free in fuse_read_interrupt()
      75b96f0e
    • Linus Torvalds's avatar
      Merge tag 'kgdb-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux · 996fe061
      Linus Torvalds authored
      Pull kgdb updates from Daniel Thompson:
       "Changes for kgdb/kdb this cycle are dominated by a change from Sumit
        that removes as small (256K) private heap from kdb. This is change
        I've hoped for ever since I discovered how few users of this heap
        remained in the kernel, so many thanks to Sumit for hunting these
        down.
      
        The other change is an incremental step towards SPDX headers"
      
      * tag 'kgdb-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
        kernel: debug: Convert to SPDX identifier
        kdb: Rename members of struct kdbtab_t
        kdb: Simplify kdb_defcmd macro logic
        kdb: Get rid of redundant kdb_register_flags()
        kdb: Rename struct defcmd_set to struct kdb_macro
        kdb: Get rid of custom debug heap allocator
      996fe061
    • Linus Torvalds's avatar
      Revert "memcg: enable accounting for pollfd and select bits arrays" · 0bcfe68b
      Linus Torvalds authored
      This reverts commit b6558434.
      
      Just like with the memcg lock accounting, the kernel test robot reports
      a sizeable performance regression for this commit, and while it clearly
      does the rigth thing in theory, we'll need to look at just how to avoid
      or minimize the performance overhead of the memcg accounting.
      
      People already have suggestions on how to do that, but it's "future
      work".
      
      So revert it for now.
      
      [ Note: the first link below is for this same commit but a different
        commit ID, because it's the kernel test robot ended up noticing it in
        Andrew Morton's patch queue ]
      
      Link: https://lore.kernel.org/lkml/20210905132732.GC15026@xsang-OptiPlex-9020/
      Link: https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex-9020/Acked-by: default avatarJens Axboe <axboe@kernel.dk>
      Acked-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0bcfe68b
    • Linus Torvalds's avatar
      Revert "memcg: enable accounting for file lock caches" · 3754707b
      Linus Torvalds authored
      This reverts commit 0f12156d.
      
      The kernel test robot reports a sizeable performance regression for this
      commit, and while it clearly does the rigth thing in theory, we'll need
      to look at just how to avoid or minimize the performance overhead of the
      memcg accounting.
      
      People already have suggestions on how to do that, but it's "future
      work".
      
      So revert it for now.
      
      Link: https://lore.kernel.org/lkml/20210907150757.GE17617@xsang-OptiPlex-9020/Acked-by: default avatarJens Axboe <axboe@kernel.dk>
      Acked-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3754707b
    • Linus Torvalds's avatar
      Revert "mm/gup: remove try_get_page(), call try_get_compound_head() directly" · cd1adf1b
      Linus Torvalds authored
      This reverts commit 9857a17f.
      
      That commit was completely broken, and I should have caught on to it
      earlier.  But happily, the kernel test robot noticed the breakage fairly
      quickly.
      
      The breakage is because "try_get_page()" is about avoiding the page
      reference count overflow case, but is otherwise the exact same as a
      plain "get_page()".
      
      In contrast, "try_get_compound_head()" is an entirely different beast,
      and uses __page_cache_add_speculative() because it's not just about the
      page reference count, but also about possibly racing with the underlying
      page going away.
      
      So all the commentary about how
      
       "try_get_page() has fallen a little behind in terms of maintenance,
        try_get_compound_head() handles speculative page references more
        thoroughly"
      
      was just completely wrong: yes, try_get_compound_head() handles
      speculative page references, but the point is that try_get_page() does
      not, and must not.
      
      So there's no lack of maintainance - there are fundamentally different
      semantics.
      
      A speculative page reference would be entirely wrong in "get_page()",
      and it's entirely wrong in "try_get_page()".  It's not about
      speculation, it's purely about "uhhuh, you can't get this page because
      you've tried to increment the reference count too much already".
      
      The reason the kernel test robot noticed this bug was that it hit the
      VM_BUG_ON() in __page_cache_add_speculative(), which is all about
      verifying that the context of any speculative page access is correct.
      But since that isn't what try_get_page() is all about, the VM_BUG_ON()
      tests things that are not correct to test for try_get_page().
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cd1adf1b
    • Randy Dunlap's avatar
      mfd: lpc_sch: Rename GPIOBASE to prevent build error · cdff1eda
      Randy Dunlap authored
      One MIPS platform (mach-rc32434) defines GPIOBASE. This macro
      conflicts with one of the same name in lpc_sch.c. Rename the latter one
      to prevent the build error.
      
      ../drivers/mfd/lpc_sch.c:25: error: "GPIOBASE" redefined [-Werror]
         25 | #define GPIOBASE        0x44
      ../arch/mips/include/asm/mach-rc32434/rb.h:32: note: this is the location of the previous definition
         32 | #define GPIOBASE        0x050000
      
      Cc: Denis Turischev <denis@compulab.co.il>
      Fixes: e82c60ae ("mfd: Introduce lpc_sch for Intel SCH LPC bridge")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      cdff1eda
    • Hector Martin's avatar
      mfd: syscon: Use of_iomap() instead of ioremap() · 452d0741
      Hector Martin authored
      This automatically selects between ioremap() and ioremap_np() on
      platforms that require it, such as Apple SoCs.
      Signed-off-by: default avatarHector Martin <marcan@marcan.st>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      452d0741
  2. 06 Sep, 2021 26 commits
    • Linus Torvalds's avatar
      thunderbolt: test: split up test cases in tb_test_credit_alloc_all · 4b93c544
      Linus Torvalds authored
      The tb_test_credit_alloc_all() function had a huge number of
      KUNIT_ASSERT() statements, all of which (though the magic of many many
      layers of inscrutable macros) ended up allocating and initializing
      various test assertion structures on the stack.
      
      Don't do that.  The kernel stack isn't infinite, and we have compiler
      warnings (now errors) for the case where a stack frame grows too large.
      
      Like it did here, by not an inconsiderable margin:
      
         drivers/thunderbolt/test.c: In function ‘tb_test_credit_alloc_all’:
         drivers/thunderbolt/test.c:2367:1: error: the frame size of 4500 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
          2367 | }
               | ^
      
      Solve this similarly to the lib/test_scanf case: split out the tests
      into several smaller functions, each just testing one particular tunnel
      credit allocation.
      
      This makes the i386 allyesconfig build work for me again.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4b93c544
    • Linus Torvalds's avatar
      lib/test_scanf: split up number parsing test routines · ba7b1f86
      Linus Torvalds authored
      It turns out that gcc has real trouble merging all the temporary
      on-stack buffer allocation.  So despite the fact that their lifetimes do
      not overlap, gcc will allocate stack for all of them when they have
      different types.  Which they do in the number scanning test routines.
      
      This is unfortunate in general, but with lots of test-cases in one
      function, it becomes a real problem.  gcc will allocate a huge stack
      frame for no actual good reason.
      
      We have tried to counteract this tendency of gcc not merging stack slots
      (see "-fconserve-stack"), but that has limited effect (and should be on
      by default these days, iirc).
      
      So with all the debug options enabled on an i386 allmodconfig build, we
      end up with overly big stack frames, and the resulting stack frame size
      warnings (now errors):
      
         lib/test_scanf.c: In function ‘numbers_list_field_width_val_width’:
         lib/test_scanf.c:530:1: error: the frame size of 2088 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
           530 | }
               | ^
         lib/test_scanf.c: In function ‘numbers_list_field_width_typemax’:
         lib/test_scanf.c:488:1: error: the frame size of 2568 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
           488 | }
               | ^
         lib/test_scanf.c: In function ‘numbers_list’:
         lib/test_scanf.c:437:1: error: the frame size of 2088 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
           437 | }
               | ^
      
      In this particular case, the reasonably straightforward solution is to
      just split out the test routines into multiple more targeted versions.
      That way we don't have one huge stack, but several smaller ones, and
      they aren't active all at the same time.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ba7b1f86
    • Linus Torvalds's avatar
      iwl: fix debug printf format strings · 1476ff21
      Linus Torvalds authored
      The variable 'package_size' is an unsigned long, and should be printed
      out using '%lu', not '%zd' (that would be for a size_t).
      
      Yes, on many architectures (including x86-64), 'size_t' is in fact the
      same type as 'long', but that's a fairly random architecture definition,
      and on some platforms 'size_t' is in fact 'int' rather than 'long'.
      
      That is the case on traditional 32-bit x86.  Yes, both types are the
      exact same 32-bit size, and it would all print out perfectly correctly,
      but '%zd' ends up still being wrong.
      
      And we can't make 'package_size' be a 'size_t', because we get the
      actual value using efivar_entry_get() that takes a pointer to an
      'unsigned long'.  So '%lu' it is.
      
      This fixes two of the i386 allmodconfig build warnings (that is now an
      error due to -Werror).
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1476ff21
    • Linus Torvalds's avatar
      Merge tag 'block-5.15-2021-09-05' of git://git.kernel.dk/linux-block · 1dbe7e38
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Was going to send this one in later this week, but given that -Werror
        is now enabled (or at least available), the mq-deadline fix really
        should go in for the folks hitting that.
      
         - Ensure dd_queued() is only there if needed (Geert)
      
         - Fix a kerneldoc warning for bio_alloc_kiocb()
      
         - BFQ fix for queue merging
      
         - loop locking fix (Tetsuo)"
      
      * tag 'block-5.15-2021-09-05' of git://git.kernel.dk/linux-block:
        loop: reduce the loop_ctl_mutex scope
        bio: fix kerneldoc documentation for bio_alloc_kiocb()
        block, bfq: honor already-setup queue merges
        block/mq-deadline: Move dd_queued() to fix defined but not used warning
      1dbe7e38
    • Linus Torvalds's avatar
      Merge tag 'misc-5.15-2021-09-05' of git://git.kernel.dk/linux-block · 03085b3d
      Linus Torvalds authored
      Pull CDROM maintainer update from Jens Axboe:
       "It's been about 22 years since I originally started maintaining the
        CDROM code, and I just haven't been able to even get reviews done in a
        timely fashion the last handful of years.
      
        Time to pass it on, and Phillip has volunteered take over these
        duties. I'll be helping as needed for the foreseeable future"
      
      * tag 'misc-5.15-2021-09-05' of git://git.kernel.dk/linux-block:
        cdrom: update uniform CD-ROM maintainership in MAINTAINERS file
      03085b3d
    • Linus Torvalds's avatar
      Merge tag 'libata-5.15-2021-09-05' of git://git.kernel.dk/linux-block · eebb4159
      Linus Torvalds authored
      Pull libata fixes from Jens Axboe:
       "Fixes for queued trim on certain Samsung SSDs, in conjunction with
        certain ATI controllers"
      
      * tag 'libata-5.15-2021-09-05' of git://git.kernel.dk/linux-block:
        libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI for Samsung 860 and 870 SSD.
        libata: add ATA_HORKAGE_NO_NCQ_TRIM for Samsung 860 and 870 SSDs
      eebb4159
    • Linus Torvalds's avatar
      Merge tag 'for-5.15/io_uring-2021-09-04' of git://git.kernel.dk/linux-block · 60f8fbaa
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "As sometimes happens, two reports came in around the merge window open
        that led to some fixes. Hence this one is a bit bigger than usual
        followup fixes, but most of it will be going towards stable, outside
        of the fixes that are addressing regressions from this merge window.
      
        In detail:
      
         - postgres is a heavy user of signals between tasks, and if we're
           unlucky this can interfere with io-wq worker creation. Make sure
           we're resilient against unrelated signal handling. This set of
           changes also includes hardening against allocation failures, which
           could previously had led to stalls.
      
         - Some use cases that end up having a mix of bounded and unbounded
           work would have starvation issues related to that. Split the
           pending work lists to handle that better.
      
         - Completion trace int -> unsigned -> long fix
      
         - Fix issue with REGISTER_IOWQ_MAX_WORKERS and SQPOLL
      
         - Fix regression with hash wait lock in this merge window
      
         - Fix retry issued on block devices (Ming)
      
         - Fix regression with links in this merge window (Pavel)
      
         - Fix race with multi-shot poll and completions (Xiaoguang)
      
         - Ensure regular file IO doesn't inadvertently skip completion
           batching (Pavel)
      
         - Ensure submissions are flushed after running task_work (Pavel)"
      
      * tag 'for-5.15/io_uring-2021-09-04' of git://git.kernel.dk/linux-block:
        io_uring: io_uring_complete() trace should take an integer
        io_uring: fix possible poll event lost in multi shot mode
        io_uring: prolong tctx_task_work() with flushing
        io_uring: don't disable kiocb_done() CQE batching
        io_uring: ensure IORING_REGISTER_IOWQ_MAX_WORKERS works with SQPOLL
        io-wq: make worker creation resilient against signals
        io-wq: get rid of FIXED worker flag
        io-wq: only exit on fatal signals
        io-wq: split bounded and unbounded work into separate lists
        io-wq: fix queue stalling race
        io_uring: don't submit half-prepared drain request
        io_uring: fix queueing half-created requests
        io-wq: ensure that hash wait lock is IRQ disabling
        io_uring: retry in case of short read on block device
        io_uring: IORING_OP_WRITE needs hash_reg_file set
        io-wq: fix race between adding work and activating a free worker
      60f8fbaa
    • Stephen Rothwell's avatar
    • Cai Huoqing's avatar
      kernel: debug: Convert to SPDX identifier · f8416aa2
      Cai Huoqing authored
      use SPDX-License-Identifier instead of a verbose license text
      Signed-off-by: default avatarCai Huoqing <caihuoqing@baidu.com>
      Link: https://lore.kernel.org/r/20210906112302.937-1-caihuoqing@baidu.comSigned-off-by: default avatarDaniel Thompson <daniel.thompson@linaro.org>
      f8416aa2
    • Peter Xu's avatar
      KVM: Drop unused kvm_dirty_gfn_invalid() · 109bbba5
      Peter Xu authored
      Drop the unused function as reported by test bot.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Message-Id: <20210901230904.15164-1-peterx@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      109bbba5
    • Miklos Szeredi's avatar
      fuse: remove unused arg in fuse_write_file_get() · a9667ac8
      Miklos Szeredi authored
      The struct fuse_conn argument is not used and can be removed.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      a9667ac8
    • Miklos Szeredi's avatar
      fuse: wait for writepages in syncfs · 660585b5
      Miklos Szeredi authored
      In case of fuse the MM subsystem doesn't guarantee that page writeback
      completes by the time ->sync_fs() is called.  This is because fuse
      completes page writeback immediately to prevent DoS of memory reclaim by
      the userspace file server.
      
      This means that fuse itself must ensure that writes are synced before
      sending the SYNCFS request to the server.
      
      Introduce sync buckets, that hold a counter for the number of outstanding
      write requests.  On syncfs replace the current bucket with a new one and
      wait until the old bucket's counter goes down to zero.
      
      It is possible to have multiple syncfs calls in parallel, in which case
      there could be more than one waited-on buckets.  Descendant buckets must
      not complete until the parent completes.  Add a count to the child (new)
      bucket until the (parent) old bucket completes.
      
      Use RCU protection to dereference the current bucket and to wake up an
      emptied bucket.  Use fc->lock to protect against parallel assignments to
      the current bucket.
      
      This leaves just the counter to be a possible scalability issue.  The
      fc->num_waiting counter has a similar issue, so both should be addressed at
      the same time.
      Reported-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Fixes: 2d82ab25 ("virtiofs: propagate sync() to file server")
      Cc: <stable@vger.kernel.org> # v5.14
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      660585b5
    • Zelin Deng's avatar
      KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted · d9130a2d
      Zelin Deng authored
      When MSR_IA32_TSC_ADJUST is written by guest due to TSC ADJUST feature
      especially there's a big tsc warp (like a new vCPU is hot-added into VM
      which has been up for a long time), tsc_offset is added by a large value
      then go back to guest. This causes system time jump as tsc_timestamp is
      not adjusted in the meantime and pvclock monotonic character.
      To fix this, just notify kvm to update vCPU's guest time before back to
      guest.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarZelin Deng <zelin.deng@linux.alibaba.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Message-Id: <1619576521-81399-2-git-send-email-zelin.deng@linux.alibaba.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      d9130a2d
    • Paolo Bonzini's avatar
      KVM: MMU: mark role_regs and role accessors as maybe unused · 4ac21457
      Paolo Bonzini authored
      It is reasonable for these functions to be used only in some configurations,
      for example only if the host is 64-bits (and therefore supports 64-bit
      guests).  It is also reasonable to keep the role_regs and role accessors
      in sync even though some of the accessors may be used only for one of the
      two sets (as is the case currently for CR4.LA57)..
      
      Because clang reports warnings for unused inlines declared in a .c file,
      mark both sets of accessors as __maybe_unused.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      4ac21457
    • Huacai Chen's avatar
      KVM: MIPS: Remove a "set but not used" variable · a3cf527e
      Huacai Chen authored
      This fix a build warning:
      
         arch/mips/kvm/vz.c: In function '_kvm_vz_restore_htimer':
      >> arch/mips/kvm/vz.c:392:10: warning: variable 'freeze_time' set but not used [-Wunused-but-set-variable]
           392 |  ktime_t freeze_time;
               |          ^~~~~~~~~~~
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
      Message-Id: <20210406024911.2008046-1-chenhuacai@loongson.cn>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a3cf527e
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD · e99314a3
      Paolo Bonzini authored
      KVM/arm64 updates for 5.15
      
      - Page ownership tracking between host EL1 and EL2
      
      - Rely on userspace page tables to create large stage-2 mappings
      
      - Fix incompatibility between pKVM and kmemleak
      
      - Fix the PMU reset state, and improve the performance of the virtual PMU
      
      - Move over to the generic KVM entry code
      
      - Address PSCI reset issues w.r.t. save/restore
      
      - Preliminary rework for the upcoming pKVM fixed feature
      
      - A bunch of MM cleanups
      
      - a vGIC fix for timer spurious interrupts
      
      - Various cleanups
      e99314a3
    • Paolo Bonzini's avatar
      Merge tag 'kvm-s390-next-5.15-1' of... · 0d0a1939
      Paolo Bonzini authored
      Merge tag 'kvm-s390-next-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
      
      KVM: s390: Fix and feature for 5.15
      
      - enable interpretion of specification exceptions
      - fix a vcpu_idx vs vcpu_id mixup
      0d0a1939
    • Lai Jiangshan's avatar
      x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait · a40b2fd0
      Lai Jiangshan authored
      Commit f4e61f0c ("x86/kvm: Fix broken irq restoration in kvm_wait")
      replaced "local_irq_restore() when IRQ enabled" with "local_irq_enable()
      when IRQ enabled" to suppress a warnning.
      
      Although there is no similar debugging warnning for doing local_irq_enable()
      when IRQ enabled as doing local_irq_restore() in the same IRQ situation.  But
      doing local_irq_enable() when IRQ enabled is no less broken as doing
      local_irq_restore() and we'd better avoid it.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarLai Jiangshan <laijs@linux.alibaba.com>
      Message-Id: <20210814035129.154242-1-jiangshanlai@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      a40b2fd0
    • Jing Zhang's avatar
      KVM: stats: Add VM stat for remote tlb flush requests · 3cc4e148
      Jing Zhang authored
      Add a new stat that counts the number of times a remote TLB flush is
      requested, regardless of whether it kicks vCPUs out of guest mode. This
      allows us to look at how often flushes are initiated.
      
      Unlike remote_tlb_flush, this one applies to ARM's instruction-set-based
      TLB flush implementation, so apply it there too.
      Original-by: default avatarDavid Matlack <dmatlack@google.com>
      Signed-off-by: default avatarJing Zhang <jingzhangos@google.com>
      Message-Id: <20210817002639.3856694-1-jingzhangos@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      3cc4e148
    • Sean Christopherson's avatar
      KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count() · fdde13c1
      Sean Christopherson authored
      Don't export KVM's MMU notifier count helpers, under no circumstance
      should any downstream module, including x86's vendor code, have a
      legitimate reason to piggyback KVM's MMU notifier logic.  E.g in the x86
      case, only KVM's MMU should be elevating the notifier count, and that
      code is always built into the core kvm.ko module.
      
      Fixes: edb298c6 ("KVM: x86/mmu: bump mmu notifier count in kvm_zap_gfn_range")
      Cc: Maxim Levitsky <mlevitsk@redhat.com>
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20210902175951.1387989-1-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      fdde13c1
    • Sean Christopherson's avatar
      KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page · 1148bfc4
      Sean Christopherson authored
      Move "lpage_disallowed_link" out of the first 64 bytes, i.e. out of the
      first cache line, of kvm_mmu_page so that "spt" and to a lesser extent
      "gfns" land in the first cache line.  "lpage_disallowed_link" is accessed
      relatively infrequently compared to "spt", which is accessed any time KVM
      is walking and/or manipulating the shadow page tables.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210901221023.1303578-4-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1148bfc4
    • Sean Christopherson's avatar
      KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality · ca41c34c
      Sean Christopherson authored
      Move "tdp_mmu_page" into the 1-byte void left by the recently removed
      "mmio_cached" so that it resides in the first 64 bytes of kvm_mmu_page,
      i.e. in the same cache line as the most commonly accessed fields.
      
      Don't bother wrapping tdp_mmu_page in CONFIG_X86_64, including the field in
      32-bit builds doesn't affect the size of kvm_mmu_page, and a future patch
      can always wrap the field in the unlikely event KVM gains a 1-byte flag
      that is 32-bit specific.
      
      Note, the size of kvm_mmu_page is also unchanged on CONFIG_X86_64=y due
      to it previously sharing an 8-byte chunk with write_flooding_count.
      
      No functional change intended.
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210901221023.1303578-3-seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ca41c34c
    • Sean Christopherson's avatar
      Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()" · e7177339
      Sean Christopherson authored
      Revert a misguided illegal GPA check when "translating" a non-nested GPA.
      The check is woefully incomplete as it does not fill in @exception as
      expected by all callers, which leads to KVM attempting to inject a bogus
      exception, potentially exposing kernel stack information in the process.
      
       WARNING: CPU: 0 PID: 8469 at arch/x86/kvm/x86.c:525 exception_type+0x98/0xb0 arch/x86/kvm/x86.c:525
       CPU: 1 PID: 8469 Comm: syz-executor531 Not tainted 5.14.0-rc7-syzkaller #0
       RIP: 0010:exception_type+0x98/0xb0 arch/x86/kvm/x86.c:525
       Call Trace:
        x86_emulate_instruction+0xef6/0x1460 arch/x86/kvm/x86.c:7853
        kvm_mmu_page_fault+0x2f0/0x1810 arch/x86/kvm/mmu/mmu.c:5199
        handle_ept_misconfig+0xdf/0x3e0 arch/x86/kvm/vmx/vmx.c:5336
        __vmx_handle_exit arch/x86/kvm/vmx/vmx.c:6021 [inline]
        vmx_handle_exit+0x336/0x1800 arch/x86/kvm/vmx/vmx.c:6038
        vcpu_enter_guest+0x2a1c/0x4430 arch/x86/kvm/x86.c:9712
        vcpu_run arch/x86/kvm/x86.c:9779 [inline]
        kvm_arch_vcpu_ioctl_run+0x47d/0x1b20 arch/x86/kvm/x86.c:10010
        kvm_vcpu_ioctl+0x49e/0xe50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3652
      
      The bug has escaped notice because practically speaking the GPA check is
      useless.  The GPA check in question only comes into play when KVM is
      walking guest page tables (or "translating" CR3), and KVM already handles
      illegal GPA checks by setting reserved bits in rsvd_bits_mask for each
      PxE, or in the case of CR3 for loading PTDPTRs, manually checks for an
      illegal CR3.  This particular failure doesn't hit the existing reserved
      bits checks because syzbot sets guest.MAXPHYADDR=1, and IA32 architecture
      simply doesn't allow for such an absurd MAXPHYADDR, e.g. 32-bit paging
      doesn't define any reserved PA bits checks, which KVM emulates by only
      incorporating the reserved PA bits into the "high" bits, i.e. bits 63:32.
      
      Simply remove the bogus check.  There is zero meaningful value and no
      architectural justification for supporting guest.MAXPHYADDR < 32, and
      properly filling the exception would introduce non-trivial complexity.
      
      This reverts commit ec7771ab.
      
      Fixes: ec7771ab ("KVM: x86: mmu: Add guest physical address check in translate_gpa()")
      Cc: stable@vger.kernel.org
      Reported-by: syzbot+200c08e88ae818f849ce@syzkaller.appspotmail.com
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Message-Id: <20210831164224.1119728-2-seanjc@google.com>
      Reviewed-by: default avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      e7177339
    • Jia He's avatar
      KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page · 678a305b
      Jia He authored
      After reverting and restoring the fast tlb invalidation patch series,
      the mmio_cached is not removed. Hence a unused field is left in
      kvm_mmu_page.
      
      Cc: Sean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarJia He <justin.he@arm.com>
      Message-Id: <20210830145336.27183-1-justin.he@arm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      678a305b
    • Eduardo Habkost's avatar
      kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710 · 1dbaf04c
      Eduardo Habkost authored
      Support for 710 VCPUs was tested by Red Hat since RHEL-8.4,
      so increase KVM_SOFT_MAX_VCPUS to 710.
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Message-Id: <20210903211600.2002377-4-ehabkost@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      1dbaf04c
    • Eduardo Habkost's avatar
      kvm: x86: Increase MAX_VCPUS to 1024 · 074c82c8
      Eduardo Habkost authored
      Increase KVM_MAX_VCPUS to 1024, so we can test larger VMs.
      
      I'm not changing KVM_SOFT_MAX_VCPUS yet because I'm afraid it
      might involve complicated questions around the meaning of
      "supported" and "recommended" in the upstream tree.
      KVM_SOFT_MAX_VCPUS will be changed in a separate patch.
      
      For reference, visible effects of this change are:
      - KVM_CAP_MAX_VCPUS will now return 1024 (of course)
      - Default value for CPUID[HYPERV_CPUID_IMPLEMENT_LIMITS (00x40000005)].EAX
        will now be 1024
      - KVM_MAX_VCPU_ID will change from 1151 to 4096
      - Size of struct kvm will increase from 19328 to 22272 bytes
        (in x86_64)
      - Size of struct kvm_ioapic will increase from 1780 to 5084 bytes
        (in x86_64)
      - Bitmap stack variables that will grow:
        - At kvm_hv_flush_tlb() kvm_hv_send_ipi(),
          vp_bitmap[] and vcpu_bitmap[] will now be 128 bytes long
        - vcpu_bitmap at bioapic_write_indirect() will be 128 bytes long
          once patch "KVM: x86: Fix stack-out-of-bounds memory access
          from ioapic_write_indirect()" is applied
      Signed-off-by: default avatarEduardo Habkost <ehabkost@redhat.com>
      Message-Id: <20210903211600.2002377-3-ehabkost@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      074c82c8