1. 09 Aug, 2020 3 commits
    • Paolo Bonzini's avatar
      x86: Expose SERIALIZE for supported cpuid · 43bd9ef4
      Paolo Bonzini authored
      The SERIALIZE instruction is supported by Tntel processors, like
      Sapphire Rapids.  SERIALIZE is a faster serializing instruction which
      does not modify registers, arithmetic flags or memory, will not cause VM
      exit. It's availability is indicated by CPUID.(EAX=7,ECX=0):ECX[bit 14].
      
      Expose it in KVM supported CPUID.  This way, KVM could pass this
      information to guests and they can make use of these features accordingly.
      Signed-off-by: default avatarCathy Zhang <cathy.zhang@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      43bd9ef4
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-5.9' of... · 0378daef
      Paolo Bonzini authored
      Merge tag 'kvmarm-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-next-5.6
      
      KVM/arm64 updates for Linux 5.9:
      
      - Split the VHE and nVHE hypervisor code bases, build the EL2 code
        separately, allowing for the VHE code to now be built with instrumentation
      
      - Level-based TLB invalidation support
      
      - Restructure of the vcpu register storage to accomodate the NV code
      
      - Pointer Authentication available for guests on nVHE hosts
      
      - Simplification of the system register table parsing
      
      - MMU cleanups and fixes
      
      - A number of post-32bit cleanups and other fixes
      0378daef
    • Sean Christopherson's avatar
      KVM: x86: Don't attempt to load PDPTRs when 64-bit mode is enabled · 05487215
      Sean Christopherson authored
      Don't attempt to load PDPTRs if EFER.LME=1, i.e. if 64-bit mode is
      enabled.  A recent change to reload the PDTPRs when CR0.CD or CR0.NW is
      toggled botched the EFER.LME handling and sends KVM down the PDTPR path
      when is_paging() is true, i.e. when the guest toggles CD/NW in 64-bit
      mode.
      
      Split the CR0 checks for 64-bit vs. 32-bit PAE into separate paths.  The
      64-bit path is specifically checking state when paging is toggled on,
      i.e. CR0.PG transititions from 0->1.  The PDPTR path now needs to run if
      the new CR0 state has paging enabled, irrespective of whether paging was
      already enabled.  Trying to shave a few cycles to make the PDPTR path an
      "else if" case is a mess.
      
      Fixes: d42e3fae ("kvm: x86: Read PDPTEs on CR0.CD and CR0.NW changes")
      Cc: Jim Mattson <jmattson@google.com>
      Cc: Oliver Upton <oupton@google.com>
      Cc: Peter Shier <pshier@google.com>
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Reviewed-by: default avatarJim Mattson <jmattson@google.com>
      Reviewed-by: default avatarMaxim Levitsky <mlevitsk@redhat.com>
      Message-Id: <20200714015732.32426-1-sean.j.christopherson@intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      05487215
  2. 08 Aug, 2020 10 commits
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 06a81c1c
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - Fix tegra194-cpufreq module build failure caused by __cpu_logical_map
         not being exported.
      
       - Improve fixed_addresses comment regarding the fixmap buffer sizes.
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Fix __cpu_logical_map undefined issue
        arm64/fixmap: make notes of fixed_addresses more precisely
      06a81c1c
    • Kefeng Wang's avatar
      arm64: Fix __cpu_logical_map undefined issue · eaecca9e
      Kefeng Wang authored
      The __cpu_logical_map undefined issue occued when the new
      tegra194-cpufreq drvier building as a module.
      
      ERROR: modpost: "__cpu_logical_map" [drivers/cpufreq/tegra194-cpufreq.ko] undefined!
      
      The driver using cpu_logical_map() macro which will expand to
      __cpu_logical_map, we can't access it in a drvier. Let's turn
      cpu_logical_map() into a C wrapper and export it to fix the
      build issue.
      
      Also create a function set_cpu_logical_map(cpu, hwid) when assign
      a value to cpu_logical_map(cpu).
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      eaecca9e
    • Pingfan Liu's avatar
      arm64/fixmap: make notes of fixed_addresses more precisely · 489577d7
      Pingfan Liu authored
      These 'compile-time allocated' memory buffers can occupy more than one
      page and each enum increment is page-sized. So improve the note about it.
      Signed-off-by: default avatarPingfan Liu <kernelfans@gmail.com>
      Cc: Will Deacon <will@kernel.org>
      Link: https://lore.kernel.org/r/1596460720-19243-1-git-send-email-kernelfans@gmail.com
      To: linux-arm-kernel@lists.infradead.org
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      489577d7
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.9-1' of git://github.com/cminyard/linux-ipmi · 11030fe9
      Linus Torvalds authored
      Pull IPMI updates from Corey Minyard:
       "Minor cleanups to the IPMI driver for 5.9
      
        Nothing of any major consequence. Duplicate code, some missing \n's in
        sysfs files, some documentation and comment changes"
      
      * tag 'for-linus-5.9-1' of git://github.com/cminyard/linux-ipmi:
        ipmi/watchdog: add missing newlines when printing parameters by sysfs
        ipmi: remve duplicate code in __ipmi_bmc_register()
        ipmi: ssif: Remove finished TODO comment about SMBus alert
        Doc: driver-api: ipmi: Add description of alerts_broken module param
      11030fe9
    • Linus Torvalds's avatar
      Merge tag 'for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply · 449dc8c9
      Linus Torvalds authored
      Pull power supply and reset updates from Sebastian Reichel:
       "Power-supply core:
         - add COOL/WARM/HOT state from JEITA JISC8712:2015 specification
         - convert simple-battery DT binding to YAML
         - add long-life charging mode
      
       Battery/charger drivers:
         - bq25150: new charger driver
         - bq27xxx: add support for BQ27z561 and BQ28z610
         - max17040: support CAPACITY_ALERT_MIN
         - sbs-battery: add PEC support
         - wilco-ec: support long-life charging mode
         - bq25890: fix DT binding
         - misc. fixes and cleanups
      
       Reset drivers:
         - linkstation: new reset driver"
      
      * tag 'for-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (32 commits)
        power: supply: wilco_ec: Add long life charging mode
        power: supply: bq27xxx_battery: Add the BQ28z610 Battery monitor
        dt-bindings: power: Add BQ28z610 compatible
        power: supply: bq27xxx_battery: Add the BQ27Z561 Battery monitor
        dt-bindings: power: Add BQ27Z561 compatible
        power: supply: test_power: Fix battery_current initial value
        power: supply: Fix kerneldoc of power_supply_temp2resist_simple()
        power: supply: cpcap-battery: Fix kerneldoc of cpcap_battery_read_accumulated()
        dt-bindings: power: Convert battery.txt to battery.yaml
        power: supply: rt5033_battery: Fix error code in rt5033_battery_probe()
        power: supply: max17040: Add POWER_SUPPLY_PROP_CAPACITY_ALERT_MIN
        power: supply: check if calc_soc succeeded in pm860x_init_battery
        power: supply: bq2xxxx: Replace HTTP links with HTTPS ones
        power: reset: add driver for LinkStation power off
        power: supply: sc27xx: prevent adc * 1000 from overflow
        math64: New DIV_S64_ROUND_CLOSEST helper
        power: fix duplicated words in bq2415x_charger.h
        power: Convert to DEFINE_SHOW_ATTRIBUTE
        power: reset: keystone-reset: Replace HTTP links with HTTPS ones
        power: supply: bq25150 introduce the bq25150
        ...
      449dc8c9
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · b79675e1
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "No common topic whatsoever in those, sorry"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: define inode flags using bit numbers
        iov_iter: Move unnecessary inclusion of crypto/hash.h
        dlmfs: clean up dlmfs_file_{read,write}() a bit
      b79675e1
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · d57b2b5b
      Linus Torvalds authored
      Pull mount leak fix from Al Viro:
       "Regression fix for the syscalls-for-init series - fix a leak of a 'struct path'"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fs: fix a struct path leak in path_umount
      d57b2b5b
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 049eb096
      Linus Torvalds authored
      Pull PCI updates from Bjorn Helgaas:
       "Enumeration:
         - Fix pci_cfg_wait queue locking problem (Bjorn Helgaas)
         - Convert PCIe capability PCIBIOS errors to errno (Bolarinwa Olayemi
           Saheed)
         - Align PCIe capability and PCI accessor return values (Bolarinwa
           Olayemi Saheed)
         - Fix pci_create_slot() reference count leak (Qiushi Wu)
         - Announce device after early fixups (Tiezhu Yang)
      
        PCI device hotplug:
         - Make rpadlpar functions static (Wei Yongjun)
      
        Driver binding:
         - Add device even if driver attach failed (Rajat Jain)
      
        Virtualization:
         - xen: Remove redundant initialization of irq (Colin Ian King)
      
        IOMMU:
         - Add pci_pri_supported() to check device or associated PF (Ashok Raj)
         - Release IVRS table in AMD ACS quirk (Hanjun Guo)
         - Mark AMD Navi10 GPU rev 0x00 ATS as broken (Kai-Heng Feng)
         - Treat "external-facing" devices themselves as internal (Rajat Jain)
      
        MSI:
         - Forward MSI-X error code in pci_alloc_irq_vectors_affinity() (Piotr
           Stankiewicz)
      
        Error handling:
         - Clear PCIe Device Status errors only if OS owns AER (Jonathan
           Cameron)
         - Log correctable errors as warning, not error (Matt Jolly)
         - Use 'pci_channel_state_t' instead of 'enum pci_channel_state' (Luc
           Van Oostenryck)
      
        Peer-to-peer DMA:
         - Allow P2PDMA on AMD Zen and newer CPUs (Logan Gunthorpe)
      
        ASPM:
         - Add missing newline in sysfs 'policy' (Xiongfeng Wang)
      
        Native PCIe controllers:
         - Convert to devm_platform_ioremap_resource_byname() (Dejin Zheng)
         - Convert to devm_platform_ioremap_resource() (Dejin Zheng)
         - Remove duplicate error message from devm_pci_remap_cfg_resource()
           callers (Dejin Zheng)
         - Fix runtime PM imbalance on error (Dinghao Liu)
         - Remove dev_err() when handing an error from platform_get_irq()
           (Krzysztof Wilczyński)
         - Use pci_host_bridge.windows list directly instead of splicing in a
           temporary list for cadence, mvebu, host-common (Rob Herring)
         - Use pci_host_probe() instead of open-coding all the pieces for
           altera, brcmstb, iproc, mobiveil, rcar, rockchip, tegra, v3,
           versatile, xgene, xilinx, xilinx-nwl (Rob Herring)
         - Default host bridge parent device to the platform device (Rob
           Herring)
         - Use pci_is_root_bus() instead of tracking root bus number
           separately in aardvark, designware (imx6, keystone,
           designware-host), mobiveil, xilinx-nwl, xilinx, rockchip, rcar (Rob
           Herring)
         - Set host bridge bus number in pci_scan_root_bus_bridge() instead of
           each driver for aardvark, designware-host, host-common, mediatek,
           rcar, tegra, v3-semi (Rob Herring)
         - Move DT resource setup into devm_pci_alloc_host_bridge() (Rob
           Herring)
         - Set bridge map_irq and swizzle_irq to default functions; drivers
           that don't support legacy IRQs (iproc) need to undo this (Rob
           Herring)
      
        ARM Versatile PCIe controller driver:
         - Drop flag PCI_ENABLE_PROC_DOMAINS (Rob Herring)
      
        Cadence PCIe controller driver:
         - Use "dma-ranges" instead of "cdns,no-bar-match-nbits" property
           (Kishon Vijay Abraham I)
         - Remove "mem" from reg binding (Kishon Vijay Abraham I)
         - Fix cdns_pcie_{host|ep}_setup() error path (Kishon Vijay Abraham I)
         - Convert all r/w accessors to perform only 32-bit accesses (Kishon
           Vijay Abraham I)
         - Add support to start link and verify link status (Kishon Vijay
           Abraham I)
         - Allow pci_host_bridge to have custom pci_ops (Kishon Vijay Abraham I)
         - Add new *ops* for CPU addr fixup (Kishon Vijay Abraham I)
         - Fix updating Vendor ID and Subsystem Vendor ID register (Kishon
           Vijay Abraham I)
         - Use bridge resources for outbound window setup (Rob Herring)
         - Remove private bus number and range storage (Rob Herring)
      
        Cadence PCIe endpoint driver:
         - Add MSI-X support (Alan Douglas)
      
        HiSilicon PCIe controller driver:
         - Remove non-ECAM HiSilicon hip05/hip06 driver (Rob Herring)
      
        Intel VMD host bridge driver:
         - Use Shadow MEMBAR registers for QEMU/KVM guests (Jon Derrick)
      
        Loongson PCIe controller driver:
         - Use DECLARE_PCI_FIXUP_EARLY for bridge_class_quirk() (Tiezhu Yang)
      
        Marvell Aardvark PCIe controller driver:
         - Indicate error in 'val' when config read fails (Pali Rohár)
         - Don't touch PCIe registers if no card connected (Pali Rohár)
      
        Marvell MVEBU PCIe controller driver:
         - Setup BAR0 in order to fix MSI (Shmuel Hazan)
      
        Microsoft Hyper-V host bridge driver:
         - Fix a timing issue which causes kdump to fail occasionally (Wei Hu)
         - Make some functions static (Wei Yongjun)
      
        NVIDIA Tegra PCIe controller driver:
         - Revert tegra124 raw_violation_fixup (Nicolas Chauvet)
         - Remove PLL power supplies (Thierry Reding)
      
        Qualcomm PCIe controller driver:
         - Change duplicate PCI reset to phy reset (Abhishek Sahu)
         - Add missing ipq806x clocks in PCIe driver (Ansuel Smith)
         - Add missing reset for ipq806x (Ansuel Smith)
         - Add ext reset (Ansuel Smith)
         - Use bulk clk API and assert on error (Ansuel Smith)
         - Add support for tx term offset for rev 2.1.0 (Ansuel Smith)
         - Define some PARF params needed for ipq8064 SoC (Ansuel Smith)
         - Add ipq8064 rev2 variant (Ansuel Smith)
         - Support PCI speed set for ipq806x (Sham Muthayyan)
      
        Renesas R-Car PCIe controller driver:
         - Use devm_pci_alloc_host_bridge() (Rob Herring)
         - Use struct pci_host_bridge.windows list directly (Rob Herring)
         - Convert rcar-gen2 to use modern host bridge probe functions (Rob
           Herring)
      
        TI J721E PCIe driver:
         - Add TI J721E PCIe host and endpoint driver (Kishon Vijay Abraham I)
      
        Xilinx Versal CPM PCIe controller driver:
         - Add Versal CPM Root Port driver and YAML schema (Bharat Kumar
           Gogada)
      
        MicroSemi Switchtec management driver:
         - Add missing __iomem and __user tags to fix sparse warnings (Logan
           Gunthorpe)
      
        Miscellaneous:
         - Replace http:// links with https:// (Alexander A. Klimov)
         - Replace lkml.org, spinics, gmane with lore.kernel.org (Bjorn
           Helgaas)
         - Remove unused pci_lost_interrupt() (Heiner Kallweit)
         - Move PCI_VENDOR_ID_REDHAT definition to pci_ids.h (Huacai Chen)
         - Fix kerneldoc warnings (Krzysztof Kozlowski)"
      
      * tag 'pci-v5.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (113 commits)
        PCI: Fix kerneldoc warnings
        PCI: xilinx-cpm: Add Versal CPM Root Port driver
        PCI: xilinx-cpm: Add YAML schemas for Versal CPM Root Port
        PCI: Set bridge map_irq and swizzle_irq to default functions
        PCI: Move DT resource setup into devm_pci_alloc_host_bridge()
        PCI: rcar-gen2: Convert to use modern host bridge probe functions
        PCI: Remove dev_err() when handing an error from platform_get_irq()
        MAINTAINERS: Add Kishon Vijay Abraham I for TI J721E SoC PCIe
        misc: pci_endpoint_test: Add J721E in pci_device_id table
        PCI: j721e: Add TI J721E PCIe driver
        PCI: switchtec: Add missing __iomem tag to fix sparse warnings
        PCI: switchtec: Add missing __iomem and __user tags to fix sparse warnings
        PCI: rpadlpar: Make functions static
        PCI/P2PDMA: Allow P2PDMA on AMD Zen and newer CPUs
        PCI: Release IVRS table in AMD ACS quirk
        PCI: Announce device after early fixups
        PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken
        PCI: Remove unused pci_lost_interrupt()
        dt-bindings: PCI: Add EP mode dt-bindings for TI's J721E SoC
        dt-bindings: PCI: Add host mode dt-bindings for TI's J721E SoC
        ...
      049eb096
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 32663c78
      Linus Torvalds authored
      Pull tracing updates from Steven Rostedt:
      
       - The biggest news in that the tracing ring buffer can now time events
         that interrupted other ring buffer events.
      
         Before this change, if an interrupt came in while recording another
         event, and that interrupt also had an event, those events would all
         have the same time stamp as the event it interrupted.
      
         Now, with the new design, those events will have a unique time stamp
         and rightfully display the time for those events that were recorded
         while interrupting another event.
      
       - Bootconfig how has an "override" operator that lets the users have a
         default config, but then add options to override the default.
      
       - A fix was made to properly filter function graph tracing to the
         ftrace PIDs. This came in at the end of the -rc cycle, and needs to
         be backported.
      
       - Several clean ups, performance updates, and minor fixes as well.
      
      * tag 'trace-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (39 commits)
        tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers
        kprobes: Fix compiler warning for !CONFIG_KPROBES_ON_FTRACE
        tracing: Use trace_sched_process_free() instead of exit() for pid tracing
        bootconfig: Fix to find the initargs correctly
        Documentation: bootconfig: Add bootconfig override operator
        tools/bootconfig: Add testcases for value override operator
        lib/bootconfig: Add override operator support
        kprobes: Remove show_registers() function prototype
        tracing/uprobe: Remove dead code in trace_uprobe_register()
        kprobes: Fix NULL pointer dereference at kprobe_ftrace_handler
        ftrace: Fix ftrace_trace_task return value
        tracepoint: Use __used attribute definitions from compiler_attributes.h
        tracepoint: Mark __tracepoint_string's __used
        trace : Have tracing buffer info use kvzalloc instead of kzalloc
        tracing: Remove outdated comment in stack handling
        ftrace: Do not let direct or IPMODIFY ftrace_ops be added to module and set trampolines
        ftrace: Setup correct FTRACE_FL_REGS flags for module
        tracing/hwlat: Honor the tracing_cpumask
        tracing/hwlat: Drop the duplicate assignment in start_kthread()
        tracing: Save one trace_event->type by using __TRACE_LAST_TYPE
        ...
      32663c78
    • Michael Ellerman's avatar
      powerpc/ptrace: Fix build error in pkey_get() · 7b9de977
      Michael Ellerman authored
      The merge resolution in commit 25d8d4ee left ret no longer used,
      leading to:
      
        arch/powerpc/kernel/ptrace/ptrace-view.c: In function ‘pkey_get’:
        arch/powerpc/kernel/ptrace/ptrace-view.c:473:6: error: unused variable ‘ret’
          473 |  int ret;
      
      Fix it by removing ret.
      
      Fixes: 25d8d4ee ("Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux")
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7b9de977
  3. 07 Aug, 2020 27 commits
    • Christoph Hellwig's avatar
      fs: fix a struct path leak in path_umount · 25ccd24f
      Christoph Hellwig authored
      Make sure we also put the dentry and vfsmnt in the illegal flags
      and !may_umount cases.
      
      Fixes: 41525f56 ("fs: refactor ksys_umount")
      Reported-by: default avatarVikas Kumar <vikas.kumar2@arm.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      25ccd24f
    • Steven Rostedt (VMware)'s avatar
      tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers · 38ce2a9e
      Steven Rostedt (VMware) authored
      As trace_array_printk() used with not global instances will not add noise to
      the main buffer, they are OK to have in the kernel (unlike trace_printk()).
      This require the subsystem to create their own tracing instance, and the
      trace_array_printk() only writes into those instances.
      
      Add trace_array_init_printk() to initialize the trace_printk() buffers
      without printing out the WARNING message.
      Reported-by: default avatarSean Paul <sean@poorly.run>
      Reviewed-by: default avatarSean Paul <sean@poorly.run>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      38ce2a9e
    • Linus Torvalds's avatar
      Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 30185b69
      Linus Torvalds authored
      Pull clk updates from Stephen Boyd:
       "It looks like a smaller batch of clk updates this time around.
      
        In the core framework we just have some minor tweaks and a debugfs
        feature, so not much to see there. The driver updates are fairly well
        split between AT91 and Qualcomm clk support. Adding those two drivers
        together equals about 50% of the diffstat.
      
        Otherwise, the big amount of work this time was on supporting
        Broadcom's Raspberry Pi firmware clks.
      
        Highlights:
      
        Core:
         - Document clk_hw_round_rate() so it gets some more use
         - Remove unused __clk_get_flags()
         - Add a prepare/enable debugfs feature similar to rate setting
      
        New Drivers:
         - Add support for SAMA7G5 SoC clks
         - Enable CPU clks on Qualcomm IPQ6018 SoCs
         - Enable CPU clks on Qualcomm MSM8996 SoCs
         - GPU clk support for Qualcomm SM8150 and SM8250 SoCs
         - Audio clks on Qualcomm SC7180 SoCs
         - Microchip Sparx5 DPLL clk
         - Add support for the new Renesas RZ/G2H (R8A774E1) SoC
      
        Updates:
         - Make defines for bcm63xx-gate clks to use in DT
         - Support BCM2711 SoC firmware clks
         - Add HDMI clks for BCM2711 SoCs
         - Add RTC related clks on Ingenic SoCs
         - Support USB PHY clks on Ingenic SoCs
         - Support gate clks on BCM6318 SoCs
         - RMU and DMAC/GPIO clock support for Actions Semi S500 SoCs
         - Use poll_timeout functions in Rockchip clk driver
         - Support Rockchip rk3288w SoC variant
         - Mark mac_lbtest critical on Rockchip rk3188
         - Add CAAM clock support for i.MX vf610 driver
         - Add MU root clock support for i.MX imx8mp driver
         - Amlogic g12: add neural network accelerator clock sources
         - Amlogic meson8: remove critical flag for main PLL divider
         - Amlogic meson8: add video decoder clock gates
         - Convert one more Renesas DT binding to json-schema
         - Enhance critical clock handling on Renesas platforms to only
           consider clocks that were enabled at boot time"
      
      * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (79 commits)
        clk: qcom: gcc: Make disp gpll0 branch aon for sc7180/sdm845
        ipq806x: gcc: add support for child probe
        clk: qcom: msm8996: Make symbol 'cpu_msm8996_clks' static
        clk: qcom: ipq8074: Add correct index for PCIe clocks
        clk: <linux/clk-provider.h>: drop a duplicated word
        clk: renesas: cpg-mssr: Add r8a774e1 support
        dt-bindings: clock: renesas,cpg-mssr: Document r8a774e1
        clk: Drop duplicate selection in Kconfig
        clk: qcom: smd: Add support for MSM8992/4 rpm clocks
        clk: qcom: ipq8074: Add missing clocks for pcie
        dt-bindings: clock: qcom: ipq8074: Add missing bindings for PCIe
        Replace HTTP links with HTTPS ones: Common CLK framework
        clk: qcom: Add CPU clock driver for msm8996
        dt-bindings: clk: qcom: Add bindings for CPU clock for msm8996
        soc: qcom: Separate kryo l2 accessors from PMU driver
        clk: meson: meson8b: add the vclk2_en gate clock
        clk: meson: meson8b: add the vclk_en gate clock
        clk: qcom: Fix return value check in apss_ipq6018_probe()
        clk: bcm: dvp: Add missing module informations
        clk: meson: meson8b: Drop CLK_IS_CRITICAL from fclk_div2
        ...
      30185b69
    • Linus Torvalds's avatar
      Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 0f43283b
      Linus Torvalds authored
      Pull fdpick coredump update from Al Viro:
       "Switches fdpic coredumps away from original aout dumping primitives to
        the same kind of regset use as regular elf coredumps do"
      
      * 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        [elf-fdpic] switch coredump to regsets
        [elf-fdpic] use elf_dump_thread_status() for the dumper thread as well
        [elf-fdpic] move allocation of elf_thread_status into elf_dump_thread_status()
        [elf-fdpic] coredump: don't bother with cyclic list for per-thread objects
        kill elf_fpxregs_t
        take fdpic-related parts of elf_prstatus out
        unexport linux/elfcore.h
      0f43283b
    • Linus Torvalds's avatar
      Merge tag 'kallsyms_show_value-fix-v5.9-rc1' of... · 6ba0d2e4
      Linus Torvalds authored
      Merge tag 'kallsyms_show_value-fix-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
      
      Pull sysfs module section fix from Kees Cook:
       "Fix sysfs module section output overflow.
      
        About a month after my kallsyms_show_value() refactoring landed, 0day
        noticed that there was a path through the kernfs binattr read handlers
        that did not have PAGE_SIZEd buffers, and the module "sections" read
        handler made a bad assumption about this, resulting in it stomping on
        memory when reached through small-sized splice() calls.
      
        I've added a set of tests to find these kinds of regressions more
        quickly in the future as well"
      Sefltests-acked-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      
      * tag 'kallsyms_show_value-fix-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        selftests: splice: Check behavior of full and short splices
        module: Correctly truncate sysfs sections output
      6ba0d2e4
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v5.9-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 1fa2c0a0
      Linus Torvalds authored
      Pull seccomp fix from Kees Cook:
       "This fixes my typo in the SCM_RIGHTS refactoring that broke compat
        handling.
      
        Thanks to Thadeu Lima de Souza Cascardo for tracking it down, and to
        Christian Zigotzky and Alex Xu for their reports"
      
      * tag 'seccomp-v5.9-rc1-fix1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        net/scm: Fix typo in SCM_RIGHTS compat refactoring
      1fa2c0a0
    • Linus Torvalds's avatar
      Merge tag 'pm-5.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · f6235eb1
      Linus Torvalds authored
      Pull more power management updates from Rafael Wysocki:
       "These are mostly ARM cpufreq driver updates plus a cpufreq core
        cleanup, an ARM-wide change to make schedutil the default scaling
        governor, an intel_pstate driver fix and some runtime PM changes
        regarding kerneldoc comments.
      
        Specifics:
      
         - Add adaptive voltage scaling (AVS) support to the brcmstb cpufreq
           driver and clean it up (Florian Fainelli, Markus Mayer).
      
         - Add a new Tegra cpufreq driver and clean up the existing one (Jon
           Hunter, Sumit Gupta).
      
         - Add bandwidth level support to the Qcom cpufreq driver along with
           OPP changes (Sibi Sankar).
      
         - Clean up the sti, cpufreq-dt, ap806, CPPC cpufreq drivers (Viresh
           Kumar, Lee Jones, Ivan Kokshaysky, Sven Auhagen, Xin Hao).
      
         - Make schedutil the default governor for ARM (Valentin Schneider).
      
         - Fix dependency issues for the imx cpufreq driver (Walter Lozano).
      
         - Clean up cached_resolved_idx handlihng in the cpufreq core (Viresh
           Kumar).
      
         - Fix the intel_pstate driver to use the correct maximum frequency
           value when MSR_TURBO_RATIO_LIMIT is 0 (Srinivas Pandruvada).
      
         - Provide kenrneldoc comments for multiple runtime PM helpers and
           improve the pm_runtime_get_if_active() kerneldoc (Rafael Wysocki)"
      
      * tag 'pm-5.9-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (22 commits)
        cpufreq: intel_pstate: Fix cpuinfo_max_freq when MSR_TURBO_RATIO_LIMIT is 0
        PM: runtime: Improve kerneldoc of pm_runtime_get_if_active()
        PM: runtime: Add kerneldoc comments to multiple helpers
        cpufreq: make schedutil the default for arm and arm64
        cpufreq: cached_resolved_idx can not be negative
        cpufreq: Add Tegra194 cpufreq driver
        dt-bindings: arm: Add NVIDIA Tegra194 CPU Complex binding
        cpufreq: imx: Select NVMEM_IMX_OCOTP
        cpufreq: sti-cpufreq: Fix some formatting and misspelling issues
        cpufreq: tegra186: Simplify probe return path
        cpufreq: CPPC: Reuse caps variable in few routines
        cpufreq: ap806: fix cpufreq driver needs ap cpu clk
        cpufreq: cppc: Reorder code and remove apply_hisi_workaround variable
        cpufreq: dt: fix oops on armada37xx
        cpufreq: brcmstb-avs-cpufreq: send S2_ENTER / S2_EXIT commands to AVS
        cpufreq: brcmstb-avs-cpufreq: Support polling AVS firmware
        cpufreq: brcmstb-avs-cpufreq: more flexible interface for __issue_avs_command()
        cpufreq: qcom: Disable fast switch when scaling DDR/L3
        cpufreq: qcom: Update the bandwidth levels on frequency change
        OPP: Add and export helper to set bandwidth
        ...
      f6235eb1
    • Linus Torvalds's avatar
      Merge tag 'for-5.9/dm-changes' of... · 2f12d440
      Linus Torvalds authored
      Merge tag 'for-5.9/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - DM multipath locking fixes around m->flags tests and improvements to
         bio-based code so that it follows patterns established by
         request-based code.
      
       - Request-based DM core improvement to eliminate unnecessary call to
         blk_mq_queue_stopped().
      
       - Add "panic_on_corruption" error handling mode to DM verity target.
      
       - DM bufio fix to to perform buffer cleanup from a workqueue rather
         than wait for IO in reclaim context from shrinker.
      
       - DM crypt improvement to optionally avoid async processing via
         workqueues for reads and/or writes -- via "no_read_workqueue" and
         "no_write_workqueue" features. This more direct IO processing
         improves latency and throughput with faster storage. Avoiding
         workqueue IO submission for writes (DM_CRYPT_NO_WRITE_WORKQUEUE) is a
         requirement for adding zoned block device support to DM crypt.
      
       - Add zoned block device support to DM crypt. Makes use of
         DM_CRYPT_NO_WRITE_WORKQUEUE and a new optional feature
         (DM_CRYPT_WRITE_INLINE) that allows write completion to wait for
         encryption to complete. This allows write ordering to be preserved,
         which is needed for zoned block devices.
      
       - Fix DM ebs target's check for REQ_OP_FLUSH.
      
       - Fix DM core's report zones support to not report more zones than were
         requested.
      
       - A few small compiler warning fixes.
      
       - DM dust improvements to return output directly to the user rather
         than require they scrape the system log for output.
      
      * tag 'for-5.9/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: don't call report zones for more than the user requested
        dm ebs: Fix incorrect checking for REQ_OP_FLUSH
        dm init: Set file local variable static
        dm ioctl: Fix compilation warning
        dm raid: Remove empty if statement
        dm verity: Fix compilation warning
        dm crypt: Enable zoned block device support
        dm crypt: add flags to optionally bypass kcryptd workqueues
        dm bufio: do buffer cleanup from a workqueue
        dm rq: don't call blk_mq_queue_stopped() in dm_stop_queue()
        dm dust: add interface to list all badblocks
        dm dust: report some message results directly back to user
        dm verity: add "panic_on_corruption" error handling mode
        dm mpath: use double checked locking in fast path
        dm mpath: rename current_pgpath to pgpath in multipath_prepare_ioctl
        dm mpath: rework __map_bio()
        dm mpath: factor out multipath_queue_bio
        dm mpath: push locking down to must_push_back_rq()
        dm mpath: take m->lock spinlock when testing QUEUE_IF_NO_PATH
        dm mpath: changes from initial m->flags locking audit
      2f12d440
    • Linus Torvalds's avatar
      Merge tag 'media/v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · fa73e212
      Linus Torvalds authored
      Pull media updates from Mauro Carvalho Chehab:
      
       - Legacy soc_camera driver was removed from staging
      
       - New I2C sensor related drivers: dw9768, ch7322, max9271, rdacm20
      
       - TI vpe driver code was re-organized and had new features added
      
       - Added Xilinx MIPI CSI-2 Rx Subsystem driver
      
       - Added support for Infrared Toy and IR Droid devices
      
       - Lots of random driver fixes, new features and cleanups
      
      * tag 'media/v5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (318 commits)
        media: camss: fix memory leaks on error handling paths in probe
        media: davinci: vpif_capture: fix potential double free
        media: radio: remove redundant assignment to variable retval
        media: allegro: fix potential null dereference on header
        media: mtk-mdp: Fix a refcounting bug on error in init
        media: allegro: fix an error pointer vs NULL check
        media: meye: fix missing pm_mchip_mode field
        media: cafe-driver: use generic power management
        media: saa7164: use generic power management
        media: v4l2-dev/ioctl: Fix document for VIDIOC_QUERYCAP
        media: v4l2: Correct kernel-doc inconsistency
        media: v4l2: Correct kernel-doc inconsistency
        media: dvbdev.h: keep * together with the type
        media: v4l2-subdev.h: keep * together with the type
        media: videobuf2: Print videobuf2 buffer state by name
        media: colorspaces-details.rst: fix V4L2_COLORSPACE_JPEG description
        media: tw68: use generic power management
        media: meye: use generic power management
        media: cx88: use generic power management
        media: cx25821: use generic power management
        ...
      fa73e212
    • Linus Torvalds's avatar
      Merge tag 'mailbox-v5.9' of git://git.linaro.org/landing-teams/working/fujitsu/integration · 75dee3b6
      Linus Torvalds authored
      Pull mailbox updates from Jassi Brar:
       "mediatek:
         - add support for mt6779 gce
         - shutdown cleanup and address shift support
      
        qcom:
         - add msm8994 apcs and sdm660 hmss compatibility
      
        imx:
         - mark PM funcs __maybe
      
        pcc:
         - put acpi table before bailout
      
        misc:
         - replace http with https links"
      
      * tag 'mailbox-v5.9' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
        mailbox: mediatek: cmdq: clear task in channel before shutdown
        mailbox: cmdq: support mt6779 gce platform definition
        mailbox: cmdq: variablize address shift in platform
        dt-binding: gce: add gce header file for mt6779
        mailbox: qcom: Add msm8994 apcs compatible
        mailbox: qcom: Add sdm660 hmss compatible
        mailbox: imx: Mark PM functions as __maybe_unused
        mailbox: pcc: Put the PCCT table for error path
        mailbox: Replace HTTP links with HTTPS ones
      75dee3b6
    • Kees Cook's avatar
      net/scm: Fix typo in SCM_RIGHTS compat refactoring · 16b89f69
      Kees Cook authored
      When refactoring the SCM_RIGHTS code, I accidentally mis-merged my
      native/compat diffs, which entirely broke using SCM_RIGHTS in compat
      mode. Use the correct helper.
      Reported-by: default avatarChristian Zigotzky <chzigotzky@xenosoft.de>
      Link: https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-August/216156.htmlReported-by: default avatar"Alex Xu (Hello71)" <alex_y_xu@yahoo.ca>
      Link: https://lore.kernel.org/lkml/1596812929.lz7fuo8r2w.none@localhost/Suggested-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Fixes: c0029de5 ("net/scm: Regularize compat handling of scm_detach_fds()")
      Tested-by: default avatarAlex Xu (Hello71) <alex_y_xu@yahoo.ca>
      Acked-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      16b89f69
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · ce615f5c
      Linus Torvalds authored
      Pull dmaengine updates from Vinod Koul:
       "Core:
         - Support out of order dma completion
         - Support for repeating transaction
      
        New controllers:
         - Support for Actions S700 DMA engine
         - Renesas R8A774E1, r8a7742 controller binding
         - New driver for Xilinx DPDMA controller
      
        Other:
         - Support of out of order dma completion in idxd driver
         - W=1 warning cleanup of subsystem
         - Updates to ti-k3-dma, dw, idxd drivers"
      
      * tag 'dmaengine-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (68 commits)
        dmaengine: dw: Don't include unneeded header to platform data header
        dmaengine: Actions: Add support for S700 DMA engine
        dmaengine: Actions: get rid of bit fields from dma descriptor
        dt-bindings: dmaengine: convert Actions Semi Owl SoCs bindings to yaml
        dmaengine: idxd: add missing invalid flags field to completion
        dmaengine: dw: Initialize max_sg_burst capability
        dmaengine: dw: Introduce max burst length hw config
        dmaengine: dw: Initialize min and max burst DMA device capability
        dmaengine: dw: Set DMA device max segment size parameter
        dmaengine: dw: Take HC_LLP flag into account for noLLP auto-config
        dmaengine: Introduce DMA-device device_caps callback
        dmaengine: Introduce max SG burst capability
        dmaengine: Introduce min burst length capability
        dt-bindings: dma: dw: Add max burst transaction length property
        dt-bindings: dma: dw: Convert DW DMAC to DT binding
        dmaengine: ti: k3-udma: Query throughput level information from hardware
        dmaengine: ti: k3-udma: Use defines for capabilities register parsing
        dmaengine: xilinx: dpdma: Fix kerneldoc warning
        dmaengine: xilinx: dpdma: add missing kernel doc
        dmaengine: xilinx: dpdma: remove comparison of unsigned expression
        ...
      ce615f5c
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · 81e11336
      Linus Torvalds authored
      Merge misc updates from Andrew Morton:
      
       - a few MM hotfixes
      
       - kthread, tools, scripts, ntfs and ocfs2
      
       - some of MM
      
      Subsystems affected by this patch series: kthread, tools, scripts, ntfs,
      ocfs2 and mm (hofixes, pagealloc, slab-generic, slab, slub, kcsan,
      debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, mincore,
      sparsemem, vmalloc, kasan, pagealloc, hugetlb and vmscan).
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (162 commits)
        mm: vmscan: consistent update to pgrefill
        mm/vmscan.c: fix typo
        khugepaged: khugepaged_test_exit() check mmget_still_valid()
        khugepaged: retract_page_tables() remember to test exit
        khugepaged: collapse_pte_mapped_thp() protect the pmd lock
        khugepaged: collapse_pte_mapped_thp() flush the right range
        mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible
        mm: thp: replace HTTP links with HTTPS ones
        mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
        mm/page_alloc.c: skip setting nodemask when we are in interrupt
        mm/page_alloc: fallbacks at most has 3 elements
        mm/page_alloc: silence a KASAN false positive
        mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()
        mm/page_alloc.c: simplify pageblock bitmap access
        mm/page_alloc.c: extract the common part in pfn_to_bitidx()
        mm/page_alloc.c: replace the definition of NR_MIGRATETYPE_BITS with PB_migratetype_bits
        mm/shuffle: remove dynamic reconfiguration
        mm/memory_hotplug: document why shuffle_zone() is relevant
        mm/page_alloc: remove nr_free_pagecache_pages()
        mm: remove vm_total_pages
        ...
      81e11336
    • Shakeel Butt's avatar
      mm: vmscan: consistent update to pgrefill · 912c0572
      Shakeel Butt authored
      The vmstat pgrefill is useful together with pgscan and pgsteal stats to
      measure the reclaim efficiency.  However vmstat's pgrefill is not updated
      consistently at system level.  It gets updated for both global and memcg
      reclaim however pgscan and pgsteal are updated for only global reclaim.
      So, update pgrefill only for global reclaim.  If someone is interested in
      the stats representing both system level as well as memcg level reclaim,
      then consult the root memcg's memory.stat instead of /proc/vmstat.
      Signed-off-by: default avatarShakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarChris Down <chris@chrisdown.name>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Link: http://lkml.kernel.org/r/20200711011459.1159929-1-shakeelb@google.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      912c0572
    • dylan-meiners's avatar
    • Hugh Dickins's avatar
      khugepaged: khugepaged_test_exit() check mmget_still_valid() · bbe98f9c
      Hugh Dickins authored
      Move collapse_huge_page()'s mmget_still_valid() check into
      khugepaged_test_exit() itself.  collapse_huge_page() is used for anon THP
      only, and earned its mmget_still_valid() check because it inserts a huge
      pmd entry in place of the page table's pmd entry; whereas
      collapse_file()'s retract_page_tables() or collapse_pte_mapped_thp()
      merely clears the page table's pmd entry.  But core dumping without mmap
      lock must have been as open to mistaking a racily cleared pmd entry for a
      page table at physical page 0, as exit_mmap() was.  And we certainly have
      no interest in mapping as a THP once dumping core.
      
      Fixes: 59ea6d06 ("coredump: fix race condition between collapse_huge_page() and core dumping")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[4.8+]
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021217020.27773@eggly.anvilsSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bbe98f9c
    • Hugh Dickins's avatar
      khugepaged: retract_page_tables() remember to test exit · 18e77600
      Hugh Dickins authored
      Only once have I seen this scenario (and forgot even to notice what forced
      the eventual crash): a sequence of "BUG: Bad page map" alerts from
      vm_normal_page(), from zap_pte_range() servicing exit_mmap();
      pmd:00000000, pte values corresponding to data in physical page 0.
      
      The pte mappings being zapped in this case were supposed to be from a huge
      page of ext4 text (but could as well have been shmem): my belief is that
      it was racing with collapse_file()'s retract_page_tables(), found *pmd
      pointing to a page table, locked it, but *pmd had become 0 by the time
      start_pte was decided.
      
      In most cases, that possibility is excluded by holding mmap lock; but
      exit_mmap() proceeds without mmap lock.  Most of what's run by khugepaged
      checks khugepaged_test_exit() after acquiring mmap lock:
      khugepaged_collapse_pte_mapped_thps() and hugepage_vma_revalidate() do so,
      for example.  But retract_page_tables() did not: fix that.
      
      The fix is for retract_page_tables() to check khugepaged_test_exit(),
      after acquiring mmap lock, before doing anything to the page table.
      Getting the mmap lock serializes with __mmput(), which briefly takes and
      drops it in __khugepaged_exit(); then the khugepaged_test_exit() check on
      mm_users makes sure we don't touch the page table once exit_mmap() might
      reach it, since exit_mmap() will be proceeding without mmap lock, not
      expecting anyone to be racing with it.
      
      Fixes: f3f0e1d2 ("khugepaged: add support of collapse for tmpfs/shmem pages")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: <stable@vger.kernel.org>	[4.8+]
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021215400.27773@eggly.anvilsSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      18e77600
    • Hugh Dickins's avatar
      khugepaged: collapse_pte_mapped_thp() protect the pmd lock · 119a5fc1
      Hugh Dickins authored
      When retract_page_tables() removes a page table to make way for a huge
      pmd, it holds huge page lock, i_mmap_lock_write, mmap_write_trylock and
      pmd lock; but when collapse_pte_mapped_thp() does the same (to handle the
      case when the original mmap_write_trylock had failed), only
      mmap_write_trylock and pmd lock are held.
      
      That's not enough.  One machine has twice crashed under load, with "BUG:
      spinlock bad magic" and GPF on 6b6b6b6b6b6b6b6b.  Examining the second
      crash, page_vma_mapped_walk_done()'s spin_unlock of pvmw->ptl (serving
      page_referenced() on a file THP, that had found a page table at *pmd)
      discovers that the page table page and its lock have already been freed by
      the time it comes to unlock.
      
      Follow the example of retract_page_tables(), but we only need one of huge
      page lock or i_mmap_lock_write to secure against this: because it's the
      narrower lock, and because it simplifies collapse_pte_mapped_thp() to know
      the hpage earlier, choose to rely on huge page lock here.
      
      Fixes: 27e1f827 ("khugepaged: enable collapse pmd for pte-mapped THP")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: <stable@vger.kernel.org>	[5.4+]
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021213070.27773@eggly.anvilsSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      119a5fc1
    • Hugh Dickins's avatar
      khugepaged: collapse_pte_mapped_thp() flush the right range · 723a80da
      Hugh Dickins authored
      pmdp_collapse_flush() should be given the start address at which the huge
      page is mapped, haddr: it was given addr, which at that point has been
      used as a local variable, incremented to the end address of the extent.
      
      Found by source inspection while chasing a hugepage locking bug, which I
      then could not explain by this.  At first I thought this was very bad;
      then saw that all of the page translations that were not flushed would
      actually still point to the right pages afterwards, so harmless; then
      realized that I know nothing of how different architectures and models
      cache intermediate paging structures, so maybe it matters after all -
      particularly since the page table concerned is immediately freed.
      
      Much easier to fix than to think about.
      
      Fixes: 27e1f827 ("khugepaged: enable collapse pmd for pte-mapped THP")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: <stable@vger.kernel.org>	[5.4+]
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008021204390.27773@eggly.anvilsSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      723a80da
    • Peter Xu's avatar
      mm/hugetlb: fix calculation of adjust_range_if_pmd_sharing_possible · 75802ca6
      Peter Xu authored
      This is found by code observation only.
      
      Firstly, the worst case scenario should assume the whole range was covered
      by pmd sharing.  The old algorithm might not work as expected for ranges
      like (1g-2m, 1g+2m), where the adjusted range should be (0, 1g+2m) but the
      expected range should be (0, 2g).
      
      Since at it, remove the loop since it should not be required.  With that,
      the new code should be faster too when the invalidating range is huge.
      
      Mike said:
      
      : With range (1g-2m, 1g+2m) within a vma (0, 2g) the existing code will only
      : adjust to (0, 1g+2m) which is incorrect.
      :
      : We should cc stable.  The original reason for adjusting the range was to
      : prevent data corruption (getting wrong page).  Since the range is not
      : always adjusted correctly, the potential for corruption still exists.
      :
      : However, I am fairly confident that adjust_range_if_pmd_sharing_possible
      : is only gong to be called in two cases:
      :
      : 1) for a single page
      : 2) for range == entire vma
      :
      : In those cases, the current code should produce the correct results.
      :
      : To be safe, let's just cc stable.
      
      Fixes: 017b1660 ("mm: migration: fix migration of huge PMD shared pages")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Link: http://lkml.kernel.org/r/20200730201636.74778-1-peterx@redhat.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      75802ca6
    • Alexander A. Klimov's avatar
      mm: thp: replace HTTP links with HTTPS ones · 42742d9b
      Alexander A. Klimov authored
      Rationale:
      Reduces attack surface on kernel devs opening the links for MITM
      as HTTPS traffic is much harder to manipulate.
      
      Deterministic algorithm:
      For each file:
        If not .svg:
          For each line:
            If doesn't contain `xmlns`:
              For each link, `http://[^# 	]*(?:\w|/)`:
      	  If neither `gnu\.org/license`, nor `mozilla\.org/MPL`:
                  If both the HTTP and HTTPS versions
                  return 200 OK and serve the same content:
                    Replace HTTP with HTTPS.
      
      [akpm@linux-foundation.org: fix amd.com URL, per Vlastimil]
      Signed-off-by: default avatarAlexander A. Klimov <grandmaster@al2klimov.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Link: http://lkml.kernel.org/r/20200713164345.36088-1-grandmaster@al2klimov.deSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      42742d9b
    • Joonsoo Kim's avatar
      mm/page_alloc: fix memalloc_nocma_{save/restore} APIs · 8510e69c
      Joonsoo Kim authored
      Currently, memalloc_nocma_{save/restore} API that prevents CMA area
      in page allocation is implemented by using current_gfp_context(). However,
      there are two problems of this implementation.
      
      First, this doesn't work for allocation fastpath. In the fastpath,
      original gfp_mask is used since current_gfp_context() is introduced in
      order to control reclaim and it is on slowpath. So, CMA area can be
      allocated through the allocation fastpath even if
      memalloc_nocma_{save/restore} APIs are used. Currently, there is just
      one user for these APIs and it has a fallback method to prevent actual
      problem.
      Second, clearing __GFP_MOVABLE in current_gfp_context() has a side effect
      to exclude the memory on the ZONE_MOVABLE for allocation target.
      
      To fix these problems, this patch changes the implementation to exclude
      CMA area in page allocation. Main point of this change is using the
      alloc_flags. alloc_flags is mainly used to control allocation so it fits
      for excluding CMA area in allocation.
      
      Fixes: d7fefcc8 (mm/cma: add PF flag to force non cma alloc)
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
      Link: http://lkml.kernel.org/r/1595468942-29687-1-git-send-email-iamjoonsoo.kim@lge.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8510e69c
    • Muchun Song's avatar
      mm/page_alloc.c: skip setting nodemask when we are in interrupt · 182f3d7a
      Muchun Song authored
      When we are in the interrupt context, it is irrelevant to the current task
      context.  If we use current task's mems_allowed, we can be fair to alloc
      pages in the fast path and fall back to slow path memory allocation when
      the current node(which is the current task mems_allowed) does not have
      enough memory to allocate.  In this case, it slows down the memory
      allocation speed of interrupt context.  So we can skip setting the
      nodemask to allow any node to allocate memory, so that fast path
      allocation can success.
      Signed-off-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarPekka Enberg <penberg@kernel.org>
      Cc: David Hildenbrand <david@redhat.com>
      Link: http://lkml.kernel.org/r/20200706025921.53683-1-songmuchun@bytedance.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      182f3d7a
    • Wei Yang's avatar
      mm/page_alloc: fallbacks at most has 3 elements · da415663
      Wei Yang authored
      MIGRAGE_TYPES is used to be the mark of end and there are at most 3
      elements for the one dimension array.
      
      Reduce to 3 to save little memory.
      Signed-off-by: default avatarWei Yang <richard.weiyang@linux.alibaba.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Link: http://lkml.kernel.org/r/20200625231022.18784-1-richard.weiyang@linux.alibaba.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      da415663
    • Qian Cai's avatar
      mm/page_alloc: silence a KASAN false positive · 9e15afa5
      Qian Cai authored
      kernel_init_free_pages() will use memset() on s390 to clear all pages from
      kmalloc_order() which will override KASAN redzones because a redzone was
      setup from the end of the allocation size to the end of the last page.
      Silence it by not reporting it there.  An example of the report is,
      
       BUG: KASAN: slab-out-of-bounds in __free_pages_ok
       Write of size 4096 at addr 000000014beaa000
       Call Trace:
       show_stack+0x152/0x210
       dump_stack+0x1f8/0x248
       print_address_description.isra.13+0x5e/0x4d0
       kasan_report+0x130/0x178
       check_memory_region+0x190/0x218
       memset+0x34/0x60
       __free_pages_ok+0x894/0x12f0
       kfree+0x4f2/0x5e0
       unpack_to_rootfs+0x60e/0x650
       populate_rootfs+0x56/0x358
       do_one_initcall+0x1f4/0xa20
       kernel_init_freeable+0x758/0x7e8
       kernel_init+0x1c/0x170
       ret_from_fork+0x24/0x28
       Memory state around the buggy address:
       000000014bea9f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       000000014bea9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      >000000014beaa000: 03 fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
                          ^
       000000014beaa080: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
       000000014beaa100: fe fe fe fe fe fe fe fe fe fe fe fe fe fe
      
      Fixes: 6471384a ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Tested-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Acked-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Christian Borntraeger <borntraeger@de.ibm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Link: http://lkml.kernel.org/r/20200610052154.5180-1-cai@lca.pwSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9e15afa5
    • Wei Yang's avatar
      mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask() · 535b81e2
      Wei Yang authored
      After previous cleanup, the end_bitidx is not necessary any more.
      Signed-off-by: default avatarWei Yang <richard.weiyang@linux.alibaba.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Link: http://lkml.kernel.org/r/20200623124201.8199-4-richard.weiyang@linux.alibaba.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      535b81e2
    • Wei Yang's avatar
      mm/page_alloc.c: simplify pageblock bitmap access · d93d5ab9
      Wei Yang authored
      Due to commit e58469ba ("mm: page_alloc: use word-based accesses for
      get/set pageblock bitmaps"), pageblock bitmap is accessed with word-based
      access.  This operation could be simplified a little.
      
      Intuitively, if we want to get a bit range [start_idx, end_idx] in a word,
      we can do like this:
      
          mask = (1 << (end_bitidx - start_bitidx + 1)) - 1;
          ret = (word >> start_idx) & mask;
      
      And also if we want to set a bit range [start_idx, end_idx] with flags, we
      can do the same by just shift start_bitidx.
      
      By doing so we reduce some instructions for these two helper functions:
      
                                      Before   Patched
          set_pfnblock_flags_mask     209      198(-5%)
          get_pfnblock_flags_mask     101      87(-13%)
      
      Since the syntax is changed a little, we need to check the whole 4-bit
      migrate_type instead of part of it.
      Signed-off-by: default avatarWei Yang <richard.weiyang@linux.alibaba.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Link: http://lkml.kernel.org/r/20200623124201.8199-3-richard.weiyang@linux.alibaba.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d93d5ab9