1. 23 Jun, 2023 16 commits
    • Linus Torvalds's avatar
      Merge tag 'arm-fixes-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 0f56e657
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "The final bug fixes for Qualcomm and Rockchips came in, all of them
        for devicetree files:
      
         - Devices on Qualcomm SC7180/SC7280 that are cache coherent are now
           marked so correctly to fix a regression after a change in kernel
           behavior
      
         - Rockchips has a few minor changes for correctness of regulator and
           cache properties, as well as fixes for incorrect behavior of the
           RK3568 PCI controller and reset pins on two boards"
      
      * tag 'arm-fixes-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices
        arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor
        arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP
        dt-bindings: firmware: qcom,scm: Document that SCM can be dma-coherent
        arm64: dts: rockchip: Fix rk356x PCIe register and range mappings
        arm64: dts: rockchip: fix button reset pin for nanopi r5c
        arm64: dts: rockchip: fix nEXTRST on SOQuartz
        arm64: dts: rockchip: add missing cache properties
        arm64: dts: rockchip: fix USB regulator on ROCK64
      0f56e657
    • Linus Torvalds's avatar
      Merge tag 'for-6.4-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 569fa939
      Linus Torvalds authored
      Pull btrfs fix from David Sterba:
       "Unfortunately the recent u32 overflow fix was not complete, there was
        one conversion left, assertion not triggered by my tests but caught by
        Qu's fstests case.
      
        The "cleanup for later" has been promoted to a proper fix and wraps
        all uses of the stripe left shift so the diffstat has grown but leaves
        no potentially problematic uses.
      
        We should have done it that way before, sorry"
      
      * tag 'for-6.4-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: fix remaining u32 overflows when left shifting stripe_nr
      569fa939
    • Linus Torvalds's avatar
      Merge tag 'block-6.4-2023-06-23' of git://git.kernel.dk/linux · 9cb38381
      Linus Torvalds authored
      Pull block fix from Jens Axboe:
       "It's apparently the week of 'fixup something from last week', because
        the same is true for this block pull request.
      
        Fix up a lock grab that needs to be IRQ saving, rather than just IRQ
        disabling, in the block cgroup code"
      
      * tag 'block-6.4-2023-06-23' of git://git.kernel.dk/linux:
        block: make sure local irq is disabled when calling __blkcg_rstat_flush
      9cb38381
    • Linus Torvalds's avatar
      Merge tag 'iommu-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 917b3c7c
      Linus Torvalds authored
      Pull iommu fix from Joerg Roedel:
      
       - Fix potential memory leak in AMD IOMMU domain allocation path
      
      * tag 'iommu-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/amd: Fix possible memory leak of 'domain'
      917b3c7c
    • Linus Torvalds's avatar
      Merge tag 'sound-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 61dabacd
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Three oneliner fixes: one for a thinko in SOF SoundWire code and two
        HD-audio quirks for ASUS laptops. All device-specific and should be
        safe to apply"
      
      * tag 'sound-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda/realtek: Add quirk for ASUS ROG GV601V
        ALSA: hda/realtek: Add quirk for ASUS ROG G634Z
        ASoC: intel: sof_sdw: Fixup typo in device link checking
      61dabacd
    • Linus Torvalds's avatar
      Merge tag 'gpio-fixes-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 6edecb99
      Linus Torvalds authored
      Pull gpio fixes from Bartosz Golaszewski:
      
       - fix IRQ initialization in gpiochip_irqchip_add_domain()
      
       - add a missing return value check for platform_get_irq() in
         gpio-sifive
      
       - don't free irq_domains which GPIOLIB does not manage
      
      * tag 'gpio-fixes-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
        gpiolib: Fix irq_domain resource tracking for gpiochip_irqchip_add_domain()
        gpio: sifive: add missing check for platform_get_irq
        gpiolib: Fix GPIO chip IRQ initialization restriction
      6edecb99
    • Arnd Bergmann's avatar
      Merge tag 'qcom-arm64-fixes-for-6.4-2' of... · ed8ff046
      Arnd Bergmann authored
      Merge tag 'qcom-arm64-fixes-for-6.4-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
      
      One last Qualcomm ARM64 DeviceTree fix for v6.4
      
      Changes related to cache management for DMA memory caused WiFi to stop
      work on SC7180 and SC7280 based products, using TF-A. These changes
      marks the relevant device dma-coherent to correct the behavior.
      
      * tag 'qcom-arm64-fixes-for-6.4-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
        arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices
        arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor
        arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP
        dt-bindings: firmware: qcom,scm: Document that SCM can be dma-coherent
      
      Link: https://lore.kernel.org/r/20230622203248.106422-1-andersson@kernel.orgSigned-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      ed8ff046
    • Linus Torvalds's avatar
      workqueue: clean up WORK_* constant types, clarify masking · afa4bb77
      Linus Torvalds authored
      Dave Airlie reports that gcc-13.1.1 has started complaining about some
      of the workqueue code in 32-bit arm builds:
      
        kernel/workqueue.c: In function ‘get_work_pwq’:
        kernel/workqueue.c:713:24: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
          713 |                 return (void *)(data & WORK_STRUCT_WQ_DATA_MASK);
              |                        ^
        [ ... a couple of other cases ... ]
      
      and while it's not immediately clear exactly why gcc started complaining
      about it now, I suspect it's some C23-induced enum type handlign fixup in
      gcc-13 is the cause.
      
      Whatever the reason for starting to complain, the code and data types
      are indeed disgusting enough that the complaint is warranted.
      
      The wq code ends up creating various "helper constants" (like that
      WORK_STRUCT_WQ_DATA_MASK) using an enum type, which is all kinds of
      confused.  The mask needs to be 'unsigned long', not some unspecified
      enum type.
      
      To make matters worse, the actual "mask and cast to a pointer" is
      repeated a couple of times, and the cast isn't even always done to the
      right pointer, but - as the error case above - to a 'void *' with then
      the compiler finishing the job.
      
      That's now how we roll in the kernel.
      
      So create the masks using the proper types rather than some ambiguous
      enumeration, and use a nice helper that actually does the type
      conversion in one well-defined place.
      
      Incidentally, this magically makes clang generate better code.  That,
      admittedly, is really just a sign of clang having been seriously
      confused before, and cleaning up the typing unconfuses the compiler too.
      Reported-by: default avatarDave Airlie <airlied@gmail.com>
      Link: https://lore.kernel.org/lkml/CAPM=9twNnV4zMCvrPkw3H-ajZOH-01JVh_kDrxdPYQErz8ZTdA@mail.gmail.com/
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      afa4bb77
    • Linus Torvalds's avatar
      Merge tag 'net-6.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 8a28a0b6
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from ipsec, bpf, mptcp and netfilter.
      
        Current release - regressions:
      
         - netfilter: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain
      
         - eth: mlx5e:
            - fix scheduling of IPsec ASO query while in atomic
            - free IRQ rmap and notifier on kernel shutdown
      
        Current release - new code bugs:
      
         - phy: manual remove LEDs to ensure correct ordering
      
        Previous releases - regressions:
      
         - mptcp: fix possible divide by zero in recvmsg()
      
         - dsa: revert "net: phy: dp83867: perform soft reset and retain
           established link"
      
        Previous releases - always broken:
      
         - sched: netem: acquire qdisc lock in netem_change()
      
         - bpf:
            - fix verifier id tracking of scalars on spill
            - fix NULL dereference on exceptions
            - accept function names that contain dots
      
         - netfilter: disallow element updates of bound anonymous sets
      
         - mptcp: ensure listener is unhashed before updating the sk status
      
         - xfrm:
            - add missed call to delete offloaded policies
            - fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets
      
         - selftests: fixes for FIPS mode
      
         - dsa: mt7530: fix multiple CPU ports, BPDU and LLDP handling
      
         - eth: sfc: use budget for TX completions
      
        Misc:
      
         - wifi: iwlwifi: add support for SO-F device with PCI id 0x7AF0"
      
      * tag 'net-6.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits)
        revert "net: align SO_RCVMARK required privileges with SO_MARK"
        net: wwan: iosm: Convert single instance struct member to flexible array
        sch_netem: acquire qdisc lock in netem_change()
        selftests: forwarding: Fix race condition in mirror installation
        wifi: mac80211: report all unusable beacon frames
        mptcp: ensure listener is unhashed before updating the sk status
        mptcp: drop legacy code around RX EOF
        mptcp: consolidate fallback and non fallback state machine
        mptcp: fix possible list corruption on passive MPJ
        mptcp: fix possible divide by zero in recvmsg()
        mptcp: handle correctly disconnect() failures
        bpf: Force kprobe multi expected_attach_type for kprobe_multi link
        bpf/btf: Accept function names that contain dots
        Revert "net: phy: dp83867: perform soft reset and retain established link"
        net: mdio: fix the wrong parameters
        netfilter: nf_tables: Fix for deleting base chains with payload
        netfilter: nfnetlink_osf: fix module autoload
        netfilter: nf_tables: drop module reference after updating chain
        netfilter: nf_tables: disallow timeout for anonymous sets
        netfilter: nf_tables: disallow updates of anonymous sets
        ...
      8a28a0b6
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 412d070b
      Linus Torvalds authored
      Pull kvm fixes from Paolo Bonzini:
       "ARM:
      
         - Correctly save/restore PMUSERNR_EL0 when host userspace is using
           PMU counters directly
      
         - Fix GICv2 emulation on GICv3 after the locking rework
      
         - Don't use smp_processor_id() in kvm_pmu_probe_armpmu(), and
           document why
      
        Generic:
      
         - Avoid setting page table entries pointing to a deleted memslot if a
           host page table entry is changed concurrently with the deletion"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        KVM: Avoid illegal stage2 mapping on invalid memory slot
        KVM: arm64: Use raw_smp_processor_id() in kvm_pmu_probe_armpmu()
        KVM: arm64: Restore GICv2-on-GICv3 functionality
        KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded
        KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
      412d070b
    • Linus Torvalds's avatar
      Merge tag 'powerpc-6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · e7758c0d
      Linus Torvalds authored
      Pull powerpc fix from Michael Ellerman:
      
       - Disable IRQs when switching mm in exit_lazy_flush_tlb() called from
         exit_mmap()
      
      Thanks to Nicholas Piggin and Sachin Sant.
      
      * tag 'powerpc-6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s/radix: Fix exit lazy tlb mm switch with irqs enabled
      e7758c0d
    • Linus Torvalds's avatar
      Merge tag 'pci-v6.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci · 4a426aa1
      Linus Torvalds authored
      Pull pci fix from Bjorn Helgaas:
      
       - Transfer Intel LGM GW PCIe maintenance from Rahul Tanwar to Chuanhua
         Lei (Zhu YiXin)
      
      * tag 'pci-v6.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
        MAINTAINERS: Add Chuanhua Lei as Intel LGM GW PCIe maintainer
      4a426aa1
    • Linus Torvalds's avatar
      Merge tag 'mmc-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 93765002
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
      
       - Fix support for deferred probing for several host drivers
      
       - litex_mmc: Use async probe as it's common for all mmc hosts
      
       - meson-gx: Fix bug when scheduling while atomic
      
       - mmci_stm32: Fix max busy timeout calculation
      
       - sdhci-msm: Disable broken 64-bit DMA on MSM8916
      
      * tag 'mmc-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: usdhi60rol0: fix deferred probing
        mmc: sunxi: fix deferred probing
        mmc: sh_mmcif: fix deferred probing
        mmc: sdhci-spear: fix deferred probing
        mmc: sdhci-acpi: fix deferred probing
        mmc: owl: fix deferred probing
        mmc: omap_hsmmc: fix deferred probing
        mmc: omap: fix deferred probing
        mmc: mvsdio: fix deferred probing
        mmc: mtk-sd: fix deferred probing
        mmc: meson-gx: fix deferred probing
        mmc: bcm2835: fix deferred probing
        mmc: litex_mmc: set PROBE_PREFER_ASYNCHRONOUS
        mmc: meson-gx: remove redundant mmc_request_done() call from irq context
        mmc: mmci: stm32: fix max busy timeout calculation
        mmc: sdhci-msm: Disable broken 64-bit DMA on MSM8916
      93765002
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v6.4-5' of... · 65d48989
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fix from Hans de Goede:
       "One small fix for an AMD PMF driver issue which is causing issues for
        users of just released AMD laptop models"
      
      * tag 'platform-drivers-x86-v6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        platform/x86/amd/pmf: Register notify handler only if SPS is enabled
      65d48989
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.4-2023-06-21' of git://git.kernel.dk/linux · c213de63
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "A fix for a race condition with poll removal and linked timeouts, and
        then a few followup fixes/tweaks for the msg_control patch from last
        week.
      
        Not super important, particularly the sparse fixup, as it was broken
        before that recent commit. But let's get it sorted for real for this
        release, rather than just have it broken a bit differently"
      
      * tag 'io_uring-6.4-2023-06-21' of git://git.kernel.dk/linux:
        io_uring/net: use the correct msghdr union member in io_sendmsg_copy_hdr
        io_uring/net: disable partial retries for recvmsg with cmsg
        io_uring/net: clear msg_controllen on partial sendmsg retry
        io_uring/poll: serialize poll linked timer start with poll removal
      c213de63
    • Linus Torvalds's avatar
      Merge tag 'cgroup-for-6.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 5950a006
      Linus Torvalds authored
      Pull cgroup fixes from Tejun Heo:
       "It's late but here are two bug fixes. Both fix problems which can be
        severe but are very confined in scope. The risk to most use cases
        should be minimal.
      
         - Fix for an old bug which triggers if a cgroup subsystem is
           remounted to a different hierarchy while someone is reading its
           cgroup.procs/tasks file. The risk is pretty low given how seldom
           cgroup subsystems are moved across hierarchies.
      
         - We moved cpus_read_lock() outside of cgroup internal locks a while
           ago but forgot to update the legacy_freezer leading to lockdep
           triggers. Fixed"
      
      * tag 'cgroup-for-6.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: Do not corrupt task iteration when rebinding subsystem
        cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex in freezer_css_{online,offline}()
      5950a006
  2. 22 Jun, 2023 22 commits
    • Paolo Bonzini's avatar
      Merge tag 'kvmarm-fixes-6.4-4' of... · 2623b3dc
      Paolo Bonzini authored
      Merge tag 'kvmarm-fixes-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
      
      KVM/arm64 fixes for 6.4, take #4
      
      - Correctly save/restore PMUSERNR_EL0 when host userspace is using
        PMU counters directly
      
      - Fix GICv2 emulation on GICv3 after the locking rework
      
      - Don't use smp_processor_id() in kvm_pmu_probe_armpmu(), and
        document why...
      2623b3dc
    • Douglas Anderson's avatar
      arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices · 7b59e8ae
      Douglas Anderson authored
      Just like for sc7180 devices using the Chrome bootflow (AKA trogdor
      and IDP), sc7280 devices using the Chrome bootflow also need their
      firmware marked dma-coherent. On sc7280 this wasn't causing WiFi to
      fail to startup, since WiFi works differently there. However, on
      sc7280 devices we were still getting the message at bootup after
      commit 7bd6680b ("Revert "Revert "arm64: dma: Drop cache
      invalidation from arch_dma_prep_coherent()"""):
      
       qcom_scm firmware:scm: Assign memory protection call failed -22
       qcom_rmtfs_mem 9c900000.memory: assign memory failed
       qcom_rmtfs_mem: probe of 9c900000.memory failed with error -22
      
      We should mark SCM properly just like we did for trogdor.
      
      Fixes: 7bd6680b ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""")
      Fixes: 7a1f4e7f ("arm64: dts: qcom: sc7280: Add basic dts/dtsi files for sc7280 soc")
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Link: https://lore.kernel.org/r/20230616081440.v2.4.I21dc14a63327bf81c6bb58fe8ed91dbdc9849ee2@changeidSigned-off-by: default avatarBjorn Andersson <andersson@kernel.org>
      7b59e8ae
    • Douglas Anderson's avatar
      arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor · a54b7fa6
      Douglas Anderson authored
      Trogdor devices use firmware backed by TF-A instead of Qualcomm's
      normal TZ. On TF-A we end up mapping memory as cacheable.
      Specifically, you can see in Trogdor's TF-A code [1] in
      qti_sip_mem_assign() that we call qti_mmap_add_dynamic_region() with
      MT_RO_DATA. This translates down to MT_MEMORY instead of
      MT_NON_CACHEABLE or MT_DEVICE. Apparently Qualcomm's normal TZ
      implementation maps the memory as non-cacheable.
      
      Let's add the "dma-coherent" attribute to the SCM for trogdor.
      
      Adding "dma-coherent" like this fixes WiFi on sc7180-trogdor
      devices. WiFi was broken as of commit 7bd6680b ("Revert "Revert
      "arm64: dma: Drop cache invalidation from
      arch_dma_prep_coherent()"""). Specifically at bootup we'd get:
      
       qcom_scm firmware:scm: Assign memory protection call failed -22
       qcom_rmtfs_mem 94600000.memory: assign memory failed
       qcom_rmtfs_mem: probe of 94600000.memory failed with error -22
      
      From discussion on the mailing lists [2] and over IRC [3], it was
      determined that we should always have been tagging the SCM as
      dma-coherent on trogdor but that the old "invalidate" happened to make
      things work most of the time. Tagging it properly like this is a much
      more robust solution.
      
      [1] https://chromium.googlesource.com/chromiumos/third_party/arm-trusted-firmware/+/refs/heads/firmware-trogdor-13577.B/plat/qti/common/src/qti_syscall.c
      [2] https://lore.kernel.org/r/20230614165904.1.I279773c37e2c1ed8fbb622ca6d1397aea0023526@changeid
      [3] https://oftc.irclog.whitequark.org/linux-msm/2023-06-15
      
      Fixes: 7bd6680b ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""")
      Fixes: 7ec3e673 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt")
      Reviewed-by: default avatarKonrad Dybcio <konrad.dybcio@linaro.org>
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Link: https://lore.kernel.org/r/20230616081440.v2.3.Ic62daa649b47b656b313551d646c4de9a7da4bd4@changeidSigned-off-by: default avatarBjorn Andersson <andersson@kernel.org>
      a54b7fa6
    • Douglas Anderson's avatar
      arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP · 9a5f0b11
      Douglas Anderson authored
      sc7180-idp is, for most intents and purposes, a trogdor device.
      Specifically, sc7180-idp is designed to run the same style of firmware
      as trogdor devices. This can be seen from the fact that IDP has the
      same "Reserved memory changes" in its device tree that trogdor has.
      
      Recently it was realized that we need to mark SCM as dma-coherent to
      match what trogdor's style of firmware (based on TF-A) does [1]. That
      means we need this dma-coherent tag on IDP as well.
      
      Without this, on newer versions of Linux, specifically those with
      commit 7bd6680b ("Revert "Revert "arm64: dma: Drop cache
      invalidation from arch_dma_prep_coherent()"""), WiFi will fail to
      work. At bootup you'll see:
      
        qcom_scm firmware:scm: Assign memory protection call failed -22
        qcom_rmtfs_mem 94600000.memory: assign memory failed
        qcom_rmtfs_mem: probe of 94600000.memory failed with error -22
      
      [1] https://lore.kernel.org/r/20230615145253.1.Ic62daa649b47b656b313551d646c4de9a7da4bd4@changeid
      
      Fixes: 7bd6680b ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""")
      Fixes: f5ab220d ("arm64: dts: qcom: sc7180: Add remoteproc enablers")
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Link: https://lore.kernel.org/r/20230616081440.v2.2.I3c17d546d553378aa8a0c68c3fe04bccea7cba17@changeidSigned-off-by: default avatarBjorn Andersson <andersson@kernel.org>
      9a5f0b11
    • Douglas Anderson's avatar
      dt-bindings: firmware: qcom,scm: Document that SCM can be dma-coherent · c0877829
      Douglas Anderson authored
      Trogdor devices use firmware backed by TF-A instead of Qualcomm's
      normal TZ. On TF-A we end up mapping memory as cacheable. Specifically,
      you can see in Trogdor's TF-A code [1] in qti_sip_mem_assign() that we
      call qti_mmap_add_dynamic_region() with MT_RO_DATA. This translates
      down to MT_MEMORY instead of MT_NON_CACHEABLE or MT_DEVICE.
      
      Let's allow devices like trogdor to be described properly by allowing
      "dma-coherent" in the SCM node.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Acked-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20230616081440.v2.1.Ie79b5f0ed45739695c9970df121e11d724909157@changeidSigned-off-by: default avatarBjorn Andersson <andersson@kernel.org>
      c0877829
    • Gavin Shan's avatar
      KVM: Avoid illegal stage2 mapping on invalid memory slot · 2230f9e1
      Gavin Shan authored
      We run into guest hang in edk2 firmware when KSM is kept as running on
      the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash
      device (TYPE_PFLASH_CFI01) during the operation of sector erasing or
      buffered write. The status is returned by reading the memory region of
      the pflash device and the read request should have been forwarded to QEMU
      and emulated by it. Unfortunately, the read request is covered by an
      illegal stage2 mapping when the guest hang issue occurs. The read request
      is completed with QEMU bypassed and wrong status is fetched. The edk2
      firmware runs into an infinite loop with the wrong status.
      
      The illegal stage2 mapping is populated due to same page sharing by KSM
      at (C) even the associated memory slot has been marked as invalid at (B)
      when the memory slot is requested to be deleted. It's notable that the
      active and inactive memory slots can't be swapped when we're in the middle
      of kvm_mmu_notifier_change_pte() because kvm->mn_active_invalidate_count
      is elevated, and kvm_swap_active_memslots() will busy loop until it reaches
      to zero again. Besides, the swapping from the active to the inactive memory
      slots is also avoided by holding &kvm->srcu in __kvm_handle_hva_range(),
      corresponding to synchronize_srcu_expedited() in kvm_swap_active_memslots().
      
        CPU-A                    CPU-B
        -----                    -----
                                 ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION)
                                 kvm_vm_ioctl_set_memory_region
                                 kvm_set_memory_region
                                 __kvm_set_memory_region
                                 kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE)
                                   kvm_invalidate_memslot
                                     kvm_copy_memslot
                                     kvm_replace_memslot
                                     kvm_swap_active_memslots        (A)
                                     kvm_arch_flush_shadow_memslot   (B)
        same page sharing by KSM
        kvm_mmu_notifier_invalidate_range_start
              :
        kvm_mmu_notifier_change_pte
          kvm_handle_hva_range
          __kvm_handle_hva_range
          kvm_set_spte_gfn            (C)
              :
        kvm_mmu_notifier_invalidate_range_end
      
      Fix the issue by skipping the invalid memory slot at (C) to avoid the
      illegal stage2 mapping so that the read request for the pflash's status
      is forwarded to QEMU and emulated by it. In this way, the correct pflash's
      status can be returned from QEMU to break the infinite loop in the edk2
      firmware.
      
      We tried a git-bisect and the first problematic commit is cd4c7183 ("
      KVM: arm64: Convert to the gfn-based MMU notifier callbacks"). With this,
      clean_dcache_guest_page() is called after the memory slots are iterated
      in kvm_mmu_notifier_change_pte(). clean_dcache_guest_page() is called
      before the iteration on the memory slots before this commit. This change
      literally enlarges the racy window between kvm_mmu_notifier_change_pte()
      and memory slot removal so that we're able to reproduce the issue in a
      practical test case. However, the issue exists since commit d5d8184d
      ("KVM: ARM: Memory virtualization setup").
      
      Cc: stable@vger.kernel.org # v3.9+
      Fixes: d5d8184d ("KVM: ARM: Memory virtualization setup")
      Reported-by: default avatarShuai Hu <hshuai@redhat.com>
      Reported-by: default avatarZhenyu Zhang <zhenyzha@redhat.com>
      Signed-off-by: default avatarGavin Shan <gshan@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarOliver Upton <oliver.upton@linux.dev>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarSean Christopherson <seanjc@google.com>
      Reviewed-by: default avatarShaoqin Huang <shahuang@redhat.com>
      Message-Id: <20230615054259.14911-1-gshan@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      2230f9e1
    • Qu Wenruo's avatar
      btrfs: fix remaining u32 overflows when left shifting stripe_nr · cb091225
      Qu Wenruo authored
      There was regression caused by a97699d1 ("btrfs: replace
      map_lookup->stripe_len by BTRFS_STRIPE_LEN") and supposedly fixed by
      a7299a18 ("btrfs: fix u32 overflows when left shifting stripe_nr").
      To avoid code churn the fix was open coding the type casts but
      unfortunately missed one which was still possible to hit [1].
      
      The missing place was assignment of bioc->full_stripe_logical inside
      btrfs_map_block().
      
      Fix it by adding a helper that does the safe calculation of the offset
      and use it everywhere even though it may not be strictly necessary due
      to already using u64 types.  This replaces all remaining
      "<< BTRFS_STRIPE_LEN_SHIFT" calls.
      
      [1] https://lore.kernel.org/linux-btrfs/20230622065438.86402-1-wqu@suse.com/
      
      Fixes: a7299a18 ("btrfs: fix u32 overflows when left shifting stripe_nr")
      Signed-off-by: default avatarQu Wenruo <wqu@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      [ update changelog ]
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      cb091225
    • Ming Lei's avatar
      block: make sure local irq is disabled when calling __blkcg_rstat_flush · 9c39b7a9
      Ming Lei authored
      When __blkcg_rstat_flush() is called from cgroup_rstat_flush*() code
      path, interrupt is always disabled.
      
      When we start to flush blkcg per-cpu stats list in __blkg_release()
      for avoiding to leak blkcg_gq's reference in commit 20cb1c2f
      ("blk-cgroup: Flush stats before releasing blkcg_gq"), local irq
      isn't disabled yet, then lockdep warning may be triggered because
      the dependent cgroup locks may be acquired from irq(soft irq) handler.
      
      Fix the issue by disabling local irq always.
      
      Fixes: 20cb1c2f ("blk-cgroup: Flush stats before releasing blkcg_gq")
      Reported-by: default avatarShinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Closes: https://lore.kernel.org/linux-block/pz2wzwnmn5tk3pwpskmjhli6g3qly7eoknilb26of376c7kwxy@qydzpvt6zpis/T/#u
      Cc: stable@vger.kernel.org
      Cc: Jay Shin <jaeshin@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Waiman Long <longman@redhat.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Reviewed-by: default avatarWaiman Long <longman@redhat.com>
      Link: https://lore.kernel.org/r/20230622084249.1208005-1-ming.lei@redhat.comSigned-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9c39b7a9
    • Paolo Abeni's avatar
      Merge tag 'nf-23-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 2ba7e7eb
      Paolo Abeni authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      This is v3, including a crash fix for patch 01/14.
      
      The following patchset contains Netfilter/IPVS fixes for net:
      
      1) Fix UDP segmentation with IPVS tunneled traffic, from Terin Stock.
      
      2) Fix chain binding transaction logic, add a bound flag to rule
         transactions. Remove incorrect logic in nft_data_hold() and
         nft_data_release().
      
      3) Add a NFT_TRANS_PREPARE_ERROR deactivate state to deal with releasing
         the set/chain as a follow up to 1240eb93 ("netfilter: nf_tables:
         incorrect error path handling with NFT_MSG_NEWRULE")
      
      4) Drop map element references from preparation phase instead of
         set destroy path, otherwise bogus EBUSY with transactions such as:
      
              flush chain ip x y
              delete chain ip x w
      
         where chain ip x y contains jump/goto from set elements.
      
      5) Pipapo set type does not regard generation mask from the walk
         iteration.
      
      6) Fix reference count underflow in set element reference to
         stateful object.
      
      7) Several patches to tighten the nf_tables API:
         - disallow set element updates of bound anonymous set
         - disallow unbound anonymous set/chain at the end of transaction.
         - disallow updates of anonymous set.
         - disallow timeout configuration for anonymous sets.
      
      8) Fix module reference leak in chain updates.
      
      9) Fix nfnetlink_osf module autoload.
      
      10) Fix deletion of basechain when NFTA_CHAIN_HOOK is specified as
          in iptables-nft.
      
      This Netfilter batch is larger than usual at this stage, I am aware we
      are fairly late in the -rc cycle, if you prefer to route them through
      net-next, please let me know.
      
      netfilter pull request 23-06-21
      
      * tag 'nf-23-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: Fix for deleting base chains with payload
        netfilter: nfnetlink_osf: fix module autoload
        netfilter: nf_tables: drop module reference after updating chain
        netfilter: nf_tables: disallow timeout for anonymous sets
        netfilter: nf_tables: disallow updates of anonymous sets
        netfilter: nf_tables: reject unbound chain set before commit phase
        netfilter: nf_tables: reject unbound anonymous set before commit phase
        netfilter: nf_tables: disallow element updates of bound anonymous sets
        netfilter: nf_tables: fix underflow in object reference counter
        netfilter: nft_set_pipapo: .walk does not deal with generations
        netfilter: nf_tables: drop map element references from preparation phase
        netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain
        netfilter: nf_tables: fix chain binding transaction logic
        ipvs: align inner_mac_header for encapsulation
      ====================
      
      Link: https://lore.kernel.org/r/20230621100731.68068-1-pablo@netfilter.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2ba7e7eb
    • Maciej Żenczykowski's avatar
      revert "net: align SO_RCVMARK required privileges with SO_MARK" · a9628e88
      Maciej Żenczykowski authored
      This reverts commit 1f86123b ("net: align SO_RCVMARK required
      privileges with SO_MARK") because the reasoning in the commit message
      is not really correct:
        SO_RCVMARK is used for 'reading' incoming skb mark (via cmsg), as such
        it is more equivalent to 'getsockopt(SO_MARK)' which has no priv check
        and retrieves the socket mark, rather than 'setsockopt(SO_MARK) which
        sets the socket mark and does require privs.
      
        Additionally incoming skb->mark may already be visible if
        sysctl_fwmark_reflect and/or sysctl_tcp_fwmark_accept are enabled.
      
        Furthermore, it is easier to block the getsockopt via bpf
        (either cgroup setsockopt hook, or via syscall filters)
        then to unblock it if it requires CAP_NET_RAW/ADMIN.
      
      On Android the socket mark is (among other things) used to store
      the network identifier a socket is bound to.  Setting it is privileged,
      but retrieving it is not.  We'd like unprivileged userspace to be able
      to read the network id of incoming packets (where mark is set via
      iptables [to be moved to bpf])...
      
      An alternative would be to add another sysctl to control whether
      setting SO_RCVMARK is privilged or not.
      (or even a MASK of which bits in the mark can be exposed)
      But this seems like over-engineering...
      
      Note: This is a non-trivial revert, due to later merged commit e42c7bee
      ("bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()")
      which changed both 'ns_capable' into 'sockopt_ns_capable' calls.
      
      Fixes: 1f86123b ("net: align SO_RCVMARK required privileges with SO_MARK")
      Cc: Larysa Zaremba <larysa.zaremba@intel.com>
      Cc: Simon Horman <simon.horman@corigine.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Eyal Birger <eyal.birger@gmail.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Patrick Rohr <prohr@google.com>
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230618103130.51628-1-maze@google.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      a9628e88
    • Kees Cook's avatar
      net: wwan: iosm: Convert single instance struct member to flexible array · dec24b3b
      Kees Cook authored
      struct mux_adth actually ends with multiple struct mux_adth_dg members.
      This is seen both in the comments about the member:
      
      /**
       * struct mux_adth - Structure of the Aggregated Datagram Table Header.
       ...
       * @dg:		datagramm table with variable length
       */
      
      and in the preparation for populating it:
      
                              adth_dg_size = offsetof(struct mux_adth, dg) +
                                              ul_adb->dg_count[i] * sizeof(*dg);
      			...
                              adth_dg_size -= offsetof(struct mux_adth, dg);
                              memcpy(&adth->dg, ul_adb->dg[i], adth_dg_size);
      
      This was reported as a run-time false positive warning:
      
      memcpy: detected field-spanning write (size 16) of single field "&adth->dg" at drivers/net/wwan/iosm/iosm_ipc_mux_codec.c:852 (size 8)
      
      Adjust the struct mux_adth definition and associated sizeof() math; no binary
      output differences are observed in the resulting object file.
      Reported-by: default avatarFlorian Klink <flokli@flokli.de>
      Closes: https://lore.kernel.org/lkml/dbfa25f5-64c8-5574-4f5d-0151ba95d232@gmail.com/
      Fixes: 1f52d7b6 ("net: wwan: iosm: Enable M.2 7360 WWAN card support")
      Cc: M Chetan Kumar <m.chetan.kumar@intel.com>
      Cc: Bagas Sanjaya <bagasdotme@gmail.com>
      Cc: Intel Corporation <linuxwwan@intel.com>
      Cc: Loic Poulain <loic.poulain@linaro.org>
      Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230620194234.never.023-kees@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      dec24b3b
    • Eric Dumazet's avatar
      sch_netem: acquire qdisc lock in netem_change() · 2174a08d
      Eric Dumazet authored
      syzbot managed to trigger a divide error [1] in netem.
      
      It could happen if q->rate changes while netem_enqueue()
      is running, since q->rate is read twice.
      
      It turns out netem_change() always lacked proper synchronization.
      
      [1]
      divide error: 0000 [#1] SMP KASAN
      CPU: 1 PID: 7867 Comm: syz-executor.1 Not tainted 6.1.30-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
      RIP: 0010:div64_u64 include/linux/math64.h:69 [inline]
      RIP: 0010:packet_time_ns net/sched/sch_netem.c:357 [inline]
      RIP: 0010:netem_enqueue+0x2067/0x36d0 net/sched/sch_netem.c:576
      Code: 89 e2 48 69 da 00 ca 9a 3b 42 80 3c 28 00 4c 8b a4 24 88 00 00 00 74 0d 4c 89 e7 e8 c3 4f 3b fd 48 8b 4c 24 18 48 89 d8 31 d2 <49> f7 34 24 49 01 c7 4c 8b 64 24 48 4d 01 f7 4c 89 e3 48 c1 eb 03
      RSP: 0018:ffffc9000dccea60 EFLAGS: 00010246
      RAX: 000001a442624200 RBX: 000001a442624200 RCX: ffff888108a4f000
      RDX: 0000000000000000 RSI: 000000000000070d RDI: 000000000000070d
      RBP: ffffc9000dcceb90 R08: ffffffff849c5e26 R09: fffffbfff10e1297
      R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888108a4f358
      R13: dffffc0000000000 R14: 0000001a8cd9a7ec R15: 0000000000000000
      FS: 00007fa73fe18700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fa73fdf7718 CR3: 000000011d36e000 CR4: 0000000000350ee0
      Call Trace:
      <TASK>
      [<ffffffff84714385>] __dev_xmit_skb net/core/dev.c:3931 [inline]
      [<ffffffff84714385>] __dev_queue_xmit+0xcf5/0x3370 net/core/dev.c:4290
      [<ffffffff84d22df2>] dev_queue_xmit include/linux/netdevice.h:3030 [inline]
      [<ffffffff84d22df2>] neigh_hh_output include/net/neighbour.h:531 [inline]
      [<ffffffff84d22df2>] neigh_output include/net/neighbour.h:545 [inline]
      [<ffffffff84d22df2>] ip_finish_output2+0xb92/0x10d0 net/ipv4/ip_output.c:235
      [<ffffffff84d21e63>] __ip_finish_output+0xc3/0x2b0
      [<ffffffff84d10a81>] ip_finish_output+0x31/0x2a0 net/ipv4/ip_output.c:323
      [<ffffffff84d10f14>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff84d10f14>] ip_output+0x224/0x2a0 net/ipv4/ip_output.c:437
      [<ffffffff84d123b5>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff84d123b5>] ip_local_out net/ipv4/ip_output.c:127 [inline]
      [<ffffffff84d123b5>] __ip_queue_xmit+0x1425/0x2000 net/ipv4/ip_output.c:542
      [<ffffffff84d12fdc>] ip_queue_xmit+0x4c/0x70 net/ipv4/ip_output.c:556
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Reviewed-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230620184425.1179809-1-edumazet@google.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2174a08d
    • Shyam Sundar S K's avatar
      platform/x86/amd/pmf: Register notify handler only if SPS is enabled · 146b6f68
      Shyam Sundar S K authored
      Power source notify handler is getting registered even when none of the
      PMF feature in enabled leading to a crash.
      
      ...
      [   22.592162] Call Trace:
      [   22.592164]  <TASK>
      [   22.592164]  ? rcu_note_context_switch+0x5e0/0x660
      [   22.592166]  ? __warn+0x81/0x130
      [   22.592171]  ? rcu_note_context_switch+0x5e0/0x660
      [   22.592172]  ? report_bug+0x171/0x1a0
      [   22.592175]  ? prb_read_valid+0x1b/0x30
      [   22.592177]  ? handle_bug+0x3c/0x80
      [   22.592178]  ? exc_invalid_op+0x17/0x70
      [   22.592179]  ? asm_exc_invalid_op+0x1a/0x20
      [   22.592182]  ? rcu_note_context_switch+0x5e0/0x660
      [   22.592183]  ? acpi_ut_delete_object_desc+0x86/0xb0
      [   22.592186]  ? acpi_ut_update_ref_count.part.0+0x22d/0x930
      [   22.592187]  __schedule+0xc0/0x1410
      [   22.592189]  ? ktime_get+0x3c/0xa0
      [   22.592191]  ? lapic_next_event+0x1d/0x30
      [   22.592193]  ? hrtimer_start_range_ns+0x25b/0x350
      [   22.592196]  schedule+0x5e/0xd0
      [   22.592197]  schedule_hrtimeout_range_clock+0xbe/0x140
      [   22.592199]  ? __pfx_hrtimer_wakeup+0x10/0x10
      [   22.592200]  usleep_range_state+0x64/0x90
      [   22.592203]  amd_pmf_send_cmd+0x106/0x2a0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616]
      [   22.592207]  amd_pmf_update_slider+0x56/0x1b0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616]
      [   22.592210]  amd_pmf_set_sps_power_limits+0x72/0x80 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616]
      [   22.592213]  amd_pmf_pwr_src_notify_call+0x49/0x90 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616]
      [   22.592216]  notifier_call_chain+0x5a/0xd0
      [   22.592218]  atomic_notifier_call_chain+0x32/0x50
      ...
      
      Fix this by moving the registration of source change notify handler only
      when SPS(Static Slider) is advertised as supported.
      Reported-by: default avatarAllen Zhong <allen@atr.me>
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217571
      Fixes: 4c71ae41 ("platform/x86/amd/pmf: Add support SPS PMF feature")
      Tested-by: default avatarPatil Rajesh Reddy <Patil.Reddy@amd.com>
      Reviewed-by: default avatarMario Limonciello <mario.limonciello@amd.com>
      Signed-off-by: default avatarShyam Sundar S K <Shyam-sundar.S-k@amd.com>
      Link: https://lore.kernel.org/r/20230622060309.310001-1-Shyam-sundar.S-k@amd.comReviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      146b6f68
    • Danielle Ratson's avatar
      selftests: forwarding: Fix race condition in mirror installation · c7c059fb
      Danielle Ratson authored
      When mirroring to a gretap in hardware the device expects to be
      programmed with the egress port and all the encapsulating headers. This
      requires the driver to resolve the path the packet will take in the
      software data path and program the device accordingly.
      
      If the path cannot be resolved (in this case because of an unresolved
      neighbor), then mirror installation fails until the path is resolved.
      This results in a race that causes the test to sometimes fail.
      
      Fix this by setting the neighbor's state to permanent in a couple of
      tests, so that it is always valid.
      
      Fixes: 35c31d5c ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1d")
      Fixes: 239e754a ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q")
      Signed-off-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Link: https://lore.kernel.org/r/268816ac729cb6028c7a34d4dda6f4ec7af55333.1687264607.git.petrm@nvidia.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c7c059fb
    • Benjamin Berg's avatar
      wifi: mac80211: report all unusable beacon frames · 7f4e0970
      Benjamin Berg authored
      Properly check for RX_DROP_UNUSABLE now that the new drop reason
      infrastructure is used. Without this change, the comparison will always
      be false as a more specific reason is given in the lower bits of result.
      
      Fixes: baa951a1 ("mac80211: use the new drop reasons infrastructure")
      Signed-off-by: default avatarBenjamin Berg <benjamin.berg@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Link: https://lore.kernel.org/r/20230621120543.412920-2-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7f4e0970
    • Jakub Kicinski's avatar
      Merge branch 'mptcp-fixes-for-6-4' · 533aa0ba
      Jakub Kicinski authored
      Matthieu Baerts says:
      
      ====================
      mptcp: fixes for 6.4
      
      Patch 1 correctly handles disconnect() failures that can happen in some
      specific cases: now the socket state is set as unconnected as expected.
      That fixes an issue introduced in v6.2.
      
      Patch 2 fixes a divide by zero bug in mptcp_recvmsg() with a fix similar
      to a recent one from Eric Dumazet for TCP introducing sk_wait_pending
      flag. It should address an issue present in MPTCP from almost the
      beginning, from v5.9.
      
      Patch 3 fixes a possible list corruption on passive MPJ even if the race
      seems very unlikely, better be safe than sorry. The possible issue is
      present from v5.17.
      
      Patch 4 consolidates fallback and non fallback state machines to avoid
      leaking some MPTCP sockets. The fix is likely needed for versions from
      v5.11.
      
      Patch 5 drops code that is no longer used after the introduction of
      patch 4/6. This is not really a fix but this patch can probably land in
      the -net tree as well not to leave unused code.
      
      Patch 6 ensures listeners are unhashed before updating their sk status
      to avoid possible deadlocks when diag info are going to be retrieved
      with a lock. Even if it should not be visible with the way we are
      currently getting diag info, the issue is present from v5.17.
      ====================
      
      Link: https://lore.kernel.org/r/20230620-upstream-net-20230620-misc-fixes-for-v6-4-v1-0-f36aa5eae8b9@tessares.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      533aa0ba
    • Paolo Abeni's avatar
      mptcp: ensure listener is unhashed before updating the sk status · 57fc0f1c
      Paolo Abeni authored
      The MPTCP protocol access the listener subflow in a lockless
      manner in a couple of places (poll, diag). That works only if
      the msk itself leaves the listener status only after that the
      subflow itself has been closed/disconnected. Otherwise we risk
      deadlock in diag, as reported by Christoph.
      
      Address the issue ensuring that the first subflow (the listener
      one) is always disconnected before updating the msk socket status.
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/407
      Fixes: b29fcfb5 ("mptcp: full disconnect implementation")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      57fc0f1c
    • Paolo Abeni's avatar
      mptcp: drop legacy code around RX EOF · b7535cfe
      Paolo Abeni authored
      Thanks to the previous patch -- "mptcp: consolidate fallback and non
      fallback state machine" -- we can finally drop the "temporary hack"
      used to detect rx eof.
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b7535cfe
    • Paolo Abeni's avatar
      mptcp: consolidate fallback and non fallback state machine · 81c1d029
      Paolo Abeni authored
      An orphaned msk releases the used resources via the worker,
      when the latter first see the msk in CLOSED status.
      
      If the msk status transitions to TCP_CLOSE in the release callback
      invoked by the worker's final release_sock(), such instance of the
      workqueue will not take any action.
      
      Additionally the MPTCP code prevents scheduling the worker once the
      socket reaches the CLOSE status: such msk resources will be leaked.
      
      The only code path that can trigger the above scenario is the
      __mptcp_check_send_data_fin() in fallback mode.
      
      Address the issue removing the special handling of fallback socket
      in __mptcp_check_send_data_fin(), consolidating the state machine
      for fallback and non fallback socket.
      
      Since non-fallback sockets do not send and do not receive data_fin,
      the mptcp code can update the msk internal status to match the next
      step in the SM every time data fin (ack) should be generated or
      received.
      
      As a consequence we can remove a bunch of checks for fallback from
      the fastpath.
      
      Fixes: 6e628cd3 ("mptcp: use mptcp release_cb for delayed tasks")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      81c1d029
    • Paolo Abeni's avatar
      mptcp: fix possible list corruption on passive MPJ · 56a666c4
      Paolo Abeni authored
      At passive MPJ time, if the msk socket lock is held by the user,
      the new subflow is appended to the msk->join_list under the msk
      data lock.
      
      In mptcp_release_cb()/__mptcp_flush_join_list(), the subflows in
      that list are moved from the join_list into the conn_list under the
      msk socket lock.
      
      Append and removal could race, possibly corrupting such list.
      Address the issue splicing the join list into a temporary one while
      still under the msk data lock.
      
      Found by code inspection, the race itself should be almost impossible
      to trigger in practice.
      
      Fixes: 3e501490 ("mptcp: cleanup MPJ subflow list handling")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      56a666c4
    • Paolo Abeni's avatar
      mptcp: fix possible divide by zero in recvmsg() · 0ad529d9
      Paolo Abeni authored
      Christoph reported a divide by zero bug in mptcp_recvmsg():
      
      divide error: 0000 [#1] PREEMPT SMP
      CPU: 1 PID: 19978 Comm: syz-executor.6 Not tainted 6.4.0-rc2-gffcc7899081b #20
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
      RIP: 0010:__tcp_select_window+0x30e/0x420 net/ipv4/tcp_output.c:3018
      Code: 11 ff 0f b7 cd c1 e9 0c b8 ff ff ff ff d3 e0 89 c1 f7 d1 01 cb 21 c3 eb 17 e8 2e 83 11 ff 31 db eb 0e e8 25 83 11 ff 89 d8 99 <f7> 7c 24 04 29 d3 65 48 8b 04 25 28 00 00 00 48 3b 44 24 10 75 60
      RSP: 0018:ffffc90000a07a18 EFLAGS: 00010246
      RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000040000
      RDX: 0000000000000000 RSI: 000000000003ffff RDI: 0000000000040000
      RBP: 000000000000ffd7 R08: ffffffff820cf297 R09: 0000000000000001
      R10: 0000000000000000 R11: ffffffff8103d1a0 R12: 0000000000003f00
      R13: 0000000000300000 R14: ffff888101cf3540 R15: 0000000000180000
      FS:  00007f9af4c09640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b33824000 CR3: 000000012f241001 CR4: 0000000000170ee0
      Call Trace:
       <TASK>
       __tcp_cleanup_rbuf+0x138/0x1d0 net/ipv4/tcp.c:1611
       mptcp_recvmsg+0xcb8/0xdd0 net/mptcp/protocol.c:2034
       inet_recvmsg+0x127/0x1f0 net/ipv4/af_inet.c:861
       ____sys_recvmsg+0x269/0x2b0 net/socket.c:1019
       ___sys_recvmsg+0xe6/0x260 net/socket.c:2764
       do_recvmmsg+0x1a5/0x470 net/socket.c:2858
       __do_sys_recvmmsg net/socket.c:2937 [inline]
       __se_sys_recvmmsg net/socket.c:2953 [inline]
       __x64_sys_recvmmsg+0xa6/0x130 net/socket.c:2953
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      RIP: 0033:0x7f9af58fc6a9
      Code: 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 37 0d 00 f7 d8 64 89 01 48
      RSP: 002b:00007f9af4c08cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
      RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f9af58fc6a9
      RDX: 0000000000000001 RSI: 0000000020000140 RDI: 0000000000000004
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000f00 R11: 0000000000000246 R12: 00000000006bc05c
      R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40
       </TASK>
      
      mptcp_recvmsg is allowed to release the msk socket lock when
      blocking, and before re-acquiring it another thread could have
      switched the sock to TCP_LISTEN status - with a prior
      connect(AF_UNSPEC) - also clearing icsk_ack.rcv_mss.
      
      Address the issue preventing the disconnect if some other process is
      concurrently performing a blocking syscall on the same socket, alike
      commit 4faeee0c ("tcp: deny tcp_disconnect() when threads are waiting").
      
      Fixes: a6b118fe ("mptcp: add receive buffer auto-tuning")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/404Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Tested-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0ad529d9
    • Paolo Abeni's avatar
      mptcp: handle correctly disconnect() failures · c2b2ae39
      Paolo Abeni authored
      Currently the mptcp code has assumes that disconnect() can fail only
      at mptcp_sendmsg_fastopen() time - to avoid a deadlock scenario - and
      don't even bother returning an error code.
      
      Soon mptcp_disconnect() will handle more error conditions: let's track
      them explicitly.
      
      As a bonus, explicitly annotate TCP-level disconnect as not failing:
      the mptcp code never blocks for event on the subflows.
      
      Fixes: 7d803344 ("mptcp: fix deadlock in fastopen error path")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Tested-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c2b2ae39
  3. 21 Jun, 2023 2 commits
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 59bb14bd
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2023-06-21
      
      We've added 7 non-merge commits during the last 14 day(s) which contain
      a total of 7 files changed, 181 insertions(+), 15 deletions(-).
      
      The main changes are:
      
      1) Fix a verifier id tracking issue with scalars upon spill,
         from Maxim Mikityanskiy.
      
      2) Fix NULL dereference if an exception is generated while a BPF
         subprogram is running, from Krister Johansen.
      
      3) Fix a BTF verification failure when compiling kernel with LLVM_IAS=0,
         from Florent Revest.
      
      4) Fix expected_attach_type enforcement for kprobe_multi link,
         from Jiri Olsa.
      
      5) Fix a bpf_jit_dump issue for x86_64 to pick the correct JITed image,
         from Yonghong Song.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf: Force kprobe multi expected_attach_type for kprobe_multi link
        bpf/btf: Accept function names that contain dots
        selftests/bpf: add a test for subprogram extables
        bpf: ensure main program has an extable
        bpf: Fix a bpf_jit_dump issue for x86_64 with sysctl bpf_jit_enable.
        selftests/bpf: Add test cases to assert proper ID tracking on spill
        bpf: Fix verifier id tracking of scalars on spill
      ====================
      
      Link: https://lore.kernel.org/r/20230621101116.16122-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      59bb14bd
    • Linus Torvalds's avatar
      Merge tag 'timers-urgent-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · dad9774d
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A single regression fix for a regression fix:
      
        For a long time the tick was aligned to clock MONOTONIC so that the
        tick event happened at a multiple of nanoseconds per tick starting
        from clock MONOTONIC = 0.
      
        At some point this changed as the refined jiffies clocksource which is
        used during boot before the TSC or other clocksources becomes usable,
        was adjusted with a boot offset, so that time 0 is closer to the point
        where the kernel starts.
      
        This broke the assumption in the tick code that when the tick setup
        happens early on ktime_get() will return a multiple of nanoseconds per
        tick. As a consequence applications which aligned their periodic
        execution so that it does not collide with the tick were not longer
        guaranteed that the tick period starts from time 0.
      
        The fix for this regression was to realign the tick when it is
        initially set up to a multiple of tick periods. That works as long as
        the underlying tick device supports periodic mode, but breaks under
        certain conditions when the tick device supports only one shot mode.
      
        Depending on the offset, the alignment delta to clock MONOTONIC can
        get in a range where the minimal programming delta of the underlying
        clock event device is larger than the calculated delta to the next
        tick. This results in a boot hang as the tick code tries to play catch
        up, but as the tick never fires jiffies are not advanced so it keeps
        trying for ever.
      
        Solve this by moving the tick alignement into the NOHZ / HIGHRES
        enablement code because at that point it is guaranteed that the
        underlying clocksource is high resolution capable and not longer
        depending on the tick.
      
        This is far before user space starts, so at the point where
        applications try to align their timers, the old behaviour of the tick
        happening at a multiple of nanoseconds per tick starting from clock
        MONOTONIC = 0 is restored"
      
      * tag 'timers-urgent-2023-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        tick/common: Align tick period during sched_timer setup
      dad9774d