1. 01 Jul, 2024 3 commits
    • John Stultz's avatar
      sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath · ddae0ca2
      John Stultz authored
      It was reported that in moving to 6.1, a larger then 10%
      regression was seen in the performance of
      clock_gettime(CLOCK_THREAD_CPUTIME_ID,...).
      
      Using a simple reproducer, I found:
      5.10:
      100000000 calls in 24345994193 ns => 243.460 ns per call
      100000000 calls in 24288172050 ns => 242.882 ns per call
      100000000 calls in 24289135225 ns => 242.891 ns per call
      
      6.1:
      100000000 calls in 28248646742 ns => 282.486 ns per call
      100000000 calls in 28227055067 ns => 282.271 ns per call
      100000000 calls in 28177471287 ns => 281.775 ns per call
      
      The cause of this was finally narrowed down to the addition of
      psi_account_irqtime() in update_rq_clock_task(), in commit
      52b1364b ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ
      pressure").
      
      In my initial attempt to resolve this, I leaned towards moving
      all accounting work out of the clock_gettime() call path, but it
      wasn't very pretty, so it will have to wait for a later deeper
      rework. Instead, Peter shared this approach:
      
      Rework psi_account_irqtime() to use its own psi_irq_time base
      for accounting, and move it out of the hotpath, calling it
      instead from sched_tick() and __schedule().
      
      In testing this, we found the importance of ensuring
      psi_account_irqtime() is run under the rq_lock, which Johannes
      Weiner helpfully explained, so also add some lockdep annotations
      to make that requirement clear.
      
      With this change the performance is back in-line with 5.10:
      6.1+fix:
      100000000 calls in 24297324597 ns => 242.973 ns per call
      100000000 calls in 24318869234 ns => 243.189 ns per call
      100000000 calls in 24291564588 ns => 242.916 ns per call
      Reported-by: default avatarJimmy Shiu <jimmyshiu@google.com>
      Originally-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJohn Stultz <jstultz@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarChengming Zhou <chengming.zhou@linux.dev>
      Reviewed-by: default avatarQais Yousef <qyousef@layalina.io>
      Link: https://lore.kernel.org/r/20240618215909.4099720-1-jstultz@google.com
      ddae0ca2
    • Wander Lairson Costa's avatar
      sched/deadline: Fix task_struct reference leak · b58652db
      Wander Lairson Costa authored
      During the execution of the following stress test with linux-rt:
      
      stress-ng --cyclic 30 --timeout 30 --minimize --quiet
      
      kmemleak frequently reported a memory leak concerning the task_struct:
      
      unreferenced object 0xffff8881305b8000 (size 16136):
        comm "stress-ng", pid 614, jiffies 4294883961 (age 286.412s)
        object hex dump (first 32 bytes):
          02 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00  .@..............
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
        debug hex dump (first 16 bytes):
          53 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00  S...............
        backtrace:
          [<00000000046b6790>] dup_task_struct+0x30/0x540
          [<00000000c5ca0f0b>] copy_process+0x3d9/0x50e0
          [<00000000ced59777>] kernel_clone+0xb0/0x770
          [<00000000a50befdc>] __do_sys_clone+0xb6/0xf0
          [<000000001dbf2008>] do_syscall_64+0x5d/0xf0
          [<00000000552900ff>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
      
      The issue occurs in start_dl_timer(), which increments the task_struct
      reference count and sets a timer. The timer callback, dl_task_timer,
      is supposed to decrement the reference count upon expiration. However,
      if enqueue_task_dl() is called before the timer expires and cancels it,
      the reference count is not decremented, leading to the leak.
      
      This patch fixes the reference leak by ensuring the task_struct
      reference count is properly decremented when the timer is canceled.
      
      Fixes: feff2e65 ("sched/deadline: Unthrottle PI boosted threads while enqueuing")
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Link: https://lore.kernel.org/r/20240620125618.11419-1-wander@redhat.com
      b58652db
    • Josh Don's avatar
      Revert "sched/fair: Make sure to try to detach at least one movable task" · 2feab249
      Josh Don authored
      This reverts commit b0defa7a.
      
      b0defa7a changed the load balancing logic to ignore env.max_loop if
      all tasks examined to that point were pinned. The goal of the patch was
      to make it more likely to be able to detach a task buried in a long list
      of pinned tasks. However, this has the unfortunate side effect of
      creating an O(n) iteration in detach_tasks(), as we now must fully
      iterate every task on a cpu if all or most are pinned. Since this load
      balance code is done with rq lock held, and often in softirq context, it
      is very easy to trigger hard lockups. We observed such hard lockups with
      a user who affined O(10k) threads to a single cpu.
      
      When I discussed this with Vincent he initially suggested that we keep
      the limit on the number of tasks to detach, but increase the number of
      tasks we can search. However, after some back and forth on the mailing
      list, he recommended we instead revert the original patch, as it seems
      likely no one was actually getting hit by the original issue.
      
      Fixes: b0defa7a ("sched/fair: Make sure to try to detach at least one movable task")
      Signed-off-by: default avatarJosh Don <joshdon@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Link: https://lore.kernel.org/r/20240620214450.316280-1-joshdon@google.com
      2feab249
  2. 23 Jun, 2024 8 commits
  3. 22 Jun, 2024 19 commits
  4. 21 Jun, 2024 10 commits
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 35bb670d
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Two fixes: one in the ufs driver fixing an obvious memory leak and the
        other (with a core flag based update) trying to prevent USB crashes by
        stopping the core from issuing a request for the I/O Hints mode page"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: usb: uas: Do not query the IO Advice Hints Grouping mode page for USB/UAS devices
        scsi: core: Introduce the BLIST_SKIP_IO_HINTS flag
        scsi: ufs: core: Free memory allocated for model before reinit
      35bb670d
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2024-06-22' of https://gitlab.freedesktop.org/drm/kernel · d6c94157
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Still pretty quiet, two weeks worth of amdgpu fixes, with one i915 and
        one xe. I didn't get the drm-misc-fixes tree PR this week, but there
        was only one fix queued and I think it can wait another week, so seems
        pretty normal.
      
        xe:
         - Fix for invalid register access
      
        i915:
         - Fix conditions for joiner usage, it's not possible with eDP MSO
      
        amdgpu:
         - Fix display idle optimization race
         - Fix GPUVM TLB flush locking scope
         - IPS fix
         - GFX 9.4.3 harvesting fix
         - Runtime pm fix for shared buffers
         - DCN 3.5.x fixes
         - USB4 fix
         - RISC-V clang fix
         - Silence UBSAN warnings
         - MES11 fix
         - PSP 14.0.x fix"
      
      * tag 'drm-fixes-2024-06-22' of https://gitlab.freedesktop.org/drm/kernel:
        drm/xe/vf: Don't touch GuC irq registers if using memory irqs
        drm/amdgpu: init TA fw for psp v14
        drm/amdgpu: cleanup MES11 command submission
        drm/amdgpu: fix UBSAN warning in kv_dpm.c
        drm/radeon: fix UBSAN warning in kv_dpm.c
        drm/amd/display: Disable CONFIG_DRM_AMD_DC_FP for RISC-V with clang
        drm/amd/display: Attempt to avoid empty TUs when endpoint is DPIA
        drm/amd/display: change dram_clock_latency to 34us for dcn35
        drm/amd/display: Change dram_clock_latency to 34us for dcn351
        drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2
        drm/amdgpu: Indicate CU havest info to CP
        drm/amd/display: prevent register access while in IPS
        drm/amdgpu: fix locking scope when flushing tlb
        drm/amd/display: Remove redundant idle optimization check
        drm/i915/mso: using joiner is not possible with eDP MSO
      d6c94157
    • Linus Torvalds's avatar
      Merge tag 'ovl-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs · 264efe48
      Linus Torvalds authored
      Pull overlayfs fixes from Miklos Szeredi:
       "Fix two bugs, one originating in this cycle and one from 6.6"
      
      * tag 'ovl-fixes-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
        ovl: fix encoding fid for lower only root
        ovl: fix copy-up in tmpfile
      264efe48
    • Linus Torvalds's avatar
      Merge tag 'io_uring-6.10-20240621' of git://git.kernel.dk/linux · a502e727
      Linus Torvalds authored
      Pull io_uring fix from Jens Axboe:
       "Just a single cleanup for the fixed buffer iov_iter import.
      
        More cosmetic than anything else, but let's get it cleaned up as it's
        confusing"
      
      * tag 'io_uring-6.10-20240621' of git://git.kernel.dk/linux:
        io_uring/rsrc: fix incorrect assignment of iter->nr_segs in io_import_fixed
      a502e727
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · ffdf504c
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "Small bug fixes:
      
         - Prevent a crash in bnxt if the en and rdma drivers disagree on the
           MSI vectors
      
         - Have rxe memcpy inline data from the correct address
      
         - Fix rxe's validation of UD packets
      
         - Several mlx5 mr cache issues: bad lock balancing on error, missing
           propagation of the ATS property to the HW, wrong bucketing of freed
           mrs in some cases
      
         - Incorrect goto error unwind in mlx5 driver probe
      
         - Missed userspace input validation in mlx5 SRQ create
      
         - Incorrect uABI in MANA rejecting valid optional MR creation flags"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/mana_ib: Ignore optional access flags for MRs
        RDMA/mlx5: Add check for srq max_sge attribute
        RDMA/mlx5: Fix unwind flow as part of mlx5_ib_stage_init_init
        RDMA/mlx5: Ensure created mkeys always have a populated rb_key
        RDMA/mlx5: Follow rb_key.ats when creating new mkeys
        RDMA/mlx5: Remove extra unlock on error path
        RDMA/rxe: Fix responder length checking for UD request packets
        RDMA/rxe: Fix data copy for IB_SEND_INLINE
        RDMA/bnxt_re: Fix the max msix vectors macro
      ffdf504c
    • Linus Torvalds's avatar
      Merge tag 'sound-6.10-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 4545981f
      Linus Torvalds authored
      Pull  more sound fixes from Takashi Iwai:
       "A follow-up fix for a random build issue, as well as another trivial
        HD-audio quirk"
      
      * tag 'sound-6.10-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda: Use imply for suggesting CONFIG_SERIAL_MULTI_INSTANTIATE
        ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14AHP9
      4545981f
    • Linus Torvalds's avatar
      Merge tag 'acpi-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 36c07583
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These address a possible NULL pointer dereference in the ACPICA code
        and quirk camera enumeration on multiple platforms where incorrect
        data are present in the platform firmware.
      
        Specifics:
      
         - Undo an ACPICA code change that attempted to keep operation regions
           within a page boundary, but allowed accesses to unmapped memory to
           occur (Raju Rangoju)
      
         - Ignore MIPI camera graph port nodes created with the help of the
           information from the ACPI tables on all Dell Tiger, Alder and
           Raptor Lake models as that information is reported to be invalid on
           the platforms in question (Hans de Goede)
      
         - Use new Intel CPU model matching macros in the MIPI DisCo for
           Imaging part of ACPI device enumeration (Hans de Goede)"
      
      * tag 'acpi-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: mipi-disco-img: Switch to new Intel CPU model defines
        ACPI: scan: Ignore camera graph port nodes on all Dell Tiger, Alder and Raptor Lake models
        ACPICA: Revert "ACPICA: avoid Info: mapping multiple BARs. Your kernel is fine."
      36c07583
    • Linus Torvalds's avatar
      Merge tag 'thermal-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · fbe7ef3f
      Linus Torvalds authored
      Pull thermal control fixes from Rafael Wysocki:
       "These fix the Mediatek lvts_thermal driver, the Intel int340x driver,
        and the thermal core (two issues related to system suspend).
      
        Specifics:
      
         - Remove the filtered mode for mt8188 from lvts_thermal as it is not
           supported on this platform and fail the lvts_thermal initialization
           when the golden temperature is zero as that means the efuse data is
           not correctly set (Julien Panis)
      
         - Update the processor_thermal part of the Intel int340x driver to
           support shared interrupts as the processor thermal device interrupt
           may in fact be shared with PCI devices (Srinivas Pandruvada)
      
         - Synchronize the suspend-prepare and post-suspend actions of the
           thermal PM notifier to avoid a destructive race condition and
           change the priority of that notifier to the minimum to avoid
           interference between the work items spawned by it and the other
           PM notifiers during system resume (Rafael Wysocki)"
      
      * tag 'thermal-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        thermal: int340x: processor_thermal: Support shared interrupts
        thermal: core: Change PM notifier priority to the minimum
        thermal: core: Synchronize suspend-prepare and post-suspend actions
        thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse data
        thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188
      fbe7ef3f
    • Linus Torvalds's avatar
      Merge tag 'dmaengine-fix-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 66cc544f
      Linus Torvalds authored
      Pull dmaengine fixes from Vinod Koul:
      
       - kmemleak, error path handling and missing kmem_cache_destroy() fixes
         for ioatdma driver
      
       - use after free fix for idxd driver
      
       - data synchronisation fix for xdma isr handling
      
       - fsl driver channel constraints and linking two fsl module fixes
      
      * tag 'dmaengine-fix-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
        dmaengine: ioatdma: Fix missing kmem_cache_destroy()
        dt-bindings: dma: fsl-edma: fix dma-channels constraints
        dmaengine: fsl-edma: avoid linking both modules
        dmaengine: ioatdma: Fix kmemleak in ioat_pci_probe()
        dmaengine: ioatdma: Fix error path in ioat3_dma_probe()
        dmaengine: ioatdma: Fix leaking on version mismatch
        dmaengine: ti: k3-udma-glue: Fix of_k3_udma_glue_parse_chn_by_id()
        dmaengine: idxd: Fix possible Use-After-Free in irq_process_work_list
        dmaengine: xilinx: xdma: Fix data synchronisation in xdma_channel_isr()
      66cc544f
    • Linus Torvalds's avatar
      Merge tag 'phy-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy · a21b52aa
      Linus Torvalds authored
      Pull phy fixes from Vinod Koul:
      
       - Qualcomm QMP driver fixes for missing register offsets and correct N4
         offsets for registers
      
      * tag 'phy-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
        phy: qcom: qmp-combo: Switch from V6 to V6 N4 register offsets
        phy: qcom-qmp: pcs: Add missing v6 N4 register offsets
        phy: qcom-qmp: qserdes-txrx: Add missing registers offsets
      a21b52aa