1. 03 Jul, 2024 1 commit
  2. 02 Jul, 2024 1 commit
    • Randy Dunlap's avatar
      cgroup_misc: add kernel-doc comments for enum misc_res_type · 7a447968
      Randy Dunlap authored
      Fully document enum misc_res_type with kernel-doc comments to prevent
      kernel-doc warnings:
      
      misc_cgroup.h:12: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
       * Types of misc cgroup entries supported by the host.
      misc_cgroup.h:12: warning: missing initial short description on line:
       * Types of misc cgroup entries supported by the host.
      
      Fixes: a72232ea ("cgroup: Add misc cgroup controller")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: cgroups@vger.kernel.org
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      7a447968
  3. 19 Jun, 2024 6 commits
    • Waiman Long's avatar
      selftest/cgroup: Update test_cpuset_prs.sh to match changes · 1c0be3f7
      Waiman Long authored
      Unlike the list of isolated CPUs, it is not easy to programamatically
      determine what sched domains are being created by the scheduler just
      by examinng the data in various kernfs filesystems. The easiest way
      to get this information is by enabling /sys/kernel/debug/sched/verbose
      file to make those information displayed in the console. This is also
      what the test_cpuset_prs.sh script is doing when the -v flag is given.
      
      It is rather hard to fetch the data from the console and compare it to
      the expected result. An easier way is to dump the expected sched-domain
      information out to the console so that they can be visually compared
      with the actual sched domain data. However, this have to be done manually
      by visual inspection and so will only be done once in a while.
      
      Moreover the preceding cpuset commits also change the cpuset behavior
      requiring corresponding chanages in some test cases as well as new test
      cases to test the newly added functionality.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      1c0be3f7
    • Waiman Long's avatar
      cgroup/cpuset: Make cpuset.cpus.exclusive independent of cpuset.cpus · 737bb142
      Waiman Long authored
      The "cpuset.cpus.exclusive.effective" value is currently limited to a
      subset of its "cpuset.cpus". This makes the exclusive CPUs distribution
      hierarchy subsumed within the larger "cpuset.cpus" hierarchy. We have to
      decide on what CPUs are used locally and what CPUs can be passed down as
      exclusive CPUs down the hierarchy and combine them into "cpuset.cpus".
      
      The advantage of the current scheme is to have only one hierarchy to
      worry about. However, it make it harder to use as all the "cpuset.cpus"
      values have to be properly set along the way down to the designated remote
      partition root. It also makes it more cumbersome to find out what CPUs
      can be used locally.
      
      Make creation of remote partition simpler by breaking the
      dependency of "cpuset.cpus.exclusive" on "cpuset.cpus" and make
      them independent entities. Now we have two separate hierarchies -
      one for setting "cpuset.cpus.effective" and the other one for setting
      "cpuset.cpus.exclusive.effective". We may not need to set "cpuset.cpus"
      when we activate a partition root anymore.
      
      Also update Documentation/admin-guide/cgroup-v2.rst and cpuset.c comment
      to document this change.
      Suggested-by: default avatarPetr Malat <oss@malat.biz>
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      737bb142
    • Waiman Long's avatar
      cgroup/cpuset: Delay setting of CS_CPU_EXCLUSIVE until valid partition · fe8cd273
      Waiman Long authored
      The CS_CPU_EXCLUSIVE flag is currently set whenever cpuset.cpus.exclusive
      is set to make sure that the exclusivity test will be run to ensure its
      exclusiveness. At the same time, this flag can be changed whenever the
      partition root state is changed. For example, the CS_CPU_EXCLUSIVE flag
      will be reset whenever a partition root becomes invalid. This makes
      using CS_CPU_EXCLUSIVE to ensure exclusiveness a bit fragile.
      
      The current scheme also makes setting up a cpuset.cpus.exclusive
      hierarchy to enable remote partition harder as cpuset.cpus.exclusive
      cannot overlap with any cpuset.cpus of sibling cpusets if their
      cpuset.cpus.exclusive aren't set.
      
      Solve these issues by deferring the setting of CS_CPU_EXCLUSIVE flag
      until the cpuset become a valid partition root while adding new checks
      in validate_change() to ensure that cpuset.cpus.exclusive of sibling
      cpusets cannot overlap.
      
      An additional check is also added to validate_change() to make sure that
      cpuset.cpus of one cpuset cannot be a subset of cpuset.cpus.exclusive
      of a sibling cpuset to avoid the problem that none of those CPUs will
      be available when these exclusive CPUs are extracted out to a newly
      enabled partition root. The Documentation/admin-guide/cgroup-v2.rst
      file is updated to document the new constraints.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      fe8cd273
    • Waiman Long's avatar
      selftest/cgroup: Fix test_cpuset_prs.sh problems reported by test robot · 43ee4014
      Waiman Long authored
      The test robot reported two different problems when running the
      test_cpuset_prs.sh test.
      
       # ./test_cpuset_prs.sh: line 106: echo: write error: Input/output error
       #  :
       # Effective cpus changed to 0-1,4-7 after test 4!
      
      The write error is caused by writing to /dev/console. It looks like
      some systems may not have /dev/console configured or in a writeable
      state. Fix this by checking the existence of /dev/console before
      attempting to write it.
      
      After the completion of each test run, the script will check if the
      cpuset state is reset back to the original state. That usually takes a
      while to happen. The test script inserts some artificial delay to make
      sure that the reset has completed. The current setting is about 80ms.
      That may not be enough in some cases especially if the test system is
      slow. Double it to 160ms to minimize the chance of this type of failure.
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202406141712.dbbaa8fd-oliver.sang@intel.comSigned-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      43ee4014
    • Waiman Long's avatar
      cgroup/cpuset: Fix remote root partition creation problem · ccac8e8d
      Waiman Long authored
      Since commit 181c8e09 ("cgroup/cpuset: Introduce remote partition"),
      a remote partition can be created underneath a non-partition root cpuset
      as long as its exclusive_cpus are set to distribute exclusive CPUs down
      to its children. The generate_sched_domains() function, however, doesn't
      take into account this new behavior and hence will fail to create the
      sched domain needed for a remote root (non-isolated) partition.
      
      There are two issues related to remote partition support. First of
      all, generate_sched_domains() has a fast path that is activated if
      root_load_balance is true and top_cpuset.nr_subparts is non-zero. The
      later condition isn't quite correct for remote partitions as nr_subparts
      just shows the number of local child partitions underneath it. There
      can be no local child partition under top_cpuset even if there are
      remote partitions further down the hierarchy. Fix that by checking
      for subpartitions_cpus which contains exclusive CPUs allocated to both
      local and remote partitions.
      
      Secondly, the valid partition check for subtree skipping in the csa[]
      generation loop isn't enough as remote partition does not need to
      have a partition root parent. Fix this problem by breaking csa[] array
      generation loop of generate_sched_domains() into v1 and v2 specific parts
      and checking a cpuset's exclusive_cpus before skipping its subtree in
      the v2 case.
      
      Also simplify generate_sched_domains() for cgroup v2 as only
      non-isolating partition roots should be included in building the cpuset
      array and none of the v1 scheduling attributes other than a different
      way to create an isolated partition are supported.
      
      Fixes: 181c8e09 ("cgroup/cpuset: Introduce remote partition")
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      ccac8e8d
    • Oleg Nesterov's avatar
      cgroup: avoid the unnecessary list_add(dying_tasks) in cgroup_exit() · 6fe96014
      Oleg Nesterov authored
      cgroup_exit() needs to do this only if the exiting task is a leader and it
      is not the last live thread.  The patch doesn't use delay_group_leader(),
      atomic_read(signal->live) matches the code css_task_iter_advance() more.
      
      cgroup_release() can now check list_empty(task->cg_list) before it takes
      css_set_lock and calls ss_set_skip_task_iters().
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      6fe96014
  4. 03 Jun, 2024 1 commit
  5. 01 Jun, 2024 1 commit
    • Xiu Jianfeng's avatar
      cgroup/cpuset: Reduce the lock protecting CS_SCHED_LOAD_BALANCE · 018ee567
      Xiu Jianfeng authored
      In the cpuset_css_online(), clearing the CS_SCHED_LOAD_BALANCE bit
      of cs->flags is guarded by callback_lock and cpuset_mutex. There is
      no problem with itself, because it is consistent with the description
      of there two global lock at the beginning of this file. However, since
      the operation of checking, setting and clearing the flag bit is atomic,
      protection of callback_lock is unnecessary here, see CS_SPREAD_*. so
      to make it more consistent with the other code, move the operation
      outside the critical section of callback_lock.
      
      No functional changes intended.
      Signed-off-by: default avatarXiu Jianfeng <xiujianfeng@huawei.com>
      Acked-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      018ee567
  6. 26 May, 2024 11 commits
  7. 25 May, 2024 12 commits
    • Linus Torvalds's avatar
      Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of... · 9b62e02e
      Linus Torvalds authored
      Merge tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull misc fixes from Andrew Morton:
       "16 hotfixes, 11 of which are cc:stable.
      
        A few nilfs2 fixes, the remainder are for MM: a couple of selftests
        fixes, various singletons fixing various issues in various parts"
      
      * tag 'mm-hotfixes-stable-2024-05-25-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        mm/ksm: fix possible UAF of stable_node
        mm/memory-failure: fix handling of dissolved but not taken off from buddy pages
        mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again
        nilfs2: fix potential hang in nilfs_detach_log_writer()
        nilfs2: fix unexpected freezing of nilfs_segctor_sync()
        nilfs2: fix use-after-free of timer for log writer thread
        selftests/mm: fix build warnings on ppc64
        arm64: patching: fix handling of execmem addresses
        selftests/mm: compaction_test: fix bogus test success and reduce probability of OOM-killer invocation
        selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages
        selftests/mm: compaction_test: fix bogus test success on Aarch64
        mailmap: update email address for Satya Priya
        mm/huge_memory: don't unpoison huge_zero_folio
        kasan, fortify: properly rename memintrinsics
        lib: add version into /proc/allocinfo output
        mm/vmalloc: fix vmalloc which may return null if called with __GFP_NOFAIL
      9b62e02e
    • Linus Torvalds's avatar
      Merge tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a0db36ed
      Linus Torvalds authored
      Pull irq fixes from Ingo Molnar:
      
       - Fix x86 IRQ vector leak caused by a CPU offlining race
      
       - Fix build failure in the riscv-imsic irqchip driver
         caused by an API-change semantic conflict
      
       - Fix use-after-free in irq_find_at_or_after()
      
      * tag 'irq-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        genirq/irqdesc: Prevent use-after-free in irq_find_at_or_after()
        genirq/cpuhotplug, x86/vector: Prevent vector leak during CPU offline
        irqchip/riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict
      a0db36ed
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3a390f24
      Linus Torvalds authored
      Pull x86 fixes from Ingo Molnar:
      
       - Fix regressions of the new x86 CPU VFM (vendor/family/model)
         enumeration/matching code
      
       - Fix crash kernel detection on buggy firmware with
         non-compliant ACPI MADT tables
      
       - Address Kconfig warning
      
      * tag 'x86-urgent-2024-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL
        crypto: x86/aes-xts - switch to new Intel CPU model defines
        x86/topology: Handle bogus ACPI tables correctly
        x86/kconfig: Select ARCH_WANT_FRAME_POINTERS again when UNWINDER_FRAME_POINTER=y
      3a390f24
    • Linus Torvalds's avatar
      Merge tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi · 56676c4c
      Linus Torvalds authored
      Pull ipmi updates from Corey Minyard:
       "Mostly updates for deprecated interfaces, platform.remove and
        converting from a tasklet to a BH workqueue.
      
        Also use HAS_IOPORT for disabling inb()/outb()"
      
      * tag 'for-linus-6.10-1' of https://github.com/cminyard/linux-ipmi:
        ipmi: kcs_bmc_npcm7xx: Convert to platform remove callback returning void
        ipmi: kcs_bmc_aspeed: Convert to platform remove callback returning void
        ipmi: ipmi_ssif: Convert to platform remove callback returning void
        ipmi: ipmi_si_platform: Convert to platform remove callback returning void
        ipmi: ipmi_powernv: Convert to platform remove callback returning void
        ipmi: bt-bmc: Convert to platform remove callback returning void
        char: ipmi: handle HAS_IOPORT dependencies
        ipmi: Convert from tasklet to BH workqueue
      56676c4c
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client · 74eca356
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "A series from Xiubo that adds support for additional access checks
        based on MDS auth caps which were recently made available to clients.
      
        This is needed to prevent scenarios where the MDS quietly discards
        updates that a UID-restricted client previously (wrongfully) acked to
        the user.
      
        Other than that, just a documentation fixup"
      
      * tag 'ceph-for-6.10-rc1' of https://github.com/ceph/ceph-client:
        doc: ceph: update userspace command to get CephFS metadata
        ceph: add CEPHFS_FEATURE_MDS_AUTH_CAPS_CHECK feature bit
        ceph: check the cephx mds auth access for async dirop
        ceph: check the cephx mds auth access for open
        ceph: check the cephx mds auth access for setattr
        ceph: add ceph_mds_check_access() helper
        ceph: save cap_auths in MDS client when session is opened
      74eca356
    • Linus Torvalds's avatar
      Merge tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3 · 89b61ca4
      Linus Torvalds authored
      Pull ntfs3 updates from Konstantin Komarov:
       "Fixes:
         - reusing of the file index (could cause the file to be trimmed)
         - infinite dir enumeration
         - taking DOS names into account during link counting
         - le32_to_cpu conversion, 32 bit overflow, NULL check
         - some code was refactored
      
        Changes:
         - removed max link count info display during driver init
      
        Remove:
         - atomic_open has been removed for lack of use"
      
      * tag 'ntfs3_for_6.10' of https://github.com/Paragon-Software-Group/linux-ntfs3:
        fs/ntfs3: Break dir enumeration if directory contents error
        fs/ntfs3: Fix case when index is reused during tree transformation
        fs/ntfs3: Mark volume as dirty if xattr is broken
        fs/ntfs3: Always make file nonresident on fallocate call
        fs/ntfs3: Redesign ntfs_create_inode to return error code instead of inode
        fs/ntfs3: Use variable length array instead of fixed size
        fs/ntfs3: Use 64 bit variable to avoid 32 bit overflow
        fs/ntfs3: Check 'folio' pointer for NULL
        fs/ntfs3: Missed le32_to_cpu conversion
        fs/ntfs3: Remove max link count info display during driver init
        fs/ntfs3: Taking DOS names into account during link counting
        fs/ntfs3: remove atomic_open
        fs/ntfs3: use kcalloc() instead of kzalloc()
      89b61ca4
    • Linus Torvalds's avatar
      Merge tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd · 6c8b1a2d
      Linus Torvalds authored
      Pull smb server fixes from Steve French:
       "Two ksmbd server fixes, both for stable"
      
      * tag '6.10-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
        ksmbd: ignore trailing slashes in share paths
        ksmbd: avoid to send duplicate oplock break notifications
      6c8b1a2d
    • Linus Torvalds's avatar
      Merge tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux · 54f71b03
      Linus Torvalds authored
      Pull RTC updates from Alexandre Belloni:
       "There is one new driver and then most of the changes are the device
        tree bindings conversions to yaml.
      
        New driver:
         - Epson RX8111
      
        Drivers:
         - Many Device Tree bindings conversions to dtschema
         - pcf8563: wakeup-source support"
      
      * tag 'rtc-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
        pcf8563: add wakeup-source support
        rtc: rx8111: handle VLOW flag
        rtc: rx8111: demote warnings to debug level
        rtc: rx6110: Constify struct regmap_config
        dt-bindings: rtc: convert trivial devices into dtschema
        dt-bindings: rtc: stmp3xxx-rtc: convert to dtschema
        dt-bindings: rtc: pxa-rtc: convert to dtschema
        rtc: Add driver for Epson RX8111
        dt-bindings: rtc: Add Epson RX8111
        rtc: mcp795: drop unneeded MODULE_ALIAS
        rtc: nuvoton: Modify part number value
        rtc: test: Split rtc unit test into slow and normal speed test
        dt-bindings: rtc: nxp,lpc1788-rtc: convert to dtschema
        dt-bindings: rtc: digicolor-rtc: move to trivial-rtc
        dt-bindings: rtc: alphascale,asm9260-rtc: convert to dtschema
        dt-bindings: rtc: armada-380-rtc: convert to dtschema
        rtc: cros-ec: provide ID table for avoiding fallback match
      54f71b03
    • Linus Torvalds's avatar
      Merge tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · 4286e1fc
      Linus Torvalds authored
      Pull i3c updates from Alexandre Belloni:
       "Runtime PM (power management) is improved and hot-join support has
        been added to the dw controller driver.
      
        Core:
         - Allow device driver to trigger controller runtime PM
      
        Drivers:
         - dw: hot-join support
         - svc: better IBI handling"
      
      * tag 'i3c/for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
        i3c: dw: Add hot-join support.
        i3c: master: Enable runtime PM for master controller
        i3c: master: svc: fix invalidate IBI type and miss call client IBI handler
        i3c: master: svc: change ENXIO to EAGAIN when IBI occurs during start frame
        i3c: Add comment for -EAGAIN in i3c_device_do_priv_xfers()
      4286e1fc
    • Linus Torvalds's avatar
      Merge tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs · 6951abe8
      Linus Torvalds authored
      Pull jffs2 updates from Richard Weinberger:
      
       - Fix illegal memory access in jffs2_free_inode()
      
       - Kernel-doc fixes
      
       - print symbolic error names
      
      * tag 'jffs2-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
        jffs2: Fix potential illegal address access in jffs2_free_inode
        jffs2: Simplify the allocation of slab caches
        jffs2: nodemgmt: fix kernel-doc comments
        jffs2: print symbolic error name instead of error code
      6951abe8
    • Linus Torvalds's avatar
      Merge tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux · 2313022e
      Linus Torvalds authored
      Pull UML updates from Richard Weinberger:
      
       - Fixes for -Wmissing-prototypes warnings and further cleanup
      
       - Remove callback returning void from rtc and virtio drivers
      
       - Fix bash location
      
      * tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (26 commits)
        um: virtio_uml: Convert to platform remove callback returning void
        um: rtc: Convert to platform remove callback returning void
        um: Remove unused do_get_thread_area function
        um: Fix -Wmissing-prototypes warnings for __vdso_*
        um: Add an internal header shared among the user code
        um: Fix the declaration of kasan_map_memory
        um: Fix the -Wmissing-prototypes warning for get_thread_reg
        um: Fix the -Wmissing-prototypes warning for __switch_mm
        um: Fix -Wmissing-prototypes warnings for (rt_)sigreturn
        um: Stop tracking host PID in cpu_tasks
        um: process: remove unused 'n' variable
        um: vector: remove unused len variable/calculation
        um: vector: fix bpfflash parameter evaluation
        um: slirp: remove set but unused variable 'pid'
        um: signal: move pid variable where needed
        um: Makefile: use bash from the environment
        um: Add winch to winch_handlers before registering winch IRQ
        um: Fix -Wmissing-prototypes warnings for __warp_* and foo
        um: Fix -Wmissing-prototypes warnings for text_poke*
        um: Move declarations to proper headers
        ...
      2313022e
    • Linus Torvalds's avatar
      Merge tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel · 56fb6f92
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Some fixes for the end of the merge window, mostly amdgpu and panthor,
        with one nouveau uAPI change that fixes a bad decision we made a few
        months back.
      
        nouveau:
         - fix bo metadata uAPI for vm bind
      
        panthor:
         - Fixes for panthor's heap logical block.
         - Reset on unrecoverable fault
         - Fix VM references.
         - Reset fix.
      
        xlnx:
         - xlnx compile and doc fixes.
      
        amdgpu:
         - Handle vbios table integrated info v2.3
      
        amdkfd:
         - Handle duplicate BOs in reserve_bo_and_cond_vms
         - Handle memory limitations on small APUs
      
        dp/mst:
         - MST null deref fix.
      
        bridge:
         - Don't let next bridge create connector in adv7511 to make probe
           work"
      
      * tag 'drm-next-2024-05-25' of https://gitlab.freedesktop.org/drm/kernel:
        drm/amdgpu/atomfirmware: add intergrated info v2.3 table
        drm/mst: Fix NULL pointer dereference at drm_dp_add_payload_part2
        drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs
        drm/amdkfd: handle duplicate BOs in reserve_bo_and_cond_vms
        drm/bridge: adv7511: Attach next bridge without creating connector
        drm/buddy: Fix the warn on's during force merge
        drm/nouveau: use tile_mode and pte_kind for VM_BIND bo allocations
        drm/panthor: Call panthor_sched_post_reset() even if the reset failed
        drm/panthor: Reset the FW VM to NULL on unplug
        drm/panthor: Keep a ref to the VM at the panthor_kernel_bo level
        drm/panthor: Force an immediate reset on unrecoverable faults
        drm/panthor: Document drm_panthor_tiler_heap_destroy::handle validity constraints
        drm/panthor: Fix an off-by-one in the heap context retrieval logic
        drm/panthor: Relax the constraints on the tiler chunk size
        drm/panthor: Make sure the tiler initial/max chunks are consistent
        drm/panthor: Fix tiler OOM handling to allow incremental rendering
        drm: xlnx: zynqmp_dpsub: Fix compilation error
        drm: xlnx: zynqmp_dpsub: Fix few function comments
      56fb6f92
  8. 24 May, 2024 7 commits
    • David Howells's avatar
      cifs: Fix missing set of remote_i_size · 93a43155
      David Howells authored
      Occasionally, the generic/001 xfstest will fail indicating corruption in
      one of the copy chains when run on cifs against a server that supports
      FSCTL_DUPLICATE_EXTENTS_TO_FILE (eg. Samba with a share on btrfs).  The
      problem is that the remote_i_size value isn't updated by cifs_setsize()
      when called by smb2_duplicate_extents(), but i_size *is*.
      
      This may cause cifs_remap_file_range() to then skip the bit after calling
      ->duplicate_extents() that sets sizes.
      
      Fix this by calling netfs_resize_file() in smb2_duplicate_extents() before
      calling cifs_setsize() to set i_size.
      
      This means we don't then need to call netfs_resize_file() upon return from
      ->duplicate_extents(), but we also fix the test to compare against the pre-dup
      inode size.
      
      [Note that this goes back before the addition of remote_i_size with the
      netfs_inode struct.  It should probably have been setting cifsi->server_eof
      previously.]
      
      Fixes: cfc63fc8 ("smb3: fix cached file size problems in duplicate extents (reflink)")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Steve French <sfrench@samba.org>
      cc: Paulo Alcantara <pc@manguebit.com>
      cc: Shyam Prasad N <nspmangalore@gmail.com>
      cc: Rohith Surabattula <rohiths.msft@gmail.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cifs@vger.kernel.org
      cc: netfs@lists.linux.dev
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      93a43155
    • David Howells's avatar
      cifs: Fix smb3_insert_range() to move the zero_point · 8a160723
      David Howells authored
      Fix smb3_insert_range() to move the zero_point over to the new EOF.
      Without this, generic/147 fails as reads of data beyond the old EOF point
      return zeroes.
      
      Fixes: 3ee1a1fc ("cifs: Cut over to using netfslib")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Shyam Prasad N <nspmangalore@gmail.com>
      cc: Rohith Surabattula <rohiths.msft@gmail.com>
      cc: Jeff Layton <jlayton@kernel.org>
      cc: linux-cifs@vger.kernel.org
      cc: netfs@lists.linux.dev
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      8a160723
    • Linus Torvalds's avatar
      Merge tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 0b32d436
      Linus Torvalds authored
      Pull more mm updates from Andrew Morton:
       "Jeff Xu's implementation of the mseal() syscall"
      
      * tag 'mm-stable-2024-05-24-11-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
        selftest mm/mseal read-only elf memory segment
        mseal: add documentation
        selftest mm/mseal memory sealing
        mseal: add mseal syscall
        mseal: wire up mseal syscall
      0b32d436
    • Chengming Zhou's avatar
      mm/ksm: fix possible UAF of stable_node · 90e82349
      Chengming Zhou authored
      The commit 2c653d0e ("ksm: introduce ksm_max_page_sharing per page
      deduplication limit") introduced a possible failure case in the
      stable_tree_insert(), where we may free the new allocated stable_node_dup
      if we fail to prepare the missing chain node.
      
      Then that kfolio return and unlock with a freed stable_node set...  And
      any MM activities can come in to access kfolio->mapping, so UAF.
      
      Fix it by moving folio_set_stable_node() to the end after stable_node
      is inserted successfully.
      
      Link: https://lkml.kernel.org/r/20240513-b4-ksm-stable-node-uaf-v1-1-f687de76f452@linux.dev
      Fixes: 2c653d0e ("ksm: introduce ksm_max_page_sharing per page deduplication limit")
      Signed-off-by: default avatarChengming Zhou <chengming.zhou@linux.dev>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Stefan Roesch <shr@devkernel.io>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      90e82349
    • Miaohe Lin's avatar
      mm/memory-failure: fix handling of dissolved but not taken off from buddy pages · 8cf360b9
      Miaohe Lin authored
      When I did memory failure tests recently, below panic occurs:
      
      page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
      flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
      raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000
      raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000
      page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page))
      ------------[ cut here ]------------
      kernel BUG at include/linux/page-flags.h:1009!
      invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
      RIP: 0010:__del_page_from_free_list+0x151/0x180
      RSP: 0018:ffffa49c90437998 EFLAGS: 00000046
      RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8
      RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0
      RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69
      R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80
      R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009
      FS:  00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0
      Call Trace:
       <TASK>
       __rmqueue_pcplist+0x23b/0x520
       get_page_from_freelist+0x26b/0xe40
       __alloc_pages_noprof+0x113/0x1120
       __folio_alloc_noprof+0x11/0xb0
       alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130
       __alloc_fresh_hugetlb_folio+0xe7/0x140
       alloc_pool_huge_folio+0x68/0x100
       set_max_huge_pages+0x13d/0x340
       hugetlb_sysctl_handler_common+0xe8/0x110
       proc_sys_call_handler+0x194/0x280
       vfs_write+0x387/0x550
       ksys_write+0x64/0xe0
       do_syscall_64+0xc2/0x1d0
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7ff916114887
      RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887
      RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003
      RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0
      R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004
      R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00
       </TASK>
      Modules linked in: mce_inject hwpoison_inject
      ---[ end trace 0000000000000000 ]---
      
      And before the panic, there had an warning about bad page state:
      
      BUG: Bad page state in process page-types  pfn:8cee00
      page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00
      flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff)
      page_type: 0xffffff7f(buddy)
      raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000
      raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000
      page dumped because: nonzero mapcount
      Modules linked in: mce_inject hwpoison_inject
      CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty #22
      Call Trace:
       <TASK>
       dump_stack_lvl+0x83/0xa0
       bad_page+0x63/0xf0
       free_unref_page+0x36e/0x5c0
       unpoison_memory+0x50b/0x630
       simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110
       debugfs_attr_write+0x42/0x60
       full_proxy_write+0x5b/0x80
       vfs_write+0xcd/0x550
       ksys_write+0x64/0xe0
       do_syscall_64+0xc2/0x1d0
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
      RIP: 0033:0x7f189a514887
      RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887
      RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003
      RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2
      R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8
      R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040
       </TASK>
      
      The root cause should be the below race:
      
       memory_failure
        try_memory_failure_hugetlb
         me_huge_page
          __page_handle_poison
           dissolve_free_hugetlb_folio
           drain_all_pages -- Buddy page can be isolated e.g. for compaction.
           take_page_off_buddy -- Failed as page is not in the buddy list.
      	     -- Page can be putback into buddy after compaction.
          page_ref_inc -- Leads to buddy page with refcnt = 1.
      
      Then unpoison_memory() can unpoison the page and send the buddy page back
      into buddy list again leading to the above bad page state warning.  And
      bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy
      page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to
      allocate this page.
      
      Fix this issue by only treating __page_handle_poison() as successful when
      it returns 1.
      
      Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com
      Fixes: ceaf8fbe ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8cf360b9
    • Yuanyuan Zhong's avatar
      mm: /proc/pid/smaps_rollup: avoid skipping vma after getting mmap_lock again · 6d065f50
      Yuanyuan Zhong authored
      After switching smaps_rollup to use VMA iterator, searching for next entry
      is part of the condition expression of the do-while loop.  So the current
      VMA needs to be addressed before the continue statement.
      
      Otherwise, with some VMAs skipped, userspace observed memory
      consumption from /proc/pid/smaps_rollup will be smaller than the sum of
      the corresponding fields from /proc/pid/smaps.
      
      Link: https://lkml.kernel.org/r/20240523183531.2535436-1-yzhong@purestorage.com
      Fixes: c4c84f06 ("fs/proc/task_mmu: stop using linked list and highest_vm_end")
      Signed-off-by: default avatarYuanyuan Zhong <yzhong@purestorage.com>
      Reviewed-by: default avatarMohamed Khalfella <mkhalfella@purestorage.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6d065f50
    • Ryusuke Konishi's avatar
      nilfs2: fix potential hang in nilfs_detach_log_writer() · eb85dace
      Ryusuke Konishi authored
      Syzbot has reported a potential hang in nilfs_detach_log_writer() called
      during nilfs2 unmount.
      
      Analysis revealed that this is because nilfs_segctor_sync(), which
      synchronizes with the log writer thread, can be called after
      nilfs_segctor_destroy() terminates that thread, as shown in the call trace
      below:
      
      nilfs_detach_log_writer
        nilfs_segctor_destroy
          nilfs_segctor_kill_thread  --> Shut down log writer thread
          flush_work
            nilfs_iput_work_func
              nilfs_dispose_list
                iput
                  nilfs_evict_inode
                    nilfs_transaction_commit
                      nilfs_construct_segment (if inode needs sync)
                        nilfs_segctor_sync  --> Attempt to synchronize with
                                                log writer thread
                                 *** DEADLOCK ***
      
      Fix this issue by changing nilfs_segctor_sync() so that the log writer
      thread returns normally without synchronizing after it terminates, and by
      forcing tasks that are already waiting to complete once after the thread
      terminates.
      
      The skipped inode metadata flushout will then be processed together in the
      subsequent cleanup work in nilfs_segctor_destroy().
      
      Link: https://lkml.kernel.org/r/20240520132621.4054-4-konishi.ryusuke@gmail.comSigned-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: syzbot+e3973c409251e136fdd0@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=e3973c409251e136fdd0Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Cc: "Bai, Shuangpeng" <sjb7183@psu.edu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      eb85dace