Commits · 4ba4f1afb6a9fed8ef896c2363076e36572f71da · Kirill Smelkov / linux

10 Sep, 2024 1 commit

perf: Generic hotplug support for a PMU with a scope · 4ba4f1af

Kan Liang authored Aug 02, 2024

The perf subsystem assumes that the counters of a PMU are per-CPU. So
the user space tool reads a counter from each CPU in the system wide
mode. However, many PMUs don't have a per-CPU counter. The counter is
effective for a scope, e.g., a die or a socket. To address this, a
cpumask is exposed by the kernel driver to restrict to one CPU to stand
for a specific scope. In case the given CPU is removed,
the hotplug support has to be implemented for each such driver.

The codes to support the cpumask and hotplug are very similar.
- Expose a cpumask into sysfs
- Pickup another CPU in the same scope if the given CPU is removed.
- Invoke the perf_pmu_migrate_context() to migrate to a new CPU.
- In event init, always set the CPU in the cpumask to event->cpu

Similar duplicated codes are implemented for each such PMU driver. It
would be good to introduce a generic infrastructure to avoid such
duplication.

5 popular scopes are implemented here, core, die, cluster, pkg, and
the system-wide. The scope can be set when a PMU is registered. If so, a
"cpumask" is automatically exposed for the PMU.

The "cpumask" is from the perf_online_<scope>_mask, which is to track
the active CPU for each scope. They are set when the first CPU of the
scope is online via the generic perf hotplug support. When a
corresponding CPU is removed, the perf_online_<scope>_mask is updated
accordingly and the PMU will be moved to a new CPU from the same scope
if possible.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240802151643.1691631-2-kan.liang@linux.intel.com

4ba4f1af

05 Sep, 2024 10 commits

uprobes: perform lockless SRCU-protected uprobes_tree lookup · cd7bdd9d

Andrii Nakryiko authored Sep 03, 2024

Another big bottleneck to scalablity is uprobe_treelock that's taken in
a very hot path in handle_swbp(). Now that uprobes are SRCU-protected,
take advantage of that and make uprobes_tree RB-tree look up lockless.

To make RB-tree RCU-protected lockless lookup correct, we need to take
into account that such RB-tree lookup can return false negatives if there
are parallel RB-tree modifications (rotations) going on. We use seqcount
lock to detect whether RB-tree changed, and if we find nothing while
RB-tree got modified inbetween, we just retry. If uprobe was found, then
it's guaranteed to be a correct lookup.

With all the lock-avoiding changes done, we get a pretty decent
improvement in performance and scalability of uprobes with number of
CPUs, even though we are still nowhere near linear scalability. This is
due to SRCU not really scaling very well with number of CPUs on
a particular hardware that was used for testing (80-core Intel Xeon Gold
6138 CPU @ 2.00GHz), but also due to the remaning mmap_lock, which is
currently taken to resolve interrupt address to inode+offset and then
uprobe instance. And, of course, uretprobes still need similar RCU to
avoid refcount in the hot path, which will be addressed in the follow up
patches.

Nevertheless, the improvement is good. We used BPF selftest-based
uprobe-nop and uretprobe-nop benchmarks to get the below numbers,
varying number of CPUs on which uprobes and uretprobes are triggered.

BASELINE
========
uprobe-nop      ( 1 cpus):    3.032 ± 0.023M/s  (  3.032M/s/cpu)
uprobe-nop      ( 2 cpus):    3.452 ± 0.005M/s  (  1.726M/s/cpu)
uprobe-nop      ( 4 cpus):    3.663 ± 0.005M/s  (  0.916M/s/cpu)
uprobe-nop      ( 8 cpus):    3.718 ± 0.038M/s  (  0.465M/s/cpu)
uprobe-nop      (16 cpus):    3.344 ± 0.008M/s  (  0.209M/s/cpu)
uprobe-nop      (32 cpus):    2.288 ± 0.021M/s  (  0.071M/s/cpu)
uprobe-nop      (64 cpus):    3.205 ± 0.004M/s  (  0.050M/s/cpu)

uretprobe-nop   ( 1 cpus):    1.979 ± 0.005M/s  (  1.979M/s/cpu)
uretprobe-nop   ( 2 cpus):    2.361 ± 0.005M/s  (  1.180M/s/cpu)
uretprobe-nop   ( 4 cpus):    2.309 ± 0.002M/s  (  0.577M/s/cpu)
uretprobe-nop   ( 8 cpus):    2.253 ± 0.001M/s  (  0.282M/s/cpu)
uretprobe-nop   (16 cpus):    2.007 ± 0.000M/s  (  0.125M/s/cpu)
uretprobe-nop   (32 cpus):    1.624 ± 0.003M/s  (  0.051M/s/cpu)
uretprobe-nop   (64 cpus):    2.149 ± 0.001M/s  (  0.034M/s/cpu)

SRCU CHANGES
============
uprobe-nop      ( 1 cpus):    3.276 ± 0.005M/s  (  3.276M/s/cpu)
uprobe-nop      ( 2 cpus):    4.125 ± 0.002M/s  (  2.063M/s/cpu)
uprobe-nop      ( 4 cpus):    7.713 ± 0.002M/s  (  1.928M/s/cpu)
uprobe-nop      ( 8 cpus):    8.097 ± 0.006M/s  (  1.012M/s/cpu)
uprobe-nop      (16 cpus):    6.501 ± 0.056M/s  (  0.406M/s/cpu)
uprobe-nop      (32 cpus):    4.398 ± 0.084M/s  (  0.137M/s/cpu)
uprobe-nop      (64 cpus):    6.452 ± 0.000M/s  (  0.101M/s/cpu)

uretprobe-nop   ( 1 cpus):    2.055 ± 0.001M/s  (  2.055M/s/cpu)
uretprobe-nop   ( 2 cpus):    2.677 ± 0.000M/s  (  1.339M/s/cpu)
uretprobe-nop   ( 4 cpus):    4.561 ± 0.003M/s  (  1.140M/s/cpu)
uretprobe-nop   ( 8 cpus):    5.291 ± 0.002M/s  (  0.661M/s/cpu)
uretprobe-nop   (16 cpus):    5.065 ± 0.019M/s  (  0.317M/s/cpu)
uretprobe-nop   (32 cpus):    3.622 ± 0.003M/s  (  0.113M/s/cpu)
uretprobe-nop   (64 cpus):    3.723 ± 0.002M/s  (  0.058M/s/cpu)

Peak througput increased from 3.7 mln/s (uprobe triggerings) up to about
8 mln/s. For uretprobes it's a bit more modest with bump from 2.4 mln/s
to 5mln/s.
Suggested-by: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240903174603.3554182-8-andrii@kernel.org

cd7bdd9d

rbtree: provide rb_find_rcu() / rb_find_add_rcu() · 50a38035

Peter Zijlstra authored Sep 03, 2024

Much like latch_tree, add two RCU methods for the regular RB-tree,
which can be used in conjunction with a seqcount to provide lockless
lookups.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240903174603.3554182-7-andrii@kernel.org

50a38035

perf/uprobe: split uprobe_unregister() · 04b01625

Peter Zijlstra authored Sep 03, 2024

With uprobe_unregister() having grown a synchronize_srcu(), it becomes
fairly slow to call. Esp. since both users of this API call it in a
loop.

Peel off the sync_srcu() and do it once, after the loop.

We also need to add uprobe_unregister_sync() into uprobe_register()'s
error handling path, as we need to be careful about returning to the
caller before we have a guarantee that partially attached consumer won't
be called anymore. This is an unlikely slow path and this should be
totally fine to be slow in the case of a failed attach.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240903174603.3554182-6-andrii@kernel.org

04b01625

uprobes: travers uprobe's consumer list locklessly under SRCU protection · cc01bd04

Andrii Nakryiko authored Sep 03, 2024

uprobe->register_rwsem is one of a few big bottlenecks to scalability of
uprobes, so we need to get rid of it to improve uprobe performance and
multi-CPU scalability.

First, we turn uprobe's consumer list to a typical doubly-linked list
and utilize existing RCU-aware helpers for traversing such lists, as
well as adding and removing elements from it.

For entry uprobes we already have SRCU protection active since before
uprobe lookup. For uretprobe we keep refcount, guaranteeing that uprobe
won't go away from under us, but we add SRCU protection around consumer
list traversal.

Lastly, to keep handler_chain()'s UPROBE_HANDLER_REMOVE handling simple,
we remember whether any removal was requested during handler calls, but
then we double-check the decision under a proper register_rwsem using
consumers' filter callbacks. Handler removal is very rare, so this extra
lock won't hurt performance, overall, but we also avoid the need for any
extra protection (e.g., seqcount locks).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240903174603.3554182-5-andrii@kernel.org

cc01bd04

uprobes: get rid of enum uprobe_filter_ctx in uprobe filter callbacks · 59da880a

Andrii Nakryiko authored Sep 03, 2024

It serves no purpose beyond adding unnecessray argument passed to the
filter callback. Just get rid of it, no one is actually using it.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240903174603.3554182-4-andrii@kernel.org

59da880a

uprobes: protected uprobe lifetime with SRCU · 8617408f

Andrii Nakryiko authored Sep 03, 2024

To avoid unnecessarily taking a (brief) refcount on uprobe during
breakpoint handling in handle_swbp for entry uprobes, make find_uprobe()
not take refcount, but protect the lifetime of a uprobe instance with
RCU. This improves scalability, as refcount gets quite expensive due to
cache line bouncing between multiple CPUs.

Specifically, we utilize our own uprobe-specific SRCU instance for this
RCU protection. put_uprobe() will delay actual kfree() using call_srcu().

For now, uretprobe and single-stepping handling will still acquire
refcount as necessary. We'll address these issues in follow up patches
by making them use SRCU with timeout.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240903174603.3554182-3-andrii@kernel.org

8617408f

uprobes: revamp uprobe refcounting and lifetime management · 3f7f1a64

Andrii Nakryiko authored Sep 03, 2024

Revamp how struct uprobe is refcounted, and thus how its lifetime is
managed.

Right now, there are a few possible "owners" of uprobe refcount:
  - uprobes_tree RB tree assumes one refcount when uprobe is registered
    and added to the lookup tree;
  - while uprobe is triggered and kernel is handling it in the breakpoint
    handler code, temporary refcount bump is done to keep uprobe from
    being freed;
  - if we have uretprobe requested on a given struct uprobe instance, we
    take another refcount to keep uprobe alive until user space code
    returns from the function and triggers return handler.

The uprobe_tree's extra refcount of 1 is confusing and problematic. No
matter how many actual consumers are attached, they all share the same
refcount, and we have an extra logic to drop the "last" (which might not
really be last) refcount once uprobe's consumer list becomes empty.

This is unconventional and has to be kept in mind as a special case all
the time. Further, because of this design we have the situations where
find_uprobe() will find uprobe, bump refcount, return it to the caller,
but that uprobe will still need uprobe_is_active() check, after which
the caller is required to drop refcount and try again. This is just too
many details leaking to the higher level logic.

This patch changes refcounting scheme in such a way as to not have
uprobes_tree keeping extra refcount for struct uprobe. Instead, each
uprobe_consumer is assuming its own refcount, which will be dropped
when consumer is unregistered. Other than that, all the active users of
uprobe (entry and return uprobe handling code) keeps exactly the same
refcounting approach.

With the above setup, once uprobe's refcount drops to zero, we need to
make sure that uprobe's "destructor" removes uprobe from uprobes_tree,
of course. This, though, races with uprobe entry handling code in
handle_swbp(), which, through find_active_uprobe()->find_uprobe() lookup,
can race with uprobe being destroyed after refcount drops to zero (e.g.,
due to uprobe_consumer unregistering). So we add try_get_uprobe(), which
will attempt to bump refcount, unless it already is zero. Caller needs
to guarantee that uprobe instance won't be freed in parallel, which is
the case while we keep uprobes_treelock (for read or write, doesn't
matter).

Note also, we now don't leak the race between registration and
unregistration, so we remove the retry logic completely. If
find_uprobe() returns valid uprobe, it's guaranteed to remain in
uprobes_tree with properly incremented refcount. The race is handled
inside __insert_uprobe() and put_uprobe() working together:
__insert_uprobe() will remove uprobe from RB-tree, if it can't bump
refcount and will retry to insert the new uprobe instance. put_uprobe()
won't attempt to remove uprobe from RB-tree, if it's already not there.
All that is protected by uprobes_treelock, which keeps things simple.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240903174603.3554182-2-andrii@kernel.org

3f7f1a64

bpf: Fix use-after-free in bpf_uprobe_multi_link_attach() · 5fe6e308

Oleg Nesterov authored Aug 13, 2024

If bpf_link_prime() fails, bpf_uprobe_multi_link_attach() goes to the
error_free label and frees the array of bpf_uprobe's without calling
bpf_uprobe_unregister().

This leaks bpf_uprobe->uprobe and worse, this frees bpf_uprobe->consumer
without removing it from the uprobe->consumers list.

Fixes: 89ae89f5 ("bpf: Add multi uprobe link")
Closes: https://lore.kernel.org/all/000000000000382d39061f59f2dd@google.com/
Reported-by: syzbot+f7a1c2c2711e4a780f19@syzkaller.appspotmail.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: syzbot+f7a1c2c2711e4a780f19@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240813152524.GA7292@redhat.com

5fe6e308

perf/core: Fix small negative period being ignored · 62c0b106

Luo Gengkun authored Aug 31, 2024

In perf_adjust_period, we will first calculate period, and then use
this period to calculate delta. However, when delta is less than 0,
there will be a deviation compared to when delta is greater than or
equal to 0. For example, when delta is in the range of [-14,-1], the
range of delta = delta + 7 is between [-7,6], so the final value of
delta/8 is 0. Therefore, the impact of -1 and -2 will be ignored.
This is unacceptable when the target period is very short, because
we will lose a lot of samples.

Here are some tests and analyzes:
before:
  # perf record -e cs -F 1000  ./a.out
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.022 MB perf.data (518 samples) ]

  # perf script
  ...
  a.out     396   257.956048:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.957891:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.959730:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.961545:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.963355:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.965163:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.966973:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.968785:         23 cs:  ffffffff81f4eeec schedul>
  a.out     396   257.970593:         23 cs:  ffffffff81f4eeec schedul>
  ...

after:
  # perf record -e cs -F 1000  ./a.out
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.058 MB perf.data (1466 samples) ]

  # perf script
  ...
  a.out     395    59.338813:         11 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.339707:         12 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.340682:         13 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.341751:         13 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.342799:         12 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.343765:         11 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.344651:         11 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.345539:         12 cs:  ffffffff81f4eeec schedul>
  a.out     395    59.346502:         13 cs:  ffffffff81f4eeec schedul>
  ...

test.c

int main() {
        for (int i = 0; i < 20000; i++)
                usleep(10);

        return 0;
}

  # time ./a.out
  real    0m1.583s
  user    0m0.040s
  sys     0m0.298s

The above results were tested on x86-64 qemu with KVM enabled using
test.c as test program. Ideally, we should have around 1500 samples,
but the previous algorithm had only about 500, whereas the modified
algorithm now has about 1400. Further more, the new version shows 1
sample per 0.001s, while the previous one is 1 sample per 0.002s.This
indicates that the new algorithm is more sensitive to small negative
values compared to old algorithm.

Fixes: bd2b5b12 ("perf_counter: More aggressive frequency adjustment")
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20240831074316.2106159-2-luogengkun@huaweicloud.com

62c0b106

Merge branch 'perf/urgent' into perf/core, to pick up fixes · 95c13662
Ingo Molnar authored Sep 05, 2024
```
This also refreshes the -rc1 based branch to -rc5.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
```
95c13662

04 Sep, 2024 1 commit

perf/aux: Fix AUX buffer serialization · 2ab9d830

Peter Zijlstra authored Sep 02, 2024

Ole reported that event->mmap_mutex is strictly insufficient to
serialize the AUX buffer, add a per RB mutex to fully serialize it.

Note that in the lock order comment the perf_event::mmap_mutex order
was already wrong, that is, it nesting under mmap_lock is not new with
this patch.

Fixes: 45bfb2e5 ("perf: Add AUX area to ring buffer for raw data streams")
Reported-by: Ole <ole@binarygecko.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

2ab9d830

03 Sep, 2024 1 commit

uprobes: Use kzalloc to allocate xol area · e240b0fd

Sven Schnelle authored Sep 03, 2024

To prevent unitialized members, use kzalloc to allocate
the xol area.

Fixes: b059a453 ("x86/vdso: Add mremap hook to vm_special_mapping")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240903102313.3402529-1-svens@linux.ibm.com

e240b0fd

25 Aug, 2024 6 commits

perf/x86/intel: Limit the period on Haswell · 25dfc9e3

Kan Liang authored Aug 19, 2024

Running the ltp test cve-2015-3290 concurrently reports the following
warnings.

perfevents: irq loop stuck!
  WARNING: CPU: 31 PID: 32438 at arch/x86/events/intel/core.c:3174
  intel_pmu_handle_irq+0x285/0x370
  Call Trace:
   <NMI>
   ? __warn+0xa4/0x220
   ? intel_pmu_handle_irq+0x285/0x370
   ? __report_bug+0x123/0x130
   ? intel_pmu_handle_irq+0x285/0x370
   ? __report_bug+0x123/0x130
   ? intel_pmu_handle_irq+0x285/0x370
   ? report_bug+0x3e/0xa0
   ? handle_bug+0x3c/0x70
   ? exc_invalid_op+0x18/0x50
   ? asm_exc_invalid_op+0x1a/0x20
   ? irq_work_claim+0x1e/0x40
   ? intel_pmu_handle_irq+0x285/0x370
   perf_event_nmi_handler+0x3d/0x60
   nmi_handle+0x104/0x330

Thanks to Thomas Gleixner's analysis, the issue is caused by the low
initial period (1) of the frequency estimation algorithm, which triggers
the defects of the HW, specifically erratum HSW11 and HSW143. (For the
details, please refer https://lore.kernel.org/lkml/87plq9l5d2.ffs@tglx/)

The HSW11 requires a period larger than 100 for the INST_RETIRED.ALL
event, but the initial period in the freq mode is 1. The erratum is the
same as the BDM11, which has been supported in the kernel. A minimum
period of 128 is enforced as well on HSW.

HSW143 is regarding that the fixed counter 1 may overcount 32 with the
Hyper-Threading is enabled. However, based on the test, the hardware
has more issues than it tells. Besides the fixed counter 1, the message
'interrupt took too long' can be observed on any counter which was armed
with a period < 32 and two events expired in the same NMI. A minimum
period of 32 is enforced for the rest of the events.
The recommended workaround code of the HSW143 is not implemented.
Because it only addresses the issue for the fixed counter. It brings
extra overhead through extra MSR writing. No related overcounting issue
has been reported so far.

Fixes: 3a632cb2 ("perf/x86/intel: Add simple Haswell PMU support")
Reported-by: Li Huafei <lihuafei1@huawei.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240819183004.3132920-1-kan.liang@linux.intel.com
Closes: https://lore.kernel.org/lkml/20240729223328.327835-1-lihuafei1@huawei.com/

25dfc9e3

Linux 6.11-rc5 · 5be63fc1
Linus Torvalds authored Aug 25, 2024

5be63fc1

Merge tag 'bcachefs-2024-08-24' of git://evilpiepirate.org/bcachefs · 72bea05c

Linus Torvalds authored Aug 25, 2024

Pull bcachefs fixes from Kent Overstreet:

 - assorted syzbot fixes

 - some upgrade fixes for old (pre 1.0) filesystems

 - fix for moving data off a device that was switched to durability=0
   after data had been written to it.

 - nocow deadlock fix

 - fix for new rebalance_work accounting

* tag 'bcachefs-2024-08-24' of git://evilpiepirate.org/bcachefs: (28 commits)
  bcachefs: Fix rebalance_work accounting
  bcachefs: Fix failure to flush moves before sleeping in copygc
  bcachefs: don't use rht_bucket() in btree_key_cache_scan()
  bcachefs: add missing inode_walker_exit()
  bcachefs: clear path->should_be_locked in bch2_btree_key_cache_drop()
  bcachefs: Fix double assignment in check_dirent_to_subvol()
  bcachefs: Fix refcounting in discard path
  bcachefs: Fix compat issue with old alloc_v4 keys
  bcachefs: Fix warning in bch2_fs_journal_stop()
  fs/super.c: improve get_tree() error message
  bcachefs: Fix missing validation in bch2_sb_journal_v2_validate()
  bcachefs: Fix replay_now_at() assert
  bcachefs: Fix locking in bch2_ioc_setlabel()
  bcachefs: fix failure to relock in btree_node_fill()
  bcachefs: fix failure to relock in bch2_btree_node_mem_alloc()
  bcachefs: unlock_long() before resort in journal replay
  bcachefs: fix missing bch2_err_str()
  bcachefs: fix time_stats_to_text()
  bcachefs: Fix bch2_bucket_gens_init()
  bcachefs: Fix bch2_trigger_alloc assert
  ...

72bea05c

Merge tag '6.11-rc5-server-fixes' of git://git.samba.org/ksmbd · 780bdc1b

Linus Torvalds authored Aug 25, 2024

Pull smb server fixes from Steve French:

 - query directory flex array fix

 - fix potential null ptr reference in open

 - fix error message in some open cases

 - two minor cleanups

* tag '6.11-rc5-server-fixes' of git://git.samba.org/ksmbd:
  smb/server: update misguided comment of smb2_allocate_rsp_buf()
  smb/server: remove useless assignment of 'file_present' in smb2_open()
  smb/server: fix potential null-ptr-deref of lease_ctx_info in smb2_open()
  smb/server: fix return value of smb2_open()
  ksmbd: the buffer of smb2 query dir response has at least 1 byte

780bdc1b

Merge tag 's390-6.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · 48fb4b3d

Linus Torvalds authored Aug 25, 2024

Pull s390 fixes from Vasily Gorbik:

 - Fix KASLR base offset to account for symbol offsets in the vmlinux
   ELF file, preventing tool breakages like the drgn debugger

 - Fix potential memory corruption of physmem_info during kernel
   physical address randomization

 - Fix potential memory corruption due to overlap between the relocated
   lowcore and identity mapping by correctly reserving lowcore memory

 - Fix performance regression and avoid randomizing identity mapping
   base by default

 - Fix unnecessary delay of AP bus binding complete uevent to prevent
   startup lag in KVM guests using AP

* tag 's390-6.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/boot: Fix KASLR base offset off by __START_KERNEL bytes
  s390/boot: Avoid possible physmem_info segment corruption
  s390/ap: Refine AP bus bindings complete processing
  s390/mm: Pin identity mapping base to zero
  s390/mm: Prevent lowcore vs identity mapping overlap

48fb4b3d

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 891e811a

Linus Torvalds authored Aug 25, 2024

Pull SCSI fixes from James Bottomley:
 "The important core fix is another tweak to our discard discovery
  issues. The off by 512 in logical block count seems bad, but in fact
  the inline was only ever used in debug prints, which is why no-one
  noticed"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Do not attempt to configure discard unless LBPME is set
  scsi: MAINTAINERS: Add header files to SCSI SUBSYSTEM
  scsi: ufs: qcom: Add UFSHCD_QUIRK_BROKEN_LSDBS_CAP for SM8550 SoC
  scsi: ufs: core: Add a quirk for handling broken LSDBS field in controller capabilities register
  scsi: core: Fix the return value of scsi_logical_block_count()
  scsi: MAINTAINERS: Update HiSilicon SAS controller driver maintainer

891e811a

24 Aug, 2024 10 commits

bcachefs: Fix rebalance_work accounting · 49aa7830

Kent Overstreet authored Aug 23, 2024

rebalance_work was keying off of the presence of rebelance_opts in the
extent - but that was incorrect, we keep those around after rebalance
for indirect extents since the inode's options are not directly
available

Fixes: 20ac515a ("bcachefs: bch_acct_rebalance_work")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

49aa7830

bcachefs: Fix failure to flush moves before sleeping in copygc · d3204616

Kent Overstreet authored Aug 23, 2024

This fixes an apparent deadlock - rebalance would get stuck trying to
take nocow locks because they weren't being released by copygc.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>

d3204616

Merge tag 'cgroup-for-6.11-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · d2bafcf2

Linus Torvalds authored Aug 24, 2024

Pull cgroup fixes from Tejun Heo:
 "Three patches addressing cpuset corner cases"

* tag 'cgroup-for-6.11-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: Eliminate unncessary sched domains rebuilds in hotplug
  cgroup/cpuset: Clear effective_xcpus on cpus_allowed clearing only if cpus.exclusive not set
  cgroup/cpuset: fix panic caused by partcmd_update

d2bafcf2

Merge tag 'wq-for-6.11-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · cb2c84b3

Linus Torvalds authored Aug 24, 2024

Pull workqueue fixes from Tejun Heo:
 "Nothing too interesting. One patch to remove spurious warning and
  others to address static checker warnings"

* tag 'wq-for-6.11-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Correct declaration of cpu_pwq in struct workqueue_struct
  workqueue: Fix spruious data race in __flush_work()
  workqueue: Remove incorrect "WARN_ON_ONCE(!list_empty(&worker->entry));" from dying worker
  workqueue: Fix UBSAN 'subtraction overflow' error in shift_and_mask()
  workqueue: doc: Fix function name, remove markers

cb2c84b3

Merge tag 'mips-fixes_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux · 5bd6cf00

Linus Torvalds authored Aug 24, 2024

Pull MIPS fixes from Thomas Bogendoerfer:

 - Set correct timer mode on Loongson64

 - Only request r4k clockevent interrupt on one CPU

* tag 'mips-fixes_6.11_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: cevt-r4k: Don't call get_c0_compare_int if timer irq is installed
  MIPS: Loongson64: Set timer mode in cpu-probe

5bd6cf00

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · a8a8dcbd

Linus Torvalds authored Aug 24, 2024

Pull arm64 kvm fixes from Catalin Marinas:

 - Don't drop references on LPIs that weren't visited by the vgic-debug
   iterator

 - Cure lock ordering issue when unregistering vgic redistributors

 - Fix for misaligned stage-2 mappings when VMs are backed by hugetlb
   pages

 - Treat SGI registers as UNDEFINED if a VM hasn't been configured for
   GICv3

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  KVM: arm64: Make ICC_*SGI*_EL1 undef in the absence of a vGICv3
  KVM: arm64: Ensure canonical IPA is hugepage-aligned when handling fault
  KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors
  KVM: arm64: vgic-debug: Don't put unmarked LPIs

a8a8dcbd

Merge tag 'nfs-for-6.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs · 60f0560f

Linus Torvalds authored Aug 24, 2024

Pull NFS client fixes from Anna Schumaker:

 - Fix rpcrdma refcounting in xa_alloc

 - Fix rpcrdma usage of XA_FLAGS_ALLOC

 - Fix requesting FATTR4_WORD2_OPEN_ARGUMENTS

 - Fix attribute bitmap decoder to handle a 3rd word

 - Add reschedule points when returning delegations to avoid soft lockups

 - Fix clearing layout segments in layoutreturn

 - Avoid unnecessary rescanning of the per-server delegation list

* tag 'nfs-for-6.11-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS: Avoid unnecessary rescanning of the per-server delegation list
  NFSv4: Fix clearing of layout segments in layoutreturn
  NFSv4: Add missing rescheduling points in nfs_client_return_marked_delegations
  nfs: fix bitmap decoder to handle a 3rd word
  nfs: fix the fetch of FATTR4_OPEN_ARGUMENTS
  rpcrdma: Trace connection registration and unregistration
  rpcrdma: Use XA_FLAGS_ALLOC instead of XA_FLAGS_ALLOC1
  rpcrdma: Device kref is over-incremented on error from xa_alloc

60f0560f

Merge tag 'v6.11-rc4-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 · 66ace9a8

Linus Torvalds authored Aug 24, 2024

Pull smb client fixes from Steve French:

 - fix refcount leak (can cause rmmod fail)

 - fix byte range locking problem with cached reads

 - fix for mount failure if reparse point unrecognized

 - minor typo

* tag 'v6.11-rc4-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb/client: fix typo: GlobalMid_Sem -> GlobalMid_Lock
  smb: client: ignore unhandled reparse tags
  smb3: fix problem unloading module due to leaked refcount on shutdown
  smb3: fix broken cached reads when posix locks

66ace9a8

Merge tag 'input-for-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 7eb61cc6

Linus Torvalds authored Aug 24, 2024

Pull input fixes from Dmitry Torokhov:

 - a tweak to uinput interface to reject requests with abnormally large
   number of slots. 100 slots/contacts should be enough for real devices

 - support for FocalTech FT8201 added to the edt-ft5x06 driver

 - tweaks to i8042 to handle more devices that have issue with its
   emulation

 - Synaptics touchpad switched to native SMbus/RMI mode on HP Elitebook
   840 G2

 - other minor fixes

* tag 'input-for-v6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: himax_hx83112b - fix incorrect size when reading product ID
  Input: i8042 - use new forcenorestore quirk to replace old buggy quirk combination
  Input: i8042 - add forcenorestore quirk to leave controller untouched even on s3
  Input: i8042 - add Fujitsu Lifebook E756 to i8042 quirk table
  Input: uinput - reject requests with unreasonable number of slots
  Input: edt-ft5x06 - add support for FocalTech FT8201
  dt-bindings: input: touchscreen: edt-ft5x06: Document FT8201 support
  Input: adc-joystick - fix optional value handling
  Input: synaptics - enable SMBus for HP Elitebook 840 G2
  Input: ads7846 - ratelimit the spi_sync error message

7eb61cc6

Merge tag 'drm-fixes-2024-08-24' of https://gitlab.freedesktop.org/drm/kernel · 79a899e3

Linus Torvalds authored Aug 24, 2024

Pull drm fixes from Dave Airlie:
 "Weekly fixes. xe and msm are the major groups, with
  amdgpu/i915/nouveau having smaller bits. xe has a bunch of hw
  workaround fixes that were found to be missing, so that is why there
  are a bunch of scattered fixes, and one larger one. But overall size
  doesn't look too out of the ordinary.

  msm:
   - virtual plane fixes:
      - drop yuv on hw where not supported
      - csc vs yuv format fix
      - rotation fix
   - fix fb cleanup on close
   - reset phy before link training
   - fix visual corruption at 4K
   - fix NULL ptr crash on hotplug
   - simplify debug macros
   - sc7180 fix
   - adreno firmware name error path fix

  amdgpu:
   - GFX10 firmware loading fix
   - SDMA 5.2 fix
   - Debugfs parameter validation fix
   - eGPU hotplug fix

  i915:
   - fix HDCP timeouts

  nouveau:
   - fix SG_DEBUG crash

  xe:
   - Fix OA format masks which were breaking build with gcc-5
   - Fix opregion leak (Lucas)
   - Fix OA sysfs entry (Ashutosh)
   - Fix VM dma-resv lock (Brost)
   - Fix tile fini sequence (Brost)
   - Prevent UAF around preempt fence (Auld)
   - Fix DGFX display suspend/resume (Maarten)
   - Many Xe/Xe2 critical workarounds (Auld, Ngai-Mint, Bommu, Tejas, Daniele)
   - Fix devm/drmm issues (Daniele)
   - Fix missing workqueue destroy in xe_gt_pagefault (Stuart)
   - Drop HW fence pointer to HW fence ctx (Brost)
   - Free job before xe_exec_queue_put (Brost)"

* tag 'drm-fixes-2024-08-24' of https://gitlab.freedesktop.org/drm/kernel: (35 commits)
  drm/xe: Free job before xe_exec_queue_put
  drm/xe: Drop HW fence pointer to HW fence ctx
  drm/xe: Fix missing workqueue destroy in xe_gt_pagefault
  drm/amdgpu: fix eGPU hotplug regression
  drm/amdgpu: Validate TA binary size
  drm/amdgpu/sdma5.2: limit wptr workaround to sdma 5.2.1
  drm/amdgpu: fixing rlc firmware loading failure issue
  drm/xe/uc: Use devm to register cleanup that includes exec_queues
  drm/xe: use devm instead of drmm for managed bo
  drm/xe/xe2hpg: Add Wa_14021821874
  drm/xe: fix WA 14018094691
  drm/xe/xe2: Add Wa_15015404425
  drm/xe/xe2: Make subsequent L2 flush sequential
  drm/xe/xe2lpg: Extend workaround 14021402888
  drm/xe/xe2lpm: Extend Wa_16021639441
  drm/xe/bmg: implement Wa_16023588340
  drm/xe/oa/uapi: Make bit masks unsigned
  drm/xe/display: Make display suspend/resume work on discrete
  drm/xe: prevent UAF around preempt fence
  drm/xe: Fix tile fini sequence
  ...

79a899e3

23 Aug, 2024 10 commits

Merge tag 'block-6.11-20240823' of git://git.kernel.dk/linux · d5afaf91

Linus Torvalds authored Aug 24, 2024

Pull block fixes from Jens Axboe:

 - NVMe pull request via Keith
     - Remove unused struct field (Nilay)
     - Fix fabrics keep-alive teardown order (Ming)

 - Write zeroes fixes (John)

* tag 'block-6.11-20240823' of git://git.kernel.dk/linux:
  nvme: Remove unused field
  nvme: move stopping keep-alive into nvme_uninit_ctrl()
  block: Drop NULL check in bdev_write_zeroes_sectors()
  block: Read max write zeroes once for __blkdev_issue_write_zeroes()

d5afaf91

Merge tag 'io_uring-6.11-20240823' of git://git.kernel.dk/linux · 489270f4

Linus Torvalds authored Aug 24, 2024

Pull io_uring fix from Jens Axboe:
 "Just a single fix for provided buffer validation"

* tag 'io_uring-6.11-20240823' of git://git.kernel.dk/linux:
  io_uring/kbuf: sanitize peek buffer setup

489270f4

Merge tag 'acpi-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · b09f6ca9

Linus Torvalds authored Aug 24, 2024

Pull ACPI fix from Rafael Wysocki:
 "Fix backlight control on a Dell All In One system where a backlight
  controller board is attached to a UART port and the dell-uart
  backlight driver binds to it, but the backlight is actually controlled
  by other means (Hans de Goede)"

* tag 'acpi-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: video: Add backlight=native quirk for Dell OptiPlex 7760 AIO
  platform/x86: dell-uart-backlight: Use acpi_video_get_backlight_type()
  ACPI: video: Add Dell UART backlight controller detection

b09f6ca9

Merge tag 'thermal-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 6ae4e48b

Linus Torvalds authored Aug 24, 2024

Pull thermal control fixes from Rafael Wysocki:
 "These fix error handling in the thermal debug code and OF node
  reference leaks in the thermal OF driver.

  Specifics:

   - Use IS_ERR() in checks of debugfs_create_dir() return value instead
     of checking it against NULL in the thermal debug code (Yang Ruibin)

   - Fix three OF node reference leaks in thermal_of (Krzysztof
     Kozlowski)"

* tag 'thermal-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: of: Fix OF node leak in of_thermal_zone_find() error paths
  thermal: of: Fix OF node leak in thermal_of_zone_register()
  thermal: of: Fix OF node leak in thermal_of_trips_init() error path
  thermal/debugfs: Fix the NULL vs IS_ERR() confusion in debugfs_create_dir()

6ae4e48b

Merge tag 'mmc-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · f76a30a9

Linus Torvalds authored Aug 24, 2024

Pull mmc fixes from Ulf Hansson:
 "MMC core:
   - Fix NULL dereference for mmc_test on allocation failure

  MMC host:
   - dw_mmc: Fix support for deferred probe for biu/ciu clocks
   - mtk-sd: Fix CMD8 support when fragile tuning settings"

* tag 'mmc-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: mmc_test: Fix NULL dereference on allocation failure
  mmc: dw_mmc: allow biu and ciu clocks to defer
  mmc: mtk-sd: receive cmd8 data when hs400 tuning fail

f76a30a9

Merge tag 'spi-fix-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · c2a905a6

Linus Torvalds authored Aug 24, 2024

Pull spi fixes from Mark Brown:
 "A small collection of fixes here, all driver specific and none of them
  too serious. For whatever reason runtime PM seems to have been causing
  a bunch of issues recently"

* tag 'spi-fix-v6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: pxa2xx: Move PM runtime handling to the glue drivers
  spi: pxa2xx: Do not override dev->platform_data on probe
  spi: spi-fsl-lpspi: limit PRESCALE bit in TCR register
  spi: spi-cadence-quadspi: Fix OSPI NOR failures during system resume
  spi: zynqmp-gqspi: Scale timeout by data size

c2a905a6

Merge tag 'pwrseq-fixes-for-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · 3d5f968a

Linus Torvalds authored Aug 23, 2024

Pull power sequencing fix from Bartosz Golaszewski:

 - request the wlan-enable GPIO "as-is" to fix an issue with the wifi
   module being already powered up before linux boots

* tag 'pwrseq-fixes-for-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  power: sequencing: request the WLAN enable GPIO as-is

3d5f968a

Merge tag 'pmdomain-v6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm · 3902f60b

Linus Torvalds authored Aug 23, 2024

Pull pmdomain fixes from Ulf Hansson:

 - imx: Remove duplicated clocks for scu power domain

 - imx: Wait for SSAR to complete power-on for i.MX93 power domain

* tag 'pmdomain-v6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: imx: wait SSAR when i.MX93 power domain on
  pmdomain: imx: scu-pd: Remove duplicated clocks

3902f60b

Merge tag 'kvmarm-fixes-6.11-2' of... · 75c8f387

Catalin Marinas authored Aug 23, 2024

Merge tag 'kvmarm-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into for-next/fixes

KVM/arm64 fixes for 6.11, round #2

 - Don't drop references on LPIs that weren't visited by the
   vgic-debug iterator

 - Cure lock ordering issue when unregistering vgic redistributors

 - Fix for misaligned stage-2 mappings when VMs are backed by hugetlb
   pages

 - Treat SGI registers as UNDEFINED if a VM hasn't been configured for
   GICv3

* tag 'kvmarm-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
  KVM: arm64: Make ICC_*SGI*_EL1 undef in the absence of a vGICv3
  KVM: arm64: Ensure canonical IPA is hugepage-aligned when handling fault
  KVM: arm64: vgic: Don't hold config_lock while unregistering redistributors
  KVM: arm64: vgic-debug: Don't put unmarked LPIs
  KVM: arm64: vgic: Hold config_lock while tearing down a CPU interface
  KVM: selftests: arm64: Correct feature test for S1PIE in get-reg-list
  KVM: arm64: Tidying up PAuth code in KVM
  KVM: arm64: vgic-debug: Exit the iterator properly w/o LPI
  KVM: arm64: Enforce dependency on an ARMv8.4-aware toolchain
  docs: KVM: Fix register ID of SPSR_FIQ
  KVM: arm64: vgic: fix unexpected unlock sparse warnings
  KVM: arm64: fix kdoc warnings in W=1 builds
  KVM: arm64: fix override-init warnings in W=1 builds
  KVM: arm64: free kvm->arch.nested_mmus with kvfree()

75c8f387

Merge tag 'ata-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux · b78b25f6

Linus Torvalds authored Aug 23, 2024

Pull ata fixes from Damien Le Moal:

 - Fix the max segment size and max number of segments supported by the
   pata_macio driver to fix issues with BIO splitting leading to an
   overflow of the adapter DMA table (from Michael)

 - Related to the previous fix, change BUG_ON() calls for incorrect
   command buffer segmentation into WARN_ON() and an error return (from
   Michael)

* tag 'ata-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
  ata: pata_macio: Use WARN instead of BUG
  ata: pata_macio: Fix DMA table overflow

b78b25f6

22 Aug, 2024 1 commit

Merge tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · aa0743a2

Linus Torvalds authored Aug 23, 2024

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth and netfilter.

  Current release - regressions:

   - virtio_net: avoid crash on resume - move netdev_tx_reset_queue()
     call before RX napi enable

  Current release - new code bugs:

   - net/mlx5e: fix page leak and incorrect header release w/ HW GRO

  Previous releases - regressions:

   - udp: fix receiving fraglist GSO packets

   - tcp: prevent refcount underflow due to concurrent execution of
     tcp_sk_exit_batch()

  Previous releases - always broken:

   - ipv6: fix possible UAF when incrementing error counters on output

   - ip6: tunnel: prevent merging of packets with different L2

   - mptcp: pm: fix IDs not being reusable

   - bonding: fix potential crashes in IPsec offload handling

   - Bluetooth: HCI:
      - MGMT: add error handling to pair_device() to avoid a crash
      - invert LE State quirk to be opt-out rather then opt-in
      - fix LE quote calculation

   - drv: dsa: VLAN fixes for Ocelot driver

   - drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings

   - drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192

  Misc:

   - netpoll: do not export netpoll_poll_[disable|enable]()

   - MAINTAINERS: update the list of networking headers"

* tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
  s390/iucv: Fix vargs handling in iucv_alloc_device()
  net: ovs: fix ovs_drop_reasons error
  net: xilinx: axienet: Fix dangling multicast addresses
  net: xilinx: axienet: Always disable promiscuous mode
  MAINTAINERS: Mark JME Network Driver as Odd Fixes
  MAINTAINERS: Add header files to NETWORKING sections
  MAINTAINERS: Add limited globs for Networking headers
  MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section
  MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS
  octeontx2-af: Fix CPT AF register offset calculation
  net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F
  net: ngbe: Fix phy mode set to external phy
  netfilter: flowtable: validate vlan header
  bnxt_en: Fix double DMA unmapping for XDP_REDIRECT
  ipv6: prevent possible UAF in ip6_xmit()
  ipv6: fix possible UAF in ip6_finish_output2()
  ipv6: prevent UAF in ip6_send_skb()
  netpoll: do not export netpoll_poll_[disable|enable]()
  selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path
  udp: fix receiving fraglist GSO packets
  ...

aa0743a2