Commits · 186873c549df11b63e17062f863654e1501e1524 · Kirill Smelkov / linux

21 Feb, 2022 15 commits

random: use simpler fast key erasure flow on per-cpu keys · 186873c5

Jason A. Donenfeld authored Feb 07, 2022

Rather than the clunky NUMA full ChaCha state system we had prior, this
commit is closer to the original "fast key erasure RNG" proposal from
<https://blog.cr.yp.to/20170723-random.html>, by simply treating ChaCha
keys on a per-cpu basis.

All entropy is extracted to a base crng key of 32 bytes. This base crng
has a birthdate and a generation counter. When we go to take bytes from
the crng, we first check if the birthdate is too old; if it is, we
reseed per usual. Then we start working on a per-cpu crng.

This per-cpu crng makes sure that it has the same generation counter as
the base crng. If it doesn't, it does fast key erasure with the base
crng key and uses the output as its new per-cpu key, and then updates
its local generation counter. Then, using this per-cpu state, we do
ordinary fast key erasure. Half of this first block is used to overwrite
the per-cpu crng key for the next call -- this is the fast key erasure
RNG idea -- and the other half, along with the ChaCha state, is returned
to the caller. If the caller desires more than this remaining half, it
can generate more ChaCha blocks, unlocked, using the now detached ChaCha
state that was just returned. Crypto-wise, this is more or less what we
were doing before, but this simply makes it more explicit and ensures
that we always have backtrack protection by not playing games with a
shared block counter.

The flow looks like this:

──extract()──► base_crng.key ◄──memcpy()───┐
                   │                       │
                   └──chacha()──────┬─► new_base_key
                                    └─► crngs[n].key ◄──memcpy()───┐
                                              │                    │
                                              └──chacha()───┬─► new_key
                                                            └─► random_bytes
                                                                      │
                                                                      └────►

There are a few hairy details around early init. Just as was done
before, prior to having gathered enough entropy, crng_fast_load() and
crng_slow_load() dump bytes directly into the base crng, and when we go
to take bytes from the crng, in that case, we're doing fast key erasure
with the base crng rather than the fast unlocked per-cpu crngs. This is
fine as that's only the state of affairs during very early boot; once
the crng initializes we never use these paths again.

In the process of all this, the APIs into the crng become a bit simpler:
we have get_random_bytes(buf, len) and get_random_bytes_user(buf, len),
which both do what you'd expect. All of the details of fast key erasure
and per-cpu selection happen only in a very short critical section of
crng_make_state(), which selects the right per-cpu key, does the fast
key erasure, and returns a local state to the caller's stack. So, we no
longer have a need for a separate backtrack function, as this happens
all at once here. The API then allows us to extend backtrack protection
to batched entropy without really having to do much at all.

The result is a bit simpler than before and has fewer foot guns. The
init time state machine also gets a lot simpler as we don't need to wait
for workqueues to come online and do deferred work. And the multi-core
performance should be increased significantly, by virtue of having hardly
any locking on the fast path.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Jann Horn <jannh@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

186873c5

random: absorb fast pool into input pool after fast load · c30c575d

Jason A. Donenfeld authored Feb 09, 2022

During crng_init == 0, we never credit entropy in add_interrupt_
randomness(), but instead dump it directly into the primary_crng. That's
fine, except for the fact that we then wind up throwing away that
entropy later when we switch to extracting from the input pool and
xoring into (and later in this series overwriting) the primary_crng key.
The two other early init sites -- add_hwgenerator_randomness()'s use
crng_fast_load() and add_device_ randomness()'s use of crng_slow_load()
-- always additionally give their inputs to the input pool. But not
add_interrupt_randomness().

This commit fixes that shortcoming by calling mix_pool_bytes() after
crng_fast_load() in add_interrupt_randomness(). That's partially
verboten on PREEMPT_RT, where it implies taking spinlock_t from an IRQ
handler. But this also only happens during early boot and then never
again after that. Plus it's a trylock so it has the same considerations
as calling crng_fast_load(), which we're already using.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Suggested-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

c30c575d

random: do not xor RDRAND when writing into /dev/random · 91c2afca

Jason A. Donenfeld authored Feb 08, 2022

Continuing the reasoning of "random: ensure early RDSEED goes through
mixer on init", we don't want RDRAND interacting with anything without
going through the mixer function, as a backdoored CPU could presumably
cancel out data during an xor, which it'd have a harder time doing when
being forced through a cryptographic hash function. There's actually no
need at all to be calling RDRAND in write_pool(), because before we
extract from the pool, we always do so with 32 bytes of RDSEED hashed in
at that stage. Xoring at this stage is needless and introduces a minor
liability.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

91c2afca

random: ensure early RDSEED goes through mixer on init · a02cf3d0

Jason A. Donenfeld authored Feb 08, 2022

Continuing the reasoning of "random: use RDSEED instead of RDRAND in
entropy extraction" from this series, at init time we also don't want to
be xoring RDSEED directly into the crng. Instead it's safer to put it
into our entropy collector and then re-extract it, so that it goes
through a hash function with preimage resistance. As a matter of hygiene,
we also order these now so that the RDSEED byte are hashed in first,
followed by the bytes that are likely more predictable (e.g. utsname()).

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

a02cf3d0

random: inline leaves of rand_initialize() · 85664172

Jason A. Donenfeld authored Feb 08, 2022

This is a preparatory commit for the following one. We simply inline the
various functions that rand_initialize() calls that have no other
callers. The compiler was doing this anyway before. Doing this will
allow us to reorganize this after. We can then move the trust_cpu and
parse_trust_cpu definitions a bit closer to where they're actually used,
which makes the code easier to read.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

85664172

random: get rid of secondary crngs · a9412d51

Jason A. Donenfeld authored Feb 06, 2022

As the comment said, this is indeed a "hack". Since it was introduced,
it's been a constant state machine nightmare, with lots of subtle early
boot issues and a wildly complex set of machinery to keep everything in
sync. Rather than continuing to play whack-a-mole with this approach,
this commit simply removes it entirely. This commit is preparation for
"random: use simpler fast key erasure flow on per-cpu keys" in this
series, which introduces a simpler (and faster) mechanism to accomplish
the same thing.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

a9412d51

random: use RDSEED instead of RDRAND in entropy extraction · 28f425e5

Jason A. Donenfeld authored Feb 08, 2022

When /dev/random was directly connected with entropy extraction, without
any expansion stage, extract_buf() was called for every 10 bytes of data
read from /dev/random. For that reason, RDRAND was used rather than
RDSEED. At the same time, crng_reseed() was still only called every 5
minutes, so there RDSEED made sense.

Those olden days were also a time when the entropy collector did not use
a cryptographic hash function, which meant most bets were off in terms
of real preimage resistance. For that reason too it didn't matter
_that_ much whether RDSEED was mixed in before or after entropy
extraction; both choices were sort of bad.

But now we have a cryptographic hash function at work, and with that we
get real preimage resistance. We also now only call extract_entropy()
every 5 minutes, rather than every 10 bytes. This allows us to do two
important things.

First, we can switch to using RDSEED in extract_entropy(), as Dominik
suggested. Second, we can ensure that RDSEED input always goes into the
cryptographic hash function with other things before being used
directly. This eliminates a category of attacks in which the CPU knows
the current state of the crng and knows that we're going to xor RDSEED
into it, and so it computes a malicious RDSEED. By going through our
hash function, it would require the CPU to compute a preimage on the
fly, which isn't going to happen.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Suggested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

28f425e5

random: fix locking in crng_fast_load() · 7c2fe2b3

Dominik Brodowski authored Feb 05, 2022

crng_init is protected by primary_crng->lock, so keep holding that lock
when incrementing crng_init from 0 to 1 in crng_fast_load(). The call to
pr_notice() can wait until the lock is released; this code path cannot
be reached twice, as crng_fast_load() aborts early if crng_init > 0.
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

7c2fe2b3

random: remove batched entropy locking · 77760fd7

Jason A. Donenfeld authored Jan 28, 2022

Rather than use spinlocks to protect batched entropy, we can instead
disable interrupts locally, since we're dealing with per-cpu data, and
manage resets with a basic generation counter. At the same time, we
can't quite do this on PREEMPT_RT, where we still want spinlocks-as-
mutexes semantics. So we use a local_lock_t, which provides the right
behavior for each. Because this is a per-cpu lock, that generation
counter is still doing the necessary CPU-to-CPU communication.

This should improve performance a bit. It will also fix the linked splat
that Jonathan received with a PROVE_RAW_LOCK_NESTING=y.
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Suggested-by: Andy Lutomirski <luto@kernel.org>
Reported-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Tested-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Link: https://lore.kernel.org/lkml/YfMa0QgsjCVdRAvJ@latitude/Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

77760fd7

random: remove use_input_pool parameter from crng_reseed() · 5d58ea3a

Eric Biggers authored Feb 04, 2022

The primary_crng is always reseeded from the input_pool, while the NUMA
crngs are always reseeded from the primary_crng.  Remove the redundant
'use_input_pool' parameter from crng_reseed() and just directly check
whether the crng is the primary_crng.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

5d58ea3a

random: make credit_entropy_bits() always safe · a49c010e

Jason A. Donenfeld authored Feb 04, 2022

This is called from various hwgenerator drivers, so rather than having
one "safe" version for userspace and one "unsafe" version for the
kernel, just make everything safe; the checks are cheap and sensible to
have anyway.
Reported-by: Sultan Alsawaf <sultan@kerneltoast.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

a49c010e

random: always wake up entropy writers after extraction · 489c7fc4

Jason A. Donenfeld authored Feb 05, 2022

Now that POOL_BITS == POOL_MIN_BITS, we must unconditionally wake up
entropy writers after every extraction. Therefore there's no point of
write_wakeup_threshold, so we can move it to the dustbin of unused
compatibility sysctls. While we're at it, we can fix a small comparison
where we were waking up after <= min rather than < min.

Cc: Theodore Ts'o <tytso@mit.edu>
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

489c7fc4

random: use linear min-entropy accumulation crediting · c5704490

Jason A. Donenfeld authored Feb 03, 2022

30e37ec5 ("random: account for entropy loss due to overwrites")
assumed that adding new entropy to the LFSR pool probabilistically
cancelled out old entropy there, so entropy was credited asymptotically,
approximating Shannon entropy of independent sources (rather than a
stronger min-entropy notion) using 1/8th fractional bits and replacing
a constant 2-2/√𝑒 term (~0.786938) with 3/4 (0.75) to slightly
underestimate it. This wasn't superb, but it was perhaps better than
nothing, so that's what was done. Which entropy specifically was being
cancelled out and how much precisely each time is hard to tell, though
as I showed with the attack code in my previous commit, a motivated
adversary with sufficient information can actually cancel out
everything.

Since we're no longer using an LFSR for entropy accumulation, this
probabilistic cancellation is no longer relevant. Rather, we're now
using a computational hash function as the accumulator and we've
switched to working in the random oracle model, from which we can now
revisit the question of min-entropy accumulation, which is done in
detail in <https://eprint.iacr.org/2019/198>.

Consider a long input bit string that is built by concatenating various
smaller independent input bit strings. Each one of these inputs has a
designated min-entropy, which is what we're passing to
credit_entropy_bits(h). When we pass the concatenation of these to a
random oracle, it means that an adversary trying to receive back the
same reply as us would need to become certain about each part of the
concatenated bit string we passed in, which means becoming certain about
all of those h values. That means we can estimate the accumulation by
simply adding up the h values in calls to credit_entropy_bits(h);
there's no probabilistic cancellation at play like there was said to be
for the LFSR. Incidentally, this is also what other entropy accumulators
based on computational hash functions do as well.

So this commit replaces credit_entropy_bits(h) with essentially `total =
min(POOL_BITS, total + h)`, done with a cmpxchg loop as before.

What if we're wrong and the above is nonsense? It's not, but let's
assume we don't want the actual _behavior_ of the code to change much.
Currently that behavior is not extracting from the input pool until it
has 128 bits of entropy in it. With the old algorithm, we'd hit that
magic 128 number after roughly 256 calls to credit_entropy_bits(1). So,
we can retain more or less the old behavior by waiting to extract from
the input pool until it hits 256 bits of entropy using the new code. For
people concerned about this change, it means that there's not that much
practical behavioral change. And for folks actually trying to model
the behavior rigorously, it means that we have an even higher margin
against attacks.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

c5704490

random: simplify entropy debiting · 9c07f578

Jason A. Donenfeld authored Feb 02, 2022

Our pool is 256 bits, and we only ever use all of it or don't use it at
all, which is decided by whether or not it has at least 128 bits in it.
So we can drastically simplify the accounting and cmpxchg loop to do
exactly this.  While we're at it, we move the minimum bit size into a
constant so it can be shared between the two places where it matters.

The reason we want any of this is for the case in which an attacker has
compromised the current state, and then bruteforces small amounts of
entropy added to it. By demanding a particular minimum amount of entropy
be present before reseeding, we make that bruteforcing difficult.

Note that this rationale no longer includes anything about /dev/random
blocking at the right moment, since /dev/random no longer blocks (except
for at ~boot), but rather uses the crng. In a former life, /dev/random
was different and therefore required a more nuanced account(), but this
is no longer.

Behaviorally, nothing changes here. This is just a simplification of
the code.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

9c07f578

random: use computational hash for entropy extraction · 6e8ec255

Jason A. Donenfeld authored Jan 16, 2022

The current 4096-bit LFSR used for entropy collection had a few
desirable attributes for the context in which it was created. For
example, the state was huge, which meant that /dev/random would be able
to output quite a bit of accumulated entropy before blocking. It was
also, in its time, quite fast at accumulating entropy byte-by-byte,
which matters given the varying contexts in which mix_pool_bytes() is
called. And its diffusion was relatively high, which meant that changes
would ripple across several words of state rather quickly.

However, it also suffers from a few security vulnerabilities. In
particular, inputs learned by an attacker can be undone, but moreover,
if the state of the pool leaks, its contents can be controlled and
entirely zeroed out. I've demonstrated this attack with this SMT2
script, <https://xn--4db.cc/5o9xO8pb>, which Boolector/CaDiCal solves in
a matter of seconds on a single core of my laptop, resulting in little
proof of concept C demonstrators such as <https://xn--4db.cc/jCkvvIaH/c>.

For basically all recent formal models of RNGs, these attacks represent
a significant cryptographic flaw. But how does this manifest
practically? If an attacker has access to the system to such a degree
that he can learn the internal state of the RNG, arguably there are
other lower hanging vulnerabilities -- side-channel, infoleak, or
otherwise -- that might have higher priority. On the other hand, seed
files are frequently used on systems that have a hard time generating
much entropy on their own, and these seed files, being files, often leak
or are duplicated and distributed accidentally, or are even seeded over
the Internet intentionally, where their contents might be recorded or
tampered with. Seen this way, an otherwise quasi-implausible
vulnerability is a bit more practical than initially thought.

Another aspect of the current mix_pool_bytes() function is that, while
its performance was arguably competitive for the time in which it was
created, it's no longer considered so. This patch improves performance
significantly: on a high-end CPU, an i7-11850H, it improves performance
of mix_pool_bytes() by 225%, and on a low-end CPU, a Cortex-A7, it
improves performance by 103%.

This commit replaces the LFSR of mix_pool_bytes() with a straight-
forward cryptographic hash function, BLAKE2s, which is already in use
for pool extraction. Universal hashing with a secret seed was considered
too, something along the lines of <https://eprint.iacr.org/2013/338>,
but the requirement for a secret seed makes for a chicken & egg problem.
Instead we go with a formally proven scheme using a computational hash
function, described in sections 5.1, 6.4, and B.1.8 of
<https://eprint.iacr.org/2019/198>.

BLAKE2s outputs 256 bits, which should give us an appropriate amount of
min-entropy accumulation, and a wide enough margin of collision
resistance against active attacks. mix_pool_bytes() becomes a simple
call to blake2s_update(), for accumulation, while the extraction step
becomes a blake2s_final() to generate a seed, with which we can then do
a HKDF-like or BLAKE2X-like expansion, the first part of which we fold
back as an init key for subsequent blake2s_update()s, and the rest we
produce to the caller. This then is provided to our CRNG like usual. In
that expansion step, we make opportunistic use of 32 bytes of RDRAND
output, just as before. We also always reseed the crng with 32 bytes,
unconditionally, or not at all, rather than sometimes with 16 as before,
as we don't win anything by limiting beyond the 16 byte threshold.

Going for a hash function as an entropy collector is a conservative,
proven approach. The result of all this is a much simpler and much less
bespoke construction than what's there now, which not only plugs a
vulnerability but also improves performance considerably.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

6e8ec255

20 Feb, 2022 13 commits

Linux 5.17-rc5 · cfb92440
Linus Torvalds authored Feb 20, 2022

cfb92440

Merge tag 'locking_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 3324e6e8

Linus Torvalds authored Feb 20, 2022

Pull locking fix from Borislav Petkov:
 "Fix a NULL ptr dereference when dumping lockdep chains through
  /proc/lockdep_chains"

* tag 'locking_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep: Correct lock_classes index mapping

3324e6e8

Merge tag 'x86_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 22217739

Linus Torvalds authored Feb 20, 2022

Pull x86 fixes from Borislav Petkov:

 - Fix the ptrace regset xfpregs_set() callback to behave according to
   the ABI

 - Handle poisoned pages properly in the SGX reclaimer code

* tag 'x86_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ptrace: Fix xfpregs_set()'s incorrect xmm clearing
  x86/sgx: Fix missing poison handling in reclaimer

22217739

Merge tag 'sched_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 0b0894ff

Linus Torvalds authored Feb 20, 2022

Pull scheduler fix from Borislav Petkov:
 "Fix task exposure order when forking tasks"

* tag 'sched_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix yet more sched_fork() races

0b0894ff

Merge tag 'edac_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras · 6e8e752f

Linus Torvalds authored Feb 20, 2022

Pull EDAC fix from Borislav Petkov:
 "Fix a long-standing struct alignment bug in the EDAC struct allocation
  code"

* tag 'edac_urgent_for_v5.17_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC: Fix calculation of returned address and next offset in edac_align_ptr()

6e8e752f

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · e268d708

Linus Torvalds authored Feb 20, 2022

Pull SCSI fixes from James Bottomley:
 "Three fixes, all in drivers.

  The ufs and qedi fixes are minor; the lpfc one is a bit bigger because
  it involves adding a heuristic to detect and deal with common but not
  standards compliant behaviour"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ufs: core: Fix divide by zero in ufshcd_map_queues()
  scsi: lpfc: Fix pt2pt NVMe PRLI reject LOGO loop
  scsi: qedi: Fix ABBA deadlock in qedi_process_tmf_resp() and qedi_process_cmd_cleanup_resp()

e268d708

Merge tag 'dmaengine-fix-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine · 77478077

Linus Torvalds authored Feb 20, 2022

Pull dmaengine fixes from Vinod Koul:
 "A bunch of driver fixes for:

   - ptdma error handling in init

   - lock fix in at_hdmac

   - error path and error num fix for sh dma

   - pm balance fix for stm32"

* tag 'dmaengine-fix-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
  dmaengine: shdma: Fix runtime PM imbalance on error
  dmaengine: sh: rcar-dmac: Check for error num after dma_set_max_seg_size
  dmaengine: stm32-dmamux: Fix PM disable depth imbalance in stm32_dmamux_probe
  dmaengine: sh: rcar-dmac: Check for error num after setting mask
  dmaengine: at_xdmac: Fix missing unlock in at_xdmac_tasklet()
  dmaengine: ptdma: Fix the error handling path in pt_core_init()

77478077

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · dacec3e7

Linus Torvalds authored Feb 20, 2022

Pull i2c fixes from Wolfram Sang:
 "Some driver updates, a MAINTAINERS fix, and additions to COMPILE_TEST
  (so we won't miss build problems again)"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: remove duplicate entry for i2c-qcom-geni
  i2c: brcmstb: fix support for DSL and CM variants
  i2c: qup: allow COMPILE_TEST
  i2c: imx: allow COMPILE_TEST
  i2c: cadence: allow COMPILE_TEST
  i2c: qcom-cci: don't put a device tree node before i2c_add_adapter()
  i2c: qcom-cci: don't delete an unregistered adapter
  i2c: bcm2835: Avoid clock stretching timeouts

dacec3e7

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 961af9db

Linus Torvalds authored Feb 20, 2022

Pull input fixes from Dmitry Torokhov:

 - a fix for Synaptics touchpads in RMI4 mode failing to suspend/resume
   properly because I2C client devices are now being suspended and
   resumed asynchronously which changed the ordering

 - a change to make sure we do not set right and middle buttons
   capabilities on touchpads that are "buttonpads" (i.e. do not have
   separate physical buttons)

 - a change to zinitix touchscreen driver adding more compatible
   strings/IDs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: psmouse - set up dependency between PS/2 and SMBus companions
  Input: zinitix - add new compatible strings
  Input: clear BTN_RIGHT/MIDDLE on buttonpads

961af9db

Merge tag 'for-v5.17-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply · 70d2bec7

Linus Torvalds authored Feb 20, 2022

Pull power supply fixes from Sebastian Reichel:
 "Three regression fixes for the 5.17 cycle:

   - build warning fix for power-supply documentation

   - pointer size fix in cw2015 battery driver

   - OOM handling in bq256xx charger driver"

* tag 'for-v5.17-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: bq256xx: Handle OOM correctly
  power: supply: core: fix application of sizeof to pointer
  power: supply: fix table problem in sysfs-class-power

70d2bec7

Merge tag 'fs.mount_setattr.v5.17-rc4' of... · 7f25f041

Linus Torvalds authored Feb 20, 2022

Merge tag 'fs.mount_setattr.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull mount_setattr test/doc fixes from Christian Brauner:
 "This contains a fix for one of the selftests for the mount_setattr
  syscall to create idmapped mounts, an entry for idmapped mounts for
  maintainers, and missing kernel documentation for the helper we split
  out some time ago to get and yield write access to a mount when
  changing mount properties"

* tag 'fs.mount_setattr.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  fs: add kernel doc for mnt_{hold,unhold}_writers()
  MAINTAINERS: add entry for idmapped mounts
  tests: fix idmapped mount_setattr test

7f25f041

Merge tag 'pidfd.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · c1034d24

Linus Torvalds authored Feb 20, 2022

Pull pidfd fix from Christian Brauner:
 "This fixes a problem reported by lockdep when installing a pidfd via
  fd_install() with siglock and the tasklisk write lock held in
  copy_process() when calling clone()/clone3() with CLONE_PIDFD.

  Originally a pidfd was created prior to holding any of these locks but
  this required a call to ksys_close(). So quite some time ago in
  6fd2fe49 ("copy_process(): don't use ksys_close() on cleanups") we
  switched to a get_unused_fd_flags() + fd_install() model.

  As part of that we moved fd_install() as late as possible. This was
  done for two main reasons. First, because we needed to ensure that we
  call fd_install() past the point of no return as once that's called
  the fd is live in the task's file table. Second, because we tried to
  ensure that the fd is visible in /proc/<pid>/fd/<pidfd> right when the
  task is visible.

  This fix moves the fd_install() to an even later point which means
  that a task will be visible in proc while the pidfd isn't yet under
  /proc/<pid>/fd/<pidfd>.

  While this is a user visible change it's very unlikely that this will
  have any impact. Nobody should be relying on that and if they do we
  need to come up with something better but again, it's doubtful this is
  relevant"

* tag 'pidfd.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  copy_process(): Move fd_install() out of sighand->siglock critical section

c1034d24

Merge branch 'ucount-rlimit-fixes-for-v5.17' of... · 2d3409eb

Linus Torvalds authored Feb 20, 2022

Merge branch 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull ucounts fixes from Eric Biederman:
 "Michal Koutný recently found some bugs in the enforcement of
  RLIMIT_NPROC in the recent ucount rlimit implementation.

  In this set of patches I have developed a very conservative approach
  changing only what is necessary to fix the bugs that I can see
  clearly. Cleanups and anything that is making the code more consistent
  can follow after we have the code working as it has historically.

  The problem is not so much inconsistencies (although those exist) but
  that it is very difficult to figure out what the code should be doing
  in the case of RLIMIT_NPROC.

  All other rlimits are only enforced where the resource is acquired
  (allocated). RLIMIT_NPROC by necessity needs to be enforced in an
  additional location, and our current implementation stumbled it's way
  into that implementation"

* 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ucounts: Handle wrapping in is_ucounts_overlimit
  ucounts: Move RLIMIT_NPROC handling after set_user
  ucounts: Base set_cred_ucounts changes on the real user
  ucounts: Enforce RLIMIT_NPROC not RLIMIT_NPROC+1
  rlimit: Fix RLIMIT_NPROC enforcement failure caused by capability calls in set_user

2d3409eb

19 Feb, 2022 5 commits

MAINTAINERS: remove duplicate entry for i2c-qcom-geni · 2428766e

Wolfram Sang authored Feb 18, 2022

The driver is already covered in the ARM/QUALCOMM section. Also, Akash
Asthana's email bounces meanwhile and Mukesh Savaliya has never
responded to mails regarding this driver.
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Wolfram Sang <wsa@kernel.org>

2428766e

sched: Fix yet more sched_fork() races · b1e82065

Peter Zijlstra authored Feb 14, 2022

Where commit 4ef0c5c6 ("kernel/sched: Fix sched_fork() access an
invalid sched_task_group") fixed a fork race vs cgroup, it opened up a
race vs syscalls by not placing the task on the runqueue before it
gets exposed through the pidhash.

Commit 13765de8 ("sched/fair: Fix fault in reweight_entity") is
trying to fix a single instance of this, instead fix the whole class
of issues, effectively reverting this commit.

Fixes: 4ef0c5c6 ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Tested-by: Zhang Qiao <zhangqiao22@huawei.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lkml.kernel.org/r/YgoeCbwj5mbCR0qA@hirez.programming.kicks-ass.net

b1e82065

Merge tag 'nfs-for-5.17-3' of git://git.linux-nfs.org/projects/anna/linux-nfs · 4f12b742

Linus Torvalds authored Feb 18, 2022

Pull NFS client bugfixes from Anna Schumaker:

 - Fix unnecessary changeattr revalidations

 - Fix resolving symlinks during directory lookups

 - Don't report writeback errors in nfs_getattr()

* tag 'nfs-for-5.17-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS: Do not report writeback errors in nfs_getattr()
  NFS: LOOKUP_DIRECTORY is also ok with symlinks
  NFS: Remove an incorrect revalidation in nfs4_update_changeattr_locked()

4f12b742

Merge tag 'acpi-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 1c2a33d0

Linus Torvalds authored Feb 18, 2022

Pull ACPI fixes from Rafael Wysocki:
 "These make an excess warning message go away and fix a recently
  introduced boot failure on a vintage machine.

  Specifics:

   - Change the log level of the "table not found" message in
     acpi_table_parse_entries_array() to debug to prevent it from
     showing up in the logs unnecessarily (Dan Williams)

   - Add a C-state limit quirk for 32-bit ThinkPad T40 to prevent it
     from crashing on boot after recent changes in the ACPI processor
     driver (Woody Suwalski)"

* tag 'acpi-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40
  ACPI: tables: Quiet ACPI table not found warning

1c2a33d0

Merge tag 'riscv-for-linus-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 241c32d8

Linus Torvalds authored Feb 18, 2022

Pull RISC-V fixes from Palmer Dabbelt:
 "A set of three fixes, all aimed at fixing some fallout from the recent
  sparse hart ID support"

* tag 'riscv-for-linus-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Fix IPI/RFENCE hmask on non-monotonic hartid ordering
  RISC-V: Fix handling of empty cpu masks
  RISC-V: Fix hartid mask handling for hartid 31 and up

241c32d8

18 Feb, 2022 7 commits

Input: psmouse - set up dependency between PS/2 and SMBus companions · 7b1f781f

Dmitry Torokhov authored Feb 15, 2022

When we switch from emulated PS/2 to native (RMI4 or Elan) protocols, we
create SMBus companion devices that are attached to I2C/SMBus controllers.
However, when suspending and resuming, we also need to make sure that we
take into account the PS/2 device they are associated with, so that PS/2
device is suspended after the companion and resumed before it, otherwise
companions will not work properly. Before I2C devices were marked for
asynchronous suspend/resume, this ordering happened naturally, but now we
need to enforce it by establishing device links, with PS/2 devices being
suppliers and SMBus companions being consumers.

Fixes: 172d9319 ("i2c: enable async suspend/resume on i2c client devices")
Reported-and-tested-by: Hugh Dickins <hughd@google.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Link: https://lore.kernel.org/r/89456fcd-a113-4c82-4b10-a9bcaefac68f@google.com
Link: https://lore.kernel.org/r/YgwQN8ynO88CPMju@google.comSigned-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>

7b1f781f

Merge branch 'acpi-processor' · 82926564

Rafael J. Wysocki authored Feb 18, 2022

Merge fix for a recent boot lockup regression on 32-bit ThinkPad T40.

* acpi-processor:
  ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40

82926564

Merge tag 'mtd/fixes-for-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · 7993e65f

Linus Torvalds authored Feb 18, 2022

Pull MTD fixes from Miquel Raynal:
 "MTD changes:

   - Qcom:
      - Don't print error message on -EPROBE_DEFER
      - Fix kernel panic on skipped partition
      - Fix missing free for pparts in cleanup

   - phram: Prevent divide by zero bug in phram_setup()

  Raw NAND controller changes:

   - ingenic: Fix missing put_device in ingenic_ecc_get

   - qcom: Fix clock sequencing in qcom_nandc_probe()

   - omap2: Prevent invalid configuration and build error

   - gpmi: Don't leak PM reference in error path

   - brcmnand: Fix incorrect sub-page ECC status"

* tag 'mtd/fixes-for-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: brcmnand: Fixed incorrect sub-page ECC status
  mtd: rawnand: gpmi: don't leak PM reference in error path
  mtd: phram: Prevent divide by zero bug in phram_setup()
  mtd: rawnand: omap2: Prevent invalid configuration and build error
  mtd: parsers: qcom: Fix missing free for pparts in cleanup
  mtd: parsers: qcom: Fix kernel panic on skipped partition
  mtd: parsers: qcom: Don't print error message on -EPROBE_DEFER
  mtd: rawnand: qcom: Fix clock sequencing in qcom_nandc_probe()
  mtd: rawnand: ingenic: Fix missing put_device in ingenic_ecc_get

7993e65f

Merge tag 'block-5.17-2022-02-17' of git://git.kernel.dk/linux-block · b9889768

Linus Torvalds authored Feb 18, 2022

Pull block fixes from Jens Axboe:

 - Surprise removal fix (Christoph)

 - Ensure that pages are zeroed before submitted for userspace IO
   (Haimin)

 - Fix blk-wbt accounting issue with BFQ (Laibin)

 - Use bsize for discard granularity in loop (Ming)

 - Fix missing zone handling in blk_complete_request() (Pankaj)

* tag 'block-5.17-2022-02-17' of git://git.kernel.dk/linux-block:
  block/wbt: fix negative inflight counter when remove scsi device
  block: fix surprise removal for drivers calling blk_set_queue_dying
  block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern
  block: loop:use kstatfs.f_bsize of backing file to set discard granularity
  block: Add handling for zone append command in blk_complete_request

b9889768

Merge tag 'sound-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · 2848551b

Linus Torvalds authored Feb 18, 2022

Pull sound fixes from Takashi Iwai:
 "A collection of small patches, mostly for old and new regressions and
  device-specific fixes.

   - Regression fixes regarding ALSA core SG-buffer helpers

   - Regression fix for Realtek HD-audio mutex deadlock

   - Regression fix for USB-audio PM resume error

   - More coverage of ASoC core control API notification fixes

   - Old regression fixes for HD-audio probe mask

   - Fixes for ASoC Realtek codec work handling

   - Other device-specific quirks / fixes"

* tag 'sound-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
  ASoC: intel: skylake: Set max DMA segment size
  ASoC: SOF: hda: Set max DMA segment size
  ALSA: hda: Set max DMA segment size
  ALSA: hda/realtek: Fix deadlock by COEF mutex
  ALSA: usb-audio: Don't abort resume upon errors
  ALSA: hda: Fix missing codec probe on Shenker Dock 15
  ALSA: hda: Fix regression on forced probe mask option
  ALSA: hda/realtek: Add quirk for Legion Y9000X 2019
  ALSA: usb-audio: revert to IMPLICIT_FB_FIXED_DEV for M-Audio FastTrack Ultra
  ASoC: wm_adsp: Correct control read size when parsing compressed buffer
  ASoC: qcom: Actually clear DMA interrupt register for HDMI
  ALSA: memalloc: invalidate SG pages before sync
  ALSA: memalloc: Fix dma_need_sync() checks
  MAINTAINERS: update cros_ec_codec maintainers
  ASoC: rt5682: do not block workqueue if card is unbound
  ASoC: rt5668: do not block workqueue if card is unbound
  ASoC: rt5682s: do not block workqueue if card is unbound
  ASoC: tas2770: Insert post reset delay
  ASoC: Revert "ASoC: mediatek: Check for error clk pointer"
  ASoC: amd: acp: Set gpio_spkr_en to None for max speaker amplifer in machine driver
  ...

2848551b

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 45a98a71

Linus Torvalds authored Feb 18, 2022

Pull arm64 fix from Catalin Marinas:
 "Fix wrong branch label in the EL2 GICv3 initialisation code"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Correct wrong label in macro __init_el2_gicv3

45a98a71

Merge tag 'powerpc-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · ea4b3d29

Linus Torvalds authored Feb 18, 2022

Pull powerpc fixes from Michael Ellerman:

 - Fix boot failure on 603 with DEBUG_PAGEALLOC and KFENCE

 - Fix 32-build with newer binutils that rejects 'ptesync' etc

Thanks to Anders Roxell, Christophe Leroy, and Maxime Bizon.

* tag 'powerpc-5.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/lib/sstep: fix 'ptesync' build error
  powerpc/603: Fix boot failure with DEBUG_PAGEALLOC and KFENCE

ea4b3d29