1. 06 Jan, 2022 18 commits
    • Jann Horn's avatar
      random: don't reset crng_init_cnt on urandom_read() · 6c8e11e0
      Jann Horn authored
      At the moment, urandom_read() (used for /dev/urandom) resets crng_init_cnt
      to zero when it is called at crng_init<2. This is inconsistent: We do it
      for /dev/urandom reads, but not for the equivalent
      getrandom(GRND_INSECURE).
      
      (And worse, as Jason pointed out, we're only doing this as long as
      maxwarn>0.)
      
      crng_init_cnt is only read in crng_fast_load(); it is relevant at
      crng_init==0 for determining when to switch to crng_init==1 (and where in
      the RNG state array to write).
      
      As far as I understand:
      
       - crng_init==0 means "we have nothing, we might just be returning the same
         exact numbers on every boot on every machine, we don't even have
         non-cryptographic randomness; we should shove every bit of entropy we
         can get into the RNG immediately"
       - crng_init==1 means "well we have something, it might not be
         cryptographic, but at least we're not gonna return the same data every
         time or whatever, it's probably good enough for TCP and ASLR and stuff;
         we now have time to build up actual cryptographic entropy in the input
         pool"
       - crng_init==2 means "this is supposed to be cryptographically secure now,
         but we'll keep adding more entropy just to be sure".
      
      The current code means that if someone is pulling data from /dev/urandom
      fast enough at crng_init==0, we'll keep resetting crng_init_cnt, and we'll
      never make forward progress to crng_init==1. It seems to be intended to
      prevent an attacker from bruteforcing the contents of small individual RNG
      inputs on the way from crng_init==0 to crng_init==1, but that's misguided;
      crng_init==1 isn't supposed to provide proper cryptographic security
      anyway, RNG users who care about getting secure RNG output have to wait
      until crng_init==2.
      
      This code was inconsistent, and it probably made things worse - just get
      rid of it.
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      6c8e11e0
    • Jason A. Donenfeld's avatar
      random: avoid superfluous call to RDRAND in CRNG extraction · 2ee25b69
      Jason A. Donenfeld authored
      RDRAND is not fast. RDRAND is actually quite slow. We've known this for
      a while, which is why functions like get_random_u{32,64} were converted
      to use batching of our ChaCha-based CRNG instead.
      
      Yet CRNG extraction still includes a call to RDRAND, in the hot path of
      every call to get_random_bytes(), /dev/urandom, and getrandom(2).
      
      This call to RDRAND here seems quite superfluous. CRNG is already
      extracting things based on a 256-bit key, based on good entropy, which
      is then reseeded periodically, updated, backtrack-mutated, and so
      forth. The CRNG extraction construction is something that we're already
      relying on to be secure and solid. If it's not, that's a serious
      problem, and it's unlikely that mixing in a measly 32 bits from RDRAND
      is going to alleviate things.
      
      And in the case where the CRNG doesn't have enough entropy yet, we're
      already initializing the ChaCha key row with RDRAND in
      crng_init_try_arch_early().
      
      Removing the call to RDRAND improves performance on an i7-11850H by
      370%. In other words, the vast majority of the work done by
      extract_crng() prior to this commit was devoted to fetching 32 bits of
      RDRAND.
      Reviewed-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Acked-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      2ee25b69
    • Dominik Brodowski's avatar
      random: early initialization of ChaCha constants · 96562f28
      Dominik Brodowski authored
      Previously, the ChaCha constants for the primary pool were only
      initialized in crng_initialize_primary(), called by rand_initialize().
      However, some randomness is actually extracted from the primary pool
      beforehand, e.g. by kmem_cache_create(). Therefore, statically
      initialize the ChaCha constants for the primary pool.
      
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: <linux-crypto@vger.kernel.org>
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      96562f28
    • Jason A. Donenfeld's avatar
      random: use IS_ENABLED(CONFIG_NUMA) instead of ifdefs · 7b873241
      Jason A. Donenfeld authored
      Rather than an awkward combination of ifdefs and __maybe_unused, we can
      ensure more source gets parsed, regardless of the configuration, by
      using IS_ENABLED for the CONFIG_NUMA conditional code. This makes things
      cleaner and easier to follow.
      
      I've confirmed that on !CONFIG_NUMA, we don't wind up with excess code
      by accident; the generated object file is the same.
      Reviewed-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      7b873241
    • Dominik Brodowski's avatar
      random: harmonize "crng init done" messages · 161212c7
      Dominik Brodowski authored
      We print out "crng init done" for !TRUST_CPU, so we should also print
      out the same for TRUST_CPU.
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      161212c7
    • Jason A. Donenfeld's avatar
      random: mix bootloader randomness into pool · 57826fee
      Jason A. Donenfeld authored
      If we're trusting bootloader randomness, crng_fast_load() is called by
      add_hwgenerator_randomness(), which sets us to crng_init==1. However,
      usually it is only called once for an initial 64-byte push, so bootloader
      entropy will not mix any bytes into the input pool. So it's conceivable
      that crng_init==1 when crng_initialize_primary() is called later, but
      then the input pool is empty. When that happens, the crng state key will
      be overwritten with extracted output from the empty input pool. That's
      bad.
      
      In contrast, if we're not trusting bootloader randomness, we call
      crng_slow_load() *and* we call mix_pool_bytes(), so that later
      crng_initialize_primary() isn't drawing on nothing.
      
      In order to prevent crng_initialize_primary() from extracting an empty
      pool, have the trusted bootloader case mirror that of the untrusted
      bootloader case, mixing the input into the pool.
      
      [linux@dominikbrodowski.net: rewrite commit message]
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      57826fee
    • Jason A. Donenfeld's avatar
      random: do not throw away excess input to crng_fast_load · 73c7733f
      Jason A. Donenfeld authored
      When crng_fast_load() is called by add_hwgenerator_randomness(), we
      currently will advance to crng_init==1 once we've acquired 64 bytes, and
      then throw away the rest of the buffer. Usually, that is not a problem:
      When add_hwgenerator_randomness() gets called via EFI or DT during
      setup_arch(), there won't be any IRQ randomness. Therefore, the 64 bytes
      passed by EFI exactly matches what is needed to advance to crng_init==1.
      Usually, DT seems to pass 64 bytes as well -- with one notable exception
      being kexec, which hands over 128 bytes of entropy to the kexec'd kernel.
      In that case, we'll advance to crng_init==1 once 64 of those bytes are
      consumed by crng_fast_load(), but won't continue onward feeding in bytes
      to progress to crng_init==2. This commit fixes the issue by feeding
      any leftover bytes into the next phase in add_hwgenerator_randomness().
      
      [linux@dominikbrodowski.net: rewrite commit message]
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      73c7733f
    • Jason A. Donenfeld's avatar
      random: do not re-init if crng_reseed completes before primary init · 9c3ddde3
      Jason A. Donenfeld authored
      If the bootloader supplies sufficient material and crng_reseed() is called
      very early on, but not too early that wqs aren't available yet, then we
      might transition to crng_init==2 before rand_initialize()'s call to
      crng_initialize_primary() made. Then, when crng_initialize_primary() is
      called, if we're trusting the CPU's RDRAND instructions, we'll
      needlessly reinitialize the RNG and emit a message about it. This is
      mostly harmless, as numa_crng_init() will allocate and then free what it
      just allocated, and excessive calls to invalidate_batched_entropy()
      aren't so harmful. But it is funky and the extra message is confusing,
      so avoid the re-initialization all together by checking for crng_init <
      2 in crng_initialize_primary(), just as we already do in crng_reseed().
      Reviewed-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      9c3ddde3
    • Dominik Brodowski's avatar
      random: fix crash on multiple early calls to add_bootloader_randomness() · f7e67b8e
      Dominik Brodowski authored
      Currently, if CONFIG_RANDOM_TRUST_BOOTLOADER is enabled, multiple calls
      to add_bootloader_randomness() are broken and can cause a NULL pointer
      dereference, as noted by Ivan T. Ivanov. This is not only a hypothetical
      problem, as qemu on arm64 may provide bootloader entropy via EFI and via
      devicetree.
      
      On the first call to add_hwgenerator_randomness(), crng_fast_load() is
      executed, and if the seed is long enough, crng_init will be set to 1.
      On subsequent calls to add_bootloader_randomness() and then to
      add_hwgenerator_randomness(), crng_fast_load() will be skipped. Instead,
      wait_event_interruptible() and then credit_entropy_bits() will be called.
      If the entropy count for that second seed is large enough, that proceeds
      to crng_reseed().
      
      However, both wait_event_interruptible() and crng_reseed() depends
      (at least in numa_crng_init()) on workqueues. Therefore, test whether
      system_wq is already initialized, which is a sufficient indicator that
      workqueue_init_early() has progressed far enough.
      
      If we wind up hitting the !system_wq case, we later want to do what
      would have been done there when wqs are up, so set a flag, and do that
      work later from the rand_initialize() call.
      Reported-by: default avatarIvan T. Ivanov <iivanov@suse.de>
      Fixes: 18b915ac ("efi/random: Treat EFI_RNG_PROTOCOL output as bootloader randomness")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      [Jason: added crng_need_done state and related logic.]
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      f7e67b8e
    • Jason A. Donenfeld's avatar
      random: do not sign extend bytes for rotation when mixing · 0d9488ff
      Jason A. Donenfeld authored
      By using `char` instead of `unsigned char`, certain platforms will sign
      extend the byte when `w = rol32(*bytes++, input_rotate)` is called,
      meaning that bit 7 is overrepresented when mixing. This isn't a real
      problem (unless the mixer itself is already broken) since it's still
      invertible, but it's not quite correct either. Fix this by using an
      explicit unsigned type.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      0d9488ff
    • Jason A. Donenfeld's avatar
      random: use BLAKE2s instead of SHA1 in extraction · 9f9eff85
      Jason A. Donenfeld authored
      This commit addresses one of the lower hanging fruits of the RNG: its
      usage of SHA1.
      
      BLAKE2s is generally faster, and certainly more secure, than SHA1, which
      has [1] been [2] really [3] very [4] broken [5]. Additionally, the
      current construction in the RNG doesn't use the full SHA1 function, as
      specified, and allows overwriting the IV with RDRAND output in an
      undocumented way, even in the case when RDRAND isn't set to "trusted",
      which means potential malicious IV choices. And its short length means
      that keeping only half of it secret when feeding back into the mixer
      gives us only 2^80 bits of forward secrecy. In other words, not only is
      the choice of hash function dated, but the use of it isn't really great
      either.
      
      This commit aims to fix both of these issues while also keeping the
      general structure and semantics as close to the original as possible.
      Specifically:
      
         a) Rather than overwriting the hash IV with RDRAND, we put it into
            BLAKE2's documented "salt" and "personal" fields, which were
            specifically created for this type of usage.
         b) Since this function feeds the full hash result back into the
            entropy collector, we only return from it half the length of the
            hash, just as it was done before. This increases the
            construction's forward secrecy from 2^80 to a much more
            comfortable 2^128.
         c) Rather than using the raw "sha1_transform" function alone, we
            instead use the full proper BLAKE2s function, with finalization.
      
      This also has the advantage of supplying 16 bytes at a time rather than
      SHA1's 10 bytes, which, in addition to having a faster compression
      function to begin with, means faster extraction in general. On an Intel
      i7-11850H, this commit makes initial seeding around 131% faster.
      
      BLAKE2s itself has the nice property of internally being based on the
      ChaCha permutation, which the RNG is already using for expansion, so
      there shouldn't be any issue with newness, funkiness, or surprising CPU
      behavior, since it's based on something already in use.
      
      [1] https://eprint.iacr.org/2005/010.pdf
      [2] https://www.iacr.org/archive/crypto2005/36210017/36210017.pdf
      [3] https://eprint.iacr.org/2015/967.pdf
      [4] https://shattered.io/static/shattered.pdf
      [5] https://www.usenix.org/system/files/sec20-leurent.pdfReviewed-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarJean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      9f9eff85
    • Jason A. Donenfeld's avatar
      lib/crypto: blake2s: include as built-in · 6048fdcc
      Jason A. Donenfeld authored
      In preparation for using blake2s in the RNG, we change the way that it
      is wired-in to the build system. Instead of using ifdefs to select the
      right symbol, we use weak symbols. And because ARM doesn't need the
      generic implementation, we make the generic one default only if an arch
      library doesn't need it already, and then have arch libraries that do
      need it opt-in. So that the arch libraries can remain tristate rather
      than bool, we then split the shash part from the glue code.
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Acked-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: linux-kbuild@vger.kernel.org
      Cc: linux-crypto@vger.kernel.org
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      6048fdcc
    • Eric Biggers's avatar
      random: fix data race on crng init time · 009ba856
      Eric Biggers authored
      _extract_crng() does plain loads of crng->init_time and
      crng_global_init_time, which causes undefined behavior if
      crng_reseed() and RNDRESEEDCRNG modify these corrently.
      
      Use READ_ONCE() and WRITE_ONCE() to make the behavior defined.
      
      Don't fix the race on crng->init_time by protecting it with crng->lock,
      since it's not a problem for duplicate reseedings to occur.  I.e., the
      lockless access with READ_ONCE() is fine.
      
      Fixes: d848e5f8 ("random: add new ioctl RNDRESEEDCRNG")
      Fixes: e192be9d ("random: replace non-blocking pool with a Chacha20-based CRNG")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      009ba856
    • Eric Biggers's avatar
      random: fix data race on crng_node_pool · 5d73d1e3
      Eric Biggers authored
      extract_crng() and crng_backtrack_protect() load crng_node_pool with a
      plain load, which causes undefined behavior if do_numa_crng_init()
      modifies it concurrently.
      
      Fix this by using READ_ONCE().  Note: as per the previous discussion
      https://lore.kernel.org/lkml/20211219025139.31085-1-ebiggers@kernel.org/T/#u,
      READ_ONCE() is believed to be sufficient here, and it was requested that
      it be used here instead of smp_load_acquire().
      
      Also change do_numa_crng_init() to set crng_node_pool using
      cmpxchg_release() instead of mb() + cmpxchg(), as the former is
      sufficient here but is more lightweight.
      
      Fixes: 1e7f583a ("random: make /dev/urandom scalable for silly userspace programs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      5d73d1e3
    • Sebastian Andrzej Siewior's avatar
      irq: remove unused flags argument from __handle_irq_event_percpu() · 5320eb42
      Sebastian Andrzej Siewior authored
      The __IRQF_TIMER bit from the flags argument was used in
      add_interrupt_randomness() to distinguish the timer interrupt from other
      interrupts. This is no longer the case.
      
      Remove the flags argument from __handle_irq_event_percpu().
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      5320eb42
    • Sebastian Andrzej Siewior's avatar
      random: remove unused irq_flags argument from add_interrupt_randomness() · 703f7066
      Sebastian Andrzej Siewior authored
      Since commit
         ee3e00e9 ("random: use registers from interrupted code for CPU's w/o a cycle counter")
      
      the irq_flags argument is no longer used.
      
      Remove unused irq_flags.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wei Liu <wei.liu@kernel.org>
      Cc: linux-hyperv@vger.kernel.org
      Cc: x86@kernel.org
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarWei Liu <wei.liu@kernel.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      703f7066
    • Mark Brown's avatar
      random: document add_hwgenerator_randomness() with other input functions · 2b6c6e3d
      Mark Brown authored
      The section at the top of random.c which documents the input functions
      available does not document add_hwgenerator_randomness() which might lead
      a reader to overlook it. Add a brief note about it.
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      [Jason: reorganize position of function in doc comment and also document
       add_bootloader_randomness() while we're at it.]
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      2b6c6e3d
    • Jason A. Donenfeld's avatar
      MAINTAINERS: add git tree for random.c · 9bafaa93
      Jason A. Donenfeld authored
      This is handy not just for humans, but also so that the 0-day bot can
      automatically test posted mailing list patches against the right tree.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      9bafaa93
  2. 05 Jan, 2022 7 commits
  3. 04 Jan, 2022 15 commits
    • Karen Sornek's avatar
      iavf: Fix limit of total number of queues to active queues of VF · b712941c
      Karen Sornek authored
      In the absence of this validation, if the user requests to
      configure queues more than the enabled queues, it results in
      sending the requested number of queues to the kernel stack
      (due to the asynchronous nature of VF response), in which
      case the stack might pick a queue to transmit that is not
      enabled and result in Tx hang. Fix this bug by
      limiting the total number of queues allocated for VF to
      active queues of VF.
      
      Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf")
      Signed-off-by: default avatarAshwin Vijayavel <ashwin.vijayavel@intel.com>
      Signed-off-by: default avatarKaren Sornek <karen.sornek@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      b712941c
    • Jedrzej Jagielski's avatar
      i40e: Fix incorrect netdev's real number of RX/TX queues · e738451d
      Jedrzej Jagielski authored
      There was a wrong queues representation in sysfs during
      driver's reinitialization in case of online cpus number is
      less than combined queues. It was caused by stopped
      NetworkManager, which is responsible for calling vsi_open
      function during driver's initialization.
      In specific situation (ex. 12 cpus online) there were 16 queues
      in /sys/class/net/<iface>/queues. In case of modifying queues with
      value higher, than number of online cpus, then it caused write
      errors and other errors.
      Add updating of sysfs's queues representation during driver
      initialization.
      
      Fixes: 41c445ff ("i40e: main driver core")
      Signed-off-by: default avatarLukasz Cieplicki <lukaszx.cieplicki@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      e738451d
    • Mateusz Palczewski's avatar
      i40e: Fix for displaying message regarding NVM version · 40feded8
      Mateusz Palczewski authored
      When loading the i40e driver, it prints a message like: 'The driver for the
      device detected a newer version of the NVM image v1.x than expected v1.y.
      Please install the most recent version of the network driver.' This is
      misleading as the driver is working as expected.
      
      Fix that by removing the second part of message and changing it from
      dev_info to dev_dbg.
      
      Fixes: 4fb29bdd ("i40e: The driver now prints the API version in error message")
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      40feded8
    • Di Zhu's avatar
      i40e: fix use-after-free in i40e_sync_filters_subtask() · 3116f59c
      Di Zhu authored
      Using ifconfig command to delete the ipv6 address will cause
      the i40e network card driver to delete its internal mac_filter and
      i40e_service_task kernel thread will concurrently access the mac_filter.
      These two processes are not protected by lock
      so causing the following use-after-free problems.
      
       print_address_description+0x70/0x360
       ? vprintk_func+0x5e/0xf0
       kasan_report+0x1b2/0x330
       i40e_sync_vsi_filters+0x4f0/0x1850 [i40e]
       i40e_sync_filters_subtask+0xe3/0x130 [i40e]
       i40e_service_task+0x195/0x24c0 [i40e]
       process_one_work+0x3f5/0x7d0
       worker_thread+0x61/0x6c0
       ? process_one_work+0x7d0/0x7d0
       kthread+0x1c3/0x1f0
       ? kthread_park+0xc0/0xc0
       ret_from_fork+0x35/0x40
      
      Allocated by task 2279810:
       kasan_kmalloc+0xa0/0xd0
       kmem_cache_alloc_trace+0xf3/0x1e0
       i40e_add_filter+0x127/0x2b0 [i40e]
       i40e_add_mac_filter+0x156/0x190 [i40e]
       i40e_addr_sync+0x2d/0x40 [i40e]
       __hw_addr_sync_dev+0x154/0x210
       i40e_set_rx_mode+0x6d/0xf0 [i40e]
       __dev_set_rx_mode+0xfb/0x1f0
       __dev_mc_add+0x6c/0x90
       igmp6_group_added+0x214/0x230
       __ipv6_dev_mc_inc+0x338/0x4f0
       addrconf_join_solict.part.7+0xa2/0xd0
       addrconf_dad_work+0x500/0x980
       process_one_work+0x3f5/0x7d0
       worker_thread+0x61/0x6c0
       kthread+0x1c3/0x1f0
       ret_from_fork+0x35/0x40
      
      Freed by task 2547073:
       __kasan_slab_free+0x130/0x180
       kfree+0x90/0x1b0
       __i40e_del_filter+0xa3/0xf0 [i40e]
       i40e_del_mac_filter+0xf3/0x130 [i40e]
       i40e_addr_unsync+0x85/0xa0 [i40e]
       __hw_addr_sync_dev+0x9d/0x210
       i40e_set_rx_mode+0x6d/0xf0 [i40e]
       __dev_set_rx_mode+0xfb/0x1f0
       __dev_mc_del+0x69/0x80
       igmp6_group_dropped+0x279/0x510
       __ipv6_dev_mc_dec+0x174/0x220
       addrconf_leave_solict.part.8+0xa2/0xd0
       __ipv6_ifa_notify+0x4cd/0x570
       ipv6_ifa_notify+0x58/0x80
       ipv6_del_addr+0x259/0x4a0
       inet6_addr_del+0x188/0x260
       addrconf_del_ifaddr+0xcc/0x130
       inet6_ioctl+0x152/0x190
       sock_do_ioctl+0xd8/0x2b0
       sock_ioctl+0x2e5/0x4c0
       do_vfs_ioctl+0x14e/0xa80
       ksys_ioctl+0x7c/0xa0
       __x64_sys_ioctl+0x42/0x50
       do_syscall_64+0x98/0x2c0
       entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      Fixes: 41c445ff ("i40e: main driver core")
      Signed-off-by: default avatarDi Zhu <zhudi2@huawei.com>
      Signed-off-by: default avatarRui Zhang <zhangrui182@huawei.com>
      Tested-by: default avatarGurucharan G <gurucharanx.g@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      3116f59c
    • Mateusz Palczewski's avatar
      i40e: Fix to not show opcode msg on unsuccessful VF MAC change · 01cbf508
      Mateusz Palczewski authored
      Hide i40e opcode information sent during response to VF in case when
      untrusted VF tried to change MAC on the VF interface.
      
      This is implemented by adding an additional parameter 'hide' to the
      response sent to VF function that hides the display of error
      information, but forwards the error code to VF.
      
      Previously it was not possible to send response with some error code
      to VF without displaying opcode information.
      
      Fixes: 5c3c48ac ("i40e: implement virtual device interface")
      Signed-off-by: default avatarGrzegorz Szczurek <grzegorzx.szczurek@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Reviewed-by: default avatarPaul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
      Reviewed-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Tested-by: default avatarTony Brelinski <tony.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      01cbf508
    • Pavel Skripkin's avatar
      ieee802154: atusb: fix uninit value in atusb_set_extended_addr · 754e4382
      Pavel Skripkin authored
      Alexander reported a use of uninitialized value in
      atusb_set_extended_addr(), that is caused by reading 0 bytes via
      usb_control_msg().
      
      Fix it by validating if the number of bytes transferred is actually
      correct, since usb_control_msg() may read less bytes, than was requested
      by caller.
      
      Fail log:
      
      BUG: KASAN: uninit-cmp in ieee802154_is_valid_extended_unicast_addr include/linux/ieee802154.h:310 [inline]
      BUG: KASAN: uninit-cmp in atusb_set_extended_addr drivers/net/ieee802154/atusb.c:1000 [inline]
      BUG: KASAN: uninit-cmp in atusb_probe.cold+0x29f/0x14db drivers/net/ieee802154/atusb.c:1056
      Uninit value used in comparison: 311daa649a2003bd stack handle: 000000009a2003bd
       ieee802154_is_valid_extended_unicast_addr include/linux/ieee802154.h:310 [inline]
       atusb_set_extended_addr drivers/net/ieee802154/atusb.c:1000 [inline]
       atusb_probe.cold+0x29f/0x14db drivers/net/ieee802154/atusb.c:1056
       usb_probe_interface+0x314/0x7f0 drivers/usb/core/driver.c:396
      
      Fixes: 7490b008 ("ieee802154: add support for atusb transceiver")
      Reported-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarAlexander Aring <aahringo@redhat.com>
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Link: https://lore.kernel.org/r/20220104182806.7188-1-paskripkin@gmail.comSigned-off-by: default avatarStefan Schmidt <stefan@datenfreihafen.org>
      754e4382
    • Jakub Kicinski's avatar
      Merge tag 'mac80211-for-net-2022-01-04' of... · 6f89ecf1
      Jakub Kicinski authored
      Merge tag 'mac80211-for-net-2022-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
      
      Johannes Berg says:
      
      ====================
      Two more changes:
       - mac80211: initialize a variable to avoid using it uninitialized
       - mac80211 mesh: put some data structures into the container to
         fix bugs with and not have to deal with allocation failures
      
      * tag 'mac80211-for-net-2022-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
        mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
        mac80211: initialize variable have_higher_than_11mbit
      ====================
      
      Link: https://lore.kernel.org/r/20220104144449.64937-1-johannes@sipsolutions.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6f89ecf1
    • Pavel Skripkin's avatar
      mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh · 8b5cb7e4
      Pavel Skripkin authored
      Syzbot hit NULL deref in rhashtable_free_and_destroy(). The problem was
      in mesh_paths and mpp_paths being NULL.
      
      mesh_pathtbl_init() could fail in case of memory allocation failure, but
      nobody cared, since ieee80211_mesh_init_sdata() returns void. It led to
      leaving 2 pointers as NULL. Syzbot has found null deref on exit path,
      but it could happen anywhere else, because code assumes these pointers are
      valid.
      
      Since all ieee80211_*_setup_sdata functions are void and do not fail,
      let's embedd mesh_paths and mpp_paths into parent struct to avoid
      adding error handling on higher levels and follow the pattern of others
      setup_sdata functions
      
      Fixes: 60854fd9 ("mac80211: mesh: convert path table to rhashtable")
      Reported-and-tested-by: syzbot+860268315ba86ea6b96b@syzkaller.appspotmail.com
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Link: https://lore.kernel.org/r/20211230195547.23977-1-paskripkin@gmail.comSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      8b5cb7e4
    • Tom Rix's avatar
      mac80211: initialize variable have_higher_than_11mbit · 68a18ad7
      Tom Rix authored
      Clang static analysis reports this warnings
      
      mlme.c:5332:7: warning: Branch condition evaluates to a
        garbage value
          have_higher_than_11mbit)
          ^~~~~~~~~~~~~~~~~~~~~~~
      
      have_higher_than_11mbit is only set to true some of the time in
      ieee80211_get_rates() but is checked all of the time.  So
      have_higher_than_11mbit needs to be initialized to false.
      
      Fixes: 5d6a1b06 ("mac80211: set basic rates earlier")
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Link: https://lore.kernel.org/r/20211223162848.3243702-1-trix@redhat.comSigned-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      68a18ad7
    • Eric Dumazet's avatar
      sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc · 7d18a078
      Eric Dumazet authored
      tx_queue_len can be set to ~0U, we need to be more
      careful about overflows.
      
      __fls(0) is undefined, as this report shows:
      
      UBSAN: shift-out-of-bounds in net/sched/sch_qfq.c:1430:24
      shift exponent 51770272 is too large for 32-bit type 'int'
      CPU: 0 PID: 25574 Comm: syz-executor.0 Not tainted 5.16.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x201/0x2d8 lib/dump_stack.c:106
       ubsan_epilogue lib/ubsan.c:151 [inline]
       __ubsan_handle_shift_out_of_bounds+0x494/0x530 lib/ubsan.c:330
       qfq_init_qdisc+0x43f/0x450 net/sched/sch_qfq.c:1430
       qdisc_create+0x895/0x1430 net/sched/sch_api.c:1253
       tc_modify_qdisc+0x9d9/0x1e20 net/sched/sch_api.c:1660
       rtnetlink_rcv_msg+0x934/0xe60 net/core/rtnetlink.c:5571
       netlink_rcv_skb+0x200/0x470 net/netlink/af_netlink.c:2496
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x814/0x9f0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0xaea/0xe60 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:704 [inline]
       sock_sendmsg net/socket.c:724 [inline]
       ____sys_sendmsg+0x5b9/0x910 net/socket.c:2409
       ___sys_sendmsg net/socket.c:2463 [inline]
       __sys_sendmsg+0x280/0x370 net/socket.c:2492
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: 462dbc91 ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d18a078
    • Christoph Hellwig's avatar
      netrom: fix copying in user data in nr_setsockopt · 3087a6f3
      Christoph Hellwig authored
      This code used to copy in an unsigned long worth of data before
      the sockptr_t conversion, so restore that.
      
      Fixes: a7b75c5a ("net: pass a sockptr_t into ->setsockopt")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3087a6f3
    • David S. Miller's avatar
      Merge branch 'srv6-traceroute' · d2d9a6d0
      David S. Miller authored
      Andrew Lunn says:
      
      ====================
      Fix traceroute in the presence of SRv6
      
      When using SRv6 the destination IP address in the IPv6 header is not
      always the true destination, it can be a router along the path that
      SRv6 is using.
      
      When ICMP reports an error, e.g, time exceeded, which is what
      traceroute uses, it included the packet which invoked the error into
      the ICMP message body. Upon receiving such an ICMP packet, the
      invoking packet is examined and an attempt is made to find the socket
      which sent the packet, so the error can be reported. Lookup is
      performed using the source and destination address. If the
      intermediary router IP address from the IP header is used, the lookup
      fails. It is necessary to dig into the header and find the true
      destination address in the Segment Router header, SRH.
      
      v2:
      Play games with the skb->network_header rather than clone the skb
      v3:
      Move helpers into seg6.c
      v4:
      Move short helper into header file.
      Rework getting SRH destination address
      v5:
      Fix comment to describe function, not caller
      
      Patch 1 exports a helper which can find the SRH in a packet
      Patch 2 does the actual examination of the invoking packet
      Patch 3 makes use of the results when trying to find the socket.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2d9a6d0
    • Andrew Lunn's avatar
      udp6: Use Segment Routing Header for dest address if present · 222a011e
      Andrew Lunn authored
      When finding the socket to report an error on, if the invoking packet
      is using Segment Routing, the IPv6 destination address is that of an
      intermediate router, not the end destination. Extract the ultimate
      destination address from the segment address.
      
      This change allows traceroute to function in the presence of Segment
      Routing.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      222a011e
    • Andrew Lunn's avatar
      icmp: ICMPV6: Examine invoking packet for Segment Route Headers. · e4129440
      Andrew Lunn authored
      RFC8754 says:
      
      ICMP error packets generated within the SR domain are sent to source
      nodes within the SR domain.  The invoking packet in the ICMP error
      message may contain an SRH.  Since the destination address of a packet
      with an SRH changes as each segment is processed, it may not be the
      destination used by the socket or application that generated the
      invoking packet.
      
      For the source of an invoking packet to process the ICMP error
      message, the ultimate destination address of the IPv6 header may be
      required.  The following logic is used to determine the destination
      address for use by protocol-error handlers.
      
      *  Walk all extension headers of the invoking IPv6 packet to the
         routing extension header preceding the upper-layer header.
      
         -  If routing header is type 4 Segment Routing Header (SRH)
      
            o  The SID at Segment List[0] may be used as the destination
               address of the invoking packet.
      
      Mangle the skb so the network header points to the invoking packet
      inside the ICMP packet. The seg6 helpers can then be used on the skb
      to find any segment routing headers. If found, mark this fact in the
      IPv6 control block of the skb, and store the offset into the packet of
      the SRH. Then restore the skb back to its old state.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4129440
    • Andrew Lunn's avatar
      seg6: export get_srh() for ICMP handling · fa55a7d7
      Andrew Lunn authored
      An ICMP error message can contain in its message body part of an IPv6
      packet which invoked the error. Such a packet might contain a segment
      router header. Export get_srh() so the ICMP code can make use of it.
      
      Since his changes the scope of the function from local to global, add
      the seg6_ prefix to keep the namespace clean. And move it into seg6.c
      so it is always available, not just when IPV6_SEG6_LWTUNNEL is
      enabled.
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa55a7d7