1. 07 Jul, 2023 9 commits
    • Junfeng Guo's avatar
      gve: Set default duplex configuration to full · 0503efea
      Junfeng Guo authored
      Current duplex mode was unset in the driver, resulting in the default
      parameter being set to 0, which corresponds to half duplex. It might
      mislead users to have incorrect expectation about the driver's
      transmission capabilities.
      Set the default duplex configuration to full, as the driver runs in
      full duplex mode at this point.
      
      Fixes: 7e074d5a ("gve: Enable Link Speed Reporting in the driver.")
      Signed-off-by: default avatarJunfeng Guo <junfeng.guo@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Message-ID: <20230706044128.2726747-1-junfeng.guo@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0503efea
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 41b9eff0
      Jakub Kicinski authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2023-07-05 (ice)
      
      This series contains updates to ice driver only.
      
      Sridhar fixes incorrect comparison of max Tx rate limit to occur against
      each TC value rather than the aggregate. He also resolves an issue with
      the wrong VSI being used when setting max Tx rate when TCs are enabled.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: Fix tx queue rate limit when TCs are configured
        ice: Fix max_rate check while configuring TX rate limits
      ====================
      
      Link: https://lore.kernel.org/r/20230705201346.49370-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      41b9eff0
    • Jakub Kicinski's avatar
      Merge tag 'mlx5-fixes-2023-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 4863b57b
      Jakub Kicinski authored
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2023-07-05
      
      This series provides bug fixes to mlx5 driver.
      
      * tag 'mlx5-fixes-2023-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
        net/mlx5e: RX, Fix page_pool page fragment tracking for XDP
        net/mlx5: Query hca_cap_2 only when supported
        net/mlx5e: TC, CT: Offload ct clear only once
        net/mlx5e: Check for NOT_READY flag state after locking
        net/mlx5: Register a unique thermal zone per device
        net/mlx5e: RX, Fix flush and close release flow of regular rq for legacy rq
        net/mlx5e: fix memory leak in mlx5e_ptp_open
        net/mlx5e: fix memory leak in mlx5e_fs_tt_redirect_any_create
        net/mlx5e: fix double free in mlx5e_destroy_flow_table
      ====================
      
      Link: https://lore.kernel.org/r/20230705175757.284614-1-saeed@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4863b57b
    • M A Ramdhan's avatar
      net/sched: cls_fw: Fix improper refcount update leads to use-after-free · 0323bce5
      M A Ramdhan authored
      In the event of a failure in tcf_change_indev(), fw_set_parms() will
      immediately return an error after incrementing or decrementing
      reference counter in tcf_bind_filter().  If attacker can control
      reference counter to zero and make reference freed, leading to
      use after free.
      
      In order to prevent this, move the point of possible failure above the
      point where the TC_FW_CLASSID is handled.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarM A Ramdhan <ramdhan@starlabs.sg>
      Signed-off-by: default avatarM A Ramdhan <ramdhan@starlabs.sg>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Message-ID: <20230705161530.52003-1-ramdhan@starlabs.sg>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0323bce5
    • Quan Zhou's avatar
      wifi: mt76: mt7921e: fix init command fail with enabled device · 525c469e
      Quan Zhou authored
      For some cases as below, we may encounter the unpreditable chip stats
      in driver probe()
      * The system reboot flow do not work properly, such as kernel oops while
        rebooting, and then the driver do not go back to default status at
        this moment.
      * Similar to the flow above. If the device was enabled in BIOS or UEFI,
        the system may switch to Linux without driver fully shutdown.
      
      To avoid the problem, force push the device back to default in probe()
      * mt7921e_mcu_fw_pmctrl() : return control privilege to chip side.
      * mt7921_wfsys_reset()    : cleanup chip config before resource init.
      
      Error log
      [59007.600714] mt7921e 0000:02:00.0: ASIC revision: 79220010
      [59010.889773] mt7921e 0000:02:00.0: Message 00000010 (seq 1) timeout
      [59010.889786] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59014.217839] mt7921e 0000:02:00.0: Message 00000010 (seq 2) timeout
      [59014.217852] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59017.545880] mt7921e 0000:02:00.0: Message 00000010 (seq 3) timeout
      [59017.545893] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59020.874086] mt7921e 0000:02:00.0: Message 00000010 (seq 4) timeout
      [59020.874099] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59024.202019] mt7921e 0000:02:00.0: Message 00000010 (seq 5) timeout
      [59024.202033] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59027.530082] mt7921e 0000:02:00.0: Message 00000010 (seq 6) timeout
      [59027.530096] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59030.857888] mt7921e 0000:02:00.0: Message 00000010 (seq 7) timeout
      [59030.857904] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59034.185946] mt7921e 0000:02:00.0: Message 00000010 (seq 8) timeout
      [59034.185961] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59037.514249] mt7921e 0000:02:00.0: Message 00000010 (seq 9) timeout
      [59037.514262] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59040.842362] mt7921e 0000:02:00.0: Message 00000010 (seq 10) timeout
      [59040.842375] mt7921e 0000:02:00.0: Failed to get patch semaphore
      [59040.923845] mt7921e 0000:02:00.0: hardware init failed
      
      Cc: stable@vger.kernel.org
      Fixes: 5c14a5f9 ("mt76: mt7921: introduce mt7921e support")
      Tested-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Tested-by: default avatarJuan Martinez <juan.martinez@amd.com>
      Co-developed-by: default avatarLeon Yen <leon.yen@mediatek.com>
      Signed-off-by: default avatarLeon Yen <leon.yen@mediatek.com>
      Signed-off-by: default avatarQuan Zhou <quan.zhou@mediatek.com>
      Signed-off-by: default avatarDeren Wu <deren.wu@mediatek.com>
      Message-ID: <39fcb7cee08d4ab940d38d82f21897483212483f.1688569385.git.deren.wu@mediatek.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      525c469e
    • Jakub Kicinski's avatar
      Merge branch 'fix-dropping-of-oversize-preemptible-frames-with-felix-dsa-driver' · 1ce1a745
      Jakub Kicinski authored
      Vladimir Oltean says:
      
      ====================
      Fix dropping of oversize preemptible frames with felix DSA driver
      
      It has been reported that preemptible traffic doesn't completely behave
      as expected. Namely, large packets should be able to be squeezed
      (through fragmentation) through taprio time slots smaller than the
      transmission time of the full frame. That does not happen due to logic
      in the driver (for oversize frame dropping with taprio) that was not
      updated in order for this use case to work.
      
      I am not sure whether it qualifies as "net" material, because some
      structural changes are involved, and it is a "never worked" scenario.
      OTOH, this is a complaint coming from users for a v6.4 kernel.
      It's up to maintainers to decide whether this series can be considered;
      I've submitted it as non-RFC in the optimistic case that it will be :)
      
      Demo script illustrating the issue below.
      
      add_taprio()
      {
      	local ifname=$1
      
      	echo "Creating root taprio"
      	tc qdisc replace dev $ifname handle 8001: parent root stab overhead 24 taprio \
      		num_tc 8 \
      		map 0 1 2 3 4 5 6 7 \
      		queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
      		base-time 0 \
      		sched-entry S 01 1216 \
      		sched-entry S fe 12368 \
      		fp P E E E E E E E \
      		flags 0x2
      }
      
      remove_taprio()
      {
      	local ifname=$1
      
      	echo "Removing taprio"
      	tc qdisc del dev $ifname root
      }
      
      ip netns add ns0
      ip link set eno0 netns ns0 && ip -n ns0 link set eno0 up && ip -n ns0 addr add 192.168.100.1/24 dev eno0
      ip addr add 192.168.100.2/24 dev swp0 && ip link set swp0 up
      ip netns exec ns0 ethtool --set-mm eno0 pmac-enabled on verify-enabled off tx-enabled on
      ethtool --set-mm swp0 pmac-enabled on verify-enabled off tx-enabled on
      add_taprio swp0
      
      ping 192.168.100.1 -s 1000 -c 5 # sent through TC0
      ethtool -I --show-mm swp0 | grep MACMergeFragCountTx # should increase
      
      ip addr flush swp0 && ip link set swp0 down
      remove_taprio swp0
      ethtool --set-mm swp0 pmac-enabled off verify-enabled off tx-enabled off
      ip netns exec ns0 ethtool --set-mm eno0 pmac-enabled off verify-enabled off tx-enabled off
      ip netns del ns0
      ====================
      
      Link: https://lore.kernel.org/r/20230705104422.49025-1-vladimir.oltean@nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1ce1a745
    • Vladimir Oltean's avatar
      net: mscc: ocelot: fix oversize frame dropping for preemptible TCs · c6efb4ae
      Vladimir Oltean authored
      This switch implements Hold/Release in a strange way, with no control
      from the user as required by IEEE 802.1Q-2018 through Set-And-Hold-MAC
      and Set-And-Release-MAC, but rather, it emits HOLD requests implicitly
      based on the schedule.
      
      Namely, when the gate of a preemptible TC is about to close (actually
      QSYS::PREEMPTION_CFG.HOLD_ADVANCE octet times in advance of this event),
      the QSYS seems to emit a HOLD request pulse towards the MAC which
      preempts the currently transmitted packet, and further packets are held
      back in the queue system.
      
      This allows large frames to be squeezed through small time slots,
      because HOLD requests initiated by the gate events result in the frame
      being segmented in multiple fragments, the bit time of which is equal to
      the size of the time slot.
      
      It has been reported that the vsc9959_tas_guard_bands_update() logic
      breaks this, because it doesn't take preemptible TCs into account, and
      enables oversized frame dropping when the time slot doesn't allow a full
      MTU to be sent, but it does allow 2*minFragSize to be sent (128B).
      Packets larger than 128B are dropped instead of being sent in multiple
      fragments.
      
      Confusingly, the manual says:
      
      | For guard band, SDU calculation of a traffic class of a port, if
      | preemption is enabled (through 'QSYS::PREEMPTION_CFG.P_QUEUES') then
      | QSYS::PREEMPTION_CFG.HOLD_ADVANCE is used, otherwise
      | QSYS::QMAXSDU_CFG_*.QMAXSDU_* is used.
      
      but this only refers to the static guard band durations, and the
      QMAXSDU_CFG_* registers have dual purpose - the other being oversized
      frame dropping, which takes place irrespective of whether frames are
      preemptible or express.
      
      So, to fix the problem, we need to call vsc9959_tas_guard_bands_update()
      from ocelot_port_update_active_preemptible_tcs(), and modify the guard
      band logic to consider a different (lower) oversize limit for
      preemptible traffic classes.
      
      Fixes: 403ffc2c ("net: mscc: ocelot: add support for preemptible traffic classes")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Message-ID: <20230705104422.49025-4-vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c6efb4ae
    • Vladimir Oltean's avatar
      net: dsa: felix: make vsc9959_tas_guard_bands_update() visible to ocelot->ops · c6081914
      Vladimir Oltean authored
      In a future change we will need to make
      ocelot_port_update_active_preemptible_tcs() call
      vsc9959_tas_guard_bands_update(), but that is currently not possible,
      since the ocelot switch lib does not have access to functions private to
      the DSA wrapper.
      
      Move the pointer to vsc9959_tas_guard_bands_update() from felix->info
      (which is private to the DSA driver) to ocelot->ops (which is also
      visible to the ocelot switch lib).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Message-ID: <20230705104422.49025-3-vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c6081914
    • Vladimir Oltean's avatar
      net: mscc: ocelot: extend ocelot->fwd_domain_lock to cover ocelot->tas_lock · 009d30f1
      Vladimir Oltean authored
      In a future commit we will have to call vsc9959_tas_guard_bands_update()
      from ocelot_port_update_active_preemptible_tcs(), and that will be
      impossible due to the AB/BA locking dependencies between
      ocelot->tas_lock and ocelot->fwd_domain_lock.
      
      Just like we did in commit 3ff468ef ("net: mscc: ocelot: remove
      struct ocelot_mm_state :: lock"), the only solution is to expand the
      scope of ocelot->fwd_domain_lock for it to also serialize changes made
      to the Time-Aware Shaper, because those will have to result in a
      recalculation of cut-through TCs, which is something that depends on the
      forwarding domain.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Message-ID: <20230705104422.49025-2-vladimir.oltean@nxp.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      009d30f1
  2. 06 Jul, 2023 2 commits
  3. 05 Jul, 2023 29 commits
    • Thadeu Lima de Souza Cascardo's avatar
      netfilter: nf_tables: prevent OOB access in nft_byteorder_eval · caf3ef74
      Thadeu Lima de Souza Cascardo authored
      When evaluating byteorder expressions with size 2, a union with 32-bit and
      16-bit members is used. Since the 16-bit members are aligned to 32-bit,
      the array accesses will be out-of-bounds.
      
      It may lead to a stack-out-of-bounds access like the one below:
      
      [   23.095215] ==================================================================
      [   23.095625] BUG: KASAN: stack-out-of-bounds in nft_byteorder_eval+0x13c/0x320
      [   23.096020] Read of size 2 at addr ffffc90000007948 by task ping/115
      [   23.096358]
      [   23.096456] CPU: 0 PID: 115 Comm: ping Not tainted 6.4.0+ #413
      [   23.096770] Call Trace:
      [   23.096910]  <IRQ>
      [   23.097030]  dump_stack_lvl+0x60/0xc0
      [   23.097218]  print_report+0xcf/0x630
      [   23.097388]  ? nft_byteorder_eval+0x13c/0x320
      [   23.097577]  ? kasan_addr_to_slab+0xd/0xc0
      [   23.097760]  ? nft_byteorder_eval+0x13c/0x320
      [   23.097949]  kasan_report+0xc9/0x110
      [   23.098106]  ? nft_byteorder_eval+0x13c/0x320
      [   23.098298]  __asan_load2+0x83/0xd0
      [   23.098453]  nft_byteorder_eval+0x13c/0x320
      [   23.098659]  nft_do_chain+0x1c8/0xc50
      [   23.098852]  ? __pfx_nft_do_chain+0x10/0x10
      [   23.099078]  ? __kasan_check_read+0x11/0x20
      [   23.099295]  ? __pfx___lock_acquire+0x10/0x10
      [   23.099535]  ? __pfx___lock_acquire+0x10/0x10
      [   23.099745]  ? __kasan_check_read+0x11/0x20
      [   23.099929]  nft_do_chain_ipv4+0xfe/0x140
      [   23.100105]  ? __pfx_nft_do_chain_ipv4+0x10/0x10
      [   23.100327]  ? lock_release+0x204/0x400
      [   23.100515]  ? nf_hook.constprop.0+0x340/0x550
      [   23.100779]  nf_hook_slow+0x6c/0x100
      [   23.100977]  ? __pfx_nft_do_chain_ipv4+0x10/0x10
      [   23.101223]  nf_hook.constprop.0+0x334/0x550
      [   23.101443]  ? __pfx_ip_local_deliver_finish+0x10/0x10
      [   23.101677]  ? __pfx_nf_hook.constprop.0+0x10/0x10
      [   23.101882]  ? __pfx_ip_rcv_finish+0x10/0x10
      [   23.102071]  ? __pfx_ip_local_deliver_finish+0x10/0x10
      [   23.102291]  ? rcu_read_lock_held+0x4b/0x70
      [   23.102481]  ip_local_deliver+0xbb/0x110
      [   23.102665]  ? __pfx_ip_rcv+0x10/0x10
      [   23.102839]  ip_rcv+0x199/0x2a0
      [   23.102980]  ? __pfx_ip_rcv+0x10/0x10
      [   23.103140]  __netif_receive_skb_one_core+0x13e/0x150
      [   23.103362]  ? __pfx___netif_receive_skb_one_core+0x10/0x10
      [   23.103647]  ? mark_held_locks+0x48/0xa0
      [   23.103819]  ? process_backlog+0x36c/0x380
      [   23.103999]  __netif_receive_skb+0x23/0xc0
      [   23.104179]  process_backlog+0x91/0x380
      [   23.104350]  __napi_poll.constprop.0+0x66/0x360
      [   23.104589]  ? net_rx_action+0x1cb/0x610
      [   23.104811]  net_rx_action+0x33e/0x610
      [   23.105024]  ? _raw_spin_unlock+0x23/0x50
      [   23.105257]  ? __pfx_net_rx_action+0x10/0x10
      [   23.105485]  ? mark_held_locks+0x48/0xa0
      [   23.105741]  __do_softirq+0xfa/0x5ab
      [   23.105956]  ? __dev_queue_xmit+0x765/0x1c00
      [   23.106193]  do_softirq.part.0+0x49/0xc0
      [   23.106423]  </IRQ>
      [   23.106547]  <TASK>
      [   23.106670]  __local_bh_enable_ip+0xf5/0x120
      [   23.106903]  __dev_queue_xmit+0x789/0x1c00
      [   23.107131]  ? __pfx___dev_queue_xmit+0x10/0x10
      [   23.107381]  ? find_held_lock+0x8e/0xb0
      [   23.107585]  ? lock_release+0x204/0x400
      [   23.107798]  ? neigh_resolve_output+0x185/0x350
      [   23.108049]  ? mark_held_locks+0x48/0xa0
      [   23.108265]  ? neigh_resolve_output+0x185/0x350
      [   23.108514]  neigh_resolve_output+0x246/0x350
      [   23.108753]  ? neigh_resolve_output+0x246/0x350
      [   23.109003]  ip_finish_output2+0x3c3/0x10b0
      [   23.109250]  ? __pfx_ip_finish_output2+0x10/0x10
      [   23.109510]  ? __pfx_nf_hook+0x10/0x10
      [   23.109732]  __ip_finish_output+0x217/0x390
      [   23.109978]  ip_finish_output+0x2f/0x130
      [   23.110207]  ip_output+0xc9/0x170
      [   23.110404]  ip_push_pending_frames+0x1a0/0x240
      [   23.110652]  raw_sendmsg+0x102e/0x19e0
      [   23.110871]  ? __pfx_raw_sendmsg+0x10/0x10
      [   23.111093]  ? lock_release+0x204/0x400
      [   23.111304]  ? __mod_lruvec_page_state+0x148/0x330
      [   23.111567]  ? find_held_lock+0x8e/0xb0
      [   23.111777]  ? find_held_lock+0x8e/0xb0
      [   23.111993]  ? __rcu_read_unlock+0x7c/0x2f0
      [   23.112225]  ? aa_sk_perm+0x18a/0x550
      [   23.112431]  ? filemap_map_pages+0x4f1/0x900
      [   23.112665]  ? __pfx_aa_sk_perm+0x10/0x10
      [   23.112880]  ? find_held_lock+0x8e/0xb0
      [   23.113098]  inet_sendmsg+0xa0/0xb0
      [   23.113297]  ? inet_sendmsg+0xa0/0xb0
      [   23.113500]  ? __pfx_inet_sendmsg+0x10/0x10
      [   23.113727]  sock_sendmsg+0xf4/0x100
      [   23.113924]  ? move_addr_to_kernel.part.0+0x4f/0xa0
      [   23.114190]  __sys_sendto+0x1d4/0x290
      [   23.114391]  ? __pfx___sys_sendto+0x10/0x10
      [   23.114621]  ? __pfx_mark_lock.part.0+0x10/0x10
      [   23.114869]  ? lock_release+0x204/0x400
      [   23.115076]  ? find_held_lock+0x8e/0xb0
      [   23.115287]  ? rcu_is_watching+0x23/0x60
      [   23.115503]  ? __rseq_handle_notify_resume+0x6e2/0x860
      [   23.115778]  ? __kasan_check_write+0x14/0x30
      [   23.116008]  ? blkcg_maybe_throttle_current+0x8d/0x770
      [   23.116285]  ? mark_held_locks+0x28/0xa0
      [   23.116503]  ? do_syscall_64+0x37/0x90
      [   23.116713]  __x64_sys_sendto+0x7f/0xb0
      [   23.116924]  do_syscall_64+0x59/0x90
      [   23.117123]  ? irqentry_exit_to_user_mode+0x25/0x30
      [   23.117387]  ? irqentry_exit+0x77/0xb0
      [   23.117593]  ? exc_page_fault+0x92/0x140
      [   23.117806]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      [   23.118081] RIP: 0033:0x7f744aee2bba
      [   23.118282] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
      [   23.119237] RSP: 002b:00007ffd04a7c9f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   23.119644] RAX: ffffffffffffffda RBX: 00007ffd04a7e0a0 RCX: 00007f744aee2bba
      [   23.120023] RDX: 0000000000000040 RSI: 000056488e9e6300 RDI: 0000000000000003
      [   23.120413] RBP: 000056488e9e6300 R08: 00007ffd04a80320 R09: 0000000000000010
      [   23.120809] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
      [   23.121219] R13: 00007ffd04a7dc38 R14: 00007ffd04a7ca00 R15: 00007ffd04a7e0a0
      [   23.121617]  </TASK>
      [   23.121749]
      [   23.121845] The buggy address belongs to the virtual mapping at
      [   23.121845]  [ffffc90000000000, ffffc90000009000) created by:
      [   23.121845]  irq_init_percpu_irqstack+0x1cf/0x270
      [   23.122707]
      [   23.122803] The buggy address belongs to the physical page:
      [   23.123104] page:0000000072ac19f0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x24a09
      [   23.123609] flags: 0xfffffc0001000(reserved|node=0|zone=1|lastcpupid=0x1fffff)
      [   23.123998] page_type: 0xffffffff()
      [   23.124194] raw: 000fffffc0001000 ffffea0000928248 ffffea0000928248 0000000000000000
      [   23.124610] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
      [   23.125023] page dumped because: kasan: bad access detected
      [   23.125326]
      [   23.125421] Memory state around the buggy address:
      [   23.125682]  ffffc90000007800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   23.126072]  ffffc90000007880: 00 00 00 00 00 f1 f1 f1 f1 f1 f1 00 00 f2 f2 00
      [   23.126455] >ffffc90000007900: 00 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00 00 00
      [   23.126840]                                               ^
      [   23.127138]  ffffc90000007980: 00 00 00 00 00 00 00 00 00 00 00 00 00 f3 f3 f3
      [   23.127522]  ffffc90000007a00: f3 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
      [   23.127906] ==================================================================
      [   23.128324] Disabling lock debugging due to kernel taint
      
      Using simple s16 pointers for the 16-bit accesses fixes the problem. For
      the 32-bit accesses, src and dst can be used directly.
      
      Fixes: 96518518 ("netfilter: add nftables")
      Cc: stable@vger.kernel.org
      Reported-by: Tanguy DUBROCA (@SidewayRE) from @Synacktiv working with ZDI
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      caf3ef74
    • Linus Torvalds's avatar
      Merge tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 68433066
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from bluetooth, bpf and wireguard.
      
        Current release - regressions:
      
         - nvme-tcp: fix comma-related oops after sendpage changes
      
        Current release - new code bugs:
      
         - ptp: make max_phase_adjustment sysfs device attribute invisible
           when not supported
      
        Previous releases - regressions:
      
         - sctp: fix potential deadlock on &net->sctp.addr_wq_lock
      
         - mptcp:
            - ensure subflow is unhashed before cleaning the backlog
            - do not rely on implicit state check in mptcp_listen()
      
        Previous releases - always broken:
      
         - net: fix net_dev_start_xmit trace event vs skb_transport_offset()
      
         - Bluetooth:
            - fix use-bdaddr-property quirk
            - L2CAP: fix multiple UaFs
            - ISO: use hci_sync for setting CIG parameters
            - hci_event: fix Set CIG Parameters error status handling
            - hci_event: fix parsing of CIS Established Event
            - MGMT: fix marking SCAN_RSP as not connectable
      
         - wireguard: queuing: use saner cpu selection wrapping
      
         - sched: act_ipt: various bug fixes for iptables <> TC interactions
      
         - sched: act_pedit: add size check for TCA_PEDIT_PARMS_EX
      
         - dsa: fixes for receiving PTP packets with 8021q and sja1105 tagging
      
         - eth: sfc: fix null-deref in devlink port without MAE access
      
         - eth: ibmvnic: do not reset dql stats on NON_FATAL err
      
        Misc:
      
         - xsk: honor SO_BINDTODEVICE on bind"
      
      * tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits)
        nfp: clean mc addresses in application firmware when closing port
        selftests: mptcp: pm_nl_ctl: fix 32-bit support
        selftests: mptcp: depend on SYN_COOKIES
        selftests: mptcp: userspace_pm: report errors with 'remove' tests
        selftests: mptcp: userspace_pm: use correct server port
        selftests: mptcp: sockopt: return error if wrong mark
        selftests: mptcp: sockopt: use 'iptables-legacy' if available
        selftests: mptcp: connect: fail if nft supposed to work
        mptcp: do not rely on implicit state check in mptcp_listen()
        mptcp: ensure subflow is unhashed before cleaning the backlog
        s390/qeth: Fix vipa deletion
        octeontx-af: fix hardware timestamp configuration
        net: dsa: sja1105: always enable the send_meta options
        net: dsa: tag_sja1105: fix MAC DA patching from meta frames
        net: Replace strlcpy with strscpy
        pptp: Fix fib lookup calls.
        mlxsw: spectrum_router: Fix an IS_ERR() vs NULL check
        net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX
        xsk: Honor SO_BINDTODEVICE on bind
        ptp: Make max_phase_adjustment sysfs device attribute invisible when not supported
        ...
      68433066
    • Linus Torvalds's avatar
      Merge tag 'f2fs-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs · 73a3fcda
      Linus Torvalds authored
      Pull f2fs updates from Jaegeuk Kim:
       "In this cycle, we've mainly investigated the zoned block device
        support along with patches such as correcting write pointers between
        f2fs and storage, adding asynchronous zone reset flow, and managing
        the number of open zones.
      
        Other than them, f2fs adds another mount option, "errors=x" to specify
        how to handle when it detects an unexpected behavior at runtime.
      
        Enhancements:
         - support 'errors=remount-ro|continue|panic' mount option
         - enforce some inode flag policies
         - allow .tmp compression given extensions
         - add some ioctls to manage the f2fs compression
         - improve looped node chain flow
         - avoid issuing small-sized discard commands during checkpoint
         - implement an asynchronous zone reset
      
        Bug fixes:
         - fix deadlock in xattr and inode page lock
         - fix and add sanity check in some error paths
         - fix to avoid NULL pointer dereference f2fs_write_end_io() along
           with put_super
         - set proper flags to quota files
         - fix potential deadlock due to unpaired node_write lock use
         - fix over-estimating free section during FG GC
         - fix the wrong condition to determine atomic context
      
        As usual, also there are a number of patches with code refactoring and
        minor clean-ups"
      
      * tag 'f2fs-for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (46 commits)
        f2fs: fix to do sanity check on direct node in truncate_dnode()
        f2fs: only set release for file that has compressed data
        f2fs: fix compile warning in f2fs_destroy_node_manager()
        f2fs: fix error path handling in truncate_dnode()
        f2fs: fix deadlock in i_xattr_sem and inode page lock
        f2fs: remove unneeded page uptodate check/set
        f2fs: update mtime and ctime in move file range method
        f2fs: compress tmp files given extension
        f2fs: refactor struct f2fs_attr macro
        f2fs: convert to use sbi directly
        f2fs: remove redundant assignment to variable err
        f2fs: do not issue small discard commands during checkpoint
        f2fs: check zone write pointer points to the end of zone
        f2fs: add f2fs_ioc_get_compress_blocks
        f2fs: cleanup MIN_INLINE_XATTR_SIZE
        f2fs: add helper to check compression level
        f2fs: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method
        f2fs: do more sanity check on inode
        f2fs: compress: fix to check validity of i_compress_flag field
        f2fs: add sanity compress level check for compressed file
        ...
      73a3fcda
    • Linus Torvalds's avatar
      Merge tag 'xfs-6.5-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · bb8e7e9f
      Linus Torvalds authored
      Pull more xfs updates from Darrick Wong:
      
       - Fix some ordering problems with log items during log recovery
      
       - Don't deadlock the system by trying to flush busy freed extents while
         holding on to busy freed extents
      
       - Improve validation of log geometry parameters when reading the
         primary superblock
      
       - Validate the length field in the AGF header
      
       - Fix recordset filtering bugs when re-calling GETFSMAP to return more
         results when the resultset didn't previously fit in the caller's
         buffer
      
       - Fix integer overflows in GETFSMAP when working with rt volumes larger
         than 2^32 fsblocks
      
       - Fix GETFSMAP reporting the undefined space beyond the last rtextent
      
       - Fix filtering bugs in GETFSMAP's log device backend if the log ever
         becomes longer than 2^32 fsblocks
      
       - Improve validation of file offsets in the GETFSMAP range parameters
      
       - Fix an off by one bug in the pmem media failure notification
         computation
      
       - Validate the length field in the AGI header too
      
      * tag 'xfs-6.5-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        xfs: Remove unneeded semicolon
        xfs: AGI length should be bounds checked
        xfs: fix the calculation for "end" and "length"
        xfs: fix xfs_btree_query_range callers to initialize btree rec fully
        xfs: validate fsmap offsets specified in the query keys
        xfs: fix logdev fsmap query result filtering
        xfs: clean up the rtbitmap fsmap backend
        xfs: fix getfsmap reporting past the last rt extent
        xfs: fix integer overflows in the fsmap rtbitmap and logdev backends
        xfs: fix interval filtering in multi-step fsmap queries
        xfs: fix bounds check in xfs_defer_agfl_block()
        xfs: AGF length has never been bounds checked
        xfs: journal geometry is not properly bounds checked
        xfs: don't block in busy flushing when freeing extents
        xfs: allow extent free intents to be retried
        xfs: pass alloc flags through to xfs_extent_busy_flush()
        xfs: use deferred frees for btree block freeing
        xfs: don't reverse order of items in bulk AIL insertion
        xfs: remove redundant initializations of pointers drop_leaf and save_leaf
      bb8e7e9f
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-6.5-rc1' of... · ace1ba1c
      Linus Torvalds authored
      Merge tag 'pwm/for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm updates from Thierry Reding:
       "There's a little bit of everything in here: we've got various
        improvements and cleanups to drivers, some fixes across the board and
        a bit of new hardware support"
      
      * tag 'pwm/for-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (22 commits)
        dt-bindings: pwm: convert pwm-bcm2835 bindings to YAML
        pwm: Add Renesas RZ/G2L MTU3a PWM driver
        pwm: mtk_disp: Fix the disable flow of disp_pwm
        dt-bindings: pwm: restrict node name suffixes
        pwm: pca9685: Switch i2c driver back to use .probe()
        pwm: ab8500: Fix error code in probe()
        MAINTAINERS: add pwm to PolarFire SoC entry
        pwm: add microchip soft ip corePWM driver
        pwm: sysfs: Do not apply state to already disabled PWMs
        pwm: imx-tpm: force 'real_period' to be zero in suspend
        pwm: meson: make full use of common clock framework
        pwm: meson: don't use hdmi/video clock as mux parent
        pwm: meson: switch to using struct clk_parent_data for mux parents
        pwm: meson: remove not needed check in meson_pwm_calc
        pwm: meson: fix handling of period/duty if greater than UINT_MAX
        pwm: meson: modify and simplify calculation in meson_pwm_get_state
        dt-bindings: pwm: Add R-Car V3U device tree bindings
        dt-bindings: pwm: imx: add i.MX8QXP compatible
        pwm: mediatek: Add support for MT7981
        dt-bindings: pwm: mediatek: Add mediatek,mt7981 compatible
        ...
      ace1ba1c
    • Linus Torvalds's avatar
      Merge tag 'devicetree-for-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · b9861581
      Linus Torvalds authored
      Pull more devicetree updates from Rob Herring:
      
       - Whitespace clean-ups in binding examples
      
       - Restrict node name suffixes to "-[0-9]+" for cases of multiple
         instances which don't have unit-addresses
      
       - Convert brcm,kona-wdt and cdns,wdt-r1p2 watchdog bindings to DT
         schema
      
      * tag 'devicetree-for-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        dt-bindings: soc: qcom: stats: Update maintainer email
        dt-bindings: cleanup DTS example whitespaces
        dt-bindings: timestamp: restrict node name suffixes
        dt-bindings: slimbus: restrict node name suffixes
        dt-bindings: watchdog: restrict node name suffixes
        dt-bindings: watchdog: brcm,kona-wdt: convert txt file to yaml
        dt-bindings: watchdog: cdns,wdt-r1p2: Convert cadence watchdog to yaml
      b9861581
    • Yinjun Zhang's avatar
      nfp: clean mc addresses in application firmware when closing port · cc7eab25
      Yinjun Zhang authored
      When moving devices from one namespace to another, mc addresses are
      cleaned in software while not removed from application firmware. Thus
      the mc addresses are remained and will cause resource leak.
      
      Now use `__dev_mc_unsync` to clean mc addresses when closing port.
      
      Fixes: e20aa071 ("nfp: fix schedule in atomic context when sync mc address")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarYinjun Zhang <yinjun.zhang@corigine.com>
      Acked-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarLouis Peens <louis.peens@corigine.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Message-ID: <20230705052818.7122-1-louis.peens@corigine.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cc7eab25
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · fdaff05b
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2023-07-05
      
      We've added 2 non-merge commits during the last 1 day(s) which contain
      a total of 3 files changed, 16 insertions(+), 4 deletions(-).
      
      The main changes are:
      
      1) Fix BTF to warn but not returning an error for a NULL BTF to still be
         able to load modules under CONFIG_DEBUG_INFO_BTF, from SeongJae Park.
      
      2) Fix xsk sockets to honor SO_BINDTODEVICE in bind(), from Ilya Maximets.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        xsk: Honor SO_BINDTODEVICE on bind
        bpf, btf: Warn but return no error for NULL btf from __register_btf_kfunc_id_set()
      ====================
      
      Link: https://lore.kernel.org/r/20230705171716.6494-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fdaff05b
    • Dragos Tatulea's avatar
      net/mlx5e: RX, Fix page_pool page fragment tracking for XDP · 7abd955a
      Dragos Tatulea authored
      Currently mlx5e releases pages directly to the page_pool for XDP_TX and
      does page fragment counting for XDP_REDIRECT. RX pages from the
      page_pool are leaking on XDP_REDIRECT because the xdp core will release
      only one fragment out of MLX5E_PAGECNT_BIAS_MAX and subsequently the page
      is marked as "skip release" which avoids the driver release.
      
      A fix would be to take an extra fragment for XDP_REDIRECT and not set the
      "skip release" bit so that the release on the driver side can handle the
      remaining bias fragments. But this would be a shortsighted solution.
      Instead, this patch converges the two XDP paths (XDP_TX and XDP_REDIRECT) to
      always do fragment tracking. The "skip release" bit is no longer
      necessary for XDP.
      
      Fixes: 6f574284 ("net/mlx5e: RX, Enable skb page recycling through the page_pool")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      7abd955a
    • Maher Sanalla's avatar
      net/mlx5: Query hca_cap_2 only when supported · 6496357a
      Maher Sanalla authored
      On vport enable, where fw's hca caps are queried, the driver queries
      hca_caps_2 without checking if fw truly supports them, causing a false
      failure of vfs vport load and blocking SRIOV enablement on old devices
      such as CX4 where hca_caps_2 support is missing.
      
      Thus, add a check for the said caps support before accessing them.
      
      Fixes: e5b9642a ("net/mlx5: E-Switch, Implement devlink port function cmds to control migratable")
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: default avatarShay Drory <shayd@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6496357a
    • Yevgeny Kliteynik's avatar
      net/mlx5e: TC, CT: Offload ct clear only once · f7a48511
      Yevgeny Kliteynik authored
      Non-clear CT action causes a flow rule split, while CT clear action
      doesn't and is just a header-rewrite to the current flow rule.
      But ct offload is done in post_parse and is per ct action instance,
      so ct clear offload is parsed multiple times, while its deleted once.
      
      Fix this by post_parsing the ct action only once per flow attribute
      (which is per flow rule) by using a offloaded ct_attr flag.
      
      Fixes: 08fe94ec ("net/mlx5e: TC, Remove special handling of CT action")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      f7a48511
    • Vlad Buslov's avatar
      net/mlx5e: Check for NOT_READY flag state after locking · 65e64640
      Vlad Buslov authored
      Currently the check for NOT_READY flag is performed before obtaining the
      necessary lock. This opens a possibility for race condition when the flow
      is concurrently removed from unready_flows list by the workqueue task,
      which causes a double-removal from the list and a crash[0]. Fix the issue
      by moving the flag check inside the section protected by
      uplink_priv->unready_flows_lock mutex.
      
      [0]:
      [44376.389654] general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP
      [44376.391665] CPU: 7 PID: 59123 Comm: tc Not tainted 6.4.0-rc4+ #1
      [44376.392984] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      [44376.395342] RIP: 0010:mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core]
      [44376.396857] Code: 00 48 8b b8 68 ce 02 00 e8 8a 4d 02 00 4c 8d a8 a8 01 00 00 4c 89 ef e8 8b 79 88 e1 48 8b 83 98 06 00 00 48 8b 93 90 06 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 90 06
      [44376.399167] RSP: 0018:ffff88812cc97570 EFLAGS: 00010246
      [44376.399680] RAX: dead000000000122 RBX: ffff8881088e3800 RCX: ffff8881881bac00
      [44376.400337] RDX: dead000000000100 RSI: ffff88812cc97500 RDI: ffff8881242f71b0
      [44376.401001] RBP: ffff88811cbb0940 R08: 0000000000000400 R09: 0000000000000001
      [44376.401663] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88812c944000
      [44376.402342] R13: ffff8881242f71a8 R14: ffff8881222b4000 R15: 0000000000000000
      [44376.402999] FS:  00007f0451104800(0000) GS:ffff88852cb80000(0000) knlGS:0000000000000000
      [44376.403787] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [44376.404343] CR2: 0000000000489108 CR3: 0000000123a79003 CR4: 0000000000370ea0
      [44376.405004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [44376.405665] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [44376.406339] Call Trace:
      [44376.406651]  <TASK>
      [44376.406939]  ? die_addr+0x33/0x90
      [44376.407311]  ? exc_general_protection+0x192/0x390
      [44376.407795]  ? asm_exc_general_protection+0x22/0x30
      [44376.408292]  ? mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core]
      [44376.408876]  __mlx5e_tc_del_fdb_peer_flow+0xbc/0xe0 [mlx5_core]
      [44376.409482]  mlx5e_tc_del_flow+0x42/0x210 [mlx5_core]
      [44376.410055]  mlx5e_flow_put+0x25/0x50 [mlx5_core]
      [44376.410529]  mlx5e_delete_flower+0x24b/0x350 [mlx5_core]
      [44376.411043]  tc_setup_cb_reoffload+0x22/0x80
      [44376.411462]  fl_reoffload+0x261/0x2f0 [cls_flower]
      [44376.411907]  ? mlx5e_rep_indr_setup_ft_cb+0x160/0x160 [mlx5_core]
      [44376.412481]  ? mlx5e_rep_indr_setup_ft_cb+0x160/0x160 [mlx5_core]
      [44376.413044]  tcf_block_playback_offloads+0x76/0x170
      [44376.413497]  tcf_block_unbind+0x7b/0xd0
      [44376.413881]  tcf_block_setup+0x17d/0x1c0
      [44376.414269]  tcf_block_offload_cmd.isra.0+0xf1/0x130
      [44376.414725]  tcf_block_offload_unbind+0x43/0x70
      [44376.415153]  __tcf_block_put+0x82/0x150
      [44376.415532]  ingress_destroy+0x22/0x30 [sch_ingress]
      [44376.415986]  qdisc_destroy+0x3b/0xd0
      [44376.416343]  qdisc_graft+0x4d0/0x620
      [44376.416706]  tc_get_qdisc+0x1c9/0x3b0
      [44376.417074]  rtnetlink_rcv_msg+0x29c/0x390
      [44376.419978]  ? rep_movs_alternative+0x3a/0xa0
      [44376.420399]  ? rtnl_calcit.isra.0+0x120/0x120
      [44376.420813]  netlink_rcv_skb+0x54/0x100
      [44376.421192]  netlink_unicast+0x1f6/0x2c0
      [44376.421573]  netlink_sendmsg+0x232/0x4a0
      [44376.421980]  sock_sendmsg+0x38/0x60
      [44376.422328]  ____sys_sendmsg+0x1d0/0x1e0
      [44376.422709]  ? copy_msghdr_from_user+0x6d/0xa0
      [44376.423127]  ___sys_sendmsg+0x80/0xc0
      [44376.423495]  ? ___sys_recvmsg+0x8b/0xc0
      [44376.423869]  __sys_sendmsg+0x51/0x90
      [44376.424226]  do_syscall_64+0x3d/0x90
      [44376.424587]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
      [44376.425046] RIP: 0033:0x7f045134f887
      [44376.425403] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
      [44376.426914] RSP: 002b:00007ffd63a82b98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      [44376.427592] RAX: ffffffffffffffda RBX: 000000006481955f RCX: 00007f045134f887
      [44376.428195] RDX: 0000000000000000 RSI: 00007ffd63a82c00 RDI: 0000000000000003
      [44376.428796] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
      [44376.429404] R10: 00007f0451208708 R11: 0000000000000246 R12: 0000000000000001
      [44376.430039] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400
      [44376.430644]  </TASK>
      [44376.430907] Modules linked in: mlx5_ib mlx5_core act_mirred act_tunnel_key cls_flower vxlan dummy sch_ingress openvswitch nsh rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_g
      ss_krb5 auth_rpcgss oid_registry overlay zram zsmalloc fuse [last unloaded: mlx5_core]
      [44376.433936] ---[ end trace 0000000000000000 ]---
      [44376.434373] RIP: 0010:mlx5e_tc_del_fdb_flow+0xb3/0x340 [mlx5_core]
      [44376.434951] Code: 00 48 8b b8 68 ce 02 00 e8 8a 4d 02 00 4c 8d a8 a8 01 00 00 4c 89 ef e8 8b 79 88 e1 48 8b 83 98 06 00 00 48 8b 93 90 06 00 00 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 83 90 06
      [44376.436452] RSP: 0018:ffff88812cc97570 EFLAGS: 00010246
      [44376.436924] RAX: dead000000000122 RBX: ffff8881088e3800 RCX: ffff8881881bac00
      [44376.437530] RDX: dead000000000100 RSI: ffff88812cc97500 RDI: ffff8881242f71b0
      [44376.438179] RBP: ffff88811cbb0940 R08: 0000000000000400 R09: 0000000000000001
      [44376.438786] R10: 0000000000000001 R11: 0000000000000000 R12: ffff88812c944000
      [44376.439393] R13: ffff8881242f71a8 R14: ffff8881222b4000 R15: 0000000000000000
      [44376.439998] FS:  00007f0451104800(0000) GS:ffff88852cb80000(0000) knlGS:0000000000000000
      [44376.440714] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [44376.441225] CR2: 0000000000489108 CR3: 0000000123a79003 CR4: 0000000000370ea0
      [44376.441843] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [44376.442471] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      
      Fixes: ad86755b ("net/mlx5e: Protect unready flows with dedicated lock")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      65e64640
    • Saeed Mahameed's avatar
      net/mlx5: Register a unique thermal zone per device · 631079e0
      Saeed Mahameed authored
      Prior to this patch only one "mlx5" thermal zone could have been
      registered regardless of the number of individual mlx5 devices in the
      system.
      
      To fix this setup a unique name per device to register its own thermal
      zone.
      
      In order to not register a thermal zone for a virtual device (VF/SF) add
      a check for PF device type.
      
      The new name is a concatenation between "mlx5_" and "<PCI_DEV_BDF>", which
      will also help associating a thermal zone with its PCI device.
      
      $ lspci | grep ConnectX
      00:04.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
      00:05.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
      
      $ cat /sys/devices/virtual/thermal/thermal_zone0/type
      mlx5_0000:00:04.0
      $ cat /sys/devices/virtual/thermal/thermal_zone1/type
      mlx5_0000:00:05.0
      
      Fixes: c1fef618 ("net/mlx5: Implement thermal zone")
      CC: Sandipan Patra <spatra@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      631079e0
    • Dragos Tatulea's avatar
      net/mlx5e: RX, Fix flush and close release flow of regular rq for legacy rq · 2e2d1965
      Dragos Tatulea authored
      Regular (non-XSK) RQs get flushed on XSK setup and re-activated on XSK
      close. If the same regular RQ is closed (a config change for example)
      soon after the XSK close, a double release occurs because the missing
      wqes get released a second time.
      
      Fixes: 3f93f829 ("net/mlx5e: RX, Defer page release in legacy rq for better recycling")
      Signed-off-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2e2d1965
    • Zhengchao Shao's avatar
      net/mlx5e: fix memory leak in mlx5e_ptp_open · d543b649
      Zhengchao Shao authored
      When kvzalloc_node or kvzalloc failed in mlx5e_ptp_open, the memory
      pointed by "c" or "cparams" is not freed, which can lead to a memory
      leak. Fix by freeing the array in the error path.
      
      Fixes: 145e5637 ("net/mlx5e: Add TX PTP port object support")
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Reviewed-by: default avatarRahul Rameshbabu <rrameshbabu@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      d543b649
    • Zhengchao Shao's avatar
      net/mlx5e: fix memory leak in mlx5e_fs_tt_redirect_any_create · 3250affd
      Zhengchao Shao authored
      The memory pointed to by the fs->any pointer is not freed in the error
      path of mlx5e_fs_tt_redirect_any_create, which can lead to a memory leak.
      Fix by freeing the memory in the error path, thereby making the error path
      identical to mlx5e_fs_tt_redirect_any_destroy().
      
      Fixes: 0f575c20 ("net/mlx5e: Introduce Flow Steering ANY API")
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarRahul Rameshbabu <rrameshbabu@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      3250affd
    • Zhengchao Shao's avatar
      net/mlx5e: fix double free in mlx5e_destroy_flow_table · 884abe45
      Zhengchao Shao authored
      In function accel_fs_tcp_create_groups(), when the ft->g memory is
      successfully allocated but the 'in' memory fails to be allocated, the
      memory pointed to by ft->g is released once. And in function
      accel_fs_tcp_create_table, mlx5e_destroy_flow_table is called to release
      the memory pointed to by ft->g again. This will cause double free problem.
      
      Fixes: c062d52a ("net/mlx5e: Receive flow steering framework for accelerated TCP flows")
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      884abe45
    • Linus Torvalds's avatar
      Merge tag 'soundwire-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire · fe1de551
      Linus Torvalds authored
      Pull soundwire updates from Vinod Koul:
      
       - Stream handling and slave alert handling
      
       - Qualcomm Soundwire v2.0.0 controller support
      
       - Intel ACE2.x initial support and code reorganization
      
      * tag 'soundwire-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: (55 commits)
        soundwire: stream: Make master_list ordered to prevent deadlocks
        soundwire: bus: Prevent lockdep asserts when stream has multiple buses
        soundwire: qcom: fix storing port config out-of-bounds
        soundwire: intel_ace2x: fix SND_SOC_SOF_HDA_MLINK dependency
        soundwire: debugfs: Add missing SCP registers
        soundwire: stream: Remove unnecessary gotos
        soundwire: stream: Invert logic on runtime alloc flags
        soundwire: stream: Remove unneeded checks for NULL bus
        soundwire: bandwidth allocation: Remove pointless variable
        soundwire: cadence: revisit parity injection
        soundwire: intel/cadence: update hardware reset sequence
        soundwire: intel_bus_common: enable interrupts last
        soundwire: intel_bus_common: update error log
        soundwire: amd: Improve error message in remove callback
        soundwire: debugfs: fix unbalanced pm_runtime_put()
        soundwire: qcom: fix unbalanced pm_runtime_put()
        soundwire: qcom: set clk stop need reset flag at runtime
        soundwire: qcom: add software workaround for bus clash interrupt assertion
        soundwire: qcom: wait for fifo to be empty before suspend
        soundwire: qcom: drop unused struct qcom_swrm_ctrl members
        ...
      fe1de551
    • Linus Torvalds's avatar
      Merge tag 'media/v6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 15ac4686
      Linus Torvalds authored
      Pull media updates from Mauro Carvalho Chehab:
      
       - Lots of improvement at atomisp driver, which is starting to look in
         good shape
      
       - Mediatek vcodec driver has gained support for av1 and hevc stateless
         codecs
      
       - New sensor driver: ov01a10
      
       - verisilicon driver has gained AV1 entropy helpers
      
       - tegra-video has gained support for Tegra20 parallel input
      
       - dvb core has gained an extra property to better support DVB-S2X
      
       - as usual, lots of cleanups, fixes and improvements on media drivers
      
      * tag 'media/v6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (253 commits)
        media: wl128x: fix a clang warning
        media: dvb: mb86a20s: get rid of a clang-15 warning
        media: cec: i2c: ch7322: also select REGMAP
        media: add HAS_IOPORT dependencies
        media: tc358746: select CONFIG_GENERIC_PHY
        media: mediatek: vcodec: Add dbgfs help function
        media: mediatek: vcodec: Add encode to support dbgfs
        media: mediatek: vcodec: Change dbgfs interface to support encode
        media: mediatek: vcodec: Get each instance format type
        media: mediatek: vcodec: Get each context resolution information
        media: mediatek: vcodec: Add a debugfs file to get different useful information
        media: mediatek: vcodec: Add debug params to control different log level
        media: mediatek: vcodec: Add debugfs interface to get debug information
        media: mediatek: vcodec: support stateless AV1 decoder
        media: verisilicon: Conditionally ignore native formats
        media: verisilicon: Enable AV1 decoder on rk3588
        media: verisilicon: Add film grain feature to AV1 driver
        media: verisilicon: Add Rockchip AV1 decoder
        media: verisilicon: Add AV1 entropy helpers
        media: verisilicon: Compute motion vectors size for AV1 frames
        ...
      15ac4686
    • Linus Torvalds's avatar
      Merge tag 'trace-tools-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 2784d74b
      Linus Torvalds authored
      Pull tracing tooling updates from Steven Rostedt:
      
       - Add cgroup support for rtla via the -C option
      
       - Add --house-keeping option that tells rtla where to place the
         housekeeping threads
      
       - Have rtla/timerlat have its own tracing instance instead of using the
         top level tracing instance that is the default for other tracing
         users to use
      
       - Add auto analysis to timerlat_hist
      
       - Have rtla start the tracers after creating the instances
      
       - Reduce rtla hwnoise down to 75% from 100% as it runs with preemption
         disabled and can cause system instability at 100%
      
       - Add support to run timerlat_top and timerlat_hist threads in
         user-space instead of just using the kernel tasks
      
       - Some minor clean ups and documentation changes
      
      * tag 'trace-tools-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        Documentation: Add tools/rtla timerlat -u option documentation
        rtla/timerlat_hist: Add timerlat user-space support
        rtla/timerlat_top: Add timerlat user-space support
        rtla/hwnoise: Reduce runtime to 75%
        rtla: Start the tracers after creating all instances
        rtla/timerlat_hist: Add auto-analysis support
        rtla/timerlat: Give timerlat auto analysis its own instance
        rtla: Automatically move rtla to a house-keeping cpu
        rtla: Change monitored_cpus from char * to cpu_set_t
        rtla: Add --house-keeping option
        rtla: Add -C cgroup support
      2784d74b
    • Linus Torvalds's avatar
      Merge tag 'parisc-for-6.5-rc1-2' of... · 2a95b03d
      Linus Torvalds authored
      Merge tag 'parisc-for-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
      
      Pull more parisc architecture updates from Helge Deller:
      
       -  Fix all compiler warnings in arch/parisc and drivers/parisc when
          compiled with W=1
      
      * tag 'parisc-for-6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: syscalls: Avoid compiler warnings with W=1
        parisc: math-emu: Avoid compiler warnings with W=1
        parisc: Raise minimal GCC version to 12.0.0
        parisc: unwind: Avoid missing prototype warning for handle_interruption()
        parisc: smp: Add declaration for start_cpu_itimer()
        parisc: pdt: Get prototype for arch_report_meminfo()
      2a95b03d
    • Linus Torvalds's avatar
      gup: make the stack expansion warning a bit more targeted · 6cd06ab1
      Linus Torvalds authored
      I added a warning about about GUP no longer expanding the stack in
      commit a425ac53 ("gup: add warning if some caller would seem to want
      stack expansion"), but didn't really expect anybody to hit it.
      
      And it's true that nobody seems to have hit a _real_ case yet, but we
      certainly have a number of reports of false positives.  Which not only
      causes extra noise in itself, but might also end up hiding any real
      cases if they do exist.
      
      So let's tighten up the warning condition, and replace the simplistic
      
      	vma = find_vma(mm, start);
      	if (vma && (start < vma->vm_start)) {
      		WARN_ON_ONCE(vma->vm_flags & VM_GROWSDOWN);
      
      with a
      
      	vma = gup_vma_lookup(mm, start);
      
      helper function which works otherwise like just "vma_lookup()", but with
      some heuristics for when to warn about gup no longer causing stack
      expansion.
      
      In particular, don't just warn for "below the stack", but warn if it's
      _just_ below the stack (with "just below" arbitrarily defined as 64kB,
      because why not?).  And rate-limit it to at most once per hour, which
      means that any false positives shouldn't completely hide subsequent
      reports, but we won't be flooding the logs about it either.
      
      The previous code triggered when some GUP user (chromium crashpad)
      accessing past the end of the previous vma, for example.  That has never
      expanded the stack, it just causes GUP to return early, and as such we
      shouldn't be warning about it.
      
      This is still going trigger the randomized testers, but to mitigate the
      noise from that, use "dump_stack()" instead of "WARN_ON_ONCE()" to get
      the kernel call chain.  We'll get the relevant information, but syzbot
      shouldn't get too upset about it.
      
      Also, don't even bother with the GROWSUP case, which would be using
      different heuristics entirely, but only happens on parisc.
      Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Reported-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Reported-by: syzbot+6cf44e127903fdf9d929@syzkaller.appspotmail.com
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6cd06ab1
    • Sridhar Samudrala's avatar
      ice: Fix tx queue rate limit when TCs are configured · 479cdfe3
      Sridhar Samudrala authored
      Configuring tx_maxrate via sysfs interface
      /sys/class/net/eth0/queues/tx-1/tx_maxrate was not working when
      TCs are configured because always main VSI was being used. Fix by
      using correct VSI in ice_set_tx_maxrate when TCs are configured.
      
      Fixes: 1ddef455 ("ice: Add NDO callback to set the maximum per-queue bitrate")
      Signed-off-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: default avatarSudheer Mogilappagari <sudheer.mogilappagari@intel.com>
      Tested-by: default avatarBharathi Sreenivas <bharathi.sreenivas@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      479cdfe3
    • Sridhar Samudrala's avatar
      ice: Fix max_rate check while configuring TX rate limits · 5f16da6e
      Sridhar Samudrala authored
      Remove incorrect check in ice_validate_mqprio_opt() that limits
      filter configuration when sum of max_rates of all TCs exceeds
      the link speed. The max rate of each TC is unrelated to value
      used by other TCs and is valid as long as it is less than link
      speed.
      
      Fixes: fbc7b27a ("ice: enable ndo_setup_tc support for mqprio_qdisc")
      Signed-off-by: default avatarSridhar Samudrala <sridhar.samudrala@intel.com>
      Signed-off-by: default avatarSudheer Mogilappagari <sudheer.mogilappagari@intel.com>
      Tested-by: default avatarBharathi Sreenivas <bharathi.sreenivas@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      5f16da6e
    • Maulik Shah's avatar
    • Krzysztof Kozlowski's avatar
      dt-bindings: cleanup DTS example whitespaces · ad5d9601
      Krzysztof Kozlowski authored
      The DTS code coding style expects spaces around '=' sign.
      Signed-off-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Reviewed-by: default avatarMatthias Brugger <matthias.bgg@gmail.com>
      Acked-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Reviewed-by: default avatarConor Dooley <conor.dooley@microchip.com>
      Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> #display/msm
      Acked-by: default avatarNeil Armstrong <neil.armstrong@linaro.org>
      Acked-by: default avatarMike Leach <mike.leach@linaro.org>
      Reviewed-by: default avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Acked-by: default avatarVinod Koul <vkoul@kernel.org>
      Link: https://lore.kernel.org/r/20230702182308.7583-1-krzysztof.kozlowski@linaro.orgSigned-off-by: default avatarRob Herring <robh@kernel.org>
      ad5d9601
    • Thadeu Lima de Souza Cascardo's avatar
      netfilter: nf_tables: do not ignore genmask when looking up chain by id · 515ad530
      Thadeu Lima de Souza Cascardo authored
      When adding a rule to a chain referring to its ID, if that chain had been
      deleted on the same batch, the rule might end up referring to a deleted
      chain.
      
      This will lead to a WARNING like following:
      
      [   33.098431] ------------[ cut here ]------------
      [   33.098678] WARNING: CPU: 5 PID: 69 at net/netfilter/nf_tables_api.c:2037 nf_tables_chain_destroy+0x23d/0x260
      [   33.099217] Modules linked in:
      [   33.099388] CPU: 5 PID: 69 Comm: kworker/5:1 Not tainted 6.4.0+ #409
      [   33.099726] Workqueue: events nf_tables_trans_destroy_work
      [   33.100018] RIP: 0010:nf_tables_chain_destroy+0x23d/0x260
      [   33.100306] Code: 8b 7c 24 68 e8 64 9c ed fe 4c 89 e7 e8 5c 9c ed fe 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 89 c6 89 c7 c3 cc cc cc cc <0f> 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 89 c6 89 c7
      [   33.101271] RSP: 0018:ffffc900004ffc48 EFLAGS: 00010202
      [   33.101546] RAX: 0000000000000001 RBX: ffff888006fc0a28 RCX: 0000000000000000
      [   33.101920] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      [   33.102649] RBP: ffffc900004ffc78 R08: 0000000000000000 R09: 0000000000000000
      [   33.103018] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880135ef500
      [   33.103385] R13: 0000000000000000 R14: dead000000000122 R15: ffff888006fc0a10
      [   33.103762] FS:  0000000000000000(0000) GS:ffff888024c80000(0000) knlGS:0000000000000000
      [   33.104184] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   33.104493] CR2: 00007fe863b56a50 CR3: 00000000124b0001 CR4: 0000000000770ee0
      [   33.104872] PKRU: 55555554
      [   33.104999] Call Trace:
      [   33.105113]  <TASK>
      [   33.105214]  ? show_regs+0x72/0x90
      [   33.105371]  ? __warn+0xa5/0x210
      [   33.105520]  ? nf_tables_chain_destroy+0x23d/0x260
      [   33.105732]  ? report_bug+0x1f2/0x200
      [   33.105902]  ? handle_bug+0x46/0x90
      [   33.106546]  ? exc_invalid_op+0x19/0x50
      [   33.106762]  ? asm_exc_invalid_op+0x1b/0x20
      [   33.106995]  ? nf_tables_chain_destroy+0x23d/0x260
      [   33.107249]  ? nf_tables_chain_destroy+0x30/0x260
      [   33.107506]  nf_tables_trans_destroy_work+0x669/0x680
      [   33.107782]  ? mark_held_locks+0x28/0xa0
      [   33.107996]  ? __pfx_nf_tables_trans_destroy_work+0x10/0x10
      [   33.108294]  ? _raw_spin_unlock_irq+0x28/0x70
      [   33.108538]  process_one_work+0x68c/0xb70
      [   33.108755]  ? lock_acquire+0x17f/0x420
      [   33.108977]  ? __pfx_process_one_work+0x10/0x10
      [   33.109218]  ? do_raw_spin_lock+0x128/0x1d0
      [   33.109435]  ? _raw_spin_lock_irq+0x71/0x80
      [   33.109634]  worker_thread+0x2bd/0x700
      [   33.109817]  ? __pfx_worker_thread+0x10/0x10
      [   33.110254]  kthread+0x18b/0x1d0
      [   33.110410]  ? __pfx_kthread+0x10/0x10
      [   33.110581]  ret_from_fork+0x29/0x50
      [   33.110757]  </TASK>
      [   33.110866] irq event stamp: 1651
      [   33.111017] hardirqs last  enabled at (1659): [<ffffffffa206a209>] __up_console_sem+0x79/0xa0
      [   33.111379] hardirqs last disabled at (1666): [<ffffffffa206a1ee>] __up_console_sem+0x5e/0xa0
      [   33.111740] softirqs last  enabled at (1616): [<ffffffffa1f5d40e>] __irq_exit_rcu+0x9e/0xe0
      [   33.112094] softirqs last disabled at (1367): [<ffffffffa1f5d40e>] __irq_exit_rcu+0x9e/0xe0
      [   33.112453] ---[ end trace 0000000000000000 ]---
      
      This is due to the nft_chain_lookup_byid ignoring the genmask. After this
      change, adding the new rule will fail as it will not find the chain.
      
      Fixes: 837830a4 ("netfilter: nf_tables: add NFTA_RULE_CHAIN_ID attribute")
      Cc: stable@vger.kernel.org
      Reported-by: Mingi Cho of Theori working with ZDI
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      515ad530
    • Florian Westphal's avatar
      netfilter: conntrack: don't fold port numbers into addresses before hashing · eaf9e719
      Florian Westphal authored
      Originally this used jhash2() over tuple and folded the zone id,
      the pernet hash value, destination port and l4 protocol number into the
      32bit seed value.
      
      When the switch to siphash was done, I used an on-stack temporary
      buffer to build a suitable key to be hashed via siphash().
      
      But this showed up as performance regression, so I got rid of
      the temporary copy and collected to-be-hashed data in 4 u64 variables.
      
      This makes it easy to build tuples that produce the same hash, which isn't
      desirable even though chain lengths are limited.
      
      Switch back to plain siphash, but just like with jhash2(), take advantage
      of the fact that most of to-be-hashed data is already in a suitable order.
      
      Use an empty struct as annotation in 'struct nf_conntrack_tuple' to mark
      last member that can be used as hash input.
      
      The only remaining data that isn't present in the tuple structure are the
      zone identifier and the pernet hash: fold those into the key.
      
      Fixes: d2c806ab ("netfilter: conntrack: use siphash_4u64")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      eaf9e719
    • Florent Revest's avatar
      netfilter: conntrack: Avoid nf_ct_helper_hash uses after free · 6eef7a2b
      Florent Revest authored
      If nf_conntrack_init_start() fails (for example due to a
      register_nf_conntrack_bpf() failure), the nf_conntrack_helper_fini()
      clean-up path frees the nf_ct_helper_hash map.
      
      When built with NF_CONNTRACK=y, further netfilter modules (e.g:
      netfilter_conntrack_ftp) can still be loaded and call
      nf_conntrack_helpers_register(), independently of whether nf_conntrack
      initialized correctly. This accesses the nf_ct_helper_hash dangling
      pointer and causes a uaf, possibly leading to random memory corruption.
      
      This patch guards nf_conntrack_helper_register() from accessing a freed
      or uninitialized nf_ct_helper_hash pointer and fixes possible
      uses-after-free when loading a conntrack module.
      
      Cc: stable@vger.kernel.org
      Fixes: 12f7a505 ("netfilter: add user-space connection tracking helper infrastructure")
      Signed-off-by: default avatarFlorent Revest <revest@chromium.org>
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6eef7a2b