1. 28 Aug, 2024 2 commits
    • Dawid Osuchowski's avatar
      ice: Add netif_device_attach/detach into PF reset flow · d11a6763
      Dawid Osuchowski authored
      Ethtool callbacks can be executed while reset is in progress and try to
      access deleted resources, e.g. getting coalesce settings can result in a
      NULL pointer dereference seen below.
      
      Reproduction steps:
      Once the driver is fully initialized, trigger reset:
      	# echo 1 > /sys/class/net/<interface>/device/reset
      when reset is in progress try to get coalesce settings using ethtool:
      	# ethtool -c <interface>
      
      BUG: kernel NULL pointer dereference, address: 0000000000000020
      PGD 0 P4D 0
      Oops: Oops: 0000 [#1] PREEMPT SMP PTI
      CPU: 11 PID: 19713 Comm: ethtool Tainted: G S                 6.10.0-rc7+ #7
      RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice]
      RSP: 0018:ffffbab1e9bcf6a8 EFLAGS: 00010206
      RAX: 000000000000000c RBX: ffff94512305b028 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff9451c3f2e588 RDI: ffff9451c3f2e588
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: ffff9451c3f2e580 R11: 000000000000001f R12: ffff945121fa9000
      R13: ffffbab1e9bcf760 R14: 0000000000000013 R15: ffffffff9e65dd40
      FS:  00007faee5fbe740(0000) GS:ffff94546fd80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000020 CR3: 0000000106c2e005 CR4: 00000000001706f0
      Call Trace:
      <TASK>
      ice_get_coalesce+0x17/0x30 [ice]
      coalesce_prepare_data+0x61/0x80
      ethnl_default_doit+0xde/0x340
      genl_family_rcv_msg_doit+0xf2/0x150
      genl_rcv_msg+0x1b3/0x2c0
      netlink_rcv_skb+0x5b/0x110
      genl_rcv+0x28/0x40
      netlink_unicast+0x19c/0x290
      netlink_sendmsg+0x222/0x490
      __sys_sendto+0x1df/0x1f0
      __x64_sys_sendto+0x24/0x30
      do_syscall_64+0x82/0x160
      entry_SYSCALL_64_after_hwframe+0x76/0x7e
      RIP: 0033:0x7faee60d8e27
      
      Calling netif_device_detach() before reset makes the net core not call
      the driver when ethtool command is issued, the attempt to execute an
      ethtool command during reset will result in the following message:
      
          netlink error: No such device
      
      instead of NULL pointer dereference. Once reset is done and
      ice_rebuild() is executing, the netif_device_attach() is called to allow
      for ethtool operations to occur again in a safe manner.
      
      Fixes: fcea6f3d ("ice: Add stats and ethtool support")
      Suggested-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarIgor Bagnucki <igor.bagnucki@intel.com>
      Signed-off-by: default avatarDawid Osuchowski <dawid.osuchowski@linux.intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Reviewed-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      d11a6763
    • Daiwei Li's avatar
      igb: Fix not clearing TimeSync interrupts for 82580 · ba8cf807
      Daiwei Li authored
      82580 NICs have a hardware bug that makes it
      necessary to write into the TSICR (TimeSync Interrupt Cause) register
      to clear it:
      https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/
      
      Add a conditional so only for 82580 we write into the TSICR register,
      so we don't risk losing events for other models.
      
      Without this change, when running ptp4l with an Intel 82580 card,
      I get the following output:
      
      > timed out while polling for tx timestamp increasing tx_timestamp_timeout or
      > increasing kworker priority may correct this issue, but a driver bug likely
      > causes it
      
      This goes away with this change.
      
      This (partially) reverts commit ee14cc9e ("igb: Fix missing time sync events").
      
      Fixes: ee14cc9e ("igb: Fix missing time sync events")
      Closes: https://lore.kernel.org/intel-wired-lan/CAN0jFd1kO0MMtOh8N2Ztxn6f7vvDKp2h507sMryobkBKe=xk=w@mail.gmail.com/Tested-by: default avatarDaiwei Li <daiweili@google.com>
      Suggested-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarDaiwei Li <daiweili@google.com>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      ba8cf807
  2. 27 Aug, 2024 16 commits
  3. 26 Aug, 2024 4 commits
  4. 23 Aug, 2024 6 commits
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_core: Fix not handling hibernation actions · 18b3256d
      Luiz Augusto von Dentz authored
      This fixes not handling hibernation actions on suspend notifier so they
      are treated in the same way as regular suspend actions.
      
      Fixes: 9952d90e ("Bluetooth: Handle PM_SUSPEND_PREPARE and PM_POST_SUSPEND")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      18b3256d
    • Neeraj Sanjay Kale's avatar
      Bluetooth: btnxpuart: Fix random crash seen while removing driver · 35237475
      Neeraj Sanjay Kale authored
      This fixes the random kernel crash seen while removing the driver, when
      running the load/unload test over multiple iterations.
      
      1) modprobe btnxpuart
      2) hciconfig hci0 reset
      3) hciconfig (check hci0 interface up with valid BD address)
      4) modprobe -r btnxpuart
      Repeat steps 1 to 4
      
      The ps_wakeup() call in btnxpuart_close() schedules the psdata->work(),
      which gets scheduled after module is removed, causing a kernel crash.
      
      This hidden issue got highlighted after enabling Power Save by default
      in 4183a7be (Bluetooth: btnxpuart: Enable Power Save feature on
      startup)
      
      The new ps_cleanup() deasserts UART break immediately while closing
      serdev device, cancels any scheduled ps_work and destroys the ps_lock
      mutex.
      
      [   85.884604] Unable to handle kernel paging request at virtual address ffffd4a61638f258
      [   85.884624] Mem abort info:
      [   85.884625]   ESR = 0x0000000086000007
      [   85.884628]   EC = 0x21: IABT (current EL), IL = 32 bits
      [   85.884633]   SET = 0, FnV = 0
      [   85.884636]   EA = 0, S1PTW = 0
      [   85.884638]   FSC = 0x07: level 3 translation fault
      [   85.884642] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041dd0000
      [   85.884646] [ffffd4a61638f258] pgd=1000000095fff003, p4d=1000000095fff003, pud=100000004823d003, pmd=100000004823e003, pte=0000000000000000
      [   85.884662] Internal error: Oops: 0000000086000007 [#1] PREEMPT SMP
      [   85.890932] Modules linked in: algif_hash algif_skcipher af_alg overlay fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine authenc libdes crct10dif_ce polyval_ce polyval_generic snd_soc_imx_spdif snd_soc_imx_card snd_soc_ak5558 snd_soc_ak4458 caam secvio error snd_soc_fsl_spdif snd_soc_fsl_micfil snd_soc_fsl_sai snd_soc_fsl_utils gpio_ir_recv rc_core fuse [last unloaded: btnxpuart(O)]
      [   85.927297] CPU: 1 PID: 67 Comm: kworker/1:3 Tainted: G           O       6.1.36+g937b1be4345a #1
      [   85.936176] Hardware name: FSL i.MX8MM EVK board (DT)
      [   85.936182] Workqueue: events 0xffffd4a61638f380
      [   85.936198] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      [   85.952817] pc : 0xffffd4a61638f258
      [   85.952823] lr : 0xffffd4a61638f258
      [   85.952827] sp : ffff8000084fbd70
      [   85.952829] x29: ffff8000084fbd70 x28: 0000000000000000 x27: 0000000000000000
      [   85.963112] x26: ffffd4a69133f000 x25: ffff4bf1c8540990 x24: ffff4bf215b87305
      [   85.963119] x23: ffff4bf215b87300 x22: ffff4bf1c85409d0 x21: ffff4bf1c8540970
      [   85.977382] x20: 0000000000000000 x19: ffff4bf1c8540880 x18: 0000000000000000
      [   85.977391] x17: 0000000000000000 x16: 0000000000000133 x15: 0000ffffe2217090
      [   85.977399] x14: 0000000000000001 x13: 0000000000000133 x12: 0000000000000139
      [   85.977407] x11: 0000000000000001 x10: 0000000000000a60 x9 : ffff8000084fbc50
      [   85.977417] x8 : ffff4bf215b7d000 x7 : ffff4bf215b83b40 x6 : 00000000000003e8
      [   85.977424] x5 : 00000000410fd030 x4 : 0000000000000000 x3 : 0000000000000000
      [   85.977432] x2 : 0000000000000000 x1 : ffff4bf1c4265880 x0 : 0000000000000000
      [   85.977443] Call trace:
      [   85.977446]  0xffffd4a61638f258
      [   85.977451]  0xffffd4a61638f3e8
      [   85.977455]  process_one_work+0x1d4/0x330
      [   85.977464]  worker_thread+0x6c/0x430
      [   85.977471]  kthread+0x108/0x10c
      [   85.977476]  ret_from_fork+0x10/0x20
      [   85.977488] Code: bad PC value
      [   85.977491] ---[ end trace 0000000000000000 ]---
      
      Preset since v6.9.11
      Fixes: 86d55f12 ("Bluetooth: btnxpuart: Deasset UART break before closing serdev device")
      Signed-off-by: default avatarNeeraj Sanjay Kale <neeraj.sanjaykale@nxp.com>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      35237475
    • Kiran K's avatar
      Bluetooth: btintel: Allow configuring drive strength of BRI · eb9e749c
      Kiran K authored
      BRI (Bluetooth Radio Interface) traffic from CNVr to CNVi was found causing
      cross talk step errors to WiFi. To avoid this potential issue OEM platforms
      can replace BRI resistor to adjust the BRI response line drive strength.
      During the *setup*, driver reads the drive strength value from uefi
      variable and passes it to the controller via vendor specific command with
      opcode 0xfc0a.
      
      dmesg:
      
      ..
      [21.982720] Bluetooth: hci0: Bootloader timestamp 2023.33 buildtype 1 build 45995
      [21.984250] Bluetooth: hci0: Found device firmware: intel/ibt-0190-0291-iml.sfi
      [21.984255] Bluetooth: hci0: Boot Address: 0x30099000
      [21.984256] Bluetooth: hci0: Firmware Version: 160-24.24
      [22.011501] Bluetooth: hci0: Waiting for firmware download to complete
      [22.011518] Bluetooth: hci0: Firmware loaded in 26624 usecs
      [22.011584] Bluetooth: hci0: Waiting for device to boot
      [22.013546] Bluetooth: hci0: Malformed MSFT vendor event: 0x02
      [22.013552] Bluetooth: hci0: Device booted in 1967 usecs
      ...
      [22.013792] Bluetooth: hci0: dsbr: enable: 0x01 value: 0x0b
      ...
      [22.015027] Bluetooth: hci0: Found device firmware: intel/ibt-0190-0291.sfi
      [22.015041] Bluetooth: hci0: Boot Address: 0x10000800
      [22.015043] Bluetooth: hci0: Firmware Version: 160-24.24
      [22.395821] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
      [22.395828] Bluetooth: BNEP filters: protocol multicast
      ...
      Signed-off-by: default avatarKiran K <kiran.k@intel.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      eb9e749c
    • Haiyang Zhang's avatar
      net: mana: Fix race of mana_hwc_post_rx_wqe and new hwc response · 8af174ea
      Haiyang Zhang authored
      The mana_hwc_rx_event_handler() / mana_hwc_handle_resp() calls
      complete(&ctx->comp_event) before posting the wqe back. It's
      possible that other callers, like mana_create_txq(), start the
      next round of mana_hwc_send_request() before the posting of wqe.
      And if the HW is fast enough to respond, it can hit no_wqe error
      on the HW channel, then the response message is lost. The mana
      driver may fail to create queues and open, because of waiting for
      the HW response and timed out.
      Sample dmesg:
      [  528.610840] mana 39d4:00:02.0: HWC: Request timed out!
      [  528.614452] mana 39d4:00:02.0: Failed to send mana message: -110, 0x0
      [  528.618326] mana 39d4:00:02.0 enP14804s2: Failed to create WQ object: -110
      
      To fix it, move posting of rx wqe before complete(&ctx->comp_event).
      
      Cc: stable@vger.kernel.org
      Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Reviewed-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8af174ea
    • Johannes Berg's avatar
      net: drop special comment style · 82b8000c
      Johannes Berg authored
      As we discussed in the room at netdevconf earlier this week,
      drop the requirement for special comment style for netdev.
      
      For checkpatch, the general check accepts both right now, so
      simply drop the special request there as well.
      Acked-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82b8000c
    • Eric Dumazet's avatar
      pktgen: use cpus_read_lock() in pg_net_init() · 979b581e
      Eric Dumazet authored
      I have seen the WARN_ON(smp_processor_id() != cpu) firing
      in pktgen_thread_worker() during tests.
      
      We must use cpus_read_lock()/cpus_read_unlock()
      around the for_each_online_cpu(cpu) loop.
      
      While we are at it use WARN_ON_ONCE() to avoid a possible syslog flood.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://patch.msgid.link/20240821175339.1191779-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      979b581e
  5. 22 Aug, 2024 12 commits