1. 12 Apr, 2023 1 commit
    • Jakub Kicinski's avatar
      Merge tag 'for-net-2023-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 160c1317
      Jakub Kicinski authored
      Luiz Augusto von Dentz says:
      
      ====================
      bluetooth pull request for net:
      
       - Fix not setting Dath Path for broadcast sink
       - Fix not cleaning up on LE Connection failure
       - SCO: Fix possible circular locking dependency
       - L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp}
       - Fix race condition in hidp_session_thread
       - btbcm: Fix logic error in forming the board name
       - btbcm: Fix use after free in btsdio_remove
      
      * tag 'for-net-2023-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
        Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp}
        Bluetooth: Set ISO Data Path on broadcast sink
        Bluetooth: hci_conn: Fix possible UAF
        Bluetooth: SCO: Fix possible circular locking dependency sco_sock_getsockopt
        Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm
        bluetooth: btbcm: Fix logic error in forming the board name.
        Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition
        Bluetooth: Fix race condition in hidp_session_thread
        Bluetooth: Fix printing errors if LE Connection times out
        Bluetooth: hci_conn: Fix not cleaning up on LE Connection failure
      ====================
      
      Link: https://lore.kernel.org/r/20230410172718.4067798-1-luiz.dentz@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      160c1317
  2. 11 Apr, 2023 1 commit
  3. 10 Apr, 2023 10 commits
    • Luiz Augusto von Dentz's avatar
      Bluetooth: L2CAP: Fix use-after-free in l2cap_disconnect_{req,rsp} · a2a9339e
      Luiz Augusto von Dentz authored
      Similar to commit d0be8347 ("Bluetooth: L2CAP: Fix use-after-free
      caused by l2cap_chan_put"), just use l2cap_chan_hold_unless_zero to
      prevent referencing a channel that is about to be destroyed.
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarMin Li <lm0963hack@gmail.com>
      a2a9339e
    • Claudia Draghicescu's avatar
      Bluetooth: Set ISO Data Path on broadcast sink · d2e4f1b1
      Claudia Draghicescu authored
      This patch enables ISO data rx on broadcast sink.
      
      Fixes: eca0ae4a ("Bluetooth: Add initial implementation of BIS connections")
      Signed-off-by: default avatarClaudia Draghicescu <claudia.rosu@nxp.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      d2e4f1b1
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_conn: Fix possible UAF · 5dc7d23e
      Luiz Augusto von Dentz authored
      This fixes the following trace:
      
      ==================================================================
      BUG: KASAN: slab-use-after-free in hci_conn_del+0xba/0x3a0
      Write of size 8 at addr ffff88800208e9c8 by task iso-tester/31
      
      CPU: 0 PID: 31 Comm: iso-tester Not tainted 6.3.0-rc2-g991aa4a69a47
       #4716
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.1-2.fc36
      04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x1d/0x70
       print_report+0xce/0x610
       ? __virt_addr_valid+0xd4/0x150
       ? hci_conn_del+0xba/0x3a0
       kasan_report+0xdd/0x110
       ? hci_conn_del+0xba/0x3a0
       hci_conn_del+0xba/0x3a0
       hci_conn_hash_flush+0xf2/0x120
       hci_dev_close_sync+0x388/0x920
       hci_unregister_dev+0x122/0x260
       vhci_release+0x4f/0x90
       __fput+0x102/0x430
       task_work_run+0xf1/0x160
       ? __pfx_task_work_run+0x10/0x10
       ? mark_held_locks+0x24/0x90
       exit_to_user_mode_prepare+0x170/0x180
       syscall_exit_to_user_mode+0x19/0x50
       do_syscall_64+0x4e/0x90
       entry_SYSCALL_64_after_hwframe+0x70/0xda
      
      Fixes: 0f00cd32 ("Bluetooth: Free potentially unfreed SCO connection")
      Link: https://syzkaller.appspot.com/bug?extid=8bb72f86fc823817bc5d
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      5dc7d23e
    • Luiz Augusto von Dentz's avatar
      Bluetooth: SCO: Fix possible circular locking dependency sco_sock_getsockopt · 975abc0c
      Luiz Augusto von Dentz authored
      This attempts to fix the following trace:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      6.3.0-rc2-g68fcb3a7bf97 #4706 Not tainted
      ------------------------------------------------------
      sco-tester/31 is trying to acquire lock:
      ffff8880025b8070 (&hdev->lock){+.+.}-{3:3}, at:
      sco_sock_getsockopt+0x1fc/0xa90
      
      but task is already holding lock:
      ffff888001eeb130 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at:
      sco_sock_getsockopt+0x104/0xa90
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #2 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
             lock_sock_nested+0x32/0x80
             sco_connect_cfm+0x118/0x4a0
             hci_sync_conn_complete_evt+0x1e6/0x3d0
             hci_event_packet+0x55c/0x7c0
             hci_rx_work+0x34c/0xa00
             process_one_work+0x575/0x910
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x2b/0x50
      
      -> #1 (hci_cb_list_lock){+.+.}-{3:3}:
             __mutex_lock+0x13b/0xcc0
             hci_sync_conn_complete_evt+0x1ad/0x3d0
             hci_event_packet+0x55c/0x7c0
             hci_rx_work+0x34c/0xa00
             process_one_work+0x575/0x910
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x2b/0x50
      
      -> #0 (&hdev->lock){+.+.}-{3:3}:
             __lock_acquire+0x18cc/0x3740
             lock_acquire+0x151/0x3a0
             __mutex_lock+0x13b/0xcc0
             sco_sock_getsockopt+0x1fc/0xa90
             __sys_getsockopt+0xe9/0x190
             __x64_sys_getsockopt+0x5b/0x70
             do_syscall_64+0x42/0x90
             entry_SYSCALL_64_after_hwframe+0x70/0xda
      
      other info that might help us debug this:
      
      Chain exists of:
        &hdev->lock --> hci_cb_list_lock --> sk_lock-AF_BLUETOOTH-BTPROTO_SCO
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO);
                                     lock(hci_cb_list_lock);
                                     lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO);
        lock(&hdev->lock);
      
       *** DEADLOCK ***
      
      1 lock held by sco-tester/31:
       #0: ffff888001eeb130 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0},
       at: sco_sock_getsockopt+0x104/0xa90
      
      Fixes: 248733e8 ("Bluetooth: Allow querying of supported offload codecs over SCO socket")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      975abc0c
    • Luiz Augusto von Dentz's avatar
      Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm · 9a8ec9e8
      Luiz Augusto von Dentz authored
      This attempts to fix the following trace:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      6.3.0-rc2-g0b93eeba4454 #4703 Not tainted
      ------------------------------------------------------
      kworker/u3:0/46 is trying to acquire lock:
      ffff888001fd9130 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at:
      sco_connect_cfm+0x118/0x4a0
      
      but task is already holding lock:
      ffffffff831e3340 (hci_cb_list_lock){+.+.}-{3:3}, at:
      hci_sync_conn_complete_evt+0x1ad/0x3d0
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #2 (hci_cb_list_lock){+.+.}-{3:3}:
             __mutex_lock+0x13b/0xcc0
             hci_sync_conn_complete_evt+0x1ad/0x3d0
             hci_event_packet+0x55c/0x7c0
             hci_rx_work+0x34c/0xa00
             process_one_work+0x575/0x910
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x2b/0x50
      
      -> #1 (&hdev->lock){+.+.}-{3:3}:
             __mutex_lock+0x13b/0xcc0
             sco_sock_connect+0xfc/0x630
             __sys_connect+0x197/0x1b0
             __x64_sys_connect+0x37/0x50
             do_syscall_64+0x42/0x90
             entry_SYSCALL_64_after_hwframe+0x70/0xda
      
      -> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
             __lock_acquire+0x18cc/0x3740
             lock_acquire+0x151/0x3a0
             lock_sock_nested+0x32/0x80
             sco_connect_cfm+0x118/0x4a0
             hci_sync_conn_complete_evt+0x1e6/0x3d0
             hci_event_packet+0x55c/0x7c0
             hci_rx_work+0x34c/0xa00
             process_one_work+0x575/0x910
             worker_thread+0x89/0x6f0
             kthread+0x14e/0x180
             ret_from_fork+0x2b/0x50
      
      other info that might help us debug this:
      
      Chain exists of:
        sk_lock-AF_BLUETOOTH-BTPROTO_SCO --> &hdev->lock --> hci_cb_list_lock
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(hci_cb_list_lock);
                                     lock(&hdev->lock);
                                     lock(hci_cb_list_lock);
        lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO);
      
       *** DEADLOCK ***
      
      4 locks held by kworker/u3:0/46:
       #0: ffff8880028d1130 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
       process_one_work+0x4c0/0x910
       #1: ffff8880013dfde0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
       at: process_one_work+0x4c0/0x910
       #2: ffff8880025d8070 (&hdev->lock){+.+.}-{3:3}, at:
       hci_sync_conn_complete_evt+0xa6/0x3d0
       #3: ffffffffb79e3340 (hci_cb_list_lock){+.+.}-{3:3}, at:
       hci_sync_conn_complete_evt+0x1ad/0x3d0
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      9a8ec9e8
    • Sasha Finkelstein's avatar
      bluetooth: btbcm: Fix logic error in forming the board name. · b76abe46
      Sasha Finkelstein authored
      This patch fixes an incorrect loop exit condition in code that replaces
      '/' symbols in the board name. There might also be a memory corruption
      issue here, but it is unlikely to be a real problem.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarSasha Finkelstein <fnkl.kernel@gmail.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      b76abe46
    • Zheng Wang's avatar
      Bluetooth: btsdio: fix use after free bug in btsdio_remove due to race condition · 73f7b171
      Zheng Wang authored
      In btsdio_probe, the data->work is bound with btsdio_work. It will be
      started in btsdio_send_frame.
      
      If the btsdio_remove runs with a unfinished work, there may be a race
      condition that hdev is freed but used in btsdio_work. Fix it by
      canceling the work before do cleanup in btsdio_remove.
      Signed-off-by: default avatarZheng Wang <zyytlz.wz@163.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      73f7b171
    • Min Li's avatar
      Bluetooth: Fix race condition in hidp_session_thread · c95930ab
      Min Li authored
      There is a potential race condition in hidp_session_thread that may
      lead to use-after-free. For instance, the timer is active while
      hidp_del_timer is called in hidp_session_thread(). After hidp_session_put,
      then 'session' will be freed, causing kernel panic when hidp_idle_timeout
      is running.
      
      The solution is to use del_timer_sync instead of del_timer.
      
      Here is the call trace:
      
      ? hidp_session_probe+0x780/0x780
      call_timer_fn+0x2d/0x1e0
      __run_timers.part.0+0x569/0x940
      hidp_session_probe+0x780/0x780
      call_timer_fn+0x1e0/0x1e0
      ktime_get+0x5c/0xf0
      lapic_next_deadline+0x2c/0x40
      clockevents_program_event+0x205/0x320
      run_timer_softirq+0xa9/0x1b0
      __do_softirq+0x1b9/0x641
      __irq_exit_rcu+0xdc/0x190
      irq_exit_rcu+0xe/0x20
      sysvec_apic_timer_interrupt+0xa1/0xc0
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMin Li <lm0963hack@gmail.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      c95930ab
    • Luiz Augusto von Dentz's avatar
      Bluetooth: Fix printing errors if LE Connection times out · b62e7220
      Luiz Augusto von Dentz authored
      This fixes errors like bellow when LE Connection times out since that
      is actually not a controller error:
      
       Bluetooth: hci0: Opcode 0x200d failed: -110
       Bluetooth: hci0: request failed to create LE connection: err -110
      
      Instead the code shall properly detect if -ETIMEDOUT is returned and
      send HCI_OP_LE_CREATE_CONN_CANCEL to give up on the connection.
      
      Link: https://github.com/bluez/bluez/issues/340
      Fixes: 8e8b92ee ("Bluetooth: hci_sync: Add hci_le_create_conn_sync")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      b62e7220
    • Luiz Augusto von Dentz's avatar
      Bluetooth: hci_conn: Fix not cleaning up on LE Connection failure · 19cf60bf
      Luiz Augusto von Dentz authored
      hci_connect_le_scan_cleanup shall always be invoked to cleanup the
      states and re-enable passive scanning if necessary, otherwise it may
      cause the pending action to stay active causing multiple attempts to
      connect.
      
      Fixes: 9b3628d7 ("Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted")
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      19cf60bf
  4. 09 Apr, 2023 3 commits
  5. 08 Apr, 2023 4 commits
    • Douglas Anderson's avatar
      r8152: Add __GFP_NOWARN to big allocations · 5cc33f13
      Douglas Anderson authored
      When memory is a little tight on my system, it's pretty easy to see
      warnings that look like this.
      
        ksoftirqd/0: page allocation failure: order:3, mode:0x40a20(GFP_ATOMIC|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
        ...
        Call trace:
         dump_backtrace+0x0/0x1e8
         show_stack+0x20/0x2c
         dump_stack_lvl+0x60/0x78
         dump_stack+0x18/0x38
         warn_alloc+0x104/0x174
         __alloc_pages+0x588/0x67c
         alloc_rx_agg+0xa0/0x190 [r8152 ...]
         r8152_poll+0x270/0x760 [r8152 ...]
         __napi_poll+0x44/0x1ec
         net_rx_action+0x100/0x300
         __do_softirq+0xec/0x38c
         run_ksoftirqd+0x38/0xec
         smpboot_thread_fn+0xb8/0x248
         kthread+0x134/0x154
         ret_from_fork+0x10/0x20
      
      On a fragmented system it's normal that order 3 allocations will
      sometimes fail, especially atomic ones. The driver handles these
      failures fine and the WARN just creates spam in the logs for this
      case. The __GFP_NOWARN flag is exactly for this situation, so add it
      to the allocation.
      
      NOTE: my testing is on a 5.15 system, but there should be no reason
      that this would be fundamentally different on a mainline kernel.
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Acked-by: default avatarHayes Wang <hayeswang@realtek.com>
      Link: https://lore.kernel.org/r/20230406171411.1.I84dbef45786af440fd269b71e9436a96a8e7a152@changeidSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5cc33f13
    • Radu Pirea (OSS)'s avatar
      net: phy: nxp-c45-tja11xx: fix unsigned long multiplication overflow · bdaaecc1
      Radu Pirea (OSS) authored
      Any multiplication between GENMASK(31, 0) and a number bigger than 1
      will be truncated because of the overflow, if the size of unsigned long
      is 32 bits.
      
      Replaced GENMASK with GENMASK_ULL to make sure that multiplication will
      be between 64 bits values.
      
      Cc: <stable@vger.kernel.org> # 5.15+
      Fixes: 514def5d ("phy: nxp-c45-tja11xx: add timestamping support")
      Signed-off-by: default avatarRadu Pirea (OSS) <radu-nicolae.pirea@oss.nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230406095953.75622-1-radu-nicolae.pirea@oss.nxp.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bdaaecc1
    • Felix Huettner's avatar
      net: openvswitch: fix race on port output · 066b8678
      Felix Huettner authored
      assume the following setup on a single machine:
      1. An openvswitch instance with one bridge and default flows
      2. two network namespaces "server" and "client"
      3. two ovs interfaces "server" and "client" on the bridge
      4. for each ovs interface a veth pair with a matching name and 32 rx and
         tx queues
      5. move the ends of the veth pairs to the respective network namespaces
      6. assign ip addresses to each of the veth ends in the namespaces (needs
         to be the same subnet)
      7. start some http server on the server network namespace
      8. test if a client in the client namespace can reach the http server
      
      when following the actions below the host has a chance of getting a cpu
      stuck in a infinite loop:
      1. send a large amount of parallel requests to the http server (around
         3000 curls should work)
      2. in parallel delete the network namespace (do not delete interfaces or
         stop the server, just kill the namespace)
      
      there is a low chance that this will cause the below kernel cpu stuck
      message. If this does not happen just retry.
      Below there is also the output of bpftrace for the functions mentioned
      in the output.
      
      The series of events happening here is:
      1. the network namespace is deleted calling
         `unregister_netdevice_many_notify` somewhere in the process
      2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
         then runs `synchronize_net`
      3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
      4. this is then handled by `dp_device_event` which calls
         `ovs_netdev_detach_dev` (if a vport is found, which is the case for
         the veth interface attached to ovs)
      5. this removes the rx_handlers of the device but does not prevent
         packages to be sent to the device
      6. `dp_device_event` then queues the vport deletion to work in
         background as a ovs_lock is needed that we do not hold in the
         unregistration path
      7. `unregister_netdevice_many_notify` continues to call
         `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
      8. port deletion continues (but details are not relevant for this issue)
      9. at some future point the background task deletes the vport
      
      If after 7. but before 9. a packet is send to the ovs vport (which is
      not deleted at this point in time) which forwards it to the
      `dev_queue_xmit` flow even though the device is unregistering.
      In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
      a while loop (if the packet has a rx_queue recorded) that is infinite if
      `dev->real_num_tx_queues` is zero.
      
      To prevent this from happening we update `do_output` to handle devices
      without carrier the same as if the device is not found (which would
      be the code path after 9. is done).
      
      Additionally we now produce a warning in `skb_tx_hash` if we will hit
      the infinite loop.
      
      bpftrace (first word is function name):
      
      __dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
      netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
      ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
      netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
      dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
      netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
      synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
      ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
      __dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
      netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
      broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
      ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
      synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604
      
      stuck message:
      
      watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
      Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
      CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic #74-Ubuntu
      Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      RIP: 0010:netdev_pick_tx+0xf1/0x320
      Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
      RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
      RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
      RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
      RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
      R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
      FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
      Call Trace:
       <IRQ>
       netdev_core_pick_tx+0xa4/0xb0
       __dev_queue_xmit+0xf8/0x510
       ? __bpf_prog_exit+0x1e/0x30
       dev_queue_xmit+0x10/0x20
       ovs_vport_send+0xad/0x170 [openvswitch]
       do_output+0x59/0x180 [openvswitch]
       do_execute_actions+0xa80/0xaa0 [openvswitch]
       ? kfree+0x1/0x250
       ? kfree+0x1/0x250
       ? kprobe_perf_func+0x4f/0x2b0
       ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
       ovs_execute_actions+0x4c/0x120 [openvswitch]
       ovs_dp_process_packet+0xa1/0x200 [openvswitch]
       ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
       ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
       ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
       ovs_vport_receive+0x77/0xd0 [openvswitch]
       ? __htab_map_lookup_elem+0x4e/0x60
       ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
       ? trace_call_bpf+0xc8/0x150
       ? kfree+0x1/0x250
       ? kfree+0x1/0x250
       ? kprobe_perf_func+0x4f/0x2b0
       ? kprobe_perf_func+0x4f/0x2b0
       ? __mod_memcg_lruvec_state+0x63/0xe0
       netdev_port_receive+0xc4/0x180 [openvswitch]
       ? netdev_port_receive+0x180/0x180 [openvswitch]
       netdev_frame_hook+0x1f/0x40 [openvswitch]
       __netif_receive_skb_core.constprop.0+0x23d/0xf00
       __netif_receive_skb_one_core+0x3f/0xa0
       __netif_receive_skb+0x15/0x60
       process_backlog+0x9e/0x170
       __napi_poll+0x33/0x180
       net_rx_action+0x126/0x280
       ? ttwu_do_activate+0x72/0xf0
       __do_softirq+0xd9/0x2e7
       ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
       do_softirq+0x7d/0xb0
       </IRQ>
       <TASK>
       __local_bh_enable_ip+0x54/0x60
       ip_finish_output2+0x191/0x460
       __ip_finish_output+0xb7/0x180
       ip_finish_output+0x2e/0xc0
       ip_output+0x78/0x100
       ? __ip_finish_output+0x180/0x180
       ip_local_out+0x5e/0x70
       __ip_queue_xmit+0x184/0x440
       ? tcp_syn_options+0x1f9/0x300
       ip_queue_xmit+0x15/0x20
       __tcp_transmit_skb+0x910/0x9c0
       ? __mod_memcg_state+0x44/0xa0
       tcp_connect+0x437/0x4e0
       ? ktime_get_with_offset+0x60/0xf0
       tcp_v4_connect+0x436/0x530
       __inet_stream_connect+0xd4/0x3a0
       ? kprobe_perf_func+0x4f/0x2b0
       ? aa_sk_perm+0x43/0x1c0
       inet_stream_connect+0x3b/0x60
       __sys_connect_file+0x63/0x70
       __sys_connect+0xa6/0xd0
       ? setfl+0x108/0x170
       ? do_fcntl+0xe8/0x5a0
       __x64_sys_connect+0x18/0x20
       do_syscall_64+0x5c/0xc0
       ? __x64_sys_fcntl+0xa9/0xd0
       ? exit_to_user_mode_prepare+0x37/0xb0
       ? syscall_exit_to_user_mode+0x27/0x50
       ? do_syscall_64+0x69/0xc0
       ? __sys_setsockopt+0xea/0x1e0
       ? exit_to_user_mode_prepare+0x37/0xb0
       ? syscall_exit_to_user_mode+0x27/0x50
       ? __x64_sys_setsockopt+0x1f/0x30
       ? do_syscall_64+0x69/0xc0
       ? irqentry_exit+0x1d/0x30
       ? exc_page_fault+0x89/0x170
       entry_SYSCALL_64_after_hwframe+0x61/0xcb
      RIP: 0033:0x7f7b8101c6a7
      Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
      RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
      RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
      RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
      R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
      R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
       </TASK>
      
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      Co-developed-by: default avatarLuca Czesla <luca.czesla@mail.schwarz>
      Signed-off-by: default avatarLuca Czesla <luca.czesla@mail.schwarz>
      Signed-off-by: default avatarFelix Huettner <felix.huettner@mail.schwarz>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bugSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      066b8678
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 029294d0
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2023-04-08
      
      We've added 4 non-merge commits during the last 11 day(s) which contain
      a total of 5 files changed, 39 insertions(+), 6 deletions(-).
      
      The main changes are:
      
      1) Fix BPF TCP socket iterator to use correct helper for dropping
         socket's refcount, that is, sock_gen_put instead of sock_put,
         from Martin KaFai Lau.
      
      2) Fix a BTI exception splat in BPF trampoline-generated code on arm64,
         from Xu Kuohai.
      
      3) Fix a LongArch JIT error from missing BPF_NOSPEC no-op, from George Guo.
      
      4) Fix dynamic XDP feature detection of veth in xdp_redirect selftest,
         from Lorenzo Bianconi.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests/bpf: fix xdp_redirect xdp-features selftest for veth driver
        bpf, arm64: Fixed a BTI error on returning to patched function
        LoongArch, bpf: Fix jit to skip speculation barrier opcode
        bpf: tcp: Use sock_gen_put instead of sock_put in bpf_iter_tcp
      ====================
      
      Link: https://lore.kernel.org/r/20230407224642.30906-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      029294d0
  6. 07 Apr, 2023 6 commits
    • David S. Miller's avatar
      Merge branch 'bonding-ns-validation-fixes' · b9881d9a
      David S. Miller authored
      Hangbin Liu says:
      
      ====================
      bonding: fix ns validation on backup slaves
      
      The first patch fixed a ns validation issue on backup slaves. The second
      patch re-format the bond option test and add a test lib file. The third
      patch add the arp validate regression test for the kernel patch.
      
      Here is the new bonding option test without the kernel fix:
      
      ]# ./bond_options.sh
      TEST: prio (active-backup miimon primary_reselect 0)           [ OK ]
      TEST: prio (active-backup miimon primary_reselect 1)           [ OK ]
      TEST: prio (active-backup miimon primary_reselect 2)           [ OK ]
      TEST: prio (active-backup arp_ip_target primary_reselect 0)    [ OK ]
      TEST: prio (active-backup arp_ip_target primary_reselect 1)    [ OK ]
      TEST: prio (active-backup arp_ip_target primary_reselect 2)    [ OK ]
      TEST: prio (active-backup ns_ip6_target primary_reselect 0)    [ OK ]
      TEST: prio (active-backup ns_ip6_target primary_reselect 1)    [ OK ]
      TEST: prio (active-backup ns_ip6_target primary_reselect 2)    [ OK ]
      TEST: prio (balance-tlb miimon primary_reselect 0)             [ OK ]
      TEST: prio (balance-tlb miimon primary_reselect 1)             [ OK ]
      TEST: prio (balance-tlb miimon primary_reselect 2)             [ OK ]
      TEST: prio (balance-tlb arp_ip_target primary_reselect 0)      [ OK ]
      TEST: prio (balance-tlb arp_ip_target primary_reselect 1)      [ OK ]
      TEST: prio (balance-tlb arp_ip_target primary_reselect 2)      [ OK ]
      TEST: prio (balance-tlb ns_ip6_target primary_reselect 0)      [ OK ]
      TEST: prio (balance-tlb ns_ip6_target primary_reselect 1)      [ OK ]
      TEST: prio (balance-tlb ns_ip6_target primary_reselect 2)      [ OK ]
      TEST: prio (balance-alb miimon primary_reselect 0)             [ OK ]
      TEST: prio (balance-alb miimon primary_reselect 1)             [ OK ]
      TEST: prio (balance-alb miimon primary_reselect 2)             [ OK ]
      TEST: prio (balance-alb arp_ip_target primary_reselect 0)      [ OK ]
      TEST: prio (balance-alb arp_ip_target primary_reselect 1)      [ OK ]
      TEST: prio (balance-alb arp_ip_target primary_reselect 2)      [ OK ]
      TEST: prio (balance-alb ns_ip6_target primary_reselect 0)      [ OK ]
      TEST: prio (balance-alb ns_ip6_target primary_reselect 1)      [ OK ]
      TEST: prio (balance-alb ns_ip6_target primary_reselect 2)      [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 0)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 1)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 2)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 3)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 4)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 5)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 6)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 0)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 1)  [ OK ]
      TEST: arp_validate (interface eth1 mii_status DOWN)                 [FAIL]
      TEST: arp_validate (interface eth2 mii_status DOWN)                 [FAIL]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 2)  [FAIL]
      TEST: arp_validate (interface eth1 mii_status DOWN)                 [FAIL]
      TEST: arp_validate (interface eth2 mii_status DOWN)                 [FAIL]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 3)  [FAIL]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 4)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 5)  [ OK ]
      TEST: arp_validate (interface eth1 mii_status DOWN)                 [FAIL]
      TEST: arp_validate (interface eth2 mii_status DOWN)                 [FAIL]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 6)  [FAIL]
      
      Here is the test result after the kernel fix:
      TEST: arp_validate (active-backup arp_ip_target arp_validate 0)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 1)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 2)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 3)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 4)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 5)  [ OK ]
      TEST: arp_validate (active-backup arp_ip_target arp_validate 6)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 0)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 1)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 2)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 3)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 4)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 5)  [ OK ]
      TEST: arp_validate (active-backup ns_ip6_target arp_validate 6)  [ OK ]
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9881d9a
    • Hangbin Liu's avatar
      selftests: bonding: add arp validate test · 2e825f8a
      Hangbin Liu authored
      This patch add bonding arp validate tests with mode active backup,
      monitor arp_ip_target and ns_ip6_target. It also checks mii_status
      to make sure all slaves are UP.
      Acked-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e825f8a
    • Hangbin Liu's avatar
      selftests: bonding: re-format bond option tests · 481b56e0
      Hangbin Liu authored
      To improve the testing process for bond options, A new bond topology lib
      is added to our testing setup. The current option_prio.sh file will be
      renamed to bond_options.sh so that all bonding options can be tested here.
      Specifically, for priority testing, we will run all tests using modes
      1, 5, and 6. These changes will help us streamline the testing process
      and ensure that our bond options are rigorously evaluated.
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      481b56e0
    • Hangbin Liu's avatar
      bonding: fix ns validation on backup slaves · 4598380f
      Hangbin Liu authored
      When arp_validate is set to 2, 3, or 6, validation is performed for
      backup slaves as well. As stated in the bond documentation, validation
      involves checking the broadcast ARP request sent out via the active
      slave. This helps determine which slaves are more likely to function in
      the event of an active slave failure.
      
      However, when the target is an IPv6 address, the NS message sent from
      the active interface is not checked on backup slaves. Additionally,
      based on the bond_arp_rcv() rule b, we must reverse the saddr and daddr
      when checking the NS message.
      
      Note that when checking the NS message, the destination address is a
      multicast address. Therefore, we must convert the target address to
      solicited multicast in the bond_get_targets_ip6() function.
      
      Prior to the fix, the backup slaves had a mii status of "down", but
      after the fix, all of the slaves' mii status was updated to "UP".
      
      Fixes: 4e24be01 ("bonding: add new parameter ns_targets")
      Reviewed-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4598380f
    • YueHaibing's avatar
      tcp: restrict net.ipv4.tcp_app_win · dc5110c2
      YueHaibing authored
      UBSAN: shift-out-of-bounds in net/ipv4/tcp_input.c:555:23
      shift exponent 255 is too large for 32-bit type 'int'
      CPU: 1 PID: 7907 Comm: ssh Not tainted 6.3.0-rc4-00161-g62bad54b-dirty #206
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x136/0x150
       __ubsan_handle_shift_out_of_bounds+0x21f/0x5a0
       tcp_init_transfer.cold+0x3a/0xb9
       tcp_finish_connect+0x1d0/0x620
       tcp_rcv_state_process+0xd78/0x4d60
       tcp_v4_do_rcv+0x33d/0x9d0
       __release_sock+0x133/0x3b0
       release_sock+0x58/0x1b0
      
      'maxwin' is int, shifting int for 32 or more bits is undefined behaviour.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc5110c2
    • Harshit Mogalapalli's avatar
      niu: Fix missing unwind goto in niu_alloc_channels() · 8ce07be7
      Harshit Mogalapalli authored
      Smatch reports: drivers/net/ethernet/sun/niu.c:4525
      	niu_alloc_channels() warn: missing unwind goto?
      
      If niu_rbr_fill() fails, then we are directly returning 'err' without
      freeing the channels.
      
      Fix this by changing direct return to a goto 'out_err'.
      
      Fixes: a3138df9 ("[NIU]: Add Sun Neptune ethernet driver.")
      Signed-off-by: default avatarHarshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ce07be7
  7. 06 Apr, 2023 15 commits