1. 07 Feb, 2020 6 commits
    • Vinicius Costa Gomes's avatar
      taprio: Fix enabling offload with wrong number of traffic classes · 5652e63d
      Vinicius Costa Gomes authored
      If the driver implementing taprio offloading depends on the value of
      the network device number of traffic classes (dev->num_tc) for
      whatever reason, it was going to receive the value zero. The value was
      only set after the offloading function is called.
      
      So, moving setting the number of traffic classes to before the
      offloading function is called fixes this issue. This is safe because
      this only happens when taprio is instantiated (we don't allow this
      configuration to be changed without first removing taprio).
      
      Fixes: 9c66d156 ("taprio: Add support for hardware offloading")
      Reported-by: default avatarPo Liu <po.liu@nxp.com>
      Signed-off-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Acked-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5652e63d
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port · de34d708
      Florian Fainelli authored
      The 7445 switch clocking profiles do not allow us to run the IMP port at
      2Gb/sec in a way that it is reliable and consistent. Make sure that the
      setting is only applied to the 7278 family.
      
      Fixes: 8f1880cb ("net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de34d708
    • Florian Fainelli's avatar
      net: dsa: b53: Always use dev->vlan_enabled in b53_configure_vlan() · df373702
      Florian Fainelli authored
      b53_configure_vlan() is called by the bcm_sf2 driver upon setup and
      indirectly through resume as well. During the initial setup, we are
      guaranteed that dev->vlan_enabled is false, so there is no change in
      behavior, however during suspend, we may have enabled VLANs before, so we
      do want to restore that setting.
      
      Fixes: dad8d7c6 ("net: dsa: b53: Properly account for VLAN filtering")
      Fixes: 967dd82f ("net: dsa: b53: Add support for Broadcom RoboSwitch")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df373702
    • Dejin Zheng's avatar
      net: stmmac: fix a possible endless loop · 7d10f077
      Dejin Zheng authored
      It forgot to reduce the value of the variable retry in a while loop
      in the ethqos_configure() function. It may cause an endless loop and
      without timeout.
      
      Fixes: a7c30e62 ("net: stmmac: Add driver for Qualcomm ethqos")
      Signed-off-by: default avatarDejin Zheng <zhengdejin5@gmail.com>
      Acked-by: default avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d10f077
    • David Howells's avatar
      rxrpc: Fix call RCU cleanup using non-bh-safe locks · 963485d4
      David Howells authored
      rxrpc_rcu_destroy_call(), which is called as an RCU callback to clean up a
      put call, calls rxrpc_put_connection() which, deep in its bowels, takes a
      number of spinlocks in a non-BH-safe way, including rxrpc_conn_id_lock and
      local->client_conns_lock.  RCU callbacks, however, are normally called from
      softirq context, which can cause lockdep to notice the locking
      inconsistency.
      
      To get lockdep to detect this, it's necessary to have the connection
      cleaned up on the put at the end of the last of its calls, though normally
      the clean up is deferred.  This can be induced, however, by starting a call
      on an AF_RXRPC socket and then closing the socket without reading the
      reply.
      
      Fix this by having rxrpc_rcu_destroy_call() punt the destruction to a
      workqueue if in softirq-mode and defer the destruction to process context.
      
      Note that another way to fix this could be to add a bunch of bh-disable
      annotations to the spinlocks concerned - and there might be more than just
      those two - but that means spending more time with BHs disabled.
      
      Note also that some of these places were covered by bh-disable spinlocks
      belonging to the rxrpc_transport object, but these got removed without the
      _bh annotation being retained on the next lock in.
      
      Fixes: 999b69f8 ("rxrpc: Kill the client connection bundle concept")
      Reported-by: syzbot+d82f3ac8d87e7ccbb2c9@syzkaller.appspotmail.com
      Reported-by: syzbot+3f1fd6b8cbf8702d134e@syzkaller.appspotmail.com
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Hillf Danton <hdanton@sina.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      963485d4
    • David Howells's avatar
      rxrpc: Fix service call disconnection · b39a934e
      David Howells authored
      The recent patch that substituted a flag on an rxrpc_call for the
      connection pointer being NULL as an indication that a call was disconnected
      puts the set_bit in the wrong place for service calls.  This is only a
      problem if a call is implicitly terminated by a new call coming in on the
      same connection channel instead of a terminating ACK packet.
      
      In such a case, rxrpc_input_implicit_end_call() calls
      __rxrpc_disconnect_call(), which is now (incorrectly) setting the
      disconnection bit, meaning that when rxrpc_release_call() is later called,
      it doesn't call rxrpc_disconnect_call() and so the call isn't removed from
      the peer's error distribution list and the list gets corrupted.
      
      KASAN finds the issue as an access after release on a call, but the
      position at which it occurs is confusing as it appears to be related to a
      different call (the call site is where the latter call is being removed
      from the error distribution list and either the next or pprev pointer
      points to a previously released call).
      
      Fix this by moving the setting of the flag from __rxrpc_disconnect_call()
      to rxrpc_disconnect_call() in the same place that the connection pointer
      was being cleared.
      
      Fixes: 5273a191 ("rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b39a934e
  2. 06 Feb, 2020 13 commits
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2020-02-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · f798a5a0
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2020-02-06
      
      This series introduces some fixes to mlx5 driver.
      
      Please pull and let me know if there is any problem.
      
      For -stable v4.19:
       ('net/mlx5: IPsec, Fix esp modify function attribute')
       ('net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctx')
      
      For -stable v5.4:
         ('net/mlx5: Deprecate usage of generic TLS HW capability bit')
         ('net/mlx5: Fix deadlock in fs_core')
      
      For -stable v5.5:
         ('net/mlx5e: TX, Error completion is for last WQE in batch')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f798a5a0
    • Tariq Toukan's avatar
      net/mlx5: Deprecate usage of generic TLS HW capability bit · 61c00cca
      Tariq Toukan authored
      Deprecate the generic TLS cap bit, use the new TX-specific
      TLS cap bit instead.
      
      Fixes: a12ff35e ("net/mlx5: Introduce TLS TX offload hardware bits and structures")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      61c00cca
    • Tariq Toukan's avatar
      net/mlx5e: TX, Error completion is for last WQE in batch · b57e66ad
      Tariq Toukan authored
      For a cyclic work queue, when not requesting a completion per WQE,
      a single CQE might indicate the completion of several WQEs.
      However, in case some WQE in the batch causes an error, then an error
      completion is issued, breaking the batch, and pointing to the offending
      WQE in the wqe_counter field.
      
      Hence, WQE-specific error CQE handling (like printing, breaking, etc...)
      should be performed only for the last WQE in batch.
      
      Fixes: 130c7b46 ("net/mlx5e: TX, Dump WQs wqe descriptors on CQE with error events")
      Fixes: fd9b4be8 ("net/mlx5e: RX, Support multiple outstanding UMR posts")
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reviewed-by: default avatarAya Levin <ayal@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      b57e66ad
    • Raed Salem's avatar
      net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctx · 08db2cf5
      Raed Salem authored
      SA context is allocated at mlx5_fpga_ipsec_create_sa_ctx,
      however the counterpart mlx5_fpga_ipsec_delete_sa_ctx function
      nullifies sa_ctx pointer without freeing the memory allocated,
      hence the memory leak.
      
      Fix by free SA context when the SA is released.
      
      Fixes: d6c4f029 ("net/mlx5: Refactor accel IPSec code")
      Signed-off-by: default avatarRaed Salem <raeds@mellanox.com>
      Reviewed-by: default avatarBoris Pismenny <borisp@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      08db2cf5
    • Raed Salem's avatar
      net/mlx5: IPsec, Fix esp modify function attribute · 0dc2c534
      Raed Salem authored
      The function mlx5_fpga_esp_validate_xfrm_attrs is wrongly used
      with negative negation as zero value indicates success but it
      used as failure return value instead.
      
      Fix by remove the unary not negation operator.
      
      Fixes: 05564d0a ("net/mlx5: Add flow-steering commands for FPGA IPSec implementation")
      Signed-off-by: default avatarRaed Salem <raeds@mellanox.com>
      Reviewed-by: default avatarBoris Pismenny <borisp@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      0dc2c534
    • Maor Gottlieb's avatar
      net/mlx5: Fix deadlock in fs_core · c1948390
      Maor Gottlieb authored
      free_match_list could be called when the flow table is already
      locked. We need to pass this notation to tree_put_node.
      
      It fixes the following lockdep warnning:
      
      [ 1797.268537] ============================================
      [ 1797.276837] WARNING: possible recursive locking detected
      [ 1797.285101] 5.5.0-rc5+ #10 Not tainted
      [ 1797.291641] --------------------------------------------
      [ 1797.299917] handler10/9296 is trying to acquire lock:
      [ 1797.307885] ffff889ad399a0a0 (&node->lock){++++}, at:
      tree_put_node+0x1d5/0x210 [mlx5_core]
      [ 1797.319694]
      [ 1797.319694] but task is already holding lock:
      [ 1797.330904] ffff889ad399a0a0 (&node->lock){++++}, at:
      nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
      [ 1797.344707]
      [ 1797.344707] other info that might help us debug this:
      [ 1797.356952]  Possible unsafe locking scenario:
      [ 1797.356952]
      [ 1797.368333]        CPU0
      [ 1797.373357]        ----
      [ 1797.378364]   lock(&node->lock);
      [ 1797.384222]   lock(&node->lock);
      [ 1797.390031]
      [ 1797.390031]  *** DEADLOCK ***
      [ 1797.390031]
      [ 1797.403003]  May be due to missing lock nesting notation
      [ 1797.403003]
      [ 1797.414691] 3 locks held by handler10/9296:
      [ 1797.421465]  #0: ffff889cf2c5a110 (&block->cb_lock){++++}, at:
      tc_setup_cb_add+0x70/0x250
      [ 1797.432810]  #1: ffff88a030081490 (&comp->sem){++++}, at:
      mlx5_devcom_get_peer_data+0x4c/0xb0 [mlx5_core]
      [ 1797.445829]  #2: ffff889ad399a0a0 (&node->lock){++++}, at:
      nested_down_write_ref_node.part.33+0x1a/0x60 [mlx5_core]
      [ 1797.459913]
      [ 1797.459913] stack backtrace:
      [ 1797.469436] CPU: 1 PID: 9296 Comm: handler10 Kdump: loaded Not
      tainted 5.5.0-rc5+ #10
      [ 1797.480643] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS
      2.4.3 01/17/2017
      [ 1797.491480] Call Trace:
      [ 1797.496701]  dump_stack+0x96/0xe0
      [ 1797.502864]  __lock_acquire.cold.63+0xf8/0x212
      [ 1797.510301]  ? lockdep_hardirqs_on+0x250/0x250
      [ 1797.517701]  ? mark_held_locks+0x55/0xa0
      [ 1797.524547]  ? quarantine_put+0xb7/0x160
      [ 1797.531422]  ? lockdep_hardirqs_on+0x17d/0x250
      [ 1797.538913]  lock_acquire+0xd6/0x1f0
      [ 1797.545529]  ? tree_put_node+0x1d5/0x210 [mlx5_core]
      [ 1797.553701]  down_write+0x94/0x140
      [ 1797.560206]  ? tree_put_node+0x1d5/0x210 [mlx5_core]
      [ 1797.568464]  ? down_write_killable_nested+0x170/0x170
      [ 1797.576925]  ? del_hw_flow_group+0xde/0x1f0 [mlx5_core]
      [ 1797.585629]  tree_put_node+0x1d5/0x210 [mlx5_core]
      [ 1797.593891]  ? free_match_list.part.25+0x147/0x170 [mlx5_core]
      [ 1797.603389]  free_match_list.part.25+0xe0/0x170 [mlx5_core]
      [ 1797.612654]  _mlx5_add_flow_rules+0x17e2/0x20b0 [mlx5_core]
      [ 1797.621838]  ? lock_acquire+0xd6/0x1f0
      [ 1797.629028]  ? esw_get_prio_table+0xb0/0x3e0 [mlx5_core]
      [ 1797.637981]  ? alloc_insert_flow_group+0x420/0x420 [mlx5_core]
      [ 1797.647459]  ? try_to_wake_up+0x4c7/0xc70
      [ 1797.654881]  ? lock_downgrade+0x350/0x350
      [ 1797.662271]  ? __mutex_unlock_slowpath+0xb1/0x3f0
      [ 1797.670396]  ? find_held_lock+0xac/0xd0
      [ 1797.677540]  ? mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
      [ 1797.686467]  mlx5_add_flow_rules+0xdc/0x360 [mlx5_core]
      [ 1797.695134]  ? _mlx5_add_flow_rules+0x20b0/0x20b0 [mlx5_core]
      [ 1797.704270]  ? irq_exit+0xa5/0x170
      [ 1797.710764]  ? retint_kernel+0x10/0x10
      [ 1797.717698]  ? mlx5_eswitch_set_rule_source_port.isra.9+0x122/0x230
      [mlx5_core]
      [ 1797.728708]  mlx5_eswitch_add_offloaded_rule+0x465/0x6d0 [mlx5_core]
      [ 1797.738713]  ? mlx5_eswitch_get_prio_range+0x30/0x30 [mlx5_core]
      [ 1797.748384]  ? mlx5_fc_stats_work+0x670/0x670 [mlx5_core]
      [ 1797.757400]  mlx5e_tc_offload_fdb_rules.isra.27+0x24/0x90 [mlx5_core]
      [ 1797.767665]  mlx5e_tc_add_fdb_flow+0xaf8/0xd40 [mlx5_core]
      [ 1797.776886]  ? mlx5e_encap_put+0xd0/0xd0 [mlx5_core]
      [ 1797.785562]  ? mlx5e_alloc_flow.isra.43+0x18c/0x1c0 [mlx5_core]
      [ 1797.795353]  __mlx5e_add_fdb_flow+0x2e2/0x440 [mlx5_core]
      [ 1797.804558]  ? mlx5e_tc_update_neigh_used_value+0x8c0/0x8c0
      [mlx5_core]
      [ 1797.815093]  ? wait_for_completion+0x260/0x260
      [ 1797.823272]  mlx5e_configure_flower+0xe94/0x1620 [mlx5_core]
      [ 1797.832792]  ? __mlx5e_add_fdb_flow+0x440/0x440 [mlx5_core]
      [ 1797.842096]  ? down_read+0x11a/0x2e0
      [ 1797.849090]  ? down_write+0x140/0x140
      [ 1797.856142]  ? mlx5e_rep_indr_setup_block_cb+0xc0/0xc0 [mlx5_core]
      [ 1797.866027]  tc_setup_cb_add+0x11a/0x250
      [ 1797.873339]  fl_hw_replace_filter+0x25e/0x320 [cls_flower]
      [ 1797.882385]  ? fl_hw_destroy_filter+0x1c0/0x1c0 [cls_flower]
      [ 1797.891607]  fl_change+0x1d54/0x1fb6 [cls_flower]
      [ 1797.899772]  ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
      [cls_flower]
      [ 1797.910728]  ? lock_downgrade+0x350/0x350
      [ 1797.918187]  ? __radix_tree_lookup+0xa5/0x130
      [ 1797.926046]  ? fl_set_key+0x1590/0x1590 [cls_flower]
      [ 1797.934611]  ? __rhashtable_insert_fast.constprop.50+0x9f0/0x9f0
      [cls_flower]
      [ 1797.945673]  tc_new_tfilter+0xcd1/0x1240
      [ 1797.953138]  ? tc_del_tfilter+0xb10/0xb10
      [ 1797.960688]  ? avc_has_perm_noaudit+0x92/0x320
      [ 1797.968721]  ? avc_has_perm_noaudit+0x1df/0x320
      [ 1797.976816]  ? avc_has_extended_perms+0x990/0x990
      [ 1797.985090]  ? mark_lock+0xaa/0x9e0
      [ 1797.991988]  ? match_held_lock+0x1b/0x240
      [ 1797.999457]  ? match_held_lock+0x1b/0x240
      [ 1798.006859]  ? find_held_lock+0xac/0xd0
      [ 1798.014045]  ? symbol_put_addr+0x40/0x40
      [ 1798.021317]  ? rcu_read_lock_sched_held+0xd0/0xd0
      [ 1798.029460]  ? tc_del_tfilter+0xb10/0xb10
      [ 1798.036810]  rtnetlink_rcv_msg+0x4d5/0x620
      [ 1798.044236]  ? rtnl_bridge_getlink+0x460/0x460
      [ 1798.052034]  ? lockdep_hardirqs_on+0x250/0x250
      [ 1798.059837]  ? match_held_lock+0x1b/0x240
      [ 1798.067146]  ? find_held_lock+0xac/0xd0
      [ 1798.074246]  netlink_rcv_skb+0xc6/0x1f0
      [ 1798.081339]  ? rtnl_bridge_getlink+0x460/0x460
      [ 1798.089104]  ? netlink_ack+0x440/0x440
      [ 1798.096061]  netlink_unicast+0x2d4/0x3b0
      [ 1798.103189]  ? netlink_attachskb+0x3f0/0x3f0
      [ 1798.110724]  ? _copy_from_iter_full+0xda/0x370
      [ 1798.118415]  netlink_sendmsg+0x3ba/0x6a0
      [ 1798.125478]  ? netlink_unicast+0x3b0/0x3b0
      [ 1798.132705]  ? netlink_unicast+0x3b0/0x3b0
      [ 1798.139880]  sock_sendmsg+0x94/0xa0
      [ 1798.146332]  ____sys_sendmsg+0x36c/0x3f0
      [ 1798.153251]  ? copy_msghdr_from_user+0x165/0x230
      [ 1798.160941]  ? kernel_sendmsg+0x30/0x30
      [ 1798.167738]  ___sys_sendmsg+0xeb/0x150
      [ 1798.174411]  ? sendmsg_copy_msghdr+0x30/0x30
      [ 1798.181649]  ? lock_downgrade+0x350/0x350
      [ 1798.188559]  ? rcu_read_lock_sched_held+0xd0/0xd0
      [ 1798.196239]  ? __fget+0x21d/0x320
      [ 1798.202335]  ? do_dup2+0x2a0/0x2a0
      [ 1798.208499]  ? lock_downgrade+0x350/0x350
      [ 1798.215366]  ? __fget_light+0xd6/0xf0
      [ 1798.221808]  ? syscall_trace_enter+0x369/0x5d0
      [ 1798.229112]  __sys_sendmsg+0xd3/0x160
      [ 1798.235511]  ? __sys_sendmsg_sock+0x60/0x60
      [ 1798.242478]  ? syscall_trace_enter+0x233/0x5d0
      [ 1798.249721]  ? syscall_slow_exit_work+0x280/0x280
      [ 1798.257211]  ? do_syscall_64+0x1e/0x2e0
      [ 1798.263680]  do_syscall_64+0x72/0x2e0
      [ 1798.269950]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: bd71b08e ("net/mlx5: Support multiple updates of steering rules in parallel")
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarAlaa Hleihel <alaa@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      c1948390
    • Florian Fainelli's avatar
      net: systemport: Avoid RBUF stuck in Wake-on-LAN mode · 263a425a
      Florian Fainelli authored
      After a number of suspend and resume cycles, it is possible for the RBUF
      to be stuck in Wake-on-LAN mode, despite the MPD enable bit being
      cleared which instructed the RBUF to exit that mode.
      
      Avoid creating that problematic condition by clearing the RX_EN and
      TX_EN bits in the UniMAC prior to disable the Magic Packet Detector
      logic which is guaranteed to make the RBUF exit Wake-on-LAN mode.
      
      Fixes: 83e82f4c ("net: systemport: add Wake-on-LAN support")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      263a425a
    • Heiner Kallweit's avatar
      r8169: fix performance regression related to PCIe max read request size · 21b5f672
      Heiner Kallweit authored
      It turned out that on low performance systems the original change can
      cause lower tx performance. On a N3450-based mini-PC tx performance
      in iperf3 was reduced from 950Mbps to ~900Mbps. Therefore effectively
      revert the original change, just use pcie_set_readrq() now instead of
      changing the PCIe capability register directly.
      
      Fixes: 2df49d36 ("r8169: remove fiddling with the PCIe max read request size")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21b5f672
    • Dan Carpenter's avatar
      net: sched: prevent a use after free · 7a02ea65
      Dan Carpenter authored
      The bug is that we call kfree_skb(skb) and then pass "skb" to
      qdisc_pkt_len(skb) on the next line, which is a use after free.
      Also Cong Wang points out that it's better to delay the actual
      frees until we drop the rtnl lock so we should use rtnl_kfree_skbs()
      instead of kfree_skb().
      
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Fixes: ec97ecf1 ("net: sched: add Flow Queue PIE packet scheduler")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a02ea65
    • Qian Cai's avatar
      skbuff: fix a data race in skb_queue_len() · 86b18aaa
      Qian Cai authored
      sk_buff.qlen can be accessed concurrently as noticed by KCSAN,
      
       BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_dgram_sendmsg
      
       read to 0xffff8a1b1d8a81c0 of 4 bytes by task 5371 on cpu 96:
        unix_dgram_sendmsg+0x9a9/0xb70 include/linux/skbuff.h:1821
      				 net/unix/af_unix.c:1761
        ____sys_sendmsg+0x33e/0x370
        ___sys_sendmsg+0xa6/0xf0
        __sys_sendmsg+0x69/0xf0
        __x64_sys_sendmsg+0x51/0x70
        do_syscall_64+0x91/0xb47
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       write to 0xffff8a1b1d8a81c0 of 4 bytes by task 1 on cpu 99:
        __skb_try_recv_from_queue+0x327/0x410 include/linux/skbuff.h:2029
        __skb_try_recv_datagram+0xbe/0x220
        unix_dgram_recvmsg+0xee/0x850
        ____sys_recvmsg+0x1fb/0x210
        ___sys_recvmsg+0xa2/0xf0
        __sys_recvmsg+0x66/0xf0
        __x64_sys_recvmsg+0x51/0x70
        do_syscall_64+0x91/0xb47
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Since only the read is operating as lockless, it could introduce a logic
      bug in unix_recvq_full() due to the load tearing. Fix it by adding
      a lockless variant of skb_queue_len() and unix_recvq_full() where
      READ_ONCE() is on the read while WRITE_ONCE() is on the write similar to
      the commit d7d16a89 ("net: add skb_queue_empty_lockless()").
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86b18aaa
    • Lorenzo Bianconi's avatar
      net: mvneta: move rx_dropped and rx_errors in per-cpu stats · c35947b8
      Lorenzo Bianconi authored
      Move rx_dropped and rx_errors counters in mvneta_pcpu_stats in order to
      avoid possible races updating statistics
      
      Fixes: 562e2f46 ("net: mvneta: Improve the buffer allocation method for SWBM")
      Fixes: dc35a10f ("net: mvneta: bm: add support for hardware buffer management")
      Fixes: c5aff182 ("net: mvneta: driver for Marvell Armada 370/XP network unit")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c35947b8
    • Devulapally Shiva Krishna's avatar
      cxgb4: Added tls stats prints. · 45a8317e
      Devulapally Shiva Krishna authored
      Added debugfs entry to show the tls stats.
      Signed-off-by: default avatarDevulapally Shiva Krishna <shiva@chelsio.com>
      Signed-off-by: default avatarVinay Kumar Yadav <vinay.yadav@chelsio.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      45a8317e
    • Florian Westphal's avatar
      mptcp: fix use-after-free for ipv6 · b0519de8
      Florian Westphal authored
      Turns out that when we accept a new subflow, the newly created
      inet_sk(tcp_sk)->pinet6 points at the ipv6_pinfo structure of the
      listener socket.
      
      This wasn't caught by the selftest because it closes the accepted fd
      before the listening one.
      
      adding a close(listenfd) after accept returns is enough:
       BUG: KASAN: use-after-free in inet6_getname+0x6ba/0x790
       Read of size 1 at addr ffff88810e310866 by task mptcp_connect/2518
       Call Trace:
        inet6_getname+0x6ba/0x790
        __sys_getpeername+0x10b/0x250
        __x64_sys_getpeername+0x6f/0xb0
      
      also alter test program to exercise this.
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0519de8
  3. 05 Feb, 2020 19 commits
    • Sudarsana Reddy Kalluru's avatar
      qed: Fix timestamping issue for L2 unicast ptp packets. · 0202d293
      Sudarsana Reddy Kalluru authored
      commit cedeac9d ("qed: Add support for Timestamping the unicast
      PTP packets.") handles the timestamping of L4 ptp packets only.
      This patch adds driver changes to detect/timestamp both L2/L4 unicast
      PTP packets.
      
      Fixes: cedeac9d ("qed: Add support for Timestamping the unicast PTP packets.")
      Signed-off-by: default avatarSudarsana Reddy Kalluru <skalluru@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0202d293
    • David S. Miller's avatar
      Merge branch 'macb-TSO-bug-fixes' · 83576e32
      David S. Miller authored
      Harini Katakam says:
      
      ====================
      macb: TSO bug fixes
      
      An IP errata was recently discovered when testing TSO enabled versions
      with perf test tools where a false amba error is reported by the IP.
      Some ways to reproduce would be to use iperf or applications with payload
      descriptor sizes very close to 16K. Once the error is observed TXERR (or
      bit 6 of ISR) will be constantly triggered leading to a series of tx path
      error handling and clean up. Workaround the same by limiting this size to
      0x3FC0 as recommended by Cadence. There was no performance impact on 1G
      system that I tested with.
      
      Note on patch 1: The alignment code may be unused but leaving it there
      in case anyone is using UFO.
      
      Added Fixes tag to patch 1.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83576e32
    • Harini Katakam's avatar
      net: macb: Limit maximum GEM TX length in TSO · f822e9c4
      Harini Katakam authored
      GEM_MAX_TX_LEN currently resolves to 0x3FF8 for any IP version supporting
      TSO with full 14bits of length field in payload descriptor. But an IP
      errata causes false amba_error (bit 6 of ISR) when length in payload
      descriptors is specified above 16387. The error occurs because the DMA
      falsely concludes that there is not enough space in SRAM for incoming
      payload. These errors were observed continuously under stress of large
      packets using iperf on a version where SRAM was 16K for each queue. This
      errata will be documented shortly and affects all versions since TSO
      functionality was added. Hence limit the max length to 0x3FC0 (rounded).
      Signed-off-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f822e9c4
    • Harini Katakam's avatar
      net: macb: Remove unnecessary alignment check for TSO · 41c1ef97
      Harini Katakam authored
      The IP TSO implementation does NOT require the length to be a
      multiple of 8. That is only a requirement for UFO as per IP
      documentation. Hence, exit macb_features_check function in the
      beginning if the protocol is not UDP. Only when it is UDP,
      proceed further to the alignment checks. Update comments to
      reflect the same. Also remove dead code checking for protocol
      TCP when calculating header length.
      
      Fixes: 1629dd4f ("cadence: Add LSO support.")
      Signed-off-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41c1ef97
    • Eric Dumazet's avatar
      bonding/alb: properly access headers in bond_alb_xmit() · 38f88c45
      Eric Dumazet authored
      syzbot managed to send an IPX packet through bond_alb_xmit()
      and af_packet and triggered a use-after-free.
      
      First, bond_alb_xmit() was using ipx_hdr() helper to reach
      the IPX header, but ipx_hdr() was using the transport offset
      instead of the network offset. In the particular syzbot
      report transport offset was 0xFFFF
      
      This patch removes ipx_hdr() since it was only (mis)used from bonding.
      
      Then we need to make sure IPv4/IPv6/IPX headers are pulled
      in skb->head before dereferencing anything.
      
      BUG: KASAN: use-after-free in bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452
      Read of size 2 at addr ffff8801ce56dfff by task syz-executor.2/18108
       (if (ipx_hdr(skb)->ipx_checksum != IPX_NO_CHECKSUM) ...)
      
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       [<ffffffff8441fc42>] __dump_stack lib/dump_stack.c:17 [inline]
       [<ffffffff8441fc42>] dump_stack+0x14d/0x20b lib/dump_stack.c:53
       [<ffffffff81a7dec4>] print_address_description+0x6f/0x20b mm/kasan/report.c:282
       [<ffffffff81a7e0ec>] kasan_report_error mm/kasan/report.c:380 [inline]
       [<ffffffff81a7e0ec>] kasan_report mm/kasan/report.c:438 [inline]
       [<ffffffff81a7e0ec>] kasan_report.cold+0x8c/0x2a0 mm/kasan/report.c:422
       [<ffffffff81a7dc4f>] __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:469
       [<ffffffff82c8c00a>] bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452
       [<ffffffff82c60c74>] __bond_start_xmit drivers/net/bonding/bond_main.c:4199 [inline]
       [<ffffffff82c60c74>] bond_start_xmit+0x4f4/0x1570 drivers/net/bonding/bond_main.c:4224
       [<ffffffff83baa558>] __netdev_start_xmit include/linux/netdevice.h:4525 [inline]
       [<ffffffff83baa558>] netdev_start_xmit include/linux/netdevice.h:4539 [inline]
       [<ffffffff83baa558>] xmit_one net/core/dev.c:3611 [inline]
       [<ffffffff83baa558>] dev_hard_start_xmit+0x168/0x910 net/core/dev.c:3627
       [<ffffffff83bacf35>] __dev_queue_xmit+0x1f55/0x33b0 net/core/dev.c:4238
       [<ffffffff83bae3a8>] dev_queue_xmit+0x18/0x20 net/core/dev.c:4278
       [<ffffffff84339189>] packet_snd net/packet/af_packet.c:3226 [inline]
       [<ffffffff84339189>] packet_sendmsg+0x4919/0x70b0 net/packet/af_packet.c:3252
       [<ffffffff83b1ac0c>] sock_sendmsg_nosec net/socket.c:673 [inline]
       [<ffffffff83b1ac0c>] sock_sendmsg+0x12c/0x160 net/socket.c:684
       [<ffffffff83b1f5a2>] __sys_sendto+0x262/0x380 net/socket.c:1996
       [<ffffffff83b1f700>] SYSC_sendto net/socket.c:2008 [inline]
       [<ffffffff83b1f700>] SyS_sendto+0x40/0x60 net/socket.c:2004
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38f88c45
    • Jacob Keller's avatar
      devlink: report 0 after hitting end in region read · d5b90e99
      Jacob Keller authored
      commit fdd41ec2 ("devlink: Return right error code in case of errors
      for region read") modified the region read code to report errors
      properly in unexpected cases.
      
      In the case where the start_offset and ret_offset match, it unilaterally
      converted this into an error. This causes an issue for the "dump"
      version of the command. In this case, the devlink region dump will
      always report an invalid argument:
      
      000000000000ffd0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      000000000000ffe0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      devlink answers: Invalid argument
      000000000000fff0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      
      This occurs because the expected flow for the dump is to return 0 after
      there is no further data.
      
      The simplest fix would be to stop converting the error code to -EINVAL
      if start_offset == ret_offset. However, avoid unnecessary work by
      checking for when start_offset is larger than the region size and
      returning 0 upfront.
      
      Fixes: fdd41ec2 ("devlink: Return right error code in case of errors for region read")
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5b90e99
    • Moritz Fischer's avatar
      net: ethernet: dec: tulip: Fix length mask in receive length calculation · 33e2b32b
      Moritz Fischer authored
      The receive frame length calculation uses a wrong mask to calculate the
      length of the received frames.
      
      Per spec table 4-1 the length is contained in the FL (Frame Length)
      field in bits 30:16.
      
      This didn't show up as an issue so far since frames were limited to
      1500 bytes which falls within the 11 bit window.
      Signed-off-by: default avatarMoritz Fischer <mdf@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33e2b32b
    • David S. Miller's avatar
      Merge branch 'wg-fixes' · 7bb77d4b
      David S. Miller authored
      Jason A. Donenfeld says:
      
      ====================
      wireguard fixes for 5.6-rc1
      
      Here are fixes for WireGuard before 5.6-rc1 is tagged. It includes:
      
      1) A fix for a UaF (caused by kmalloc failing during a very small
         allocation) that syzkaller found, from Eric Dumazet.
      
      2) A fix for a deadlock that syzkaller found, along with an additional
         selftest to ensure that the bug fix remains correct, from me.
      
      3) Two little fixes/cleanups to the selftests from Krzysztof Kozlowski
         and me.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7bb77d4b
    • Jason A. Donenfeld's avatar
      wireguard: selftests: tie socket waiting to target pid · 88f404a9
      Jason A. Donenfeld authored
      Without this, we wind up proceeding too early sometimes when the
      previous process has just used the same listening port. So, we tie the
      listening socket query to the specific pid we're interested in.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88f404a9
    • Krzysztof Kozlowski's avatar
      wireguard: selftests: cleanup CONFIG_ENABLE_WARN_DEPRECATED · 4a2ef721
      Krzysztof Kozlowski authored
      CONFIG_ENABLE_WARN_DEPRECATED is gone since commit 771c0353
      ("deprecate the '__deprecated' attribute warnings entirely and for
      good").
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4a2ef721
    • Jason A. Donenfeld's avatar
      wireguard: selftests: ensure non-addition of peers with failed precomputation · f9398acb
      Jason A. Donenfeld authored
      Ensure that peers with low order points are ignored, both in the case
      where we already have a device private key and in the case where we do
      not. This adds points that naturally give a zero output.
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9398acb
    • Jason A. Donenfeld's avatar
      wireguard: noise: reject peers with low order public keys · ec31c267
      Jason A. Donenfeld authored
      Our static-static calculation returns a failure if the public key is of
      low order. We check for this when peers are added, and don't allow them
      to be added if they're low order, except in the case where we haven't
      yet been given a private key. In that case, we would defer the removal
      of the peer until we're given a private key, since at that point we're
      doing new static-static calculations which incur failures we can act on.
      This meant, however, that we wound up removing peers rather late in the
      configuration flow.
      
      Syzkaller points out that peer_remove calls flush_workqueue, which in
      turn might then wait for sending a handshake initiation to complete.
      Since handshake initiation needs the static identity lock, holding the
      static identity lock while calling peer_remove can result in a rare
      deadlock. We have precisely this case in this situation of late-stage
      peer removal based on an invalid public key. We can't drop the lock when
      removing, because then incoming handshakes might interact with a bogus
      static-static calculation.
      
      While the band-aid patch for this would involve breaking up the peer
      removal into two steps like wg_peer_remove_all does, in order to solve
      the locking issue, there's actually a much more elegant way of fixing
      this:
      
      If the static-static calculation succeeds with one private key, it
      *must* succeed with all others, because all 32-byte strings map to valid
      private keys, thanks to clamping. That means we can get rid of this
      silly dance and locking headaches of removing peers late in the
      configuration flow, and instead just reject them early on, regardless of
      whether the device has yet been assigned a private key. For the case
      where the device doesn't yet have a private key, we safely use zeros
      just for the purposes of checking for low order points by way of
      checking the output of the calculation.
      
      The following PoC will trigger the deadlock:
      
      ip link add wg0 type wireguard
      ip addr add 10.0.0.1/24 dev wg0
      ip link set wg0 up
      ping -f 10.0.0.2 &
      while true; do
              wg set wg0 private-key /dev/null peer AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= allowed-ips 10.0.0.0/24 endpoint 10.0.0.3:1234
              wg set wg0 private-key <(echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=)
      done
      
      [    0.949105] ======================================================
      [    0.949550] WARNING: possible circular locking dependency detected
      [    0.950143] 5.5.0-debug+ #18 Not tainted
      [    0.950431] ------------------------------------------------------
      [    0.950959] wg/89 is trying to acquire lock:
      [    0.951252] ffff8880333e2128 ((wq_completion)wg-kex-wg0){+.+.}, at: flush_workqueue+0xe3/0x12f0
      [    0.951865]
      [    0.951865] but task is already holding lock:
      [    0.952280] ffff888032819bc0 (&wg->static_identity.lock){++++}, at: wg_set_device+0x95d/0xcc0
      [    0.953011]
      [    0.953011] which lock already depends on the new lock.
      [    0.953011]
      [    0.953651]
      [    0.953651] the existing dependency chain (in reverse order) is:
      [    0.954292]
      [    0.954292] -> #2 (&wg->static_identity.lock){++++}:
      [    0.954804]        lock_acquire+0x127/0x350
      [    0.955133]        down_read+0x83/0x410
      [    0.955428]        wg_noise_handshake_create_initiation+0x97/0x700
      [    0.955885]        wg_packet_send_handshake_initiation+0x13a/0x280
      [    0.956401]        wg_packet_handshake_send_worker+0x10/0x20
      [    0.956841]        process_one_work+0x806/0x1500
      [    0.957167]        worker_thread+0x8c/0xcb0
      [    0.957549]        kthread+0x2ee/0x3b0
      [    0.957792]        ret_from_fork+0x24/0x30
      [    0.958234]
      [    0.958234] -> #1 ((work_completion)(&peer->transmit_handshake_work)){+.+.}:
      [    0.958808]        lock_acquire+0x127/0x350
      [    0.959075]        process_one_work+0x7ab/0x1500
      [    0.959369]        worker_thread+0x8c/0xcb0
      [    0.959639]        kthread+0x2ee/0x3b0
      [    0.959896]        ret_from_fork+0x24/0x30
      [    0.960346]
      [    0.960346] -> #0 ((wq_completion)wg-kex-wg0){+.+.}:
      [    0.960945]        check_prev_add+0x167/0x1e20
      [    0.961351]        __lock_acquire+0x2012/0x3170
      [    0.961725]        lock_acquire+0x127/0x350
      [    0.961990]        flush_workqueue+0x106/0x12f0
      [    0.962280]        peer_remove_after_dead+0x160/0x220
      [    0.962600]        wg_set_device+0xa24/0xcc0
      [    0.962994]        genl_rcv_msg+0x52f/0xe90
      [    0.963298]        netlink_rcv_skb+0x111/0x320
      [    0.963618]        genl_rcv+0x1f/0x30
      [    0.963853]        netlink_unicast+0x3f6/0x610
      [    0.964245]        netlink_sendmsg+0x700/0xb80
      [    0.964586]        __sys_sendto+0x1dd/0x2c0
      [    0.964854]        __x64_sys_sendto+0xd8/0x1b0
      [    0.965141]        do_syscall_64+0x90/0xd9a
      [    0.965408]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [    0.965769]
      [    0.965769] other info that might help us debug this:
      [    0.965769]
      [    0.966337] Chain exists of:
      [    0.966337]   (wq_completion)wg-kex-wg0 --> (work_completion)(&peer->transmit_handshake_work) --> &wg->static_identity.lock
      [    0.966337]
      [    0.967417]  Possible unsafe locking scenario:
      [    0.967417]
      [    0.967836]        CPU0                    CPU1
      [    0.968155]        ----                    ----
      [    0.968497]   lock(&wg->static_identity.lock);
      [    0.968779]                                lock((work_completion)(&peer->transmit_handshake_work));
      [    0.969345]                                lock(&wg->static_identity.lock);
      [    0.969809]   lock((wq_completion)wg-kex-wg0);
      [    0.970146]
      [    0.970146]  *** DEADLOCK ***
      [    0.970146]
      [    0.970531] 5 locks held by wg/89:
      [    0.970908]  #0: ffffffff827433c8 (cb_lock){++++}, at: genl_rcv+0x10/0x30
      [    0.971400]  #1: ffffffff82743480 (genl_mutex){+.+.}, at: genl_rcv_msg+0x642/0xe90
      [    0.971924]  #2: ffffffff827160c0 (rtnl_mutex){+.+.}, at: wg_set_device+0x9f/0xcc0
      [    0.972488]  #3: ffff888032819de0 (&wg->device_update_lock){+.+.}, at: wg_set_device+0xb0/0xcc0
      [    0.973095]  #4: ffff888032819bc0 (&wg->static_identity.lock){++++}, at: wg_set_device+0x95d/0xcc0
      [    0.973653]
      [    0.973653] stack backtrace:
      [    0.973932] CPU: 1 PID: 89 Comm: wg Not tainted 5.5.0-debug+ #18
      [    0.974476] Call Trace:
      [    0.974638]  dump_stack+0x97/0xe0
      [    0.974869]  check_noncircular+0x312/0x3e0
      [    0.975132]  ? print_circular_bug+0x1f0/0x1f0
      [    0.975410]  ? __kernel_text_address+0x9/0x30
      [    0.975727]  ? unwind_get_return_address+0x51/0x90
      [    0.976024]  check_prev_add+0x167/0x1e20
      [    0.976367]  ? graph_lock+0x70/0x160
      [    0.976682]  __lock_acquire+0x2012/0x3170
      [    0.976998]  ? register_lock_class+0x1140/0x1140
      [    0.977323]  lock_acquire+0x127/0x350
      [    0.977627]  ? flush_workqueue+0xe3/0x12f0
      [    0.977890]  flush_workqueue+0x106/0x12f0
      [    0.978147]  ? flush_workqueue+0xe3/0x12f0
      [    0.978410]  ? find_held_lock+0x2c/0x110
      [    0.978662]  ? lock_downgrade+0x6e0/0x6e0
      [    0.978919]  ? queue_rcu_work+0x60/0x60
      [    0.979166]  ? netif_napi_del+0x151/0x3b0
      [    0.979501]  ? peer_remove_after_dead+0x160/0x220
      [    0.979871]  peer_remove_after_dead+0x160/0x220
      [    0.980232]  wg_set_device+0xa24/0xcc0
      [    0.980516]  ? deref_stack_reg+0x8e/0xc0
      [    0.980801]  ? set_peer+0xe10/0xe10
      [    0.981040]  ? __ww_mutex_check_waiters+0x150/0x150
      [    0.981430]  ? __nla_validate_parse+0x163/0x270
      [    0.981719]  ? genl_family_rcv_msg_attrs_parse+0x13f/0x310
      [    0.982078]  genl_rcv_msg+0x52f/0xe90
      [    0.982348]  ? genl_family_rcv_msg_attrs_parse+0x310/0x310
      [    0.982690]  ? register_lock_class+0x1140/0x1140
      [    0.983049]  netlink_rcv_skb+0x111/0x320
      [    0.983298]  ? genl_family_rcv_msg_attrs_parse+0x310/0x310
      [    0.983645]  ? netlink_ack+0x880/0x880
      [    0.983888]  genl_rcv+0x1f/0x30
      [    0.984168]  netlink_unicast+0x3f6/0x610
      [    0.984443]  ? netlink_detachskb+0x60/0x60
      [    0.984729]  ? find_held_lock+0x2c/0x110
      [    0.984976]  netlink_sendmsg+0x700/0xb80
      [    0.985220]  ? netlink_broadcast_filtered+0xa60/0xa60
      [    0.985533]  __sys_sendto+0x1dd/0x2c0
      [    0.985763]  ? __x64_sys_getpeername+0xb0/0xb0
      [    0.986039]  ? sockfd_lookup_light+0x17/0x160
      [    0.986397]  ? __sys_recvmsg+0x8c/0xf0
      [    0.986711]  ? __sys_recvmsg_sock+0xd0/0xd0
      [    0.987018]  __x64_sys_sendto+0xd8/0x1b0
      [    0.987283]  ? lockdep_hardirqs_on+0x39b/0x5a0
      [    0.987666]  do_syscall_64+0x90/0xd9a
      [    0.987903]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      [    0.988223] RIP: 0033:0x7fe77c12003e
      [    0.988508] Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 4
      [    0.989666] RSP: 002b:00007fffada2ed58 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [    0.990137] RAX: ffffffffffffffda RBX: 00007fe77c159d48 RCX: 00007fe77c12003e
      [    0.990583] RDX: 0000000000000040 RSI: 000055fd1d38e020 RDI: 0000000000000004
      [    0.991091] RBP: 000055fd1d38e020 R08: 000055fd1cb63358 R09: 000000000000000c
      [    0.991568] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002c
      [    0.992014] R13: 0000000000000004 R14: 000055fd1d38e020 R15: 0000000000000001
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec31c267
    • Eric Dumazet's avatar
      wireguard: allowedips: fix use-after-free in root_remove_peer_lists · 9981159f
      Eric Dumazet authored
      In the unlikely case a new node could not be allocated, we need to
      remove @newnode from @peer->allowedips_list before freeing it.
      
      syzbot reported:
      
      BUG: KASAN: use-after-free in __list_del_entry_valid+0xdc/0xf5 lib/list_debug.c:54
      Read of size 8 at addr ffff88809881a538 by task syz-executor.4/30133
      
      CPU: 0 PID: 30133 Comm: syz-executor.4 Not tainted 5.5.0-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:639
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135
       __list_del_entry_valid+0xdc/0xf5 lib/list_debug.c:54
       __list_del_entry include/linux/list.h:132 [inline]
       list_del include/linux/list.h:146 [inline]
       root_remove_peer_lists+0x24f/0x4b0 drivers/net/wireguard/allowedips.c:65
       wg_allowedips_free+0x232/0x390 drivers/net/wireguard/allowedips.c:300
       wg_peer_remove_all+0xd5/0x620 drivers/net/wireguard/peer.c:187
       wg_set_device+0xd01/0x1350 drivers/net/wireguard/netlink.c:542
       genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:717 [inline]
       genl_rcv_msg+0x67d/0xea0 net/netlink/genetlink.c:734
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       genl_rcv+0x29/0x40 net/netlink/genetlink.c:745
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:672
       ____sys_sendmsg+0x753/0x880 net/socket.c:2343
       ___sys_sendmsg+0x100/0x170 net/socket.c:2397
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
       __do_sys_sendmsg net/socket.c:2439 [inline]
       __se_sys_sendmsg net/socket.c:2437 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x45b399
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f99a9bcdc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f99a9bce6d4 RCX: 000000000045b399
      RDX: 0000000000000000 RSI: 0000000020001340 RDI: 0000000000000003
      RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
      R13: 00000000000009ba R14: 00000000004cb2b8 R15: 0000000000000009
      
      Allocated by task 30103:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       __kasan_kmalloc mm/kasan/common.c:513 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
       kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527
       kmem_cache_alloc_trace+0x158/0x790 mm/slab.c:3551
       kmalloc include/linux/slab.h:556 [inline]
       kzalloc include/linux/slab.h:670 [inline]
       add+0x70a/0x1970 drivers/net/wireguard/allowedips.c:236
       wg_allowedips_insert_v4+0xf6/0x160 drivers/net/wireguard/allowedips.c:320
       set_allowedip drivers/net/wireguard/netlink.c:343 [inline]
       set_peer+0xfb9/0x1150 drivers/net/wireguard/netlink.c:468
       wg_set_device+0xbd4/0x1350 drivers/net/wireguard/netlink.c:591
       genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:717 [inline]
       genl_rcv_msg+0x67d/0xea0 net/netlink/genetlink.c:734
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       genl_rcv+0x29/0x40 net/netlink/genetlink.c:745
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:672
       ____sys_sendmsg+0x753/0x880 net/socket.c:2343
       ___sys_sendmsg+0x100/0x170 net/socket.c:2397
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
       __do_sys_sendmsg net/socket.c:2439 [inline]
       __se_sys_sendmsg net/socket.c:2437 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 30103:
       save_stack+0x23/0x90 mm/kasan/common.c:72
       set_track mm/kasan/common.c:80 [inline]
       kasan_set_free_info mm/kasan/common.c:335 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
       __cache_free mm/slab.c:3426 [inline]
       kfree+0x10a/0x2c0 mm/slab.c:3757
       add+0x12d2/0x1970 drivers/net/wireguard/allowedips.c:266
       wg_allowedips_insert_v4+0xf6/0x160 drivers/net/wireguard/allowedips.c:320
       set_allowedip drivers/net/wireguard/netlink.c:343 [inline]
       set_peer+0xfb9/0x1150 drivers/net/wireguard/netlink.c:468
       wg_set_device+0xbd4/0x1350 drivers/net/wireguard/netlink.c:591
       genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:717 [inline]
       genl_rcv_msg+0x67d/0xea0 net/netlink/genetlink.c:734
       netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
       genl_rcv+0x29/0x40 net/netlink/genetlink.c:745
       netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
       netlink_unicast+0x59e/0x7e0 net/netlink/af_netlink.c:1328
       netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg+0xd7/0x130 net/socket.c:672
       ____sys_sendmsg+0x753/0x880 net/socket.c:2343
       ___sys_sendmsg+0x100/0x170 net/socket.c:2397
       __sys_sendmsg+0x105/0x1d0 net/socket.c:2430
       __do_sys_sendmsg net/socket.c:2439 [inline]
       __se_sys_sendmsg net/socket.c:2437 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2437
       do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff88809881a500
       which belongs to the cache kmalloc-64 of size 64
      The buggy address is located 56 bytes inside of
       64-byte region [ffff88809881a500, ffff88809881a540)
      The buggy address belongs to the page:
      page:ffffea0002620680 refcount:1 mapcount:0 mapping:ffff8880aa400380 index:0x0
      raw: 00fffe0000000200 ffffea000250b748 ffffea000254bac8 ffff8880aa400380
      raw: 0000000000000000 ffff88809881a000 0000000100000020 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88809881a400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff88809881a480: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
      >ffff88809881a500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
                                              ^
       ffff88809881a580: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff88809881a600: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
      
      Fixes: e7096c13 ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Jason A. Donenfeld <Jason@zx2c4.com>
      Cc: wireguard@lists.zx2c4.com
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9981159f
    • Cong Wang's avatar
      net_sched: fix a resource leak in tcindex_set_parms() · 52b5ae50
      Cong Wang authored
      Jakub noticed there is a potential resource leak in
      tcindex_set_parms(): when tcindex_filter_result_init() fails
      and it jumps to 'errout1' which doesn't release the memory
      and resources allocated by tcindex_alloc_perfect_hash().
      
      We should just jump to 'errout_alloc' which calls
      tcindex_free_perfect_hash().
      
      Fixes: b9a24bb7 ("net_sched: properly handle failure case of tcf_exts_init()")
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52b5ae50
    • Florian Westphal's avatar
      mptcp: fix use-after-free on tcp fallback · 2c22c06c
      Florian Westphal authored
      When an mptcp socket connects to a tcp peer or when a middlebox interferes
      with tcp options, mptcp needs to fall back to plain tcp.
      Problem is that mptcp is trying to be too clever in this case:
      
      It attempts to close the mptcp meta sk and transparently replace it with
      the (only) subflow tcp sk.
      
      Unfortunately, this is racy -- the socket is already exposed to userspace.
      Any parallel calls to send/recv/setsockopt etc. can cause use-after-free:
      
      BUG: KASAN: use-after-free in atomic_try_cmpxchg include/asm-generic/atomic-instrumented.h:693 [inline]
      CPU: 1 PID: 2083 Comm: syz-executor.1 Not tainted 5.5.0 #2
       atomic_try_cmpxchg include/asm-generic/atomic-instrumented.h:693 [inline]
       queued_spin_lock include/asm-generic/qspinlock.h:78 [inline]
       do_raw_spin_lock include/linux/spinlock.h:181 [inline]
       __raw_spin_lock_bh include/linux/spinlock_api_smp.h:136 [inline]
       _raw_spin_lock_bh+0x71/0xd0 kernel/locking/spinlock.c:175
       spin_lock_bh include/linux/spinlock.h:343 [inline]
       __lock_sock+0x105/0x190 net/core/sock.c:2414
       lock_sock_nested+0x10f/0x140 net/core/sock.c:2938
       lock_sock include/net/sock.h:1516 [inline]
       mptcp_setsockopt+0x2f/0x1f0 net/mptcp/protocol.c:800
       __sys_setsockopt+0x152/0x240 net/socket.c:2130
       __do_sys_setsockopt net/socket.c:2146 [inline]
       __se_sys_setsockopt net/socket.c:2143 [inline]
       __x64_sys_setsockopt+0xba/0x150 net/socket.c:2143
       do_syscall_64+0xb7/0x3d0 arch/x86/entry/common.c:294
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      While the use-after-free can be resolved, there is another problem:
      sock->ops and sock->sk assignments are not atomic, i.e. we may get calls
      into mptcp functions with sock->sk already pointing at the subflow socket,
      or calls into tcp functions with a mptcp meta sk.
      
      Remove the fallback code and call the relevant functions for the (only)
      subflow in case the mptcp socket is connected to tcp peer.
      Reported-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Diagnosed-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Tested-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c22c06c
    • Andy Shevchenko's avatar
      8b7a07c7
    • Andy Shevchenko's avatar
      net: dsa: b53: Platform data shan't include kernel.h · e22e0790
      Andy Shevchenko authored
      Replace with appropriate types.h.
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e22e0790
    • kbuild test robot's avatar
      netdevsim: fix ptr_ret.cocci warnings · 34611e69
      kbuild test robot authored
      drivers/net/netdevsim/dev.c:937:1-3: WARNING: PTR_ERR_OR_ZERO can be used
      
       Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR
      
      Generated by: scripts/coccinelle/api/ptr_ret.cocci
      
      Fixes: 6556ff32 ("netdevsim: use IS_ERR instead of IS_ERR_OR_NULL for debugfs")
      CC: Taehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34611e69
    • Thomas Bogendoerfer's avatar
      net: sgi: ioc3-eth: Remove leftover free_irq() · 9784e619
      Thomas Bogendoerfer authored
      Commit 0ce5ebd2 ("mfd: ioc3: Add driver for SGI IOC3 chip") moved
      request_irq() from ioc3_open into probe function, but forgot to remove
      free_irq() from ioc3_close.
      
      Fixes: 0ce5ebd2 ("mfd: ioc3: Add driver for SGI IOC3 chip")
      Signed-off-by: default avatarThomas Bogendoerfer <tbogendoerfer@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9784e619
  4. 04 Feb, 2020 2 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 33b40134
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Use after free in rxrpc_put_local(), from David Howells.
      
       2) Fix 64-bit division error in mlxsw, from Nathan Chancellor.
      
       3) Make sure we clear various bits of TCP state in response to
          tcp_disconnect(). From Eric Dumazet.
      
       4) Fix netlink attribute policy in cls_rsvp, from Eric Dumazet.
      
       5) txtimer must be deleted in stmmac suspend(), from Nicolin Chen.
      
       6) Fix TC queue mapping in bnxt_en driver, from Michael Chan.
      
       7) Various netdevsim fixes from Taehee Yoo (use of uninitialized data,
          snapshot panics, stack out of bounds, etc.)
      
       8) cls_tcindex changes hash table size after allocating the table, fix
          from Cong Wang.
      
       9) Fix regression in the enforcement of session ID uniqueness in l2tp.
          We only have to enforce uniqueness for IP based tunnels not UDP
          ones. From Ridge Kennedy.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (46 commits)
        gtp: use __GFP_NOWARN to avoid memalloc warning
        l2tp: Allow duplicate session creation with UDP
        r8152: Add MAC passthrough support to new device
        net_sched: fix an OOB access in cls_tcindex
        qed: Remove set but not used variable 'p_link'
        tc-testing: add missing 'nsPlugin' to basic.json
        tc-testing: fix eBPF tests failure on linux fresh clones
        net: hsr: fix possible NULL deref in hsr_handle_frame()
        netdevsim: remove unused sdev code
        netdevsim: use __GFP_NOWARN to avoid memalloc warning
        netdevsim: use IS_ERR instead of IS_ERR_OR_NULL for debugfs
        netdevsim: fix stack-out-of-bounds in nsim_dev_debugfs_init()
        netdevsim: fix panic in nsim_dev_take_snapshot_write()
        netdevsim: disable devlink reload when resources are being used
        netdevsim: fix using uninitialized resources
        bnxt_en: Fix TC queue mapping.
        bnxt_en: Fix logic that disables Bus Master during firmware reset.
        bnxt_en: Fix RDMA driver failure with SRIOV after firmware reset.
        bnxt_en: Refactor logic to re-enable SRIOV after firmware reset detected.
        net: stmmac: Delete txtimer in suspend()
        ...
      33b40134
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · d60ddd24
      Linus Torvalds authored
      Pull ARM updates from Russell King:
      
       - decompressor updates
      
       - prevention of out-of-bounds access while stacktracing
      
       - fix a section mismatch warning with free_memmap()
      
       - make kexec depend on MMU to avoid some build errors
      
       - remove swapops stubs
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 8954/1: NOMMU: remove stubs for swapops
        ARM: 8952/1: Disable kmemleak on XIP kernels
        ARM: 8951/1: Fix Kexec compilation issue.
        ARM: 8949/1: mm: mark free_memmap as __init
        ARM: 8948/1: Prevent OOB access in stacktrace
        ARM: 8945/1: decompressor: use CONFIG option instead of cc-option
        ARM: 8942/1: Revert "8857/1: efi: enable CP15 DMB instructions before cleaning the cache"
        ARM: 8941/1: decompressor: enable CP15 barrier instructions in v7 cache setup code
      d60ddd24