1. 09 Jan, 2021 7 commits
    • Jakub Kicinski's avatar
      net: make sure devices go through netdev_wait_all_refs · 766b0515
      Jakub Kicinski authored
      If register_netdevice() fails at the very last stage - the
      notifier call - some subsystems may have already seen it and
      grabbed a reference. struct net_device can't be freed right
      away without calling netdev_wait_all_refs().
      
      Now that we have a clean interface in form of dev->needs_free_netdev
      and lenient free_netdev() we can undo what commit 93ee31f1 ("[NET]:
      Fix free_netdev on register_netdev failure.") has done and complete
      the unregistration path by bringing the net_set_todo() call back.
      
      After registration fails user is still expected to explicitly
      free the net_device, so make sure ->needs_free_netdev is cleared,
      otherwise rolling back the registration will cause the old double
      free for callers who release rtnl_lock before the free.
      
      This also solves the problem of priv_destructor not being called
      on notifier error.
      
      net_set_todo() will be moved back into unregister_netdevice_queue()
      in a follow up.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Reported-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      766b0515
    • Jakub Kicinski's avatar
      net: make free_netdev() more lenient with unregistering devices · c269a24c
      Jakub Kicinski authored
      There are two flavors of handling netdev registration:
       - ones called without holding rtnl_lock: register_netdev() and
         unregister_netdev(); and
       - those called with rtnl_lock held: register_netdevice() and
         unregister_netdevice().
      
      While the semantics of the former are pretty clear, the same can't
      be said about the latter. The netdev_todo mechanism is utilized to
      perform some of the device unregistering tasks and it hooks into
      rtnl_unlock() so the locked variants can't actually finish the work.
      In general free_netdev() does not mix well with locked calls. Most
      drivers operating under rtnl_lock set dev->needs_free_netdev to true
      and expect core to make the free_netdev() call some time later.
      
      The part where this becomes most problematic is error paths. There is
      no way to unwind the state cleanly after a call to register_netdevice(),
      since unreg can't be performed fully without dropping locks.
      
      Make free_netdev() more lenient, and defer the freeing if device
      is being unregistered. This allows error paths to simply call
      free_netdev() both after register_netdevice() failed, and after
      a call to unregister_netdevice() but before dropping rtnl_lock.
      
      Simplify the error paths which are currently doing gymnastics
      around free_netdev() handling.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c269a24c
    • Jakub Kicinski's avatar
      docs: net: explain struct net_device lifetime · 2b446e65
      Jakub Kicinski authored
      Explain the two basic flows of struct net_device's operation.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2b446e65
    • Tom Parkin's avatar
      ppp: fix refcount underflow on channel unbridge · c1787ffd
      Tom Parkin authored
      When setting up a channel bridge, ppp_bridge_channels sets the
      pch->bridge field before taking the associated reference on the bridge
      file instance.
      
      This opens up a refcount underflow bug if ppp_bridge_channels called
      via. iotcl runs concurrently with ppp_unbridge_channels executing via.
      file release.
      
      The bug is triggered by ppp_bridge_channels taking the error path
      through the 'err_unset' label.  In this scenario, pch->bridge is set,
      but the reference on the bridged channel will not be taken because
      the function errors out.  If ppp_unbridge_channels observes pch->bridge
      before it is unset by the error path, it will erroneously drop the
      reference on the bridged channel and cause a refcount underflow.
      
      To avoid this, ensure that ppp_bridge_channels holds a reference on
      each channel in advance of setting the bridge pointers.
      Signed-off-by: default avatarTom Parkin <tparkin@katalix.com>
      Fixes: 4cf476ce ("ppp: add PPPIOCBRIDGECHAN and PPPIOCUNBRIDGECHAN ioctls")
      Acked-by: default avatarGuillaume Nault <gnault@redhat.com>
      Link: https://lore.kernel.org/r/20210107181315.3128-1-tparkin@katalix.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c1787ffd
    • Baptiste Lepers's avatar
      udp: Prevent reuseport_select_sock from reading uninitialized socks · fd2ddef0
      Baptiste Lepers authored
      reuse->socks[] is modified concurrently by reuseport_add_sock. To
      prevent reading values that have not been fully initialized, only read
      the array up until the last known safe index instead of incorrectly
      re-reading the last index of the array.
      
      Fixes: acdcecc6 ("udp: correct reuseport selection with connected sockets")
      Signed-off-by: default avatarBaptiste Lepers <baptiste.lepers@gmail.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Link: https://lore.kernel.org/r/20210107051110.12247-1-baptiste.lepers@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fd2ddef0
    • Dongseok Yi's avatar
      net: fix use-after-free when UDP GRO with shared fraglist · 53475c5d
      Dongseok Yi authored
      skbs in fraglist could be shared by a BPF filter loaded at TC. If TC
      writes, it will call skb_ensure_writable -> pskb_expand_head to create
      a private linear section for the head_skb. And then call
      skb_clone_fraglist -> skb_get on each skb in the fraglist.
      
      skb_segment_list overwrites part of the skb linear section of each
      fragment itself. Even after skb_clone, the frag_skbs share their
      linear section with their clone in PF_PACKET.
      
      Both sk_receive_queue of PF_PACKET and PF_INET (or PF_INET6) can have
      a link for the same frag_skbs chain. If a new skb (not frags) is
      queued to one of the sk_receive_queue, multiple ptypes can see and
      release this. It causes use-after-free.
      
      [ 4443.426215] ------------[ cut here ]------------
      [ 4443.426222] refcount_t: underflow; use-after-free.
      [ 4443.426291] WARNING: CPU: 7 PID: 28161 at lib/refcount.c:190
      refcount_dec_and_test_checked+0xa4/0xc8
      [ 4443.426726] pstate: 60400005 (nZCv daif +PAN -UAO)
      [ 4443.426732] pc : refcount_dec_and_test_checked+0xa4/0xc8
      [ 4443.426737] lr : refcount_dec_and_test_checked+0xa0/0xc8
      [ 4443.426808] Call trace:
      [ 4443.426813]  refcount_dec_and_test_checked+0xa4/0xc8
      [ 4443.426823]  skb_release_data+0x144/0x264
      [ 4443.426828]  kfree_skb+0x58/0xc4
      [ 4443.426832]  skb_queue_purge+0x64/0x9c
      [ 4443.426844]  packet_set_ring+0x5f0/0x820
      [ 4443.426849]  packet_setsockopt+0x5a4/0xcd0
      [ 4443.426853]  __sys_setsockopt+0x188/0x278
      [ 4443.426858]  __arm64_sys_setsockopt+0x28/0x38
      [ 4443.426869]  el0_svc_common+0xf0/0x1d0
      [ 4443.426873]  el0_svc_handler+0x74/0x98
      [ 4443.426880]  el0_svc+0x8/0xc
      
      Fixes: 3a1296a3 (net: Support GRO/GSO fraglist chaining.)
      Signed-off-by: default avatarDongseok Yi <dseok.yi@samsung.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/1610072918-174177-1-git-send-email-dseok.yi@samsung.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      53475c5d
    • Stephan Gerhold's avatar
      net: ipa: modem: add missing SET_NETDEV_DEV() for proper sysfs links · afba9dc1
      Stephan Gerhold authored
      At the moment it is quite hard to identify the network interface
      provided by IPA in userspace components: The network interface is
      created as virtual device, without any link to the IPA device.
      The interface name ("rmnet_ipa%d") is the only indication that the
      network interface belongs to IPA, but this is not very reliable.
      
      Add SET_NETDEV_DEV() to associate the network interface with the
      IPA parent device. This allows userspace services like ModemManager
      to properly identify that this network interface is provided by IPA
      and belongs to the modem.
      
      Cc: Alex Elder <elder@kernel.org>
      Fixes: a646d6ec ("soc: qcom: ipa: modem and microcontroller")
      Signed-off-by: default avatarStephan Gerhold <stephan@gerhold.net>
      Link: https://lore.kernel.org/r/20210106100755.56800-1-stephan@gerhold.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      afba9dc1
  2. 08 Jan, 2021 25 commits
  3. 07 Jan, 2021 8 commits