1. 16 Mar, 2021 17 commits
  2. 15 Mar, 2021 5 commits
    • Davide Caratti's avatar
      mptcp: fix ADD_ADDR HMAC in case port is specified · 13832ae2
      Davide Caratti authored
      Currently, Linux computes the HMAC contained in ADD_ADDR sub-option using
      the Address Id and the IP Address, and hardcodes a destination port equal
      to zero. This is not ok for ADD_ADDR with port: ensure to account for the
      endpoint port when computing the HMAC, in compliance with RFC8684 §3.4.1.
      
      Fixes: 22fb85ff ("mptcp: add port support for ADD_ADDR suboption writing")
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Acked-by: default avatarGeliang Tang <geliangtang@gmail.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13832ae2
    • Alexander Ovechkin's avatar
      tcp: relookup sock for RST+ACK packets handled by obsolete req sock · 7233da86
      Alexander Ovechkin authored
      Currently tcp_check_req can be called with obsolete req socket for which big
      socket have been already created (because of CPU race or early demux
      assigning req socket to multiple packets in gro batch).
      
      Commit e0f9759f ("tcp: try to keep packet if SYN_RCV race
      is lost") added retry in case when tcp_check_req is called for PSH|ACK packet.
      But if client sends RST+ACK immediatly after connection being
      established (it is performing healthcheck, for example) retry does not
      occur. In that case tcp_check_req tries to close req socket,
      leaving big socket active.
      
      Fixes: e0f9759f ("tcp: try to keep packet if SYN_RCV race is lost")
      Signed-off-by: default avatarAlexander Ovechkin <ovov@yandex-team.ru>
      Reported-by: default avatarOleg Senin <olegsenin@yandex-team.ru>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7233da86
    • Eric Dumazet's avatar
      tipc: better validate user input in tipc_nl_retrieve_key() · 0217ed28
      Eric Dumazet authored
      Before calling tipc_aead_key_size(ptr), we need to ensure
      we have enough data to dereference ptr->keylen.
      
      We probably also want to make sure tipc_aead_key_size()
      wont overflow with malicious ptr->keylen values.
      
      Syzbot reported:
      
      BUG: KMSAN: uninit-value in __tipc_nl_node_set_key net/tipc/node.c:2971 [inline]
      BUG: KMSAN: uninit-value in tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023
      CPU: 0 PID: 21060 Comm: syz-executor.5 Not tainted 5.11.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x21c/0x280 lib/dump_stack.c:120
       kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x5f/0xa0 mm/kmsan/kmsan_instr.c:197
       __tipc_nl_node_set_key net/tipc/node.c:2971 [inline]
       tipc_nl_node_set_key+0x9bf/0x13b0 net/tipc/node.c:3023
       genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline]
       genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
       genl_rcv_msg+0x1319/0x1610 net/netlink/genetlink.c:800
       netlink_rcv_skb+0x6fa/0x810 net/netlink/af_netlink.c:2494
       genl_rcv+0x63/0x80 net/netlink/genetlink.c:811
       netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
       netlink_unicast+0x11d6/0x14a0 net/netlink/af_netlink.c:1330
       netlink_sendmsg+0x1740/0x1840 net/netlink/af_netlink.c:1919
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg net/socket.c:672 [inline]
       ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345
       ___sys_sendmsg net/socket.c:2399 [inline]
       __sys_sendmsg+0x714/0x830 net/socket.c:2432
       __compat_sys_sendmsg net/compat.c:347 [inline]
       __do_compat_sys_sendmsg net/compat.c:354 [inline]
       __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351
       __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351
       do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline]
       __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141
       do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166
       do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209
       entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
      RIP: 0023:0xf7f60549
      Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
      RSP: 002b:00000000f555a5fc EFLAGS: 00000296 ORIG_RAX: 0000000000000172
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020000200
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:121 [inline]
       kmsan_internal_poison_shadow+0x5c/0xf0 mm/kmsan/kmsan.c:104
       kmsan_slab_alloc+0x8d/0xe0 mm/kmsan/kmsan_hooks.c:76
       slab_alloc_node mm/slub.c:2907 [inline]
       __kmalloc_node_track_caller+0xa37/0x1430 mm/slub.c:4527
       __kmalloc_reserve net/core/skbuff.c:142 [inline]
       __alloc_skb+0x2f8/0xb30 net/core/skbuff.c:210
       alloc_skb include/linux/skbuff.h:1099 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
       netlink_sendmsg+0xdbc/0x1840 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg net/socket.c:672 [inline]
       ____sys_sendmsg+0xcfc/0x12f0 net/socket.c:2345
       ___sys_sendmsg net/socket.c:2399 [inline]
       __sys_sendmsg+0x714/0x830 net/socket.c:2432
       __compat_sys_sendmsg net/compat.c:347 [inline]
       __do_compat_sys_sendmsg net/compat.c:354 [inline]
       __se_compat_sys_sendmsg+0xa7/0xc0 net/compat.c:351
       __ia32_compat_sys_sendmsg+0x4a/0x70 net/compat.c:351
       do_syscall_32_irqs_on arch/x86/entry/common.c:79 [inline]
       __do_fast_syscall_32+0x102/0x160 arch/x86/entry/common.c:141
       do_fast_syscall_32+0x6a/0xc0 arch/x86/entry/common.c:166
       do_SYSENTER_32+0x73/0x90 arch/x86/entry/common.c:209
       entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
      
      Fixes: e1f32190 ("tipc: add support for AEAD key setting via netlink")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tuong Lien <tuong.t.lien@dektech.com.au>
      Cc: Jon Maloy <jmaloy@redhat.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0217ed28
    • Ong Boon Leong's avatar
      net: phylink: Fix phylink_err() function name error in phylink_major_config · d82c6c1a
      Ong Boon Leong authored
      if pl->mac_ops->mac_finish() failed, phylink_err should use
      "mac_finish" instead of "mac_prepare".
      
      Fixes: b7ad14c2 ("net: phylink: re-implement interface configuration with PCS")
      Signed-off-by: default avatarOng Boon Leong <boon.leong.ong@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d82c6c1a
    • Xie He's avatar
      net: hdlc_x25: Prevent racing between "x25_close" and "x25_xmit"/"x25_rx" · bf0ffea3
      Xie He authored
      "x25_close" is called by "hdlc_close" in "hdlc.c", which is called by
      hardware drivers' "ndo_stop" function.
      "x25_xmit" is called by "hdlc_start_xmit" in "hdlc.c", which is hardware
      drivers' "ndo_start_xmit" function.
      "x25_rx" is called by "hdlc_rcv" in "hdlc.c", which receives HDLC frames
      from "net/core/dev.c".
      
      "x25_close" races with "x25_xmit" and "x25_rx" because their callers race.
      
      However, we need to ensure that the LAPB APIs called in "x25_xmit" and
      "x25_rx" are called before "lapb_unregister" is called in "x25_close".
      
      This patch adds locking to ensure when "x25_xmit" and "x25_rx" are doing
      their work, "lapb_unregister" is not yet called in "x25_close".
      
      Reasons for not solving the racing between "x25_close" and "x25_xmit" by
      calling "netif_tx_disable" in "x25_close":
      1. We still need to solve the racing between "x25_close" and "x25_rx";
      2. The design of the HDLC subsystem assumes the HDLC hardware drivers
      have full control over the TX queue, and the HDLC protocol drivers (like
      this driver) have no control. Controlling the queue here in the protocol
      driver may interfere with hardware drivers' control of the queue.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarXie He <xie.he.0141@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf0ffea3
  3. 14 Mar, 2021 4 commits
    • Alexander Lobakin's avatar
      flow_dissector: fix byteorder of dissected ICMP ID · a25f8222
      Alexander Lobakin authored
      flow_dissector_key_icmp::id is of type u16 (CPU byteorder),
      ICMP header has its ID field in network byteorder obviously.
      Sparse says:
      
      net/core/flow_dissector.c:178:43: warning: restricted __be16 degrades to integer
      
      Convert ID value to CPU byteorder when storing it into
      flow_dissector_key_icmp.
      
      Fixes: 5dec597e ("flow_dissector: extract more ICMP information")
      Signed-off-by: default avatarAlexander Lobakin <alobakin@pm.me>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a25f8222
    • Eric Dumazet's avatar
      net: qrtr: fix a kernel-infoleak in qrtr_recvmsg() · 50535249
      Eric Dumazet authored
      struct sockaddr_qrtr has a 2-byte hole, and qrtr_recvmsg() currently
      does not clear it before copying kernel data to user space.
      
      It might be too late to name the hole since sockaddr_qrtr structure is uapi.
      
      BUG: KMSAN: kernel-infoleak in kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
      CPU: 0 PID: 29705 Comm: syz-executor.3 Not tainted 5.11.0-rc7-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:79 [inline]
       dump_stack+0x21c/0x280 lib/dump_stack.c:120
       kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
       kmsan_internal_check_memory+0x202/0x520 mm/kmsan/kmsan.c:402
       kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
       instrument_copy_to_user include/linux/instrumented.h:121 [inline]
       _copy_to_user+0x1ac/0x270 lib/usercopy.c:33
       copy_to_user include/linux/uaccess.h:209 [inline]
       move_addr_to_user+0x3a2/0x640 net/socket.c:237
       ____sys_recvmsg+0x696/0xd50 net/socket.c:2575
       ___sys_recvmsg net/socket.c:2610 [inline]
       do_recvmmsg+0xa97/0x22d0 net/socket.c:2710
       __sys_recvmmsg net/socket.c:2789 [inline]
       __do_sys_recvmmsg net/socket.c:2812 [inline]
       __se_sys_recvmmsg+0x24a/0x410 net/socket.c:2805
       __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2805
       do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x465f69
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f43659d6188 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
      RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465f69
      RDX: 0000000000000008 RSI: 0000000020003e40 RDI: 0000000000000003
      RBP: 00000000004bfa8f R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000010060 R11: 0000000000000246 R12: 000000000056bf60
      R13: 0000000000a9fb1f R14: 00007f43659d6300 R15: 0000000000022000
      
      Local variable ----addr@____sys_recvmsg created at:
       ____sys_recvmsg+0x168/0xd50 net/socket.c:2550
       ____sys_recvmsg+0x168/0xd50 net/socket.c:2550
      
      Bytes 2-3 of 12 are uninitialized
      Memory access of size 12 starts at ffff88817c627b40
      Data copied to user address 0000000020000140
      
      Fixes: bdabad3e ("net: Add Qualcomm IPC router")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Courtney Cavin <courtney.cavin@sonymobile.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50535249
    • Tong Zhang's avatar
      net: arcnet: com20020 fix error handling · 6577b9a5
      Tong Zhang authored
      There are two issues when handling error case in com20020pci_probe()
      
      1. priv might be not initialized yet when calling com20020pci_remove()
      from com20020pci_probe(), since the priv is set at the very last but it
      can jump to error handling in the middle and priv remains NULL.
      2. memory leak - the net device is allocated in alloc_arcdev but not
      properly released if error happens in the middle of the big for loop
      
      [    1.529110] BUG: kernel NULL pointer dereference, address: 0000000000000008
      [    1.531447] RIP: 0010:com20020pci_remove+0x15/0x60 [com20020_pci]
      [    1.536805] Call Trace:
      [    1.536939]  com20020pci_probe+0x3f2/0x48c [com20020_pci]
      [    1.537226]  local_pci_probe+0x48/0x80
      [    1.539918]  com20020pci_init+0x3f/0x1000 [com20020_pci]
      Signed-off-by: default avatarTong Zhang <ztong0001@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6577b9a5
    • Eva Dengler's avatar
      devlink: fix typo in documentation · ad236ccd
      Eva Dengler authored
      This commit fixes three spelling typos in devlink-dpipe.rst and
      devlink-port.rst.
      Signed-off-by: default avatarEva Dengler <eva.dengler@fau.de>
      Acked-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad236ccd
  4. 13 Mar, 2021 9 commits
  5. 12 Mar, 2021 5 commits
    • liuyacan's avatar
      net: correct sk_acceptq_is_full() · f211ac15
      liuyacan authored
      The "backlog" argument in listen() specifies
      the maximom length of pending connections,
      so the accept queue should be considered full
      if there are exactly "backlog" elements.
      Signed-off-by: default avatarliuyacan <yacanliu@163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f211ac15
    • David S. Miller's avatar
      Revert "net: bonding: fix error return code of bond_neigh_init()" · 080bfa1e
      David S. Miller authored
      This reverts commit 2055a99d.
      
      This change rejects legitimate configurations.
      
      A slave doesn't need to exist nor implement ndo_slave_setup.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      080bfa1e
    • Li RongQing's avatar
      igb: avoid premature Rx buffer reuse · 98dfb02a
      Li RongQing authored
      Igb needs a similar fix as commit 75aab4e1 ("i40e: avoid
      premature Rx buffer reuse")
      
      The page recycle code, incorrectly, relied on that a page fragment
      could not be freed inside xdp_do_redirect(). This assumption leads to
      that page fragments that are used by the stack/XDP redirect can be
      reused and overwritten.
      
      To avoid this, store the page count prior invoking xdp_do_redirect().
      
      Longer explanation:
      
      Intel NICs have a recycle mechanism. The main idea is that a page is
      split into two parts. One part is owned by the driver, one part might
      be owned by someone else, such as the stack.
      
      t0: Page is allocated, and put on the Rx ring
                    +---------------
      used by NIC ->| upper buffer
      (rx_buffer)   +---------------
                    | lower buffer
                    +---------------
        page count  == USHRT_MAX
        rx_buffer->pagecnt_bias == USHRT_MAX
      
      t1: Buffer is received, and passed to the stack (e.g.)
                    +---------------
                    | upper buff (skb)
                    +---------------
      used by NIC ->| lower buffer
      (rx_buffer)   +---------------
        page count  == USHRT_MAX
        rx_buffer->pagecnt_bias == USHRT_MAX - 1
      
      t2: Buffer is received, and redirected
                    +---------------
                    | upper buff (skb)
                    +---------------
      used by NIC ->| lower buffer
      (rx_buffer)   +---------------
      
      Now, prior calling xdp_do_redirect():
        page count  == USHRT_MAX
        rx_buffer->pagecnt_bias == USHRT_MAX - 2
      
      This means that buffer *cannot* be flipped/reused, because the skb is
      still using it.
      
      The problem arises when xdp_do_redirect() actually frees the
      segment. Then we get:
        page count  == USHRT_MAX - 1
        rx_buffer->pagecnt_bias == USHRT_MAX - 2
      
      From a recycle perspective, the buffer can be flipped and reused,
      which means that the skb data area is passed to the Rx HW ring!
      
      To work around this, the page count is stored prior calling
      xdp_do_redirect().
      
      Fixes: 9cbc948b ("igb: add XDP support")
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Tested-by: default avatarVishakha Jambekar <vishakha.jambekar@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      98dfb02a
    • Maciej Fijalkowski's avatar
      ixgbe: move headroom initialization to ixgbe_configure_rx_ring · 76064573
      Maciej Fijalkowski authored
      ixgbe_rx_offset(), that is supposed to initialize the Rx buffer headroom,
      relies on __IXGBE_RX_BUILD_SKB_ENABLED flag.
      
      Currently, the callsite of mentioned function is placed incorrectly
      within ixgbe_setup_rx_resources() where Rx ring's build skb flag is not
      set yet. This causes the XDP_REDIRECT to be partially broken due to
      inability to create xdp_frame in the headroom space, as the headroom is
      0.
      
      Fix this by moving ixgbe_rx_offset() to ixgbe_configure_rx_ring() after
      the flag setting, which happens to be set in ixgbe_set_rx_buffer_len.
      
      Fixes: c0d4e9d2 ("ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring")
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarVishakha Jambekar <vishakha.jambekar@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      76064573
    • Maciej Fijalkowski's avatar
      ice: move headroom initialization to ice_setup_rx_ctx · 89861c48
      Maciej Fijalkowski authored
      ice_rx_offset(), that is supposed to initialize the Rx buffer headroom,
      relies on ICE_RX_FLAGS_RING_BUILD_SKB flag as well as XDP prog presence.
      
      Currently, the callsite of mentioned function is placed incorrectly
      within ice_setup_rx_ring() where Rx ring's build skb flag is not
      set yet. This causes the XDP_REDIRECT to be partially broken due to
      inability to create xdp_frame in the headroom space, as the headroom is
      0.
      
      Fix this by moving ice_rx_offset() to ice_setup_rx_ctx() after the flag
      setting.
      
      Fixes: f1b1f409 ("ice: store the result of ice_rx_offset() onto ice_ring")
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      89861c48