1. 19 Aug, 2021 7 commits
    • Jakub Kicinski's avatar
      Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 31674900
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2021-08-19
      
      We've added 3 non-merge commits during the last 3 day(s) which contain
      a total of 3 files changed, 29 insertions(+), 6 deletions(-).
      
      The main changes are:
      
      1) Fix to clear zext_dst for dead instructions which was causing invalid program
         rejections on JITs with bpf_jit_needs_zext such as s390x, from Ilya Leoshkevich.
      
      2) Fix RCU splat in bpf_get_current_{ancestor_,}cgroup_id() helpers when they are
         invoked from sleepable programs, from Yonghong Song.
      
      * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests, bpf: Test that dead ldx_w insns are accepted
        bpf: Clear zext_dst of dead insns
        bpf: Add rcu_read_lock in bpf_get_current_[ancestor_]cgroup_id() helpers
      ====================
      
      Link: https://lore.kernel.org/r/20210819144904.20069-1-daniel@iogearbox.netSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      31674900
    • David S. Miller's avatar
      Merge branch 'r8152-bp-settings' · c15128c9
      David S. Miller authored
      Hayes Wang says:
      
      ====================
      r8152: fix bp settings
      
      Fix the wrong bp settings of the firmware.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c15128c9
    • Hayes Wang's avatar
      r8152: fix the maximum number of PLA bp for RTL8153C · 6633fb83
      Hayes Wang authored
      The maximum PLA bp number of RTL8153C is 16, not 8. That is, the
      bp 0 ~ 15 are at 0xfc28 ~ 0xfc46, and the bp_en is at 0xfc48.
      
      Fixes: 195aae32 ("r8152: support new chips")
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6633fb83
    • Hayes Wang's avatar
      r8152: fix writing USB_BP2_EN · a876a33d
      Hayes Wang authored
      The register of USB_BP2_EN is 16 bits, so we should use
      ocp_write_word(), not ocp_write_byte().
      
      Fixes: 9370f2d0 ("support request_firmware for RTL8153")
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a876a33d
    • David S. Miller's avatar
      Merge branch 'mptcp-fixes' · d98c8210
      David S. Miller authored
      Mat Martineau says:
      
      ====================
      mptcp: Bug fixes
      
      Here are two bug fixes for the net tree:
      
      Patch 1 fixes a memory leak that could be encountered when clearing the
      list of advertised MPTCP addresses.
      
      Patch 2 fixes a protocol issue early in an MPTCP connection, to ensure
      both peers correctly understand that the full MPTCP connection handshake
      has completed even when the server side quickly sends an ADD_ADDR
      option.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d98c8210
    • Matthieu Baerts's avatar
      mptcp: full fully established support after ADD_ADDR · 67b12f79
      Matthieu Baerts authored
      If directly after an MP_CAPABLE 3WHS, the client receives an ADD_ADDR
      with HMAC from the server, it is enough to switch to a "fully
      established" mode because it has received more MPTCP options.
      
      It was then OK to enable the "fully_established" flag on the MPTCP
      socket. Still, best to check if the ADD_ADDR looks valid by looking if
      it contains an HMAC (no 'echo' bit). If an ADD_ADDR echo is received
      while we are not in "fully established" mode, it is strange and then
      we should not switch to this mode now.
      
      But that is not enough. On one hand, the path-manager has be notified
      the state has changed. On the other hand, the "fully_established" flag
      on the subflow socket should be turned on as well not to re-send the
      MP_CAPABLE 3rd ACK content with the next ACK.
      
      Fixes: 84dfe367 ("mptcp: send out dedicated ADD_ADDR packet")
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67b12f79
    • Paolo Abeni's avatar
      mptcp: fix memory leak on address flush · a0eea5f1
      Paolo Abeni authored
      The endpoint cleanup path is prone to a memory leak, as reported
      by syzkaller:
      
       BUG: memory leak
       unreferenced object 0xffff88810680ea00 (size 64):
         comm "syz-executor.6", pid 6191, jiffies 4295756280 (age 24.138s)
         hex dump (first 32 bytes):
           58 75 7d 3c 80 88 ff ff 22 01 00 00 00 00 ad de  Xu}<....".......
           01 00 02 00 00 00 00 00 ac 1e 00 07 00 00 00 00  ................
         backtrace:
           [<0000000072a9f72a>] kmalloc include/linux/slab.h:591 [inline]
           [<0000000072a9f72a>] mptcp_nl_cmd_add_addr+0x287/0x9f0 net/mptcp/pm_netlink.c:1170
           [<00000000f6e931bf>] genl_family_rcv_msg_doit.isra.0+0x225/0x340 net/netlink/genetlink.c:731
           [<00000000f1504a2c>] genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
           [<00000000f1504a2c>] genl_rcv_msg+0x341/0x5b0 net/netlink/genetlink.c:792
           [<0000000097e76f6a>] netlink_rcv_skb+0x148/0x430 net/netlink/af_netlink.c:2504
           [<00000000ceefa2b8>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
           [<000000008ff91aec>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
           [<000000008ff91aec>] netlink_unicast+0x537/0x750 net/netlink/af_netlink.c:1340
           [<0000000041682c35>] netlink_sendmsg+0x846/0xd80 net/netlink/af_netlink.c:1929
           [<00000000df3aa8e7>] sock_sendmsg_nosec net/socket.c:704 [inline]
           [<00000000df3aa8e7>] sock_sendmsg+0x14e/0x190 net/socket.c:724
           [<000000002154c54c>] ____sys_sendmsg+0x709/0x870 net/socket.c:2403
           [<000000001aab01d7>] ___sys_sendmsg+0xff/0x170 net/socket.c:2457
           [<00000000fa3b1446>] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2486
           [<00000000db2ee9c7>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
           [<00000000db2ee9c7>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
           [<000000005873517d>] entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      We should not require an allocation to cleanup stuff.
      
      Rework the code a bit so that the additional RCU work is no more needed.
      
      Fixes: 1729cf18 ("mptcp: create the listening socket for new port")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0eea5f1
  2. 18 Aug, 2021 11 commits
  3. 17 Aug, 2021 4 commits
  4. 16 Aug, 2021 10 commits
    • Lahav Schlesinger's avatar
      vrf: Reset skb conntrack connection on VRF rcv · 09e856d5
      Lahav Schlesinger authored
      To fix the "reverse-NAT" for replies.
      
      When a packet is sent over a VRF, the POST_ROUTING hooks are called
      twice: Once from the VRF interface, and once from the "actual"
      interface the packet will be sent from:
      1) First SNAT: l3mdev_l3_out() -> vrf_l3_out() -> .. -> vrf_output_direct()
           This causes the POST_ROUTING hooks to run.
      2) Second SNAT: 'ip_output()' calls POST_ROUTING hooks again.
      
      Similarly for replies, first ip_rcv() calls PRE_ROUTING hooks, and
      second vrf_l3_rcv() calls them again.
      
      As an example, consider the following SNAT rule:
      > iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j SNAT --to-source 2.2.2.2 -o vrf_1
      
      In this case sending over a VRF will create 2 conntrack entries.
      The first is from the VRF interface, which performs the IP SNAT.
      The second will run the SNAT, but since the "expected reply" will remain
      the same, conntrack randomizes the source port of the packet:
      e..g With a socket bound to 1.1.1.1:10000, sending to 3.3.3.3:53, the conntrack
      rules are:
      udp      17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=0 bytes=0 mark=0 use=1
      udp      17 29 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
      
      i.e. First SNAT IP from 1.1.1.1 --> 2.2.2.2, and second the src port is
      SNAT-ed from 10000 --> 61033.
      
      But when a reply is sent (3.3.3.3:53 -> 2.2.2.2:61033) only the later
      conntrack entry is matched:
      udp      17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=1 bytes=49 mark=0 use=1
      udp      17 28 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
      
      And a "port 61033 unreachable" ICMP packet is sent back.
      
      The issue is that when PRE_ROUTING hooks are called from vrf_l3_rcv(),
      the skb already has a conntrack flow attached to it, which means
      nf_conntrack_in() will not resolve the flow again.
      
      This means only the dest port is "reverse-NATed" (61033 -> 10000) but
      the dest IP remains 2.2.2.2, and since the socket is bound to 1.1.1.1 it's
      not received.
      This can be verified by logging the 4-tuple of the packet in '__udp4_lib_rcv()'.
      
      The fix is then to reset the flow when skb is received on a VRF, to let
      conntrack resolve the flow again (which now will hit the earlier flow).
      
      To reproduce: (Without the fix "Got pkt_to_nat_port" will not be printed by
        running 'bash ./repro'):
        $ cat run_in_A1.py
        import logging
        logging.getLogger("scapy.runtime").setLevel(logging.ERROR)
        from scapy.all import *
        import argparse
      
        def get_packet_to_send(udp_dst_port, msg_name):
            return Ether(src='11:22:33:44:55:66', dst=iface_mac)/ \
                IP(src='3.3.3.3', dst='2.2.2.2')/ \
                UDP(sport=53, dport=udp_dst_port)/ \
                Raw(f'{msg_name}\x0012345678901234567890')
      
        parser = argparse.ArgumentParser()
        parser.add_argument('-iface_mac', dest="iface_mac", type=str, required=True,
                            help="From run_in_A3.py")
        parser.add_argument('-socket_port', dest="socket_port", type=str,
                            required=True, help="From run_in_A3.py")
        parser.add_argument('-v1_mac', dest="v1_mac", type=str, required=True,
                            help="From script")
      
        args, _ = parser.parse_known_args()
        iface_mac = args.iface_mac
        socket_port = int(args.socket_port)
        v1_mac = args.v1_mac
      
        print(f'Source port before NAT: {socket_port}')
      
        while True:
            pkts = sniff(iface='_v0', store=True, count=1, timeout=10)
            if 0 == len(pkts):
                print('Something failed, rerun the script :(', flush=True)
                break
            pkt = pkts[0]
            if not pkt.haslayer('UDP'):
                continue
      
            pkt_sport = pkt.getlayer('UDP').sport
            print(f'Source port after NAT: {pkt_sport}', flush=True)
      
            pkt_to_send = get_packet_to_send(pkt_sport, 'pkt_to_nat_port')
            sendp(pkt_to_send, '_v0', verbose=False) # Will not be received
      
            pkt_to_send = get_packet_to_send(socket_port, 'pkt_to_socket_port')
            sendp(pkt_to_send, '_v0', verbose=False)
            break
      
        $ cat run_in_A2.py
        import socket
        import netifaces
      
        print(f"{netifaces.ifaddresses('e00000')[netifaces.AF_LINK][0]['addr']}",
              flush=True)
        s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE,
                     str('vrf_1' + '\0').encode('utf-8'))
        s.connect(('3.3.3.3', 53))
        print(f'{s. getsockname()[1]}', flush=True)
        s.settimeout(5)
      
        while True:
            try:
                # Periodically send in order to keep the conntrack entry alive.
                s.send(b'a'*40)
                resp = s.recvfrom(1024)
                msg_name = resp[0].decode('utf-8').split('\0')[0]
                print(f"Got {msg_name}", flush=True)
            except Exception as e:
                pass
      
        $ cat repro.sh
        ip netns del A1 2> /dev/null
        ip netns del A2 2> /dev/null
        ip netns add A1
        ip netns add A2
      
        ip -n A1 link add _v0 type veth peer name _v1 netns A2
        ip -n A1 link set _v0 up
      
        ip -n A2 link add e00000 type bond
        ip -n A2 link add lo0 type dummy
        ip -n A2 link add vrf_1 type vrf table 10001
        ip -n A2 link set vrf_1 up
        ip -n A2 link set e00000 master vrf_1
      
        ip -n A2 addr add 1.1.1.1/24 dev e00000
        ip -n A2 link set e00000 up
        ip -n A2 link set _v1 master e00000
        ip -n A2 link set _v1 up
        ip -n A2 link set lo0 up
        ip -n A2 addr add 2.2.2.2/32 dev lo0
      
        ip -n A2 neigh add 1.1.1.10 lladdr 77:77:77:77:77:77 dev e00000
        ip -n A2 route add 3.3.3.3/32 via 1.1.1.10 dev e00000 table 10001
      
        ip netns exec A2 iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j \
      	SNAT --to-source 2.2.2.2 -o vrf_1
      
        sleep 5
        ip netns exec A2 python3 run_in_A2.py > x &
        XPID=$!
        sleep 5
      
        IFACE_MAC=`sed -n 1p x`
        SOCKET_PORT=`sed -n 2p x`
        V1_MAC=`ip -n A2 link show _v1 | sed -n 2p | awk '{print $2'}`
        ip netns exec A1 python3 run_in_A1.py -iface_mac ${IFACE_MAC} -socket_port \
                ${SOCKET_PORT} -v1_mac ${SOCKET_PORT}
        sleep 5
      
        kill -9 $XPID
        wait $XPID 2> /dev/null
        ip netns del A1
        ip netns del A2
        tail x -n 2
        rm x
        set +x
      
      Fixes: 73e20b76 ("net: vrf: Add support for PREROUTING rules on vrf device")
      Signed-off-by: default avatarLahav Schlesinger <lschlesinger@drivenets.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20210815120002.2787653-1-lschlesinger@drivenets.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      09e856d5
    • Dan Carpenter's avatar
      net: iosm: Prevent underflow in ipc_chnl_cfg_get() · 4f3f2e3f
      Dan Carpenter authored
      The bounds check on "index" doesn't catch negative values.  Using
      ARRAY_SIZE() directly is more readable and more robust because it prevents
      negative values for "index".  Fortunately we only pass valid values to
      ipc_chnl_cfg_get() so this patch does not affect runtime.
      Reported-by: default avatarSolomon Ucko <solly.ucko@gmail.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarM Chetan Kumar <m.chetan.kumar@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f3f2e3f
    • David S. Miller's avatar
      Merge branch 'bnxt_en-fixes' · 517c54d2
      David S. Miller authored
      Michael Chan says:
      
      ====================
      bnxt_en: 2 bug fixes
      
      The first one disables aRFS/NTUPLE on an older broken firmware version.
      The second one adds missing memory barriers related to completion ring
      handling.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      517c54d2
    • Michael Chan's avatar
      bnxt_en: Add missing DMA memory barriers · 828affc2
      Michael Chan authored
      Each completion ring entry has a valid bit to indicate that the entry
      contains a valid completion event.  The driver's main poll loop
      __bnxt_poll_work() has the proper dma_rmb() to make sure the valid
      bit of the next entry has been checked before proceeding further.
      But when we call bnxt_rx_pkt() to process the RX event, the RX
      completion event consists of two completion entries and only the
      first entry has been checked to be valid.  We need the same barrier
      after checking the next completion entry.  Add missing dma_rmb()
      barriers in bnxt_rx_pkt() and other similar locations.
      
      Fixes: 67a95e20 ("bnxt_en: Need memory barrier when processing the completion ring.")
      Reported-by: default avatarLance Richardson <lance.richardson@broadcom.com>
      Reviewed-by: default avatarAndy Gospodarek <gospo@broadcom.com>
      Reviewed-by: default avatarLance Richardson <lance.richardson@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      828affc2
    • Michael Chan's avatar
      bnxt_en: Disable aRFS if running on 212 firmware · 976e52b7
      Michael Chan authored
      212 firmware broke aRFS, so disable it.  Traffic may stop after ntuple
      filters are inserted and deleted by the 212 firmware.
      
      Fixes: ae10ae74 ("bnxt_en: Add new hardware RFS mode.")
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      976e52b7
    • Shai Malin's avatar
      qed: Fix null-pointer dereference in qed_rdma_create_qp() · d33d19d3
      Shai Malin authored
      Fix a possible null-pointer dereference in qed_rdma_create_qp().
      
      Changes from V2:
      - Revert checkpatch fixes.
      Reported-by: default avatarTOTE Robot <oslab@tsinghua.edu.cn>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarShai Malin <smalin@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d33d19d3
    • Shai Malin's avatar
      qed: qed ll2 race condition fixes · 37110237
      Shai Malin authored
      Avoiding qed ll2 race condition and NULL pointer dereference as part
      of the remove and recovery flows.
      
      Changes form V1:
      - Change (!p_rx->set_prod_addr).
      - qed_ll2.c checkpatch fixes.
      
      Change from V2:
      - Revert "qed_ll2.c checkpatch fixes".
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarShai Malin <smalin@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37110237
    • Xin Long's avatar
      tipc: call tipc_wait_for_connect only when dlen is not 0 · 7387a72c
      Xin Long authored
      __tipc_sendmsg() is called to send SYN packet by either tipc_sendmsg()
      or tipc_connect(). The difference is in tipc_connect(), it will call
      tipc_wait_for_connect() after __tipc_sendmsg() to wait until connecting
      is done. So there's no need to wait in __tipc_sendmsg() for this case.
      
      This patch is to fix it by calling tipc_wait_for_connect() only when dlen
      is not 0 in __tipc_sendmsg(), which means it's called by tipc_connect().
      
      Note this also fixes the failure in tipcutils/test/ptts/:
      
        # ./tipcTS &
        # ./tipcTC 9
        (hang)
      
      Fixes: 36239dab6da7 ("tipc: fix implicit-connect for SYN+")
      Reported-by: default avatarShuang Li <shuali@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7387a72c
    • Andy Shevchenko's avatar
      ptp_pch: Restore dependency on PCI · 55c8fca1
      Andy Shevchenko authored
      During the swap dependency on PCH_GBE to selection PTP_1588_CLOCK_PCH
      incidentally dropped the implicit dependency on the PCI. Restore it.
      
      Fixes: 18d359ce ("pch_gbe, ptp_pch: Fix the dependency direction between these drivers")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55c8fca1
    • Pavel Skripkin's avatar
      net: 6pack: fix slab-out-of-bounds in decode_data · 19d1532a
      Pavel Skripkin authored
      Syzbot reported slab-out-of bounds write in decode_data().
      The problem was in missing validation checks.
      
      Syzbot's reproducer generated malicious input, which caused
      decode_data() to be called a lot in sixpack_decode(). Since
      rx_count_cooked is only 400 bytes and noone reported before,
      that 400 bytes is not enough, let's just check if input is malicious
      and complain about buffer overrun.
      
      Fail log:
      ==================================================================
      BUG: KASAN: slab-out-of-bounds in drivers/net/hamradio/6pack.c:843
      Write of size 1 at addr ffff888087c5544e by task kworker/u4:0/7
      
      CPU: 0 PID: 7 Comm: kworker/u4:0 Not tainted 5.6.0-rc3-syzkaller #0
      ...
      Workqueue: events_unbound flush_to_ldisc
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x197/0x210 lib/dump_stack.c:118
       print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
       __kasan_report.cold+0x1b/0x32 mm/kasan/report.c:506
       kasan_report+0x12/0x20 mm/kasan/common.c:641
       __asan_report_store1_noabort+0x17/0x20 mm/kasan/generic_report.c:137
       decode_data.part.0+0x23b/0x270 drivers/net/hamradio/6pack.c:843
       decode_data drivers/net/hamradio/6pack.c:965 [inline]
       sixpack_decode drivers/net/hamradio/6pack.c:968 [inline]
      
      Reported-and-tested-by: syzbot+fc8cd9a673d4577fb2e4@syzkaller.appspotmail.com
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarPavel Skripkin <paskripkin@gmail.com>
      Reviewed-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19d1532a
  5. 14 Aug, 2021 1 commit
  6. 13 Aug, 2021 7 commits