1. 09 Aug, 2022 5 commits
  2. 08 Aug, 2022 2 commits
  3. 06 Aug, 2022 16 commits
  4. 05 Aug, 2022 6 commits
    • Cezar Bulinaru's avatar
      selftests: add few test cases for tap driver · 2e64fe46
      Cezar Bulinaru authored
      Few test cases related to the fix for 924a9bc3:
      "net: check if protocol extracted by virtio_net_hdr_set_proto is correct"
      
      Need test for the case when a non-standard packet (GSO without NEEDS_CSUM)
      sent to the tap device causes a BUG check in the tap driver.
      Signed-off-by: default avatarCezar Bulinaru <cbulinaru@gmail.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e64fe46
    • Cezar Bulinaru's avatar
      net: tap: NULL pointer derefence in dev_parse_header_protocol when skb->dev is null · 4f61f133
      Cezar Bulinaru authored
      Fixes a NULL pointer derefence bug triggered from tap driver.
      When tap_get_user calls virtio_net_hdr_to_skb the skb->dev is null
      (in tap.c skb->dev is set after the call to virtio_net_hdr_to_skb)
      virtio_net_hdr_to_skb calls dev_parse_header_protocol which
      needs skb->dev field to be valid.
      
      The line that trigers the bug is in dev_parse_header_protocol
      (dev is at offset 0x10 from skb and is stored in RAX register)
        if (!dev->header_ops || !dev->header_ops->parse_protocol)
        22e1:   mov    0x10(%rbx),%rax
        22e5:	  mov    0x230(%rax),%rax
      
      Setting skb->dev before the call in tap.c fixes the issue.
      
      BUG: kernel NULL pointer dereference, address: 0000000000000230
      RIP: 0010:virtio_net_hdr_to_skb.constprop.0+0x335/0x410 [tap]
      Code: c0 0f 85 b7 fd ff ff eb d4 41 39 c6 77 cf 29 c6 48 89 df 44 01 f6 e8 7a 79 83 c1 48 85 c0 0f 85 d9 fd ff ff eb b7 48 8b 43 10 <48> 8b 80 30 02 00 00 48 85 c0 74 55 48 8b 40 28 48 85 c0 74 4c 48
      RSP: 0018:ffffc90005c27c38 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff888298f25300 RCX: 0000000000000010
      RDX: 0000000000000005 RSI: ffffc90005c27cb6 RDI: ffff888298f25300
      RBP: ffffc90005c27c80 R08: 00000000ffffffea R09: 00000000000007e8
      R10: ffff88858ec77458 R11: 0000000000000000 R12: 0000000000000001
      R13: 0000000000000014 R14: ffffc90005c27e08 R15: ffffc90005c27cb6
      FS:  0000000000000000(0000) GS:ffff88858ec40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000230 CR3: 0000000281408006 CR4: 00000000003706e0
      Call Trace:
       tap_get_user+0x3f1/0x540 [tap]
       tap_sendmsg+0x56/0x362 [tap]
       ? get_tx_bufs+0xc2/0x1e0 [vhost_net]
       handle_tx_copy+0x114/0x670 [vhost_net]
       handle_tx+0xb0/0xe0 [vhost_net]
       handle_tx_kick+0x15/0x20 [vhost_net]
       vhost_worker+0x7b/0xc0 [vhost]
       ? vhost_vring_call_reset+0x40/0x40 [vhost]
       kthread+0xfa/0x120
       ? kthread_complete_and_exit+0x20/0x20
       ret_from_fork+0x1f/0x30
      
      Fixes: 924a9bc3 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct")
      Signed-off-by: default avatarCezar Bulinaru <cbulinaru@gmail.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f61f133
    • David S. Miller's avatar
      Merge branch 'mptcp-fixes' · 9f05f9ad
      David S. Miller authored
      Mat Martineau says:
      
      ====================
      mptcp: Fixes for mptcp cleanup/close and a selftest
      
      Patch 1 fixes an issue with leaking subflow sockets if there's a failure
      in a CGROUP_INET_SOCK_CREATE eBPF program.
      
      Patch 2 fixes a syzkaller-detected race at MPTCP socket close.
      
      Patch 3 is a fix for one mode of the mptcp_connect.sh selftest.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f05f9ad
    • Florian Westphal's avatar
      selftests: mptcp: make sendfile selftest work · df9e03ae
      Florian Westphal authored
      When the selftest got added, sendfile() on mptcp sockets returned
      -EOPNOTSUPP, so running 'mptcp_connect.sh -m sendfile' failed
      immediately.
      
      This is no longer the case, but the script fails anyway due to timeout.
      Let the receiver know once the sender has sent all data, just like
      with '-m mmap' mode.
      
      v2: need to respect cfg_wait too, as pm_userspace.sh relied
      on -m sendfile to keep the connection open (Mat Martineau)
      
      Fixes: 048d19d4 ("mptcp: add basic kselftest for mptcp")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df9e03ae
    • Paolo Abeni's avatar
      mptcp: do not queue data on closed subflows · c886d702
      Paolo Abeni authored
      Dipanjan reported a syzbot splat at close time:
      
      WARNING: CPU: 1 PID: 10818 at net/ipv4/af_inet.c:153
      inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153
      Modules linked in: uio_ivshmem(OE) uio(E)
      CPU: 1 PID: 10818 Comm: kworker/1:16 Tainted: G           OE
      5.19.0-rc6-g2eae0556bb9d #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      1.13.0-1ubuntu1.1 04/01/2014
      Workqueue: events mptcp_worker
      RIP: 0010:inet_sock_destruct+0x6d0/0x8e0 net/ipv4/af_inet.c:153
      Code: 21 02 00 00 41 8b 9c 24 28 02 00 00 e9 07 ff ff ff e8 34 4d 91
      f9 89 ee 4c 89 e7 e8 4a 47 60 ff e9 a6 fc ff ff e8 20 4d 91 f9 <0f> 0b
      e9 84 fe ff ff e8 14 4d 91 f9 0f 0b e9 d4 fd ff ff e8 08 4d
      RSP: 0018:ffffc9001b35fa78 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 00000000002879d0 RCX: ffff8881326f3b00
      RDX: 0000000000000000 RSI: ffff8881326f3b00 RDI: 0000000000000002
      RBP: ffff888179662674 R08: ffffffff87e983a0 R09: 0000000000000000
      R10: 0000000000000005 R11: 00000000000004ea R12: ffff888179662400
      R13: ffff888179662428 R14: 0000000000000001 R15: ffff88817e38e258
      FS:  0000000000000000(0000) GS:ffff8881f5f00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020007bc0 CR3: 0000000179592000 CR4: 0000000000150ee0
      Call Trace:
       <TASK>
       __sk_destruct+0x4f/0x8e0 net/core/sock.c:2067
       sk_destruct+0xbd/0xe0 net/core/sock.c:2112
       __sk_free+0xef/0x3d0 net/core/sock.c:2123
       sk_free+0x78/0xa0 net/core/sock.c:2134
       sock_put include/net/sock.h:1927 [inline]
       __mptcp_close_ssk+0x50f/0x780 net/mptcp/protocol.c:2351
       __mptcp_destroy_sock+0x332/0x760 net/mptcp/protocol.c:2828
       mptcp_worker+0x5d2/0xc90 net/mptcp/protocol.c:2586
       process_one_work+0x9cc/0x1650 kernel/workqueue.c:2289
       worker_thread+0x623/0x1070 kernel/workqueue.c:2436
       kthread+0x2e9/0x3a0 kernel/kthread.c:376
       ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
       </TASK>
      
      The root cause of the problem is that an mptcp-level (re)transmit can
      race with mptcp_close() and the packet scheduler checks the subflow
      state before acquiring the socket lock: we can try to (re)transmit on
      an already closed ssk.
      
      Fix the issue checking again the subflow socket status under the
      subflow socket lock protection. Additionally add the missing check
      for the fallback-to-tcp case.
      
      Fixes: d5f49190 ("mptcp: allow picking different xmit subflows")
      Reported-by: default avatarDipanjan Das <mail.dipanjan.das@gmail.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c886d702
    • Paolo Abeni's avatar
      mptcp: move subflow cleanup in mptcp_destroy_common() · c0bf3c6a
      Paolo Abeni authored
      If the mptcp socket creation fails due to a CGROUP_INET_SOCK_CREATE
      eBPF program, the MPTCP protocol ends-up leaking all the subflows:
      the related cleanup happens in __mptcp_destroy_sock() that is not
      invoked in such code path.
      
      Address the issue moving the subflow sockets cleanup in the
      mptcp_destroy_common() helper, which is invoked in every msk cleanup
      path.
      
      Additionally get rid of the intermediate list_splice_init step, which
      is an unneeded relic from the past.
      
      The issue is present since before the reported root cause commit, but
      any attempt to backport the fix before that hash will require a complete
      rewrite.
      
      Fixes: e16163b6 ("mptcp: refactor shutdown and close")
      Reported-by: default avatarNguyen Dinh Phi <phind.uet@gmail.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Co-developed-by: default avatarNguyen Dinh Phi <phind.uet@gmail.com>
      Signed-off-by: default avatarNguyen Dinh Phi <phind.uet@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0bf3c6a
  5. 04 Aug, 2022 7 commits
  6. 03 Aug, 2022 4 commits
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · f86d1fbb
      Linus Torvalds authored
      Pull networking changes from Paolo Abeni:
       "Core:
      
         - Refactor the forward memory allocation to better cope with memory
           pressure with many open sockets, moving from a per socket cache to
           a per-CPU one
      
         - Replace rwlocks with RCU for better fairness in ping, raw sockets
           and IP multicast router.
      
         - Network-side support for IO uring zero-copy send.
      
         - A few skb drop reason improvements, including codegen the source
           file with string mapping instead of using macro magic.
      
         - Rename reference tracking helpers to a more consistent netdev_*
           schema.
      
         - Adapt u64_stats_t type to address load/store tearing issues.
      
         - Refine debug helper usage to reduce the log noise caused by bots.
      
        BPF:
      
         - Improve socket map performance, avoiding skb cloning on read
           operation.
      
         - Add support for 64 bits enum, to match types exposed by kernel.
      
         - Introduce support for sleepable uprobes program.
      
         - Introduce support for enum textual representation in libbpf.
      
         - New helpers to implement synproxy with eBPF/XDP.
      
         - Improve loop performances, inlining indirect calls when possible.
      
         - Removed all the deprecated libbpf APIs.
      
         - Implement new eBPF-based LSM flavor.
      
         - Add type match support, which allow accurate queries to the eBPF
           used types.
      
         - A few TCP congetsion control framework usability improvements.
      
         - Add new infrastructure to manipulate CT entries via eBPF programs.
      
         - Allow for livepatch (KLP) and BPF trampolines to attach to the same
           kernel function.
      
        Protocols:
      
         - Introduce per network namespace lookup tables for unix sockets,
           increasing scalability and reducing contention.
      
         - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support.
      
         - Add support to forciby close TIME_WAIT TCP sockets via user-space
           tools.
      
         - Significant performance improvement for the TLS 1.3 receive path,
           both for zero-copy and not-zero-copy.
      
         - Support for changing the initial MTPCP subflow priority/backup
           status
      
         - Introduce virtually contingus buffers for sockets over RDMA, to
           cope better with memory pressure.
      
         - Extend CAN ethtool support with timestamping capabilities
      
         - Refactor CAN build infrastructure to allow building only the needed
           features.
      
        Driver API:
      
         - Remove devlink mutex to allow parallel commands on multiple links.
      
         - Add support for pause stats in distributed switch.
      
         - Implement devlink helpers to query and flash line cards.
      
         - New helper for phy mode to register conversion.
      
        New hardware / drivers:
      
         - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro.
      
         - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch.
      
         - Ethernet DSA driver for the Microchip LAN937x switch.
      
         - Ethernet PHY driver for the Aquantia AQR113C EPHY.
      
         - CAN driver for the OBD-II ELM327 interface.
      
         - CAN driver for RZ/N1 SJA1000 CAN controller.
      
         - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device.
      
        Drivers:
      
         - Intel Ethernet NICs:
            - i40e: add support for vlan pruning
            - i40e: add support for XDP framented packets
            - ice: improved vlan offload support
            - ice: add support for PPPoE offload
      
         - Mellanox Ethernet (mlx5)
            - refactor packet steering offload for performance and scalability
            - extend support for TC offload
            - refactor devlink code to clean-up the locking schema
            - support stacked vlans for bridge offloads
            - use TLS objects pool to improve connection rate
      
         - Netronome Ethernet NICs (nfp):
            - extend support for IPv6 fields mangling offload
            - add support for vepa mode in HW bridge
            - better support for virtio data path acceleration (VDPA)
            - enable TSO by default
      
         - Microsoft vNIC driver (mana)
            - add support for XDP redirect
      
         - Others Ethernet drivers:
            - bonding: add per-port priority support
            - microchip lan743x: extend phy support
            - Fungible funeth: support UDP segmentation offload and XDP xmit
            - Solarflare EF100: add support for virtual function representors
            - MediaTek SoC: add XDP support
      
         - Mellanox Ethernet/IB switch (mlxsw):
            - dropped support for unreleased H/W (XM router).
            - improved stats accuracy
            - unified bridge model coversion improving scalability (parts 1-6)
            - support for PTP in Spectrum-2 asics
      
         - Broadcom PHYs
            - add PTP support for BCM54210E
            - add support for the BCM53128 internal PHY
      
         - Marvell Ethernet switches (prestera):
            - implement support for multicast forwarding offload
      
         - Embedded Ethernet switches:
            - refactor OcteonTx MAC filter for better scalability
            - improve TC H/W offload for the Felix driver
            - refactor the Microchip ksz8 and ksz9477 drivers to share the
              probe code (parts 1, 2), add support for phylink mac
              configuration
      
         - Other WiFi:
            - Microchip wilc1000: diable WEP support and enable WPA3
            - Atheros ath10k: encapsulation offload support
      
        Old code removal:
      
         - Neterion vxge ethernet driver: this is untouched since more than 10 years"
      
      * tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits)
        doc: sfp-phylink: Fix a broken reference
        wireguard: selftests: support UML
        wireguard: allowedips: don't corrupt stack when detecting overflow
        wireguard: selftests: update config fragments
        wireguard: ratelimiter: use hrtimer in selftest
        net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ
        net: usb: ax88179_178a: Bind only to vendor-specific interface
        selftests: net: fix IOAM test skip return code
        net: usb: make USB_RTL8153_ECM non user configurable
        net: marvell: prestera: remove reduntant code
        octeontx2-pf: Reduce minimum mtu size to 60
        net: devlink: Fix missing mutex_unlock() call
        net/tls: Remove redundant workqueue flush before destroy
        net: txgbe: Fix an error handling path in txgbe_probe()
        net: dsa: Fix spelling mistakes and cleanup code
        Documentation: devlink: add add devlink-selftests to the table of contents
        dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock
        net: ionic: fix error check for vlan flags in ionic_set_nic_features()
        net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr()
        nfp: flower: add support for tunnel offload without key ID
        ...
      f86d1fbb
    • Linus Torvalds's avatar
      Merge tag 'ata-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata · 526942b8
      Linus Torvalds authored
      Pull ATA updates from Damien Le Moal:
      
       - Some code refactoring for the pata_hpt37x and pata_hpt3x2n drivers,
         from Sergei.
      
       - Several patches to cleanup in libata-core, libata-scsi and libata-eh
         code: fixes arguments and variables types, change some functions
         declaration to static and fix for a typo in a comment. From Sergey
         and Xiang.
      
       - Fix a compilation warning in the pata_macio driver, from me.
      
       - A fix for the expected number of resources in the sata_mv driver fix,
         from Andrew.
      
      * tag 'ata-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
        ata: sata_mv: Fixes expected number of resources now IRQs are gone
        ata: libata-scsi: fix result type of ata_ioc32()
        ata: pata_macio: Fix compilation warning
        ata: libata-eh: fix sloppy result type of ata_internal_cmd_timeout()
        ata: libata-core: fix sloppy parameter type in ata_exec_internal[_sg]()
        ata: make ata_port::fastdrain_cnt *unsigned int*
        ata: libata-eh: fix sloppy result type of ata_eh_nr_in_flight()
        ata: libata-core: make ata_exec_internal_sg() *static*
        ata: make transfer mode masks *unsigned int*
        ata: libata-core: get rid of *else* branches in ata_id_n_sectors()
        ata: libata-core: fix sloppy typing in ata_id_n_sectors()
        ata: pata_hpt3x2n: pass base DPLL frequency to hpt3x2n_pci_clock()
        ata: pata_hpt37x: merge hpt374_read_freq() to hpt37x_pci_clock()
        ata: pata_hpt37x: factor out hpt37x_pci_clock()
        ata: pata_hpt37x: move claculating PCI clock from hpt37x_clock_slot()
        ata: libata: Fix syntax errors in comments
      526942b8
    • Linus Torvalds's avatar
      Merge tag 'zonefs-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs · a39b5dbd
      Linus Torvalds authored
      Pull zonefs update from Damien Le Moal:
       "A single change for this cycle to simplify handling of the memory page
        used as super block buffer during mount (from Fabio)"
      
      * tag 'zonefs-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
        zonefs: Call page_address() on page acquired with GFP_KERNEL flag
      a39b5dbd
    • Linus Torvalds's avatar
      Merge tag 'iomap-5.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · f18d7309
      Linus Torvalds authored
      Pull iomap updates from Darrick Wong:
       "The most notable change in this first batch is that we no longer
        schedule pages beyond i_size for writeback, preferring instead to let
        truncate deal with those pages.
      
        Next week, there may be a second pull request to remove
        iomap_writepage from the other two filesystems (gfs2/zonefs) that use
        iomap for buffered IO. This follows in the same vein as the recent
        removal of writepage from XFS, since it hasn't been triggered in a few
        years; it does nothing during direct reclaim; and as far as the people
        who examined the patchset can tell, it's moving the codebase in the
        right direction.
      
        However, as it was a late addition to for-next, I'm holding off on
        that section for another week of testing to see if anyone can come up
        with a solid reason for holding off in the meantime.
      
        Summary:
      
         - Skip writeback for pages that are completely beyond EOF
      
         - Minor code cleanups"
      
      * tag 'iomap-5.20-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        dax: set did_zero to true when zeroing successfully
        iomap: set did_zero to true when zeroing successfully
        iomap: skip pages past eof in iomap_do_writepage()
      f18d7309