1. 19 Dec, 2023 2 commits
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · c49b292d
      Jakub Kicinski authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf-next 2023-12-18
      
      This PR is larger than usual and contains changes in various parts
      of the kernel.
      
      The main changes are:
      
      1) Fix kCFI bugs in BPF, from Peter Zijlstra.
      
      End result: all forms of indirect calls from BPF into kernel
      and from kernel into BPF work with CFI enabled. This allows BPF
      to work with CONFIG_FINEIBT=y.
      
      2) Introduce BPF token object, from Andrii Nakryiko.
      
      It adds an ability to delegate a subset of BPF features from privileged
      daemon (e.g., systemd) through special mount options for userns-bound
      BPF FS to a trusted unprivileged application. The design accommodates
      suggestions from Christian Brauner and Paul Moore.
      
      Example:
      $ sudo mkdir -p /sys/fs/bpf/token
      $ sudo mount -t bpf bpffs /sys/fs/bpf/token \
                   -o delegate_cmds=prog_load:MAP_CREATE \
                   -o delegate_progs=kprobe \
                   -o delegate_attachs=xdp
      
      3) Various verifier improvements and fixes, from Andrii Nakryiko, Andrei Matei.
      
       - Complete precision tracking support for register spills
       - Fix verification of possibly-zero-sized stack accesses
       - Fix access to uninit stack slots
       - Track aligned STACK_ZERO cases as imprecise spilled registers.
         It improves the verifier "instructions processed" metric from single
         digit to 50-60% for some programs.
       - Fix verifier retval logic
      
      4) Support for VLAN tag in XDP hints, from Larysa Zaremba.
      
      5) Allocate BPF trampoline via bpf_prog_pack mechanism, from Song Liu.
      
      End result: better memory utilization and lower I$ miss for calls to BPF
      via BPF trampoline.
      
      6) Fix race between BPF prog accessing inner map and parallel delete,
      from Hou Tao.
      
      7) Add bpf_xdp_get_xfrm_state() kfunc, from Daniel Xu.
      
      It allows BPF interact with IPSEC infra. The intent is to support
      software RSS (via XDP) for the upcoming ipsec pcpu work.
      Experiments on AWS demonstrate single tunnel pcpu ipsec reaching
      line rate on 100G ENA nics.
      
      8) Expand bpf_cgrp_storage to support cgroup1 non-attach, from Yafang Shao.
      
      9) BPF file verification via fsverity, from Song Liu.
      
      It allows BPF progs get fsverity digest.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (164 commits)
        bpf: Ensure precise is reset to false in __mark_reg_const_zero()
        selftests/bpf: Add more uprobe multi fail tests
        bpf: Fail uprobe multi link with negative offset
        selftests/bpf: Test the release of map btf
        s390/bpf: Fix indirect trampoline generation
        selftests/bpf: Temporarily disable dummy_struct_ops test on s390
        x86/cfi,bpf: Fix bpf_exception_cb() signature
        bpf: Fix dtor CFI
        cfi: Add CFI_NOSEAL()
        x86/cfi,bpf: Fix bpf_struct_ops CFI
        x86/cfi,bpf: Fix bpf_callback_t CFI
        x86/cfi,bpf: Fix BPF JIT call
        cfi: Flip headers
        selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment
        selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test
        selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment
        bpf: Limit the number of kprobes when attaching program to multiple kprobes
        bpf: Limit the number of uprobes when attaching program to multiple uprobes
        bpf: xdp: Register generic_kfunc_set with XDP programs
        selftests/bpf: utilize string values for delegate_xxx mount options
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20231219000520.34178-1-alexei.starovoitov@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c49b292d
    • Jakub Kicinski's avatar
      Merge tag 'wireless-next-2023-12-18' of... · 0ee28c9a
      Jakub Kicinski authored
      Merge tag 'wireless-next-2023-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
      
      Kalle Valo says:
      
      ====================
      wireless-next patches for v6.8
      
      The second features pull request for v6.8. A bigger one this time with
      changes both to stack and drivers. We have a new Wifi band RFI (WBRF)
      mitigation feature for which we pulled an immutable branch shared with
      other subsystems. And, as always, other new features and bug fixes all
      over.
      
      Major changes:
      
      cfg80211/mac80211
       * AMD ACPI based Wifi band RFI (WBRF) mitigation feature
       * Basic Service Set (BSS) usage reporting
       * TID to link mapping support
       * mac80211 hardware flag to disallow puncturing
      
      iwlwifi
       * new debugfs file fw_dbg_clear
      
      mt76
       * NVMEM EEPROM improvements
       * mt7996 Extremely High Throughpu (EHT) improvements
       * mt7996 Wireless Ethernet Dispatcher (WED) support
       * mt7996 36-bit DMA support
      
      ath12k
       * support one MSI vector
       * WCN7850: support AP mode
      
      * tag 'wireless-next-2023-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (207 commits)
        wifi: mt76: mt7996: Use DECLARE_FLEX_ARRAY() and fix -Warray-bounds warnings
        wifi: ath11k: workaround too long expansion sparse warnings
        Revert "wifi: ath12k: use ATH12K_PCI_IRQ_DP_OFFSET for DP IRQ"
        wifi: rt2x00: remove useless code in rt2x00queue_create_tx_descriptor()
        wifi: rtw89: only reset BB/RF for existing WiFi 6 chips while starting up
        wifi: rtw89: add DBCC H2C to notify firmware the status
        wifi: rtw89: mac: add suffix _ax to MAC functions
        wifi: rtw89: mac: add flags to check if CMAC and DMAC are enabled
        wifi: rtw89: 8922a: add power on/off functions
        wifi: rtw89: add XTAL SI for WiFi 7 chips
        wifi: rtw89: phy: print out RFK log with formatted string
        wifi: rtw89: parse and print out RFK log from C2H events
        wifi: rtw89: add C2H event handlers of RFK log and report
        wifi: rtw89: load RFK log format string from firmware file
        wifi: rtw89: fw: add version field to BB MCU firmware element
        wifi: rtw89: fw: load TX power track tables from fw_element
        wifi: mwifiex: configure BSSID consistently when starting AP
        wifi: mwifiex: add extra delay for firmware ready
        wifi: mac80211: sta_info.c: fix sentence grammar
        wifi: mac80211: rx.c: fix sentence grammar
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20231218163900.C031DC433C9@smtp.kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0ee28c9a
  2. 18 Dec, 2023 23 commits
  3. 17 Dec, 2023 15 commits
    • David S. Miller's avatar
      Merge branch 'phy-ackage-addr-mmd-apis' · 54f4c257
      David S. Miller authored
      Christian Marangi says:
      
      ====================
      net: phy: add PHY package base addr + mmd APIs
      
      This small series is required for the upcoming qca807x PHY that
      will make use of PHY package mmd API and the new implementation
      with read/write based on base addr.
      
      The MMD PHY package patch currently has no use but it will be
      used in the upcoming patch and it does complete what a PHY package
      may require in addition to basic read/write to setup global PHY address.
      
      (Changelog for all the revision is present in the single patch)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54f4c257
    • Christian Marangi's avatar
      net: phy: add support for PHY package MMD read/write · d63710fc
      Christian Marangi authored
      Some PHY in PHY package may require to read/write MMD regs to correctly
      configure the PHY package.
      
      Add support for these additional required function in both lock and no
      lock variant.
      
      It's assumed that the entire PHY package is either C22 or C45. We use
      C22 or C45 way of writing/reading to mmd regs based on the passed phydev
      whether it's C22 or C45.
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d63710fc
    • Christian Marangi's avatar
      net: phy: restructure __phy_write/read_mmd to helper and phydev user · 028672bd
      Christian Marangi authored
      Restructure phy_write_mmd and phy_read_mmd to implement generic helper
      for direct mdiobus access for mmd and use these helper for phydev user.
      
      This is needed in preparation of PHY package API that requires generic
      access to the mdiobus and are deatched from phydev struct but instead
      access them based on PHY package base_addr and offsets.
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      028672bd
    • Christian Marangi's avatar
      net: phy: extend PHY package API to support multiple global address · 9eea577e
      Christian Marangi authored
      Current API for PHY package are limited to single address to configure
      global settings for the PHY package.
      
      It was found that some PHY package (for example the qca807x, a PHY
      package that is shipped with a bundle of 5 PHY) requires multiple PHY
      address to configure global settings. An example scenario is a PHY that
      have a dedicated PHY for PSGMII/serdes calibrarion and have a specific
      PHY in the package where the global PHY mode is set and affects every
      other PHY in the package.
      
      Change the API in the following way:
      - Change phy_package_join() to take the base addr of the PHY package
        instead of the global PHY addr.
      - Make __/phy_package_write/read() require an additional arg that
        select what global PHY address to use by passing the offset from the
        base addr passed on phy_package_join().
      
      Each user of this API is updated to follow this new implementation
      following a pattern where an enum is defined to declare the offset of the
      addr.
      
      We also drop the check if shared is defined as any user of the
      phy_package_read/write is expected to use phy_package_join first. Misuse
      of this will correctly trigger a kernel panic for NULL pointer
      exception.
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9eea577e
    • Christian Marangi's avatar
      net: phy: make addr type u8 in phy_package_shared struct · ebb30ccb
      Christian Marangi authored
      Switch addr type in phy_package_shared struct to u8.
      
      The value is already checked to be non negative and to be less than
      PHY_MAX_ADDR, hence u8 is better suited than using int.
      Signed-off-by: default avatarChristian Marangi <ansuelsmth@gmail.com>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebb30ccb
    • Suman Ghosh's avatar
      octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs · dd784287
      Suman Ghosh authored
      On some silicon variants the number of available CAM entries are
      less. Reserving one entry for each NIX-LF for default DMAC based pkt
      forwarding rules will reduce the number of available CAM entries
      further. Hence add configurability via devlink to set maximum number of
      NIX-LFs needed which inturn frees up some CAM entries.
      Signed-off-by: default avatarSuman Ghosh <sumang@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd784287
    • Kalle Valo's avatar
      Merge tag 'ath-next-20231215' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath · c5a3f56f
      Kalle Valo authored
      ath.git patches for v6.8.
      
      We have new features only for ath12k but lots of small cleanup for
      ath10k, ath11k and ath12k. And of course smaller fixes to several
      drivers.
      
      Major changes:
      
      ath12k
      
      * support one MSI vector
      
      * WCN7850: support AP mode
      c5a3f56f
    • Gustavo A. R. Silva's avatar
      wifi: mt76: mt7996: Use DECLARE_FLEX_ARRAY() and fix -Warray-bounds warnings · 40d51f70
      Gustavo A. R. Silva authored
      Transform zero-length arrays `rate`, `adm_stat` and `msdu_cnt` into
      proper flexible-array members in anonymous union in `struct
      mt7996_mcu_all_sta_info_event` via the DECLARE_FLEX_ARRAY()
      helper; and fix multiple -Warray-bounds warnings:
      
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:544:61: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:551:58: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:553:58: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:530:61: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:538:66: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:540:66: warning: array subscript <unknown> is outside array bounds of 'struct <anonymous>[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:520:57: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=]
      drivers/net/wireless/mediatek/mt76/mt7996/mcu.c:526:76: warning: array subscript <unknown> is outside array bounds of 'struct all_sta_trx_rate[0]' [-Warray-bounds=]
      
      This results in no differences in binary output, helps with the ongoing
      efforts to globally enable -Warray-bounds.
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarKalle Valo <kvalo@kernel.org>
      Link: https://msgid.link/ZXiU9ayVCslt3qiI@work
      40d51f70
    • David S. Miller's avatar
      Merge branch 'skb-coalescing-page_pool' · 3a3af3ae
      David S. Miller authored
      Liang Chen says:
      
      ====================
      skbuff: Optimize SKB coalescing for page pool
      
      The combination of the following condition was excluded from skb coalescing:
      
      from->pp_recycle = 1
      from->cloned = 1
      to->pp_recycle = 1
      
      With page pool in use, this combination can be quite common(ex.
      NetworkMananger may lead to the additional packet_type being registered,
      thus the cloning). In scenarios with a higher number of small packets, it
      can significantly affect the success rate of coalescing.
      
      This patchset aims to optimize this scenario and enable coalescing of this
      particular combination. That also involves supporting multiple users
      referencing the same fragment of a pp page to accomondate the need to
      increment the "from" SKB page's pp page reference count.
      
      Changes from v10:
      - re-number patches to 1/3, 2/3, 3/3
      
      Changes from v9:
      - patch 1 was already applied
      - imporve description for patch 2
      - make sure skb_pp_frag_ref only work for pp aware skbs
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3a3af3ae
    • Liang Chen's avatar
      skbuff: Optimization of SKB coalescing for page pool · f7dc3248
      Liang Chen authored
      In order to address the issues encountered with commit 1effe8ca
      ("skbuff: fix coalescing for page_pool fragment recycling"), the
      combination of the following condition was excluded from skb coalescing:
      
      from->pp_recycle = 1
      from->cloned = 1
      to->pp_recycle = 1
      
      However, with page pool environments, the aforementioned combination can
      be quite common(ex. NetworkMananger may lead to the additional
      packet_type being registered, thus the cloning). In scenarios with a
      higher number of small packets, it can significantly affect the success
      rate of coalescing. For example, considering packets of 256 bytes size,
      our comparison of coalescing success rate is as follows:
      
      Without page pool: 70%
      With page pool: 13%
      
      Consequently, this has an impact on performance:
      
      Without page pool: 2.57 Gbits/sec
      With page pool: 2.26 Gbits/sec
      
      Therefore, it seems worthwhile to optimize this scenario and enable
      coalescing of this particular combination. To achieve this, we need to
      ensure the correct increment of the "from" SKB page's page pool
      reference count (pp_ref_count).
      
      Following this optimization, the success rate of coalescing measured in
      our environment has improved as follows:
      
      With page pool: 60%
      
      This success rate is approaching the rate achieved without using page
      pool, and the performance has also been improved:
      
      With page pool: 2.52 Gbits/sec
      
      Below is the performance comparison for small packets before and after
      this optimization. We observe no impact to packets larger than 4K.
      
      packet size     before      after       improved
      (bytes)         (Gbits/sec) (Gbits/sec)
      128             1.19        1.27        7.13%
      256             2.26        2.52        11.75%
      512             4.13        4.81        16.50%
      1024            6.17        6.73        9.05%
      2048            14.54       15.47       6.45%
      4096            25.44       27.87       9.52%
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Suggested-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarMina Almasry <almasrymina@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7dc3248
    • Liang Chen's avatar
      skbuff: Add a function to check if a page belongs to page_pool · 8cfa2dee
      Liang Chen authored
      Wrap code for checking if a page is a page_pool page into a
      function for better readability and ease of reuse.
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Reviewed-by: default avatarMina Almasry <almarsymina@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cfa2dee
    • Liang Chen's avatar
      page_pool: halve BIAS_MAX for multiple user references of a fragment · aaf153ae
      Liang Chen authored
      Up to now, we were only subtracting from the number of used page fragments
      to figure out when a page could be freed or recycled. A following patch
      introduces support for multiple users referencing the same fragment. So
      reduce the initial page fragments value to half to avoid overflowing.
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarYunsheng Lin <linyunsheng@huawei.com>
      Reviewed-by: default avatarMina Almasry <almarsymina@google.com>
      Reviewed-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aaf153ae
    • David S. Miller's avatar
      Merge branch 'tcp-ao-selftests' · 66fe8963
      David S. Miller authored
      Dmitry Safonov says:
      
      ====================
      selftests/net: Add TCP-AO tests
      
      An essential part of any big kernel submissions is selftests.
      At the beginning of TCP-AO project, I made patches to fcnal-test.sh
      and nettest.c to have the benefits of easy refactoring, early noticing
      breakages, putting a moat around the code, documenting
      and designing uAPI.
      
      While tests based on fcnal-test.sh/nettest.c provided initial testing*
      and were very easy to add, the pile of TCP-AO quickly grew out of
      one-binary + shell-script testing.
      
      The design of the TCP-AO testing is a bit different than one-big
      selftest binary as I did previously in net/ipsec.c. I found it
      beneficial to avoid implementing a tests runner/scheduler and delegate
      it to the user or Makefile. The approach is very influenced
      by CRIU/ZDTM testing[1]: it provides a static library with helper
      functions and selftest binaries that create specific scenarios.
      I also tried to utilize kselftest.h.
      
      test_init() function does all needed preparations. To not leave
      any traces after a selftest exists, it creates a network namespace
      and if the test wants to establish a TCP connection, a child netns.
      The parent and child netns have veth pair with proper ip addresses
      and routes set up. Both peers, the client and server are different
      pthreads. The treading model was chosen over forking mostly by easiness
      of cleanup on a failure: no need to search for children, handle SIGCHLD,
      make sure not to wait for a dead peer to perform anything, etc.
      Any thread that does exit() naturally kills the tests, sweet!
      The selftests are compiled currently in two variants: ipv4 and ipv6.
      Ipv4-mapped-ipv6 addresses might be a third variant to add, but it's not
      there in this version. As pretty much all tests are shared between two
      address families, most of the code can be shared, too. To differ in code
      what kind of test is running, Makefile supplies -DIPV6_TEST to compiler
      and ifdeffery in tests can do things that have to be different between
      address families. This is similar to TARGETS_C_BOTHBITS in x86 selftests
      and also to tests code sharing in CRIU/ZDTM.
      
      The total number of tests is 832.
      From them rst_ipv{4,6} has currently one flaky subtest, that may fail:
      > not ok 9 client connection was not reset: 0
      I'll investigate what happens there. Also, unsigned-md5_ipv{4,6}
      are flaky because of netns counter checks: it doesn't expect that
      there may be retransmitted TCP segments from a previous sub-selftest.
      That will be fixed. Besides, key-management_ipv{4,6} has 3 sub-tests
      passing with XFAIL:
      > ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200
      > ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16
      > ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16
      ...
      > # Totals: pass:117 fail:0 xfail:3 xpass:0 skip:0 error:0
      Those need some more kernel work to pass instead of xfail.
      
      The overview of selftests (see the diffstat at the bottom):
      ├── lib
      │   ├── aolib.h
      │   │   The header for all selftests to include.
      │   ├── kconfig.c
      │   │   Kernel kconfig detector to SKIP tests that depend on something.
      │   ├── netlink.c
      │   │   Netlink helper to add/modify/delete VETH/IPs/routes/VRFs
      │   │   I considered just using libmnl, but this is around 400 lines
      │   │   and avoids selftests dependency on out-of-tree sources/packets.
      │   ├── proc.c
      │   │   SNMP/netstat procfs parser and the counters comparator.
      │   ├── repair.c
      │   │   Heavily influenced by libsoccr and reduced to minimum TCP
      │   │   socket checkpoint/repair. Shouldn't be used out of selftests,
      │   │   though.
      │   ├── setup.c
      │   │   All the needed netns/veth/ips/etc preparations for test init.
      │   ├── sock.c
      │   │   Socket helpers: {s,g}etsockopt()s/connect()/listen()/etc.
      │   └── utils.c
      │       Random stuff (a pun intended).
      ├── bench-lookups.c
      │   The only benchmark in selftests currently: checks how well TCP-AO
      │   setsockopt()s perform, depending on the amount of keys on a socket.
      ├── connect.c
      │   Trivial sample, can be used as a boilerplate to write a new test.
      ├── connect-deny.c
      │   More-or-less what could be expected for TCP-AO in fcnal-test.sh
      ├── icmps-accept.c -> icmps-discard.c
      ├── icmps-discard.c
      │   Verifies RFC5925 (7.8) by checking that TCP-AO connection can be
      │   broken if ICMPs are accepted and survives when ::accept_icmps = 0
      ├── key-management.c
      │   Key manipulations, rotations between randomized hashing algorithms
      │   and counter checks for those scenarios.
      ├── restore.c
      │   TCP_AO_REPAIR: verifies that a socket can be re-created without
      │   TCP-AO connection being interrupted.
      ├── rst.c
      │   As RST segments are signed on a separate code-path in kernel,
      │   verifies passive/active TCP send_reset().
      ├── self-connect.c
      │   Verifies that TCP self-connect and also simultaneous open work.
      ├── seq-ext.c
      │   Utilizes TCP_AO_REPAIR to check that on SEQ roll-over SNE
      │   increment is performed and segments with different SNEs fail to
      │   pass verification.
      ├── setsockopt-closed.c
      │   Checks that {s,g}etsockopt()s are extendable syscalls and common
      │   error-paths for them.
      └── unsigned-md5.c
          Checks listen() socket for (non-)matching peers with: AO/MD5/none
          keys. As well as their interaction with VRFs and AO_REQUIRED flag.
      
      There are certainly more test scenarios that can be added, but even so,
      I'm pretty happy that this much of TCP-AO functionality and uAPIs got
      covered. These selftests were iteratively developed by me during TCP-AO
      kernel upstreaming and the resulting kernel patches would have been
      worse without having these tests. They provided the user-side
      perspective but also allowed safer refactoring with less possibility
      of introducing a regression. Now it's time to use them to dig
      a moat around the TCP-AO code!
      
      There are also people from other network companies that work on TCP-AO
      (+testing), so sharing these selftests will allow them to contribute
      and may benefit from their efforts.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66fe8963
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO key-management test · 3c3ead55
      Dmitry Safonov authored
      Check multiple keys on a socket:
      - rotation on closed socket
      - current/rnext operations shouldn't be possible on listen sockets
      - current/rnext key set should be the one, that's used on connect()
      - key rotations with pseudo-random generated keys
      - copying matching keys on connect() and on accept()
      
      At this moment there are 3 tests that are "expected" to fail: a kernel
      fix is needed to improve the situation, they are marked XFAIL.
      
      Sample output:
      > # ./key-management_ipv4
      > 1..120
      > # 1601[lib/setup.c:239] rand seed 1700526653
      > TAP version 13
      > ok 1 closed socket, delete a key: the key was deleted
      > ok 2 closed socket, delete all keys: the key was deleted
      > ok 3 closed socket, delete current key: key deletion was prevented
      > ok 4 closed socket, delete rnext key: key deletion was prevented
      > ok 5 closed socket, delete a key + set current/rnext: the key was deleted
      > ok 6 closed socket, force-delete current key: the key was deleted
      > ok 7 closed socket, force-delete rnext key: the key was deleted
      > ok 8 closed socket, delete current+rnext key: key deletion was prevented
      > ok 9 closed socket, add + change current key
      > ok 10 closed socket, add + change rnext key
      > ok 11 listen socket, delete a key: the key was deleted
      > ok 12 listen socket, delete all keys: the key was deleted
      > ok 13 listen socket, setting current key not allowed
      > ok 14 listen socket, setting rnext key not allowed
      > ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200
      > ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16
      > ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16
      > ok 18 listen socket, getsockopt(TCP_AO_REPAIR) is restricted
      > ok 19 listen socket, setsockopt(TCP_AO_REPAIR) is restricted
      > ok 20 listen socket, delete a key + set current/rnext: key deletion was prevented
      > ok 21 listen socket, force-delete current key: key deletion was prevented
      > ok 22 listen socket, force-delete rnext key: key deletion was prevented
      > ok 23 listen socket, delete a key: the key was deleted
      > ok 24 listen socket, add + change current key
      > ok 25 listen socket, add + change rnext key
      > ok 26 server: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations
      > ok 27 client: Check current/rnext keys unset before connect(): current key 19 as expected
      > ok 28 client: Check current/rnext keys unset before connect(): rnext key 146 as expected
      > ok 29 server: Check current/rnext keys unset before connect(): server alive
      > ok 30 server: Check current/rnext keys unset before connect(): passed counters checks
      > ok 31 client: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations
      > ok 32 server: Check current/rnext keys unset before connect(): The socket keys are consistent with the expectations
      > ok 33 server: Check current/rnext keys unset before connect(): passed counters checks
      > ok 34 client: Check current/rnext keys unset before connect(): passed counters checks
      > ok 35 server: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 36 server: Check current/rnext keys set before connect(): server alive
      > ok 37 server: Check current/rnext keys set before connect(): passed counters checks
      > ok 38 client: Check current/rnext keys set before connect(): current key 10 as expected
      > ok 39 client: Check current/rnext keys set before connect(): rnext key 137 as expected
      > ok 40 server: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 41 client: Check current/rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 42 client: Check current/rnext keys set before connect(): passed counters checks
      > ok 43 server: Check current/rnext keys set before connect(): passed counters checks
      > ok 44 server: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 45 server: Check current != rnext keys set before connect(): server alive
      > ok 46 server: Check current != rnext keys set before connect(): passed counters checks
      > ok 47 client: Check current != rnext keys set before connect(): current key 10 as expected
      > ok 48 client: Check current != rnext keys set before connect(): rnext key 132 as expected
      > ok 49 server: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 50 client: Check current != rnext keys set before connect(): The socket keys are consistent with the expectations
      > ok 51 client: Check current != rnext keys set before connect(): passed counters checks
      > ok 52 server: Check current != rnext keys set before connect(): passed counters checks
      > ok 53 server: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations
      > ok 54 server: Check current flapping back on peer's RnextKey request: server alive
      > ok 55 server: Check current flapping back on peer's RnextKey request: passed counters checks
      > ok 56 client: Check current flapping back on peer's RnextKey request: current key 10 as expected
      > ok 57 client: Check current flapping back on peer's RnextKey request: rnext key 132 as expected
      > ok 58 server: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations
      > ok 59 client: Check current flapping back on peer's RnextKey request: The socket keys are consistent with the expectations
      > ok 60 server: Check current flapping back on peer's RnextKey request: passed counters checks
      > ok 61 client: Check current flapping back on peer's RnextKey request: passed counters checks
      > ok 62 server: Rotate over all different keys: The socket keys are consistent with the expectations
      > ok 63 server: Rotate over all different keys: server alive
      > ok 64 server: Rotate over all different keys: passed counters checks
      > ok 65 server: Rotate over all different keys: current key 128 as expected
      > ok 66 client: Rotate over all different keys: rnext key 128 as expected
      > ok 67 server: Rotate over all different keys: current key 129 as expected
      > ok 68 client: Rotate over all different keys: rnext key 129 as expected
      > ok 69 server: Rotate over all different keys: current key 130 as expected
      > ok 70 client: Rotate over all different keys: rnext key 130 as expected
      > ok 71 server: Rotate over all different keys: current key 131 as expected
      > ok 72 client: Rotate over all different keys: rnext key 131 as expected
      > ok 73 server: Rotate over all different keys: current key 132 as expected
      > ok 74 client: Rotate over all different keys: rnext key 132 as expected
      > ok 75 server: Rotate over all different keys: current key 133 as expected
      > ok 76 client: Rotate over all different keys: rnext key 133 as expected
      > ok 77 server: Rotate over all different keys: current key 134 as expected
      > ok 78 client: Rotate over all different keys: rnext key 134 as expected
      > ok 79 server: Rotate over all different keys: current key 135 as expected
      > ok 80 client: Rotate over all different keys: rnext key 135 as expected
      > ok 81 server: Rotate over all different keys: current key 136 as expected
      > ok 82 client: Rotate over all different keys: rnext key 136 as expected
      > ok 83 server: Rotate over all different keys: current key 137 as expected
      > ok 84 client: Rotate over all different keys: rnext key 137 as expected
      > ok 85 server: Rotate over all different keys: current key 138 as expected
      > ok 86 client: Rotate over all different keys: rnext key 138 as expected
      > ok 87 server: Rotate over all different keys: current key 139 as expected
      > ok 88 client: Rotate over all different keys: rnext key 139 as expected
      > ok 89 server: Rotate over all different keys: current key 140 as expected
      > ok 90 client: Rotate over all different keys: rnext key 140 as expected
      > ok 91 server: Rotate over all different keys: current key 141 as expected
      > ok 92 client: Rotate over all different keys: rnext key 141 as expected
      > ok 93 server: Rotate over all different keys: current key 142 as expected
      > ok 94 client: Rotate over all different keys: rnext key 142 as expected
      > ok 95 server: Rotate over all different keys: current key 143 as expected
      > ok 96 client: Rotate over all different keys: rnext key 143 as expected
      > ok 97 server: Rotate over all different keys: current key 144 as expected
      > ok 98 client: Rotate over all different keys: rnext key 144 as expected
      > ok 99 server: Rotate over all different keys: current key 145 as expected
      > ok 100 client: Rotate over all different keys: rnext key 145 as expected
      > ok 101 server: Rotate over all different keys: current key 146 as expected
      > ok 102 client: Rotate over all different keys: rnext key 146 as expected
      > ok 103 server: Rotate over all different keys: current key 127 as expected
      > ok 104 client: Rotate over all different keys: rnext key 127 as expected
      > ok 105 client: Rotate over all different keys: current key 0 as expected
      > ok 106 client: Rotate over all different keys: rnext key 127 as expected
      > ok 107 server: Rotate over all different keys: The socket keys are consistent with the expectations
      > ok 108 client: Rotate over all different keys: The socket keys are consistent with the expectations
      > ok 109 client: Rotate over all different keys: passed counters checks
      > ok 110 server: Rotate over all different keys: passed counters checks
      > ok 111 server: Check accept() => established key matching: The socket keys are consistent with the expectations
      > ok 112 Can't add a key with non-matching ip-address for established sk
      > ok 113 Can't add a key with non-matching VRF for established sk
      > ok 114 server: Check accept() => established key matching: server alive
      > ok 115 server: Check accept() => established key matching: passed counters checks
      > ok 116 client: Check connect() => established key matching: current key 0 as expected
      > ok 117 client: Check connect() => established key matching: rnext key 128 as expected
      > ok 118 client: Check connect() => established key matching: The socket keys are consistent with the expectations
      > ok 119 server: Check accept() => established key matching: The socket keys are consistent with the expectations
      > ok 120 server: Check accept() => established key matching: passed counters checks
      > # Totals: pass:120 fail:0 xfail:0 xpass:0 skip:0 error:0
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c3ead55
    • Dmitry Safonov's avatar
      selftests/net: Add TCP-AO selfconnect/simultaneous connect test · 8c4e8dd0
      Dmitry Safonov authored
      Check that a rare functionality of TCP named self-connect works with
      TCP-AO. This "under the cover" also checks TCP simultaneous connect
      (TCP_SYN_RECV socket state), which would be harder to check other ways.
      
      In order to verify that it's indeed TCP simultaneous connect, check
      the counters TCPChallengeACK and TCPSYNChallenge.
      
      Sample of the output:
      > # ./self-connect_ipv6
      > 1..4
      > # 1738[lib/setup.c:254] rand seed 1696451931
      > TAP version 13
      > ok 1 self-connect(same keyids): connect TCPAOGood 0 => 24
      > ok 2 self-connect(different keyids): connect TCPAOGood 26 => 50
      > ok 3 self-connect(restore): connect TCPAOGood 52 => 97
      > ok 4 self-connect(restore, different keyids): connect TCPAOGood 99 => 144
      > # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c4e8dd0