1. 16 May, 2022 30 commits
    • Eric Dumazet's avatar
      ipv6: add READ_ONCE(sk->sk_bound_dev_if) in INET6_MATCH() · 5d368f03
      Eric Dumazet authored
      INET6_MATCH() runs without holding a lock on the socket.
      
      We probably need to annotate most reads.
      
      This patch makes INET6_MATCH() an inline function
      to ease our changes.
      
      v2: inline function only defined if IS_ENABLED(CONFIG_IPV6)
          Change the name to inet6_match(), this is no longer a macro.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d368f03
    • Eric Dumazet's avatar
      l2tp: use add READ_ONCE() to fetch sk->sk_bound_dev_if · ff009403
      Eric Dumazet authored
      Use READ_ONCE() in paths not holding the socket lock.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff009403
    • Eric Dumazet's avatar
      net_sched: em_meta: add READ_ONCE() in var_sk_bound_if() · 70f87de9
      Eric Dumazet authored
      sk->sk_bound_dev_if can change under us, use READ_ONCE() annotation.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      70f87de9
    • Eric Dumazet's avatar
      inet: add READ_ONCE(sk->sk_bound_dev_if) in inet_csk_bind_conflict() · d2c13561
      Eric Dumazet authored
      inet_csk_bind_conflict() can access sk->sk_bound_dev_if for
      unlocked sockets.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2c13561
    • Eric Dumazet's avatar
      dccp: use READ_ONCE() to read sk->sk_bound_dev_if · 36f7cec4
      Eric Dumazet authored
      When reading listener sk->sk_bound_dev_if locklessly,
      we must use READ_ONCE().
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36f7cec4
    • Eric Dumazet's avatar
      net: core: add READ_ONCE/WRITE_ONCE annotations for sk->sk_bound_dev_if · e5fccaa1
      Eric Dumazet authored
      sock_bindtoindex_locked() needs to use WRITE_ONCE(sk->sk_bound_dev_if, val),
      because other cpus/threads might locklessly read this field.
      
      sock_getbindtodevice(), sock_getsockopt() need READ_ONCE()
      because they run without socket lock held.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e5fccaa1
    • Eric Dumazet's avatar
      tcp: sk->sk_bound_dev_if once in inet_request_bound_dev_if() · fdb5fd7f
      Eric Dumazet authored
      inet_request_bound_dev_if() reads sk->sk_bound_dev_if twice
      while listener socket is not locked.
      
      Another cpu could change this field under us.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdb5fd7f
    • Eric Dumazet's avatar
      sctp: read sk->sk_bound_dev_if once in sctp_rcv() · a20ea298
      Eric Dumazet authored
      sctp_rcv() reads sk->sk_bound_dev_if twice while the socket
      is not locked. Another cpu could change this field under us.
      
      Fixes: 0fd9a65a ("[SCTP] Support SO_BINDTODEVICE socket option on incoming packets.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a20ea298
    • Eric Dumazet's avatar
      net: annotate races around sk->sk_bound_dev_if · 4c971d2f
      Eric Dumazet authored
      UDP sendmsg() is lockless, and reads sk->sk_bound_dev_if while
      this field can be changed by another thread.
      
      Adds minimal annotations to avoid KCSAN splats for UDP.
      Following patches will add more annotations to potential lockless readers.
      
      BUG: KCSAN: data-race in __ip6_datagram_connect / udpv6_sendmsg
      
      write to 0xffff888136d47a94 of 4 bytes by task 7681 on cpu 0:
       __ip6_datagram_connect+0x6e2/0x930 net/ipv6/datagram.c:221
       ip6_datagram_connect+0x2a/0x40 net/ipv6/datagram.c:272
       inet_dgram_connect+0x107/0x190 net/ipv4/af_inet.c:576
       __sys_connect_file net/socket.c:1900 [inline]
       __sys_connect+0x197/0x1b0 net/socket.c:1917
       __do_sys_connect net/socket.c:1927 [inline]
       __se_sys_connect net/socket.c:1924 [inline]
       __x64_sys_connect+0x3d/0x50 net/socket.c:1924
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x2b/0x50 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff888136d47a94 of 4 bytes by task 7670 on cpu 1:
       udpv6_sendmsg+0xc60/0x16e0 net/ipv6/udp.c:1436
       inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:652
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg net/socket.c:725 [inline]
       ____sys_sendmsg+0x39a/0x510 net/socket.c:2413
       ___sys_sendmsg net/socket.c:2467 [inline]
       __sys_sendmmsg+0x267/0x4c0 net/socket.c:2553
       __do_sys_sendmmsg net/socket.c:2582 [inline]
       __se_sys_sendmmsg net/socket.c:2579 [inline]
       __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2579
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x2b/0x50 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x00000000 -> 0xffffff9b
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 7670 Comm: syz-executor.3 Tainted: G        W         5.18.0-rc1-syzkaller-dirty #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      I chose to not add Fixes: tag because race has minor consequences
      and stable teams busy enough.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c971d2f
    • David S. Miller's avatar
      Merge branch 'big-tcp' · 7fa2e481
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      tcp: BIG TCP implementation
      
      This series implements BIG TCP as presented in netdev 0x15:
      
      https://netdevconf.info/0x15/session.html?BIG-TCP
      
      Jonathan Corbet made a nice summary: https://lwn.net/Articles/884104/
      
      Standard TSO/GRO packet limit is 64KB
      
      With BIG TCP, we allow bigger TSO/GRO packet sizes for IPv6 traffic.
      
      Note that this feature is by default not enabled, because it might
      break some eBPF programs assuming TCP header immediately follows IPv6 header.
      
      While tcpdump recognizes the HBH/Jumbo header, standard pcap filters
      are unable to skip over IPv6 extension headers.
      
      Reducing number of packets traversing networking stack usually improves
      performance, as shown on this experiment using a 100Gbit NIC, and 4K MTU.
      
      'Standard' performance with current (74KB) limits.
      for i in {1..10}; do ./netperf -t TCP_RR -H iroa23  -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done
      77           138          183          8542.19
      79           143          178          8215.28
      70           117          164          9543.39
      80           144          176          8183.71
      78           126          155          9108.47
      80           146          184          8115.19
      71           113          165          9510.96
      74           113          164          9518.74
      79           137          178          8575.04
      73           111          171          9561.73
      
      Now enable BIG TCP on both hosts.
      
      ip link set dev eth0 gro_max_size 185000 gso_max_size 185000
      for i in {1..10}; do ./netperf -t TCP_RR -H iroa23  -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT|tail -1; done
      57           83           117          13871.38
      64           118          155          11432.94
      65           116          148          11507.62
      60           105          136          12645.15
      60           103          135          12760.34
      60           102          134          12832.64
      62           109          132          10877.68
      58           82           115          14052.93
      57           83           124          14212.58
      57           82           119          14196.01
      
      We see an increase of transactions per second, and lower latencies as well.
      
      v7: adopt unsafe_memcpy() in mlx5 to avoid FORTIFY warnings.
      
      v6: fix a compilation error for CONFIG_IPV6=n in
          "net: allow gso_max_size to exceed 65536", reported by kernel bots.
      
      v5: Replaced two patches (that were adding new attributes) with patches
          from Alexander Duyck. Idea is to reuse existing gso_max_size/gro_max_size
      
      v4: Rebased on top of Jakub series (Merge branch 'tso-gso-limit-split')
          max_tso_size is now family independent.
      
      v3: Fixed a typo in RFC number (Alexander)
          Added Reviewed-by: tags from Tariq on mlx4/mlx5 parts.
      
      v2: Removed the MAX_SKB_FRAGS change, this belongs to a different series.
          Addressed feedback, for Alexander and nvidia folks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7fa2e481
    • Eric Dumazet's avatar
      mlx5: support BIG TCP packets · de78960e
      Eric Dumazet authored
      mlx5 supports LSOv2.
      
      IPv6 gro/tcp stacks insert a temporary Hop-by-Hop header
      with JUMBO TLV for big packets.
      
      We need to ignore/skip this HBH header when populating TX descriptor.
      
      Note that ipv6_has_hopopt_jumbo() only recognizes very specific packet
      layout, thus mlx5e_sq_xmit_wqe() is taking care of this layout only.
      
      v7: adopt unsafe_memcpy() and MLX5_UNSAFE_MEMCPY_DISCLAIMER
      v2: clear hopbyhop in mlx5e_tx_get_gso_ihs()
      v4: fix compile error for CONFIG_MLX5_CORE_IPOIB=y
      Signed-off-by: default avatarCoco Li <lixiaoyan@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Cc: Saeed Mahameed <saeedm@nvidia.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de78960e
    • Eric Dumazet's avatar
      mlx4: support BIG TCP packets · 1169a642
      Eric Dumazet authored
      mlx4 supports LSOv2 just fine.
      
      IPv6 stack inserts a temporary Hop-by-Hop header
      with JUMBO TLV for big packets.
      
      We need to ignore the HBH header when populating TX descriptor.
      
      Tested:
      
      Before: (not enabling bigger TSO/GRO packets)
      
      ip link set dev eth0 gso_max_size 65536 gro_max_size 65536
      
      netperf -H lpaa18 -t TCP_RR -T2,2 -l 10 -Cc -- -r 70000,70000
      MIGRATED TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to lpaa18.prod.google.com () port 0 AF_INET6 : first burst 0 : cpu bind
      Local /Remote
      Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
      Send   Recv   Size    Size   Time    Rate     local  remote local   remote
      bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr
      
      262144 540000 70000   70000  10.00   6591.45  0.86   1.34   62.490  97.446
      262144 540000
      
      After: (enabling bigger TSO/GRO packets)
      
      ip link set dev eth0 gso_max_size 185000 gro_max_size 185000
      
      netperf -H lpaa18 -t TCP_RR -T2,2 -l 10 -Cc -- -r 70000,70000
      MIGRATED TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to lpaa18.prod.google.com () port 0 AF_INET6 : first burst 0 : cpu bind
      Local /Remote
      Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
      Send   Recv   Size    Size   Time    Rate     local  remote local   remote
      bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr
      
      262144 540000 70000   70000  10.00   8383.95  0.95   1.01   54.432  57.584
      262144 540000
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1169a642
    • Eric Dumazet's avatar
      veth: enable BIG TCP packets · d406099d
      Eric Dumazet authored
      Set the TSO driver limit to GSO_MAX_SIZE (512 KB).
      
      This allows the admin/user to set a GSO limit up to this value.
      
      ip link set dev veth10 gso_max_size 200000
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d406099d
    • Eric Dumazet's avatar
      net: loopback: enable BIG TCP packets · d6f938ce
      Eric Dumazet authored
      Set the driver limit to GSO_MAX_SIZE (512 KB).
      
      This allows the admin/user to set a GSO limit up to this value.
      
      Tested:
      
      ip link set dev lo gso_max_size 200000
      netperf -H ::1 -t TCP_RR -l 100 -- -r 80000,80000 &
      
      tcpdump shows :
      
      18:28:42.962116 IP6 ::1 > ::1: HBH 40051 > 63780: Flags [P.], seq 3626480001:3626560001, ack 3626560001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 80000
      18:28:42.962138 IP6 ::1.63780 > ::1.40051: Flags [.], ack 3626560001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 0
      18:28:42.962152 IP6 ::1 > ::1: HBH 63780 > 40051: Flags [P.], seq 3626560001:3626640001, ack 3626560001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 80000
      18:28:42.962157 IP6 ::1.40051 > ::1.63780: Flags [.], ack 3626640001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 0
      18:28:42.962180 IP6 ::1 > ::1: HBH 40051 > 63780: Flags [P.], seq 3626560001:3626640001, ack 3626640001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 80000
      18:28:42.962214 IP6 ::1.63780 > ::1.40051: Flags [.], ack 3626640001, win 17743, options [nop,nop,TS val 3771179266 ecr 3771179265], length 0
      18:28:42.962228 IP6 ::1 > ::1: HBH 63780 > 40051: Flags [P.], seq 3626640001:3626720001, ack 3626640001, win 17743, options [nop,nop,TS val 3771179266 ecr 3771179265], length 80000
      18:28:42.962233 IP6 ::1.40051 > ::1.63780: Flags [.], ack 3626720001, win 17743, options [nop,nop,TS val 3771179266 ecr 3771179266], length 0
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6f938ce
    • Coco Li's avatar
      ipv6: Add hop-by-hop header to jumbograms in ip6_output · 80e425b6
      Coco Li authored
      Instead of simply forcing a 0 payload_len in IPv6 header,
      implement RFC 2675 and insert a custom extension header.
      
      Note that only TCP stack is currently potentially generating
      jumbograms, and that this extension header is purely local,
      it wont be sent on a physical link.
      
      This is needed so that packet capture (tcpdump and friends)
      can properly dissect these large packets.
      Signed-off-by: default avatarCoco Li <lixiaoyan@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      80e425b6
    • Alexander Duyck's avatar
      net: allow gro_max_size to exceed 65536 · 0fe79f28
      Alexander Duyck authored
      Allow the gro_max_size to exceed a value larger than 65536.
      
      There weren't really any external limitations that prevented this other
      than the fact that IPv4 only supports a 16 bit length field. Since we have
      the option of adding a hop-by-hop header for IPv6 we can allow IPv6 to
      exceed this value and for IPv4 and non-TCP flows we can cap things at 65536
      via a constant rather than relying on gro_max_size.
      
      [edumazet] limit GRO_MAX_SIZE to (8 * 65535) to avoid overflows.
      Signed-off-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fe79f28
    • Eric Dumazet's avatar
      ipv6/gro: insert temporary HBH/jumbo header · 81fbc812
      Eric Dumazet authored
      Following patch will add GRO_IPV6_MAX_SIZE, allowing gro to build
      BIG TCP ipv6 packets (bigger than 64K).
      
      This patch changes ipv6_gro_complete() to insert a HBH/jumbo header
      so that resulting packet can go through IPv6/TCP stacks.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81fbc812
    • Eric Dumazet's avatar
      ipv6/gso: remove temporary HBH/jumbo header · 09f3d1a3
      Eric Dumazet authored
      ipv6 tcp and gro stacks will soon be able to build big TCP packets,
      with an added temporary Hop By Hop header.
      
      If GSO is involved for these large packets, we need to remove
      the temporary HBH header before segmentation happens.
      
      v2: perform HBH removal from ipv6_gso_segment() instead of
          skb_segment() (Alexander feedback)
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09f3d1a3
    • Eric Dumazet's avatar
      ipv6: add struct hop_jumbo_hdr definition · 7c96d8ec
      Eric Dumazet authored
      Following patches will need to add and remove local IPv6 jumbogram
      options to enable BIG TCP.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c96d8ec
    • Eric Dumazet's avatar
      tcp_cubic: make hystart_ack_delay() aware of BIG TCP · 9957b38b
      Eric Dumazet authored
      hystart_ack_delay() had the assumption that a TSO packet
      would not be bigger than GSO_MAX_SIZE.
      
      This will no longer be true.
      
      We should use sk->sk_gso_max_size instead.
      
      This reduces chances of spurious Hystart ACK train detections.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9957b38b
    • Eric Dumazet's avatar
      net: limit GSO_MAX_SIZE to 524280 bytes · 34b92e8d
      Eric Dumazet authored
      Make sure we will not overflow shinfo->gso_segs
      
      Minimal TCP MSS size is 8 bytes, and shinfo->gso_segs
      is a 16bit field.
      
      TCP_MIN_GSO_SIZE is currently defined in include/net/tcp.h,
      it seems cleaner to not bring tcp details into include/linux/netdevice.h
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34b92e8d
    • Alexander Duyck's avatar
      net: allow gso_max_size to exceed 65536 · 7c4e983c
      Alexander Duyck authored
      The code for gso_max_size was added originally to allow for debugging and
      workaround of buggy devices that couldn't support TSO with blocks 64K in
      size. The original reason for limiting it to 64K was because that was the
      existing limits of IPv4 and non-jumbogram IPv6 length fields.
      
      With the addition of Big TCP we can remove this limit and allow the value
      to potentially go up to UINT_MAX and instead be limited by the tso_max_size
      value.
      
      So in order to support this we need to go through and clean up the
      remaining users of the gso_max_size value so that the values will cap at
      64K for non-TCPv6 flows. In addition we can clean up the GSO_MAX_SIZE value
      so that 64K becomes GSO_LEGACY_MAX_SIZE and UINT_MAX will now be the upper
      limit for GSO_MAX_SIZE.
      
      v6: (edumazet) fixed a compile error if CONFIG_IPV6=n,
                     in a new sk_trim_gso_size() helper.
                     netif_set_tso_max_size() caps the requested TSO size
                     with GSO_MAX_SIZE.
      Signed-off-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c4e983c
    • Eric Dumazet's avatar
      net: add IFLA_TSO_{MAX_SIZE|SEGS} attributes · 89527be8
      Eric Dumazet authored
      New netlink attributes IFLA_TSO_MAX_SIZE and IFLA_TSO_MAX_SEGS
      are used to report to user-space the device TSO limits.
      
      ip -d link sh dev eth1
      ...
         tso_max_size 65536 tso_max_segs 65535
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      89527be8
    • David S. Miller's avatar
      Merge branch 'Renesas-RSZ-V2M-support' · 5cf15ce3
      David S. Miller authored
      Phil Edworthy says:
      
      ====================
      Add Renesas RZ/V2M Ethernet support
      
      The RZ/V2M Ethernet is very similar to R-Car Gen3 Ethernet-AVB, though
      some small parts are the same as R-Car Gen2.
      Other differences are:
      * It has separate data (DI), error (Line 1) and management (Line 2) irqs
        rather than one irq for all three.
      * Instead of using the High-speed peripheral bus clock for gPTP, it has
        a separate gPTP reference clock.
      
      v4:
       * Add clk_disable_unprepare() for gptp ref clk
      
      v3:
       * Really renamed irq_en_dis_regs to irq_en_dis this time
       * Modified ravb_ptp_extts() to use irq_en_dis
       * Added Reviewed-by tags
      
      v2:
       * Just net patches in this series
       * Instead of reusing ch22 and ch24 interrupt names, use the proper names
       * Renamed irq_en_dis_regs to irq_en_dis
       * Squashed use of GIC reg versus GIE/GID and got rid of separate gptp_ptm_gic feature.
       * Move err_mgmt_irqs code under multi_irqs
       * Minor editing of the commit msgs
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5cf15ce3
    • Phil Edworthy's avatar
      ravb: Add support for RZ/V2M · e1154be7
      Phil Edworthy authored
      RZ/V2M Ethernet is very similar to R-Car Gen3 Ethernet-AVB, though
      some small parts are the same as R-Car Gen2.
      Other differences to R-Car Gen3 and Gen2 are:
      * It has separate data (DI), error (Line 1) and management (Line 2) irqs
        rather than one irq for all three.
      * Instead of using the High-speed peripheral bus clock for gPTP, it has a
        separate gPTP reference clock.
      Signed-off-by: default avatarPhil Edworthy <phil.edworthy@renesas.com>
      Reviewed-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1154be7
    • Phil Edworthy's avatar
      ravb: Use separate clock for gPTP · 72069a7b
      Phil Edworthy authored
      RZ/V2M has a separate gPTP reference clock that is used when the
      AVB-DMAC Mode Register (CCC) gPTP Clock Select (CSEL) bits are
      set to "01: High-speed peripheral bus clock".
      Therefore, add a feature that allows this clock to be used for
      gPTP.
      Signed-off-by: default avatarPhil Edworthy <phil.edworthy@renesas.com>
      Reviewed-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72069a7b
    • Phil Edworthy's avatar
      ravb: Support separate Line0 (Desc), Line1 (Err) and Line2 (Mgmt) irqs · b0265dcb
      Phil Edworthy authored
      R-Car has a combined interrupt line, ch22 = Line0_DiA | Line1_A | Line2_A.
      RZ/V2M has separate interrupt lines for each of these, so add a feature
      that allows the driver to get these interrupts and call the common handler.
      Signed-off-by: default avatarPhil Edworthy <phil.edworthy@renesas.com>
      Reviewed-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0265dcb
    • Phil Edworthy's avatar
      ravb: Separate handling of irq enable/disable regs into feature · cb99badd
      Phil Edworthy authored
      Currently, when the HW has a single interrupt, the driver uses the
      GIC, TIC, RIC0 registers to enable and disable interrupts.
      When the HW has multiple interrupts, it uses the GIE, GID, TIE, TID,
      RIE0, RID0 registers.
      
      However, other devices, e.g. RZ/V2M, have multiple irqs and only have
      the GIC, TIC, RIC0 registers.
      Therefore, split this into a separate feature.
      Signed-off-by: default avatarPhil Edworthy <phil.edworthy@renesas.com>
      Reviewed-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb99badd
    • Phil Edworthy's avatar
      dt-bindings: net: renesas,etheravb: Document RZ/V2M SoC · a7931ac1
      Phil Edworthy authored
      Document the Ethernet AVB IP found on RZ/V2M SoC.
      It includes the Ethernet controller (E-MAC) and Dedicated Direct memory
      access controller (DMAC) for transferring transmitted Ethernet frames
      to and received Ethernet frames from respective storage areas in the
      RAM at high speed.
      The AVB-DMAC is compliant with IEEE 802.1BA, IEEE 802.1AS timing and
      synchronization protocol, IEEE 802.1Qav real-time transfer, and the
      IEEE 802.1Qat stream reservation protocol.
      
      R-Car has a pair of combined interrupt lines:
       ch22 = Line0_DiA | Line1_A | Line2_A
       ch23 = Line0_DiB | Line1_B | Line2_B
      Line0 for descriptor interrupts (which we call dia and dib).
      Line1 for error related interrupts (which we call err_a and err_b).
      Line2 for management and gPTP related interrupts (mgmt_a and mgmt_b).
      
      RZ/V2M hardware has separate interrupt lines for each of these.
      
      It has 3 clocks; the main AXI clock, the AMBA CHI (Coherent Hub
      Interface) clock and a gPTP reference clock.
      Signed-off-by: default avatarPhil Edworthy <phil.edworthy@renesas.com>
      Reviewed-by: default avatarBiju Das <biju.das.jz@bp.renesas.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7931ac1
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next · 1a01a075
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter updates for net-next
      
      This is v2 including deadlock fix in conntrack ecache rework
      reported by Jakub Kicinski.
      
      The following patchset contains Netfilter updates for net-next,
      mostly updates to conntrack from Florian Westphal.
      
      1) Add a dedicated list for conntrack event redelivery.
      
      2) Include event redelivery list in conntrack dumps of dying type.
      
      3) Remove per-cpu dying list for event redelivery, not used anymore.
      
      4) Add netns .pre_exit to cttimeout to zap timeout objects before
         synchronize_rcu() call.
      
      5) Remove nf_ct_unconfirmed_destroy.
      
      6) Add generation id for conntrack extensions for conntrack
         timeout and helpers.
      
      7) Detach timeout policy from conntrack on cttimeout module removal.
      
      8) Remove __nf_ct_unconfirmed_destroy.
      
      9) Remove unconfirmed list.
      
      10) Remove unconditional local_bh_disable in init_conntrack().
      
      11) Consolidate conntrack iterator nf_ct_iterate_cleanup().
      
      12) Detect if ctnetlink listeners exist to short-circuit event
          path early.
      
      13) Un-inline nf_ct_ecache_ext_add().
      
      14) Add nf_conntrack_events autodetect ctnetlink listener mode
          and make it default.
      
      15) Add nf_ct_ecache_exist() to check for event cache extension.
      
      16) Extend flowtable reverse route lookup to include source, iif,
          tos and mark, from Sven Auhagen.
      
      17) Do not verify zero checksum UDP packets in nf_reject,
          from Kevin Mitchell.
      
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a01a075
  2. 14 May, 2022 3 commits
  3. 13 May, 2022 7 commits