Commit 5db5b395 authored by David S. Miller's avatar David S. Miller

Merge branch 'ipv6-sr'

David Lebrun says:

====================
net: add support for IPv6 Segment Routing

v5:
 - Check SRH validity when adding a new route with lwtunnels and
   when setting an IPV6_RTHDR socket option.
 - Check that hdr->segments_left is not out of bounds when processing
   an SR-enabled packet.
 - Add __ro_after_init attribute to seg6_genl_policy structure.
 - Add CONFIG_IPV6_SEG6_INLINE option to enable or disable
   direct header insertion.

v4:
 - Change @cleanup in ipv6_srh_rcv() from int to bool
 - Move checksum helper functions into header file
 - Add common definition for SR TLVs
 - Add comments for HMAC computation algorithm
 - Use rhashtable to store HMAC infos instead of linked list
 - Remove packed attribute for struct sr6_tlv_hmac
 - Use dst cache only if CONFIG_DST_CACHE is enabled

v3:
 - Fix compilation for CONFIG_IPV6={n,m}

v2:
 - Remove packed attribute from sr6 struct and replaced unaligned
   16-bit flags with two 8-bit flags.
 - SR code now included by default. Option CONFIG_IPV6_SEG6_HMAC
   exists for HMAC support (which requires crypto dependencies).
 - Replace "hidden" calls to mutex_{un,}lock to direct calls.
 - Fix reverse xmas tree coding style.
 - Fix cast-from-void*'s.
 - Update skb->csum to account for SR modifications.
 - Add dst_cache in seg6_output.

Segment Routing (SR) is a source routing paradigm, architecturally
defined in draft-ietf-spring-segment-routing-09 [1]. The IPv6 flavor of
SR is defined in draft-ietf-6man-segment-routing-header-02 [2].

The main idea is that an SR-enabled packet contains a list of segments,
which represent mandatory waypoints. Each waypoint is called a segment
endpoint. The SR-enabled packet is routed normally (e.g. shortest path)
between the segment endpoints. A node that inserts an SRH into a packet
is called an ingress node, and a node that is the last segment endpoint
is called an egress node.

From an IPv6 viewpoint, an SR-enabled packet contains an IPv6 extension
header, which is a Routing Header type 4, defined as follows:

struct ipv6_sr_hdr {
        __u8    nexthdr;
        __u8    hdrlen;
        __u8    type;
        __u8    segments_left;
        __u8    first_segment;
        __u8    flag_1;
        __u8    flag_2;
        __u8    reserved;

        struct in6_addr segments[0];
};

The first 4 bytes of the SRH is consistent with the Routing Header
definition in RFC 2460. The type is set to `4' (SRH).

Each segment is encoded as an IPv6 address. The segments are encoded in
reverse order: segments[0] is the last segment of the path, and
segments[first_segment] is the first segment of the path.

segments[segments_left] points to the currently active segment and
segments_left is decremented at each segment endpoint.

There exist two ways for a packet to receive an SRH, we call them
encap mode and inline mode. In the encap mode, the packet is encapsulated
in an outer IPv6 header that contains the SRH. The inner (original) packet
is not modified. A virtual tunnel is thus created between the ingress node
(the node that encapsulates) and the egress node (the last segment of the path).
Once an encapsulated SR packet reaches the egress node, the node decapsulates
the packet and performs a routing decision on the inner packet. This kind of
SRH insertion is intended to use for routers that encapsulates in-transit
packet.

The second SRH insertion method, the inline mode, acts by directly inserting
the SRH right after the IPv6 header of the original packet. For this method,
if a particular flag (SR6_FLAG_CLEANUP) is set, then the penultimate segment
endpoint must strip the SRH from the packet before forwarding it to the last
segment endpoint. This insertion method is intended to use for endhosts,
however it is also used for in-transit packets by some industry actors.
Note that directly inserting extension headers may break several mechanisms
such as Path MTU Discovery, IPSec AH, etc. For this reason, this insertion
method is only available if CONFIG_IPV6_SEG6_INLINE is enabled.

Finally, the SRH may contain TLVs after the segments list. Several types of
TLVs are defined, but we currently consider only the HMAC TLV. This TLV is
an answer to the deprecation of the RH0 and enables to ensure the authenticity
and integrity of the SRH. The HMAC text contains the flags, the first_segment
index, the full list of segments, and the source address of the packet. While
SR is intended to use mostly within a single administrative domain, the HMAC
TLV allows to verify SR packets coming from an untrusted source.

This patches series implements support for the IPv6 flavor of SR and is
logically divided into the following components:

        (1) Data plane support (patch 01). This patch adds a function
            in net/ipv6/exthdrs.c to handle the Routing Header type 4.
            It enables the kernel to act as a segment endpoint, by supporting
            the following operations: decrementation of the segments_left field,
            cleanup flag support (removal of the SRH if we are the penultimate
            segment endpoint) and decapsulation of the inner packet as an egress
            node.

        (2) Control plane support (patches 02..03 and 07..09). These patches enables
            to insert SRH on locally emitted and/or forwarded packets, both with
            encap mode and with inline mode. The SRH insertion is controlled through
            the lightweight tunnels mechanism. Furthermore, patch 08 enables the
            applications to insert an SRH on a per-socket basis, through the
            setsockopt() system call. The mechanism to specify a per-socket
            Routing Header was already defined for RH0 and no special modification
            was performed on this side. However, the code to actually push the RH
            onto the packets had to be adapted for the SRH specifications.

        (3) HMAC support (patches 04..06). These patches adds the support of the
            HMAC TLV verification for the dataplane part, and generation for
            the control plane part. Two hashing algorithms are supported
            (SHA-1 as legacy and SHA-256 as required by the IETF draft), but
            additional algorithms can be easily supported by simply adding an
            entry into an array.

[1] https://tools.ietf.org/html/draft-ietf-spring-segment-routing-09
[2] https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-02
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents dc0b2c9c 8bc66a44
/proc/sys/net/conf/<iface>/seg6_* variables:
seg6_enabled - BOOL
Accept or drop SR-enabled IPv6 packets on this interface.
Relevant packets are those with SRH present and DA = local.
0 - disabled (default)
not 0 - enabled
seg6_require_hmac - INTEGER
Define HMAC policy for ingress SR-enabled packets on this interface.
-1 - Ignore HMAC field
0 - Accept SR packets without HMAC, validate SR packets with HMAC
1 - Drop SR packets without HMAC, validate SR packets with HMAC
Default is 0.
......@@ -64,6 +64,10 @@ struct ipv6_devconf {
} stable_secret;
__s32 use_oif_addrs_only;
__s32 keep_addr_on_down;
__s32 seg6_enabled;
#ifdef CONFIG_IPV6_SEG6_HMAC
__s32 seg6_require_hmac;
#endif
struct ctl_table_header *sysctl_header;
};
......
#ifndef _LINUX_SEG6_H
#define _LINUX_SEG6_H
#include <uapi/linux/seg6.h>
#endif
#ifndef _LINUX_SEG6_GENL_H
#define _LINUX_SEG6_GENL_H
#include <uapi/linux/seg6_genl.h>
#endif
#ifndef _LINUX_SEG6_HMAC_H
#define _LINUX_SEG6_HMAC_H
#include <uapi/linux/seg6_hmac.h>
#endif
#ifndef _LINUX_SEG6_IPTUNNEL_H
#define _LINUX_SEG6_IPTUNNEL_H
#include <uapi/linux/seg6_iptunnel.h>
#endif
......@@ -932,7 +932,8 @@ int ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb);
*/
void ipv6_push_nfrag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
u8 *proto, struct in6_addr **daddr_p);
u8 *proto, struct in6_addr **daddr_p,
struct in6_addr *saddr);
void ipv6_push_frag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
u8 *proto);
......
......@@ -85,6 +85,7 @@ struct netns_ipv6 {
#endif
atomic_t dev_addr_genid;
atomic_t fib6_sernum;
struct seg6_pernet_data *seg6_data;
};
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
......
/*
* SR-IPv6 implementation
*
* Author:
* David Lebrun <david.lebrun@uclouvain.be>
*
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#ifndef _NET_SEG6_H
#define _NET_SEG6_H
#include <linux/net.h>
#include <linux/ipv6.h>
#include <net/lwtunnel.h>
#include <linux/seg6.h>
#include <linux/rhashtable.h>
static inline void update_csum_diff4(struct sk_buff *skb, __be32 from,
__be32 to)
{
__be32 diff[] = { ~from, to };
skb->csum = ~csum_partial((char *)diff, sizeof(diff), ~skb->csum);
}
static inline void update_csum_diff16(struct sk_buff *skb, __be32 *from,
__be32 *to)
{
__be32 diff[] = {
~from[0], ~from[1], ~from[2], ~from[3],
to[0], to[1], to[2], to[3],
};
skb->csum = ~csum_partial((char *)diff, sizeof(diff), ~skb->csum);
}
struct seg6_pernet_data {
struct mutex lock;
struct in6_addr __rcu *tun_src;
#ifdef CONFIG_IPV6_SEG6_HMAC
struct rhashtable hmac_infos;
#endif
};
static inline struct seg6_pernet_data *seg6_pernet(struct net *net)
{
return net->ipv6.seg6_data;
}
extern int seg6_init(void);
extern void seg6_exit(void);
extern int seg6_iptunnel_init(void);
extern void seg6_iptunnel_exit(void);
extern bool seg6_validate_srh(struct ipv6_sr_hdr *srh, int len);
#endif
/*
* SR-IPv6 implementation
*
* Author:
* David Lebrun <david.lebrun@uclouvain.be>
*
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#ifndef _NET_SEG6_HMAC_H
#define _NET_SEG6_HMAC_H
#include <net/flow.h>
#include <net/ip6_fib.h>
#include <net/sock.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/route.h>
#include <net/seg6.h>
#include <linux/seg6_hmac.h>
#include <linux/rhashtable.h>
#define SEG6_HMAC_MAX_DIGESTSIZE 160
#define SEG6_HMAC_RING_SIZE 256
struct seg6_hmac_info {
struct rhash_head node;
struct rcu_head rcu;
u32 hmackeyid;
char secret[SEG6_HMAC_SECRET_LEN];
u8 slen;
u8 alg_id;
};
struct seg6_hmac_algo {
u8 alg_id;
char name[64];
struct crypto_shash * __percpu *tfms;
struct shash_desc * __percpu *shashs;
};
extern int seg6_hmac_compute(struct seg6_hmac_info *hinfo,
struct ipv6_sr_hdr *hdr, struct in6_addr *saddr,
u8 *output);
extern struct seg6_hmac_info *seg6_hmac_info_lookup(struct net *net, u32 key);
extern int seg6_hmac_info_add(struct net *net, u32 key,
struct seg6_hmac_info *hinfo);
extern int seg6_hmac_info_del(struct net *net, u32 key);
extern int seg6_push_hmac(struct net *net, struct in6_addr *saddr,
struct ipv6_sr_hdr *srh);
extern bool seg6_hmac_validate_skb(struct sk_buff *skb);
extern int seg6_hmac_init(void);
extern void seg6_hmac_exit(void);
extern int seg6_hmac_net_init(struct net *net);
extern void seg6_hmac_net_exit(struct net *net);
#endif
......@@ -39,6 +39,7 @@ struct in6_ifreq {
#define IPV6_SRCRT_STRICT 0x01 /* Deprecated; will be removed */
#define IPV6_SRCRT_TYPE_0 0 /* Deprecated; will be removed */
#define IPV6_SRCRT_TYPE_2 2 /* IPv6 type 2 Routing Header */
#define IPV6_SRCRT_TYPE_4 4 /* Segment Routing with IPv6 */
/*
* routing header
......@@ -178,6 +179,8 @@ enum {
DEVCONF_DROP_UNSOLICITED_NA,
DEVCONF_KEEP_ADDR_ON_DOWN,
DEVCONF_RTR_SOLICIT_MAX_INTERVAL,
DEVCONF_SEG6_ENABLED,
DEVCONF_SEG6_REQUIRE_HMAC,
DEVCONF_MAX
};
......
......@@ -9,6 +9,7 @@ enum lwtunnel_encap_types {
LWTUNNEL_ENCAP_IP,
LWTUNNEL_ENCAP_ILA,
LWTUNNEL_ENCAP_IP6,
LWTUNNEL_ENCAP_SEG6,
__LWTUNNEL_ENCAP_MAX,
};
......
/*
* SR-IPv6 implementation
*
* Author:
* David Lebrun <david.lebrun@uclouvain.be>
*
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#ifndef _UAPI_LINUX_SEG6_H
#define _UAPI_LINUX_SEG6_H
/*
* SRH
*/
struct ipv6_sr_hdr {
__u8 nexthdr;
__u8 hdrlen;
__u8 type;
__u8 segments_left;
__u8 first_segment;
__u8 flag_1;
__u8 flag_2;
__u8 reserved;
struct in6_addr segments[0];
};
#define SR6_FLAG1_CLEANUP (1 << 7)
#define SR6_FLAG1_PROTECTED (1 << 6)
#define SR6_FLAG1_OAM (1 << 5)
#define SR6_FLAG1_ALERT (1 << 4)
#define SR6_FLAG1_HMAC (1 << 3)
#define SR6_TLV_INGRESS 1
#define SR6_TLV_EGRESS 2
#define SR6_TLV_OPAQUE 3
#define SR6_TLV_PADDING 4
#define SR6_TLV_HMAC 5
#define sr_has_cleanup(srh) ((srh)->flag_1 & SR6_FLAG1_CLEANUP)
#define sr_has_hmac(srh) ((srh)->flag_1 & SR6_FLAG1_HMAC)
struct sr6_tlv {
__u8 type;
__u8 len;
__u8 data[0];
};
#endif
#ifndef _UAPI_LINUX_SEG6_GENL_H
#define _UAPI_LINUX_SEG6_GENL_H
#define SEG6_GENL_NAME "SEG6"
#define SEG6_GENL_VERSION 0x1
enum {
SEG6_ATTR_UNSPEC,
SEG6_ATTR_DST,
SEG6_ATTR_DSTLEN,
SEG6_ATTR_HMACKEYID,
SEG6_ATTR_SECRET,
SEG6_ATTR_SECRETLEN,
SEG6_ATTR_ALGID,
SEG6_ATTR_HMACINFO,
__SEG6_ATTR_MAX,
};
#define SEG6_ATTR_MAX (__SEG6_ATTR_MAX - 1)
enum {
SEG6_CMD_UNSPEC,
SEG6_CMD_SETHMAC,
SEG6_CMD_DUMPHMAC,
SEG6_CMD_SET_TUNSRC,
SEG6_CMD_GET_TUNSRC,
__SEG6_CMD_MAX,
};
#define SEG6_CMD_MAX (__SEG6_CMD_MAX - 1)
#endif
#ifndef _UAPI_LINUX_SEG6_HMAC_H
#define _UAPI_LINUX_SEG6_HMAC_H
#include <linux/seg6.h>
#define SEG6_HMAC_SECRET_LEN 64
#define SEG6_HMAC_FIELD_LEN 32
struct sr6_tlv_hmac {
struct sr6_tlv tlvhdr;
__u16 reserved;
__be32 hmackeyid;
__u8 hmac[SEG6_HMAC_FIELD_LEN];
};
enum {
SEG6_HMAC_ALGO_SHA1 = 1,
SEG6_HMAC_ALGO_SHA256 = 2,
};
#endif
/*
* SR-IPv6 implementation
*
* Author:
* David Lebrun <david.lebrun@uclouvain.be>
*
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#ifndef _UAPI_LINUX_SEG6_IPTUNNEL_H
#define _UAPI_LINUX_SEG6_IPTUNNEL_H
enum {
SEG6_IPTUNNEL_UNSPEC,
SEG6_IPTUNNEL_SRH,
__SEG6_IPTUNNEL_MAX,
};
#define SEG6_IPTUNNEL_MAX (__SEG6_IPTUNNEL_MAX - 1)
struct seg6_iptunnel_encap {
int mode;
struct ipv6_sr_hdr srh[0];
};
#define SEG6_IPTUN_ENCAP_SIZE(x) ((sizeof(*x)) + (((x)->srh->hdrlen + 1) << 3))
enum {
SEG6_IPTUN_MODE_INLINE,
SEG6_IPTUN_MODE_ENCAP,
};
static inline size_t seg6_lwt_headroom(struct seg6_iptunnel_encap *tuninfo)
{
int encap = (tuninfo->mode == SEG6_IPTUN_MODE_ENCAP);
return ((tuninfo->srh->hdrlen + 1) << 3) +
(encap * sizeof(struct ipv6hdr));
}
#endif
......@@ -39,6 +39,8 @@ static const char *lwtunnel_encap_str(enum lwtunnel_encap_types encap_type)
return "MPLS";
case LWTUNNEL_ENCAP_ILA:
return "ILA";
case LWTUNNEL_ENCAP_SEG6:
return "SEG6";
case LWTUNNEL_ENCAP_IP6:
case LWTUNNEL_ENCAP_IP:
case LWTUNNEL_ENCAP_NONE:
......
......@@ -289,4 +289,28 @@ config IPV6_PIMSM_V2
Support for IPv6 PIM multicast routing protocol PIM-SMv2.
If unsure, say N.
config IPV6_SEG6_INLINE
bool "IPv6: direct Segment Routing Header insertion "
depends on IPV6
---help---
Support for direct insertion of the Segment Routing Header,
also known as inline mode. Be aware that direct insertion of
extension headers (as opposed to encapsulation) may break
multiple mechanisms such as PMTUD or IPSec AH. Use this feature
only if you know exactly what you are doing.
If unsure, say N.
config IPV6_SEG6_HMAC
bool "IPv6: Segment Routing HMAC support"
depends on IPV6
select CRYPTO_HMAC
select CRYPTO_SHA1
select CRYPTO_SHA256
---help---
Support for HMAC signature generation and verification
of SR-enabled packets.
If unsure, say N.
endif # IPV6
......@@ -9,7 +9,7 @@ ipv6-objs := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o \
route.o ip6_fib.o ipv6_sockglue.o ndisc.o udp.o udplite.o \
raw.o icmp.o mcast.o reassembly.o tcp_ipv6.o ping.o \
exthdrs.o datagram.o ip6_flowlabel.o inet6_connection_sock.o \
udp_offload.o
udp_offload.o seg6.o seg6_iptunnel.o
ipv6-offload := ip6_offload.o tcpv6_offload.o exthdrs_offload.o
......@@ -44,6 +44,7 @@ obj-$(CONFIG_IPV6_SIT) += sit.o
obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o
obj-$(CONFIG_IPV6_GRE) += ip6_gre.o
obj-$(CONFIG_IPV6_FOU) += fou6.o
obj-$(CONFIG_IPV6_SEG6_HMAC) += seg6_hmac.o
obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
obj-$(CONFIG_INET) += output_core.o protocol.o $(ipv6-offload)
......
......@@ -238,6 +238,10 @@ static struct ipv6_devconf ipv6_devconf __read_mostly = {
.use_oif_addrs_only = 0,
.ignore_routes_with_linkdown = 0,
.keep_addr_on_down = 0,
.seg6_enabled = 0,
#ifdef CONFIG_IPV6_SEG6_HMAC
.seg6_require_hmac = 0,
#endif
};
static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
......@@ -284,6 +288,10 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = {
.use_oif_addrs_only = 0,
.ignore_routes_with_linkdown = 0,
.keep_addr_on_down = 0,
.seg6_enabled = 0,
#ifdef CONFIG_IPV6_SEG6_HMAC
.seg6_require_hmac = 0,
#endif
};
/* Check if a valid qdisc is available */
......@@ -4944,6 +4952,10 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf,
array[DEVCONF_DROP_UNICAST_IN_L2_MULTICAST] = cnf->drop_unicast_in_l2_multicast;
array[DEVCONF_DROP_UNSOLICITED_NA] = cnf->drop_unsolicited_na;
array[DEVCONF_KEEP_ADDR_ON_DOWN] = cnf->keep_addr_on_down;
array[DEVCONF_SEG6_ENABLED] = cnf->seg6_enabled;
#ifdef CONFIG_IPV6_SEG6_HMAC
array[DEVCONF_SEG6_REQUIRE_HMAC] = cnf->seg6_require_hmac;
#endif
}
static inline size_t inet6_ifla6_size(void)
......@@ -6035,6 +6047,22 @@ static const struct ctl_table addrconf_sysctl[] = {
.proc_handler = proc_dointvec,
},
{
.procname = "seg6_enabled",
.data = &ipv6_devconf.seg6_enabled,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec,
},
#ifdef CONFIG_IPV6_SEG6_HMAC
{
.procname = "seg6_require_hmac",
.data = &ipv6_devconf.seg6_require_hmac,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec,
},
#endif
{
/* sentinel */
}
......
......@@ -61,6 +61,7 @@
#include <net/ip6_tunnel.h>
#endif
#include <net/calipso.h>
#include <net/seg6.h>
#include <asm/uaccess.h>
#include <linux/mroute6.h>
......@@ -991,6 +992,10 @@ static int __init inet6_init(void)
if (err)
goto calipso_fail;
err = seg6_init();
if (err)
goto seg6_fail;
#ifdef CONFIG_SYSCTL
err = ipv6_sysctl_register();
if (err)
......@@ -1001,8 +1006,10 @@ static int __init inet6_init(void)
#ifdef CONFIG_SYSCTL
sysctl_fail:
calipso_exit();
seg6_exit();
#endif
seg6_fail:
calipso_exit();
calipso_fail:
pingv6_exit();
pingv6_fail:
......
......@@ -47,6 +47,11 @@
#if IS_ENABLED(CONFIG_IPV6_MIP6)
#include <net/xfrm.h>
#endif
#include <linux/seg6.h>
#include <net/seg6.h>
#ifdef CONFIG_IPV6_SEG6_HMAC
#include <net/seg6_hmac.h>
#endif
#include <linux/uaccess.h>
......@@ -286,6 +291,182 @@ static int ipv6_destopt_rcv(struct sk_buff *skb)
return -1;
}
static void seg6_update_csum(struct sk_buff *skb)
{
struct ipv6_sr_hdr *hdr;
struct in6_addr *addr;
__be32 from, to;
/* srh is at transport offset and seg_left is already decremented
* but daddr is not yet updated with next segment
*/
hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
addr = hdr->segments + hdr->segments_left;
hdr->segments_left++;
from = *(__be32 *)hdr;
hdr->segments_left--;
to = *(__be32 *)hdr;
/* update skb csum with diff resulting from seg_left decrement */
update_csum_diff4(skb, from, to);
/* compute csum diff between current and next segment and update */
update_csum_diff16(skb, (__be32 *)(&ipv6_hdr(skb)->daddr),
(__be32 *)addr);
}
static int ipv6_srh_rcv(struct sk_buff *skb)
{
struct inet6_skb_parm *opt = IP6CB(skb);
struct net *net = dev_net(skb->dev);
struct ipv6_sr_hdr *hdr;
struct inet6_dev *idev;
struct in6_addr *addr;
bool cleanup = false;
int accept_seg6;
hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
idev = __in6_dev_get(skb->dev);
accept_seg6 = net->ipv6.devconf_all->seg6_enabled;
if (accept_seg6 > idev->cnf.seg6_enabled)
accept_seg6 = idev->cnf.seg6_enabled;
if (!accept_seg6) {
kfree_skb(skb);
return -1;
}
#ifdef CONFIG_IPV6_SEG6_HMAC
if (!seg6_hmac_validate_skb(skb)) {
kfree_skb(skb);
return -1;
}
#endif
looped_back:
if (hdr->segments_left > 0) {
if (hdr->nexthdr != NEXTHDR_IPV6 && hdr->segments_left == 1 &&
sr_has_cleanup(hdr))
cleanup = true;
} else {
if (hdr->nexthdr == NEXTHDR_IPV6) {
int offset = (hdr->hdrlen + 1) << 3;
skb_postpull_rcsum(skb, skb_network_header(skb),
skb_network_header_len(skb));
if (!pskb_pull(skb, offset)) {
kfree_skb(skb);
return -1;
}
skb_postpull_rcsum(skb, skb_transport_header(skb),
offset);
skb_reset_network_header(skb);
skb_reset_transport_header(skb);
skb->encapsulation = 0;
__skb_tunnel_rx(skb, skb->dev, net);
netif_rx(skb);
return -1;
}
opt->srcrt = skb_network_header_len(skb);
opt->lastopt = opt->srcrt;
skb->transport_header += (hdr->hdrlen + 1) << 3;
opt->nhoff = (&hdr->nexthdr) - skb_network_header(skb);
return 1;
}
if (hdr->segments_left >= (hdr->hdrlen >> 1)) {
__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
IPSTATS_MIB_INHDRERRORS);
icmpv6_param_prob(skb, ICMPV6_HDR_FIELD,
((&hdr->segments_left) -
skb_network_header(skb)));
kfree_skb(skb);
return -1;
}
if (skb_cloned(skb)) {
if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) {
__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
IPSTATS_MIB_OUTDISCARDS);
kfree_skb(skb);
return -1;
}
}
hdr = (struct ipv6_sr_hdr *)skb_transport_header(skb);
hdr->segments_left--;
addr = hdr->segments + hdr->segments_left;
skb_push(skb, sizeof(struct ipv6hdr));
if (skb->ip_summed == CHECKSUM_COMPLETE)
seg6_update_csum(skb);
ipv6_hdr(skb)->daddr = *addr;
if (cleanup) {
int srhlen = (hdr->hdrlen + 1) << 3;
int nh = hdr->nexthdr;
skb_pull_rcsum(skb, sizeof(struct ipv6hdr) + srhlen);
memmove(skb_network_header(skb) + srhlen,
skb_network_header(skb),
(unsigned char *)hdr - skb_network_header(skb));
skb->network_header += srhlen;
ipv6_hdr(skb)->nexthdr = nh;
ipv6_hdr(skb)->payload_len = htons(skb->len -
sizeof(struct ipv6hdr));
skb_push_rcsum(skb, sizeof(struct ipv6hdr));
}
skb_dst_drop(skb);
ip6_route_input(skb);
if (skb_dst(skb)->error) {
dst_input(skb);
return -1;
}
if (skb_dst(skb)->dev->flags & IFF_LOOPBACK) {
if (ipv6_hdr(skb)->hop_limit <= 1) {
__IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)),
IPSTATS_MIB_INHDRERRORS);
icmpv6_send(skb, ICMPV6_TIME_EXCEED,
ICMPV6_EXC_HOPLIMIT, 0);
kfree_skb(skb);
return -1;
}
ipv6_hdr(skb)->hop_limit--;
/* be sure that srh is still present before reinjecting */
if (!cleanup) {
skb_pull(skb, sizeof(struct ipv6hdr));
goto looped_back;
}
skb_set_transport_header(skb, sizeof(struct ipv6hdr));
IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
}
dst_input(skb);
return -1;
}
/********************************
Routing header.
********************************/
......@@ -326,6 +507,10 @@ static int ipv6_rthdr_rcv(struct sk_buff *skb)
return -1;
}
/* segment routing */
if (hdr->type == IPV6_SRCRT_TYPE_4)
return ipv6_srh_rcv(skb);
looped_back:
if (hdr->segments_left == 0) {
switch (hdr->type) {
......@@ -679,9 +864,9 @@ int ipv6_parse_hopopts(struct sk_buff *skb)
* for headers.
*/
static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
static void ipv6_push_rthdr0(struct sk_buff *skb, u8 *proto,
struct ipv6_rt_hdr *opt,
struct in6_addr **addr_p)
struct in6_addr **addr_p, struct in6_addr *saddr)
{
struct rt0_hdr *phdr, *ihdr;
int hops;
......@@ -704,6 +889,62 @@ static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
*proto = NEXTHDR_ROUTING;
}
static void ipv6_push_rthdr4(struct sk_buff *skb, u8 *proto,
struct ipv6_rt_hdr *opt,
struct in6_addr **addr_p, struct in6_addr *saddr)
{
struct ipv6_sr_hdr *sr_phdr, *sr_ihdr;
int plen, hops;
sr_ihdr = (struct ipv6_sr_hdr *)opt;
plen = (sr_ihdr->hdrlen + 1) << 3;
sr_phdr = (struct ipv6_sr_hdr *)skb_push(skb, plen);
memcpy(sr_phdr, sr_ihdr, sizeof(struct ipv6_sr_hdr));
hops = sr_ihdr->first_segment + 1;
memcpy(sr_phdr->segments + 1, sr_ihdr->segments + 1,
(hops - 1) * sizeof(struct in6_addr));
sr_phdr->segments[0] = **addr_p;
*addr_p = &sr_ihdr->segments[hops - 1];
#ifdef CONFIG_IPV6_SEG6_HMAC
if (sr_has_hmac(sr_phdr)) {
struct net *net = NULL;
if (skb->dev)
net = dev_net(skb->dev);
else if (skb->sk)
net = sock_net(skb->sk);
WARN_ON(!net);
if (net)
seg6_push_hmac(net, saddr, sr_phdr);
}
#endif
sr_phdr->nexthdr = *proto;
*proto = NEXTHDR_ROUTING;
}
static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
struct ipv6_rt_hdr *opt,
struct in6_addr **addr_p, struct in6_addr *saddr)
{
switch (opt->type) {
case IPV6_SRCRT_TYPE_0:
ipv6_push_rthdr0(skb, proto, opt, addr_p, saddr);
break;
case IPV6_SRCRT_TYPE_4:
ipv6_push_rthdr4(skb, proto, opt, addr_p, saddr);
break;
default:
break;
}
}
static void ipv6_push_exthdr(struct sk_buff *skb, u8 *proto, u8 type, struct ipv6_opt_hdr *opt)
{
struct ipv6_opt_hdr *h = (struct ipv6_opt_hdr *)skb_push(skb, ipv6_optlen(opt));
......@@ -715,10 +956,10 @@ static void ipv6_push_exthdr(struct sk_buff *skb, u8 *proto, u8 type, struct ipv
void ipv6_push_nfrag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
u8 *proto,
struct in6_addr **daddr)
struct in6_addr **daddr, struct in6_addr *saddr)
{
if (opt->srcrt) {
ipv6_push_rthdr(skb, proto, opt->srcrt, daddr);
ipv6_push_rthdr(skb, proto, opt->srcrt, daddr, saddr);
/*
* IPV6_RTHDRDSTOPTS is ignored
* unless IPV6_RTHDR is set (RFC3542).
......@@ -945,7 +1186,22 @@ struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
return NULL;
*orig = fl6->daddr;
switch (opt->srcrt->type) {
case IPV6_SRCRT_TYPE_0:
fl6->daddr = *((struct rt0_hdr *)opt->srcrt)->addr;
break;
case IPV6_SRCRT_TYPE_4:
{
struct ipv6_sr_hdr *srh = (struct ipv6_sr_hdr *)opt->srcrt;
fl6->daddr = srh->segments[srh->first_segment];
break;
}
default:
return NULL;
}
return orig;
}
EXPORT_SYMBOL_GPL(fl6_update_dst);
......@@ -203,7 +203,8 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
if (opt->opt_flen)
ipv6_push_frag_opts(skb, opt, &proto);
if (opt->opt_nflen)
ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop);
ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop,
&fl6->saddr);
}
skb_push(skb, sizeof(struct ipv6hdr));
......@@ -1672,7 +1673,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
if (opt && opt->opt_flen)
ipv6_push_frag_opts(skb, opt, &proto);
if (opt && opt->opt_nflen)
ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst);
ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst, &fl6->saddr);
skb_push(skb, sizeof(struct ipv6hdr));
skb_reset_network_header(skb);
......
......@@ -1157,7 +1157,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
if (encap_limit >= 0) {
init_tel_txopt(&opt, encap_limit);
ipv6_push_nfrag_opts(skb, &opt.ops, &proto, NULL);
ipv6_push_nfrag_opts(skb, &opt.ops, &proto, NULL, NULL);
}
/* Calculate max headroom for all the headers and adjust
......
......@@ -52,6 +52,7 @@
#include <net/udplite.h>
#include <net/xfrm.h>
#include <net/compat.h>
#include <net/seg6.h>
#include <asm/uaccess.h>
......@@ -430,6 +431,15 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
break;
#endif
case IPV6_SRCRT_TYPE_4:
{
struct ipv6_sr_hdr *srh = (struct ipv6_sr_hdr *)
opt->srcrt;
if (!seg6_validate_srh(srh, optlen))
goto sticky_done;
break;
}
default:
goto sticky_done;
}
......
This diff is collapsed.
This diff is collapsed.
/*
* SR-IPv6 implementation
*
* Author:
* David Lebrun <david.lebrun@uclouvain.be>
*
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <linux/types.h>
#include <linux/skbuff.h>
#include <linux/net.h>
#include <linux/module.h>
#include <net/ip.h>
#include <net/lwtunnel.h>
#include <net/netevent.h>
#include <net/netns/generic.h>
#include <net/ip6_fib.h>
#include <net/route.h>
#include <net/seg6.h>
#include <linux/seg6.h>
#include <linux/seg6_iptunnel.h>
#include <net/addrconf.h>
#include <net/ip6_route.h>
#ifdef CONFIG_DST_CACHE
#include <net/dst_cache.h>
#endif
#ifdef CONFIG_IPV6_SEG6_HMAC
#include <net/seg6_hmac.h>
#endif
struct seg6_lwt {
#ifdef CONFIG_DST_CACHE
struct dst_cache cache;
#endif
struct seg6_iptunnel_encap tuninfo[0];
};
static inline struct seg6_lwt *seg6_lwt_lwtunnel(struct lwtunnel_state *lwt)
{
return (struct seg6_lwt *)lwt->data;
}
static inline struct seg6_iptunnel_encap *
seg6_encap_lwtunnel(struct lwtunnel_state *lwt)
{
return seg6_lwt_lwtunnel(lwt)->tuninfo;
}
static const struct nla_policy seg6_iptunnel_policy[SEG6_IPTUNNEL_MAX + 1] = {
[SEG6_IPTUNNEL_SRH] = { .type = NLA_BINARY },
};
int nla_put_srh(struct sk_buff *skb, int attrtype,
struct seg6_iptunnel_encap *tuninfo)
{
struct seg6_iptunnel_encap *data;
struct nlattr *nla;
int len;
len = SEG6_IPTUN_ENCAP_SIZE(tuninfo);
nla = nla_reserve(skb, attrtype, len);
if (!nla)
return -EMSGSIZE;
data = nla_data(nla);
memcpy(data, tuninfo, len);
return 0;
}
static void set_tun_src(struct net *net, struct net_device *dev,
struct in6_addr *daddr, struct in6_addr *saddr)
{
struct seg6_pernet_data *sdata = seg6_pernet(net);
struct in6_addr *tun_src;
rcu_read_lock();
tun_src = rcu_dereference(sdata->tun_src);
if (!ipv6_addr_any(tun_src)) {
memcpy(saddr, tun_src, sizeof(struct in6_addr));
} else {
ipv6_dev_get_saddr(net, dev, daddr, IPV6_PREFER_SRC_PUBLIC,
saddr);
}
rcu_read_unlock();
}
/* encapsulate an IPv6 packet within an outer IPv6 header with a given SRH */
static int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
{
struct net *net = dev_net(skb_dst(skb)->dev);
struct ipv6hdr *hdr, *inner_hdr;
struct ipv6_sr_hdr *isrh;
int hdrlen, tot_len, err;
hdrlen = (osrh->hdrlen + 1) << 3;
tot_len = hdrlen + sizeof(*hdr);
err = pskb_expand_head(skb, tot_len, 0, GFP_ATOMIC);
if (unlikely(err))
return err;
inner_hdr = ipv6_hdr(skb);
skb_push(skb, tot_len);
skb_reset_network_header(skb);
skb_mac_header_rebuild(skb);
hdr = ipv6_hdr(skb);
/* inherit tc, flowlabel and hlim
* hlim will be decremented in ip6_forward() afterwards and
* decapsulation will overwrite inner hlim with outer hlim
*/
ip6_flow_hdr(hdr, ip6_tclass(ip6_flowinfo(inner_hdr)),
ip6_flowlabel(inner_hdr));
hdr->hop_limit = inner_hdr->hop_limit;
hdr->nexthdr = NEXTHDR_ROUTING;
isrh = (void *)hdr + sizeof(*hdr);
memcpy(isrh, osrh, hdrlen);
isrh->nexthdr = NEXTHDR_IPV6;
hdr->daddr = isrh->segments[isrh->first_segment];
set_tun_src(net, skb->dev, &hdr->daddr, &hdr->saddr);
#ifdef CONFIG_IPV6_SEG6_HMAC
if (sr_has_hmac(isrh)) {
err = seg6_push_hmac(net, &hdr->saddr, isrh);
if (unlikely(err))
return err;
}
#endif
skb_postpush_rcsum(skb, hdr, tot_len);
return 0;
}
/* insert an SRH within an IPv6 packet, just after the IPv6 header */
#ifdef CONFIG_IPV6_SEG6_INLINE
static int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh)
{
struct ipv6hdr *hdr, *oldhdr;
struct ipv6_sr_hdr *isrh;
int hdrlen, err;
hdrlen = (osrh->hdrlen + 1) << 3;
err = pskb_expand_head(skb, hdrlen, 0, GFP_ATOMIC);
if (unlikely(err))
return err;
oldhdr = ipv6_hdr(skb);
skb_pull(skb, sizeof(struct ipv6hdr));
skb_postpull_rcsum(skb, skb_network_header(skb),
sizeof(struct ipv6hdr));
skb_push(skb, sizeof(struct ipv6hdr) + hdrlen);
skb_reset_network_header(skb);
skb_mac_header_rebuild(skb);
hdr = ipv6_hdr(skb);
memmove(hdr, oldhdr, sizeof(*hdr));
isrh = (void *)hdr + sizeof(*hdr);
memcpy(isrh, osrh, hdrlen);
isrh->nexthdr = hdr->nexthdr;
hdr->nexthdr = NEXTHDR_ROUTING;
isrh->segments[0] = hdr->daddr;
hdr->daddr = isrh->segments[isrh->first_segment];
#ifdef CONFIG_IPV6_SEG6_HMAC
if (sr_has_hmac(isrh)) {
struct net *net = dev_net(skb_dst(skb)->dev);
err = seg6_push_hmac(net, &hdr->saddr, isrh);
if (unlikely(err))
return err;
}
#endif
skb_postpush_rcsum(skb, hdr, sizeof(struct ipv6hdr) + hdrlen);
return 0;
}
#endif
static int seg6_do_srh(struct sk_buff *skb)
{
struct dst_entry *dst = skb_dst(skb);
struct seg6_iptunnel_encap *tinfo;
int err = 0;
tinfo = seg6_encap_lwtunnel(dst->lwtstate);
if (likely(!skb->encapsulation)) {
skb_reset_inner_headers(skb);
skb->encapsulation = 1;
}
switch (tinfo->mode) {
#ifdef CONFIG_IPV6_SEG6_INLINE
case SEG6_IPTUN_MODE_INLINE:
err = seg6_do_srh_inline(skb, tinfo->srh);
skb_reset_inner_headers(skb);
break;
#endif
case SEG6_IPTUN_MODE_ENCAP:
err = seg6_do_srh_encap(skb, tinfo->srh);
break;
}
if (err)
return err;
ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr));
skb_set_transport_header(skb, sizeof(struct ipv6hdr));
skb_set_inner_protocol(skb, skb->protocol);
return 0;
}
int seg6_input(struct sk_buff *skb)
{
int err;
err = seg6_do_srh(skb);
if (unlikely(err)) {
kfree_skb(skb);
return err;
}
skb_dst_drop(skb);
ip6_route_input(skb);
return dst_input(skb);
}
int seg6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
{
struct dst_entry *orig_dst = skb_dst(skb);
struct dst_entry *dst = NULL;
struct seg6_lwt *slwt;
int err = -EINVAL;
err = seg6_do_srh(skb);
if (unlikely(err))
goto drop;
slwt = seg6_lwt_lwtunnel(orig_dst->lwtstate);
#ifdef CONFIG_DST_CACHE
dst = dst_cache_get(&slwt->cache);
#endif
if (unlikely(!dst)) {
struct ipv6hdr *hdr = ipv6_hdr(skb);
struct flowi6 fl6;
fl6.daddr = hdr->daddr;
fl6.saddr = hdr->saddr;
fl6.flowlabel = ip6_flowinfo(hdr);
fl6.flowi6_mark = skb->mark;
fl6.flowi6_proto = hdr->nexthdr;
dst = ip6_route_output(net, NULL, &fl6);
if (dst->error) {
err = dst->error;
dst_release(dst);
goto drop;
}
#ifdef CONFIG_DST_CACHE
dst_cache_set_ip6(&slwt->cache, dst, &fl6.saddr);
#endif
}
skb_dst_drop(skb);
skb_dst_set(skb, dst);
return dst_output(net, sk, skb);
drop:
kfree_skb(skb);
return err;
}
static int seg6_build_state(struct net_device *dev, struct nlattr *nla,
unsigned int family, const void *cfg,
struct lwtunnel_state **ts)
{
struct nlattr *tb[SEG6_IPTUNNEL_MAX + 1];
struct seg6_iptunnel_encap *tuninfo;
struct lwtunnel_state *newts;
int tuninfo_len, min_size;
struct seg6_lwt *slwt;
int err;
err = nla_parse_nested(tb, SEG6_IPTUNNEL_MAX, nla,
seg6_iptunnel_policy);
if (err < 0)
return err;
if (!tb[SEG6_IPTUNNEL_SRH])
return -EINVAL;
tuninfo = nla_data(tb[SEG6_IPTUNNEL_SRH]);
tuninfo_len = nla_len(tb[SEG6_IPTUNNEL_SRH]);
/* tuninfo must contain at least the iptunnel encap structure,
* the SRH and one segment
*/
min_size = sizeof(*tuninfo) + sizeof(struct ipv6_sr_hdr) +
sizeof(struct in6_addr);
if (tuninfo_len < min_size)
return -EINVAL;
switch (tuninfo->mode) {
#ifdef CONFIG_IPV6_SEG6_INLINE
case SEG6_IPTUN_MODE_INLINE:
break;
#endif
case SEG6_IPTUN_MODE_ENCAP:
break;
default:
return -EINVAL;
}
/* verify that SRH is consistent */
if (!seg6_validate_srh(tuninfo->srh, tuninfo_len - sizeof(*tuninfo)))
return -EINVAL;
newts = lwtunnel_state_alloc(tuninfo_len + sizeof(*slwt));
if (!newts)
return -ENOMEM;
slwt = seg6_lwt_lwtunnel(newts);
#ifdef CONFIG_DST_CACHE
err = dst_cache_init(&slwt->cache, GFP_KERNEL);
if (err) {
kfree(newts);
return err;
}
#endif
memcpy(&slwt->tuninfo, tuninfo, tuninfo_len);
newts->type = LWTUNNEL_ENCAP_SEG6;
newts->flags |= LWTUNNEL_STATE_OUTPUT_REDIRECT |
LWTUNNEL_STATE_INPUT_REDIRECT;
newts->headroom = seg6_lwt_headroom(tuninfo);
*ts = newts;
return 0;
}
#ifdef CONFIG_DST_CACHE
static void seg6_destroy_state(struct lwtunnel_state *lwt)
{
dst_cache_destroy(&seg6_lwt_lwtunnel(lwt)->cache);
}
#endif
static int seg6_fill_encap_info(struct sk_buff *skb,
struct lwtunnel_state *lwtstate)
{
struct seg6_iptunnel_encap *tuninfo = seg6_encap_lwtunnel(lwtstate);
if (nla_put_srh(skb, SEG6_IPTUNNEL_SRH, tuninfo))
return -EMSGSIZE;
return 0;
}
static int seg6_encap_nlsize(struct lwtunnel_state *lwtstate)
{
struct seg6_iptunnel_encap *tuninfo = seg6_encap_lwtunnel(lwtstate);
return nla_total_size(SEG6_IPTUN_ENCAP_SIZE(tuninfo));
}
static int seg6_encap_cmp(struct lwtunnel_state *a, struct lwtunnel_state *b)
{
struct seg6_iptunnel_encap *a_hdr = seg6_encap_lwtunnel(a);
struct seg6_iptunnel_encap *b_hdr = seg6_encap_lwtunnel(b);
int len = SEG6_IPTUN_ENCAP_SIZE(a_hdr);
if (len != SEG6_IPTUN_ENCAP_SIZE(b_hdr))
return 1;
return memcmp(a_hdr, b_hdr, len);
}
static const struct lwtunnel_encap_ops seg6_iptun_ops = {
.build_state = seg6_build_state,
#ifdef CONFIG_DST_CACHE
.destroy_state = seg6_destroy_state,
#endif
.output = seg6_output,
.input = seg6_input,
.fill_encap = seg6_fill_encap_info,
.get_encap_size = seg6_encap_nlsize,
.cmp_encap = seg6_encap_cmp,
};
int __init seg6_iptunnel_init(void)
{
return lwtunnel_encap_add_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6);
}
void seg6_iptunnel_exit(void)
{
lwtunnel_encap_del_ops(&seg6_iptun_ops, LWTUNNEL_ENCAP_SEG6);
}
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment