Commit 29b3881b authored by Jakub Kicinski's avatar Jakub Kicinski

Merge branch 'ipv4-fix-accidental-rto_onlink-flags-passed-to-ip_route_output_key_hash'

Guillaume Nault says:

====================
ipv4: Fix accidental RTO_ONLINK flags passed to ip_route_output_key_hash()

The IPv4 stack generally uses the last bit of ->flowi4_tos as a flag
indicating link scope for route lookups (RTO_ONLINK). Therefore, we
have to be careful when copying a TOS value to ->flowi4_tos. In
particular, the ->tos field of IPv4 packets may have this bit set
because of ECN. Also tunnel keys generally accept any user value for
the tos.

This series fixes several places where ->flowi4_tos was set from
non-sanitised values and the flowi4 structure was later used by
ip_route_output_key_hash().

Note that the IPv4 stack usually clears the RTO_ONLINK bit using
RT_TOS(). However this macro is based on an obsolete interpretation of
the old IPv4 TOS field (RFC 1349) and clears the three high order bits
too. Since we don't need to clear these bits and since it doesn't make
sense to clear only one of the ECN bits, this patch series uses
INET_ECN_MASK instead.

All patches were compile tested only.
====================

Link: https://lore.kernel.org/r/cover.1641821242.git.gnault@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
parents 274c2240 48d67543
......@@ -32,6 +32,7 @@
#include <linux/tcp.h>
#include <linux/ipv6.h>
#include <net/inet_ecn.h>
#include <net/route.h>
#include <net/ip6_route.h>
......@@ -99,7 +100,7 @@ cxgb_find_route(struct cxgb4_lld_info *lldi,
rt = ip_route_output_ports(&init_net, &fl4, NULL, peer_ip, local_ip,
peer_port, local_port, IPPROTO_TCP,
tos, 0);
tos & ~INET_ECN_MASK, 0);
if (IS_ERR(rt))
return NULL;
n = dst_neigh_lookup(&rt->dst, &peer_ip);
......
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
/* Copyright (c) 2018 Mellanox Technologies. */
#include <net/inet_ecn.h>
#include <net/vxlan.h>
#include <net/gre.h>
#include <net/geneve.h>
......@@ -235,7 +236,7 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv,
int err;
/* add the IP fields */
attr.fl.fl4.flowi4_tos = tun_key->tos;
attr.fl.fl4.flowi4_tos = tun_key->tos & ~INET_ECN_MASK;
attr.fl.fl4.daddr = tun_key->u.ipv4.dst;
attr.fl.fl4.saddr = tun_key->u.ipv4.src;
attr.ttl = tun_key->ttl;
......@@ -350,7 +351,7 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv,
int err;
/* add the IP fields */
attr.fl.fl4.flowi4_tos = tun_key->tos;
attr.fl.fl4.flowi4_tos = tun_key->tos & ~INET_ECN_MASK;
attr.fl.fl4.daddr = tun_key->u.ipv4.dst;
attr.fl.fl4.saddr = tun_key->u.ipv4.src;
attr.ttl = tun_key->ttl;
......
......@@ -604,8 +604,9 @@ static int gre_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
key = &info->key;
ip_tunnel_init_flow(&fl4, IPPROTO_GRE, key->u.ipv4.dst, key->u.ipv4.src,
tunnel_id_to_key32(key->tun_id), key->tos, 0,
skb->mark, skb_get_hash(skb));
tunnel_id_to_key32(key->tun_id),
key->tos & ~INET_ECN_MASK, 0, skb->mark,
skb_get_hash(skb));
rt = ip_route_output_key(dev_net(dev), &fl4);
if (IS_ERR(rt))
return PTR_ERR(rt);
......
......@@ -31,6 +31,7 @@
#include <linux/if_tunnel.h>
#include <net/dst.h>
#include <net/flow.h>
#include <net/inet_ecn.h>
#include <net/xfrm.h>
#include <net/ip.h>
#include <net/gre.h>
......@@ -3295,7 +3296,7 @@ decode_session4(struct sk_buff *skb, struct flowi *fl, bool reverse)
fl4->flowi4_proto = iph->protocol;
fl4->daddr = reverse ? iph->saddr : iph->daddr;
fl4->saddr = reverse ? iph->daddr : iph->saddr;
fl4->flowi4_tos = iph->tos;
fl4->flowi4_tos = iph->tos & ~INET_ECN_MASK;
if (!ip_is_fragment(iph)) {
switch (iph->protocol) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment