BUG: NetworkManager kills re6stnet on Debian 8 ([PATCH] daemon: Force NetworkManager on Debian 8 to ignore restnet* interfaces)
A problem was discovered on several production machines of a project installed by client with Debian 8.9 with eth0 being managed by NetworkManager (to get IP via DHCP):
Even though re6stnet was starting ok, it was either failing completely to bring any of its interfaces up, or was loosing brought up interfaces very fast. Symptoms were:
-
openvpn links are broken after 60s (we use
--ping-exit 60
on both openvpn ends) -
I forced links to stay alive with
O--ping O10
-
I increased babel debug level
B-d B1
-
it indeed shows babel does not send hello
in despite openvpns running interfaces itself become down:
root@...:/etc/re6stnet# ip link |grep re6st
145: re6stnet-tcp: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
146: re6stnet1: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
147: re6stnet2: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
148: re6stnet3: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
149: re6stnet4: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
150: re6stnet5: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
151: re6stnet6: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
152: re6stnet7: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
153: re6stnet8: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
154: re6stnet9: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
155: re6stnet10: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 100
nmcli says networkmanager does not manage it:
root@...:/etc/re6stnet# nmcli d
DEVICE TYPE STATE CONNECTION
eth0 ethernet connected Wired connection 1
lo loopback unmanaged --
re6stnet-tcp tap unmanaged --
re6stnet1 tap unmanaged --
re6stnet10 tap unmanaged --
re6stnet2 tap unmanaged --
re6stnet3 tap unmanaged --
re6stnet4 tap unmanaged --
re6stnet5 tap unmanaged --
re6stnet6 tap unmanaged --
re6stnet7 tap unmanaged --
re6stnet8 tap unmanaged --
re6stnet9 tap unmanaged --
however I see a lot of churn about re6stnetX in NM logs, e.g.
<info> (re6stnet6): found matching connection 're6stnet6'
<info> (re6stnet6): device state change: unmanaged -> unavailable (reason 'connection-assumed') [10 20 41]
<info> (re6stnet6): device state change: unavailable -> disconnected (reason 'connection-assumed') [20 30 41]
<info> Activation (re6stnet6) starting connection 're6stnet6'
<info> Activation (re6stnet6) Stage 1 of 5 (Device Prepare) scheduled...
<info> Activation (re6stnet6) Stage 1 of 5 (Device Prepare) started...
<info> (re6stnet6): device state change: disconnected -> prepare (reason 'none') [30 40 0]
<info> Activation (re6stnet6) Stage 2 of 5 (Device Configure) scheduled...
<info> Activation (re6stnet6) Stage 1 of 5 (Device Prepare) complete.
<info> Activation (re6stnet6) Stage 2 of 5 (Device Configure) starting...
<info> (re6stnet6): device state change: prepare -> config (reason 'none') [40 50 0]
<info> Activation (re6stnet6) Stage 2 of 5 (Device Configure) successful.
<info> Activation (re6stnet6) Stage 3 of 5 (IP Configure Start) scheduled.
<info> Activation (re6stnet6) Stage 2 of 5 (Device Configure) complete.
<info> Activation (re6stnet6) Stage 3 of 5 (IP Configure Start) started...
<info> (re6stnet6): device state change: config -> ip-config (reason 'none') [50 70 0]
<info> (re6stnet6): device state change: ip-config -> failed (reason 'ip-config-unavailable') [70 120 5]
<info> Disabling autoconnect for connection 're6stnet6'.
<warn> Activation (re6stnet6) failed for connection 're6stnet6'
<info> Activation (re6stnet6) Stage 3 of 5 (IP Configure Start) complete.
<info> (re6stnet6): device state change: failed -> disconnected (reason 'none') [120 30 0]
<info> (re6stnet6): deactivating device (reason 'none') [0]
<info> (re6stnet6): device state change: disconnected -> unmanaged (reason 'none') [30 10 0]
<info> (re6stnet6): link disconnected
as the result re6stnet does not work
root@...:/etc/re6stnet# ip -6 r
unreachable 2001:67c:1254:e:70::/80 dev lo proto kernel metric 256 error -101
unreachable 2001:67c:1254::/48 dev lo metric 1024 error -101
fe80::/64 dev eth0 proto kernel metric 256
On a couple of other machines I see similar symptoms but with e.g. only 1 re6stnet interface staying up and thus "re6stnet working" but if it goes down it will become all down.
Julien suggested it is really NetworkManager who messes things up and advised to
put iface re6stnetX inet manual
lines into /etc/network/interfaces and that
worked - re6stnetX interfaces were no longer brought down and IPv6 started to
work everywhere.
So let's do this kind of setup by default, so that next poor soul does not need
to debug that whole conscious old NetworkManager from Debian 8 nmcli d
reporting is not telling the full truth and how to make re6stnet work.
I mean when people install re6st-node package on a Debian 8 box with NetworkManager enabled it should be just working out of the box.
P.S. with recent NetworkManager (1.10.2, from current Debian testing) re6st-node works out of the box without this patch.