• Zang MingJie's avatar
    vxlan: fix oops when delete netns containing vxlan · 9cb6cb7e
    Zang MingJie authored
    The following script will produce a kernel oops:
    
        sudo ip netns add v
        sudo ip netns exec v ip ad add 127.0.0.1/8 dev lo
        sudo ip netns exec v ip link set lo up
        sudo ip netns exec v ip ro add 224.0.0.0/4 dev lo
        sudo ip netns exec v ip li add vxlan0 type vxlan id 42 group 239.1.1.1 dev lo
        sudo ip netns exec v ip link set vxlan0 up
        sudo ip netns del v
    
    where inspect by gdb:
    
        Program received signal SIGSEGV, Segmentation fault.
        [Switching to Thread 107]
        0xffffffffa0289e33 in ?? ()
        (gdb) bt
        #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
        #1  vxlan_stop (dev=0xffff88001bafa000) at drivers/net/vxlan.c:1087
        #2  0xffffffff812cc498 in __dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1299
        #3  0xffffffff812cd920 in dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1335
        #4  0xffffffff812cef31 in rollback_registered_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:4851
        #5  0xffffffff812cf040 in unregister_netdevice_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:5752
        #6  0xffffffff812cf1ba in default_device_exit_batch (net_list=0xffff88001f2e7e18) at net/core/dev.c:6170
        #7  0xffffffff812cab27 in cleanup_net (work=<optimized out>) at net/core/net_namespace.c:302
        #8  0xffffffff810540ef in process_one_work (worker=0xffff88001ba9ed40, work=0xffffffff8167d020) at kernel/workqueue.c:2157
        #9  0xffffffff810549d0 in worker_thread (__worker=__worker@entry=0xffff88001ba9ed40) at kernel/workqueue.c:2276
        #10 0xffffffff8105870c in kthread (_create=0xffff88001f2e5d68) at kernel/kthread.c:168
        #11 <signal handler called>
        #12 0x0000000000000000 in ?? ()
        #13 0x0000000000000000 in ?? ()
        (gdb) fr 0
        #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
        533		struct sock *sk = vn->sock->sk;
        (gdb) l
        528	static int vxlan_leave_group(struct net_device *dev)
        529	{
        530		struct vxlan_dev *vxlan = netdev_priv(dev);
        531		struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
        532		int err = 0;
        533		struct sock *sk = vn->sock->sk;
        534		struct ip_mreqn mreq = {
        535			.imr_multiaddr.s_addr	= vxlan->gaddr,
        536			.imr_ifindex		= vxlan->link,
        537		};
        (gdb) p vn->sock
        $4 = (struct socket *) 0x0
    
    The kernel calls `vxlan_exit_net` when deleting the netns before shutting down
    vxlan interfaces. Later the removal of all vxlan interfaces, where `vn->sock`
    is already gone causes the oops. so we should manually shutdown all interfaces
    before deleting `vn->sock` as the patch does.
    Signed-off-by: default avatarZang MingJie <zealot0630@gmail.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    9cb6cb7e
vxlan.c 37.8 KB