1. 20 Oct, 2010 1 commit
    • Neil Horman's avatar
      netpoll: Remove netpoll blocking from uninit path · 9ff76c95
      Neil Horman authored
      Some recent testing in netpoll with bonding showed this backtrace
      
       ------------[ cut here ]------------
       kernel BUG at drivers/net/bonding/bonding.h:134!
       invalid opcode: 0000 [#1] SMP
       last sysfs file: /sys/devices/pci0000:00/0000:00:1d.2/usb7/devnum
       CPU 0
       Pid: 1876, comm: rmmod Not tainted 2.6.36-rc3+ #10 D26928/
       RIP: 0010:[<ffffffffa0514ba4>]  [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0
       RSP: 0018:ffff88003b1b5d58  EFLAGS: 00010296
       RAX: ffff88003b9b6200 RBX: ffff8800373e8e00 RCX: 00000000000f4240
       RDX: 00000000ffffffff RSI: 0000000000000286 RDI: 0000000000000286
       RBP: ffff88003b1b5dc8 R08: 0000000000000000 R09: 00000001af7de920
       R10: 0000000000000000 R11: ffff880002495e98 R12: ffff880037922700
       R13: ffff880038c31000 R14: ffff880037922730 R15: 0000000000000286
       FS:  00007f90e6d72700(0000) GS:ffff880002400000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
       CR2: 000000346f0d9ad0 CR3: 000000003b263000 CR4: 00000000000006f0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process rmmod (pid: 1876, threadinfo ffff88003b1b4000, task ffff88003b36aa80)
       Stack:
       00000000ffffffff ffff88003b1b5d7a ffff8800379221e8 ffff880037922000
       <0> ffff88003b1b5dc8 ffffffff813eb5fb ffff88003b1b5da8 0000000031b177a3
       <0> ffff88003b1b5da8 ffff880037922000 ffff88003b1b5e48 ffff88003b1b5e48
       Call Trace:
       [<ffffffff813eb5fb>] ? rtmsg_ifinfo+0xcb/0xf0
       [<ffffffff813daad8>] rollback_registered_many+0x168/0x280
       [<ffffffff813dac09>] unregister_netdevice_many+0x19/0x80
       [<ffffffff813e97b3>] __rtnl_kill_links+0x63/0x90
       [<ffffffff813e980b>] __rtnl_link_unregister+0x2b/0x60
       [<ffffffff813e9bde>] rtnl_link_unregister+0x1e/0x30
       [<ffffffffa052124b>] bonding_exit+0x37/0x51 [bonding]
       [<ffffffff81098b2e>] sys_delete_module+0x19e/0x270
       [<ffffffff810bb2b2>] ? audit_syscall_entry+0x252/0x280
       [<ffffffff8100b0b2>] system_call_fastpath+0x16/0x1b
       RIP  [<ffffffffa0514ba4>] bond_uninit+0x6f4/0x7a0 [bonding]
       RSP <ffff88003b1b5d58>
       ---[ end trace 1395ad691cea24d1 ]---
      
      It occurs because of my recent netpoll blocking patches, which I added to avoid
      recursive deadlock in the bonding driver.  It relies on some per cpu bits, but
      the shutdown path forces some rescheduling as we cancel workqueues for the
      driver and wait for some device refcounts.  If after the forced reschedule, we
      wind up on a different cpu we trigger the bughalt in unblock_netpoll_tx.
      
      The fix is to remove the netpoll block/unblock calls from bond_release_all.
      This is safe to do because bond_uninit, which is called via ndo_uninit in
      rollback_registered_many, doesn't occur until we send a NETDEV_UNREGISTER event,
      which triggers netconsole to remove us as a netpoll client, so we are guaranteed
      not to recurse into our own tx path here.
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Reviewed-by: default avatarWANG Cong <amwang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ff76c95
  2. 19 Oct, 2010 6 commits
  3. 18 Oct, 2010 30 commits
  4. 17 Oct, 2010 3 commits