• Alan Stern's avatar
    USB: core: Fix bug caused by duplicate interface PM usage counter · c2b71462
    Alan Stern authored
    The syzkaller fuzzer reported a bug in the USB hub driver which turned
    out to be caused by a negative runtime-PM usage counter.  This allowed
    a hub to be runtime suspended at a time when the driver did not expect
    it.  The symptom is a WARNING issued because the hub's status URB is
    submitted while it is already active:
    
    	URB 0000000031fb463e submitted while active
    	WARNING: CPU: 0 PID: 2917 at drivers/usb/core/urb.c:363
    
    The negative runtime-PM usage count was caused by an unfortunate
    design decision made when runtime PM was first implemented for USB.
    At that time, USB class drivers were allowed to unbind from their
    interfaces without balancing the usage counter (i.e., leaving it with
    a positive count).  The core code would take care of setting the
    counter back to 0 before allowing another driver to bind to the
    interface.
    
    Later on when runtime PM was implemented for the entire kernel, the
    opposite decision was made: Drivers were required to balance their
    runtime-PM get and put calls.  In order to maintain backward
    compatibility, however, the USB subsystem adapted to the new
    implementation by keeping an independent usage counter for each
    interface and using it to automatically adjust the normal usage
    counter back to 0 whenever a driver was unbound.
    
    This approach involves duplicating information, but what is worse, it
    doesn't work properly in cases where a USB class driver delays
    decrementing the usage counter until after the driver's disconnect()
    routine has returned and the counter has been adjusted back to 0.
    Doing so would cause the usage counter to become negative.  There's
    even a warning about this in the USB power management documentation!
    
    As it happens, this is exactly what the hub driver does.  The
    kick_hub_wq() routine increments the runtime-PM usage counter, and the
    corresponding decrement is carried out by hub_event() in the context
    of the hub_wq work-queue thread.  This work routine may sometimes run
    after the driver has been unbound from its interface, and when it does
    it causes the usage counter to go negative.
    
    It is not possible for hub_disconnect() to wait for a pending
    hub_event() call to finish, because hub_disconnect() is called with
    the device lock held and hub_event() acquires that lock.  The only
    feasible fix is to reverse the original design decision: remove the
    duplicate interface-specific usage counter and require USB drivers to
    balance their runtime PM gets and puts.  As far as I know, all
    existing drivers currently do this.
    Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
    Reported-and-tested-by: syzbot+7634edaea4d0b341c625@syzkaller.appspotmail.com
    CC: <stable@vger.kernel.org>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    c2b71462
power-management.rst 34.6 KB