• Vladimir Oltean's avatar
    net: dsa: realtek: register the MDIO bus under devres · 74b6d7d1
    Vladimir Oltean authored
    The Linux device model permits both the ->shutdown and ->remove driver
    methods to get called during a shutdown procedure. Example: a DSA switch
    which sits on an SPI bus, and the SPI bus driver calls this on its
    ->shutdown method:
    
    spi_unregister_controller
    -> device_for_each_child(&ctlr->dev, NULL, __unregister);
       -> spi_unregister_device(to_spi_device(dev));
          -> device_del(&spi->dev);
    
    So this is a simple pattern which can theoretically appear on any bus,
    although the only other buses on which I've been able to find it are
    I2C:
    
    i2c_del_adapter
    -> device_for_each_child(&adap->dev, NULL, __unregister_client);
       -> i2c_unregister_device(client);
          -> device_unregister(&client->dev);
    
    The implication of this pattern is that devices on these buses can be
    unregistered after having been shut down. The drivers for these devices
    might choose to return early either from ->remove or ->shutdown if the
    other callback has already run once, and they might choose that the
    ->shutdown method should only perform a subset of the teardown done by
    ->remove (to avoid unnecessary delays when rebooting).
    
    So in other words, the device driver may choose on ->remove to not
    do anything (therefore to not unregister an MDIO bus it has registered
    on ->probe), because this ->remove is actually triggered by the
    device_shutdown path, and its ->shutdown method has already run and done
    the minimally required cleanup.
    
    This used to be fine until the blamed commit, but now, the following
    BUG_ON triggers:
    
    void mdiobus_free(struct mii_bus *bus)
    {
    	/* For compatibility with error handling in drivers. */
    	if (bus->state == MDIOBUS_ALLOCATED) {
    		kfree(bus);
    		return;
    	}
    
    	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
    	bus->state = MDIOBUS_RELEASED;
    
    	put_device(&bus->dev);
    }
    
    In other words, there is an attempt to free an MDIO bus which was not
    unregistered. The attempt to free it comes from the devres release
    callbacks of the SPI device, which are executed after the device is
    unregistered.
    
    I'm not saying that the fact that MDIO buses allocated using devres
    would automatically get unregistered wasn't strange. I'm just saying
    that the commit didn't care about auditing existing call paths in the
    kernel, and now, the following code sequences are potentially buggy:
    
    (a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device
        located on a bus that unregisters its children on shutdown. After
        the blamed patch, either both the alloc and the register should use
        devres, or none should.
    
    (b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no
        mdiobus_unregister at all in the remove path. After the blamed
        patch, nobody unregisters the MDIO bus anymore, so this is even more
        buggy than the previous case which needs a specific bus
        configuration to be seen, this one is an unconditional bug.
    
    In this case, the Realtek drivers fall under category (b). To solve it,
    we can register the MDIO bus under devres too, which restores the
    previous behavior.
    
    Fixes: ac3a68d5 ("net: phy: don't abuse devres in devm_mdiobus_register()")
    Reported-by: default avatarLino Sanfilippo <LinoSanfilippo@gmx.de>
    Reported-by: default avatarAlvin Šipraga <alsi@bang-olufsen.dk>
    Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    74b6d7d1
realtek-smi-core.c 11.9 KB