• Alvin Šipraga's avatar
    net: dsa: realtek: rtl8365mb: serialize indirect PHY register access · 27967284
    Alvin Šipraga authored
    Realtek switches in the rtl8365mb family can access the PHY registers of
    the internal PHYs via the switch registers. This method is called
    indirect access. At a high level, the indirect PHY register access
    method involves reading and writing some special switch registers in a
    particular sequence. This works for both SMI and MDIO connected
    switches.
    
    Currently the rtl8365mb driver does not take any care to serialize the
    aforementioned access to the switch registers. In particular, it is
    permitted for other driver code to access other switch registers while
    the indirect PHY register access is ongoing. Locking is only done at the
    regmap level. This, however, is a bug: concurrent register access, even
    to unrelated switch registers, risks corrupting the PHY register value
    read back via the indirect access method described above.
    
    Arınç reported that the switch sometimes returns nonsense data when
    reading the PHY registers. In particular, a value of 0 causes the
    kernel's PHY subsystem to think that the link is down, but since most
    reads return correct data, the link then flip-flops between up and down
    over a period of time.
    
    The aforementioned bug can be readily observed by:
    
     1. Enabling ftrace events for regmap and mdio
     2. Polling BSMR PHY register for a connected port;
        it should always read the same (e.g. 0x79ed)
     3. Wait for step 2 to give a different value
    
    Example command for step 2:
    
        while true; do phytool read swp2/2/0x01; done
    
    On my i.MX8MM, the above steps will yield a bogus value for the BSMR PHY
    register within a matter of seconds. The interleaved register access it
    then evident in the trace log:
    
     kworker/3:4-70      [003] .......  1927.139849: regmap_reg_write: ethernet-switch reg=1004 val=bd
         phytool-16816   [002] .......  1927.139979: regmap_reg_read: ethernet-switch reg=1f01 val=0
     kworker/3:4-70      [003] .......  1927.140381: regmap_reg_read: ethernet-switch reg=1005 val=0
         phytool-16816   [002] .......  1927.140468: regmap_reg_read: ethernet-switch reg=1d15 val=a69
     kworker/3:4-70      [003] .......  1927.140864: regmap_reg_read: ethernet-switch reg=1003 val=0
         phytool-16816   [002] .......  1927.140955: regmap_reg_write: ethernet-switch reg=1f02 val=2041
     kworker/3:4-70      [003] .......  1927.141390: regmap_reg_read: ethernet-switch reg=1002 val=0
         phytool-16816   [002] .......  1927.141479: regmap_reg_write: ethernet-switch reg=1f00 val=1
     kworker/3:4-70      [003] .......  1927.142311: regmap_reg_write: ethernet-switch reg=1004 val=be
         phytool-16816   [002] .......  1927.142410: regmap_reg_read: ethernet-switch reg=1f01 val=0
     kworker/3:4-70      [003] .......  1927.142534: regmap_reg_read: ethernet-switch reg=1005 val=0
         phytool-16816   [002] .......  1927.142618: regmap_reg_read: ethernet-switch reg=1f04 val=0
         phytool-16816   [002] .......  1927.142641: mdio_access: SMI-0 read  phy:0x02 reg:0x01 val:0x0000 <- ?!
     kworker/3:4-70      [003] .......  1927.143037: regmap_reg_read: ethernet-switch reg=1001 val=0
     kworker/3:4-70      [003] .......  1927.143133: regmap_reg_read: ethernet-switch reg=1000 val=2d89
     kworker/3:4-70      [003] .......  1927.143213: regmap_reg_write: ethernet-switch reg=1004 val=be
     kworker/3:4-70      [003] .......  1927.143291: regmap_reg_read: ethernet-switch reg=1005 val=0
     kworker/3:4-70      [003] .......  1927.143368: regmap_reg_read: ethernet-switch reg=1003 val=0
     kworker/3:4-70      [003] .......  1927.143443: regmap_reg_read: ethernet-switch reg=1002 val=6
    
    The kworker here is polling MIB counters for stats, as evidenced by the
    register 0x1004 that we are writing to (RTL8365MB_MIB_ADDRESS_REG). This
    polling is performed every 3 seconds, but is just one example of such
    unsynchronized access. In Arınç's case, the driver was not using the
    switch IRQ, so the PHY subsystem was itself doing polling analogous to
    phytool in the above example.
    
    A test module was created [see second Link] to simulate such spurious
    switch register accesses while performing indirect PHY register reads
    and writes. Realtek was also consulted to confirm whether this is a
    known issue or not. The conclusion of these lines of inquiry is as
    follows:
    
    1. Reading of PHY registers via indirect access will be aborted if,
       after executing the read operation (via a write to the
       INDIRECT_ACCESS_CTRL_REG), any register is accessed, other than
       INDIRECT_ACCESS_STATUS_REG.
    
    2. The PHY register indirect read is only complete when
       INDIRECT_ACCESS_STATUS_REG reads zero.
    
    3. The INDIRECT_ACCESS_DATA_REG, which is read to get the result of the
       PHY read, will contain the result of the last successful read
       operation. If there was spurious register access and the indirect
       read was aborted, then this register is not guaranteed to hold
       anything meaningful and the PHY read will silently fail.
    
    4. PHY writes do not appear to be affected by this mechanism.
    
    5. Other similar access routines, such as for MIB counters, although
       similar to the PHY indirect access method, are actually table access.
       Table access is not affected by spurious reads or writes of other
       registers. However, concurrent table access is not allowed. Currently
       this is protected via mib_lock, so there is nothing to fix.
    
    The above statements are corroborated both via the test module and
    through consultation with Realtek. In particular, Realtek states that
    this is simply a property of the hardware design and is not a hardware
    bug.
    
    To fix this problem, one must guard against regmap access while the
    PHY indirect register read is executing. Fix this by using the newly
    introduced "nolock" regmap in all PHY-related functions, and by aquiring
    the regmap mutex at the top level of the PHY register access callbacks.
    Although no issue has been observed with PHY register _writes_, this
    change also serializes the indirect access method there. This is done
    purely as a matter of convenience and for reasons of symmetry.
    
    Fixes: 4af2950c ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
    Link: https://lore.kernel.org/netdev/CAJq09z5FCgG-+jVT7uxh1a-0CiiFsoKoHYsAWJtiKwv7LXKofQ@mail.gmail.com/
    Link: https://lore.kernel.org/netdev/871qzwjmtv.fsf@bang-olufsen.dk/Reported-by: default avatarArınç ÜNAL <arinc.unal@arinc9.com>
    Reported-by: default avatarLuiz Angelo Daros de Luca <luizluca@gmail.com>
    Signed-off-by: default avatarAlvin Šipraga <alsi@bang-olufsen.dk>
    Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
    Reviewed-by: default avatarLinus Walleij <linus.walleij@linaro.org>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    27967284
rtl8365mb.c 61.9 KB