serial: 8250_dw: Avoid "too much work" from bogus rx timeout interrupt
On a Rockchip rk3399-based board during suspend/resume testing, we found that we could get the console UART into a state where it would print this to the console a lot: serial8250: too much work for irq42 Followed eventually by: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 11s! Upon debugging I found that we're in this state: iir = 0x000000cc lsr = 0x00000060 It appears that somehow we have a RX Timeout interrupt but there is no actual data present to receive. When we're in this state the UART driver claims that it handled the interrupt but it actually doesn't really do anything. This means that we keep getting the interrupt over and over again. Normally we don't actually need to do anything special to handle a RX Timeout interrupt. We'll notice that there is some data ready and we'll read it, which will end up clearing the RX Timeout. In this case we have a problem specifically because we got the RX TImeout without any data. Reading a bogus byte is confirmed to get us out of this state. It's unclear how exactly the UART got into this state, but it is known that the UART lines are essentially undriven and unpowered during suspend, so possibly during resume some garbage / half transmitted bits are seen on the line and put the UART into this state. The UART on the rk3399 is a DesignWare based 8250 UART. From mailing list posts, it appears that other people have run into similar problems with DesignWare based IP. Presumably this problem is unique to that IP, so I have placed the workaround there to avoid possibly of accidentally triggering bad behavior on other IP. Also note the RX Timeout behaves very differently in the DMA case, for for now the workaround is only applied to the non-DMA case. Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Showing
Please register or sign in to comment