• Douglas Anderson's avatar
    drm/msm/dp: Avoid unpowered AUX xfers that caused crashes · d03fcc1d
    Douglas Anderson authored
    If you happened to try to access `/dev/drm_dp_aux` devices provided by
    the MSM DP AUX driver too early at bootup you could go boom. Let's
    avoid that by only allowing AUX transfers when the controller is
    powered up.
    
    Specifically the crash that was seen (on Chrome OS 5.4 tree with
    relevant backports):
      Kernel panic - not syncing: Asynchronous SError Interrupt
      CPU: 0 PID: 3131 Comm: fwupd Not tainted 5.4.144-16620-g28af11b73efb #1
      Hardware name: Google Lazor (rev3+) with KB Backlight (DT)
      Call trace:
       dump_backtrace+0x0/0x14c
       show_stack+0x20/0x2c
       dump_stack+0xac/0x124
       panic+0x150/0x390
       nmi_panic+0x80/0x94
       arm64_serror_panic+0x78/0x84
       do_serror+0x0/0x118
       do_serror+0xa4/0x118
       el1_error+0xbc/0x160
       dp_catalog_aux_write_data+0x1c/0x3c
       dp_aux_cmd_fifo_tx+0xf0/0x1b0
       dp_aux_transfer+0x1b0/0x2bc
       drm_dp_dpcd_access+0x8c/0x11c
       drm_dp_dpcd_read+0x64/0x10c
       auxdev_read_iter+0xd4/0x1c4
    
    I did a little bit of tracing and found that:
    * We register the AUX device very early at bootup.
    * Power isn't actually turned on for my system until
      hpd_event_thread() -> dp_display_host_init() -> dp_power_init()
    * You can see that dp_power_init() calls dp_aux_init() which is where
      we start allowing AUX channel requests to go through.
    
    In general this patch is a bit of a bandaid but at least it gets us
    out of the current state where userspace acting at the wrong time can
    fully crash the system.
    * I think the more proper fix (which requires quite a bit more
      changes) is to power stuff on while an AUX transfer is
      happening. This is like the solution we did for ti-sn65dsi86. This
      might be required for us to move to populating the panel via the
      DP-AUX bus.
    * Another fix considered was to dynamically register / unregister. I
      tried that at <https://crrev.com/c/3169431/3> but it got
      ugly. Currently there's a bug where the pm_runtime() state isn't
      tracked properly and that causes us to just keep registering more
      and more.
    Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
    Reviewed-by: default avatarKuogee Hsieh <quic_khsieh@quicinc.com>
    Reviewed-by: default avatarAbhinav Kumar <quic_abhinavk@quicinc.com>
    Link: https://lore.kernel.org/r/20211109100403.1.I4e23470d681f7efe37e2e7f1a6466e15e9bb1d72@changeidSigned-off-by: default avatarRob Clark <robdclark@chromium.org>
    d03fcc1d
dp_aux.c 12 KB