• Stephen Boyd's avatar
    soc: qcom: rpmh: Avoid accessing freed memory from batch API · baef1c90
    Stephen Boyd authored
    Using the batch API from the interconnect driver sometimes leads to a
    KASAN error due to an access to freed memory. This is easier to trigger
    with threadirqs on the kernel commandline.
    
     BUG: KASAN: use-after-free in rpmh_tx_done+0x114/0x12c
     Read of size 1 at addr fffffff51414ad84 by task irq/110-apps_rs/57
    
     CPU: 0 PID: 57 Comm: irq/110-apps_rs Tainted: G        W         4.19.10 #72
     Call trace:
      dump_backtrace+0x0/0x2f8
      show_stack+0x20/0x2c
      __dump_stack+0x20/0x28
      dump_stack+0xcc/0x10c
      print_address_description+0x74/0x240
      kasan_report+0x250/0x26c
      __asan_report_load1_noabort+0x20/0x2c
      rpmh_tx_done+0x114/0x12c
      tcs_tx_done+0x450/0x768
      irq_forced_thread_fn+0x58/0x9c
      irq_thread+0x120/0x1dc
      kthread+0x248/0x260
      ret_from_fork+0x10/0x18
    
     Allocated by task 385:
      kasan_kmalloc+0xac/0x148
      __kmalloc+0x170/0x1e4
      rpmh_write_batch+0x174/0x540
      qcom_icc_set+0x8dc/0x9ac
      icc_set+0x288/0x2e8
      a6xx_gmu_stop+0x320/0x3c0
      a6xx_pm_suspend+0x108/0x124
      adreno_suspend+0x50/0x60
      pm_generic_runtime_suspend+0x60/0x78
      __rpm_callback+0x214/0x32c
      rpm_callback+0x54/0x184
      rpm_suspend+0x3f8/0xa90
      pm_runtime_work+0xb4/0x178
      process_one_work+0x544/0xbc0
      worker_thread+0x514/0x7d0
      kthread+0x248/0x260
      ret_from_fork+0x10/0x18
    
     Freed by task 385:
      __kasan_slab_free+0x12c/0x1e0
      kasan_slab_free+0x10/0x1c
      kfree+0x134/0x588
      rpmh_write_batch+0x49c/0x540
      qcom_icc_set+0x8dc/0x9ac
      icc_set+0x288/0x2e8
      a6xx_gmu_stop+0x320/0x3c0
      a6xx_pm_suspend+0x108/0x124
      adreno_suspend+0x50/0x60
     cr50_spi spi5.0: SPI transfer timed out
      pm_generic_runtime_suspend+0x60/0x78
      __rpm_callback+0x214/0x32c
      rpm_callback+0x54/0x184
      rpm_suspend+0x3f8/0xa90
      pm_runtime_work+0xb4/0x178
      process_one_work+0x544/0xbc0
      worker_thread+0x514/0x7d0
      kthread+0x248/0x260
      ret_from_fork+0x10/0x18
    
     The buggy address belongs to the object at fffffff51414ac80
      which belongs to the cache kmalloc-512 of size 512
     The buggy address is located 260 bytes inside of
      512-byte region [fffffff51414ac80, fffffff51414ae80)
     The buggy address belongs to the page:
     page:ffffffbfd4505200 count:1 mapcount:0 mapping:fffffff51e00c680 index:0x0 compound_mapcount: 0
     flags: 0x4000000000008100(slab|head)
     raw: 4000000000008100 ffffffbfd4529008 ffffffbfd44f9208 fffffff51e00c680
     raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
     page dumped because: kasan: bad access detected
    
     Memory state around the buggy address:
      fffffff51414ac80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      fffffff51414ad00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
     >fffffff51414ad80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                        ^
      fffffff51414ae00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      fffffff51414ae80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    
    The batch API sets the same completion for each rpmh message that's sent
    and then loops through all the messages and waits for that single
    completion declared on the stack to be completed before returning from
    the function and freeing the message structures. Unfortunately, some
    messages may still be in process and 'stuck' in the TCS. At some later
    point, the tcs_tx_done() interrupt will run and try to process messages
    that have already been freed at the end of rpmh_write_batch(). This will
    in turn access the 'needs_free' member of the rpmh_request structure and
    cause KASAN to complain. Furthermore, if there's a message that's
    completed in rpmh_tx_done() and freed immediately after the complete()
    call is made we'll be racing with potentially freed memory when
    accessing the 'needs_free' member:
    
    	CPU0                         CPU1
    	----                         ----
    	rpmh_tx_done()
    	 complete(&compl)
    	                             wait_for_completion(&compl)
    	                             kfree(rpm_msg)
    	 if (rpm_msg->needs_free)
    	 <KASAN warning splat>
    
    Let's fix this by allocating a chunk of completions for each message and
    waiting for all of them to be completed before returning from the batch
    API. Alternatively, we could wait for the last message in the batch, but
    that may be a more complicated change because it looks like
    tcs_tx_done() just iterates through the indices of the queue and
    completes each message instead of tracking the last inserted message and
    completing that first.
    
    Fixes: c8790cb6 ("drivers: qcom: rpmh: add support for batch RPMH request")
    Cc: Lina Iyer <ilina@codeaurora.org>
    Cc: "Raju P.L.S.S.S.N" <rplsssn@codeaurora.org>
    Cc: Matthias Kaehlcke <mka@chromium.org>
    Cc: Evan Green <evgreen@chromium.org>
    Cc: stable@vger.kernel.org
    Reviewed-by: default avatarLina Iyer <ilina@codeaurora.org>
    Reviewed-by: default avatarEvan Green <evgreen@chromium.org>
    Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
    Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
    Signed-off-by: default avatarAndy Gross <andy.gross@linaro.org>
    baef1c90
rpmh.c 11.9 KB