• Alex Sierra's avatar
    drm/amdkfd: avoid recursive lock in migrations back to RAM · a6283010
    Alex Sierra authored
    [Why]:
    When we call hmm_range_fault to map memory after a migration, we don't
    expect memory to be migrated again as a result of hmm_range_fault. The
    driver ensures that all memory is in GPU-accessible locations so that
    no migration should be needed. However, there is one corner case where
    hmm_range_fault can unexpectedly cause a migration from DEVICE_PRIVATE
    back to system memory due to a write-fault when a system memory page in
    the same range was mapped read-only (e.g. COW). Ranges with individual
    pages in different locations are usually the result of failed page
    migrations (e.g. page lock contention). The unexpected migration back
    to system memory causes a deadlock from recursive locking in our
    driver.
    
    [How]:
    Creating a task reference new member under svm_range_list struct.
    Setting this with "current" reference, right before the hmm_range_fault
    is called. This member is checked against "current" reference at
    svm_migrate_to_ram callback function. If equal, the migration will be
    ignored.
    Signed-off-by: default avatarAlex Sierra <alex.sierra@amd.com>
    Reviewed-by: default avatarFelix Kuehling <Felix.Kuehling@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    a6283010
kfd_migrate.c 24.8 KB