• Koby Elbaz's avatar
    accel/habanalabs: set device status 'malfunction' while in rmmod · e4a97d6b
    Koby Elbaz authored
    hl_device_status() returns the status of an acquired device.
    If a device is going down (following an rmmod cmd),
    it should be marked as an unusable/malfunctioning device, and
    hence should not be acquired.
    However, since this was not the case so far (i.e., a device going
    down would inaccurately return 'in reset' status allowing the user
    to acquire the device) it introduced a bug where as part of a reset
    flow, the driver could not kill processes that have not run yet, and
    since those processes aren't blocked from reacquiring a device,
    we get eventually a new flow of a driver attempting to kill all
    processes in a list that can't be ever really empty.
    Signed-off-by: default avatarKoby Elbaz <kelbaz@habana.ai>
    Reviewed-by: default avatarOded Gabbay <ogabbay@kernel.org>
    Signed-off-by: default avatarOded Gabbay <ogabbay@kernel.org>
    e4a97d6b
device.c 73 KB