• Chris Wilson's avatar
    drm/i915: Annotate timeline lock nesting · 0adb90d3
    Chris Wilson authored
    CI noticed
    
    <4>[   23.430701] ============================================
    <4>[   23.430706] WARNING: possible recursive locking detected
    <4>[   23.430713] 4.17.0-rc4-CI-CI_DRM_4156+ #1 Not tainted
    <4>[   23.430720] --------------------------------------------
    <4>[   23.430725] systemd-udevd/169 is trying to acquire lock:
    <4>[   23.430732]         (ptrval) (&(&timeline->lock)->rlock){....}, at: move_to_timeline+0x48/0x12c [i915]
    <4>[   23.430888]
                      but task is already holding lock:
    <4>[   23.430894]         (ptrval) (&(&timeline->lock)->rlock){....}, at: i915_request_submit+0x1a/0x40 [i915]
    <4>[   23.430995]
                      other info that might help us debug this:
    <4>[   23.431002]  Possible unsafe locking scenario:
    
    <4>[   23.431007]        CPU0
    <4>[   23.431010]        ----
    <4>[   23.431013]   lock(&(&timeline->lock)->rlock);
    <4>[   23.431021]   lock(&(&timeline->lock)->rlock);
    <4>[   23.431028]
                       *** DEADLOCK ***
    
    <4>[   23.431036]  May be due to missing lock nesting notation
    
    <4>[   23.431044] 5 locks held by systemd-udevd/169:
    <4>[   23.431049]  #0:         (ptrval) (&dev->mutex){....}, at: __driver_attach+0x42/0xe0
    <4>[   23.431065]  #1:         (ptrval) (&dev->mutex){....}, at: __driver_attach+0x50/0xe0
    <4>[   23.431078]  #2:         (ptrval) (&dev->struct_mutex){+.+.}, at: i915_gem_init+0xca/0x630 [i915]
    <4>[   23.431174]  #3:         (ptrval) (rcu_read_lock){....}, at: submit_notify+0x35/0x124 [i915]
    <4>[   23.431271]  #4:         (ptrval) (&(&timeline->lock)->rlock){....}, at: i915_request_submit+0x1a/0x40 [i915]
    <4>[   23.431369]
                      stack backtrace:
    <4>[   23.431377] CPU: 0 PID: 169 Comm: systemd-udevd Not tainted 4.17.0-rc4-CI-CI_DRM_4156+ #1
    <4>[   23.431385] Hardware name: Dell Inc.                 OptiPlex GX280               /0G8310, BIOS A04 02/09/2005
    <4>[   23.431394] Call Trace:
    <4>[   23.431403]  dump_stack+0x67/0x9b
    <4>[   23.431411]  __lock_acquire+0xc67/0x1b50
    <4>[   23.431421]  ? ring_buffer_lock_reserve+0x154/0x3f0
    <4>[   23.431429]  ? lock_acquire+0xa6/0x210
    <4>[   23.431435]  lock_acquire+0xa6/0x210
    <4>[   23.431530]  ? move_to_timeline+0x48/0x12c [i915]
    <4>[   23.431540]  _raw_spin_lock+0x2a/0x40
    <4>[   23.431634]  ? move_to_timeline+0x48/0x12c [i915]
    <4>[   23.431730]  move_to_timeline+0x48/0x12c [i915]
    <4>[   23.431826]  __i915_request_submit+0xfa/0x280 [i915]
    <4>[   23.431923]  i915_request_submit+0x25/0x40 [i915]
    <4>[   23.432024]  i9xx_submit_request+0x11/0x140 [i915]
    <4>[   23.432120]  submit_notify+0x8d/0x124 [i915]
    <4>[   23.432202]  __i915_sw_fence_complete+0x81/0x250 [i915]
    <4>[   23.432300]  __i915_request_add+0x31c/0x7c0 [i915]
    <4>[   23.432395]  i915_gem_init+0x621/0x630 [i915]
    <4>[   23.432476]  i915_driver_load+0xbee/0x10b0 [i915]
    <4>[   23.432485]  ? trace_hardirqs_on_caller+0xe0/0x1b0
    <4>[   23.432566]  i915_pci_probe+0x29/0x90 [i915]
    <4>[   23.432574]  pci_device_probe+0xa1/0x130
    <4>[   23.432582]  driver_probe_device+0x306/0x480
    <4>[   23.432589]  __driver_attach+0xb7/0xe0
    <4>[   23.432596]  ? driver_probe_device+0x480/0x480
    <4>[   23.432602]  ? driver_probe_device+0x480/0x480
    <4>[   23.432609]  bus_for_each_dev+0x74/0xc0
    <4>[   23.432616]  bus_add_driver+0x15f/0x250
    <4>[   23.432623]  ? 0xffffffffa02d7000
    <4>[   23.432629]  driver_register+0x52/0xc0
    <4>[   23.432635]  ? 0xffffffffa02d7000
    <4>[   23.432642]  do_one_initcall+0x58/0x370
    <4>[   23.432653]  ? do_init_module+0x1d/0x1ea
    <4>[   23.432660]  ? rcu_read_lock_sched_held+0x6f/0x80
    <4>[   23.432667]  ? kmem_cache_alloc_trace+0x282/0x2e0
    <4>[   23.432675]  do_init_module+0x56/0x1ea
    <4>[   23.432682]  load_module+0x2435/0x2b20
    <4>[   23.432694]  ? __se_sys_finit_module+0xd3/0xf0
    <4>[   23.432701]  __se_sys_finit_module+0xd3/0xf0
    <4>[   23.432710]  do_syscall_64+0x55/0x190
    <4>[   23.432717]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
    <4>[   23.432724] RIP: 0033:0x7fa780782839
    <4>[   23.432729] RSP: 002b:00007ffcea73e668 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
    <4>[   23.432738] RAX: ffffffffffffffda RBX: 0000561a472a4b30 RCX: 00007fa780782839
    <4>[   23.432745] RDX: 0000000000000000 RSI: 00007fa7804610e5 RDI: 000000000000000e
    <4>[   23.432752] RBP: 00007fa7804610e5 R08: 0000000000000000 R09: 00007ffcea73e780
    <4>[   23.432758] R10: 000000000000000e R11: 0000000000000246 R12: 0000000000000000
    <4>[   23.432765] R13: 0000561a47296450 R14: 0000000000020000 R15: 0000561a472a4b30
    
    but did not report it as an issue as it only occurred during the first
    module on boot. This is due to the removal of the distinct global
    timeline, and its separate lock class. So instead mark up the expected
    nesting. An alternative would be to define a separate lock class for the
    engine, but since we only expect to have a single point of nesting, we
    can avoid having multiple lock classes for the struct.
    
    Fixes: a89d1f92 ("drm/i915: Split i915_gem_timeline into individual timelines")
    Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Tested-by: default avatarMichel Thierry <michel.thierry@intel.com>
    Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180508153514.20251-1-chris@chris-wilson.co.uk
    0adb90d3
i915_request.c 42 KB