• Reinette Chatre's avatar
    x86/intel_rdt: Fix potential deadlock during resctrl mount · 87943db7
    Reinette Chatre authored
    Sai reported a warning during some MBA tests:
    
    [  236.755559] ======================================================
    [  236.762443] WARNING: possible circular locking dependency detected
    [  236.769328] 4.14.0-rc4-yocto-standard #8 Not tainted
    [  236.774857] ------------------------------------------------------
    [  236.781738] mount/10091 is trying to acquire lock:
    [  236.787071]  (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff8117f892>] static_key_enable+0x12/0x30
    [  236.797058]
                   but task is already holding lock:
    [  236.803552]  (&type->s_umount_key#37/1){+.+.}, at: [<ffffffff81208b2f>] sget_userns+0x32f/0x520
    [  236.813247]
                   which lock already depends on the new lock.
    
    [  236.822353]
                   the existing dependency chain (in reverse order) is:
    [  236.830686]
                   -> #4 (&type->s_umount_key#37/1){+.+.}:
    [  236.837756]        __lock_acquire+0x1100/0x11a0
    [  236.842799]        lock_acquire+0xdf/0x1d0
    [  236.847363]        down_write_nested+0x46/0x80
    [  236.852310]        sget_userns+0x32f/0x520
    [  236.856873]        kernfs_mount_ns+0x7e/0x1f0
    [  236.861728]        rdt_mount+0x30c/0x440
    [  236.866096]        mount_fs+0x38/0x150
    [  236.870262]        vfs_kern_mount+0x67/0x150
    [  236.875015]        do_mount+0x1df/0xd50
    [  236.879286]        SyS_mount+0x95/0xe0
    [  236.883464]        entry_SYSCALL_64_fastpath+0x18/0xad
    [  236.889183]
                   -> #3 (rdtgroup_mutex){+.+.}:
    [  236.895292]        __lock_acquire+0x1100/0x11a0
    [  236.900337]        lock_acquire+0xdf/0x1d0
    [  236.904899]        __mutex_lock+0x80/0x8f0
    [  236.909459]        mutex_lock_nested+0x1b/0x20
    [  236.914407]        intel_rdt_online_cpu+0x3b/0x4a0
    [  236.919745]        cpuhp_invoke_callback+0xce/0xb80
    [  236.925177]        cpuhp_thread_fun+0x1c5/0x230
    [  236.930222]        smpboot_thread_fn+0x11a/0x1e0
    [  236.935362]        kthread+0x152/0x190
    [  236.939536]        ret_from_fork+0x27/0x40
    [  236.944097]
                   -> #2 (cpuhp_state-up){+.+.}:
    [  236.950199]        __lock_acquire+0x1100/0x11a0
    [  236.955241]        lock_acquire+0xdf/0x1d0
    [  236.959800]        cpuhp_issue_call+0x12e/0x1c0
    [  236.964845]        __cpuhp_setup_state_cpuslocked+0x13b/0x2f0
    [  236.971242]        __cpuhp_setup_state+0xa7/0x120
    [  236.976483]        page_writeback_init+0x43/0x67
    [  236.981623]        pagecache_init+0x38/0x3b
    [  236.986281]        start_kernel+0x3c6/0x41a
    [  236.990931]        x86_64_start_reservations+0x2a/0x2c
    [  236.996650]        x86_64_start_kernel+0x72/0x75
    [  237.001793]        verify_cpu+0x0/0xfb
    [  237.005966]
                   -> #1 (cpuhp_state_mutex){+.+.}:
    [  237.012364]        __lock_acquire+0x1100/0x11a0
    [  237.017408]        lock_acquire+0xdf/0x1d0
    [  237.021969]        __mutex_lock+0x80/0x8f0
    [  237.026527]        mutex_lock_nested+0x1b/0x20
    [  237.031475]        __cpuhp_setup_state_cpuslocked+0x54/0x2f0
    [  237.037777]        __cpuhp_setup_state+0xa7/0x120
    [  237.043013]        page_alloc_init+0x28/0x30
    [  237.047769]        start_kernel+0x148/0x41a
    [  237.052425]        x86_64_start_reservations+0x2a/0x2c
    [  237.058145]        x86_64_start_kernel+0x72/0x75
    [  237.063284]        verify_cpu+0x0/0xfb
    [  237.067456]
                   -> #0 (cpu_hotplug_lock.rw_sem){++++}:
    [  237.074436]        check_prev_add+0x401/0x800
    [  237.079286]        __lock_acquire+0x1100/0x11a0
    [  237.084330]        lock_acquire+0xdf/0x1d0
    [  237.088890]        cpus_read_lock+0x42/0x90
    [  237.093548]        static_key_enable+0x12/0x30
    [  237.098496]        rdt_mount+0x406/0x440
    [  237.102862]        mount_fs+0x38/0x150
    [  237.107035]        vfs_kern_mount+0x67/0x150
    [  237.111787]        do_mount+0x1df/0xd50
    [  237.116058]        SyS_mount+0x95/0xe0
    [  237.120233]        entry_SYSCALL_64_fastpath+0x18/0xad
    [  237.125952]
                   other info that might help us debug this:
    
    [  237.134867] Chain exists of:
                     cpu_hotplug_lock.rw_sem --> rdtgroup_mutex --> &type->s_umount_key#37/1
    
    [  237.148425]  Possible unsafe locking scenario:
    
    [  237.155015]        CPU0                    CPU1
    [  237.160057]        ----                    ----
    [  237.165100]   lock(&type->s_umount_key#37/1);
    [  237.169952]                                lock(rdtgroup_mutex);
    [  237.176641]
    lock(&type->s_umount_key#37/1);
    [  237.184287]   lock(cpu_hotplug_lock.rw_sem);
    [  237.189041]
                    *** DEADLOCK ***
    
    When the resctrl filesystem is mounted the locks must be acquired in the
    same order as was done when the cpus came online:
    
         cpu_hotplug_lock before rdtgroup_mutex.
    
    This also requires to switch the static_branch_enable() calls to the
    _cpulocked variant because now cpu hotplug lock is held already.
    
    [ tglx: Switched to cpus_read_[un]lock ]
    Reported-by: default avatarSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
    Signed-off-by: default avatarReinette Chatre <reinette.chatre@intel.com>
    Tested-by: default avatarSai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
    Acked-by: default avatarVikas Shivappa <vikas.shivappa@linux.intel.com>
    Cc: fenghua.yu@intel.com
    Cc: tony.luck@intel.com
    Link: https://lkml.kernel.org/r/9c41b91bc2f47d9e95b62b213ecdb45623c47a9f.1508490116.git.reinette.chatre@intel.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    87943db7
intel_rdt_rdtgroup.c 47.1 KB