• Mark Salter's avatar
    arm64: fix soft lockup due to large tlb flush range · 05ac6530
    Mark Salter authored
    Under certain loads, this soft lockup has been observed:
    
       BUG: soft lockup - CPU#2 stuck for 22s! [ip6tables:1016]
       Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211 rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vfat fat efivarfs xfs libcrc32c
    
       CPU: 2 PID: 1016 Comm: ip6tables Not tainted 3.13.0-0.rc7.30.sa2.aarch64 #1
       task: fffffe03e81d1400 ti: fffffe03f01f8000 task.ti: fffffe03f01f8000
       PC is at __cpu_flush_kern_tlb_range+0xc/0x40
       LR is at __purge_vmap_area_lazy+0x28c/0x3ac
       pc : [<fffffe000009c5cc>] lr : [<fffffe0000182710>] pstate: 80000145
       sp : fffffe03f01fbb70
       x29: fffffe03f01fbb70 x28: fffffe03f01f8000
       x27: fffffe0000b19000 x26: 00000000000000d0
       x25: 000000000000001c x24: fffffe03f01fbc50
       x23: fffffe03f01fbc58 x22: fffffe03f01fbc10
       x21: fffffe0000b2a3f8 x20: 0000000000000802
       x19: fffffe0000b2a3c8 x18: 000003fffdf52710
       x17: 000003ff9d8bb910 x16: fffffe000050fbfc
       x15: 0000000000005735 x14: 000003ff9d7e1a5c
       x13: 0000000000000000 x12: 000003ff9d7e1a5c
       x11: 0000000000000007 x10: fffffe0000c09af0
       x9 : fffffe0000ad1000 x8 : 000000000000005c
       x7 : fffffe03e8624000 x6 : 0000000000000000
       x5 : 0000000000000000 x4 : 0000000000000000
       x3 : fffffe0000c09cc8 x2 : 0000000000000000
       x1 : 000fffffdfffca80 x0 : 000fffffcd742150
    
    The __cpu_flush_kern_tlb_range() function looks like:
    
      ENTRY(__cpu_flush_kern_tlb_range)
    	dsb	sy
    	lsr	x0, x0, #12
    	lsr	x1, x1, #12
      1:	tlbi	vaae1is, x0
    	add	x0, x0, #1
    	cmp	x0, x1
    	b.lo	1b
    	dsb	sy
    	isb
    	ret
      ENDPROC(__cpu_flush_kern_tlb_range)
    
    The above soft lockup shows the PC at tlbi insn with:
    
      x0 = 0x000fffffcd742150
      x1 = 0x000fffffdfffca80
    
    So __cpu_flush_kern_tlb_range has 0x128ba930 tlbi flushes left
    after it has already been looping for 23 seconds!.
    
    Looking up one frame at __purge_vmap_area_lazy(), there is:
    
    	...
    	list_for_each_entry_rcu(va, &vmap_area_list, list) {
    		if (va->flags & VM_LAZY_FREE) {
    			if (va->va_start < *start)
    				*start = va->va_start;
    			if (va->va_end > *end)
    				*end = va->va_end;
    			nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
    			list_add_tail(&va->purge_list, &valist);
    			va->flags |= VM_LAZY_FREEING;
    			va->flags &= ~VM_LAZY_FREE;
    		}
    	}
    	...
    	if (nr || force_flush)
    		flush_tlb_kernel_range(*start, *end);
    
    So if two areas are being freed, the range passed to
    flush_tlb_kernel_range() may be as large as the vmalloc
    space. For arm64, this is ~240GB for 4k pagesize and ~2TB
    for 64kpage size.
    
    This patch works around this problem by adding a loop limit.
    If the range is larger than the limit, use flush_tlb_all()
    rather than flushing based on individual pages. The limit
    chosen is arbitrary as the TLB size is implementation
    specific and not accessible in an architected way. The aim
    of the arbitrary limit is to avoid soft lockup.
    Signed-off-by: default avatarMark Salter <msalter@redhat.com>
    [catalin.marinas@arm.com: commit log update]
    [catalin.marinas@arm.com: marginal optimisation]
    [catalin.marinas@arm.com: changed to MAX_TLB_RANGE and added comment]
    Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
    05ac6530
tlbflush.h 4.36 KB