rhashtable: avoid large lock-array allocations
Sander reports following splat after netfilter nat bysrc table got converted to rhashtable: swapper/0: page allocation failure: order:3, mode:0x2084020(GFP_ATOMIC|__GFP_COMP) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc1 [..] [<ffffffff811633ed>] warn_alloc_failed+0xdd/0x140 [<ffffffff811638b1>] __alloc_pages_nodemask+0x3e1/0xcf0 [<ffffffff811a72ed>] alloc_pages_current+0x8d/0x110 [<ffffffff8117cb7f>] kmalloc_order+0x1f/0x70 [<ffffffff811aec19>] __kmalloc+0x129/0x140 [<ffffffff8146d561>] bucket_table_alloc+0xc1/0x1d0 [<ffffffff8146da1d>] rhashtable_insert_rehash+0x5d/0xe0 [<ffffffff819fcfff>] nf_nat_setup_info+0x2ef/0x400 The failure happens when allocating the spinlock array. Even with GFP_KERNEL its unlikely for such a large allocation to succeed. Thomas Graf pointed me at inet_ehash_locks_alloc(), so in addition to adding NOWARN for atomic allocations this also makes the bucket-array sizing more conservative. In commit 095dc8e0 ("tcp: fix/cleanup inet_ehash_locks_alloc()"), Eric Dumazet says: "Budget 2 cache lines per cpu worth of 'spinlocks'". IOW, consider size needed by a single spinlock when determining number of locks per cpu. So with 64 byte per cacheline and 4 byte per spinlock this gives 32 locks per cpu. Resulting size of the lock-array (sizeof(spinlock) == 4): cpus: 1 2 4 8 16 32 64 old: 1k 1k 4k 8k 16k 16k 16k new: 128 256 512 1k 2k 4k 8k 8k allocation should have decent chance of success even with GFP_ATOMIC, and should not fail with GFP_KERNEL. With 72-byte spinlock (LOCKDEP): cpus : 1 2 old: 9k 18k new: ~2k ~4k Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Suggested-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Showing
Please register or sign in to comment