• Christophe Leroy's avatar
    net: Force inlining of checksum functions in net/checksum.h · 5486f5bf
    Christophe Leroy authored
    All functions defined as static inline in net/checksum.h are
    meant to be inlined for performance reason.
    
    But since commit ac7c3e4f ("compiler: enable
    CONFIG_OPTIMIZE_INLINING forcibly") the compiler is allowed to
    uninline functions when it wants.
    
    Fair enough in the general case, but for tiny performance critical
    checksum helpers that's counter-productive.
    
    The problem mainly arises when selecting CONFIG_CC_OPTIMISE_FOR_SIZE,
    Those helpers being 'static inline' in header files you suddenly find
    them duplicated many times in the resulting vmlinux.
    
    Here is a typical exemple when building powerpc pmac32_defconfig
    with CONFIG_CC_OPTIMISE_FOR_SIZE. csum_sub() appears 4 times:
    
    	c04a23cc <csum_sub>:
    	c04a23cc:	7c 84 20 f8 	not     r4,r4
    	c04a23d0:	7c 63 20 14 	addc    r3,r3,r4
    	c04a23d4:	7c 63 01 94 	addze   r3,r3
    	c04a23d8:	4e 80 00 20 	blr
    		...
    	c04a2ce8:	4b ff f6 e5 	bl      c04a23cc <csum_sub>
    		...
    	c04a2d2c:	4b ff f6 a1 	bl      c04a23cc <csum_sub>
    		...
    	c04a2d54:	4b ff f6 79 	bl      c04a23cc <csum_sub>
    		...
    	c04a754c <csum_sub>:
    	c04a754c:	7c 84 20 f8 	not     r4,r4
    	c04a7550:	7c 63 20 14 	addc    r3,r3,r4
    	c04a7554:	7c 63 01 94 	addze   r3,r3
    	c04a7558:	4e 80 00 20 	blr
    		...
    	c04ac930:	4b ff ac 1d 	bl      c04a754c <csum_sub>
    		...
    	c04ad264:	4b ff a2 e9 	bl      c04a754c <csum_sub>
    		...
    	c04e3b08 <csum_sub>:
    	c04e3b08:	7c 84 20 f8 	not     r4,r4
    	c04e3b0c:	7c 63 20 14 	addc    r3,r3,r4
    	c04e3b10:	7c 63 01 94 	addze   r3,r3
    	c04e3b14:	4e 80 00 20 	blr
    		...
    	c04e5788:	4b ff e3 81 	bl      c04e3b08 <csum_sub>
    		...
    	c04e65c8:	4b ff d5 41 	bl      c04e3b08 <csum_sub>
    		...
    	c0512d34 <csum_sub>:
    	c0512d34:	7c 84 20 f8 	not     r4,r4
    	c0512d38:	7c 63 20 14 	addc    r3,r3,r4
    	c0512d3c:	7c 63 01 94 	addze   r3,r3
    	c0512d40:	4e 80 00 20 	blr
    		...
    	c0512dfc:	4b ff ff 39 	bl      c0512d34 <csum_sub>
    		...
    	c05138bc:	4b ff f4 79 	bl      c0512d34 <csum_sub>
    		...
    
    Restore the expected behaviour by using __always_inline for all
    functions defined in net/checksum.h
    
    vmlinux size is even reduced by 256 bytes with this patch:
    
    	   text	   data	    bss	    dec	    hex	filename
    	6980022	2515362	 194384	9689768	 93daa8	vmlinux.before
    	6979862	2515266	 194384	9689512	 93d9a8	vmlinux.now
    
    Fixes: ac7c3e4f ("compiler: enable CONFIG_OPTIMIZE_INLINING forcibly")
    Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
    Cc: Nick Desaulniers <ndesaulniers@google.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    5486f5bf
checksum.h 4.73 KB