Commit 4ad096cc authored by Eric Biggers's avatar Eric Biggers Committed by Herbert Xu

crypto: x86/nh-avx2 - add missing vzeroupper

Since nh_avx2() uses ymm registers, execute vzeroupper before returning
from it.  This is necessary to avoid reducing the performance of SSE
code.

Fixes: 0f961f9f ("crypto: x86/nhpoly1305 - add AVX2 accelerated NHPoly1305")
Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
Acked-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
parent 8f0e0cf7
...@@ -154,5 +154,6 @@ SYM_TYPED_FUNC_START(nh_avx2) ...@@ -154,5 +154,6 @@ SYM_TYPED_FUNC_START(nh_avx2)
vpaddq T1, T0, T0 vpaddq T1, T0, T0
vpaddq T4, T0, T0 vpaddq T4, T0, T0
vmovdqu T0, (HASH) vmovdqu T0, (HASH)
vzeroupper
RET RET
SYM_FUNC_END(nh_avx2) SYM_FUNC_END(nh_avx2)
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment