• Taehee Yoo's avatar
    crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher · ba3579e6
    Taehee Yoo authored
    The implementation is based on the 32-bit implementation of the aria.
    Also, aria-avx process steps are the similar to the camellia-avx.
    1. Byteslice(16way)
    2. Add-round-key.
    3. Sbox
    4. Diffusion layer.
    
    Except for s-box, all steps are the same as the aria-generic
    implementation. s-box step is very similar to camellia and
    sm4 implementation.
    
    There are 2 implementations for s-box step.
    One is to use AES-NI and affine transformation, which is the same as
    Camellia, sm4, and others.
    Another is to use GFNI.
    GFNI implementation is faster than AES-NI implementation.
    So, it uses GFNI implementation if the running CPU supports GFNI.
    
    There are 4 s-boxes in the ARIA and the 2 s-boxes are the same as
    AES's s-boxes.
    
    To calculate the first sbox, it just uses the aesenclast and then
    inverts shift_row.
    No more process is needed for this job because the first s-box is
    the same as the AES encryption s-box.
    
    To calculate the second sbox(invert of s1), it just uses the aesdeclast
    and then inverts shift_row.
    No more process is needed for this job because the second s-box is
    the same as the AES decryption s-box.
    
    To calculate the third s-box, it uses the aesenclast,
    then affine transformation, which is combined AES inverse affine and
    ARIA S2.
    
    To calculate the last s-box, it uses the aesdeclast,
    then affine transformation, which is combined X2 and AES forward affine.
    
    The optimized third and last s-box logic and GFNI s-box logic are
    implemented by Jussi Kivilinna.
    
    The aria-generic implementation is based on a 32-bit implementation,
    not an 8-bit implementation. the aria-avx Diffusion Layer implementation
    is based on aria-generic implementation because 8-bit implementation is
    not fit for parallel implementation but 32-bit is enough to fit for this.
    Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
    Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
    ba3579e6
aria-aesni-avx-asm_64.S 37.6 KB