1. 03 Feb, 2017 10 commits
  2. 02 Feb, 2017 2 commits
  3. 23 Jan, 2017 16 commits
  4. 13 Jan, 2017 9 commits
  5. 12 Jan, 2017 3 commits
    • Ard Biesheuvel's avatar
      crypto: arm64/aes - reimplement bit-sliced ARM/NEON implementation for arm64 · 1abee99e
      Ard Biesheuvel authored
      This is a reimplementation of the NEON version of the bit-sliced AES
      algorithm. This code is heavily based on Andy Polyakov's OpenSSL version
      for ARM, which is also available in the kernel. This is an alternative for
      the existing NEON implementation for arm64 authored by me, which suffers
      from poor performance due to its reliance on the pathologically slow four
      register variant of the tbl/tbx NEON instruction.
      
      This version is about ~30% (*) faster than the generic C code, but only in
      cases where the input can be 8x interleaved (this is a fundamental property
      of bit slicing). For this reason, only the chaining modes ECB, XTS and CTR
      are implemented. (The significance of ECB is that it could potentially be
      used by other chaining modes)
      
      * Measured on Cortex-A57. Note that this is still an order of magnitude
        slower than the implementations that use the dedicated AES instructions
        introduced in ARMv8, but those are part of an optional extension, and so
        it is good to have a fallback.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      1abee99e
    • Ard Biesheuvel's avatar
      crypto: arm/aes - replace scalar AES cipher · 81edb426
      Ard Biesheuvel authored
      This replaces the scalar AES cipher that originates in the OpenSSL project
      with a new implementation that is ~15% (*) faster (on modern cores), and
      reuses the lookup tables and the key schedule generation routines from the
      generic C implementation (which is usually compiled in anyway due to
      networking and other subsystems depending on it).
      
      Note that the bit sliced NEON code for AES still depends on the scalar cipher
      that this patch replaces, so it is not removed entirely yet.
      
      * On Cortex-A57, the performance increases from 17.0 to 14.9 cycles per byte
        for 128-bit keys.
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      81edb426
    • Ard Biesheuvel's avatar
      crypto: arm64/aes - add scalar implementation · bed593c0
      Ard Biesheuvel authored
      This adds a scalar implementation of AES, based on the precomputed tables
      that are exposed by the generic AES code. Since rotates are cheap on arm64,
      this implementation only uses the 4 core tables (of 1 KB each), and avoids
      the prerotated ones, reducing the D-cache footprint by 75%.
      
      On Cortex-A57, this code manages 13.0 cycles per byte, which is ~34% faster
      than the generic C code. (Note that this is still >13x slower than the code
      that uses the optional ARMv8 Crypto Extensions, which manages <1 cycles per
      byte.)
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      bed593c0