- 16 Mar, 2018 23 commits
-
-
Ard Biesheuvel authored
CBC MAC is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC MAC which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
CBC encryption is strictly sequential, and so the current AES code simply processes the input one block at a time. However, we are about to add yield support, which adds a bit of overhead, and which we prefer to align with other modes in terms of granularity (i.e., it is better to have all routines yield every 64 bytes and not have an exception for CBC encrypt which yields every 16 bytes) So unroll the loop by 4. We still cannot perform the AES algorithm in parallel, but we can at least merge the loads and stores. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The AES block mode implementation using Crypto Extensions or plain NEON was written before real hardware existed, and so its interleave factor was made build time configurable (as well as an option to instantiate all interleaved sequences inline rather than as subroutines) We ended up using INTERLEAVE=4 with inlining disabled for both flavors of the core AES routines, so let's stick with that, and remove the option to configure this at build time. This makes the code easier to modify, which is nice now that we're adding yield support. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Note that this requires some reshuffling of the registers in the asm code, because the XTS routines can no longer rely on the registers to retain their contents between invocations. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
When kernel mode NEON was first introduced on arm64, the preserve and restore of the userland NEON state was completely unoptimized, and involved saving all registers on each call to kernel_neon_begin(), and restoring them on each call to kernel_neon_end(). For this reason, the NEON crypto code that was introduced at the time keeps the NEON enabled throughout the execution of the crypto API methods, which may include calls back into the crypto API that could result in memory allocation or other actions that we should avoid when running with preemption disabled. Since then, we have optimized the kernel mode NEON handling, which now restores lazily (upon return to userland), and so the preserve action is only costly the first time it is called after entering the kernel. So let's put the kernel_neon_begin() and kernel_neon_end() calls around the actual invocations of the NEON crypto code, and run the remainder of the code with kernel mode NEON disabled (and preemption enabled) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
In order to be able to test yield support under preempt, add a test vector for CRC-T10DIF that is long enough to take multiple iterations (and thus possible preemption between them) of the primary loop of the accelerated x86 and arm64 implementations. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Kees Cook authored
On the quest to remove all VLAs from the kernel[1], this switches to a pair of kmalloc regions instead of using the stack. This also moves the get_random_bytes() after all allocations (and drops the needless "nbytes" variable). [1] https://lkml.org/lkml/2018/3/7/621Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Gary R Hook authored
The CCP driver copies data between scatter/gather lists and DMA buffers. The length of the requested copy operation must be checked against the available destination buffer length. Reported-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Kamil Konieczny authored
Prevent improper use of req->result field in ahash update, init, export and import functions in drivers code. A driver should use ahash request context if it needs to save internal state. Signed-off-by: Kamil Konieczny <k.konieczny@partner.samsung.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Peter Wu authored
virtio_crypto does not use function crypto_authenc_extractkeys, remove this unnecessary dependency. Compiles fine and passes cryptodev-linux cipher and speed tests from https://wiki.qemu.org/Features/VirtioCrypto Fixes: dbaf0624 ("crypto: add virtio-crypto driver") Signed-off-by: Peter Wu <peter@lekensteyn.nl> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Gilad Ben-Yossef authored
Add testmgr tests for the newly introduced SM4 ECB symmetric cipher. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Gilad Ben-Yossef authored
Introduce the SM4 cipher algorithms (OSCCA GB/T 32907-2016). SM4 (GBT.32907-2016) is a cryptographic standard issued by the Organization of State Commercial Administration of China (OSCCA) as an authorized cryptographic algorithms for the use within China. SMS4 was originally created for use in protecting wireless networks, and is mandated in the Chinese National Standard for Wireless LAN WAPI (Wired Authentication and Privacy Infrastructure) (GB.15629.11-2003). Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Send multiple WRs to H/W when No. of entries received in scatter list cannot be sent in single request. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
We use ctr(aes) to fallback rfc3686(ctr) request. Send updated IV to fallback path. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
CBC Decryption requires Last Block as IV. In case src/dst buffer are same last block will be replaced by plain text. This patch copies the Last Block before sending request to HW. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
ulptx header cannot have length > 64k. Adjust length accordingly. Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Harsh Jain authored
Replace DIV_ROUND_UP to roundup or rounddown Signed-off-by: Harsh Jain <harsh@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Vladimir Zapolskiy authored
The driver works well on i.MX31 powered boards with device description taken from board device tree, the only change to add to the driver is the missing OF device id, the affected list of included headers and indentation in platform driver struct are beautified a little. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Vladimir Zapolskiy authored
Freescale i.MX21 and i.MX31 SoCs contain a Random Number Generator Accelerator module (RNGA), which is replaced by RNGB and RNGC modules on later i.MX SoC series, the change adds a new compatible property to describe the controller. Since all versions of Freescale RNG modules are legacy, apparently the documentation file has no more potential for further extensions, nevertheless generalize it by removing explicit RNGC specifics. Signed-off-by: Vladimir Zapolskiy <vz@mleia.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS for ARM64. This is ported from the 32-bit version. It may be useful on devices with 64-bit ARM CPUs that don't have the Cryptography Extensions, so cannot do AES efficiently -- e.g. the Cortex-A53 processor on the Raspberry Pi 3. It generally works the same way as the 32-bit version, but there are some slight differences due to the different instructions, registers, and syntax available in ARM64 vs. in ARM32. For example, in the 64-bit version there are enough registers to hold the XTS tweaks for each 128-byte chunk, so they don't need to be saved on the stack. Benchmarks on a Raspberry Pi 3 running a 64-bit kernel: Algorithm Encryption Decryption --------- ---------- ---------- Speck64/128-XTS (NEON) 92.2 MB/s 92.2 MB/s Speck128/256-XTS (NEON) 75.0 MB/s 75.0 MB/s Speck128/256-XTS (generic) 47.4 MB/s 35.6 MB/s AES-128-XTS (NEON bit-sliced) 33.4 MB/s 29.6 MB/s AES-256-XTS (NEON bit-sliced) 24.6 MB/s 21.7 MB/s The code performs well on higher-end ARM64 processors as well, though such processors tend to have the Crypto Extensions which make AES preferred. For example, here are the same benchmarks run on a HiKey960 (with CPU affinity set for the A73 cores), with the Crypto Extensions implementation of AES-256-XTS added: Algorithm Encryption Decryption --------- ----------- ----------- AES-256-XTS (Crypto Extensions) 1273.3 MB/s 1274.7 MB/s Speck64/128-XTS (NEON) 359.8 MB/s 348.0 MB/s Speck128/256-XTS (NEON) 292.5 MB/s 286.1 MB/s Speck128/256-XTS (generic) 186.3 MB/s 181.8 MB/s AES-128-XTS (NEON bit-sliced) 142.0 MB/s 124.3 MB/s AES-256-XTS (NEON bit-sliced) 104.7 MB/s 91.1 MB/s Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Markus Elfring authored
Reuse existing functionality from memdup_user() instead of keeping duplicate source code. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Acked-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 09 Mar, 2018 17 commits
-
-
Gary R Hook authored
Any change to the result buffer should only happen on final, finup and digest operations. Changes to the buffer for update, import, export, etc, are not allowed. Fixes: 66d7b9f6175e ("crypto: testmgr - test misuse of result in ahash") Signed-off-by: Gary R Hook <gary.hook@amd.com> Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Wu Fengguang authored
Fixes: 09c0f03b ("crypto: x86/des3_ede - convert to skcipher interface") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
James Bottomley authored
Apparently the ecdh use case was in bluetooth which always has single element scatterlists, so the ecdh module was hard coded to expect them. Now we're using this in TPM, we need multi-element scatterlists, so remove this limitation. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
James Bottomley authored
TPM security routines require encryption and decryption with AES in CFB mode, so add it to the Linux Crypto schemes. CFB is basically a one time pad where the pad is generated initially from the encrypted IV and then subsequently from the encrypted previous block of ciphertext. The pad is XOR'd into the plain text to get the final ciphertext. https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#CFBSigned-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Krzysztof Kozlowski authored
Improve the code (safety and readability) by indicating that data passed through pointer is not modified. This adds const keyword in many places, most notably: - the driver data (pointer to struct samsung_aes_variant), - scatterlist addresses written as value to device registers, - key and IV arrays. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Krzysztof Kozlowski authored
ahash_request 'req' argument passed by the caller s5p_hash_handle_queue() cannot be NULL here because it is obtained from non-NULL pointer via container_of(). This fixes smatch warning: drivers/crypto/s5p-sss.c:1213 s5p_hash_prepare_request() warn: variable dereferenced before check 'req' (see line 1208) Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Krzysztof Kozlowski authored
Commit 8043bb1a ("crypto: omap-sham - convert driver logic to use sgs for data xmit") removed the if() clause leaving the statement as is. The intention was in that case to finish the request always so the goto instruction seems sensible. Remove the indentation to fix Smatch warning: drivers/crypto/omap-sham.c:1761 omap_sham_done_task() warn: inconsistent indenting Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Krzysztof Kozlowski authored
ahash_request 'req' argument passed by the caller omap_sham_handle_queue() cannot be NULL here because it is obtained from non-NULL pointer via container_of(). This fixes smatch warning: drivers/crypto/omap-sham.c:812 omap_sham_prepare_request() warn: variable dereferenced before check 'req' (see line 805) Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Atul Gupta authored
The Inline IPSec driver does not offload csum. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Gregory CLEMENT authored
On Armada 7K/8K we need to explicitly enable the register clock. This clock is optional because not all the SoCs using this IP need it but at least for Armada 7K/8K it is actually mandatory. The binding documentation is updating accordingly. Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Gregory CLEMENT authored
clk_disable_unprepare() already checks that the clock pointer is valid. No need to test it before calling it. Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Tero Kristo authored
Crypto driver queue size can now be configured from userspace. This allows optimizing the queue usage based on use case. Default queue size is still 10 entries. Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Tero Kristo authored
Crypto driver fallback size can now be configured from userspace. This allows optimizing the DMA usage based on use case. Detault fallback size of 200 is still used. Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Tero Kristo authored
Crypto driver queue size can now be configured from userspace. This allows optimizing the queue usage based on use case. Default queue size is still 10 entries. Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Tero Kristo authored
Crypto driver fallback size can now be configured from userspace. This allows optimizing the DMA usage based on use case. Default fallback size of 256 is still used. Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Tero Kristo authored
In certain platforms like DRA7xx having memory > 2GB with LPAE enabled has a constraint that DMA can be done with the initial 2GB and marks it as ZONE_DMA. But openssl when used with cryptodev does not make sure that input buffer is DMA capable. So, adding a check to verify if the input buffer is capable of DMA. Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Tero Kristo authored
In certain platforms like DRA7xx having memory > 2GB with LPAE enabled has a constraint that DMA can be done with the initial 2GB and marks it as ZONE_DMA. But openssl when used with cryptodev does not make sure that input buffer is DMA capable. So, adding a check to verify if the input buffer is capable of DMA. Signed-off-by: Tero Kristo <t-kristo@ti.com> Reported-by: Aparna Balasubramanian <aparnab@ti.com> Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-