- 14 Jan, 2021 36 commits
-
-
Ard Biesheuvel authored
Replace the glue helper dependency with implementations of ECB and CBC based on the new CPP macros, which avoid the need for indirect calls. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Replace the glue helper dependency with implementations of ECB and CBC based on the new CPP macros, which avoid the need for indirect calls. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Replace the glue helper dependency with implementations of ECB and CBC based on the new CPP macros, which avoid the need for indirect calls. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Replace the glue helper dependency with implementations of ECB and CBC based on the new CPP macros, which avoid the need for indirect calls. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Replace the glue helper dependency with implementations of ECB and CBC based on the new CPP macros, which avoid the need for indirect calls. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The x86 glue helper module is starting to show its age: - It relies heavily on function pointers to invoke asm helper functions that operate on fixed input sizes that are relatively small. This means the performance is severely impacted by retpolines. - It goes to great lengths to amortize the cost of kernel_fpu_begin()/end() over as much work as possible, which is no longer necessary now that FPU save/restore is done lazily, and doing so may cause unbounded scheduling blackouts due to the fact that enabling the FPU in kernel mode disables preemption. - The CBC mode decryption helper makes backward strides through the input, in order to avoid a single block size memcpy() between chunks. Consuming the input in this manner is highly likely to defeat any hardware prefetchers, so it is better to go through the data linearly, and perform the extra memcpy() where needed (which is turned into direct loads and stores by the compiler anyway). Note that benchmarks won't show this effect, given that the memory they use is always cache hot. - It implements blockwise XOR in terms of le128 pointers, which imply an alignment that is not guaranteed by the API, violating the C standard. GCC does not seem to be smart enough to elide the indirect calls when the function pointers are passed as arguments to static inline helper routines modeled after the existing ones. So instead, let's create some CPP macros that encapsulate the core of the ECB and CBC processing, so we can wire them up for existing users of the glue helper module, i.e., Camellia, Serpent, Twofish and CAST6. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Blowfish in counter mode is never used in the kernel, so there is no point in keeping an accelerated implementation around. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
DES or Triple DES in counter mode is never used in the kernel, so there is no point in keeping an accelerated implementation around. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The glue helper's CTR routines are no longer used, so drop them. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Twofish in CTR mode is never used by the kernel directly, and is highly unlikely to be relied upon by dm-crypt or algif_skcipher. So let's drop the accelerated CTR mode implementation, and instead, rely on the CTR template and the bare cipher. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
CAST6 in CTR mode is never used by the kernel directly, and is highly unlikely to be relied upon by dm-crypt or algif_skcipher. So let's drop the accelerated CTR mode implementation, and instead, rely on the CTR template and the bare cipher. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
CAST5 in CTR mode is never used by the kernel directly, and is highly unlikely to be relied upon by dm-crypt or algif_skcipher. So let's drop the accelerated CTR mode implementation, and instead, rely on the CTR template and the bare cipher. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Serpent in CTR mode is never used by the kernel directly, and is highly unlikely to be relied upon by dm-crypt or algif_skcipher. So let's drop the accelerated CTR mode implementation, and instead, rely on the CTR template and the bare cipher. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Camellia in CTR mode is never used by the kernel directly, and is highly unlikely to be relied upon by dm-crypt or algif_skcipher. So let's drop the accelerated CTR mode implementation, and instead, rely on the CTR template and the bare cipher. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The glue helper's XTS routines are no longer used, so drop them. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Now that the XTS template can wrap accelerated ECB modes, it can be used to implement Twofish in XTS mode as well, which turns out to be at least as fast, and sometimes even faster Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Now that the XTS template can wrap accelerated ECB modes, it can be used to implement Serpent in XTS mode as well, which turns out to be at least as fast, and sometimes even faster Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Now that the XTS template can wrap accelerated ECB modes, it can be used to implement CAST6 in XTS mode as well, which turns out to be at least as fast, and sometimes even faster Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Now that the XTS template can wrap accelerated ECB modes, it can be used to implement Camellia in XTS mode as well, which turns out to be at least as fast, and sometimes even faster. Acked-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Bhaskar Chowdhury authored
s/fautly/faulty/p Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Kai Ye authored
Register SEC device to uacce framework for user space. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Kai Ye authored
Register HPRE device to uacce framework for user space. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Kai Ye authored
Add 'uacce_mode' parameter for ZIP, which can be set as 0(default) or 1. '0' means ZIP is only registered to kernel crypto, and '1' means it's registered to both kernel crypto and UACCE. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Kai Ye authored
Kunpeng920 SEC/HPRE/ZIP cannot support running user space SVA and kernel Crypto at the same time. Therefore, the algorithms should not be registered to Crypto as user space SVA is enabled. Signed-off-by: Kai Ye <yekai13@huawei.com> Reviewed-by: Zaibo Xu <xuzaibo@huawei.com> Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jiri Olsa authored
Renaming 'struct device_private' to 'struct bcm_device_private', because it clashes with 'struct device_private' from 'drivers/base/base.h'. While it's not a functional problem, it's causing two distinct type hierarchies in BTF data. It also breaks build with options: CONFIG_DEBUG_INFO_BTF=y CONFIG_CRYPTO_DEV_BCM_SPU=y as reported by Qais Yousef [1]. [1] https://lore.kernel.org/lkml/20201229151352.6hzmjvu3qh6p2qgg@e107158-lin/ Fixes: 9d12ba86 ("crypto: brcm - Add Broadcom SPU driver") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Adam Guerin authored
Restrict size of field to what is required by the operation. This issue was detected by smatch: drivers/crypto/qat/qat_common/qat_asym_algs.c:328 qat_dh_compute_value() error: dma_map_single_attrs() '&qat_req->in.dh.in.b' too small (8 vs 64) Signed-off-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Adam Guerin authored
Cast ADF_SIZE_TO_RING_SIZE_IN_BYTES() so it can return a 64 bit value. This issue was detected by smatch: drivers/crypto/qat/qat_common/adf_transport_debug.c:65 adf_ring_show() warn: should '(1 << (ring->ring_size - 1)) << 7' be a 64 bit type? Signed-off-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Adam Guerin authored
Sanitize ring_num value coming from configuration (and potentially from user space) before it is used as index in the banks array. This issue was detected by smatch: drivers/crypto/qat/qat_common/adf_transport.c:233 adf_create_ring() warn: potential spectre issue 'bank->rings' [r] (local cap) Signed-off-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Wojciech Ziemba authored
The hardware specific function adf_get_arbiter_mapping() modifies the static array thrd_to_arb_map to disable mappings for AEs that are disabled. This static array is used for each device of the same type. If the ae mask is not identical for all devices of the same type then the arbiter mapping returned by adf_get_arbiter_mapping() may be wrong. This patch fixes this problem by ensuring the static arbiter mapping is unchanged and the device arbiter mapping is re-calculated each time based on the static mapping. Signed-off-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Replace the function pointers in the GCM implementation with static branches, which are based on code patching, which occurs only at module load time. This avoids the severe performance penalty caused by the use of retpolines. In order to retain the ability to switch between different versions of the implementation based on the input size on cores that support AVX and AVX2, use static branches instead of static calls. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Currently, the gcm(aes-ni) driver open codes the scatterlist handling that is encapsulated by the skcipher walk API. So let's switch to that instead. Also, move the handling at the end of gcmaes_crypt_by_sg() that is dependent on whether we are encrypting or decrypting into the callers, which always do one or the other. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The gcm(aes-ni) driver is only built for x86_64, which does not make use of highmem. So testing for PageHighMem is pointless and can be omitted. While at it, replace GFP_ATOMIC with the appropriate runtime decided value based on the context. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Drop some prototypes that are declared but never called. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The GCM mode driver uses 16 byte aligned buffers on the stack to pass the IV to the asm helpers, but unfortunately, the x86 port does not guarantee that the stack pointer is 16 byte aligned upon entry in the first place. Since the compiler is not aware of this, it will not emit the additional stack realignment sequence that is needed, and so the alignment is not guaranteed to be more than 8 bytes. So instead, allocate some padding on the stack, and realign the IV pointer by hand. Cc: <stable@vger.kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Marco Chiappero authored
Use CRYPTO_LIB_AES in place of CRYPTO_AES in the dependences for the QAT common code. Fixes: c0e583ab ("crypto: qat - add CRYPTO_AES to Kconfig dependencies") Reported-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
This patch changes the cast in stm32_cryp_check_ctr_counter from u32 to __be32 to match the prototype of stm32_cryp_hw_write_iv correctly. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 08 Jan, 2021 3 commits
-
-
Herbert Xu authored
This patch moves the extern algorithm declarations into a header file so that a number of compiler warnings are silenced. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The AES-NI driver implements XTS via the glue helper, which consumes a struct with sets of function pointers which are invoked on chunks of input data of the appropriate size, as annotated in the struct. Let's get rid of this indirection, so that we can perform direct calls to the assembler helpers. Instead, let's adopt the arm64 strategy, i.e., provide a helper which can consume inputs of any size, provided that the penultimate, full block is passed via the last call if ciphertext stealing needs to be applied. This also allows us to enable the XTS mode for i386. Tested-by: Eric Biggers <ebiggers@google.com> # x86_64 Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The XTS asm helper arrangement is a bit odd: the 8-way stride helper consists of back-to-back calls to the 4-way core transforms, which are called indirectly, based on a boolean that indicates whether we are performing encryption or decryption. Given how costly indirect calls are on x86, let's switch to direct calls, and given how the 8-way stride doesn't really add anything substantial, use a 4-way stride instead, and make the asm core routine deal with any multiple of 4 blocks. Since 512 byte sectors or 4 KB blocks are the typical quantities XTS operates on, increase the stride exported to the glue helper to 512 bytes as well. As a result, the number of indirect calls is reduced from 3 per 64 bytes of in/output to 1 per 512 bytes of in/output, which produces a 65% speedup when operating on 1 KB blocks (measured on a Intel(R) Core(TM) i7-8650U CPU) Fixes: 9697fa39 ("x86/retpoline/crypto: Convert crypto assembler indirect jumps") Tested-by: Eric Biggers <ebiggers@google.com> # x86_64 Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 02 Jan, 2021 1 commit
-
-
Rob Herring authored
PicoXcell has had nothing but treewide cleanups for at least the last 8 years and no signs of activity. The most recent activity is a yocto vendor kernel based on v3.0 in 2015. Cc: Jamie Iles <jamie@jamieiles.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-crypto@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-