Commit cb906953 authored by Linus Torvalds's avatar Linus Torvalds

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto update from Herbert Xu:
 "Here is the crypto update for 4.1:

  New interfaces:
   - user-space interface for AEAD
   - user-space interface for RNG (i.e., pseudo RNG)

  New hashes:
   - ARMv8 SHA1/256
   - ARMv8 AES
   - ARMv8 GHASH
   - ARM assembler and NEON SHA256
   - MIPS OCTEON SHA1/256/512
   - MIPS img-hash SHA1/256 and MD5
   - Power 8 VMX AES/CBC/CTR/GHASH
   - PPC assembler AES, SHA1/256 and MD5
   - Broadcom IPROC RNG driver

  Cleanups/fixes:
   - prevent internal helper algos from being exposed to user-space
   - merge common code from assembly/C SHA implementations
   - misc fixes"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (169 commits)
  crypto: arm - workaround for building with old binutils
  crypto: arm/sha256 - avoid sha256 code on ARMv7-M
  crypto: x86/sha512_ssse3 - move SHA-384/512 SSSE3 implementation to base layer
  crypto: x86/sha256_ssse3 - move SHA-224/256 SSSE3 implementation to base layer
  crypto: x86/sha1_ssse3 - move SHA-1 SSSE3 implementation to base layer
  crypto: arm64/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
  crypto: arm64/sha1-ce - move SHA-1 ARMv8 implementation to base layer
  crypto: arm/sha2-ce - move SHA-224/256 ARMv8 implementation to base layer
  crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer
  crypto: arm/sha1-ce - move SHA-1 ARMv8 implementation to base layer
  crypto: arm/sha1_neon - move SHA-1 NEON implementation to base layer
  crypto: arm/sha1 - move SHA-1 ARM asm implementation to base layer
  crypto: sha512-generic - move to generic glue implementation
  crypto: sha256-generic - move to generic glue implementation
  crypto: sha1-generic - move to generic glue implementation
  crypto: sha512 - implement base layer for SHA-512
  crypto: sha256 - implement base layer for SHA-256
  crypto: sha1 - implement base layer for SHA-1
  crypto: api - remove instance when test failed
  crypto: api - Move alg ref count init to crypto_check_alg
  ...
parents 6c373ca8 3abafaf2
This diff is collapsed.
Introduction
============
The concepts of the kernel crypto API visible to kernel space is fully
applicable to the user space interface as well. Therefore, the kernel crypto API
high level discussion for the in-kernel use cases applies here as well.
The major difference, however, is that user space can only act as a consumer
and never as a provider of a transformation or cipher algorithm.
The following covers the user space interface exported by the kernel crypto
API. A working example of this description is libkcapi that can be obtained from
[1]. That library can be used by user space applications that require
cryptographic services from the kernel.
Some details of the in-kernel kernel crypto API aspects do not
apply to user space, however. This includes the difference between synchronous
and asynchronous invocations. The user space API call is fully synchronous.
In addition, only a subset of all cipher types are available as documented
below.
User space API general remarks
==============================
The kernel crypto API is accessible from user space. Currently, the following
ciphers are accessible:
* Message digest including keyed message digest (HMAC, CMAC)
* Symmetric ciphers
Note, AEAD ciphers are currently not supported via the symmetric cipher
interface.
The interface is provided via Netlink using the type AF_ALG. In addition, the
setsockopt option type is SOL_ALG. In case the user space header files do not
export these flags yet, use the following macros:
#ifndef AF_ALG
#define AF_ALG 38
#endif
#ifndef SOL_ALG
#define SOL_ALG 279
#endif
A cipher is accessed with the same name as done for the in-kernel API calls.
This includes the generic vs. unique naming schema for ciphers as well as the
enforcement of priorities for generic names.
To interact with the kernel crypto API, a Netlink socket must be created by
the user space application. User space invokes the cipher operation with the
send/write system call family. The result of the cipher operation is obtained
with the read/recv system call family.
The following API calls assume that the Netlink socket descriptor is already
opened by the user space application and discusses only the kernel crypto API
specific invocations.
To initialize a Netlink interface, the following sequence has to be performed
by the consumer:
1. Create a socket of type AF_ALG with the struct sockaddr_alg parameter
specified below for the different cipher types.
2. Invoke bind with the socket descriptor
3. Invoke accept with the socket descriptor. The accept system call
returns a new file descriptor that is to be used to interact with
the particular cipher instance. When invoking send/write or recv/read
system calls to send data to the kernel or obtain data from the
kernel, the file descriptor returned by accept must be used.
In-place cipher operation
=========================
Just like the in-kernel operation of the kernel crypto API, the user space
interface allows the cipher operation in-place. That means that the input buffer
used for the send/write system call and the output buffer used by the read/recv
system call may be one and the same. This is of particular interest for
symmetric cipher operations where a copying of the output data to its final
destination can be avoided.
If a consumer on the other hand wants to maintain the plaintext and the
ciphertext in different memory locations, all a consumer needs to do is to
provide different memory pointers for the encryption and decryption operation.
Message digest API
==================
The message digest type to be used for the cipher operation is selected when
invoking the bind syscall. bind requires the caller to provide a filled
struct sockaddr data structure. This data structure must be filled as follows:
struct sockaddr_alg sa = {
.salg_family = AF_ALG,
.salg_type = "hash", /* this selects the hash logic in the kernel */
.salg_name = "sha1" /* this is the cipher name */
};
The salg_type value "hash" applies to message digests and keyed message digests.
Though, a keyed message digest is referenced by the appropriate salg_name.
Please see below for the setsockopt interface that explains how the key can be
set for a keyed message digest.
Using the send() system call, the application provides the data that should be
processed with the message digest. The send system call allows the following
flags to be specified:
* MSG_MORE: If this flag is set, the send system call acts like a
message digest update function where the final hash is not
yet calculated. If the flag is not set, the send system call
calculates the final message digest immediately.
With the recv() system call, the application can read the message digest from
the kernel crypto API. If the buffer is too small for the message digest, the
flag MSG_TRUNC is set by the kernel.
In order to set a message digest key, the calling application must use the
setsockopt() option of ALG_SET_KEY. If the key is not set the HMAC operation is
performed without the initial HMAC state change caused by the key.
Symmetric cipher API
====================
The operation is very similar to the message digest discussion. During
initialization, the struct sockaddr data structure must be filled as follows:
struct sockaddr_alg sa = {
.salg_family = AF_ALG,
.salg_type = "skcipher", /* this selects the symmetric cipher */
.salg_name = "cbc(aes)" /* this is the cipher name */
};
Before data can be sent to the kernel using the write/send system call family,
the consumer must set the key. The key setting is described with the setsockopt
invocation below.
Using the sendmsg() system call, the application provides the data that should
be processed for encryption or decryption. In addition, the IV is specified
with the data structure provided by the sendmsg() system call.
The sendmsg system call parameter of struct msghdr is embedded into the
struct cmsghdr data structure. See recv(2) and cmsg(3) for more information
on how the cmsghdr data structure is used together with the send/recv system
call family. That cmsghdr data structure holds the following information
specified with a separate header instances:
* specification of the cipher operation type with one of these flags:
ALG_OP_ENCRYPT - encryption of data
ALG_OP_DECRYPT - decryption of data
* specification of the IV information marked with the flag ALG_SET_IV
The send system call family allows the following flag to be specified:
* MSG_MORE: If this flag is set, the send system call acts like a
cipher update function where more input data is expected
with a subsequent invocation of the send system call.
Note: The kernel reports -EINVAL for any unexpected data. The caller must
make sure that all data matches the constraints given in /proc/crypto for the
selected cipher.
With the recv() system call, the application can read the result of the
cipher operation from the kernel crypto API. The output buffer must be at least
as large as to hold all blocks of the encrypted or decrypted data. If the output
data size is smaller, only as many blocks are returned that fit into that
output buffer size.
Setsockopt interface
====================
In addition to the read/recv and send/write system call handling to send and
retrieve data subject to the cipher operation, a consumer also needs to set
the additional information for the cipher operation. This additional information
is set using the setsockopt system call that must be invoked with the file
descriptor of the open cipher (i.e. the file descriptor returned by the
accept system call).
Each setsockopt invocation must use the level SOL_ALG.
The setsockopt interface allows setting the following data using the mentioned
optname:
* ALG_SET_KEY -- Setting the key. Key setting is applicable to:
- the skcipher cipher type (symmetric ciphers)
- the hash cipher type (keyed message digests)
User space API example
======================
Please see [1] for libkcapi which provides an easy-to-use wrapper around the
aforementioned Netlink kernel interface. [1] also contains a test application
that invokes all libkcapi API calls.
[1] http://www.chronox.de/libkcapi.html
Author
======
Stephan Mueller <smueller@chronox.de>
Imagination Technologies hardware hash accelerator
The hash accelerator provides hardware hashing acceleration for
SHA1, SHA224, SHA256 and MD5 hashes
Required properties:
- compatible : "img,hash-accelerator"
- reg : Offset and length of the register set for the module, and the DMA port
- interrupts : The designated IRQ line for the hashing module.
- dmas : DMA specifier as per Documentation/devicetree/bindings/dma/dma.txt
- dma-names : Should be "tx"
- clocks : Clock specifiers
- clock-names : "sys" Used to clock the hash block registers
"hash" Used to clock data through the accelerator
Example:
hash: hash@18149600 {
compatible = "img,hash-accelerator";
reg = <0x18149600 0x100>, <0x18101100 0x4>;
interrupts = <GIC_SHARED 59 IRQ_TYPE_LEVEL_HIGH>;
dmas = <&dma 8 0xffffffff 0>;
dma-names = "tx";
clocks = <&cr_periph SYS_CLK_HASH>, <&clk_periph PERIPH_CLK_ROM>;
clock-names = "sys", "hash";
};
HWRNG support for the iproc-rng200 driver
Required properties:
- compatible : "brcm,iproc-rng200"
- reg : base address and size of control register block
Example:
rng {
compatible = "brcm,iproc-rng200";
reg = <0x18032000 0x28>;
};
......@@ -2825,6 +2825,7 @@ L: linux-crypto@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git
S: Maintained
F: Documentation/crypto/
F: Documentation/DocBook/crypto-API.tmpl
F: arch/*/crypto/
F: crypto/
F: drivers/crypto/
......
......@@ -2175,6 +2175,9 @@ source "arch/arm/Kconfig.debug"
source "security/Kconfig"
source "crypto/Kconfig"
if CRYPTO
source "arch/arm/crypto/Kconfig"
endif
source "lib/Kconfig"
......
menuconfig ARM_CRYPTO
bool "ARM Accelerated Cryptographic Algorithms"
depends on ARM
help
Say Y here to choose from a selection of cryptographic algorithms
implemented using ARM specific CPU features or instructions.
if ARM_CRYPTO
config CRYPTO_SHA1_ARM
tristate "SHA1 digest algorithm (ARM-asm)"
select CRYPTO_SHA1
select CRYPTO_HASH
help
SHA-1 secure hash standard (FIPS 180-1/DFIPS 180-2) implemented
using optimized ARM assembler.
config CRYPTO_SHA1_ARM_NEON
tristate "SHA1 digest algorithm (ARM NEON)"
depends on KERNEL_MODE_NEON
select CRYPTO_SHA1_ARM
select CRYPTO_SHA1
select CRYPTO_HASH
help
SHA-1 secure hash standard (FIPS 180-1/DFIPS 180-2) implemented
using optimized ARM NEON assembly, when NEON instructions are
available.
config CRYPTO_SHA1_ARM_CE
tristate "SHA1 digest algorithm (ARM v8 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_SHA1_ARM
select CRYPTO_HASH
help
SHA-1 secure hash standard (FIPS 180-1/DFIPS 180-2) implemented
using special ARMv8 Crypto Extensions.
config CRYPTO_SHA2_ARM_CE
tristate "SHA-224/256 digest algorithm (ARM v8 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_SHA256_ARM
select CRYPTO_HASH
help
SHA-256 secure hash standard (DFIPS 180-2) implemented
using special ARMv8 Crypto Extensions.
config CRYPTO_SHA256_ARM
tristate "SHA-224/256 digest algorithm (ARM-asm and NEON)"
select CRYPTO_HASH
depends on !CPU_V7M
help
SHA-256 secure hash standard (DFIPS 180-2) implemented
using optimized ARM assembler and NEON, when available.
config CRYPTO_SHA512_ARM_NEON
tristate "SHA384 and SHA512 digest algorithm (ARM NEON)"
depends on KERNEL_MODE_NEON
select CRYPTO_SHA512
select CRYPTO_HASH
help
SHA-512 secure hash standard (DFIPS 180-2) implemented
using ARM NEON instructions, when available.
This version of SHA implements a 512 bit hash with 256 bits of
security against collision attacks.
This code also includes SHA-384, a 384 bit hash with 192 bits
of security against collision attacks.
config CRYPTO_AES_ARM
tristate "AES cipher algorithms (ARM-asm)"
depends on ARM
select CRYPTO_ALGAPI
select CRYPTO_AES
help
Use optimized AES assembler routines for ARM platforms.
AES cipher algorithms (FIPS-197). AES uses the Rijndael
algorithm.
Rijndael appears to be consistently a very good performer in
both hardware and software across a wide range of computing
environments regardless of its use in feedback or non-feedback
modes. Its key setup time is excellent, and its key agility is
good. Rijndael's very low memory requirements make it very well
suited for restricted-space environments, in which it also
demonstrates excellent performance. Rijndael's operations are
among the easiest to defend against power and timing attacks.
The AES specifies three key sizes: 128, 192 and 256 bits
See <http://csrc.nist.gov/encryption/aes/> for more information.
config CRYPTO_AES_ARM_BS
tristate "Bit sliced AES using NEON instructions"
depends on KERNEL_MODE_NEON
select CRYPTO_ALGAPI
select CRYPTO_AES_ARM
select CRYPTO_ABLK_HELPER
help
Use a faster and more secure NEON based implementation of AES in CBC,
CTR and XTS modes
Bit sliced AES gives around 45% speedup on Cortex-A15 for CTR mode
and for XTS mode encryption, CBC and XTS mode decryption speedup is
around 25%. (CBC encryption speed is not affected by this driver.)
This implementation does not rely on any lookup tables so it is
believed to be invulnerable to cache timing attacks.
config CRYPTO_AES_ARM_CE
tristate "Accelerated AES using ARMv8 Crypto Extensions"
depends on KERNEL_MODE_NEON
select CRYPTO_ALGAPI
select CRYPTO_ABLK_HELPER
help
Use an implementation of AES in CBC, CTR and XTS modes that uses
ARMv8 Crypto Extensions
config CRYPTO_GHASH_ARM_CE
tristate "PMULL-accelerated GHASH using ARMv8 Crypto Extensions"
depends on KERNEL_MODE_NEON
select CRYPTO_HASH
select CRYPTO_CRYPTD
help
Use an implementation of GHASH (used by the GCM AEAD chaining mode)
that uses the 64x64 to 128 bit polynomial multiplication (vmull.p64)
that is part of the ARMv8 Crypto Extensions
endif
......@@ -6,13 +6,35 @@ obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o
obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o
obj-$(CONFIG_CRYPTO_SHA1_ARM) += sha1-arm.o
obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
obj-$(CONFIG_CRYPTO_SHA512_ARM_NEON) += sha512-arm-neon.o
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_SHA2_ARM_CE) += sha2-arm-ce.o
ce-obj-$(CONFIG_CRYPTO_GHASH_ARM_CE) += ghash-arm-ce.o
ifneq ($(ce-obj-y)$(ce-obj-m),)
ifeq ($(call as-instr,.fpu crypto-neon-fp-armv8,y,n),y)
obj-y += $(ce-obj-y)
obj-m += $(ce-obj-m)
else
$(warning These ARMv8 Crypto Extensions modules need binutils 2.23 or higher)
$(warning $(ce-obj-y) $(ce-obj-m))
endif
endif
aes-arm-y := aes-armv4.o aes_glue.o
aes-arm-bs-y := aesbs-core.o aesbs-glue.o
sha1-arm-y := sha1-armv4-large.o sha1_glue.o
sha1-arm-neon-y := sha1-armv7-neon.o sha1_neon_glue.o
sha256-arm-neon-$(CONFIG_KERNEL_MODE_NEON) := sha256_neon_glue.o
sha256-arm-y := sha256-core.o sha256_glue.o $(sha256-arm-neon-y)
sha512-arm-neon-y := sha512-armv7-neon.o sha512_neon_glue.o
sha1-arm-ce-y := sha1-ce-core.o sha1-ce-glue.o
sha2-arm-ce-y := sha2-ce-core.o sha2-ce-glue.o
aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o
ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
quiet_cmd_perl = PERL $@
cmd_perl = $(PERL) $(<) > $(@)
......@@ -20,4 +42,7 @@ quiet_cmd_perl = PERL $@
$(src)/aesbs-core.S_shipped: $(src)/bsaes-armv7.pl
$(call cmd,perl)
.PRECIOUS: $(obj)/aesbs-core.S
$(src)/sha256-core.S_shipped: $(src)/sha256-armv4.pl
$(call cmd,perl)
.PRECIOUS: $(obj)/aesbs-core.S $(obj)/sha256-core.S
This diff is collapsed.
This diff is collapsed.
......@@ -301,7 +301,8 @@ static struct crypto_alg aesbs_algs[] = { {
.cra_name = "__cbc-aes-neonbs",
.cra_driver_name = "__driver-cbc-aes-neonbs",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
.cra_alignmask = 7,
......@@ -319,7 +320,8 @@ static struct crypto_alg aesbs_algs[] = { {
.cra_name = "__ctr-aes-neonbs",
.cra_driver_name = "__driver-ctr-aes-neonbs",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aesbs_ctr_ctx),
.cra_alignmask = 7,
......@@ -337,7 +339,8 @@ static struct crypto_alg aesbs_algs[] = { {
.cra_name = "__xts-aes-neonbs",
.cra_driver_name = "__driver-xts-aes-neonbs",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct aesbs_xts_ctx),
.cra_alignmask = 7,
......
/*
* Accelerated GHASH implementation with ARMv8 vmull.p64 instructions.
*
* Copyright (C) 2015 Linaro Ltd. <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 as published
* by the Free Software Foundation.
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
SHASH .req q0
SHASH2 .req q1
T1 .req q2
T2 .req q3
MASK .req q4
XL .req q5
XM .req q6
XH .req q7
IN1 .req q7
SHASH_L .req d0
SHASH_H .req d1
SHASH2_L .req d2
T1_L .req d4
MASK_L .req d8
XL_L .req d10
XL_H .req d11
XM_L .req d12
XM_H .req d13
XH_L .req d14
.text
.fpu crypto-neon-fp-armv8
/*
* void pmull_ghash_update(int blocks, u64 dg[], const char *src,
* struct ghash_key const *k, const char *head)
*/
ENTRY(pmull_ghash_update)
vld1.64 {SHASH}, [r3]
vld1.64 {XL}, [r1]
vmov.i8 MASK, #0xe1
vext.8 SHASH2, SHASH, SHASH, #8
vshl.u64 MASK, MASK, #57
veor SHASH2, SHASH2, SHASH
/* do the head block first, if supplied */
ldr ip, [sp]
teq ip, #0
beq 0f
vld1.64 {T1}, [ip]
teq r0, #0
b 1f
0: vld1.64 {T1}, [r2]!
subs r0, r0, #1
1: /* multiply XL by SHASH in GF(2^128) */
#ifndef CONFIG_CPU_BIG_ENDIAN
vrev64.8 T1, T1
#endif
vext.8 T2, XL, XL, #8
vext.8 IN1, T1, T1, #8
veor T1, T1, T2
veor XL, XL, IN1
vmull.p64 XH, SHASH_H, XL_H @ a1 * b1
veor T1, T1, XL
vmull.p64 XL, SHASH_L, XL_L @ a0 * b0
vmull.p64 XM, SHASH2_L, T1_L @ (a1 + a0)(b1 + b0)
vext.8 T1, XL, XH, #8
veor T2, XL, XH
veor XM, XM, T1
veor XM, XM, T2
vmull.p64 T2, XL_L, MASK_L
vmov XH_L, XM_H
vmov XM_H, XL_L
veor XL, XM, T2
vext.8 T2, XL, XL, #8
vmull.p64 XL, XL_L, MASK_L
veor T2, T2, XH
veor XL, XL, T2
bne 0b
vst1.64 {XL}, [r1]
bx lr
ENDPROC(pmull_ghash_update)
/*
* Accelerated GHASH implementation with ARMv8 vmull.p64 instructions.
*
* Copyright (C) 2015 Linaro Ltd. <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 as published
* by the Free Software Foundation.
*/
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <asm/unaligned.h>
#include <crypto/cryptd.h>
#include <crypto/internal/hash.h>
#include <crypto/gf128mul.h>
#include <linux/crypto.h>
#include <linux/module.h>
MODULE_DESCRIPTION("GHASH secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
#define GHASH_BLOCK_SIZE 16
#define GHASH_DIGEST_SIZE 16
struct ghash_key {
u64 a;
u64 b;
};
struct ghash_desc_ctx {
u64 digest[GHASH_DIGEST_SIZE/sizeof(u64)];
u8 buf[GHASH_BLOCK_SIZE];
u32 count;
};
struct ghash_async_ctx {
struct cryptd_ahash *cryptd_tfm;
};
asmlinkage void pmull_ghash_update(int blocks, u64 dg[], const char *src,
struct ghash_key const *k, const char *head);
static int ghash_init(struct shash_desc *desc)
{
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
*ctx = (struct ghash_desc_ctx){};
return 0;
}
static int ghash_update(struct shash_desc *desc, const u8 *src,
unsigned int len)
{
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
unsigned int partial = ctx->count % GHASH_BLOCK_SIZE;
ctx->count += len;
if ((partial + len) >= GHASH_BLOCK_SIZE) {
struct ghash_key *key = crypto_shash_ctx(desc->tfm);
int blocks;
if (partial) {
int p = GHASH_BLOCK_SIZE - partial;
memcpy(ctx->buf + partial, src, p);
src += p;
len -= p;
}
blocks = len / GHASH_BLOCK_SIZE;
len %= GHASH_BLOCK_SIZE;
kernel_neon_begin();
pmull_ghash_update(blocks, ctx->digest, src, key,
partial ? ctx->buf : NULL);
kernel_neon_end();
src += blocks * GHASH_BLOCK_SIZE;
partial = 0;
}
if (len)
memcpy(ctx->buf + partial, src, len);
return 0;
}
static int ghash_final(struct shash_desc *desc, u8 *dst)
{
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
unsigned int partial = ctx->count % GHASH_BLOCK_SIZE;
if (partial) {
struct ghash_key *key = crypto_shash_ctx(desc->tfm);
memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial);
kernel_neon_begin();
pmull_ghash_update(1, ctx->digest, ctx->buf, key, NULL);
kernel_neon_end();
}
put_unaligned_be64(ctx->digest[1], dst);
put_unaligned_be64(ctx->digest[0], dst + 8);
*ctx = (struct ghash_desc_ctx){};
return 0;
}
static int ghash_setkey(struct crypto_shash *tfm,
const u8 *inkey, unsigned int keylen)
{
struct ghash_key *key = crypto_shash_ctx(tfm);
u64 a, b;
if (keylen != GHASH_BLOCK_SIZE) {
crypto_shash_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
return -EINVAL;
}
/* perform multiplication by 'x' in GF(2^128) */
b = get_unaligned_be64(inkey);
a = get_unaligned_be64(inkey + 8);
key->a = (a << 1) | (b >> 63);
key->b = (b << 1) | (a >> 63);
if (b >> 63)
key->b ^= 0xc200000000000000UL;
return 0;
}
static struct shash_alg ghash_alg = {
.digestsize = GHASH_DIGEST_SIZE,
.init = ghash_init,
.update = ghash_update,
.final = ghash_final,
.setkey = ghash_setkey,
.descsize = sizeof(struct ghash_desc_ctx),
.base = {
.cra_name = "ghash",
.cra_driver_name = "__driver-ghash-ce",
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_SHASH | CRYPTO_ALG_INTERNAL,
.cra_blocksize = GHASH_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct ghash_key),
.cra_module = THIS_MODULE,
},
};
static int ghash_async_init(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
struct ahash_request *cryptd_req = ahash_request_ctx(req);
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
if (!may_use_simd()) {
memcpy(cryptd_req, req, sizeof(*req));
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
return crypto_ahash_init(cryptd_req);
} else {
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
struct crypto_shash *child = cryptd_ahash_child(cryptd_tfm);
desc->tfm = child;
desc->flags = req->base.flags;
return crypto_shash_init(desc);
}
}
static int ghash_async_update(struct ahash_request *req)
{
struct ahash_request *cryptd_req = ahash_request_ctx(req);
if (!may_use_simd()) {
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
memcpy(cryptd_req, req, sizeof(*req));
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
return crypto_ahash_update(cryptd_req);
} else {
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
return shash_ahash_update(req, desc);
}
}
static int ghash_async_final(struct ahash_request *req)
{
struct ahash_request *cryptd_req = ahash_request_ctx(req);
if (!may_use_simd()) {
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
memcpy(cryptd_req, req, sizeof(*req));
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
return crypto_ahash_final(cryptd_req);
} else {
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
return crypto_shash_final(desc, req->result);
}
}
static int ghash_async_digest(struct ahash_request *req)
{
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
struct ahash_request *cryptd_req = ahash_request_ctx(req);
struct cryptd_ahash *cryptd_tfm = ctx->cryptd_tfm;
if (!may_use_simd()) {
memcpy(cryptd_req, req, sizeof(*req));
ahash_request_set_tfm(cryptd_req, &cryptd_tfm->base);
return crypto_ahash_digest(cryptd_req);
} else {
struct shash_desc *desc = cryptd_shash_desc(cryptd_req);
struct crypto_shash *child = cryptd_ahash_child(cryptd_tfm);
desc->tfm = child;
desc->flags = req->base.flags;
return shash_ahash_digest(req, desc);
}
}
static int ghash_async_setkey(struct crypto_ahash *tfm, const u8 *key,
unsigned int keylen)
{
struct ghash_async_ctx *ctx = crypto_ahash_ctx(tfm);
struct crypto_ahash *child = &ctx->cryptd_tfm->base;
int err;
crypto_ahash_clear_flags(child, CRYPTO_TFM_REQ_MASK);
crypto_ahash_set_flags(child, crypto_ahash_get_flags(tfm)
& CRYPTO_TFM_REQ_MASK);
err = crypto_ahash_setkey(child, key, keylen);
crypto_ahash_set_flags(tfm, crypto_ahash_get_flags(child)
& CRYPTO_TFM_RES_MASK);
return err;
}
static int ghash_async_init_tfm(struct crypto_tfm *tfm)
{
struct cryptd_ahash *cryptd_tfm;
struct ghash_async_ctx *ctx = crypto_tfm_ctx(tfm);
cryptd_tfm = cryptd_alloc_ahash("__driver-ghash-ce",
CRYPTO_ALG_INTERNAL,
CRYPTO_ALG_INTERNAL);
if (IS_ERR(cryptd_tfm))
return PTR_ERR(cryptd_tfm);
ctx->cryptd_tfm = cryptd_tfm;
crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
sizeof(struct ahash_request) +
crypto_ahash_reqsize(&cryptd_tfm->base));
return 0;
}
static void ghash_async_exit_tfm(struct crypto_tfm *tfm)
{
struct ghash_async_ctx *ctx = crypto_tfm_ctx(tfm);
cryptd_free_ahash(ctx->cryptd_tfm);
}
static struct ahash_alg ghash_async_alg = {
.init = ghash_async_init,
.update = ghash_async_update,
.final = ghash_async_final,
.setkey = ghash_async_setkey,
.digest = ghash_async_digest,
.halg.digestsize = GHASH_DIGEST_SIZE,
.halg.base = {
.cra_name = "ghash",
.cra_driver_name = "ghash-ce",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_AHASH | CRYPTO_ALG_ASYNC,
.cra_blocksize = GHASH_BLOCK_SIZE,
.cra_type = &crypto_ahash_type,
.cra_ctxsize = sizeof(struct ghash_async_ctx),
.cra_module = THIS_MODULE,
.cra_init = ghash_async_init_tfm,
.cra_exit = ghash_async_exit_tfm,
},
};
static int __init ghash_ce_mod_init(void)
{
int err;
if (!(elf_hwcap2 & HWCAP2_PMULL))
return -ENODEV;
err = crypto_register_shash(&ghash_alg);
if (err)
return err;
err = crypto_register_ahash(&ghash_async_alg);
if (err)
goto err_shash;
return 0;
err_shash:
crypto_unregister_shash(&ghash_alg);
return err;
}
static void __exit ghash_ce_mod_exit(void)
{
crypto_unregister_ahash(&ghash_async_alg);
crypto_unregister_shash(&ghash_alg);
}
module_init(ghash_ce_mod_init);
module_exit(ghash_ce_mod_exit);
/*
* sha1-ce-core.S - SHA-1 secure hash using ARMv8 Crypto Extensions
*
* Copyright (C) 2015 Linaro Ltd.
* Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
.text
.fpu crypto-neon-fp-armv8
k0 .req q0
k1 .req q1
k2 .req q2
k3 .req q3
ta0 .req q4
ta1 .req q5
tb0 .req q5
tb1 .req q4
dga .req q6
dgb .req q7
dgbs .req s28
dg0 .req q12
dg1a0 .req q13
dg1a1 .req q14
dg1b0 .req q14
dg1b1 .req q13
.macro add_only, op, ev, rc, s0, dg1
.ifnb \s0
vadd.u32 tb\ev, q\s0, \rc
.endif
sha1h.32 dg1b\ev, dg0
.ifb \dg1
sha1\op\().32 dg0, dg1a\ev, ta\ev
.else
sha1\op\().32 dg0, \dg1, ta\ev
.endif
.endm
.macro add_update, op, ev, rc, s0, s1, s2, s3, dg1
sha1su0.32 q\s0, q\s1, q\s2
add_only \op, \ev, \rc, \s1, \dg1
sha1su1.32 q\s0, q\s3
.endm
.align 6
.Lsha1_rcon:
.word 0x5a827999, 0x5a827999, 0x5a827999, 0x5a827999
.word 0x6ed9eba1, 0x6ed9eba1, 0x6ed9eba1, 0x6ed9eba1
.word 0x8f1bbcdc, 0x8f1bbcdc, 0x8f1bbcdc, 0x8f1bbcdc
.word 0xca62c1d6, 0xca62c1d6, 0xca62c1d6, 0xca62c1d6
/*
* void sha1_ce_transform(struct sha1_state *sst, u8 const *src,
* int blocks);
*/
ENTRY(sha1_ce_transform)
/* load round constants */
adr ip, .Lsha1_rcon
vld1.32 {k0-k1}, [ip, :128]!
vld1.32 {k2-k3}, [ip, :128]
/* load state */
vld1.32 {dga}, [r0]
vldr dgbs, [r0, #16]
/* load input */
0: vld1.32 {q8-q9}, [r1]!
vld1.32 {q10-q11}, [r1]!
subs r2, r2, #1
#ifndef CONFIG_CPU_BIG_ENDIAN
vrev32.8 q8, q8
vrev32.8 q9, q9
vrev32.8 q10, q10
vrev32.8 q11, q11
#endif
vadd.u32 ta0, q8, k0
vmov dg0, dga
add_update c, 0, k0, 8, 9, 10, 11, dgb
add_update c, 1, k0, 9, 10, 11, 8
add_update c, 0, k0, 10, 11, 8, 9
add_update c, 1, k0, 11, 8, 9, 10
add_update c, 0, k1, 8, 9, 10, 11
add_update p, 1, k1, 9, 10, 11, 8
add_update p, 0, k1, 10, 11, 8, 9
add_update p, 1, k1, 11, 8, 9, 10
add_update p, 0, k1, 8, 9, 10, 11
add_update p, 1, k2, 9, 10, 11, 8
add_update m, 0, k2, 10, 11, 8, 9
add_update m, 1, k2, 11, 8, 9, 10
add_update m, 0, k2, 8, 9, 10, 11
add_update m, 1, k2, 9, 10, 11, 8
add_update m, 0, k3, 10, 11, 8, 9
add_update p, 1, k3, 11, 8, 9, 10
add_only p, 0, k3, 9
add_only p, 1, k3, 10
add_only p, 0, k3, 11
add_only p, 1
/* update state */
vadd.u32 dga, dga, dg0
vadd.u32 dgb, dgb, dg1a0
bne 0b
/* store new state */
vst1.32 {dga}, [r0]
vstr dgbs, [r0, #16]
bx lr
ENDPROC(sha1_ce_transform)
/*
* sha1-ce-glue.c - SHA-1 secure hash using ARMv8 Crypto Extensions
*
* Copyright (C) 2015 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <crypto/internal/hash.h>
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
#include <linux/crypto.h>
#include <linux/module.h>
#include <asm/hwcap.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include "sha1.h"
MODULE_DESCRIPTION("SHA1 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
asmlinkage void sha1_ce_transform(struct sha1_state *sst, u8 const *src,
int blocks);
static int sha1_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
if (!may_use_simd() ||
(sctx->count % SHA1_BLOCK_SIZE) + len < SHA1_BLOCK_SIZE)
return sha1_update_arm(desc, data, len);
kernel_neon_begin();
sha1_base_do_update(desc, data, len, sha1_ce_transform);
kernel_neon_end();
return 0;
}
static int sha1_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!may_use_simd())
return sha1_finup_arm(desc, data, len, out);
kernel_neon_begin();
if (len)
sha1_base_do_update(desc, data, len, sha1_ce_transform);
sha1_base_do_finalize(desc, sha1_ce_transform);
kernel_neon_end();
return sha1_base_finish(desc, out);
}
static int sha1_ce_final(struct shash_desc *desc, u8 *out)
{
return sha1_ce_finup(desc, NULL, 0, out);
}
static struct shash_alg alg = {
.init = sha1_base_init,
.update = sha1_ce_update,
.final = sha1_ce_final,
.finup = sha1_ce_finup,
.descsize = sizeof(struct sha1_state),
.digestsize = SHA1_DIGEST_SIZE,
.base = {
.cra_name = "sha1",
.cra_driver_name = "sha1-ce",
.cra_priority = 200,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA1_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init sha1_ce_mod_init(void)
{
if (!(elf_hwcap2 & HWCAP2_SHA1))
return -ENODEV;
return crypto_register_shash(&alg);
}
static void __exit sha1_ce_mod_fini(void)
{
crypto_unregister_shash(&alg);
}
module_init(sha1_ce_mod_init);
module_exit(sha1_ce_mod_fini);
......@@ -7,4 +7,7 @@
extern int sha1_update_arm(struct shash_desc *desc, const u8 *data,
unsigned int len);
extern int sha1_finup_arm(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out);
#endif
......@@ -22,127 +22,47 @@
#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
#include <asm/byteorder.h>
#include <asm/crypto/sha1.h>
#include "sha1.h"
asmlinkage void sha1_block_data_order(u32 *digest,
const unsigned char *data, unsigned int rounds);
static int sha1_init(struct shash_desc *desc)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
*sctx = (struct sha1_state){
.state = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 },
};
return 0;
}
static int __sha1_update(struct sha1_state *sctx, const u8 *data,
unsigned int len, unsigned int partial)
{
unsigned int done = 0;
sctx->count += len;
if (partial) {
done = SHA1_BLOCK_SIZE - partial;
memcpy(sctx->buffer + partial, data, done);
sha1_block_data_order(sctx->state, sctx->buffer, 1);
}
if (len - done >= SHA1_BLOCK_SIZE) {
const unsigned int rounds = (len - done) / SHA1_BLOCK_SIZE;
sha1_block_data_order(sctx->state, data + done, rounds);
done += rounds * SHA1_BLOCK_SIZE;
}
memcpy(sctx->buffer, data + done, len - done);
return 0;
}
int sha1_update_arm(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int partial = sctx->count % SHA1_BLOCK_SIZE;
int res;
/* make sure casting to sha1_block_fn() is safe */
BUILD_BUG_ON(offsetof(struct sha1_state, state) != 0);
/* Handle the fast case right here */
if (partial + len < SHA1_BLOCK_SIZE) {
sctx->count += len;
memcpy(sctx->buffer + partial, data, len);
return 0;
}
res = __sha1_update(sctx, data, len, partial);
return res;
return sha1_base_do_update(desc, data, len,
(sha1_block_fn *)sha1_block_data_order);
}
EXPORT_SYMBOL_GPL(sha1_update_arm);
/* Add padding and return the message digest. */
static int sha1_final(struct shash_desc *desc, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int i, index, padlen;
__be32 *dst = (__be32 *)out;
__be64 bits;
static const u8 padding[SHA1_BLOCK_SIZE] = { 0x80, };
bits = cpu_to_be64(sctx->count << 3);
/* Pad out to 56 mod 64 and append length */
index = sctx->count % SHA1_BLOCK_SIZE;
padlen = (index < 56) ? (56 - index) : ((SHA1_BLOCK_SIZE+56) - index);
/* We need to fill a whole block for __sha1_update() */
if (padlen <= 56) {
sctx->count += padlen;
memcpy(sctx->buffer + index, padding, padlen);
} else {
__sha1_update(sctx, padding, padlen, index);
}
__sha1_update(sctx, (const u8 *)&bits, sizeof(bits), 56);
/* Store state in digest */
for (i = 0; i < 5; i++)
dst[i] = cpu_to_be32(sctx->state[i]);
/* Wipe context */
memset(sctx, 0, sizeof(*sctx));
return 0;
sha1_base_do_finalize(desc, (sha1_block_fn *)sha1_block_data_order);
return sha1_base_finish(desc, out);
}
static int sha1_export(struct shash_desc *desc, void *out)
int sha1_finup_arm(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
memcpy(out, sctx, sizeof(*sctx));
return 0;
sha1_base_do_update(desc, data, len,
(sha1_block_fn *)sha1_block_data_order);
return sha1_final(desc, out);
}
static int sha1_import(struct shash_desc *desc, const void *in)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
memcpy(sctx, in, sizeof(*sctx));
return 0;
}
EXPORT_SYMBOL_GPL(sha1_finup_arm);
static struct shash_alg alg = {
.digestsize = SHA1_DIGEST_SIZE,
.init = sha1_init,
.init = sha1_base_init,
.update = sha1_update_arm,
.final = sha1_final,
.export = sha1_export,
.import = sha1_import,
.finup = sha1_finup_arm,
.descsize = sizeof(struct sha1_state),
.statesize = sizeof(struct sha1_state),
.base = {
.cra_name = "sha1",
.cra_driver_name= "sha1-asm",
......
......@@ -25,147 +25,60 @@
#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <asm/byteorder.h>
#include <crypto/sha1_base.h>
#include <asm/neon.h>
#include <asm/simd.h>
#include <asm/crypto/sha1.h>
#include "sha1.h"
asmlinkage void sha1_transform_neon(void *state_h, const char *data,
unsigned int rounds);
static int sha1_neon_init(struct shash_desc *desc)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
*sctx = (struct sha1_state){
.state = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 },
};
return 0;
}
static int __sha1_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len, unsigned int partial)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int done = 0;
sctx->count += len;
if (partial) {
done = SHA1_BLOCK_SIZE - partial;
memcpy(sctx->buffer + partial, data, done);
sha1_transform_neon(sctx->state, sctx->buffer, 1);
}
if (len - done >= SHA1_BLOCK_SIZE) {
const unsigned int rounds = (len - done) / SHA1_BLOCK_SIZE;
sha1_transform_neon(sctx->state, data + done, rounds);
done += rounds * SHA1_BLOCK_SIZE;
}
memcpy(sctx->buffer, data + done, len - done);
return 0;
}
static int sha1_neon_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
unsigned int len)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int partial = sctx->count % SHA1_BLOCK_SIZE;
int res;
/* Handle the fast case right here */
if (partial + len < SHA1_BLOCK_SIZE) {
sctx->count += len;
memcpy(sctx->buffer + partial, data, len);
if (!may_use_simd() ||
(sctx->count % SHA1_BLOCK_SIZE) + len < SHA1_BLOCK_SIZE)
return sha1_update_arm(desc, data, len);
return 0;
}
if (!may_use_simd()) {
res = sha1_update_arm(desc, data, len);
} else {
kernel_neon_begin();
res = __sha1_neon_update(desc, data, len, partial);
kernel_neon_end();
}
return res;
}
/* Add padding and return the message digest. */
static int sha1_neon_final(struct shash_desc *desc, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int i, index, padlen;
__be32 *dst = (__be32 *)out;
__be64 bits;
static const u8 padding[SHA1_BLOCK_SIZE] = { 0x80, };
bits = cpu_to_be64(sctx->count << 3);
/* Pad out to 56 mod 64 and append length */
index = sctx->count % SHA1_BLOCK_SIZE;
padlen = (index < 56) ? (56 - index) : ((SHA1_BLOCK_SIZE+56) - index);
if (!may_use_simd()) {
sha1_update_arm(desc, padding, padlen);
sha1_update_arm(desc, (const u8 *)&bits, sizeof(bits));
} else {
kernel_neon_begin();
/* We need to fill a whole block for __sha1_neon_update() */
if (padlen <= 56) {
sctx->count += padlen;
memcpy(sctx->buffer + index, padding, padlen);
} else {
__sha1_neon_update(desc, padding, padlen, index);
}
__sha1_neon_update(desc, (const u8 *)&bits, sizeof(bits), 56);
kernel_neon_end();
}
/* Store state in digest */
for (i = 0; i < 5; i++)
dst[i] = cpu_to_be32(sctx->state[i]);
/* Wipe context */
memset(sctx, 0, sizeof(*sctx));
kernel_neon_begin();
sha1_base_do_update(desc, data, len,
(sha1_block_fn *)sha1_transform_neon);
kernel_neon_end();
return 0;
}
static int sha1_neon_export(struct shash_desc *desc, void *out)
static int sha1_neon_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
if (!may_use_simd())
return sha1_finup_arm(desc, data, len, out);
memcpy(out, sctx, sizeof(*sctx));
kernel_neon_begin();
if (len)
sha1_base_do_update(desc, data, len,
(sha1_block_fn *)sha1_transform_neon);
sha1_base_do_finalize(desc, (sha1_block_fn *)sha1_transform_neon);
kernel_neon_end();
return 0;
return sha1_base_finish(desc, out);
}
static int sha1_neon_import(struct shash_desc *desc, const void *in)
static int sha1_neon_final(struct shash_desc *desc, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
memcpy(sctx, in, sizeof(*sctx));
return 0;
return sha1_neon_finup(desc, NULL, 0, out);
}
static struct shash_alg alg = {
.digestsize = SHA1_DIGEST_SIZE,
.init = sha1_neon_init,
.init = sha1_base_init,
.update = sha1_neon_update,
.final = sha1_neon_final,
.export = sha1_neon_export,
.import = sha1_neon_import,
.finup = sha1_neon_finup,
.descsize = sizeof(struct sha1_state),
.statesize = sizeof(struct sha1_state),
.base = {
.cra_name = "sha1",
.cra_driver_name = "sha1-neon",
......
/*
* sha2-ce-core.S - SHA-224/256 secure hash using ARMv8 Crypto Extensions
*
* Copyright (C) 2015 Linaro Ltd.
* Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/linkage.h>
#include <asm/assembler.h>
.text
.fpu crypto-neon-fp-armv8
k0 .req q7
k1 .req q8
rk .req r3
ta0 .req q9
ta1 .req q10
tb0 .req q10
tb1 .req q9
dga .req q11
dgb .req q12
dg0 .req q13
dg1 .req q14
dg2 .req q15
.macro add_only, ev, s0
vmov dg2, dg0
.ifnb \s0
vld1.32 {k\ev}, [rk, :128]!
.endif
sha256h.32 dg0, dg1, tb\ev
sha256h2.32 dg1, dg2, tb\ev
.ifnb \s0
vadd.u32 ta\ev, q\s0, k\ev
.endif
.endm
.macro add_update, ev, s0, s1, s2, s3
sha256su0.32 q\s0, q\s1
add_only \ev, \s1
sha256su1.32 q\s0, q\s2, q\s3
.endm
.align 6
.Lsha256_rcon:
.word 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5
.word 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5
.word 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3
.word 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174
.word 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc
.word 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da
.word 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7
.word 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967
.word 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13
.word 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85
.word 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3
.word 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070
.word 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5
.word 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3
.word 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208
.word 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
/*
* void sha2_ce_transform(struct sha256_state *sst, u8 const *src,
int blocks);
*/
ENTRY(sha2_ce_transform)
/* load state */
vld1.32 {dga-dgb}, [r0]
/* load input */
0: vld1.32 {q0-q1}, [r1]!
vld1.32 {q2-q3}, [r1]!
subs r2, r2, #1
#ifndef CONFIG_CPU_BIG_ENDIAN
vrev32.8 q0, q0
vrev32.8 q1, q1
vrev32.8 q2, q2
vrev32.8 q3, q3
#endif
/* load first round constant */
adr rk, .Lsha256_rcon
vld1.32 {k0}, [rk, :128]!
vadd.u32 ta0, q0, k0
vmov dg0, dga
vmov dg1, dgb
add_update 1, 0, 1, 2, 3
add_update 0, 1, 2, 3, 0
add_update 1, 2, 3, 0, 1
add_update 0, 3, 0, 1, 2
add_update 1, 0, 1, 2, 3
add_update 0, 1, 2, 3, 0
add_update 1, 2, 3, 0, 1
add_update 0, 3, 0, 1, 2
add_update 1, 0, 1, 2, 3
add_update 0, 1, 2, 3, 0
add_update 1, 2, 3, 0, 1
add_update 0, 3, 0, 1, 2
add_only 1, 1
add_only 0, 2
add_only 1, 3
add_only 0
/* update state */
vadd.u32 dga, dga, dg0
vadd.u32 dgb, dgb, dg1
bne 0b
/* store new state */
vst1.32 {dga-dgb}, [r0]
bx lr
ENDPROC(sha2_ce_transform)
/*
* sha2-ce-glue.c - SHA-224/SHA-256 using ARMv8 Crypto Extensions
*
* Copyright (C) 2015 Linaro Ltd <ard.biesheuvel@linaro.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <crypto/internal/hash.h>
#include <crypto/sha.h>
#include <crypto/sha256_base.h>
#include <linux/crypto.h>
#include <linux/module.h>
#include <asm/hwcap.h>
#include <asm/simd.h>
#include <asm/neon.h>
#include <asm/unaligned.h>
#include "sha256_glue.h"
MODULE_DESCRIPTION("SHA-224/SHA-256 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
asmlinkage void sha2_ce_transform(struct sha256_state *sst, u8 const *src,
int blocks);
static int sha2_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha256_state *sctx = shash_desc_ctx(desc);
if (!may_use_simd() ||
(sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
return crypto_sha256_arm_update(desc, data, len);
kernel_neon_begin();
sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha2_ce_transform);
kernel_neon_end();
return 0;
}
static int sha2_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!may_use_simd())
return crypto_sha256_arm_finup(desc, data, len, out);
kernel_neon_begin();
if (len)
sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha2_ce_transform);
sha256_base_do_finalize(desc, (sha256_block_fn *)sha2_ce_transform);
kernel_neon_end();
return sha256_base_finish(desc, out);
}
static int sha2_ce_final(struct shash_desc *desc, u8 *out)
{
return sha2_ce_finup(desc, NULL, 0, out);
}
static struct shash_alg algs[] = { {
.init = sha224_base_init,
.update = sha2_ce_update,
.final = sha2_ce_final,
.finup = sha2_ce_finup,
.descsize = sizeof(struct sha256_state),
.digestsize = SHA224_DIGEST_SIZE,
.base = {
.cra_name = "sha224",
.cra_driver_name = "sha224-ce",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA256_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
}, {
.init = sha256_base_init,
.update = sha2_ce_update,
.final = sha2_ce_final,
.finup = sha2_ce_finup,
.descsize = sizeof(struct sha256_state),
.digestsize = SHA256_DIGEST_SIZE,
.base = {
.cra_name = "sha256",
.cra_driver_name = "sha256-ce",
.cra_priority = 300,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA256_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
} };
static int __init sha2_ce_mod_init(void)
{
if (!(elf_hwcap2 & HWCAP2_SHA2))
return -ENODEV;
return crypto_register_shashes(algs, ARRAY_SIZE(algs));
}
static void __exit sha2_ce_mod_fini(void)
{
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
}
module_init(sha2_ce_mod_init);
module_exit(sha2_ce_mod_fini);
This diff is collapsed.
This diff is collapsed.
/*
* Glue code for the SHA256 Secure Hash Algorithm assembly implementation
* using optimized ARM assembler and NEON instructions.
*
* Copyright 2015 Google Inc.
*
* This file is based on sha256_ssse3_glue.c:
* Copyright (C) 2013 Intel Corporation
* Author: Tim Chen <tim.c.chen@linux.intel.com>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*
*/
#include <crypto/internal/hash.h>
#include <linux/crypto.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/cryptohash.h>
#include <linux/types.h>
#include <linux/string.h>
#include <crypto/sha.h>
#include <crypto/sha256_base.h>
#include <asm/simd.h>
#include <asm/neon.h>
#include "sha256_glue.h"
asmlinkage void sha256_block_data_order(u32 *digest, const void *data,
unsigned int num_blks);
int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
/* make sure casting to sha256_block_fn() is safe */
BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0);
return sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha256_block_data_order);
}
EXPORT_SYMBOL(crypto_sha256_arm_update);
static int sha256_final(struct shash_desc *desc, u8 *out)
{
sha256_base_do_finalize(desc,
(sha256_block_fn *)sha256_block_data_order);
return sha256_base_finish(desc, out);
}
int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha256_block_data_order);
return sha256_final(desc, out);
}
EXPORT_SYMBOL(crypto_sha256_arm_finup);
static struct shash_alg algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.update = crypto_sha256_arm_update,
.final = sha256_final,
.finup = crypto_sha256_arm_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha256",
.cra_driver_name = "sha256-asm",
.cra_priority = 150,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA256_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
}, {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.update = crypto_sha256_arm_update,
.final = sha256_final,
.finup = crypto_sha256_arm_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha224",
.cra_driver_name = "sha224-asm",
.cra_priority = 150,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA224_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
} };
static int __init sha256_mod_init(void)
{
int res = crypto_register_shashes(algs, ARRAY_SIZE(algs));
if (res < 0)
return res;
if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && cpu_has_neon()) {
res = crypto_register_shashes(sha256_neon_algs,
ARRAY_SIZE(sha256_neon_algs));
if (res < 0)
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
}
return res;
}
static void __exit sha256_mod_fini(void)
{
crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && cpu_has_neon())
crypto_unregister_shashes(sha256_neon_algs,
ARRAY_SIZE(sha256_neon_algs));
}
module_init(sha256_mod_init);
module_exit(sha256_mod_fini);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("SHA256 Secure Hash Algorithm (ARM), including NEON");
MODULE_ALIAS_CRYPTO("sha256");
#ifndef _CRYPTO_SHA256_GLUE_H
#define _CRYPTO_SHA256_GLUE_H
#include <linux/crypto.h>
extern struct shash_alg sha256_neon_algs[2];
int crypto_sha256_arm_update(struct shash_desc *desc, const u8 *data,
unsigned int len);
int crypto_sha256_arm_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash);
#endif /* _CRYPTO_SHA256_GLUE_H */
/*
* Glue code for the SHA256 Secure Hash Algorithm assembly implementation
* using NEON instructions.
*
* Copyright 2015 Google Inc.
*
* This file is based on sha512_neon_glue.c:
* Copyright 2014 Jussi Kivilinna <jussi.kivilinna@iki.fi>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*
*/
#include <crypto/internal/hash.h>
#include <linux/cryptohash.h>
#include <linux/types.h>
#include <linux/string.h>
#include <crypto/sha.h>
#include <crypto/sha256_base.h>
#include <asm/byteorder.h>
#include <asm/simd.h>
#include <asm/neon.h>
#include "sha256_glue.h"
asmlinkage void sha256_block_data_order_neon(u32 *digest, const void *data,
unsigned int num_blks);
static int sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha256_state *sctx = shash_desc_ctx(desc);
if (!may_use_simd() ||
(sctx->count % SHA256_BLOCK_SIZE) + len < SHA256_BLOCK_SIZE)
return crypto_sha256_arm_update(desc, data, len);
kernel_neon_begin();
sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha256_block_data_order_neon);
kernel_neon_end();
return 0;
}
static int sha256_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!may_use_simd())
return crypto_sha256_arm_finup(desc, data, len, out);
kernel_neon_begin();
if (len)
sha256_base_do_update(desc, data, len,
(sha256_block_fn *)sha256_block_data_order_neon);
sha256_base_do_finalize(desc,
(sha256_block_fn *)sha256_block_data_order_neon);
kernel_neon_end();
return sha256_base_finish(desc, out);
}
static int sha256_final(struct shash_desc *desc, u8 *out)
{
return sha256_finup(desc, NULL, 0, out);
}
struct shash_alg sha256_neon_algs[] = { {
.digestsize = SHA256_DIGEST_SIZE,
.init = sha256_base_init,
.update = sha256_update,
.final = sha256_final,
.finup = sha256_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha256",
.cra_driver_name = "sha256-neon",
.cra_priority = 250,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA256_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
}, {
.digestsize = SHA224_DIGEST_SIZE,
.init = sha224_base_init,
.update = sha256_update,
.final = sha256_final,
.finup = sha256_finup,
.descsize = sizeof(struct sha256_state),
.base = {
.cra_name = "sha224",
.cra_driver_name = "sha224-neon",
.cra_priority = 250,
.cra_flags = CRYPTO_ALG_TYPE_SHASH,
.cra_blocksize = SHA224_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
} };
......@@ -284,7 +284,8 @@ static struct crypto_alg aes_algs[] = { {
.cra_name = "__ecb-aes-" MODE,
.cra_driver_name = "__driver-ecb-aes-" MODE,
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
......@@ -302,7 +303,8 @@ static struct crypto_alg aes_algs[] = { {
.cra_name = "__cbc-aes-" MODE,
.cra_driver_name = "__driver-cbc-aes-" MODE,
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
......@@ -320,7 +322,8 @@ static struct crypto_alg aes_algs[] = { {
.cra_name = "__ctr-aes-" MODE,
.cra_driver_name = "__driver-ctr-aes-" MODE,
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
CRYPTO_ALG_INTERNAL,
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct crypto_aes_ctx),
.cra_alignmask = 7,
......@@ -338,7 +341,8 @@ static struct crypto_alg aes_algs[] = { {
.cra_name = "__xts-aes-" MODE,
.cra_driver_name = "__driver-xts-aes-" MODE,
.cra_priority = 0,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER,
.cra_flags = CRYPTO_ALG_TYPE_BLKCIPHER |
CRYPTO_ALG_INTERNAL,
.cra_blocksize = AES_BLOCK_SIZE,
.cra_ctxsize = sizeof(struct crypto_aes_xts_ctx),
.cra_alignmask = 7,
......
......@@ -66,8 +66,8 @@
.word 0x5a827999, 0x6ed9eba1, 0x8f1bbcdc, 0xca62c1d6
/*
* void sha1_ce_transform(int blocks, u8 const *src, u32 *state,
* u8 *head, long bytes)
* void sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src,
* int blocks)
*/
ENTRY(sha1_ce_transform)
/* load round constants */
......@@ -78,25 +78,22 @@ ENTRY(sha1_ce_transform)
ld1r {k3.4s}, [x6]
/* load state */
ldr dga, [x2]
ldr dgb, [x2, #16]
ldr dga, [x0]
ldr dgb, [x0, #16]
/* load partial state (if supplied) */
cbz x3, 0f
ld1 {v8.4s-v11.4s}, [x3]
b 1f
/* load sha1_ce_state::finalize */
ldr w4, [x0, #:lo12:sha1_ce_offsetof_finalize]
/* load input */
0: ld1 {v8.4s-v11.4s}, [x1], #64
sub w0, w0, #1
sub w2, w2, #1
1:
CPU_LE( rev32 v8.16b, v8.16b )
CPU_LE( rev32 v9.16b, v9.16b )
CPU_LE( rev32 v10.16b, v10.16b )
CPU_LE( rev32 v11.16b, v11.16b )
2: add t0.4s, v8.4s, k0.4s
1: add t0.4s, v8.4s, k0.4s
mov dg0v.16b, dgav.16b
add_update c, ev, k0, 8, 9, 10, 11, dgb
......@@ -127,15 +124,15 @@ CPU_LE( rev32 v11.16b, v11.16b )
add dgbv.2s, dgbv.2s, dg1v.2s
add dgav.4s, dgav.4s, dg0v.4s
cbnz w0, 0b
cbnz w2, 0b
/*
* Final block: add padding and total bit count.
* Skip if we have no total byte count in x4. In that case, the input
* size was not a round multiple of the block size, and the padding is
* handled by the C code.
* Skip if the input size was not a round multiple of the block size,
* the padding is handled by the C code in that case.
*/
cbz x4, 3f
ldr x4, [x0, #:lo12:sha1_ce_offsetof_count]
movi v9.2d, #0
mov x8, #0x80000000
movi v10.2d, #0
......@@ -144,10 +141,10 @@ CPU_LE( rev32 v11.16b, v11.16b )
mov x4, #0
mov v11.d[0], xzr
mov v11.d[1], x7
b 2b
b 1b
/* store new state */
3: str dga, [x2]
str dgb, [x2, #16]
3: str dga, [x0]
str dgb, [x0, #16]
ret
ENDPROC(sha1_ce_transform)
......@@ -12,144 +12,81 @@
#include <asm/unaligned.h>
#include <crypto/internal/hash.h>
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
#include <linux/cpufeature.h>
#include <linux/crypto.h>
#include <linux/module.h>
#define ASM_EXPORT(sym, val) \
asm(".globl " #sym "; .set " #sym ", %0" :: "I"(val));
MODULE_DESCRIPTION("SHA1 secure hash using ARMv8 Crypto Extensions");
MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>");
MODULE_LICENSE("GPL v2");
asmlinkage void sha1_ce_transform(int blocks, u8 const *src, u32 *state,
u8 *head, long bytes);
struct sha1_ce_state {
struct sha1_state sst;
u32 finalize;
};
static int sha1_init(struct shash_desc *desc)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
asmlinkage void sha1_ce_transform(struct sha1_ce_state *sst, u8 const *src,
int blocks);
*sctx = (struct sha1_state){
.state = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 },
};
return 0;
}
static int sha1_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
static int sha1_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int partial = sctx->count % SHA1_BLOCK_SIZE;
sctx->count += len;
if ((partial + len) >= SHA1_BLOCK_SIZE) {
int blocks;
if (partial) {
int p = SHA1_BLOCK_SIZE - partial;
struct sha1_ce_state *sctx = shash_desc_ctx(desc);
memcpy(sctx->buffer + partial, data, p);
data += p;
len -= p;
}
blocks = len / SHA1_BLOCK_SIZE;
len %= SHA1_BLOCK_SIZE;
kernel_neon_begin_partial(16);
sha1_ce_transform(blocks, data, sctx->state,
partial ? sctx->buffer : NULL, 0);
kernel_neon_end();
sctx->finalize = 0;
kernel_neon_begin_partial(16);
sha1_base_do_update(desc, data, len,
(sha1_block_fn *)sha1_ce_transform);
kernel_neon_end();
data += blocks * SHA1_BLOCK_SIZE;
partial = 0;
}
if (len)
memcpy(sctx->buffer + partial, data, len);
return 0;
}
static int sha1_final(struct shash_desc *desc, u8 *out)
static int sha1_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
static const u8 padding[SHA1_BLOCK_SIZE] = { 0x80, };
struct sha1_ce_state *sctx = shash_desc_ctx(desc);
bool finalize = !sctx->sst.count && !(len % SHA1_BLOCK_SIZE);
struct sha1_state *sctx = shash_desc_ctx(desc);
__be64 bits = cpu_to_be64(sctx->count << 3);
__be32 *dst = (__be32 *)out;
int i;
u32 padlen = SHA1_BLOCK_SIZE
- ((sctx->count + sizeof(bits)) % SHA1_BLOCK_SIZE);
sha1_update(desc, padding, padlen);
sha1_update(desc, (const u8 *)&bits, sizeof(bits));
for (i = 0; i < SHA1_DIGEST_SIZE / sizeof(__be32); i++)
put_unaligned_be32(sctx->state[i], dst++);
*sctx = (struct sha1_state){};
return 0;
}
static int sha1_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
__be32 *dst = (__be32 *)out;
int blocks;
int i;
if (sctx->count || !len || (len % SHA1_BLOCK_SIZE)) {
sha1_update(desc, data, len);
return sha1_final(desc, out);
}
ASM_EXPORT(sha1_ce_offsetof_count,
offsetof(struct sha1_ce_state, sst.count));
ASM_EXPORT(sha1_ce_offsetof_finalize,
offsetof(struct sha1_ce_state, finalize));
/*
* Use a fast path if the input is a multiple of 64 bytes. In
* this case, there is no need to copy data around, and we can
* perform the entire digest calculation in a single invocation
* of sha1_ce_transform()
* Allow the asm code to perform the finalization if there is no
* partial data and the input is a round multiple of the block size.
*/
blocks = len / SHA1_BLOCK_SIZE;
sctx->finalize = finalize;
kernel_neon_begin_partial(16);
sha1_ce_transform(blocks, data, sctx->state, NULL, len);
sha1_base_do_update(desc, data, len,
(sha1_block_fn *)sha1_ce_transform);
if (!finalize)
sha1_base_do_finalize(desc, (sha1_block_fn *)sha1_ce_transform);
kernel_neon_end();
for (i = 0; i < SHA1_DIGEST_SIZE / sizeof(__be32); i++)
put_unaligned_be32(sctx->state[i], dst++);
*sctx = (struct sha1_state){};
return 0;
return sha1_base_finish(desc, out);
}
static int sha1_export(struct shash_desc *desc, void *out)
static int sha1_ce_final(struct shash_desc *desc, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
struct sha1_state *dst = out;
*dst = *sctx;
return 0;
}
static int sha1_import(struct shash_desc *desc, const void *in)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
struct sha1_state const *src = in;
*sctx = *src;
return 0;
kernel_neon_begin_partial(16);
sha1_base_do_finalize(desc, (sha1_block_fn *)sha1_ce_transform);
kernel_neon_end();
return sha1_base_finish(desc, out);
}
static struct shash_alg alg = {
.init = sha1_init,
.update = sha1_update,
.final = sha1_final,
.finup = sha1_finup,
.export = sha1_export,
.import = sha1_import,
.descsize = sizeof(struct sha1_state),
.init = sha1_base_init,
.update = sha1_ce_update,
.final = sha1_ce_final,
.finup = sha1_ce_finup,
.descsize = sizeof(struct sha1_ce_state),
.digestsize = SHA1_DIGEST_SIZE,
.statesize = sizeof(struct sha1_state),
.base = {
.cra_name = "sha1",
.cra_driver_name = "sha1-ce",
......
......@@ -73,8 +73,8 @@
.word 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2
/*
* void sha2_ce_transform(int blocks, u8 const *src, u32 *state,
* u8 *head, long bytes)
* void sha2_ce_transform(struct sha256_ce_state *sst, u8 const *src,
* int blocks)
*/
ENTRY(sha2_ce_transform)
/* load round constants */
......@@ -85,24 +85,21 @@ ENTRY(sha2_ce_transform)
ld1 {v12.4s-v15.4s}, [x8]
/* load state */
ldp dga, dgb, [x2]
ldp dga, dgb, [x0]
/* load partial input (if supplied) */
cbz x3, 0f
ld1 {v16.4s-v19.4s}, [x3]
b 1f
/* load sha256_ce_state::finalize */
ldr w4, [x0, #:lo12:sha256_ce_offsetof_finalize]
/* load input */
0: ld1 {v16.4s-v19.4s}, [x1], #64
sub w0, w0, #1
sub w2, w2, #1
1:
CPU_LE( rev32 v16.16b, v16.16b )
CPU_LE( rev32 v17.16b, v17.16b )
CPU_LE( rev32 v18.16b, v18.16b )
CPU_LE( rev32 v19.16b, v19.16b )
2: add t0.4s, v16.4s, v0.4s
1: add t0.4s, v16.4s, v0.4s
mov dg0v.16b, dgav.16b
mov dg1v.16b, dgbv.16b
......@@ -131,15 +128,15 @@ CPU_LE( rev32 v19.16b, v19.16b )
add dgbv.4s, dgbv.4s, dg1v.4s
/* handled all input blocks? */
cbnz w0, 0b
cbnz w2, 0b
/*
* Final block: add padding and total bit count.
* Skip if we have no total byte count in x4. In that case, the input
* size was not a round multiple of the block size, and the padding is
* handled by the C code.
* Skip if the input size was not a round multiple of the block size,
* the padding is handled by the C code in that case.
*/
cbz x4, 3f
ldr x4, [x0, #:lo12:sha256_ce_offsetof_count]
movi v17.2d, #0
mov x8, #0x80000000
movi v18.2d, #0
......@@ -148,9 +145,9 @@ CPU_LE( rev32 v19.16b, v19.16b )
mov x4, #0
mov v19.d[0], xzr
mov v19.d[1], x7
b 2b
b 1b
/* store new state */
3: stp dga, dgb, [x2]
3: stp dga, dgb, [x0]
ret
ENDPROC(sha2_ce_transform)
This diff is collapsed.
......@@ -4,4 +4,7 @@
obj-y += octeon-crypto.o
obj-$(CONFIG_CRYPTO_MD5_OCTEON) += octeon-md5.o
obj-$(CONFIG_CRYPTO_MD5_OCTEON) += octeon-md5.o
obj-$(CONFIG_CRYPTO_SHA1_OCTEON) += octeon-sha1.o
obj-$(CONFIG_CRYPTO_SHA256_OCTEON) += octeon-sha256.o
obj-$(CONFIG_CRYPTO_SHA512_OCTEON) += octeon-sha512.o
......@@ -17,7 +17,7 @@
* crypto operations in calls to octeon_crypto_enable/disable in order to make
* sure the state of COP2 isn't corrupted if userspace is also performing
* hardware crypto operations. Allocate the state parameter on the stack.
* Preemption must be disabled to prevent context switches.
* Returns with preemption disabled.
*
* @state: Pointer to state structure to store current COP2 state in.
*
......@@ -28,6 +28,7 @@ unsigned long octeon_crypto_enable(struct octeon_cop2_state *state)
int status;
unsigned long flags;
preempt_disable();
local_irq_save(flags);
status = read_c0_status();
write_c0_status(status | ST0_CU2);
......@@ -62,5 +63,6 @@ void octeon_crypto_disable(struct octeon_cop2_state *state,
else
write_c0_status(read_c0_status() & ~ST0_CU2);
local_irq_restore(flags);
preempt_enable();
}
EXPORT_SYMBOL_GPL(octeon_crypto_disable);
......@@ -97,8 +97,6 @@ static int octeon_md5_update(struct shash_desc *desc, const u8 *data,
memcpy((char *)mctx->block + (sizeof(mctx->block) - avail), data,
avail);
local_bh_disable();
preempt_disable();
flags = octeon_crypto_enable(&state);
octeon_md5_store_hash(mctx);
......@@ -114,8 +112,6 @@ static int octeon_md5_update(struct shash_desc *desc, const u8 *data,
octeon_md5_read_hash(mctx);
octeon_crypto_disable(&state, flags);
preempt_enable();
local_bh_enable();
memcpy(mctx->block, data, len);
......@@ -133,8 +129,6 @@ static int octeon_md5_final(struct shash_desc *desc, u8 *out)
*p++ = 0x80;
local_bh_disable();
preempt_disable();
flags = octeon_crypto_enable(&state);
octeon_md5_store_hash(mctx);
......@@ -152,8 +146,6 @@ static int octeon_md5_final(struct shash_desc *desc, u8 *out)
octeon_md5_read_hash(mctx);
octeon_crypto_disable(&state, flags);
preempt_enable();
local_bh_enable();
memcpy(out, mctx->hash, sizeof(mctx->hash));
memset(mctx, 0, sizeof(*mctx));
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -1258,20 +1258,6 @@
#define M2M_SRCID_REG(x) ((x) * 0x40 + 0x14)
#define M2M_DSTID_REG(x) ((x) * 0x40 + 0x18)
/*************************************************************************
* _REG relative to RSET_RNG
*************************************************************************/
#define RNG_CTRL 0x00
#define RNG_EN (1 << 0)
#define RNG_STAT 0x04
#define RNG_AVAIL_MASK (0xff000000)
#define RNG_DATA 0x08
#define RNG_THRES 0x0c
#define RNG_MASK 0x10
/*************************************************************************
* _REG relative to RSET_SPI
*************************************************************************/
......
......@@ -4,6 +4,14 @@
# Arch-specific CryptoAPI modules.
#
obj-$(CONFIG_CRYPTO_AES_PPC_SPE) += aes-ppc-spe.o
obj-$(CONFIG_CRYPTO_MD5_PPC) += md5-ppc.o
obj-$(CONFIG_CRYPTO_SHA1_PPC) += sha1-powerpc.o
obj-$(CONFIG_CRYPTO_SHA1_PPC_SPE) += sha1-ppc-spe.o
obj-$(CONFIG_CRYPTO_SHA256_PPC_SPE) += sha256-ppc-spe.o
aes-ppc-spe-y := aes-spe-core.o aes-spe-keys.o aes-tab-4k.o aes-spe-modes.o aes-spe-glue.o
md5-ppc-y := md5-asm.o md5-glue.o
sha1-powerpc-y := sha1-powerpc-asm.o sha1.o
sha1-ppc-spe-y := sha1-spe-asm.o sha1-spe-glue.o
sha256-ppc-spe-y := sha256-spe-asm.o sha256-spe-glue.o
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -232,7 +232,6 @@ static void glue_ctr_crypt_final_128bit(const common_glue_ctr_func_t fn_ctr,
le128_to_be128((be128 *)walk->iv, &ctrblk);
}
EXPORT_SYMBOL_GPL(glue_ctr_crypt_final_128bit);
static unsigned int __glue_ctr_crypt_128bit(const struct common_glue_ctx *gctx,
struct blkcipher_desc *desc,
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment