Commit 93e220a6 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - hwrng core now credits for low-quality RNG devices.

  Algorithms:
   - Optimisations for neon aes on arm/arm64.
   - Add accelerated crc32_be on arm64.
   - Add ffdheXYZ(dh) templates.
   - Disallow hmac keys < 112 bits in FIPS mode.
   - Add AVX assembly implementation for sm3 on x86.

  Drivers:
   - Add missing local_bh_disable calls for crypto_engine callback.
   - Ensure BH is disabled in crypto_engine callback path.
   - Fix zero length DMA mappings in ccree.
   - Add synchronization between mailbox accesses in octeontx2.
   - Add Xilinx SHA3 driver.
   - Add support for the TDES IP available on sama7g5 SoC in atmel"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
  crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST
  MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list
  crypto: dh - Remove the unused function dh_safe_prime_dh_alg()
  hwrng: nomadik - Change clk_disable to clk_disable_unprepare
  crypto: arm64 - cleanup comments
  crypto: qat - fix initialization of pfvf rts_map_msg structures
  crypto: qat - fix initialization of pfvf cap_msg structures
  crypto: qat - remove unneeded assignment
  crypto: qat - disable registration of algorithms
  crypto: hisilicon/qm - fix memset during queues clearing
  crypto: xilinx: prevent probing on non-xilinx hardware
  crypto: marvell/octeontx - Use swap() instead of open coding it
  crypto: ccree - Fix use after free in cc_cipher_exit()
  crypto: ccp - ccp_dmaengine_unregister release dma channels
  crypto: octeontx2 - fix missing unlock
  hwrng: cavium - fix NULL but dereferenced coccicheck error
  crypto: cavium/nitrox - don't cast parameter in bit operations
  crypto: vmx - add missing dependencies
  MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver
  crypto: xilinx - Add Xilinx SHA3 driver
  ...
parents 5628b8de 0e03b8fd
This diff is collapsed.
What: /sys/kernel/debug/hisi_sec2/<bdf>/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
What: /sys/kernel/debug/hisi_sec2/<bdf>/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
the SEC debug registers.
0: disable, 1: enable.
Only available for PF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/current_qm
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One SEC controller has one PF and multiple VFs, each function
What: /sys/kernel/debug/hisi_sec2/<bdf>/current_qm
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One SEC controller has one PF and multiple VFs, each function
has a QM. This file can be used to select the QM which below
qm refers to.
Only available for PF.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
What: /sys/kernel/debug/hisi_sec2/<bdf>/alg_qos
Date: Jun 2021
Contact: linux-crypto@vger.kernel.org
Description: The <bdf> is related the function for PF and VF.
SEC driver supports to configure each function's QoS, the driver
supports to write <bdf> value to alg_qos in the host. Such as
"echo <bdf> value > alg_qos". The qos value is 1~1000, means
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Available for PF and VF in host. VF in guest currently only
has one debug register.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/current_q
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One QM of SEC may contain multiple queues. Select specific
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/current_q
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One QM of SEC may contain multiple queues. Select specific
queue to show its debug registers in above 'regs'.
Only available for PF.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
the SEC's QM debug registers.
0: disable, 1: enable.
Only available for PF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/err_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of invalid interrupts for
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/err_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of invalid interrupts for
QM task completion.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/aeq_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of QM async event queue interrupts.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/aeq_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of QM async event queue interrupts.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/abnormal_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of interrupts for QM abnormal event.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/abnormal_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of interrupts for QM abnormal event.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/create_qp_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of queue allocation errors.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/create_qp_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of queue allocation errors.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/mb_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of failed QM mailbox commands.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/mb_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of failed QM mailbox commands.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/status
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the status of the QM.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/status
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the status of the QM.
Four states: initiated, started, stopped and closed.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of sent requests.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of sent requests.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/recv_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of received requests.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/recv_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of received requests.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_busy_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of requests sent with returning busy.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_busy_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of requests sent with returning busy.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/err_bd_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of BD type error requests
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/err_bd_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of BD type error requests
to be received.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/invalid_req_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of invalid requests being received.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/invalid_req_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of invalid requests being received.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/done_flag_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of completed but marked error requests
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/done_flag_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of completed but marked error requests
to be received.
Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_zip/<bdf>/comp_core[01]/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of compression cores related debug registers.
What: /sys/kernel/debug/hisi_zip/<bdf>/comp_core[01]/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of compression cores related debug registers.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/decomp_core[0-5]/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of decompression cores related debug registers.
What: /sys/kernel/debug/hisi_zip/<bdf>/decomp_core[0-5]/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of decompression cores related debug registers.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Compression/decompression core debug registers read clear
What: /sys/kernel/debug/hisi_zip/<bdf>/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Compression/decompression core debug registers read clear
control. 1 means enable register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/current_qm
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One ZIP controller has one PF and multiple VFs, each function
What: /sys/kernel/debug/hisi_zip/<bdf>/current_qm
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One ZIP controller has one PF and multiple VFs, each function
has a QM. Select the QM which below qm refers to.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
What: /sys/kernel/debug/hisi_zip/<bdf>/alg_qos
Date: Jun 2021
Contact: linux-crypto@vger.kernel.org
Description: The <bdf> is related the function for PF and VF.
ZIP driver supports to configure each function's QoS, the driver
supports to write <bdf> value to alg_qos in the host. Such as
"echo <bdf> value > alg_qos". The qos value is 1~1000, means
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Available for PF and VF in host. VF in guest currently only
has one debug register.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/current_q
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/current_q
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to
show its debug registers in above regs.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: QM debug registers(regs) read clear control. 1 means enable
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: QM debug registers(regs) read clear control. 1 means enable
register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/err_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of invalid interrupts for
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/err_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of invalid interrupts for
QM task completion.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/aeq_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of QM async event queue interrupts.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/aeq_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of QM async event queue interrupts.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/abnormal_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of interrupts for QM abnormal event.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/abnormal_irq
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of interrupts for QM abnormal event.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/create_qp_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of queue allocation errors.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/create_qp_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of queue allocation errors.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/mb_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of failed QM mailbox commands.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/mb_err
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the number of failed QM mailbox commands.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/status
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the status of the QM.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/status
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the status of the QM.
Four states: initiated, started, stopped and closed.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of sent requests.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of sent requests.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/recv_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of received requests.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/recv_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of received requests.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_busy_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of requests received
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_busy_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of requests received
with returning busy.
Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/err_bd_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of BD type error requests
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/err_bd_cnt
Date: Apr 2020
Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of BD type error requests
to be received.
Available for both PF and VF, and take no other effect on ZIP.
......@@ -8644,7 +8644,7 @@ S: Maintained
F: drivers/gpio/gpio-hisi.c
HISILICON HIGH PERFORMANCE RSA ENGINE DRIVER (HPRE)
M: Zaibo Xu <xuzaibo@huawei.com>
M: Longfang Liu <liulongfang@huawei.com>
L: linux-crypto@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/debugfs-hisi-hpre
......@@ -8724,8 +8724,8 @@ F: Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
F: drivers/scsi/hisi_sas/
HISILICON SECURITY ENGINE V2 DRIVER (SEC2)
M: Zaibo Xu <xuzaibo@huawei.com>
M: Kai Ye <yekai13@huawei.com>
M: Longfang Liu <liulongfang@huawei.com>
L: linux-crypto@vger.kernel.org
S: Maintained
F: Documentation/ABI/testing/debugfs-hisi-sec
......@@ -8756,7 +8756,7 @@ F: Documentation/devicetree/bindings/mfd/hisilicon,hi6421-spmi-pmic.yaml
F: drivers/mfd/hi6421-spmi-pmic.c
HISILICON TRUE RANDOM NUMBER GENERATOR V2 SUPPORT
M: Zaibo Xu <xuzaibo@huawei.com>
M: Weili Qian <qianweili@huawei.com>
S: Maintained
F: drivers/crypto/hisilicon/trng/trng.c
......@@ -21302,6 +21302,11 @@ T: git https://github.com/Xilinx/linux-xlnx.git
F: Documentation/devicetree/bindings/phy/xlnx,zynqmp-psgtr.yaml
F: drivers/phy/xilinx/phy-zynqmp.c
XILINX ZYNQMP SHA3 DRIVER
M: Harsha <harsha.harsha@xilinx.com>
S: Maintained
F: drivers/crypto/xilinx/zynqmp-sha.c
XILINX EVENT MANAGEMENT DRIVER
M: Abhyuday Godhasara <abhyuday.godhasara@xilinx.com>
S: Maintained
......
......@@ -5,24 +5,43 @@
* Optimized RAID-5 checksumming functions for alpha EV5 and EV6
*/
extern void xor_alpha_2(unsigned long, unsigned long *, unsigned long *);
extern void xor_alpha_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
extern void xor_alpha_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
extern void xor_alpha_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
extern void
xor_alpha_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
extern void
xor_alpha_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
extern void
xor_alpha_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void
xor_alpha_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
extern void xor_alpha_prefetch_2(unsigned long, unsigned long *,
unsigned long *);
extern void xor_alpha_prefetch_3(unsigned long, unsigned long *,
unsigned long *, unsigned long *);
extern void xor_alpha_prefetch_4(unsigned long, unsigned long *,
unsigned long *, unsigned long *,
unsigned long *);
extern void xor_alpha_prefetch_5(unsigned long, unsigned long *,
unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
extern void
xor_alpha_prefetch_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
extern void
xor_alpha_prefetch_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
extern void
xor_alpha_prefetch_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void
xor_alpha_prefetch_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
asm(" \n\
.text \n\
......
......@@ -758,29 +758,24 @@ ENTRY(aesbs_cbc_decrypt)
ENDPROC(aesbs_cbc_decrypt)
.macro next_ctr, q
vmov.32 \q\()h[1], r10
vmov \q\()h, r9, r10
adds r10, r10, #1
vmov.32 \q\()h[0], r9
adcs r9, r9, #0
vmov.32 \q\()l[1], r8
vmov \q\()l, r7, r8
adcs r8, r8, #0
vmov.32 \q\()l[0], r7
adc r7, r7, #0
vrev32.8 \q, \q
.endm
/*
* aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
* int rounds, int blocks, u8 ctr[], u8 final[])
* int rounds, int bytes, u8 ctr[])
*/
ENTRY(aesbs_ctr_encrypt)
mov ip, sp
push {r4-r10, lr}
ldm ip, {r5-r7} // load args 4-6
teq r7, #0
addne r5, r5, #1 // one extra block if final != 0
ldm ip, {r5, r6} // load args 4-5
vld1.8 {q0}, [r6] // load counter
vrev32.8 q1, q0
vmov r9, r10, d3
......@@ -792,20 +787,19 @@ ENTRY(aesbs_ctr_encrypt)
adc r7, r7, #0
99: vmov q1, q0
sub lr, r5, #1
vmov q2, q0
adr ip, 0f
vmov q3, q0
and lr, lr, #112
vmov q4, q0
cmp r5, #112
vmov q5, q0
sub ip, ip, lr, lsl #1
vmov q6, q0
add ip, ip, lr, lsr #2
vmov q7, q0
adr ip, 0f
sub lr, r5, #1
and lr, lr, #7
cmp r5, #8
sub ip, ip, lr, lsl #5
sub ip, ip, lr, lsl #2
movlt pc, ip // computed goto if blocks < 8
movle pc, ip // computed goto if bytes < 112
next_ctr q1
next_ctr q2
......@@ -820,12 +814,14 @@ ENTRY(aesbs_ctr_encrypt)
bl aesbs_encrypt8
adr ip, 1f
and lr, r5, #7
cmp r5, #8
movgt r4, #0
ldrle r4, [sp, #40] // load final in the last round
sub ip, ip, lr, lsl #2
movlt pc, ip // computed goto if blocks < 8
sub lr, r5, #1
cmp r5, #128
bic lr, lr, #15
ands r4, r5, #15 // preserves C flag
teqcs r5, r5 // set Z flag if not last iteration
sub ip, ip, lr, lsr #2
rsb r4, r4, #16
movcc pc, ip // computed goto if bytes < 128
vld1.8 {q8}, [r1]!
vld1.8 {q9}, [r1]!
......@@ -834,46 +830,70 @@ ENTRY(aesbs_ctr_encrypt)
vld1.8 {q12}, [r1]!
vld1.8 {q13}, [r1]!
vld1.8 {q14}, [r1]!
teq r4, #0 // skip last block if 'final'
1: bne 2f
1: subne r1, r1, r4
vld1.8 {q15}, [r1]!
2: adr ip, 3f
cmp r5, #8
sub ip, ip, lr, lsl #3
movlt pc, ip // computed goto if blocks < 8
add ip, ip, #2f - 1b
veor q0, q0, q8
vst1.8 {q0}, [r0]!
veor q1, q1, q9
vst1.8 {q1}, [r0]!
veor q4, q4, q10
vst1.8 {q4}, [r0]!
veor q6, q6, q11
vst1.8 {q6}, [r0]!
veor q3, q3, q12
vst1.8 {q3}, [r0]!
veor q7, q7, q13
vst1.8 {q7}, [r0]!
veor q2, q2, q14
bne 3f
veor q5, q5, q15
movcc pc, ip // computed goto if bytes < 128
vst1.8 {q0}, [r0]!
vst1.8 {q1}, [r0]!
vst1.8 {q4}, [r0]!
vst1.8 {q6}, [r0]!
vst1.8 {q3}, [r0]!
vst1.8 {q7}, [r0]!
vst1.8 {q2}, [r0]!
teq r4, #0 // skip last block if 'final'
W(bne) 5f
3: veor q5, q5, q15
2: subne r0, r0, r4
vst1.8 {q5}, [r0]!
4: next_ctr q0
next_ctr q0
subs r5, r5, #8
subs r5, r5, #128
bgt 99b
vst1.8 {q0}, [r6]
pop {r4-r10, pc}
5: vst1.8 {q5}, [r4]
b 4b
3: adr lr, .Lpermute_table + 16
cmp r5, #16 // Z flag remains cleared
sub lr, lr, r4
vld1.8 {q8-q9}, [lr]
vtbl.8 d16, {q5}, d16
vtbl.8 d17, {q5}, d17
veor q5, q8, q15
bcc 4f // have to reload prev if R5 < 16
vtbx.8 d10, {q2}, d18
vtbx.8 d11, {q2}, d19
mov pc, ip // branch back to VST sequence
4: sub r0, r0, r4
vshr.s8 q9, q9, #7 // create mask for VBIF
vld1.8 {q8}, [r0] // reload
vbif q5, q8, q9
vst1.8 {q5}, [r0]
pop {r4-r10, pc}
ENDPROC(aesbs_ctr_encrypt)
.align 6
.Lpermute_table:
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.macro next_tweak, out, in, const, tmp
vshr.s64 \tmp, \in, #63
vand \tmp, \tmp, \const
......@@ -888,6 +908,7 @@ ENDPROC(aesbs_ctr_encrypt)
* aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 iv[], int reorder_last_tweak)
*/
.align 6
__xts_prepare8:
vld1.8 {q14}, [r7] // load iv
vmov.i32 d30, #0x87 // compose tweak mask vector
......
......@@ -37,7 +37,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 ctr[], u8 final[]);
int rounds, int blocks, u8 ctr[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], int);
......@@ -243,32 +243,25 @@ static int ctr_encrypt(struct skcipher_request *req)
err = skcipher_walk_virt(&walk, req, false);
while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL;
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
int bytes = walk.nbytes;
if (walk.nbytes < walk.total) {
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
final = NULL;
}
if (unlikely(bytes < AES_BLOCK_SIZE))
src = dst = memcpy(buf + sizeof(buf) - bytes,
src, bytes);
else if (walk.nbytes < walk.total)
bytes &= ~(8 * AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->rk, ctx->rounds, blocks, walk.iv, final);
aesbs_ctr_encrypt(dst, src, ctx->rk, ctx->rounds, bytes, walk.iv);
kernel_neon_end();
if (final) {
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
if (unlikely(bytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
buf + sizeof(buf) - bytes, bytes);
crypto_xor_cpy(dst, src, final,
walk.total % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, 0);
break;
}
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, walk.nbytes - bytes);
}
return err;
......
......@@ -44,7 +44,8 @@
: "0" (dst), "r" (a1), "r" (a2), "r" (a3), "r" (a4))
static void
xor_arm4regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_arm4regs_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned int lines = bytes / sizeof(unsigned long) / 4;
register unsigned int a1 __asm__("r4");
......@@ -64,8 +65,9 @@ xor_arm4regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_arm4regs_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_arm4regs_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned int lines = bytes / sizeof(unsigned long) / 4;
register unsigned int a1 __asm__("r4");
......@@ -86,8 +88,10 @@ xor_arm4regs_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_arm4regs_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_arm4regs_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned int lines = bytes / sizeof(unsigned long) / 2;
register unsigned int a1 __asm__("r8");
......@@ -105,8 +109,11 @@ xor_arm4regs_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_arm4regs_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_arm4regs_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned int lines = bytes / sizeof(unsigned long) / 2;
register unsigned int a1 __asm__("r8");
......@@ -146,7 +153,8 @@ static struct xor_block_template xor_block_arm4regs = {
extern struct xor_block_template const xor_block_neon_inner;
static void
xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
if (in_interrupt()) {
xor_arm4regs_2(bytes, p1, p2);
......@@ -158,8 +166,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
if (in_interrupt()) {
xor_arm4regs_3(bytes, p1, p2, p3);
......@@ -171,8 +180,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
if (in_interrupt()) {
xor_arm4regs_4(bytes, p1, p2, p3, p4);
......@@ -184,8 +195,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
if (in_interrupt()) {
xor_arm4regs_5(bytes, p1, p2, p3, p4, p5);
......
......@@ -17,17 +17,11 @@ MODULE_LICENSE("GPL");
/*
* Pull in the reference implementations while instructing GCC (through
* -ftree-vectorize) to attempt to exploit implicit parallelism and emit
* NEON instructions.
* NEON instructions. Clang does this by default at O2 so no pragma is
* needed.
*/
#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
#ifdef CONFIG_CC_IS_GCC
#pragma GCC optimize "tree-vectorize"
#else
/*
* While older versions of GCC do not generate incorrect code, they fail to
* recognize the parallel nature of these functions, and emit plain ARM code,
* which is known to be slower than the optimized ARM code in asm-arm/xor.h.
*/
#warning This code requires at least version 4.6 of GCC
#endif
#pragma GCC diagnostic ignored "-Wunused-variable"
......
......@@ -45,7 +45,7 @@ config CRYPTO_SM3_ARM64_CE
tristate "SM3 digest algorithm (ARMv8.2 Crypto Extensions)"
depends on KERNEL_MODE_NEON
select CRYPTO_HASH
select CRYPTO_SM3
select CRYPTO_LIB_SM3
config CRYPTO_SM4_ARM64_CE
tristate "SM4 symmetric cipher (ARMv8.2 Crypto Extensions)"
......
......@@ -24,7 +24,6 @@
#ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE "ce"
#define PRIO 300
#define STRIDE 5
#define aes_expandkey ce_aes_expandkey
#define aes_ecb_encrypt ce_aes_ecb_encrypt
#define aes_ecb_decrypt ce_aes_ecb_decrypt
......@@ -42,7 +41,6 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else
#define MODE "neon"
#define PRIO 200
#define STRIDE 4
#define aes_ecb_encrypt neon_aes_ecb_encrypt
#define aes_ecb_decrypt neon_aes_ecb_decrypt
#define aes_cbc_encrypt neon_aes_cbc_encrypt
......@@ -89,7 +87,7 @@ asmlinkage void aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 const iv[]);
asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 ctr[], u8 finalbuf[]);
int rounds, int bytes, u8 ctr[]);
asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int bytes, u32 const rk2[], u8 iv[],
......@@ -458,26 +456,21 @@ static int __maybe_unused ctr_encrypt(struct skcipher_request *req)
unsigned int nbytes = walk.nbytes;
u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE];
unsigned int tail;
if (unlikely(nbytes < AES_BLOCK_SIZE))
src = memcpy(buf, src, nbytes);
src = dst = memcpy(buf + sizeof(buf) - nbytes,
src, nbytes);
else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin();
aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv, buf);
walk.iv);
kernel_neon_end();
tail = nbytes % (STRIDE * AES_BLOCK_SIZE);
if (tail > 0 && tail < AES_BLOCK_SIZE)
/*
* The final partial block could not be returned using
* an overlapping store, so it was passed via buf[]
* instead.
*/
memcpy(dst + nbytes - tail, buf, tail);
if (unlikely(nbytes < AES_BLOCK_SIZE))
memcpy(walk.dst.virt.addr,
buf + sizeof(buf) - nbytes, nbytes);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
}
......@@ -983,6 +976,7 @@ module_cpu_feature_match(AES, aes_init);
module_init(aes_init);
EXPORT_SYMBOL(neon_aes_ecb_encrypt);
EXPORT_SYMBOL(neon_aes_cbc_encrypt);
EXPORT_SYMBOL(neon_aes_ctr_encrypt);
EXPORT_SYMBOL(neon_aes_xts_encrypt);
EXPORT_SYMBOL(neon_aes_xts_decrypt);
#endif
......
......@@ -321,7 +321,7 @@ AES_FUNC_END(aes_cbc_cts_decrypt)
/*
* aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int bytes, u8 ctr[], u8 finalbuf[])
* int bytes, u8 ctr[])
*/
AES_FUNC_START(aes_ctr_encrypt)
......@@ -414,8 +414,8 @@ ST5( st1 {v4.16b}, [x0], #16 )
.Lctrtail:
/* XOR up to MAX_STRIDE * 16 - 1 bytes of in/output with v0 ... v3/v4 */
mov x16, #16
ands x13, x4, #0xf
csel x13, x13, x16, ne
ands x6, x4, #0xf
csel x13, x6, x16, ne
ST5( cmp w4, #64 - (MAX_STRIDE << 4) )
ST5( csel x14, x16, xzr, gt )
......@@ -424,10 +424,10 @@ ST5( csel x14, x16, xzr, gt )
cmp w4, #32 - (MAX_STRIDE << 4)
csel x16, x16, xzr, gt
cmp w4, #16 - (MAX_STRIDE << 4)
ble .Lctrtail1x
adr_l x12, .Lcts_permute_table
add x12, x12, x13
ble .Lctrtail1x
ST5( ld1 {v5.16b}, [x1], x14 )
ld1 {v6.16b}, [x1], x15
......@@ -462,11 +462,19 @@ ST5( st1 {v5.16b}, [x0], x14 )
b .Lctrout
.Lctrtail1x:
csel x0, x0, x6, eq // use finalbuf if less than a full block
sub x7, x6, #16
csel x6, x6, x7, eq
add x1, x1, x6
add x0, x0, x6
ld1 {v5.16b}, [x1]
ld1 {v6.16b}, [x0]
ST5( mov v3.16b, v4.16b )
encrypt_block v3, w3, x2, x8, w7
ld1 {v10.16b-v11.16b}, [x12]
tbl v3.16b, {v3.16b}, v10.16b
sshr v11.16b, v11.16b, #7
eor v5.16b, v5.16b, v3.16b
bif v5.16b, v6.16b, v11.16b
st1 {v5.16b}, [x0]
b .Lctrout
AES_FUNC_END(aes_ctr_encrypt)
......
......@@ -735,119 +735,67 @@ SYM_FUNC_END(aesbs_cbc_decrypt)
* int blocks, u8 iv[])
*/
SYM_FUNC_START_LOCAL(__xts_crypt8)
mov x6, #1
lsl x6, x6, x23
subs w23, w23, #8
csel x23, x23, xzr, pl
csel x6, x6, xzr, mi
movi v18.2s, #0x1
movi v19.2s, #0x87
uzp1 v18.4s, v18.4s, v19.4s
ld1 {v0.16b-v3.16b}, [x1], #64
ld1 {v4.16b-v7.16b}, [x1], #64
next_tweak v26, v25, v18, v19
next_tweak v27, v26, v18, v19
next_tweak v28, v27, v18, v19
next_tweak v29, v28, v18, v19
next_tweak v30, v29, v18, v19
next_tweak v31, v30, v18, v19
next_tweak v16, v31, v18, v19
next_tweak v17, v16, v18, v19
ld1 {v0.16b}, [x20], #16
next_tweak v26, v25, v30, v31
eor v0.16b, v0.16b, v25.16b
tbnz x6, #1, 0f
ld1 {v1.16b}, [x20], #16
next_tweak v27, v26, v30, v31
eor v1.16b, v1.16b, v26.16b
tbnz x6, #2, 0f
ld1 {v2.16b}, [x20], #16
next_tweak v28, v27, v30, v31
eor v2.16b, v2.16b, v27.16b
tbnz x6, #3, 0f
ld1 {v3.16b}, [x20], #16
next_tweak v29, v28, v30, v31
eor v3.16b, v3.16b, v28.16b
tbnz x6, #4, 0f
ld1 {v4.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset]
eor v4.16b, v4.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #5, 0f
ld1 {v5.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 16]
eor v5.16b, v5.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #6, 0f
ld1 {v6.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 32]
eor v6.16b, v6.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #7, 0f
eor v5.16b, v5.16b, v30.16b
eor v6.16b, v6.16b, v31.16b
eor v7.16b, v7.16b, v16.16b
ld1 {v7.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 48]
eor v7.16b, v7.16b, v29.16b
next_tweak v29, v29, v30, v31
stp q16, q17, [sp, #16]
0: mov bskey, x21
mov rounds, x22
mov bskey, x2
mov rounds, x3
br x16
SYM_FUNC_END(__xts_crypt8)
.macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
frame_push 6, 64
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
mov x23, x4
mov x24, x5
stp x29, x30, [sp, #-48]!
mov x29, sp
movi v30.2s, #0x1
movi v25.2s, #0x87
uzp1 v30.4s, v30.4s, v25.4s
ld1 {v25.16b}, [x24]
ld1 {v25.16b}, [x5]
99: adr x16, \do8
0: adr x16, \do8
bl __xts_crypt8
ldp q16, q17, [sp, #.Lframe_local_offset]
ldp q18, q19, [sp, #.Lframe_local_offset + 32]
eor v16.16b, \o0\().16b, v25.16b
eor v17.16b, \o1\().16b, v26.16b
eor v18.16b, \o2\().16b, v27.16b
eor v19.16b, \o3\().16b, v28.16b
eor \o0\().16b, \o0\().16b, v25.16b
eor \o1\().16b, \o1\().16b, v26.16b
eor \o2\().16b, \o2\().16b, v27.16b
eor \o3\().16b, \o3\().16b, v28.16b
ldp q24, q25, [sp, #16]
st1 {\o0\().16b}, [x19], #16
mov v25.16b, v26.16b
tbnz x6, #1, 1f
st1 {\o1\().16b}, [x19], #16
mov v25.16b, v27.16b
tbnz x6, #2, 1f
st1 {\o2\().16b}, [x19], #16
mov v25.16b, v28.16b
tbnz x6, #3, 1f
st1 {\o3\().16b}, [x19], #16
mov v25.16b, v29.16b
tbnz x6, #4, 1f
eor v20.16b, \o4\().16b, v29.16b
eor v21.16b, \o5\().16b, v30.16b
eor v22.16b, \o6\().16b, v31.16b
eor v23.16b, \o7\().16b, v24.16b
eor \o4\().16b, \o4\().16b, v16.16b
eor \o5\().16b, \o5\().16b, v17.16b
eor \o6\().16b, \o6\().16b, v18.16b
eor \o7\().16b, \o7\().16b, v19.16b
st1 {v16.16b-v19.16b}, [x0], #64
st1 {v20.16b-v23.16b}, [x0], #64
st1 {\o4\().16b}, [x19], #16
tbnz x6, #5, 1f
st1 {\o5\().16b}, [x19], #16
tbnz x6, #6, 1f
st1 {\o6\().16b}, [x19], #16
tbnz x6, #7, 1f
st1 {\o7\().16b}, [x19], #16
subs x4, x4, #8
b.gt 0b
cbz x23, 1f
st1 {v25.16b}, [x24]
b 99b
1: st1 {v25.16b}, [x24]
frame_pop
st1 {v25.16b}, [x5]
ldp x29, x30, [sp], #48
ret
.endm
......@@ -869,133 +817,51 @@ SYM_FUNC_END(aesbs_xts_decrypt)
/*
* aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
* int rounds, int blocks, u8 iv[], u8 final[])
* int rounds, int blocks, u8 iv[])
*/
SYM_FUNC_START(aesbs_ctr_encrypt)
frame_push 8
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
mov x23, x4
mov x24, x5
mov x25, x6
stp x29, x30, [sp, #-16]!
mov x29, sp
cmp x25, #0
cset x26, ne
add x23, x23, x26 // do one extra block if final
ldp x7, x8, [x24]
ld1 {v0.16b}, [x24]
ldp x7, x8, [x5]
ld1 {v0.16b}, [x5]
CPU_LE( rev x7, x7 )
CPU_LE( rev x8, x8 )
adds x8, x8, #1
adc x7, x7, xzr
99: mov x9, #1
lsl x9, x9, x23
subs w23, w23, #8
csel x23, x23, xzr, pl
csel x9, x9, xzr, le
tbnz x9, #1, 0f
next_ctr v1
tbnz x9, #2, 0f
0: next_ctr v1
next_ctr v2
tbnz x9, #3, 0f
next_ctr v3
tbnz x9, #4, 0f
next_ctr v4
tbnz x9, #5, 0f
next_ctr v5
tbnz x9, #6, 0f
next_ctr v6
tbnz x9, #7, 0f
next_ctr v7
0: mov bskey, x21
mov rounds, x22
mov bskey, x2
mov rounds, x3
bl aesbs_encrypt8
lsr x9, x9, x26 // disregard the extra block
tbnz x9, #0, 0f
ld1 {v8.16b}, [x20], #16
eor v0.16b, v0.16b, v8.16b
st1 {v0.16b}, [x19], #16
tbnz x9, #1, 1f
ld1 { v8.16b-v11.16b}, [x1], #64
ld1 {v12.16b-v15.16b}, [x1], #64
ld1 {v9.16b}, [x20], #16
eor v1.16b, v1.16b, v9.16b
st1 {v1.16b}, [x19], #16
tbnz x9, #2, 2f
eor v8.16b, v0.16b, v8.16b
eor v9.16b, v1.16b, v9.16b
eor v10.16b, v4.16b, v10.16b
eor v11.16b, v6.16b, v11.16b
eor v12.16b, v3.16b, v12.16b
eor v13.16b, v7.16b, v13.16b
eor v14.16b, v2.16b, v14.16b
eor v15.16b, v5.16b, v15.16b
ld1 {v10.16b}, [x20], #16
eor v4.16b, v4.16b, v10.16b
st1 {v4.16b}, [x19], #16
tbnz x9, #3, 3f
st1 { v8.16b-v11.16b}, [x0], #64
st1 {v12.16b-v15.16b}, [x0], #64
ld1 {v11.16b}, [x20], #16
eor v6.16b, v6.16b, v11.16b
st1 {v6.16b}, [x19], #16
tbnz x9, #4, 4f
ld1 {v12.16b}, [x20], #16
eor v3.16b, v3.16b, v12.16b
st1 {v3.16b}, [x19], #16
tbnz x9, #5, 5f
ld1 {v13.16b}, [x20], #16
eor v7.16b, v7.16b, v13.16b
st1 {v7.16b}, [x19], #16
tbnz x9, #6, 6f
next_ctr v0
subs x4, x4, #8
b.gt 0b
ld1 {v14.16b}, [x20], #16
eor v2.16b, v2.16b, v14.16b
st1 {v2.16b}, [x19], #16
tbnz x9, #7, 7f
ld1 {v15.16b}, [x20], #16
eor v5.16b, v5.16b, v15.16b
st1 {v5.16b}, [x19], #16
8: next_ctr v0
st1 {v0.16b}, [x24]
cbz x23, .Lctr_done
b 99b
.Lctr_done:
frame_pop
st1 {v0.16b}, [x5]
ldp x29, x30, [sp], #16
ret
/*
* If we are handling the tail of the input (x6 != NULL), return the
* final keystream block back to the caller.
*/
0: cbz x25, 8b
st1 {v0.16b}, [x25]
b 8b
1: cbz x25, 8b
st1 {v1.16b}, [x25]
b 8b
2: cbz x25, 8b
st1 {v4.16b}, [x25]
b 8b
3: cbz x25, 8b
st1 {v6.16b}, [x25]
b 8b
4: cbz x25, 8b
st1 {v3.16b}, [x25]
b 8b
5: cbz x25, 8b
st1 {v7.16b}, [x25]
b 8b
6: cbz x25, 8b
st1 {v2.16b}, [x25]
b 8b
7: cbz x25, 8b
st1 {v5.16b}, [x25]
b 8b
SYM_FUNC_END(aesbs_ctr_encrypt)
......@@ -34,7 +34,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], u8 final[]);
int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]);
......@@ -46,6 +46,8 @@ asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks);
asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 iv[]);
asmlinkage void neon_aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 ctr[]);
asmlinkage void neon_aes_xts_encrypt(u8 out[], u8 const in[],
u32 const rk1[], int rounds, int bytes,
u32 const rk2[], u8 iv[], int first);
......@@ -58,7 +60,7 @@ struct aesbs_ctx {
int rounds;
} __aligned(AES_BLOCK_SIZE);
struct aesbs_cbc_ctx {
struct aesbs_cbc_ctr_ctx {
struct aesbs_ctx key;
u32 enc[AES_MAX_KEYLENGTH_U32];
};
......@@ -128,10 +130,10 @@ static int ecb_decrypt(struct skcipher_request *req)
return __ecb_crypt(req, aesbs_ecb_decrypt);
}
static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
static int aesbs_cbc_ctr_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len)
{
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct crypto_aes_ctx rk;
int err;
......@@ -154,7 +156,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
static int cbc_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
......@@ -177,7 +179,7 @@ static int cbc_encrypt(struct skcipher_request *req)
static int cbc_decrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
int err;
......@@ -205,40 +207,32 @@ static int cbc_decrypt(struct skcipher_request *req)
static int ctr_encrypt(struct skcipher_request *req)
{
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm);
struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk;
u8 buf[AES_BLOCK_SIZE];
int err;
err = skcipher_walk_virt(&walk, req, false);
while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL;
if (walk.nbytes < walk.total) {
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
final = NULL;
}
int blocks = (walk.nbytes / AES_BLOCK_SIZE) & ~7;
int nbytes = walk.nbytes % (8 * AES_BLOCK_SIZE);
const u8 *src = walk.src.virt.addr;
u8 *dst = walk.dst.virt.addr;
kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
ctx->rk, ctx->rounds, blocks, walk.iv, final);
kernel_neon_end();
if (final) {
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
crypto_xor_cpy(dst, src, final,
walk.total % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, 0);
break;
if (blocks >= 8) {
aesbs_ctr_encrypt(dst, src, ctx->key.rk, ctx->key.rounds,
blocks, walk.iv);
dst += blocks * AES_BLOCK_SIZE;
src += blocks * AES_BLOCK_SIZE;
}
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
if (nbytes && walk.nbytes == walk.total) {
neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds,
nbytes, walk.iv);
nbytes = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
return err;
}
......@@ -308,23 +302,18 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
return err;
while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE;
if (walk.nbytes < walk.total || walk.nbytes % AES_BLOCK_SIZE)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
int blocks = (walk.nbytes / AES_BLOCK_SIZE) & ~7;
out = walk.dst.virt.addr;
in = walk.src.virt.addr;
nbytes = walk.nbytes;
kernel_neon_begin();
if (likely(blocks > 6)) { /* plain NEON is faster otherwise */
if (first)
if (blocks >= 8) {
if (first == 1)
neon_aes_ecb_encrypt(walk.iv, walk.iv,
ctx->twkey,
ctx->key.rounds, 1);
first = 0;
first = 2;
fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
walk.iv);
......@@ -333,10 +322,17 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
in += blocks * AES_BLOCK_SIZE;
nbytes -= blocks * AES_BLOCK_SIZE;
}
if (walk.nbytes == walk.total && nbytes > 0)
goto xts_tail;
if (walk.nbytes == walk.total && nbytes > 0) {
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
nbytes = first = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
}
......@@ -361,13 +357,12 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
nbytes = walk.nbytes;
kernel_neon_begin();
xts_tail:
if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2);
nbytes, ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2);
nbytes, ctx->twkey, walk.iv, first);
kernel_neon_end();
return skcipher_walk_done(&walk, 0);
......@@ -402,14 +397,14 @@ static struct skcipher_alg aes_algs[] = { {
.base.cra_driver_name = "cbc-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctx),
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctr_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_cbc_setkey,
.setkey = aesbs_cbc_ctr_setkey,
.encrypt = cbc_encrypt,
.decrypt = cbc_decrypt,
}, {
......@@ -417,7 +412,7 @@ static struct skcipher_alg aes_algs[] = { {
.base.cra_driver_name = "ctr-aes-neonbs",
.base.cra_priority = 250,
.base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct aesbs_ctx),
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctr_ctx),
.base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE,
......@@ -425,7 +420,7 @@ static struct skcipher_alg aes_algs[] = { {
.chunksize = AES_BLOCK_SIZE,
.walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_setkey,
.setkey = aesbs_cbc_ctr_setkey,
.encrypt = ctr_encrypt,
.decrypt = ctr_encrypt,
}, {
......
/* SPDX-License-Identifier: GPL-2.0 */
// SPDX-License-Identifier: GPL-2.0
/*
* sha3-ce-glue.c - core SHA-3 transform using v8.2 Crypto Extensions
*
......
......@@ -43,7 +43,7 @@
# on Cortex-A53 (or by 4 cycles per round).
# (***) Super-impressive coefficients over gcc-generated code are
# indication of some compiler "pathology", most notably code
# generated with -mgeneral-regs-only is significanty faster
# generated with -mgeneral-regs-only is significantly faster
# and the gap is only 40-90%.
#
# October 2016.
......
/* SPDX-License-Identifier: GPL-2.0 */
// SPDX-License-Identifier: GPL-2.0
/*
* sha512-ce-glue.c - SHA-384/SHA-512 using ARMv8 Crypto Extensions
*
......
......@@ -26,8 +26,10 @@ asmlinkage void sm3_ce_transform(struct sm3_state *sst, u8 const *src,
static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
if (!crypto_simd_usable())
return crypto_sm3_update(desc, data, len);
if (!crypto_simd_usable()) {
sm3_update(shash_desc_ctx(desc), data, len);
return 0;
}
kernel_neon_begin();
sm3_base_do_update(desc, data, len, sm3_ce_transform);
......@@ -38,8 +40,10 @@ static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
static int sm3_ce_final(struct shash_desc *desc, u8 *out)
{
if (!crypto_simd_usable())
return crypto_sm3_finup(desc, NULL, 0, out);
if (!crypto_simd_usable()) {
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
kernel_neon_begin();
sm3_base_do_finalize(desc, sm3_ce_transform);
......@@ -51,14 +55,22 @@ static int sm3_ce_final(struct shash_desc *desc, u8 *out)
static int sm3_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable())
return crypto_sm3_finup(desc, data, len, out);
if (!crypto_simd_usable()) {
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, out);
return 0;
}
kernel_neon_begin();
sm3_base_do_update(desc, data, len, sm3_ce_transform);
if (len)
sm3_base_do_update(desc, data, len, sm3_ce_transform);
sm3_base_do_finalize(desc, sm3_ce_transform);
kernel_neon_end();
return sm3_ce_final(desc, out);
return sm3_base_finish(desc, out);
}
static struct shash_alg sm3_alg = {
......
......@@ -16,7 +16,8 @@
extern struct xor_block_template const xor_block_inner_neon;
static void
xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
kernel_neon_begin();
xor_block_inner_neon.do_2(bytes, p1, p2);
......@@ -24,8 +25,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
kernel_neon_begin();
xor_block_inner_neon.do_3(bytes, p1, p2, p3);
......@@ -33,8 +35,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
kernel_neon_begin();
xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4);
......@@ -42,8 +46,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_neon_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
kernel_neon_begin();
xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5);
......
......@@ -11,7 +11,44 @@
.arch armv8-a+crc
.macro __crc32, c
.macro byteorder, reg, be
.if \be
CPU_LE( rev \reg, \reg )
.else
CPU_BE( rev \reg, \reg )
.endif
.endm
.macro byteorder16, reg, be
.if \be
CPU_LE( rev16 \reg, \reg )
.else
CPU_BE( rev16 \reg, \reg )
.endif
.endm
.macro bitorder, reg, be
.if \be
rbit \reg, \reg
.endif
.endm
.macro bitorder16, reg, be
.if \be
rbit \reg, \reg
lsr \reg, \reg, #16
.endif
.endm
.macro bitorder8, reg, be
.if \be
rbit \reg, \reg
lsr \reg, \reg, #24
.endif
.endm
.macro __crc32, c, be=0
bitorder w0, \be
cmp x2, #16
b.lt 8f // less than 16 bytes
......@@ -24,10 +61,14 @@
add x8, x8, x1
add x1, x1, x7
ldp x5, x6, [x8]
CPU_BE( rev x3, x3 )
CPU_BE( rev x4, x4 )
CPU_BE( rev x5, x5 )
CPU_BE( rev x6, x6 )
byteorder x3, \be
byteorder x4, \be
byteorder x5, \be
byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
tst x7, #8
crc32\c\()x w8, w0, x3
......@@ -55,33 +96,43 @@ CPU_BE( rev x6, x6 )
32: ldp x3, x4, [x1], #32
sub x2, x2, #32
ldp x5, x6, [x1, #-16]
CPU_BE( rev x3, x3 )
CPU_BE( rev x4, x4 )
CPU_BE( rev x5, x5 )
CPU_BE( rev x6, x6 )
byteorder x3, \be
byteorder x4, \be
byteorder x5, \be
byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
crc32\c\()x w0, w0, x3
crc32\c\()x w0, w0, x4
crc32\c\()x w0, w0, x5
crc32\c\()x w0, w0, x6
cbnz x2, 32b
0: ret
0: bitorder w0, \be
ret
8: tbz x2, #3, 4f
ldr x3, [x1], #8
CPU_BE( rev x3, x3 )
byteorder x3, \be
bitorder x3, \be
crc32\c\()x w0, w0, x3
4: tbz x2, #2, 2f
ldr w3, [x1], #4
CPU_BE( rev w3, w3 )
byteorder w3, \be
bitorder w3, \be
crc32\c\()w w0, w0, w3
2: tbz x2, #1, 1f
ldrh w3, [x1], #2
CPU_BE( rev16 w3, w3 )
byteorder16 w3, \be
bitorder16 w3, \be
crc32\c\()h w0, w0, w3
1: tbz x2, #0, 0f
ldrb w3, [x1]
bitorder8 w3, \be
crc32\c\()b w0, w0, w3
0: ret
0: bitorder w0, \be
ret
.endm
.align 5
......@@ -99,3 +150,11 @@ alternative_if_not ARM64_HAS_CRC32
alternative_else_nop_endif
__crc32 c
SYM_FUNC_END(__crc32c_le)
.align 5
SYM_FUNC_START(crc32_be)
alternative_if_not ARM64_HAS_CRC32
b crc32_be_base
alternative_else_nop_endif
__crc32 be=1
SYM_FUNC_END(crc32_be)
......@@ -10,8 +10,8 @@
#include <linux/module.h>
#include <asm/neon-intrinsics.h>
void xor_arm64_neon_2(unsigned long bytes, unsigned long *p1,
unsigned long *p2)
void xor_arm64_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
......@@ -37,8 +37,9 @@ void xor_arm64_neon_2(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
void xor_arm64_neon_3(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3)
void xor_arm64_neon_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
......@@ -72,8 +73,10 @@ void xor_arm64_neon_3(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
void xor_arm64_neon_4(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3, unsigned long *p4)
void xor_arm64_neon_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
......@@ -115,9 +118,11 @@ void xor_arm64_neon_4(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
void xor_arm64_neon_5(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3,
unsigned long *p4, unsigned long *p5)
void xor_arm64_neon_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
......@@ -186,8 +191,10 @@ static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
return res;
}
static void xor_arm64_eor3_3(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3)
static void xor_arm64_eor3_3(unsigned long bytes,
unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
......@@ -219,9 +226,11 @@ static void xor_arm64_eor3_3(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
static void xor_arm64_eor3_4(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3,
unsigned long *p4)
static void xor_arm64_eor3_4(unsigned long bytes,
unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
......@@ -261,9 +270,12 @@ static void xor_arm64_eor3_4(unsigned long bytes, unsigned long *p1,
} while (--lines > 0);
}
static void xor_arm64_eor3_5(unsigned long bytes, unsigned long *p1,
unsigned long *p2, unsigned long *p3,
unsigned long *p4, unsigned long *p5)
static void xor_arm64_eor3_5(unsigned long bytes,
unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2;
......
......@@ -4,13 +4,20 @@
*/
extern void xor_ia64_2(unsigned long, unsigned long *, unsigned long *);
extern void xor_ia64_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
extern void xor_ia64_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
extern void xor_ia64_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
extern void xor_ia64_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
extern void xor_ia64_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
extern void xor_ia64_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void xor_ia64_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
static struct xor_block_template xor_block_ia64 = {
.name = "ia64",
......
......@@ -3,17 +3,20 @@
#define _ASM_POWERPC_XOR_ALTIVEC_H
#ifdef CONFIG_ALTIVEC
void xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in);
void xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in);
void xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in);
void xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in);
void xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
#endif
#endif /* _ASM_POWERPC_XOR_ALTIVEC_H */
......@@ -49,8 +49,9 @@ typedef vector signed char unative_t;
V1##_3 = vec_xor(V1##_3, V2##_3); \
} while (0)
void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in)
void __xor_altivec_2(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in)
{
DEFINE(v1);
DEFINE(v2);
......@@ -67,8 +68,10 @@ void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0);
}
void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in)
void __xor_altivec_3(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in)
{
DEFINE(v1);
DEFINE(v2);
......@@ -89,9 +92,11 @@ void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0);
}
void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in)
void __xor_altivec_4(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in,
const unsigned long * __restrict v4_in)
{
DEFINE(v1);
DEFINE(v2);
......@@ -116,9 +121,12 @@ void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0);
}
void __xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in)
void __xor_altivec_5(unsigned long bytes,
unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in,
const unsigned long * __restrict v4_in,
const unsigned long * __restrict v5_in)
{
DEFINE(v1);
DEFINE(v2);
......
......@@ -6,16 +6,17 @@
* outside of the enable/disable altivec block.
*/
void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in);
void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in);
void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in);
void __xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in);
void __xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void __xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void __xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void __xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
......@@ -12,47 +12,51 @@
#include <asm/xor_altivec.h>
#include "xor_vmx.h"
void xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in)
void xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_2(bytes, v1_in, v2_in);
__xor_altivec_2(bytes, p1, p2);
disable_kernel_altivec();
preempt_enable();
}
EXPORT_SYMBOL(xor_altivec_2);
void xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in)
void xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_3(bytes, v1_in, v2_in, v3_in);
__xor_altivec_3(bytes, p1, p2, p3);
disable_kernel_altivec();
preempt_enable();
}
EXPORT_SYMBOL(xor_altivec_3);
void xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in)
void xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_4(bytes, v1_in, v2_in, v3_in, v4_in);
__xor_altivec_4(bytes, p1, p2, p3, p4);
disable_kernel_altivec();
preempt_enable();
}
EXPORT_SYMBOL(xor_altivec_4);
void xor_altivec_5(unsigned long bytes, unsigned long *v1_in,
unsigned long *v2_in, unsigned long *v3_in,
unsigned long *v4_in, unsigned long *v5_in)
void xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
preempt_disable();
enable_kernel_altivec();
__xor_altivec_5(bytes, v1_in, v2_in, v3_in, v4_in, v5_in);
__xor_altivec_5(bytes, p1, p2, p3, p4, p5);
disable_kernel_altivec();
preempt_enable();
}
......
......@@ -11,7 +11,8 @@
#include <linux/raid/xor.h>
#include <asm/xor.h>
static void xor_xc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
static void xor_xc_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
asm volatile(
" larl 1,2f\n"
......@@ -32,8 +33,9 @@ static void xor_xc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
: "0", "1", "cc", "memory");
}
static void xor_xc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
static void xor_xc_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
asm volatile(
" larl 1,2f\n"
......@@ -58,8 +60,10 @@ static void xor_xc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
: : "0", "1", "cc", "memory");
}
static void xor_xc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
static void xor_xc_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
asm volatile(
" larl 1,2f\n"
......@@ -88,8 +92,11 @@ static void xor_xc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
: : "0", "1", "cc", "memory");
}
static void xor_xc_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
static void xor_xc_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
asm volatile(
" larl 1,2f\n"
......
......@@ -13,7 +13,8 @@
*/
static void
sparc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
sparc_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
int lines = bytes / (sizeof (long)) / 8;
......@@ -50,8 +51,9 @@ sparc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
sparc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
sparc_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
int lines = bytes / (sizeof (long)) / 8;
......@@ -101,8 +103,10 @@ sparc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
sparc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
sparc_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
int lines = bytes / (sizeof (long)) / 8;
......@@ -165,8 +169,11 @@ sparc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
sparc_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
sparc_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
int lines = bytes / (sizeof (long)) / 8;
......
......@@ -12,13 +12,20 @@
#include <asm/spitfire.h>
void xor_vis_2(unsigned long, unsigned long *, unsigned long *);
void xor_vis_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
void xor_vis_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
void xor_vis_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
void xor_vis_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void xor_vis_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void xor_vis_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_vis_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
/* XXX Ugh, write cheetah versions... -DaveM */
......@@ -30,13 +37,20 @@ static struct xor_block_template xor_block_VIS = {
.do_5 = xor_vis_5,
};
void xor_niagara_2(unsigned long, unsigned long *, unsigned long *);
void xor_niagara_3(unsigned long, unsigned long *, unsigned long *,
unsigned long *);
void xor_niagara_4(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *);
void xor_niagara_5(unsigned long, unsigned long *, unsigned long *,
unsigned long *, unsigned long *, unsigned long *);
void xor_niagara_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2);
void xor_niagara_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3);
void xor_niagara_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_niagara_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
static struct xor_block_template xor_block_niagara = {
.name = "Niagara",
......
......@@ -90,6 +90,9 @@ nhpoly1305-avx2-y := nh-avx2-x86_64.o nhpoly1305-avx2-glue.o
obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o
obj-$(CONFIG_CRYPTO_SM3_AVX_X86_64) += sm3-avx-x86_64.o
sm3-avx-x86_64-y := sm3-avx-asm_64.o sm3_avx_glue.o
obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) += sm4-aesni-avx-x86_64.o
sm4-aesni-avx-x86_64-y := sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o
......
/* SPDX-License-Identifier: GPL-2.0-only OR BSD-3-Clause */
/*
* Implement AES CTR mode by8 optimization with AVX instructions. (x86_64)
*
* This is AES128/192/256 CTR mode optimization implementation. It requires
* the support of Intel(R) AESNI and AVX instructions.
*
* This work was inspired by the AES CTR mode optimization published
* in Intel Optimized IPSEC Cryptograhpic library.
* Additional information on it can be found at:
* http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
*
* GPL LICENSE SUMMARY
* AES CTR mode by8 optimization with AVX instructions. (x86_64)
*
* Copyright(c) 2014 Intel Corporation.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* Contact Information:
* James Guilford <james.guilford@intel.com>
* Sean Gulley <sean.m.gulley@intel.com>
* Chandramouli Narayanan <mouli@linux.intel.com>
*/
/*
* This is AES128/192/256 CTR mode optimization implementation. It requires
* the support of Intel(R) AESNI and AVX instructions.
*
* BSD LICENSE
*
* Copyright(c) 2014 Intel Corporation.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
* This work was inspired by the AES CTR mode optimization published
* in Intel Optimized IPSEC Cryptographic library.
* Additional information on it can be found at:
* https://github.com/intel/intel-ipsec-mb
*/
#include <linux/linkage.h>
......
......@@ -32,24 +32,12 @@ static inline void blowfish_enc_blk(struct bf_ctx *ctx, u8 *dst, const u8 *src)
__blowfish_enc_blk(ctx, dst, src, false);
}
static inline void blowfish_enc_blk_xor(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk(ctx, dst, src, true);
}
static inline void blowfish_enc_blk_4way(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk_4way(ctx, dst, src, false);
}
static inline void blowfish_enc_blk_xor_4way(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk_4way(ctx, dst, src, true);
}
static void blowfish_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{
blowfish_enc_blk(crypto_tfm_ctx(tfm), dst, src);
......
......@@ -45,14 +45,6 @@ static inline void des3_ede_dec_blk(struct des3_ede_x86_ctx *ctx, u8 *dst,
des3_ede_x86_64_crypt_blk(dec_ctx, dst, src);
}
static inline void des3_ede_enc_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
u32 *enc_ctx = ctx->enc.expkey;
des3_ede_x86_64_crypt_blk_3way(enc_ctx, dst, src);
}
static inline void des3_ede_dec_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
......
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM3 Secure Hash Algorithm, AVX assembler accelerated.
* specified in: https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02
*
* Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/types.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
#include <asm/simd.h>
asmlinkage void sm3_transform_avx(struct sm3_state *state,
const u8 *data, int nblocks);
static int sm3_avx_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sm3_state *sctx = shash_desc_ctx(desc);
if (!crypto_simd_usable() ||
(sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) {
sm3_update(sctx, data, len);
return 0;
}
/*
* Make sure struct sm3_state begins directly with the SM3
* 256-bit internal state, as this is what the asm functions expect.
*/
BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0);
kernel_fpu_begin();
sm3_base_do_update(desc, data, len, sm3_transform_avx);
kernel_fpu_end();
return 0;
}
static int sm3_avx_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable()) {
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, out);
return 0;
}
kernel_fpu_begin();
if (len)
sm3_base_do_update(desc, data, len, sm3_transform_avx);
sm3_base_do_finalize(desc, sm3_transform_avx);
kernel_fpu_end();
return sm3_base_finish(desc, out);
}
static int sm3_avx_final(struct shash_desc *desc, u8 *out)
{
if (!crypto_simd_usable()) {
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
kernel_fpu_begin();
sm3_base_do_finalize(desc, sm3_transform_avx);
kernel_fpu_end();
return sm3_base_finish(desc, out);
}
static struct shash_alg sm3_avx_alg = {
.digestsize = SM3_DIGEST_SIZE,
.init = sm3_base_init,
.update = sm3_avx_update,
.final = sm3_avx_final,
.finup = sm3_avx_finup,
.descsize = sizeof(struct sm3_state),
.base = {
.cra_name = "sm3",
.cra_driver_name = "sm3-avx",
.cra_priority = 300,
.cra_blocksize = SM3_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init sm3_avx_mod_init(void)
{
const char *feature_name;
if (!boot_cpu_has(X86_FEATURE_AVX)) {
pr_info("AVX instruction are not detected.\n");
return -ENODEV;
}
if (!boot_cpu_has(X86_FEATURE_BMI2)) {
pr_info("BMI2 instruction are not detected.\n");
return -ENODEV;
}
if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
&feature_name)) {
pr_info("CPU feature '%s' is not supported.\n", feature_name);
return -ENODEV;
}
return crypto_register_shash(&sm3_avx_alg);
}
static void __exit sm3_avx_mod_exit(void)
{
crypto_unregister_shash(&sm3_avx_alg);
}
module_init(sm3_avx_mod_init);
module_exit(sm3_avx_mod_exit);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
MODULE_DESCRIPTION("SM3 Secure Hash Algorithm, AVX assembler accelerated");
MODULE_ALIAS_CRYPTO("sm3");
MODULE_ALIAS_CRYPTO("sm3-avx");
......@@ -57,7 +57,8 @@
op(i + 3, 3)
static void
xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_sse_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 8;
......@@ -108,7 +109,8 @@ xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_sse_2_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 8;
......@@ -142,8 +144,9 @@ xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_sse_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 8;
......@@ -201,8 +204,9 @@ xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_sse_3_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 8;
......@@ -238,8 +242,10 @@ xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_sse_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 8;
......@@ -304,8 +310,10 @@ xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_sse_4_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 8;
......@@ -343,8 +351,11 @@ xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_sse_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 8;
......@@ -416,8 +427,11 @@ xor_sse_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_sse_5_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_sse_5_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 8;
......
......@@ -21,7 +21,8 @@
#include <asm/fpu/api.h>
static void
xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_pII_mmx_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 7;
......@@ -64,8 +65,9 @@ xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_pII_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_pII_mmx_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 7;
......@@ -113,8 +115,10 @@ xor_pII_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_pII_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_pII_mmx_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 7;
......@@ -168,8 +172,11 @@ xor_pII_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
static void
xor_pII_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_pII_mmx_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 7;
......@@ -248,7 +255,8 @@ xor_pII_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
#undef BLOCK
static void
xor_p5_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
xor_p5_mmx_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 6;
......@@ -295,8 +303,9 @@ xor_p5_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
}
static void
xor_p5_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3)
xor_p5_mmx_3(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 6;
......@@ -352,8 +361,10 @@ xor_p5_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_p5_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4)
xor_p5_mmx_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 6;
......@@ -418,8 +429,11 @@ xor_p5_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
}
static void
xor_p5_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
unsigned long *p3, unsigned long *p4, unsigned long *p5)
xor_p5_mmx_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{
unsigned long lines = bytes >> 6;
......
......@@ -26,7 +26,8 @@
BLOCK4(8) \
BLOCK4(12)
static void xor_avx_2(unsigned long bytes, unsigned long *p0, unsigned long *p1)
static void xor_avx_2(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1)
{
unsigned long lines = bytes >> 9;
......@@ -52,8 +53,9 @@ do { \
kernel_fpu_end();
}
static void xor_avx_3(unsigned long bytes, unsigned long *p0, unsigned long *p1,
unsigned long *p2)
static void xor_avx_3(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{
unsigned long lines = bytes >> 9;
......@@ -82,8 +84,10 @@ do { \
kernel_fpu_end();
}
static void xor_avx_4(unsigned long bytes, unsigned long *p0, unsigned long *p1,
unsigned long *p2, unsigned long *p3)
static void xor_avx_4(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{
unsigned long lines = bytes >> 9;
......@@ -115,8 +119,11 @@ do { \
kernel_fpu_end();
}
static void xor_avx_5(unsigned long bytes, unsigned long *p0, unsigned long *p1,
unsigned long *p2, unsigned long *p3, unsigned long *p4)
static void xor_avx_5(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{
unsigned long lines = bytes >> 9;
......
......@@ -231,6 +231,13 @@ config CRYPTO_DH
help
Generic implementation of the Diffie-Hellman algorithm.
config CRYPTO_DH_RFC7919_GROUPS
bool "Support for RFC 7919 FFDHE group parameters"
depends on CRYPTO_DH
select CRYPTO_RNG_DEFAULT
help
Provide support for RFC 7919 FFDHE group parameters. If unsure, say N.
config CRYPTO_ECC
tristate
select CRYPTO_RNG_DEFAULT
......@@ -267,7 +274,7 @@ config CRYPTO_ECRDSA
config CRYPTO_SM2
tristate "SM2 algorithm"
select CRYPTO_SM3
select CRYPTO_LIB_SM3
select CRYPTO_AKCIPHER
select CRYPTO_MANAGER
select MPILIB
......@@ -425,6 +432,7 @@ config CRYPTO_LRW
select CRYPTO_SKCIPHER
select CRYPTO_MANAGER
select CRYPTO_GF128MUL
select CRYPTO_ECB
help
LRW: Liskov Rivest Wagner, a tweakable, non malleable, non movable
narrow block cipher mode for dm-crypt. Use it with cipher
......@@ -999,6 +1007,7 @@ config CRYPTO_SHA3
config CRYPTO_SM3
tristate "SM3 digest algorithm"
select CRYPTO_HASH
select CRYPTO_LIB_SM3
help
SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3).
It is part of the Chinese Commercial Cryptography suite.
......@@ -1007,6 +1016,19 @@ config CRYPTO_SM3
http://www.oscca.gov.cn/UpFile/20101222141857786.pdf
https://datatracker.ietf.org/doc/html/draft-shen-sm3-hash
config CRYPTO_SM3_AVX_X86_64
tristate "SM3 digest algorithm (x86_64/AVX)"
depends on X86 && 64BIT
select CRYPTO_HASH
select CRYPTO_LIB_SM3
help
SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3).
It is part of the Chinese Commercial Cryptography suite. This is
SM3 optimized implementation using Advanced Vector Extensions (AVX)
when available.
If unsure, say N.
config CRYPTO_STREEBOG
tristate "Streebog Hash Function"
select CRYPTO_HASH
......@@ -1847,6 +1869,7 @@ config CRYPTO_JITTERENTROPY
config CRYPTO_KDF800108_CTR
tristate
select CRYPTO_HMAC
select CRYPTO_SHA256
config CRYPTO_USER_API
......
......@@ -6,6 +6,7 @@
*/
#include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <linux/err.h>
#include <linux/errno.h>
#include <linux/fips.h>
......@@ -21,6 +22,11 @@
static LIST_HEAD(crypto_template_list);
#ifdef CONFIG_CRYPTO_MANAGER_EXTRA_TESTS
DEFINE_PER_CPU(bool, crypto_simd_disabled_for_test);
EXPORT_PER_CPU_SYMBOL_GPL(crypto_simd_disabled_for_test);
#endif
static inline void crypto_check_module_sig(struct module *mod)
{
if (fips_enabled && mod && !module_sig_ok(mod))
......@@ -322,9 +328,17 @@ void crypto_alg_tested(const char *name, int err)
found:
q->cra_flags |= CRYPTO_ALG_DEAD;
alg = test->adult;
if (err || list_empty(&alg->cra_list))
if (list_empty(&alg->cra_list))
goto complete;
if (err == -ECANCELED)
alg->cra_flags |= CRYPTO_ALG_FIPS_INTERNAL;
else if (err)
goto complete;
else
alg->cra_flags &= ~CRYPTO_ALG_FIPS_INTERNAL;
alg->cra_flags |= CRYPTO_ALG_TESTED;
/* Only satisfy larval waiters if we are the best. */
......@@ -604,6 +618,7 @@ int crypto_register_instance(struct crypto_template *tmpl,
{
struct crypto_larval *larval;
struct crypto_spawn *spawn;
u32 fips_internal = 0;
int err;
err = crypto_check_alg(&inst->alg);
......@@ -626,11 +641,15 @@ int crypto_register_instance(struct crypto_template *tmpl,
spawn->inst = inst;
spawn->registered = true;
fips_internal |= spawn->alg->cra_flags;
crypto_mod_put(spawn->alg);
spawn = next;
}
inst->alg.cra_flags |= (fips_internal & CRYPTO_ALG_FIPS_INTERNAL);
larval = __crypto_register_alg(&inst->alg);
if (IS_ERR(larval))
goto unlock;
......@@ -683,7 +702,8 @@ int crypto_grab_spawn(struct crypto_spawn *spawn, struct crypto_instance *inst,
if (IS_ERR(name))
return PTR_ERR(name);
alg = crypto_find_alg(name, spawn->frontend, type, mask);
alg = crypto_find_alg(name, spawn->frontend,
type | CRYPTO_ALG_FIPS_INTERNAL, mask);
if (IS_ERR(alg))
return PTR_ERR(alg);
......@@ -1002,7 +1022,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
}
while (IS_ENABLED(CONFIG_64BIT) && len >= 8 && !(relalign & 7)) {
*(u64 *)dst = *(u64 *)src1 ^ *(u64 *)src2;
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u64 l = get_unaligned((u64 *)src1) ^
get_unaligned((u64 *)src2);
put_unaligned(l, (u64 *)dst);
} else {
*(u64 *)dst = *(u64 *)src1 ^ *(u64 *)src2;
}
dst += 8;
src1 += 8;
src2 += 8;
......@@ -1010,7 +1036,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
}
while (len >= 4 && !(relalign & 3)) {
*(u32 *)dst = *(u32 *)src1 ^ *(u32 *)src2;
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u32 l = get_unaligned((u32 *)src1) ^
get_unaligned((u32 *)src2);
put_unaligned(l, (u32 *)dst);
} else {
*(u32 *)dst = *(u32 *)src1 ^ *(u32 *)src2;
}
dst += 4;
src1 += 4;
src2 += 4;
......@@ -1018,7 +1050,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
}
while (len >= 2 && !(relalign & 1)) {
*(u16 *)dst = *(u16 *)src1 ^ *(u16 *)src2;
if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u16 l = get_unaligned((u16 *)src1) ^
get_unaligned((u16 *)src2);
put_unaligned(l, (u16 *)dst);
} else {
*(u16 *)dst = *(u16 *)src1 ^ *(u16 *)src2;
}
dst += 2;
src1 += 2;
src2 += 2;
......
......@@ -223,6 +223,8 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
else if (crypto_is_test_larval(larval) &&
!(alg->cra_flags & CRYPTO_ALG_TESTED))
alg = ERR_PTR(-EAGAIN);
else if (alg->cra_flags & CRYPTO_ALG_FIPS_INTERNAL)
alg = ERR_PTR(-EAGAIN);
else if (!crypto_mod_get(alg))
alg = ERR_PTR(-EAGAIN);
crypto_mod_put(&larval->alg);
......@@ -233,6 +235,7 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
u32 mask)
{
const u32 fips = CRYPTO_ALG_FIPS_INTERNAL;
struct crypto_alg *alg;
u32 test = 0;
......@@ -240,8 +243,20 @@ static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
test |= CRYPTO_ALG_TESTED;
down_read(&crypto_alg_sem);
alg = __crypto_alg_lookup(name, type | test, mask | test);
if (!alg && test) {
alg = __crypto_alg_lookup(name, (type | test) & ~fips,
(mask | test) & ~fips);
if (alg) {
if (((type | mask) ^ fips) & fips)
mask |= fips;
mask &= fips;
if (!crypto_is_larval(alg) &&
((type ^ alg->cra_flags) & mask)) {
/* Algorithm is disallowed in FIPS mode. */
crypto_mod_put(alg);
alg = ERR_PTR(-ENOENT);
}
} else if (test) {
alg = __crypto_alg_lookup(name, type, mask);
if (alg && !crypto_is_larval(alg)) {
/* Test failed */
......
......@@ -35,7 +35,7 @@ void public_key_signature_free(struct public_key_signature *sig)
EXPORT_SYMBOL_GPL(public_key_signature_free);
/**
* query_asymmetric_key - Get information about an aymmetric key.
* query_asymmetric_key - Get information about an asymmetric key.
* @params: Various parameters.
* @info: Where to put the information.
*/
......
......@@ -22,7 +22,7 @@ struct x509_certificate {
time64_t valid_to;
const void *tbs; /* Signed data */
unsigned tbs_size; /* Size of signed data */
unsigned raw_sig_size; /* Size of sigature */
unsigned raw_sig_size; /* Size of signature */
const void *raw_sig; /* Signature data */
const void *raw_serial; /* Raw serial number in ASN.1 */
unsigned raw_serial_size;
......
......@@ -170,8 +170,8 @@ dma_xor_aligned_offsets(struct dma_device *device, unsigned int offset,
*
* xor_blocks always uses the dest as a source so the
* ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in
* the calculation. The assumption with dma eninges is that they only
* use the destination buffer as a source when it is explicity specified
* the calculation. The assumption with dma engines is that they only
* use the destination buffer as a source when it is explicitly specified
* in the source list.
*
* src_list note: if the dest is also a source it must be at index zero.
......@@ -261,8 +261,8 @@ EXPORT_SYMBOL_GPL(async_xor_offs);
*
* xor_blocks always uses the dest as a source so the
* ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in
* the calculation. The assumption with dma eninges is that they only
* use the destination buffer as a source when it is explicity specified
* the calculation. The assumption with dma engines is that they only
* use the destination buffer as a source when it is explicitly specified
* in the source list.
*
* src_list note: if the dest is also a source it must be at index zero.
......
......@@ -217,7 +217,7 @@ static int raid6_test(void)
err += test(12, &tests);
}
/* the 24 disk case is special for ioatdma as it is the boudary point
/* the 24 disk case is special for ioatdma as it is the boundary point
* at which it needs to switch from 8-source ops to 16-source
* ops for continuation (assumes DMA_HAS_PQ_CONTINUE is not set)
*/
......@@ -241,7 +241,7 @@ static void raid6_test_exit(void)
}
/* when compiled-in wait for drivers to load first (assumes dma drivers
* are also compliled-in)
* are also compiled-in)
*/
late_initcall(raid6_test);
module_exit(raid6_test_exit);
......
......@@ -253,7 +253,7 @@ static int crypto_authenc_decrypt_tail(struct aead_request *req,
dst = scatterwalk_ffwd(areq_ctx->dst, req->dst, req->assoclen);
skcipher_request_set_tfm(skreq, ctx->enc);
skcipher_request_set_callback(skreq, aead_request_flags(req),
skcipher_request_set_callback(skreq, flags,
req->base.complete, req->base.data);
skcipher_request_set_crypt(skreq, src, dst,
req->cryptlen - authsize, req->iv);
......
//SPDX-License-Identifier: GPL-2.0
// SPDX-License-Identifier: GPL-2.0
/*
* CFB: Cipher FeedBack mode
*
......
......@@ -53,6 +53,7 @@ static void crypto_finalize_request(struct crypto_engine *engine,
dev_err(engine->dev, "failed to unprepare request\n");
}
}
lockdep_assert_in_softirq();
req->complete(req, err);
kthread_queue_work(engine->kworker, &engine->pump_requests);
......
This diff is collapsed.
......@@ -10,7 +10,7 @@
#include <crypto/dh.h>
#include <crypto/kpp.h>
#define DH_KPP_SECRET_MIN_SIZE (sizeof(struct kpp_secret) + 4 * sizeof(int))
#define DH_KPP_SECRET_MIN_SIZE (sizeof(struct kpp_secret) + 3 * sizeof(int))
static inline u8 *dh_pack_data(u8 *dst, u8 *end, const void *src, size_t size)
{
......@@ -28,7 +28,7 @@ static inline const u8 *dh_unpack_data(void *dst, const void *src, size_t size)
static inline unsigned int dh_data_size(const struct dh *p)
{
return p->key_size + p->p_size + p->q_size + p->g_size;
return p->key_size + p->p_size + p->g_size;
}
unsigned int crypto_dh_key_len(const struct dh *p)
......@@ -53,11 +53,9 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params)
ptr = dh_pack_data(ptr, end, &params->key_size,
sizeof(params->key_size));
ptr = dh_pack_data(ptr, end, &params->p_size, sizeof(params->p_size));
ptr = dh_pack_data(ptr, end, &params->q_size, sizeof(params->q_size));
ptr = dh_pack_data(ptr, end, &params->g_size, sizeof(params->g_size));
ptr = dh_pack_data(ptr, end, params->key, params->key_size);
ptr = dh_pack_data(ptr, end, params->p, params->p_size);
ptr = dh_pack_data(ptr, end, params->q, params->q_size);
ptr = dh_pack_data(ptr, end, params->g, params->g_size);
if (ptr != end)
return -EINVAL;
......@@ -65,7 +63,7 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params)
}
EXPORT_SYMBOL_GPL(crypto_dh_encode_key);
int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
int __crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
{
const u8 *ptr = buf;
struct kpp_secret secret;
......@@ -79,28 +77,36 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
ptr = dh_unpack_data(&params->key_size, ptr, sizeof(params->key_size));
ptr = dh_unpack_data(&params->p_size, ptr, sizeof(params->p_size));
ptr = dh_unpack_data(&params->q_size, ptr, sizeof(params->q_size));
ptr = dh_unpack_data(&params->g_size, ptr, sizeof(params->g_size));
if (secret.len != crypto_dh_key_len(params))
return -EINVAL;
/* Don't allocate memory. Set pointers to data within
* the given buffer
*/
params->key = (void *)ptr;
params->p = (void *)(ptr + params->key_size);
params->g = (void *)(ptr + params->key_size + params->p_size);
return 0;
}
int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
{
int err;
err = __crypto_dh_decode_key(buf, len, params);
if (err)
return err;
/*
* Don't permit the buffer for 'key' or 'g' to be larger than 'p', since
* some drivers assume otherwise.
*/
if (params->key_size > params->p_size ||
params->g_size > params->p_size || params->q_size > params->p_size)
params->g_size > params->p_size)
return -EINVAL;
/* Don't allocate memory. Set pointers to data within
* the given buffer
*/
params->key = (void *)ptr;
params->p = (void *)(ptr + params->key_size);
params->q = (void *)(ptr + params->key_size + params->p_size);
params->g = (void *)(ptr + params->key_size + params->p_size +
params->q_size);
/*
* Don't permit 'p' to be 0. It's not a prime number, and it's subject
* to corner cases such as 'mod 0' being undefined or
......@@ -109,10 +115,6 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
if (memchr_inv(params->p, 0, params->p_size) == NULL)
return -EINVAL;
/* It is permissible to not provide Q. */
if (params->q_size == 0)
params->q = NULL;
return 0;
}
EXPORT_SYMBOL_GPL(crypto_dh_decode_key);
......@@ -15,6 +15,7 @@
#include <crypto/internal/hash.h>
#include <crypto/scatterwalk.h>
#include <linux/err.h>
#include <linux/fips.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
......@@ -51,6 +52,9 @@ static int hmac_setkey(struct crypto_shash *parent,
SHASH_DESC_ON_STACK(shash, hash);
unsigned int i;
if (fips_enabled && (keylen < 112 / 8))
return -EINVAL;
shash->tfm = hash;
if (keylen > bs) {
......
......@@ -68,9 +68,17 @@ static int crypto_kpp_init_tfm(struct crypto_tfm *tfm)
return 0;
}
static void crypto_kpp_free_instance(struct crypto_instance *inst)
{
struct kpp_instance *kpp = kpp_instance(inst);
kpp->free(kpp);
}
static const struct crypto_type crypto_kpp_type = {
.extsize = crypto_alg_extsize,
.init_tfm = crypto_kpp_init_tfm,
.free = crypto_kpp_free_instance,
#ifdef CONFIG_PROC_FS
.show = crypto_kpp_show,
#endif
......@@ -87,6 +95,15 @@ struct crypto_kpp *crypto_alloc_kpp(const char *alg_name, u32 type, u32 mask)
}
EXPORT_SYMBOL_GPL(crypto_alloc_kpp);
int crypto_grab_kpp(struct crypto_kpp_spawn *spawn,
struct crypto_instance *inst,
const char *name, u32 type, u32 mask)
{
spawn->base.frontend = &crypto_kpp_type;
return crypto_grab_spawn(&spawn->base, inst, name, type, mask);
}
EXPORT_SYMBOL_GPL(crypto_grab_kpp);
static void kpp_prepare_alg(struct kpp_alg *alg)
{
struct crypto_alg *base = &alg->base;
......@@ -111,5 +128,17 @@ void crypto_unregister_kpp(struct kpp_alg *alg)
}
EXPORT_SYMBOL_GPL(crypto_unregister_kpp);
int kpp_register_instance(struct crypto_template *tmpl,
struct kpp_instance *inst)
{
if (WARN_ON(!inst->free))
return -EINVAL;
kpp_prepare_alg(&inst->alg);
return crypto_register_instance(tmpl, kpp_crypto_instance(inst));
}
EXPORT_SYMBOL_GPL(kpp_register_instance);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Key-agreement Protocol Primitives");
......@@ -428,3 +428,4 @@ module_exit(lrw_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("LRW block cipher mode");
MODULE_ALIAS_CRYPTO("lrw");
MODULE_SOFTDEP("pre: ecb");
......@@ -60,6 +60,7 @@
*/
#include <crypto/algapi.h>
#include <asm/unaligned.h>
#ifndef __HAVE_ARCH_CRYPTO_MEMNEQ
......@@ -71,7 +72,8 @@ __crypto_memneq_generic(const void *a, const void *b, size_t size)
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
while (size >= sizeof(unsigned long)) {
neq |= *(unsigned long *)a ^ *(unsigned long *)b;
neq |= get_unaligned((unsigned long *)a) ^
get_unaligned((unsigned long *)b);
OPTIMIZER_HIDE_VAR(neq);
a += sizeof(unsigned long);
b += sizeof(unsigned long);
......@@ -95,18 +97,24 @@ static inline unsigned long __crypto_memneq_16(const void *a, const void *b)
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
if (sizeof(unsigned long) == 8) {
neq |= *(unsigned long *)(a) ^ *(unsigned long *)(b);
neq |= get_unaligned((unsigned long *)a) ^
get_unaligned((unsigned long *)b);
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned long *)(a+8) ^ *(unsigned long *)(b+8);
neq |= get_unaligned((unsigned long *)(a + 8)) ^
get_unaligned((unsigned long *)(b + 8));
OPTIMIZER_HIDE_VAR(neq);
} else if (sizeof(unsigned int) == 4) {
neq |= *(unsigned int *)(a) ^ *(unsigned int *)(b);
neq |= get_unaligned((unsigned int *)a) ^
get_unaligned((unsigned int *)b);
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+4) ^ *(unsigned int *)(b+4);
neq |= get_unaligned((unsigned int *)(a + 4)) ^
get_unaligned((unsigned int *)(b + 4));
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+8) ^ *(unsigned int *)(b+8);
neq |= get_unaligned((unsigned int *)(a + 8)) ^
get_unaligned((unsigned int *)(b + 8));
OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+12) ^ *(unsigned int *)(b+12);
neq |= get_unaligned((unsigned int *)(a + 12)) ^
get_unaligned((unsigned int *)(b + 12));
OPTIMIZER_HIDE_VAR(neq);
} else
#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
......
......@@ -385,15 +385,15 @@ static int pkcs1pad_sign(struct akcipher_request *req)
struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst);
const struct rsa_asn1_template *digest_info = ictx->digest_info;
int err;
unsigned int ps_end, digest_size = 0;
unsigned int ps_end, digest_info_size = 0;
if (!ctx->key_size)
return -EINVAL;
if (digest_info)
digest_size = digest_info->size;
digest_info_size = digest_info->size;
if (req->src_len + digest_size > ctx->key_size - 11)
if (req->src_len + digest_info_size > ctx->key_size - 11)
return -EOVERFLOW;
if (req->dst_len < ctx->key_size) {
......@@ -406,7 +406,7 @@ static int pkcs1pad_sign(struct akcipher_request *req)
if (!req_ctx->in_buf)
return -ENOMEM;
ps_end = ctx->key_size - digest_size - req->src_len - 2;
ps_end = ctx->key_size - digest_info_size - req->src_len - 2;
req_ctx->in_buf[0] = 0x01;
memset(req_ctx->in_buf + 1, 0xff, ps_end - 1);
req_ctx->in_buf[ps_end] = 0x00;
......@@ -441,6 +441,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
struct akcipher_instance *inst = akcipher_alg_instance(tfm);
struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst);
const struct rsa_asn1_template *digest_info = ictx->digest_info;
const unsigned int sig_size = req->src_len;
const unsigned int digest_size = req->dst_len;
unsigned int dst_len;
unsigned int pos;
u8 *out_buf;
......@@ -476,6 +478,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
pos++;
if (digest_info) {
if (digest_info->size > dst_len - pos)
goto done;
if (crypto_memneq(out_buf + pos, digest_info->data,
digest_info->size))
goto done;
......@@ -485,20 +489,19 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
err = 0;
if (req->dst_len != dst_len - pos) {
if (digest_size != dst_len - pos) {
err = -EKEYREJECTED;
req->dst_len = dst_len - pos;
goto done;
}
/* Extract appended digest. */
sg_pcopy_to_buffer(req->src,
sg_nents_for_len(req->src,
req->src_len + req->dst_len),
sg_nents_for_len(req->src, sig_size + digest_size),
req_ctx->out_buf + ctx->key_size,
req->dst_len, ctx->key_size);
digest_size, sig_size);
/* Do the actual verification step. */
if (memcmp(req_ctx->out_buf + ctx->key_size, out_buf + pos,
req->dst_len) != 0)
digest_size) != 0)
err = -EKEYREJECTED;
done:
kfree_sensitive(req_ctx->out_buf);
......@@ -534,14 +537,15 @@ static int pkcs1pad_verify(struct akcipher_request *req)
struct crypto_akcipher *tfm = crypto_akcipher_reqtfm(req);
struct pkcs1pad_ctx *ctx = akcipher_tfm_ctx(tfm);
struct pkcs1pad_request *req_ctx = akcipher_request_ctx(req);
const unsigned int sig_size = req->src_len;
const unsigned int digest_size = req->dst_len;
int err;
if (WARN_ON(req->dst) ||
WARN_ON(!req->dst_len) ||
!ctx->key_size || req->src_len < ctx->key_size)
if (WARN_ON(req->dst) || WARN_ON(!digest_size) ||
!ctx->key_size || sig_size != ctx->key_size)
return -EINVAL;
req_ctx->out_buf = kmalloc(ctx->key_size + req->dst_len, GFP_KERNEL);
req_ctx->out_buf = kmalloc(ctx->key_size + digest_size, GFP_KERNEL);
if (!req_ctx->out_buf)
return -ENOMEM;
......@@ -554,8 +558,7 @@ static int pkcs1pad_verify(struct akcipher_request *req)
/* Reuse input buffer, output to a new buffer */
akcipher_request_set_crypt(&req_ctx->child_req, req->src,
req_ctx->out_sg, req->src_len,
ctx->key_size);
req_ctx->out_sg, sig_size, ctx->key_size);
err = crypto_akcipher_encrypt(&req_ctx->child_req);
if (err != -EINPROGRESS && err != -EBUSY)
......@@ -621,6 +624,11 @@ static int pkcs1pad_create(struct crypto_template *tmpl, struct rtattr **tb)
rsa_alg = crypto_spawn_akcipher_alg(&ctx->spawn);
if (strcmp(rsa_alg->base.cra_name, "rsa") != 0) {
err = -EINVAL;
goto err_free_inst;
}
err = -ENAMETOOLONG;
hash_name = crypto_attr_alg_name(tb[2]);
if (IS_ERR(hash_name)) {
......
/* SPDX-License-Identifier: GPL-2.0-or-later */
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* SM2 asymmetric public-key algorithm
* as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
......@@ -13,7 +13,7 @@
#include <crypto/internal/akcipher.h>
#include <crypto/akcipher.h>
#include <crypto/hash.h>
#include <crypto/sm3_base.h>
#include <crypto/sm3.h>
#include <crypto/rng.h>
#include <crypto/sm2.h>
#include "sm2signature.asn1.h"
......@@ -213,7 +213,7 @@ int sm2_get_signature_s(void *context, size_t hdrlen, unsigned char tag,
return 0;
}
static int sm2_z_digest_update(struct shash_desc *desc,
static int sm2_z_digest_update(struct sm3_state *sctx,
MPI m, unsigned int pbytes)
{
static const unsigned char zero[32];
......@@ -226,20 +226,20 @@ static int sm2_z_digest_update(struct shash_desc *desc,
if (inlen < pbytes) {
/* padding with zero */
crypto_sm3_update(desc, zero, pbytes - inlen);
crypto_sm3_update(desc, in, inlen);
sm3_update(sctx, zero, pbytes - inlen);
sm3_update(sctx, in, inlen);
} else if (inlen > pbytes) {
/* skip the starting zero */
crypto_sm3_update(desc, in + inlen - pbytes, pbytes);
sm3_update(sctx, in + inlen - pbytes, pbytes);
} else {
crypto_sm3_update(desc, in, inlen);
sm3_update(sctx, in, inlen);
}
kfree(in);
return 0;
}
static int sm2_z_digest_update_point(struct shash_desc *desc,
static int sm2_z_digest_update_point(struct sm3_state *sctx,
MPI_POINT point, struct mpi_ec_ctx *ec, unsigned int pbytes)
{
MPI x, y;
......@@ -249,8 +249,8 @@ static int sm2_z_digest_update_point(struct shash_desc *desc,
y = mpi_new(0);
if (!mpi_ec_get_affine(x, y, point, ec) &&
!sm2_z_digest_update(desc, x, pbytes) &&
!sm2_z_digest_update(desc, y, pbytes))
!sm2_z_digest_update(sctx, x, pbytes) &&
!sm2_z_digest_update(sctx, y, pbytes))
ret = 0;
mpi_free(x);
......@@ -265,7 +265,7 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm,
struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm);
uint16_t bits_len;
unsigned char entl[2];
SHASH_DESC_ON_STACK(desc, NULL);
struct sm3_state sctx;
unsigned int pbytes;
if (id_len > (USHRT_MAX / 8) || !ec->Q)
......@@ -278,17 +278,17 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm,
pbytes = MPI_NBYTES(ec->p);
/* ZA = H256(ENTLA | IDA | a | b | xG | yG | xA | yA) */
sm3_base_init(desc);
crypto_sm3_update(desc, entl, 2);
crypto_sm3_update(desc, id, id_len);
if (sm2_z_digest_update(desc, ec->a, pbytes) ||
sm2_z_digest_update(desc, ec->b, pbytes) ||
sm2_z_digest_update_point(desc, ec->G, ec, pbytes) ||
sm2_z_digest_update_point(desc, ec->Q, ec, pbytes))
sm3_init(&sctx);
sm3_update(&sctx, entl, 2);
sm3_update(&sctx, id, id_len);
if (sm2_z_digest_update(&sctx, ec->a, pbytes) ||
sm2_z_digest_update(&sctx, ec->b, pbytes) ||
sm2_z_digest_update_point(&sctx, ec->G, ec, pbytes) ||
sm2_z_digest_update_point(&sctx, ec->Q, ec, pbytes))
return -EINVAL;
crypto_sm3_final(desc, dgst);
sm3_final(&sctx, dgst);
return 0;
}
EXPORT_SYMBOL(sm2_compute_z_digest);
......
......@@ -5,6 +5,7 @@
*
* Copyright (C) 2017 ARM Limited or its affiliates.
* Written by Gilad Ben-Yossef <gilad@benyossef.com>
* Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#include <crypto/internal/hash.h>
......@@ -26,143 +27,29 @@ const u8 sm3_zero_message_hash[SM3_DIGEST_SIZE] = {
};
EXPORT_SYMBOL_GPL(sm3_zero_message_hash);
static inline u32 p0(u32 x)
{
return x ^ rol32(x, 9) ^ rol32(x, 17);
}
static inline u32 p1(u32 x)
{
return x ^ rol32(x, 15) ^ rol32(x, 23);
}
static inline u32 ff(unsigned int n, u32 a, u32 b, u32 c)
{
return (n < 16) ? (a ^ b ^ c) : ((a & b) | (a & c) | (b & c));
}
static inline u32 gg(unsigned int n, u32 e, u32 f, u32 g)
{
return (n < 16) ? (e ^ f ^ g) : ((e & f) | ((~e) & g));
}
static inline u32 t(unsigned int n)
{
return (n < 16) ? SM3_T1 : SM3_T2;
}
static void sm3_expand(u32 *t, u32 *w, u32 *wt)
{
int i;
unsigned int tmp;
/* load the input */
for (i = 0; i <= 15; i++)
w[i] = get_unaligned_be32((__u32 *)t + i);
for (i = 16; i <= 67; i++) {
tmp = w[i - 16] ^ w[i - 9] ^ rol32(w[i - 3], 15);
w[i] = p1(tmp) ^ (rol32(w[i - 13], 7)) ^ w[i - 6];
}
for (i = 0; i <= 63; i++)
wt[i] = w[i] ^ w[i + 4];
}
static void sm3_compress(u32 *w, u32 *wt, u32 *m)
{
u32 ss1;
u32 ss2;
u32 tt1;
u32 tt2;
u32 a, b, c, d, e, f, g, h;
int i;
a = m[0];
b = m[1];
c = m[2];
d = m[3];
e = m[4];
f = m[5];
g = m[6];
h = m[7];
for (i = 0; i <= 63; i++) {
ss1 = rol32((rol32(a, 12) + e + rol32(t(i), i & 31)), 7);
ss2 = ss1 ^ rol32(a, 12);
tt1 = ff(i, a, b, c) + d + ss2 + *wt;
wt++;
tt2 = gg(i, e, f, g) + h + ss1 + *w;
w++;
d = c;
c = rol32(b, 9);
b = a;
a = tt1;
h = g;
g = rol32(f, 19);
f = e;
e = p0(tt2);
}
m[0] = a ^ m[0];
m[1] = b ^ m[1];
m[2] = c ^ m[2];
m[3] = d ^ m[3];
m[4] = e ^ m[4];
m[5] = f ^ m[5];
m[6] = g ^ m[6];
m[7] = h ^ m[7];
a = b = c = d = e = f = g = h = ss1 = ss2 = tt1 = tt2 = 0;
}
static void sm3_transform(struct sm3_state *sst, u8 const *src)
{
unsigned int w[68];
unsigned int wt[64];
sm3_expand((u32 *)src, w, wt);
sm3_compress(w, wt, sst->state);
memzero_explicit(w, sizeof(w));
memzero_explicit(wt, sizeof(wt));
}
static void sm3_generic_block_fn(struct sm3_state *sst, u8 const *src,
int blocks)
{
while (blocks--) {
sm3_transform(sst, src);
src += SM3_BLOCK_SIZE;
}
}
int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
static int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
return sm3_base_do_update(desc, data, len, sm3_generic_block_fn);
sm3_update(shash_desc_ctx(desc), data, len);
return 0;
}
EXPORT_SYMBOL(crypto_sm3_update);
int crypto_sm3_final(struct shash_desc *desc, u8 *out)
static int crypto_sm3_final(struct shash_desc *desc, u8 *out)
{
sm3_base_do_finalize(desc, sm3_generic_block_fn);
return sm3_base_finish(desc, out);
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
EXPORT_SYMBOL(crypto_sm3_final);
int crypto_sm3_finup(struct shash_desc *desc, const u8 *data,
static int crypto_sm3_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash)
{
sm3_base_do_update(desc, data, len, sm3_generic_block_fn);
return crypto_sm3_final(desc, hash);
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, hash);
return 0;
}
EXPORT_SYMBOL(crypto_sm3_finup);
static struct shash_alg sm3_alg = {
.digestsize = SM3_DIGEST_SIZE,
......@@ -174,6 +61,7 @@ static struct shash_alg sm3_alg = {
.base = {
.cra_name = "sm3",
.cra_driver_name = "sm3-generic",
.cra_priority = 100,
.cra_blocksize = SM3_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -466,3 +466,4 @@ MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("XTS block cipher mode");
MODULE_ALIAS_CRYPTO("xts");
MODULE_IMPORT_NS(CRYPTO_INTERNAL);
MODULE_SOFTDEP("pre: ecb");
......@@ -401,7 +401,7 @@ config HW_RANDOM_MESON
config HW_RANDOM_CAVIUM
tristate "Cavium ThunderX Random Number Generator support"
depends on HW_RANDOM && PCI && ARM64
depends on HW_RANDOM && PCI && ARCH_THUNDER
default HW_RANDOM
help
This driver provides kernel-side support for the Random Number
......
This diff is collapsed.
......@@ -179,7 +179,7 @@ static int cavium_map_pf_regs(struct cavium_rng *rng)
pdev = pci_get_device(PCI_VENDOR_ID_CAVIUM,
PCI_DEVID_CAVIUM_RNG_PF, NULL);
if (!pdev) {
dev_err(&pdev->dev, "Cannot find RNG PF device\n");
pr_err("Cannot find RNG PF device\n");
return -EIO;
}
......
This diff is collapsed.
......@@ -65,14 +65,14 @@ static int nmk_rng_probe(struct amba_device *dev, const struct amba_id *id)
out_release:
amba_release_regions(dev);
out_clk:
clk_disable(rng_clk);
clk_disable_unprepare(rng_clk);
return ret;
}
static void nmk_rng_remove(struct amba_device *dev)
{
amba_release_regions(dev);
clk_disable(rng_clk);
clk_disable_unprepare(rng_clk);
}
static const struct amba_id nmk_rng_ids[] = {
......
This diff is collapsed.
......@@ -47,7 +47,7 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/
obj-$(CONFIG_CRYPTO_DEV_SAFEXCEL) += inside-secure/
obj-$(CONFIG_CRYPTO_DEV_ARTPEC6) += axis/
obj-$(CONFIG_CRYPTO_DEV_ZYNQMP_AES) += xilinx/
obj-y += xilinx/
obj-y += hisilicon/
obj-$(CONFIG_CRYPTO_DEV_AMLOGIC_GXL) += amlogic/
obj-y += keembay/
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment