Commit 93e220a6 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - hwrng core now credits for low-quality RNG devices.

  Algorithms:
   - Optimisations for neon aes on arm/arm64.
   - Add accelerated crc32_be on arm64.
   - Add ffdheXYZ(dh) templates.
   - Disallow hmac keys < 112 bits in FIPS mode.
   - Add AVX assembly implementation for sm3 on x86.

  Drivers:
   - Add missing local_bh_disable calls for crypto_engine callback.
   - Ensure BH is disabled in crypto_engine callback path.
   - Fix zero length DMA mappings in ccree.
   - Add synchronization between mailbox accesses in octeontx2.
   - Add Xilinx SHA3 driver.
   - Add support for the TDES IP available on sama7g5 SoC in atmel"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits)
  crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST
  MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list
  crypto: dh - Remove the unused function dh_safe_prime_dh_alg()
  hwrng: nomadik - Change clk_disable to clk_disable_unprepare
  crypto: arm64 - cleanup comments
  crypto: qat - fix initialization of pfvf rts_map_msg structures
  crypto: qat - fix initialization of pfvf cap_msg structures
  crypto: qat - remove unneeded assignment
  crypto: qat - disable registration of algorithms
  crypto: hisilicon/qm - fix memset during queues clearing
  crypto: xilinx: prevent probing on non-xilinx hardware
  crypto: marvell/octeontx - Use swap() instead of open coding it
  crypto: ccree - Fix use after free in cc_cipher_exit()
  crypto: ccp - ccp_dmaengine_unregister release dma channels
  crypto: octeontx2 - fix missing unlock
  hwrng: cavium - fix NULL but dereferenced coccicheck error
  crypto: cavium/nitrox - don't cast parameter in bit operations
  crypto: vmx - add missing dependencies
  MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver
  crypto: xilinx - Add Xilinx SHA3 driver
  ...
parents 5628b8de 0e03b8fd
This diff is collapsed.
What: /sys/kernel/debug/hisi_sec2/<bdf>/clear_enable What: /sys/kernel/debug/hisi_sec2/<bdf>/clear_enable
Date: Oct 2019 Date: Oct 2019
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading Description: Enabling/disabling of clear action after reading
the SEC debug registers. the SEC debug registers.
0: disable, 1: enable. 0: disable, 1: enable.
Only available for PF, and take no other effect on SEC. Only available for PF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/current_qm What: /sys/kernel/debug/hisi_sec2/<bdf>/current_qm
Date: Oct 2019 Date: Oct 2019
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: One SEC controller has one PF and multiple VFs, each function Description: One SEC controller has one PF and multiple VFs, each function
has a QM. This file can be used to select the QM which below has a QM. This file can be used to select the QM which below
qm refers to. qm refers to.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs What: /sys/kernel/debug/hisi_sec2/<bdf>/alg_qos
Date: Oct 2019 Date: Jun 2021
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers. Description: The <bdf> is related the function for PF and VF.
SEC driver supports to configure each function's QoS, the driver
supports to write <bdf> value to alg_qos in the host. Such as
"echo <bdf> value > alg_qos". The qos value is 1~1000, means
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Available for PF and VF in host. VF in guest currently only Available for PF and VF in host. VF in guest currently only
has one debug register. has one debug register.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/current_q What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/current_q
Date: Oct 2019 Date: Oct 2019
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: One QM of SEC may contain multiple queues. Select specific Description: One QM of SEC may contain multiple queues. Select specific
queue to show its debug registers in above 'regs'. queue to show its debug registers in above 'regs'.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/clear_enable What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/clear_enable
Date: Oct 2019 Date: Oct 2019
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading Description: Enabling/disabling of clear action after reading
the SEC's QM debug registers. the SEC's QM debug registers.
0: disable, 1: enable. 0: disable, 1: enable.
Only available for PF, and take no other effect on SEC. Only available for PF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/err_irq What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/err_irq
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of invalid interrupts for Description: Dump the number of invalid interrupts for
QM task completion. QM task completion.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/aeq_irq What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/aeq_irq
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of QM async event queue interrupts. Description: Dump the number of QM async event queue interrupts.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/abnormal_irq What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/abnormal_irq
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of interrupts for QM abnormal event. Description: Dump the number of interrupts for QM abnormal event.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/create_qp_err What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/create_qp_err
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of queue allocation errors. Description: Dump the number of queue allocation errors.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/mb_err What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/mb_err
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of failed QM mailbox commands. Description: Dump the number of failed QM mailbox commands.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/status What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/status
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the status of the QM. Description: Dump the status of the QM.
Four states: initiated, started, stopped and closed. Four states: initiated, started, stopped and closed.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_cnt What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of sent requests. Description: Dump the total number of sent requests.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/recv_cnt What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/recv_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of received requests. Description: Dump the total number of received requests.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_busy_cnt What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_busy_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of requests sent with returning busy. Description: Dump the total number of requests sent with returning busy.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/err_bd_cnt What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/err_bd_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of BD type error requests Description: Dump the total number of BD type error requests
to be received. to be received.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/invalid_req_cnt What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/invalid_req_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of invalid requests being received. Description: Dump the total number of invalid requests being received.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/done_flag_cnt What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/done_flag_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of completed but marked error requests Description: Dump the total number of completed but marked error requests
to be received. to be received.
Available for both PF and VF, and take no other effect on SEC. Available for both PF and VF, and take no other effect on SEC.
What: /sys/kernel/debug/hisi_zip/<bdf>/comp_core[01]/regs What: /sys/kernel/debug/hisi_zip/<bdf>/comp_core[01]/regs
Date: Nov 2018 Date: Nov 2018
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump of compression cores related debug registers. Description: Dump of compression cores related debug registers.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/decomp_core[0-5]/regs What: /sys/kernel/debug/hisi_zip/<bdf>/decomp_core[0-5]/regs
Date: Nov 2018 Date: Nov 2018
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump of decompression cores related debug registers. Description: Dump of decompression cores related debug registers.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/clear_enable What: /sys/kernel/debug/hisi_zip/<bdf>/clear_enable
Date: Nov 2018 Date: Nov 2018
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Compression/decompression core debug registers read clear Description: Compression/decompression core debug registers read clear
control. 1 means enable register read clear, otherwise 0. control. 1 means enable register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers. disable counters clear after reading of these registers.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/current_qm What: /sys/kernel/debug/hisi_zip/<bdf>/current_qm
Date: Nov 2018 Date: Nov 2018
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: One ZIP controller has one PF and multiple VFs, each function Description: One ZIP controller has one PF and multiple VFs, each function
has a QM. Select the QM which below qm refers to. has a QM. Select the QM which below qm refers to.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs What: /sys/kernel/debug/hisi_zip/<bdf>/alg_qos
Date: Nov 2018 Date: Jun 2021
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers. Description: The <bdf> is related the function for PF and VF.
ZIP driver supports to configure each function's QoS, the driver
supports to write <bdf> value to alg_qos in the host. Such as
"echo <bdf> value > alg_qos". The qos value is 1~1000, means
1/1000~1000/1000 of total QoS. The driver reading alg_qos to
get related QoS in the host and VM, Such as "cat alg_qos".
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Available for PF and VF in host. VF in guest currently only Available for PF and VF in host. VF in guest currently only
has one debug register. has one debug register.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/current_q What: /sys/kernel/debug/hisi_zip/<bdf>/qm/current_q
Date: Nov 2018 Date: Nov 2018
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to Description: One QM may contain multiple queues. Select specific queue to
show its debug registers in above regs. show its debug registers in above regs.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/clear_enable What: /sys/kernel/debug/hisi_zip/<bdf>/qm/clear_enable
Date: Nov 2018 Date: Nov 2018
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: QM debug registers(regs) read clear control. 1 means enable Description: QM debug registers(regs) read clear control. 1 means enable
register read clear, otherwise 0. register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers. disable counters clear after reading of these registers.
Only available for PF. Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/err_irq What: /sys/kernel/debug/hisi_zip/<bdf>/qm/err_irq
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of invalid interrupts for Description: Dump the number of invalid interrupts for
QM task completion. QM task completion.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/aeq_irq What: /sys/kernel/debug/hisi_zip/<bdf>/qm/aeq_irq
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of QM async event queue interrupts. Description: Dump the number of QM async event queue interrupts.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/abnormal_irq What: /sys/kernel/debug/hisi_zip/<bdf>/qm/abnormal_irq
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of interrupts for QM abnormal event. Description: Dump the number of interrupts for QM abnormal event.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/create_qp_err What: /sys/kernel/debug/hisi_zip/<bdf>/qm/create_qp_err
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of queue allocation errors. Description: Dump the number of queue allocation errors.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/mb_err What: /sys/kernel/debug/hisi_zip/<bdf>/qm/mb_err
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the number of failed QM mailbox commands. Description: Dump the number of failed QM mailbox commands.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/status What: /sys/kernel/debug/hisi_zip/<bdf>/qm/status
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the status of the QM. Description: Dump the status of the QM.
Four states: initiated, started, stopped and closed. Four states: initiated, started, stopped and closed.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_cnt What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of sent requests. Description: Dump the total number of sent requests.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/recv_cnt What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/recv_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of received requests. Description: Dump the total number of received requests.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_busy_cnt What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_busy_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of requests received Description: Dump the total number of requests received
with returning busy. with returning busy.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/err_bd_cnt What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/err_bd_cnt
Date: Apr 2020 Date: Apr 2020
Contact: linux-crypto@vger.kernel.org Contact: linux-crypto@vger.kernel.org
Description: Dump the total number of BD type error requests Description: Dump the total number of BD type error requests
to be received. to be received.
Available for both PF and VF, and take no other effect on ZIP. Available for both PF and VF, and take no other effect on ZIP.
...@@ -8644,7 +8644,7 @@ S: Maintained ...@@ -8644,7 +8644,7 @@ S: Maintained
F: drivers/gpio/gpio-hisi.c F: drivers/gpio/gpio-hisi.c
HISILICON HIGH PERFORMANCE RSA ENGINE DRIVER (HPRE) HISILICON HIGH PERFORMANCE RSA ENGINE DRIVER (HPRE)
M: Zaibo Xu <xuzaibo@huawei.com> M: Longfang Liu <liulongfang@huawei.com>
L: linux-crypto@vger.kernel.org L: linux-crypto@vger.kernel.org
S: Maintained S: Maintained
F: Documentation/ABI/testing/debugfs-hisi-hpre F: Documentation/ABI/testing/debugfs-hisi-hpre
...@@ -8724,8 +8724,8 @@ F: Documentation/devicetree/bindings/scsi/hisilicon-sas.txt ...@@ -8724,8 +8724,8 @@ F: Documentation/devicetree/bindings/scsi/hisilicon-sas.txt
F: drivers/scsi/hisi_sas/ F: drivers/scsi/hisi_sas/
HISILICON SECURITY ENGINE V2 DRIVER (SEC2) HISILICON SECURITY ENGINE V2 DRIVER (SEC2)
M: Zaibo Xu <xuzaibo@huawei.com>
M: Kai Ye <yekai13@huawei.com> M: Kai Ye <yekai13@huawei.com>
M: Longfang Liu <liulongfang@huawei.com>
L: linux-crypto@vger.kernel.org L: linux-crypto@vger.kernel.org
S: Maintained S: Maintained
F: Documentation/ABI/testing/debugfs-hisi-sec F: Documentation/ABI/testing/debugfs-hisi-sec
...@@ -8756,7 +8756,7 @@ F: Documentation/devicetree/bindings/mfd/hisilicon,hi6421-spmi-pmic.yaml ...@@ -8756,7 +8756,7 @@ F: Documentation/devicetree/bindings/mfd/hisilicon,hi6421-spmi-pmic.yaml
F: drivers/mfd/hi6421-spmi-pmic.c F: drivers/mfd/hi6421-spmi-pmic.c
HISILICON TRUE RANDOM NUMBER GENERATOR V2 SUPPORT HISILICON TRUE RANDOM NUMBER GENERATOR V2 SUPPORT
M: Zaibo Xu <xuzaibo@huawei.com> M: Weili Qian <qianweili@huawei.com>
S: Maintained S: Maintained
F: drivers/crypto/hisilicon/trng/trng.c F: drivers/crypto/hisilicon/trng/trng.c
...@@ -21302,6 +21302,11 @@ T: git https://github.com/Xilinx/linux-xlnx.git ...@@ -21302,6 +21302,11 @@ T: git https://github.com/Xilinx/linux-xlnx.git
F: Documentation/devicetree/bindings/phy/xlnx,zynqmp-psgtr.yaml F: Documentation/devicetree/bindings/phy/xlnx,zynqmp-psgtr.yaml
F: drivers/phy/xilinx/phy-zynqmp.c F: drivers/phy/xilinx/phy-zynqmp.c
XILINX ZYNQMP SHA3 DRIVER
M: Harsha <harsha.harsha@xilinx.com>
S: Maintained
F: drivers/crypto/xilinx/zynqmp-sha.c
XILINX EVENT MANAGEMENT DRIVER XILINX EVENT MANAGEMENT DRIVER
M: Abhyuday Godhasara <abhyuday.godhasara@xilinx.com> M: Abhyuday Godhasara <abhyuday.godhasara@xilinx.com>
S: Maintained S: Maintained
......
...@@ -5,24 +5,43 @@ ...@@ -5,24 +5,43 @@
* Optimized RAID-5 checksumming functions for alpha EV5 and EV6 * Optimized RAID-5 checksumming functions for alpha EV5 and EV6
*/ */
extern void xor_alpha_2(unsigned long, unsigned long *, unsigned long *); extern void
extern void xor_alpha_3(unsigned long, unsigned long *, unsigned long *, xor_alpha_2(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *); const unsigned long * __restrict p2);
extern void xor_alpha_4(unsigned long, unsigned long *, unsigned long *, extern void
unsigned long *, unsigned long *); xor_alpha_3(unsigned long bytes, unsigned long * __restrict p1,
extern void xor_alpha_5(unsigned long, unsigned long *, unsigned long *, const unsigned long * __restrict p2,
unsigned long *, unsigned long *, unsigned long *); const unsigned long * __restrict p3);
extern void
xor_alpha_4(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void
xor_alpha_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
extern void xor_alpha_prefetch_2(unsigned long, unsigned long *, extern void
unsigned long *); xor_alpha_prefetch_2(unsigned long bytes, unsigned long * __restrict p1,
extern void xor_alpha_prefetch_3(unsigned long, unsigned long *, const unsigned long * __restrict p2);
unsigned long *, unsigned long *); extern void
extern void xor_alpha_prefetch_4(unsigned long, unsigned long *, xor_alpha_prefetch_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *, unsigned long *, const unsigned long * __restrict p2,
unsigned long *); const unsigned long * __restrict p3);
extern void xor_alpha_prefetch_5(unsigned long, unsigned long *, extern void
unsigned long *, unsigned long *, xor_alpha_prefetch_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *, unsigned long *); const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void
xor_alpha_prefetch_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
asm(" \n\ asm(" \n\
.text \n\ .text \n\
......
...@@ -758,29 +758,24 @@ ENTRY(aesbs_cbc_decrypt) ...@@ -758,29 +758,24 @@ ENTRY(aesbs_cbc_decrypt)
ENDPROC(aesbs_cbc_decrypt) ENDPROC(aesbs_cbc_decrypt)
.macro next_ctr, q .macro next_ctr, q
vmov.32 \q\()h[1], r10 vmov \q\()h, r9, r10
adds r10, r10, #1 adds r10, r10, #1
vmov.32 \q\()h[0], r9
adcs r9, r9, #0 adcs r9, r9, #0
vmov.32 \q\()l[1], r8 vmov \q\()l, r7, r8
adcs r8, r8, #0 adcs r8, r8, #0
vmov.32 \q\()l[0], r7
adc r7, r7, #0 adc r7, r7, #0
vrev32.8 \q, \q vrev32.8 \q, \q
.endm .endm
/* /*
* aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], * aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
* int rounds, int blocks, u8 ctr[], u8 final[]) * int rounds, int bytes, u8 ctr[])
*/ */
ENTRY(aesbs_ctr_encrypt) ENTRY(aesbs_ctr_encrypt)
mov ip, sp mov ip, sp
push {r4-r10, lr} push {r4-r10, lr}
ldm ip, {r5-r7} // load args 4-6 ldm ip, {r5, r6} // load args 4-5
teq r7, #0
addne r5, r5, #1 // one extra block if final != 0
vld1.8 {q0}, [r6] // load counter vld1.8 {q0}, [r6] // load counter
vrev32.8 q1, q0 vrev32.8 q1, q0
vmov r9, r10, d3 vmov r9, r10, d3
...@@ -792,20 +787,19 @@ ENTRY(aesbs_ctr_encrypt) ...@@ -792,20 +787,19 @@ ENTRY(aesbs_ctr_encrypt)
adc r7, r7, #0 adc r7, r7, #0
99: vmov q1, q0 99: vmov q1, q0
sub lr, r5, #1
vmov q2, q0 vmov q2, q0
adr ip, 0f
vmov q3, q0 vmov q3, q0
and lr, lr, #112
vmov q4, q0 vmov q4, q0
cmp r5, #112
vmov q5, q0 vmov q5, q0
sub ip, ip, lr, lsl #1
vmov q6, q0 vmov q6, q0
add ip, ip, lr, lsr #2
vmov q7, q0 vmov q7, q0
movle pc, ip // computed goto if bytes < 112
adr ip, 0f
sub lr, r5, #1
and lr, lr, #7
cmp r5, #8
sub ip, ip, lr, lsl #5
sub ip, ip, lr, lsl #2
movlt pc, ip // computed goto if blocks < 8
next_ctr q1 next_ctr q1
next_ctr q2 next_ctr q2
...@@ -820,12 +814,14 @@ ENTRY(aesbs_ctr_encrypt) ...@@ -820,12 +814,14 @@ ENTRY(aesbs_ctr_encrypt)
bl aesbs_encrypt8 bl aesbs_encrypt8
adr ip, 1f adr ip, 1f
and lr, r5, #7 sub lr, r5, #1
cmp r5, #8 cmp r5, #128
movgt r4, #0 bic lr, lr, #15
ldrle r4, [sp, #40] // load final in the last round ands r4, r5, #15 // preserves C flag
sub ip, ip, lr, lsl #2 teqcs r5, r5 // set Z flag if not last iteration
movlt pc, ip // computed goto if blocks < 8 sub ip, ip, lr, lsr #2
rsb r4, r4, #16
movcc pc, ip // computed goto if bytes < 128
vld1.8 {q8}, [r1]! vld1.8 {q8}, [r1]!
vld1.8 {q9}, [r1]! vld1.8 {q9}, [r1]!
...@@ -834,46 +830,70 @@ ENTRY(aesbs_ctr_encrypt) ...@@ -834,46 +830,70 @@ ENTRY(aesbs_ctr_encrypt)
vld1.8 {q12}, [r1]! vld1.8 {q12}, [r1]!
vld1.8 {q13}, [r1]! vld1.8 {q13}, [r1]!
vld1.8 {q14}, [r1]! vld1.8 {q14}, [r1]!
teq r4, #0 // skip last block if 'final' 1: subne r1, r1, r4
1: bne 2f
vld1.8 {q15}, [r1]! vld1.8 {q15}, [r1]!
2: adr ip, 3f add ip, ip, #2f - 1b
cmp r5, #8
sub ip, ip, lr, lsl #3
movlt pc, ip // computed goto if blocks < 8
veor q0, q0, q8 veor q0, q0, q8
vst1.8 {q0}, [r0]!
veor q1, q1, q9 veor q1, q1, q9
vst1.8 {q1}, [r0]!
veor q4, q4, q10 veor q4, q4, q10
vst1.8 {q4}, [r0]!
veor q6, q6, q11 veor q6, q6, q11
vst1.8 {q6}, [r0]!
veor q3, q3, q12 veor q3, q3, q12
vst1.8 {q3}, [r0]!
veor q7, q7, q13 veor q7, q7, q13
vst1.8 {q7}, [r0]!
veor q2, q2, q14 veor q2, q2, q14
bne 3f
veor q5, q5, q15
movcc pc, ip // computed goto if bytes < 128
vst1.8 {q0}, [r0]!
vst1.8 {q1}, [r0]!
vst1.8 {q4}, [r0]!
vst1.8 {q6}, [r0]!
vst1.8 {q3}, [r0]!
vst1.8 {q7}, [r0]!
vst1.8 {q2}, [r0]! vst1.8 {q2}, [r0]!
teq r4, #0 // skip last block if 'final' 2: subne r0, r0, r4
W(bne) 5f
3: veor q5, q5, q15
vst1.8 {q5}, [r0]! vst1.8 {q5}, [r0]!
4: next_ctr q0 next_ctr q0
subs r5, r5, #8 subs r5, r5, #128
bgt 99b bgt 99b
vst1.8 {q0}, [r6] vst1.8 {q0}, [r6]
pop {r4-r10, pc} pop {r4-r10, pc}
5: vst1.8 {q5}, [r4] 3: adr lr, .Lpermute_table + 16
b 4b cmp r5, #16 // Z flag remains cleared
sub lr, lr, r4
vld1.8 {q8-q9}, [lr]
vtbl.8 d16, {q5}, d16
vtbl.8 d17, {q5}, d17
veor q5, q8, q15
bcc 4f // have to reload prev if R5 < 16
vtbx.8 d10, {q2}, d18
vtbx.8 d11, {q2}, d19
mov pc, ip // branch back to VST sequence
4: sub r0, r0, r4
vshr.s8 q9, q9, #7 // create mask for VBIF
vld1.8 {q8}, [r0] // reload
vbif q5, q8, q9
vst1.8 {q5}, [r0]
pop {r4-r10, pc}
ENDPROC(aesbs_ctr_encrypt) ENDPROC(aesbs_ctr_encrypt)
.align 6
.Lpermute_table:
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07
.byte 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.byte 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
.macro next_tweak, out, in, const, tmp .macro next_tweak, out, in, const, tmp
vshr.s64 \tmp, \in, #63 vshr.s64 \tmp, \in, #63
vand \tmp, \tmp, \const vand \tmp, \tmp, \const
...@@ -888,6 +908,7 @@ ENDPROC(aesbs_ctr_encrypt) ...@@ -888,6 +908,7 @@ ENDPROC(aesbs_ctr_encrypt)
* aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, * aesbs_xts_decrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int blocks, u8 iv[], int reorder_last_tweak) * int blocks, u8 iv[], int reorder_last_tweak)
*/ */
.align 6
__xts_prepare8: __xts_prepare8:
vld1.8 {q14}, [r7] // load iv vld1.8 {q14}, [r7] // load iv
vmov.i32 d30, #0x87 // compose tweak mask vector vmov.i32 d30, #0x87 // compose tweak mask vector
......
...@@ -37,7 +37,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], ...@@ -37,7 +37,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]); int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 ctr[], u8 final[]); int rounds, int blocks, u8 ctr[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], int); int rounds, int blocks, u8 iv[], int);
...@@ -243,32 +243,25 @@ static int ctr_encrypt(struct skcipher_request *req) ...@@ -243,32 +243,25 @@ static int ctr_encrypt(struct skcipher_request *req)
err = skcipher_walk_virt(&walk, req, false); err = skcipher_walk_virt(&walk, req, false);
while (walk.nbytes > 0) { while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; const u8 *src = walk.src.virt.addr;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL; u8 *dst = walk.dst.virt.addr;
int bytes = walk.nbytes;
if (walk.nbytes < walk.total) { if (unlikely(bytes < AES_BLOCK_SIZE))
blocks = round_down(blocks, src = dst = memcpy(buf + sizeof(buf) - bytes,
walk.stride / AES_BLOCK_SIZE); src, bytes);
final = NULL; else if (walk.nbytes < walk.total)
} bytes &= ~(8 * AES_BLOCK_SIZE - 1);
kernel_neon_begin(); kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, aesbs_ctr_encrypt(dst, src, ctx->rk, ctx->rounds, bytes, walk.iv);
ctx->rk, ctx->rounds, blocks, walk.iv, final);
kernel_neon_end(); kernel_neon_end();
if (final) { if (unlikely(bytes < AES_BLOCK_SIZE))
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE; memcpy(walk.dst.virt.addr,
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE; buf + sizeof(buf) - bytes, bytes);
crypto_xor_cpy(dst, src, final, err = skcipher_walk_done(&walk, walk.nbytes - bytes);
walk.total % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, 0);
break;
}
err = skcipher_walk_done(&walk,
walk.nbytes - blocks * AES_BLOCK_SIZE);
} }
return err; return err;
......
...@@ -44,7 +44,8 @@ ...@@ -44,7 +44,8 @@
: "0" (dst), "r" (a1), "r" (a2), "r" (a3), "r" (a4)) : "0" (dst), "r" (a1), "r" (a2), "r" (a3), "r" (a4))
static void static void
xor_arm4regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) xor_arm4regs_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
unsigned int lines = bytes / sizeof(unsigned long) / 4; unsigned int lines = bytes / sizeof(unsigned long) / 4;
register unsigned int a1 __asm__("r4"); register unsigned int a1 __asm__("r4");
...@@ -64,8 +65,9 @@ xor_arm4regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -64,8 +65,9 @@ xor_arm4regs_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
xor_arm4regs_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_arm4regs_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
unsigned int lines = bytes / sizeof(unsigned long) / 4; unsigned int lines = bytes / sizeof(unsigned long) / 4;
register unsigned int a1 __asm__("r4"); register unsigned int a1 __asm__("r4");
...@@ -86,8 +88,10 @@ xor_arm4regs_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -86,8 +88,10 @@ xor_arm4regs_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_arm4regs_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_arm4regs_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
unsigned int lines = bytes / sizeof(unsigned long) / 2; unsigned int lines = bytes / sizeof(unsigned long) / 2;
register unsigned int a1 __asm__("r8"); register unsigned int a1 __asm__("r8");
...@@ -105,8 +109,11 @@ xor_arm4regs_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -105,8 +109,11 @@ xor_arm4regs_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_arm4regs_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_arm4regs_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
unsigned int lines = bytes / sizeof(unsigned long) / 2; unsigned int lines = bytes / sizeof(unsigned long) / 2;
register unsigned int a1 __asm__("r8"); register unsigned int a1 __asm__("r8");
...@@ -146,7 +153,8 @@ static struct xor_block_template xor_block_arm4regs = { ...@@ -146,7 +153,8 @@ static struct xor_block_template xor_block_arm4regs = {
extern struct xor_block_template const xor_block_neon_inner; extern struct xor_block_template const xor_block_neon_inner;
static void static void
xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
if (in_interrupt()) { if (in_interrupt()) {
xor_arm4regs_2(bytes, p1, p2); xor_arm4regs_2(bytes, p1, p2);
...@@ -158,8 +166,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -158,8 +166,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
if (in_interrupt()) { if (in_interrupt()) {
xor_arm4regs_3(bytes, p1, p2, p3); xor_arm4regs_3(bytes, p1, p2, p3);
...@@ -171,8 +180,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -171,8 +180,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
if (in_interrupt()) { if (in_interrupt()) {
xor_arm4regs_4(bytes, p1, p2, p3, p4); xor_arm4regs_4(bytes, p1, p2, p3, p4);
...@@ -184,8 +195,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -184,8 +195,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_neon_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
if (in_interrupt()) { if (in_interrupt()) {
xor_arm4regs_5(bytes, p1, p2, p3, p4, p5); xor_arm4regs_5(bytes, p1, p2, p3, p4, p5);
......
...@@ -17,17 +17,11 @@ MODULE_LICENSE("GPL"); ...@@ -17,17 +17,11 @@ MODULE_LICENSE("GPL");
/* /*
* Pull in the reference implementations while instructing GCC (through * Pull in the reference implementations while instructing GCC (through
* -ftree-vectorize) to attempt to exploit implicit parallelism and emit * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
* NEON instructions. * NEON instructions. Clang does this by default at O2 so no pragma is
* needed.
*/ */
#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6) #ifdef CONFIG_CC_IS_GCC
#pragma GCC optimize "tree-vectorize" #pragma GCC optimize "tree-vectorize"
#else
/*
* While older versions of GCC do not generate incorrect code, they fail to
* recognize the parallel nature of these functions, and emit plain ARM code,
* which is known to be slower than the optimized ARM code in asm-arm/xor.h.
*/
#warning This code requires at least version 4.6 of GCC
#endif #endif
#pragma GCC diagnostic ignored "-Wunused-variable" #pragma GCC diagnostic ignored "-Wunused-variable"
......
...@@ -45,7 +45,7 @@ config CRYPTO_SM3_ARM64_CE ...@@ -45,7 +45,7 @@ config CRYPTO_SM3_ARM64_CE
tristate "SM3 digest algorithm (ARMv8.2 Crypto Extensions)" tristate "SM3 digest algorithm (ARMv8.2 Crypto Extensions)"
depends on KERNEL_MODE_NEON depends on KERNEL_MODE_NEON
select CRYPTO_HASH select CRYPTO_HASH
select CRYPTO_SM3 select CRYPTO_LIB_SM3
config CRYPTO_SM4_ARM64_CE config CRYPTO_SM4_ARM64_CE
tristate "SM4 symmetric cipher (ARMv8.2 Crypto Extensions)" tristate "SM4 symmetric cipher (ARMv8.2 Crypto Extensions)"
......
...@@ -24,7 +24,6 @@ ...@@ -24,7 +24,6 @@
#ifdef USE_V8_CRYPTO_EXTENSIONS #ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE "ce" #define MODE "ce"
#define PRIO 300 #define PRIO 300
#define STRIDE 5
#define aes_expandkey ce_aes_expandkey #define aes_expandkey ce_aes_expandkey
#define aes_ecb_encrypt ce_aes_ecb_encrypt #define aes_ecb_encrypt ce_aes_ecb_encrypt
#define aes_ecb_decrypt ce_aes_ecb_decrypt #define aes_ecb_decrypt ce_aes_ecb_decrypt
...@@ -42,7 +41,6 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions"); ...@@ -42,7 +41,6 @@ MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else #else
#define MODE "neon" #define MODE "neon"
#define PRIO 200 #define PRIO 200
#define STRIDE 4
#define aes_ecb_encrypt neon_aes_ecb_encrypt #define aes_ecb_encrypt neon_aes_ecb_encrypt
#define aes_ecb_decrypt neon_aes_ecb_decrypt #define aes_ecb_decrypt neon_aes_ecb_decrypt
#define aes_cbc_encrypt neon_aes_cbc_encrypt #define aes_cbc_encrypt neon_aes_cbc_encrypt
...@@ -89,7 +87,7 @@ asmlinkage void aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[], ...@@ -89,7 +87,7 @@ asmlinkage void aes_cbc_cts_decrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 const iv[]); int rounds, int bytes, u8 const iv[]);
asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[], asmlinkage void aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 ctr[], u8 finalbuf[]); int rounds, int bytes, u8 ctr[]);
asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[], asmlinkage void aes_xts_encrypt(u8 out[], u8 const in[], u32 const rk1[],
int rounds, int bytes, u32 const rk2[], u8 iv[], int rounds, int bytes, u32 const rk2[], u8 iv[],
...@@ -458,26 +456,21 @@ static int __maybe_unused ctr_encrypt(struct skcipher_request *req) ...@@ -458,26 +456,21 @@ static int __maybe_unused ctr_encrypt(struct skcipher_request *req)
unsigned int nbytes = walk.nbytes; unsigned int nbytes = walk.nbytes;
u8 *dst = walk.dst.virt.addr; u8 *dst = walk.dst.virt.addr;
u8 buf[AES_BLOCK_SIZE]; u8 buf[AES_BLOCK_SIZE];
unsigned int tail;
if (unlikely(nbytes < AES_BLOCK_SIZE)) if (unlikely(nbytes < AES_BLOCK_SIZE))
src = memcpy(buf, src, nbytes); src = dst = memcpy(buf + sizeof(buf) - nbytes,
src, nbytes);
else if (nbytes < walk.total) else if (nbytes < walk.total)
nbytes &= ~(AES_BLOCK_SIZE - 1); nbytes &= ~(AES_BLOCK_SIZE - 1);
kernel_neon_begin(); kernel_neon_begin();
aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes, aes_ctr_encrypt(dst, src, ctx->key_enc, rounds, nbytes,
walk.iv, buf); walk.iv);
kernel_neon_end(); kernel_neon_end();
tail = nbytes % (STRIDE * AES_BLOCK_SIZE); if (unlikely(nbytes < AES_BLOCK_SIZE))
if (tail > 0 && tail < AES_BLOCK_SIZE) memcpy(walk.dst.virt.addr,
/* buf + sizeof(buf) - nbytes, nbytes);
* The final partial block could not be returned using
* an overlapping store, so it was passed via buf[]
* instead.
*/
memcpy(dst + nbytes - tail, buf, tail);
err = skcipher_walk_done(&walk, walk.nbytes - nbytes); err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
} }
...@@ -983,6 +976,7 @@ module_cpu_feature_match(AES, aes_init); ...@@ -983,6 +976,7 @@ module_cpu_feature_match(AES, aes_init);
module_init(aes_init); module_init(aes_init);
EXPORT_SYMBOL(neon_aes_ecb_encrypt); EXPORT_SYMBOL(neon_aes_ecb_encrypt);
EXPORT_SYMBOL(neon_aes_cbc_encrypt); EXPORT_SYMBOL(neon_aes_cbc_encrypt);
EXPORT_SYMBOL(neon_aes_ctr_encrypt);
EXPORT_SYMBOL(neon_aes_xts_encrypt); EXPORT_SYMBOL(neon_aes_xts_encrypt);
EXPORT_SYMBOL(neon_aes_xts_decrypt); EXPORT_SYMBOL(neon_aes_xts_decrypt);
#endif #endif
......
...@@ -321,7 +321,7 @@ AES_FUNC_END(aes_cbc_cts_decrypt) ...@@ -321,7 +321,7 @@ AES_FUNC_END(aes_cbc_cts_decrypt)
/* /*
* aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds, * aes_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], int rounds,
* int bytes, u8 ctr[], u8 finalbuf[]) * int bytes, u8 ctr[])
*/ */
AES_FUNC_START(aes_ctr_encrypt) AES_FUNC_START(aes_ctr_encrypt)
...@@ -414,8 +414,8 @@ ST5( st1 {v4.16b}, [x0], #16 ) ...@@ -414,8 +414,8 @@ ST5( st1 {v4.16b}, [x0], #16 )
.Lctrtail: .Lctrtail:
/* XOR up to MAX_STRIDE * 16 - 1 bytes of in/output with v0 ... v3/v4 */ /* XOR up to MAX_STRIDE * 16 - 1 bytes of in/output with v0 ... v3/v4 */
mov x16, #16 mov x16, #16
ands x13, x4, #0xf ands x6, x4, #0xf
csel x13, x13, x16, ne csel x13, x6, x16, ne
ST5( cmp w4, #64 - (MAX_STRIDE << 4) ) ST5( cmp w4, #64 - (MAX_STRIDE << 4) )
ST5( csel x14, x16, xzr, gt ) ST5( csel x14, x16, xzr, gt )
...@@ -424,10 +424,10 @@ ST5( csel x14, x16, xzr, gt ) ...@@ -424,10 +424,10 @@ ST5( csel x14, x16, xzr, gt )
cmp w4, #32 - (MAX_STRIDE << 4) cmp w4, #32 - (MAX_STRIDE << 4)
csel x16, x16, xzr, gt csel x16, x16, xzr, gt
cmp w4, #16 - (MAX_STRIDE << 4) cmp w4, #16 - (MAX_STRIDE << 4)
ble .Lctrtail1x
adr_l x12, .Lcts_permute_table adr_l x12, .Lcts_permute_table
add x12, x12, x13 add x12, x12, x13
ble .Lctrtail1x
ST5( ld1 {v5.16b}, [x1], x14 ) ST5( ld1 {v5.16b}, [x1], x14 )
ld1 {v6.16b}, [x1], x15 ld1 {v6.16b}, [x1], x15
...@@ -462,11 +462,19 @@ ST5( st1 {v5.16b}, [x0], x14 ) ...@@ -462,11 +462,19 @@ ST5( st1 {v5.16b}, [x0], x14 )
b .Lctrout b .Lctrout
.Lctrtail1x: .Lctrtail1x:
csel x0, x0, x6, eq // use finalbuf if less than a full block sub x7, x6, #16
csel x6, x6, x7, eq
add x1, x1, x6
add x0, x0, x6
ld1 {v5.16b}, [x1] ld1 {v5.16b}, [x1]
ld1 {v6.16b}, [x0]
ST5( mov v3.16b, v4.16b ) ST5( mov v3.16b, v4.16b )
encrypt_block v3, w3, x2, x8, w7 encrypt_block v3, w3, x2, x8, w7
ld1 {v10.16b-v11.16b}, [x12]
tbl v3.16b, {v3.16b}, v10.16b
sshr v11.16b, v11.16b, #7
eor v5.16b, v5.16b, v3.16b eor v5.16b, v5.16b, v3.16b
bif v5.16b, v6.16b, v11.16b
st1 {v5.16b}, [x0] st1 {v5.16b}, [x0]
b .Lctrout b .Lctrout
AES_FUNC_END(aes_ctr_encrypt) AES_FUNC_END(aes_ctr_encrypt)
......
...@@ -735,119 +735,67 @@ SYM_FUNC_END(aesbs_cbc_decrypt) ...@@ -735,119 +735,67 @@ SYM_FUNC_END(aesbs_cbc_decrypt)
* int blocks, u8 iv[]) * int blocks, u8 iv[])
*/ */
SYM_FUNC_START_LOCAL(__xts_crypt8) SYM_FUNC_START_LOCAL(__xts_crypt8)
mov x6, #1 movi v18.2s, #0x1
lsl x6, x6, x23 movi v19.2s, #0x87
subs w23, w23, #8 uzp1 v18.4s, v18.4s, v19.4s
csel x23, x23, xzr, pl
csel x6, x6, xzr, mi ld1 {v0.16b-v3.16b}, [x1], #64
ld1 {v4.16b-v7.16b}, [x1], #64
next_tweak v26, v25, v18, v19
next_tweak v27, v26, v18, v19
next_tweak v28, v27, v18, v19
next_tweak v29, v28, v18, v19
next_tweak v30, v29, v18, v19
next_tweak v31, v30, v18, v19
next_tweak v16, v31, v18, v19
next_tweak v17, v16, v18, v19
ld1 {v0.16b}, [x20], #16
next_tweak v26, v25, v30, v31
eor v0.16b, v0.16b, v25.16b eor v0.16b, v0.16b, v25.16b
tbnz x6, #1, 0f
ld1 {v1.16b}, [x20], #16
next_tweak v27, v26, v30, v31
eor v1.16b, v1.16b, v26.16b eor v1.16b, v1.16b, v26.16b
tbnz x6, #2, 0f
ld1 {v2.16b}, [x20], #16
next_tweak v28, v27, v30, v31
eor v2.16b, v2.16b, v27.16b eor v2.16b, v2.16b, v27.16b
tbnz x6, #3, 0f
ld1 {v3.16b}, [x20], #16
next_tweak v29, v28, v30, v31
eor v3.16b, v3.16b, v28.16b eor v3.16b, v3.16b, v28.16b
tbnz x6, #4, 0f
ld1 {v4.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset]
eor v4.16b, v4.16b, v29.16b eor v4.16b, v4.16b, v29.16b
next_tweak v29, v29, v30, v31 eor v5.16b, v5.16b, v30.16b
tbnz x6, #5, 0f eor v6.16b, v6.16b, v31.16b
eor v7.16b, v7.16b, v16.16b
ld1 {v5.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 16]
eor v5.16b, v5.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #6, 0f
ld1 {v6.16b}, [x20], #16
str q29, [sp, #.Lframe_local_offset + 32]
eor v6.16b, v6.16b, v29.16b
next_tweak v29, v29, v30, v31
tbnz x6, #7, 0f
ld1 {v7.16b}, [x20], #16 stp q16, q17, [sp, #16]
str q29, [sp, #.Lframe_local_offset + 48]
eor v7.16b, v7.16b, v29.16b
next_tweak v29, v29, v30, v31
0: mov bskey, x21 mov bskey, x2
mov rounds, x22 mov rounds, x3
br x16 br x16
SYM_FUNC_END(__xts_crypt8) SYM_FUNC_END(__xts_crypt8)
.macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7 .macro __xts_crypt, do8, o0, o1, o2, o3, o4, o5, o6, o7
frame_push 6, 64 stp x29, x30, [sp, #-48]!
mov x29, sp
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
mov x23, x4
mov x24, x5
movi v30.2s, #0x1 ld1 {v25.16b}, [x5]
movi v25.2s, #0x87
uzp1 v30.4s, v30.4s, v25.4s
ld1 {v25.16b}, [x24]
99: adr x16, \do8 0: adr x16, \do8
bl __xts_crypt8 bl __xts_crypt8
ldp q16, q17, [sp, #.Lframe_local_offset] eor v16.16b, \o0\().16b, v25.16b
ldp q18, q19, [sp, #.Lframe_local_offset + 32] eor v17.16b, \o1\().16b, v26.16b
eor v18.16b, \o2\().16b, v27.16b
eor v19.16b, \o3\().16b, v28.16b
eor \o0\().16b, \o0\().16b, v25.16b ldp q24, q25, [sp, #16]
eor \o1\().16b, \o1\().16b, v26.16b
eor \o2\().16b, \o2\().16b, v27.16b
eor \o3\().16b, \o3\().16b, v28.16b
st1 {\o0\().16b}, [x19], #16 eor v20.16b, \o4\().16b, v29.16b
mov v25.16b, v26.16b eor v21.16b, \o5\().16b, v30.16b
tbnz x6, #1, 1f eor v22.16b, \o6\().16b, v31.16b
st1 {\o1\().16b}, [x19], #16 eor v23.16b, \o7\().16b, v24.16b
mov v25.16b, v27.16b
tbnz x6, #2, 1f
st1 {\o2\().16b}, [x19], #16
mov v25.16b, v28.16b
tbnz x6, #3, 1f
st1 {\o3\().16b}, [x19], #16
mov v25.16b, v29.16b
tbnz x6, #4, 1f
eor \o4\().16b, \o4\().16b, v16.16b st1 {v16.16b-v19.16b}, [x0], #64
eor \o5\().16b, \o5\().16b, v17.16b st1 {v20.16b-v23.16b}, [x0], #64
eor \o6\().16b, \o6\().16b, v18.16b
eor \o7\().16b, \o7\().16b, v19.16b
st1 {\o4\().16b}, [x19], #16 subs x4, x4, #8
tbnz x6, #5, 1f b.gt 0b
st1 {\o5\().16b}, [x19], #16
tbnz x6, #6, 1f
st1 {\o6\().16b}, [x19], #16
tbnz x6, #7, 1f
st1 {\o7\().16b}, [x19], #16
cbz x23, 1f st1 {v25.16b}, [x5]
st1 {v25.16b}, [x24] ldp x29, x30, [sp], #48
b 99b
1: st1 {v25.16b}, [x24]
frame_pop
ret ret
.endm .endm
...@@ -869,133 +817,51 @@ SYM_FUNC_END(aesbs_xts_decrypt) ...@@ -869,133 +817,51 @@ SYM_FUNC_END(aesbs_xts_decrypt)
/* /*
* aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], * aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
* int rounds, int blocks, u8 iv[], u8 final[]) * int rounds, int blocks, u8 iv[])
*/ */
SYM_FUNC_START(aesbs_ctr_encrypt) SYM_FUNC_START(aesbs_ctr_encrypt)
frame_push 8 stp x29, x30, [sp, #-16]!
mov x29, sp
mov x19, x0
mov x20, x1
mov x21, x2
mov x22, x3
mov x23, x4
mov x24, x5
mov x25, x6
cmp x25, #0 ldp x7, x8, [x5]
cset x26, ne ld1 {v0.16b}, [x5]
add x23, x23, x26 // do one extra block if final
ldp x7, x8, [x24]
ld1 {v0.16b}, [x24]
CPU_LE( rev x7, x7 ) CPU_LE( rev x7, x7 )
CPU_LE( rev x8, x8 ) CPU_LE( rev x8, x8 )
adds x8, x8, #1 adds x8, x8, #1
adc x7, x7, xzr adc x7, x7, xzr
99: mov x9, #1 0: next_ctr v1
lsl x9, x9, x23
subs w23, w23, #8
csel x23, x23, xzr, pl
csel x9, x9, xzr, le
tbnz x9, #1, 0f
next_ctr v1
tbnz x9, #2, 0f
next_ctr v2 next_ctr v2
tbnz x9, #3, 0f
next_ctr v3 next_ctr v3
tbnz x9, #4, 0f
next_ctr v4 next_ctr v4
tbnz x9, #5, 0f
next_ctr v5 next_ctr v5
tbnz x9, #6, 0f
next_ctr v6 next_ctr v6
tbnz x9, #7, 0f
next_ctr v7 next_ctr v7
0: mov bskey, x21 mov bskey, x2
mov rounds, x22 mov rounds, x3
bl aesbs_encrypt8 bl aesbs_encrypt8
lsr x9, x9, x26 // disregard the extra block ld1 { v8.16b-v11.16b}, [x1], #64
tbnz x9, #0, 0f ld1 {v12.16b-v15.16b}, [x1], #64
ld1 {v8.16b}, [x20], #16
eor v0.16b, v0.16b, v8.16b
st1 {v0.16b}, [x19], #16
tbnz x9, #1, 1f
ld1 {v9.16b}, [x20], #16 eor v8.16b, v0.16b, v8.16b
eor v1.16b, v1.16b, v9.16b eor v9.16b, v1.16b, v9.16b
st1 {v1.16b}, [x19], #16 eor v10.16b, v4.16b, v10.16b
tbnz x9, #2, 2f eor v11.16b, v6.16b, v11.16b
eor v12.16b, v3.16b, v12.16b
eor v13.16b, v7.16b, v13.16b
eor v14.16b, v2.16b, v14.16b
eor v15.16b, v5.16b, v15.16b
ld1 {v10.16b}, [x20], #16 st1 { v8.16b-v11.16b}, [x0], #64
eor v4.16b, v4.16b, v10.16b st1 {v12.16b-v15.16b}, [x0], #64
st1 {v4.16b}, [x19], #16
tbnz x9, #3, 3f
ld1 {v11.16b}, [x20], #16 next_ctr v0
eor v6.16b, v6.16b, v11.16b subs x4, x4, #8
st1 {v6.16b}, [x19], #16 b.gt 0b
tbnz x9, #4, 4f
ld1 {v12.16b}, [x20], #16
eor v3.16b, v3.16b, v12.16b
st1 {v3.16b}, [x19], #16
tbnz x9, #5, 5f
ld1 {v13.16b}, [x20], #16
eor v7.16b, v7.16b, v13.16b
st1 {v7.16b}, [x19], #16
tbnz x9, #6, 6f
ld1 {v14.16b}, [x20], #16 st1 {v0.16b}, [x5]
eor v2.16b, v2.16b, v14.16b ldp x29, x30, [sp], #16
st1 {v2.16b}, [x19], #16
tbnz x9, #7, 7f
ld1 {v15.16b}, [x20], #16
eor v5.16b, v5.16b, v15.16b
st1 {v5.16b}, [x19], #16
8: next_ctr v0
st1 {v0.16b}, [x24]
cbz x23, .Lctr_done
b 99b
.Lctr_done:
frame_pop
ret ret
/*
* If we are handling the tail of the input (x6 != NULL), return the
* final keystream block back to the caller.
*/
0: cbz x25, 8b
st1 {v0.16b}, [x25]
b 8b
1: cbz x25, 8b
st1 {v1.16b}, [x25]
b 8b
2: cbz x25, 8b
st1 {v4.16b}, [x25]
b 8b
3: cbz x25, 8b
st1 {v6.16b}, [x25]
b 8b
4: cbz x25, 8b
st1 {v3.16b}, [x25]
b 8b
5: cbz x25, 8b
st1 {v7.16b}, [x25]
b 8b
6: cbz x25, 8b
st1 {v2.16b}, [x25]
b 8b
7: cbz x25, 8b
st1 {v5.16b}, [x25]
b 8b
SYM_FUNC_END(aesbs_ctr_encrypt) SYM_FUNC_END(aesbs_ctr_encrypt)
...@@ -34,7 +34,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[], ...@@ -34,7 +34,7 @@ asmlinkage void aesbs_cbc_decrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]); int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aesbs_ctr_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[], u8 final[]); int rounds, int blocks, u8 iv[]);
asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[], asmlinkage void aesbs_xts_encrypt(u8 out[], u8 const in[], u8 const rk[],
int rounds, int blocks, u8 iv[]); int rounds, int blocks, u8 iv[]);
...@@ -46,6 +46,8 @@ asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[], ...@@ -46,6 +46,8 @@ asmlinkage void neon_aes_ecb_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks); int rounds, int blocks);
asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[], asmlinkage void neon_aes_cbc_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int blocks, u8 iv[]); int rounds, int blocks, u8 iv[]);
asmlinkage void neon_aes_ctr_encrypt(u8 out[], u8 const in[], u32 const rk[],
int rounds, int bytes, u8 ctr[]);
asmlinkage void neon_aes_xts_encrypt(u8 out[], u8 const in[], asmlinkage void neon_aes_xts_encrypt(u8 out[], u8 const in[],
u32 const rk1[], int rounds, int bytes, u32 const rk1[], int rounds, int bytes,
u32 const rk2[], u8 iv[], int first); u32 const rk2[], u8 iv[], int first);
...@@ -58,7 +60,7 @@ struct aesbs_ctx { ...@@ -58,7 +60,7 @@ struct aesbs_ctx {
int rounds; int rounds;
} __aligned(AES_BLOCK_SIZE); } __aligned(AES_BLOCK_SIZE);
struct aesbs_cbc_ctx { struct aesbs_cbc_ctr_ctx {
struct aesbs_ctx key; struct aesbs_ctx key;
u32 enc[AES_MAX_KEYLENGTH_U32]; u32 enc[AES_MAX_KEYLENGTH_U32];
}; };
...@@ -128,10 +130,10 @@ static int ecb_decrypt(struct skcipher_request *req) ...@@ -128,10 +130,10 @@ static int ecb_decrypt(struct skcipher_request *req)
return __ecb_crypt(req, aesbs_ecb_decrypt); return __ecb_crypt(req, aesbs_ecb_decrypt);
} }
static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key, static int aesbs_cbc_ctr_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
unsigned int key_len) unsigned int key_len)
{ {
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct crypto_aes_ctx rk; struct crypto_aes_ctx rk;
int err; int err;
...@@ -154,7 +156,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key, ...@@ -154,7 +156,7 @@ static int aesbs_cbc_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
static int cbc_encrypt(struct skcipher_request *req) static int cbc_encrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk; struct skcipher_walk walk;
int err; int err;
...@@ -177,7 +179,7 @@ static int cbc_encrypt(struct skcipher_request *req) ...@@ -177,7 +179,7 @@ static int cbc_encrypt(struct skcipher_request *req)
static int cbc_decrypt(struct skcipher_request *req) static int cbc_decrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_cbc_ctx *ctx = crypto_skcipher_ctx(tfm); struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk; struct skcipher_walk walk;
int err; int err;
...@@ -205,40 +207,32 @@ static int cbc_decrypt(struct skcipher_request *req) ...@@ -205,40 +207,32 @@ static int cbc_decrypt(struct skcipher_request *req)
static int ctr_encrypt(struct skcipher_request *req) static int ctr_encrypt(struct skcipher_request *req)
{ {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
struct aesbs_ctx *ctx = crypto_skcipher_ctx(tfm); struct aesbs_cbc_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
struct skcipher_walk walk; struct skcipher_walk walk;
u8 buf[AES_BLOCK_SIZE];
int err; int err;
err = skcipher_walk_virt(&walk, req, false); err = skcipher_walk_virt(&walk, req, false);
while (walk.nbytes > 0) { while (walk.nbytes > 0) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; int blocks = (walk.nbytes / AES_BLOCK_SIZE) & ~7;
u8 *final = (walk.total % AES_BLOCK_SIZE) ? buf : NULL; int nbytes = walk.nbytes % (8 * AES_BLOCK_SIZE);
const u8 *src = walk.src.virt.addr;
if (walk.nbytes < walk.total) { u8 *dst = walk.dst.virt.addr;
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
final = NULL;
}
kernel_neon_begin(); kernel_neon_begin();
aesbs_ctr_encrypt(walk.dst.virt.addr, walk.src.virt.addr, if (blocks >= 8) {
ctx->rk, ctx->rounds, blocks, walk.iv, final); aesbs_ctr_encrypt(dst, src, ctx->key.rk, ctx->key.rounds,
kernel_neon_end(); blocks, walk.iv);
dst += blocks * AES_BLOCK_SIZE;
if (final) { src += blocks * AES_BLOCK_SIZE;
u8 *dst = walk.dst.virt.addr + blocks * AES_BLOCK_SIZE;
u8 *src = walk.src.virt.addr + blocks * AES_BLOCK_SIZE;
crypto_xor_cpy(dst, src, final,
walk.total % AES_BLOCK_SIZE);
err = skcipher_walk_done(&walk, 0);
break;
} }
err = skcipher_walk_done(&walk, if (nbytes && walk.nbytes == walk.total) {
walk.nbytes - blocks * AES_BLOCK_SIZE); neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds,
nbytes, walk.iv);
nbytes = 0;
}
kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes);
} }
return err; return err;
} }
...@@ -308,23 +302,18 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt, ...@@ -308,23 +302,18 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
return err; return err;
while (walk.nbytes >= AES_BLOCK_SIZE) { while (walk.nbytes >= AES_BLOCK_SIZE) {
unsigned int blocks = walk.nbytes / AES_BLOCK_SIZE; int blocks = (walk.nbytes / AES_BLOCK_SIZE) & ~7;
if (walk.nbytes < walk.total || walk.nbytes % AES_BLOCK_SIZE)
blocks = round_down(blocks,
walk.stride / AES_BLOCK_SIZE);
out = walk.dst.virt.addr; out = walk.dst.virt.addr;
in = walk.src.virt.addr; in = walk.src.virt.addr;
nbytes = walk.nbytes; nbytes = walk.nbytes;
kernel_neon_begin(); kernel_neon_begin();
if (likely(blocks > 6)) { /* plain NEON is faster otherwise */ if (blocks >= 8) {
if (first) if (first == 1)
neon_aes_ecb_encrypt(walk.iv, walk.iv, neon_aes_ecb_encrypt(walk.iv, walk.iv,
ctx->twkey, ctx->twkey,
ctx->key.rounds, 1); ctx->key.rounds, 1);
first = 0; first = 2;
fn(out, in, ctx->key.rk, ctx->key.rounds, blocks, fn(out, in, ctx->key.rk, ctx->key.rounds, blocks,
walk.iv); walk.iv);
...@@ -333,10 +322,17 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt, ...@@ -333,10 +322,17 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
in += blocks * AES_BLOCK_SIZE; in += blocks * AES_BLOCK_SIZE;
nbytes -= blocks * AES_BLOCK_SIZE; nbytes -= blocks * AES_BLOCK_SIZE;
} }
if (walk.nbytes == walk.total && nbytes > 0) {
if (walk.nbytes == walk.total && nbytes > 0) if (encrypt)
goto xts_tail; neon_aes_xts_encrypt(out, in, ctx->cts.key_enc,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec,
ctx->key.rounds, nbytes,
ctx->twkey, walk.iv, first);
nbytes = first = 0;
}
kernel_neon_end(); kernel_neon_end();
err = skcipher_walk_done(&walk, nbytes); err = skcipher_walk_done(&walk, nbytes);
} }
...@@ -361,13 +357,12 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt, ...@@ -361,13 +357,12 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt,
nbytes = walk.nbytes; nbytes = walk.nbytes;
kernel_neon_begin(); kernel_neon_begin();
xts_tail:
if (encrypt) if (encrypt)
neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds, neon_aes_xts_encrypt(out, in, ctx->cts.key_enc, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2); nbytes, ctx->twkey, walk.iv, first);
else else
neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds, neon_aes_xts_decrypt(out, in, ctx->cts.key_dec, ctx->key.rounds,
nbytes, ctx->twkey, walk.iv, first ?: 2); nbytes, ctx->twkey, walk.iv, first);
kernel_neon_end(); kernel_neon_end();
return skcipher_walk_done(&walk, 0); return skcipher_walk_done(&walk, 0);
...@@ -402,14 +397,14 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -402,14 +397,14 @@ static struct skcipher_alg aes_algs[] = { {
.base.cra_driver_name = "cbc-aes-neonbs", .base.cra_driver_name = "cbc-aes-neonbs",
.base.cra_priority = 250, .base.cra_priority = 250,
.base.cra_blocksize = AES_BLOCK_SIZE, .base.cra_blocksize = AES_BLOCK_SIZE,
.base.cra_ctxsize = sizeof(struct aesbs_cbc_ctx), .base.cra_ctxsize = sizeof(struct aesbs_cbc_ctr_ctx),
.base.cra_module = THIS_MODULE, .base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
.max_keysize = AES_MAX_KEY_SIZE, .max_keysize = AES_MAX_KEY_SIZE,
.walksize = 8 * AES_BLOCK_SIZE, .walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE, .ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_cbc_setkey, .setkey = aesbs_cbc_ctr_setkey,
.encrypt = cbc_encrypt, .encrypt = cbc_encrypt,
.decrypt = cbc_decrypt, .decrypt = cbc_decrypt,
}, { }, {
...@@ -417,7 +412,7 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -417,7 +412,7 @@ static struct skcipher_alg aes_algs[] = { {
.base.cra_driver_name = "ctr-aes-neonbs", .base.cra_driver_name = "ctr-aes-neonbs",
.base.cra_priority = 250, .base.cra_priority = 250,
.base.cra_blocksize = 1, .base.cra_blocksize = 1,
.base.cra_ctxsize = sizeof(struct aesbs_ctx), .base.cra_ctxsize = sizeof(struct aesbs_cbc_ctr_ctx),
.base.cra_module = THIS_MODULE, .base.cra_module = THIS_MODULE,
.min_keysize = AES_MIN_KEY_SIZE, .min_keysize = AES_MIN_KEY_SIZE,
...@@ -425,7 +420,7 @@ static struct skcipher_alg aes_algs[] = { { ...@@ -425,7 +420,7 @@ static struct skcipher_alg aes_algs[] = { {
.chunksize = AES_BLOCK_SIZE, .chunksize = AES_BLOCK_SIZE,
.walksize = 8 * AES_BLOCK_SIZE, .walksize = 8 * AES_BLOCK_SIZE,
.ivsize = AES_BLOCK_SIZE, .ivsize = AES_BLOCK_SIZE,
.setkey = aesbs_setkey, .setkey = aesbs_cbc_ctr_setkey,
.encrypt = ctr_encrypt, .encrypt = ctr_encrypt,
.decrypt = ctr_encrypt, .decrypt = ctr_encrypt,
}, { }, {
......
/* SPDX-License-Identifier: GPL-2.0 */ // SPDX-License-Identifier: GPL-2.0
/* /*
* sha3-ce-glue.c - core SHA-3 transform using v8.2 Crypto Extensions * sha3-ce-glue.c - core SHA-3 transform using v8.2 Crypto Extensions
* *
......
...@@ -43,7 +43,7 @@ ...@@ -43,7 +43,7 @@
# on Cortex-A53 (or by 4 cycles per round). # on Cortex-A53 (or by 4 cycles per round).
# (***) Super-impressive coefficients over gcc-generated code are # (***) Super-impressive coefficients over gcc-generated code are
# indication of some compiler "pathology", most notably code # indication of some compiler "pathology", most notably code
# generated with -mgeneral-regs-only is significanty faster # generated with -mgeneral-regs-only is significantly faster
# and the gap is only 40-90%. # and the gap is only 40-90%.
# #
# October 2016. # October 2016.
......
/* SPDX-License-Identifier: GPL-2.0 */ // SPDX-License-Identifier: GPL-2.0
/* /*
* sha512-ce-glue.c - SHA-384/SHA-512 using ARMv8 Crypto Extensions * sha512-ce-glue.c - SHA-384/SHA-512 using ARMv8 Crypto Extensions
* *
......
...@@ -26,8 +26,10 @@ asmlinkage void sm3_ce_transform(struct sm3_state *sst, u8 const *src, ...@@ -26,8 +26,10 @@ asmlinkage void sm3_ce_transform(struct sm3_state *sst, u8 const *src,
static int sm3_ce_update(struct shash_desc *desc, const u8 *data, static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
unsigned int len) unsigned int len)
{ {
if (!crypto_simd_usable()) if (!crypto_simd_usable()) {
return crypto_sm3_update(desc, data, len); sm3_update(shash_desc_ctx(desc), data, len);
return 0;
}
kernel_neon_begin(); kernel_neon_begin();
sm3_base_do_update(desc, data, len, sm3_ce_transform); sm3_base_do_update(desc, data, len, sm3_ce_transform);
...@@ -38,8 +40,10 @@ static int sm3_ce_update(struct shash_desc *desc, const u8 *data, ...@@ -38,8 +40,10 @@ static int sm3_ce_update(struct shash_desc *desc, const u8 *data,
static int sm3_ce_final(struct shash_desc *desc, u8 *out) static int sm3_ce_final(struct shash_desc *desc, u8 *out)
{ {
if (!crypto_simd_usable()) if (!crypto_simd_usable()) {
return crypto_sm3_finup(desc, NULL, 0, out); sm3_final(shash_desc_ctx(desc), out);
return 0;
}
kernel_neon_begin(); kernel_neon_begin();
sm3_base_do_finalize(desc, sm3_ce_transform); sm3_base_do_finalize(desc, sm3_ce_transform);
...@@ -51,14 +55,22 @@ static int sm3_ce_final(struct shash_desc *desc, u8 *out) ...@@ -51,14 +55,22 @@ static int sm3_ce_final(struct shash_desc *desc, u8 *out)
static int sm3_ce_finup(struct shash_desc *desc, const u8 *data, static int sm3_ce_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out) unsigned int len, u8 *out)
{ {
if (!crypto_simd_usable()) if (!crypto_simd_usable()) {
return crypto_sm3_finup(desc, data, len, out); struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, out);
return 0;
}
kernel_neon_begin(); kernel_neon_begin();
sm3_base_do_update(desc, data, len, sm3_ce_transform); if (len)
sm3_base_do_update(desc, data, len, sm3_ce_transform);
sm3_base_do_finalize(desc, sm3_ce_transform);
kernel_neon_end(); kernel_neon_end();
return sm3_ce_final(desc, out); return sm3_base_finish(desc, out);
} }
static struct shash_alg sm3_alg = { static struct shash_alg sm3_alg = {
......
...@@ -16,7 +16,8 @@ ...@@ -16,7 +16,8 @@
extern struct xor_block_template const xor_block_inner_neon; extern struct xor_block_template const xor_block_inner_neon;
static void static void
xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) xor_neon_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
kernel_neon_begin(); kernel_neon_begin();
xor_block_inner_neon.do_2(bytes, p1, p2); xor_block_inner_neon.do_2(bytes, p1, p2);
...@@ -24,8 +25,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -24,8 +25,9 @@ xor_neon_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_neon_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
kernel_neon_begin(); kernel_neon_begin();
xor_block_inner_neon.do_3(bytes, p1, p2, p3); xor_block_inner_neon.do_3(bytes, p1, p2, p3);
...@@ -33,8 +35,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -33,8 +35,10 @@ xor_neon_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_neon_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
kernel_neon_begin(); kernel_neon_begin();
xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4); xor_block_inner_neon.do_4(bytes, p1, p2, p3, p4);
...@@ -42,8 +46,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -42,8 +46,11 @@ xor_neon_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_neon_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_neon_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
kernel_neon_begin(); kernel_neon_begin();
xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5); xor_block_inner_neon.do_5(bytes, p1, p2, p3, p4, p5);
......
...@@ -11,7 +11,44 @@ ...@@ -11,7 +11,44 @@
.arch armv8-a+crc .arch armv8-a+crc
.macro __crc32, c .macro byteorder, reg, be
.if \be
CPU_LE( rev \reg, \reg )
.else
CPU_BE( rev \reg, \reg )
.endif
.endm
.macro byteorder16, reg, be
.if \be
CPU_LE( rev16 \reg, \reg )
.else
CPU_BE( rev16 \reg, \reg )
.endif
.endm
.macro bitorder, reg, be
.if \be
rbit \reg, \reg
.endif
.endm
.macro bitorder16, reg, be
.if \be
rbit \reg, \reg
lsr \reg, \reg, #16
.endif
.endm
.macro bitorder8, reg, be
.if \be
rbit \reg, \reg
lsr \reg, \reg, #24
.endif
.endm
.macro __crc32, c, be=0
bitorder w0, \be
cmp x2, #16 cmp x2, #16
b.lt 8f // less than 16 bytes b.lt 8f // less than 16 bytes
...@@ -24,10 +61,14 @@ ...@@ -24,10 +61,14 @@
add x8, x8, x1 add x8, x8, x1
add x1, x1, x7 add x1, x1, x7
ldp x5, x6, [x8] ldp x5, x6, [x8]
CPU_BE( rev x3, x3 ) byteorder x3, \be
CPU_BE( rev x4, x4 ) byteorder x4, \be
CPU_BE( rev x5, x5 ) byteorder x5, \be
CPU_BE( rev x6, x6 ) byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
tst x7, #8 tst x7, #8
crc32\c\()x w8, w0, x3 crc32\c\()x w8, w0, x3
...@@ -55,33 +96,43 @@ CPU_BE( rev x6, x6 ) ...@@ -55,33 +96,43 @@ CPU_BE( rev x6, x6 )
32: ldp x3, x4, [x1], #32 32: ldp x3, x4, [x1], #32
sub x2, x2, #32 sub x2, x2, #32
ldp x5, x6, [x1, #-16] ldp x5, x6, [x1, #-16]
CPU_BE( rev x3, x3 ) byteorder x3, \be
CPU_BE( rev x4, x4 ) byteorder x4, \be
CPU_BE( rev x5, x5 ) byteorder x5, \be
CPU_BE( rev x6, x6 ) byteorder x6, \be
bitorder x3, \be
bitorder x4, \be
bitorder x5, \be
bitorder x6, \be
crc32\c\()x w0, w0, x3 crc32\c\()x w0, w0, x3
crc32\c\()x w0, w0, x4 crc32\c\()x w0, w0, x4
crc32\c\()x w0, w0, x5 crc32\c\()x w0, w0, x5
crc32\c\()x w0, w0, x6 crc32\c\()x w0, w0, x6
cbnz x2, 32b cbnz x2, 32b
0: ret 0: bitorder w0, \be
ret
8: tbz x2, #3, 4f 8: tbz x2, #3, 4f
ldr x3, [x1], #8 ldr x3, [x1], #8
CPU_BE( rev x3, x3 ) byteorder x3, \be
bitorder x3, \be
crc32\c\()x w0, w0, x3 crc32\c\()x w0, w0, x3
4: tbz x2, #2, 2f 4: tbz x2, #2, 2f
ldr w3, [x1], #4 ldr w3, [x1], #4
CPU_BE( rev w3, w3 ) byteorder w3, \be
bitorder w3, \be
crc32\c\()w w0, w0, w3 crc32\c\()w w0, w0, w3
2: tbz x2, #1, 1f 2: tbz x2, #1, 1f
ldrh w3, [x1], #2 ldrh w3, [x1], #2
CPU_BE( rev16 w3, w3 ) byteorder16 w3, \be
bitorder16 w3, \be
crc32\c\()h w0, w0, w3 crc32\c\()h w0, w0, w3
1: tbz x2, #0, 0f 1: tbz x2, #0, 0f
ldrb w3, [x1] ldrb w3, [x1]
bitorder8 w3, \be
crc32\c\()b w0, w0, w3 crc32\c\()b w0, w0, w3
0: ret 0: bitorder w0, \be
ret
.endm .endm
.align 5 .align 5
...@@ -99,3 +150,11 @@ alternative_if_not ARM64_HAS_CRC32 ...@@ -99,3 +150,11 @@ alternative_if_not ARM64_HAS_CRC32
alternative_else_nop_endif alternative_else_nop_endif
__crc32 c __crc32 c
SYM_FUNC_END(__crc32c_le) SYM_FUNC_END(__crc32c_le)
.align 5
SYM_FUNC_START(crc32_be)
alternative_if_not ARM64_HAS_CRC32
b crc32_be_base
alternative_else_nop_endif
__crc32 be=1
SYM_FUNC_END(crc32_be)
...@@ -10,8 +10,8 @@ ...@@ -10,8 +10,8 @@
#include <linux/module.h> #include <linux/module.h>
#include <asm/neon-intrinsics.h> #include <asm/neon-intrinsics.h>
void xor_arm64_neon_2(unsigned long bytes, unsigned long *p1, void xor_arm64_neon_2(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p2) const unsigned long * __restrict p2)
{ {
uint64_t *dp1 = (uint64_t *)p1; uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2; uint64_t *dp2 = (uint64_t *)p2;
...@@ -37,8 +37,9 @@ void xor_arm64_neon_2(unsigned long bytes, unsigned long *p1, ...@@ -37,8 +37,9 @@ void xor_arm64_neon_2(unsigned long bytes, unsigned long *p1,
} while (--lines > 0); } while (--lines > 0);
} }
void xor_arm64_neon_3(unsigned long bytes, unsigned long *p1, void xor_arm64_neon_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p2, unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
uint64_t *dp1 = (uint64_t *)p1; uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2; uint64_t *dp2 = (uint64_t *)p2;
...@@ -72,8 +73,10 @@ void xor_arm64_neon_3(unsigned long bytes, unsigned long *p1, ...@@ -72,8 +73,10 @@ void xor_arm64_neon_3(unsigned long bytes, unsigned long *p1,
} while (--lines > 0); } while (--lines > 0);
} }
void xor_arm64_neon_4(unsigned long bytes, unsigned long *p1, void xor_arm64_neon_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p2, unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
uint64_t *dp1 = (uint64_t *)p1; uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2; uint64_t *dp2 = (uint64_t *)p2;
...@@ -115,9 +118,11 @@ void xor_arm64_neon_4(unsigned long bytes, unsigned long *p1, ...@@ -115,9 +118,11 @@ void xor_arm64_neon_4(unsigned long bytes, unsigned long *p1,
} while (--lines > 0); } while (--lines > 0);
} }
void xor_arm64_neon_5(unsigned long bytes, unsigned long *p1, void xor_arm64_neon_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p2, unsigned long *p3, const unsigned long * __restrict p2,
unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
uint64_t *dp1 = (uint64_t *)p1; uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2; uint64_t *dp2 = (uint64_t *)p2;
...@@ -186,8 +191,10 @@ static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r) ...@@ -186,8 +191,10 @@ static inline uint64x2_t eor3(uint64x2_t p, uint64x2_t q, uint64x2_t r)
return res; return res;
} }
static void xor_arm64_eor3_3(unsigned long bytes, unsigned long *p1, static void xor_arm64_eor3_3(unsigned long bytes,
unsigned long *p2, unsigned long *p3) unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
uint64_t *dp1 = (uint64_t *)p1; uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2; uint64_t *dp2 = (uint64_t *)p2;
...@@ -219,9 +226,11 @@ static void xor_arm64_eor3_3(unsigned long bytes, unsigned long *p1, ...@@ -219,9 +226,11 @@ static void xor_arm64_eor3_3(unsigned long bytes, unsigned long *p1,
} while (--lines > 0); } while (--lines > 0);
} }
static void xor_arm64_eor3_4(unsigned long bytes, unsigned long *p1, static void xor_arm64_eor3_4(unsigned long bytes,
unsigned long *p2, unsigned long *p3, unsigned long * __restrict p1,
unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
uint64_t *dp1 = (uint64_t *)p1; uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2; uint64_t *dp2 = (uint64_t *)p2;
...@@ -261,9 +270,12 @@ static void xor_arm64_eor3_4(unsigned long bytes, unsigned long *p1, ...@@ -261,9 +270,12 @@ static void xor_arm64_eor3_4(unsigned long bytes, unsigned long *p1,
} while (--lines > 0); } while (--lines > 0);
} }
static void xor_arm64_eor3_5(unsigned long bytes, unsigned long *p1, static void xor_arm64_eor3_5(unsigned long bytes,
unsigned long *p2, unsigned long *p3, unsigned long * __restrict p1,
unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
uint64_t *dp1 = (uint64_t *)p1; uint64_t *dp1 = (uint64_t *)p1;
uint64_t *dp2 = (uint64_t *)p2; uint64_t *dp2 = (uint64_t *)p2;
......
...@@ -4,13 +4,20 @@ ...@@ -4,13 +4,20 @@
*/ */
extern void xor_ia64_2(unsigned long, unsigned long *, unsigned long *); extern void xor_ia64_2(unsigned long bytes, unsigned long * __restrict p1,
extern void xor_ia64_3(unsigned long, unsigned long *, unsigned long *, const unsigned long * __restrict p2);
unsigned long *); extern void xor_ia64_3(unsigned long bytes, unsigned long * __restrict p1,
extern void xor_ia64_4(unsigned long, unsigned long *, unsigned long *, const unsigned long * __restrict p2,
unsigned long *, unsigned long *); const unsigned long * __restrict p3);
extern void xor_ia64_5(unsigned long, unsigned long *, unsigned long *, extern void xor_ia64_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *, unsigned long *, unsigned long *); const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
extern void xor_ia64_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
static struct xor_block_template xor_block_ia64 = { static struct xor_block_template xor_block_ia64 = {
.name = "ia64", .name = "ia64",
......
...@@ -3,17 +3,20 @@ ...@@ -3,17 +3,20 @@
#define _ASM_POWERPC_XOR_ALTIVEC_H #define _ASM_POWERPC_XOR_ALTIVEC_H
#ifdef CONFIG_ALTIVEC #ifdef CONFIG_ALTIVEC
void xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
void xor_altivec_2(unsigned long bytes, unsigned long *v1_in, const unsigned long * __restrict p2);
unsigned long *v2_in); void xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
void xor_altivec_3(unsigned long bytes, unsigned long *v1_in, const unsigned long * __restrict p2,
unsigned long *v2_in, unsigned long *v3_in); const unsigned long * __restrict p3);
void xor_altivec_4(unsigned long bytes, unsigned long *v1_in, void xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *v2_in, unsigned long *v3_in, const unsigned long * __restrict p2,
unsigned long *v4_in); const unsigned long * __restrict p3,
void xor_altivec_5(unsigned long bytes, unsigned long *v1_in, const unsigned long * __restrict p4);
unsigned long *v2_in, unsigned long *v3_in, void xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *v4_in, unsigned long *v5_in); const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
#endif #endif
#endif /* _ASM_POWERPC_XOR_ALTIVEC_H */ #endif /* _ASM_POWERPC_XOR_ALTIVEC_H */
...@@ -49,8 +49,9 @@ typedef vector signed char unative_t; ...@@ -49,8 +49,9 @@ typedef vector signed char unative_t;
V1##_3 = vec_xor(V1##_3, V2##_3); \ V1##_3 = vec_xor(V1##_3, V2##_3); \
} while (0) } while (0)
void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in, void __xor_altivec_2(unsigned long bytes,
unsigned long *v2_in) unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in)
{ {
DEFINE(v1); DEFINE(v1);
DEFINE(v2); DEFINE(v2);
...@@ -67,8 +68,10 @@ void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in, ...@@ -67,8 +68,10 @@ void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0); } while (--lines > 0);
} }
void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in, void __xor_altivec_3(unsigned long bytes,
unsigned long *v2_in, unsigned long *v3_in) unsigned long * __restrict v1_in,
const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in)
{ {
DEFINE(v1); DEFINE(v1);
DEFINE(v2); DEFINE(v2);
...@@ -89,9 +92,11 @@ void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in, ...@@ -89,9 +92,11 @@ void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0); } while (--lines > 0);
} }
void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in, void __xor_altivec_4(unsigned long bytes,
unsigned long *v2_in, unsigned long *v3_in, unsigned long * __restrict v1_in,
unsigned long *v4_in) const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in,
const unsigned long * __restrict v4_in)
{ {
DEFINE(v1); DEFINE(v1);
DEFINE(v2); DEFINE(v2);
...@@ -116,9 +121,12 @@ void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in, ...@@ -116,9 +121,12 @@ void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in,
} while (--lines > 0); } while (--lines > 0);
} }
void __xor_altivec_5(unsigned long bytes, unsigned long *v1_in, void __xor_altivec_5(unsigned long bytes,
unsigned long *v2_in, unsigned long *v3_in, unsigned long * __restrict v1_in,
unsigned long *v4_in, unsigned long *v5_in) const unsigned long * __restrict v2_in,
const unsigned long * __restrict v3_in,
const unsigned long * __restrict v4_in,
const unsigned long * __restrict v5_in)
{ {
DEFINE(v1); DEFINE(v1);
DEFINE(v2); DEFINE(v2);
......
...@@ -6,16 +6,17 @@ ...@@ -6,16 +6,17 @@
* outside of the enable/disable altivec block. * outside of the enable/disable altivec block.
*/ */
void __xor_altivec_2(unsigned long bytes, unsigned long *v1_in, void __xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *v2_in); const unsigned long * __restrict p2);
void __xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
void __xor_altivec_3(unsigned long bytes, unsigned long *v1_in, const unsigned long * __restrict p2,
unsigned long *v2_in, unsigned long *v3_in); const unsigned long * __restrict p3);
void __xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
void __xor_altivec_4(unsigned long bytes, unsigned long *v1_in, const unsigned long * __restrict p2,
unsigned long *v2_in, unsigned long *v3_in, const unsigned long * __restrict p3,
unsigned long *v4_in); const unsigned long * __restrict p4);
void __xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
void __xor_altivec_5(unsigned long bytes, unsigned long *v1_in, const unsigned long * __restrict p2,
unsigned long *v2_in, unsigned long *v3_in, const unsigned long * __restrict p3,
unsigned long *v4_in, unsigned long *v5_in); const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
...@@ -12,47 +12,51 @@ ...@@ -12,47 +12,51 @@
#include <asm/xor_altivec.h> #include <asm/xor_altivec.h>
#include "xor_vmx.h" #include "xor_vmx.h"
void xor_altivec_2(unsigned long bytes, unsigned long *v1_in, void xor_altivec_2(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *v2_in) const unsigned long * __restrict p2)
{ {
preempt_disable(); preempt_disable();
enable_kernel_altivec(); enable_kernel_altivec();
__xor_altivec_2(bytes, v1_in, v2_in); __xor_altivec_2(bytes, p1, p2);
disable_kernel_altivec(); disable_kernel_altivec();
preempt_enable(); preempt_enable();
} }
EXPORT_SYMBOL(xor_altivec_2); EXPORT_SYMBOL(xor_altivec_2);
void xor_altivec_3(unsigned long bytes, unsigned long *v1_in, void xor_altivec_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *v2_in, unsigned long *v3_in) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
preempt_disable(); preempt_disable();
enable_kernel_altivec(); enable_kernel_altivec();
__xor_altivec_3(bytes, v1_in, v2_in, v3_in); __xor_altivec_3(bytes, p1, p2, p3);
disable_kernel_altivec(); disable_kernel_altivec();
preempt_enable(); preempt_enable();
} }
EXPORT_SYMBOL(xor_altivec_3); EXPORT_SYMBOL(xor_altivec_3);
void xor_altivec_4(unsigned long bytes, unsigned long *v1_in, void xor_altivec_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *v2_in, unsigned long *v3_in, const unsigned long * __restrict p2,
unsigned long *v4_in) const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
preempt_disable(); preempt_disable();
enable_kernel_altivec(); enable_kernel_altivec();
__xor_altivec_4(bytes, v1_in, v2_in, v3_in, v4_in); __xor_altivec_4(bytes, p1, p2, p3, p4);
disable_kernel_altivec(); disable_kernel_altivec();
preempt_enable(); preempt_enable();
} }
EXPORT_SYMBOL(xor_altivec_4); EXPORT_SYMBOL(xor_altivec_4);
void xor_altivec_5(unsigned long bytes, unsigned long *v1_in, void xor_altivec_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *v2_in, unsigned long *v3_in, const unsigned long * __restrict p2,
unsigned long *v4_in, unsigned long *v5_in) const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
preempt_disable(); preempt_disable();
enable_kernel_altivec(); enable_kernel_altivec();
__xor_altivec_5(bytes, v1_in, v2_in, v3_in, v4_in, v5_in); __xor_altivec_5(bytes, p1, p2, p3, p4, p5);
disable_kernel_altivec(); disable_kernel_altivec();
preempt_enable(); preempt_enable();
} }
......
...@@ -11,7 +11,8 @@ ...@@ -11,7 +11,8 @@
#include <linux/raid/xor.h> #include <linux/raid/xor.h>
#include <asm/xor.h> #include <asm/xor.h>
static void xor_xc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) static void xor_xc_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
asm volatile( asm volatile(
" larl 1,2f\n" " larl 1,2f\n"
...@@ -32,8 +33,9 @@ static void xor_xc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -32,8 +33,9 @@ static void xor_xc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
: "0", "1", "cc", "memory"); : "0", "1", "cc", "memory");
} }
static void xor_xc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, static void xor_xc_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
asm volatile( asm volatile(
" larl 1,2f\n" " larl 1,2f\n"
...@@ -58,8 +60,10 @@ static void xor_xc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -58,8 +60,10 @@ static void xor_xc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
: : "0", "1", "cc", "memory"); : : "0", "1", "cc", "memory");
} }
static void xor_xc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, static void xor_xc_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
asm volatile( asm volatile(
" larl 1,2f\n" " larl 1,2f\n"
...@@ -88,8 +92,11 @@ static void xor_xc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -88,8 +92,11 @@ static void xor_xc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
: : "0", "1", "cc", "memory"); : : "0", "1", "cc", "memory");
} }
static void xor_xc_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, static void xor_xc_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
asm volatile( asm volatile(
" larl 1,2f\n" " larl 1,2f\n"
......
...@@ -13,7 +13,8 @@ ...@@ -13,7 +13,8 @@
*/ */
static void static void
sparc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) sparc_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
int lines = bytes / (sizeof (long)) / 8; int lines = bytes / (sizeof (long)) / 8;
...@@ -50,8 +51,9 @@ sparc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -50,8 +51,9 @@ sparc_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
sparc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, sparc_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
int lines = bytes / (sizeof (long)) / 8; int lines = bytes / (sizeof (long)) / 8;
...@@ -101,8 +103,10 @@ sparc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -101,8 +103,10 @@ sparc_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
sparc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, sparc_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
int lines = bytes / (sizeof (long)) / 8; int lines = bytes / (sizeof (long)) / 8;
...@@ -165,8 +169,11 @@ sparc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -165,8 +169,11 @@ sparc_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
sparc_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, sparc_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
int lines = bytes / (sizeof (long)) / 8; int lines = bytes / (sizeof (long)) / 8;
......
...@@ -12,13 +12,20 @@ ...@@ -12,13 +12,20 @@
#include <asm/spitfire.h> #include <asm/spitfire.h>
void xor_vis_2(unsigned long, unsigned long *, unsigned long *); void xor_vis_2(unsigned long bytes, unsigned long * __restrict p1,
void xor_vis_3(unsigned long, unsigned long *, unsigned long *, const unsigned long * __restrict p2);
unsigned long *); void xor_vis_3(unsigned long bytes, unsigned long * __restrict p1,
void xor_vis_4(unsigned long, unsigned long *, unsigned long *, const unsigned long * __restrict p2,
unsigned long *, unsigned long *); const unsigned long * __restrict p3);
void xor_vis_5(unsigned long, unsigned long *, unsigned long *, void xor_vis_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *, unsigned long *, unsigned long *); const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_vis_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
/* XXX Ugh, write cheetah versions... -DaveM */ /* XXX Ugh, write cheetah versions... -DaveM */
...@@ -30,13 +37,20 @@ static struct xor_block_template xor_block_VIS = { ...@@ -30,13 +37,20 @@ static struct xor_block_template xor_block_VIS = {
.do_5 = xor_vis_5, .do_5 = xor_vis_5,
}; };
void xor_niagara_2(unsigned long, unsigned long *, unsigned long *); void xor_niagara_2(unsigned long bytes, unsigned long * __restrict p1,
void xor_niagara_3(unsigned long, unsigned long *, unsigned long *, const unsigned long * __restrict p2);
unsigned long *); void xor_niagara_3(unsigned long bytes, unsigned long * __restrict p1,
void xor_niagara_4(unsigned long, unsigned long *, unsigned long *, const unsigned long * __restrict p2,
unsigned long *, unsigned long *); const unsigned long * __restrict p3);
void xor_niagara_5(unsigned long, unsigned long *, unsigned long *, void xor_niagara_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *, unsigned long *, unsigned long *); const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4);
void xor_niagara_5(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5);
static struct xor_block_template xor_block_niagara = { static struct xor_block_template xor_block_niagara = {
.name = "Niagara", .name = "Niagara",
......
...@@ -90,6 +90,9 @@ nhpoly1305-avx2-y := nh-avx2-x86_64.o nhpoly1305-avx2-glue.o ...@@ -90,6 +90,9 @@ nhpoly1305-avx2-y := nh-avx2-x86_64.o nhpoly1305-avx2-glue.o
obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o obj-$(CONFIG_CRYPTO_CURVE25519_X86) += curve25519-x86_64.o
obj-$(CONFIG_CRYPTO_SM3_AVX_X86_64) += sm3-avx-x86_64.o
sm3-avx-x86_64-y := sm3-avx-asm_64.o sm3_avx_glue.o
obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) += sm4-aesni-avx-x86_64.o obj-$(CONFIG_CRYPTO_SM4_AESNI_AVX_X86_64) += sm4-aesni-avx-x86_64.o
sm4-aesni-avx-x86_64-y := sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o sm4-aesni-avx-x86_64-y := sm4-aesni-avx-asm_64.o sm4_aesni_avx_glue.o
......
/* SPDX-License-Identifier: GPL-2.0-only OR BSD-3-Clause */
/* /*
* Implement AES CTR mode by8 optimization with AVX instructions. (x86_64) * AES CTR mode by8 optimization with AVX instructions. (x86_64)
*
* This is AES128/192/256 CTR mode optimization implementation. It requires
* the support of Intel(R) AESNI and AVX instructions.
*
* This work was inspired by the AES CTR mode optimization published
* in Intel Optimized IPSEC Cryptograhpic library.
* Additional information on it can be found at:
* http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&DwnldID=22972
*
* This file is provided under a dual BSD/GPLv2 license. When using or
* redistributing this file, you may do so under either license.
*
* GPL LICENSE SUMMARY
* *
* Copyright(c) 2014 Intel Corporation. * Copyright(c) 2014 Intel Corporation.
* *
* This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* Contact Information: * Contact Information:
* James Guilford <james.guilford@intel.com> * James Guilford <james.guilford@intel.com>
* Sean Gulley <sean.m.gulley@intel.com> * Sean Gulley <sean.m.gulley@intel.com>
* Chandramouli Narayanan <mouli@linux.intel.com> * Chandramouli Narayanan <mouli@linux.intel.com>
*/
/*
* This is AES128/192/256 CTR mode optimization implementation. It requires
* the support of Intel(R) AESNI and AVX instructions.
* *
* BSD LICENSE * This work was inspired by the AES CTR mode optimization published
* * in Intel Optimized IPSEC Cryptographic library.
* Copyright(c) 2014 Intel Corporation. * Additional information on it can be found at:
* * https://github.com/intel/intel-ipsec-mb
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in
* the documentation and/or other materials provided with the
* distribution.
* Neither the name of Intel Corporation nor the names of its
* contributors may be used to endorse or promote products derived
* from this software without specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*
*/ */
#include <linux/linkage.h> #include <linux/linkage.h>
......
...@@ -32,24 +32,12 @@ static inline void blowfish_enc_blk(struct bf_ctx *ctx, u8 *dst, const u8 *src) ...@@ -32,24 +32,12 @@ static inline void blowfish_enc_blk(struct bf_ctx *ctx, u8 *dst, const u8 *src)
__blowfish_enc_blk(ctx, dst, src, false); __blowfish_enc_blk(ctx, dst, src, false);
} }
static inline void blowfish_enc_blk_xor(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk(ctx, dst, src, true);
}
static inline void blowfish_enc_blk_4way(struct bf_ctx *ctx, u8 *dst, static inline void blowfish_enc_blk_4way(struct bf_ctx *ctx, u8 *dst,
const u8 *src) const u8 *src)
{ {
__blowfish_enc_blk_4way(ctx, dst, src, false); __blowfish_enc_blk_4way(ctx, dst, src, false);
} }
static inline void blowfish_enc_blk_xor_4way(struct bf_ctx *ctx, u8 *dst,
const u8 *src)
{
__blowfish_enc_blk_4way(ctx, dst, src, true);
}
static void blowfish_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src) static void blowfish_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
{ {
blowfish_enc_blk(crypto_tfm_ctx(tfm), dst, src); blowfish_enc_blk(crypto_tfm_ctx(tfm), dst, src);
......
...@@ -45,14 +45,6 @@ static inline void des3_ede_dec_blk(struct des3_ede_x86_ctx *ctx, u8 *dst, ...@@ -45,14 +45,6 @@ static inline void des3_ede_dec_blk(struct des3_ede_x86_ctx *ctx, u8 *dst,
des3_ede_x86_64_crypt_blk(dec_ctx, dst, src); des3_ede_x86_64_crypt_blk(dec_ctx, dst, src);
} }
static inline void des3_ede_enc_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src)
{
u32 *enc_ctx = ctx->enc.expkey;
des3_ede_x86_64_crypt_blk_3way(enc_ctx, dst, src);
}
static inline void des3_ede_dec_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst, static inline void des3_ede_dec_blk_3way(struct des3_ede_x86_ctx *ctx, u8 *dst,
const u8 *src) const u8 *src)
{ {
......
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
* SM3 Secure Hash Algorithm, AVX assembler accelerated.
* specified in: https://datatracker.ietf.org/doc/html/draft-sca-cfrg-sm3-02
*
* Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/types.h>
#include <crypto/sm3.h>
#include <crypto/sm3_base.h>
#include <asm/simd.h>
asmlinkage void sm3_transform_avx(struct sm3_state *state,
const u8 *data, int nblocks);
static int sm3_avx_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
struct sm3_state *sctx = shash_desc_ctx(desc);
if (!crypto_simd_usable() ||
(sctx->count % SM3_BLOCK_SIZE) + len < SM3_BLOCK_SIZE) {
sm3_update(sctx, data, len);
return 0;
}
/*
* Make sure struct sm3_state begins directly with the SM3
* 256-bit internal state, as this is what the asm functions expect.
*/
BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0);
kernel_fpu_begin();
sm3_base_do_update(desc, data, len, sm3_transform_avx);
kernel_fpu_end();
return 0;
}
static int sm3_avx_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out)
{
if (!crypto_simd_usable()) {
struct sm3_state *sctx = shash_desc_ctx(desc);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, out);
return 0;
}
kernel_fpu_begin();
if (len)
sm3_base_do_update(desc, data, len, sm3_transform_avx);
sm3_base_do_finalize(desc, sm3_transform_avx);
kernel_fpu_end();
return sm3_base_finish(desc, out);
}
static int sm3_avx_final(struct shash_desc *desc, u8 *out)
{
if (!crypto_simd_usable()) {
sm3_final(shash_desc_ctx(desc), out);
return 0;
}
kernel_fpu_begin();
sm3_base_do_finalize(desc, sm3_transform_avx);
kernel_fpu_end();
return sm3_base_finish(desc, out);
}
static struct shash_alg sm3_avx_alg = {
.digestsize = SM3_DIGEST_SIZE,
.init = sm3_base_init,
.update = sm3_avx_update,
.final = sm3_avx_final,
.finup = sm3_avx_finup,
.descsize = sizeof(struct sm3_state),
.base = {
.cra_name = "sm3",
.cra_driver_name = "sm3-avx",
.cra_priority = 300,
.cra_blocksize = SM3_BLOCK_SIZE,
.cra_module = THIS_MODULE,
}
};
static int __init sm3_avx_mod_init(void)
{
const char *feature_name;
if (!boot_cpu_has(X86_FEATURE_AVX)) {
pr_info("AVX instruction are not detected.\n");
return -ENODEV;
}
if (!boot_cpu_has(X86_FEATURE_BMI2)) {
pr_info("BMI2 instruction are not detected.\n");
return -ENODEV;
}
if (!cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM,
&feature_name)) {
pr_info("CPU feature '%s' is not supported.\n", feature_name);
return -ENODEV;
}
return crypto_register_shash(&sm3_avx_alg);
}
static void __exit sm3_avx_mod_exit(void)
{
crypto_unregister_shash(&sm3_avx_alg);
}
module_init(sm3_avx_mod_init);
module_exit(sm3_avx_mod_exit);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Tianjia Zhang <tianjia.zhang@linux.alibaba.com>");
MODULE_DESCRIPTION("SM3 Secure Hash Algorithm, AVX assembler accelerated");
MODULE_ALIAS_CRYPTO("sm3");
MODULE_ALIAS_CRYPTO("sm3-avx");
...@@ -57,7 +57,8 @@ ...@@ -57,7 +57,8 @@
op(i + 3, 3) op(i + 3, 3)
static void static void
xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) xor_sse_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
...@@ -108,7 +109,8 @@ xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -108,7 +109,8 @@ xor_sse_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2) xor_sse_2_pf64(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
...@@ -142,8 +144,9 @@ xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -142,8 +144,9 @@ xor_sse_2_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_sse_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
...@@ -201,8 +204,9 @@ xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -201,8 +204,9 @@ xor_sse_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_sse_3_pf64(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
...@@ -238,8 +242,10 @@ xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -238,8 +242,10 @@ xor_sse_3_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_sse_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
...@@ -304,8 +310,10 @@ xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -304,8 +310,10 @@ xor_sse_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_sse_4_pf64(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
...@@ -343,8 +351,11 @@ xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -343,8 +351,11 @@ xor_sse_4_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_sse_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_sse_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
...@@ -416,8 +427,11 @@ xor_sse_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -416,8 +427,11 @@ xor_sse_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_sse_5_pf64(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_sse_5_pf64(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
unsigned long lines = bytes >> 8; unsigned long lines = bytes >> 8;
......
...@@ -21,7 +21,8 @@ ...@@ -21,7 +21,8 @@
#include <asm/fpu/api.h> #include <asm/fpu/api.h>
static void static void
xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) xor_pII_mmx_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
unsigned long lines = bytes >> 7; unsigned long lines = bytes >> 7;
...@@ -64,8 +65,9 @@ xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -64,8 +65,9 @@ xor_pII_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
xor_pII_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_pII_mmx_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
unsigned long lines = bytes >> 7; unsigned long lines = bytes >> 7;
...@@ -113,8 +115,10 @@ xor_pII_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -113,8 +115,10 @@ xor_pII_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_pII_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_pII_mmx_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
unsigned long lines = bytes >> 7; unsigned long lines = bytes >> 7;
...@@ -168,8 +172,11 @@ xor_pII_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -168,8 +172,11 @@ xor_pII_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
static void static void
xor_pII_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_pII_mmx_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
unsigned long lines = bytes >> 7; unsigned long lines = bytes >> 7;
...@@ -248,7 +255,8 @@ xor_pII_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -248,7 +255,8 @@ xor_pII_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2,
#undef BLOCK #undef BLOCK
static void static void
xor_p5_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) xor_p5_mmx_2(unsigned long bytes, unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
unsigned long lines = bytes >> 6; unsigned long lines = bytes >> 6;
...@@ -295,8 +303,9 @@ xor_p5_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2) ...@@ -295,8 +303,9 @@ xor_p5_mmx_2(unsigned long bytes, unsigned long *p1, unsigned long *p2)
} }
static void static void
xor_p5_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_p5_mmx_3(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3) const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
unsigned long lines = bytes >> 6; unsigned long lines = bytes >> 6;
...@@ -352,8 +361,10 @@ xor_p5_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -352,8 +361,10 @@ xor_p5_mmx_3(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_p5_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_p5_mmx_4(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
unsigned long lines = bytes >> 6; unsigned long lines = bytes >> 6;
...@@ -418,8 +429,11 @@ xor_p5_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2, ...@@ -418,8 +429,11 @@ xor_p5_mmx_4(unsigned long bytes, unsigned long *p1, unsigned long *p2,
} }
static void static void
xor_p5_mmx_5(unsigned long bytes, unsigned long *p1, unsigned long *p2, xor_p5_mmx_5(unsigned long bytes, unsigned long * __restrict p1,
unsigned long *p3, unsigned long *p4, unsigned long *p5) const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4,
const unsigned long * __restrict p5)
{ {
unsigned long lines = bytes >> 6; unsigned long lines = bytes >> 6;
......
...@@ -26,7 +26,8 @@ ...@@ -26,7 +26,8 @@
BLOCK4(8) \ BLOCK4(8) \
BLOCK4(12) BLOCK4(12)
static void xor_avx_2(unsigned long bytes, unsigned long *p0, unsigned long *p1) static void xor_avx_2(unsigned long bytes, unsigned long * __restrict p0,
const unsigned long * __restrict p1)
{ {
unsigned long lines = bytes >> 9; unsigned long lines = bytes >> 9;
...@@ -52,8 +53,9 @@ do { \ ...@@ -52,8 +53,9 @@ do { \
kernel_fpu_end(); kernel_fpu_end();
} }
static void xor_avx_3(unsigned long bytes, unsigned long *p0, unsigned long *p1, static void xor_avx_3(unsigned long bytes, unsigned long * __restrict p0,
unsigned long *p2) const unsigned long * __restrict p1,
const unsigned long * __restrict p2)
{ {
unsigned long lines = bytes >> 9; unsigned long lines = bytes >> 9;
...@@ -82,8 +84,10 @@ do { \ ...@@ -82,8 +84,10 @@ do { \
kernel_fpu_end(); kernel_fpu_end();
} }
static void xor_avx_4(unsigned long bytes, unsigned long *p0, unsigned long *p1, static void xor_avx_4(unsigned long bytes, unsigned long * __restrict p0,
unsigned long *p2, unsigned long *p3) const unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3)
{ {
unsigned long lines = bytes >> 9; unsigned long lines = bytes >> 9;
...@@ -115,8 +119,11 @@ do { \ ...@@ -115,8 +119,11 @@ do { \
kernel_fpu_end(); kernel_fpu_end();
} }
static void xor_avx_5(unsigned long bytes, unsigned long *p0, unsigned long *p1, static void xor_avx_5(unsigned long bytes, unsigned long * __restrict p0,
unsigned long *p2, unsigned long *p3, unsigned long *p4) const unsigned long * __restrict p1,
const unsigned long * __restrict p2,
const unsigned long * __restrict p3,
const unsigned long * __restrict p4)
{ {
unsigned long lines = bytes >> 9; unsigned long lines = bytes >> 9;
......
...@@ -231,6 +231,13 @@ config CRYPTO_DH ...@@ -231,6 +231,13 @@ config CRYPTO_DH
help help
Generic implementation of the Diffie-Hellman algorithm. Generic implementation of the Diffie-Hellman algorithm.
config CRYPTO_DH_RFC7919_GROUPS
bool "Support for RFC 7919 FFDHE group parameters"
depends on CRYPTO_DH
select CRYPTO_RNG_DEFAULT
help
Provide support for RFC 7919 FFDHE group parameters. If unsure, say N.
config CRYPTO_ECC config CRYPTO_ECC
tristate tristate
select CRYPTO_RNG_DEFAULT select CRYPTO_RNG_DEFAULT
...@@ -267,7 +274,7 @@ config CRYPTO_ECRDSA ...@@ -267,7 +274,7 @@ config CRYPTO_ECRDSA
config CRYPTO_SM2 config CRYPTO_SM2
tristate "SM2 algorithm" tristate "SM2 algorithm"
select CRYPTO_SM3 select CRYPTO_LIB_SM3
select CRYPTO_AKCIPHER select CRYPTO_AKCIPHER
select CRYPTO_MANAGER select CRYPTO_MANAGER
select MPILIB select MPILIB
...@@ -425,6 +432,7 @@ config CRYPTO_LRW ...@@ -425,6 +432,7 @@ config CRYPTO_LRW
select CRYPTO_SKCIPHER select CRYPTO_SKCIPHER
select CRYPTO_MANAGER select CRYPTO_MANAGER
select CRYPTO_GF128MUL select CRYPTO_GF128MUL
select CRYPTO_ECB
help help
LRW: Liskov Rivest Wagner, a tweakable, non malleable, non movable LRW: Liskov Rivest Wagner, a tweakable, non malleable, non movable
narrow block cipher mode for dm-crypt. Use it with cipher narrow block cipher mode for dm-crypt. Use it with cipher
...@@ -999,6 +1007,7 @@ config CRYPTO_SHA3 ...@@ -999,6 +1007,7 @@ config CRYPTO_SHA3
config CRYPTO_SM3 config CRYPTO_SM3
tristate "SM3 digest algorithm" tristate "SM3 digest algorithm"
select CRYPTO_HASH select CRYPTO_HASH
select CRYPTO_LIB_SM3
help help
SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3). SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3).
It is part of the Chinese Commercial Cryptography suite. It is part of the Chinese Commercial Cryptography suite.
...@@ -1007,6 +1016,19 @@ config CRYPTO_SM3 ...@@ -1007,6 +1016,19 @@ config CRYPTO_SM3
http://www.oscca.gov.cn/UpFile/20101222141857786.pdf http://www.oscca.gov.cn/UpFile/20101222141857786.pdf
https://datatracker.ietf.org/doc/html/draft-shen-sm3-hash https://datatracker.ietf.org/doc/html/draft-shen-sm3-hash
config CRYPTO_SM3_AVX_X86_64
tristate "SM3 digest algorithm (x86_64/AVX)"
depends on X86 && 64BIT
select CRYPTO_HASH
select CRYPTO_LIB_SM3
help
SM3 secure hash function as defined by OSCCA GM/T 0004-2012 SM3).
It is part of the Chinese Commercial Cryptography suite. This is
SM3 optimized implementation using Advanced Vector Extensions (AVX)
when available.
If unsure, say N.
config CRYPTO_STREEBOG config CRYPTO_STREEBOG
tristate "Streebog Hash Function" tristate "Streebog Hash Function"
select CRYPTO_HASH select CRYPTO_HASH
...@@ -1847,6 +1869,7 @@ config CRYPTO_JITTERENTROPY ...@@ -1847,6 +1869,7 @@ config CRYPTO_JITTERENTROPY
config CRYPTO_KDF800108_CTR config CRYPTO_KDF800108_CTR
tristate tristate
select CRYPTO_HMAC
select CRYPTO_SHA256 select CRYPTO_SHA256
config CRYPTO_USER_API config CRYPTO_USER_API
......
...@@ -6,6 +6,7 @@ ...@@ -6,6 +6,7 @@
*/ */
#include <crypto/algapi.h> #include <crypto/algapi.h>
#include <crypto/internal/simd.h>
#include <linux/err.h> #include <linux/err.h>
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/fips.h> #include <linux/fips.h>
...@@ -21,6 +22,11 @@ ...@@ -21,6 +22,11 @@
static LIST_HEAD(crypto_template_list); static LIST_HEAD(crypto_template_list);
#ifdef CONFIG_CRYPTO_MANAGER_EXTRA_TESTS
DEFINE_PER_CPU(bool, crypto_simd_disabled_for_test);
EXPORT_PER_CPU_SYMBOL_GPL(crypto_simd_disabled_for_test);
#endif
static inline void crypto_check_module_sig(struct module *mod) static inline void crypto_check_module_sig(struct module *mod)
{ {
if (fips_enabled && mod && !module_sig_ok(mod)) if (fips_enabled && mod && !module_sig_ok(mod))
...@@ -322,9 +328,17 @@ void crypto_alg_tested(const char *name, int err) ...@@ -322,9 +328,17 @@ void crypto_alg_tested(const char *name, int err)
found: found:
q->cra_flags |= CRYPTO_ALG_DEAD; q->cra_flags |= CRYPTO_ALG_DEAD;
alg = test->adult; alg = test->adult;
if (err || list_empty(&alg->cra_list))
if (list_empty(&alg->cra_list))
goto complete; goto complete;
if (err == -ECANCELED)
alg->cra_flags |= CRYPTO_ALG_FIPS_INTERNAL;
else if (err)
goto complete;
else
alg->cra_flags &= ~CRYPTO_ALG_FIPS_INTERNAL;
alg->cra_flags |= CRYPTO_ALG_TESTED; alg->cra_flags |= CRYPTO_ALG_TESTED;
/* Only satisfy larval waiters if we are the best. */ /* Only satisfy larval waiters if we are the best. */
...@@ -604,6 +618,7 @@ int crypto_register_instance(struct crypto_template *tmpl, ...@@ -604,6 +618,7 @@ int crypto_register_instance(struct crypto_template *tmpl,
{ {
struct crypto_larval *larval; struct crypto_larval *larval;
struct crypto_spawn *spawn; struct crypto_spawn *spawn;
u32 fips_internal = 0;
int err; int err;
err = crypto_check_alg(&inst->alg); err = crypto_check_alg(&inst->alg);
...@@ -626,11 +641,15 @@ int crypto_register_instance(struct crypto_template *tmpl, ...@@ -626,11 +641,15 @@ int crypto_register_instance(struct crypto_template *tmpl,
spawn->inst = inst; spawn->inst = inst;
spawn->registered = true; spawn->registered = true;
fips_internal |= spawn->alg->cra_flags;
crypto_mod_put(spawn->alg); crypto_mod_put(spawn->alg);
spawn = next; spawn = next;
} }
inst->alg.cra_flags |= (fips_internal & CRYPTO_ALG_FIPS_INTERNAL);
larval = __crypto_register_alg(&inst->alg); larval = __crypto_register_alg(&inst->alg);
if (IS_ERR(larval)) if (IS_ERR(larval))
goto unlock; goto unlock;
...@@ -683,7 +702,8 @@ int crypto_grab_spawn(struct crypto_spawn *spawn, struct crypto_instance *inst, ...@@ -683,7 +702,8 @@ int crypto_grab_spawn(struct crypto_spawn *spawn, struct crypto_instance *inst,
if (IS_ERR(name)) if (IS_ERR(name))
return PTR_ERR(name); return PTR_ERR(name);
alg = crypto_find_alg(name, spawn->frontend, type, mask); alg = crypto_find_alg(name, spawn->frontend,
type | CRYPTO_ALG_FIPS_INTERNAL, mask);
if (IS_ERR(alg)) if (IS_ERR(alg))
return PTR_ERR(alg); return PTR_ERR(alg);
...@@ -1002,7 +1022,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len) ...@@ -1002,7 +1022,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
} }
while (IS_ENABLED(CONFIG_64BIT) && len >= 8 && !(relalign & 7)) { while (IS_ENABLED(CONFIG_64BIT) && len >= 8 && !(relalign & 7)) {
*(u64 *)dst = *(u64 *)src1 ^ *(u64 *)src2; if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u64 l = get_unaligned((u64 *)src1) ^
get_unaligned((u64 *)src2);
put_unaligned(l, (u64 *)dst);
} else {
*(u64 *)dst = *(u64 *)src1 ^ *(u64 *)src2;
}
dst += 8; dst += 8;
src1 += 8; src1 += 8;
src2 += 8; src2 += 8;
...@@ -1010,7 +1036,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len) ...@@ -1010,7 +1036,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
} }
while (len >= 4 && !(relalign & 3)) { while (len >= 4 && !(relalign & 3)) {
*(u32 *)dst = *(u32 *)src1 ^ *(u32 *)src2; if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u32 l = get_unaligned((u32 *)src1) ^
get_unaligned((u32 *)src2);
put_unaligned(l, (u32 *)dst);
} else {
*(u32 *)dst = *(u32 *)src1 ^ *(u32 *)src2;
}
dst += 4; dst += 4;
src1 += 4; src1 += 4;
src2 += 4; src2 += 4;
...@@ -1018,7 +1050,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len) ...@@ -1018,7 +1050,13 @@ void __crypto_xor(u8 *dst, const u8 *src1, const u8 *src2, unsigned int len)
} }
while (len >= 2 && !(relalign & 1)) { while (len >= 2 && !(relalign & 1)) {
*(u16 *)dst = *(u16 *)src1 ^ *(u16 *)src2; if (IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) {
u16 l = get_unaligned((u16 *)src1) ^
get_unaligned((u16 *)src2);
put_unaligned(l, (u16 *)dst);
} else {
*(u16 *)dst = *(u16 *)src1 ^ *(u16 *)src2;
}
dst += 2; dst += 2;
src1 += 2; src1 += 2;
src2 += 2; src2 += 2;
......
...@@ -223,6 +223,8 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg) ...@@ -223,6 +223,8 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
else if (crypto_is_test_larval(larval) && else if (crypto_is_test_larval(larval) &&
!(alg->cra_flags & CRYPTO_ALG_TESTED)) !(alg->cra_flags & CRYPTO_ALG_TESTED))
alg = ERR_PTR(-EAGAIN); alg = ERR_PTR(-EAGAIN);
else if (alg->cra_flags & CRYPTO_ALG_FIPS_INTERNAL)
alg = ERR_PTR(-EAGAIN);
else if (!crypto_mod_get(alg)) else if (!crypto_mod_get(alg))
alg = ERR_PTR(-EAGAIN); alg = ERR_PTR(-EAGAIN);
crypto_mod_put(&larval->alg); crypto_mod_put(&larval->alg);
...@@ -233,6 +235,7 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg) ...@@ -233,6 +235,7 @@ static struct crypto_alg *crypto_larval_wait(struct crypto_alg *alg)
static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type, static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
u32 mask) u32 mask)
{ {
const u32 fips = CRYPTO_ALG_FIPS_INTERNAL;
struct crypto_alg *alg; struct crypto_alg *alg;
u32 test = 0; u32 test = 0;
...@@ -240,8 +243,20 @@ static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type, ...@@ -240,8 +243,20 @@ static struct crypto_alg *crypto_alg_lookup(const char *name, u32 type,
test |= CRYPTO_ALG_TESTED; test |= CRYPTO_ALG_TESTED;
down_read(&crypto_alg_sem); down_read(&crypto_alg_sem);
alg = __crypto_alg_lookup(name, type | test, mask | test); alg = __crypto_alg_lookup(name, (type | test) & ~fips,
if (!alg && test) { (mask | test) & ~fips);
if (alg) {
if (((type | mask) ^ fips) & fips)
mask |= fips;
mask &= fips;
if (!crypto_is_larval(alg) &&
((type ^ alg->cra_flags) & mask)) {
/* Algorithm is disallowed in FIPS mode. */
crypto_mod_put(alg);
alg = ERR_PTR(-ENOENT);
}
} else if (test) {
alg = __crypto_alg_lookup(name, type, mask); alg = __crypto_alg_lookup(name, type, mask);
if (alg && !crypto_is_larval(alg)) { if (alg && !crypto_is_larval(alg)) {
/* Test failed */ /* Test failed */
......
...@@ -35,7 +35,7 @@ void public_key_signature_free(struct public_key_signature *sig) ...@@ -35,7 +35,7 @@ void public_key_signature_free(struct public_key_signature *sig)
EXPORT_SYMBOL_GPL(public_key_signature_free); EXPORT_SYMBOL_GPL(public_key_signature_free);
/** /**
* query_asymmetric_key - Get information about an aymmetric key. * query_asymmetric_key - Get information about an asymmetric key.
* @params: Various parameters. * @params: Various parameters.
* @info: Where to put the information. * @info: Where to put the information.
*/ */
......
...@@ -22,7 +22,7 @@ struct x509_certificate { ...@@ -22,7 +22,7 @@ struct x509_certificate {
time64_t valid_to; time64_t valid_to;
const void *tbs; /* Signed data */ const void *tbs; /* Signed data */
unsigned tbs_size; /* Size of signed data */ unsigned tbs_size; /* Size of signed data */
unsigned raw_sig_size; /* Size of sigature */ unsigned raw_sig_size; /* Size of signature */
const void *raw_sig; /* Signature data */ const void *raw_sig; /* Signature data */
const void *raw_serial; /* Raw serial number in ASN.1 */ const void *raw_serial; /* Raw serial number in ASN.1 */
unsigned raw_serial_size; unsigned raw_serial_size;
......
...@@ -170,8 +170,8 @@ dma_xor_aligned_offsets(struct dma_device *device, unsigned int offset, ...@@ -170,8 +170,8 @@ dma_xor_aligned_offsets(struct dma_device *device, unsigned int offset,
* *
* xor_blocks always uses the dest as a source so the * xor_blocks always uses the dest as a source so the
* ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in * ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in
* the calculation. The assumption with dma eninges is that they only * the calculation. The assumption with dma engines is that they only
* use the destination buffer as a source when it is explicity specified * use the destination buffer as a source when it is explicitly specified
* in the source list. * in the source list.
* *
* src_list note: if the dest is also a source it must be at index zero. * src_list note: if the dest is also a source it must be at index zero.
...@@ -261,8 +261,8 @@ EXPORT_SYMBOL_GPL(async_xor_offs); ...@@ -261,8 +261,8 @@ EXPORT_SYMBOL_GPL(async_xor_offs);
* *
* xor_blocks always uses the dest as a source so the * xor_blocks always uses the dest as a source so the
* ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in * ASYNC_TX_XOR_ZERO_DST flag must be set to not include dest data in
* the calculation. The assumption with dma eninges is that they only * the calculation. The assumption with dma engines is that they only
* use the destination buffer as a source when it is explicity specified * use the destination buffer as a source when it is explicitly specified
* in the source list. * in the source list.
* *
* src_list note: if the dest is also a source it must be at index zero. * src_list note: if the dest is also a source it must be at index zero.
......
...@@ -217,7 +217,7 @@ static int raid6_test(void) ...@@ -217,7 +217,7 @@ static int raid6_test(void)
err += test(12, &tests); err += test(12, &tests);
} }
/* the 24 disk case is special for ioatdma as it is the boudary point /* the 24 disk case is special for ioatdma as it is the boundary point
* at which it needs to switch from 8-source ops to 16-source * at which it needs to switch from 8-source ops to 16-source
* ops for continuation (assumes DMA_HAS_PQ_CONTINUE is not set) * ops for continuation (assumes DMA_HAS_PQ_CONTINUE is not set)
*/ */
...@@ -241,7 +241,7 @@ static void raid6_test_exit(void) ...@@ -241,7 +241,7 @@ static void raid6_test_exit(void)
} }
/* when compiled-in wait for drivers to load first (assumes dma drivers /* when compiled-in wait for drivers to load first (assumes dma drivers
* are also compliled-in) * are also compiled-in)
*/ */
late_initcall(raid6_test); late_initcall(raid6_test);
module_exit(raid6_test_exit); module_exit(raid6_test_exit);
......
...@@ -253,7 +253,7 @@ static int crypto_authenc_decrypt_tail(struct aead_request *req, ...@@ -253,7 +253,7 @@ static int crypto_authenc_decrypt_tail(struct aead_request *req,
dst = scatterwalk_ffwd(areq_ctx->dst, req->dst, req->assoclen); dst = scatterwalk_ffwd(areq_ctx->dst, req->dst, req->assoclen);
skcipher_request_set_tfm(skreq, ctx->enc); skcipher_request_set_tfm(skreq, ctx->enc);
skcipher_request_set_callback(skreq, aead_request_flags(req), skcipher_request_set_callback(skreq, flags,
req->base.complete, req->base.data); req->base.complete, req->base.data);
skcipher_request_set_crypt(skreq, src, dst, skcipher_request_set_crypt(skreq, src, dst,
req->cryptlen - authsize, req->iv); req->cryptlen - authsize, req->iv);
......
//SPDX-License-Identifier: GPL-2.0 // SPDX-License-Identifier: GPL-2.0
/* /*
* CFB: Cipher FeedBack mode * CFB: Cipher FeedBack mode
* *
......
...@@ -53,6 +53,7 @@ static void crypto_finalize_request(struct crypto_engine *engine, ...@@ -53,6 +53,7 @@ static void crypto_finalize_request(struct crypto_engine *engine,
dev_err(engine->dev, "failed to unprepare request\n"); dev_err(engine->dev, "failed to unprepare request\n");
} }
} }
lockdep_assert_in_softirq();
req->complete(req, err); req->complete(req, err);
kthread_queue_work(engine->kworker, &engine->pump_requests); kthread_queue_work(engine->kworker, &engine->pump_requests);
......
This diff is collapsed.
...@@ -10,7 +10,7 @@ ...@@ -10,7 +10,7 @@
#include <crypto/dh.h> #include <crypto/dh.h>
#include <crypto/kpp.h> #include <crypto/kpp.h>
#define DH_KPP_SECRET_MIN_SIZE (sizeof(struct kpp_secret) + 4 * sizeof(int)) #define DH_KPP_SECRET_MIN_SIZE (sizeof(struct kpp_secret) + 3 * sizeof(int))
static inline u8 *dh_pack_data(u8 *dst, u8 *end, const void *src, size_t size) static inline u8 *dh_pack_data(u8 *dst, u8 *end, const void *src, size_t size)
{ {
...@@ -28,7 +28,7 @@ static inline const u8 *dh_unpack_data(void *dst, const void *src, size_t size) ...@@ -28,7 +28,7 @@ static inline const u8 *dh_unpack_data(void *dst, const void *src, size_t size)
static inline unsigned int dh_data_size(const struct dh *p) static inline unsigned int dh_data_size(const struct dh *p)
{ {
return p->key_size + p->p_size + p->q_size + p->g_size; return p->key_size + p->p_size + p->g_size;
} }
unsigned int crypto_dh_key_len(const struct dh *p) unsigned int crypto_dh_key_len(const struct dh *p)
...@@ -53,11 +53,9 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params) ...@@ -53,11 +53,9 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params)
ptr = dh_pack_data(ptr, end, &params->key_size, ptr = dh_pack_data(ptr, end, &params->key_size,
sizeof(params->key_size)); sizeof(params->key_size));
ptr = dh_pack_data(ptr, end, &params->p_size, sizeof(params->p_size)); ptr = dh_pack_data(ptr, end, &params->p_size, sizeof(params->p_size));
ptr = dh_pack_data(ptr, end, &params->q_size, sizeof(params->q_size));
ptr = dh_pack_data(ptr, end, &params->g_size, sizeof(params->g_size)); ptr = dh_pack_data(ptr, end, &params->g_size, sizeof(params->g_size));
ptr = dh_pack_data(ptr, end, params->key, params->key_size); ptr = dh_pack_data(ptr, end, params->key, params->key_size);
ptr = dh_pack_data(ptr, end, params->p, params->p_size); ptr = dh_pack_data(ptr, end, params->p, params->p_size);
ptr = dh_pack_data(ptr, end, params->q, params->q_size);
ptr = dh_pack_data(ptr, end, params->g, params->g_size); ptr = dh_pack_data(ptr, end, params->g, params->g_size);
if (ptr != end) if (ptr != end)
return -EINVAL; return -EINVAL;
...@@ -65,7 +63,7 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params) ...@@ -65,7 +63,7 @@ int crypto_dh_encode_key(char *buf, unsigned int len, const struct dh *params)
} }
EXPORT_SYMBOL_GPL(crypto_dh_encode_key); EXPORT_SYMBOL_GPL(crypto_dh_encode_key);
int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params) int __crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
{ {
const u8 *ptr = buf; const u8 *ptr = buf;
struct kpp_secret secret; struct kpp_secret secret;
...@@ -79,28 +77,36 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params) ...@@ -79,28 +77,36 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
ptr = dh_unpack_data(&params->key_size, ptr, sizeof(params->key_size)); ptr = dh_unpack_data(&params->key_size, ptr, sizeof(params->key_size));
ptr = dh_unpack_data(&params->p_size, ptr, sizeof(params->p_size)); ptr = dh_unpack_data(&params->p_size, ptr, sizeof(params->p_size));
ptr = dh_unpack_data(&params->q_size, ptr, sizeof(params->q_size));
ptr = dh_unpack_data(&params->g_size, ptr, sizeof(params->g_size)); ptr = dh_unpack_data(&params->g_size, ptr, sizeof(params->g_size));
if (secret.len != crypto_dh_key_len(params)) if (secret.len != crypto_dh_key_len(params))
return -EINVAL; return -EINVAL;
/* Don't allocate memory. Set pointers to data within
* the given buffer
*/
params->key = (void *)ptr;
params->p = (void *)(ptr + params->key_size);
params->g = (void *)(ptr + params->key_size + params->p_size);
return 0;
}
int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
{
int err;
err = __crypto_dh_decode_key(buf, len, params);
if (err)
return err;
/* /*
* Don't permit the buffer for 'key' or 'g' to be larger than 'p', since * Don't permit the buffer for 'key' or 'g' to be larger than 'p', since
* some drivers assume otherwise. * some drivers assume otherwise.
*/ */
if (params->key_size > params->p_size || if (params->key_size > params->p_size ||
params->g_size > params->p_size || params->q_size > params->p_size) params->g_size > params->p_size)
return -EINVAL; return -EINVAL;
/* Don't allocate memory. Set pointers to data within
* the given buffer
*/
params->key = (void *)ptr;
params->p = (void *)(ptr + params->key_size);
params->q = (void *)(ptr + params->key_size + params->p_size);
params->g = (void *)(ptr + params->key_size + params->p_size +
params->q_size);
/* /*
* Don't permit 'p' to be 0. It's not a prime number, and it's subject * Don't permit 'p' to be 0. It's not a prime number, and it's subject
* to corner cases such as 'mod 0' being undefined or * to corner cases such as 'mod 0' being undefined or
...@@ -109,10 +115,6 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params) ...@@ -109,10 +115,6 @@ int crypto_dh_decode_key(const char *buf, unsigned int len, struct dh *params)
if (memchr_inv(params->p, 0, params->p_size) == NULL) if (memchr_inv(params->p, 0, params->p_size) == NULL)
return -EINVAL; return -EINVAL;
/* It is permissible to not provide Q. */
if (params->q_size == 0)
params->q = NULL;
return 0; return 0;
} }
EXPORT_SYMBOL_GPL(crypto_dh_decode_key); EXPORT_SYMBOL_GPL(crypto_dh_decode_key);
...@@ -15,6 +15,7 @@ ...@@ -15,6 +15,7 @@
#include <crypto/internal/hash.h> #include <crypto/internal/hash.h>
#include <crypto/scatterwalk.h> #include <crypto/scatterwalk.h>
#include <linux/err.h> #include <linux/err.h>
#include <linux/fips.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/module.h> #include <linux/module.h>
...@@ -51,6 +52,9 @@ static int hmac_setkey(struct crypto_shash *parent, ...@@ -51,6 +52,9 @@ static int hmac_setkey(struct crypto_shash *parent,
SHASH_DESC_ON_STACK(shash, hash); SHASH_DESC_ON_STACK(shash, hash);
unsigned int i; unsigned int i;
if (fips_enabled && (keylen < 112 / 8))
return -EINVAL;
shash->tfm = hash; shash->tfm = hash;
if (keylen > bs) { if (keylen > bs) {
......
...@@ -68,9 +68,17 @@ static int crypto_kpp_init_tfm(struct crypto_tfm *tfm) ...@@ -68,9 +68,17 @@ static int crypto_kpp_init_tfm(struct crypto_tfm *tfm)
return 0; return 0;
} }
static void crypto_kpp_free_instance(struct crypto_instance *inst)
{
struct kpp_instance *kpp = kpp_instance(inst);
kpp->free(kpp);
}
static const struct crypto_type crypto_kpp_type = { static const struct crypto_type crypto_kpp_type = {
.extsize = crypto_alg_extsize, .extsize = crypto_alg_extsize,
.init_tfm = crypto_kpp_init_tfm, .init_tfm = crypto_kpp_init_tfm,
.free = crypto_kpp_free_instance,
#ifdef CONFIG_PROC_FS #ifdef CONFIG_PROC_FS
.show = crypto_kpp_show, .show = crypto_kpp_show,
#endif #endif
...@@ -87,6 +95,15 @@ struct crypto_kpp *crypto_alloc_kpp(const char *alg_name, u32 type, u32 mask) ...@@ -87,6 +95,15 @@ struct crypto_kpp *crypto_alloc_kpp(const char *alg_name, u32 type, u32 mask)
} }
EXPORT_SYMBOL_GPL(crypto_alloc_kpp); EXPORT_SYMBOL_GPL(crypto_alloc_kpp);
int crypto_grab_kpp(struct crypto_kpp_spawn *spawn,
struct crypto_instance *inst,
const char *name, u32 type, u32 mask)
{
spawn->base.frontend = &crypto_kpp_type;
return crypto_grab_spawn(&spawn->base, inst, name, type, mask);
}
EXPORT_SYMBOL_GPL(crypto_grab_kpp);
static void kpp_prepare_alg(struct kpp_alg *alg) static void kpp_prepare_alg(struct kpp_alg *alg)
{ {
struct crypto_alg *base = &alg->base; struct crypto_alg *base = &alg->base;
...@@ -111,5 +128,17 @@ void crypto_unregister_kpp(struct kpp_alg *alg) ...@@ -111,5 +128,17 @@ void crypto_unregister_kpp(struct kpp_alg *alg)
} }
EXPORT_SYMBOL_GPL(crypto_unregister_kpp); EXPORT_SYMBOL_GPL(crypto_unregister_kpp);
int kpp_register_instance(struct crypto_template *tmpl,
struct kpp_instance *inst)
{
if (WARN_ON(!inst->free))
return -EINVAL;
kpp_prepare_alg(&inst->alg);
return crypto_register_instance(tmpl, kpp_crypto_instance(inst));
}
EXPORT_SYMBOL_GPL(kpp_register_instance);
MODULE_LICENSE("GPL"); MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Key-agreement Protocol Primitives"); MODULE_DESCRIPTION("Key-agreement Protocol Primitives");
...@@ -428,3 +428,4 @@ module_exit(lrw_module_exit); ...@@ -428,3 +428,4 @@ module_exit(lrw_module_exit);
MODULE_LICENSE("GPL"); MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("LRW block cipher mode"); MODULE_DESCRIPTION("LRW block cipher mode");
MODULE_ALIAS_CRYPTO("lrw"); MODULE_ALIAS_CRYPTO("lrw");
MODULE_SOFTDEP("pre: ecb");
...@@ -60,6 +60,7 @@ ...@@ -60,6 +60,7 @@
*/ */
#include <crypto/algapi.h> #include <crypto/algapi.h>
#include <asm/unaligned.h>
#ifndef __HAVE_ARCH_CRYPTO_MEMNEQ #ifndef __HAVE_ARCH_CRYPTO_MEMNEQ
...@@ -71,7 +72,8 @@ __crypto_memneq_generic(const void *a, const void *b, size_t size) ...@@ -71,7 +72,8 @@ __crypto_memneq_generic(const void *a, const void *b, size_t size)
#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
while (size >= sizeof(unsigned long)) { while (size >= sizeof(unsigned long)) {
neq |= *(unsigned long *)a ^ *(unsigned long *)b; neq |= get_unaligned((unsigned long *)a) ^
get_unaligned((unsigned long *)b);
OPTIMIZER_HIDE_VAR(neq); OPTIMIZER_HIDE_VAR(neq);
a += sizeof(unsigned long); a += sizeof(unsigned long);
b += sizeof(unsigned long); b += sizeof(unsigned long);
...@@ -95,18 +97,24 @@ static inline unsigned long __crypto_memneq_16(const void *a, const void *b) ...@@ -95,18 +97,24 @@ static inline unsigned long __crypto_memneq_16(const void *a, const void *b)
#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS #ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
if (sizeof(unsigned long) == 8) { if (sizeof(unsigned long) == 8) {
neq |= *(unsigned long *)(a) ^ *(unsigned long *)(b); neq |= get_unaligned((unsigned long *)a) ^
get_unaligned((unsigned long *)b);
OPTIMIZER_HIDE_VAR(neq); OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned long *)(a+8) ^ *(unsigned long *)(b+8); neq |= get_unaligned((unsigned long *)(a + 8)) ^
get_unaligned((unsigned long *)(b + 8));
OPTIMIZER_HIDE_VAR(neq); OPTIMIZER_HIDE_VAR(neq);
} else if (sizeof(unsigned int) == 4) { } else if (sizeof(unsigned int) == 4) {
neq |= *(unsigned int *)(a) ^ *(unsigned int *)(b); neq |= get_unaligned((unsigned int *)a) ^
get_unaligned((unsigned int *)b);
OPTIMIZER_HIDE_VAR(neq); OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+4) ^ *(unsigned int *)(b+4); neq |= get_unaligned((unsigned int *)(a + 4)) ^
get_unaligned((unsigned int *)(b + 4));
OPTIMIZER_HIDE_VAR(neq); OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+8) ^ *(unsigned int *)(b+8); neq |= get_unaligned((unsigned int *)(a + 8)) ^
get_unaligned((unsigned int *)(b + 8));
OPTIMIZER_HIDE_VAR(neq); OPTIMIZER_HIDE_VAR(neq);
neq |= *(unsigned int *)(a+12) ^ *(unsigned int *)(b+12); neq |= get_unaligned((unsigned int *)(a + 12)) ^
get_unaligned((unsigned int *)(b + 12));
OPTIMIZER_HIDE_VAR(neq); OPTIMIZER_HIDE_VAR(neq);
} else } else
#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ #endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */
......
...@@ -385,15 +385,15 @@ static int pkcs1pad_sign(struct akcipher_request *req) ...@@ -385,15 +385,15 @@ static int pkcs1pad_sign(struct akcipher_request *req)
struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst); struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst);
const struct rsa_asn1_template *digest_info = ictx->digest_info; const struct rsa_asn1_template *digest_info = ictx->digest_info;
int err; int err;
unsigned int ps_end, digest_size = 0; unsigned int ps_end, digest_info_size = 0;
if (!ctx->key_size) if (!ctx->key_size)
return -EINVAL; return -EINVAL;
if (digest_info) if (digest_info)
digest_size = digest_info->size; digest_info_size = digest_info->size;
if (req->src_len + digest_size > ctx->key_size - 11) if (req->src_len + digest_info_size > ctx->key_size - 11)
return -EOVERFLOW; return -EOVERFLOW;
if (req->dst_len < ctx->key_size) { if (req->dst_len < ctx->key_size) {
...@@ -406,7 +406,7 @@ static int pkcs1pad_sign(struct akcipher_request *req) ...@@ -406,7 +406,7 @@ static int pkcs1pad_sign(struct akcipher_request *req)
if (!req_ctx->in_buf) if (!req_ctx->in_buf)
return -ENOMEM; return -ENOMEM;
ps_end = ctx->key_size - digest_size - req->src_len - 2; ps_end = ctx->key_size - digest_info_size - req->src_len - 2;
req_ctx->in_buf[0] = 0x01; req_ctx->in_buf[0] = 0x01;
memset(req_ctx->in_buf + 1, 0xff, ps_end - 1); memset(req_ctx->in_buf + 1, 0xff, ps_end - 1);
req_ctx->in_buf[ps_end] = 0x00; req_ctx->in_buf[ps_end] = 0x00;
...@@ -441,6 +441,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err) ...@@ -441,6 +441,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
struct akcipher_instance *inst = akcipher_alg_instance(tfm); struct akcipher_instance *inst = akcipher_alg_instance(tfm);
struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst); struct pkcs1pad_inst_ctx *ictx = akcipher_instance_ctx(inst);
const struct rsa_asn1_template *digest_info = ictx->digest_info; const struct rsa_asn1_template *digest_info = ictx->digest_info;
const unsigned int sig_size = req->src_len;
const unsigned int digest_size = req->dst_len;
unsigned int dst_len; unsigned int dst_len;
unsigned int pos; unsigned int pos;
u8 *out_buf; u8 *out_buf;
...@@ -476,6 +478,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err) ...@@ -476,6 +478,8 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
pos++; pos++;
if (digest_info) { if (digest_info) {
if (digest_info->size > dst_len - pos)
goto done;
if (crypto_memneq(out_buf + pos, digest_info->data, if (crypto_memneq(out_buf + pos, digest_info->data,
digest_info->size)) digest_info->size))
goto done; goto done;
...@@ -485,20 +489,19 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err) ...@@ -485,20 +489,19 @@ static int pkcs1pad_verify_complete(struct akcipher_request *req, int err)
err = 0; err = 0;
if (req->dst_len != dst_len - pos) { if (digest_size != dst_len - pos) {
err = -EKEYREJECTED; err = -EKEYREJECTED;
req->dst_len = dst_len - pos; req->dst_len = dst_len - pos;
goto done; goto done;
} }
/* Extract appended digest. */ /* Extract appended digest. */
sg_pcopy_to_buffer(req->src, sg_pcopy_to_buffer(req->src,
sg_nents_for_len(req->src, sg_nents_for_len(req->src, sig_size + digest_size),
req->src_len + req->dst_len),
req_ctx->out_buf + ctx->key_size, req_ctx->out_buf + ctx->key_size,
req->dst_len, ctx->key_size); digest_size, sig_size);
/* Do the actual verification step. */ /* Do the actual verification step. */
if (memcmp(req_ctx->out_buf + ctx->key_size, out_buf + pos, if (memcmp(req_ctx->out_buf + ctx->key_size, out_buf + pos,
req->dst_len) != 0) digest_size) != 0)
err = -EKEYREJECTED; err = -EKEYREJECTED;
done: done:
kfree_sensitive(req_ctx->out_buf); kfree_sensitive(req_ctx->out_buf);
...@@ -534,14 +537,15 @@ static int pkcs1pad_verify(struct akcipher_request *req) ...@@ -534,14 +537,15 @@ static int pkcs1pad_verify(struct akcipher_request *req)
struct crypto_akcipher *tfm = crypto_akcipher_reqtfm(req); struct crypto_akcipher *tfm = crypto_akcipher_reqtfm(req);
struct pkcs1pad_ctx *ctx = akcipher_tfm_ctx(tfm); struct pkcs1pad_ctx *ctx = akcipher_tfm_ctx(tfm);
struct pkcs1pad_request *req_ctx = akcipher_request_ctx(req); struct pkcs1pad_request *req_ctx = akcipher_request_ctx(req);
const unsigned int sig_size = req->src_len;
const unsigned int digest_size = req->dst_len;
int err; int err;
if (WARN_ON(req->dst) || if (WARN_ON(req->dst) || WARN_ON(!digest_size) ||
WARN_ON(!req->dst_len) || !ctx->key_size || sig_size != ctx->key_size)
!ctx->key_size || req->src_len < ctx->key_size)
return -EINVAL; return -EINVAL;
req_ctx->out_buf = kmalloc(ctx->key_size + req->dst_len, GFP_KERNEL); req_ctx->out_buf = kmalloc(ctx->key_size + digest_size, GFP_KERNEL);
if (!req_ctx->out_buf) if (!req_ctx->out_buf)
return -ENOMEM; return -ENOMEM;
...@@ -554,8 +558,7 @@ static int pkcs1pad_verify(struct akcipher_request *req) ...@@ -554,8 +558,7 @@ static int pkcs1pad_verify(struct akcipher_request *req)
/* Reuse input buffer, output to a new buffer */ /* Reuse input buffer, output to a new buffer */
akcipher_request_set_crypt(&req_ctx->child_req, req->src, akcipher_request_set_crypt(&req_ctx->child_req, req->src,
req_ctx->out_sg, req->src_len, req_ctx->out_sg, sig_size, ctx->key_size);
ctx->key_size);
err = crypto_akcipher_encrypt(&req_ctx->child_req); err = crypto_akcipher_encrypt(&req_ctx->child_req);
if (err != -EINPROGRESS && err != -EBUSY) if (err != -EINPROGRESS && err != -EBUSY)
...@@ -621,6 +624,11 @@ static int pkcs1pad_create(struct crypto_template *tmpl, struct rtattr **tb) ...@@ -621,6 +624,11 @@ static int pkcs1pad_create(struct crypto_template *tmpl, struct rtattr **tb)
rsa_alg = crypto_spawn_akcipher_alg(&ctx->spawn); rsa_alg = crypto_spawn_akcipher_alg(&ctx->spawn);
if (strcmp(rsa_alg->base.cra_name, "rsa") != 0) {
err = -EINVAL;
goto err_free_inst;
}
err = -ENAMETOOLONG; err = -ENAMETOOLONG;
hash_name = crypto_attr_alg_name(tb[2]); hash_name = crypto_attr_alg_name(tb[2]);
if (IS_ERR(hash_name)) { if (IS_ERR(hash_name)) {
......
/* SPDX-License-Identifier: GPL-2.0-or-later */ // SPDX-License-Identifier: GPL-2.0-or-later
/* /*
* SM2 asymmetric public-key algorithm * SM2 asymmetric public-key algorithm
* as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and * as specified by OSCCA GM/T 0003.1-2012 -- 0003.5-2012 SM2 and
...@@ -13,7 +13,7 @@ ...@@ -13,7 +13,7 @@
#include <crypto/internal/akcipher.h> #include <crypto/internal/akcipher.h>
#include <crypto/akcipher.h> #include <crypto/akcipher.h>
#include <crypto/hash.h> #include <crypto/hash.h>
#include <crypto/sm3_base.h> #include <crypto/sm3.h>
#include <crypto/rng.h> #include <crypto/rng.h>
#include <crypto/sm2.h> #include <crypto/sm2.h>
#include "sm2signature.asn1.h" #include "sm2signature.asn1.h"
...@@ -213,7 +213,7 @@ int sm2_get_signature_s(void *context, size_t hdrlen, unsigned char tag, ...@@ -213,7 +213,7 @@ int sm2_get_signature_s(void *context, size_t hdrlen, unsigned char tag,
return 0; return 0;
} }
static int sm2_z_digest_update(struct shash_desc *desc, static int sm2_z_digest_update(struct sm3_state *sctx,
MPI m, unsigned int pbytes) MPI m, unsigned int pbytes)
{ {
static const unsigned char zero[32]; static const unsigned char zero[32];
...@@ -226,20 +226,20 @@ static int sm2_z_digest_update(struct shash_desc *desc, ...@@ -226,20 +226,20 @@ static int sm2_z_digest_update(struct shash_desc *desc,
if (inlen < pbytes) { if (inlen < pbytes) {
/* padding with zero */ /* padding with zero */
crypto_sm3_update(desc, zero, pbytes - inlen); sm3_update(sctx, zero, pbytes - inlen);
crypto_sm3_update(desc, in, inlen); sm3_update(sctx, in, inlen);
} else if (inlen > pbytes) { } else if (inlen > pbytes) {
/* skip the starting zero */ /* skip the starting zero */
crypto_sm3_update(desc, in + inlen - pbytes, pbytes); sm3_update(sctx, in + inlen - pbytes, pbytes);
} else { } else {
crypto_sm3_update(desc, in, inlen); sm3_update(sctx, in, inlen);
} }
kfree(in); kfree(in);
return 0; return 0;
} }
static int sm2_z_digest_update_point(struct shash_desc *desc, static int sm2_z_digest_update_point(struct sm3_state *sctx,
MPI_POINT point, struct mpi_ec_ctx *ec, unsigned int pbytes) MPI_POINT point, struct mpi_ec_ctx *ec, unsigned int pbytes)
{ {
MPI x, y; MPI x, y;
...@@ -249,8 +249,8 @@ static int sm2_z_digest_update_point(struct shash_desc *desc, ...@@ -249,8 +249,8 @@ static int sm2_z_digest_update_point(struct shash_desc *desc,
y = mpi_new(0); y = mpi_new(0);
if (!mpi_ec_get_affine(x, y, point, ec) && if (!mpi_ec_get_affine(x, y, point, ec) &&
!sm2_z_digest_update(desc, x, pbytes) && !sm2_z_digest_update(sctx, x, pbytes) &&
!sm2_z_digest_update(desc, y, pbytes)) !sm2_z_digest_update(sctx, y, pbytes))
ret = 0; ret = 0;
mpi_free(x); mpi_free(x);
...@@ -265,7 +265,7 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm, ...@@ -265,7 +265,7 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm,
struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm); struct mpi_ec_ctx *ec = akcipher_tfm_ctx(tfm);
uint16_t bits_len; uint16_t bits_len;
unsigned char entl[2]; unsigned char entl[2];
SHASH_DESC_ON_STACK(desc, NULL); struct sm3_state sctx;
unsigned int pbytes; unsigned int pbytes;
if (id_len > (USHRT_MAX / 8) || !ec->Q) if (id_len > (USHRT_MAX / 8) || !ec->Q)
...@@ -278,17 +278,17 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm, ...@@ -278,17 +278,17 @@ int sm2_compute_z_digest(struct crypto_akcipher *tfm,
pbytes = MPI_NBYTES(ec->p); pbytes = MPI_NBYTES(ec->p);
/* ZA = H256(ENTLA | IDA | a | b | xG | yG | xA | yA) */ /* ZA = H256(ENTLA | IDA | a | b | xG | yG | xA | yA) */
sm3_base_init(desc); sm3_init(&sctx);
crypto_sm3_update(desc, entl, 2); sm3_update(&sctx, entl, 2);
crypto_sm3_update(desc, id, id_len); sm3_update(&sctx, id, id_len);
if (sm2_z_digest_update(desc, ec->a, pbytes) || if (sm2_z_digest_update(&sctx, ec->a, pbytes) ||
sm2_z_digest_update(desc, ec->b, pbytes) || sm2_z_digest_update(&sctx, ec->b, pbytes) ||
sm2_z_digest_update_point(desc, ec->G, ec, pbytes) || sm2_z_digest_update_point(&sctx, ec->G, ec, pbytes) ||
sm2_z_digest_update_point(desc, ec->Q, ec, pbytes)) sm2_z_digest_update_point(&sctx, ec->Q, ec, pbytes))
return -EINVAL; return -EINVAL;
crypto_sm3_final(desc, dgst); sm3_final(&sctx, dgst);
return 0; return 0;
} }
EXPORT_SYMBOL(sm2_compute_z_digest); EXPORT_SYMBOL(sm2_compute_z_digest);
......
...@@ -5,6 +5,7 @@ ...@@ -5,6 +5,7 @@
* *
* Copyright (C) 2017 ARM Limited or its affiliates. * Copyright (C) 2017 ARM Limited or its affiliates.
* Written by Gilad Ben-Yossef <gilad@benyossef.com> * Written by Gilad Ben-Yossef <gilad@benyossef.com>
* Copyright (C) 2021 Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
*/ */
#include <crypto/internal/hash.h> #include <crypto/internal/hash.h>
...@@ -26,143 +27,29 @@ const u8 sm3_zero_message_hash[SM3_DIGEST_SIZE] = { ...@@ -26,143 +27,29 @@ const u8 sm3_zero_message_hash[SM3_DIGEST_SIZE] = {
}; };
EXPORT_SYMBOL_GPL(sm3_zero_message_hash); EXPORT_SYMBOL_GPL(sm3_zero_message_hash);
static inline u32 p0(u32 x) static int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
{
return x ^ rol32(x, 9) ^ rol32(x, 17);
}
static inline u32 p1(u32 x)
{
return x ^ rol32(x, 15) ^ rol32(x, 23);
}
static inline u32 ff(unsigned int n, u32 a, u32 b, u32 c)
{
return (n < 16) ? (a ^ b ^ c) : ((a & b) | (a & c) | (b & c));
}
static inline u32 gg(unsigned int n, u32 e, u32 f, u32 g)
{
return (n < 16) ? (e ^ f ^ g) : ((e & f) | ((~e) & g));
}
static inline u32 t(unsigned int n)
{
return (n < 16) ? SM3_T1 : SM3_T2;
}
static void sm3_expand(u32 *t, u32 *w, u32 *wt)
{
int i;
unsigned int tmp;
/* load the input */
for (i = 0; i <= 15; i++)
w[i] = get_unaligned_be32((__u32 *)t + i);
for (i = 16; i <= 67; i++) {
tmp = w[i - 16] ^ w[i - 9] ^ rol32(w[i - 3], 15);
w[i] = p1(tmp) ^ (rol32(w[i - 13], 7)) ^ w[i - 6];
}
for (i = 0; i <= 63; i++)
wt[i] = w[i] ^ w[i + 4];
}
static void sm3_compress(u32 *w, u32 *wt, u32 *m)
{
u32 ss1;
u32 ss2;
u32 tt1;
u32 tt2;
u32 a, b, c, d, e, f, g, h;
int i;
a = m[0];
b = m[1];
c = m[2];
d = m[3];
e = m[4];
f = m[5];
g = m[6];
h = m[7];
for (i = 0; i <= 63; i++) {
ss1 = rol32((rol32(a, 12) + e + rol32(t(i), i & 31)), 7);
ss2 = ss1 ^ rol32(a, 12);
tt1 = ff(i, a, b, c) + d + ss2 + *wt;
wt++;
tt2 = gg(i, e, f, g) + h + ss1 + *w;
w++;
d = c;
c = rol32(b, 9);
b = a;
a = tt1;
h = g;
g = rol32(f, 19);
f = e;
e = p0(tt2);
}
m[0] = a ^ m[0];
m[1] = b ^ m[1];
m[2] = c ^ m[2];
m[3] = d ^ m[3];
m[4] = e ^ m[4];
m[5] = f ^ m[5];
m[6] = g ^ m[6];
m[7] = h ^ m[7];
a = b = c = d = e = f = g = h = ss1 = ss2 = tt1 = tt2 = 0;
}
static void sm3_transform(struct sm3_state *sst, u8 const *src)
{
unsigned int w[68];
unsigned int wt[64];
sm3_expand((u32 *)src, w, wt);
sm3_compress(w, wt, sst->state);
memzero_explicit(w, sizeof(w));
memzero_explicit(wt, sizeof(wt));
}
static void sm3_generic_block_fn(struct sm3_state *sst, u8 const *src,
int blocks)
{
while (blocks--) {
sm3_transform(sst, src);
src += SM3_BLOCK_SIZE;
}
}
int crypto_sm3_update(struct shash_desc *desc, const u8 *data,
unsigned int len) unsigned int len)
{ {
return sm3_base_do_update(desc, data, len, sm3_generic_block_fn); sm3_update(shash_desc_ctx(desc), data, len);
return 0;
} }
EXPORT_SYMBOL(crypto_sm3_update);
int crypto_sm3_final(struct shash_desc *desc, u8 *out) static int crypto_sm3_final(struct shash_desc *desc, u8 *out)
{ {
sm3_base_do_finalize(desc, sm3_generic_block_fn); sm3_final(shash_desc_ctx(desc), out);
return sm3_base_finish(desc, out); return 0;
} }
EXPORT_SYMBOL(crypto_sm3_final);
int crypto_sm3_finup(struct shash_desc *desc, const u8 *data, static int crypto_sm3_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash) unsigned int len, u8 *hash)
{ {
sm3_base_do_update(desc, data, len, sm3_generic_block_fn); struct sm3_state *sctx = shash_desc_ctx(desc);
return crypto_sm3_final(desc, hash);
if (len)
sm3_update(sctx, data, len);
sm3_final(sctx, hash);
return 0;
} }
EXPORT_SYMBOL(crypto_sm3_finup);
static struct shash_alg sm3_alg = { static struct shash_alg sm3_alg = {
.digestsize = SM3_DIGEST_SIZE, .digestsize = SM3_DIGEST_SIZE,
...@@ -174,6 +61,7 @@ static struct shash_alg sm3_alg = { ...@@ -174,6 +61,7 @@ static struct shash_alg sm3_alg = {
.base = { .base = {
.cra_name = "sm3", .cra_name = "sm3",
.cra_driver_name = "sm3-generic", .cra_driver_name = "sm3-generic",
.cra_priority = 100,
.cra_blocksize = SM3_BLOCK_SIZE, .cra_blocksize = SM3_BLOCK_SIZE,
.cra_module = THIS_MODULE, .cra_module = THIS_MODULE,
} }
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -466,3 +466,4 @@ MODULE_LICENSE("GPL"); ...@@ -466,3 +466,4 @@ MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("XTS block cipher mode"); MODULE_DESCRIPTION("XTS block cipher mode");
MODULE_ALIAS_CRYPTO("xts"); MODULE_ALIAS_CRYPTO("xts");
MODULE_IMPORT_NS(CRYPTO_INTERNAL); MODULE_IMPORT_NS(CRYPTO_INTERNAL);
MODULE_SOFTDEP("pre: ecb");
...@@ -401,7 +401,7 @@ config HW_RANDOM_MESON ...@@ -401,7 +401,7 @@ config HW_RANDOM_MESON
config HW_RANDOM_CAVIUM config HW_RANDOM_CAVIUM
tristate "Cavium ThunderX Random Number Generator support" tristate "Cavium ThunderX Random Number Generator support"
depends on HW_RANDOM && PCI && ARM64 depends on HW_RANDOM && PCI && ARCH_THUNDER
default HW_RANDOM default HW_RANDOM
help help
This driver provides kernel-side support for the Random Number This driver provides kernel-side support for the Random Number
......
This diff is collapsed.
...@@ -179,7 +179,7 @@ static int cavium_map_pf_regs(struct cavium_rng *rng) ...@@ -179,7 +179,7 @@ static int cavium_map_pf_regs(struct cavium_rng *rng)
pdev = pci_get_device(PCI_VENDOR_ID_CAVIUM, pdev = pci_get_device(PCI_VENDOR_ID_CAVIUM,
PCI_DEVID_CAVIUM_RNG_PF, NULL); PCI_DEVID_CAVIUM_RNG_PF, NULL);
if (!pdev) { if (!pdev) {
dev_err(&pdev->dev, "Cannot find RNG PF device\n"); pr_err("Cannot find RNG PF device\n");
return -EIO; return -EIO;
} }
......
This diff is collapsed.
...@@ -65,14 +65,14 @@ static int nmk_rng_probe(struct amba_device *dev, const struct amba_id *id) ...@@ -65,14 +65,14 @@ static int nmk_rng_probe(struct amba_device *dev, const struct amba_id *id)
out_release: out_release:
amba_release_regions(dev); amba_release_regions(dev);
out_clk: out_clk:
clk_disable(rng_clk); clk_disable_unprepare(rng_clk);
return ret; return ret;
} }
static void nmk_rng_remove(struct amba_device *dev) static void nmk_rng_remove(struct amba_device *dev)
{ {
amba_release_regions(dev); amba_release_regions(dev);
clk_disable(rng_clk); clk_disable_unprepare(rng_clk);
} }
static const struct amba_id nmk_rng_ids[] = { static const struct amba_id nmk_rng_ids[] = {
......
This diff is collapsed.
...@@ -47,7 +47,7 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/ ...@@ -47,7 +47,7 @@ obj-$(CONFIG_CRYPTO_DEV_VMX) += vmx/
obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/ obj-$(CONFIG_CRYPTO_DEV_BCM_SPU) += bcm/
obj-$(CONFIG_CRYPTO_DEV_SAFEXCEL) += inside-secure/ obj-$(CONFIG_CRYPTO_DEV_SAFEXCEL) += inside-secure/
obj-$(CONFIG_CRYPTO_DEV_ARTPEC6) += axis/ obj-$(CONFIG_CRYPTO_DEV_ARTPEC6) += axis/
obj-$(CONFIG_CRYPTO_DEV_ZYNQMP_AES) += xilinx/ obj-y += xilinx/
obj-y += hisilicon/ obj-y += hisilicon/
obj-$(CONFIG_CRYPTO_DEV_AMLOGIC_GXL) += amlogic/ obj-$(CONFIG_CRYPTO_DEV_AMLOGIC_GXL) += amlogic/
obj-y += keembay/ obj-y += keembay/
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment