crypto: talitos - chain in buffered data for ahash on SEC1
SEC1 doesn't support S/G in descriptors so for hash operations, the CPU has to build a buffer containing the buffered block and the incoming data. This generates a lot of memory copies which represents more than 50% of CPU time of a md5sum operation as shown below with a 'perf record'. |--86.24%-- kcapi_md_digest | | | |--86.18%-- _kcapi_common_vmsplice_chunk_fd | | | | | |--83.68%-- splice | | | | | | | |--83.59%-- ret_from_syscall | | | | | | | | | |--83.52%-- sys_splice | | | | | | | | | | | |--83.49%-- splice_from_pipe | | | | | | | | | | | | | |--83.04%-- __splice_from_pipe | | | | | | | | | | | | | | | |--80.67%-- pipe_to_sendpage | | | | | | | | | | | | | | | | | |--78.25%-- hash_sendpage | | | | | | | | | | | | | | | | | | | |--60.08%-- ahash_process_req | | | | | | | | | | | | | | | | | | | | | |--56.36%-- sg_copy_buffer | | | | | | | | | | | | | | | | | | | | | | | |--55.29%-- memcpy | | | | | | | | | | | | However, unlike SEC2+, SEC1 offers the possibility to chain descriptors. It is therefore possible to build a first descriptor pointing to the buffered data and a second descriptor pointing to the incoming data, hence avoiding the memory copy to a single buffer. With this patch, the time necessary for a md5sum on a 90Mbytes file is approximately 3 seconds. Without the patch it takes 6 seconds. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Showing
This diff is collapsed.