PQC-AMX: Accelerating Saber and FrodoKEM on the Apple M1 and M3 SoCs.
IEEE Symposium on Computer Arithmetic(2023)
Abstract
As CPU performance cannot keep up with the dramatic growth of the past few decades, CPU architects turn to domain-specific architectures to accelerate certain tasks. A recent trend is the introduction of matrix-multiplication accelerators to CPUs by manufacturers such as IBM, Intel and ARM, some of them yet to launch commercially. Apple’s systems-on-chip (SoCs) for its mobile phones, tablets and personal computers include a proprietary, undocumented CPU-coupled matrix multiplication coprocessor called AMX. We leverage AMX to accelerate the post-quantum lattice-based cryptosystems Saber and FrodoKEM, and benchmark their performance on Apple M1 and M3 SoCs. We propose a variant of the Toeplitz Matrix-Vector Product algorithm for polynomial multiplication, which sets new speed records for Saber using AMX, improving up to 20% for the main KEM operations, and 152% for matrix-vector multiplication of polynomials, over the current state-of-the-art. We also set new FrodoKEM speed records using AMX, gaining up to 21% for the main KEM operations and 124% for matrix multiplication (with further improvements for 4×-batching), over our optimized NEON implementation, also introduced here, which already improves upon the previous state-of-the-art for ARMv8 CPUs.
MoreTranslated text
Key words
Post-quantum cryptography,AMX,ARM,NEON,FrodoKEM,Saber
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined