SHA-3
SHA-3 Secure Hash Crypto Engine

The SHA-3 is a high-throughput, area-efficient hardware accelerator for the SHA-3 cryptographic hashing functions, compliant to NIST’s FIPS 180-4 and FIPS 202 standards. 
The accelerator core requires no assistance from a host processor and uses standard AMBA® AXI4-Stream interfaces for input and output data. An AXI4-Stream to AXI4 Memory Mapped bridge, with or without DMA capabilities, can be used with the core and is separately available from CAST. A single instance of the core implements all fixed-length and extendable-output hash functions. The cryptographic function, the length of the extendable output function (up to 2GB) is chosen at run time via AXI4-Stream side-band signals and can be different for every input message. 
The SHA-3 core is also highly configurable at synthesis time, to ease integration in systems with different requirements. The data-bus width of the input and output interfaces is configurable at synthesis time. The number of SHA-3 permutation rounds per clock cycle is also configurable at synthesis time, allowing users to trade throughput for silicon resources. Under its minimum configuration of one permutation per cycle, the core processes 50 bits per cycle depending on the hashing function. Its throughput can scale by implementing 2, 3, or 4 permutations per cycle respectively, enabling throughputs in excess of 100Gbps in modern ASIC technologies.
The core is designed for ease of use and integration and adheres to industry-best coding and verification practices. Technology mapping, and timing closure are trouble-free, as the core contains no multi-cycle or false paths, and uses only ris-ing-edge-triggered D-type flip-flops, no tri-states, and a single-clock/reset domain.

Applications

The SHA-3 IP core can be used to ensure data integrity and/or verify authentication in a wide range of applications including IP-sec and TLS/SSL protocol engines, secure boot engines, encrypted data storage, e-commerce, and financial transaction systems.

Sample implementation results for a limited set of the SHA3 core configurations are provided in the following table.

Target
Technology
Configuration Logic
Resources
Memory
Resources
Freq.
(MHz)
Input / Output
Bitwidth
Rounds
per Cycle
Number
of Buffers
TSMC 7nm 32 1 0 32,803 Gates 1,600
TSMC 7nm 64 1 1 48,665 Gates 1,700
TSMC 7nm 64 1 2 60,598 Gates 1,700
TSMC 7nm 128 2 2 97,787 Gates 1,300
TSMC 7nm 512 4 2 149,687 Gates 700

Note that these sample implementation figures do not represent the highest speed or smallest area possible for the core.  

The SHA-3 IP core can be implemented in any Altera FPGA provided sufficient resources are available. Sample implementation results for a limited set of configurations are provided in the following table. Please, note that the list of configurations is not exhaustive and that the indicated clock frequency is not the highest possible.;

FPGA Device Configuration Logic
Resources
Memory
Resources
Freq.
(MHz)
Input / Output
Bitwidth
Rounds
per Cycle
Number
of Buffers
Cyclone V (-7) 32 1 0 3,291 ALMs 100
Arria 10 GX (-1) 64 1 1 3,852 ALMs 300
Ailex (-1) 64 1 1 4,087 ALMs 500
Ailex (-1) 128 2 2 6,357 ALMs 250
Ailex (-1) 512 4 2 14,218 ALMs 125

The SHA-3 IP core can be implemented in any AMD FPGA provided sufficient resources are available. Sample implementation results for a limited set of configurations are provided in the following table. Please, note that the list of configurations is not exhaustive and that the indicated clock frequency is not the highest possible.

FPGA Device Configuration Logic
Resources
Memory
Resources
Freq.
(MHz)
Input / Output
Bitwidth
Rounds
per Cycle
Number
of Buffers
Spartan-7 (-1) 64 1 1 4,767 LUTs 125
Artix-7 (-1) 64 1 1 4,772 LUTs 150
Kintex Ultrascale+ (-1) 64 1 1 4,808 LUTs 375
Kintex Ultrascale+ (-1) 128 2 2 10,340 LUTs 200
Kintex Ultrascale+ (-1) 512 4 2 20,672 LUTs 100

The SHA-3 IP core can be implemented in any Microchip FPGA provided sufficient resources are available. Sample implementation results for a limited set of configurations are provided in the following table. Please, note that the list of configurations is not exhaustive and that the indicated clock frequency is not the highest possible.

FPGA Device Configuration Logic
Resources
Memory
Resources
Freq.
(MHz)
Input / Output
Bitwidth
Rounds
per Cycle
Number
of Buffers
RTG4 rt4g150-std 32 1 0 7,162 4LUT 85
RTG4 rt4g150-std 64 1 1 9,302 4LUT 85
RTG4 rt4g150-std 64 1 2 9,687 4LUT 80
RTG4 rt4g150-std 128 2 2 15,185 4LUT 65

Related Content

Features List

Standards Support 

  • FIPS 202: SHA-3 - Permutation-Based Hash and Extendable-Output Function  
  • FIPS 180-4: Secure Hash Functions (limited to SHA-3 use) 
  • All four fixed-length SHA-3 Hash Functions: 
    • SHA3-224 
    • SHA3-256 
    • SHA3-384 
    • SHA3-512 
  • Both SHA-3 Extendable Output Functions (XOF): 
    • SHAKE-128 
    • SHAKE-256 
  • NIST-Validated

Performance  

  • User-selectable (1 to 4) permutation rounds per clock cycle, resulting in a throughput of:
    • Up to 50 Mbits/MHz for one permutation per cycle 
    • Up to 150 Mbits/MHz for four permutations per cycle
  • Intelligent buffers management optionally allows receiving new input while processing the previous message 
  • Optional dynamic control of the number of permutation rounds

Interfaces  

  • AMBA® AXI4-Stream  

Fully autonomous operation  

  • Requires no assistance from the host processor 
  • Automatic padding insertion 

Configuration Options  

  • Hashing function (bit-rate, capacity, number of permutation rounds)
  • Input & output bus bit-width  
  • Number of input buffers 
  • Number of permutations per cycle 
  • Enable/disable dynamic control of permutation rounds 

Deliverables 

  • Verilog RTL source code or targeted FPGA netlist
  • Integration Test-Bench
  • Simulation & synthesis scripts
  • Bit Accurate C Model
  • User documentation

Let's talk about your project and our IP solutions

Request Info