AES-GCM
AES-GCM Authenticated Encrypt/Decrypt Engine

The AES-GCM encryption IP core implements Rijndael encoding and decoding in compliance with the NIST Advanced Encryption Standard. It processes 128-bit blocks, and is programmable for 128-, 192-, and 256-bit key lengths. 

Four architectural versions are available to suit system requirements. The Standard version (AES-GCM-S) is more compact, using a 32-bit datapath and requiring 44/52/60 clock cycles for each data block (128/192/256-bit cipher key, respectively). The Fast version (AES-GCM-F) achieves higher throughput using a 128-bit datapath and requiring 11/13/15 clock cycles for each data block depending on key size.

For applications where throughput is critical there are two additional versions. The High Throughput AES-GCM-X can process 128 bits/cycle and the Higher Throughput AES-GCM-X2 can process 256 bits/cycle respectively independent of the key size.

GCM stands for Galois Counter. GCM is a generic authenticate-and-encrypt block cipher mode. A Galois Field (GF) multiplier/accumulator is utilized to generate an authentication tag while CTR (Counter) mode is used to encrypt. 

The AES-GCM cores are fully synchronous design and have been evaluated in a variety of technologies, and is available optimized for ASICs or FPGAs.  

An AES encryption operation transforms a 128-bit block into a block of the same size. The encryption key can be chosen among three different sizes: 128, 192 or 256 bits. The key is expanded during cryptographic operations.  

The AES algorithm consists of a series of steps repeated a number of times (rounds). The number of rounds depends on the size of the key and the data block. The intermediate cipher result is known as state. Initially, the incoming data and the key are added together in the AddRoundKey module. The result is stored in the State Storage area. 

Number of rounds as a function of key size.
  KSIZE = 00 KSIZE = 01 KSIZE = 10
Rounds 10 12 14

The state information is then retrieved and the ByteSub, Shiftrow, MixColumn and AddRoundKey functions are performed on it in the specified order. At the end of each round, the new state is stored in the State Storage area. These operations are repeated according to the number of rounds.  

The final round is anomalous as the MixColumn step is skipped. The cipher is output after the final round. 

GCM mode 

A Galois Field (GF) multiplier/accumulator is utilized to generate an authentication tag while CTR (Counter) mode is used to encrypt. The counter value is initialized by an IV input from the user. 

During encryption or decryption, CTR mode is used to process the incoming data. An authentication tag up to 128 bits long is also produced by hashing with a GF multiply/accumulator additional data as well as the result of the CTR mode encryption or decryption. 

At the end of decryption, the user should verify that the authentication tag matches the original. If the former is different from the original, the authentication has failed. In this case no other information (i.e. no decrypted data or the value of the authentication tag) should be revealed except the failure itself. 

Key Expansion 

The AES algorithm requires an expanded key for encryption or decryption. The KEXP AES key expander core is available as an AES-GCM core option for the standard and fast versions. It is included for the higher throughput versions. During encryption, the key expander can produce the expanded key on the fly while the AES core is consuming it. For decryption, though, the key must be pre-expanded and stored in an appropriate memory before being used by the AES core. This is because the core uses the expanded key backwards during decryption. 

In some cases a key expander is not required. This might be the case when the key does not need to be changed (and so it can be stored in its expanded form) or when the key does not change very often (and thus it can be expanded more slowly in software). 

The core has been verified through extensive synthesis, place and route and simulation runs. It has also been embedded in several products, and is proven in FPGA technologies. 
 

Support 

The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available. 

Deliverables 

The core is available in ASIC (RTL) or FPGA (netlist) forms, and includes everything required for successful implementation. The ASIC version includes 

  •     HDL RTL source 
  •     Sophisticated HDL Testbench (self checking) 
  •     C Model & test vector generator 
  •     Simulation script, vectors & expected results 
  •     Synthesis script 
  •     User documentation 

The AES-GCM can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following are sample ASIC pre-layout results reported from synthesis with a silicon vendor design kit under typical conditions, with all core I/Os assumed to be routed on-chip. The provided figures do not represent the higher speed or smaller area for the core. Please contact CAST to get characterization data for your target configuration and technology.

AES-GCM Standard Core ASIC Implementation Results

ASIC Technology

Number of eq. gates

Fmax (MHz)

Throughput (Gbps)

TSMC 16nm
10,556
500
1.455
TSMC 28nm HPM
11,041
500
1.455

TSMC 40nm G

14.993
500
1.455

Throughput for a 128-bit key size

AES-GCM Fast Core ASIC Implementation Results

ASIC Technology

Number of eq. gates

Fmax (MHz)

Throughput (Gbps)

TSMC 16nm
21,909
500
5.818
TSMC 28nm HPM
21,368
500
5.818
TSMC 40nm G
26,659
500
5.818

Throughput for a 128-bit key size

AES-GCM High Throughput (-X) ASIC Implementation Results

ASIC Technology

Number of eq. gates

Fmax (MHz)

Throughput (Gbps)

TSMC 40nm
384,786
800
102.40
TSMC 28nm
270,39
800
102.40
TSMC 16nm
233,200
800
102.40

AES-GCM Higher Throughput (-X2) ASIC Implementation Results

ASIC Technology

Number of eq. gates

Fmax (MHz)

Throughput (Gbps)

TSMC 40nm
760,757
800
204.80
TSMC 28nm
532,089
800
204.80
TSMC 16nm
451,539
800
204.80

The AES-GCM can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following are sample Intel results with all core I/Os assumed to be routed on-chip. The provided figures do not represent the higher speed or smaller area for the core. Please contact CAST to get characterization data for your target configuration and technology.

AES-GCM Standard Core Intel Implementation Results

Family

ALMs

RAM bits

Freq. (MHz)

Throughout (Mbps)

Arria 10 GX (-2)
778
0
70
204
Stratix V (-1)
768
0
150
436
MAX 10 (-7)
1,604
0
50
145

Throughput for a 128-bit key size

AES-GCM Fast Core Intel Implementation Results

Family

ALMs

RAM bits

Freq. (MHz)

Throughout (Mbps)

Arria 10 GX (-2)
2,187
0
100
1,164
Stratix V (-1)
2,312
0
150
1,745
MAX 10 (-7)
5,812
0
75
873

Throughput for a 128-bit key size

AES-GCM High Throughput (-X) Intel Implementation Results

Family

ALMs

RAM bits

Freq. (MHz)

Throughout (Gbps)

Arria 10 GX (-1)
9,543
868,352
200
25.60
Stratix V (-1)
9,652
868,352
225
28.80

AES-GCM Higher Throughput (-X2) Intel Implementation Results

Family

ALMs

RAM bits

Freq. (MHz)

Throughout (Gbps)

Arria 10 GX (-1)
18,607
1,736,704
100
25.60
Stratix V (-1)
17,935
1,736,704 
200
 51.20

The AES-GCM can be mapped to any ASIC technology or FPGA device (provided sufficient silicon resources are available). The following are sample Xilinx results with all core I/Os assumed to be routed on-chip. The provided figures do not represent the higher speed or smaller area for the core.Please contact CAST to get characterization data for your target configuration and technology.

AES-GCM Standard Core Xilinx Implementation Results

Family

LUTs

BRAMs

Freq. (MHz)

Throughout (Mbps)

Virtex-7 (-3)
1,098
0
200
582
Kintex-7 (-2)
1,063
0
150
436
Kintex UltraScale (-1)
920
2
200
582
Kintex UltraScale (-2)
1,108
0
200
582
Kintex UltraScale+ (-1)
930
2
400
1,164

Throughput for a 128-bit key size

AES-GCM Fast Core Xilinx Implementation Results

Family

LUTs

BRAMs

Freq. (MHz)

Throughout (Mbps)

Virtex-7 (-3)
2,486
0
300
3,491
Kintex UltraScale (-1)
2,172
8
250
2,908
Kintex UltraScale+ (-1)
2,169
8
350
4,071

Throughput for a 128-bit key size

AES-GCM High Throughput (-X)  Xilinx Implementation Results

Family

LUTs

BRAMs

Freq. (MHz)

Throughout (Gbps)

Virtex-7 (-3)
9,348
108
200
 25.6
Virtex UltraScale (-3)
9,651
108
300
38.4
Kintex UltraScale (-1)
11,619
108
200
25.6
Kintex UltraScale+ (-1)
11,612
108
300
38.4
Kintex UltraScale+ (-3)
11,624
108
400
 51.2

AES-GCM Higher Throughput (-X2)  Xilinx Implementation Results

Family

LUTs

BRAMs

Freq. (MHz)

Throughout (Gbps)

Virtex-7 (-3)
25,246
216
200
51.2
Virtex UltraScale (-3)
25,194
216
250
  64.0
Kintex UltraScale (-1)
28,612
216
200
51.2
Kintex UltraScale+ (-1)
28,618
216
250
64.0
Kintex UltraScale+ (-3)
25,279
216
350
89.6

Related Content

This product is sourced from Technology Partner Ocean Logic.

Features List

 

  • Encrypts and decrypts using the AES Rijndael Block Cipher Algorithm 
  • Satisfies Federal Information Processing Standard (FIPS) Publication 197 from the US National Institute of Standards and Technology (NIST)  
  • Processes 128-bit data in 32-bit blocks 
  • Employs user-programmable key size of 128, 192, or 256 bits 
  • Four architectural versions: 
    • AES-GCM-S is more compact: 32-bit data path size. Processes each 128-bit data block in 44/52/60 clock cycles for 128/192/256-bit cipher keys, respectively 
    • AES-GCM-F yields higher transmission rates: 128-bit data path. Processes each 128-bit block in 11/13/15 clock cycles for 128/192/256-bit cipher keys, respectively 
    • Higher throughput versions (AES-GCM-X or AES-GCM-X2) can process 128 bits/cycle or 256 bits/cycle and have a 128-bit datapath size 
  • Arbitrary IV length for fast version 
  • Works with a pre-expended key or can integrate the optional key expansion function 
  • NIST Certified
  • Simple, fully synchronous, reusable design  
  • Available as fully functional and synthesizable VHDL or Verilog, or as a netlist for popular programmable devices 
  • Complete deliverables include test benches, C model and test vector generator 

 

Resources

NIST: Approved Block Ciphers

FIPS 197, Advanced Encryption Standard (AES): download PDF

AES test suite: The Advanced Encryption Standard Algorithm Validation Suite (AESAVS): download PDF

Let's talk about your project and our IP solutions

Request Info

This core implements encryption functions and as such it is subject to export control regulations. Export to your country may or may not require a special export license. Please contact CAST to determine what applies in your specific case.