UDPIP-100G
100G UDP/IP Hardware Protocol Stack

AMDA

Implements a UDP/IP hardware protocol stack that enables high-speed communication over a LAN or a point-to-point connection. Designed for standalone operation, the core is ideal for offloading the host processor from the demanding task of UDP/IP encapsulation and enables media streaming with speeds up to 100Gbps even in processor-less SoC designs.

Trouble-free network operation is ensured through run-time programmability of all the required network parameters (local, destination and gateway IP addresses; UDP ports; and MAC address). The core implements the Address Resolution Protocol (ARP), which is critical for multiple access networks, and the Echo Request and Reply Messages ("ping") of the Internet Control Message Protocol (ICMP) widely used to test network connectivity. It can use a static IP address or automatically request and acquire an IP address from a Dynamic Host Configuration Server (DHCP) server. Furthermore, the core supports 801.1Q tagging and is suitable for operation in a Virtual LAN.

The core is easy to integrate into systems with or without a host processor. Packet data can be read/written to the core via dedicated streaming-capable interfaces, or optionally via registers mapped on an SoC bus. Up to 32 streaming interfaces are used for transmit data, and up to 32 for receive data. Each such pair of receive and transmit interfaces (a "channel") is configured independently, with the source UDP port, destination IP address and UDP port, multicast receive address and transmit mode (unicast or multicast). The packet-data interfaces are 512-bit wide and comply to either the AMBA® AXI4-stream or the Avalon®-ST protocol. The registers interface is 32bit-wide, operates on an independent clock and complies to either the AXI4-Lite, or the Avalon-MM protocol. The core is delivered with a wrapper that allows direct connection to Xilinx’s 100G Ethernet MAC.

Applications

Backbone networks; video, image and audio streaming or broadcasting over Ethernet; high-frequency trading systems; storage servers; high-speed communication between servers or LAN nodes; remote device monitoring; and control over IP networks.

Block Diagram

FEATURES

- Complete UDP/IP Hardware Stack
  - 40G, 50G, and 100G Ethernet
  - IPv4 support without packet fragmentation
  - Jumbo and Super Jumbo Frames
  - Transmit and Receive
  - ARP with Cache
  - ICMP (Ping Reply)
  - IGMPv3 (Multicast)
  - UDP/IP Unicast and Multicast
  - UDP Port Filtering
  - UDP/IP Checksums generation and validation, and optional Ethernet CRC validation
  - VLAN (IEEE 802.1Q) support
  - 1 to 32 UDP transmit and 1 to UDP 32 receive channels
  - Ethernet Framing processing for non-UDP user-provided packets
  - DHCP client

- Trouble-Free Operation
  - Run time programmable network parameters
    - Local MAC address, Local IP address, Gateway IP address, and IP subnet mask
    - Per-channel: Destination IP address, Source and Destination and filtered UDP ports, multicast enable/disable and receive group
  - ARP support for operation in networks with Dynamic IP allocation

- Easy SoC Integration
  - Flexible interfaces:
    - Packet Data: 512-bit streaming capable using Avalon-ST or AXI4-Stream
    - Control/Status Registers: Generic 32-bit SRAM-like, 32-bit AXI4-Lite or Avalon-MM
  - Separate clock domains for packet processing and control/status interfaces
  - Configurable buffer sizes
  - Rich interrupt support for system events
  - Interface logic for Xilinx 100G Ethernet MAC IP core
Functional Description

The UDPIP-100G core receives and transmits UDP packet data, and forwards other traffic from the Ethernet MAC to the application and vice versa. It also receives and transmits ARP requests and responses, and responds to ICMP echo reply messages. The core generates and validates the UDP and IP checksums of outgoing and incoming packets, respectively. It can be programmed to discard or forward corrupted packets to the user application.

The core consists of the following modules:

The **Ethernet Frame Decoder** receives Ethernet frames from an external Ethernet MAC, detects the frame type and sends frames to the ARP or the IP packet decoder.

The **Ethernet Frame Transmitter** provides the external Ethernet MAC interface. The Transmitter also multiplexes ARP and IP transmit packets from the core subsystems.

The **VLAN Receiver** receives Ethernet frames from an external Ethernet MAC, and when enabled detects and compares VLAN tag and filters frames to the correct VLAN tag.

The **VLAN Transmitter** receives Ethernet frames from the Ethernet Frame Transmitter and adds the VLAN Tag to the frames when enabled.

The **Protocol Decoder and Checker** receives IP packets and handles them according to the packet type. The module decodes ICMP/IGMP/UDP/IP Packet types and saves the packets to the related receive packet buffer. The module also checks packets for errors.

The **Received Packet Buffers** implement separate data storage for each protocol and UDP channels. The buffers are implemented if the related protocol or UDP channel is enabled. The buffer sizes are configurable at synthesis time.

The **Transmit Packet Buffer** stores UDP application data as well as the ICMP and IGMP packet data. The size of the buffer is configurable at synthesis time.

The **Transmit Packet Generator** assembles ICMP, IGMP, UDP packets based on data received from the Transmit Packet Buffer.

The **ARP Module** sends and receives ARP packets and handles the packets according to command in the packet.

The **DHCP Module** automatically requests and acquires an IP address from a DHCP server.

Finally, the **Control and Status Registers** control the core’s functionality and report its core status.

Support

The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available.

Implementation Results

UDPIP-100G reference designs have been evaluated in a variety of technologies. The following sample implementation figures are indicative of the core capabilities and their corresponding utilization metrics.

<table>
<thead>
<tr>
<th>Family / Device</th>
<th>UDP Channels</th>
<th>LUTs</th>
<th>BRAM Tiles</th>
<th>Freq. (MHz)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Kintex UltraScale Xc7u060-2-e</td>
<td>1</td>
<td>13,611</td>
<td>45</td>
<td>250</td>
</tr>
<tr>
<td>4</td>
<td>16,864</td>
<td>68</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>22,232</td>
<td>98</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>16</td>
<td>32,774</td>
<td>159</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>Zynq UltraScale+ Xc7u011e1-e</td>
<td>1</td>
<td>13,575</td>
<td>45</td>
<td>250</td>
</tr>
<tr>
<td>4</td>
<td>16,816</td>
<td>68</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>22,184</td>
<td>98</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>16</td>
<td>32,718</td>
<td>159</td>
<td>250</td>
<td></td>
</tr>
</tbody>
</table>

Table 1: UDPIP-100G sample results for the core configured with 32kB transmit and receive buffers, and the Statistics Counters, Multicast. VLAN, and DHCP support enabled.

<table>
<thead>
<tr>
<th>Family / Device</th>
<th>UDP Channels</th>
<th>LUTs</th>
<th>BRAM Tiles</th>
<th>Freq. (MHz)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Versal Xcvkc190-2mp</td>
<td>4</td>
<td>16,852</td>
<td>53</td>
<td>250</td>
</tr>
<tr>
<td>1</td>
<td>9,929</td>
<td>30</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>Kintex UltraScale Xc7u060-2-e</td>
<td>4</td>
<td>15,923</td>
<td>53</td>
<td>250</td>
</tr>
<tr>
<td>8</td>
<td>18,011</td>
<td>83</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>16</td>
<td>27,341</td>
<td>143</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>Zynq UltraScale+ Xc7u011e1-e</td>
<td>1</td>
<td>9,928</td>
<td>30</td>
<td>250</td>
</tr>
<tr>
<td>4</td>
<td>15,923</td>
<td>53</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>8</td>
<td>18,011</td>
<td>83</td>
<td>250</td>
<td></td>
</tr>
<tr>
<td>16</td>
<td>27,337</td>
<td>143</td>
<td>250</td>
<td></td>
</tr>
</tbody>
</table>

Table 2: UDPIP-100G sample results for the core configured with 32kB transmit and receive buffers, and the Statistics Counters, Multicast. VLAN, and DHCP support disabled.

Deliverables

The core is available in synthesizable RTL and FPGA netlist forms, and includes everything required for successful implementation, including a sophisticated self-checking testbench, simulation scripts, test vectors, and expected results, synthesis scripts and comprehensive user documentation.