# H264-LD-BP

# Low-Power AVC/H.264 Baseline Profile Decoder



The H264-LD-BP IP core implements a silicon and energy efficient hardware video decoder able to process H.264 streams produced by the H264-E-BPS, H264-E-BPF, and H264-E-BIS video encoder cores available from CAST.

The H264-LD-BP is extremely small, requiring less than 70k gates and about 60k bits of internal memory. Its small silicon footprint, low bandwidth requirements, and zero software-overhead enable extremely cost-effective and low-power ASIC and FPGA implementations.

The H264-LD-BP is designed for straightforward, trouble-free SoC integration. It operates on a stand-alone basis such that decoding proceeds without any assistance or input from the host processor. The decoder's memory interface—used to store reconstructed video data—is extremely flexible: it operates on a separate clock domain, is independent from the external memory type and memory controller, and is tolerant to relatively large latencies. The decoder reports decompressed video parameters, detects and reports bit stream errors to the system, and simplifies video cropping at its output. The core is optionally delivered with a raster-to-block converter, and wrappers for AMBA® AHB, AXI, or AXI-Streaming buses are available.

Customers can further decrease their time to market by using CAST's integration services to receive complete video encoding/decoding subsystems. These integrate the decoder core with video encoders, video and networking interface controllers, networking stacks, or other CAST or third-party IP cores.

The H264-LD-BP IP core is designed using industry best practices and has been multiple times production proven. Its deliverables include a complete verification environment and a bit-accurate software model.

## **Block Diagram**



#### **FEATURES**

Low-power AVC/H.264 decoder, with small silicon footprint; optimized for lowlatency, low-bit-rate video streaming

 Decodes streams produced by the H264-E-BPS, H264-E-BPF, and H264-E-BIS cores

#### Video Formats

- Progressive or Interlaced, 4:2:0
   YCbCr with 8 bits per color sample
- Single-channel SD, ED, and Full-HD capable even in low-cost FPGAs
- Optional multichannel decoding

#### **Small and Low-Power**

- Less than 70k Gates and about 60k bits of RAM
- Less than half the typical silicon footprint and small external memory bandwidth mean it uses less power than competitive hardware H.264 decoders
- Consumes much less power than any equivalent software or softwarehardware decoder

#### **Ease of Integration**

- Zero CPU overhead, stand-alone operation
- Flexible external memory interface.
   Uses a separate clock, is independent of memory type and tolerant to latencies
- AMBA® Interface Options: DMAcapable AMBA® AHB, AXI or AXI-Streaming

#### **Supported Coding Tools**

- I and P Slices
- Single Reference Frame
- Motion vector up to -32.00/+31.75 pixels down to ¼ pel accuracy
- All intra16x16 and most intra 4x4 modes
- Multiple slices per frame
- Block skipping
- Deblocking filter





#### Silicon Resources Utilization

The H264-LD-BP BP can be mapped to any Intel/Altera Family (provided sufficient silicon resources are available) and optimized to suit the particular project's requirements. The following table provides sample resource utilization data for different Intel/Altera Device Families.

|          | Area    | Memory Bits | DSPs / MULs |
|----------|---------|-------------|-------------|
| StratixV | 4k ALMs | 59,278      | 2           |
| Arria10  | 4k ALMs | 59,278      | 2           |
| CycloneV | 4k ALMs | 59,278      | 2           |
| Max10    | 10K LEs | 59,364      | 2           |

Multiple H264-LD-BP cores can be combined to decode streams produced by the H264-E-BPF core. The following table indicates the number of H264-LD-BP cores that would be required for different video formats in different Altera families.

|          | 480p30      | 720p30      | 720p60      | 1080p30     | 1080p60     |  |
|----------|-------------|-------------|-------------|-------------|-------------|--|
| StratixV | <b>(</b> 1) | <b>(</b> 1) | <b>(</b> 2) | <b>(</b> 2) | <b>(</b> 3) |  |
| Arria10  | <b>(</b> 1) | <b>(</b> 1) | <b>(</b> 2) | <b>(</b> 2) | <b>(</b> 3) |  |
| CycloneV | <b>(</b> 1) | <b>(</b> 2) | <b>(</b> 3) | <b>(</b> 3) | ×           |  |
| Max10    | <b>(</b> 1) | X           | X           | ×           | X           |  |

Note: List of video formats is not exhaustive.

#### **Evaluation**

Potential customers can readily evaluate the video decoder's low latency characteristics by using the <u>Video over IP</u> reference design with the compressed stream captured over Ethernet, and the decoded video driving an HDMI interface.

#### **Deliverables**

The core deliverables include everything required for successful implementation:

- · Targeted netlist for Intel FPGAs
- Sophisticated self-checking Testbench
- · Synthesis scripts.
- · Simulation script, vectors, and expected results.
- · Comprehensive user documentation.

### **H.264 Cores Family**

The H264-LD-BP is one member of the family of H.264 cores that CAST offers. The low-latency H264-D-BP and the encoders summarized in the following table are also available.

| H.264 Encoder<br>Cores  | H264-E-BIS<br>Intra-Only<br>Baseline Profile | H264-E-BPS<br>Low-Power<br>Baseline Profile | <b>H264-E-MPS</b><br>Low-Power<br>Main Profile | H264-E-CFS Ultra-Low-Power Baseline Profile | <b>H264-E-HIS</b><br>Intra-Only<br>High Profile | H264-E-BPF<br>Ultra-Fast<br>Baseline Profile |
|-------------------------|----------------------------------------------|---------------------------------------------|------------------------------------------------|---------------------------------------------|-------------------------------------------------|----------------------------------------------|
| Cycles/Pixel            | 4                                            | 4                                           | 4                                              | 4                                           | 2.5                                             | 2 or 1                                       |
| Silicon Resources *     | Very Small                                   | Small                                       | Small                                          | Small                                       | Moderate                                        | Moderate-High                                |
| Profile                 | Constrained<br>Baseline                      | Constrained<br>Baseline                     | Main                                           | Constrained<br>Baseline                     | High 10 Intra                                   | Constrained<br>Baseline                      |
| Slices Types            | IDR                                          | IDR, P                                      | IDR, P                                         | IDR, P                                      | IDR                                             | IDR, P                                       |
| Chroma Formats          | 4:2:0                                        | 4:2:0                                       | 4:2:0                                          | 4:2:0                                       | 4:2:0                                           | 4:2:0                                        |
| Bits per sample         | 8                                            | 8                                           | 8                                              | 8                                           | 8, 10                                           | 8                                            |
| Progressive/Interlaced  | <b>/</b> / <b>/</b>                          | <b>*</b> / <b>*</b>                         | <b>✓</b> /×                                    | <b>✓</b> /×                                 | <b>✓</b> /×                                     | <b>*</b> / <b>*</b>                          |
| Multiple video channels | Optional                                     | Optional                                    | Optional                                       | Optional                                    | ×                                               | Optional                                     |
| CAVLC / CABAC           | <b>✓</b> /×                                  | <b>✓</b> /×                                 | ×/ <b>✓</b>                                    | <b>✓</b> /×                                 | <b>✓</b> /×                                     | <b>✓</b> /×                                  |
| CBR and VBR             | <b>✓</b>                                     | ✓                                           | <b>✓</b>                                       | <b>✓</b>                                    | ×                                               | <b>~</b>                                     |
| Intra-Refresh           | N/A                                          | <b>✓</b>                                    | <b>✓</b>                                       | <b>~</b>                                    | N/A                                             | ✓                                            |
| Multiple Slices         | ~                                            | <b>✓</b>                                    | ✓                                              | ✓                                           | X                                               | ✓                                            |
| Compressed Frame Store  | ×                                            | ×                                           | ×                                              | <b>✓</b>                                    | N/A                                             | N/A                                          |

<sup>\*</sup> Very Small <100k Gates, Small < 200k Gates, Moderate < 500K Gates, and High > 500KGates



