shim

corner

PDF datasheets:

Datasheet

Related information:

News Release

06/04/07 CAST Offers PCI Express Model, Highlights APS 32-bit Processor Cores at DAC 2007
05/22/07 CAST Releases DSP Coprocessor for Cortus APS 32-bit Processor Cores

32-bit Processor Cores

APS2 high-performance
APS3 code-optimized

For more information:

contact Sales @ CAST

APS-DSP Digital Signal Processing Coprocessor Core

The APS-DSP implements a fixed point,16-bit RISC DSP coprocessor extension to the APS family of processors. It extends both the hardware and the instruction set to provide fast math and optimized data handling for analog or mixed-signal applications.

APS 32-bit processors from CAST and CortusThe core integrates with an APS2 or APS3 processor core through the patented APS coprocessor interface. The DSP and main CPU operate in parallel: instructions complete in a single cycle and at the same speed, with out-of-order instruction completion in both processors. Up to six operations can be executed per cycle, for example: multiply, accumulate, load data and update pointer (with wrap around), and load coefficient and update pointer.

The core adds three ALUs operating in parallel, enabling one arithmetic operation and two address calculations on every clock cycle. The two address ALUs support a number of addressing modes, implementing circular buffers and bit-reverse arithmetic without additional programming. The core also adds two 32k by 32-bit memory interfaces, using a Harvard bus architecture for simple memory design. This dual memory interface ensures that each instruction can perform two memory accesses—including pointer updates and wrap around—as well as the instruction fetch in a single cycle.

A bit-reverse arithmetic feature facilitates calculations like Fast Fourier Transforms (FFTs), and a special Zero Overhead Loop (ZOL) feature enables smarter, more effi-cient data processing.

APS-DSP reduces processing cycles for FFTs, FIRs, etc.Instructions for the APS-DSP are written in assembler language using a simple set of constructs. Provided macros make it easy to work with the DSP routines from the C and C++ programming environments of the APS processors. An included library offers pre-coded solutions for typical DSP challenges such as Fast Fourier Transforms (FFTs) and Finite and Infinite Impulse Response filters (FIRs and IIRs), plus numerous test cases. Designers can use these without having the DSP expertise required to implement them from scratch.

Like the APS processor family, the APS-DSP is suitable for implementation in ASICs, structured ASICs, and many FPGAs. It is fast and relatively compact—running at 250 MHz and requiring just 14,000 gates in a 0.13 µm ASIC process—and its efficiency complements the low-power nature of APS processors. The core has been rigorously verified through thousands of test cases, and has been implemented in FPGAs.
See representative implementation results (each in a new pop-up window):

ASIC numbers Actel numbers Xilinx numbers

Features

  • Fixed point, 16-bit RISC DSP processor for APS main processors
  • Extends hardware and language for easier DSP programming and faster DSP execution
  • Operates in parallel with main APS CPU, with independent register set and memory access
  • Executes nearly all instructions in a single cycle, including multiplies and multiply-accumulates
  • Integrates with main APS CPU through patented APS coprocessor interface
Hardware Extensions
  • Adds three ALUs and two memory interfaces
  • Arithmetic ALU
    • Two 20-bit Accumulators
    • Four 16-bit general purpose data registers
    • 16 x 16 Multiplier gives 20-bit results for MAC, Multiply, Shift, etc.
  • Two Address ALUs
    • Eight Address Pointers
    • Eight Offset Registers
    • Eight Circular Buffer Registers
    • Support for Bit-Reverse Arithmetic, as used for FFTs
  • Two 128 kB memory interfaces, seen by software as 64k x 16 bits wide each
Language Extensions
  • Assembly language instruction set includes:
    • dsp_add
    • dsp_clr
    • dsp_mac
    • dsp_macn
    • dsp_mul
    • dsp_muln
    • dsp_mov
    • parallel moves
    • dsp_nop
    • dsp_sub
    • dsp_shl
    • dsp_shr
    • dsp_zol
    • dsp_zol_end
  • Zero Overhead Loop (ZOL) construct enables algorithm iteration without instruction fetches
  • Special macros facilitate DSP programming in the C and C++ development environments of the APS processor family
Included Solutions Library
  • Pre-coded and verified routines for over 60 DSP functions and test cases

Applications

Designed for demanding signal processing applications such as Internet telephony, audio processing, automotive systems, and voice recognition.

Block Diagram

See to the right.

 

Functional Description

The APS-DSP adds dedicated math and memory hardware and signal processing language extensions to an APS family main processor.

Hardware Extensions

The core adds three ALUs and two memory interfaces.

The main ALU has four 16-bit general purpose registers and two 20-bit accumulators. A 16 x 16 multiplier gives 20-bit re-sults. Each of the two address ALUs has four address pointers, each of which is associated with an offset register and a circular buffer register, all 16-bits long.

The two additional memory interfaces are each 32k by 32-bits long.

A bit reverse arithmetic mode flips an address pointer register’s MSB and LSB and causes the carry to be propagated in the opposite direction from normal processing. This facilitates algorithms such as FFTs that benefit from being able to read tables of parameters in both directions.

Language Extensions

The DSP coprocessor is programmed in assembly language using a special set of DSP constructs. These include parallel move operations and thirteen other special functions, for arithmetic, multiplication, multiply-accumulates, moving, and shifting. Nearly all of these execute within a single cycle.

A Zero Overhead Loop (ZOL) construct allows part of an al-gorithm to be iterated automatically. The instructions are stored in an internal buffer, and need be fetched just once for the entire loop. This loop then executes in parallel with the application running on the main CPU.

Pre-Coded Library

Delivered with the core are a set of over 60 frequently-used DSP functions and test cases that are pre-coded, verified, and ready to execute. These include common architectures for FFTs, FIR and IIR digital filters, arithmetic and multiplication functions, data manipulation operations, and typical combinations of specific functions.

DSP Code Example: A Simple FIR

Executes multiple operations in a single cycle: multiply, accumulate, load data, update data pointer with wraparound, load coefficient, update coefficient pointer.
 

Coprocessor Performance

The APS-DSP operates in parallel with the main APS processor and can execute most signal processing instructions in a single clock cycle. Test cases show a significant reduction in processing time when the APS-DSP executes an algorithm instead of programming that algorithm in C for the main CPU. Results for some examples are shown here (see chart above).

DSP Algorithm
C cycles
DSP cycles
Execution Time Reduction
FIR 1 (256 taps)
8,279
283
71%
FIR 2 (256 taps)
8,023
283
72%
FFT (256 pts)
205,556
12,807
84%
FFT SHIFT (256 pts)
199,412
21,509
91%

Support

The core as delivered is warranted against defects for ninety days from purchase. Thirty days of phone and email technical support are included, starting with the first interaction. Additional maintenance and support options are available.

Verification

The core has been verified through extensive simulation and rigorous code coverage measurements. FPGA demonstration units have also been implemented and evaluated.

Deliverables

The core is intended for use with an APS family main processor, and includes everything else required for successful implementation:

  • HDL RTL source code or optimized netlist for structured ASICs and FPGAs
  • Sophisticated self-checking HDL Testbench including everything needed to test the core
  • Simulation scripts, vectors, and expected results; Synthesis scripts
  • Comprehensive user documentation, including detailed specifications and a programmers reference manual

 


 

 

top of page
cores    models     info     support     services
site info     contacts      castNet