Today, DSP processing is a hot technology arena. The growth rate
for DSP programmable processors is running some 35+% for a number
of years. That DSP processor growth has attracted a number of
alternate technologies, including reconfigurable, DSP functions.
And the FPGA vendors have pretty much recognized DSP as a
specialized growth area. Every major FPGA vendorXilinx,
Altera, QuickLogic, and Atmelis addressing the DSP designer's
needs.
Fueling the FPGA charge into DSP processing is the ever-present
silicon curve. Silicon developments have moved FPGA chip
resolutions down to 0.15 micron and lower, enabling FPGA-based
designers to pack in more and faster logic onto a single chip.
Driving this FPGA move toward DSP functions has been the
emergence of the wireless market, in particular the infrastructure
needed to deploy wireless infrastructure equipment. "Wireless is in
many cases the driver for DSP processing on reconfigurable logic,"
said Will Strauss, president of Forward Concepts, the leading
analysts on the DSP marketplace and technology. "The processing
power needed for the wireless infrastructure has been increasing,
and in many instances is greater than past and most current DSPs
could supply. Thus, designers have turned to FPGAs to do the
on-the-fly, front-end processing."
Consequentially, the market numbers for FPGA-based DSP
processing are rising. According to Will Strauss, reconfigurable
DSPs (FPGA-based) will show a combined annual growth rate of some
30%. Sales are projected to go from 1999's $84.5 M to $185 M though
2002. That said, we should recognize that the DSP market is much,
much larger and that reconfigurable DSP logic will represent at
most some 10% of that market.
Growth of Reconfigurable DSP Logic and DSPs
| YEAR |
RECONFIGURABLOGIC (FPGAs)($M) |
DSP PROGRAMMABLE PROCESSORS |
| 1999 |
$84.5 M |
$4.4 B |
| 2000 |
$109 M |
$6.1 B |
| 2001 |
$142 M |
$8.2 B |
| 2002 |
$185 M |
$19.2 B |
Source: Forward Concepts 2000
FPGAs Accommodate DSP Processing
FPGA vendors have recognized the size and shape of the DSP
processing market. Consequentially, FPGA architectures have evolved
to serve DSP processing needs. Today, the wireless and Telecom, and
Internet applications are driving a good deal of the electronics
market, much as the PC did in its heyday. Thus, FPGAs are evolving
to provide DSP front-end processing capabilities. The three major
approaches are more gates/cells, hardwired specialty blocks, and
hardwired blocks with dynamic FPGA cells. Many vendors are mixing
these approaches, borrowing from two or more.
Approach 1—More, Faster Gates
No designer wants fewer gates or slower logic. One way to handle
complex processing problems is to throw lots of fast logic and
state blocks at it. Lots of fast gates and state elements can
handle a surprising number of problems. Faster logic lets you do
more between clocks and more logic lets you spread the problem out
vertically, putting more logic to work in the same clock period.
With more logic modules or engines to do the work in parallel, you
get even more done in the same clock time.
Today's top-of-the-line FPGAs have already passed the 1 M
usable gate barrier. They are also driving internal system clock
rates to 200 MHz, 400 MHz and beyond, while supporting I/O rates on
the order of 600 to 800 MHz to support Telecom OC-2 and higher
links. The leading FPGA architecture for DSPs today is Xilinx's
Vertix I, which packs up to 1 M available gates into a single chip.
And that's not all. Xilinx just announced its next generation FPGA
family, Vertix II, which goes to 10 M gates, providing a veritable
bonanza of available gates for DSP processing. Altera is also
pushing the gate count limits with its APEX 20K family. These FPGAs
deliver up to 1.52 M usable gates.
Approach 2—Hybrid FPGAs With Special Blocks
More and more gates and logic cells may not be the answer. Many
FPGA vendors have realized that you can't build everything with the
basic FPGA logic/state cell. And that the FPGA tinker-toy approach
doesn't always work for large datapath elements. So they've shifted
to adding specialized blocks for higher datapath element
efficiencies.
The first and most important blocks added were RAM blocks. These
blocks of RAM are placed strategically in the silicon die data-flow
and provide high-density RAM without the logic wastage that was
common in earlier FPGAs when the logic cell flip-flops were used as
RAM, leaving much of the cell logic unused.
More RAM and efficient RAM is critical to many of the newer
applications, which are "data-flow" oriented, rather than being
control oriented. Data-flow applications can be characterized as
involving the flow of data that is passed through one or more
processing stages. These stages process the data and move the
results on to the next stage.
Many of the growth applications—Telecom, Datacom, and
wireless base stations—are all examples of such data-flow.
For these applications, RAM is needed to hold the incoming data,
hold intermediate results, and to buffer output flows. Buffering
inputs, intermediate and output results enables the logic to
maintain the processing flow, producing high throughput
efficiencies.
Today, the FPGA-based memory is there. For example, QuickLogic
deploys 83Kbits of block RAM in its QuickDSP FPGA family; these are
RAM blocks at the top and bottom of its FPGA array. Altera packs in
some 442 Kbits of block memory. This memory is distributed in
Embedded System Blocks (ESBs) of 2 K bits; the ESBs can be
configured as RAM, ROM, FIFO, or CAM memory. And, last but not
least, Xilinx's Vertex I integrates some 131 Kbits of block
SRAM.
FPGA vendors add more than memory to the FPGA DSP mix. They're
adding processing blocks as well. These are specialized arithmetic
functional blocks that combine ALU, MPU, and MAC operations.
Generally, they can be scaled, i.e., cascaded to provide larger
field arithmetic.
For example, QuickLogic is taking an almost ASIC-like flow
approach to its blocks. Its Eclipse family of FPGAs integrate I/O,
with RAM blocks, and FPGA logic cells with up to 18 Embedded
Computation Units (ECUs), which are hardwired logic not built on
FPGA cells. These ECUs are mathematical processing units with 8-bit
multipliers, 16-bit adders, and internal registers. They can be
cascaded for larger arithmetic processing. The ECUs are
supplemented with up to 662 K usable gates and 82 Kbits of block
RAM.
Approach 3—Reconfigurable Logic Supported by Fixed
Blocks
A third approach to FPGA DSP processing is to integrate the DSP
functions in FPGA logic with an on-chip standard processor and
support blocks. With this approach, you'd use the FPGA structure to
build the specialized DSP processing functions, but also rely on
fixed support blocks to handle the normal I/O, coordination, state
and data storage functions.
Thus, you have relatively fixed hardware architecture integrated
with the flexibility of FPGA logic blocks. This architecture relies
more on software control, via an on-chip processor, than on
hardwired control. This software also provides a level of
flexibility, while gaining hardware efficiencies from fixed,
hardwired blocks that are not built from FPGA logic cells.
An even greater advantage can come from using the fixed control
resources — the on-chip processor and memory—to
dynamically reconfigure the FPGA cell resources. The processor can
control swapping into FPGA logic banks, using the on-chip memory
resources to hold or cache the FPGA logic cell descriptions that
are loaded into the FPGA's cell defining RAM. Specialized DSP
processing or signal conditioning functions can be loaded in to
process the incoming data, and be swapped out for the next DSP
processing function. A design can use two FPGA banks, one running
the current logic, the second being loaded for the next logic
iteration.
Atmel has created such a combined FPGA product, called FPSLIC.
This architecture integrates Atmel's AVR 8-bit, 40 MHz RISC
microcontroller, with block RAM and a set of RAM-based FPGA
blocks.
The microprocessor controls operations and comes with a set of
on-chip peripherals. It has an on-chip hardware multiplier
accelerator (8 x 8). The microprocessor runs from up to 32KB of
on-chip RAM and can support external memory as well. The FPSLIC
includes up to 40 FPGA system gates that are RAM based.
Additionally, the FPGA logic incorporates up to 18.4 K bits of
distributed, single-port RAM.
Interestingly, each logic cell can be configured into a number
of modes. These include a 2-bit adder/counter, 4-bit random logic,
a 2-bit multiplexer (w select and enable) or a 2-bit multiplier
cell. The multiplier cell configurations can be connected together
to form large multiplier arrays for DSP processing.
The Atmel engineers took a very straightforward approach to
reconfigurable logic, or as they refer to it: Cached Logic. The
underlying RAM that defines a block of FPGA logic cells can be
loaded to reconfigure the logic. The RAM can be loaded from the
chip's block RAM that can be said to be "caching the logic," much
as RISC CPU's cache memory "caches" executable code and
programs.
Representative FPGAs for DSP Processing
| FPGA |
# Usable Gates |
RAM Blocks (Kbits) |
System Clock (MHz) |
Function Blocks |
Comments |
| Altera APEX |
1.5 M (RAM based) |
442.3 Kbits |
200 MHz |
|
Mixed cells with PLD logic, LUT-based logic |
| Atmel FPSLIC |
40 K (RAM based) |
18.4 K |
8-bit uP
36 KB I & D SRAM
uP peripherals |
|
SOC with CPU, memory, & reconfigurable FPGA LOGIC |
| QuickLogic QuickDSP |
660 K (antifuse) |
82.9 Kbits Dual-Ported |
400 MHz |
18 ECU (8-bit Adder, 16-bit MPY/MAC) |
Flow-thru arch w layers:
RAM
CLB
FPGA logic
RAM |
| Xilinx Vertex I |
1.124 M (RAM based) |
131 Kbits |
200 MHz |
|
RAM based, opens underlying LUTs, logic to use |
| Xilinx Vertex II |
10 M (RAM based) |
262 Kbits |
200 MHz |
|
1.5 V CMOS
0.12 transistor
8-layer metal |
FPGA To DSP Tool Link
DSP system engineers targeting FPGA-based designs do not have to
roll their designs by hand or with mismatched tools. The DSP design
tools are now bridging or migrating to the FPGA design base. The
connection is now solid between top level DSP design and FPGA DSP
function implementation.
MathWorks, the leading vendor of DSP algorithm design tools, has
linked up with FPGAs. Its MATLAB tools, which many DSP developers
use to create and test their algorithms, are now integrated with
the FPGA vendor toolsets. For example, MathWorks has in conjunction
with Xilinx, brought out its Xilinx System Generator. This tool
bridges between the System Design Domain (MATLAB and Simulink), and
the Hardware Design Domain (HDL design, hardware simulation,
synthesis/placement/routing, and timing verification).
The Xilinx System Generator provides a link and a feedback path
between the two domains, between system and algorithm design and
FPGA hardware implementation. The tool automatically calls the
Xilinx CORE generator and maps the MATLAB design into FPGA terms.
This tool implements fully parameterized, high-performance LogiCORE
DSP cores for the Xilinx FPGAs. These cores are implemented in FPGA
logic cells, both at the logic level and at the inner cell LUT
level (using the FPGA's internal logic mapping resources).
Designers can build a fully DSP system and then automatically
generate a HDL representation that is compatible with the Xilinx
FPGA architecture. Designs can be generated from the MATLAB side or
HDL-based designs can be pushed up to the MATLAB interface for
algorithm development and checking. The Xilinx DSP System Generator
will be available from Xilinx in the 4Q00.