For FIR filters, the DSP block combines the four-multipliers adder mode with the shift register inputs. One set of shift inputs contains the filter data, while the other holds the coefficients loaded in serial or parallel. The input shift register eliminates the need for shift registers external to the DSP block (i.e., implemented in LEs). This architecture simplifies filter design since the DSP block implements all of the filter circuitry.
One DSP block can implement an entire 18-bit FIR filter with up to four taps. For FIR filters larger than four taps, DSP blocks can be cascaded with additional adder stages implemented in LEs.
Table36 shows the different number of multipliers possible in each DSP block mode according to size. These modes allow the DSP blocks to implement numerous applications for DSP including FFTs, complex FIR, FIR, and 2D FIR filters, equalizers, IIR, correlators, matrix multiplication and many other functions.
Table36.Multiplier Size & Configurations per DSP block
DSP Block Mode
Multiplier
Multiply-accumulatorTwo-multipliers adderFour-multipliers adderNote to Table36:(1)
The number of supported multiply functions shown is based on signed/signed or unsigned/unsigned implementations.
9 × 9
Eight multipliers with eight product outputsTwo multiply and accumulate (52 bits)Four sums of two
multiplier products eachTwo sums of four
multiplier products each
18 × 1836 × 36 (1)
Four multipliers with four One multiplier with one product outputsproduct outputTwo multiply and accumulate (52 bits)Two sums of two
multiplier products each
– –
One sum of four multiplier –products each
DSP Block Interface
StratixGX device DSP block outputs can cascade down within the same DSP block column. Dedicated connections between DSP blocks provide fast connections between the shift register inputs to cascade the shift register chains. The designer can cascade DSP blocks for 9× 9- or 18× 18-bit FIR filters larger than four taps, with additional adder stages
implemented in LEs. If the DSP block is configured as 36× 36 bits, the adder, subtractor, or accumulator stages are implemented in LEs. Each DSP block can route the shift register chain out of the block to cascade two full columns of DSP blocks.
StratixGX FPGA Family
Figure88.EP1SGX40 Device Fast Regional Clock Pin Connections to Fast Regional Clocks
Fast Clock [3]Fast Clock [2]Fast Clock [1]Fast Clock [0]fclk[1..0][4][5][6][7]Fast ClockFast ClockFast ClockFast ClockCombined Resources
Within each region, there are 22 distinct dedicated clocking resources consisting of 16 global clock lines, 4 regional clock lines, and 2 fast
regional clock lines. Multiplexers are used with these clocks to form 8-bit busses to drive LAB row clocks, column IOE clocks, or row IOE clocks. Another multiplexer is used at the LAB level to select two of the eight row clocks to feed the LE registers within the LAB. See Figure89.
StratixGX FPGA Family
Figure94.Global & Regional Clock Connections From Top Clock Pins & Enhanced PLL Outputs
PLL5_OUT[3..0]CLK14PLL5_FBCLK15CLK12CLK13Note(1)
E[0..3]PLL 5PLL 11G0 G1 G2 G3 L0 L1PLL11_OUTRCLK12RCLK13L0 L1 G0 G1 G2 G3RegionalClocksRCLK14RCLK15G12G13G14G15GlobalClocksG4G5G6G7RegionalClocksRCLK4RCLK5RCLK6RCLK7PLL12_OUTL0 L1 G0 G1 G2 G3PLL 6G0 G1 G2 G3 L0 L1PLL 12PLL6_OUT[3..0]PLL6_FBCLK4CLK5CLK6CLK7Note to Figure94:(1)
PLLs 5, 6, 11, and 12 are enhanced PLLs.
PLLs & Clock Networks
Enhanced PLLs
StratixGX devices contain up to four enhanced PLLs with advanced clock management features. Figure95 shows a diagram of the enhanced PLL.
Figure95.StratixGX Enhanced PLL
Post-ScaleCountersVCO Phase SelectionSelectable at EachPLL Output PortFrom Adjacent PLL/l0ClockSwitch-OverCircuitryCLK0/n?tChargePumpLoopFilter8VCO/g0?tRegionalClocks/l1Phase FrequencyDetectorSpreadSpectrum4?tProgrammable Time Delay on Each PLL PortPFD?tGlobalClocksCLK1/g1/g2?t?t?tI/O Buffers (2)to I/O or generalrouting(1)?t/m/g3FBINLock Detect& FilterVCO Phase SelectionAffecting All Outputs/e0?t?t?t/e14/e2/e3?tI/O Buffers (3)Notes to Figure95:(1)(2)(3)
External feedback is available in PLLs 5 and 6.
This external output is available from the g0 counter for PLLs 11 and 12.These counters and external outputs are available in PLLs 5 and 6.
StratixGX FPGA Family
Clock Multiplication & Division
Each Stratix GX device enhanced PLL provides clock synthesis for PLL output ports using m/(n × post-scale counter) scaling factors. The input clock is divided by a pre-scale divider, n, and is then multiplied by the m feedback factor. The control loop drives the VCO to match fIN × (m/n). Each output port has a unique post-scale counter that divides down the high-frequency VCO. For multiple PLL outputs with different
frequencies, the VCO is set to the least common multiple of the output frequencies that meets its frequency specifications. Then, the post-scale dividers scale down the output frequency for each output port. For
example, if output frequencies required from one PLL are 33 and 66 MHz, set the VCO to 330 MHz (the least common multiple in the VCO’s range). There is one pre-scale divider, n, and one multiply divider, m, per PLL, with a range of 1 to 512 on each. There are two post-scale dividers (l) for regional clock output ports, four counters (g) for global clock output ports, and up to four counters (e) for external clock outputs, all ranging from 1 to 512. The Quartus II software automatically chooses the appropriate scaling factors according to the input frequency, multiplication, and division values entered.