FPGA RTL Coding Style
FPGAs are very versatile and give you a lot flexibility - IO, Logic, embedded memory, DSP, Standard Interfaces like PCIe, Ethernet, DDRx and so on. Advanced features like Dynamic Reconfiguration and Partial Reconfiguration allows even more flexibility. The biggest challenges are compile time and timing closure.
Today's FPGAs, like the Agilex-9 or even the Agilex-7 series, are large FPGAs with a few million ALMs, thousands of M20Ks embedded memory, and DSP blocks - which when translated to transistor count amount to billions! Imagine now being able to Synthesize, Plan the clocking and IO, Placement, Routing and closing timing across 5-7 corners. I'd confidently say 6-8 hours of compile time is excellent. The Altera Quartus Engineering team has made significant progress towards reducing compile times and out-of-the-box timing closure over the last several years but we will talk about that in a later post. Today the focus is on RTL coding style which will take advantage of the FPGA architecture to help you achieve timing closure.
Overs the years, I have had the opportunity to interact with brilliant Engineers. You and I probably see Verilog/VHDL code as it is - RTL, but there are Engineers who can "see" the synthesized netlist just by looking at the RTL. These Engineers are the ones who can spot RTL coding style issues that hamper timing closure and will even re-write RTL code that will squeeze a few Mhz of additional performance boost that will determine whether you ship the product on time or not! Cool stuff, eh? Let's see a few examples.
NUM_FRAMES is 850 FOR int IN 0 TO (NUM_FRAMES-1) LOOP IF (addr(17 DOWNTO 1) = CONV_STD_LOGIC_VECTOR(ind,17)) THEN rdata <= in(ind); END IF; END LOOP; END IF;
//after adding ieee.std_logic_unsigned.ALL library variable index : INTEGER; index := CONV_INTEGER(addr(17 DOWNTO 1)); if (index < NUM_FRAMES) THEN rdata <= in(index); END IF;
The modified RTL after adding the ieee library, results in 6 fewer LUT levels, resulting in lower data delay and hence higher Fmax.
Conclusion:
Recommended by LinkedIn
The Quartus Prime Pro Design Software provides the user with all the tools they need to identify their RTL coding style issues. In synthesis stage, review & address all the Critical/Warning messages. As much as possible ensure that there are no critical/warning messages. The synthesis engine also has a robust DRC report. Review and address the DRCs to minimize timing closure challenges. Use the RTL and the Technology Map Viewers to review the logic levels in your post-synthesis netlist.
Irrespective of the technology node, assume 500ps of delay per logic level. So if you want your design to run at 500 MHz, then a majority of your timing paths need to have less than 4 levels of logic.
The Quartus Prime Pro design software in the High-Effort compiler setting, will aggressively retime - forward and backward, but for retiming to happen there needs to be sufficient slack before/after the critical path - the path that is actually failing timing.
Reach out to your FAE for help. In most cases, the FAEs will help you rewrite your RTL and constraints to help you achieve timing closure.
I also encourage you to submit your design to Altera. Submitting your design with the RTL, to Altera will help us fine-tune the algorithms to help you achieve timing closure out-of-the-box. Submitting your design with the RTL helps us add your designs to our daily and weekly regression suites. These regressions helps us to monitor the QoR - Fmax/compile time/Memory, helping us to debug outliers.
To submit your design, reach out to your Altera FAE.