Автор(ы): Peter Safir
Рубрика конференции: Секция 14. Технические науки. Специальность 05.00.00
DOI статьи: 10.32743/SpainConf.2023.1.27.351343
Библиографическое описание
Peter S. COMPARISON OF FAST BINARY ADDERS// Proceedings of the XXVII International Multidisciplinary Conference «Prospects and Key Tendencies of Science in Contemporary World». Bubok Publishing S.L., Madrid, Spain. 2023. DOI:10.32743/SpainConf.2023.1.27.351343



Peter Safir

Bachelor of Science, The Azrieli College of Engineering in Jerusalem (JCE),

Israel, Jerusalem



When creating logic circuits on the FPGA you have to use adders. In this project I present comparison of high-speed, parallel-prefix adders such as Kogge-Stone, Brent-Kung, Sklansky, and also ripple carry adder and default adder. Based on the results of the synthesis, graphs were presented.


Keywords: Kogge-Stone, Brent-Kung, Sklansky, FPGA, RTL, adder.



The core of any microprocessor is its data processing speed. The most important element in data processing speed is the arefmetic block, which includes the adders. Parallel-prefix adders such as Kogge-Stone, Brent-Kung, Sklansky offer a highly efficient solution to the binary addition problem. These adders are suited for FPGA implementation. This project involves the comparison of high-speed, parallel-prefix adders such as Kogge-Stone[3], Brent-Kung[4], Sklansky[2], and also ripple carry adder and default adder. I have passed through the full design cycle from initial concept to structural RTL[1] coding, simulation and synthesis for Cyclone II FPGA family, device EP2C35F627C6.

Common Adder Structure

1-bit Full Adder

A full adder is a circuit which adds three one bit binary numbers and outputs two one bit binary numbers. The block diagram and equations is shown in Figure 1:




Description automatically generated

Figure 1. 1-bit Full Adder

 Default adder (Binary addition)

Adding binary numbers is a very simple task, and very similar to the longhand addition of decimal numbers. As with decimal numbers, you start by adding the bits (digits) one column, or place weight, at a time, from right to left. Unlike decimal addition, there is little to memorize in the way of rules for the addition of binary bits:

0 + 0 = 0

1 + 0 = 1

0 + 1 = 1

1 + 1 = 0 carry=1

Just as with decimal addition, when the sum in one column is a two-bit (two-digit) number, the least significant figure is written as part of the total sum and the most significant figure is "carried" to the next left column. Consider the following example:

               11  1     Carry bits




Realization in VHDL code:

sumint <= ('0' & a) + ('0' & b) + Cin;

Cout <= sumint(n);

s <= sumint(n-1 downto 0);

N-bit ripple carry adder

The ripple carry adder is one of the simplest adders. It consists of a cascaded series of full adders. For example, a 4-bit adder can be constructed by cascading four full adders together as shown in Figure 2. The ripple carry adder is relatively slow as each full adder must wait for the carry bit to be calculated from the previous full adder.



Description automatically generated

Figure 2. n-bit Carry Propagate Adder



Optimization was carried out in the mode balanced, for the best trade-off between area and speed. The frequency of the crystal was set default 10Mhz . The maximum number of bits for the crystal is a 128-bit, since no area-crystal, allows to use more.

The calculation was done for 16 ,32,64 and 128 bits. The results of the calculation were compared between the all adders. Design software Quartus II sp2 Web Edition[8] was used to produce design simulation of BKA , KSA, Sklansky, RPA[7] and default adder. The VHDL[1] source code writing is the most important part in this project. For this research, the VHDL source codes contain elements such as entity, library, architecture, function and array. The design file has to be analyzed, synthesis and compile before it can be simulated. Simulation results in this project come in the form of Register Transfer Level (RTL) diagram, functional  vector  waveform outcome and classic timing analysis. The RTL design can be obtained by using the RTL viewer based on the netlist viewer. Functional vector waveform outcome are produced by selecting random bit values and add up to produce the sum and carry bits. Timing analysis can be obtained by viewing the summary of the classic timing analysis after compiling the whole project. The simulations are done by using the functions that is included in the Quartus II design software. Simulation analysis is prepared by viewing the results from the simulated VHDL source code. Analysis of the simulation is performed once the desired simulation outcome is obtained. Simulation results show the classic timing analysis, RTL schematic diagram and also vector waveform outcome of the simulated designs. Finally, the adders comparison will be made once all simulation results are analyzed. The comparisons will be based on the computational speed or also known as time propagation delay, area and delay*area.

Result and analysis

The all adders are compared in two main aspects, area and computational delay. The comparison for area is based on number of total logic elements on each adder, while the comparison on computational delay is based on the timing analysis of all adders.

Area Comparison

Bar chart 2 shows the bar chart analysis of the number of logic elements versus bit size for all adders. The number of logic elements for all type adders except KSA increases in proportion with the bit size. However, the increment for number of logic elements in all adders is not significant compared to KSA. The number of logic elements in KSA increases drastically in proportion to the bit size comparison to all adders. At 128 bits size, the KSA produces more than 70% logic elements than BKA produces and more than 90% logic elements than default adder. The numbers of logic elements has a great affect the area of the adder. In the practical application, the increase in the number of logic elements also requires more wiring for connection in the circuit design. Therefore, the higher the numbers of logic elements exist, the higher cost involved. Cost is increases because it is directly proportional to the area of the circuit design. As a result, the KSA is higher in terms of area as the number of bits increase compared to default adder. And as we see the least amount of logic elements is in the structure of default adder.


Bar chart 1. Generic logic elements

Table 1.

Total logic elements

Изображение выглядит как стол

Автоматически созданное описание


Computational delay comparison

Bar chart 3 shows the bar chart of time propagation delay () in nanosecond versus bit size for all adders. The  for RCA is slightly higher than the  for all adders at bit size of 16 bits. The  RCA continues to grow during the entire measurement and 128 bits we have is a large number in comparison with the other adders. This is not due to the number of logic elements. As we can see from the previous bar chart there are not many logic elements in RCA. This is primarily due to the fact that each adder in tandem wait for carry from the previous adder.

Prefix adders with higher number of “o” stages will have longer propagation time. This is because each stage’s logical computation depends on its previous stage(s) logical computation. Thus, longer propagation time is needed for high number of “o” stages. But as we see in the bar chart, the fastest prefix adder is the one that contains the most logical elements and accordingly most the FCO blocks[9]. This is due to so that each of the FCO blocks in prefix adder generates a most significant bit (MSB) and do not have to wait for the right is being computed from the less significant bit (LSB). In terms of computational delay or time propagation delay ( ), KSA is a better choice.


Bar chart 2. Propagation delay


Table 2.

Propagation delay

Изображение выглядит как стол

Автоматически созданное описание



In this chapter of the project I multiplied the propagation delay by the total logic elements for getting featured adder area and performance (Bar chart 4). As would be expected among the reference featured adder is the default adder. It contains a minimum number of logic elements and minimal propagation delay. Among PPA adders Featured results were obtained in Sklansky prefix adder. This is an optimal compromise between area and propagation delay.


Bar chart 3. Generic Propagation delay*Total logic elements


Table 3.

Propagation delay*Total logic elements

Изображение выглядит как стол

Автоматически созданное описание



  1. Hassoune, D. Flandre, I. O'Connor, and J.-D. Legat, “ULPFA: a new efficient design of a power-aware full adder,” IEEE Trans. Circ. Syst. I: Reg. Papers, vol. 57, no. 8, pp. 2066–2074, Aug. 2010.
  2. Neve, H. Schettler, T. Ludwig, and D. Flandre, “Powerdelay product minimization in high-performance 64-bit carry-select adders,” IEEE Trans. Very Large Scale Integration (VLSI) Syst., vol. 12, no. 3, pp. 235–244, March 2004.
  3. Shams and M. Bayoumi, “A novel high-performance CMOS 1-bit full-adder cell,” IEEE Trans. Circuits Syst. II, vol. 47, pp. 478–481, May 2000.
  4. S.M.Y. Sherazi, S. Asif, E. Backenius, and M. Vesterbacka, “Reduction of Substrate Noise in Sub Clock Frequency Range,” IEEE Trans. Circ. Syst. I: Reg. Papers, vol. 57, no. 6, pp. 1287–1297, June 2010.
  5. M.A. Manzoul, “Parallel CLA algorithm for fast addition,” in Proc. Int. Conf. Parallel Computing in Electrical Engineering, 2000, pp. 55–58.
  6. Youngjoon Kim and Lee-Sup Kim, “A low power carry select adder with reduced area,” in Proc. IEEE Int. Symp. Circuits and Systems, May 2001, vol. 4, pp. 218–221.
  7. M. Alioto and G. Palumbo, “A simple strategy for optimized design of one-level carry-skip adders,” IEEE Trans. Circ. Syst. I: Fundamental Theory Appl., vol. 50, no. 1, pp. 141– 148, Jan. 2003.
  8. K. Rawat, T. Darwish, and M. Bayoumi, “A low power and reduced area carry select adder,” in Proc. 45th Midwest Symp. Circuits and Systems, Aug. 2002, vol. 1, pp. 467–70.
  9. R. Ward and T. Molteno, Table of linear feedback shift registers, Dunedin, New Zealand: Datasheet, Department of Physics, University of Otago, 2007
  10. M. Alioto and G. Palumbo, “Analysis and comparison on full adder block in submicron technology,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 10, no. 6, pp. 806–823, Dec 2002.
  11. Frank K. Gurkaynad, Yusuf Leblebicit, Laurent Chaouatt and Patrik J. McGuinnessz, “Higher Radix Kogge-Stone Parallel Prefix Adder Architectures,” in Proc. of ISCAS 2000
  12. J.M. Rabaey, A. Chandrakasan, and B. Nikolić, Digital Integrated Circuits: A Design Perspective, 2nd Ed., PrenticeHall, 2003.