# Power Analysis and Implementation of the 8 - bit Toggle Clock Gated ALU

Vandana Prajapati Department of Electronics and Communication Sagar Institute of Research & Technology Bhopal, India

## ABSTRACT

Power dissipation is major drawback in the digital sequential circuit design of low power electronic devices. Clock signal is one input which is common for all the sequential circuits. The clock signal has major power dissipation at high frequencies. The clock gating technique can be implemented at architectural level to reduce the power dissipation at dynamic and clock power level. Aim of this paper is to analyze, implement and comparison between various clock gating techniques for a 8-bit ALU on a artix7,45 nm technology with xc7a100t-3csg324, xc6slx41-1Ltqg144 spartan6 with 40nm FPGA board. The two clock gating techniques are proposed and used in the design are namely: T-flip flop and use of latch. This technique is implemented by using Xilinx 14.1. T flip flop is best for this design as it requires less number of gate counts and also less area. Operation using11 instructions are performed in the proposed design. This technique is designed through T Flip-Flop based on gated clock ALU at RTL level. At different operating frequencies of 100MHZ, 200MHZ, 300MHZ, 400MHZ & 500MHZ, the dissipated power is 5mw, 9mw, 14mw, 19mw, 24mw respectively.

#### **Keywords**

Sequential circuit, T-FF, Clock- Gating, Implementation, Instruction, Gated ALU

#### 1. INTRODUCTION

In the world of digital electronics, high performance and low power electronic devices are dominant which lead to continuous research on low power design techniques. This is the basic need of any digital signal processing device in simple applications such as data acquisition to the complex ones namely: network-on-chip system, memory re/wr process, Counters and resistor. Device which performs the arithmetic and logical operations require higher power than the other peripherals. Thus the power reduction in the Device is the real challenge. The research on minimizing the power of the ALU is spreading out and various low Power dissipation approaches are being proposed. Power dissipation has become the bottle neck in gating high efficiency and smaller size of appliances. Thus the need for the research for achieving lower power dissipation and high speed VLSI Systems. Power dissipation is sum of the issues namely the [1] short

Circuit power, dynamic power and leakage power which are influenced by frequency, supply voltage,

Switching activity and load capacitor charging and discharging process. The power dissipation equation is given as:

P = Pdynamic + Pshortcircuit + Pleakage

Uday Panwar Department of Electronics and Communication Sagar Institute of Research & Technology Bhopal, India

Dynamic power is also on-off power defined as: P dynamic =  $\alpha * C * f(2)$  V2In equation 2,  $\alpha$  is the switching activity, C is capacitance, V is supply voltage and f is frequency. Hence, power optimization techniques can be tested at different levels such as logic level, architecture level and design level etc. Dynamic power loss is a major factor in all the sequential circuits. It consumes upto 60% of the total power.

# 2. RELATED WORK

Dynamic power dissipation can be reduced by clock gating techniques at different levels. According to reference [2] AND gate based clock gating circuit for three bit full-adder system, the clock gate enabled function can be identified by Boolean analysis of logic input for all adders. According to reference [1] clock gating is implemented in various techniques for smaller circuits called D flip-flop and on large circuits called 16 bit register. The evaluation in percentage of dynamic power exclusively for clock power is verified for different device operating frequencies. According to reference [3] flipflop is used to design 10 bits binary counter and 14 bits consecutive approximation register. A new clock gated flipflop is presented which reduces the consumption of clock signal switching power. It conducts with no redundant clock cycles and has reduced number of transistor to minimize the over head and to make it suitable for data signal with higher on-off activity. As per reference [4] practical outcomes based on the toggling activity, correlation of flip-flop and their physical position adjacency constraint in the layout has been achieved. The arithmetic and logic units are coexistent circuit is incorporated into the results obtained from the which implementation. Section VI and IX will conclude the results and discussions carried out.

# 3. CLOCK GATING

Clock gating, is one of the most well-known low-power consumption techniques. CG design is very effective for reducing the power consumption in digital circuits and also VLSI circuits. This CG design is implemented to reduce unwanted transition of clock pulse which in turn reduces the dynamic power loss and also the power loss from the load capacitor switching. The goal of this technique is to disable or suppress transitions from propagating to parts of the clock lane (i.e., clock network & flip-flops) under a certain condition computed by clock-gating circuits. CG is illustrated in figure 1 clock gating, which hinder the clock signal in the idle condition associated with each sequential logical unit. The clock signal is computed by function Fcg. is the system clock and CLKG the gated clock of the functional unit.



Fig 1: Clock Gated Function

Clock gating (CG) is the technique which prevents the clock input to the functional modules which are idle. This implies turning off the clock if not needed. There are various clock gating styles [5] used to optimize the power.

To name a few: latched free CG, latched based CG, flip-flop based CG etc.[1] [2] In the latch free CG, the clock input to the functional block is provided through the basic logic gates such as: AND, NAND, or NOR gate. The basic block representation is as shown in the figure2.



Fig 2: Latch Free Clock Gating.

The problem with this technique is: If enable signal goes inactive in between the clock pulse then gated clock gets terminated before its life time.

#### 3.1 Latch Based Clock Gating

The latch-based clock gating technique adds a level-sensitive latch to the design and holds the EN signal from the active peak of the clock till the dormant of the peak clock. Since the latch takes the state of the EN signal and keep stable it until the entire clock has been produced, the EN signal need only be stable around the rising peak of the gated clock, like as in the traditional gated design style (figure 3).

In some applications, latch-based designs are applied to D Flip Flop (DFF)–based designs.[5][6] The basic concept is that a D-FF can be split into 2 latches, and each one is clocked with a separate clock signal.

The clock gating is simple to design. A simple AND gate is employed to generate the gated clock. This technique (figure 4) is glitch-free because the control signal, generated when EN is high, is stable and remains stable.



Fig 3: D-Flip Flop Latch Based ALU.

### 4. FLOWCHART OF PROPOSED WORK

In this design pulse enable concept is an additive to clock gating techniques by which clock signal remains sleepy whenever enable pulse equals zero.



Fig 4: Flow Chart of Design

## 5. IMPLEMANTATION OF CLOCK GATED ALU

#### 5.1 T- Flip Flop Based Clock Gating

There are many techniques that are implemented to generate a clock gating signal flip flop has a feature to generate the clock gating signal. T flip flop produce logic 1 at 0 signals which is the foremost advantages to lower the power loss. It generates clock gated signal at 0 logic that reduces the power loses related to clock transition. The implementation of clock gating signal is shown in figure 5.



Fig 5: Gated Clock With T Flip Flop.

T flip-flop is a sequential circuit where primarily the clock is generated by T flip-flop which is applied to the ALU. In below fig. when enable (T) is equal to 0, T-Flip-flop have high value in its output (Q) which is applied to AND gate to produce clock signal for ALU.

Table 1. Device Utilization with T-FF Based GC

| RESOURCE         | USED | AVAILABLE | UTOLIZATION |
|------------------|------|-----------|-------------|
| Slice LUT'S      | 4    | 46,560    | 1%          |
| Used logic       | 4    | 46,560    | 1%          |
| Use as<br>memory | 0    | 16,120    | 0           |
| Registers slice  | 5    | 93,120    | 1%          |

# 5.1.1 RTL View Of Gated ALU

Below fig. shows architecture of clock gated ALU. RTL schematic has been obtained from vertex-6 FPGA family having 40nm technology.



Fig 6: RTL View Of Gated ALU With T Flip Flop.



Fig 7: Internal View of Gated ALU

# 5.2 Generation of Gated Clock With D-Flip Flop

D flip flop is most frequently used to produce gated clock pulse. This design provides negative and positive latch based gated clock generation with low statics, dynamic and total power consumption.



Fig 8: Gated ALU with D Flip Flop

A low gated clock design is implemented with Spartan 6, 40nm FPGA technology .RTL view of D flip flop based design is presented below figure 9.



Fig 9: Internal View of D-FF ALU

#### 6. RESULT

# 6.1 Power Consumed In T- Flip Flop Based ALU

The power consumption in gated ALU is carried out with artix7- xc7a100t-3csg324 with 45nm FPGA technology and xc6slx41-1Ltqg144 Spartan 6 with 40 nm technology. Here also static and dynamic power consumption is calculated at 60c & 50c temperature using both technologies. The table shows total power consumed during overall operation in the ALU by using clock gating.

Table 2.Total Power Consumption In Gated ALU At Artix7

| Clock<br>signal(ns) | Frequency<br>(MHZ) | Static<br>power | Dynamic<br>Power | Total<br>Power |
|---------------------|--------------------|-----------------|------------------|----------------|
| 1                   | 500                | 42mw            | 24mw             | 66mw           |
| 2                   | 400                | 42mw            | 19mw             | 61mw           |
| 3                   | 300                | 43mw            | 14mw             | 57mw           |
| 5                   | 200                | 43mw            | 9mw              | 52mw           |
| 10                  | 100                | 42mw            | 5mw              | 47mw           |

| Clock      | Frequency | Static | Dynamic | Total |
|------------|-----------|--------|---------|-------|
| signal(ns) | (MHZ)     | power  | Power   | Power |
| 1          | 500       | 82mw   | 24mw    | 106mw |
| 2          | 400       | 92mw   | 19mw    | 111mw |
| 3          | 300       | 102mw  | 14mw    | 116mw |
| 5          | 200       | 111mw  | 9mw     | 120mw |
| 10         | 100       | 120mw  | 5mw     | 125mw |

# Table 3.Total Power Consumption In Gated ALU At 60c With Artix7.

# 6.2 Measurement of Static and Dynamic Power Consumption By Spartan 6,xc6slx41 -1Ltqg144 Technology With 40nm FPGA.

 Table 4.Total Power Consumption In Gated ALU With

| Clock<br>signal(ns) | Frequency<br>(MHZ) | Static<br>power | Dynamic<br>Power | Total<br>Power |
|---------------------|--------------------|-----------------|------------------|----------------|
| 1                   | 500                | 12mw            | 30mw             | 42mw           |
| 2                   | 400                | 12mw            | 24mw             | 36mw           |
| 3                   | 300                | 11mw            | 18mw             | 29mw           |
| 5                   | 200                | 12mw            | 12mw             | 23mw           |
| 10                  | 100                | 12mw            | 6mw              | 17mw           |

Table 5. Power Consumption at 50c With Spartan 6.

| Clock<br>signal(ns) | Frequency<br>(MHZ) | Static<br>power | Dynamic<br>power | Total<br>Power |
|---------------------|--------------------|-----------------|------------------|----------------|
| 1                   | 500                | 18mw            | 30mw             | 48mw           |
| 2                   | 400                | 20mw            | 24mw             | 42mw           |
| 3                   | 300                | 18mw            | 18mw             | 36mw           |
| 5                   | 200                | 17mw            | 12mw             | 29mw           |
| 10                  | 100                | 17mw            | 6mw              | 23mw           |

## 6.3 Power Consumption in D-FF Based ALU

Power consumption analysis also done for D-FF based design. In this design total power optimization during a complete instruction is performed with the help x- power analyzer. The total power including static and dynamic power consumption is presented in the table VI.

| Table 5. Power consumption in 1 | D-FF | Based | ALU |
|---------------------------------|------|-------|-----|
|---------------------------------|------|-------|-----|

| Clock<br>signal(ns) | Frequency<br>(MHZ) | Static<br>power | Dynamic<br>power | Total<br>Power |
|---------------------|--------------------|-----------------|------------------|----------------|
| 1                   | 500                | 197mw           | 465mw            | 664mw          |
| 2                   | 400                | 198mw           | 259mw            | 457mw          |
| 3                   | 300                | 198mw           | 120mw            | 318mw          |
| 5                   | 200                | 198mw           | 76mw             | 274mw          |
| 10                  | 100                | 199mw           | 46mw             | 243mw          |

#### 7. SIMULATED RESULT

The simulated wave form of the 8-bit ALU using T flip flop clock gating is shown in the Fig.9. From the simulated result shown in Fig. 9, it is observed that, the 8-bit input is applied to  $A_{in}$  and  $B_{in}$  and the input to the select lines of logical and arithmetic functional units are also provided. The power

analysis of an 8-bit ALU is implemented for various resource types namely: the clock power, dynamic power, signal power, IO power and the total power.

The enable input EN of the T flip flop is initially made HIGH and it generates half frequency signal of input signal. Then enable is made low which maintain the previous stage of output. Finally the gated clock is applied to the logical and the arithmetic unit. The output is obtained at the Logic out stage. The clock pulse remains constant till next operation is select to be executed.



Fig 10: Simulated Result With T Flip Flop Gated ALU



Fig 11: Graphic Representations Of Total Power At Various Technology Level With T- Flip Flop Based ALU.

# 8. COMPARISION IN VARIOUS CLOCKS GATING TECHNIQUE

Table 4. show the dynamic power comparison in various clock gating techniques which shows improved results as we integrate technology and implement new design.

Table 6. Dynamic Power Comparisons In ALU.

| Clock<br>signal(ns) | Frequency<br>(MHZ) | Existing<br>technique<br>(dynamic<br>power) | T flip flop<br>Based<br>(dynamic<br>Power) |
|---------------------|--------------------|---------------------------------------------|--------------------------------------------|
| 1                   | 500                | 465mw                                       | 24mw                                       |
| 2                   | 400                | 259mw                                       | 19mw                                       |
| 3                   | 300                | 120mw                                       | 14mw                                       |
| 5                   | 200                | 76mw                                        | 9mw                                        |
| 10                  | 100                | 46mw                                        | 5mw                                        |

## 9. CONCLUSION

Power consumption traditionally relegated to the synthesis, and placement and routing stages, has moved up to the System level and RTL stages. Hardware designers use clock gating to turn off inactive sections of the design and reduce overall dynamic power consumption. The RTL approach is important because developers generally test power only at the gate level & any change to the RTL needs many design iterations to reduce power. The RTL form of he design thus saves weeks of achievement by fixing potential power issues up-front. The RTL coding step is not too early in the design flow to address power consumption optimization. For each source of consumption and each type of digital block, appropriate solutions can be implemented. Although the theory behind some of these techniques can be complex, they are often easy to implement. RTL designers should be aware of these techniques and use their knowledge of the system not only to optimize the speed performance, but also to reduce the unnecessary switching activities.

#### **10. REFERENCES**

- [1] Mahendra pratap , Deepak baghel "clock gated low power sequential ckt.design," proceeding of 2013IEEE conference on information and communication technologies(ICT2013)
- [2] Padmini g.kaushik, sanjay m. gulhane, athar ravish khan, "Dynamicpower reuction of digital circuits by clock gating", ijict.org, vol.4 no. 1march 2013.
- [3] Mohamed o shanker, Magdy A Bayoumi, "clock Gated FF for low powerapplication in 90nm cmos," *IEEE Trans Circuits Syst.*
- [4] Shmuel wimer and Israel koren, "Design flow for ff gouping in data – driven clock gating" IEEE Trans. On vlsi, 1063-8210, 2012.
- [5] L. Benini, A. Bogliolo, and G. De Micheli, "A survey on design techniques for system-level dynamic power management," IEEE Trans.
- [6] Very Large Scale Integr. (VLSI) Syst., vol. 8, no. 3, pp. 299–316, Jun. 2000.

- [7] M. S. Hosny and W. Yuejian, "Low power clocking strategies in deep submicron technologies," in Proc. IEEE Intll. Conf. Integr. Circuit Design Technol., Jun. 2008, pp.143–146
- [8] C. Chunhong, K. Changjun, and S. Majid, "Activitysensitive clock tree construction for low power," in Proc. Int. Symp. Low Power Electron. Design, 2002, pp. 279– 282.
- [9] A. Farrahi, C. Chen, A. Srivastava, G. Tellez, and M. Sarrafzadeh, "Activity-driven clock design," IEEE Trans. Comput.Aided Design Integr. Circuits Syst., vol. 20, no. 6, pp. 705–714, Jun. 2001
- [10] W. Shen, Y. Cai, X. Hong, and J. Hu, "Activity and register placement aware gated clock network design," in Proc. Int. Symp. Phys. Design, 2008, pp. 182–189.
- [11] M. Donno, E. Macii, and L. Mazzoni, "Power-aware clock tree planning,"in Proc. Int. Symp. Phys. Design, 2004, pp. 138–147.
- [12] SpyGlasPower[Online].Available: http://www.atrenta.com/solutions/spyglass family/spyglasspower.htm
- [13] S. Wimer and I. Koren, "The Optimal fan-out of clock network for power minimization by adaptive gating," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 10, pp. 1772–1780, Oct. 2012.
- [14] Y.-T. Chang, C.-C. Hsu, M. P.-H. Lin, Y.-W. Tsai, and S.-F. Chen, "Postplacement power optimization with multi-bit flip-flops," in Proc. IEEE/ACM Int. Conf. Comput., Aided Design, Nov. 2010, pp. 218–223.
- [15] I. H.-R. Jiang, C.-L. Chang, Y.-M. Yang, E. Y.-W. Tsai, and L. S.-F. Cheng, "INTEGRA: Fast multi-bit flipflop clustering for clock power saving based on interval graphs," in Proc. Int. Symp. Phys. Design, 2011, pp. 115–121.
- [16] N. Magen, A. Kolodny, U. Weiser, and N.Shamir, "Interconnect-power dissipation in amicroprocessor," in Proc. Int. Workshop Syst.Level Int. Predict., 2004, pp. 7– 13.