## Design of IIR Systolic Array Architecture by using Linear Mapping Technique

Manoj Kumar Department of ECE, NIT Manipur, Imphal, India

#### ABSTRACT

Design of IIR (infinite impulse response) systolic array architecture by using linear mapping technique is proposed in this paper. Systolic array architecture maps high level computations into hardware structures. In a systolic array, all the processing elements (PEs) are uniform and fully pipelined. On regular dependence graph, systolic architectures are designed by using linear mapping techniques. IIR filters are used in digital signal and image processing applications. IIR filters are recursive filters. IIR filters have high selectivity and less number of coefficients than the FIR (finite impulse response) filters. Various IIR systolic arrays architectures such as design B1, design B2, and design F is proposed in this paper. By selecting the projection vector, processor vector and scheduling vector these designs are derived.

#### General Terms

Linear mapping technique, Hardware utilization efficiency (HUE)

#### Keywords

Systolic array, DG, PEs, FIR, IIR, DSP

#### **1. INTRODUCTION**

Dependence of the computations in an algorithm (program) is represented by the dependence graph. It is a directed graph. Computations in an algorithm are represented by nodes in the dependence graph (DG). And precedence constraints among nodes are represented by edges in a dependence graph. For a given algorithm a new node is created whenever new computation is called in a dependence graph. In a DG, no node is ever reused on a single computation basis. DG does not contain any delay elements for storing and passing data from the current iteration to subsequent iteration. We can use the dependence graph for designing a systolic array. Systolic arrays consist set of processing elements (PEs) to represent a network. In such system data move between processing elements in a rhythmic fashion [1, 2].Each processor are independent for storing and computing data. To maintain a regular flow in a systolic array, processing elements regularly pump data into and outside of the systolic system. The coprocessor can be used as a systolic array. The coprocessor can be combined with a host computer for passing data through processing elements. And the final result is returned to the host computer for maintaining the regular flow of data in the systolic array. A general block diagram of systolic array is shown in the Fig. 1. Important properties of systolic arrays are modularity and regularity. IIR filters are recursive filter with fewer design parameters, less memory requirements and lower computations complexity than FIR (Finite impulse response) filters. IIR filters are more attractive for designing filters where there is no requirement for a linear -phase characteristics within the pass-band of a digital filter [3]. General IIR filter equation is given below [4].

$$y(n) = \sum_{k=1}^{N} a_{K} y(n-k) + \sum_{K=0}^{M} b_{K} x(n-k) (1.1)$$

In Eq.(1.1), N represents order of the filter. Many researchers have used IIR filters to implement image processing and digital processing systems. FPGA implementation of 1st-order 4D IIR frequency-hyperplane digital filters was proposed in 2011[8]. D.Dansereau and L.Bruton proposed a 4D frequency-planar IIR filter for light filed processing application [9]. Siji P.V and Manju Manuel used 2D IIR filters for directional filtering of temporally broadband bandpass space time plane waves [10]. An IIR filter design using CSA (Carry save adder) for DSP (Digital signal processing) applications was proposed in 2015[11]. In this design, high speed 2nd order infinite impulse response (IIR) filter was used to remove noise in signals. This paper is organized as follows. Section II discusses linear mapping technique. Section III presents the proposed different types of 1st order 1D IIR systolic arrays architectures. Section IV presents the proposed 2nd order 1D IIR systolic arrays architectures. In section V, the conclusion is presented.



#### Fig. 1: General block diagram of a systolic system

# 2. IIR SYSTOLIC ARRAY DESIGN METHODOLOGY

Linear mapping technique is used to design systolic array architecture [5, 6]. Linear mapping technique maps an Ndimensional dependence graph to a lower dimensional systolic architecture. If the presence of an edge in a certain direction at any node in the dependence graph represents the presence of an edge in the same direction at all nodes in the DG, then dependence graph is said to be regular. In a DG, at t=0, plane corresponds to a space representation where no time instance is assigned to any computation. Space representation is transformed to a space-time representation by using linear mapping technique. In space-time representation, each node is mapped to a certain processing element, and each node is executed at a certain time instance. Projection, processor space, and scheduling vectors are used to design systolic array and they are defined below [7].

Projection vector (d) is represented by the below matrix

$$d = \begin{pmatrix} d_1 \\ d_2 \end{pmatrix}$$

The same processor is used to execute two nodes, which are displaced by d or multiples of d.

Processor space vector (PT) is defined below

PT = [P1, P2]

A node IT = (i,j) will be executed by processor

 $PTI = [P1 P2]\binom{i}{i}$ 

Scheduling vector (ST) is represented by ST = (S1 S2). A node IT = (i,j) will be executed at time STI.

Hardware utilization efficiency (HUE) is defined as

$$HUE = \frac{1}{|S^{T}d|}$$

Two tasks executed by the same processor are spaced  $\left|ST \; d\right|$  time units.

By selecting different projection, processor space, and scheduling vectors many systolic architectures can be designed. Following constraints must be satisfied by these vectors.

PTd = 0, i.e. two nodes are executed by the same processor if they are differ by projection vector d.  $STd \neq 0$ , i.e. two nodes cannot be executed at the same time if they mapped to the same processor.

#### 3. 1<sup>St</sup> ORDER IIR SYSTOLIC ARRAYS

Different design based on linear mapping technique for 1<sup>st</sup> order IIR filter is discussed in this section.

#### 3.1 Design B<sub>1</sub>

Following values for projection vector, processor space and scheduling vector are taken for deriving the systolic design  $B_1$ .

$$\begin{split} & d = \binom{1}{0} \quad , PT = (0 \ 1) \ , ST = (1 \ 0) \\ & \text{So, for any given node } IT = (i,j) \\ & PTI = (0 \ 1) \binom{i}{j} = j \\ & \text{STI} = (1 \ 0) \binom{i}{j} = i \\ & \text{and } STd = (1 \ 0) \binom{1}{0} = 1 \\ & \text{So, } HUE = \frac{1}{|S^{T}d|} = 1 \end{split}$$

1st order 1D IIR filter is defined as

Y(n) = a1 y(n-1) + b0 x(n) + b1 x(n-1)

Block diagram for this filter is shown in Figure 2. Table 1 represents edge mapping for the design B1. For 1st order IIR filter, the block diagram for B1 systolic array design is shown in Figure 3 .Systolic array architecture for the design B1 is shown in Figure 4. And Figure 5, shows space-time representations diagram for the designing B1.From this figure, it is clear that input data is broadcast to the processors, weight values stay and output is appearing at the processors at different space and time.



Fig. 2: First order 1D IIR Filter



Fig. 3: Block diagram of the B<sub>1</sub> design



Fig. 4: Systolic array architecture of the design B<sub>1</sub>

 Table 1. Edge Mapping Table for Design B1



Fig. 5: space-time representation of the B<sub>1</sub> design

#### 3.2 Design B<sub>2</sub>

Following values for projection vector, processor space and scheduling vector are taken for deriving the systolic design B2

Table 2 represents edge mapping for the design  $B_2$ . For 1<sup>st</sup> order IIR filter, the block diagram for  $B_2$  design is shown in Figure 6. Systolic array architecture for the design  $B_2$  is shown in Figure 7. And Figure 8, shows space-time representations diagram for the designing B2. From this figure, it is clear that input data is broadcast to the processors, output stay and weight value appear to the processors at different space and time.

 e
  $\mathbf{P}^{T}$  e
  $\mathbf{S}^{T}$  e

 wt(1, 0)
 1
 1

 i/p (0, 1)
 1
 0

 result (1, -1)
 0
 1

Table 2: Edge Mapping for Design B<sub>2</sub>







Fig. 7: Systolic array architecture of the design B<sub>2</sub>



Fig. 8: Space-time representation of the B<sub>2</sub> design

#### 3.3 Design F

Following values for projection vector, processor space and scheduling vector are taken for deriving the systolic design F.

 $d = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, PT = (0 \ 1), ST = (1 \ 1)$ So, for any given node IT = (i, j) PTI = (0 \ 1) \begin{pmatrix} i \\ j \end{pmatrix} = j STI=(11)  $\begin{pmatrix} i \\ j \end{pmatrix}$ =i+j and STd = (1 \ 1)  $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$  = 1 So, HUE = 1 For the given 1st order 1D IIR filter

Y(n) = a1 y(n-1) + b0 x(n) + b1 x(n-1)

Table 3 represents edge mapping for the design F. For 1<sup>st</sup> order IIR filter, the block diagram for systolic array design F is shown in the Figure 9. Systolic array architecture for the design F is shown in Figure 10. And Figure 11, shows space-

time representations diagram for the designing F. From Table 3, it is clear that weight value stays, input vector moves from left to right with 1 delay element and output moves from right to left with no delay elements.

| Table 3: Edge Mapping for Design F |                  |                  |  |
|------------------------------------|------------------|------------------|--|
| e                                  | P <sup>T</sup> e | S <sup>T</sup> e |  |
| wt(1, 0)                           | 0                | 1                |  |
| i/p (0, 1)                         | 1                | 1                |  |
| result (1, -1)                     | -1               | 0                |  |



Fig.9: Block diagram of the F design



Fig. 10: Systolic array architecture of the design F



Fig. 11: Space-time representation of the F design

### 4. 2<sup>nd</sup> ORDER 1D IIR SYSTOLIC

#### ARRAYS

Different design based on linear mapping technique for 2nd order IIR filter is discussed in this section.

#### 4.1 Design B1

Following values for projection vector, processor space and scheduling vector are taken for deriving the systolic design B1.

 $d = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, PT = (0 \ 1), ST = (1 \ 0)$ So, for any given node IT = (i,j) PTI = (0 \ 1) \begin{pmatrix} i \\ j \end{pmatrix} = j  $STI=(10)\binom{i}{j}=i$ and  $STd = (1 \ 0)\binom{1}{0} = 1$ So,  $HUE = \frac{1}{|S^{T}d|} = 1$ 2nd order 1D IIR filter is defined as  $Y(n) = a1y(n-1)+a2 \ y(n-2)+b0 \ x(n)+b1x(n-1)$ 

Block diagram for this filter is shown in Figure 12. Table 4 represents edge mapping for the design B1. For 2nd order IIR filter, the block diagram for B1 design is shown in Figure 13. Systolic array architecture for the design B1 is shown in Figure 14. And Figure 15, shows space-time representations diagram for the designing B1.From this figure, it is clear that input data is broadcast to the processors, weight values stay and output is appearing at the processors at different space and time.



Fig.12: Second order 1D IIR Filter

 Table 4: Edge Mapping Table for Design B1 (2<sup>nd</sup> order)



Fig.13: Block diagram of the B<sub>1</sub> design (2<sup>nd</sup> order)



Fig. 14: Systolic array architecture of the design B<sub>1</sub>(2<sup>nd</sup> order)



Fig. 15: Space-time representation of the B<sub>1</sub>(2<sup>nd</sup> order) design

#### 4.2 Design B<sub>2</sub>

Following values for projection vector, processor space and scheduling vector are taken for deriving the systolic design B2.

d =  $\begin{pmatrix} 1 \\ -1 \end{pmatrix}$ , PT = (1 1), ST = (1 0) So, for any given node IT = (i,j) PTI = (1 1) $\begin{pmatrix} i \\ j \end{pmatrix}$  = i+j STI=(10) $\begin{pmatrix} i \\ j \end{pmatrix}$ =i and STd = (1 0) $\begin{pmatrix} 1 \\ -1 \end{pmatrix}$  = 1 So, HUE = 1 For the given 2nd order 1D IIR filter Y(n) = a1y(n-1)+a2 y(n-2)+b0 x(n)+ b1x(n-1) Table 5 represents edge mapping for the desi

Table 5 represents edge mapping for the design B2. For 2nd order IIR filter, the block diagram for B2 design is shown in Figure 16. Systolic array architecture for the design B2 is shown in Figure 17. And Figure 18, shows space-time representations diagram for the designing B2. From this figure, it is clear that input data is broadcast to the processors, output stay and weight value appear to the processors at different space and time.

 Table 5: Edge Mapping Table for Design B<sub>2</sub> (2<sup>nd</sup> order)

| e               | P <sup>T</sup> e | S <sup>T</sup> e |  |  |
|-----------------|------------------|------------------|--|--|
| wt(1, 0)        | 1                | 1                |  |  |
| i/p (0, 1)      | 1                | 0                |  |  |
| result (1, -1)  | 0                | 1                |  |  |
| weight<br>input |                  |                  |  |  |

Fig. 16: Block diagram of the B<sub>2</sub> design (2<sup>nd</sup> order)



Fig. 17: Systolic array architecture of the design  $B_2(2^{nd}$ 



Fig. 18: Space-time representation of the  $B_2(2^{nd} \text{ order})$ design

#### 4.3 Design F

Following values for projection vector, processor space and scheduling vector are taken for deriving the systolic design F.  $d = \begin{pmatrix} 1 \\ 0 \end{pmatrix}$ ,  $PT = (0 \ 1)$ ,  $ST = (1 \ 1)$ So, for any given node IT = (i,j)

PTI =  $(0 \ 1) {i \choose i} = j$ 

 $STI=(11)\binom{i}{i}=i+j$ 

and STd =  $(1 \ 1) \begin{pmatrix} 1 \\ 0 \end{pmatrix} = 1$ 

So, HUE = 1

For the given 2nd order 1D IIR filter

$$\begin{split} Y(n) &= a1y(n-1) + a2\ y(n-2) + b0\ x(n) + \ b1x(n-1) \\ \text{Table 6 represents edge mapping for the design F. For 2nd} \end{split}$$

order IIR filter, the block diagram for design F is shown in Figure 19.Systolic array architecture for the design F is shown in Figure 20. And Figure 21, shows space-time representations diagram for the designing F. From Table 6, it is clear that weight value stays, input vector moves from left to right with 1 delay element and output moves from right to left with no delay elements.

| Table 6: Edge Mapping Table for Design F (2 <sup></sup> order) |                  |                  |  |
|----------------------------------------------------------------|------------------|------------------|--|
| e                                                              | P <sup>T</sup> e | S <sup>T</sup> e |  |
| wt(1, 0)                                                       | 0                | 1                |  |
| i/p (0, 1)                                                     | 1                | 1                |  |
| result $(1, -1)$                                               | -1               | 0                |  |



Fig. 19: Block diagram of the F design (2<sup>nd</sup> order)



Fig. 20: Systolic array architecture of the design F (2<sup>nd</sup> order)



Fig. 21: Space-time representation of the F (2<sup>nd</sup> order) design

#### 5. CONCLUSION

In this paper, different IIR systolic arrays are proposed.1<sup>st</sup> order and 2<sup>nd</sup> order IIR filter examples are taken for designing the systolic arrays. Systolic array architectures for the different design are shown in the figures. For the proposed designs, space -time representation figures are presented. Also, edge mapping tables for IIR systolic array design are presented. Hardware Utilization Efficiency of the proposed IIR systolic array was one.

#### 6. REFERENCES

- H.T.Kung and C.E.Leiserson, "Systolic arrays (for VLSI)," in Sparse Matrix Symposium, SIAM, pp.256-282, 1978.
- [2] H.T.Kung," Why systolic architectures? IEEE Commuters Magazine, vol.15, pp.37-45, Jan. 1982.
- [3] Dimitris G. Manolakis, John G. Proakis, Digital Signal Processing: Principles, Algorithm, and Applications", Macmillan Publishing Company, 1992.
- [4] G.Ramana Murthy, C.Senthilpari, P.velrajkumar, Lim Tien sze, "Design of Low Power and High Speed Digital IIR Filter in 45nm with Optimized CSA for Digital Signal Processing Applications", International Journal of Electrical and Computer Engineering, vol.7, no.4, 2013.
- [5] S.Y.Kung, VLSI Array Processors. Prentice Hall, 1998.
- [6] H. V. Jagadish, S. K.Rao, and T. Kailath, "Array architecture for iterative algorithm,"Proc. IEEE, vol.75, no.9, pp.1304-1321, Sept.1987.

International Journal of Computer Applications (0975 – 8887) Volume 182 – No. 39, February 2019

- [7] K.K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation. New York: Wiley, 1999.
- [8] A.Madanayake,Randeel Wimalagunarathne, Donald G. Dansereau and Len T.Bruton ,"Design and FPGAimplementation of 1<sup>st</sup>-order 4D IIR frequencyhyperplane digital filters, in Circuits and systems(MWSCAS),2011 IEEE 54<sup>th</sup> International Midwest Symposium on,pp.1-4,2011.
- [9] D.Dansereau and L.Bruton," A 4D freq\*uency –planar IIR filter and its application to light filed processing", ISCAS 2003, vol.4,pp.476-479,2003.
- [10] Siji P.V and Manju Manuel,"2D IIR Spatially Bandpass Beam Filter-A Multiplierless Realization" International Advanced Research Journal in Science, Engineering and Technology, vol.2, no.10, 2015.
- [11] Sagara K.S and Ravi L. S, "IIR filter design using CSA for DSP applications "International Journal for Research in Applied Science & Engineering Technology(IJRASET),vol.3,no.VI,2015.