# Design and Implementation of OFDM Transceiver in FPGA-in-the-Loop Configuration onto an FPGA using MATLAB/SIMULINK

Rehan Muzammil Senior Member, IEEE Electronics Engineering Department ZHCET, Aligarh Muslim University Aligarh, India

# ABSTRACT

The OFDM multiple carrier modulations are more secure and suitable for a particular fading, known as frequency selective fading. OFDM systems provide spectrum efficiency, the most crucial factor for the new radio waveform of the beyond 5G mobile system. This paper describes an implementation of an OFDM transceiver onto an FPGA using MATLAB / SIMULINK model using 16-QAM multiplexing. This model-based design and development saves much time spent on coding and debugging as the system writes an efficient HDL code to be run onto the FPGA automatically.

### **General Terms**

Software Defined Radio, Quadrature Amplitude Modulation, Field Programmable Gate Arrays, Hardware Description Language, Orthogonal Frequency Division Multiplexing.

### **Keywords**

Model-Based Design, OFDM, 16-QAM, FPGA, MATLAB, SIMULINK, HDL.

## **1. INTRODUCTION**

After decades of perseverance and academics, the applications and demands for the 5<sup>th</sup> generation wireless networks are increasing daily. The millimeter-wave and lower frequency bands demand short-range and long-range high-speed radio access of about tens of Gbps. The next-generation wireless networks pose a challenge for wireless communication engineers and scientists. LTE functional activity and data transfer rate are promoted to eliminate the demand for data. Researchers recently proposed 10, 16, or 20 antenna-based systems to increase the channel capacity and perform massive MIMO operations. Likewise, for advanced 4x4 MIMO operations, four antenna systems are proposed for 5G-enabled smartphones. For the success of 4G LTE systems, the cyclic prefix OFDM (CP-OFDM) is the most important factor [1].

Because of the growth of mobile devices and multimedia services, mobile traffic is also increasing dramatically. Hence, it becomes challenging for contemporary mobile communication to support mobile traffic, thus required in the future. Beyond 5G, mobile communications studies have been conducted to avoid this. For this, several schemes are developed, like the multicarrier Conventional orthogonal frequency division multiplexing (OFDM), which has highpower out-of-band (OOB) [2].

In the 5G mobile communications systems and beyond, applications requiring high-speed data communications like mobile broadband are expected to be supported and also

support many connected devices and ultra-low latency and high-reliability applications. Advanced modulation, coding, and multiple access schemes are developing technologies that improve the spectral efficiency of the 5G and beyond waveforms [3].

Due to the resistance to inter-symbol-interference (ISI) caused by the frequency selective channels, schemes like OFDM are specially used in wired and wireless broadband communication systems since OFDM requires only a simple one-tap equalizer at the receiver. Moreover, OFDM has gained popularity for optical systems as the modulation scheme since it has better optical power efficiency than conventional ones. Since bipolar signals cannot be sent in intensity-modulated optical wireless systems because light intensity cannot be negative, OFDM systems must be accurate and non-negative, thus designed for such applications. [4].

Coherent optical OFDM (CO-OFDM) provides a more practical solution for high-capacity and long-haul transmissions. The disadvantage of such a scheme is that it leads to a lower significant overhead, resulting in a lower spectral efficiency mainly caused by the cyclic prefix added to tolerate the ISI [5].

Using different types of Wavelet Transforms is an advanced modulation technique for OFDM, which can be applied to various wired and wireless communication systems. Conventional OFDM exclusively uses Fourier transforms to generate the orthogonal subcarriers, while the WT-based OFDM uses wavelet filter banks (FBs) for the same objective. Instead of the Discrete Fourier Transform (DFT) and the Inverse Discrete Fourier Transform, the wavelet packet transform (WPT) is used in coherent optical OFDM (CO-OFDM) systems, which permits the exclusion of a cyclic prefix (CP) and increases significantly both the transmission distance and the system spectral efficiency [6].

The next-generation wireless systems are proposed for Intelligent Transportation Systems (ITS). Applications such as broadband wireless internet access, digital television, audio broadcasting, video conferencing, etc., use ITS for wideband digital communications. OFDM transmits data over extremely hostile channels at a comparable low complexity with high data rates. Various combination of OFDM has emerged, such as the single-input single-output (SISO) is called SISO-OFDM, single-input multiple-output (SIMO) is called SIMO-OFDM, multiple-input single-output (MISO) is called MISO-OFDM, and multiple-input multiple-output (MIMO) is called MIMO-OFDM [7]. To multiplex the signals and transmit them simultaneously over several subcarriers, a conventional OFDM system uses IFFT and FFT algorithms at the transmitter and receiver, respectively. To minimize the ISI, it adds cyclic prefixes (CP) so that the delay spread of the channel becomes longer than the channel impulse response. However, the CP has the disadvantage of reducing the spectral containment of the channels [8].

Due to flexible resource allocation and simple equalization processing in time and frequency domains, OFDM is used in multiple wireless communications such as 4th generation (4G) mobile communications, wireless local area network (WLAN), and digital video broadcasting satellite-second generation (DVB). Moreover, OFDM has been adopted in non-standalone 5G mobile communication [9].

OFDM is used for transmitting high-rate data stream with power efficiency and fading immunity. Conventional OFDM systems use IFFT and FFT algorithms at the transmitter and receiver, respectively, to multiplex the signals and transmit them simultaneously over many subcarriers. The system employs cyclic prefixes (CP) so that the delay spread of the channel becomes longer than the channel impulse response to minimize inter-symbol interference (ISI). However, the CP has the disadvantage of reducing the spectral containment of the channels [10].

Section II describes the System architecture. Section III illustrates the 16-QAM Multiplexing, and section IV illustrates the 16-QAM Demultiplexing. Section V describes the hardware implementation of the system onto the ARTIX-7 FPGA, and Section VI presents the Real-time results. Section VII draws on the conclusion.

### 2. SYSTEM ARCHITECTURE

The OFDM transceiver is first modeled in MATLAB-SIMULINK, as illustrated in Figures 1 & 2.



Fig 1: OFDM-QAM-16 Top Model.

From Figure 1, it can be seen that the data is generated randomly using a pseudo-random noise generator. The data is then passed through the primary transceiver System depicted in Figure 1 as OFDM-QAM-16. This subsystem receives and detects data passed through the vector scope for viewing. The input and output bitstream are then observed on the vector scope to find out the errors in the transmission. Figure 2 shows that the data input is passed through the serial-to-parallel converter with an output-to-input data ratio of 8/1. This 8-bit output is then passed through the BLOCK CODER subsystem. The BLOCK-CODER subsystem is illustrated in Figure 3. From the subsystem, it is clear that the 8-bit input is broken up into two 4-bit inputs to be fed into the (8, 4) Block Coder running in parallel. Here, it can be said that theoretically, the

number of bits corrected in the output 8-bit word from the BLOCK-DECODER Block is two. Hence, 25% of bit errors can be removed efficiently. Figure 4 illustrates the BLOCK-DECODER where the input 16-bit stream is first broken up into two 8-bit streams to be input to the two (8, 4) Block Decoders running in parallel. Each output is a 4-bit broad word, first concatenated into an 8-bit word and fed to the parallel-to-serial converter with an input-to-output ratio of 8/1. This single bit stream is the received bitstream provided to the vector scope.

# 3. 16-QAM MULTIPLEXING

The up-sampling block before the DMUX block is used to insert the cyclic prefix into the input stream. The 2-bit chunks from the DMUX block are fed to the in-phase and quadraturephase multiplexing subsystem. Both of these are identical blocks and are depicted in Figure 5. As can be seen from this figure, the input to the in-phase and the quadrature-phase subsystems is a 2-bit data stream. Here, the multiplexing is performed as shown in Table 1.

| Table 1. The 16-QAM Multiplexing (in-phase & |
|----------------------------------------------|
| quadrature phase)                            |

| Gray Coded Data | Multiplexed<br>Output |
|-----------------|-----------------------|
| 00              | -3                    |
| 01              | -1                    |
| 11              | +1                    |
| 10              | +3                    |

The output of this subsystem takes on any of the values as shown in the table above. This is then fed to the next block, the S/P converter, to convert the input stream into a parallel stream of order 16. This is because the IFFT block used here is of the order of 16 with radix 4. This block is depicted in Figure 6. Here it can be assumed that the data is first moved from the time domain into the frequency domain by the effect of multiplexing and again moved back to the time domain after passing through the IFFT block. This data in the time domain is transmitted over the channel.

The channel here is taken to be ideal. If the data is passed through a Rayleigh fading channel, it corrupts the first few symbols in the input data stream. This is because the channel's impulse response convolves with the data stream, resulting in the corruption of the first few symbols in the data stream. A cyclic prefix is added to prevent this and to the input data stream to let the channel corrupt it and remove the corrupted first few symbols in the data stream when down-sampling the data stream. This is achieved by Upsampling sixteen times. The next block that starts the receiver portion of the system is the FFT block depicted in Figure 7. Here, the data is assumed to move from the time domain into the frequency domain. This data is then fed to the in-phase and quadrature-phase Demultiplexer to move the data back to the time domain and the original bit stream.

International Journal of Computer Applications (0975 – 8887) Volume 185 – No. 33, September 2023















Fig 5: The 16-QAM Multiplexing for the In-phase and Quadrature-phase subsystems.

| Block Parameters: IFFT ×                                                      |                                                     |   |  |  |  |  |
|-------------------------------------------------------------------------------|-----------------------------------------------------|---|--|--|--|--|
| IFFT                                                                          |                                                     |   |  |  |  |  |
| Compute the inverse fast Fourier transform (IFFT) of a complex or real input. |                                                     |   |  |  |  |  |
| The IFFT implementation is optimized for HDL code generation.                 |                                                     |   |  |  |  |  |
| Main Data Types                                                               | Control Ports                                       |   |  |  |  |  |
| Parameters                                                                    |                                                     |   |  |  |  |  |
| FFT length:                                                                   | 16                                                  |   |  |  |  |  |
| Architecture:                                                                 | Architecture: Streaming Radix 2^2                   |   |  |  |  |  |
| Complex Multiplicati                                                          | Complex Multiplicati Use 4 multipliers and 2 adders |   |  |  |  |  |
| Output in bit-reversed order                                                  |                                                     |   |  |  |  |  |
| Input in bit-reversed order                                                   |                                                     |   |  |  |  |  |
| Divide butterfly outputs by two                                               |                                                     |   |  |  |  |  |
|                                                                               |                                                     |   |  |  |  |  |
| 0                                                                             | OK Cancel Help Appl                                 | y |  |  |  |  |

Fig 6: IFFT-16 Block.

| Block Parameters: FFT                                                |                   |      |  |  |  |  |  |
|----------------------------------------------------------------------|-------------------|------|--|--|--|--|--|
| FFT                                                                  |                   |      |  |  |  |  |  |
| Compute the fast Fourier transform (FFT) of a complex or real input. |                   |      |  |  |  |  |  |
| The FFT implementation is optimized for HDL code generation.         |                   |      |  |  |  |  |  |
| Main Data Types                                                      | Control Ports     |      |  |  |  |  |  |
| Parameters                                                           |                   |      |  |  |  |  |  |
| FFT length:                                                          | 16                | :    |  |  |  |  |  |
| Architecture: Streaming Radix 2^2                                    |                   |      |  |  |  |  |  |
| Complex Multiplicati Use 4 multipliers and 2 adders                  |                   |      |  |  |  |  |  |
| Output in bit-reversed order                                         |                   |      |  |  |  |  |  |
| Input in bit-reversed order                                          |                   |      |  |  |  |  |  |
| Divide butterfly outputs by two                                      |                   |      |  |  |  |  |  |
|                                                                      |                   |      |  |  |  |  |  |
| 0                                                                    | OK Cancel Help Ap | oply |  |  |  |  |  |

Fig 7: FFT-16 Block.

The multiplexed data thus received is passed through the demultiplexing block described in the next section.

## 4. 16-QAM DEMULTIPLEXING

After the MUX in the receiver section of the transceiver, the down-sampling block is used to remove the cyclic prefix and,

subsequently, the corrupted data in the data stream. Demultiplexing is performed as shown in Table 2. This table shows three thresholds for detecting the data viz: -2, 0, +2. The in-phase and quadrature-phase demultiplexing subsystem is depicted in Figure 8.



Fig 8: The In-phase and Quadrature phase Demultiplexer.

| Multiplexed Input | Demultiplexed<br>data |
|-------------------|-----------------------|
| symbol <= -2      | 00                    |
| -2 < symbol <= 0  | 01                    |
| 0 < symbol <= +2  | 11                    |
| symbol > +2       | 10                    |

# Table 2. The 16-QAM Demultiplexing (in-phase & quadrature phase)

Figure 8 shows that the input symbol is passed through a logic series, and the subsystem output is obtained, as shown in Table 2. This is a di-bit, and concatenating two of these from the inphase and quadrature-phase subsystems makes up the 16-QAM symbol, which is 4-bit. This bit stream is passed through the down sampler, which is used to remove the cyclic prefix. The input to this subsystem is signed fixed-point data, which is converted into Boolean data. The output from this subsystem is then passed onto the S/P converter to convert it into a 16-bit stream fed to the Block Decoder subsection for error removal. The Block Coder and Decoder sections are described in [11].

# 5. HARDWARE IMPLEMENTATION ON ARTIX-7 FPGA

The processes involved in the hardware implementation of the Model are illustrated in Figure 9.



### Fig 9: Hardware Workflow Advisor for FPGA-in-the-Loop.

The first process, 1.1, in this series of processes is depicted in Figure 10. This figure describes the synthesis tool used for generating the VHDL files for this system and the Target device on which the ".bit" file is run. Here, the synthesis tool is Vivado 2020.2, and the target device is Nexys Artix-7 FPGA board.

All the processes shown in Figure 9 have to run successfully before the final bit file is produced for downloading onto the FPGA board. The final stage of the process is seen in Figure 11. Here, it can be observed that all the operations are run successfully.

|                                   | 1.1. Set Target Device and Synthesis Tool                    |               |        |            |        |                           |                |         |
|-----------------------------------|--------------------------------------------------------------|---------------|--------|------------|--------|---------------------------|----------------|---------|
|                                   | Analysis (^Triggers Update Diagram)                          |               |        |            |        |                           |                |         |
|                                   | Set Target Device and Synthesis Tool for HDL code generation |               |        |            |        |                           |                |         |
| Input Parameters                  |                                                              |               |        |            |        |                           |                |         |
| Target workflow: FPGA-in-the-Loop |                                                              |               |        |            |        | •                         |                |         |
|                                   | Target platform: Nexys4 Artix-7 FPGA board                   |               |        |            |        | •                         | Launch Board M | anager  |
|                                   | Synthesis tool: Xilinx Vivado - 1                            | Tool version: | 2020.2 |            |        | Allow unsupported version |                | Refresh |
|                                   | Family: Artix7                                               |               | ٠      | Device: xc | 7a100t |                           |                | *       |
|                                   | Package: csg324                                              |               | ٧      | Speed: -1  |        |                           |                | *       |
|                                   | Project folder: hdl_prj                                      |               |        |            |        |                           |                | Browse  |
|                                   | Run This Task Result: 🍇 Not Run                              |               |        |            |        |                           |                |         |
|                                   | Click Run This Task.                                         |               |        |            |        |                           |                |         |

### Fig 10: Set Target Device and Synthesis Tool.

Figure 11 illustrates that all the processes under FPGA-in-the-Loop are run successfully.



### Fig 11: Hardware Workflow Advisor for FPGA-in-the-Loop Completed processes.



The generated Model at the last stage is illustrated in Figure 12.

#### Fig. 12. FPGA-in-the-Loop Model Generated for Implementation on the FPGA.

When this Model is run, the real-time results are obtained, illustrated in the next section.

### 6. REAL-TIME RESULTS

The Real-time results of the OFDM transceiver are illustrated in Figure 13. This figure shows that the transmitted and the received bitstream are identical, and there is no error in transmission.

The figure below shows that the transmitted bitstream, the middle bitstream, is perfectly identical to the received bitstream, which is the upper bitstream. The lowermost waveform is the error between the two, which is zero throughout.



Fig 13: Real-time Results.

# 7. CONCLUSIONS

The OFDM transceiver is realized in Matlab / Simulink, and the hardware implementation of the transceiver is performed on the Nexys 4 Artix-7 FPGA. The results show that the transceiver is working perfectly well. The FEC in the system is realized onto the FPGA and works perfectly well. The 16-QAM Multiplexing and the Demultiplexing systems, thus realized onto the FPGA, also work fine. The real-time result does not show any error. The Model-Based Design and Development saves a lot of time in this process. In the future, this Model can be designed using higher-level QAM like 64, 256-QAM, etc. This higher-level QAM will require more resourceful FPGAs. Moreover, higher-level IFFT-FFT could be used for better performance but require more resourceful FPGAs.

### 8. REFERENCES

- [1] Md. Hasan Mahmud, MD. Maruf Hossain, Abid Anjum Khan, and Shanjid Ahmed, "Performance Analysis of OFDM, W-OFDM and F-OFDM Under Rayleigh Fading Channel for 5G Wireless Communication", 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), 03-05 December 2020, Thoothukudi, India.
- [2] Changyoung An and Heung-Gyoon Ryu, "CPW-OFDM(Cyclic Postfix Windowing OFDM) for the B5G (Beyond 5th Generation) Waveform", 2018 IEEE 10th Latin-American Conference on Communications (LATINCOM), 14-16 November 2018, Guadalajara, Mexico.
- [3] Keiichi Mizutani, Takeshi Matsumura, and Hiroshi Harada, "A comprehensive study of universal timedomain windowed OFDM-based LTE downlink system", 2017 20th International Symposium on Wireless Personal Multimedia Communications (WPMC), 17-20 December 2017, Bali, Indonesia.
- [4] Sarangi Devasmitha Dissanayake and Jean Armstrong, "Comparison of ACO-OFDM, DCO-OFDM and ADO-OFDM in IM/DD Systems", Journal of Lightwave Technology (Volume: 31, Issue: 7, April 2013), pp. 1063 – 1072.
- [5] Shanqin Deng, Xingwen Yi, Mingliang Deng, Zhengyu Luo, Qi Yang, Ming Luo, and Kun Qiu, "Reduced-Guard-Interval OFDM Using Digital Sub-Band-

Demultiplexing", IEEE Photonics Technology Letters (Volume: 25, Issue: 22, November 2013), pp. 2174 – 2177.

- [6] Y. Ben-Ezra, D. Brodeski, and B. I. Lembrikov, "High spectral efficiency OFDM based on complex wavelet packets", 2014 16th International Conference on Transparent Optical Networks (ICTON), 06-10 July 2014, Graz, Austria.
- [7] M.M. Kamruzzaman, "Performance of Turbo coded wireless link for SISO-OFDM, SIMO-OFDM, MISO-OFDM and MIMO-OFDM system", 14th International Conference on Computer and Information Technology (ICCIT 2011), 22-24 December 2011, Dhaka, Bangladesh.
- [8] Khaizuran Abdullah, Saidatul Izyanie Kamarudin, Nadiatul Fatiha Hussin, and Sigit P. W. Jarrot, "Impulsive noise effects on DWT-OFDM versus FFT-OFDM", The

17th Asia Pacific Conference on Communications, 02-05 October 2011, Sabah, Malaysia.

- [9] Changyoung An, Dayoung Kim, and Heung-Gyoon Ryu, "Design and Performance Evaluation of Dual Mode OFDM-DIM and OFDM-CDIM Systems", 2018 21st International Symposium on Wireless Personal Multimedia Communications (WPMC), 25-28 November 2018, Chiang Rai, Thailand.
- [10] Khaizuran Abdullah, Amin Z. Sadik, and Zahir M. Hussain, "On the DWT- and WPT-OFDM versus FFT-OFDM", 2009 5th IEEE GCC Conference & Exhibition, 17-19 March 2009, Kuwait, Kuwait.
- [11] Rehan Muzammil, "Design and Implementation of 4-QAM Transceiver on FPGA using MATLAB/SIMULINK Model", 2023 International Conference for Advancement in Technology (ICONAT), 24-26 January 2023, Goa, India.