Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

Arshia Kermani; Ehsan Zeraatkar; Habib Irani

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2025

Submit your paper

Know more

The week's pick

Assessing LLMs as Cognitive Interpreters of Student Prompts: A Typological Framework

Tadeu da Ponte Matevz Vremec Matej Mertik

Random Articles

Reseach Article

Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

by Arshia Kermani, Ehsan Zeraatkar, Habib Irani

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 186 - Number 81

Year of Publication: 2025

Authors: Arshia Kermani, Ehsan Zeraatkar, Habib Irani

10.5120/ijca2025924771

Arshia Kermani, Ehsan Zeraatkar, Habib Irani . Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification. International Journal of Computer Applications. 186, 81 ( Apr 2025), 1-9. DOI=10.5120/ijca2025924771

@article{ 10.5120/ijca2025924771,

author = { Arshia Kermani, Ehsan Zeraatkar, Habib Irani },

title = { Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification },

journal = { International Journal of Computer Applications },

issue_date = { Apr 2025 },

volume = { 186 },

number = { 81 },

month = { Apr },

year = { 2025 },

issn = { 0975-8887 },

pages = { 1-9 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume186/number81/energy-efficient-transformer-inference-optimization-strategies-for-time-series-classification/ },

doi = { 10.5120/ijca2025924771 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2025-04-26T02:19:43+05:30

%A Arshia Kermani

%A Ehsan Zeraatkar

%A Habib Irani

%T Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

%J International Journal of Computer Applications

%@ 0975-8887

%V 186

%N 81

%P 1-9

%D 2025

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The increasing computational demands of transformer models in time series classification necessitate effective optimization strategies for energy-efficient deployment. Our study presents a systematic investigation of optimization techniques, focusing on structured pruning and quantization methods for transformer architectures. Through extensive experimentation on three distinct datasets (RefrigerationDevices, ElectricDevices, and PLAID), model performance and energy efficiency are quantitatively evaluated across different transformer configurations. Our experimental results demonstrate that static quantization reduces energy consumption by 29.14% while maintaining classification performance, and L1 pruning achieves a 63% improvement in inference speed with minimal accuracy degradation. Our findings provide valuable insights into the effectiveness of optimization strategies for transformerbased time series classification, establishing a foundation for efficient model deployment in resource-constrained environments.

References

Sarah Abdulsalam, Ziliang Zong, Qijun Gu, and Meikang Qiu. Using the greenup, powerup, and speedup metrics to evaluate software energy efficiency. In 2015 Sixth International Green and Sustainable Computing Conference (IGSC), pages 1–8. IEEE, 2015.
Nesrine Bannour, Sahar Ghannay, Aurélie Névéol, and Anne- Laure Ligozat. Evaluating the carbon footprint of nlp methods: A survey and analysis of existing tools. In Proceedings of the Second Workshop on Simple and Efficient Natural Language Processing, pages 11–21, 2021.
Robin Cheong. Transformers . zip: Compressing transformers with pruning and quantization, 2019.
Krishna Teja Chitty-Venkata, Sparsh Mittal, Murali Emani, Venkatram Vishwanath, and Arun K Somani. A survey of techniques for optimizing transformer inference. Journal of Systems Architecture, page 102990, 2023.
Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, DirkWeissenborn, Jakob Uszkoreit, and Thomas Unterthiner. Differentiable patch selection for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2351–2360, 2021.
Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin- Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, and Eamonn Keogh. The ucr time series archive, 2019.
Maryam Deldadehasl, Mohsen Jafari, and Mohammad R Sayeh. Dynamic classification using the adaptive competitive algorithm for breast cancer detection. Journal of Data Analysis and Information Processing, 13(2):101–115, 2025.
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pages 4171–4186, 2019.
Dave Dice and Alex Kogan. Optimizing inference performance of transformers on cpus. arXiv preprint arXiv:2102.06621, 2021.
F. Fuhrmann, A. Maly, M. Blass, J. Waikat, F. Belavi´c, and F. Graf. 3d acoustic heat-maps for transformer monitoring applications. Journal of Energy - Energija, 72:15–18, 2024.
M. Gorbett, H. Shirazi, and I. Ray. Sparse binary transformers for multivariate time series modeling. Proceedings of ACM Conference, pages 544–556, 2023.
W. Guo, H. E. Yantır, M. E. Fouda, A. M. Eltawil, and K. Saláma. Towards efficient neuromorphic hardware: unsupervised adaptive neuron pruning. Electronics, 9:1059, 2020.
Hossein Jamali, Sergiu M Dascalu, and Frederick C Harris. Ai-driven analysis and prediction of energy consumption in nyc’s municipal buildings. In 2024 IEEE/ACIS 22nd International Conference on Software Engineering Research, Management and Applications (SERA), pages 277–283. IEEE, 2024.
Swapandeep Kaur, Raman Kumar, Kanwardeep Singh, and Yinglai Huang. Leveraging artificial intelligence for enhanced sustainable energy management. Journal of Sustainability for Energy, 2024.
Mohammad Ali Labbaf Khaniki, Marzieh Mirzaeibonehkhater, Mohammad Manthouri, and Elham Hasani. Vision transformer with feature calibration and selective cross-attention for brain tumor classification. Iran Journal of Computer Science, pages 1–13, 2024.
D. Kim. Pruning-guided feature distillation for an efficient transformer-based pose estimation model. IET Computer Vision, 18(6):745–758, 2024.
Sehoon Kim, Coleman Hooper, ThanakulWattanawong, Minwoo Kang, Ruohan Yan, Hasan Genc, Grace Dinh, Qijing Huang, Kurt Keutzer, Michael W. Mahoney, Yakun Sophia Shao, and Amir Gholami. Full stack optimization of transformer inference: a survey, 2023.
Loïc Lannelongue, Jason Grealey, and Michael Inouye. Green algorithms: Quantifying the carbon footprint of computation. Advanced Science, 8(12):2100707, 2021.
P. Lara-Benítez, L. Gallego-Ledesma, M. Carranza-García, and J. Luna-Romera. Evaluation of the transformer architecture for univariate time series forecasting. Springer, pages 106–115, 2021.
D. Li. Tst-refrac: A novel transformer variant for prediction of production of re-fractured oil wells. Journal of Physics: Conference Series, 2901(1):012030, 2024.
M. Liu. Gated transformer networks for multivariate time series classification. arXiv preprint, 2021.
S. Liu. Llm-fp4: 4-bit floating-point quantized transformers. EMNLP, 2023.
Rashin Mousavi, Arash Mousavi, Yashar Mousavi, Mahsa Tavasoli, Aliasghar Arab, Ibrahim Beklan Kucukdemiral, Alireza Alfi, and Afef Fekih. Revolutionizing solar energy resources: The central role of generative ai in elevating system sustainability and efficiency. Applied Energy, 382:125296, 2025.
H. D. Nguyen, K. P. Tran, S. Thomassey, and M. M. Hamad. Forecasting and anomaly detection approaches using lstm and lstm autoencoder techniques with the applications in supply chain management. International Journal of Information Management, 57:102282, 2021.
E. Strubell, A. Ganesh, and A. McCallum. Energy and policy considerations for deep learning in nlp. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
G. Tao. Time series forecasting based on the wavelet transform. SPIE, 2023.
Z. Tian, Z. Ma, Q. Wen, X. Wang, L. Sun, and R. Jin. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. arXiv preprint, 2022.
S. M. Tripathi, H. Upadhyay, and J. Soni. A quantum lstm based approach to cyber threat detection in virtual environment. 2024.
Sanaz Vasheghani and Shayan Sharifi. Dynamic ensemble learning for robust image classification: A model-specific selection strategy. Available at SSRN 5215134.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
Ali Vaziri and Huazhen Fang. Optimal inferential control of convolutional neural networks. arXiv preprint arXiv:2410.09663, 2024.
Ali Vaziri and Huazhen Fang. Optimal inferential control of convolutional neural networks, 2024.
Qingsong Wen, Tian Zhou, Chaoli Zhang, Weiqi Chen, Ziqing Ma, Junchi Yan, and Liang Sun. Transformers in time series: A survey. arXiv preprint, 2022.
H.Wu, J. Xu, J.Wang, and M. Long. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. arXiv preprint, 2021.
J. Xu, S. Hu, J. Yu, X. Liu, and H. Meng. Mixed precision quantization of transformer language models for speech recognition. Proceedings of ICASSP, 2021.
Ailing Zeng, Muxi Chen, Lei Zhang, and Qiang Xu. Are transformers effective for time series forecasting?, 2022.
T. Zhang, S. Ye, K. Zhang, J. Tang, W. Wen, M. Fardad, and Y. Wang. A systematic dnn weight pruning framework using alternating direction method of multipliers. Lecture Notes in Computer Science, pages 191–207, 2018.
Zining Zhang, Yao Chen, Bingsheng He, and Zhenjie Zhang. Niot: A novel inference optimization of transformers on modern cpus. IEEE Transactions on Parallel and Distributed Systems, 34(6):1982–1995, 2023.
H. et al. Zhou. Informer: Beyond efficient transformer for long sequence time-series forecasting. Proceedings of the AAAI Conference on Artificial Intelligence, 2021.
K. et al. Zhu. Quantized feature distillation for network quantization. Proceedings of the AAAI Conference on Artificial Intelligence, 2023.

Index Terms

Computer Science

Information Sciences

Keywords

Energy Efficiency Time Series Classification Optimization Quantization Pruning