Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning

Seyed Mohammad Hossein Nabavi; Somayeh Hajforoosh

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

Navigating the Future of Cybersecurity: A Strategic Approach to Crypto Agility for Modern Enterprises

Aditya Gupta

Random Articles

Characterization of Angular Error in Magnetic Head Tracking

July

2013

Design and Implementation of a Wireless Gesture Controlled Robotic Arm with Vision

October

2013

A Survey on Security in Medical Image Communication

September

2011

Application Specific Cache Simulation Analysis for Application Specific Instructionset Processor

March

2014

Reseach Article

Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning

by Seyed Mohammad Hossein Nabavi, Somayeh Hajforoosh

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 59 - Number 5

Year of Publication: 2012

Authors: Seyed Mohammad Hossein Nabavi, Somayeh Hajforoosh

10.5120/9545-3994

Seyed Mohammad Hossein Nabavi, Somayeh Hajforoosh . Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning. International Journal of Computer Applications. 59, 5 ( December 2012), 26-31. DOI=10.5120/9545-3994

@article{ 10.5120/9545-3994,

author = { Seyed Mohammad Hossein Nabavi, Somayeh Hajforoosh },

title = { Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning },

journal = { International Journal of Computer Applications },

issue_date = { December 2012 },

volume = { 59 },

number = { 5 },

month = { December },

year = { 2012 },

issn = { 0975-8887 },

pages = { 26-31 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume59/number5/9545-3994/ },

doi = { 10.5120/9545-3994 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:05:21.785783+05:30

%A Seyed Mohammad Hossein Nabavi

%A Somayeh Hajforoosh

%T Exploration and Exploitation Tradeoff using Fuzzy Reinforcement Learning

%J International Journal of Computer Applications

%@ 0975-8887

%V 59

%N 5

%P 26-31

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Difficulty of making a balance between exploration and exploitation in multiagent environment is a dilemma that does not have a clear answer and there are still different methods for investigation of this problem that all refer to it. In this paper, we provide a method based on fuzzy variables for making exploration and exploitation in multiagent environment. In this method, an effective agent (? in ?-greedy method) is obtained which is updated using fuzzy variables in each step to manage tradeoff between exploration and exploitation. The proposed algorithm is investigated for determination an optimized path in the Grid World. In this method, agents effort to reach locations with a highest gain in a cooperative environment. Outcomes of the suggested fuzzy based algorithm compared with the results by conventional ?-greedy method. In addition, quality improvement of interaction between exploration and exploitation is discussed.

References

Weiss, Gerhard. Multiagent Systems: A Modern Approach toDistributed Modern Approach to Artificial Intelligence. London :MIT Press, 1999.
Russell, Stuart J. and Norvig, Peter. , Artificial Intelligence: A modern approach (2nd Edition). Englewood Cliffs, New Jersey : Prentice Hall, 2003.
Stone, P. and Veloso, M. , "Multiagent systems: A survey from the machine learning perspective. " Auton. Robots, vol. 8, no. 3, pp. 345–383, 2000.
Busoniu, Lucian, Babuska, Robert and Schutter, Bart De. , "A Comprehensive Survey of Multiagent Reinforcement Learning. ", IEEE Transaction on Systems, Man, and Cybernetics,Part C:Applications and Reviews, Vol. 38, No. 2,, pp. 156-172, 2008.
Filar, Jerzy and Vrieze, Koos. , Competitive Markov Decision Process. s. l. : Springer-Verlag, 1997.
Hu, J. and Wellman, P. , "Multiagent reinforcement learning:Theoretical framework and an algorithm. " In Proceedings of the Fifteenth International Conference on Machine Learning. pp. 242– 250, 1998.
Kononen, Ville. , "Asymmetric multiagent reinforcement learning. " Web Intelligence and Agent Systems: An international journal, pp. 105–121, 2004.
Wang, X. and Sandholm, T. , "Reinforcement learning to play an optimal Nash equilibrium in team Markov games. " Vancouver, Canada : Adv. Neural Inf. Process. Syst. (NIPS-02). pp. 1571–1578, 2002.
l. Panait and S. Luke, "Cooperative multiagent learning: The state of the art," Autonomous Agents Multiagent Systems, vol. 11, no. 3, p-387-434,2005.
M, A, Potter and K. A. D. Jong, " A cooperative coevolutionary approach to function optimization," ,Jerusalem, Israel,1994.
S. G. Ficici and J. B. Pollack, "A game-theoretic approach to the simple coevolutionary algorithm," , Paris,France,2000.
Lucian Busoniu, Robert Babuska, and Bart De Schutter, "A Comprehensive Survey of Multiagent Reinforcement Learning," IEEE Transaction on Systems, Man, and Cybernetics,Part C: Applications and Reviews, Vol. 38, No. 2, pp. 156-172, 2008.
Michael Lederman Littman, "Algorithms for Sequential Decision Making," Providence, Rhode Island, 1996.
J. Hu and P. Wellman, "Multiagent reinforcement learning: Theoretical framework and an algorithm," in In Proceedings of the Fifteenth International Conference on Machine Learning, 1998, p. 242–250.
Ali Akramizadeh, Ahmad Afshar, and Mohammad –B. Menhaj, "Different Forms of the Games in Multiagent Reinforcement learning: Alternating vs. simultanous movements," in 17th Mediterranean Conference on Control and Automation, Thessaloniki, Greece, 2009.
M. L. Littman, "Friend-or-foe Q-learning in general-sum games," , 2001.
M. Guo, Y. Liu, J. Malec, A new Q-learning algorithm based on the metropolis criterion, IEEE Trans. Systems Man Cybernet. B 34 (5) (2004) 2140–2143.
R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, 1998.
C. Zhou, Q. Meng, Dynamic balance of a biped robot using fuzzy reinforcement learning agents, Fuzzy Sets and Systems 134 (1) (2003) 169–187.
F. Saadatjou, V. Derhami, V. Majd, Balance of exploration and Exploitation in deterministic and stochastic environment in reinforcement learning, in: 11th Annu. Computer Society of Iran Computer Conf. , Tehran, Iran, 2006, pp. 492–498.
G. Yan, T. Hickey, Reinforcement learning algorithms for robotic navigation in dynamic environment, in: IEEE Internat. Conf. on Neural Network, 2002, pp. 1444–1449.
G. Yen, F. Yang, T. Hickey, M. Goldstein, Coordination of exploration and exploitation in a dynamic environment, in: IEEE Internat. Conf. on Neural Networks, 2001, pp. 1014–1018.
H. R. Berenji, D. Vengerov, A convergent actor-critic-based FRL algorithm with application to power management of wireless transmitters, IEEE Trans. Fuzzy Systems 11 (4) (2003) 478–485.
C. -K. Lin, A reinforcement learning adaptive fuzzy controller for robot, Fuzzy Sets and Systems 137 (3) (2003) 339–352.

Index Terms

Computer Science

Information Sciences

Keywords

Reinforcement learning Multiagent environment Balance between exploration and exploitation Q-Learning