Call for Paper - August 2019 Edition
IJCA solicits original research papers for the August 2019 Edition. Last date of manuscript submission is July 20, 2019. Read More

Transfer Learning Approach for Fast Convergence of Deep Q Networks in Game Pong

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2018
Authors:
Baomin Shao, Xue Jiang, Qiuling Li
10.5120/ijca2018917925

Baomin Shao, Xue Jiang and Qiuling Li. Transfer Learning Approach for Fast Convergence of Deep Q Networks in Game Pong. International Journal of Computer Applications 181(21):11-14, October 2018. BibTeX

@article{10.5120/ijca2018917925,
	author = {Baomin Shao and Xue Jiang and Qiuling Li},
	title = {Transfer Learning Approach for Fast Convergence of Deep Q Networks in Game Pong},
	journal = {International Journal of Computer Applications},
	issue_date = {October 2018},
	volume = {181},
	number = {21},
	month = {Oct},
	year = {2018},
	issn = {0975-8887},
	pages = {11-14},
	numpages = {4},
	url = {http://www.ijcaonline.org/archives/volume181/number21/30008-2018917925},
	doi = {10.5120/ijca2018917925},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

By simulating the psychological and neurological system, deep reinforcement learning method has been playing an important role in the development and application of artificial intelligence with the help of the powerful feature representation capability of deep neural networks. The deep Q network which improves traditional RL methods by breaking out the learning mechanism of value function approximation and policy search based on shallow structure, has the capabilities of hierarchical feature extraction and accurate Q value approximation in various high-dimensional sensing environments.

In this paper, DQN was adapted into Game Pong playing, however, it was found that by adjusting hyperparameters (network architecture, exploration, learning rate), the Q-values could not converge easily. The lacking convergence of the Q-loss might be the limiting factor for better game playing results. A transfer learning approach has been adopted for fast convergence of DQN in game Pong, several measure standards was used as rewards to train DQN, experiments showed that this approach can get fast convergence of DQN training, and DQN network play good performance on game Pong.

References

  1. M. G. Bellmare, Y. Naddaf, J. Veness and M. Bowling, The Arcade learning environment: an evaluation platform for general agents, Journal of Artificial Intelligence Research, 47, pp.253–279, 2013
  2. D. Zhao and Y. Zhu, MEC-a near-optimal online reinforcement learning algorithm for continuous deterministic systems, IEEE Trans. Neural Netw. Learn. Sys., vol. 26, no. 2, pp. 346–356, Feb. 2015.
  3. B. Piot, M. Geist, and O. Pietquin, Bridging the gap between imitation learning and inverse reinforcement learning, IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 8, pp. 1814–1826, Aug. 2017.
  4. J. Li, H. Modares, T. Chai, F. L. Lewis, and L. Xie, Off-policy reinforcement learning for synchronization in multiagent graphical games, IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 10, pp. 2434–2445, Oct. 2017.
  5. V. Mnih et al., Human-level control through deep reinforcement learning, Nature, vol. 518, pp. 529–533, 2015.
  6. F. Abtahi, Z. Zhu, and A. M. Burry, A deep reinforcement learning approach to character segmentation of license plate images, in Proc. IAPR Int. Conf. Mach. Vis. Appl., Jul. 2015, pp. 539–542.
  7. Y. Deng, F. Bao, Y. Kong, Z. Ren, and Q. Dai, Deep Direct Reinforcement Learning for Financial Signal Representation and Trading. IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 3, pp. 653–664, Mar. 2017.
  8. K. Narasimhan, T. Kulkarni, and R. Barzilay, Language understanding for text-based games using deep reinforcement learning, in Proc. Conf. Empir. Methods Nature Lang. Process., Sep. 2015, pp. 1–11.
  9. H. Y. Ong, K. Chavez, and A. Hong. (2015). Distributed deep Q learning. [Online]. Available: https://arxiv.org/abs/1508.04186
  10. M. E. Taylor, G. Kuhlmann, and P. Stone, Accelerating search with transferred heuristics, in Proc. ICAPS Workshop AI Planning Learn., 2007.
  11. M. Riedmiller, Neural fitted Q iteration—First experiences with a data efficient neural reinforcement learning method, in Proc. Eur. Conf. Mach. Learn., Oct. 2005, pp. 317–328.
  12. A. Fachantidis, I. Partalas, G. Tsoumakas, and I. Vlahavas, Transferring models in hybrid reinforcement learning agents, in Proc. IFIP Adv. Inf. Commun. Technol., Sep. 2011, pp. 162–171.
  13. A. Lazaric, M. Restelli, and A. Bonarini, Transfer of samples in batch reinforcement learning, in Proc. 25th Int. Conf. Mach. Learn., Jul. 2008, pp. 544–551.

Keywords

DQN; Transfer Learning, Game Pong, Image Evaluation