Structure-Preserving Imitation Learning With Delayed Reward: An Evaluation Within the RoboCup Soccer 2D Simulation Environment

Nguyen, Quang Dang and Prokopenko, Mikhail (2020) Structure-Preserving Imitation Learning With Delayed Reward: An Evaluation Within the RoboCup Soccer 2D Simulation Environment. Frontiers in Robotics and AI, 7. ISSN 2296-9144

[thumbnail of pubmed-zip/versions/2/package-entries/frobt-07-00123-r1/frobt-07-00123.pdf]

Text
pubmed-zip/versions/2/package-entries/frobt-07-00123-r1/frobt-07-00123.pdf - Published Version
Download (2MB)

Official URL: https://doi.org/10.3389/frobt.2020.00123

Abstract

We describe and evaluate a neural network-based architecture aimed to imitate and improve the performance of a fully autonomous soccer team in RoboCup Soccer 2D Simulation environment. The approach utilizes deep Q-network architecture for action determination and a deep neural network for parameter learning. The proposed solution is shown to be feasible for replacing a selected behavioral module in a well-established RoboCup base team, Gliders2d, in which behavioral modules have been evolved with human experts in the loop. Furthermore, we introduce an additional performance-correlated signal (a delayed reward signal), enabling a search for local maxima during a training phase. The extension is compared against a known benchmark. Finally, we investigate the extent to which preserving the structure of expert-designed behaviors affects the performance of a neural network-based solution.

Item Type:	Article
Subjects:	Middle Asian Archive > Mathematical Science
Depositing User:	Managing Editor
Date Deposited:	30 Jun 2023 05:43
Last Modified:	19 Jun 2024 12:34
URI:	http://library.eprintglobalarchived.com/id/eprint/925

Actions (login required)

: View Item