A policy gradient reinforcement learning algorithm with fuzzy function approximation

Gu, Dongbing and Yang, Erfu; (2004) A policy gradient reinforcement learning algorithm with fuzzy function approximation. In: IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004. IEEE, CHN, pp. 936-940. ISBN 0780386148 (https://doi.org/10.1109/ROBIO.2004.1521910)

Full text not available in this repository.Request a copy

Abstract

For complex systems, reinforcement learning has to be generalised from a discrete form to a continuous form due to large state or action spaces. In this paper, the generalisation of reinforcement learning to continuous state space is investigated by using a policy gradient approach. Fuzzy logic is used as a function approximation in the generalisation. To guarantee learning convergence, a policy approximator and a state action value approximator are employed for the reinforcement learning. Both of them are based on fuzzy logic. The convergence of the learning algorithm is justified.

ORCID iDs

Gu, Dongbing and Yang, Erfu ORCID logoORCID: https://orcid.org/0000-0003-1813-5950;