Abstract
The autonomous landing of an unmanned aerial
vehicle (UAV) is still an open problem. Previous work focused
on the use of hand-crafted geometric features and sensor-data
fusion for identifying a fiducial marker and guide the UAV
toward it. In this article we propose a method based on deep
reinforcement learning that only requires low-resolution images
coming from a down looking camera in order to drive the
vehicle. The proposed approach is based on a hierarchy of Deep
Q-Networks (DQNs) that are used as high-end control policy
for the navigation in different phases. We implemented various
technical solutions, such as the combination of vanilla and
double DQNs trained using a form of prioritized buffer replay
that separates experiences in multiple containers. The optimal
control policy is learned without any human supervision,
providing the agent with a sparse reward feedback indicating
the success or failure of the landing. The results show that
the quadrotor can autonomously land on a large variety of
simulated environments and with relevant noise, proving that
the underline DQNs are able to generalise effectively on unseen
scenarios. Furthermore, it was proved that in some conditions
the network outperformed human pilots.
Original language | English |
---|---|
Number of pages | 0 |
Journal | Default journal |
Volume | 0 |
Issue number | 0 |
Publication status | Published - 2018 |