Neurocognitive Shared Visuomotor Network for End-to-end Learning of Object Identification, Localization and Grasping on a Humanoid

Matthias Kerzel, Manfred Eppe, Stefan Heinrich, Fares Abawi, Stefan Wermter

Conference: Proceedings of the Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob2019), pp. 19-25, Oslo, Norway, Aug 2019

PDF - DOI

Abstract: We present a unified visuomotor neural architecture for the robotic task of identifying, localizing, and grasping a goal object in a cluttered scene. The RetinaNet-based neural architecture enables end-to-end training of visuomotor abilities in a biological-inspired developmental approach. We demonstrate a successful development and evaluation of the method on a humanoid robot platform. The proposed architecture outperforms previous work on single object grasping as well as a modular architecture for object picking. An analysis of grasp errors suggests similarities to infant grasp learning: While the end-to-end architecture successfully learns grasp configurations, sometimes object confusions occur: when multiple objects are presented, salient objects are picked instead of the intended object.