Semantic scene understanding for intelligent robotics
Authors
Advisors
Issue Date
Type
Keywords
Citation
Abstract
This dissertation focuses on improving robotic scene semantics understanding and developing a new human-robot interaction (HRI) interface based on augmented reality (AR). To achieve a deep scene understanding, the proposed scene semantics understanding method consists of three parts: object detection, object semantic comprehension, and feedback on robotic comprehension. The method analyzes detected objects’ category, function, property, and composition to enable robots to understand object semantics and reason relations between objects. Additionally, the dissertation proposes a method for an intelligent industrial robot to comprehend spatial constraints for model assembly. The proposed method uses an extended generative adversary network (GAN) with a 3D long short-term memory (LSTM) network to composite 3D point clouds from a single or a few multiple-depth scans. The spatial constraints of the segmented point clouds are identified by a neural-logic network that incorporates general knowledge of spatial constraints in terms of first-order logic. The proposed HRI interface superimposes robot-centered and human-centered reality on the working space to construct a mutual understanding environment. The interface enables humans to communicate with robots through speech and immersive touching, constructing mutual understanding through the user’s commands, localization and recognition of objects, object semantics, and augmented trajectory. The user’s vocal commands are interpreted to formal logic, and finger touching is detected and coordinated. Real-world experiments show the effectiveness of the proposed interface.

