Master Thesis on “Natural Language Commands for Robotic Navigation”

Supervisor: Prof. Ville Kyrki (ville.kyrki@aalto.fi)

Advisor: Dr. Tsvetomila Mihaylova (tsvetomila.mihaylova@aalto.fi), Dr. Francesco Verdoja (francesco.verdoja@aalto.fi)

Keywords: robotic navigation, natural language processing

Project Description

Recent breakthroughs in natural language processing are enabling robots to understand human language like never before. Recent work is proposing to use the capabilities of large visual-language models for enabling humans to command robots for navigating in the environment and executing different tasks using open vocabulary.

The thesis will address the problem of translating human language to executable robot skills. It will aim to answer the following research questions: What are different methods for translating text in natural language to executable robotic skills? At what level are robotic skills currently represented? What are different levels of skill specification where navigation can be addressed?

Based on the answers to these questions, the goal is to develop a prototype of a software component which would serve as an entry point to a system for open-vocabulary robotic navigation. The component should be able to process any given text, and figure out whether it contains a command for navigation. If a navigation command is found, it should identify and return a list of pre-defined skills to execute, or objects that should be navigated to, taking into account the constraints specified in the command. The processor should have the flexibility to process arbitrary list of robotic skills and objects.

Optionally, the system could be tested on a real robot, Hello Robot Stretch 2, and maps obtained by the robot.

The project offers an opportunity for gaining experience in machine learning, familiarity with large (visual-)language models and robotic navigation.

Deliverables

  • Literature review of commonly used navigation commands.
  • Specification of how abstract robotic skills can be defined and passed to the query processor.
  • Building a text classifier for open-vocabulary queries.

Practical Information

Pre-requisites: Machine Learning; some familiarity with large language models or robotic navigation can be useful

Programming languages: Python

Start: Available immediately

References