The goal of this master thesis is to integrate a large vision-language model (VLM) with a manipulation policy in order to control a robotic hand for predefined manipulation tasks, such as grasping or pushing.
The goal of this master thesis is to explore existing approaches, datasets and models that provide textual explanations of driving situations, to implement a state-of-the-art model and to validate it on predefined driving conflict situations.
With the growing advancement of robotics research, there is a growing need for people-friendly communication between robots and humans. On one hand, the decisions of the autonomous system need to be understandable to humans, and on the other – humans need to be able to specify commands in a way that is natural to them. […]