Meta-Learning of ‘learning-how-to-learn’ has been immensely popular in the deep learning community in recent years. In this thesis, we will investigate the problem of using meta-learning approaches and ideas to learn latent policy embeddings for use in reinforcement learning. A potential approach for this is hyper networks, ie networks which can generate other networks (see references). The thesis will investigate the application of these approaches and evaluate their usefulness in robot learning and continuous control tasks. Currently, the use of policy embeddings and hyper networks is an active area in research with potential applications to real-world robotics. This is a research-oriented thesis where the student will have the possibility to work on state-of-the-art problems and propose novel methods and algorithms for their use in robot learning.
This is a more engineering and software-development oriented thesis, aiming at providing open-source implementations for the research community. However, the thesis student will get some exposuire to the use of Deep Leanring algorithms and their application to practical research problems.
This research thesis will investigate further how the use of imitation learning methods and algorithms can be used to improve existing Co-Design algorithms. The aim of this thesis is to develop systems developing both the body and mind of robots.
The thesis will start with an initial literature review to identify the space of potential hypothesis to investigate and apply the developed method to continous control tasks. This thesis is well suited for students interested in a research oriented master thesis with some possibilities to develop your own idea.
The goal of this Master thesis is to develop simulation tools necessary to evaluate co-adaptation techniques, and to develop new approaches for learning the behaviour and design of robots using deep learning and deep reinforcement learning.
In this thesis, an extensive investigation of constrained DDP methods will be performed and the major selected ones will be implemented in simulation environment for trajectory optimizations of different robots such as a simple point robot, 2D car-like robot, 3D quadrotor robot and cart-pole system. In this context, the methods will be compared in terms of convergence speed, computational complexity, sensitivity to initializations and parameter selections.