Master thesis on “Development of data-driven driver model”

Supervisor: Prof. Ville Kyrki (ville.kyrki@aalto.fi) Advisor: Daulet Baimukashev  (daulet.baimukashev@aalto.fi), Shoaib Azam (shoaib.azam@aalto.fi) Keywords: imitation learning, autonomous driving Data-driven driver models are superior to rule-based models in interactive multi-agent scenarios where it is essential to consider agents’ behavior. For example, humans have diverse driving styles as aggressive, neutral, or defensive [1] and it is challenging to […]

Master Thesis on “Interactive Bayesian Multiobjective Evolutionary Optimization in Reinforcement Learning Problems with Conflicting Reward Functions”

In many real-world problems, there are multiple conflicting objective functions that need to be optimized simultaneously. For example, an investment company wants to create an optimum portfolio of stocks to maximize profits and minimize risk simultaneously. However, most reinforcement learning (RL) problems do not explicitly consider the tradeoff between multiple conflicting reward functions and assume a scalarized single objective reward function to be optimized. Multiobjective evolutionary optimization algorithms (MOEAs) can be used to find Pareto optimal policies by considering multiple reward functions as objectives.

Energy-Efficiency of Reinforcement Learning

This thesis will investigate the energy consumption of such reinforcement learning algorithms for both training and inference using monitoring capabilities. The goal is to find out how different algorithms compare in performance vs energy consumption on practical applications and if there are ways to reduce energy consumption by trading performance for low computational complexity, for example.

Meta-Learning Embeddings for Reinforcement Learning

Meta-Learning of ‘learning-how-to-learn’ has been immensely popular in the deep learning community in recent years. In this thesis, we will investigate the problem of using meta-learning approaches and ideas to learn latent policy embeddings for use in reinforcement learning. A potential approach for this is hyper networks, ie networks which can generate other networks (see references). The thesis will investigate the application of these approaches and evaluate their usefulness in robot learning and continuous control tasks. Currently, the use of policy embeddings and hyper networks is an active area in research with potential applications to real-world robotics. This is a research-oriented thesis where the student will have the possibility to work on state-of-the-art problems and propose novel methods and algorithms for their use in robot learning.

Creating tool-boxes for the Co-Design of Robots in Simulation

This is a more engineering and software-development oriented thesis, aiming at providing open-source implementations for the research community. However, the thesis student will get some exposuire to the use of Deep Leanring algorithms and their application to practical research problems.

Improving Co-Design with Imitation Learning

This research thesis will investigate further how the use of imitation learning methods and algorithms can be used to improve existing Co-Design algorithms. The aim of this thesis is to develop systems developing both the body and mind of robots.