Master thesis on “Evaluating Domain Randomization”

Supervisor: Prof. Ville Kyrki

Advisor: Karol Arndt (karol.arndt@aalto.fi)

Keywords: domain transfer, sim-to-real transfer, deep learning, reinforcement learning

Project description

Deep learning techniques require a lot of training data in order to train models which achieve good performance. The problem of data collection has been a large bottleneck to applying deep learning techniques to robotics, where collecting training data from the physical system can take a long time, and potentially cause hazards to the robot and its surroundings. This is especially pronounced in reinforcement learning, where random (and often unconstrained) exploration is an inherent part of the learning process. A potential solution lies in training machine learning models in simulation and later deploying the trained model on the physical system; however, tuning the simulation to perfectly match the physical system is a long, manual process, which may involve drastic measures, such as disassembling the robot and carefully measuring the physical properties of each part. Similar problems also occur on the image-based perception side with computer vision techniques, as obtaining photorealistic observations which closely match the real environment lies beyond the reach of most physics simulators used in robotics.

These generalization problems can, to some extent, be solved by training a model in a wide variety of simulated environments, with varying dynamics and appearance. This approach, known as domain randomization, results in models robust to modelling inaccuracies, both on the vision and dynamics side. Such models can either generalize well to real world straight away, or can be trained to learn to adapt to new conditions.
However, it is difficult to estimate the actual range of environments required to accurately capture the real domain. If the randomization range is too small, the model may not generalize to real conditions; however, randomization ranges which are unnecessarily large require a lot of data to be generated, and provide weaker priors or dynamics in case further domain adaptation to real conditions is required.

The goal of this thesis is to design a method to quantitatively evaluate whether the generated synthetic data captures reference data collected on the physical setup. The method should be able to detect whether the real data falls within the range of the synthetic data.

Deliverables

  • Review of relevant state-of-the-art literature,
  • Developing a method for evaluating whether real data lies within the simulated range,
  • Implementing the developed algorithms,
  • Selecting/designing appropriate evaluation setups,
  • Obtaining simulated and real data for evaluation,
  • Evaluating the method using the collected data.

Practical information

Prerequisites: deep learning, Python, Linux, basics of reinforcement learning

Suggested tools: PyTorch, MuJoCo

Platform: KUKA LWR 4+ or Franka Panda robotic arm

Start: Available immediately

References

OpenAI, Learning Dexterous In-Hand Manipulation, IJRR 2019

Josh Tobin, Real-World Robotic Perception and Control Using Synthetic Data, PhD thesis, 2019

Aleksi Hämäläinen, Karol Arndt, Ali Ghadirzadeh, and Ville Kyrki, Affordance Learning for End-to-end Visuomotor Control, IROS 2019

Karol Arndt, Murtaza Hazara, Ali Ghadirzadeh, and Ville Kyrki, Meta Reinforcement Learning for Sim-to-real Domain Adaptation, ICRA 2020