Midas : A Machine Learning Model for Depth Estimation
This is an introduction to「Midas」, a machine learning model that can be used with ailia SDK. You can easily use this model to create AI applications using ailia SDK as well as many other ready-to-use ailia MODELS.
Overview
Midas is a machine learning model that estimates depth from an arbitrary input image.


Source: https://arxiv.org/pdf/1907.01341v3.pdf
Architecture
Various datasets containing depth information are not compatible in terms of scale and bias. This is due to the diversity of measuring tools, including stereo cameras, laser scanners, and light sensors. Midas introduces a new loss function that absorbs these diversities, thereby eliminating compatibility issues and allowing multiple data sets to be used for training simultaneously.
Midas uses multiple datasets for training, as shown in the table below. Therefore, it can estimate the depth of images in various conditions and environments.

Source: https://arxiv.org/pdf/1907.01341v3.pdf
In addition, 3D movies were also used for training to complement the existing data set.

Source: https://arxiv.org/pdf/1907.01341v3.pdf
Below is the loss function introduced by Midas.

Source: https://arxiv.org/pdf/1907.01341v3.pdf
The architecture of the network is based on ResNet.

Source: https://arxiv.org/pdf/1907.01341v3.pdf
Usage
You can use the following command to run Midas on the webcam video stream in ailia SDK.
$ python3 midas.py -v 0
You can also choose the higher precision v2.1 or the faster v2.1 small model, which runs five times faster than the regular model and enables real-time processing.
$ python3 midas.py -v 0 -v21
$ python3 midas.py -v 0 -v21 -t small
Here are some results.
Related topic
ailia Inc. has developed ailia SDK, which enables cross-platform, GPU-based rapid inference.
ailia Inc. provides a wide range of services from consulting and model creation, to the development of AI-based applications and SDKs. Feel free to contact us for any inquiry.
ailia Tech BLOG