Cognitive data interpretation

Monitoring the condition of large structures such as traffic routes, buildings or agricultural land creates enormous quantities of 3D and image data, which currently tends to be analyzed manually. Fraunhofer IPM relies on automated data interpretation by using a »deep learning« approach, which is more favorable in terms of duration and costs. This method implies a semantical segmentation of the image data, where each pixel or 3D point is attributed to as specific object class.

As a method of machine learning, »deep learning« is a subfield of artificial intelligence which relies on smart algorithms. Training data sets are used to identify objects, e.g. prototype objects such as traffic signs, in a picture. Deep learning is based on artificial neural networks and has been shown to be superior to traditional methods of object recognition.

Only a few years ago, training such algorithms took weeks, if not months. A process that has shrunk to a few hours thanks to massive parallelization. Data interpretation with the help of a trained ANN is actually carried out in real-time. In ANNs, the information provided passes through a large number of interconnected artificial neurons, where it is processed and transmitted to other neurons.

© /graphics: Fraunhofer IPM
A new artificial neural network (ANN) is trained within a few hours. Data interpretation with the help of a trained ANN is carried out in real-time.

ANNs learn the output patterns which correspond to specific input patterns with the help of manually annotated training data. On the basis of this »experience«, new types of input data can then be analyzed in real-time. ANNs have proven to be very robust when confronted with variations on characteristic colors, edges and shapes.

2D Camera and /or 3D scanner data or merged scanner and camera data form a suitable data basis for automated object recognition. The Fraunhofer IPM framework transfers the georeferenced scanner data points to a grid format containing depth information before linking them with RGB camera data. This pixel-based RGB-D(epth) data set contains a corresponding depth image for each RGB camera image, which makes it the ideal input format for ANNs.