As a method of machine learning, »deep learning« is a subfield of artificial intelligence which relies on smart algorithms. Training data sets are used to identify objects, e.g. prototype objects such as traffic signs, in a picture. Deep learning is based on artificial neural networks and has been shown to be superior to traditional methods of object recognition.
Only a few years ago, training such algorithms took weeks, if not months. A process that has shrunk to a few hours thanks to massive parallelization. Data interpretation with the help of a trained ANN is actually carried out in real-time. In ANNs, the information provided passes through a large number of interconnected artificial neurons, where it is processed and transmitted to other neurons.
ANNs learn the output patterns which correspond to specific input patterns with the help of manually annotated training data. On the basis of this »experience«, new types of input data can then be analyzed in real-time. ANNs have proven to be very robust when confronted with variations on characteristic colors, edges and shapes.
2D Camera and /or 3D scanner data or merged scanner and camera data form a suitable data basis for automated object recognition. The Fraunhofer IPM framework transfers the georeferenced scanner data points to a grid format containing depth information before linking them with RGB camera data. This pixel-based RGB-D(epth) data set contains a corresponding depth image for each RGB camera image, which makes it the ideal input format for ANNs.