Neural nets work much like linear regression, generalizing a data set with a "best" fit. Neural nets can take much higher dimensional data as input. They can can match arbitrary best fit "shapes" where higher dimensions mean generalizations to lines, surfaces, solids and hypercubes! But, they can be subject to over fitting the data, and need a large amount of data to prevent this over fitting.
Neural nets have been found to excel in image recognition (convolutional neural nets) and speech recognition (long-short term memory). Neural nets are highly adaptable to stereo images.
Where do you find a large data set for such a neural net to train on? The plan is to input the stereo images and output a depth map of the space, an image where the brightness correlates with the distance from the camera. The data set must contain sets of both the input and the output to train the neural net, and need millions of samples to train well. There are 2 options:
- Set up a Microsoft Kinect or specialized LIDAR in conjunction with stereo cameras and take a million photo scans.
- Use a video game with realistic lighting, setting up the view with 2 cameras and calculate the depth map from the virtual world
No comments:
Post a Comment