CSE/EE 486: Computer Vision IComputer Project Report # : Project 4Disparity Maps and Depth ComputationsGroup #4: Isaac Gerg, Adam Ickes, Jamie McCullochDate: November 13, 2003 |
|||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||
A. |
Objectives
|
||||||||||||||||||||||||||||||||||||||||
B. |
Methods There are six 'M' file for this project. All coordinate references are defined as follows: -Y ^ Where the + is the top left hand corner of image. corner_code.m 1. Determines interesting points by detecting
corner utilizing eigenvalues and eigenvectors.
1. Reads in the two images of a scene along with
the interesting points for both the right and left images of the scene. allpoints.m 1. Reads in the 2 images of a scene along with
the disparity map outputted from DisparityOfInterestingPoints.m calculate_depth.m 1. Analyzes the mat file outputted by allpoints. Computes depth information and outputs it to an image. (e.g. CastleDepth.png); FundementalMatrix.m 1. Reads in the output mat file from
DisparityOfInterestingPoints.m. CorrelationFunction.m 1. Contains the SSD correlation function used during correlation.
Executing this project from Matlab If you had a file entitled "file.m"; at the command prompt enter:
***For best results and ease, the files should be run in the order they are described above. To switch between images and scenes, the variables in the files must be rename appropriately. This allows much flexibility in using this code in future projects. |
||||||||||||||||||||||||||||||||||||||||
C. |
Results Results described in order following Methods section above. *Note: Images will not be labeled by a figure number. The images in this experiment are meant to be inspected and not directly referenced. The discussion of such images will be adjacent to the images in the flow of the document. Original Images (Top scene is Castle, bottom scene is Villa) Finding Interesting Points The corners_code.m file that was given to us for this project was utilized. The threshold for the Villa scene was 200,000 and for the Castle was 150,000.
Disparity of Interesting Points (Mapping from left to right image) From the interesting points previously computed, an algorithm was constructed to find corresponding points. It was found that if we used a large correlation window with a tight epipolar constraint, we yielded the most amount of corresponding points. If we would have used a looser epipolar constraint to account for error, we would have had to use a large correlation window and the time it would take to perform the computation would be unreasonable. Below is a table of the parameters used for each scene.
The width of the search window was determined by computing the maximum disparity of each scene. This was mostly done by inspection. While this code is running, it will output the percentage complete and the number of points found. This helped gauge the time it would take to compute the mappings with different search window and correlation window parameters. The Sum of Squares Difference (SSD) was used as the correlation function. Disparity of All Points (Mapping of all points in left to all points in right) From the mappings already created in the previously explained code, we compute a mapping for all points using the following constraints:
Because the epipolar lines in these two scenes. are close to parallel with the image top and bottom (epipoles are close to infinity because they cameras both face 'straight ahead'.) we could make an assumption that all the points in the left image will have a high probability of mapping to points in a position left of the location in the right image. It was as if all the points in a certain depth plane were shifted to the left in the right image. This limited our search window to one half of the search window previously used. This made our code run much faster without sacrificing accuracy. We continued to use the tight epipolar constraint of one pixel high for the search window. We found this yielded the best results when used with a large correlation window. Below are the images displaying the disparities in the X direction. The higher the disparity, the higher the intensity in the image. There was no Y disparity component computed since we used such a tight epipolar constraint. This would yield an image which is all black as there would be no disparity.
Notice how the different depth planes of each structure correspond to a different disparity. This indicates that these structures are not round, but flat along the plane. Depth Computation With the disparities computed for all pixels from the previous code, the depth could be computed with some extrinsic camera parameters. These parameters for each scene are in the table below.
The equation for depth was: Depth = f*(T/disparity) The images realized from these depth calculations did not show anything really useful. This was due to the fact that the imagesc function in Matlab is a linear scaling function. However, the Depth equation is non-linear. Therefore, by linearly normalizing all the points in the image you 'compress' many of the depths to values close to 0 or close to 1. A better would be to discretize the different disparities and then display those values. This is exactly what the disparity mapping image of the previously section displays. To get a true sense of depth, the depth values should be scaled by a non linear transform (C/x or similar equation perhaps). This would show the true sense of depth and not just display information about the different image planes. The output of the depths is shown below.
Fundamental Matrix Computation To ensure the fundamental matrix, F, was computed as best as possible, the disparities (mappings) used in the computation were the ones derived from the interesting points. If we used all the points, there would have been much error as you can see many of the points mapped incorrectly due to constraints in processing power and time. F was computed using a normalization method to prevent numerical instabilities. This was done by multiplying the points in each image of a scene, left and right, by a transformation matrix which translated and scaled the points. The points were translated such that their center was around the origin and scaled so that their average distance from the center was 1. F was then computed using the EIGHT_POINT algorithm demonstrated on page 156 of Trucco and Verri (Introductory Techniques for 3-D computer Vision). After F was computed, each transform for the
left and right points was multiplied by F to denormalize the matrix. F is then outputted along with its null space (null(F)) to verify that the epipolar line runs to infinity. The output from this operation is below.
As shown, the epipolar lines (null(F)) are run to infinity.
Summary
|
||||||||||||||||||||||||||||||||||||||||
D. |
Conclusions Stereo Vision is a fundamental concept in computer vision. Moreso, depth computation can be extremely useful and necessary In the real world, these algorithms utilized in the experiment would be heavily optimized and probably executed on special hardware in which the code for these algorithms was modified to tightly match the hardware abilities. The better the algorithm to realize interesting points, the better the depth computations. Ideally, an sophisticated method of computing correspondences would be utilized. Such a system might use object recognition and chain codes to better detect correspondences in left and right images. With our algorithm though, it would be ideal if the search window could be much larger in height and the correlation window could also be much larger in both dimensions. This would realize highly accurate correspondences. However, due to the amount of time it would take to run such an algorithm, this would be not feasible. As mentioned in the Report Section, a different non linear scaling scheme should be used for the values computed by the depth algorithm. The fundamental matrix computation as stated in the book (Trucco and Verri) can have some numerical instabilities associated with it. These instabilities can be remove though with normalization techniques. However, there may be some other techniques to determine such a matrix. Ideally, one would want a transform that given the transform matrix T, by multiplying a point on one image by T, you would get the corresponding point in the other image. This would make depth computations very easy. However, such a matrix with associated complexity of computation is not realizable with elementary techniques. |
||||||||||||||||||||||||||||||||||||||||
E. |
Appendix
Source Code Time Management |
||||||||||||||||||||||||||||||||||||||||
|