Human beings do it everyday, but the problem of reconstructing a 3D environment using only 2D images taken from that environment is a baffling one. This document defines the 3D reconstruction problem as being the problem of generating a dense 3D model of an object or scene using only 2D image taken of that object or scene and previously obtained knowledge of the geometry and internal parameters of the cameras used. Research into this problem is ongoing, but many problems need to be overcome before a computer can "see" in 3D as well as any human. Most of these problems arise because the 3D reconstruction problem is very ill posed; the solution space for nearly all problems is huge!
Many constraints exist to reduce this solution space. Most of these are based on human vision "rules" developed through experimental research into human vision.
There has been very little quantitative research into the effects of these constraints on the solution space. Typically, one or more constraints are applied and the results evaluated by comparing, often by eye, the resultant 3D model with the real-life environment. Another problem is that some constraints or assumptions are built into traditional reconstruction methods (e.g. area-based matching) at a fundamental level, which creates problems with the quality of reconstruction.
This document presents a novel formulation of the 3d reconstruction problem. The 3D space of interest is divided up into voxels (volume elements with colour and opacity) and a set of logic equations is used to describe the 3D reconstruction problem. No additional constraints or assumptions are incorporated into the logic equations. Because of the discrete nature of the formulation, the solution space can easily be calculated by generating all the solutions satisfying the logic equations. This means that the effects of any constraints, added later, on the solution space can easily be calculated.
A program has been implemented that will calculate the solution space for a given input voxel space and input images. Also implemented are a number of commonly used constraints to apply to the solution space. Experiments have been performed using this program that involve the reconstruction of four different voxel configurations using firstly the base formulation, then with each constraint being added independently. From these experiments, a number of conclusions have been reached:
• If the ideal solution (the best reconstruction using the input images) fits all the added constraints, then not only is that solution guaranteed to be within the solution space, but the general quality of reconstruction in the solution space is linked to the size of the solutions space. The smaller the space, the better the quality of reconstruction. This means that the accuracy of the reconstructed scene can be determined automatically, without human intervention.
• Because the size of the solution space can be determined numerically it is easy to determine the relative strengths of constraints and in which situations each constraint is better or worse. However, problems arise if an ambiguous constraint that does not specify the exact conditions of the constraint (e.g. the smoothness constraint, which constrains the reconstructed surfaces to be smooth but which does not specify exactly how smooth the surfaces must be) is used. Because it is unknown how well the ideal solution fits the constraint (i.e. using the previous example, it is unknown how smooth the ideal solution is), these ambiguous constraints can only be properly evaluated if a 3D model of the reconstructed scene is already available.
• Although the presented formulation provides valuable insights into the field of 3D reconstruction and shows clear advantages over existing methods, the practical use of the formulation is currently limited. Using only the constraints presented in the document, the solution space would too large and therefore the speed of the program too slow for real-sized images and voxel spaces (a possible strategy for obtaining and using many more constraints is presented that could work around this problem is presented in Section 9.1.1). Also, there is the problem of picking a final solution out of the solution space, although a few methods of doing this are given. Lastly, at the moment the formulation assumes that each pixel has exactly the same colour as the voxel it projects upon. However in practice this is not always the case.
This document concludes its discussion by describing many avenues of future research using the logical formulation. This formulation is still in its infancy, but early results and insights into the field of 3D reconstruction are very promising.