image-bentley-lumenrt

contextcapture 200px3D models contribute significant opportunity and value to design, construction and the renovation of existing and building of new infrastructure. Aerial photogrammetry has emerged as a popular approach for building 3D models in comparison to common laser scanning methods. We assess the accuracy of photogrammetry reconstruction by comparing outcomes when capturing and processing various scenes using ContextCapture software and laser scanner-generated point clouds.

In the last decade, the evolution in computing power – both in CPUs and GPUs – has allowed the development of photogrammetry software that can turn a set of photos of an object, such as a building, into a 3D model. These software products compete with laser scanners, which are widely used throughout design, construction, and operations of infrastructure assets due to their fast acquisition time and remarkable accuracy. In this paper, we assess the accuracy of photogrammetry reconstruction by comparing outcomes when capturing and processing various scenes using ContextCapture software and laser scanner-generated point clouds.

Introduction
When undertaking the protection or renovation of archeological ancient marvels, it is important to accurately survey and document sites to create detailed documentation. This documentation, which is typically done digitally, helps to ensure the fidelity of the future renovation. Equally important, by analyzing and researching the digital documentation, restoration teams can understand and replicate the techniques used for the initial construction.

For years, laser scanners have been widely used to capture digital data for archeological sites because they are fast, versatile for surveying purposes, and are accurate within millimeters. However, the scanning process can be tedious, and the technology requires highly skilled, well-trained individuals in order to obtain a virtual 3D representation of the real world. In contrast, producing an accurate, 3D model using photogrammetry requires only a camera of reasonable quality. Photogrammetric software can then automatically build the model using multiple pictures of the same object. [1]

Much has been written about photogrammetry recently, including:

• An evaluation of the photogrammetry process in laboratory conditions. [2]
• The first comparison of the technology using real scenes [3]
• An extensive review of optical 3D measurement sensors and 3D modeling techniques for cultural heritage use cases [4, 5]
• A review of the pipeline for 3D modeling from terrestrial data [6] 

We compare 3D models reconstructed using ContextCapture software with a 3D model created from terrestrial LiDAR point clouds, with the goal of assessing  the accuracy of the photogrammetric reconstruction. All of the datasets discussed were captured for a professional purpose, which allowed us to understand whether photogrammetry can be used professionally to digitally document the kinds of complex projects usually completed with laser scanners.

The castle of Penne, located on the top of a steep hill in the south of France, was selected as the scene for the comparison. The pictures for the reconstruction were taken with a typical digital camera, and the LiDAR data was captured using a terrestrial laser scanner.

Acquiring Site Data

Figure 1 – Boliden’s current Aitik fleet.Figure 1 – Boliden’s current Aitik fleet.

For the photogrammetry data collection process, Gerpho, a team of professional aerial photographers, took pictures of the castle from a plane at a distance of roughly 400 meters.

They used a Nikon D800 camera with a 36MP full frame CMOS sensor and a Nikon 70-200 lens. A total of 249 pictures were taken – all with a focal length ranging between 139 millimeters and 201 millimeters.

The picture dimensions were 7,360 by 4,912 pixels. Figure 1 provides an example of a typical picture captured of the castle.

For the laser scanning data collection, Sompayrac Cianferani Prieu, a professional land surveyor, acquired a LiDAR point cloud that was then processed by Geovast 3D.

The 3D model features approximately 1 billion points after data unification. The surveyor used a Leica HDS 7000 scanner, which has ranging errors within 2 millimeters.

Forty-six different scans were registered – all with a maximum constraint of 6 millimeters deviation.

Creating 3D Models

Georeferencing was done using RTK and the TERIA network on 18 specific targets in order to precisely align the two models.


3D model creation using a LiDAR point cloud 

Uses traditional laser scanning, the castle structure was captured in three parts:

• The first part was the corner wall, with 4,991,330 points in the point cloud (see image 1 below).
• The second piece of the structure captured was a wall and a rooftop viewed from the bottom of the tower of the castle. This point cloud contained 27,831,695 points (see image 2 below).
• The third piece was another wall from the tower in a point cloud containing 9,802,203 points (see image 3 below).

Figure 2 - Extract 1 – Error ranges from 0 to 0.87 meters (2), Extract 2 – Error ranges from 0 to 0.71 meters (3) and Extract 3 – Error ranges from 0 to 0.94 meters (4).Figure 2 - Extract 1 – Error ranges from 0 to 0.87 meters (2), Extract 2 – Error ranges from 0 to 0.71 meters (3) and Extract 3 – Error ranges from 0 to 0.94 meters (4).


3D Reconstruction of Pictures Using ContextCapture

The reconstruction of the traditional pictures into a 3D model was performed using ContextCapture (see Figure 3). The parameters for the aerotriangulation process (for computing the orientation and position of the pictures) were set to the defaults, and the parameters for the 3D reconstruction were set to “Highest,” the default preset. (Note that ContextCapture also provides an “Ultra” preset mode, which allows for a denser reconstruction of the model, but it was determined to be inappropriate for this dataset.)

Figure 3 – A 3D textured model of the castle of Penne created using ContextCapture’s reconstruction algorithms.Figure 3 – A 3D textured model of the castle of Penne created using ContextCapture’s reconstruction algorithms.All 249 pictures were used for the reconstruction. The aerotriangulation functionality of the software makes it possible to estimate the pixel resolution of the pictures – also called “Ground Sample Distance.”

In this case, the pixel resolution ranges from 8 millimeters to 1.5 centimeters, which means that a pixel in a picture is equal to roughly 1 centimeters.

The images were georeferenced using the seven remaining ground control points several weeks later than the original 18 points (based on the quality of the real-time kinematic [RTK] observation). The coordinates of the control points were computed by Geovast 3D.

The result of the photogrammetric processing was good; as shown in Figure 3, the model does not have holes, and there are no other obvious errors in the reconstruction of the castle.

Comparing the Results

CloudCompare [4] was used to compare the 3D photogrammetric reconstruction and the LiDAR point cloud. The 3D photogrammetric model was set as the reference, and CloudCompare computed the distance from each point of the LiDAR point cloud to the surface mesh of the 3D model. The mesh (M) is made up of interconnected triangles (t). The distance between a point (p) and the mesh (M) is computed as: 

dist (p, M) = mint∈triangles (dist(p, t)) where;

the distance between a triangle and a point is defined by the distance between the plane containing the triangle and the point.

Deviation Comparison

fig6Figure 4 – An example of people “noise” in the laser scan. No attempt was made to align the point cloud and the 3D model directly, both of which were georeferenced in the RGF93 CC43 coordinate system. Rather, the alignment was made using the seven control points.

It is important to note that the photos and the laser point cloud were not captured at the same time. As a result, there are differences between the two data captures. For example, some objects have moved, and there is “noise” (for example, people) featured in the LiDAR data, but not the photographs (see Figure 3). Also, objects like planks had been moved, and there are other outliers in the scan data due to the weather and various material properties.

For these reasons, the comparison between the photogrammetric 3D model and the LiDAR data was made (see Figures 5 above / below) on the parts of the clouds that were the same between the two captures, enabling the elimination of noise and other artifacts such as moving objects. (see CloudCompare, 3D point cloud and mesh processing software. www.danielgm.net/cc/).

The results of the deviation comparison are summarized in Table 1. CloudCompare provided the distance between the point cloud and the mesh, which was assimilated as an error. The columns give the average error, the median error, the smallest range containing 90 percent of the points, and the root mean square error.
Figure 7 – Distance between the point cloud and the mesh for Extract 2 (above) and Distance between the point cloud and the mesh for Extract 3 (below).Figure 5 – Distance between the point cloud and the mesh for Extract 2 (above) and Distance between the point cloud and the mesh for Extract 3 (below).

Two primary factors can explain the deviations. First, outliers are present in the scene, so in order to not influence the results, we decided to work on the raw point cloud. Figure 6 shows these outliers.

As shown in the earlier three extracts (see Figures 2.1, 2.2, and 2.3), 90 percent of the points are no more than between 0 centimeters to 3 centimeters away from the surface. The average error is 1.5 centimeters, and the root mean square error is about 2.5 centimeters. If a higher resolution imagery had been used, the deviations between the point cloud and pictures would have been much smaller.

Since the resolution of the pictures is 1 centimeter per pixel, we can see that the root mean square error of 2.5 centimeters is equivalent to just 2-3 pixels. This is an excellent result, and we can extrapolate that if we had used much higher resolution imagery, the deviation between the two 3D models would be equivalent to just 2-3 pixels of the input photography.

Another factor to consider when comparing deviations between the two models is the distance of the data capture devices from the castle during data acquisition.

Laser scans were shot from the ground, whereas the pictures were shot from 400 feet in the air, looking down on the castle. Thus, photogrammetry cannot reconstruct the bottom of arches of the castle, as shown in Figure 6. If the photographers had walked around the castle to capture ground-based photography, the photogrammetric model would have been equally complete.

Table 1 – Comparison results for the extracts of Penne Castle.Table 1 – Comparison results for the extracts of Penne Castle.

Acquisition and Processing Duration

We also compared the time required to reconstruct the castle using the LiDAR scan versus the photogrammetry:

• The 46 LiDAR scans were taken in four hours, and two hours of additional processing were needed for the registration of the different scans. The total processing time to produce the complete point cloud was six hours.


• For the photogrammetry, the flight time needed to acquire the pictures was one hour, and the production of the 3D model required two hours and 30 minutes. Therefore, total processing time needed to produce the 3D model was three hours and 30 minutes.

This comparison indicates that photogrammetry enables faster production of 3D models – and with less labor and expensive equipment required.

Figure 6 - On the left, note the outliers. On the right, note that the bottom of the arches are not precise, because they are not on pictures.
Figure 6 - On the left, note the outliers. On the right, note that the bottom of the arches are not precise, because they are not on pictures.

Key Findings
When comparing the 3D models created from photogrammetry and from LiDAR, we found photogrammetry to be a viable method for generating useful site surveys and documentation. While the LiDAR point cloud produces accurate models it requires more time and more sophisticated, expensive, and difficult-to-operate equipment than the photogrammetric reconstruction.

Photogrammetric reconstructions achieve accuracy similar to point clouds when very high-resolution photography is captured. Photogrammetry also allows for quicker data acquisition and easier processing because users do not need any particular training to do it. Finally, photogrammetry produces a photo-textured mesh which is much more visually understandable. Reality meshes are also easier for CAD and GIS applications to analyze and process since they contain a real shape and information on existing conditions.

Figure 7 - Extracts of the 3D model created from pictures.
Figure 7 - Extracts of the 3D model created from pictures.

These benefits make photogrammetry a viable and cost-effective option for large and small projects that could benefit from 3D models of existing conditions and sites. Now, smaller projects that would not normally warrant the costs of a point cloud capture can be captured using a digital camera.

These benefits make photogrammetry a viable and cost-effective option for large and small projects that could benefit from 3D models of existing conditions and sites. Now, smaller projects that would not normally warrant the costs of a point cloud capture can be captured using a digital camera.

To learn more about ContextCapture and all of Bentley’s reality modeling products, visit www.bentley.com/RealityModeling

-------------------------------------------------------------------------------------------------------------------------------------------------------------------

Prepared by:


Cyril Novel - Senior Software Engineer, Bentley Systems, Renaud Keriven - Director, Software Development, Bentley Systems,  Philippe Graindorge - Gerpho and Florent Poux - Geovast 3D

References:

1 Hartley, R. I. and Zisserman, A.,Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press, ISBN: 0521540518, 2004.

2 Seitz, Steven M and Curless, Brian and Diebel, James and Scharstein, Daniel and Szeliski, Richard, A comparison and evaluation of multi-view stereo reconstruction algorithms. Computer vision and pattern recognition, 2006.

3 Remondino, F, Heritage recording and 3D modeling with photogrammetry and 3D scanning. Remote Sensing, 3(6), 1104-1138.

4 Strecha, W. von Hansen, L. Van Gool, P. Fua, U. Thoennessen, On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery Computer Vision and Pattern Recognition, 2008.

5 El-Hakim, S. F., Beraldin, J. A., Picard, M., and Godin, G., Detailed 3D reconstruction of large-scale heritage sites with integrated techniques. Computer Graphics and Applications, IEEE, 2004, 24(3), 21-29.

6 Remondino, F., El-Hakim, S., Image-based 3D Modeling: a Review. The Photogrammetric Record, Issue 21, Pages 269-291.

image-globalindex-3dvisworld