Internship Report

Drone flying above mountains.

Real-time Change Detection Between Drone Video and 3D Reference Model

Internship by Esteban Mateos, in cooperation with Polytech Sorbonne, within Sorbonne University in Paris, France. 

Abstract

This report presents the findings and outcomes of my internship, focusing on the feasibility of real-time change detection using a 3D reference model and drone-captured videos. The objective was to rapidly identify changes in the environment, such as fallen trees or suspicious objects.

During the familiarization phase, extensive research was conducted on existing algorithms and feature detection and matching techniques. This involved studying various approaches to identify keypoints in images, which were then used for matching and generating differences between images.

The internship involved working with drone-captured videos, a 3D model generated through photogrammetry, and telemetry data. The alignment of the 3D model and video frames enabled the creation of an image database for further analysis. Challenges, such as geographic offsets and manual corrections, were encountered during this process.

To address the limitations of existing algorithms, the Segment Anything Model (SAM) was explored. SAM, a promptable segmentation model, provided high-quality object masks and flexible segmentation based on specific prompts. The SAM model was utilized to identify and compare objects in the images.

The comparison process involved masking, sorting masks based on area and color, and determining differences between the reference image and video frames. Results were visually presented, highlighting the detected differences on an interactive 3D map.

Throughout the internship, various challenges were encountered, including image differences, a unique perspective, limited datasets, and software utilization. Collaboration and knowledge-sharing within the company played a crucial role in overcoming these challenges. Furthermore, GPU computing challenges were faced due to the resource-intensive nature of the SAM model. Limitations in accessing CUDA on the personal computer resulted in time losses during mask generation. The computational demands of SAM made it impractical to embed the solution in a drone, necessitating a division of tasks between onboard real-time image generation and comparison on a more powerful system.

In conclusion, the internship provided valuable insights into real-time change detection using 3D models and drone-captured videos. The study highlighted the effectiveness of the SAM model for promptable segmentation. The findings contribute to understanding image comparison techniques
and the challenges associated with utilizing big machine learning models in resource-constrained environments.

To go further

During the internship, several potential avenues for improving the algorithm and enhancing the capabilities of the system were explored. Although no specific methods have been identified at this stage, continuous exploration of potential approaches is crucial to address existing challenges and optimize the project’s performance.

One aspect that requires attention is the reduction of missed details caused by lighting variations and the enhancement of accuracy by mitigating false positives and false negatives. A promising solution to consider is the training of a specialized machine learning model dedicated to this purpose. This model could be designed to improve the algorithm’s ability to compare elements, thereby minimizing errors and ensuring more reliable results.

Another area of interest is the recognition of overlapping elements. By developing a method to identify when two or more elements overlap, we can enhance the system’s ability to display a single, clear representation of these elements. This improvement is particularly valuable in scenarios where multiple objects overlap, as it enhances recognition and understanding. Additionally, the exploration of photogrammetry techniques can be considered to generate detailed 3D representations of the overlapping elements, providing a more immersive visualization.

Optimizing the model to run onboard the drone presents a significant challenge due to current hardware limitations. Real-time comparison, a crucial requirement, necessitates the use of CUDA, which relies on a high-performance graphics card. Unfortunately, this is not compatible with our
embedded drone solution. To overcome this obstacle, a hybrid solution can be adopted. By performing a portion of the process onboard the drone itself, we can achieve real-time image generation during the drone’s flight over the area of interest. The remaining comparison process would then be conducted on a computer after the drone is connected to it. This hybrid approach allows for efficient utilization of resources, with the drone handling image generation and the computer performing the comparison.

Interactions with elements can be further improved by implementing an ‘OnClick’ functionality. This feature would enable users to interact with the system by removing elements or retrieving additional information simply by clicking on them.

These potential developments and enhancements offer exciting prospects for the future of the project. By continuing to explore and implement these ideas, we can strive for increased accuracy, real-time performance, and a more interactive user experience.

 

More information

Mateos, Esteban. Real-time Change Detection Between Drone Video and 3D Reference Model (2023).

Read more in the full report below.

Open the full report as pdf

Related knowledge

Master’s Thesis: Enabling geospatial hybrid-collaboration

Master’s Thesis: Enabling geospatial hybrid-collaboration

Collaborative single-display teamwork without any widget or button. A case study using Carmenta Engine. This thesis introduced a new form of interfacing with the Carmenta Engine on touch screen. Its results show that through using the technique of hand chords, several users can work with the one large touchscreen simultaneously.

Read more