Transfer learning and performance enhancement techniques for deep semantic segmentation of built heritage point clouds

Francesca Matrone, Massimo Martini

Abstract

The growing availability of three-dimensional (3D) data, such as point clouds, coming from Light Detection and Ranging (LiDAR), Mobile Mapping Systems (MMSs) or Unmanned Aerial Vehicles (UAVs), provides the opportunity to rapidly generate 3D models to support the restoration, conservation, and safeguarding activities of cultural heritage (CH). The so-called scan-to-BIM process can, in fact, benefit from such data, and they can themselves be a source for further analyses or activities on the archaeological and built heritage. There are several ways to exploit this type of data, such as Historic Building Information Modelling (HBIM), mesh creation, rasterisation, classification, and semantic segmentation. The latter, referring to point clouds, is a trending topic not only in the CH domain but also in other fields like autonomous navigation, medicine or retail. Precisely in these sectors, the task of semantic segmentation has been mainly exploited and developed with artificial intelligence techniques. In particular, machine learning (ML) algorithms, and their deep learning (DL) subset, are increasingly applied and have established a solid state-of-the-art in the last half-decade. However, applications of DL techniques on heritage point clouds are still scarce; therefore, we propose to tackle this framework within the built heritage field. Starting from some previous tests with the Dynamic Graph Convolutional Neural Network (DGCNN), in this contribution close attention is paid to: i) the investigation of fine-tuned models, used as a transfer learning technique, ii) the combination of external classifiers, such as Random Forest (RF), with the artificial neural network, and iii) the evaluation of the data augmentation results for the domain-specific ArCH dataset. Finally, after taking into account the main advantages and criticalities, considerations are made on the possibility to profit by this methodology also for non-programming or domain experts.

Highlights:

  • Semantic segmentation of built heritage point clouds through deep neural networks can provide performances comparable to those of more consolidated state-of-the-art ML classifiers.

  • Transfer learning approaches, as fine-tuning, can considerably reduce computational time also for CH domain-specific datasets, as well as improve metrics for some challenging categories (i.e. windows or mouldings).

  • Data augmentation techniques do not significantly improve overall performances.


Keywords

cultural heritage; semantic segmentation; deep learning; deep neural networks; point clouds

Full Text:

PDF

References

Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3D semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1534-1543. https://doi.org/10.1109/CVPR.2016.170

Baraldi, L., Cornia, M., Grana, C., & Cucchiara, R. (2018). Aligning text and document illustrations: towards visually explainable digital humanities. In 24th International Conference on Pattern Recognition (ICPR), 1097-1102. IEEE. https://doi.org/10.1109/ICPR.2018.8545064

Bassier, M., Yousefzadeh, M., & Vergauwen, M. (2020). Comparison of 2D and 3D wall reconstruction algorithms from point cloud data for as-built BIM. Journal of Information Technology in Construction (ITcon), 25(11), 173-192. https://doi.org/10.36680/j.itcon.2020.011

Boulch, A., Guerry, J., Le Saux, B., & Audebert, N. (2018). SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks. Computers & Graphics, 71, 189-198. https://doi.org/10.1016/j.cag.2017.11.010

Chadwick, J., (2020). Google launches hieroglyphics translator that uses AI to decipher images of Ancient Egyptian script. Available at https://www.dailymail.co.uk/sciencetech/article-8540329/Google-launches-hieroglyphics-translator-uses-AI-decipher-Ancient-Egyptian-script.html Last access 24/11/2020

Fiorucci, M., Khoroshiltseva, M., Pontil, M., Traviglia, A., Del Bue, A., & James, S. (2020). Machine learning for cultural heritage: a survey. Pattern Recognition Letters, 133, 102-108. https://doi.org/10.1016/j.patrec.2020.02.017

Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research, 32(11), 1231-1237. https://doi.org/10.1177/0278364913491297

Grilli, E., & Remondino, F. (2019). Classification of 3D digital heritage. Remote Sensing, 11(7), 847. https://doi.org/10.3390/rs11070847

Grilli, E., & Remondino, F. (2020). Machine learning generalisation across different 3D architectural heritage. ISPRS International Journal of Geo-Information, 9(6), 379. https://doi.org/10.3390/ijgi9060379

Grilli, E., Özdemir, E., & Remondino, F. (2019a). Application Of Machine And Deep Learning Strategies For The Classification Of Heritage Point Clouds. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-4/W18, 447–454, 2019. https://doi.org/10.5194/isprs-archives-XLII-4-W18-447-2019

Grilli, E., Farella, E. M., Torresani, A., & Remondino, F. (2019b). Geometric features analysis for the classification of cultural heritage point clouds. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W15, 541–548, 2019 https://doi.org/10.5194/isprs-archives-XLII-2-W15-541-2019

Hackel, T., Savinov, N., Ladicky, L., Wegner, J. D., Schindler, K., & Pollefeys, M. (2017). Semantic3d.net: A new large-scale point cloud classification benchmark. arXiv:1704.03847

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778. arXiv:1512.03385

Korc, F., & Förstner, W. (2009). eTRIMS Image Database for interpreting images of man-made scenes. Dept. of Photogrammetry, University of Bonn, Tech. Rep. TR-IGG-P-2009-01.

Landrieu, L., & Simonovsky, M. (2018). Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4558-4567. arXiv:1711.09869

Llamas, J., M Lerones, P., Medina, R., Zalama, E., & Gómez-García-Bermejo, J. (2017). Classification of architectural heritage images using deep learning techniques. Applied Sciences, 7(10), 992. https://doi.org/10.3390/app7100992

Mathias, M., Martinovic, A., Weissenberg, J., Haegler, S., & Van

Gool, L. (2011). Automatic architectural style recognition. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XXXVIII-5/W16, 171–176 3. https://doi.org/10.3390/app7100992

Matrone, F., Grilli, E., Martini, M., Paolanti, M., Pierdicca, R., & Remondino, F. (2020a). Comparing machine and deep learning methods for large 3D heritage semantic segmentation. ISPRS International Journal of Geo-Information, 9(9), 535. https://doi.org/10.3390/ijgi9090535

Matrone, F., Lingua, A., Pierdicca, R., Malinverni, E. S., Paolanti, M., Grilli, E., Remondino, F., Murtiyoso, A., & Landes, T. (2020b). A benchmark for large-scale heritage point cloud semantic segmentation. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLIII-B2-2020, 1419–1426. https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-1419-2020

Murtiyoso, A., & Grussenmeyer, P. (2019a). Automatic heritage building point cloud segmentation and classification using geometrical rules. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W15, 821-827. https://doi.org/10.5194/isprs-archives-XLII-2-W15-821-2019

Murtiyoso, A., & Grussenmeyer, P. (2019b). Point cloud segmentation and semantic annotation aided by GIS data for heritage complexes. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W9, 523–528, 2019. https://doi.org/10.5194/isprs-archives-XLII-2-W9-523-2019

Oses, N., Dornaika, F., & Moujahid, A. (2014). Image-based delineation and classification of built heritage masonry. Remote Sensing, 6(3), 1863-1889. https://doi.org/10.3390/rs6031863

Park, Y., & Guldmann, J. M. (2019). Creating 3D city models with building footprints and LIDAR point cloud classification: A machine learning approach. Computers, Environment and Urban Systems, 75, 76-89. https://doi.org/10.1016/j.compenvurbsys.2019.01.004

Pierdicca, R., Paolanti, M., Matrone, F., Martini, M., Morbidoni, C., Malinverni, E. S. & Lingua, A. M. (2020). Point cloud semantic segmentation using a deep learning framework for cultural heritage. Remote Sensing, 12(6), 1005. https://doi.org/10.3390/rs12061005

Qi, C. R., Su, H., Mo, K., & Guibas, L. J. (2017). Pointnet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 652-660. arXiv:1612.00593

Sharafi, S., Fouladvand, S., Simpson, I., & Alvarez, J. A. B. (2016). Application of pattern recognition in detection of buried archaeological sites based on analysing environmental variables, Khorramabad Plain, West Iran. Journal of Archaeological Science: Reports, 8, 206-215. https://doi.org/10.1016/j.jasrep.2016.06.024

Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

Stathopoulou, E. K., & Remondino, F. (2019). Semantic photogrammetry: boosting image-based 3D reconstruction with semantic labeling. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 42(2), W9. https://doi.org/10.5194/isprs-archives-XLII-2-W9-685-2019

Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., & Paragios, N. (2012). Parsing facades with shape grammars and reinforcement learning. IEEE transactions on pattern analysis and machine intelligence, 35(7), 1744-1756. https://doi.org/10.1109/TPAMI.2012.252.

Teruggi, S., Grilli, E., Russo, M., Fassi, F., & Remondino, F. (2020). A hierarchical machine learning approach for multi-level and multi-resolution 3D point cloud classification. Remote Sensing, 12(16), 2598. https://doi.org/10.3390/rs12162598

Tyleček, R., & Šára, R. (2013). Spatial pattern templates for recognition of objects with regular structure. In German Conference on Pattern Recognition, Springer, Berlin, Heidelberg, 364-374. https://doi.org/10.1007/978-3-642-40602-7_39

Verschoof-van der Vaart, W. B., & Lambers, K. (2019). Learning to Look at LiDAR: the use of R-CNN in the automated detection of archaeological objects in LiDAR data from the Netherlands. Journal of Computer Applications in Archaeology, 2(1). https://doi.org/10.5334/jcaa.32

Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M. M., &

Solomon, J. M. (2019). Dynamic graph CNN for learning on point clouds. ACM Transactions On Graphics, 38(5), 1-12. arXiv:1801.07829

Weinmann, M., Jutzi, B., Hinz, S., & Mallet, C. (2015). Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers. ISPRS Journal of Photogrammetry and Remote Sensing, 105, 286-304. https://doi.org/10.1016/j.isprsjprs.2015.01.016

Xie, Y., Tian, J., & Zhu, X. X. (2019). Linking points with labels in 3D: a review of point cloud semantic segmentation. arXiv:1908.08854

Yan, H., Ding, Y., Li, P., Wang, Q., Xu, Y., & Zuo, W. (2017). Mind the class weight bias: Weighted maximum mean discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 2272-2281). arXiv:1705.00609 https://doi.org/10.1109/CVPR.2017.107

Abstract Views

544
Metrics Loading ...

Metrics powered by PLOS ALM

Refbacks

  • There are currently no refbacks.




Creative Commons License

This journal is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Universitat Politècnica de València

Official journal of Spanish Society of Virtual Archaeology

e-ISSN: 1989-9947   https://dx.doi.org/10.4995/var