Transfer learning and performance enhancement techniques for deep semantic segmentation of built heritage point clouds
The growing availability of three-dimensional (3D) data, such as point clouds, coming from Light Detection and Ranging (LiDAR), Mobile Mapping Systems (MMSs) or Unmanned Aerial Vehicles (UAVs), provides the opportunity to rapidly generate 3D models to support the restoration, conservation, and safeguarding activities of cultural heritage (CH). The so-called scan-to-BIM process can, in fact, benefit from such data, and they can themselves be a source for further analyses or activities on the archaeological and built heritage. There are several ways to exploit this type of data, such as Historic Building Information Modelling (HBIM), mesh creation, rasterisation, classification, and semantic segmentation. The latter, referring to point clouds, is a trending topic not only in the CH domain but also in other fields like autonomous navigation, medicine or retail. Precisely in these sectors, the task of semantic segmentation has been mainly exploited and developed with artificial intelligence techniques. In particular, machine learning (ML) algorithms, and their deep learning (DL) subset, are increasingly applied and have established a solid state-of-the-art in the last half-decade. However, applications of DL techniques on heritage point clouds are still scarce; therefore, we propose to tackle this framework within the built heritage field. Starting from some previous tests with the Dynamic Graph Convolutional Neural Network (DGCNN), in this contribution close attention is paid to: i) the investigation of fine-tuned models, used as a transfer learning technique, ii) the combination of external classifiers, such as Random Forest (RF), with the artificial neural network, and iii) the evaluation of the data augmentation results for the domain-specific ArCH dataset. Finally, after taking into account the main advantages and criticalities, considerations are made on the possibility to profit by this methodology also for non-programming or domain experts.
Semantic segmentation of built heritage point clouds through deep neural networks can provide performances comparable to those of more consolidated state-of-the-art ML classifiers.
Transfer learning approaches, as fine-tuning, can considerably reduce computational time also for CH domain-specific datasets, as well as improve metrics for some challenging categories (i.e. windows or mouldings).
Data augmentation techniques do not significantly improve overall performances.
Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3D semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1534-1543. https://doi.org/10.1109/CVPR.2016.170
Baraldi, L., Cornia, M., Grana, C., & Cucchiara, R. (2018). Aligning text and document illustrations: towards visually explainable digital humanities. In 24th International Conference on Pattern Recognition (ICPR), 1097-1102. IEEE. https://doi.org/10.1109/ICPR.2018.8545064
Bassier, M., Yousefzadeh, M., & Vergauwen, M. (2020). Comparison of 2D and 3D wall reconstruction algorithms from point cloud data for as-built BIM. Journal of Information Technology in Construction (ITcon), 25(11), 173-192. https://doi.org/10.36680/j.itcon.2020.011
Boulch, A., Guerry, J., Le Saux, B., & Audebert, N. (2018). SnapNet: 3D point cloud semantic labeling with 2D deep segmentation networks. Computers & Graphics, 71, 189-198. https://doi.org/10.1016/j.cag.2017.11.010
Chadwick, J., (2020). Google launches hieroglyphics translator that uses AI to decipher images of Ancient Egyptian script. Available at https://www.dailymail.co.uk/sciencetech/article-8540329/Google-launches-hieroglyphics-translator-uses-AI-decipher-Ancient-Egyptian-script.html Last access 24/11/2020
Fiorucci, M., Khoroshiltseva, M., Pontil, M., Traviglia, A., Del Bue, A., & James, S. (2020). Machine learning for cultural heritage: a survey. Pattern Recognition Letters, 133, 102-108. https://doi.org/10.1016/j.patrec.2020.02.017
Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research, 32(11), 1231-1237. https://doi.org/10.1177/0278364913491297
Grilli, E., & Remondino, F. (2019). Classification of 3D digital heritage. Remote Sensing, 11(7), 847. https://doi.org/10.3390/rs11070847
Grilli, E., & Remondino, F. (2020). Machine learning generalisation across different 3D architectural heritage. ISPRS International Journal of Geo-Information, 9(6), 379. https://doi.org/10.3390/ijgi9060379
Grilli, E., Özdemir, E., & Remondino, F. (2019a). Application Of Machine And Deep Learning Strategies For The Classification Of Heritage Point Clouds. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-4/W18, 447–454, 2019. https://doi.org/10.5194/isprs-archives-XLII-4-W18-447-2019
Grilli, E., Farella, E. M., Torresani, A., & Remondino, F. (2019b). Geometric features analysis for the classification of cultural heritage point clouds. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W15, 541–548, 2019 https://doi.org/10.5194/isprs-archives-XLII-2-W15-541-2019
Hackel, T., Savinov, N., Ladicky, L., Wegner, J. D., Schindler, K., & Pollefeys, M. (2017). Semantic3d.net: A new large-scale point cloud classification benchmark. arXiv:1704.03847
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770-778. arXiv:1512.03385
Korc, F., & Förstner, W. (2009). eTRIMS Image Database for interpreting images of man-made scenes. Dept. of Photogrammetry, University of Bonn, Tech. Rep. TR-IGG-P-2009-01.
Landrieu, L., & Simonovsky, M. (2018). Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4558-4567. arXiv:1711.09869
Llamas, J., M Lerones, P., Medina, R., Zalama, E., & Gómez-García-Bermejo, J. (2017). Classification of architectural heritage images using deep learning techniques. Applied Sciences, 7(10), 992. https://doi.org/10.3390/app7100992
Mathias, M., Martinovic, A., Weissenberg, J., Haegler, S., & Van
Gool, L. (2011). Automatic architectural style recognition. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XXXVIII-5/W16, 171–176 3. https://doi.org/10.3390/app7100992
Matrone, F., Grilli, E., Martini, M., Paolanti, M., Pierdicca, R., & Remondino, F. (2020a). Comparing machine and deep learning methods for large 3D heritage semantic segmentation. ISPRS International Journal of Geo-Information, 9(9), 535. https://doi.org/10.3390/ijgi9090535
Matrone, F., Lingua, A., Pierdicca, R., Malinverni, E. S., Paolanti, M., Grilli, E., Remondino, F., Murtiyoso, A., & Landes, T. (2020b). A benchmark for large-scale heritage point cloud semantic segmentation. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLIII-B2-2020, 1419–1426. https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-1419-2020
Murtiyoso, A., & Grussenmeyer, P. (2019a). Automatic heritage building point cloud segmentation and classification using geometrical rules. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W15, 821-827. https://doi.org/10.5194/isprs-archives-XLII-2-W15-821-2019
Murtiyoso, A., & Grussenmeyer, P. (2019b). Point cloud segmentation and semantic annotation aided by GIS data for heritage complexes. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLII-2/W9, 523–528, 2019. https://doi.org/10.5194/isprs-archives-XLII-2-W9-523-2019
Oses, N., Dornaika, F., & Moujahid, A. (2014). Image-based delineation and classification of built heritage masonry. Remote Sensing, 6(3), 1863-1889. https://doi.org/10.3390/rs6031863
Park, Y., & Guldmann, J. M. (2019). Creating 3D city models with building footprints and LIDAR point cloud classification: A machine learning approach. Computers, Environment and Urban Systems, 75, 76-89. https://doi.org/10.1016/j.compenvurbsys.2019.01.004
Pierdicca, R., Paolanti, M., Matrone, F., Martini, M., Morbidoni, C., Malinverni, E. S. & Lingua, A. M. (2020). Point cloud semantic segmentation using a deep learning framework for cultural heritage. Remote Sensing, 12(6), 1005. https://doi.org/10.3390/rs12061005
Qi, C. R., Su, H., Mo, K., & Guibas, L. J. (2017). Pointnet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 652-660. arXiv:1612.00593
Sharafi, S., Fouladvand, S., Simpson, I., & Alvarez, J. A. B. (2016). Application of pattern recognition in detection of buried archaeological sites based on analysing environmental variables, Khorramabad Plain, West Iran. Journal of Archaeological Science: Reports, 8, 206-215. https://doi.org/10.1016/j.jasrep.2016.06.024
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Stathopoulou, E. K., & Remondino, F. (2019). Semantic photogrammetry: boosting image-based 3D reconstruction with semantic labeling. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 42(2), W9. https://doi.org/10.5194/isprs-archives-XLII-2-W9-685-2019
Teboul, O., Kokkinos, I., Simon, L., Koutsourakis, P., & Paragios, N. (2012). Parsing facades with shape grammars and reinforcement learning. IEEE transactions on pattern analysis and machine intelligence, 35(7), 1744-1756. https://doi.org/10.1109/TPAMI.2012.252.
Teruggi, S., Grilli, E., Russo, M., Fassi, F., & Remondino, F. (2020). A hierarchical machine learning approach for multi-level and multi-resolution 3D point cloud classification. Remote Sensing, 12(16), 2598. https://doi.org/10.3390/rs12162598
Tyleček, R., & Šára, R. (2013). Spatial pattern templates for recognition of objects with regular structure. In German Conference on Pattern Recognition, Springer, Berlin, Heidelberg, 364-374. https://doi.org/10.1007/978-3-642-40602-7_39
Verschoof-van der Vaart, W. B., & Lambers, K. (2019). Learning to Look at LiDAR: the use of R-CNN in the automated detection of archaeological objects in LiDAR data from the Netherlands. Journal of Computer Applications in Archaeology, 2(1). https://doi.org/10.5334/jcaa.32
Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M. M., &
Solomon, J. M. (2019). Dynamic graph CNN for learning on point clouds. ACM Transactions On Graphics, 38(5), 1-12. arXiv:1801.07829
Weinmann, M., Jutzi, B., Hinz, S., & Mallet, C. (2015). Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers. ISPRS Journal of Photogrammetry and Remote Sensing, 105, 286-304. https://doi.org/10.1016/j.isprsjprs.2015.01.016
Xie, Y., Tian, J., & Zhu, X. X. (2019). Linking points with labels in 3D: a review of point cloud semantic segmentation. arXiv:1908.08854
Yan, H., Ding, Y., Li, P., Wang, Q., Xu, Y., & Zuo, W. (2017). Mind the class weight bias: Weighted maximum mean discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (pp. 2272-2281). arXiv:1705.00609 https://doi.org/10.1109/CVPR.2017.107
Metrics powered by PLOS ALM
- There are currently no refbacks.
This journal is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Universitat Politècnica de València
Official journal of Spanish Society of Virtual Archaeology
e-ISSN: 1989-9947 https://dx.doi.org/10.4995/var