Analysis of cross-validation methods for robust retrieval of biophysical parameters

Authors

  • Ll. Pérez-Planells Universitat de València
  • J. Delegido Universitat de València
  • J.P. Rivera-Caicedo Universitat de València
  • J. Verrelst Universitat de València

DOI:

https://doi.org/10.4995/raet.2015.4153

Keywords:

Hold-Out, k-fold, Cross-validation, MLRA, Gaussian Process Regression, Kernel Ridge Regression

Abstract

Non-parametric regression methods are powerful statistical methods to retrieve biophysical parameters from remote sensing measurements. However, their performance can be affected by what has been presented during the training phase. To ensure robust retrievals, various cross-validation sub-sampling methods are often used, which allow to evaluate the model with subsets of the field dataset. Here, two types of cross-validation techniques were analyzed in the development of non-parametric regression models: hold-out and k-fold. Selected non-parametric linear regression methods were least squares Linear Regression (LR) and Partial Least Squares Regression (PLSR), and nonlinear methods were Kernel Ridge Regression (KRR) and Gaussian Process Regression (GPR). Cross-validation results showed that LR performed most unstable, while KRR and GPR led to more robust results. This work recommends using a nonlinear regression algorithm (e.g., KRR, GPR) in combination with a k-fold cross-validation technique with k=10 to realize robust retrievals.

Downloads

Download data is not yet available.

Author Biographies

Ll. Pérez-Planells, Universitat de València

Dpto. Física de la Tierra y Termodinámica, Facultad de Física

J. Delegido, Universitat de València

Image Processing Laboratory (IPL)

J.P. Rivera-Caicedo, Universitat de València

Image Processing Laboratory (IPL)

J. Verrelst, Universitat de València

Image Processing Laboratory (IPL)

References

Arlot, S., Celisse, A. 2010. A Survey of Cross-validation Procedures for Model Selection. Statistics Surveys, 4, 40-79. http://dx.doi.org/10.1214/09-SS054

Bazi, Y., Alajlan, N., Melgani, F., AlHichri, H., Yager, R.R. 2014. Robust estimation of water chlorophyll concentrations with gaussian process regression and IOWA aggregation operators. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(7), 3019-3028. http://dx.doi. org/10.1109/JSTARS.2014.2327003

Delegido, J., Verrels, J., Rivera, J.P., Ruiz-Verdú, A., Moreno, J. 2015. Brown and green LAI mapping through spectral indices. International Journal of Applied Earth Observation and Geoinformation, 35, 350-358. http://dx.doi.org/10.1016/j. jag.2014.10.001

Guerschman, J.P., Scarth, P.F., McVicar, T.R., Renzullo, L.J., Malthus, T.J., Stewart, J.B., Rickards, J.E., Trevithick, R. 2015. Assessing the effects of site heterogeneity and soil properties when unmixing photosynthetic vegetation, non-photosynthetic vegetation and bare soil fractions from Landsat and MODIS data. Remote Sensing of Environment, 161, 12-26. http://dx.doi.org/10.1016/j.rse.2015.01.021

Hawkins, D.M., Basak, S.C., Mills, D. 2003. Assessing Model Fit by Cross-Validation. Journal of Chemical Information and Computer Sciences, 43(2), 579- 586. http://dx.doi.org/10.1021/ci025626i

Jung, Y., Hu, J. 2015. A k-fold averaging crossvalidation procedure. Journal of Nonparametric Statistics, 27(2), 167-179. http://dx.doi.org/10.1080/1048525 2.2015.1010532

Kozak, A., Kozak, R. 2003. Does cross validation provide additional information in the evaluation of regression models? Canadian Journal of Forest Research, 33(6), 976-987. http://dx.doi.org/10.1139/ x03-022

Lázaro-Gredilla, M., Verrelst, J., Camps-Valls, G. 2014. Retrieval of Biophysical Parameters With Heteroscedastic Gaussian Processes. IEEE Geoscience and Remote Sensing Letters, 11(4), 838- 842. http://dx.doi.org/10.1109/LGRS.2013.2279695

Rasmussen, C.E., Williams, C.K.I. 2006. Gaussian Processes for Machine Learning. The MIT Press, New York, US.

Rivera, J.P., Verrelst, J., Delegido, J., Veroustraete, F., Moreno, J. 2014a. On the Semi-Automatic Retrieval of Biophysical Parameters Based on Spectral Index Optimization. Remote Sensing, 6, 4927-4951. http:// dx.doi.org/10.3390/rs6064927

Rivera, J.P., Verrelst, J., Muñoz-Marí, J., Moreno, J., Camps-Valls, G. 2014b. Toward a Semiautomatic Machine Learning Retrieval of Biophysical Parameters. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7 (4), 1249-1259. http://dx.doi.org/10.1109/ JSTARS.2014.2298752

Shao, J. 1993. Linear Model Selection by CrossValidation. Journal of the American Statistical Association, 88(422), 486-494. http://dx.doi.org/ 10.1080/01621459.1993.10476299

Verrelst, J., Alonso, L., Camps-Valls, G. 2012. Retrieval of Vegetation Biophysical Parameters Using Gaussian Process Techniques. IEEE Transactions on Geoscience and Remote Sensing, 50(5), 1832-1843. http://dx.doi.org/10.1109/TGRS.2011.2168962

Verrelst, J., Rivera, J.P., Leonenko, G., Alonso, L., Moreno, J. 2014a. Optimizing LUT-Based RTM Inversion for Semiautomatic Mapping of Crop Biophysical Parameters from Sentinel-2 and -3 Data: Role of Cost Functions. IEEE Transactions on Geoscience and Remote Sensing, 52(1), 257-269. http://dx.doi.org/10.1109/TGRS.2013.2238242

Verrelst, J., Alonso, L., Rivera, J.P., Moreno, J., Camps-Valls, G. 2014b. Gaussian Process Retrieval of Chlorophyll Content from Imaging Spectroscopy Data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 6(2), 867-874. http://dx.doi.org/10.1109/ JSTARS.2012.2222356

Verrelst, J., Rivera, J.P., Moreno, J., Camps-Valls, G., 2014c. Gaussian Processes uncertainty estimates in experimental Sentinel-2 LAI and leaf Chlorofyll content retrieval. ISPRS Journal of Photogrammetry and Remote Sensing, 86, 157-167. http://dx.doi. org/10.1016/j.isprsjprs.2013.09.012

Verrelst, J., Camps-Valls, G., Muñoz-Marí, J., Rivera, J.P., Veroustraete, F., Clevers, J., Moreno, J. 2015. Optical remote sensing and the retrieval of terrestrial vegetation bio-geophysical attributes – A review. ISPRS Journal of Photogrammetry and Remote Sensing, 108, 273-290. http://dx.doi.org/10.1016/j. isprsjprs.2015.05.005

Wittenberghe, S.V., Verrelst, V., Rivera, J.P., Alonso, L., Moreno, J., Samson, R. 2014. Gaussian processes retrieval of leaf parameters from a multi-species reflectance, absorbance and fluorescence dataset. Journal of Photochemistry and Photobiology B: Biology, 134, 37-48. http://dx.doi.org/10.1016/j. jphotobiol.2014.03.010

Yang, Y. y Huang, S., 2014. Suitability of five cross validation methods for performance evaluation of nonlinear mixed-effecs forest models – a case study. Forestry, 87, 654-662. http://dx.doi.org/10.1093/ forestry/cpu025

Published

2015-12-22

Issue

Section

Practical cases