Statistical analyses
A set of statistical analyses was performed on the data (Figure 1). A Random Forest (RF) model was applied to
identify the bands of the spectra storing the most relevant information to predict SOC at plot scale. This model
showed that the band placed at 1697 cm-1 can be used to model SOC content. Partial Least Squares (PLS) was then
used to determine the relationships between the absorbance of that band and a set of easily accessible environmental
covariates in the form of raster maps (climate, land use and geology). This model served to create a map depicting
estimated values of the absorbance in such band in the whole study region. We calibrated a linear regression model
(MLR) to determine the contribution of 1697 cm-1 on the SOC values and finally we used this model to predict SOC
in the whole study area.