Skip to main content
  • Research Paper
  • Published:

Integration of field sampling and LiDAR data in forest inventories: comparison of area-based approach and (lognormal) universal kriging

Abstract

Key message

We compared (lognormal) universal kriging with the area-based approach for estimation of forest inventory variables using LiDAR data as auxiliary information and showed that universal kriging could be an accurate alternative when there is spatial autocorrelation.

Context

Forest inventories supported by geospatial technologies are essential to achieve a spatially informed assessment of forest structure. LiDAR technology supplies comprehensive and spatially explicit data enabling the estimation of wide-scale forest variables.

Aims

To compare the area-based approach with universal kriging for estimation of the stem density, basal area, and quadratic mean diameter using LiDAR data as auxiliary information.

Methods

We used data from 202 inventory plots, distributed in four Forest Management Units with differences in structure and management, and a 6-points/m2 resolution LiDAR dataset from a Pinus sylvestris L. forest in Spain to test the accuracy of the (lognormal) universal kriging and the area-based approaches.

Results

In those Forest Management Units where the analyzed variables showed spatial autocorrelation, kriging showed better results than the area-based approach in terms of RMSE and Pearson coefficient between observed and estimated values, although lognormal universal kriging provided slightly biased estimations (up to 2%).

Conclusion

Universal kriging is an accurate method for estimation of forest inventory variables with LiDAR data as auxiliary information for those variable exhibiting spatial autocorrelation.

1 Introduction

Forest monitoring is essential to design management strategies and conservation policies (Scott and Gove 2002; Köhl et al. 2006). Forest inventories are the main sources of information to assess forest structure and for monitoring forest dynamics. Geographic information systems (GIS) and remote sensing technologies enable forest resources to be mapped through wall-to-wall estimation of variables from plot measurements and have transformed how forest inventory data are employed (Tomppo et al. 2008).

Light Detection and Ranging (LiDAR) provides accurate terrain elevation models and comprehensive and spatially explicit information of vegetation height (Lindberg et al. 2012), enabling the estimation of forest attributes over large areas (Ayrey and Hayes 2018). LiDAR has become an essential tool for forest managers (White et al. 2013), over recent decades, with LiDAR-derived metrics supported by field inventory data proving to be a useful approach for the evaluation of forest structure (Blomdahl et al. 2019), estimation of fire-related variables (Hall et al. 2005; Marino et al. 2018), biomass assessment (Hall et al. 2005), and measurement of individual tree attributes or species characterization (Lin and Hyyppä 2016; Harikumar et al. 2017). LiDAR height metrics are highly correlated with growing stock variables, such as basal area (BA), volume, and aboveground biomass in forests with relatively simple structure (Lefsky et al. 1999; Means et al. 2000). But relationships between height and BA and biomass are not so straightforward for complex forest structures (Palace et al. 2015).

Statistical inference is required to evaluate the field measurements over large areas, usually incorporating LiDAR information through regression models (White et al. 2013). The area-based approach (ABA) (Næsset 2002) for the prediction of forest variables is a successful technique for the integration of LiDAR and field inventory data in forest inventories (White et al. 2013). Under ABA, linear models can be built and applied to obtain estimates of forest attributes over an entire study area using wall-to-wall LiDAR metrics as auxiliary variables. Other modeling approaches have been applied with ABA, such as the k-nearest neighbor method (Tomppo and Halme 2004) or the linear mixed-effects models (Mauro et al. 2017). The ABA has been widely applied for estimating forest stand attributes at fine scale (Means et al. 2000; Magnussen et al. 2012) or for generalized prediction models from forest inventory data (Næsset 2004; Bouvier et al. 2015).

Spatial autocorrelation is a common characteristic of many forest inventory variables linked to site and climate factors as well as ecological processes driving the stand development (Legendre 1993). In this sense, kriging exploits the spatial autocorrelation of forest inventory variables to increase the accuracy of estimations (Montes et al. 2005). When there is an underlying variable distribution explained by auxiliary variables, universal kriging (UK) incorporates auxiliary information as a mean function (Montes and Ledo 2010), and the residuals between the observed values and the mean function exhibit spatial autocorrelation. Nowadays, many forest inventories combine information from two phases: information from remotely sensed data, such as satellite images, aerial photogrammetry, or LiDAR, is collected in the first phase, whereas the second phase comprises terrestrial plots (Mandallaz and RongHua 1999). Universal kriging employing variables derived from aerial photogrammetry as auxiliary information has proved to be effective in reducing estimation error in forest inventories (Mandallaz 2000). However, kriging estimator optimality is based on normality assumptions (Matheron 1971) that the target variables in a forest inventory often do not meet. Even though logarithmic transformations are often required for these variables, the difficulties involved in back-transforming lognormal kriging interpolations (Dowd 1982) have prevented the use of lognormal kriging in operational forest inventories. However, robust lognormal kriging estimators as proposed by Cressie (2006) or Nussbaum et al. (2014) have demonstrated their viability for interpolation of soil variables.

Our aim in this work is to evaluate and compare the performance of two techniques for estimation of stem density per ha (N), basal area (BA), and quadratic mean diameter (QMD) in dense stands of pines: (i) ABA, where field-based response variables are linked to LiDAR metrics through linear models and (ii) the UK approach for interpolation of field-based variables with LiDAR metrics as auxiliary variables. We also aim to assess the effect of spatial autocorrelation on the UK estimation error and to provide scientific rationale for the choice of ABA or UK in LiDAR-supported forest inventories.

2 Materials and methods

2.1 Study area

The study took place in Valsaín pinewood, a 7622-ha Pinus sylvestris L. forest located in Sierra de Guadarrama, Spain (Fig. 1) at an altitude between 1200 and 2130 m. Valsaín yields good-quality timber and plays an important role as a carbon sink and in the protection of a high-mountain area (Donés and Garrido 2001; Moreno-Fernández et al. 2015).

Fig. 1
figure 1

Location of Valsaín pinewood in Europe and study area delimitation (Valsaín pinewood). Forest Management Units (FMUs) are defined, and field inventory plots are represented with triangles

The study area has an average N of 604 stems/ha with a minimum of 75 stems/ha and a maximum of 2147 stems/ha. The average BA is 42 m2/ha with a minimum of 14 m2/ha and a maximum of 91 m2/ha. QMD shows an average of 33 cm/ha, a minimum of 14 cm/ha, and a maximum of 63 cm/ha. As described in the forest management plan, the area is divided into three Forest Management Units (FMUs). The group shelterwood system is implemented during a 40-year regeneration period in those forest compartments that have reached the rotation age (120 years), while in the other compartments, a moderate thinning regime is applied. Regeneration areas and polewoods predominate in FMU 1, whereas dense mature stands and polewoods predominate in FMUs 2 and 3, respectively. We considered a fourth FMU comprising the pine stands located at altitude > 1800 m since they show specific high-mountain stand structure and are managed for protection. The distribution and characteristics of each FMU are shown in Fig. 1 and Table 1.

Table 1 Characteristics of Forest Management Units (FMUs) in the study area

2.2 Data

2.2.1 Field inventory

Two hundred two plots of 13-m radius were positioned using a Leica SR530 GPS receiver with sub-metric accuracy. Within the plots, the diameter at breast height (DBH) and tree height (H) of all trees with DBH ≥ 7.5 cm were manually measured with a caliper and a Vertex III dendrometer. When modeling with LiDAR, circular plots are recommended (White et al., 2013) since they are easier to establish (Adams et al. 2011) and the edge effects are reduced (Wulder et al. 2012). We evaluated N, BA, and QMD stand variables. The inventory was carried out at the same time as the LiDAR acquisition survey (2009).

2.2.2 LiDAR data

The LiDAR flight covered the entire study area with an average density of 6 points/m2 and a minimum of 1.6 points/m2 which has proven to be appropriate to derive a digital terrain model (DTM), stand volume, or canopy metrics (Jakubowski et al. 2013). The flight details are summarized in Table 2.

Table 2 LiDAR flight plan parameters

Matching and adjustment of LiDAR flights were carried out using TerraMatch software (Terrasolid 2021a), and point classification was performed using TerraScan software (Terrasolid 2021b). A DTM and a digital surface model (DSM) were built with a 1-m cell grid following two steps: (i) automatic and manual detection of atypical points and (ii) automatic ground/not-ground point classification. For the second step, the point cloud was filtered out, bare ground points were automatically extracted with TerraScan software and visually checked for verification, and the point cloud was classified into two single classes: ground and vegetation points.

2.2.3 LiDAR metrics

A set of elevation metrics were derived from the LiDAR point cloud using FUSION software (McGaughey 2018). LiDAR point cloud metrics corresponding to field inventory plots were extracted based on the circular areas of 13-m radius.

We initially defined 43 LiDAR metrics related to height, height variability, vegetation cover (Lefsky et al., 2005), and number of returns (Hawbaker et al. 2010; Bater et al. 2011; Woods et al. 2011) (Table 3). A 2-m threshold was applied to remove ground and understory (Bouvier et al. 2015), considering an appropriate approach when studying mature, coniferous forests (White et al. 2013). A post-mission quality check assessment to evaluate the final product was performed for LiDAR point-cloud metrics as recommended in White et al. (2013).

Table 3 LiDAR metrics considered for modeling

2.3 Methods

2.3.1 Area-based approach

For the prediction of forest variables in the ABA, we applied linear regression models with N, BA, and QMD as Z variables, while 43 preselected LiDAR metrics were tested as auxiliary variables (Table 3). Linear relationships were built with no variable transformation or by using (i) the exponential model (Eq. 1), which assumes a linear relationship between the log transformation of the Z variable and the auxiliary variables (Næsset 2002; Frazer et al. 2011; Bouvier et al. 2015):

$$ \ln \left(\hat{Z}\right)={\beta}_0+\sum \limits_{j=1}^p{\beta}_j{X}_j $$
(1)

and (ii) the potential model, which assumes a linear relationship between the log transformation of both Z and auxiliary variables (Eq. 2).

$$ \ln \left(\hat{Z}\right)={\beta}_0+\sum \limits_{j=1}^p{\beta}_j\ln \left({X}_j\right) $$
(2)

In each FMU, we carried out a preselection of those LiDAR metrics that present a priori relationship with each response variable based on the literature and the expert knowledge of stand structure and dynamics. We then analyzed the Pearson correlation coefficient between LiDAR metrics and the Z variables to identify those metrics with the highest correlation with each response variable. This preselection step provided a set of ten LiDAR variables for each variable and FMU. To obtain the definitive set of predictors for each response variable (N, BA, and QMD), different combinations of no more than four variables were tested by comparing the Akaike information criterion (AIC) calculated by the StepAIC function implemented in MASS R package (Venables and Ripley 2002), discarding redundant variables (Ayrey and Hayes 2018). Seven outliers were detected and filtered out in six models.

In order to determine the final model to be used, we studied the performance of the three models, selected for each response variable and FMU: without variable transformation, with log transformation of Z variable, and with log transformation of Z and LiDAR variables, by comparing the coefficient of determination (R2), Pearson coefficient (ρZ,X) and bias. These coefficients were calculated as follows:

$$ {R}^2=1-\frac{\sum \limits_{i=1}^n{\left({Z}_i-{\hat{Z}}_i\right)}^2}{\sum \limits_{i=1}^n{\left({Z}_i-\overline{Z_i}\right)}^2} $$
(3)
$$ {\rho}_{\mathrm{Z},\mathrm{X}}=\frac{\sigma_{\mathrm{Z}\mathrm{X}}}{\sigma_{\mathrm{Z}}{\sigma}_{\mathrm{X}}} $$
(4)
$$ \mathrm{bias}\left(\%\right)=\frac{\frac{1}{n}\times \sum \limits_{i=1}^n\left({\hat{Z}}_i-{Z}_i\right)}{\overline{Z_i}}\cdotp 100 $$
(5)

where n is the sampling size; Zi is the observed value of variable Z; \( {\hat{Z}}_{\mathrm{i}} \)is the predicted value of variable Z; σZ,X is the covariance of Z and auxiliary variable X; σZ is the standard deviation of Z variable; and σX is the standard deviation of X.

Back-transformation was necessary previous to validation. As a systematic bias is introduced with log-transformation, the application of a model correction factor (CF) was necessary (Bouvier et al. 2015):

$$ \mathrm{CF}=\exp \left(\frac{{\mathrm{SEE}}^2}{2}\right) $$
(6)

where SEE is the standard error of regression estimates obtained with lm function in stats R package (Wilkinson and Rogers 1973; Chambers 1992; R Core Team 2019). Therefore, the back-transformation would be (Eq. 9) for the exponential model, and (Eq. 10) for the potential model.

$$ \hat{Z}=\left({e}^{\beta_0}\prod \limits_{j=1}^p{e}^{\beta_j\cdotp {X}_j}\right)\cdotp CF $$
(7)
$$ \hat{Z}=\left({e}^{\beta_0}\prod \limits_{j=1}^p{X_j}^{\beta_j}\right)\cdotp CF $$
(8)

A leave-one-out cross-validation was carried out to assess model accuracy, estimating the Pearson coefficient, the bias, and the RMSE (%) through Eqs. 4, 5, and 9.

$$ \mathrm{RMSE}\left(\%\right)=\frac{\sqrt{\frac{1}{n}\sum \limits_{i=1}^n{\left({\hat{Z}}_i-{Z}_i\right)}^2}}{\overline{Z_i}}\cdotp 100 $$
(9)

2.3.2 Universal kriging approach

In the UK model, the variable value Z(s0) at the location s0 is expressed as the sum of a mean function of the auxiliary variables Xj (s0) and a spatially autocorrelated residual δ(s0):

$$ Z\left({s}_0\right)=\sum \limits_{\mathrm{j}=0}^p{\beta}_j{X}_j\left({s}_0\right)+\delta \left({s}_0\right) $$
(10)

We used as Z variable the field-based N, BA, or QMD (or the log transformation of these variables) and as auxiliary variables, Xj (s0), the intercept and the LiDAR-derived variables. As the mean function establishes linear relationships between the Z variable and the auxiliary variables, analogous to those of ABA models, the same variable selection and transformations applied for ABA were applied as well for UK (Table 4).

Table 4 Final models for stand structure variables (Z variables) applying area-based approach (ABA) for stem density (N), basal area (BA), and quadratic mean diameter (QMD)

The UK prediction of variable Z at location s0 is a linear combination of values Z in sampling locations si corresponding to the field plots:

$$ P\left(Z;{\boldsymbol{s}}_0\right)=\sum \limits_{\mathrm{i}=1}^n{\lambda}_{\mathrm{i}}Z\left({\boldsymbol{s}}_{\mathrm{i}}\right) $$
(11)

The coefficients λi that weight each observed value Z(si) are calculated minimizing the prediction error under the unbiasedness condition (Cressie 1993). The kriging variance can be calculated as (Cressie 1993):

$$ {\sigma}_{\mathrm{k}}\left({\boldsymbol{s}}_0\right)=\sum \limits_{\mathrm{i}=1}^n{\lambda}_{\mathrm{i}}\gamma \left({\boldsymbol{s}}_0-{\boldsymbol{s}}_{\mathrm{i}}\right)+\sum \limits_{\mathrm{j}=0}^p{m}_{\mathrm{j}}{X}_{\mathrm{j}}\left({\boldsymbol{s}}_0\right) $$
(12)

To calculate λi, the variogram (d) is needed (where d is the distance between locations). Under UK (d) can be estimated from the residual semivariance:

$$ \gamma (d)=\frac{1}{2n(d)}\sum \limits_i^{n(d)}{\left(\delta \left({s}_i\right)-\delta \left({s}_i+d\right)\right)}^2 $$
(13)

Restricted maximum likelihood (REML) was applied for the simultaneous estimation of the coefficients of the mean function and the parameters of the variogram (Harville 1974; Mardia and Marshall 1984). Visual inspection of the residual variogram was used for selection of the variogram model. Finally, the spherical variogram model was fitted for all the variables analyzed.

In those models where log-transformations were applied, lognormal kriging (Dowd 1982) was used. The unbiased estimator of Z(si) when the UK prediction is carried out using the log-transformed variable is given in Eq. 14 (Cressie 2006):

$$ \hat{Z}\left({\boldsymbol{s}}_0\right)=\exp \left(\mathrm{P}\left(\ln (Z);{\boldsymbol{s}}_0\right)+{\sigma}_{\mathrm{k}}\left({\boldsymbol{s}}_0\right)/2-\sum \limits_{\mathrm{j}=0}^p{m}_{\mathrm{j}}{X}_{\mathrm{j}}\left({\boldsymbol{s}}_0\right)\right) $$
(14)

where \( \hat{Z}\left({\boldsymbol{s}}_0\right) \) is the unbiased back transformation of P(ln(Z); s0), which is the UK leave-one-out prediction at validation plot s0, and mj are the Lagrange multipliers to ensure the condition \( \sum \limits_{\mathrm{i}=1}^n{\lambda}_{\mathrm{i}}{X}_{\mathrm{j}}\left({\boldsymbol{s}}_{\mathrm{i}}\right)={X}_{\mathrm{j}}\left({\boldsymbol{s}}_0\right) \).

Leave-one-out cross-validation of the UK model was carried out in order to compare the results of the ABA and the UK approaches. The geostatistical analysis was implemented using the software package Geostat developed in Matlab® by the authors.

2.3.3 Comparison of ABA and UK

The bias (Eq. 5), Pearson coefficient between estimated and observed values (Eq. 4), and RMSE (Eq. 9) for the leave-one-out cross-validation residuals of the ABA and UK models were used to compare the results from both approaches. Box plots were used to compare the prediction errors of both ABA and UK models for each FMU and variable, estimated through SEE and σk, respectively.

3 Results

3.1 Area-based approach

Exponential models were fitted for N in FMUs 1, 2, and 3, and BA in FMUs 1 and 3. QMD models were linear in all the FMUs, whereas in FMU 4, the models were linear for all the variables analyzed (Table 4). In every N model, regardless of the FMU, the LiDAR variables included were related with cover fraction (PFRA2 or PARA2) and mid to high height elevation percentiles (missing in the FMU 2 where we found height mode instead). In BA models, the most frequent variables included were h20 and PFRA2 (FMUs 1 and 3). In the case of QMD, the variables included were h40 in forest units I and II and h70 in FMUs 3 and 4 (Table 4).

3.2 Universal kriging approach

REML estimations of variogram parameters for N, BA, and QMD for each FMU are shown in Table 5. The ratio from the residual variogram partial sill to the empirical variogram sill represents the variance linked to spatial autocorrelation (Montes and Ledo 2010), which is maximum in FMU 4 for all the analyzed variables and minimum in FMU 3 (Fig. 2 and Table 5). The spatial correlation ranged from 141 m in FMU 1 to 2101 m in FMU 4 for N, from 978 m in FMU 2 to 11556 m in FMU 4 for BA and from 1299 in FMU 1 to 4576 m in FMU 2 for QMD. The variance explained by the LiDAR variables (estimated as a ratio from the linear combination of variograms and cross-variograms of the auxiliary variables weighted by their β coefficients to the empirical variogram sill) is lower than the variance linked to spatial autocorrelation, for QMD in FMUs 1 and 3, and BA in FMU 3, whereas the LiDAR metrics stand for more than 50% of the variance of N in FMUs 2 and 3, QMD in FMU 2, and BA in FMU 4 (Fig. 2).

Table 5 Spherical variogram model parameters and trend function coefficients with their respective p-values for stem density (N), basal area (BA), and quadratic mean diameter (QMD) in universal kriging models
Fig. 2
figure 2

Universal kriging variogram model γ (discontinuous line) fitted to the residual variogram (hollow circles) and linear combination of the fitted variogram model and \( \sum \limits_{\mathrm{i}}^{n(d)}\sum \limits_{\mathrm{m}=1}^p\sum \limits_{\mathrm{j}=1}^p{\beta}_{\mathrm{j}}{\beta}_{\mathrm{m}}{\left({X}_{\mathrm{j}}\left({\boldsymbol{s}}_{\mathrm{i}}\right)-{X}_{\mathrm{m}}\left({\boldsymbol{s}}_{\mathrm{i}}+h\right)\right)}^2 \) (continuous line) vs empirical semivariogram (filled circles) of the stem density (N), basal area (BA), and quadratic mean diameter (QMD) in a Forest Management Unit 1 (FMU 1), b FMU 2, c FMU 3, and d FMU 4

3.3 Comparison of ABA and UK approach

FMU 4 shows the lowest correlation between estimates and field observations in both approaches (Fig. 3, Table 6) whereas FMU 2 showed the highest correlation, reaching R2 values up to 73% in QMD. UK obtains better R2 and RMSE results (and similar bias) than ABA for those variables exhibiting strong spatial autocorrelation in those FMUs where linear models were fitted (QMD in FMU 2 and all variables—N, BA, and QMD—in FMU 4). ABA and UK showed very similar results for the remaining combination of FMU and variables where linear relationships were fitted. Lognormal kriging for BA in FMU 1, where this variable also exhibited spatial autocorrelation, improved R2 and RMSE over ABA (from R2 = 0.66 and RMSE = 0.25 with ABA to 0.72 and 0.23 respectively with lognormal universal kriging) at the expense of an increase of bias (from 0.31 with ABA to 1.89 with lognormal universal kriging). Lognormal kriging point prediction bias was less than 2% for all the variables.

Fig. 3
figure 3

Scatterplot of predicted vs observed values in every Forest Management Unit (FMU) for stem density (N), basal area (BA), and quadratic mean diameter (QMD) under area-based approach (ABA) (a, c, e) and universal kriging (UK) approach (b, d, f). The continuous line represents the 1:1 line

Table 6 Comparison of area-based approach and universal kriging approach for stem density (N), basal area (BA), and quadratic mean diameter (QMD), in terms of Pearson coefficient between estimated and observed values, RMSE, R2, and bias

Regarding the prediction errors (Fig. 4), we observe that in every studied model, UK presents a lower error average than ABA, as well as a wider interquartile range of prediction variances due to the effect that the varying distance between plots produces in the final UK prediction.

Fig. 4
figure 4

Boxplot of the prediction errors for stem density (N), basal area (BA), and quadratic mean diameter (QMD) under area-based approach (ABA) and universal kriging (UK) approach, estimated through the standard error of regression estimates (SEE) and kriging error (σk), respectively. a N exponential models. b N linear models. c BA exponential models. d BA linear models. e QMD linear models

4 Discussion

Both the ABA and UK approaches have proved to be useful techniques for integrating field sampling and LiDAR data in forest inventories showing reasonable bias, RMSE, and Pearson coefficient between observed and estimated values for modeling N, BA, and QMD. Our results show that the UK approach may increase the accuracy for those variables exhibiting spatial autocorrelation beyond the distance between sampling plots. The spatial correlation of the analyzed variables depends on the spatial variations of site conditions, disturbance regime, stand regeneration, and development processes and management. FMU 4 presents the greater amount of variance explained by autocorrelation for the analyzed variables, possibly due to the lack of silvicultural treatments in this FMU covering high altitude protection areas, where variability in forest structure is due to changes in the ecological conditions and the natural dynamics, in contrast to the man-induced forest structure in the other managed FMUs, where group shelterwood is applied at forest compartment level, leading to a shorter spatial autocorrelation range.

The Pearson coefficients from the cross-validation for N, BA, and QMD attained through ABA are in general within the range reported in other studies (from 0.38 to 0.67 for N (Silva et al., 2018; Hall et al. 2005), from 0.78 to 0.93 for BA (Hall et al. 2005; Reutebuch et al., 2005; Silva et al., 2017), or from 0.39 to 0.78 for QMD (Næsset 2002). However, in FMU 4, which covers the high elevation areas with a more complex forest structure (due to the natural disturbance regime), ABA showed poorer correlations. In this FMU, UK performs better than ABA in terms of Pearson correlation and RMSE (Table 6). LiDAR cover and height percentiles are often the metrics that show the greatest correlation with those variables related to stand growing stock (Means et al. 2000). In our case, h20 had the greatest explanatory power in the BA models and medium-to-high percentiles (h50, h70, and h95) in N models. Height percentiles are usually correlated, and the inclusion of one or another may depend on the stand structure and the LiDAR characteristics. Furthermore, metrics associated with crown cover, such as PFRA2, PARA2, PFRAhmean, and PARAhmean, were also relevant for N and BA. The inclusion of spatial pattern measures of pulse returns by height strata may improve the estimations of N and QMD.

In most studies, N estimates show the weakest correlation with LiDAR metrics (Næsset 2002; Goerndt et al. 2010; Hayashi et al. 2014). This may explain the lower Pearson coefficient and RMSE both for ABA and UK (Table 6). The lack of spatial autocorrelation may explain the poor performance of UK for in FMUs 1 and 3, where managed polewoods and regenerating stands predominate (Table 6). On the contrary, in FMU 4, a high mountain forest under natural disturbance regime, the variograms show a noticeable spatial autocorrelation of the analyzed variables beyond 2000 m, and the UK performs much better than the ABA. The bias problems with lognormal kriging have been widely treated in the literature (Journel 1980; Dowd 1982; Cressie and Pavlicová 2005; Cressie 2006; Yamamoto 2007). The RMSE based on cross-validation residuals reported includes the effect of bias, showing that this effect is compensated by an estimation variance reduction when there is spatial autocorrelation (BA in FMU 2). It should be noticed that bias is less than 2% in all the lognormal UK models indicating the suitability of Eq. 14 to bound bias for lognormal UK predictions in our dataset. Other approaches as proposed in Tolosana-Delgado and Pawlowsky-Glahn (2003) may overcome the problems linked to error sensitivity in the variance estimation of the adjustment used to preserve unbiasedness when the kriging prediction is transformed back to the original scale (Clark and Harper 2000).

In the UK models (as seen in Fig. 2), the variance increases with the distance between plots; the plots nearest to the prediction points have significant weight in the prediction (Oliver and Webster 2014). The effect of spatial autocorrelation of both field sampling data and LiDAR metrics over the UK estimators depends on the plot size, the cell size, and the distance between plots (Mauro et al. 2017). Therefore, increasing the number of sampling plots may improve the estimation of the coefficients of both ABA models and UK mean function, but for UK, there should be an additional uncertainty reduction because the kriging variance decrease as the number of nearby sampling plots within the range of spatial correlation increases (Mauro et al. 2017). Increasing the number of plots in those FMUs where the target variables show strong spatial autocorrelation, with a range proximal to the minimum distance between plots, would produce a greater reduction of the UK error. This increase in the number of plots shall be accompanied by a plot size reduction to avoid an inventory cost rise. Another option to reduce the distance between plots would be a two-stage sampling design (Mandallaz and Ronghua 1999) combining a denser network of terrain data points assessed through an inexpensive methodology, such as 2D Terrestrial Laser scanner (Ringdahl et al. 2013) or ForeStereo (Montes et al. 2019) and a second stage of sparser accurately measured sampling plots. It must be taken into account that point estimates are analyzed in our study, whereas operational forest inventories usually estimate variables at forest compartment or forest unit level. Block predictions (referred to UK predictions at forest compartment or forest unit level) should reduce substantially the estimation error concerning the point predictions, especially in small area units, due to the spatial correlation with nearby sampling plots (Lappi 2001). Similarly, using linear mixed-effects models with ABA may reduce the uncertainty of estimations for small-scale (5 to 50 ha) management unit (Mauro et al. 2017).

5 Conclusions

Universal kriging (UK) integrating field sampling and LiDAR data increased modeling accuracy with respect to the area-based approach (ABA) in terms of RMSE and Pearson coefficient for stem density (N), basal area (BA), and quadratic mean diameter (QMD) in Forest Management Units (FMUs) where these variables exhibited spatial autocorrelation (e.g., protection areas subjected to natural disturbance regime). In those cases, the UK approach reduced the estimation error up to 16%.

Lognormal universal kriging was superior to ABA in terms of RMSE and Pearson correlation between cross-validation estimations and observed values when modeling BA in FMU 1, where this variable shows spatial auto-correlation, but bias was, in general, higher with lognormal UK than with ABA.

In those FMU where regeneration areas and managed polewoods predominate, we generally found low autocorrelation and UK did not improve the ABA results.

Under the UK approach, an increment of the sampling intensity along with plot size reduction would increase cost-efficiency in those FMUs where the spatial autocorrelation range gets closer to the plot distance.

Data Availability

Data subject to third-party restrictions: the data that support the findings of this study are available from Spanish National Parks Autonomous Agency (OAPN) but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Spanish National Parks Autonomous Agency (OAPN).

References

  • Adams T, Brack C, Farrier T et al (2011) So you want to use LiDAR? A guide on how to use LiDAR in forestry. New Zeal J For 55:19–23

    Google Scholar 

  • Ayrey E, Hayes DJ (2018) The use of three-dimensional convolutional neural networks to interpret LiDAR for forest inventory. Remote Sens 10:1–16. https://doi.org/10.3390/rs10040649

    Article  Google Scholar 

  • Bater CW, Wulder MA, Coops NC, Nelson RF, Hilker T, Nasset E (2011) Stability of sample-based scanning LiDAR-derived vegetation metrics for forest monitoring. IEEE Trans Geosci Remote Sens 49:2385–2392

    Article  Google Scholar 

  • Blomdahl EM, Thompson CM, Kane JR, Kane VR, Churchill D, Moskal LM, Lutz JA (2019) Forest structure predictive of fisher (Pekania pennanti) dens exists in recently burned forest in Yosemite, California, USA. For Ecol Manage 444:174–186. https://doi.org/10.1016/j.foreco.2019.04.024

    Article  Google Scholar 

  • Bouvier M, Durrieu S, Fournier RA, Renaud JP (2015) Generalizing predictive models of forest inventory attributes using an area-based approach with airborne LiDAR data. Remote Sens Environ 156:322–334. https://doi.org/10.1016/j.rse.2014.10.004

    Article  Google Scholar 

  • Chambers JM (1992) Linear models. In: Chambers JM, Hastie TJ (eds) Statistical Models in S. Wadsworth & Brooks/Cole

  • Clark I, Harper WV (2000) Practical geostatistics 2000. Ecosse North America Llc, Columbus Ohio (USA)

    Google Scholar 

  • Cressie N (1993) Statistics for spatial data, revised edn. A Wiley-Interscience Publication

  • Cressie N (2006) Block kriging for lognormal spatial processes. Math Geol 38:413–443. https://doi.org/10.1007/s11004-005-9022-8

    Article  Google Scholar 

  • Cressie N, Pavlicová M (2005) Lognormal kriging: bias adjustment and kriging variances. 1027–1036. https://doi.org/10.1007/978-1-4020-3610-1_107

  • Donés J, Garrido M (2001) Daños por temporales en el monte pinar de Valsaín. Datos históricos y problemas generados por el temporal de enero de 1996. In: II Congreso Forestal Espanol, Granada. pp 315–320

  • Dowd PA (1982) Lognormal kriging-the general case. Math Geol 14:475–499

    Article  Google Scholar 

  • Frazer GW, Magnussen S, Wulder MA, Niemann KO (2011) Simulated impact of sample plot size and co-registration error on the accuracy and uncertainty of LiDAR-derived estimates of forest stand biomass. Remote Sens Environ 115:636–649. https://doi.org/10.1016/j.rse.2010.10.008

    Article  Google Scholar 

  • Goerndt ME, Monleon VJ, Temesgen H (2010) Relating forest attributes with area- and tree-based light detection and ranging metrics for western Oregon. West J Appl For 25:105–111

    Article  Google Scholar 

  • Hall SA, Burke IC, Box DO, Kaufmann MR, Stoker JM (2005) Estimating stand structure using discrete-return lidar: an example from low density, fire prone ponderosa pine forests. For Ecol Manage 208:189–209. https://doi.org/10.1016/j.foreco.2004.12.001

    Article  Google Scholar 

  • Harikumar A, Bovolo F, Bruzzone L (2017) An internal crown geometric model for conifer species classification with high-density lidar data. IEEE Trans Geosci Remote Sens 55:2924–2940

    Article  Google Scholar 

  • Harville DA (1974) Bayesian inference for variance components using only error contrasts. Biometrika 61:383–385. https://doi.org/10.1093/biomet/61.2.383

    Article  Google Scholar 

  • Hawbaker TJ, Gobakken T, Lesak A et al (2010) Light detection and ranging-based measures of mixed hardwood forest structure. For Sci 56:313–326

    Google Scholar 

  • Hayashi R, Weiskittel A, Sader S (2014) Assessing the feasibility of low-density LiDAR for stand inventory attribute predictions in complex and managed forests of Northern Maine, USA. Forests 5:363–383. https://doi.org/10.3390/f5020363

    Article  Google Scholar 

  • Jakubowski MK, Li W, Guo Q, Kelly M (2013) Delineating individual trees from lidar data: a comparison of vector- and raster-based segmentation approaches. Remote Sens 5:4163–4186. https://doi.org/10.3390/rs5094163

    Article  Google Scholar 

  • Journel AG (1980) The lognormal approach to predicting local distributions of selective mining unit grades. Math Geol 12:285–303

    Article  Google Scholar 

  • Köhl M, Magnussen S, Marchetti M (2006) Sampling methods, remote sensing and GIS multiresource forest inventory. Springer Berlin Heidelberg, Berlin

    Book  Google Scholar 

  • Lappi L (2001) Forest inventory of small areas combining the calibration estimator and a spatial model. Can J For Res 31:1551–1560

    Article  Google Scholar 

  • Lefsky MA, Harding D, Cohen WB, Parker G, Shugart HH (1999) Surface lidar remote sensing of basal area and biomass in deciduous forests of eastern Maryland, USA. Remote Sens Environ 67:83–98. https://doi.org/10.1016/S0034-4257(98)00071-6

    Article  Google Scholar 

  • Lefsky M, Hudak AT, Cohen WB, Acker SA (2005) Patterns of covariance between forest stand and canopy structure in the Pacific Northwest. Remote Sens Environ 95:517–531

    Article  Google Scholar 

  • Legendre P (1993) Spatial autocorrelation: trouble or new paradigm? Ecology 74:1659–1673. https://doi.org/10.2307/1939924

    Article  Google Scholar 

  • Lin Y, Hyyppä J (2016) A comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification. Int J Appl Earth Obs Geoinf 46:45–55

    Article  Google Scholar 

  • Lindberg E, Holmgren J, Olofsson K, Olsson H (2012) Estimation of stem attributes using a combination of terrestrial and airborne laser scanning. Eur J For Res 131:1917–1931. https://doi.org/10.1007/s10342-012-0642-5

    Article  Google Scholar 

  • Magnussen S, Næsset E, Gobakken T, Frazer G (2012) A fine-scale model for area-based predictions of tree-size-related attributes derived from LiDAR canopy heights. Scand J For Res 27:312–322. https://doi.org/10.1080/02827581.2011.624116

    Article  Google Scholar 

  • Mandallaz D (2000) Estimation of the spatial covariance in Universal Kriging: application to forest inventory. Environ Ecol Stat 7:263–284. https://doi.org/10.1023/A:1009619117138

    Article  Google Scholar 

  • Mandallaz D, Ronghua Y (1999) Forest inventory with optimal two-phase, two-stage sampling schemes based on the anticipated variance. Can J For Res 29:1691–1708. https://doi.org/10.1139/x99-124

  • Mardia KV, Marshall RJ (1984) Maximum likelihood estimation of models for residual covariance in spatial regression. Biometrika 71:135–146. https://doi.org/10.1093/biomet/71.1.135

    Article  Google Scholar 

  • Marino E, Montes F, Tomé JL, Navarro JA, Hernando C (2018) Vertical forest structure analysis for wildfire prevention: comparing airborne laser scanning data and stereoscopic hemispherical images. Int J Appl Earth Obs Geoinf 73:438–449. https://doi.org/10.1016/j.jag.2018.07.015

    Article  Google Scholar 

  • Matheron G (1971) The theory of regionalized variables and its applications. Ecole des Mines de Paris

  • Mauro F, Monleon VJ, Temesgen H, Ruiz LA (2017) Analysis of spatial correlation in predictive models of forest variables that use LiDAR auxiliary information. Can J For Res 47:788–799. https://doi.org/10.1139/cjfr-2016-0296

    Article  Google Scholar 

  • McGaughey RJ (2018) FUSION/LDV: software for LIDAR data analysis and visualization. 209

  • Means J, Acker S, Fitt B et al (2000) Predicting forest stand characteristics with airborne scanning lidar. Photogramm Eng Remote Sens 66:1367–1371. https://doi.org/10.1016/S0034-4257(01)00290-5

    Article  Google Scholar 

  • Montes F, Hernández MJ, Cañellas I (2005) A geostatistical approach to cork production sampling estimation in Quercus suber forests. Can J For Res 35:2787–2796. https://doi.org/10.1139/x05-197

    Article  Google Scholar 

  • Montes F, Ledo A (2010) Incorporating environmental and geographical information in forest data analysis: a new fitting approach for universal kriging. Can J For Res 40:1852–1861. https://doi.org/10.1139/X10-131

    Article  Google Scholar 

  • Montes F, Rubio-Cuadrado Á, Sánchez-González M d l O et al (2019) Occlusion probability in operational forest inventory field sampling with ForeStereo. Photogramm Eng Remote Sensing 85:493–508. https://doi.org/10.14358/PERS.85.7.493

    Article  Google Scholar 

  • Moreno-Fernández D, Díaz-Pinés E, Barbeito I, Sánchez-González M, Montes F, Rubio A, Cañellas I (2015) Temporal carbon dynamics over the rotation period of two alternative management systems in Mediterranean mountain Scots pine forests. For Ecol Manage 348:186–195. https://doi.org/10.1016/j.foreco.2015.03.043

    Article  Google Scholar 

  • Næsset E (2002) Predicting forest stand characteristics with airborne scanning laser using a practical two-stage procedure and field data. Remote Sens Environ 80:88–99. https://doi.org/10.1016/S0034-4257(01)00290-5

    Article  Google Scholar 

  • Næsset E (2004) Practical large-scale forest stand inventory using a small-footprint airborne scanning laser. Scand J For Res 19:164–179. https://doi.org/10.1080/02827580310019257

    Article  Google Scholar 

  • Nussbaum M, Papritz A, Baltensweiler A, Walthert L (2014) Estimating soil organic carbon stocks of Swiss forest soils by robust external-drift kriging. Geosci Model Dev 7:1197–1210

    Article  Google Scholar 

  • Oliver MA, Webster R (2014) A tutorial guide to geostatistics: computing and modelling variograms and kriging. Catena 113:56–69. https://doi.org/10.1016/j.catena.2013.09.006

    Article  Google Scholar 

  • Palace MW, Sullivan FB, Ducey MJ et al (2015) Estimating forest structure in a tropical forest using field measurements, a synthetic model and discrete return lidar data. For Ecol Manage 161:1–11

    Google Scholar 

  • R Core Team (2019) R: A Language and Environment for Statistical Computing

  • Reutebuch SE, Andersen H, Mcgaughey RJ (2005) Light detection and ranging (LIDAR): an emerging tool for multiple resource inventory. J For 27:286–292. https://doi.org/10.1080/01431160500396493

    Article  Google Scholar 

  • Ringdahl O, Hohnloser P, Hellström T, Holmgren J, Lindroos O (2013) Enhanced algorithms for estimating tree trunk diameter using 2D laser scanner. Remote Sens 5:4839–4856

    Article  Google Scholar 

  • Scott CT, Gove JH (2002) Forest inventory. Encycl. Environmetrics 2:814–820

    Google Scholar 

  • Silva CA, Klauberg C, Hudak AT et al (2017) Modeling and mapping basal area of Pinus taeda L. plantation using airborne LiDAR data. An Acad Bras Cienc 89:1895–1905. https://doi.org/10.1590/0001-3765201720160324

    Article  PubMed  Google Scholar 

  • Silva CA, Klauberg K, Hudak AT et al (2018) Estimating stand height and tree density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation. An Acad Bras Cienc 90:295–309

    Article  Google Scholar 

  • Terrasolid (2021a) TerraMatch: calibration and strip adjustment. https://terrasolid.com/products/terramatch/

  • Terrasolid (2021b) TerraScan: managing and processing point clouds. https://terrasolid.com/products/terrascan/

  • Tolosana-Delgado R, Pawlowsky-Glahn V (2003) Kriging coordinates: what does that mean? In: CODAWORK’03. Girona: La Universitat. http://hdl.handle.net/10256/676

  • Tomppo E, Halme M (2004) Using coarse scale forest variables as ancillary information and weighting of variables in k-NN estimation: a genetic algorithm approach. Remote Sens Environ 92:1–20. https://doi.org/10.1016/j.rse.2004.04.003

    Article  Google Scholar 

  • Tomppo E, Olsson H, Ståhl G, Nilsson M, Hagner O, Katila M (2008) Combining national forest inventory field plots and remote sensing data for forest databases. Remote Sens Environ 112:1982–1999. https://doi.org/10.1016/j.rse.2007.03.032

    Article  Google Scholar 

  • Venables WN, Ripley BD (2002) Modern applied statistics with S, 4th edn. Springer-Verlag, New York

    Book  Google Scholar 

  • White JC, Wulder MA, Varhola A, et al (2013) A best practices guide for generating forest inventory attributes from airborne laser scanning data using an area-based approach

  • White JC, Wulder MA, Vastaranta M, Coops N, Pitt D, Woods M (2013) The utility of image-based point clouds for forest inventory: a comparison with airborne laser scanning. Forests 4:518–536. https://doi.org/10.3390/f4030518

    Article  Google Scholar 

  • Wilkinson G, Rogers C (1973) Symbolic descriptions of factorial models for analysis of variance. Appl Stat 22:392–399

    Article  Google Scholar 

  • Woods M, Pitt D, Penner M, Lim K, Nesbitt D, Etheridge D, Treitz P (2011) Operational implementation of a LiDAR inventory in boreal Ontario. For Chron 87:512–528

    Article  Google Scholar 

  • Wulder MA, White JC, Nelson RF, Næsset E, Ørka HO, Coops NC, Hilker T, Bater CW, Gobakken T (2012) Lidar sampling for large-area forest characterization: a review. Remote Sens Environ 121:196–209. https://doi.org/10.1016/j.rse.2012.02.001

    Article  Google Scholar 

  • Yamamoto JK (2007) On unbiased backtransform of lognormal kriging estimates. Comput Geosci 11:219–234. https://doi.org/10.1007/s10596-007-9046-x

    Article  Google Scholar 

Download references

Acknowledgements

The authors wish to thank Javier Donés and the staff of National Parks Autonomous Agency (OAPN) for their support in the fieldwork, and also to José Luis Tomé, AGRESTA, for his support in data analysis.

Funding

This work was supported by the Spanish Ministry of Economy and Business (formerly Ministry of Economy, Industry, and Competitiveness) through the FPI program (BES-2017-081606), and through the AGL2016-76769-C2-1-R project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Isabel Aulló-Maestro.

Ethics declarations

Consent for publication

All authors gave their informed consent to this publication and its content.

Conflict of interest

The authors declare no competing interests.

Additional information

Handling Editor: Jean-Michel Leban

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contribution of the co-authors

Conceptualization: Isabel Aulló-Maestro, Fernando Montes; methodology: Isabel Aulló-Maestro, Fernando Montes; software: Isabel Aulló-Maestro, Fernando Montes; validation: Isabel Aulló-Maestro, Fernando Montes; formal analysis: Isabel Aulló-Maestro; investigation: Isabel Aulló-Maestro; resources: Eva Marino, Miguel Cabrera ; data curation: Isabel Aulló-Maestro; writing—original draft: Isabel Aulló-Maestro, Fernando Montes; writing—review and editing: Cristina Gómez, Antonio Vázquez De La Cueva, Eva Marino, Miguel Cabrera; visualization: Isabel Aulló-Maestro, Cristina Gómez, Fernando Montes; supervision: Cristina Gómez, Antonio Vázquez De La Cueva, Fernando Montes; project administration: Fernando Montes; funding acquisition: Fernando Montes, Antonio Vázquez De La Cueva

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Aulló-Maestro, I., Gómez, C., Marino, E. et al. Integration of field sampling and LiDAR data in forest inventories: comparison of area-based approach and (lognormal) universal kriging. Annals of Forest Science 78, 39 (2021). https://doi.org/10.1007/s13595-021-01056-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13595-021-01056-1

Keywords