Skip to main content
  • Original Paper
  • Published:

Propagating uncertainty through individual tree volume model predictions to large-area volume estimates

Abstract

Key message

The effects on large-area volume estimates of uncertainty in individual tree volume model predictions were negligible when using simple random sampling estimators for large-area estimation, but non-negligible when using stratified estimators which reduced the effects of sampling variability.

Context

Forest inventory estimates of tree volume for large areas are typically calculated by adding model predictions of volumes for individual trees at the plot level and calculating the per unit area mean over plots. The uncertainty in the model predictions is generally ignored with the result that the precision of the large-area volume estimate is optimistic.

Aims

The primary objective was to estimate the effects on large-area volume estimates of volume model prediction uncertainty due to diameter and height measurement error, parameter uncertainty, and model residual variance.

Methods

Monte Carlo simulation approaches were used because of the complexities associated with multiple sources of uncertainty, the non-linear nature of the models, and heteroskedasticity.

Results

The effects of model prediction uncertainty on large-area volume estimates of growing stock volume were negligible when using simple random sampling estimators. However, with stratified estimators that reduce the effects of sampling variability, the effects of model prediction uncertainty were not necessarily negligible. The adverse effects of parameter uncertainty and residual variance were greater than the effects of diameter and height measurement errors.

Conclusion

The uncertainty of large-area volume estimates that do not account for model prediction uncertainty should be regarded with caution.

1 Introduction

Forest inventory and monitoring programs predict the volume, biomass, or carbon content of individual trees using statistical models based on observations and measurements of tree attributes such as species, diameter, and height. The individual tree model predictions are aggregated at plot level, and plot-level estimates are averaged over plots to produce large-area estimates, often expressed on a per unit area basis. The uncertainty in the model predictions is routinely ignored with the result that precision estimates are erroneously optimistic.

Uncertainty in model predictions can be attributed to four primary sources: (1) model misspecification, (2) uncertainty in values of the independent variables, (3) uncertainty in the model parameter estimates, and (4) residual variability around model predictions. Model misspecification is due to the lack of appropriate model calibration data and/or the lack of modeling expertise. The effects of uncertainty from this source are not considered for this study because model misspecification is typically not a problem for allometric volume models for which reported R 2 and pseudo-R 2 values are usually greater than 0.85 (e.g., Brown et al. 1989; Chave et al. 2005; McRoberts and Westfall 2014; McRoberts et al. 2014b). The effects of uncertainty in values of the predictor variables have been studied extensively (Gertner and Dzialowy 1984; Gertner 1990; Gertner and Köhl 1992; McRoberts et al. 1994; McRoberts 1996; Kangas 1996; Westfall and Patterson 2007: Berger et al. 2014; Qi et al. 2015), but few generalizations are possible. Uncertainty in model parameter estimates is often expressed by the model parameter covariance matrix, and for a correctly specified model, residual variance is typically estimated as root mean square error. McRoberts and Westfall (2014) review approaches that have been used to estimate the effects of uncertainty in model predictions on large-area estimates including sampling theory (Cunia 1965, 1987; Ståhl et al. 2014; Qi et al. 2015), Taylor series approximations (Gertner 1990; Berger et al. 2014), and Monte Carlo simulations (Gertner and Dzialowy 1984; Gertner 1987; McRoberts 1996; McRoberts and Westfall 2014; Breidenbach et al. 2014).

The overall study objective was to assess the effects of uncertainty in allometric model predictions of individual tree volumes on large-area, sample-based estimates of mean volume per unit area. The technical objective was to estimate the particular effects of diameter and height measurement error, parameter uncertainty, and model residual variance for a single, non-specific allometric volume model. A Monte Carlo approach was used with a model that was constructed specifically for this study so that parameter uncertainty and model residual variance could be rigorously quantified.

2 Material and Methods

2.1 Study area

The study area was Minnesota Survey Unit 1 of the Forest Inventory and Analysis (FIA) program of the Northern Research Station, US Forest Service (Fig. 1). The study area includes approximately 33,353 km2 (12,877 mi2) and consists of forest land dominated by aspen-birch and spruce-fir associations, agricultural land, wetlands, and water.

Fig. 1
figure 1

Minnesota Survey Unit 1 (black), area of Northern Research Station inventory responsibility (gray), and geographic source of model calibration data (Minnesota, Wisconsin, Michigan)

2.2 Data

The FIA program conducts the National Forest Inventory (NFI) of the United States of America (USA) and has established field plot centers in permanent locations using a quasi-systematic sampling design that is regarded as producing an equal probability sample (McRoberts et al. 2010). Field crews observe species and measure diameter at breast height (dbh) (1.37 m, 4.5 ft) and height (ht) for all trees with dbh of at least 12.7 cm (5 in.). Volumes (V) for individual trees are predicted using statistical models, aggregated at plot level, expressed as volume per unit area, and typically considered to be observations without error. For this study, data were used for 2178 FIA plots on forest land with 50,176 trees representing 38 species. McRoberts and Westfall (2014, Table 1) describe these data in greater detail. For future reference, these data are characterized as the estimation dataset.

Table 1 Effects on large-area volume estimates of model prediction uncertainty due to uncertainty from underlying sources

The data used to calibrate the allometric volume model were originally acquired for a taper model study (Westfall and Scott 2010) encompassing 24 northeastern states of the USA (Fig. 1). For the current study, the geographic source of the calibration data was restricted to the States of Michigan, Minnesota, and Wisconsin which span the ecological province that includes the study area. For the 2398 individual trees in the dataset, diameter measurements were obtained using a Barr and Stroud dendrometer at heights of 0.3, 0.6, 0.9, 1.4, and 1.8 m and at approximately 2.5-cm diameter taper intervals up to total tree height. Volumes of sections between height measurements were calculated using Smalian’s formula (Avery and Burkhart 2002, p. 101) as the product of mean cross-sectional area and section length, and total stem volumes for individual trees were calculated by adding volumes for all sections. McRoberts and Westfall (2014) describe the sampling procedure and protocols for individual tree measurements in detail. For future reference, these data are characterized as the calibration dataset.

2.3 Volume models

An allometric model of the relationship between V as the response variable and dbh and ht as the predictor variables was formulated as

$$ {V}_i={\beta}_0\times {\mathrm{dbh}}_i^{\beta_1}\times {\mathrm{ht}}_i^{\beta_2}+{\varepsilon}_i, $$
(1)

where i indexes individual trees, ε i is a random residual, and the βs are parameters to be estimated. McRoberts and Westfall (2014) documented the global popularity of this model form. They also demonstrated that a set of species-specific models and a non-specific model were nearly indistinguishable with respect to estimates of large-area volume means and their standard errors (SE); McRoberts et al. (2014b) confirmed this result for a sub-tropical Brazilian dataset. Further, the boundaries for Minnesota Survey Unit 1, which are also the boundaries of the study area, were specifically selected to ensure a large degree of within-unit homogeneity with respect to climate, topography, and forest types. Therefore, only a single, non-specific model was considered for this study.

Before model fitting, natural logarithmic (ln) transformations of the response and predictor variables were calculated, and the model was reformulated as

$$ \ln \left({V}_i\right)={\alpha}_0+{\alpha}_1\times \ln \left({\mathrm{dbh}}_i\right)+{\alpha}_2\times \ln \left({\mathrm{ht}}_i\right)+{\varepsilon}_i, $$
(2)

where the αs are parameters to be estimated, and the ε i of Eq. (2) are not necessarily the same ε i as for Eq. (1). The advantages of the transformations were that the model could be expressed in linear form which facilitates estimation of the parameters, and heteroskedasticity was removed, thereby eliminating the necessity of weighted regressions. On the original scale, predictions were calculated as

$$ {\widehat{V}}_i= \exp \left[{\widehat{\alpha}}_0+{\widehat{\alpha}}_1\times \ln \left({\mathrm{dbh}}_i\right)+{\widehat{\alpha}}_2\times \ln \left({h}_i\right)+\frac{{\widehat{\sigma}}_{\varepsilon }}{2}\right], $$
(3)

where \( {\widehat{\sigma}}_{\varepsilon } \) is the residual standard deviation on the ln-ln scale, and the term \( \frac{{\widehat{\sigma}}_{\varepsilon }}{2} \) compensates for bias that accrues when transforming from the ln-ln scale back to the original scale (Baskerville 1972).

The quality of model fit on both the ln-ln and original scales was assessed in terms of the proportion of the variability explained by the model. On the ln-ln scale, the proportion was calculated and denoted R 2. The predictions were also transformed from the ln-ln scale back to the original scale using Eq. (3), and the proportion of variability explained by the model was denoted pseudo-R 2 because the assumptions underlying R 2 are not completely satisfied when using non-linear models (Anderson-Sprecher 1994).

2.4 Estimators

The simplest approach for estimating large-area parameters is to use the familiar simple random sampling (SRS) estimators,

$$ {\widehat{\mu}}_{\mathrm{SRS}}=\frac{1}{n}{\displaystyle \sum_{j=1}^n{y}_j} $$
(4a)

and

$$ \mathrm{V}\widehat{\mathrm{a}}\mathrm{r}\left({\widehat{\mu}}_{\mathrm{SRS}}\right)=\frac{{\displaystyle \sum_{j=1}^n{\left({y}_j\hbox{-} {\widehat{\mu}}_{\mathrm{SRS}}\right)}^2}}{n\left(n\hbox{-} 1\right)}, $$
(4b)

where n is the total sample size and y j is the observation for the jth plot selected for the sample. The primary advantages of the SRS estimators are that they are intuitive, simple, and unbiased when used with an SRS design; the disadvantage is that variances are frequently large, particularly for highly variable populations and/or small sample sizes. Although \( \mathrm{V}\widehat{\mathrm{a}}\mathrm{r}\left({\widehat{\mu}}_{\mathrm{SRS}}\right) \) from Eq. (4b) may be biased when used with systematic sampling, it is usually conservative in the sense that it overestimates the variance (Särndal et al. 1992, p. 83). For this study, the finite population correction factor was ignored because of the small sampling intensity of approximately one 670 m2 plot per 1200 ha of study area.

Because the uncertainty in volume model predictions is independent of the particular large-area estimators, the relative effects of model prediction uncertainty will be greater for estimators that reduce the effects of population variability than for the SRS estimators. Multiple regional FIA programs use post-stratified estimators to reduce variances of estimates with strata based on satellite spectral data. McRoberts et al. (2012) showed that stratifications derived from lidar-based maps of growing stock volume reduced the variance of mean volume per unit area by factors as great as 3.5 relative to the SRS estimators. Therefore, for this study, the effects of volume model uncertainty on large-area estimates of mean volume per unit area were also assessed using post-stratified estimation.

Post-stratified (STR) estimates of means and variances are calculated using estimators provided by Cochran (1977, pp. 134–135),

$$ {\widehat{\mu}}_{\mathrm{STR}}={\displaystyle \sum_{h=1}^H{w}_h{\widehat{\mu}}_h} $$
(5a)

and

$$ \mathrm{V}\widehat{\mathrm{a}}\mathrm{r}\left({\widehat{\mu}}_{\mathrm{STR}}\right)={\displaystyle \sum_{h=1}^H\left[{w}_h\cdot \frac{{\widehat{\sigma}}_h^2}{n}+\left(1-{w}_h\right)\frac{{\widehat{\sigma}}_h^2}{n^2}\right]}, $$
(5b)

where

$$ \begin{array}{c}\hfill {\widehat{\mu}}_h=\frac{1}{n_h}{\displaystyle \sum_{j=1}^{n_h}{y}_{hj}},\hfill \\ {}\hfill {\widehat{\sigma}}_h^2=\frac{1}{n_h-1}{{\displaystyle \sum_{j=1}^{n_h}\left({y}_{hj}\hbox{-} {\widehat{\mu}}_h\right)}}^2,\hfill \end{array} $$

n is the total sample size; h = 1,…, H denotes strata; and for the hth stratum, y hj is the jth sample observation, w h is the weight calculated as the proportional area of the stratum, n h is the sample size, and \( {\widehat{\mu}}_h \) and \( {\widehat{\sigma}}_h^2 \) are the sample estimates of the mean and variance, respectively.

Because lidar data were not available for the entire study area, stratifications were simulated by ordering the predicted plot volumes from smallest to largest and dividing the range into intervals with approximately the same number of plots per interval. These intervals simulated strata for which strata weights were estimated as the proportions of plots in the strata. The consequences on estimates of using these simulated strata rather than strata based on an actual lidar-based volume map were twofold. First, when using the same data, the SRS and STR estimates of the study area mean will be exactly the same, regardless of the number of strata used. Operationally, however, differences between stratum weights and proportions of plots per stratum cause SRS and STR estimates of means to differ. Second, the simulated approach assumes that each plot is assigned to the correct stratum, whereas operationally map prediction and geo-location errors cause some plots to be assigned to incorrect strata. These incorrect assignments do not induce bias into the estimators, but they increase the within-stratum variances and the overall variance. The overall result is that the relative effects of volume model prediction uncertainty for this study will be slightly overestimated for the simulated stratifications.

Cochran (1977, pp. 132–134) suggests that more than six strata are usually not useful; McRoberts et al. (2012) reported that little was gained when using more than six lidar-based strata; and the FIA program uses five spectral-based strata in the study area. For this study, stratifications based on 2, 4, and 8 strata were considered as representative of the range of possibilities.

2.5 Simulating uncertainty

The study focused on the effects on estimates of large-area mean volume per unit area of model prediction uncertainty arising from four sources: dbh measurement error, height measurement error, parameter uncertainty, and model residual variance. The tolerance for dbh measurement errors specified by the FIA protocols is that 95 % of measurements are to be within 0.5 % of the true dbh (U.S. Forest Service 2012). Assuming that dbh measurement errors follow a Gaussian distribution with mean 0, the standard deviation of the distribution is \( {\sigma}_{\varepsilon}^{\mathrm{dbh}}=\frac{0.005\times \mathrm{d}\mathrm{b}\mathrm{h}}{1.96}\approx 0.00255\times \mathrm{d}\mathrm{b}\mathrm{h} \). The tolerance for height measurement errors specified by the FIA protocols is that 90 % of measurements are to be within 10 % of the true height (U.S. Forest Service 2012). Assuming that the height measurement errors follow a Gaussian distribution with mean 0, the standard deviation of the distribution is \( {\sigma}_{\varepsilon}^{\mathrm{ht}}=\frac{0.10\times \mathrm{h}\mathrm{t}}{1.645}\approx 0.06079\times \mathrm{h}\mathrm{t} \).

Uncertainty in the linear model parameter estimates on the ln-ln scale was assessed using a 3-step Monte Carlo approach: (i) the transformed calibration dataset was aggregated into 10 dbh size classes, each with approximately the same number of observations; (ii) each dbh size class was resampled with replacement until the original class sample size was achieved; and (iii) the model was fit to the resampled data and the parameters were estimated. Steps (i)–(iii) were then replicated until the means and standard deviations of the distributions of parameter estimates stabilized. The resulting multiparameter distribution of parameter estimates represented the uncertainty in estimates of the linear model parameters on the ln-ln scale.

Residual uncertainty was assessed on the original scale where the models were applied using a 4-step procedure that accommodated heteroskedasticity: (i) the pairs \( \left({V}_i,{\widehat{V}}_i\right) \) were ordered with respect to the model prediction, \( {\widehat{V}}_i \); (ii) the pairs were aggregated into groups of size 25; (iii) within each group, g, the mean of the observations \( {\overline{V}}_g \), the mean of the predictions \( {\overline{\widehat{V}}}_g \), and the standard deviation \( {\widehat{\sigma}}_g \) of the residuals \( {\varepsilon}_i={V}_i\hbox{-} {\widehat{V}}_i \) were calculated; and (iv) the relationship between the group standard deviations, \( {\widehat{\sigma}}_g \), and the group prediction means, \( {\overline{\widehat{V}}}_g \), was represented using the model,

$$ {\widehat{\sigma}}_g={\gamma}_1\times {{\overline{\widehat{V}}}_g}^{\gamma_{{}_2}}+{\varepsilon}_g, $$
(6)

where the γs are parameters to be estimated.

2.6 Uncertainty in large-area volume estimates

A 6-step Monte Carlo simulation procedure was used to estimate the effects of model prediction uncertainty on the uncertainty of large-area estimates of mean volume per unit area.

  1. Step 1.

    For the kth replication, a set of model parameter estimates, \( {\widehat{\beta}}^k \), was randomly selected from the distribution constructed in “Section 2.5”.

  2. Step 2.

    For the ith tree on the jth plot in the estimation dataset, a random number, ε, was drawn from a Gaussian (0,1) distribution; if |ε| > 2.5, ε was redrawn. A dbh observation was then simulated as

    $$ {\mathrm{dbh}}_{ij}={\mathrm{dbh}}_{ij}^0+\varepsilon \times {\sigma}_{\varepsilon}^{\mathrm{dbh}}, $$

    where dbh 0 ij was the observation from the estimation dataset, and σ dbh ε was as described in “Section 2.5.”

  3. Step 3.

    For the ith tree on the jth plot in the estimation dataset, a random number, ε, was drawn from a Gaussian (0,1) distribution; if |ε| > 2.5, ε was redrawn. A height observation was then simulated as

    $$ {\mathrm{ht}}_{ij}={\mathrm{ht}}_{ij}^0+\varepsilon \times {\sigma}_{\varepsilon}^{\mathrm{ht}}, $$

    where ht 0 ij was the observation from the estimation dataset, and σ ht ε was as described in “Section 2.5.”

  4. Step 4.

    For the ith tree on the jth plot in the estimation dataset, an initial volume observation was calculated using the parameter values from step (1) and the simulated dbh and height observations from steps (2) and (3) as

    $$ {V}_{ij}^{k,0}={\widehat{\beta}}_1^k{{\times \mathrm{d}\mathrm{b}\mathrm{h}}_{ij}}^{{\widehat{\beta}}_2^k}{{\times \mathrm{h}\mathrm{t}}_{ij}}^{{\widehat{\beta}}_3^k}. $$

    A random number, ε, was drawn from a Gaussian (0,1) distribution; if |ε| > 2.5, ε was redrawn. The residual standard deviation, \( {\widehat{\sigma}}_{ij} \), was then calculated using Eq. (6) with V k,0 ij as the value of the predictor variable. The individual tree volume was then simulated as

    $$ {V}_{ij}^k={V}_{ij}^{k,0}+\varepsilon \times {\widehat{\sigma}}_{ij}. $$
  5. Step 5.

    The total volume for the jth plot in the estimation dataset was calculated as \( {V}_j^k={\displaystyle \sum_{i=1}^{n_j}{V}_{ij}^k} \) where n j is the number of trees on the plot.

  6. Step 6.

    The overall study area mean, \( {\overline{V}}^k \), and variance of the mean, \( \mathrm{V}\widehat{\mathrm{a}}\mathrm{r}\left({\overline{V}}^k\right) \), for the kth replication were estimated using both the SRS and STR estimators as described in “Section 2.4.”

Steps (1)–(6) were replicated, and the mean and variance over replications were estimated as per Rubin (1987, pp.76–77),

$$ \widehat{\mu}=\frac{1}{n_{\mathrm{rep}}}{\displaystyle \sum_{k=1}^{n_{\mathrm{rep}}}{\overline{V}}^k}, $$
(7)

and

$$ \mathrm{V}\widehat{\mathrm{a}}\mathrm{r}\left(\widehat{\mu}\right)=\left(1+\frac{1}{n_{\mathrm{rep}}}\right)\times {W}_1+{W}_2, $$
(8)

where \( {W}_1=\frac{1}{n_{\mathrm{rep}}\hbox{-} 1}{\displaystyle \sum_{k=1}^{n_{\mathrm{rep}}}{\left({\overline{V}}^k\hbox{-} \widehat{\mu}\right)}^2} \) is the among-replications variance, \( {W}_2=\frac{1}{n_{\mathrm{rep}}}{\displaystyle \sum_{k=1}^{n_{\mathrm{rep}}}V\widehat{\mathrm{a}}\mathrm{r}\left({\overline{V}}^k\right)} \) is the mean within-replication variance, and n rep is the number of replications. The replications continued until \( \widehat{\mu} \) and \( \mathrm{S}\mathrm{E}\left(\widehat{\mu}\right)=\sqrt{V\widehat{\mathrm{a}}\mathrm{r}\left(\widehat{\mu}\right)} \) stabilized.

In Steps (2) and (3), dbh and height measurement errors for the same tree were assumed to be independent, as were dbh errors for trees on the same plot. However, because of the greater difficulty in accurately measuring height and because plot canopy conditions tend to affect measurements of the heights of all trees on the same plot in a similar manner, height measurement errors for trees on the same plot may not be independent. Therefore, simulations were conducted separately for height correlations of ρ = 0.00 and ρ = 0.25. In Step (4), spatial correlations among residuals for trees on the same plot were ignored based on Berger et al. (2014), Breidenbach et al. (2014), and McRoberts and Westfall (2014) who all reported that the effects were negligible.

Parameter uncertainty and residual variance are highly correlated because each necessarily affects the other. This phenomenon is easily understood by considering the parametric form of the parameter covariance matrix for a linear model; in particular,

$$ \mathrm{V}\mathrm{a}\mathrm{r}\left(\widehat{\boldsymbol{\upbeta}}\right)={\sigma}^2\times {\left(\mathbf{X}\prime \mathbf{X}\right)}^{\hbox{-} 1}, $$

where X is the matrix of values of the independent variables and σ 2 is the residual variance (Bates and Watts 1988, p. 5). Thus, if σ 2 = 0, the covariances of the parameter estimates are necessarily 0; conversely, the only way the covariances can all simultaneously be 0 is if σ 2 = 0. Therefore, for this study, neither parameter uncertainty nor residual variance was incorporated into the simulation procedure apart from the other.

3 Results

The fit of the model to the data on the ln-ln scale produced R 2 = 0.98 with corresponding pseudo-R 2 = 0.97 on the original scale (Figs. 2 and 3). These large values justify the initial decision not to consider model misspecification for this study (“Section 1”) and also suggest that other model forms would likely not have produced more accurate predictions. However, as with most similar datasets, the number of observations for large trees is smaller than for other trees. Although this phenomenon could contribute to serious lack of fit of the model for large trees, such was not the case for this study (Figs. 2 and 3).

Fig. 2
figure 2

Group observation means versus group prediction means on ln-ln scale

Fig. 3
figure 3

Group observation means versus group prediction means on original scale

On the ln-ln scale, the distribution of simulated parameter estimates exhibited an ellipsoidal pattern as expected for linear models (Fig. 4). Parameter uncertainty was simulated by random draws from this distribution. The approach to estimating the relationship between heteroskedastic residual standard deviations and volume model predictions as described in “Section 2.5” was somewhat arbitrary, but the relationship was well estimated (Fig. 5).

Fig. 4
figure 4

Distribution of parameter estimates used to simulate parameter uncertainty

Fig. 5
figure 5

Observed versus predicted heteroskedastic residual standard deviations

For all combinations of dbh measurement error, height measurement error, and parameter uncertainty and residual variance, 5000 replications of the simulation procedure were sufficient for estimates of both means and SEs to stabilize (Fig. 6). Further, no prediction for any of the more than 50,000 trees in the estimation dataset over the 5000 replications was proportionally less than 0.87 or greater than 1.20 than the prediction with the original parameter estimates. For large trees, the proportions were 0.95 and 1.05.

Fig. 6
figure 6

Simulated mean biomass per unit area (Mg/ha) and standard error versus replications for four strata and incorporating all sources of uncertainty

Means of tree-level dbh measurement errors ranged from approximately −0.12 to 0.12 cm with nearly 98 % between −0.05 and 0.05 cm, and means of tree-level height measurement errors ranged from approximately −2.4 to 2.1 m with nearly 98 % between −1.0 and 1.0 m. These relatively small tree-level errors have minimal effects at the population level.

For the STR estimators, the effects of uncertainty from the four sources on SEs increased as the number of strata increased, although nearly all the increase is attributed to the combined effects of parameter uncertainty and residual variance. For two strata, the proportional increase in SE was 0.036; for four strata, the proportional increase was 0.092; and for eight strata, the proportional increase was 0.148. Although the increase for two strata would likely be considered negligible, such may not be the case for four and eight strata. As previously noted, the simulated stratifications likely reduce SEs more than would be realized with actual stratifications with the result that the actual increases in SE may be slightly less than those reported in Table 1.

4 Discussion

The simulated stratifications accomplished the intended objective by grouping plots into strata with greater homogeneity than the population as a whole and thereby reducing the variance of the estimate of the population mean relative to the variance of the SRS mean (Tables 1 and 2). As previously noted, the lack of differences among the SRS and STR estimates of the means is attributed to using proportions of plots per stratum as stratum weights rather than proportions of the population.

Table 2 Stratified estimates

When using the SRS estimators, no source of uncertainty individually or in combination had a meaningful effect on SEs (Table 1). This result is consistent with the results reported by McRoberts et al. (2014b) for a sub-tropical dataset. For both the SRS and STR estimators, the effects of dbh and height measurement errors, including correlated height measurement errors, were negligible. This result can be attributed to the fairly large number of trees per plot (mean 23, maximum 134) which resulted in relatively small mean dbh and height measurement errors at the plot level. Berger et al. (2014) and Qi et al. (2015) reported similar results. Overall, these results suggest that as long as height measurements satisfy FIA protocols, these sources of uncertainty produce no meaningful adverse consequences. However, experience suggests that height measurements may fail to satisfy the protocols. Nevertheless, results obtained using a 20 % rather than 10 % tolerance and an 80 % rather than 90 % satisfaction rate were essentially unchanged.

The combined effects of parameter uncertainty and residual variance were greater than the combined effects of dbh and height measurement error. This result can be at least partially attributed to how parameter uncertainty affects plot-level estimates. Whereas measurement errors and prediction residuals are incorporated separately for individual trees and compensate for each other, parameter uncertainty is realized at the population level and therefore should be expected to have a greater population-level effect.

The important finding is that as the effects on SEs of population variability are reduced by using the STR estimators rather than the SRS estimators, the relative effects of underlying sources of model prediction uncertainty increase. Model-assisted estimators, which are receiving increasing attention for inventory applications, often reduce the effects of population variability even more than do stratified estimators (McRoberts et al. 2013, 2014a). Thus, the proportional adverse effects of measurement error, parameter uncertainty, and residual variance on the uncertainty of the large-area volume estimates may be even greater when the effects of sampling variable populations are further reduced.

5 Conclusions

Three conclusions may be drawn from the study. First, when using the simple random sampling estimators, the effects on large-area volume estimates of uncertainty in individual tree volume model predictions due to diameter and height measurement error, parameter uncertainty, and residual variance were negligible. Second, however, when the effects of variability in the population on uncertainty were reduced via stratified estimation, the effects of model prediction uncertainty on the large-area volume estimates increased as the number of strata increased. For four and eight strata, the proportional increases in the stratified SEs were as great as 0.092 and 0.148, respectively, which may not be negligible. Third, nearly all the effects of model prediction uncertainty can be attributed to parameter uncertainty and residual variance. Finally, all results for this study are contingent on the calibration dataset sample size and the quality of fit of the model to the data, both of which directly affect parameter uncertainty and residual variance.

References

  • Anderson-Sprecher R (1994) Model comparisons and R2. Am Stat 48:113–117

    Google Scholar 

  • Avery TE, Burkhart HE (2002) Forest measurements, 5th edn. McGraw-Hill, NewYork, p 456

    Google Scholar 

  • Baskerville GL (1972) Use of logarithmic regression in the estimation of plant biomass. Can J Forest Res 2:49–53

    Article  Google Scholar 

  • Bates DM, Watts DG (1988) Nonlinear regression analysis and its applications. Wiley, New York, p 365

    Book  Google Scholar 

  • Berger A, Gschwantner T, McRoberts RE, Schadauer K (2014) Effects of measurement errors on single tree stem volume estimates for the Austrian National Forest Inventory. For Sci 60:14–24

    Google Scholar 

  • Breidenbach J, Antón-Fernández C, Petersson H, Astrup P, McRoberts RE (2014) Quantifying the contribution of biomass model errors to the uncertainty of biomass stock and change estimates in Norway. For Sci 60:25–33

    Google Scholar 

  • Brown S, Gillespie AJR, Lugo AE (1989) Biomass estimation methods for tropical forests with application to forest inventory data. For Sci 35:881–902

    Google Scholar 

  • Chave J, Andalo C, Brown S, Cairns MA, Chambers JQ, Eamus D, Fölster H, Fromard F, Higuchi N, Kira T, Lescure J, Nelson BW, Ogaw H, Puig H, Riéra B, Yamakura T (2005) Tree allometry and improved estimation of carbon stocks and balance in tropical forests. Oecol 145:87–99

    Article  CAS  Google Scholar 

  • Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, New York, p 428

    Google Scholar 

  • Cunia T (1965) Some theory on reliability of volume estimates in a forest inventory sample. For Sci 11:115–128

    Google Scholar 

  • Cunia T (1987) On the error of continuous forest inventory estimates. Can J Forest Res 17:436–441

    Article  Google Scholar 

  • Gertner GZ (1987) Approximating precision in simulation projections: an efficient alternative to Monte Carlo. For Sci 33:230–239

    Google Scholar 

  • Gertner GZ (1990) The sensitivity of measurement error in stand volume estimation. Can J Forest Res 20:800–804

    Article  Google Scholar 

  • Gertner GZ, Dzialowy PJ (1984) Effects of measurement error on an individual tree-based growth projection system. Can J Forest Res 14:311–316

    Article  Google Scholar 

  • Gertner G, Köhl M (1992) An assessment of some nonsampling errors in a national survey using an error budget. For Sci 38:525–538

    Google Scholar 

  • Kangas A (1996) On the bias and variance in tree volume predictions due to model and measurement errors. Scand J For Res 11:281–290

    Article  Google Scholar 

  • McRoberts RE (1996) Estimating variation in field crew estimates of site index. Can J Forest Res 26:560–565

    Article  Google Scholar 

  • McRoberts RE, Westfall JA (2014) The effects of uncertainty in model predictions of individual tree volume on large area volume estimates. For Sci 60:34–43

    Google Scholar 

  • McRoberts RE, Hahn JT, Hefty GJ, Van Cleve JR (1994) Variation in forest inventory field measurements. Can J Forest Res 24:1766–1770

    Article  Google Scholar 

  • McRoberts RE, Hansen MH, Smith WB (2010) United States of America. In: Tomppo E, Gschwantner T, Lawrence M, McRoberts RE (eds) National forest inventories, pathways for common reporting. Springer, Heidelberg, pp 567–582

    Google Scholar 

  • McRoberts RE, Gobakken T, Næsset E (2012) Post-stratified estimation of forest area and growing stock volume using lidar-based stratifications. Remote Sens Environ 125:157–166

    Article  Google Scholar 

  • McRoberts RE, Næsset E, Gobakken T (2013) Inference for lidar-assisted estimation of forest growing stock volume. Remote Sens Environ 128:268–275

    Article  Google Scholar 

  • McRoberts RE, Liknes GC, Domke GM (2014a) Using a remote sensing-based, percent tree cover map to enhance forest inventory estimation. For Ecol Manage 312:2–18

    Google Scholar 

  • McRoberts RE, Moser P, Oliveira Z, Vibrans AC (2014b) The effects of uncertainty in individual tree volume model predictions on large area volume estimates for the Brazilian State of Santa Catarina. Can J Forest Res 45:44–51

    Article  Google Scholar 

  • Qi C, Vaglio GL, Valentini R (2015) Uncertainty of remote sensed aboveground biomass over an African tropical forest: propagating errors from trees to plots to pixels. Remote Sens Environ

  • Rubin DB (1987) Multiple imputation in non-response surveys. Wiley, New York, p 287

    Book  Google Scholar 

  • Särndal C-E, Swensson B, Wretman J (1992) Model assisted survey sampling. Springer-Verlag, Inc, New York, p 694

    Book  Google Scholar 

  • Ståhl G, Heikkinen J, Petersson H, Repola J, Holm S (2014) Adapting uncertainty assessments from sample based forest inventories to include the effects of model errors. For Sci 60:3–13

    Google Scholar 

  • US Forest Service (2012) Forest inventory and analysis national field guide. Volume 1: field data collection procedures for phase 2 plots, version 6.0. Available at: http://www.fia.fs.fed.us/library/field-guides-methods-proc/. Accessed: December 2014

  • Westfall JA, Patterson PL (2007) Measurement variability error for estimates of volume change. Can J Forest Res 37:2201–2210

    Article  Google Scholar 

  • Westfall JA, Scott CT (2010) Taper models for commercial tree species in the northeastern United States. For Sci 56:515–528

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ronald E. McRoberts.

Additional information

Handling Editor: Jean-Michel Leban

Contribution of the co-authors Ronald E. McRoberts: developed and conducted the analyses and wrote most of the paper.

James A. Westfall: provided the data, consulted on the analyses, and reviewed and revised the manuscript.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

McRoberts, R.E., Westfall, J.A. Propagating uncertainty through individual tree volume model predictions to large-area volume estimates. Annals of Forest Science 73, 625–633 (2016). https://doi.org/10.1007/s13595-015-0473-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13595-015-0473-x

Keywords