Skip to main content
  • Research Paper
  • Open access
  • Published:

Two new methods applied to crown width additive models: a case study for three tree species in Northeastern China

Abstract

Key message

The non-linear seemingly unrelated regression mixed-effects model (NSURMEM) and generalized additive model (GAM) were applied for the first time in crown width (CW) additive models of larch (Larix gmelinii Rupr.), birch (Betula platyphylla Suk.), and poplar (Populus davidiana Dode). The crown radii in four directions (CR) exhibited different growth trends and responded differently to tree size and competition variables. In the absence of calibration, GAM was more accurate than NSURMEM for CR and CW predictions.

Context

Crown radii in four directions (CR) and crown width (CW) are fundamental indicators used to describe tree crowns. The complexity of the CR growth in four directions of different tree species in natural forests is often ignored. There is logical additivity among CR and CW that is also often overlooked. Furthermore, the existing methods applied to CW additive models have some drawbacks.

Aims

We aim to: (i) evaluate the utility of two new methods in developing CW additive models for larch (Larix gmelinii Rupr.), birch (Betula platyphylla Suk.), and poplar (Populus davidiana Dode) in natural secondary forests of Northeastern China; and (ii) explore the growth patterns of CR in four directions to gain important ecological insights.

Methods

The non-linear seemingly unrelated regression mixed-effects model (NSURMEM) and generalized additive model (GAM) were used to develop CW additive models and to explore crown growth patterns. The predictive ability of the additive models was evaluated using leave-one-plot-out cross-validation (LOOCV).

Results

At a fair level without calibration, GAM provided slightly better results than NSURMEM. The response of the four CR to tree size and competition variables is different and may be non-uniform due to complex stand conditions and tree growth strategies.

Conclusion

The newly provided methods applied to additive models are available for external datasets. GAM is recommended in the absence of calibration. This study has important implications for the understanding of natural forest dynamics and decision-making for critical stand management.

1 Introduction

Crown radii (CR) are basic and essential variables in forestry surveys, which are usually measured in four directions, i.e., eastern crown radius (CRE), southern crown radius (CRS), western crown radius (CRW), and northern crown radius (CRN) (Bragg 2001). Tree distribution is often stochastic in heterogeneous stands, which results in individual trees being subject to irregular prevailing winds, directional solar radiation, and competition from neighboring trees (Grote 2003; Kong et al. 2021). The plasticity and phototropism of the crown allow for optimal canopy filling by growing into vacant growth spaces (Jucker et al. 2015; Krůček et al. 2019; Longuetaud et al. 2013). Therefore, the four CR usually possess different growth and development characteristics that tend to form asymmetrical or irregular crowns (Fu et al. 2017b; Kong et al. 2021; Thorpe et al. 2010). The understanding of the growth characteristics of the CR in four directions is critical for assessing the health and competitiveness of trees (Thorpe et al. 2010), simulating canopy cover and radiative transfer (Cescatti 1997; Gill et al. 2000), and understanding the susceptibility of trees to windthrow (Skatter and Kucera 2000). However, most of the developed models related to the crown still assumed consistent developmental trends in all directions of the crown, leading to identical models being used to simulate crown properties in all directions (Fu et al. 2017b; Fu et al. 2017c; Gao et al. 2021; Lei et al. 2018).

Crown width (CW) is also an important variable that has great value for research areas such as tree condition assessment (Zarnoch et al. 2004), stand model simulation (Jucker et al. 2017), and forest management (Hemery et al. 2005). CW is usually obtained from the four CR, i.e., CW = (CRE + CRS + CRW + CRN)/2 (Fu et al. 2013). Therefore, logical additivity exists among the four CR and CW (Fu et al. 2017c; Lei et al. 2018). Several parameter estimation methods of CR or CW models have been used in the literature, including ordinary non-linear least squares (ONLS) and non-linear mixed-effects models (NMEM) (Chen et al. 2021; Fu et al. 2017a; Sharma et al. 2016; Sharma et al. 2017; Yang and Huang 2017). However, these estimation methods ignore the logical additivity of the four CR and CW, results in statistically inefficient and inconsistent results (half of the sum of the predicted values of the four CR models is not equal to the predicted values of the CW model) (Fu et al. 2017c; Lei et al. 2018). In addition, ONLS violates the assumption of independent errors, leading to invalid hypothesis testing (Sharma et al. 2017).

The CW additive models can predict CW and the four CR simultaneously, ensuring their logical additivity (Fu et al. 2017c; Lei et al. 2018). Typically, CW additive models can be developed in two forms: (i) 4 + 1 (four CR and one CW) models; and (ii) four CR models, and then calculate the predicted values of CW using the predicted values of the four CR models. The latter form eliminates the parameter constraints in the model structure and is shown to yield better predictive performance (Xie et al. 2022; Zhao et al. 2019). The main estimation methods for CW additive models include non-linear seemingly unrelated regression (NSUR) and two-stage error-in-variable model (TSEM). NSUR and TSEM have been shown to perform slightly better compared to traditional methods (Lei et al. 2018). However, the use of NSUR and TSEM estimation still has drawbacks, such as ignoring the hierarchical nested structure of crown data, violates the independence assumption of observation errors (Raptis et al. 2018; West et al. 1984).

Non-linear seemingly unrelated mixed-effects model (NSURMEM) should be an appropriate solution to ensure both the additivity of CR and CW and to analyse the hierarchical structured data (Mehtätalo and Lappi 2020). NSURMEM is a flexible method for jointly modeling different dependent variables by imposing a joint multivariate distribution on the random effects (Fieuws and Verbeke 2006). NSURMEM is suitable for a variety of situations and is currently used in the medical field in both longitudinal and non-longitudinal environments (Hamza et al. 2009; Schluchter and Piccorelli 2019) but is not widely used in forestry (Mehtätalo and Lappi 2020). NSURMEM may provide a more flexible random effects calibration through sub-model correlations (Mehtätalo and Lappi 2020). However, supplementary data is often not readily available and is prone to high costs, especially in the case of large-scale forest surveys (Calama and Montero 2004). In addition, the prediction accuracy depends on the amount and properties of the supplementary data (Calama and Montero 2004; Chen et al. 2021). Bias in the supplementary data may result in lower accuracy of the calibration than predicted using only fixed-effects parameters (mean response) (Westfall and Scott 2010). Therefore, it may be sensible to fit and predict using only the fixed-effects parameters of NSURMEM (He et al. 2021).

Advances in the modeling approach have allowed us to evolve from parametric models to non-parametric or semi-parametric models in forest inventory, which are notable for being data-driven rather than model-driven and having no strict assumptions (Albert and Schmidt 2010; Byun et al. 2013; Frescino et al. 2001; Hastie and Tibshirani 1990; Zhang and Gove 2005). As a semi-parametric model, the generalized additive model (GAM) (Hasenauer 1997) uses a link function to establish the relationship between the mean of the response variable and the smoothing functions of the explanatory variables. GAM has relaxed assumptions, with the only underlying assumptions being that the functions are additive and that the explanatory variables are smoothed (Guisan et al. 2002; Wood 2017). GAM avoids problems such as model form filtering of parametric models, making it one of the most innovative and successful techniques in many forestry studies (Adamec and Drápela 2016; Fichtner et al. 2013; Guisan et al. 2002; He et al. 2021; Moisen et al. 2006; Robinson et al. 2011; Wang et al. 2005; Zang et al. 2016). Further, GAM could handle highly non-linear and non-monotonic relationships between response and explanatory variables by quantifying partial effects, contributing to deeper insights into the complexity of forest stands (Albert and Schmidt 2010; Schmidt et al. 2011; Wernicke et al. 2020). However, to our knowledge, no previous studies have applied GAM to CR or CW modeling. In addition, given the compelling features of GAM and the desirable additivity characteristics among CR and CW, we considered whether GAM could be applied to the CW additive models to provide high statistical efficiency with proper interpretation of additivity.

Larch (Larix gmelinii Rupr.), birch (Betula platyphylla Suk.), and poplar (Populus davidiana Dode) are important tree species in natural secondary forests in Northeastern China, providing essential regional and national benefits related to economic timber, carbon storage, biodiversity, and other ecosystem services (Dong et al. 2019). Therefore, the objectives of this study were to: (i) develop CW additive models for larch, birch, and poplar using two new methods, NSURMEM and GAM. In this process, each of the four CR was tested for appropriate basic models and covariates to accurately understand the growth and response to covariates of the four CR; and (ii) gain insight and evaluate the strengths and weaknesses of NSURMEM and GAM in application to CW addictive models. For a fair comparison, neither of the two additive models considered calibration. In addition, we did not perform statistical comparisons with traditional models such as ONLS and NMEM because the aim of this study was to develop new methods that are more consistent with the characteristics and statistical assumptions of the crown data. NSURMEM and GAM are expected to be widely applicable to other species and forest types in CW additive models.

2 Material and methods

2.1 Research area and data collection

Our study area is located in the Daxing’an Mountains Forest region in Northeastern China, which is the forest ecological function zone and wood resource reserve base with the largest area, highest latitude, and foremost ecological status. Data for this study were collected from 10 forest farms of Xinlin (123°41′E–125°25′E, 51°21′N–52°10′N), Songling (123°29′E–125°11′E, 50°9′N–51°23′N), and Huzhong (122°39′E–124°20'E, 51°14′N–52°25′N) Forestry Bureaus in Daxing’an Mountains during 2012–2014 and 2017–2018 without repeated measurements. The research areas belong to the continental monsoon climate zone in the cold temperate zone. For the Xinlin Forestry Bureau, the extreme temperature range is – 47 °C to + 36 °C, the annual average temperature is − 3 °C, the annual precipitation is 550.7 mm, and the frost-free period is 80–100 days. For Songling Forestry Bureau, the extreme temperature range is – 48 °C to + 30 °C, the annual average temperature is – 3 °C, the annual average precipitation is 600 mm, and the annual frost-free period is 100–110 days. For the Huzhong Forestry Bureau, the extreme temperature range is – 52 °C to + 32 °C, the annual average temperature is − 4.3 °C, the average annual precipitation is 497.7 mm, and the annual frost-free period is 87 days. The presence of low temperatures and frozen soil significantly affects soil formation in the three Forestry Bureaus.

A total of 121 temporary plots were set up in natural secondary stands that were regenerated naturally without human intervention. The plots were set up by stratified sampling according to site type and dominant tree species. The accessibility of the plots due to topography, terrain, rivers, and water areas was also considered. The plots ranged in area from 400 to 3600 m2 and covered a wide range of growing conditions as reflected by the various slopes, aspects, and elevation differences. In addition to the three main tree species mentioned above, there are also a small number of other accompanying tree species such as Betula davurica Pall., Quercus mongolica Fisch. ex Ledeb, etc. Hence, only larch, birch, and poplar were included in the detailed analysis, while other tree species cooperated with the calculation of stand-level variables (Thorpe et al. 2010). The data of larch and birch were measured from the Xinlin, Songling, and Huzhong Forestry Bureaus, while for poplar, only the Xinlin and Songling Forestry Bureaus. The prior statistical test (ANOVA) confirmed that there were no significant differences between geographic regions of the CR of the three species; hence, the source of the sampled trees was ignored for data analysis and model development (Dong et al. 2014).

In each sample plot, all living trees without top damage and extremely lopsided crown were measured, including over-bark diameter at breast height (DBH, 1.3 m above ground), total height (THT), and height to crown base (HCB). CR was measured as the maximum horizontal distance from the center of the trunk to the crown in four vertical directions. CW was calculated as the arithmetic mean of the two crown diameters obtained from the four CR (Fu et al. 2013). Differences in measuring instruments, technicians, and stand slopes may lead to extreme data points. To detect possible outliers and improve processing efficiency, a method like the abnormal data detection system proposed by Bi (2000) was used with smoothing parameters of 0.5 for the CR data. The detected extreme points were finally excluded from the original database (24, 29, and 2 trees for larch, birch, and poplar, respectively). The total numbers of larch, birch, and poplar left in the dataset were 11216, 4863, and 1039, respectively. Scatter plots and marginal histograms of CW and DBH for larch, birch, and poplar are shown in Fig. 1.

Fig. 1
figure 1

Scatter plots and marginal histograms of CW and DBH for a larch, b birch, and c poplar

2.2 Preparation before modeling

In the first place, we considered whether a generic model could be developed for the four CR of each species as in other studies (Fu et al. 2017b; Fu et al. 2017c; Lei et al. 2018). However, we believe that various tree species or even directions of CR may have diverse growth trends, especially in natural forests. To prove this hypothesis, we randomly selected three plots to map the relative positions of trees and the size of CW in the plots (Fig. 2a). The distributions of trees in the plots were scattered and random. The ideal situation of uniform growth of the CR may be difficult to achieve and does not correspond to biological realism.

Fig. 2
figure 2

a Relative positions of trees in the three plots, with the size of the circles representing CW; and b frequency distributions of crown asymmetry indices of trees in the three plots

In addition, we calculated the crown asymmetry indices (Eq. 1) of the trees in the three plots (Kong et al. 2021).

$${\text{CAI}}=\frac{1}{N_p}\sum \limits_{i=1}^{N_p}\frac{\left|{\overleftarrow{R}}_i-{\overrightarrow{R}}_i\right|}{{\overleftarrow{R}}_i+\overrightarrow{R}_i}$$
(1)

where Np is the number of paired CR measurements, \(\overleftarrow{R_i}\) and \(\overrightarrow{R_i}\) represent the ith pair of CR measured on two opposite sides of the crown. CAI = 0 indicates perfect symmetry and CAI = 1 indicates extreme asymmetry. The frequency distributions of crown asymmetry indices of trees in the three plots are shown in Fig. 2b. We found that most of the trees showed varying degrees of crown asymmetry.

Furthermore, we calculated the Pearson correlation coefficients of the four CR with some important variables using the whole data. The radar plots (Fig. 3) show that for all species, the four CR have differential correlation coefficients for the same variables. For example, for larch, the correlation coefficients of CRE, CRS, CRW, and CRN with DBH were 0.55, 0.56, 0.55, and 0.54, respectively; and with THT were 0.41, 0.42, 0.38, and 0.39, respectively.

Fig. 3
figure 3

Radar plots of Pearson correlation coefficients between CR and A DBH, B THT, C HCB, D BA, and E BAL for a) larch, b) birch, and c) poplar

In view of the above, we decided to develop separate models for each CR of each tree species. Although developing models separately may reduce the predictive and application ability of the additive models, it will be more consistent with the natural law that tree crown growth may be asymmetric in heterogeneous stands. In addition, separate modeling would provide a more rigorous and robust account of the relationship among CR and factors such as tree size and competition.

2.3 Selection of basic models

Given that the non-linear model form is theoretically more biologically logical and appears to yield more reliable extrapolation results (Gill et al. 2000), a total of eight popular non-linear models were selected as candidate basic models (Table 1). With the assumption of independent observations, the entire data for CR of larch, birch, and poplar were fitted separately using ONLS to determine the most appropriate basic models based on mean error (ME, Eq. 4) and root mean squared error (RMSE, Eq. 5) (Bronisz and Mehtätalo 2020a; Temesgen et al. 2008). Four main criteria were used to select basic models for each species: easy convergence, lower ME and RMSE, general applicability, and biological realism.

Table 1 Candidate basic models analyzed for the CR of larch, birch, and poplar

2.4 Addition of covariates

Accurate quantification of factors influencing CR not only allows for the development of generalized models with broader geographic applicability and higher predictive accuracy, but also provides insight into crown growth patterns in stands. Several landmark studies have observed that crown growth is closely related to biotic factors such as tree size (Raptis et al. 2018; Russell and Weiskittel 2011) and competition (Sharma et al. 2016; Sharma et al. 2017), as well as abiotic factors such as site conditions (Fu et al. 2013) and topography (Bechtold 2004). Given the objectives of this study, two types of covariates that are convenient to calculate and apply, including tree size and competition, were tested to capture the variability of CR among stands and optimize models (Hasenauer et al. 1998; Qin et al. 2022). Site quality variables were not considered in this study because they are more applicable to pure even-aged forests rather than mixed-uneven aged forests (Huang and Titus 1993). The covariates associated with tree size were DBH, THT, HCB, and the THT-to-DBH ratio (HDR) which represents the stability of the trees (Zhang et al. 2020). Variables describing competition incorporated quadratic mean diameter (QMD), DBH-to-QMD ratio (DQR), basal area (BA), the basal area proportion of target species (BApor), the basal area of trees larger than the subject tree (BAL), and stem numbers (N). Detailed summary statistics are shown in Table 2. All variables and their transformations, including square, natural logarithm, root, and power forms, were examined for their role in improving the CR equations.

Table 2 Descriptive statistics of the tree size and competition variables of larch, birch, and poplar

Several approaches have been successfully employed for adding covariates to forest models, including stepwise regression (Bechtold 2004), principal component analysis (Lei et al. 2016), two-stage approach, etc. (Calama and Montero 2004). The two-stage approach has been competently utilized in recent years and has proven to be reliable since it is biologically more relevant and interpretable than other methods (Chen et al. 2021; Sharma et al. 2016). Therefore, the two-stage approach was used in this study to incorporate different high-contribution covariables into the optimal basic models of each CR for each tree species. The specific processes were (i) fitting basic models to the data of each plot using the nlsList function in R software (Pinheiro and Bates 2006); and (ii) examining the relationship between the model coefficients and each potential covariable as well as its transformations by graphical and correlation analysis. Each pre-selected combination of variables was tested for multicollinearity based on a variance inflation factor (VIF), and covariates were removed using a VIF = 5 decision threshold. The maximum number of covariates for each CR equation was set to 2 to avoid problems such as over-parameterization, poor convergence, and slow computational speed of parameter estimation (Wang et al. 2021).

2.5 Non-linear seemingly unrelated mixed-effects model (NSURMEM)

After determining the generalized models of each CR for larch, birch, and poplar, the next step was the development of NSURMEM. Notice that the above covariable selection process was based on ONLS fitting, and significant variables may be insignificant due to the inherent correlation and random parameters in NSURMEM fitting, which should be deleted in this case. The structural formulation of m contemporaneously correlated non-linear equations could be specified as (Mehtätalo and Lappi 2020):

$${\displaystyle \begin{array}{c}{\textbf{y}}_i^{(1)}={f}_1\left({\textbf{x}}_1,{\boldsymbol{\upbeta}}_1,{\textbf{b}}_i^{(1)}\right)+{\boldsymbol{\upvarepsilon}}_i^{(1)}\\ {}{\textbf{y}}_i^{(2)}={f}_2\left({\textbf{x}}_2,{\boldsymbol{\upbeta}}_2,{\textbf{b}}_i^{(2)}\right)+{\boldsymbol{\upvarepsilon}}_i^{(2)}\\ {}\vdots \\ {}{\textbf{y}}_i^{(M)}={f}_M\left({\textbf{x}}_M,{\boldsymbol{\upbeta}}_M,{\textbf{b}}_i^{(M)}\right)+{\boldsymbol{\upvarepsilon}}_i^{(M)}\end{array}}$$
(2)

where \({\textbf{y}}_i^{(M)}\) is the response variable vector (M = 1, 2, …m), fM() are non-linear functions of the predictor variable matrix xM, βM is the fixed parameter vector common to all subjects, \({\textbf{b}}_i^{(M)}\sim \textrm{MVN}\left(\textbf{0},\textbf{D}\right)\) where D is the stacked covariance matrix of the random parameter vector \({\textbf{b}}_i^{(M)}\). \({\boldsymbol{\upvarepsilon}}_i^{(M)}\sim \textrm{MVN}\left(\textbf{0},{\textbf{R}}_i\right)\) with Ri = IΣ where is the Kronecker product, Σ is the stacked variance-covariance matrix of m response variables conditional on \({\textbf{b}}_i^{(M)}\).

Theoretically, NSURMEM could be extended by non-linear mixed-effects models for joint modeling. However, convergence problems often arise for high-dimensional calculations (Mehtätalo and Lappi 2020). Therefore, we used the solution proposed by Fieuws and Verbeke (2006) to fit the P = m(m – 1)/2 pairs of bivariate models instead of the full multivariate model (m = 4, P = 6 in this study). In a maximum likelihood framework, each pairwise model produces estimates with classical optimal asymptotic properties, including consistency and asymptotic normality. The fitted pairwise models were averaged over the obtained unbiased estimates to generate the final estimates (Fieuws and Verbeke 2006; Mehtätalo and Lappi 2020). All pairwise models were jointly fitted to the data for each tree species through the SAS/ETS NLMIXED Procedure (SAS Institute, Inc. 2011).

2.6 Generalized additive model (GAM)

GAM (Hastie and Tibshirani 1990) is specified only on the basis of smoothing functions rather than detailed parameter relationships, allowing flexibility in specifying the dependence of the response on covariates (Wood 2017). Generally, the structure of the GAM could be expressed as (Guisan et al. 2002):

$$g\left({\mu}_i\right)={\textbf{A}}_i\boldsymbol{\theta} +{f}_1\left({x}_{1i}\right)+{f}_2\left({x}_{2i}\right)+{f}_3\left({x}_{3i}\right)+\dots$$
(3)

where g() is an invertible link function, µi is the expected value of the response variable yi, Ai is the design matrix for any strictly parametric model component, θ is the corresponding parameter vector, fj() are smooth functions of the covariates xki. None of the interaction terms were considered in the model since the preliminary results showed no significant improvement.

Variable selection for GAM could be performed using either predefined rules including bias reduction measured by the χ2 statistic, and methods to minimize AIC; or more automatic procedures including stepwise regression, and shrinkage rules such as ridge regression or lasso (Wang et al. 2006). However, the model comparison is valid only when the models have the same covariables (Dong et al. 2014). Hence, the independent variables of GAM were consistent with the variables used in NSURMEM of each CR. After determining the model forms for all CR of each tree species, the CW model was defined as half of the sum of the GAM for the separately calculated CR.

The GAM was estimated by gamm function via package mgcv in the R-library, which allows access to a wide range of random effects and correlation structures (Wood 2004). Despite various smoothing functions or combinations thereof that could be used in GAM, we used the default thin plate regression splines (TP) for all model covariates as it provided more stable and excellent results for CR of all tree species after extensive testing. It is reassuring to attribute that it would be unfortunate if the model relied heavily on details such as the precise choice of basis (Wood 2017). For details on smooth splines see Wood (2017).

2.7 Model evaluation

Since independent data sets were not available, a method named leave-one-plot-out cross-validation (LOOCV) was used to measure the predictive performance of additive models (Yang and Huang 2014). NSURMEM was not calibrated for a fair comparison with GAM. For the sake of brevity, the LOOCV process was not described in detail in this paper. Various statistics were used for model fitting and validation, which were calculated as follows:

$$\text{ME}=\frac{1}{n}\sum_{i=1}^n\left(y_i-{\widehat y}_i\right)$$
(4)
$$\text{RMSE}=\sqrt{\frac{1}{n-1}\sum_{i=1}^n\left(y_i-{\widehat y}_i\right)^2}$$
(5)
$$\text{MAE}=\frac{1}{n}\sum_{i=1}^n\left|y_i-{\widehat y}_i\right|$$
(6)
$$\text{FI}=1-\left[\sum_{i=1}^n\left(y_i-{\widehat y}_i\right)^2/\sum_{i=1}^n\left(y_i-\overline y\right)^2\right]$$
(7)

where yi and \({\hat{y}}_i\) are observed and predicted values of CR or CW for the ith observation; \(\overline{y}\) is the average of yi; n is the total number of observations. ME is mean error; RMSE is the root mean squared error; MAE is mean absolute error; and FI is the fit index. ME and MAE deal with ‘bias’ or ‘accuracy’, while RMSE and FI deal with the “precision” of estimates (Amaro et al. 2003). For a more intuitive understanding of the accuracy of the corresponding two methods, the following FI% was also calculated, representing the percentage increase in FI due to GAM estimation compared to NSURMEM.

$$\textrm{FI}\%=\left({\textrm{FI}}_{\textrm{GAM}}{\textrm{-FI}}_{\textrm{NSURMEM}}\right)/{\textrm{FI}}_{\textrm{NSURMEM}}\times 100\%$$
(8)

3 Results

3.1 Determination of basic models

An overall summary of the comparison of candidate basic models is given in Table 3. As expected, the eight candidate models yielded different applicability for CR of various tree species or even various directions of the same tree species. For all CR of larch and birch, Model 8 provided the smallest ME and RMSE. Yet for poplar, model 4, model 8, model 2, and model 4 were considered the optimal basic models based on a combination of easy convergence, smallest ME and RMSE, general applicability, and biological realism. These basic models have previously been widely applied to several CW models and additive models (cf. Table 1).

Table 3 Fit statistics of the candidate basic models for CR of larch, birch, and poplar

3.2 NSURMEM

The preliminary assessment of the two-stage approach showed that tree size and competition variables including HCB, BA, THT, and BAL, rather than their transformed forms, contributed more to the CR of the three tree species. The best-performing generalized CR equations composed the NSURMEM for three species. The random parameters were determined by comparing the model performance for different combinations of parameters. The NSURMEM of birch only contained the size variables because the competition (BA and QMD, not shown) became insignificant with the addition of the random parameters. Eventually, the NSURMEM of larch (Eq. 9), birch (Eq. 10), and poplar (Eq. 11) held cross-equation correlations of the following detailed forms:

$$\left\{\begin{array}{c}{\textrm{CRE}}_{ij}={f}_{\textrm{E}}\left(\textbf{x}\right)={\beta}_{1\textrm{E}}/\left[1+\left({\beta}_{2\textrm{E}}+{\beta}_{4\textrm{E}} {\textrm{HCB}}_{ij}\right)\exp \left(-\left({\beta}_{3\textrm{E}}+{b}_E+{\beta}_{5\textrm{E}} {\textrm{BA}}_i\right){\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{E}}\\ {}{\textrm{CRS}}_{ij}={f}_{\textrm{S}}\left(\textbf{x}\right)=\left({\beta}_{1\textrm{S}}+{b}_S\right)/\left[1+\left({\beta}_{2\textrm{S}}+{\beta}_{4\textrm{S}} {\textrm{HCB}}_{ij}\right)\exp \left(-\left({\beta}_{3\textrm{S}}+{\beta}_{5\textrm{S}} {\textrm{BA}}_i\right){\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{S}}\\ {}{\textrm{CRW}}_{ij}={f}_{\textrm{W}}\left(\textbf{x}\right)={\beta}_{1\textrm{W}}/\left[1+\left({\beta}_{2\textrm{W}}+{\beta}_{4\textrm{W}} {\textrm{HCB}}_{ij}\right)\exp \left(-\left({\beta}_{3\textrm{W}}+{b}_W\right){\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{W}}\\ {}{\textrm{CRN}}_{ij}={f}_{\textrm{N}}\left(\textbf{x}\right)=\left({\beta}_{1\textrm{N}}+{b}_N\right)/\left[1+\left({\beta}_{2\textrm{N}}+{\beta}_{4\textrm{N}} {\textrm{HCB}}_{ij}\right)\exp \left(-\left({\beta}_{3\textrm{N}}+{\beta}_{5\textrm{N}} {\textrm{BA}}_i\right){\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{N}}\end{array}\right.$$
(9)
$$\left\{\begin{array}{c}{\textrm{CRE}}_{ij}={f}_{\textrm{E}}\left(\textbf{x}\right)=\left({\beta}_{1\textrm{E}}+{b}_E\right)/\left[1+\left({\beta}_{2\textrm{E}}+{\beta}_{4\textrm{E}} {\textrm{HCB}}_{ij}\right)\exp \left(-{\beta}_{3\textrm{E}}{\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{E}}\\ {}{\textrm{CRS}}_{ij}={f}_{\textrm{S}}\left(\textbf{x}\right)=\left({\beta}_{1\textrm{S}}+{b}_S\right)/\left[1+\left({\beta}_{2\textrm{S}}+{\beta}_{4\textrm{S}} {\textrm{HCB}}_{ij}\right)\exp \left(-{\beta}_{3\textrm{S}}{\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{S}}\\ {}{\textrm{CRW}}_{ij}={f}_{\textrm{W}}\left(\textbf{x}\right)=\left({\beta}_{1\textrm{W}}+{b}_W\right)/\left[1+\left({\beta}_{2\textrm{W}}+{\beta}_{4\textrm{W}} {\textrm{HCB}}_{ij}\right)\exp \left(-{\beta}_{3\textrm{W}}{\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{W}}\\ {}{\textrm{CRN}}_{ij}={f}_{\textrm{N}}\left(\textbf{x}\right)={\beta}_{1\textrm{N}}/\left[1+\left({\beta}_{2\textrm{N}}+{b}_N+{\beta}_{4\textrm{N}} {\textrm{HCB}}_{ij}\right)\exp \left(-{\beta}_{3\textrm{N}}{\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{N}}\end{array}\right.$$
(10)
$$\left\{\begin{array}{c}{\textrm{CRE}}_{ij}={f}_{\textrm{E}}\left(\textbf{x}\right)={\left(\frac{{\textrm{DBH}}_{ij}}{\beta_{1\textrm{E}}+{b}_E+\left({\beta}_{2\textrm{E}}+{\beta}_{3\textrm{E}} {\textrm{BA}}_i\right){\textrm{DBH}}_{ij}}\right)}^2+{\varepsilon}_{\textrm{E}}\\ {}{\textrm{CRS}}_{ij}={f}_{\textrm{S}}\left(\textbf{x}\right)=\left({\beta}_{1\textrm{S}}+{b}_S\right)/\left[1+{\beta_{2\textrm{S}}}\exp \left(-{\beta}_{3\textrm{S}}{\textrm{DBH}}_{ij}\right)\right]+{\varepsilon}_{\textrm{S}}\\ {}{\textrm{CRW}}_{ij}={f}_{\textrm{W}}\left(\textbf{x}\right)=\left({\beta}_{1\textrm{W}}+{b}_W+{\beta}_{3\textrm{W}} {\textrm{THT}}_{ij}\right){{\textrm{DBH}}_{ij}}^{\beta_{2\textrm{W}}}+{\varepsilon}_{\textrm{W}}\\ {}{\textrm{CRN}}_{ij}={f}_{\textrm{N}}\left(\textbf{x}\right)={\left(\frac{{\textrm{DBH}}_{ij}}{\beta_{1\textrm{N}}+\left({\beta}_{2\textrm{N}}+{b}_N+{\beta}_{3\textrm{N}} {\textrm{HCB}}_{ij}+{\beta}_{4\textrm{N}} {\textrm{BAL}}_{ij}\right){\textrm{DBH}}_{ij}}\right)}^2+{\varepsilon}_{\textrm{N}}\end{array}\right.$$
(11)

The CW predictions for the three tree species were derived from the CR predictions:

$${\textrm{CW}}_{ij}=\left({\textrm{CRE}}_{ij}+{\textrm{CRS}}_{ij}+{\textrm{CRW}}_{ij}+{\textrm{CRN}}_{ij}\right)/2$$
(12)

where CREij, CRSij, CRWij, CRNij, CWij, DBHij, HCBij, BALij, and THTij are the CRE, CRS, CRW, CRN, DBH, HCB, BAL, and THT for tree j in plot i; BAi is the BA for plot i; β1•, β2•, β3•, β4•, and β5• are fixed parameters for each CR of the three tree species; b are random parameters for four CR; ε are the model error terms. The scatter plots of the CR versus covariates in the final NSURMEM are shown in Appendix: Fig. 9.

Tables 4 and 5 report the estimated fixed parameters and random effects variance-covariance matrix of NSURMEM, respectively. All the fixed parameters were statistically significant at P = 0.05 with negative or positive biological significance.

Table 4 Fixed parameter estimates of NSURMEM systems for larch, birch, and poplar
Table 5 Random effects variance-covariance matrix of NSURMEM systems

For greater clarity, the contribution of each covariate to the variations of each CR for three species is illustrated in Fig. 4, showing the varying degrees of variation in the CR equations across the three species. All CR of all tree species increased with increasing DBH and decreased with increasing HCB, BA, THT, and BAL.

Fig. 4
figure 4

Contribution of different covariates including tree size (HCB and THT) and competition (BA and BAL) to CR for larch, birch, and poplar. The curves were generated by using the average values of the covariates in Table 2, the parameters of NSURMEM systems in Table 4, and the equal intervals of the test covariates from the minimum to the maximum range

3.3 GAM

The detailed forms of the final GAM for each species, which include covariables consistent with the corresponding NSURMEM, are shown in Table 6.

Table 6 Formulae of the final GAM systems for larch, birch, and poplar

The estimated coefficients and statistical characteristics of GAM are given in Table 7. All non-linear relationships between covariables and response variables were biologically plausible and significant at P = 0.05. The partial residual plots are shown in Fig. 5. Similar conclusions to the NSURMEM could be drawn from the trends in partial residual plots of GAM. All CR of all tree species increased with increasing DBH. The CR of larch and birch decreased with increasing HCB except for the CRE of birch, which showed a trend of first decreasing and then slightly increasing (b2 in Fig. 5). For poplar, the CRW and CRN responses to tree size and competition were consistent with those reported by the NSURMEM. Yet differences still existed in NSURMEM and GAM, such as all CR of larch and birch, and CRE of poplar showed non-monotonic trends in response to competition variables.

Table 7 Estimated coefficients and statistical characteristics of GAM systems for the CR of larch, birch, and poplar
Fig. 5
figure 5

Partial residual plots of covariables to each CR for a larch, b birch, and c poplar. The solid line in each plot is the estimate of the smooth function, while the dashed lines are roughly 95% confidence limits. At the base of each plot is a univariate histogram (rugplot) showing the distribution of data points

3.4 Intraspecific and interspecific variability of crown

NSURMEM and GAM were used to simulate the relationship among the predicted CR and CW against DBH (cf. Fig. 6). The fitted curves of the two additive models combined showed that the four CR of larch were quite similar. For birch, CRS was significantly larger than the other three directions in terms of DBH from small to large. As for poplar, the growth trends of CRS and CRW were similar. As DBH increased, the CRE and CRN were significantly shorter than the other two directions. In general, larch produced the largest CW, and the two broadleaf species produced similar CW. The size of CR for broadleaf species varied considerably, especially poplar.

Fig. 6
figure 6

Intraspecific and interspecific variability in CR and CW for the three species generated using NSURMEM and GAM systems

3.5 Fitting evaluation

A further graphical inspection (Fig. 7) showed that all methods fitted the data well, and there was no significant deviation from the homogeneous error variance even if without applying the weighting function in NSURMEM.

Fig. 7
figure 7

Multi-panel display of residuals versus predicted values of NSURMEM and GAM systems for larch, birch, and poplar

Table 8 summarizes the fit statistics of the NSURMEM and GAM by species. Apparently, all methods provided reasonably good fits to the data. The statistics showed that in most cases, the GAM produced slightly lower ME, MAE, and RMSE and higher FI (FI% from 0.66% to 39.56%). The CR of the NSURMEM for birch with the similar but lower MAE than the GAM. For CW of all species, the FI values of the GAM were higher than those of the NSURMEM by 4.14% for larch, 2.84% for birch, and 4.15% for poplar.

Table 8 The goodness-of-fit statistics of the NSURMEM and GAM systems for larch, birch, and poplar

3.6 Model validation

An overall summary of the LOOCV is given in Table 9. For most CR and all CW of all tree species, GAM provided slightly higher precision than NSURMEM based on validation statistics. For CW of all species, the FI values of the GAM were higher than those of the NSURMEM by 2.66% for larch, 0.94% for birch, and 1.61% for poplar.

Table 9 The statistics of LOOCV of the NSURMEM and GAM systems for larch, birch, and poplar

Furthermore, the boxplots of residuals of the NSURMEM and GAM were plotted against DBH classes of 8 cm width (cf. Fig. 8), while the 41 cm DBH classes were further pooled with the previous DBH classes since there were very few observations (12, 3, and 2 trees for larch, birch, and poplar, respectively). Generally, neither method showed serious bias. The bias of NSURMEM and GAM occurred mainly in the largest DBH class, i.e., the class with fewer observations, and behaved consistently in the smaller DBH classes with more observations.

Fig. 8
figure 8

Boxplots of residuals in different DBH classes of NSURMEM and GAM systems for larch, birch, and poplar. The grey numbers below each subplot indicate the observations of each DBH class for each tree species

4 Discussion

4.1 The growth trend of the four CR

Appropriate models were developed for each CR of larch, birch, and poplar in natural secondary forests. It is noteworthy that the CR of identical tree species may be influenced by different variables. Due to the plasticity and adaptive strategies of the crown, growth trends in different directions may depend on complex stand conditions, intense competition with neighboring trees, and irregular light, water, and nutrients in natural forest communities (Grote 2003; Kong et al. 2021). The resulting crown asymmetry will allow crown development to be adapted to the local environment to maximize access to resources (Attocchi and Skovsgaard 2015; Gao et al. 2021; Krůček et al. 2019; Pretzsch 2019). In addition, estimating the CR in different basic directions of each tree will be of great value due to the need for explicit crown asymmetry information in numerous research areas such as windthrow susceptibility estimation, and wood quality simulation. (Grote 2003; Kellomäki et al. 1999; Krůček et al. 2019; Skatter and Kucera 2000; Sun et al. 2022). The results indicated that building separate models for each CR avoided the problem of neglecting the asymmetry of the tree crown in each direction. Although this practice may reduce the applicability of the models, it provided a more robust and precise explanation for the relationships among CR and the covariables (Figs. 4 and 5).

Each of the selected covariables accurately reflects tree growth, resource allocation relationships, or competition with neighboring trees. Moreover, these covariates are readily available through field surveys or calculations, facilitating the practical application of additive models (Chen et al. 2021; Raptis et al. 2018; Sharma et al. 2016). Consistent with previous studies (Chen et al. 2021; Fu et al. 2013; Raptis et al. 2018; Sharma et al. 2016), we found that tree size (HCB and THT) and competition (BA and BAL) were the significant influencing variables for the crown (Fig. 4). The negative influence of HCB and THT on the CR may be attributed to resource allocation strategies related to biomechanical and hydraulic constraints (Zhang et al. 2019). The significant negative influences of competition on the crown may be since, with limited resource availability, the larger the competition, the fewer resources available to individual trees (Gill et al. 2000; Hemery et al. 2005; Qin et al. 2022). Canopies could optimize crown filling by changing structure and growth strategies through their plasticity and vertical stratification to efficiently use above-ground space (Morin et al. 2011; Pretzsch 2014), adapt to the changing environment of competing neighbors (Gao et al. 2021; Kaitaniemi and Lintunen 2010; Longuetaud et al. 2013), and efficiently utilize light resources (Jucker et al. 2015; Pretzsch 2019). However, these characteristics may not be sufficient to counteract the negative effects of competition. Competition variables were not significant in the NSURMEM of birch. It is probably because birch is a pioneer species, and the smaller influence of competition was explained by the random parameters (Huang et al. 2009; Wang et al. 2018; Zhou et al. 1989).

Moreover, the effects of the covariables on the identical CR may not be monotonic, as indicated by the partial residual plots of GAM (Fig. 5), with possible attribution to ecological niche differentiation due to the complex stand structure of natural secondary forests (Buchacher and Ledermann 2020; Ciceu et al. 2020; del Río et al. 2014; Jucker et al. 2015). Further support was given by Thorpe et al. (2010) who found that in mixed-species scenarios, trees growing in dense communities were often associated with larger predicted canopies. However, as the statistics indicated (Tables 8 and 9), there may be other covariables that significantly affect crown development not incorporated. Future work may consider quantifying other important covariables such as climate and soil to better understand their influence on the CR of individual trees.

Our research has shown that there are clear differences in crown development among different tree species and even among different directions of the same tree species (Fig. 6). As expected, the four CR of all tree species increased with tree size gradually and showed significant differences. In the sapling stage, lateral crown expansion is rarely limited by neighboring trees, and light is usually the most limiting resource for growth, with the uniform growth of the CR allowing them to survive in low light conditions (Ricard et al. 2003). As tree size increases, the crown preferentially expands to the side with gaps to maximize the accumulation of photosynthetic products for further investment in stem growth for mechanical safety, resulting in an asymmetrical crown (Krůček et al. 2019; Xu et al. 2022). Larch consistently has a larger and more uniform crown than birch and poplar, which may be attributed to its growth characteristics and less plasticity than broadleaf trees (Buchacher and Ledermann 2020; Holdaway 1986). Continued research on the crown growth of these three important tree species that account for a relatively large proportion of natural secondary forests in Northeastern China is warranted.

4.2 Remarks on the performance of the models

As expected, the NSURMEM and GAM adequately predicted the CR and CW of each tree species in the natural secondary forests studied, indicating that both additive models have passed the statistical test. In addition, additive models consider not only the crown asymmetric information, but also the additivity of CW and CR (Fu et al. 2017b; Lei et al. 2018). NSURMEM and GAM are by far the first time to model CR data for multiple tree species in a systematic and large-scale manner. Our results showed that GAM produced similar or even better prediction accuracy than NSURMEM for the CR and CW in the absence of calibration (Tables 8 and 9).

Compared to the traditional methods (ONLS and NMEM), the characteristics of NSURMEM are evident. Firstly, NSURMEM with strong logic could consider cross-model correlation and produce biologically meaningful parameters and robust predictions (Fu et al. 2017c; Kangas et al. 2016). Secondly, NSURMEM could handle the lack of independence due to the hierarchically nested structure (Calama and Montero 2004; Chen et al. 2021). Finally, NSURMEM allows flexible calibration for random effects using correlations among sub-models. However, NSURMEM is necessary to face the problems of determining the model form, checking the relationship between covariables and parameters, and estimating the initial values of the function parameters, which still seem cumbersome and awkward nowadays (Wang et al. 2005). In recent years, the application of the linear seemingly unrelated mixed-effects model (SURMEM) is becoming more and more widespread in forestry (Bronisz and Mehtätalo 2020b; Mehtätalo and Lappi 2020). Previous studies have demonstrated the more flexible calibration of SURMEM (Hao et al. 2022; Xie et al. 2022). For comparison with GAM in a fair perspective, we did not calibrate NSURMEM. It also avoids large deviations caused by inaccurate calibration data (Hussain et al. 2021; Westfall and Scott 2010).

GAM is a powerful exploratory and flexible tool for detecting simple linear relationships and complex patterns in the distribution of forest properties (Albert and Schmidt 2010; Frescino et al. 2001). Compared to the traditional methods of ONLS and NMEM, the GAM simplifies those processes by automatically identifying the appropriate relationships between predictors and response variables (Albert and Schmidt 2010; Di Salvatore et al. 2021; Levine et al. 2021). Moreover, due to the data-driven characteristic, GAM allows to reveal variability that is masked by parametric models and thus provides key ecological insights (e.g., the non-monotonic relationship between competition and CR) (Levine et al. 2021; Moisen et al. 2006; Wernicke et al. 2020). Currently, many scholars have concluded that GAM is more resistant to extreme values than other models (Byun et al. 2013), which is certainly remarkable in practice if valuable data could not be authenticated and excluded. Thus, a wider range of application areas for GAM is waiting to be actively explored.

Nonetheless, all methods face the limitation of hardly giving accurate estimates if applied to data ranges and forest conditions beyond equations (Bi et al. 2010; Kangas et al. 2016; Levine et al. 2021). In the work of Frescino et al. (2001), this problem was solved by specifying the values in the validation dataset that were outside the range as the maximum or minimum values of the corresponding variables in the fitted dataset. Other scholars have suggested expanding the fitted data set to cover a larger range of data to overcome this issue (Albert and Schmidt 2010; He et al. 2021). Fortunately, the NSURMEM and GAM in this study have achieved satisfactory accuracy. It should be noted, however, that the variable range of the data behind the models needs to be carefully reviewed before using them for prediction, as it will ultimately determine the accuracy of the sample-based equations (Bi et al. 2010).

5 Conclusion

In this study, we used two new methods, NSURMEM and GAM, to establish CW additive models for larch, birch, and poplar in the natural secondary forests of Northeastern China. Through the development of the additive models, we found that the CR of each tree species exhibited different growth trends and was influenced to varying degrees by tree size and competition. The birch and poplar were more likely to exhibit asymmetric crowns than larch. NSURMEM considered the inherent additivity between CR and CW and the hierarchical nesting structure of crown data. GAM simplified the model form selection process and could handle non-linear and non-monotonic relationships between variables. In the future, calibrations of NSURMEM and GAM can be considered for comparison. However, large-scale sampling calibration is not recommended due to the high cost.

Availability of data and materials

The data used in the current study can be obtained from the corresponding author with reasonable requirements.

References

Download references

Acknowledgements

The authors are deeply grateful to the editor, the associate editor, and all anonymous reviewers for their helpful feedback and valuable suggestions. We would also like to thank the researchers, MSc, and PhD students who contributed to this article.

Code availability

The code used in the current study can be obtained from the corresponding author with reasonable requirements.

Funding

This research was supported by the National Natural Science Foundation of China (Grant Nos. 31170591 and 32271866); Heilongjiang Touyan Innovation Team Program; and National Forestry and Grassland Data Center-Heilongjiang platform (2005DKA32200-OH).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization: Junjie Wang, Lichun Jiang, Youzhu Wang; methodology: Junjie Wang, Shidong Xin, Pei He; formal analysis and investigation: Junjie Wang, Shidong Xin, Pei He, Yunfei Yan; writing—original draft preparation: Junjie Wang; writing—review and editing: Junjie Wang, Youzhu Wang, Lichun Jiang; funding acquisition: Lichun Jiang; supervision: Lichun Jiang. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lichun Jiang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors gave their informed consent to this publication and its content.

Competing of interests

The authors declare that they have no conflict of interest.

Additional information

Handling Editor: John Lhotka

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Figure 9

Fig. 9
figure 9

Scatter plots of crown components versus covariates in the final additive models for a larch, b birch, and c poplar

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Jiang, L., Xin, S. et al. Two new methods applied to crown width additive models: a case study for three tree species in Northeastern China. Annals of Forest Science 80, 11 (2023). https://doi.org/10.1186/s13595-022-01165-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13595-022-01165-5

Keywords