Spatial optimization of genetic thinning in seed orchards

We provide a mathematical model to determine which trees should be ruled out from the grid to promote random mating in seed orchards under genetic thinning. Genetic thinning (roguing) is a common practice in forest tree breeding to remove inferior genotypes in seed orchards, thus boosting the genetic worth of the seed crop. To develop a general methodology for spatial optimization of genetic thinning. It should promote random mating and consider any existing seed orchard layout. The model is based on the Optimum-Neighborhood Allocation algorithm (Chaloupková et al., Forests 10:1-6, 2019). The algorithm’s efficiency was evaluated using computer simulation. A fully randomized scheme was used as a reference. In addition, the study provides a demonstration on an actual seed orchard. Simulations confirm the method’s efficiency in promoting random mating compared to the fully randomized allocation across a wide range of selection intensities. We suggest Linear Deployment as a preferred method for calculating optimum deployment contributions at higher thinning intensities. The algorithm was programmed in R and is publicly available. Breeders can use the software and follow the example to implement genetic thinning in different practical scenarios assuming any seed orchard layout. The approach enhances random mating while maximizing genetic response to selection.


Introduction
Seed orchards connect tree breeding programs and operational forestry. They are the source of genetically improved seed for afforestation. In the initial breeding cycle, phenotypically superior individuals (designated as plus trees) are identified in forest stands. They are typically grafted to establish the first-generation (clonal) seed orchards. As breeding progresses through a typical sequence of breeding activities (mating, testing, and selection), new seed orchards are planted with grafted top-ranking genetically improved individuals (clones). Due to repeated cycles of selection, advanced-generation seed orchards are associated with higher genetic gains in important quantitative and qualitative traits (White et al. 2007;Funda and El-Kassaby 2013).
Seed orchard management is complex, including the establishment, long-term maintenance, and periodic harvest of the seed crop. During the planning period, a decision is made on the appropriate spatial distribution of vegetative propagules (ramets) of the respective clones on the orchard's grid (spatial design). During the 1950s to early 1970s, numerous spatial configurations were developed primarily under the assumption of crossbreeding Page 2 of 8 Chaloupková and Lstibůrek Annals of Forest Science (2022) 79:37 occurring among neighboring individuals; see Giertych (1975) for a comprehensive review. While this assumption was confirmed by case studies (Burczyk and Prat 1997;Chen et al. 2018), additional factors should be considered while choosing the appropriate orchard design, primarily in advanced breeding cycles. Hodge and White (1993) list four such criteria: (1) to maximize genetic gain while maintaining acceptable genetic diversity, (2) to provide for random mating, (3) to minimize inbreeding, and (4) to minimize the impact of selection errors (maintaining flexibility for roguing, denoted as the genetic thinning). Contemporary designs address most of these concerns (Lstibůrek and El-Kassaby 2010;El-Kassaby et al. 2014;Liesebach et al. 2021). In the current study, we utilize the Optimum Neighbourhood Allocation (ONA) scheme (Chaloupková et al. 2019), which promotes panmixia by minimizing the variance of the counts of adjacencies among all clones across the orchard's grid.
In the initial breeding cycle, the genetic worth of clones is unknown prior to progeny testing. Once the parental genetic values become available from progeny trials, one can rank and remove inferior parents from the orchard. This operation is called "genetic thinning" or "roguing. " These genetically improved seed orchards are regarded as the 1.5 generation. They are genetically superior to the initial first-generation orchards due to the effect of selection. They serve their purpose before the 2 nd generation orchards provide a sufficient quantity of seed (White et al. 2007).
To keep things simple, breeders would often opt for roguing a specific number of clones systematically from the orchard, e.g., every other tree in a row. There is a more efficient approach considering genetic gain and diversity in the seed orchard crop. Optimum clonal proportions (expected gametic contributions following the thinning) can be calculated from predicted additive genetic (breeding) values (Prescher et al. 2008;Bondesson and Lindgren 1993). The approach is based on the Linear Deployment concept by Lindgren and Matheson (1986). Theoretically, a more general selection optimization protocol (e.g., Funda et al. 2009) could also be extended for genetic thinning. However, Linear Deployment is the method of choice during initial breeding cycles (assumption of unrelated and non-inbred clones), when genetic thinning is typically considered.
In most cases, genetic thinning alters the original objective of the seed orchard layout. Breeders are faced with the problem of identifying the exact ramets that should be removed from the grid, keeping in mind the main criteria of seed orchard designs. Currently, there is no methodology to optimize the spatial distribution of genetic thinning across the orchard's grid. For this purpose, we provide a new method based on the ONA scheme. The optimum allocation is acquired from the assessment of removing a given candidate ramet from every position across the grid and all candidate clones, minimizing the ONA's criterion function. Following the genetic thinning, the resulting scheme is thus optimized to promote random mating and reduce the level of inbreeding. We demonstrate the approach using computer simulation and the actual seed orchard layout.

Conceptual plan
We start with any seed orchard spatial configuration, i.e., the spatial distribution of N C clones on a grid. It is assumed that breeders determine the intensity of genetic thinning using one of the approaches introduced earlier (e.g., Linear Deployment).
Next, we optimize the spatial distribution of genetic thinning, pointing at particular trees on the grid that should be removed. After conducting the thinning, the idea is to promote panmixia among all remaining entries. Therefore, we adopted (with some modifications) the Optimum-Neighborhood Allocation scheme by Chaloupková et al. (2016).

Mathematical algorithm
To assist tree breeders with implementing the algorithm using their unique input parameters, we provide R-package "thinONA" (see "Code availability" section, R Core Team 2022).
The existing seed orchard layout (subjected to genetic thinning) is provided in matrix ORCHARD (m rows and n columns). Elements of the matrix are labeled using an integer sequence from the first clone, denoted 1, to the last clone, denoted N C . The goal is to remove a given subset REMOVE of genetic entries from the existing grid. There are no distributional assumptions on REMOVE . Note that clonal identification numbers in REMOVE must correspond to those in ORCHARD . For example, REMOVE = {1, 1, 3, 4}designates that two ramets of the clone one and one ramet of clones 3 and 4 should be removed from the grid.
In the first step, a matrix PAIRS is formed. It consists of the sums of direct neighborhood positions across all possible clonal pairs across ORCHARD . (For example, assuming clones A and B, the corresponding value in PAIRS has a value of two if their respective ramets are present as direct neighbors two times across the entire grid). A single candidate is then selected from REMOVE (based on the clonal size; clones with a higher number of ramets are selected first; if there are more candidate clones of the same size, the choice is random). The candidate is then located in PAIRS to identify its most frequent direct neighbor. All such positions in ORCHARD are evaluated to determine which elimination minimizes the criterion function (introduced in the next paragraph). The candidate is then removed from the matrix ORCHARD position and the candidate list in the vector REMOVE.
Candidate positions on the grid are individually evaluated from the top-left to the bottom-right corner assuming eight closest neighboring positions (two horizontal, two vertical, and four diagonal). The algorithm is flexible to evaluate more distant positions on the grid if requested by the breeder. We developed a criterion function to quantify the impact of removing entries in REMOVE in each location across the grid for the Var min criterion (see Chaloupková et al. 2016 for details). The algorithm is based on the minimization of the criterion function: where n ii and n ij depict the number of single and different clones' adjacencies, respectively; N G is the total number of adjacencies existing within the rectangular orchard grid, w is a weight parameter to penalize the placement of identical clone's ramets as direct neighbors, and p is the sum of all n ii elements. The component w is a higher-order value than the first addend (ex. 99999), leading to the effective elimination of direct neighborhoods of identical genetic entries.
The optimization goal is to minimize the variance of matrix PAIRS while constraining the placement of identical clones' ramets as direct neighbors.

Computer simulation
We adopted the ONA scheme and optimized the spatial distribution of 40 clones with equal sizes (number of ramets) on a square grid of 20 rows and 20 columns (Chaloupková et al. 2016). Respective breeding values were sampled from a standardized normal distribution. Two scenarios of genetic thinning were assumed. First, the inferior one-half of clones were subjected to simple thinning of equal intensity (scenario Genetic Thinning Simple, GTS). Second, we followed the Linear Deployment algorithm (Lindgren and Matheson 1986) to optimize selection intensity across the assumed 40 clones (scenario Genetic Thinning Linear Deployment, GTL). The selected proportion during genetic thinning ranged from 5% to 40%. In each scenario, vector REMOVE was (1) created, consisting of clonal identification numbers subjected to removal on the grid.

Scots pine seed orchard
The Scots pine (Pinus sylvestris, L.) seed orchard "Doubrava" is managed by the Forest of the Czech Republic (state enterprise), district Plasy. It was established in 1980 using a simplistic randomization protocol with a spacing of 6 × 6 m across 6.5 ha. The orchard consists of 1165 ramets of 87 clones with highly variable sizes (ranging from 1 to 28 ramets per clone). Geographic coordinates are 49°54′31′′N, 13°26′34′′E; altitude 380 m. We utilized breeding values from half-sib progeny trials (Kaňák 2011). We followed the Linear Deployment algorithm (Lindgren and Matheson 1986) to derive optimum clonal sizes. The selection intensity of genetic thinning (expressed as a proportion) ranged from 5 to 40%. In each scenario, assuming the optimum clonal sizes, vector REMOVE was created, consisting of specific ramets subjected to removal on the grid. The spatial distribution of genetic thinning was then optimized using the mathematical algorithm (Section 2.2).
All schemes mentioned above were evaluated using the R package OrchSTAT (Lstibůrek 2022). As a reference line (Section 2.3), averages of the above statistics were calculated for a fully randomized seed orchard layout, assuming 100 stochastic iterations (designated as the "random reference line").

Results
Assuming efficient baseline seed orchard layout (ONA) with equal clonal sizes, genetic thinning (ONA algorithm) promotes Var min , diversifying gametic contributions among the parental genotypes to the seed orchard's crop (departure from panmixia). Interestingly, this increment in Var min is more significant under Linear Deployment (GTL scenario) at lower selection intensity than in the simplistic GTS scenario (p up to 20%). Both approaches perform similarly regarding Var min at intermediate thinning intensities. GTL results in lower Varmin at higher thinning intensities (over 20%), which is undoubtedly beneficial. This is likely due to more open spaces on the grid combined with a more gradual change of clonal proportions under the GTL (see ONA lines in Fig. 1).
On the contrary, a reduction in Var min is associated with the initial fully randomized scheme (see RRL lines in Fig. 1). GTS seems to be more efficient in reducing Var min at p up to 30%, yet it is inferior to GTL at p > 30%. In line with the reduction of Var min , i.e., the minimization of the PAIRS matrix, the optimization is constrained by the initial spatial distribution with multiple neighborhoods among the ramets of the same clone.
Page 4 of 8 Chaloupková and Lstibůrek Annals of Forest Science (2022) 79:37 The algorithm efficiently reduces these identical clones' adjacencies ( Fig. 2). It seems that both GTS and GTL are similarly efficient in this regard. On the contrary, the sum of diagonal elements of PAIRS was always zero under the ONA schemes (not shown in Fig. 2), which is attributable to the second addend in Eq. 1. The actual spatial distribution of genetic thinning, considering the GTL option, is provided in Figs. 3, 4, and 5 for three thinning intensities (proportions 10, 20, and 30%). The efficiency of Var min minimization is very apparent in the seed orchard Doubrava (primary Y-axis, blue line, Fig. 6). Var min drops exponentially with p across the whole studied interval. Variation in clonal sizes is reduced with p (secondary Y-axis, box plots).

Discussion
This study demonstrates that genetic thinning could be spatially optimized across the orchard's grid. Breeders should evaluate the impact of genetic thinning under specific circumstances in line with the main criteria of seed orchard designs (Hodge and White 1993). As presented in our study, interactions among the main variables are not trivial. The effect of genetic thinning on expected panmixia (Var min ) is proportional to the selection intensity, the original seed orchard layout, and the distribution of clonal sizes. Thorough roguing is typically advocated in seed orchards since the genetic response to selection depends on high selection intensity (Falconer and Mackay 1996). Seed orchards that have been established with an excessive number of plus trees highly benefit from intensive roguing (a prerequisite to capturing meaningful genetic response). High genetic gain (selection intensity) is, however, associated with unavoidable loss of seed production.
While the build-up of Var min is minimized using the modified ONA algorithm, its relative magnitude is still substantial, provided the original design was efficient (ONA lines in Fig. 1). In numerous practical scenarios (less efficient spatial designs, highly variable clonal sizes), the ONA-based genetic thinning preserves or even reduces the Var min of the original scheme, as seen in the Doubrava orchard (Fig. 6). Thus, genetic thinning could be used to "improve" original less efficient seed orchard layouts. Breeders can evaluate this effect by comparing the Var min before and after the thinning using the R package OrchSTAT ("Code availability" section). Our method also reduces the same clone's adjacency (if present in the initial orchard), limiting the selfing rate (Fig. 2). Fig. 1 The Var min criterion (Y-axis) of the seed orchard layout following the genetic thinning, with 0-40% ramets removed from the orchard (X-axis). Genetic thinning was applied to either the randomized seed orchard scheme ("random reference line, RRL"; averages of 100 stochastic iterations) or the declared ONA scheme ("ONA"; optimum layout). Both GTS and GTL deployment options are shown in the graph Chaloupková and Lstibůrek Annals of Forest Science (2022)   (5.) Actual gametic contributions differ from calculated "ideal" proportions (complex reproductive biology in variable environmental conditions). Most of the above issues are equally relevant in spatial layout optimization. Next, we discuss specific assumptions of the presented methodology (genetic thinning utilizing the ONA criterion function). The algorithm utilizes local heuristics. Therefore, reaching the true optimum is not guaranteed. Hovewer, Chaloupková et al. (2016) derived theoretical minimum variance, which can be used as a reference optimality measure. While we focused exclusively on the same clone's adjacencies, further penalties could be introduced to trace pair-wise genetic relationships across adjacent positions (assumption 1. above). Each would be given a separate weight. One may set up relative weights using the genetic coancestry, i.e., the equivalent of the "flow" matrix C in the Minimum Inbreeding scheme (Lstibůrek and El-Kassaby 2010, pp. 604). As stated in the methodical section, no distributional assumptions exist on REMOVE and ORCHARD . They are assumed to be calculated error-free (assumptions 2. and 3.). Provided the premise of preferential mating among neighboring individuals holds (i.e., spatially constrained pollen dispersion, see Funda and El-Kassaby 2013), random mating should be evaluated in proximity, e.g., nine adjacent positions as presented here (assumption 4.). Higher intensity of genetic thinning unavoidably produces multiple adjacent empty blocks, as seen in Fig. 5. Should this be of concern, an extra penalty can be integrated into the objective function (Eq. 1). The Varmin could be calculated across a broader range, considering more distant neighborhood positions. Information on reproductive biology and its dynamics may streamline the optimization protocol (assumption 5.). In practical terminology, layouts are established for the entire lifespan of seed orchards, and they are never considered ideal, given the ever-present reproductive variation.

Conclusion
Breeders conduct genetic thinning in seed orchards by removing genetically inferior trees from the grid after their genetic merit is evaluated in progeny trials. This operation enhances the genetic worth of seed orchards' crop, Fig. 6 The Var min criterion (Y-axis) of the seed orchard Doubrava in the Czech Republic following the genetic thinning, with 0-40% ramets removed from the orchard (proportion selected, X-axis). The number of remaining trees (grafts) is provided on the secondary X-axis. The distribution of clonal sizes is shown on the secondary Y-axis