Stochastic models in seed dispersals: random walks and birth–death processes

ABSTRACT Seed dispersals deal with complex systems through which the data collected using advanced seed tracking facilities pose challenges to conventional approaches, such as empirical and deterministic models. The use of stochastic models in current seed dispersal studies is encouraged. This review describes three existing stochastic models: the birth–death process (BDP), a 2 dimensional () symmetric random walks and a intermittent walks. The three models possess Markovian property, which make them flexible for studying natural phenomena. Only a few of applications in ecology are found in seed dispersals. The review illustrates how the models are to be used in seed dispersals context. Using the nonlinear BDP, we formulate the individual-based models for two competing plant species while the cover time model is formulated by the symmetric and intermittent random walks. We also show that these three stochastic models can be formulated using the Gillespie algorithm. The full cover time obtained by the symmetric random walks can approximate the Gumbel distribution pattern as the other searching strategies do. We suggest that the applications of these models in seed dispersals may lead to understanding of many complex systems, such as the seed removal experiments and behaviour of foraging agents, among others.


Introduction
The seedling survival, growth and development of many plant species depend on seed dispersals through which agents, such as rodents, move seeds away from the seed producing trees. This avoids the intraspecific competition among the seeds when they are many under the parent trees. The agents also benefit from seed dispersals; they often consume the seeds as foods. This mutualistic relationship is a major life-history for the population growth of most plant and animal species [26]. Despite the fact that empirical studies contribute immensely in seed dispersal studies, many complex systems, such as the determination of foraging pattern, could not be understood by field studies alone. This necessitates the use of different approaches, including mathematics, to understand the seed dispersal mechanisms. Even though seed dispersals models exist [41,50], field studies always pose challenges to mathematics due to the emergence of advanced seed tracking facilities. Technology provides many equipment that are currently used for data collection experiments. For over a decade, a motion-sensitive camera trap and radio transmitters have been used to track and monitor the dispersal agents [33].
Therefore, computational ecology develops tools necessary for confronting large amount of data generated through animal tracking facilities [23]. Seed dispersals require effective mathematical tools for understanding and predicting the hidden traits of moving seeds. These tools vary from temporal to spatial models. The choice of an appropriate method depends on many factors, such as the availability of data and managements' objectives, among others [37]. Even if the data are available, it is not always guaranteed to obtain a model that is suitable for the analyses. This compels the application of variety of methods such as individual-based models (IBMs), which their basis entities are individual plants and animals species among others, to seed dispersals. When the real data are not available, IBMs are flexible enough to generate the simulated data. For example, empirical studies of animal movement records the time and location of species in a 50 h plot [68]; nevertheless, sometimes sampling longer distances in forest may be difficult [16]. This problem arises since larger seeds would often dispersed by large animal seed dispersers over thousand metres [57]. Meanwhile, small-bodied frogivores dispersed seeds over a few hundred metres [59]. As Carlo and Morales [12] found that the majority of large-seeded plant species are dispersed by frugivores, such as birds and large animal seed dispersers, which could cover a long distance searching for food. Therefore, the simulated models of animal movements help in understanding and estimating the distances of seeds dispersed by both small and large animal seed dispersers [70]. The survival of regional plant species depends on the long-distance, through which seeds are dispersed by their dispersal agents [65]; mostly large animals.
Even though many IBMs exist in ecological literature, only few of such models have being applied to investigate the dispersal agents' population dynamic and their foraging behaviour. Mathematical models [53,54], mostly formulated deterministically, were applied to study some traits of foraging agents. Applying deterministic models alone will not be enough to understand the characteristics of each individual agent. Considering the random variations, agents move and interact with their potential targets. This, therefore, necessitates the use of stochastic models in seed dispersals. Even though tools [62], such as the birth-death processes (BDPs), symmetric and intermittent random walks, were separately applied in many studies, obtaining a single work that include these three models and relate them to seed dispersals remains elusive. These three models share common Markovian property, through which the future event does not depend on the fast given the present [40]. This review does not serve to replace the existing literature in stochastic modelling, but rather supports and relates them to seed dispersals. By doing this, we hope to promote the applications of these models to investigate different seed dispersal problems. We show that interaction of plant species that require some dispersal agents, time taken by the agent to remove a seed and agents' foraging pattern can all be studied with the three models.

Birth-death processes
We consider the BDPs formulated using a continuous-time Markov chain (CTMC), in which the population count of the process is discrete [1]. Using CTMC, a variety of problems are studied. The CTMC of infectious Salman Anemia (a virus infecting fish), for example, was formulated through BDP [46]. In modelling the population growth of a certain species, BDP is often more suitable than the other two forms of this CTMC: pure birth and death processes. In the former, individual is limited to reproduction and no death, while the later allows individual to die young [40]. The advantage of the BDP over these two models can be seen in the waterfowl movement [41], which was formulated to estimate the model parameters. Furthermore, zu Dohna and Pineda-Krch [72] fit the parameters of BDP to infectious disease data. Following this, the BDPs' data were found to fit many probability distributions [44]. In addition, a flexible method for finding transition probabilities of BDPs is determined to handle sophisticated ecological models, whose birth and death transition rates could not be solved analytically [18]. Results obtained by BDPs are often compared with their deterministic analogues. For example, to understand the dynamic of pathogens that spread by wild rodents, both BDPs and their deterministic counterparts were studied, through which some realistic dynamics of rodents' virus were captured [2]. Moreover, BDPs serve as analogue to many well-known differential equations, such as logistic-growth-dispersal model [42], and to some extend they explore changes in demographic stochasticity [39]. Furthermore, BDPs recently approximate fractional differential equations, through which the birth and death rates are linked to regimes: supercritical and under-critical [28]. In infectious disease immunization, the fractional BDP identified the targeted scheme was more efficient than the uniform control [34]. Therefore, BDP is a robust tool for studying seed dispersal problems, such as determining the delay germination of annual organism [38], finding the seed dispersal distance [64] and evaluating the cost (seed predation) and benefit (seed dispersal) of ant-seed dispersals [3]. However, this review found only few of such tools were applied to seed dispersals. We therefore illustrate how the BDP can be used to formulate the IBM of two competing species, which reproduce through seed dispersals process.

Two species IBM
Capturing the birth, death and behaviour of individual species, the spatio-temporal dynamics is called the IBM [67]. The IBMs are often formulated through BDPs with the aid of Gillespie algorithm [29,30]. These mathematical models [22,45] investigate interaction behaviour between two or more species, such as predation and competition, among others. Therefore, applying interacting populations models in seed dispersals may play important role of understanding mechanisms that structure ecological communities. For example, here we describe how two competing species can be formulated using nonlinear BDP. This illustration may lead to identification of a potential model for understanding mechanisms of seed dispersal, such as the time for each individual species extinction, the effects of different initial population sizes and dispersal rates in two competing species, among others.
Here, Equations (13) and (14) are the deterministic models of the process. The coupled ordinary differential equations appeared in Lotka-Volterra form [6]. Using these equations alone, we cannot be sure of the predictions obtained from the models. The deterministic models lack random variation [20], which is important in studying population dynamics. The variations occur naturally due to fluctuations in birth, death or environmental factors [40]. Therefore, to capture the random variations associated with each individual agent in the two competing species, the computer simulation of the process was formulated. The Gillespie algorithm [29,30], which is efficient in formulating BDPs, was used to build the stochastic IBM. The algorithm uses two random numbers, r 1 and r 2 that are drown from the uniform distribution. The first random number r 1 updates the time in the process. Meanwhile, r 2 selects the next event, which were given by the transition rates. The whole computer simulation of the two competing species is summarized in the flowchart in Figure 1. After the formulation, the deterministic and stochastic IBMs are usually combined to answer questions that may lead to the understanding of species' coexistence. Though this study is not answering any questions related to the competing species, it combined the two models for the purpose of illustration. In addition, the same arbitrary parameters were used in both deterministic and stochastic IBMs. When observing Figure 2, we notice that the two models are nearly converged. This illustration, therefore, serves as an introduction, which can be modified to study a related seed dispersal problem.

Random walk models
Random walks theory is among the most widely used models for studying natural phenomena. For more than five decades, the models have been applied in ecology [63] and physics [58], among other related field of studies. Due to the wider scope of random walks, a study [14] discussed the models in detail and identified their applications in biology. In this context, what is considered most among the contributions of random walks is how the models play important role in understanding spatial ecology. Studies conducted on random walks in ecology include how species move [36], where the species can get a target [7], how long it takes the species to find a first target [24]. Other applications of random walks models include the determination of long distance covered by a species [10], finding a foraging patterns of a specie [43,60], prediction of a species perceptual range [61], a distance from which a species detects a target, and estimation of effect of seed dispersals in forest [31]. This review further discusses the regular and irregular random walks: symmetric and intermittent walks. Lacking many applications in seed dispersals, the symmetric and intermittent random walks can be considered as potential tools for understanding dispersal mechanisms. Many agents performing irregular random walks, such as intermittent walks, posses power of locating many targets. However later in this review, we show how the symmetric random walks can approximate the intermittent walks, one of the efficient searching strategies.

Symmetric random walks
Because the agent has equal chance of moving to any given directions [9], the symmetric walks is considered the simplest form of random walk models. For example in one dimension 1D, the agent can move to either left or right with probability of 1 2 respectively. This can be formulated in the same way with the linear BDP [51]. If δ is the distance covered by the agent in one-step τ , the location of the agent is (μ, σ 2 ) ≈ (0, 1 2 ), whereby μ is the mean and σ 2 is the variance. After n-steps of the walks, the location of the agent using the central limit theorem follows the normal distribution, N(nμ, nσ 2 ) ≈ N(0, n/2) [32]. Other  (13) and (14) while the stochastic simulation is obtained by one realization of the Gillespie algorithm.
important properties of both 1D and 2D symmetric walks, such as the continuum of the models, are discussed in Codling et al. [14] and Allen [1], among other related literature.
Although a 1D symmetric random walks forms the basis for many higher dimensional random walks, a 2D symmetric model is more likely to produce the temporal and spatial patterns associated with the movement of agents. This property, therefore, makes the 2D symmetric walks robust for investigating the movement and foraging of majority of animal seed dispersers [35]. The spatial pattern has influence for the distribution of seeds within a given habitats [69]. As usual, the 2D symmetric can be formulated by the Gillespie algorithm [29,30], and how the computer simulation works is summarized in the flowchart in Figure 3.

Intermittent random walks
Another random walks described in this review that can be potential for investigating animal seed dispersal problems is the intermittent walks, a model through which an agent performs two different steps with the aid of random numbers. These two steps are called the searching and relocation steps. An agent can switch to a searching step (with a certain rate) if it can thoroughly explore the sites it will visit and alternate to the relocation step (with a relocation rate) if it can lightly visit the sites of a given lattice. The intermittent walks has been used to investigate the foraging behaviour of lizards and fish, among others [56]. Sometimes agents use their cognitive skills to develop more efficient searching strategies, as in the case of human beings looking for jobs or shelter [55].
The whole process of formulating the intermittent random walk is accomplished by implementing the Gillespie algorithm [29,30], as explained earlier. However, we briefly describe it below in relation to our intermittent walks. We chose ρ and 1 to be our searching and relocation rates respectively. The total rate is given by 0 = 1 + ρ. Therefore, the agent performs a searching step with the probability ρ/ 0 and a relocation step with this step is the time of the searching steps plus κ, whereby κ is a random number drawn from the exponential distribution with the mean of 1/ 2 , and 2 is added to the relocation time: t i+1 = t i + r 2 + κ + 2 . This completes one step of the intermittent random walks, and it will continue in this way until the number of steps is exhausted. The flowchart in Figure 4 summarizes the computer simulation for the intermittent walks. Note that a periodic barrier is often used to control the movement of agents at the boundaries.

Cover time models of animal movements
In this section, we describe one application of the random walk models formulated previously. Whenever an agent is moving on a given lattice domain N, the study is interested in two quantities: (i) the number of distinct sites visited by the agent, and (ii) the time taken by the agent to visit the sites. The latter, which is called a cover time, is divided into two: the full t f and partial t p cover times. The time taken by an agent to visit all distinct sites of a given lattice is called a full cover time [71]. Meanwhile, if the agent visits only a fraction of the lattice is termed as the partial cover time [17]. Furthermore, the time taken by the agent to reach an absorbing barrier is called the first passage time [66]. One example of this is the simulation model that studied gut-passage time to determine the effects of plant distribution [48]. When many targets are distributed on the lattice domain, averaging their distances from the starting points produces the global mean first passage time of the process E(T) [8].
Cover time problems exist [52,71], but few is applied to other field of studies, such as grand tour that is a statistical tool for describing a multivariate data [4]. Perhaps, the well known results of covering time problems are limited to 1D random walks. This can be seen as in the cases of trapping problems [27,47] and first visit problems [17,21]. Both of these covering time problems are in 1D, but if the dimension of cover time problem ≥ 2D, the results obtained is hardly to be deduced to any well known random walk problem. Here, the algebra becomes quite involved as the number of dimensions increases; therefore, computing the cover time analytically for dimensions ≥ 2D appears complex [49]. This compels us to formulate our cover time problems numerically using computer simulations. Since the literature on the applications of covering time problem in seed dispersal is lacking, the review shows that a 2D symmetric random walks can be as good as intermittent walks in approximating some covering time problems. Both 2D symmetric and intermittent walks can be good candidates for determining the time taken by animal seed dispersers to encounter seeds and estimating the time costs of foraging species [25], among other related applications.
The mean cover time of a 2D random walks, which calculates the time taken by a dispersal agent to cover a given foraging plot, is discussed here. Comparing Figure 5(b,d) shows that the number of steps taken by a symmetric walker to cover a given domain is more than that of the intermittent walker. Though an agent performing symmetric walks may take a longer time to visit a given domain, its cover time distribution converges to that of most irregular random walks, such as the intermittent walks. To find the mean cover time distribution pattern, the t f is obtained from the random walks, and then E(T) can be determined either asymptotically or numerically from the formulation. After that, t f is rescaled by the factor, t f /E(T) − log N [13], whereby N is the lattice domain size. Following this, the distribution of the rescaled cover time for both a symmetric and intermittent random walks is given by the Gumbel distribution [13,15] whereby v is the number of unvisited sites. In order to ascertain the accuracy of the cover time distribution obtained by our symmetric random walks, we adopt the technique used by Chupeau et al. [13], through which Equation (15) was used as a theoretical prediction to many irregular random walks. Furthermore, the distribution of a 2D symmetric and a 2D intermittent walks approximate the Gumbel pattern in many lattice domain sizes, N = 64,256,400 (see Figure 6). Even though these domain sizes varied, they approximately collapsed to the theoretical prediction, Equation (15). This indicates how the cover time distribution depends on the global mean first passage time of random walks, obtained in periodic domain sizes. All the distributions presented in Figure 6 were obtained when a searcher explored all the sites of a given lattice (v = 0).

Conclusion
Seed dispersal is dealing with many complex systems, some of which are difficult to be analysed with empirical or deterministic models alone. Therefore, the review described three stochastic models: BDP, 2D symmetric and intermittent random walks. What makes the formulation of these models encouraging is how they can be formulated through the same Gillespie algorithm, with some slight variations. The models have been applied to study other real world problems. However, their applications in seed dispersals are not many. In an attempt to confront the challenges pose by large amount of data generated from advanced seed tracking facilities, the models and illustrations related to seed dispersals are described in this review. The IBMs for two competing species was taken as an application of BDP. Meanwhile, the cover time problem was introduced to seed dispersals as applications of both the symmetric and intermittent random walks. We have seen that stochastic IBMs approximate their deterministic counterparts. In spatial models, though which the agent performing 2D symmetric walks takes longer time to visit a given domain size; nevertheless, the cover time distribution obtained by the agent converges faster and approximates the distributions obtained by irregular random walks. We suggest that if these three models are to be modified to capture the future of a given seed dispersal (spatial) problem, many important predictions, confronting the challenges of seed tracking data facilities, may be made from the formulation. For example, the complex systems of seed removal process and the agents' foraging behaviour in scatter-hoarding studies can be studied using the described models and illustrations.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work is supported by the Geran Putra IPS , Project number: GP-IPs/2018/9657400.