Probability Distribution. Case study with a bacterial population
Let us ignore in this answer the odd feature that this model entails some negative populations of bacteria and let us start from the fact that the sum of $n$ i.i.d. normal random variables with mean $\mu$ and variance $\sigma^2$ has mean $n\mu$ and variance $n\sigma^2$. Hence, $E[X_{t+1}\mid X_t]=M\cdot X_t$ and, since $X_0=1$, $E[X_t]=M^t$ for every integer $t\geqslant0$.
Note however that $X_t$ is far from being normal when $t\geqslant2$. To wit, recall that any normal random variable $\xi$ with mean $\mu$ and variance $\sigma^2$ is such that $E[\mathrm e^{z\xi}]=\mathrm e^{z\mu+z^2\sigma^2/2}$ for every complex number $z$. Here, $E[\mathrm e^{zX_{t+1}}\mid X_t]=\mathrm e^{zMX_t+z^2\mathrm{SD}^2X_t/2}$, that is $E[\mathrm e^{zX_{t+1}}]=E[\mathrm e^{u(z)X_{t}}]$, where $u$ denotes the function defined by $u(z)=Mz+\frac12\mathrm{SD}^2z^2$. In particular, $X_1$ is normal since $u(z)$ is polynomial with degree $2$ but $X_t$ is not normal for $t\geqslant2$ since $u^{\circ t}(z)$ is polynomial but with degree $2^t\gt2$.
Likewise, $E[X_t]\ne M\cdot t$ in general and $\mathrm{var}(X_t)\ne \mathrm{SD}^2\cdot t$ in general.
Related videos on Youtube
Remi.b
Updated on August 01, 2022Comments
-
Remi.b over 1 year
Let's imagine, we start with one single bacterium. At each time step (generation), each bacterium has $x$ offspring and it dies (semelparous species). $x$ is a value drawn from a normal distribution with mean=$M$, standard deviation=$SD$.
Question 1:
What is the probability distribution of the number of bacteria (population size) after t generations ?
Question 2:
Same question but assuming that nobody ever dies ! So that after the very first reproductive event, there are $n$ bacteria, value which is drawn from a normal distribution mean = $M+1$, standard deviation = $SD$.
FastingGuy says the the probability distribution is normal with mean = $M*t$, standard deviation = $sqrt(t)*SD$.
Below is a very simple R-script that shows that the population size is almost always lower when $SD$ is higher and therefore the mean should depend of $SD$. What am I missunderstanding. In my code I'm using a uniform distribution to avoid a crash because all individuals reproduce equally at each generation. Is this the reason that higher $SD$ yields to lower population size ?
a=c() for (i in 1:500){a[i]=runif(1,min=1.45,max=1.55)} b=c() for (i in 1:500){b[i]=runif(1,min=1,max=2)} plot(cumprod(a),log='y',xlab='generation',ylab='number of inds') points(cumprod(b),col='red') a.fec=mean(a) b.fec=mean(b) a.fit=c() b.fit=c() for (i in 1:250){ a.fit[i]=a[i*2]*a[(i*2)-1] b.fit[i]=b[i*2]*b[(i*2)-1] } a.fit=mean(a.fit) b.fit=mean(b.fit)
Y is in logarithmic scale !
Thank you.
-
Samrat Mukhopadhyay over 10 yearsSeems to be a Branching process.
-
-
Remi.b over 10 yearsDoes it mean that the mean of my new normal distribution $Zn$ is $M*t$ and the Standard deviation is $SD*t$. Is it correct ?
-
NebulousReveal over 10 yearsThe standard deviation would be $\sqrt{t} \cdot SD$.
-
Remi.b over 10 yearsAnd the mean is $M*t$ ? If so there's a result of a very simple simulation that I don't get. I'll add the R-script on the post
-
NebulousReveal over 10 yearsyes is would be
-
Did over 8 yearsPEV: Do you really think the distribution of the sum of a random number of i.i.d. normal random variables is normal? Your answer is designed to make us believe that, as well as your comments, but this is wrong.