Norm statement considered harmful: comment on ‘evolution of unconditional dispersal in periodic environments’

ABSTRACT The mathematical symbol for the norm, which is heavily overloaded with multiple definitions that have both universal and specific properties, lends itself to confusion. This is manifest in the proof of an important theorem for population dynamics by Schreiber and Li on how dispersal increases population growth in a periodic environment. Here the theorem is placed in context, the proof is clarified, and the confusing but inconsequential errors corrected.

First, let us have some context for their theorem. Karlin [17] proved a deep theorem on the asymptotic growth rate of populations that combine (1) a Markov chain with (2) heterogeneous growth rates: Karlin's motivation for the theorem was to analyse the effect of dispersal in structured populations upon the protection of genetic diversity. In the theorem, α is the dispersal rate, and P is the matrix of probabilities P ij of moving from deme j to deme i given that an organism disperses. D is the diagonal matrix of growth rates of the rare allele in each deme. The spectral radius r(M(α)D represents the asymptotic growth rate of a rare allele in the metapopulation, and the increase in r(M(α)D as α decreases means that reducing dispersal gives rare alleles greater protection against extinction.
The generality of his theorem allows it to be employed towards a seemingly unrelated problem -the evolution of information transmission in organisms -to prove that evolution would favour reductions in migration rates, mutation rates, recombination rates, and even rates of cultural change [1,4], producing a unification of the 'Reduction Principle' found for specific models by Feldman and coworkers (cf. [10,11]). Karlin's theorem and its application to the evolution of dispersal rates were independently rediscovered by Kirkland, Li and Schreiber [18]. The theorem's extension to linear operators on Banach spaces is provided in [3], which unifies the result that 'the slower diffuser wins' found in a number of reaction diffusion models for the evolution of dispersal [ Empirically, it is clear that dispersal, mutation, and recombination rates have not evolved to zero in organisms, so one is directed to look for mathematical sources of departure from the reduction principle. The characterization of these conditions remains largely an open question. Just as the reduction principle appears in many context, we are seeing new situations in which departures from reduction hold, for example, for linear stochastic differential equations [9] (S. Schreiber, personal communication).
For linear systems, departure from reduction means that the spectral radius increases with the mixing rate α. One such departure occurs when the variation in the stochastic matrix has the general form M(α) = B[(1 − α)I + αP], where B and P are specially related stochastic matrices representing multiple transformation processes [2].
Schreiber and Li [26] prove a general result for another source of departure: temporally changing environments, where the growth rate matrix D alternates every time step with its inverse D −1 . They generalize, to arbitrary n × n transition matrices of reversible Markov chains, the behaviour found for 2 × 2 matrices in [6,16,19,20,Result 2], and 4 × 4 in [25]. To prove this theorem, they provide another theorem in which norm notation enters:

Theorem 1.3 ([26, Appendix 2]): Denote by A the operator norm of the matrix A.
Suppose A ∈ M n is non-zero and satisfies I + A ≥ 1. Then I + tA ≥ I + A for all t ≥ 1.
The term 'operator norm' is polymorphic in the literature. In some uses (e.g. [23]) 'operator norm' is synonymous with the spectral norm, A 2 := r(A * A) 1/2 , where A * is the conjugate transpose of A, and r(·) is the spectral radius.
Others use 'operator norm' more generally as the norm of a matrix induced by a chosen vector norm · on C n (e.g. [13,21]): Throughout the main text of Schreiber and Li [26] and in Theorem 1.2, the vector norm used is x = m i=1 x i for x i ≥ 0. However, in Theorem 1.3, used to prove Theorem 1.2, we find this inequality: where u and v are vectors such that u = v = 1, I + A = (I + A)u , Au = αu + βv, and {u, v} is an orthonormal family. Two things are clear from Equation (2): (1) The vector norm is no longer (2) Squares are missing, and Equation (2) should read: The squares were also dropped from the next sequence of calculations, which should read I + tA 2 ≥ (I + tA)u 2 = |1 + tα| 2 + |tβ| 2 = · · · ≥ I + A 2 .
Fortunately, the final inequality I + tA 2 ≥ I + A 2 remains true when the squares are dropped, so the errors are inconsequential to the proof.
Norm notation allows Theorem 1.3 to be stated quite simply, but non-transparently, and the polymorphism in the literature for the usage of 'operator norm' makes us unsure of exactly which vector norm is being used (It is, however, hard to imagine a vector norm that would provide a counterexample to Theorem 1.3). The theorem could have been expressed by unpacking the spectral norm explicitly as r((I + tA)(I + tA) * ) ≥ r((I + A)(I + A) * ) for all t ≥ 1, which leaves no room for ambiguity or error, and segues directly into Schreiber and Li's sagacious decomposition (p. 134), r(F(t)) = r(D −1/2 F(t)D 1/2 ) = r (B(t)B(t) ).
Norms are of course fundamental to mathematical analysis. The 'information hiding' achieved by · makes norm notation concise, but both the reader -and the writer -may benefit from always explicitly stating its content.

Disclosure statement
No potential conflict of interest was reported by the author.

Funding information
This work was supported by the Konrad Lorenz Institute for Evolution and Cognition Research, Klosterneuburg, Austria, and the Mathematical Biosciences Institute at The Ohio State University, USA, through National Science Foundation Award #DMS 0931642.