A study of lactation curves in dairy cattle using the optimal design of experiments methodology

Abstract In this paper, we have worked with some of the most popular models to fit the lactation curve. Lactation curves have received considerable attention in modelling dairy cattle milk production. These curves express the evolution over time of a cattle head’s milk yield. Observations of milk production over a cycle have usually been carried out periodically (i.e. daily, weekly or monthly). Optimal experiment design allows one to know the times when a herd’s milk production must be recorded. An experiment design is optimal when it achieves the best parameter estimates for the curve model. This work aims to study optimal designs for certain models such as quadratic polynomial, hyperbolic, exponentials and Wilmink models (WMs). Designs are observed to depend on models and must be compared to get to know if a design for a specific model has a good behaviour with another model. Design efficiency is the tool used for this comparison. Efficiency is a value between 0 and 1 – a design is good if its efficiency is close to 1. The WM designs perform well and are an improvement on those designs that are routinely used in daily observations. Highlights Optimal experiment design allows one to know the times when a herd‘s milk production must be recorded. The use of optimal design methodology greatly reduces the number of instances when dairy production has to be recorded. Optimal designs reduce the costs of experimentation by allowing parameters to be estimated with fewer experimental runs.


Introduction
A lactation curve represents the evolution over time of a herd's milk production during a specific lactation cycle. This cycle is the period from lactation onset after calving until the cow's milk dries up. Lactation curves generally reach their peak yield after calving and then decrease steadily from then until drying up (Figure 1).
Lactation curves allow the evaluation of important milk production characteristics such as maximum production and times to maximum production (Gipson and Grossman, 1990) or persistency (Gengler, 1996). These curves can help in herd management, particularly in assessing cattle heads' nutritional and health status (Dudouet, 1982). These curves are also useful in predicting a cattle head's total milk yield when only early lactation cycle observations are known.
Multiple statistical models have been used to describe lactation curves. Quintero et al. (2007), Cankaya et al. (2011), Macciotta et al. (2011), Graesbøll et al. (2016), Hossein-Zadeh (2016) and Melzer et al. (2017) reported on different lactation curve models, which are compared for dairy cattle. These papers show some models that can generally be expressed as ð Þ; where Y stands for milk yield at week or day t of lactation, and time t is on a time interval T. Here b is a vector which contains the unknown model parameters to be estimated from data. Function gðt; bÞ can be either linear or nonlinear on b. Some of the models used to fit lactation curve data are: linear where parameter b 0 is associated with the level of production, b 1 with the production decrease after peak yield, b 2 with the production increase prior to peak yield, and k is a parameter associated with the time of peak yield. Cankaya et al. (2011) characterised peak yield and the maximum production time. In Wilmink (1987) and many subsequent papers, parameter k is fixed at 0:05. Nevertheless, Torshizi et al. (2011) used values of 0:05, 0:065; 0:1, and 0:61 for k. The WM has three parameters when k is fixed. Olori et al. (1999) concluded that the WM is the best three-parameter model for predicting mean herd yield. If k is an unknown parameter, the WM has four parameters, k being a nonlinear parameter and is called the Wilmink extended model (WEM).
Some papers dealing with lactation models researched the use of several models to estimate the data and compared them using the coefficient of determination. Different conclusions were drawn regarding the best model for lactation period modelling. Figure 2 shows the plot of some of these functions.
The WM is a modification of the Cobby-Le Du model (see Cobby and Le Du, 1978). This model includes particularity that post-peak milk yield is modelled as a linear decline function and combines exponential and linear models. Both models are related to a model proposed by Wood (1967), which is expressed as Y ¼ b 0 t b 1 e Àb 2 t . The Wood model is widely used for lactation curve descriptions in dairy cattle and lactation predictions in cows and goats. Nonetheless, the results from the Wood model are improved by the WM.  To estimate these models' parameters, the researcher requires certain observations; namely, a sample of t and Y values. The optimal design of experiments methodology can be applied because it aims to find the best t values where variable Y is to be observed (i.e. the days or weeks when milk production must be registered). In general, model estimation includes several steps in which the researcher can control the independent variable t. Firstly, given a theoretical model, one attempts to find 'optimum' values for the design variable, t, in the range of experimental interest, T. This search is carried out with respect to an optimality criterion which is chosen according to the final analysis goalparameter estimation or hypothesis testing. Secondly, variable Y is observed in the optimum t allocations given by the optimum design. Finally, the model is estimated using these observations.
Optimal designs have been applied to a wide range of fields such as biomedicine (Campos-Barreiro and L opez-Fidalgo, 2013), pharmacokinetics and pharmacodynamics (Kitsos and Kolovos, 2013), and agricultural studies (Mart ınez-L opez et al., 2009). This paper researches optimal experimental designs for lactation curves. However, these designs depend on the particular model and the optimality criterion. Therefore, several lactation models are considered in the paper. With the objective of estimating the model parameters, a commonly adopted criterion is D-optimality, which obtains D-optimal designs. By using such designs, the best estimates for a model's unknown coefficients (parameters) are achieved.

Material and methods
The statistical regression model is expressed as: ð Þþ e; t 2 T; where Y is the response variable, t is the independent variable, and b is a vector of p unknown parameters that we have to estimate from the observations. The e term stands for the random errors with zero mean and constant variance.

Some definitions of optimal experimental design theory
An experimental design n is a collection of t values where Y is to be observed. The design is made-up of n different support points t 1 , t 2 , . . ., t n , and their respective weights w 1 , w 2 , . . ., w n (relative frequencies, w 1 þ w 2 þ . . . þ w n ¼ 1). A weight, w i , is the proportion of the total observations taken at time t i , and this can be considered as a probability measure (Kiefer, 1974). If we want N observations to estimate the model, then we round up the Nw i values to know the number of observations at points t i .
An optimal design, n Ã , includes the best points and weights with respect to an optimality criterion. Of the different criteria in the literature, we are going to consider the D-optimality criterion because it relates to the estimation quality of the model parameters.
The advantage of using optimal designs is that we control the times when the observations are taken, leading to the best possible parameter estimation. The accuracy of an estimator is measured by its variance, and this is part of a matrix known as the information matrix. This matrix is the main tool when looking for optimal designs.
When the function g t; b ð Þ is differentiable with a continuous derivative for every parameter, b i , with rgðt; bÞ as its gradient vector, the information matrix is defined by M n; b ð Þ¼ P n i¼1 w i rgðt i ; bÞrgðt i ; bÞ T : The inverse of the information matrix is asymptotically proportional to the covariance matrix of the parameter estimators. For this reason, many optimality criteria are formulated in terms of this matrix; for example, the D-optimality criterion is related to the determinant of the information matrix. A D-optimal design maximises this determinant, which is equivalent to minimising that of the covariance matrix, so D-optimal designs minimise the volume of the confidence region of the regression parameter vector, b.
The General Equivalence Theorem is a useful tool to check whether a design is optimal. This theorem states that a design n Ã is D-optimal if and only if d t; for all values of t in T, with p being the number of model parameters. Furthermore, the maxima of dðt; n Ã ; bÞ occur at the n Ã support points. Function dðt; n Ã ; bÞ is called the variance function.
For non-linear models, the variance function and the information matrix depend on unknown parameters. The simplest way to deal with this problem is by using a single prior guess for the unknown parameters, obtaining locally optimal designs. Such knowledge is often available from preliminary studies. Optimal designs will depend on this initial information.

Results
The results of this study are the optimal experimental designs for the Wilmink lactation curves. These designs contain the times when observations must to be recorded. The D-optimality criterion is used. The optimal designs are presented in several tables.
We will consider the general design space T ¼ ½a; b, 0 < a < b < þ1, and two cases depending on parameter k. If k is a known value, we employ the usual WM, which has three parameters If k is an unknown parameter, we employ the WEM, which has Consequently, the optimal designs differ in each case.

Wilmink model
The D-optimal design, n Ã , for WM in the interval For the proof of the above results, we have taken into account the gradient vector rg t; b ð Þ ¼ ð1; t; e Àkt Þ T and the results in Karlin and Studden (1966). In this case the information matrix and the variance function do not depend on unknown parameters, so they can be denoted by dðt; nÞ and MðnÞ, respectively. Studying the maxima and minima of dðt; nÞ, and applying the General Equivalence Theorem, we conclude that t 1 ¼ a, t 3 ¼ b and t 2 is an interior point in ½a; b, which is the solution of ojMðnÞj=ot 2 ¼ 0.
We can observe that interior support point t 2 tends to the midpoint of the interval ða þ bÞ=2; when k ! 0 and t 2 tends to a; when k ! þ1.
According to Wilmink (1987), the lactation cycle length was 305 days, making T ¼ ½1; 305. Consequently, the observations must to be recorded at days t 1 ¼ 1, t 2 ¼ À 1 k log e Àk Àe À305 k 304 k and t 3 ¼ 305. For k ¼ 0:05, considered in Wilmink (1987), the optimal value of t 2 is 55:426. Nevertheless, in the Wilmink work milk production was observed on each of the 305 days; that is, a uniform design with 305 support points t 1 ¼ 1; t 2 ¼ 2; . . . ; t 305 ¼ 305 was considered; this will be denoted by n 305 .
We can compare designs n 305 and n Ã through the determinant of the information matrix: jMðn 305 Þj ¼ 138:18 and jMðn Ã Þj ¼ 1766:25; the second determinant being greater than the first because it corresponds to the D-optimal design. Applying the D-optimality criterion, design n Ã is better than design n 305 because its determinant is greater as is the meaningful information. Therefore, the estimated parameters will be more precise if we use design n Ã .
A measurement of a given design's behaviour, n, relative to the D-optimal design, n Ã , is the D-efficiency, which is calculated as: where p is the number of unknown parameters. This value is within ½0; 1 and Eff D n ð Þ ¼ 0:5 meaning that design n needs to double the total number of observations to perform as well as optimal design n Ã . The D-efficiency of n 305 is only 0.43, which is a very low value. Torshizi et al. (2011) considered other values for k; namely, 0:065, 0:1 and 0:61. For these values, the Doptimal designs in T ¼ ½1; 305 are equally supported at three pointsthese are the ends of this interval and the points 46.9, 35.14, and 9.56, respectively.
Since D-optimal design depends on the value of k, we need to analyse whether the D-optimal design for a given k value behaves well for another value. Table 1 reports the D-efficiencies of the optimal design for k ¼ k 2 when it is used to estimate another model with a different value, k ¼ k 1 . For this, we considered the values of k in Torshizi et al. (2011). For instance, the value 0:994 in Table 1 implies that the D-optimal design for k 2 ¼ 0:065 behaves very well in estimating the WM with k 1 ¼ 0:05. However, the Doptimal design for k ¼ 0:61 does not behave well in estimating the model with k 1 ¼ 0:05, because its Defficiency is 0:544. Table 1 also includes the efficiencies of daily design n 305 . We can see that the efficiencies for the various D-optimal designs are higher than for the daily design used in Wilmink (1987).

Wilmink extended model
Here, we consider that k is another unknown parameter. The WEM is nonlinear with four parameters b ¼ ðb 0 ; b 1 ; b 2 ; kÞ T . The information matrix depends on parameter k and is denoted by M n; k ð Þ. Locally Doptimal designs will be obtained because there is a need for prior information regarding k; for example, from previous experiments, this will be denoted by k 0 . Efficiency is the behaviour of D-optimal designs for Wilmink models (WM) with k 2 when used to estimate WM with true value k 1 .
The locally D-optimal design n Ã for WEM in T ¼ ½a; b, 0 < a < b < þ1, and k ¼ k 0 is equally supported at exactly four points. Two of the support points are the boundary points of the interval (t 1 ¼ a and t 4 ¼ b). For the other points, t 2 and t 3 , we do not have an explicit expression but they are the solutions to the equations of oM n; k 0 ð Þ=ot 2 ¼ 0 and oM n; k 0 ð Þ=ot 3 ¼ 0. Table 2 reports the support points of locally D-optimal designs in ½1; 305 for WEM and the values of k in Torshizi et al. (2011). Table 3 compares the behaviour of the different locally D-optimal designs in Table 2. Efficiencies for daily design n 305 are included. In this case, efficiencies are strongly influenced by parameter k. Significantly lower efficiency is obtained when k 2 differs from the true k 1 value. As in WM, the daily design efficiencies are very small.
In Table 4, we include the efficiencies of the locally D-optimal designs for WEM when they are used to estimate the WM. We can use a four-point optimal design for WEM to estimate a three-parameter WM but it is not possible to use an optimal design for a WM (three points) to estimate a WEM (four parameters).

Discussion
A variety of different mathematical models have been used to research the lactation curve shape. Working with these models, a particular problem is collecting data to estimate the parameters as precisely as possible. To do this, it is necessary to design the experiment. There is relevant literature regarding D-optimal designs for regression models. We then set the results for those models applied to lactationnamely, the polynomial, hyperbolic, inverse quadratic and exponential models. Firstly, we offer the theoretical insight and, after that, the numerical results will be included in Table 5. Depending on the model, we will have to record the observations at different times, so we compare the behaviour of a model-specific D-optimal design when used to estimate another model. Finally, a web application is presented to obtain the optimal designs for this research.

Polynomial model
According to Karlin and Studden (1966), the D-optimal design for the polynomial model of degree r, on interval ½À1; 1, is equally supported at r þ 1 points, which are the roots of the polynomial ð1 À t 2 ÞL' r ðtÞ, where L' r ðtÞ is the derivative of the Legendre polynomial of degree r. If z i are the support points in ½À1; 1, then are the support points on interval ½1; b. Both ends of the design space are always support points.   D-optimal designs are equally supported at t 1 , t 2 , t 3 , and t 4 . WEM: Wilmink extended model. Lactation models namely, linear (LM), exponential (EM), hyperbolic (HM), inverse quadratic (IM), Wood (WDM), exponential with constant term (EMC), quadratic (QM), Wilmink (WM), and Wilmink extended (WEM), in ½1; 14. With best guesses from Landete-Castillejos and Gallego (2000):

Hyperbolic model
The D-optimal design in ½a; b, with 0 < a < b, is equally supported at t 1 ¼ a, t 2 ¼ ffiffiffiffiffi ffi ab p and t 3 ¼ b. In particular, the D-optimal design is equally supported at 1;

Inverse quadratic polynomial model
From Haines (1992), the locally D-optimal design is equally supported at t 1 ¼ s=q, t 2 ¼ s and t 3 ¼ sq, If t 1 or t 3 are not included in the design interval, the support point is then fixed at the end of the interval and the other support points are calculated as shown in Haines (1992). Han and Chaloner (2003) obtained the locally D-optimal design in interval ½1; b is equally supported at two points:

Exponential model
Exponential model with constant term Han and Chaloner (2003) also calculated the locally Doptimal design for the exponential model with a constant term in ½1; b. For b 2 ¼ b 0 2 it is equally supported at three points t 1 ¼ 1, and t 3 ¼ b. Table 5 reports the optimal designs for the above models in the interval T ¼ ½1; 14 considered by Landete-Castillejos and Gallego (2000). This design space corresponds to monthly observations of lactation yield throughout a cycle. All designs in Table 5 are equally supported. Some of the designs are locally optimal designs so they are obtained for initial parameter values found in Landete- Castillejos and Gallego (2000). For example, EM (1) is the locally D-optimal design for the exponential model with b 0 2 ¼ 0:03307. IM (2) is the locally D-optimal design for the inverse quadratic polynomial model with b 0 0 ¼ 0:0002865, b 0 1 ¼ 0:0002117 and b 0 2 ¼ 0:0000301. Table 5 also includes the locally D-optimal design for WEM with k ¼ 0:05 and the locally D-optimal design for the Wood model (WDM) with b 0 1 ¼ 0:33269 and b 0 2 ¼ 0:09319. Table 6 summarises the performances of the optimal designs in Table 5 in terms of their efficiencies. We must emphasise the good performance of some of these designs for different lactation curves, with efficiencies close to 1, and the low efficiency always shown by the monthly design n 14 .
We have created a Web application with the R Shiny software. This application allows one to obtain the optimal design for any of the models discussed, and for an interval T ¼ ½1; b. We can estimate and draw the models for our experimental data. The Web application is available at https://imor.shinyapps. io/lactancy/

Conclusions
Milk yield over the course of the lactation cycle follows a curvilinear pattern, so a suitable function is required to model this curve. Eight functions (namely: linear, exponential, hyperbolic, inverse quadratic, exponential with constant term, quadratic, Wilmink and Wilmink extended) are investigated to model the lactation curve. The optimal designs for these models are studied in this paper.
The use of optimal design methodology greatly reduces the number of instances when dairy production has to be recorded. Consequently, milk production does not have to be observed every day or weekly. To obtain the best parameter estimates, one only has to study the days proposed by the optimal design. Optimal designs reduce the costs of experimentation by allowing parameters to be estimated with fewer experimental runs.
Efficiency was used to compare optimal design performance. This method allows one to compare models under different conditions. Table 5 shows two groups within the 3-support-point D-optimal designs. The hyperbolic, the inverse quadratic polynomial and the  Wood models show a similar behaviour and high efficiency. Another group includes the exponential model with constant term, the quadratic polynomial model, and the WM, which likewise share similar characteristics.
The results of this work support the idea that the WM provides good performance. Its efficiencies are high for the cases considered. It greatly improves the design with routinely used with daily observations. The main disadvantage of the WM is the importance of having an accurate parameter value for the exponent. Inappropriate specifications for this parameter can lead to low efficiencies.