Effects of β- and κ-casein, and β-lactoglobulin single and composite genotypes on milk composition and milk coagulation properties of Italian Holsteins assessed by FT-MIR

Abstract The present study aimed to investigate the effect of β- and κ- casein, and β-lactoglobulin genotypes on milk composition and milk coagulation properties predicted using Fourier-transform mid-infrared spectroscopy (FT-MIR). The final dataset provided 74,721 observations from 5316 Holstein–Friesian cows reared in 122 herds in the Veneto region. Individual cow milk composition was obtained from Dairy Herd Improvement (DHI) testing between the years 2012 and 2021. Genotypes were analysed through customised bovine genotyping chip. The effect of both separate and combined milk protein genotype on major milk components, SCS, urea, and predicted milk coagulation properties, were investigated applying a linear mixed model. β-lactoglobulin BB resulted in the higher fat content, whereas protein and casein content was mostly enhanced by the k-casein BB haplotype. The urea content was significantly lower in β-lactoglobulin AA and k-casein EE genotypes and SCS was significantly lower in β-casein A1A1, β-lactoglobulin AA, and k-casein BB genotypes. Milk coagulation properties performance was significantly improved by β-casein A1A1, β-LG BB and k-casein BB, both as single variants and as composite genotype. The effect of genotypes on milk composition and milk coagulation properties showed five and four principal hierarchical clusters, respectively. Clustering highlighted unexpected relationships among genotypes in terms of final milk quality, suggesting it could be possible to select favourable genotypes maintaining sufficient diversity. HIGHLIGHTS Milk protein variants have an impact on milk technological properties, but phenotype collection is laborious and expensive. Data collected using mid-infra-red spectroscopy can be used to estimate milk coagulation properties and perform population studies. Milk protein variants revealed unexpected relationships with milk composition and technological traits.


Introduction
Among milk proteins, caseins represent the main component of the entire milk cheese-making process (Mariani and Pecorari 1991). These are a family of phosphoproteins, synthesised in the mammary gland, responsible for many of the technological properties of milk. b-casein and j-casein (b-CN and j-CN, respectively) differ in the amino acid sequences and the distribution of the electrical charges, given both by the different amino acids and the phosphorylation on serine and threonine residues (Walstra and Jenness 1984;Rollema 1992;Holt et al. 2013). b-CN is the most hydrophobic casein that represents 34% of total caseins, and it is constituted by a polypeptide chain of 209 amino acids (Holt et al. 2013). j-CN represents about 13% of caseins, and it is composed of a polypeptide chain of 169 amino acids of which valine represents the carboxy-terminus (Holt et al. 2013).
Main whey proteins include b-lactoglobulin (b-LG), synthesised within the udder; moreover, in smaller quantities, there are also serum albumin, immunoglobulins, proteose-peptones, lactoferrin, and transferrin (Kuzdzal-Savoie 1980;Davies et al. 1983;Jenness 1985). b-LG is a globular protein formed by a peptide chain of 162 amino acids and is the most represented among ruminant whey proteins (P erez and Calvo 1995). The milk caseins b and j, and also the whey proteins as b-LG have several genetic polymorphisms (Farrell et al. 2004). The protein variants, indeed, are distinguished from each other by the substitution or deletion of amino acids within the polypeptide chain (Martin et al. 1999).
Over the years, the interest in the genetic polymorphisms of milk proteins mainly concerned the influence that these can exert on the characteristics of milk and on human health (Caroli et al. 2009;Massella et al. 2017). In addition rennet clotting time (RCT), curd firming rate (k 20 ), and curd firmness (a 30 ) are affected by milk protein composition (Wedholm et al. 2006;Penasa et al. 2010). Several studies investigated the effects of milk protein polymorphisms on MCP (Buchberger and Dovc 2000) and protein composition (Heck et al. 2009) ; indeed and significant association between allele B of j-CN and enhanced MCP has been reported (Lodes et al. 1996;Walsh, Guinee, Reville, et al. 1998;Buchberger and Dovc 2000;Hall en et al. 2008;Heck et al. 2009). The genetic variant B of b-CN improves the total content of casein in milk, promotes the formation of smaller casein micelles, and improves the coagulation properties of milk and cheese yield (Poulsen et al. 2013). The study of these polymorphisms turns out to be a valid practical application to improve the efficiency of dairy production, considering it could be possible to carry out animal selection for a specific variant within a protein. Nevertheless, it is not fully understood whether the interaction among protein genotypes and MCP is only determined by the protein fraction content or also by different three-dimensional conformation and chemical properties of protein variants. Moreover, cheese-making properties of individual cow milk are influenced by different factors such as herd management, parity, stage of lactation, and season, raising difficulties in disentangling co-variance of such effects with milk protein polymorphisms (Wedholm et al. 2006;Jensen et al. 2012;Penasa et al. 2016).
The common limitation of previous studies dealing with the relationship between milk protein genotypes and MCP was the limited number of observations for each genotype, due to the labor-intensive collection of MCP phenotypes. For the same reason, environmental factors such as the herd effect or seasonality were difficult to be assessed. The present work took advantage of the possibility to predict milk coagulation properties from milk mid-infra-red spectra within the frame of Dairy Herd Improvement programs (Penasa et al. 2015;Tiezzi et al. 2015). Therefore, the aim of the present study was to estimate the effects of b-CN, j-CN, and b-LG single and composite genotype on milk quality traits and MCP of Italian Holstein-Friesians cows, using mid-infra-red spectroscopy (MIRS) data and chemometrics predictions.

Data origin
Information on individual cow milk samples from Dairy Herd Improvement (DHI) testing was provided by the Breeders Association of Veneto Region (Vicenza, Italy). Information about cow genotype was provided by the National Association of Friesian, Brown, and Jersey Breeders (Cremona, Italy). The genotypes of milk protein polymorphisms were carried out through Custom Bovine SNP50K Bead-Chip (Illumina Inc., USA). All sampling activities were carried on within routinary DHI activities by Breeders Association of Veneto Region (Vicenza, Italy) and Italian Holstein, Brown and Jersey Association (Cremona, Italy) in full compliance with current legislation, in particular referring to animal welfare.
Samples were collected in the Veneto region between 2012 and 2021. The distribution of sampled cows over Veneto provinces is reported in Figure 1. Farms in the Veneto region are characterised by medium herd size (43 col), added with 20 ml of preservative (Bronysolv; ANA.LI.TIK Austria, Vienna, Austria), were analysed in the laboratory of the Breeders Association of Veneto Region (Padova, Italy) according to guidelines of International Committee for Animal Recording (ICAR) for milk quality analyses. Somatic cell count (SCC) was assessed by Fossomatic (FOSS Electric A/S, Hillerød, Denmark) and transformed to somatic cell score (SCS) through the equation SCS ¼ log 2 (SCC/ 100) þ 3 (Wiggans and Shook 1987). Milk fat, protein, and casein percentage, milk urea (mg/dL), and predicted milk coagulation properties were assessed using different versions of Fourier Transformed -Mid Infra-red MilkoScan instruments (Foss, Hillerød, Denmark). According to the manufacturer instructions and ICAR committee, MilkoScan instruments were standardised to ensure the reliability of predicted traits (Feudale et al. 2002;Juhl 2017). Predictive equations for milk fat, protein, casein and lactose percentage, urea, and coagulation properties were purchased from the instrument producer and continuously updated at the newest available version.

Data editing
The original dataset (n ¼ 132,835), including observations from 7783 genotyped animals, was edited to retain records from purebred HF, between 5 and 305 days in milk (DIM) reared in the same herd during the complete period of the study (2012 to 2021). Contemporary groups were defined as cows tested in the same herd and date (herd-test-date, HTD) and HTD with less than 3 animals, and cows with less than 3 observations within lactation were removed. Observation traits deviating more than 3 SD from the overall mean of the trait were set as missing. After editing, 74,721 observations from 5316 cows and 122 herds were available for further analyses. Parity and DIM averaged 1.96 ± 1.77 and 148.15 ± 84.47 days, respectively. Herd size ranged from 6 to 198 cows.

Statistical analysis
Milk composition, SCS, urea, and predicted milk coagulation properties were analysed through a linear mixed model using PROC HPMIXED of SAS ver. 9.4 (SAS Institute Inc., Cary, NC). The model was: where y ijklm is the dependent variable; m is the overall intercept of the model; S i is the fixed effect of the ith class of stage of lactation (l ¼ 1 to 30, being all classes of 10 DIM from 6 to 305 DIM); P j is the fixed effect of the jth parity of the cow (j ¼ 1 or 2, with 1 being primiparous cows and 2 being multiparous cows); b-CN k is the fixed effect for cow b-CN genotype (k ¼ A1A1, A1A2, A2A2); j-CN l is the fixed effect for cow j-CN genotype (l ¼ AA, AB, AE, BB, BE, EE); b-LG m is the fixed effect for cow b-LG genotype (m ¼ AA, AB, BB); (S Â P) ij is the fixed interaction effect between stage of lactation and parity; HTD n is the random effect of the nth contemporary group is the random effect of the oth cow nested within the composite genotype (o ¼ 1-5316) $ n(0,r 2 C(G) ), where r 2 C(G) is the cow variance tested in within composite genotype variance; and e ijklmno is the random residual $ N(0,r 2 e ), where r 2 e is the residual variance. Moreover, the combined effect of genotypes was investigated separately substituting the fixed effect of single genotypes, that is, b-CN k , j-CN l , and b-LG m , with G p fixed effect for composite genotype (p ¼ 1-40). Composite genotypes with less than 150 observations or present in less than five cows were excluded. Multiple comparisons of least squares means were performed using Bonferroni's adjustment. Significance was set at p < .05 unless otherwise stated. Heatmap and graphical representations were created using the package ggplot2 and Hierarchical Clustering Algorithms through the statistical software R (R Core Team 2013).

Genotype and allele frequencies
Genotype and allele frequencies of analysed samples are reported in Table 1. Notably, even if in the last years the attention towards A2 milk increased globally due to potential positive effects on human health compared to other variants (Thiruvengadam et al. 2021), 39.13% of b-CN alleles in analysed samples was A1. Accordingly, only 36.49% of analysed cows were homozygous for A2 b-CN, whereas the remaining animals had at least one A1 allele. The correspondence between actual genotype frequency and the one expected from alleles percentage suggested no segregation between A1 and A2 cows is ongoing in Italian Holsteins. Previously, Chessa et al. (2020) investigated the temporal trend of allele frequencies in the Italian Holstein population, highlighting small differences between 2000 and 2010 and the period 2010 and 2017. During the latter period, A1 and A2 represented 36.7% and 55.8% of population alleles, respectively, in accordance with the results of the present study. The same authors reported the remaining alleles being mostly b-CN I, B, and A3. Such minor alleles were not investigated in the present study. Sebastiani et al. (2020) examined the frequencies of b-CN alleles in Holstein cows reared in 17 herds of central Italy and reported very similar A2 allele frequency compared to the present study (60.65%) and slightly lower A1 frequency (30.39%). Lower frequencies compared to the present study were expected, considering the aforementioned authors took in account also minor alleles such as B and I. Similar investigations in Holstein population, from several countries in Europe and Asia, reported A1 and A2 to represent about 36.9% (range from 26% to 50%), and about 57.1% (range from 26% to 50%) of b-CN alleles, respectively (Dai et al. 2016;Cie sli nska et al. 2019).
Frequencies of A allele of b-LG in analysed samples was 54.45%, in accordance with previous studies which reported frequencies between 27% and 57% in different Holstein populations, between late 90 s and early 2000s (Heidari et al. 2012). Results differed from a previous investigation in Italian Holstein, which reported a frequency of the A allele of 43.6% (Caroli et al. 2004).
The most represented j-CN allele was A (56.81%), followed by B (33.31%) and E (9.88%). Similar to Chessa et al. (2020), no individual carried j-CN C variant. Even if the positive effect of B variant of j-CN on milk technological properties was well-known, and tests for quantification of B protein variant of j-CN are commercially available (Rossoni et al. 2011), only 10.89% of tested cows were homozygous for B allele. The frequency of the B allele from the present study was confirmed by previous investigations in Holstein-Friesian cows predominantly from US, Italy, and Canada (Chessa et al. 2020).

Effect of b-CN genotype on milk composition and predicted MCP
The fat content of milk was not significantly influenced by b-CN genotype (Table 2). However, the statistical analysis highlighted significant, even if minor, differences in protein and casein content among b-CN genotypes, in agreement with findings of Lund en et al. (1997). Similarly, a recent investigation on Agerolese cattle breed found significantly higher protein and casein content in b-CN A2A2 milk compared to other genotypes. Previously, Ng-  found opposite results in Holstein-Friesian, even if the statistical significance of the differences between b-CN A2A2 and A1A1 genotypes was not reported. Moreover, GWAS meta-analysis highlighted the chromosomic region associated with caseins genes had a significant effect on milk protein content (van den Berg et al. 2020). Among b-CN genotypes, A1A1 showed the lowest SCS content, followed by A1A2 and A2A2. Previously, Vallas et al. (2012) did not find a significant association between SCS and b-j-CN composite genotype.
Considering predicted milk coagulation properties, b-CN genotype did not significantly affect RCT, while b-CN A1A1 individuals showed significantly improved k 20 and a 30 compared to A1A2 and A2A2 (Table 2). Previous studies described dissimilar and not significant differences between b-CN A1 and A2 variants on RCT. Marziali and Ng-Kwai-Hang (1986) reported shorter RCT for A1A2 milk compared to A1A1, meanwhile, Ketto et al. (2017) findings suggested A2A2 milk having the less favourable RCT and k 20 compared to A1A2. The favourable effect of A1 allele on RCT, compared to A2 allele, was supported by studies on Holstein and Jersey cows (Jensen et al. 2012;Poulsen et al. 2013), and on Finnish Ayrshire and Finnish Friesian cows (Ikonen et al. 1999). Moreover, Lodes et al. (1996) demonstrated that b-CN A1A1 had shorter RCT and k 20 than b-CN A1A2 and b-CN A2A2 genotypes. The favourable effect of b-CN A1 on a 30 was reported by Ketto et al. (2017), who demonstrated significantly superior performance in terms of a 30 for A1A2 compared to A2A2 milk. Different behaviour among b-CN variants on MCP have been attributed to the different net charge of protein molecules, with A1 and B variants having one and two positively charged amino acids more than A2, respectively (McLean et al. 1987).
Effect of b-LG genotype on milk composition and predicted MCP Milk fat content was higher for b-LG genotype BB compared to AA (Table 3) LG genotypes when CN was expressed as a fraction of the major proteins (quantified using HPLC). Such differences were essentially linked to a lower level of total protein in b-LG BB samples and a slightly higher concentration of j-CN compared to other genotypes.
The milk content of urea is usually referred to as an index of efficient transformation of feed nitrogen in the protein of milk (Bobbo et al. 2020). Cows carrying b-LG AA genotype resulted in the lower urea content in milk. Botaro et al. (2008) found similar, but not statistically significant, results in Girolando but not in Holstein breeds. Singh et al. (2015) highlighted a significantly lower SCS in Holstein cows (n ¼ 126) carrying b-LG AA genotype compared to BB, in accordance with the results of the present study. Such results could be linked to different susceptibility to mastitis or to a different response of the immune system to udder infections among b-LG genotypes.  b-LG AA showed a less favourable RCT, k 20 and a 30 compared with the other investigated genotypes (Table 3). Previous research is not in accordance with the effect of b-LG on RCT. Reporting only studies with statistically significant differences among b-LG genotypes, K€ ubarsepp et al. (2005), Ng- Kwai-Hang et al. (2002), and Ketto et al. (2017) reported that the most favourable genotypes were AA, AB, and BB using different statistical approaches, respectively. Celik (2003) supported AA being the most favourable genotype analysing milk from 50 Brown Swiss cows. In the present study b-LG BB had shorter k 20 (7.19 min) than AB and AA genotypes. Such a result is in accordance with K€ ubarsepp et al. (2005).
The stability of the complex between j-CN and b-LG has been reported to be influenced by protein variants (Imafidon et al. 1991). In this view, a deeper understanding of the effect of such interaction on the coagulation process from a molecular perspective and under controlled conditions could be useful to fully determine which is the b-LG variant having the most favourable effect on MCP.

Effect of j-CN genotype on milk composition and predicted MCP
Even if previous researches dealt with the effect of j-CN on fat content, most data focussed only on A and B variants (Bovenhuis et al. 1992;Mayer et al. 1997). According to findings presented in table 4, Heck et al. (2009) did not find significant differences in milk fat percentage among j-CN genotypes.
Protein content, as well as casein content of milk, were influenced by j-CN genotypes, with BB being the most favourable one (Table 4). Similar results were found by Hall en et al. (2008), Heck et al. (2009), andMackle et al. (1999) using different computational approaches. In particular, Heck et al. (2009) reported a 0.19% difference in milk protein content between BB and EE genotypes, similar to the 0.18% of the present study. On the other hand, Albarella et al. (2020) did not find significant differences in protein and casein contents among Agerolese breed individuals carrying j-CN AB and BB genotypes.
j-CN EE genotype demonstrated significantly lower urea compared to AB, BB, and BE genotypes (Table 4). To the authors' knowledge, this is the first time j-CN protein variants are associated with changes in milk urea content. SCS in the analysed individuals was lower for j-CN BB genotype compared to AA. Previously, Vallas et al. (2012) did not find a significant association between SCS and b-j-CN composite genotype.
A significant effect of j-CN on MCP was found, with the BB genotype performing significantly better in terms of RCT compared to AA (Jakob and Puhan 1992; (Table 4). Accordingly, Vallas et al. (2012) findings demonstrated allele B of j-CN having the best coagulation characteristics compared to A variant. Moreover, Ikonen et al. (1999Ikonen et al. ( ), K€ ubarsepp et al. (2005, and Jensen et al. (2012) supported present findings suggesting j-CN BB genotype was linked to improved RCT. A possible explanation for such difference is that j-CN B determines smaller casein micelles compared to j-CN A ). This possibly leads to a lower critical proportion of j-CN to be proteolysed in order to initiate micelles aggregation, essentially due to increased diffusion of micelles (McMahon and Brown 1984;Dalgleish 1993). Individuals carrying j-CN BB genotypes were characterised by a significantly shorter k 20 compared to all other j-CN genotypes.  investigated the effect of j-CN variants on low-moisture mozzarella cheese manufacturing and reported a significantly improved rate of curd firming in milk with j-CN BB compared to AA. Finally, the effect of j-CN genotype on a 30 was in accordance with previous studies (Lodes et al. 1996;).

Effect of composite genotype on milk composition
The effect of genotypes on milk composition is represented as a heatmap in Figure 2. To generate the heatmap, standardised least squares means and standard errors were imputed. The map highlighted five principal clusters. The first cluster is characterised by average SCS, high urea levels, and the highest percentages of fat, protein, and casein. Such cluster is mostly represented by genotypes composed of b-CN A2A2, b-LG BB and j-CN BB. Such results are in agreement with the findings of Mayer et al. (1997), even if a direct comparison is not possible considering cited authors did not include b-CN A1 variant among investigated alleles. The second cluster was characterised by low SCS levels, average urea, and good protein and casein levels. Such characteristics suggested genotypes belonging to this cluster to be favourable for udder health, efficiency, and production. Genotypes of the cluster are represented by b-CN A1A2 followed by A1A1, b-LG AA and AB, and j-CN AB followed by BB. In this context, b-CN A1, b-LG A, and j-CN B alleles are the most represented. The third cluster is composed of a single genotype, A1A1, BB, and BB for b-CN, b-LG, and j-CN, respectively. This genotype is characterised by an average level of SCS, extremely low urea, and high levels of protein, casein and fat. Nevertheless, such results could be biased by the low number of animals carrying such genotype (n ¼ 17, Supplementary Table 1). The fourth cluster is characterised by medium-high SCS and average to low protein and casein content. b-CN and b-LG genotypes are almost equally represented, while the presence of j-CN E allele is particularly high in this cluster. Finally, the fifth cluster is represented by genotypes with low protein, casein, and fat contents. All genotypes carry at least one copy of b-CN A1 allele, and most of them also have a j-CN E copy. To the authors' knowledge, this is the first time a clustering approach was applied to milk compositional characteristics in order to highlight composite genotypes with similar milk compositions. Perna et al. (2016) studied the effect of caseins haplotype on milk composition and suggested BB-A2A2-BB genotype for a-CN-b-CN-j-CN having good compositional characteristics and average SCS levels, in accordance to present findings. The same authors reported the highest levels of fat for BB-A1A1-AA casein haplotype despite average levels of protein, and the highest SCS. Accordingly, the same haplotype for b-CN-j-CN in the present study clustered in the fourth group, characterised by similar composition except for the genotype carrying b-LG AB variants, having lower SCS levels.

Effect of composite genotype on MCP
The effect of milk protein variants on MCP was represented as a heatmap in Figure 3. Four principal clusters were observed. The first cluster is mainly represented by genotypes with poor coagulation properties in terms of RCT, k 20 , and a 30 . Four genotypes belong to this cluster, all presenting at least one j-CN A allele. The second and largest cluster presented lower a 30 and average to poor RCT and k 20 . Notably, most genotypes carrying j-CN A and E alleles belong to cluster, while no j-CN BB genotype is clustered in this group. The third cluster gathered genotypes with good a 30 and average to good k 20 and RCT. In this group, most of the genotypes carrying j-CN BB and AB are present, while j-CN E allele was not present. Accordingly, the favourable effect of b-CN-j-CN A1A1-AB genotype on a 30 was previously demonstrated in Finnish Ayrshire cows, even if the same results were not confirmed in Finnish Friesian. Considering k 20 , Finnish Friesian b-CN-j-CN genotypes A1A1-AB and A1A2-AB were proposed as the most favourable. However, the same study did not take into consideration homozygous animals for j-CN B (Ikonen et al. 1997). Finally, the fourth cluster is characterised by very good MCP. All genotypes belonging to such cluster were uncommon in Holstein population, being present in 0.32%, 1.8%, and 0.49% of analysed cows for genotypes A1A1-BB-BB, A2A2-AB-BB, and A1A1-BB-BE of b-CN, b-LG, j-CN, respectively. Notably, one of the genotypes represented in such cluster is A1A1-BB-BB, the same demonstrating extremely good compositional traits. It is important to notice that in this cluster is present a composite genotype with A2A2 b-CN. The good performance in terms of technological traits of A2A2-AB-BB composite genotype, similar to results reported by Perna et al. (2016) for BB-A2A2-BB (CSN1S1-CSN2-CSN3) genotype, could be attributed to a synergistic effect of b-CN A2 and j-CN B on micelle size (Walsh, Guinee, Reville, et al. 1998;Ketto et al. 2017).

Conclusions
The present study demonstrated the effect of single and composite milk protein genotypes on milk composition and milk coagulation properties, predicted using milk infra-red spectroscopy, from a large database of Holstein cows. b-CN A2A2 cows produced milk with higher protein and casein content compared to other genotypes, while b-CN A1A1 reported the most favourable urea level, SCS, k 20 , and a 30 , while no significant differences among genotypes were found for fat content and RCT. The most favourable b-LG genotype was BB for fat content and milk coagulation properties, while the AA genotype resulted in lower urea and SCS. Finally, j-CN BB was the best genotype for most of the analysed traits, except for fat content, resulting in no significant differences among genotypes, and urea, being EE and AE and AA the most favourable genotypes. Moreover, the relationships in terms of milk composition and milk coagulation properties among milk protein composite genotypes were investigated using hierarchical clustering, suggesting the existence of genotypes sharing common characteristics. This aspect could be taken into consideration for future selection plans. Finally, considering the increasing interest in b-CN A2A2 milk from a health point of view, the detrimental effect of the A2 allele on milk technological traits should be taken into account. The present study highlighted the existence of composite genotypes carrying b-CN A2A2 and having average to good MCP. Such genotypes should be considered to counteract the possible detrimental effects of A2A2 selection.

Ethical approval
Present study was developed in the framework of national dairy herd improvement program. All activities involving animals were conducted by state-recognized Dairy Herd Improvement Associations. All applicable international, national, and/or institutional guidelines for the care of animals were followed.