Evaluating signatures of selection through variation in linkage disequilibrium among different cattle breeds

The aim of presented investigation was to assess the overlap between selection signature discovered through analysis of variation in linkage disequilibrium and reported genomic regions associated with economic and traits of biological importance in cattle populations. The differences across Slovak Pinzgau, Austrian Pinzgau, Simmental and Holstein cattle and thus genome signatures of production and adaptation were found. The highest peak (top 0.01 percentile) was observed between Slovak and Austrian Pinzgau on chromosome 23, between Slovak Pinzgau and Simmental on chromosome 4 and between Slovak Pinzgau and Holstein on chromosomes 1, 7 and 20. Many candidate genes found have a known role in milk production (casein genes CSN1S1, CSN2, CSN1S2, CSN3; ABCG2, HBEGF, CAPN3, DGAT1, TG, GHR), reproduction (MGAT1, FGF1), feed efficiency (R3HDM1, ZRANB3), fertility (SPOCK1) and immune response (HSPA9, CD14, ARAP3, PCDH). Results of this study could be the basis for implementation of genomic selection programs in the Slovak Pinzgau cattle.


Introduction
Since domestication, significant genetic improvement has been achieved for many traits of commercial importance in cattle, including adaptation, appearance and production.In response to such intense selection pressures, the bovine genome has undergone changes at the underlying regions of functional genetic variants, which are termed "selection signatures" (Randhawa et al., 2016).Domestication and selection are processes that alter the pattern of within-and between-population genetic variability.Recently, sequence polymorphisms at the genome-wide level have been investigated in a wide range of animals.A common approach to detect selection signatures is to compare breeds that have been selected for different breeding goals (dairy and beef cattle).However, genetic variations in different breeds with similar production aptitudes and similar phenotypes can be related to differences in their selection history (Sorbolini et al., 2015).As the technology improves and the cost of low-density genotyping platforms decreases, mating designs that utilize genomic information could assist producers in managing their herd at the genomic level (Howard et al., 2015).
A traditional method to examine the degree of differentiation is to compute Wright's F ST statistic across two populations.The use of this measure is advantageous when large differences in allele frequencies occur, such as across cattle breeds.Within a breed, small differences in allele frequencies are expected across populations and particularly when there is some degree of genetic exchange.Due to this the usefulness of F ST to determine regions that are different within a breed is reduced, therefore alternative methods have been used (Howard et al., 2015).One such alternative method to characterize the genomic differences across populations is to compute the average or a specific region's ROH (runs of homozygosity) frequency (Ferenčaković et al., 2013;Kim et al., 2013).
Signatures of selection are regions in the genome that have been preferentially increased in frequency and fixed in a population because of their functional importance in specific processes.These regions can be detected because of their lower genetic variability and specific regional linkage disequilibrium (LD) patterns.The VarLD method is a powerful tool to identify differences in LD between cattle populations and putative signatures of selection with potential adaptive and productive importance (Perez O´Brian et al., 2014).This method was successfully used in detection of genes associated with civilization diseases (Teo et al., 2009;Ong et al., 2010) as well as production and reproduction traits in cattle (Perez O´Brian et al., 2014;Kadri et al., 2015;Sorbolini et al., 2015;Randhawa et al. 2016).
Change of breeding goal and use of Total merit index to preserve dual-purpose character of Slovak Pinzgau is proposed for long time and the positive impact on population structure is expected.The use of most recent results of quantitative (Krupova et al., 2016) and molecular genetics should help to sustainably manage population.The routine collection of genomic information would be an invaluable resource for effective management of breeding programs in small, endangered populations (Mészáros et al., 2015;Šidlová et al., 2015).From the point of the molecular genetics, identification of the most valuable lines, families and individuals with the impact on population diversity and to achieve competitive genetic progress and production of progeny with higher added genetic value.In addition, the knowledge on the selection signatures of Slovak Pinzgau, the breed relationship with other European cattle populations including Austrian Pinzgau and identification of the most characteristic regions in the genome will further contribute to the development of breeding programs.
The ultimate objectives of genomic scans for evolving patterns of genetic diversity are to detect causative variants and its functional relevance to particular traits.Use of the high throughput assays for generating dense genotypes and genomic sequences will aid the fine mapping of candidate variants.Finally, functional analysis of the detected variants in the regions and genes under hotspots of positive selection will be an active area of future research to understand the biological significance of molecular variations in adaptation, appearance and production in cattle (Randhawa et al., 2016).
The regional linkage disequilibrium variation in the autosomal genome between breeds, including Slovak Pinzgau, Austrian Pinzgau, Simmental and Holstein cattle, genotyped with the Illumina Bovine50K BeadChip was measured.By comparing the differences across dairy and dual-purpose cattle types, we aimed to find genome signatures of production and adaptation.

Material and methods
All analyses performed in this study were based on a dataset consisting of 308 cattle individuals (4 breeds) genotyped by high-throughput technology.2013) also for populations from other sources was performed.Afterward, we removed individuals with >10 % missing data across the markers on the 29 autosomes from our analyses.Similarly markers that were missing in >10% of individuals were discarded using PLINK v1.9 (Purcell et al. 2007).A total of 287 individuals and 41 135 SNPs were included for further analyses.
Data were processed using the varLD method that compares the regional variation of LD between populations giving a score to each window, and then compares the score across 2 breeds highlighting the regions that show high differences.The regions containing the top 5, 1, 0.1 and 0.01 percentile of signals were detected and compared with previously reported bovine genomic regions associated with production and reproduction traits to find the overlap.Variation in linkage disequilibrium detects candidate regions under positive selection by comparing genome-wide LD variation between populations (Teo et al., 2009) and it is implemented in the varLD program (Ong et al., 2010).

Results and Discussion
The top 0.01 percentile of signals between SKP and HOL peaked at 40-58 Mb on chromosome 7 (Table 1).This region contains many genes associated with milk production and disease resistance (SAR1B, HBEGF), reproduction traits (MGAT1, FGF1), fertility (PCSK4, SPOCK1) and immune response (HSPA9, CD14, ARAP3 and multiple members of PCDH group).Gene MGAT1 was observed also between SKP and ATP as well as SKP and SIM while gene FGF1 between SKP and SIM, however on the lower percentile.The low gene diversity genomic regions harbouring genes for feed efficiency (R3HDM1, ZRANB3) were found between SKP and HOL on chromosome 2, while casein genes (CSN1S1, CSN2, CSN1S2, CSN3) were observed between SKP-ATP and SKP-SIM on chromosome 6.Thyroglobulin (TG, chromosome 14) responsible for meat tenderness or intramuscular fat distribution is also associated with milk yield and composition, whereas in our study was found only in SKP-HOL.Candidate genes on chromosome 2, 6, 7, 14 and more are described in the study of Randhawa et al. (2016).Intense selection to increase milk yield has had negative consequences for mastitis incidence in dairy cattle.Association signal for clinical mastitis and milk yield peaked in the 26-40 Mb region on chromosome 20 in HOL according Kadri et al. (2015) confirming also by results of our study (SKP-HOL).
Candidate genes ABCG2 (SKP-SIM), CAPN3 (SKP-SIM, SKP-HOL) and DGAT1 (SKP-HOL) on chromosomes 6, 10, 14, respectively, were found corresponding with Sorbolini et al. (2015).In addition, they observed MSTN and FTO gene which are connected with beef production and thus were not preserved in our study with dairy and dual-purpose cattle.Kasarda et al. (2015) found only 2 SNPs close to gene FTO using the integrated Haplotype Score (iHS) in Slovak and Austrian cattle.
The highest peak (0.01%) between Slovak and Austrian on chromosome 23, between Slovak Pinzgau and Simmental on chromosome 4 and between Slovak Pinzgau and Holstein on chromosomes 1, 7 and 20 was observed.Similarly, Perez O´Brian et al. ( 2014) noticed that the largest varLD scores are found when comparing different production types within a subspecies.According Table 1 is obvious that only variance in LD between Slovak and Austrian Pinzgau showed signals in all autosomes (at least on the top 5%) most likely because SKP has been selected mainly for milk production and ATP mainly for beef production.
Table 1 Regions with different signal between all evaluated pairs of breed containing the top 5, 1, 0.  Signatures of selection for adaptation are mainly attributed to tolerance in new climates, feed resources and resistance to different disease agents in various cattle breeds.The changes in genetic aspects of behavioural control for new adaptations, from survive to thrive, and using available resources have also been detected under positive selection in several populations (Randhawa et al., 2016).Kim et al. (2013) found that several of the regions that had differing levels of ROH across populations were associated with economically important traits including milk, fat and protein yield.
The same approach was utilized in this study to detect signatures of selection in common and different across populations.Kasarda et al. (2015) identified the evidence of recent selection based on estimation of the iHS, F ST and characterized affected regions near QTL associated with traits under strong selection in Pinzgau cattle.Recent investigations of the structural variation in cattle genome suggest that selective forces, in addition to the genotypic and haplotypic patterns, operate on the copy number variation (CNV) in candidate genes and can be helpful to characterize the effects of domestication, breed formation and artificial selection (Perez O´Brian et al., 2014).

Conclusions
The highest peak between Slovak and Austrian on chromosome 23, between Slovak Pinzgau and Simmental on chromosome 4 and between Slovak Pinzgau and Holstein on chromosomes 1, 7 and 20 was observed.Many candidate genes associated with milk production, reproduction and health were estimated, whereas genes associated with meat production were not found.Results of this study could be the basis for implementation of genomic selection programs in the Slovak Pinzgau cattle.
differences among pairs of breeds were obtained in expression of a specific region of the genome.Between Slovak and Austrian Pinzgau there was only one region assumed as the most significant signal on chromosome 23 (Figure1, 2).Šidlová et al. (2015)  found KHDRBS2 gene on chromosome 23, while results of this study showed signals on the same chromosome (top 0.01 % between SKP-ATP, top 1 % between SKP-SIM) in different region.Richardson et al. (2016) identified candidate region and imputation of this region to sequence data pinpointed the association signal in the introns of the FKBP5 gene, which is involved in immune response.The variance of LD between SKP-ATP (top 1%) and SKP-SIM (top 5 %) peaked also in this region (Figure1, 2A, 2B, 2C).

Figure 1
Figure 1 Comparison between Slovak and Austrian Pinzgau based on variation in linkage disequilibrium analysis.Manhattan plot demonstrates the presence of significant signals in the regions on several autosomes.Upper abline represents signal with whole-genome significance threshold

Figure 2
Figure 2 Differences among different breed pairs on chromosome 23: A -The comparison of all evaluated breed pairs, B -Slovak Pinzgau v. Austrian Pinzgau (blue), C -Slovak Pinzgau v. Simmental (red) and D-Slovak Pinzgau v. Holstein (green)