IOCAS-IR  > 实验海洋生物学重点实验室
凡纳滨对虾生长和耐高盐性状相关标记的筛选及基因组育种技术研究
罗正
Subtype博士
Thesis Advisor李富花
2022-05
Degree Grantor中国科学院大学
Place of Conferral中国科学院海洋研究所
Degree Name理学博士
Keyword全基因组关联分析 基因组选择 生长性状 耐高盐性状 凡纳滨对虾
Abstract

全基因组关联分析(Genome-wide association study, GWAS)和基因组选择(Genomic selection, GS)是育种3.0阶段重要的技术手段,极大地推动了动植物性状的遗传解析和分子育种技术的发展。凡纳滨对虾基因组的破译使对虾相关研究进入了后基因组时代,GWAS分析成为解析对虾经济性状遗传基础的重要手段。开展凡纳滨对虾分子育种技术研究,加速其遗传选育过程,是破解其良种缺乏、种源受制于人困境的重要技术手段。前期本课题组和国内外相关研究团队开展了凡纳滨对虾生长、抗弧菌等性状的GWASGS研究,但是进展相对比较缓慢。本研究针对凡纳滨对虾生长和耐高盐性状,利用高密度SNP芯片、比较转录组分析等开展了性状相关标记的筛选,鉴定了性状相关基因,并建立了适合水产动物进行基因组选择育种的策略,为对虾重要经济性状的遗传解析奠定了重要基础,并为GS在水产动物中进行应用提供了重要思路。论文主要进展如下:

1. 对虾生长性状相关标记和基因的筛选:利用凡纳滨对虾600 K高密度芯片对多家系混合群体进行了SNP分型,并进行了连锁不平衡衰减(LD decay)分析,结果表明当r2 = 0.2时,标记间的物理距离为20 Kb,说明该群体的衰减速率较快。利用GWAS分析鉴定了11个与体重性状显著相关的SNP标记,并在40号连锁群上鉴定到一个与生长性状显著相关的区段,在该区段内定位到候选基因c76794。对候选基因进行关联分析鉴定了11个与体重显著相关的变异位点,其中单个标记对体重表型的解释率达11.23%11个标记的累积表型解释率为24%,表明其为生长性状密切相关基因,该基因的发现为对虾生长性状的遗传解析奠定了重要基础。

2. 对虾耐高盐性状相关基因及通路的筛选:通过耐高盐家系和敏感家系在正常盐度和高盐处理后的比较转录组分析,发现耐高盐家系和敏感家系在正常盐度和高盐环境下其差异基因均富集在 “response to stimulus”通路上,这个通路中包含多个编码甲壳蓝蛋白的基因,这些基因均在耐高盐家系中呈显著高表达的趋势,提示它们在对虾耐高盐过程中发挥重要作用。通过比较耐高盐家系和敏感家系对高盐处理后的应答反应差异,获得只在抗性家系发生响应的基因,这些基因主要富集在“serine-type endopeptidase activity”, “serine-type peptidase activity”“serine hydrolase activity”这三个GO条目中。其中,丝氨酸蛋白酶家族的基因是三个GO条目中共有的基因,提示该类型的基因在对虾耐高盐过程中发挥着重要作用。在基因组水平上,通过整合GWAS和机器学习的方法对耐高盐性状进行关联分析,在21号连锁群上发现了一个与性状显著相关标记,在该标记上下游20 Kb范围内定位到一个功能基因,该基因恰恰与转录组数据分析发现的丝氨酸蛋白酶类基因相吻合,进一步说明其在对虾耐高盐过程中发挥重要作用。

3. GWAS选择标记对基因组选择预测能力影响的评估:针对水产动物自身的特点,开发适用于水产动物GS育种的策略,降低GS分型成本,是推动基因组选择育种在水产动物中应用的基础。本研究提出了一种利用GWAS筛选出的与性状相关标记进行GS分析的策略,并评估了该方法对水产动物抗病性状GS预测能力的影响。结果表明,使用GWAS选择的SNPs预测能力高于随机选择SNPs以及使用全部SNPs的预测能力,并且BayesB模型表现出优于GBLUP模型的性能。用于GS的最佳SNP数量会因性状和物种的不同而有所差异。该策略不仅可以降低基因型的分型成本,而且可以提高GS预测的准确性,这将有助于加快GS在水产动物良种选育中的应用。

4. 基于机器学习的对虾GS分析方法的建立:利用1565尾凡纳滨对虾的基因型和表型数据,对不同预测模型,包括传统GS模型中的GBLUPBayesB、以及机器学习模型的KAML, Catboost, XGBoost, ExtraTrees, RF, KNN, LightGBM, NeuralNet, WE的预测能力进行了比较。表型数据包括体长、腹长与头胸部长的比值两个表型。预测能力的比较发现,NeuralNet在所有模型的对比中表现最好,与GBLUP比较,NeuralNet在两种表型中的预测能力都提高了约10%,说明机器学习中的NeuralNet模型在凡纳滨对虾基因组选育中有更好的应用前景。

5. 对虾耐高盐性状的基因组选择分析:采用课题组自主设计的对虾育种芯片评估了凡纳滨对虾耐高盐性状的遗传参数,并结合建立的GS方法开展了耐高盐性状的基因组选择分析。以耐高盐性状测试中的对虾存活时间、致死盐度以及存活状态这三个性状作为表型性状,利用分子标记评估的遗传力分别为0.38 ± 0.010.40 ± 0.010.5 ± 0.01。在不同GS模型中,NeuralNet的预测准确性最高,其在对虾存活时间、致死盐度以及存活状态预测的准确性分别为0.67 ± 0.050.66 ± 0.060.77 ± 0.03。相较于传统的系谱选择(PBLUP),基因组选择的预测准确性提高了13.8%。进一步的分析表明,当参考群体的数量达到1000左右时,其预测准确性可提高到0.9。说明基因组选择较传统的系谱选择具有明显的优势,研究结果为凡纳滨对虾耐高盐性状的基因组选择提供了重要指导。

Other Abstract

Genome-wide association analysis (GWAS) and genomic selection (GS) are important techniques in the breeding 3.0 stage, which have greatly promoted genetic dissection of important traits and molecular breeding in animals and plants. With the decoding of Litopenaeus vannamei (L. vannamei) genome, studies on shrimp are going to the post-genome era. GWAS has become an important method to analyze the genetic basis of economic traits in shrimp. Molecular breeding technology is a fundamental approach to solve the problem faced by shrimp culture, “absence of high quality broodstocks of Litopenaeus vannamei due to the restriction from abroad”. Our group and other research teams have carried out some GWAS and GS research in L. vannamei, but the progress is still limited. In this study, high-density SNP chips and comparative transcriptome analysis were used to screen markers or genes related to growth and high salinity tolerance traits. Strategy for genomic selection breeding in aquatic animals were developed. All these works have laid an important foundation on the genetic analysis of important economic traits in shrimp, and will provide important theoretical guidance and technical support for the application of GS in aquatic animals, especially in shrimp. The main progresses are as follows:

1. Identification of markers and genes related to growth traits in L. vannamei: A mixed population composed of multiple families was genotyped by 600 K high-density chip. Linkage disequilibrium decay (LD decay) analysis showed that the estimated physical distance between markers was 20 Kb when r2 = 0.2, which indicated that the decay rate of the analyzed population was high. A total of 11 SNPs significantly correlated with body weight were identified by GWAS, and a chromosome region apparently related to growth trait was localized on linkage group 40. A functional gene named c76794 was identified in this region. Candidate gene association analysis further illustrated that 11 SNPs in this gene were significantly related to body weight. The phenotype explanation ratio of a single loci reached 11.23% for the body weight phenotype, and the cumulative phenotypic explanation rate of these 11 SNPs in this gene was 24%. These data suggested that the candidate gene c76794 should play important roles in growth trait of shrimp. The discovery of this gene laid an important foundation for the genetic dissection of growth trait in shrimp.

2. Screening of genes and pathways related to high salinity tolerance traits in L. vannamei: Comparing the transcriptomes between high salinity tolerant families and susceptible families, we found that the response to stimulus was the most enriched Gene Ontology (GO) term for biological process. Meanwhile, genes encoding crustacyanin (CRCN) showed apparently high expressions in tolerant families both under normal and high salinity conditions, which suggests its relevance in shrimp tolerance to high salinity stress. By comparing the responses of high salinity tolerant families and susceptible families under high salinity treatment, some genes specifically responsive to high salinity in tolerant families were obtained. These genes were mainly enriched in three GO terms, including serine-type endopeptidase activity, serine-type peptidase activity and serine hydrolase activity. The genes related to serine protease family were shared by these three GO terms, which suggested that these genes should play important roles in the shrimp resistance to high salinity. Through association analysis on the high salinity resistant trait by GWAS and random forest (RF) methods, a significantly associated marker was found in the linkage group 21. Around this marker, a candidate gene was localized, which is a kind of serine protease discovered in the transcriptome data. The data suggested that this gene might play a key role in shrimp tolerance to high salinity. 

3. Evaluation on the accuracy of genomic selection using markers selected by GWAS: Due to the characteristics of aquatic animals, developing a GS breeding strategy suitable for aquatic animals and reducing the cost of genotyping are the basis for promoting the application of GS in aquatic animals. In this study, we proposed a strategy for GS using a subset of markers selected by genome wide association studies (GWAS), and evaluated the prediction accuracy for disease resistance traits in different aquaculture species. The results showed that the prediction accuracy using SNPs selected by GWAS was higher than that predicted by randomly selected SNPs and all SNPs. BayesB model presented a better performance than GBLUP model. Furthermore, the optimal SNP numbers necessary for GS were varied in different species for different traits. The proposed strategy of GS in the present study could not only reduce the genotyping cost, but also improve the prediction accuracy of GS, which will be very helpful to accelerate the application of GS in aquaculture breeding program.

4. Establishment of GS method based on machine learning for L. vannamei: In order to compare the prediction ability of different GS methods in genomic selection of Litopenaeus vannamei, we selected a population composed of 1565 individuals. The body length and ratio of abdomen length to cephalothorax length of shrimp were used as two phenotypes. The traditional GS models included GBLUP, BayesB, and machine learning models included KAML, Catboost, XGBoost, ExtraTrees, RF, KNN, LightGBM, NeuralNet, WE were used. Comparison among the prediction abilities of different models showed that NeuralNet had the highest prediction ability in the two growth phenotypees. The prediction ability of NeuralNet increased about 10% when compared to GBLUP for both phenotypes, which indicated that NeuralNet had a better application prospect for genomic selection breeding in shrimp.

5. Genomic selection of shrimp tolerance traits to high salinity: The genetic parameters and genomic prediction accuracy on high salinity tolerance trait were evaluated using the shrimp breeding chip designed by our research group. The survival time, lethal salinity and survival status of shrimp during the high salinity treatment were recorded for each individual and used as phenotypes. The heritability of three phenotypes were 0.38 ± 0.01, 0.40 ± 0.01, 0.5 ± 0.01, respectively. Among different GS models, NeuralNet showed the highest prediction accuracy, and its prediction accuracy for survival time, lethal salinity, and survival state of shrimp were 0.67 ± 0.05, 0.66 ± 0.06, and 0.77 ± 0.03, respectively. Compared with traditional pedigree selection (PBLUP), GS improved the prediction accuracy by up to 13.8%. In addition, the optimal training population size was predicted to be around 1000, in which the prediction accuracy can rise to around 0.9. These results indicate that the heritability of high salinity tolerance trait of shrimp is medium to high heritability, and the genomic selection is superior to traditional selection approach. This study provides important guidance for genomic selection of shrimp tolerant to high salinity.

MOST Discipline Catalogue理学
Language中文
Table of Contents

目  录

第1章  绪论... 1

1.1  选择育种概述... 1

1.1.1  选择育种方法的发展过程... 1

1.1.2  传统选择育种方法的应用现状... 2

1.1.3  凡纳滨对虾传统选育进展和面临的问题... 4

1.2  标记辅助选育技术的发展概述... 5

1.2.1  标记辅助育种的发展和应用现状... 6

1.2.2  标记辅助育种存在的问题... 7

1.2.3  全基因组关联分析的发展概述... 7

1.2.4  全基因组关联分析的应用... 9

1.2.5  机器学习方法在关联分析中的应用现状... 9

1.3  基因组选择育种的发展概述... 10

1.3.1  基因组选择育种概念... 10

1.3.2  基因组选择育种的分析方法... 11

1.3.3  基因组选择育种在畜禽和作物中取得的研究进展... 13

1.3.4  机器学习方法在基因组选择育种中的应用现状... 15

1.3.5  基因组选择育种在水产动物上的应用及挑战... 15

1.4  研究目的和意义... 16

第2章  凡纳滨对虾体重性状的全基因组关联分析和精细定位... 19

2.1  材料和方法... 20

2.1.1  实验材料的制备... 20

2.1.2  DNA的提取... 20

2.1.3  基因芯片SNP分型... 20

2.1.4  标记在染色体上的分布及标记周围基因的定位... 21

2.1.5  主成分(PCA)和个体亲缘关系分析... 21

2.1.6  连锁不平衡分析... 21

2.1.7  全基因组关联分析... 21

2.1.8  GWAS获得的体重相关SNP位点的验证... 22

2.1.9  生长性状候选基因的三代靶向测序分型... 23

2.1.10  候选基因与生长性状的关联分析... 26

2.2  结果... 27

2.2.1  表型测量... 27

2.2.2  600K芯片分型结果... 27

2.2.3  SNP标记在连锁群上的分布... 28

2.2.4  群体结构主成分分析 (PCA) 29

2.2.5  GWAS群体的亲缘关系分析... 30

2.2.6  群体连锁不平衡分析... 31

2.2.7  全基因组关联分析... 32

2.2.8  显著标记周围的基因定位... 33

2.2.9  标记AX-249832331在A13和C16群体中的验证结果... 34

2.2.10  标记AX-249790475在A13和C16群体中的验证结果... 35

2.2.11  600K芯片中AX-249832331周围标记连锁不平衡分析... 36

2.2.12  候选基因的关联分析... 37

2.2.13  候选基因(c76794)序列中体重显著相关标记的表型解释率... 38

2.2.14  候选基因(c76794)中与体重显著相关位点对蛋白编码的影响... 39

2.3  讨论... 40

第3章  凡纳滨对虾耐高盐性状的比较转录组分析和基因定位... 43

3.1  凡纳滨对虾耐高盐家系和敏感家系比较转录组分析... 43

3.1.1  材料和方法... 44

3.1.1.1  高盐耐受家系和敏感家系的挑选和取样... 44

3.1.1.2  转录组样品的取样... 45

3.1.1.3  样品RNA提取和cDNA的合成... 45

3.1.1.4  RNA测序和数据预处理... 45

3.1.1.5  差异表达基因(DEGs)的筛选... 46

3.1.1.6  差异表达基因的功能富集分析... 46

3.1.1.7  转录组测序结果的验证... 46

3.1.2  结果... 48

3.1.2.1  RNA测序数据... 48

3.1.2.2  耐高盐家系和敏感家系在正常盐度下的比较分析... 49

3.1.2.3  耐高盐家系和敏感家系在高盐度下的比较分析... 50

3.1.2.4  耐高盐家系和敏感家系在正常盐度和高盐度条件下的共同差异表达基因... 52

3.1.2.5  耐高盐家系和敏感家系响应高盐胁迫的转录组分析... 53

3.1.2.6  转录组数据的验证... 60

3.1.3  讨论... 61

3.2  凡纳滨对虾耐高盐性状的全基因组关联分析... 64

3.2.1  材料和方法... 64

3.2.1.1  耐高盐性状测试和样品的收集... 64

3.2.1.2  表型测定和DNA提取... 65

3.2.1.3  标记分型... 65

3.2.1.4  主成分(PCA)分析和个体亲缘关系分析... 65

3.2.1.5  全基因组关联分析... 65

3.2.1.6  机器学习方法评估标记的重要性... 66

3.2.1.7  候选基因的筛选... 66

3.2.1.8  候选基因的GO富集分析... 66

3.2.2  结果... 66

3.2.2.1  耐高盐性状的表型数据... 66

3.2.2.2  群体结构主成分分析... 67

3.2.2.3  个体之间亲缘关系分析... 68

3.2.2.4  存活时间表型的全基因组关联分析... 69

3.2.2.5  致死盐度表型全基因组关联分析... 70

3.2.2.6  存活状态表型全基因组关联分析... 71

3.2.2.7  三个表型共有的与性状显著相关的标记... 72

3.2.2.8  机器学习方法评估标记的重要性... 74

3.2.2.9  三个表型共有的重要性排名前50的标记... 76

3.2.2.10  存活时间表型相关候选基因的GO富集分析... 77

3.2.3  讨论... 83

Document Type学位论文
Identifierhttp://ir.qdio.ac.cn/handle/337002/178352
Collection实验海洋生物学重点实验室
Recommended Citation
GB/T 7714
罗正. 凡纳滨对虾生长和耐高盐性状相关标记的筛选及基因组育种技术研究[D]. 中国科学院海洋研究所. 中国科学院大学,2022.
Files in This Item:
File Name/Size DocType Version Access License
罗正毕业论文.pdf(5201KB)学位论文 暂不开放CC BY-NC-SA
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[罗正]'s Articles
Baidu academic
Similar articles in Baidu academic
[罗正]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[罗正]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.