EN
登录

染色体级别秋海棠基因组组装

A chromosomal-level genome assembly of Begonia fimbristipula (Begoniaceae)

Nature 等信源发布 2025-03-12 18:09

可切换为仅中文


Abstract

摘要

Begonia fimbristipula

丁香秋海棠

Hance (Begoniaceae) is a valuable medicinal herb that is classified as a protected species in Guangdong Province, China. In this study, we present a chromosome-level genome assembly of

Hance(秋海棠科)是一种珍贵的药用植物,被列为广东省保护物种。在本研究中,我们展示了其染色体级别的基因组组装。

B

B

.

fimbristipula

纤毛草属

, aiming to facilitate its conservation and utilization. The genome was assembled using a combination of Oxford Nanopore long-read data and Illumina short-read data. The assembled genome size of

,旨在促进其保护和利用。基因组使用牛津纳米孔长读段数据和Illumina短读段数据组合进行组装。组装的基因组大小为

B

B

.

fimbristipula

纤毛草属

is 462.11 Mb, with a scaffold N50 of 38.22 Mb. A total of 91.96% (424.94 Mb) of the sequences were anchored to 11 pseudochromosomes using Hi-C technology. The genome assembly exhibits a BUSCO completeness of 90.3% and an LTR Assembly Index (LAI) of 17.73. Genome annotation revealed 25,563 protein-coding genes and 274 tRNA genes.

大小为462.11 Mb,scaffold N50为38.22 Mb。利用Hi-C技术,将91.96%(424.94 Mb)的序列锚定到11条假染色体上。基因组组装的BUSCO完整度为90.3%,LTR组装指数(LAI)为17.73。基因组注释揭示了25,563个蛋白质编码基因和274个tRNA基因。

The high-quality chromosome-level assembly and annotation provide valuable insights into the genomic characteristics of .

高质量的染色体级别组装和注释为基因组特征提供了宝贵的见解。

B

B

.

fimbristipula

纤毛草

, thereby offering essential resources for its conservation and economic utilization.

,从而为其保护和经济利用提供必要的资源。

Background & Summary

背景与概述

The family Begoniaceae C. Agardh consists of two genera:

秋海棠科(C. Agardh)包含两个属:

Hillebrandia

喜莱宝迪亚

Oliv. and

奥利弗和

Begonia

秋海棠

L.

L.

Hillebrandia

喜荫花属

is a monotypic genus, while

是一个单型属,而

Begonia

秋海棠

is one of the ten largest angiosperm genera, comprising over 2,000 species

是十大被子植物属之一,包含超过2000个物种

1

1

. Species within

. 物种内部

Begonia

秋海棠

are perennial herbs widely distributed in moist tropical and subtropical regions worldwide, with some species extending into the warm temperate zone (e.g.,

是广泛分布于全球湿润热带和亚热带地区的多年生草本植物,有些种类延伸到温暖的温带地区(例如,

B. grandis

B. grandis

Dryand.)

德赖安德。

2

2

. Members of this genus display a wide range of phenotypic diversity and possess significant ornamental value, with some species also having medicinal properties.

这个属的成员表现出广泛的表型多样性,具有显著的观赏价值,有些种类还具有药用价值。

B. fimbristipula

B. fimbristipula

Hance (2n = 22) is indigenous to southeastern China, specifically in Zhejiang, Jiangxi, Hunan, Fujian, Guangdong, Guangxi, Hainan, and Hong Kong

Hance (2n = 22) 原产于中国东南部,具体分布于浙江、江西、湖南、福建、广东、广西、海南和香港。

3

3

, with a subspecies endemic to Thailand (

,有一个泰国特有的亚种(

B. fimbristipula

B. fimbristipula

subsp.

亚种

siamensis

暹罗的

Phutthai & Radbouch.)

普塔伊 & 拉德布奇。

4

4

. This species typically has a solitary leaf and grows on rock or soil slopes in forest areas (Fig.

这种植物通常只有一片叶子,生长在森林地区的岩石或土壤斜坡上(图。

1

1

). It holds considerable economic significance as both a medicinal and food source; for example, the essential oil derived from

). 它作为药用和食物来源具有相当大的经济意义;例如,从中提取的精油

B. fimbristipula

B. fimbristipula

has shown inhibitory effects against

显示出对...的抑制作用

Streptococcus iniae

海豚链球菌

Pier in tilapia

罗非鱼码头

5

5

. However, this species faces threats from climate change and human activities, particularly the local demand for its use in herbal tea. This species was designated as a protected wild species by the Guangdong Province, China, in 2023.

然而,该物种面临着气候变化和人类活动的威胁,特别是当地对其用于草药茶的需求。该物种于2023年被中国广东省列为受保护的野生物种。

Fig. 1

图1

Photographs of the plant and fruits.

植物和果实的照片。

Full size image

全尺寸图像

Despite the utilization of complete genomes for plant conservation over the past few decades, only four

尽管在过去几十年中利用完整基因组进行植物保护,但只有四种

Begonia

秋海棠

genomes have been published to date:

迄今为止,已发布的基因组有:

B. loranthoides

B. loranthoides

Hook.f.,

胡克。

B. masoniana

B. masoniana

Irmsch. ex Ziesenh.,

Irmsch. ex Ziesenh.,

B. darthvaderiana

B. darthvaderiana

C.W.Lin & C.I Peng and

林志伟和彭智毅

B. peltatifolia

B. peltatifolia

Li

6

6

. Therefore, the generation of a high-quality genome of

。因此,高质量基因组的生成

B

B

.

fimbristipula

纤毛草

is essential for promoting its conservation and utilization, as well as for elucidating species relationships and evolutionary histories within this megadiverse genus.

对于促进其保护和利用,以及阐明该巨大多样性属内的物种关系和进化历史至关重要。

In this study, we assembled and annotated the genome of

在本研究中,我们组装并注释了基因组

B

B

.

fimbristipula

纤毛草属

using Oxford Nanopore Technology (ONT) reads, next-generation sequencing (NGS) reads, high-throughput chromosome conformation capture (Hi-C) reads, and RNA-seq reads. The assembled genome has a total size of 462.11 Mb and a scaffold N50 of 38.22 Mb. Annotation of repeat elements revealed that 64.63% (274.62 Mb) of the genome comprises repeat elements, with long terminal repeats (LTRs) accounting for 53.85% (146.60 Mb).

使用牛津纳米孔技术(ONT)读取、下一代测序(NGS)读取、高通量染色体构象捕获(Hi-C)读取和RNA-seq读取。组装的基因组总大小为462.11 Mb,支架N50为38.22 Mb。重复元件的注释显示,基因组中64.63%(274.62 Mb)由重复元件组成,其中长末端重复序列(LTRs)占53.85%(146.60 Mb)。

Our analyses predicted a total of 25,563 protein-coding genes and 274 tRNAs. The high-quality genome of .

我们的分析预测了总共25,563个蛋白质编码基因和274个tRNA。高质量的基因组。

B

B

.

fimbristipula

细裂叶荆芥

will advance our understanding of the evolutionary relationships within the genus

将增进我们对属内进化关系的理解

Begonia

秋海棠

and contribute to the conservation of this economically valuable species.

有助于保护这一经济价值高的物种。

Methods

方法

Sample collection and sequencing

样本收集与测序

Samples of

样本

B

B

.

fimbristipula

纤毛草属

were collected from Dinghu Mountain in Zhaoqing city, Guangdong Province, China (23°10′48″ N, 112°31′53″ E). Tissues, including fresh and young leaves and fruits, were immediately frozen in liquid nitrogen after collection and subsequently stored in refrigerator at −80 °C for DNA and RNA extraction.

采自中国广东省肇庆市鼎湖山(北纬23°10′48″,东经112°31′53″)。采集后,包括新鲜的幼叶和果实在内的组织立即被液氮冷冻,并随后在-80°C的冰箱中保存,用于DNA和RNA的提取。

A voucher specimen (ID: gexj230012) has been deposited in the herbarium of the South China Botanical Garden, Chinese Academy of Sciences (IBSC)..

一份凭证标本(编号:gexj230012)已存放在中国科学院华南植物园标本馆(IBSC)。

Genomic DNA extraction and sequencing were performed according to the protocols described in our previous study

根据我们之前研究中描述的协议进行了基因组DNA提取和测序。

7

7

. Specifically, total DNA was extracted using Grandomics Genomic DNA Kit (GrandOmics Biosciences, Wuhan, China). DNA degradation was assessed via a 0.75% gel electrophoresis experiment, while DNA purity and concentration were evaluated using a NanoDrop One UV–Vis spectrophotometer and a Qubit 3.0 fluorometer (Thermo Fisher Scientific, Waltham, MA, USA), respectively.

具体来说,使用Grandomics基因组DNA试剂盒(GrandOmics Biosciences,中国武汉)提取总DNA。通过0.75%凝胶电泳实验评估DNA降解情况,同时分别使用NanoDrop One紫外-可见分光光度计和Qubit 3.0荧光计(Thermo Fisher Scientific,美国马萨诸塞州沃尔瑟姆)评估DNA纯度和浓度。

For NGS sequencing, a short-read library (2 × 150 bp) with an insert size of 200–300 bp was prepared using the TruSeq Nano DNA HT Sample Preparation Kit, and sequencing was conducted on the Illumina HiSeq X Ten platform (Illumina, San Diego, CA, USA), generating 143.04 Gb (~325×) of raw data (Table .

对于NGS测序,使用TruSeq Nano DNA HT样品制备试剂盒准备了一个插入片段大小为200-300 bp的短读长文库(2×150 bp),并在Illumina HiSeq X Ten平台(Illumina,美国加利福尼亚州圣地亚哥)上进行测序,生成了143.04 Gb(约325×)的原始数据(表。

1

1

).

)。

Table 1 Summary of the sequencing data.

表1 测序数据汇总。

Full size table

全尺寸表格

For ONT long-read sequencing, the Nanopore library was prepared with the LSK109 Ligation Sequencing Kit, following the manufacturer’s instructions, and sequenced with a Nanopore PromethION sequencer (Oxford Nanopore Technologies, Oxford, UK). This process yielded approximately 125.68 Gb (~284×) of ONT data with a mean read length of 14.84 kb and a read N50 length of 23.43 kb (Table .

对于ONT长读长测序,使用LSK109连接测序试剂盒按照制造商的说明制备Nanopore文库,并使用Nanopore PromethION测序仪(Oxford Nanopore Technologies,牛津,英国)进行测序。该过程产生了约125.68 Gb(~284×)的ONT数据,平均读长为14.84 kb,读取的N50长度为23.43 kb(表。

1

1

). For Hi-C sequencing, the library was constructed from cross-linked DNA after digestion, biotinylation, ligation, enrichment, shearing, blunt-end repair, and additional steps. Sequencing of the final Hi-C library with pair-end read lengths of 150 bp was conducted on the Illumina HiSeq X Ten platform, which produced a total of 148.93 Gb (~338×) of Hi-C data (Table .

对于Hi-C测序,文库是通过交联DNA经过消化、生物素标记、连接、富集、剪切、平末端修复及其他步骤构建的。最终的Hi-C文库采用配对末端读长为150 bp的方式在Illumina HiSeq X Ten平台上进行测序,共产生了总计148.93 Gb(约338倍覆盖)的Hi-C数据(表。

1

1

). For transcriptome sequencing, a pair-end RNA library (2 × 150 bp) was prepared according to the TruSeq RNA Library Preparation Kit instructions and sequenced on the Illumina HiSeq X Ten platform, yielding 34.48 Gb and 37.24 Gb of RNA-seq data for the leaf and fruit samples, respectively (Table

). 对于转录组测序,按照TruSeq RNA Library Preparation Kit说明准备了双端RNA文库(2×150 bp),并在Illumina HiSeq X Ten平台上进行测序,分别获得了叶片和果实样本的34.48 Gb和37.24 Gb的RNA-seq数据(表

1

1

). All sequencing was conducted at GrandOmics Co., Ltd (Wuhan, China).

)。所有测序工作均在武汉的GrandOmics公司进行。

Adapters in ONT data were trimmed using Porechop v0.2.4

使用Porechop v0.2.4对ONT数据中的接头进行了修剪。

8

8

. For NGS, Hi-C, and RNA-seq data, adapters and low-quality reads were removed using fastp v0.23.3

对于NGS、Hi-C和RNA-seq数据,使用fastp v0.23.3去除接头和低质量读段。

9

9

.

Genome size estimation

基因组大小估计

The k-mer frequency distribution was calculated using Jellyfish v2.3.0 (-m 21)

使用Jellyfish v2.3.0(-m 21)计算k-mer频率分布。

10

10

with the NGS data. The resulting file was then utilized to predict the genome features using GenomeScope v1.0

结合NGS数据。然后,使用GenomeScope v1.0对生成的文件进行基因组特征预测。

11

11

with k-mer and read length set at 21 and 150, respectively. Based on this analysis, the genome of

k-mer 和读取长度分别设置为 21 和 150。基于此分析,基因组的

B

B

.

fimbristipula

细裂叶藨草

was estimated to be 439.94 Mb, with a heterozygosity rate of 1.45% (Fig.

估计为439.94 Mb,杂合率为1.45%(图。

2

2

).

)。

Fig. 2

图2

Genome survey of

基因组调查

Begonia fimbristipula

红叶秋海棠

based on 21-mer analysis.

基于21-mer分析。

Full size image

全尺寸图像

Chromosome-level genome assembly

染色体级别的基因组组装

The genome was assembled using NextDenovo v2.5.1

基因组使用NextDenovo v2.5.1进行组装。

12

12

, which has been shown to outperform other assemblers when applied to high repetitive and heterozygous genomes with ONT data

,当应用于高重复和高杂合基因组的ONT数据时,已被证明优于其他组装工具

13

13

. The assembled haploid draft genome was 756.92 Mb, consisting of 980 contigs with a contig N50 length of 6.02 Mb. ONT reads were subsequently aligned to the draft genome using minimap2 v2.24-r1122

组装的单倍体草图基因组为756.92 Mb,由980个contigs组成,contig N50长度为6.02 Mb。随后使用minimap2 v2.24-r1122将ONT reads比对到草图基因组。

14

14

. The aligned bam file was processed with Purge Haplotigs v1.1.2 (-l 10 -m 110 -h 300)

. 使用 Purge Haplotigs v1.1.2(参数 -l 10 -m 110 -h 300)处理了对齐的 bam 文件

15

15

to remove haplotypic duplications. After purging, 197 contigs were retained, totaling 460.91 Mb. The purged genome assembly was polished for two rounds with Racon v1.5.0

去除单倍型重复后,保留了197个重叠群,总计460.91 Mb。清理后的基因组组装使用Racon v1.5.0进行了两轮优化。

16

16

using ONT data, followed by two rounds of polishing with Polypolish v0.5.0

使用ONT数据,随后使用Polypolish v0.5.0进行两轮抛光

17

17

using NGS data. Hi-C reads were used to scaffold the polished genome assembly with Juicer v1.6

使用NGS数据。Hi-C读取数据被用于通过Juicer v1.6对精细组装的基因组进行支架搭建。

18

18

and 3d-dna v180922

和 3d-dna v180922

19

19

(using the ‘-r 0’ option), with manual adjustments made in Juicebox v1.11.08

(使用‘-r 0’选项),在Juicebox v1.11.08中进行手动调整

20

20

. Gaps in the genome assembly were processed using TGS-GapCloser v1.1.1

基因组组装中的缺口使用TGS-GapCloser v1.1.1进行处理

21

21

with ONT long reads. Initially, 63 gaps were present; after gap-filling, only three remained on chromosomes 6, 7, and 8. The gap-filled genome was further polished in two additional rounds with Racon and Polypolish, as previously described.

使用ONT长读段。最初存在63个缺口;经过缺口填充后,只有3个缺口保留在6号、7号和8号染色体上。如前所述,填补缺口后的基因组还通过两轮Racon和Polypolish进行了进一步抛光。

The final genome assembly was 462.11 Mb in length, consisting of 67 scaffolds with a scaffold N50 length of 38.22 Mb. A total of 91.96% (424.94 Mb) of the sequences were anchored to 11 pseudochromosomes. The lengths of individual chromosomes ranged from 23.65 Mb (chr3) to 74.55 Mb (chr8) (Fig.

最终的基因组组装长度为462.11 Mb,由67个支架组成,支架N50长度为38.22 Mb。共有91.96%(424.94 Mb)的序列被锚定到11条假染色体上。单个染色体的长度范围从23.65 Mb(chr3)到74.55 Mb(chr8)不等(图)。

3a

3a

). The circular plot of chromosomes and Hi-C interaction heatmap were visualized by circos v0.69-9

)。染色体的圆形图和Hi-C相互作用热图通过circos v0.69-9进行可视化。

22

22

and HiCExplorer v3

和 HiCExplorer v3

23

23

, respectively. Tandem repeats were identified using Tandem Repeats Finder v4.09

,分别使用Tandem Repeats Finder v4.09鉴定串联重复序列。

24

24

in quarTeT v1.2.2

在quarTeT v1.2.2中

25

25

. The counts of tandem repeats varied from 11,339 on chr1 to 74,946 on chr8. GC content of the genome was calculated using bedtools v2.30.0

串联重复序列的数量从chr1上的11,339到chr8上的74,946不等。基因组的GC含量使用bedtools v2.30.0计算。

26

26

, revealing that individual chromosomes had GC contents ranging from 37.39% (chr7) to 38.63% (chr3), with an overall mean of 38.00%.

,揭示了单个染色体的GC含量范围从37.39%(chr7)到38.63%(chr3),总体平均值为38.00%。

Fig. 3

图 3

Chromosome features of

染色体特征

Begonia fimbristipula

秋海棠属纤毛叶

. (

。 (

a

a

) The tracks from outer to inner (I–VII) represent the chromosome, tandem repeat density,

) 从外到内的轨道 (I–VII) 分别表示染色体、串联重复密度,

Gypsy

吉普赛人

density,

密度,

Copia

副本

density, GC content, gene density, and sequence synteny within the genome, respectively (window size = 700 kb). (

密度、GC 含量、基因密度以及基因组内的序列共线性(窗口大小 = 700 kb)。

b

b

) Hi-C interaction heatmap (bin size = 10 kb).

) Hi-C相互作用热图(bin大小=10 kb)。

Full size image

全尺寸图像

Repeat and gene annotation

重复序列和基因注释

The Extensive

广泛的

de novo

从头开始

TE Annotator (EDTA) v2.1.0

TE 注释器 (EDTA) v2.1.0

27

27

was employed to identify transposable elements. A total of 398,385 repetitive sequences were identified, representing 64.63% (274.62 Mb) of the genome (Table

被用于鉴定转座元件。共鉴定了398,385个重复序列,占基因组的64.63%(274.62 Mb)(表

2

2

). Among these repeats, LTRs were the most prevalent, constituting 53.85% (146.60 Mb) of the genome (Table

)。在这些重复序列中,LTR是最普遍的,占基因组的53.85%(146.60 Mb)(表

2

2

).

)。

Gypsy

吉普赛人

elements (28.43%) were the predominant LTRs, followed by

元素(28.43%)是主要的LTR,其次是

Copia

副本

elements (21.50%). The total lengths of terminal inverted repeats (TIRs) and non-LTR elements were 5.32 Mb and 1.87 Mb, respectively. These elements accounted for 6.27% and 1.32% of the genome (Table

元素(21.50%)。末端反向重复序列(TIRs)和非LTR元素的总长度分别为5.32 Mb和1.87 Mb。这些元素分别占基因组的6.27%和1.32%(表)。

2

2

).

)。

Table 2 Summary of repeat classes identified by EDTA.

表2 EDTA识别的重复类别的摘要。

Full size table

全尺寸表格

The transposable element (TE) library generated by EDTA served as input for RepeatMasker v4.1.2

EDTA 生成的转座元件 (TE) 库作为 RepeatMasker v4.1.2 的输入。

28

28

to produce a soft-masked genome. Gene prediction and functional annotation for the soft-masked genome were performed using

以生成一个软屏蔽的基因组。对软屏蔽基因组进行基因预测和功能注释的工作使用

de novo

从头开始

, homology protein-based, and transcriptome-based methods via the funannotate pipeline v1.8.15

通过funannotate流程v1.8.15的同源蛋白和转录组方法

29

29

. RNA-seq data were utilized to train the gene prediction models with the ‘funannotate train’ function. Subsequently, Augustus v3.5.0

RNA-seq 数据被用于通过 ‘funannotate train’ 功能训练基因预测模型。随后,使用 Augustus v3.5.0

30

30

, GeneMark-ET v4.72

,GeneMark-ET v4.72

31

31

, GlimmerHMM v3.0.1

`, GlimmerHMM v3.0.1`

32

32

, and SNAP v2013-02-16

,以及SNAP v2013-02-16

33

33

were employed for protein-coding gene prediction via the ‘funannotate predict’ function. At this stage, the protein-coding sequences of

通过‘funannotate predict’功能,用于蛋白质编码基因预测。在此阶段,蛋白质编码序列

B. loranthoides

B. loranthoides

34

34

,

B. masoniana

B. masoniana

35

35

,

B. darthvaderiana

B. darthvaderiana

36

36

, and

,以及

B. peltatifolia

B. peltatifolia

37

37

were obtained from China National GeneBank DataBase as protein evidence. tRNAs were annotated using tRNAscan-SE v2.0.11

从中国国家基因库数据库获取,作为蛋白质证据。tRNA使用tRNAscan-SE v2.0.11进行注释。

38

38

. Subsequently, the gene model predictions were refined and untranslated regions (UTRs) were incorporated using the ‘funannotate update’ feature. For functional annotation, the predicted genes were queried against public databases, including pfam v32.0, gene2product v1.45, interpro v76.0, dbCAN v8.0, busco_outgroups v1.0, merops v12.0, mibig v1.4, go v2023-05-10, repeats v1.0, unipot v2023_02, and eggNOG v5.0, using the InterProScan v5.62–94.0.

随后,使用“funannotate update”功能对基因模型预测进行了优化,并加入了非翻译区(UTRs)。在功能注释方面,将预测的基因与包括pfam v32.0、gene2product v1.45、interpro v76.0、dbCAN v8.0、busco_outgroups v1.0、merops v12.0、mibig v1.4、go v2023-05-10、repeats v1.0、unipot v2023_02和eggNOG v5.0在内的公共数据库进行了比对查询,使用的工具是InterProScan v5.62–94.0。

39

39

and EggNOG-mapper v2.1.11

和 EggNOG-mapper v2.1.11

40

40

pipelines. The functional annotations were further processed using the ‘funannotate annotate’ feature. In total, 25,563 genes encoding 27,671 proteins were predicted, with an average gene length of 3,281 bp. Furthermore, 274 tRNAs were annotated. Among the protein-coding genes, 24,871 (97.29%) were functionally annotated by the eggNOG database, while 22,443 (87.79%) of the genes were identified by InterProScan database.

流水线。功能注释通过“funannotate annotate”功能进一步处理。总共预测了25,563个基因,编码27,671个蛋白质,平均基因长度为3,281 bp。此外,还注释了274个tRNA。在编码蛋白质的基因中,有24,871个(97.29%)通过eggNOG数据库进行了功能注释,而22,443个(87.79%)基因被InterProScan数据库识别。

Synteny blocks were identified using jcvi v1.3.8.

使用jcvi v1.3.8识别了共线性块。

41

41

.

Data Records

数据记录

The raw data, including ONT long reads, Illumina short reads, Hi-C reads, and RNA short reads, have been deposited in the Genome Sequence Archive in the National Genomics Data Center (NGDC), China National Center for Bioinformation (CNCB) with the accession number of CRA019543 under BioProject PRJCA031018.

原始数据,包括ONT长读段、Illumina短读段、Hi-C读段和RNA短读段,已存储在中国国家生物信息中心(CNCB)的国家基因组数据中心(NGDC)的基因组序列档案库中,登录号为CRA019543,属于BioProject项目PRJCA031018。

42

42

. The final genome assembly, annotation, and protein-coding sequences are accessible via Figshare

最终的基因组组装、注释和蛋白质编码序列可通过Figshare获取。

43

四十三

. Genome assembly has been submitted to the National Center for Biotechnology Information (NCBI) with the accession number of JBIQHB000000000 under BioProject PRJNA1173897

基因组组装已提交至美国国家生物技术信息中心(NCBI),登录号为JBIQHB000000000,属于BioProject项目PRJNA1173897。

44

44

.

Technical Validation

技术验证

The completeness of the genome assembly was evaluated using the Benchmarking Universal Single-Copy Orthologs (BUSCO) v5.3.2

使用 Benchmarking Universal Single-Copy Orthologs (BUSCO) v5.3.2 评估基因组组装的完整性。

45

45

with the embryophyta_odb10.2020-09-10 database. Of the core 1,614 conserved plant genes evaluated, the complete BUSCOs for

使用 embryophyta_odb10.2020-09-10 数据库。在评估的 1,614 个核心保守植物基因中,完整的 BUSCOs 为

B

B

.

fimbristipula

纤毛草属

were 90.3%, with 87.0% complete and single-copy BUSCOs, 3.3% complete and duplicated BUSCOs, 1.2% fragmented BUSCOs, and 8.5% missing BUSCOs. Additionally, the genome quality was also evaluated by calculating the LTR assembly index (LAI) using the LAI program

为90.3%,其中87.0%为完整且单拷贝的BUSCO,3.3%为完整且重复的BUSCO,1.2%为片段化的BUSCO,8.5%为缺失的BUSCO。此外,还通过使用LAI程序计算LTR组装指数(LAI)来评估基因组质量。

46

46

, yielding a LAI value of 17.73. The base accuracy was assessed by Merqury v1.3

,得到LAI值为17.73。基础准确性通过Merqury v1.3进行评估。

47

47

based on the NGS data, which demonstrated a k-mer-based QV of 24.61 and k-mer completeness of 61.41%. The Hi-C heatmap revealed that the 11 pseudochromosomes of

基于NGS数据,其k-mer基于的QV值为24.61,k-mer完整性为61.41%。Hi-C热图显示了11条假染色体的

B

B

.

fimbristipula

纤毛草

exhibited strong interactive signals along the diagonals (Fig.

沿对角线表现出强烈的交互信号(图。

3b

3b

). ONT reads and RNA-seq reads were aligned to the final genome assembly using minimap2 v2.24-r1122

ONT读数和RNA-seq读数使用minimap2 v2.24-r1122比对到最终的基因组组装上。

14

14

and and HISAT2

和 HISAT2

48

48

, respectively. The mapping rate of ONT and RNA-seq reads were 89.30% and 87.16%, which were calculated using the ‘stats’ function in bamtools v2.5.1

,分别。使用bamtools v2.5.1中的‘stats’功能计算,ONT和RNA-seq读段的比对率分别为89.30%和87.16%。

49

49

. Moreover, the BUSCO completeness for genome annotation was assessed with BUSCO v5.3.2, yielding a value of 84.6%. Overall, these metrics indicate that the genome assembly of

此外,使用 BUSCO v5.3.2 对基因组注释的 BUSCO 完整性进行了评估,结果为 84.6%。总体而言,这些指标表明基因组组装的

B

B

.

fimbristipula

纤毛带

is of high quality and well-annotated.

质量高且注释良好。

Code availability

代码可用性

All software and pipelines utilized in this study were performed following the guidelines of the published tools. The parameters and version numbers of software and databases are detailed in the Methods section. Any elements not specified in the Methods were executed using default parameters. No custom scripts were employed..

本研究中使用的所有软件和流程均按照已发布工具的指南进行操作。软件和数据库的参数及版本号在方法部分有详细说明。方法中未指定的任何要素均使用默认参数执行。未使用自定义脚本。

References

参考文献

Moonlight, P. W.

月光,P. W.

et al

等人

. Dividing and conquering the fastest–growing genus: Towards a natural sectional classification of the mega–diverse genus

. 分而治之,最快增长的属:朝向巨型多样化属的自然节分类

Begonia

秋海棠

(Begoniaceae).

(秋海棠科)。

Taxon

分类单元

67

67

, 267–323,

,267-323,

https://doi.org/10.12705/672.3

https://doi.org/10.12705/672.3

(2018).

(2018)。

Article

文章

Google Scholar

谷歌学术

Kubitzki, K.

库比特斯基,K.

The Families and Genera of Vascular Plants Vol. X, Flowering Plants - Eudicots - Sapindales, Cucurbitales, Myrtaceae

《维管植物的科与属 第十卷,开花植物 - 真双子叶植物 - 无患子目,葫芦目,桃金娘科》

. (Springer, 2011).

. (Springer,2011)。

Wu, Z.-Y., Raven, P. H. & Hong, D.-Y.

吴征镒,雷文,洪德元

Flora of China

中国植物志

. (Science Press, 1999).

(科学出版社,1999年)。

Radbouchoom, S., Phutthai, T. & Schneider, H.

拉德布乔姆,S.,普塔伊,T.,施耐德,H.

Begonia fimbristipula

秋海棠属细裂叶植物

subsp.

亚种

siamensis

暹罗的

(sect.

(节。

Diploclinium

双轴花属

, Begoniaceae), a new taxon of the megadiverse genus endemic to Thailand.

,秋海棠科),泰国特有的一个新分类单元。

PhytoKeys

植物钥匙

218

218

, 1–10,

,1–10,

https://doi.org/10.3897/phytokeys.218.85699

https://doi.org/10.3897/phytokeys.218.85699

(2023).

(2023)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Yang, X.

杨,X.

et al

等人

. Inhibitory activity of essential oil from

。来自精油的抑制活性

Begonia fimbristipula

红叶秋海棠

against

反对

Streptococcus iniae

无乳链球菌

in tilapia.

在罗非鱼中。

Jiangxi Fishery Science and Technology

江西渔业科技

4

4

, 11–14 (2018).

,11-14页(2018年)。

ADS

广告

Google Scholar

谷歌学术索

Li, L.

李,L。

et al

. Genomes shed light on the evolution of

. 基因组揭示了进化的

Begonia

秋海棠

, a mega-diverse genus.

,一个超级多样化的属。

New Phytol.

新植物学家。

234

234

, 295–310,

,295-310,

https://doi.org/10.1111/nph.17949

https://doi.org/10.1111/nph.17949

(2022).

(2022)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Xiao, T.-W.

肖,T.-W.

et al

. Chromosome-level genome assemblies of

染色体级别的基因组组装

Musa ornata

花果芭蕉

and

Musa velutina

毛香蕉

provide insights into pericarp dehiscence and anthocyanin biosynthesis in banana.

提供对香蕉果皮开裂和花青素生物合成的见解。

Hortic. Res.

园艺研究

11

11

, uhae079,

,uhae079,

https://doi.org/10.1093/hr/uhae079

https://doi.org/10.1093/hr/uhae079

(2024).

(2024)。

Wick, R. R., Judd, L. M., Gorrie, C. L. & Holt, K. E. Completing bacterial genome assemblies with multiplex MinION sequencing.

Wick, R. R., Judd, L. M., Gorrie, C. L. 和 Holt, K. E. 使用多重MinION测序完成细菌基因组组装。

Microb. Genom.

微生物基因组学

3

3

, e000132,

,e000132,

https://doi.org/10.1099/mgen.0.000132

https://doi.org/10.1099/mgen.0.000132

(2017).

(2017)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor.

陈,S.,周,Y.,陈,Y.,顾,J. fastp: 一个超快的全能FASTQ预处理器。

Bioinformatics

生物信息学

34

34

, i884–i890,

,i884–i890,

https://doi.org/10.1093/bioinformatics/bty560

https://doi.org/10.1093/bioinformatics/bty560

(2018).

(2018)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

Marçais, G. & Kingsford, C. 一种快速、无锁的方法,用于高效并行计数k-mers的出现次数。

Bioinformatics

生物信息学

27

27

, 764–770,

,764-770,

https://doi.org/10.1093/bioinformatics/btr011

https://doi.org/10.1093/bioinformatics/btr011

(2011).

(2011)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Vurture, G. W.

沃尔彻,G. W.

et al

. GenomeScope: fast reference-free genome profiling from short reads.

GenomeScope:快速无参考基因组分析短读取数据。

Bioinformatics

生物信息学

33

33

, 2202–2204,

,2202-2204,

https://doi.org/10.1093/bioinformatics/btx153

https://doi.org/10.1093/bioinformatics/btx153

(2017).

(2017)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Hu, J.

胡,J.

et al

. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads.

NextDenovo:一个针对嘈杂长读段的高效纠错和精准组装工具。

Genome Biol.

基因组生物学

25

25

, 107,

,107,

https://doi.org/10.1186/s13059-024-03252-4

https://doi.org/10.1186/s13059-024-03252-4

(2024).

(2024)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术索

Sun, J., Li, R., Chen, C., Sigwart, J. D. & Kocot, K. M. Benchmarking Oxford Nanopore read assemblers for high-quality molluscan genomes.

孙, J., 李, R., 陈, C., Sigwart, J. D. & Kocot, K. M. 用于高质量软体动物基因组的牛津纳米孔测序组装软件基准测试。

Philos. Trans. R. Soc. B, Biol. Sci.

《皇家学会哲学会刊B:生物科学》

376

376

, 20200160,

,20200160,

https://doi.org/10.1098/rstb.2020.0160

https://doi.org/10.1098/rstb.2020.0160

(2021).

(2021)。

Article

文章

Google Scholar

谷歌学术

Li, H. New strategies to improve minimap2 alignment accuracy.

李,H. 提高minimap2比对准确性的新策略。

Bioinformatics

生物信息学

37

37

, 4572–4574,

,4572–4574,

https://doi.org/10.1093/bioinformatics/btab705

https://doi.org/10.1093/bioinformatics/btab705

(2021).

(2021)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术索

Roach, M. J., Schmidt, S. A. & Borneman, A. R. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies.

Roach, M. J., Schmidt, S. A. & Borneman, A. R. Purge Haplotigs:用于第三代二倍体基因组组装的等位基因重分配。

BMC Bioinform.

BMC生物信息学

19

19

, 460,

,460,

https://doi.org/10.1186/s12859-018-2485-7

https://doi.org/10.1186/s12859-018-2485-7

(2018).

(2018)。

Article

文章

Google Scholar

谷歌学术

Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads.

Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. 使用长的未修正读段进行快速而准确的从头基因组组装。

Genome Res.

基因组研究

27

27

, 737–746,

,737-746,

https://doi.org/10.1101/gr.214270.116

https://doi.org/10.1101/gr.214270.116

(2017).

(2017)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Wick, R. R. & Holt, K. E. Polypolish: Short-read polishing of long-read bacterial genome assemblies.

Wick, R. R. & Holt, K. E. Polypolish:使用短读长对长读长细菌基因组组装进行抛光。

PLoS Comp. Biol.

PLoS计算生物学

18

18

, e1009802,

,e1009802,

https://doi.org/10.1371/journal.pcbi.1009802

https://doi.org/10.1371/journal.pcbi.1009802

(2022).

(2022)。

Article

文章

ADS

广告

Google Scholar

谷歌学术索

Durand, N. C.

杜兰德,N. C.

et al

. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments.

Juicer 提供了一个一键式系统,用于分析环分辨率的 Hi-C 实验。

Cell Syst.

细胞系统。

3

3

, 95–98,

,95-98,

https://doi.org/10.1016/j.cels.2016.07.002

https://doi.org/10.1016/j.cels.2016.07.002

(2016).

(2016)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术索

Dudchenko, O.

杜德琴科,O.

et al

. De novo assembly of the

. 从头组装

Aedes aegypti

埃及伊蚊

genome using Hi-C yields chromosome-length scaffolds.

使用Hi-C技术对基因组进行分析,可以获得染色体长度的支架。

Science

科学

356

356

, 92–95,

,92-95,

https://doi.org/10.1126/science.aal3327

https://doi.org/10.1126/science.aal3327

(2017).

(2017)。

Article

文章

ADS

广告

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Durand, N. C.

杜兰德,N. C.

et al

等人

. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom.

Juicebox 提供了一个可视化系统,用于 Hi-C 接触图的无限制缩放。

Cell Syst.

细胞系统

3

3

, 99–101,

,99-101,

https://doi.org/10.1016/j.cels.2015.07.012

https://doi.org/10.1016/j.cels.2015.07.012

(2016).

(2016)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Xu, M.

徐,M。

et al

. TGS-GapCloser: a fast and accurate gap closer for large genomes with low coverage of error-prone long reads.

TGS-GapCloser:一种针对低覆盖度、高错误率长读段的大基因组快速且准确的缺口填补工具。

GigaScience

千兆科学

9

9

, giaa094,

, giaa094,

https://doi.org/10.1093/gigascience/giaa094

https://doi.org/10.1093/gigascience/giaa094

(2020).

(2020)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Krzywinski, M.

克日维尼基, M.

et al

. Circos: an information aesthetic for comparative genomics.

Circos:比较基因组学的信息美学。

Genome Res.

基因组研究

19

19

, 1639–1645,

,1639-1645,

https://doi.org/10.1101/gr.092759.109

https://doi.org/10.1101/gr.092759.109

(2009).

(2009)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Wolff, J.

沃尔夫,J.

et al

. Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization.

Galaxy HiCExplorer 3:一个用于可重复的Hi-C、捕获Hi-C和单细胞Hi-C数据分析、质量控制及可视化的网络服务器。

Nucleic Acids Res.

核酸研究

48

48

, W177–W184,

,W177–W184,

https://doi.org/10.1093/nar/gkaa220

https://doi.org/10.1093/nar/gkaa220

(2020).

(2020)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Benson, G. Tandem repeats finder: a program to analyze DNA sequences.

Benson,G. 串联重复序列查找器:一个用于分析DNA序列的程序。

Nucleic Acids Res.

核酸研究

27

27

, 573–580,

,573-580,

https://doi.org/10.1093/nar/27.2.573

https://doi.org/10.1093/nar/27.2.573

(1999).

(1999)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Lin, Y.

林,Y。

et al

. quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat identification.

quarTeT:一个从端粒到端粒的无间隙基因组组装和着丝粒重复序列识别工具包。

Hortic. Res.

园艺研究

10

10

, uhad127,

,uhad127,

https://doi.org/10.1093/hr/uhad127

https://doi.org/10.1093/hr/uhad127

(2023).

(2023)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术索

Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features.

Quinlan, A. R. & Hall, I. M. BEDTools:一个用于比较基因组特征的灵活实用工具套件。

Bioinformatics

生物信息学

26

26

, 841–842,

,841-842,

https://doi.org/10.1093/bioinformatics/btq033

https://doi.org/10.1093/bioinformatics/btq033

(2010).

(2010)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Ou, S.

欧,S.

et al

等人

. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline.

. 对转座元件注释方法进行基准测试,以创建一个简化的、全面的流程。

Genome Biol.

基因组生物学

20

20

, 275,

,275,

https://doi.org/10.1186/s13059-019-1905-y

https://doi.org/10.1186/s13059-019-1905-y

(2019).

(2019)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术搜索

Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences.

Tarailo-Graovac, M. 和 Chen, N. 使用 RepeatMasker 识别基因组序列中的重复元件。

Curr. Protoc. Bioinformatics

当前协议:生物信息学

25

25

, 4.10.11–14.10.14,

,4.10.11–14.10.14,

https://doi.org/10.1002/0471250953.bi0410s25

https://doi.org/10.1002/0471250953.bi0410s25

(2009).

(2009)。

Article

文章

Google Scholar

谷歌学术索

Zenodo

Zenodo

https://zenodo.org/records/4054262

https://zenodo.org/records/4054262

(2020).

(2020)。

Hoff, K. J. & Stanke, M. Predicting genes in single genomes with AUGUSTUS.

霍夫,K. J. 和斯坦克,M. 使用 AUGUSTUS 预测单个基因组中的基因。

Curr. Protoc. Bioinformatics

当前协议:生物信息学

65

65

, e57,

,e57,

https://doi.org/10.1002/cpbi.57

https://doi.org/10.1002/cpbi.57

(2019).

(2019)。

Article

文章

PubMed

PubMed

MATH

数学

Google Scholar

谷歌学术

Lomsadze, A., Ter-Hovhannisyan, V., Chernoff, Y. O. & Borodovsky, M. Gene identification in novel eukaryotic genomes by self-training algorithm.

Lomsadze, A., Ter-Hovhannisyan, V., Chernoff, Y. O. & Borodovsky, M. 通过自训练算法在新型真核基因组中进行基因识别。

Nucleic Acids Res.

核酸研究。

33

33

, 6494–6506,

,6494–6506,

https://doi.org/10.1093/nar/gki937

https://doi.org/10.1093/nar/gki937

(2005).

(2005)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders.

Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan 和 GlimmerHMM:两个开源的从头预测真核基因工具。

Bioinformatics

生物信息学

20

20

, 2878–2879,

,2878-2879,

https://doi.org/10.1093/bioinformatics/bth315

https://doi.org/10.1093/bioinformatics/bth315

(2004).

(2004)。

Article

文章

PubMed

PubMed

MATH

数学

Google Scholar

谷歌学术索

Korf, I. Gene finding in novel genomes.

科夫,I. 在新基因组中寻找基因。

BMC Bioinform.

BMC生物信息学

5

5

, 59,

,59,

https://doi.org/10.1186/1471-2105-5-59

https://doi.org/10.1186/1471-2105-5-59

(2004).

(2004)。

Article

文章

MATH

数学

Google Scholar

谷歌学术

China National GeneBank DataBase

中国国家基因库数据库

https://db.cngb.org/search/assembly/CNA0013974

https://db.cngb.org/search/assembly/CNA0013974

(2021).

(2021)。

China National GeneBank DataBase

中国国家基因库数据库

https://db.cngb.org/search/assembly/CNA0013975

https://db.cngb.org/search/assembly/CNA0013975

(2021).

(2021)。

China National GeneBank DataBase

中国国家基因库数据库

https://db.cngb.org/search/assembly/CNA0013973

https://db.cngb.org/search/assembly/CNA0013973

(2021).

(2021)。

China National GeneBank DataBase

中国国家基因库数据库

https://db.cngb.org/search/assembly/CNA0013976

https://db.cngb.org/search/assembly/CNA0013976

(2021).

(2021)。

Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence.

Lowe, T. M. & Eddy, S. R. tRNAscan-SE:一种用于改进基因组序列中转运RNA基因检测的程序。

Nucleic Acids Res.

核酸研究

25

25

, 955–964,

,955-964,

https://doi.org/10.1093/nar/25.5.955

https://doi.org/10.1093/nar/25.5.955

(1997).

(1997)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Jones, P.

琼斯,P.

et al

. InterProScan 5: genome-scale protein function classification.

InterProScan 5: 基因组规模的蛋白质功能分类。

Bioinformatics

生物信息学

30

30

, 1236–1240,

,1236-1240,

https://doi.org/10.1093/bioinformatics/btu031

https://doi.org/10.1093/bioinformatics/btu031

(2014).

(2014)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Huerta-Cepas, J.

韦尔塔-塞帕斯,J。

et al

. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper.

通过eggNOG-Mapper进行直系同源分配,实现快速的全基因组功能注释。

Mol. Biol. Evol.

分子生物学与进化

34

34

, 2115–2122,

,2115–2122,

https://doi.org/10.1093/molbev/msx148

https://doi.org/10.1093/molbev/msx148

(2017).

(2017)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Tang, H.

唐,H.

et al

等人

. JCVI: A versatile toolkit for comparative genomics analysis.

JCVI:一个用于比较基因组分析的多功能工具包。

iMeta

iMeta

3

3

, e211,

,e211,

https://doi.org/10.1002/imt2.211

https://doi.org/10.1002/imt2.211

(2024).

(2024)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

National Genomics Data Center, China National Center for Bioinformation

国家基因组数据中心,中国生物信息中心

https://ngdc.cncb.ac.cn/gsa/browse/CRA019543

https://ngdc.cncb.ac.cn/gsa/browse/CRA019543

(2024).

(2024)。

Figshare

Figshare

https://doi.org/10.6084/m9.figshare.27247158

https://doi.org/10.6084/m9.figshare.27247158

(2024).

(2024)。

NCBI GenBank

NCBI GenBank

https://identifiers.org/ncbi/insdc:JBIQHB000000000

https://标识符.org/ncbi/insdc:JBIQHB000000000

(2024).

(2024)。

Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes.

曼尼,M.,伯克利,M. R.,塞佩,M.,西马奥,F. A.,兹多博夫,E. M. BUSCO更新:新颖且简化的流程,以及更广泛和更深入的系统发育覆盖,用于真核、原核和病毒基因组的评分。

Mol. Biol. Evol.

分子生物学与进化

38

38

, 4647–4654,

,4647–4654,

https://doi.org/10.1093/molbev/msab199

https://doi.org/10.1093/molbev/msab199

(2021).

(2021)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术搜索

Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI).

欧,S.,陈,J.,江,N. 使用LTR组装指数(LAI)评估基因组组装质量。

Nucleic Acids Res.

核酸研究

46

46

, e126–e126,

,e126–e126,

https://doi.org/10.1093/nar/gky730

https://doi.org/10.1093/nar/gky730

(2018).

(2018)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

MATH

数学

Google Scholar

谷歌学术

Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies.

Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury:无需参考基因组的基因组组装质量、完整性和定相评估工具。

Genome Biol.

基因组生物学

21

21

, 245,

,245,

https://doi.org/10.1186/s13059-020-02134-9

https://doi.org/10.1186/s13059-020-02134-9

(2020).

(2020)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype.

金,D.,帕吉,J. M.,朴,C.,贝内特,C.,萨尔茨伯格,S. L. 基于图的基因组比对和基因分型与HISAT2和HISAT-genotype。

Nat. Biotechnol.

自然生物技术

37

37

, 907–915,

,907-915,

https://doi.org/10.1038/s41587-019-0201-4

https://doi.org/10.1038/s41587-019-0201-4

(2019).

(2019)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Barnett, D. W., Garrison, E. K., Quinlan, A. R., Strömberg, M. P. & Marth, G. T. BamTools: a C++ API and toolkit for analyzing and managing BAM files.

巴内特,D.W.,加里森,E.K.,奎兰,A.R.,斯特罗姆伯格,M.P.,马思,G.T. BamTools:一个用于分析和管理BAM文件的C++ API和工具包。

Bioinformatics

生物信息学

27

27

, 1691–1692,

,1691-1692,

https://doi.org/10.1093/bioinformatics/btr174

https://doi.org/10.1093/bioinformatics/btr174

(2011).

(2011)。

Article

文章

PubMed

PubMed

PubMed Central

PubMed Central

Google Scholar

谷歌学术

Download references

下载参考文献

Acknowledgements

致谢

This work was supported by Science & Technology Fundamental Resources Investigation Program (Grant No. 2022FY100500).

本工作得到了科技基础资源调查计划(项目编号:2022FY100500)的支持。

Author information

作者信息

Authors and Affiliations

作者与所属机构

Key Laboratory of National Forestry and Grassland Administration on Plant Conservation and Utilization in Southern China, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China

中国科学院华南植物园,国家林业和草原局南方植物保护与利用重点实验室,广州,510650,中国

Tian-Wen Xiao, Zheng-Feng Wang & Hai-Fei Yan

肖天闻,王正风,闫海飞

South China National Botanical Garden, Guangzhou, 510650, China

中国广东省广州市华南国家植物园,邮编510650

Tian-Wen Xiao, Zheng-Feng Wang & Hai-Fei Yan

肖天闻,王正峰,闫海飞

Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China

中国科学院华南植物园,广东省应用植物学重点实验室,广州,510650,中国

Zheng-Feng Wang & Hai-Fei Yan

王正峰 和 颜海飞

Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China

中国科学院华南植物园植被恢复与退化生态系统管理重点实验室,广州,510650,中国

Zheng-Feng Wang

郑锋·王

State Key Laboratory of Plant Diversity and Specialty Crops, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China

中国科学院华南植物园植物多样性与特色经济作物重点实验室,广州,510650,中国

Hai-Fei Yan

严海飞

Authors

作者

Tian-Wen Xiao

天问肖

View author publications

查看作者出版物

You can also search for this author in

您还可以搜索此作者在

PubMed

PubMed

Google Scholar

谷歌学术

Zheng-Feng Wang

郑峰·王

View author publications

查看作者的出版物

You can also search for this author in

您还可以搜索该作者在

PubMed

PubMed

Google Scholar

谷歌学术

Hai-Fei Yan

严海飞

View author publications

查看作者出版物

You can also search for this author in

您还可以搜索此作者在

PubMed

PubMed

Google Scholar

谷歌学术索

Contributions

贡献

H.F.Y. and Z.F.W. designed this study, collected the samples, and revised the manuscript. T.W.X. performed data analyses and wrote the manuscript. All authors approved the final manuscript for publication.

H.F.Y.和Z.F.W.设计了本研究,收集了样本,并修订了手稿。T.W.X.进行数据分析并撰写了手稿。所有作者均批准最终手稿发表。

Corresponding authors

通讯作者

Correspondence to

致信给

Zheng-Feng Wang

王正峰

or

Hai-Fei Yan

严海飞

.

Ethics declarations

伦理声明

Competing interests

竞争利益

The authors declare no competing interests.

作者声明不存在竞争性利益。

Additional information

附加信息

Publisher’s note

出版社注

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature 对已发布地图中的管辖权声明和机构隶属关系保持中立。

Rights and permissions

权利与许可

Open Access

开放获取

This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material.

本文根据知识共享署名-非商业性使用-禁止演绎 4.0 国际许可协议获得许可,该协议允许您在任何媒介或格式中进行非商业性的使用、分享、分发和复制,只要您对原作者和来源给予适当的署名,提供指向知识共享许可协议的链接,并说明是否对授权材料进行了修改。

You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

根据本许可,您无权分享基于本文或其部分内容改编的材料。本文中的图像或其他第三方材料包含在文章的Creative Commons许可中,除非在材料的署名行中另有说明。如果材料未包含在文章的Creative Commons许可中,并且您的预期用途不被法律法规允许或超出了允许的使用范围,您需要直接从版权持有人处获得许可。

To view a copy of this licence, visit .

要查看此许可证的副本,请访问。

http://creativecommons.org/licenses/by-nc-nd/4.0/

http://creativecommons.org/licenses/by-nc-nd/4.0/

.

Reprints and permissions

重印和权限

About this article

关于本文

Cite this article

引用这篇文章

Xiao, TW., Wang, ZF. & Yan, HF. A chromosomal-level genome assembly of

肖,TW.,王,ZF.,闫,HF. 染色体级别的基因组组装

Begonia fimbristipula

五爪金龙

(Begoniaceae).

(秋海棠科)。

Sci Data

科学数据

12

12

, 429 (2025). https://doi.org/10.1038/s41597-025-04768-5

,429(2025)。https://doi.org/10.1038/s41597-025-04768-5

Download citation

下载引用

Received

已收到

:

08 November 2024

2024年11月8日

Accepted

已接受

:

06 March 2025

2025年3月6日

Published

已发布

:

12 March 2025

2025年3月12日

DOI

数字对象标识符

:

https://doi.org/10.1038/s41597-025-04768-5

https://doi.org/10.1038/s41597-025-04768-5

Share this article

分享这篇文章

Anyone you share the following link with will be able to read this content:

任何你分享以下链接的人都将能够阅读此内容:

Get shareable link

获取可共享链接

Sorry, a shareable link is not currently available for this article.

抱歉,这篇文章目前没有可共享的链接。

Copy to clipboard

复制到剪贴板

Provided by the Springer Nature SharedIt content-sharing initiative

由 Springer Nature SharedIt 内容共享计划提供