商务合作
动脉网APP
可切换为仅中文
AbstractSingle-cell RNA sequencing (scRNA-seq) technologies have become essential tools for characterizing cellular landscapes within complex tissues. Large-scale single-cell transcriptomics holds great potential for identifying rare cell types critical to the pathogenesis of diseases and biological processes.
摘要单细胞RNA测序(scRNA-seq)技术已成为表征复杂组织中细胞景观的重要工具。大规模单细胞转录组学在鉴定对疾病和生物过程的发病机理至关重要的稀有细胞类型方面具有巨大潜力。
Existing methods for identifying rare cell types often rely on one-time clustering using partial or global gene expression. However, these rare cell types may be overlooked during the clustering phase, posing challenges for their accurate identification. In this paper, we propose a Cluster decomposition-based Anomaly Detection method (scCAD), which iteratively decomposes clusters based on the most differential signals in each cluster to effectively separate rare cell types and achieve accurate identification.
现有的鉴定稀有细胞类型的方法通常依赖于使用部分或全局基因表达的一次性聚类。然而,这些罕见的细胞类型在聚类阶段可能会被忽视,对其准确识别提出了挑战。在本文中,我们提出了一种基于聚类分解的异常检测方法(scCAD),该方法根据每个聚类中差异最大的信号迭代分解聚类,以有效分离稀有细胞类型并实现准确识别。
We benchmark scCAD on 25 real-world scRNA-seq datasets, demonstrating its superior performance compared to 10 state-of-the-art methods. In-depth case studies across diverse datasets, including mouse airway, brain, intestine, human pancreas, immunology data, and clear cell renal cell carcinoma, showcase scCAD’s efficiency in identifying rare cell types in complex biological scenarios.
我们在25个真实世界的scRNA-seq数据集上对scCAD进行了基准测试,与10种最先进的方法相比,证明了其优越的性能。跨不同数据集的深入案例研究,包括小鼠气道,脑,肠,人胰腺,免疫学数据和透明细胞肾细胞癌,展示了scCAD在复杂生物学情况下识别稀有细胞类型的效率。
Furthermore, scCAD can correct the annotation of rare cell types and identify immune cell subtypes associated with disease, thereby offering valuable insights into disease progression..
此外,scCAD可以纠正稀有细胞类型的注释,并识别与疾病相关的免疫细胞亚型,从而为疾病进展提供有价值的见解。。
IntroductionSingle-cell RNA sequencing (scRNA-seq) technologies have enabled researchers to analyze gene expression patterns at the single-cell level1, thereby dissecting cellular heterogeneity2, while providing new insights into understanding the composition and function of cell types within complex tissues3.
引言单细胞RNA测序(scRNA-seq)技术使研究人员能够分析单细胞水平的基因表达模式1,从而剖析细胞异质性2,同时为理解复杂组织中细胞类型的组成和功能提供新的见解3。
With the advancement of sequencing technology, larger datasets become available4, enabling not only characterizing the major cell types but also capturing low-frequency cell types5,6,7. These rare cell types exhibit low abundance and have been extensively validated for their significant roles in disease pathogenesis and biological processes such as angiogenesis and immune response mediation.
随着测序技术的进步,更大的数据集变得可用4,不仅可以表征主要细胞类型,还可以捕获低频细胞类型5,6,7。这些罕见的细胞类型表现出低丰度,并且由于其在疾病发病机理和生物过程(例如血管生成和免疫应答介导)中的重要作用而得到了广泛验证。
For example, circulating tumor cells (CTCs) are indeed very rare in peripheral blood but their metastasis is closely associated with cancer-related death. It is estimated that CTCs account for 1 or fewer cells in every 105–106 peripheral blood mononuclear cells (PBMCs)8. This limited presence of CTCs poses a substantial challenge to their detection and characterization in cancer research9,10.
例如,循环肿瘤细胞(CTC)在外周血中确实非常罕见,但它们的转移与癌症相关的死亡密切相关。据估计,CTC在每105-106个外周血单核细胞(PBMC)中占1个或更少的细胞8。CTC的这种有限存在对其在癌症研究中的检测和表征提出了重大挑战9,10。
Therefore, in addition to commonly used tools like Seurat11 that comprehensively identify major cell types, developing specialized methods to accurately and effectively identify and characterize these rare cell types has become a major challenge in single-cell research.Prominent algorithms used in recent years for the identification and analysis of rare cell types include Finder of Rare Entities (FiRE)12, CellSIUS13, Ensemble method for simultaneous Dimensionality reduction and feature Gene Extraction (EDGE)14, GapClust15, GiniClust series methods16,17,18, RaceID series methods19,20, SCISSORS21, CIARA22, and surprisal component analysis (SCA)23.
因此,除了像Seurat11这样全面识别主要细胞类型的常用工具外,开发专门的方法来准确有效地识别和表征这些稀有细胞类型已成为单细胞研究的主要挑战。近年来用于鉴定和分析稀有细胞类型的突出算法包括稀有实体查找器(FiRE)12,CellSIUS13,用于同时降维和特征基因提取的集成方法(EDGE)14,GapClust15,GiniClust系列方法16,17,18,外消旋体系列方法19,20,剪刀21,CIARA22和意外成分分析(SCA)23。
These methods identify rare cells from four.
这些方法从四种细胞中鉴定出稀有细胞。
1.
1.
The ensemble feature selection process involves selecting highly variable genes11 and highly discriminative genes27. Specifically, scCAD calculates the mean and a dispersion measure (variance/mean) for each gene across all single cells, selecting the top 2000 most variable genes that exhibit high variability compared to genes with similar average expression.
集合特征选择过程涉及选择高度可变的基因11和高度区分的基因27。具体而言,scCAD计算所有单个细胞中每个基因的平均值和分散度(方差/平均值),选择与具有相似平均表达的基因相比表现出高变异性的前2000个最可变基因。
At the same time, a random forest model is trained using the preprocessed gene expression matrix and cluster labels, and the importance of each gene is calculated based on the Gini impurity obtained from a set of decision trees26. Next, scCAD selects the top 2,000 genes with the highest importance. Finally, the combined set of genes from both selections is retained for subsequent analysis..
同时,使用预处理的基因表达矩阵和聚类标签训练随机森林模型,并根据从一组决策树26获得的基尼杂质计算每个基因的重要性26。接下来,scCAD选择重要性最高的前2000个基因。最后,保留来自两个选择的组合基因组用于后续分析。。
2.
2.
Using the preprocessed expression matrix with selected genes, scCAD performs cell clustering and initially partitions \(n\) cells into several clusters. The set of clusters obtained from the initial clustering is defined as I-clusters (initial clusters). Then, scCAD iteratively decomposes each cluster containing more than \(R=1\%\) of the total number of cells (\(R \,*\, n\)) through clustering until no new clusters are generated by using the Louvain method, or until all clusters become smaller than \(R \,*\, n\).
使用具有选定基因的预处理表达矩阵,scCAD进行细胞聚类,并最初将(n)个细胞分成几个簇。从初始聚类获得的聚类集被定义为I聚类(初始聚类)。然后,scCAD通过聚类迭代分解每个包含超过细胞总数(\(R\,*\,n\)的簇,直到使用Louvain方法不产生新簇,或者直到所有簇都小于\(R\,*\,n\)。
The set of clusters obtained from cluster decomposition is defined as D-clusters (decomposed clusters). Subsequently, clusters in D-clusters will be merged based on a threshold to form the final clusters set (M-clusters). Specifically, scCAD first determines the centers for each cluster. The center of cluster \(i\) is calculated as: \({{{{\bf{V}}}}}_{{{{\bf{i}}}}}=\left(\right.\frac{1}{{N}_{i}}{\sum }_{j=1}^{{N}_{i}}{x}_{j,1}^{i},\frac{1}{{N}_{i}}{\sum }_{j=1}^{{N}_{i}}{x}_{j,2}^{i},\ldots,\frac{1}{{N}_{i}}{\sum }_{j=1}^{{N}_{i}}{x}_{j,g}^{i}\), where \({{{{\bf{V}}}}}_{{{{\bf{i}}}}}\) is a vector with a magnitude of \(g\), \(g\) is the number of selected genes, \({N}_{i}\) represents the number of cells in cluster \(i\), and \({x}_{j,k}^{i}\) represents the expression value of gene \(k\) in cell \(j\) belonging to cluster \(i\).
从聚类分解获得的聚类集定义为D聚类(分解聚类)。随后,将基于阈值合并D簇中的簇以形成最终簇集(M簇)。具体来说,scCAD首先确定每个集群的中心。簇的中心\(i \)计算为:\({{{{\bf{V}}}}}}}u{{{{{\ bf{i}}}}=\左(\右。\ frac{1}{{N}_{i} }{\和}{j=1}^{{N}_{i} }{x}_{j,1}^{i},\frac{1}{{N}_{i} }{\和}{j=1}^{{N}_{i} }{x}_{j,2}^{i}、\ldots、\frac{1}{{N}_{i} }{\和}{j=1}^{{N}_{i} {x}_{j,g}^{i}),其中\({{{{{bf{V}}}}}}u{{{{{{bf{i}}}}})是一个大小为\(g)\,\(g)是所选基因的数量\({N}_{i} \)表示簇(i)中的单元数,并且\({x}_{j,k}^{i}\)代表基因\(k \)在属于簇\(i \)的细胞\(j \)中的表达值。
Then, scCAD calculates the Euclidean distance between all cluster centers: \(D\left({{{{\bf{V}}}}}_{{{{\bf{i}}}}},{{{{\bf{V}}}}}_{{{{\bf{j}}}}}\right)=\sqrt{{({{{{\bf{V}}}}}_{{{{\bf{i}}}}}-{{{{\bf{V}}}}}_{{{{\bf{j}}}}})}^{2}}\), where \({{{{\bf{V}}}}}_{{{{\bf{i}}}}}\) and \({{{{\bf{V}}}}}_{{{{\bf{j}}}}}\) represent the centers of cluster \(i\) and \(j\), respectively.
然后,scCAD计算所有聚类中心之间的欧几里得距离:\(D \ left({{{{{{\bf{V}}}}}}}}}}{{{{{\bf{i}}}}}},{{{{\bf{V}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}bf{i}}}-{{{{\bf{V}}}}}}}}}}{{{\bf{j}}}}}}}^{2}}}),其中\({{{{{\bf{V}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}bf{j}}}分别代表簇\(i \)和\(j \)的中心。
Finally, scCAD determines the threshold of merging, \({THM}={{\mbox{.
最后,scCAD确定合并阈值,\({THM}={{\mbox{。
3.
3.
For each gene in cluster \(i\) in M-clusters, scCAD calculates the difference between the median of the gene expressions of all cells within cluster \(i\) and the median of those outside of cluster \(i\)93. Assume that \({X}_{{C}_{i}}^{k}\) represents the vector composed of gene \(k\) expression values for all cells in cluster \(i\), \({X}_{{C}_{i}^{{\prime} }}^{k}\) represents the vector composed of gene \(k\) expression values for all cells outside of cluster \(i\), the median difference of gene \(k\) is calculated as: \({d}_{k}=|{{\mbox{median}}}\left({X}_{{C}_{i}}^{k}\right)-{{\mbox{median}}}\left({X}_{{C}_{i}^{{\prime} }}^{k}\right)|\).
对于M簇中簇(i)中的每个基因,scCAD计算簇(i)内所有细胞的基因表达中位数与簇(i)外所有细胞的基因表达中位数之间的差异93。假设\({X}_(笑声){{C}_{i} }^{k}表示由簇(i)中所有细胞的基因(k)表达值组成的载体\({X}_(笑声){{C}_{i} ^{{\ prime}}}^{k}表示由簇\(i \)之外的所有细胞的基因\(k \)表达值组成的载体,基因\(k \)的中值差计算如下:\({d}_{k} =|{\mbox{中位数}}}\左({X}_(笑声){{C}_{i} }^{k}\右){{\mbox{中位数}}\左({X}_(笑声){{C}_{i} ^{{\素数}}}^{k}\右)^\)。
Finally, scCAD selects the top 20 genes with the largest differences to generate the candidate gene set \({S}_{i}\) for cluster \(i\)..
最后,scCAD选择差异最大的前20个基因来生成候选基因集\({S}_{i} \)对于群集\(i \)。。
4.
4.
The gene expression matrix, which contains the genes in \({S}_{i}\), is then fed into an isolation forest model28 to calculate an anomaly score for each cell. The isolation forest model builds a collection of isolation trees where each tree is constructed by randomly selecting a subset of cells and recursively partitioning them into smaller subsets based on their expression in randomly selected candidate genes.
基因表达矩阵,其中包含\({S}_{i} \),然后输入隔离森林模型28,以计算每个单元的异常得分。隔离森林模型构建了一组隔离树,其中每棵树都是通过随机选择细胞子集并根据它们在随机选择的候选基因中的表达递归地将它们划分为较小的子集来构建的。
This process continues until each cell is isolated in its leaf node. The anomaly score is computed by normalizing the average path length for each cell, achieved by comparing it to the average path length of a randomly generated cell from the same dataset. The resulting score represents the degree of abnormality exhibited by each cell with the candidate genes.
这个过程一直持续到每个单元在其叶节点中被隔离为止。异常得分是通过标准化每个单元的平均路径长度来计算的,通过将其与来自同一数据集的随机生成的单元的平均路径长度进行比较来实现。所得分数代表每个细胞与候选基因表现出的异常程度。
As described in28, the ensemble anomaly score of cell j based on the candidate genes in Si is calculated with:$${{AS}}_{j}^{{S}_{i}}={2}^{-\frac{E(h(j))}{c(n)}}$$.
如28所述,基于Si中候选基因的细胞j的集合异常得分是通过以下公式计算的:$${{As}}{j}^{{S}_{i} }={2}^{-\frac{E(h(j))}{c(n)}}$$。
(1)
(1)
where \(h(j)\) is the path length of cell \(j\) in an isolation tree, which is the number of edges traversed in an isolation tree from the root node to the node containing cell \(i\). \(E(h(j))\) is the average of \(h(j)\) across all the isolation trees in the isolation forest model. \(c(n)\) represents the average path length when the total number of cells is n, and its formula is as follows28:$$c\left(n\right)=\left\{\begin{array}{cc}2H\left(n-1\right)-\frac{2\left(n-1\right)}{n} & n \, > \, 2\\ 1 & n=2\\ 0 & {\mbox{otherwise}}\end{array}\right.$$.
其中\(h(j)\)是隔离树中单元\(j)的路径长度,它是隔离树中从根节点到包含单元\(i \)的节点所穿过的边的数量\(E(h(j))\)是隔离森林模型中所有隔离树的\(h(j)\)的平均值\(c(n)\)表示当单元总数为n时的平均路径长度,其公式如下28:$$c \ left(n \ right)=\ left \{\ begin{array}{cc}2H\左(n-1 \右)-\ frac{2 \左(n-1 \右)}{n}&n\,>\,2\\1&n=2\\0&{\mbox{否则}}\结束{数组}\右。$$。
(2)
(2)
where \(H(n-1)\) is the harmonic number that can be estimated by \({{\mathrm{ln}}}(n-1)+0.5772156649\) (Euler’s constant)28.
。
5.
5.
scCAD assigns an independence score to cluster \(i\) in M-clusters based on the list composed of the corresponding anomaly scores of all cells: \(\{{{AS}}_{1}^{{S}_{i}},\, {{AS}}_{2}^{{S}_{i}},\, \ldots,\, {{AS}}_{n}^{{S}_{i}}\}\). The independence score (IS) of cluster i is defined as follows:$${{\mbox{IS}}}_{i}=\frac{{{{\rm{|}}}}{T}_{{N}_{i}}\cap {C}_{i}{{{\rm{|}}}}}{{N}_{i}}$$.
scCAD根据由所有单元格的相应异常得分组成的列表,为M簇中的簇\(i \)分配一个独立得分:\(\{{{{{AS}}}u{1}^{{S}_{i} },\,{{AS}}u2}^{{S}_{i} },\,\ldots,\,{{AS}}un}^{{S}_{i} }\}\)。群集i的独立性得分(IS)定义如下:$${{\mbox{IS}}}}\UI}=\frac{{{{\rm{{}}}{T}_(笑声){{N}_{i} }\上限{C}_{i} {{{\rm{|}}}}{{N}_{i} }$$。
(3)
(3)
where \({N}_{i}\) is the number of cells in cluster \(i\), \({T}_{{N}_{i}}\) is the set of the top \({N}_{i}\) cells with the highest anomaly scores, and \({C}_{i}\) is cluster \(i\) in M-clusters. A higher independence score indicates that the differentially expressed genes of the corresponding cluster effectively distinguish and characterize its encompassing cells..
在哪里\({N}_{i} \)是簇(i)中的单元数\({T}_(笑声){{N}_{i} }\)是顶部的集合\({N}_{i} \)异常得分最高的细胞,以及\({C}_{i} \)是M簇中的簇\(i \)。较高的独立性得分表明,相应簇的差异表达基因有效区分和表征了其包含的细胞。。
6.
6.
scCAD executes steps 3~5 for each cluster in M-clusters until obtaining the independence score for all clusters: \(\{{{\mbox{IS}}}_{1},\, {{\mbox{IS}}}_{2},\ldots,\, {{\mbox{IS}}}_{{m}^{{\prime} }}\}\). Finally, clusters with an independence score exceeding the threshold \(I\) (default is 0.7) (\(\{{C}_{i}\in {\mbox{M}}-{\mbox{clusters}}|{{\mbox{score}}}_{i} \, > \, I\}\)) are predicted as rare cell types and are outputted, along with the corresponding candidate genes..
scCAD对M集群中的每个集群执行步骤3〜5,直到获得所有集群的独立性得分:\(\{{{\mbox{IS}}}}{1},\,{\mbox{IS}}}}}{2},\,{\mbox{IS}}}}}}u{{M}^{\prime}}}}}\)。最后,独立性得分超过阈值(I)的集群(默认值为0.7)(\(\{{C}_{i} \ in{\mbox{M}}-{\mbox{clusters}}}|{\mbox{score}}}{ui},>\,i})被预测为罕见的细胞类型,并与相应的候选基因一起输出。。
Parameter value selection in scCADClusters containing more than \(R \,*\, n\) cells are considered for decomposition through iterative clustering. Based on a comprehensive review of previous studies18,94,95 defining the size of rare cell types, we set this parameter to 1% on all larger datasets. For smaller datasets, especially when the total number of cells is below 3000, we set the threshold to \(\frac{30}{n}\) to prevent the generation of excessively small clusters, thereby enhancing the interpretability and reliability of clustering outcomes.After cluster decomposition, scCAD merges clusters if their distance is smaller than the threshold \({THM}\), which is denoted as \({THM}={{\mbox{median}}}({d}_{1},\, {d}_{2},\, \ldots,\, {d}_{m})\), where \({d}_{i}\) is the Euclidean distance between cluster \(i\) and its nearest neighboring cluster.
考虑通过迭代聚类对包含多个\(R \,*\,n \)细胞的scCADClusters中的参数值选择进行分解。基于对先前定义稀有细胞类型大小的研究18,94,95的全面回顾,我们在所有较大的数据集中将此参数设置为1%。对于较小的数据集,特别是当细胞总数低于3000时,我们将阈值设置为\(\ frac{30}{n}\),以防止产生过小的聚类,从而提高聚类结果的可解释性和可靠性。聚类分解后,如果距离小于阈值\({THM}\),则scCAD会合并聚类,该阈值表示为\({THM}={\mbox{median}}({d}_{1} ,\,{d}_{2} ,\,\ldots,\,{d}_{m} )\),其中\({d}_{i} \)是聚类(i)与其最近相邻聚类之间的欧几里得距离。
We test the number of clusters generated after merging and the average proportions of all cell types and rare cell types in their dominant clusters by using different \({THM}\) values in the Arc-ME dataset (Supplementary Fig. 22). As shown in Supplementary Fig. 22, lower \({THM}\) values (such as zero and the lower quartile) may incur higher computational overhead due to a larger number of analyzed clusters, while higher \({THM}\) values (such as the upper quartile and the 90th percentile) may significantly increase the likelihood of merging clusters dominated by rare cell types, potentially diminishing the effectiveness of the decomposition step.
我们通过在Arc-ME数据集中使用不同的“({THM}”)值来测试合并后产生的簇的数量以及所有细胞类型和稀有细胞类型在其优势簇中的平均比例(补充图22)。如补充图22所示,较低的\({THM}\)值(例如零和较低的四分位数)可能会由于分析的聚类数量较多而产生较高的计算开销,而较高的\({THM}\)值(例如上四分位数和第90百分位数)可能会显着增加合并由稀有细胞类型主导的聚类的可能性,这可能会降低分解步骤的有效性。
To enhance efficiency and reduce the number of analyzed clusters, we use the median as the default parameter across all datasets.scCAD identifies a cluster as rare when its independence score exceeds a threshold value, \(I\). We display the distribution of independence scores calculated b.
为了提高效率并减少分析的聚类数量,我们使用中位数作为所有数据集的默认参数。当独立性得分超过阈值\(I \)时,scCAD将聚类识别为罕见。我们显示了b计算的独立性分数的分布。
Data availability
数据可用性
The details of the datasets used in this study are reported in Supplementary Table 1. All described datasets are obtained from various public websites under accession codes provided in Supplementary Table 1, including NCBI Gene Expression Omnibus (GEO) [https://www.ncbi.nlm.nih.gov/geo/], ArrayExpress [https://www.ebi.ac.uk/arrayexpress/], Sequence Read Archive (SRA) [https://www.ncbi.nlm.nih.gov/sra].
本研究中使用的数据集的详细信息见补充表1。所有描述的数据集都是根据补充表1中提供的登录号从各种公共网站获得的,包括NCBI Gene Expression Omnibus(GEO)[https://www.ncbi.nlm.nih.gov/geo/],ArrayExpress[https://www.ebi.ac.uk/arrayexpress/],序列读取存档(SRA)[https://www.ncbi.nlm.nih.gov/sra]。
10X PBMC is obtained at Github [https://github.com/ttgump/scDeepCluster/blob/master/scRNA-seq%20data/10X_PBMC.h5]. 68k PBMC and Jurkat datasets are obtained from the website of 10X genomics ([https://www.10xgenomics.com/datasets/fresh-68-k-pbm-cs-donor-a-1-standard-1-1-0], [https://www.10xgenomics.com/datasets/50-percent-50-percent-jurkat-293-t-cell-mixture-1-standard-1-1-0]).
在Github获得了10倍的PBMC[https://github.com/ttgump/scDeepCluster/blob/master/scRNA-seq%20data/10X_PBMC.h5]。68k PBMC和Jurkat数据集可从10X genomics网站获得([https://www.10xgenomics.com/datasets/fresh-68-k-pbm-cs-donor-a-1-standard-1-1-0][https://www.10xgenomics.com/datasets/50-percent-50-percent-jurkat-293-t-cell-mixture-1-standard-1-1-0])。
The worm neuron cells dataset Cao is sampled from a dataset obtained from the sci-RNA-seq platform (single-cell combinatorial indexing RNA sequencing) [http://atlas.gs.washington.edu/worm-rna/docs/]. The preprocessed human tonsil data, named Tonsil, and Crohn data are available from Broad Institute Single Cell Portal ([https://singlecell.broadinstitute.org/single_cell/study/SCP2169/slide-tags-snrna-seq-on-human-tonsil], [https://singlecell.broadinstitute.org/single_cell/study/SCP359/ica-ileum-lamina-propria-immunocytes-sinai]).
蠕虫神经元细胞数据集Cao是从sci RNA-seq平台(单细胞组合索引RNA测序)获得的数据集中采样的[http://atlas.gs.washington.edu/worm-rna/docs/]。经过预处理的人类扁桃体数据(称为扁桃体)和克罗恩数据可从Broad Institute Single Cell Portal获得([https://singlecell.broadinstitute.org/single_cell/study/SCP2169/slide-tags-snrna-seq-on-human-tonsil][https://singlecell.broadinstitute.org/single_cell/study/SCP359/ica-ileum-lamina-propria-immunocytes-sinai])。
The mouse retina data and B_ lymphoma data are available at Github [https://github.com/OSU-BMBL/marsgt/tree/main/Data]. Source data are provided with this paper..
[https://github.com/OSU-BMBL/marsgt/tree/main/Data]。。。
Code availability
代码可用性
scCAD is publicly available at GitHub [https://github.com/xuyp-csu/scCAD] and Zenodo97.
[https://github.com/xuyp-csu/scCAD]和Zenodo97。
ReferencesPotter, S. S. Single-cell RNA sequencing for the study of development, physiology and disease. Nat. Rev. Nephrol. 14, 479–492 (2018).Article
ReferencesPotter,S.S。单细胞RNA测序,用于研究发育,生理和疾病。自然修订版Nephrol。。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Choi, Y. H. & Kim, J. K. Dissecting cellular heterogeneity using single-cell RNA sequencing. Mol. Cells 42, 189–199 (2019).CAS
Choi,Y.H。&Kim,J.K。使用单细胞RNA测序解剖细胞异质性。分子细胞42189-199(2019)。中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Jaitin, D. A. et al. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science 343, 776–779 (2014).Article
Jaitin,D.A。等人,大规模平行单细胞RNA-seq,用于将组织无标记分解为细胞类型。科学343776-779(2014)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Hwang, B., Lee, J. H. & Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 50, 1–14 (2018).Article
Hwang,B.,Lee,J.H。&Bang,D。单细胞RNA测序技术和生物信息学管道。实验分子医学50,1-14(2018)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Travaglini, K. J. et al. A molecular cell atlas of the human lung from single-cell RNA sequencing. Nature 587, 619–625 (2020).Article
Travaglini,K.J.等人。来自单细胞RNA测序的人肺分子细胞图谱。《自然》587619-625(2020)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Wu, H., Kirita, Y., Donnelly, E. L. & Humphreys, B. D. Advantages of single-nucleus over single-cell RNA sequencing of adult kidney: rare cell types and novel cell states revealed in fibrosis. J. Am. Soc. Nephrol. JASN 30, 23–32 (2019).Article
Wu,H.,Kirita,Y.,Donnelly,E.L。和Humphreys,B.D。单核优于成人肾脏单细胞RNA测序的优势:纤维化中揭示的罕见细胞类型和新细胞状态。J、 美国社会肾脏病学会。JASN 30,23-32(2019)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Kiselev, V. Y., Andrews, T. S. & Hemberg, M. Challenges in unsupervised clustering of single-cell RNA-seq data. Nat. Rev. Genet. 20, 273–282 (2019).Article
Kiselev,V.Y.,Andrews,T.S。&Hemberg,M。单细胞RNA-seq数据无监督聚类的挑战。Genet自然Rev。20273-282(2019)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Ross, A. et al. Detection and viability of tumor cells in peripheral blood stem cell collections from breast cancer patients using immunocytochemical and clonogenic assay techniques [see comments]. Blood 82, 2605–2610 (1993).Article
Ross,A.等人。使用免疫细胞化学和克隆形成测定技术检测乳腺癌患者外周血干细胞收集物中肿瘤细胞的检测和活力[见评论]。血液822605-2610(1993)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Paterlini-Brechot, P. & Benali, N. L. Circulating tumor cells (CTC) detection: clinical impact and future directions. Cancer Lett. 253, 180–204 (2007).Article
Paterlini Brechot,P。&Benali,N.L。循环肿瘤细胞(CTC)检测:临床影响和未来方向。癌症Lett。253180-204(2007)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Joosse, S. A., Gorges, T. M. & Pantel, K. Biology, detection, and clinical implications of circulating tumor cells. EMBO Mol. Med. 7, 1–11 (2015).Article
Joosse,S.A.,Gings,T.M。和Pantel,K。循环肿瘤细胞的生物学,检测和临床意义。EMBO Mol.Med.7,1-11(2015)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).Article
Satija,R.,Farrell,J.A.,Gennert,D.,Schier,A.F。&Regev,A。单细胞基因表达数据的空间重建。美国国家生物技术公司。33495-502(2015)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Jindal, A., Gupta, P., Jayadeva & Sengupta, D. Discovery of rare cells from voluminous single cell expression data. Nat. Commun. 9, 4719 (2018).Article
Jindal,A.,Gupta,P.,Jayadeva&Sengupta,D。从大量单细胞表达数据中发现稀有细胞。国家公社。94719(2018)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Wegmann, R. et al. CellSIUS provides sensitive and specific detection of rare cell populations from complex single-cell RNA-seq data. Genome Biol. 20, 142 (2019).Article
Wegmann,R。等人CellSIUS从复杂的单细胞RNA-seq数据中提供了对稀有细胞群的灵敏和特异性检测。基因组生物学。20142(2019)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Sun, X., Liu, Y. & An, L. Ensemble dimensionality reduction and feature gene extraction for single-cell RNA-seq data. Nat. Commun. 11, 5853 (2020).Article
Sun,X.,Liu,Y。&An,L。单细胞RNA-seq数据的集合降维和特征基因提取。国家公社。115853(2020)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Fa, B. et al. GapClust is a light-weight approach distinguishing rare cells from voluminous single cell expression profiles. Nat. Commun. 12, 4197 (2021).Article
Fa,B。等人GapClust是一种轻量级方法,可将稀有细胞与大量单细胞表达谱区分开。国家公社。124197(2021)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Jiang, L., Chen, H., Pinello, L. & Yuan, G.-C. GiniClust: detecting rare cell types from single-cell gene expression data with Gini index. Genome Biol. 17, 144 (2016).Article
Jiang,L.,Chen,H.,Pinello,L。&Yuan,G.-C。GiniClust:用基尼指数从单细胞基因表达数据中检测稀有细胞类型。基因组生物学。17144(2016)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Tsoucas, D. & Yuan, G.-C. GiniClust2: a cluster-aware, weighted ensemble clustering method for cell-type detection. Genome Biol. 19, 58 (2018).Article
Tsoucas,D.&Yuan,G.-C。GiniClust2:一种用于细胞类型检测的聚类感知加权集成聚类方法。基因组生物学。19,58(2018)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Dong, R. & Yuan, G.-C. GiniClust3: a fast and memory-efficient tool for rare cell type identification. BMC Bioinform. 21, 158 (2020).Article
Dong,R。&Yuan,G.-C。GiniClust3:一种用于稀有细胞类型鉴定的快速且记忆有效的工具。BMC生物信息。21158(2020)。文章
CAS
中科院
Google Scholar
谷歌学者
Grün, D. et al. Single-cell messenger RNA sequencing reveals rare intestinal cell types. Nature 525, 251–255 (2015).Article
Grün,D。等人。单细胞信使RNA测序揭示了罕见的肠细胞类型。自然525251-255(2015)。文章
ADS
广告
PubMed
PubMed
Google Scholar
谷歌学者
Herman, J. S., Sagar & Grün, D. FateID infers cell fate bias in multipotent progenitors from single-cell RNA-seq data. Nat. Methods 15, 379–386 (2018).Article
Herman,J.S.,Sagar&Grün,D.FateID从单细胞RNA-seq数据推断多能祖细胞的细胞命运偏倚。自然方法15379-386(2018)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Leary, J. R. et al. Sub-cluster identification through semi-supervised optimization of rare-cell silhouettes (SCISSORS) in single-cell RNA-sequencing. Bioinformatics 39, btad449 (2023).Article
Leary,J.R.等人。通过单细胞RNA测序中稀有细胞轮廓(剪刀)的半监督优化进行亚簇鉴定。生物信息学39,btad449(2023)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Lubatti, G. et al. CIARA: a cluster-independent algorithm for identifying markers of rare cell types from single-cell sequencing data. Development 150, dev201264 (2023).Article
Lubatti,G。等人。CIARA:一种独立于聚类的算法,用于从单细胞测序数据中识别稀有细胞类型的标记。。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
DeMeo, B. & Berger, B. SCA: recovering single-cell heterogeneity through information-based dimensionality reduction. Genome Biol. 24, 195 (2023).Article
DeMeo,B。&Berger,B。SCA:通过基于信息的降维来恢复单细胞异质性。基因组生物学。24195(2023)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Wang, X. et al. MarsGT: multi-omics analysis for rare population inference using single-cell graph transformer. Nat. Commun. 15, 338 (2024).Article
Wang,X。et al。MarsGT:使用单细胞图转换器进行稀有种群推断的多组学分析。国家公社。15338(2024)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Argelaguet, R., Cuomo, A. S. E., Stegle, O. & Marioni, J. C. Computational principles and challenges in single-cell data integration. Nat. Biotechnol. 39, 1202–1215 (2021).Article
Argelaguet,R.,Cuomo,A.S.E.,Stegle,O.&Marioni,J.C。单细胞数据集成的计算原理和挑战。美国国家生物技术公司。391202-1215(2021)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).Article
Breiman,L。随机森林。马赫。学习。45,5-32(2001)。文章
Google Scholar
谷歌学者
Xu, Y. et al. CellBRF: a feature selection method for single-cell clustering using cell balance and random forest. Bioinformatics 39, i368–i376 (2023).Article
Xu,Y。等人。CellBRF:一种使用细胞平衡和随机森林的单细胞聚类特征选择方法。生物信息学39,i368–i376(2023)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Liu, F. T., Ting, K. M. & Zhou, Z.-H. Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining 413–422 (IEEE, Pisa, Italy, 2008).Gerniers, A., Bricard, O. & Dupont, P. MicroCellClust: mining rare and highly specific subpopulations from single-cell expression data.
Liu,F.T.,Ting,K.M。和Zhou,Z.-H。隔离林。2008年,第八届IEEE数据挖掘国际会议413-422(IEEE,意大利比萨,2008)。Gerniers,A.,Bricard,O。&Dupont,P。MicroCellClust:从单细胞表达数据中挖掘罕见且高度特异的亚群。
Bioinformatics 37, 3220–3227 (2021).Article .
生物信息学373220-3227(2021)。。
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852–866 (2022).Article
Yang,F。等人。scBERT是一种用于单细胞RNA-seq数据细胞类型注释的大规模预训练深度语言模型。自然马赫数。因特尔。。文章
Google Scholar
谷歌学者
Liao, M. et al. Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19. Nat. Med. 26, 842–844 (2020).Article
廖,M。等。新型冠状病毒肺炎患者支气管肺泡免疫细胞的单细胞景观。《自然医学》26842-844(2020)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Peng, J. et al. Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma. Cell Res. 29, 725–738 (2019).Article
Peng,J。等人。单细胞RNA-seq突出了胰腺导管腺癌的肿瘤内异质性和恶性进展。Cell Res.29725–738(2019)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).Article
Zheng,G.X.Y.等人。单细胞的大规模并行数字转录谱分析。国家公社。814049(2017)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Xie, K., Huang, Y., Zeng, F., Liu, Z. & Chen, T. scAIDE: clustering of large-scale single-cell RNA-seq data reveals putative and rare cell types. NAR Genom. Bioinform. 2, lqaa082 (2020).Article
Xie,K.,Huang,Y.,Zeng,F.,Liu,Z。&Chen,T。scAIDE:大规模单细胞RNA-seq数据的聚类揭示了推定的和罕见的细胞类型。NAR Genom。。2,lqaa082(2020)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Davis, J. D. & Wypych, T. P. Cellular and functional heterogeneity of the airway epithelium. Mucosal Immunol. 14, 978–990 (2021).Article
Davis,J.D。&Wypych,T.P。气道上皮的细胞和功能异质性。粘膜免疫。14978-990(2021)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Montoro, D. T. et al. A revised airway epithelial hierarchy includes CFTR-expressing ionocytes. Nature 560, 319–324 (2018).Article
。自然560319-324(2018)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
McInnes, L., Healy, J., Saul, N. & Großberger, L. UMAP: Uniform Manifold Approximation and Projection. J. Open Source Softw. 3, 861 (2018).Plasschaert, L. W. et al. A single-cell atlas of the airway epithelium reveals the CFTR-rich pulmonary ionocyte. Nature 560, 377–381 (2018).Article .
McInnes,L.,Healy,J.,Saul,N。&Großberger,L。UMAP:统一流形近似和投影。J、 开源软件。3861(2018)。Plasschaert,L.W。等人。气道上皮的单细胞图谱揭示了富含CFTR的肺离子细胞。自然560377-381(2018)。。
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Hewitt, R. J. & Lloyd, C. M. Regulation of immune responses by the airway epithelial cell landscape. Nat. Rev. Immunol. 21, 347–362 (2021).Article
Hewitt,R.J。&Lloyd,C.M。通过气道上皮细胞景观调节免疫反应。国家免疫修订版。21347-362(2021)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Deprez, M. et al. A single-cell atlas of the human healthy airways. Am. J. Respir. Crit. Care Med. 202, 1636–1645 (2020).Article
Deprez,M.等人,《人类健康气道的单细胞图谱》。Am.J.Respir。。护理医学2021636-1645(2020)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Song, H., Seddighzadeh, B., Cooperberg, M. R. & Huang, F. W. Expression of ACE2, the SARS-CoV-2 receptor, and TMPRSS2 in prostate epithelial cells. Eur. Urol. 78, 296–298 (2020).Article
Song,H.,Seddighzadeh,B.,Cooperberg,M.R。&Huang,F.W。前列腺上皮细胞中ACE2,SARS-CoV-2受体和TMPRSS2的表达。欧元Urol。78296-298(2020)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Campbell, J. N. et al. A molecular census of arcuate hypothalamus and median eminence cell types. Nat. Neurosci. 20, 484–496 (2017).Article
。自然神经科学。20484-496(2017)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Chen, R., Wu, X., Jiang, L. & Zhang, Y. Single-cell RNA-seq reveals hypothalamic cell diversity. Cell Rep. 18, 3227–3241 (2017).Article
Chen,R.,Wu,X.,Jiang,L。&Zhang,Y。单细胞RNA-seq揭示下丘脑细胞多样性。Cell Rep.183227–3241(2017)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Chen, Y. et al. The oligodendrocyte-specific G protein–coupled receptor GPR17 is a cell-intrinsic timer of myelination. Nat. Neurosci. 12, 1398–1406 (2009).Article
Chen,Y.等人。少突胶质细胞特异性G蛋白偶联受体GPR17是髓鞘形成的细胞内在计时器。自然神经科学。121398-1406(2009)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Lendahl, U., Muhl, L. & Betsholtz, C. Identification, discrimination and heterogeneity of fibroblasts. Nat. Commun. 13, 3409 (2022).Article
Lendahl,U.,Muhl,L。和Betsholtz,C。成纤维细胞的鉴定,鉴别和异质性。国家公社。133409(2022)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Joost, S. et al. The molecular anatomy of mouse skin during hair growth and rest. Cell Stem Cell 26, 441–457.e7 (2020).Article
Joost,S.等人,《毛发生长和休息期间小鼠皮肤的分子解剖学》。细胞干细胞26441–457.e7(2020)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Ascensión, A. M., Fuertes-Álvarez, S., Ibañez-Solé, O., Izeta, A. & Araúzo-Bravo, M. J. Human dermal fibroblast subpopulations are conserved across single-cell RNA sequencing studies. J. Invest. Dermatol. 141, 1735–1744.e35 (2021).Article
Ascensión,A.M.,Fuertes-lvarez,S.,Ibañez-Solé,O.,Izeta,A。&Araúzo Bravo,M.J。在单细胞RNA测序研究中,人皮肤成纤维细胞亚群是保守的。J、 投资。皮肤病。1411735-1744.e35(2021)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Morel, L. et al. Molecular and functional properties of regional astrocytes in the adult brain. J. Neurosci. 37, 8706–8717 (2017).Article
Morel,L。等人。成人大脑中局部星形胶质细胞的分子和功能特性。J、 神经科学。378706-8717(2017)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Jurga, A. M., Paleczna, M., Kadluczka, J. & Kuter, K. Z. Beyond the GFAP-astrocyte protein markers in the brain. Biomolecules 11, 1361 (2021).Article
Jurga,A.M.,Paleczna,M.,Kadluczka,J.和Kuter,K.Z.大脑中GFAP星形胶质细胞蛋白标记物之外。生物分子111361(2021)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
He, L. et al. Analysis of the brain mural cell transcriptome. Sci. Rep. 6, 35108 (2016).Article
He,L。等人。脑壁细胞转录组的分析。科学。代表635108(2016)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Gerbe, F., Legraverend, C. & Jay, P. The intestinal epithelium tuft cells: specification and function. Cell. Mol. Life Sci. 69, 2907–2917 (2012).Article
Gerbe,F.,Legraverend,C。&Jay,P。肠上皮簇状细胞:规格和功能。细胞。分子生命科学。692907-2917(2012)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Ayyaz, A. et al. Single-cell transcriptomes of the regenerating intestine reveal a revival stem cell. Nature 569, 121–125 (2019).Article
Ayyaz,A。等人。再生肠的单细胞转录组揭示了再生干细胞。自然569121-125(2019)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Middelhoff, M. et al. Dclk1-expressing tuft cells: critical modulators of the intestinal niche? Am. J. Physiol. Gastrointest. Liver Physiol. 313, G285–G299 (2017).Article
Middelhoff,M。等人。表达Dclk1的簇状细胞:肠道生态位的关键调节剂?Am.J.Physiol。胃肠学家。肝脏生理学。313,G285–G299(2017)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Engelstoft, M. S. et al. Research resource: a chromogranin a reporter for serotonin and histamine secreting enteroendocrine cells. Mol. Endocrinol. 29, 1658–1671 (2015).Article
Engelstoft,M.S.等人。研究资源:嗜铬粒蛋白是分泌血清素和组胺的肠内分泌细胞的报告基因。分子内分泌。291658-1671(2015)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Franzén, O., Gan, L.-M. & Björkegren, J. L. M. PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data. Database 2019, baz046 (2019).Hunyadi, J., Simon, M., Kenderessy, A., Sz & Dobozy, A. Expression of monocyte/macrophage markers (CD13, CD14, CD68) on human keratinocytes in healthy and diseased skin.
Franzén,O.,Gan,L.-M.&Björkegren,J.L.M.PanglaoDB:用于探索小鼠和人类单细胞RNA测序数据的web服务器。数据库2019,baz046(2019)。Hunyadi,J.,Simon,M.,Kenderessy,A.,Sz&Dobozy,A.健康和患病皮肤中人角质形成细胞上单核细胞/巨噬细胞标志物(CD13,CD14,CD68)的表达。
J. Dermatol. 20, 341–345 (1993).Article .
J、 皮肤病。。。
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Xu, Q. et al. NADPH oxidases are essential for macrophage differentiation. J. Biol. Chem. 291, 20030–20041 (2016).Article
Xu,Q。等人。NADPH氧化酶对于巨噬细胞分化至关重要。J、 。化学。29120030–20041(2016)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Chung, E. J. et al. Natural variation in macrophage polarization and function impact pneumocyte senescence and susceptibility to fibrosis. Aging 14, 7692–7717 (2022).Article
Chung,E.J.等人。巨噬细胞极化和功能的自然变化影响肺细胞衰老和纤维化易感性。年龄147692-7717(2022)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Dominguez Gutierrez, G. et al. Gene signature of the human pancreatic ε cell. Endocrinology 159, 4023–4032 (2018).Article
Dominguez-Gutierrez,G。等人。人胰腺ε细胞的基因特征。内分泌学1594023-4032(2018)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Baron, M. et al. A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure. Cell Syst. 3, 346–360.e4 (2016).Article
Baron,M。等人。人和小鼠胰腺的单细胞转录组图谱揭示了细胞间和细胞内的种群结构。细胞系统。3346-360.e4(2016)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Muraro, M. J. et al. A single-cell transcriptome atlas of the human pancreas. Cell Syst. 3, 385–394.e3 (2016).Article
Muraro,M.J.等人,《人类胰腺的单细胞转录组图谱》。细胞系统。3385-394.e3(2016)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Xue, M. et al. Schwann cells regulate tumor cells and cancer-associated fibroblasts in the pancreatic ductal adenocarcinoma microenvironment. Nat. Commun. 14, 4600 (2023).Article
薛,M。等。雪旺氏细胞调节胰腺导管腺癌微环境中的肿瘤细胞和癌症相关成纤维细胞。国家公社。144600(2023)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Eissmann, M. F. et al. IL-33-mediated mast cell activation promotes gastric cancer through macrophage mobilization. Nat. Commun. 10, 2735 (2019).Article
Eissmann,M.F。等人。IL-33介导的肥大细胞活化通过巨噬细胞动员促进胃癌。国家公社。102735(2019)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Sharma, R. B. et al. Insulin demand regulates β cell number via the unfolded protein response. J. Clin. Invest. 125, 3831–3846 (2015).Article
Sharma,R.B。等人。胰岛素需求通过未折叠的蛋白质反应调节β细胞数量。J、 临床。投资。1253831-3846(2015)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29 (2021).Article
Hao,Y.等人。多模式单细胞数据的综合分析。细胞1843573-3587.e29(2021)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Martin, J. C. et al. Single-cell analysis of Crohn’s disease lesions identifies a pathogenic cellular module associated with resistance to anti-TNF therapy. Cell 178, 1493–1508.e20 (2019).Article
Martin,J.C.等人。克罗恩病病变的单细胞分析确定了与抗TNF治疗耐药性相关的致病细胞模块。细胞1781493-1508.e20(2019)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
D’Acquisto, F. & Crompton, T. CD3 + CD4 − CD8− (double negative) T cells: saviours or villains of the immune response? Biochem. Pharmacol. 82, 333–340 (2011).Article
D'Acquisto,F。&Crompton,T。CD3+,CD4-,CD8-(双阴性)T细胞:免疫反应的救世主还是恶棍?生物化学。药理学。82333-340(2011)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Zhang, Y. et al. Single-cell analyses of renal cell cancers reveal insights into tumor microenvironment, cell of origin, and therapy response. Proc. Natl Acad. Sci. USA 118, e2103240118 (2021).Article
Zhang,Y。等人。肾细胞癌的单细胞分析揭示了对肿瘤微环境,起源细胞和治疗反应的见解。程序。国家科学院。科学。美国118,e2103240118(2021)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Stewart, B. J. et al. Spatiotemporal immune zonation of the human kidney. Science 365, 1461–1466 (2019).Article
Stewart,B.J.等人。人类肾脏的时空免疫分区。科学3651461-1466(2019)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Zhang, J.-Y. et al. Single-cell landscape of immunological responses in patients with COVID-19. Nat. Immunol. 21, 1107–1118 (2020).Article
Zhang,J.-Y.等人。新型冠状病毒肺炎患者免疫反应的单细胞景观。自然免疫。211107-1118(2020)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Maier, B. et al. A conserved dendritic-cell regulatory program limits antitumour immunity. Nature 580, 257–262 (2020).Article
Maier,B。等人。保守的树突状细胞调节程序限制了抗肿瘤免疫力。自然580257-262(2020)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
An, X. et al. Global transcriptome analyses of human and murine terminal erythroid differentiation. Blood 123, 3466–3477 (2014).Article
An,X。等人。人类和小鼠终末红细胞分化的全球转录组分析。血液1233466-3477(2014)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Lee, J., Hyeon, D. Y. & Hwang, D. Single-cell multiomics: technologies and data analysis methods. Exp. Mol. Med. 52, 1428–1442 (2020).Article
Lee,J.,Hyeon,D.Y。&Hwang,D。单细胞多组学:技术和数据分析方法。实验分子医学521428-1442(2020)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Ma, A., McDermaid, A., Xu, J., Chang, Y. & Ma, Q. Integrative Methods and Practical Challenges for Single-Cell Multi-omics. Trends Biotechnol. 38, 1007–1022 (2020).Article
Ma,A.,McDermaid,A.,Xu,J.,Chang,Y。&Ma,Q。单细胞多组学的综合方法和实际挑战。趋势生物技术。381007-1022(2020)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Dou, J. et al. Bi-order multimodal integration of single-cell data. Genome Biol. 23, 112 (2022).Article
Dou,J。等人。单细胞数据的双阶多模式整合。基因组生物学。23112(2022)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Langer, K. B. et al. Retinal Ganglion Cell Diversity and Subtype Specification from Human Pluripotent Stem Cells. Stem Cell Rep. 10, 1282–1293 (2018).Article
Langer,K.B.等人。来自人类多能干细胞的视网膜神经节细胞多样性和亚型规范。干细胞代表101282-1293(2018)。文章
CAS
中科院
Google Scholar
谷歌学者
Rheaume, B. A. et al. Single cell transcriptome profiling of retinal ganglion cells identifies cellular subtypes. Nat. Commun. 9, 2759 (2018).Article
Rheaume,B.A。等人。视网膜神经节细胞的单细胞转录组分析可识别细胞亚型。国家公社。92759(2018)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Møller, H. J. et al. Soluble CD163: a marker molecule for monocyte/macrophage activity in disease. Scand. J. Clin. Lab. Invest. 62, 29–33 (2002).Article
Møller,H.J.等人,《可溶性CD163:疾病中单核细胞/巨噬细胞活性的标记分子》。。J、 临床。实验室投资。。文章
Google Scholar
谷歌学者
Villani, A.-C. et al. Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science 356, eaah4573 (2017).Article
Villani,A.-C.等人。单细胞RNA-seq揭示了新型人类血液树突状细胞,单核细胞和祖细胞。科学356,eaah4573(2017)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Calon, A. et al. Stromal gene expression defines poor-prognosis subtypes in colorectal cancer. Nat. Genet. 47, 320–329 (2015).Article
Calon,A。等人。基质基因表达定义了结直肠癌预后不良的亚型。纳特·吉内特。47320-329(2015)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
MacParland, S. A. et al. Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations. Nat. Commun. 9, 4383 (2018).Article
MacParland,S.A。等人。人肝脏的单细胞RNA测序揭示了不同的肝内巨噬细胞群。国家公社。94383(2018)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Koay, H.-F. et al. A divergent transcriptional landscape underpins the development and functional branching of MAIT cells. Sci. Immunol. 4, eaay6039 (2019).Article
Koay,H.-F.等人。不同的转录景观支持MAIT细胞的发育和功能分支。科学。免疫。4,eaay6039(2019)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Kleiveland, C. R. Peripheral blood mononuclear cells. In The Impact of Food Bioactives on Health: in vitro and ex vivo models (Springer, Cham, 2015).da Silva, F. A. R. et al. Whole transcriptional analysis identifies markers of B, T and plasma cell signaling pathways in the mesenteric adipose tissue associated with Crohn’s disease.
Kleiveland,C.R。外周血单核细胞。《食品生物活性物质对健康的影响:体外和离体模型》(Springer,Cham,2015)。da Silva,F.A.R。等人。全转录分析鉴定了与克罗恩病相关的肠系膜脂肪组织中B,T和浆细胞信号通路的标志物。
J. Transl. Med. 18, 44 (2020).Article .
J.Transl。18, 44 (2020).文章联盟。
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Wang, Z. et al. Celda: a Bayesian model to perform co-clustering of genes into modules and cells into subpopulations using single-cell RNA-seq data. NAR Genom. Bioinform. 4, lqac066 (2022).Article
Wang,Z。et al。Celda:一种贝叶斯模型,使用单细胞RNA-seq数据将基因共聚类为模块,将细胞共聚类为亚群。NAR Genom。。4,lqac066(2022)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Stassen, S. V. et al. PARC: ultrafast and accurate clustering of phenotypic data of millions of single cells. Bioinformatics 36, 2778–2786 (2020).Article
Stassen,S.V.等人PARC:对数百万个单细胞的表型数据进行超快而准确的聚类。生物信息学362778-2786(2020)。文章
MathSciNet
MathSciNet
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Yang, P., Huang, H. & Liu, C. Feature selection revisited in the single-cell era. Genome Biol. 22, 1–17 (2021).Article
Yang,P.,Huang,H。&Liu,C。在单细胞时代重新审视特征选择。基因组生物学。22,1-17(2021)。文章
Google Scholar
谷歌学者
Townes, F. W., Hicks, S. C., Aryee, M. J. & Irizarry, R. A. Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model. Genome Biol. 20, 295 (2019).Article
Townes,F.W.,Hicks,S.C.,Aryee,M.J。&Irizarry,R.A。基于多项式模型的单细胞RNA-Seq的特征选择和降维。基因组生物学。20295(2019)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Ranjan, B. et al. DUBStepR is a scalable correlation-based feature selection method for accurately clustering single-cell data. Nat. Commun. 12, 5849 (2021).Article
DUBStepR是一种可扩展的基于相关性的特征选择方法,用于准确聚类单细胞数据。国家公社。125849(2021)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Wang, J. et al. scGNN is a novel graph neural network framework for single-cell RNA-Seq analyses. Nat. Commun. 12, 1882 (2021).Article
scGNN是一种用于单细胞RNA-Seq分析的新型图形神经网络框架。国家公社。121882(2021)。文章
ADS
广告
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Yu, Z. et al. ZINB-based graph embedding autoencoder for single-cell RNA-seq interpretations. Proc. AAAI Conf. Artif. Intell. 36, 4671–4679 (2022).
。程序。AAAI配置文件。因特尔。364671–4679(2022)。
Google Scholar
谷歌学者
Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, P10008 (2008).Article
Blondel,V.D.,Guillaume,J.L.,Lambiotte,R。和Lefebvre,E。大型网络中社区的快速发展。J、 统计机械。理论实验2008,P10008(2008)。文章
Google Scholar
谷歌学者
Scherf, U. et al. A gene expression database for the molecular pharmacology of cancer. Nat. Genet. 24, 236–244 (2000).Article
Scherf,U。等人。癌症分子药理学的基因表达数据库。纳特·吉内特。24236-244(2000)。文章
CAS
中科院
PubMed
PubMed
Google Scholar
谷歌学者
Märtens, K. et al. Rarity: discovering rare cell populations from single-cell imaging data. https://doi.org/10.1101/2022.07.15.500256 (2022).Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).Article
Märtens,K。等人。稀有性:从单细胞成像数据中发现稀有细胞群。https://doi.org/10.1101/2022.07.15.500256(2022年)。Klein,A.M.等人。应用于胚胎干细胞的单细胞转录组学的液滴条形码。细胞1611187-1201(2015)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Zappia, L., Phipson, B. & Oshlack, A. Splatter: simulation of single-cell RNA sequencing data. Genome Biol. 18, 174 (2017).Article
Zappia,L.,Phipson,B。&Oshlack,A。Splatter:单细胞RNA测序数据的模拟。基因组生物学。18174(2017)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Xu, Y. et al. scCAD: Cluster decomposition-based anomaly detection for rare cell identification in single-cell expression data. scCAD https://doi.org/10.5281/zenodo.13121480 (2024).Download referencesAcknowledgementsThis work was supported in part by the National Key Research and Development Program of China (No.2021YFF1201200), the National Natural Science Foundation of China under Grants (Nos.
Xu,Y。等人。scCAD:基于聚类分解的异常检测,用于单细胞表达数据中的稀有细胞鉴定。scCAD公司https://doi.org/10.5281/zenodo.13121480(2024年)。下载参考文献致谢这项工作部分得到了中国国家重点研究发展计划(No.2021YFF1201200),国家自然科学基金(National National National Science Foundation)的资助(No。
62350004, 62332020), the Project of Xiangjiang Laboratory (No. 23XJ01011) to J.X.W., the National Natural Science Foundation of China under Grants (No. U22A2041) to H.D.L., and the Science Foundation for Distinguished Young Scholars of Hunan Province (No. 2023JJ10080) to J.Z.X. This work was carried out in part using computing resources at the High-Performance Computing Center of Central South University.Author informationAuthor notesThese authors contributed equally: Yunpei Xu, Shaokai Wang.Authors and AffiliationsSchool of Computer Science and Engineering, Central South University, Changsha, ChinaYunpei Xu, Qilong Feng, Jiazhi Xia, Hong-Dong Li & Jianxin WangXiangjiang Laboratory, Changsha, ChinaYunpei Xu, Qilong Feng, Hong-Dong Li & Jianxin WangHunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, ChinaYunpei Xu, Qilong Feng, Jiazhi Xia, Hong-Dong Li & Jianxin WangDavid R.
6235000462332020),湘江实验室(编号23XJ01011)给J.X.W.,国家自然科学基金(编号U22A2041)给H.D.L.,湖南省杰出青年科学基金(编号2023JJ10080)给J.Z.X.这项工作部分使用中南大学高性能计算中心的计算资源进行。作者信息作者注意到这些作者做出了同样的贡献:徐云培,王少凯。作者和附属机构中南大学计算机科学与工程学院,长沙,中国许云培,冯启龙,夏佳芝,李洪东和汪向江实验室,长沙,中国许云培,冯启龙,李洪东和汪建新中南大学湖南省生物信息学重点实验室,长沙,徐云培,冯启龙,夏佳芝,李洪东和汪建新。
Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, CanadaShaokai WangDepartment of Computer Science, Old Dominion University, Norfolk, VA, USAYaohang LiAuthorsYunpei XuView author publicationsYou can also search for this author in.
华盛顿州滑铁卢市滑铁卢大学切里顿计算机科学学院,CanadaShaokai Wang美国弗吉尼亚州诺福克市旧道明大学计算机科学系Yaoyang LiAuthorsYunpei XuView作者出版物您也可以在中搜索该作者。
PubMed Google ScholarShaokai WangView author publicationsYou can also search for this author in
PubMed Google ScholarShaokai WangView作者出版物您也可以在
PubMed Google ScholarQilong FengView author publicationsYou can also search for this author in
PubMed Google ScholarJiazhi XiaView author publicationsYou can also search for this author in
PubMed Google ScholarYaohang LiView author publicationsYou can also search for this author in
PubMed Google ScholarHong-Dong LiView author publicationsYou can also search for this author in
PubMed Google ScholarHong Dong LiView作者出版物您也可以在
PubMed Google ScholarJianxin WangView author publicationsYou can also search for this author in
PubMed Google ScholarJianxin WangView作者出版物您也可以在
PubMed Google ScholarContributionsJ.X.W. and H.D.L. conceived and designed this project. Y.P.X., S.K.W., J.X.W., and H.D.L. conceived, designed, and implemented the scCAD. Y.P.X. and S.K.W. collected datasets and conducted experiments. Y.P.X., S.K.W., J.Z.X., and H.D.L. performed the analysis.
PubMed谷歌学术贡献。十、 。Y、 P.X.,S.K.W.,J.X.W。和H.D.L.构思,设计和实施了scCAD。Y、 P.X.和S.K.W.收集数据集并进行实验。Y、 P.X.,S.K.W.,J.Z.X。和H.D.L.进行了分析。
Y.P.X., Y.H.L., Q.L.F., and J.X.W. wrote the paper. All authors have read and approved the final version of this paper.Corresponding authorsCorrespondence to.
Y、 P.X.,Y.H.L.,Q.L.F。和J.X.W.撰写了这篇论文。。通讯作者通讯。
Hong-Dong Li or Jianxin Wang.Ethics declarations
李洪东或王建新。道德宣言
Competing interests
相互竞争的利益
The authors declare no competing interests.
作者声明没有利益冲突。
Peer review
同行评审
Peer review information
同行评审信息
Nature Communications thanks Anil Giri and Shibiao Wan for their contribution to the peer review of this work. A peer review file is available.
Nature Communications感谢Anil Giri和Shibiao Wan为这项工作的同行评审做出的贡献。可以获得同行评审文件。
Additional informationPublisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Supplementary informationSupplementary informationPeer Review FileDescription of Additional Supplementary FilesSupplementary Dataset 1-11Reporting SummarySource dataSource DataRights and permissions.
Additional informationPublisher的注释Springer Nature在已发布的地图和机构隶属关系中的管辖权主张方面保持中立。补充信息补充信息同行评审文件其他补充文件的描述补充数据集1-11报告摘要源数据源数据权限。
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material.
开放获取本文是根据知识共享署名非商业性NoDerivatives 4.0国际许可证授权的,该许可证允许以任何媒介或格式进行任何非商业性使用,共享,分发和复制,只要您对原始作者和来源给予适当的信任,提供知识共享许可证的链接,并指出您是否修改了许可材料。
You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
根据本许可证,您无权共享源自本文或其部分的改编材料。本文中的图像或其他第三方材料包含在文章的知识共享许可证中,除非该材料的信用额度中另有说明。如果材料未包含在文章的知识共享许可中,并且您的预期用途不受法律法规的许可或超出许可用途,则您需要直接获得版权所有者的许可。
To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/..
要查看此许可证的副本,请访问http://creativecommons.org/licenses/by-nc-nd/4.0/..
Reprints and permissionsAbout this articleCite this articleXu, Y., Wang, S., Feng, Q. et al. scCAD: Cluster decomposition-based anomaly detection for rare cell identification in single-cell expression data.
转载和许可本文引用本文Xu,Y.,Wang,S.,Feng,Q。等人。scCAD:基于聚类分解的异常检测,用于单细胞表达数据中稀有细胞的鉴定。
Nat Commun 15, 7561 (2024). https://doi.org/10.1038/s41467-024-51891-9Download citationReceived: 05 February 2024Accepted: 15 August 2024Published: 31 August 2024DOI: https://doi.org/10.1038/s41467-024-51891-9Share this articleAnyone you share the following link with will be able to read this content:Get shareable linkSorry, a shareable link is not currently available for this article.Copy to clipboard.
《国家公社》157561(2024)。https://doi.org/10.1038/s41467-024-51891-9Download引文接收日期:2024年2月5日接受日期:2024年8月15日发布日期:2024年8月31日OI:https://doi.org/10.1038/s41467-024-51891-9Share本文与您共享以下链接的任何人都可以阅读此内容:获取可共享链接对不起,本文目前没有可共享的链接。复制到剪贴板。
Provided by the Springer Nature SharedIt content-sharing initiative
由Springer Nature SharedIt内容共享计划提供
CommentsBy submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
评论通过提交评论,您同意遵守我们的条款和社区指南。。