商务合作
动脉网APP
可切换为仅中文
AbstractSynthetic lethality (SL) is a gold mine of anticancer drug targets, exposing cancer-specific dependencies of cellular survival. To complement resource-intensive experimental screening, many machine learning methods for SL prediction have emerged recently. However, a comprehensive benchmarking is lacking.
摘要合成致死率(SL)是抗癌药物靶标的金矿,暴露了细胞存活的癌症特异性依赖性。为了补充资源密集型实验筛选,最近出现了许多用于SL预测的机器学习方法。然而,缺乏全面的基准。
This study systematically benchmarks 12 recent machine learning methods for SL prediction, assessing their performance across diverse data splitting scenarios, negative sample ratios, and negative sampling techniques, on both classification and ranking tasks. We observe that all the methods can perform significantly better by improving data quality, e.g., excluding computationally derived SLs from training and sampling negative labels based on gene expression.
这项研究系统地对12种最新的机器学习方法进行了SL预测基准测试,评估了它们在分类和排名任务中在不同数据分割场景,负样本比率和负采样技术中的性能。我们观察到,通过提高数据质量,例如,从训练中排除计算衍生的SL,并基于基因表达对负标记进行采样,所有方法都可以表现得更好。
Among the methods, SLMGAE performs the best. Furthermore, the methods have limitations in realistic scenarios such as cold-start independent tests and context-specific SLs. These results, together with source code and datasets made freely available, provide guidance for selecting suitable methods and developing more powerful techniques for SL virtual screening..
在这些方法中,SLMGAE表现最好。此外,这些方法在现实场景中存在局限性,例如独立于冷启动的测试和特定于上下文的SL。这些结果以及免费提供的源代码和数据集,为选择合适的方法和开发更强大的SL虚拟筛选技术提供了指导。。
IntroductionThe synthetic lethal (SL) interaction between genes was first discovered in Drosophila Melanogaster about a century ago1,2. SL occurs if mutations in two genes result in cell death, but a mutation in either gene alone does not. Based on this observation, Hartwell et al.3 and Kealin4 suggested that SL could be used to identify new targets for cancer therapy.
引言大约一个世纪前,基因之间的合成致死(SL)相互作用首次在果蝇中发现1,2。如果两个基因的突变导致细胞死亡,则会发生SL,但仅任一基因的突变都不会。基于这一观察结果,Hartwell等[3]和Kealin4提出SL可用于确定癌症治疗的新靶点。
In the context of cancer, where multiple genes are often mutated, identifying the SL partners of these genes and interfering with their function can lead to cancer cell death, but spare normal cells. PARP inhibitors (PARPi) are the first clinically approved drugs designed by exploiting SL5, which target the PARP proteins responsible for DNA damage repair for the treatment of tumors with BRCA1/2 mutations5,6,7,8,9.
在癌症的背景下,多个基因经常发生突变,鉴定这些基因的SL伴侣并干扰其功能可能导致癌细胞死亡,但会保留正常细胞。PARP抑制剂(PARPi)是通过利用SL5设计的第一种临床批准的药物,其靶向负责DNA损伤修复的PARP蛋白,用于治疗具有BRCA1/2突变的肿瘤5,6,7,8,9。
Clinical trials have shown that PARPi have promising treatment effect on lung, ovarian, breast, and prostate cancers9,10,11,12. Despite the success of PARPi, there are still few SL-based drugs that have passed the clinical trials so far, partly due to the lack of techniques to efficiently identify clinically relevant and robust SL gene pairs.Many methods have been proposed for identifying potential SL gene pairs in the last decade.
临床试验表明,PARPi对肺癌,卵巢癌,乳腺癌和前列腺癌具有良好的治疗效果9,10,11,12。尽管PARPi取得了成功,但到目前为止,通过临床试验的基于SL的药物仍然很少,部分原因是缺乏有效鉴定临床相关且强大的SL基因对的技术。在过去的十年中,已经提出了许多方法来鉴定潜在的SL基因对。
Various wet-lab experimental methods such as drug screening13, RNAi screening14, and CRISPR/Cas9 screening15 have been used to screen gene pairs with SL relationships16. However, due to the large number of pairwise gene combinations (~200 million in human cells)17, and considering the combinations of different genetic contexts (e.g., cancer types and cell lines), it is impractical to screen all potential SL pairs by these wet-lab methods.
各种湿实验室实验方法,如药物筛选13,RNAi筛选14和CRISPR/Cas9筛选15已被用于筛选具有SL关系的基因对16。然而,由于大量成对基因组合(人类细胞中约2亿)17,并且考虑到不同遗传背景(例如癌症类型和细胞系)的组合,通过这些湿实验室方法筛选所有潜在的SL对是不切实际的。
To reduce the search space of SL gene pairs, computational methods have been proposed. Statistical methods identify SL gene pairs b.
为了减少SL基因对的搜索空间,已经提出了计算方法。统计方法鉴定SL基因对b。
CV1: we split the data into training and testing sets by SL pairs, where both genes of a tested pair may have occurred in some other gene pairs in the training set.
。
CV2: we split the data by genes, where only one gene of a tested pair is present in the training set.
CV2:我们按基因分割数据,其中训练集中仅存在测试对中的一个基因。
CV3: we split the data by genes, where neither of a tested pair of genes is present in the training set.
CV3:我们按基因分割数据,训练集中没有一对测试过的基因。
Fig. 5: Schematic diagram of negative sampling strategy and data splitting method.A depicts the negative sampling steps based on gene expression and gene dependency scores. B illustrates the different data partitioning methods: the left matrix shows an example when DSM = CV1, with the blue and purple areas representing the randomly sampled training and test samples, respectively.
。A描述了基于基因表达和基因依赖性评分的负采样步骤。B说明了不同的数据分区方法:左矩阵显示了当DSM=CV1时的一个例子,蓝色和紫色区域分别代表随机抽样的训练样本和测试样本。
For DSM = CV2 or CV3, their training samples are both drawn from the blue area, while the purple and orange represent the test sample regions for CV2 and CV3 respectively.Full size imageNegative sampling methodsTo train the deep learning models for SL prediction, a sufficient number of gene pairs are required, including negative samples.
对于DSM=CV2或CV3,它们的训练样本均来自蓝色区域,而紫色和橙色分别代表CV2和CV3的测试样本区域。全尺寸图像负采样方法为了训练SL预测的深度学习模型,需要足够数量的基因对,包括负样本。
However, non-SL gene pairs are rarely known, which makes it difficult to satisfy the requirements of deep learning models. Therefore, negative sampling is often needed to obtain negative SL data for learning. A common strategy is to randomly select gene pairs from unknown samples as negative samples, which may include false negatives.To address this issue, we designed two new negative sampling methods (NSMs) based on the DepMap database69.
然而,非SL基因对很少为人所知,这使得难以满足深度学习模型的要求。因此,通常需要负采样来获得负SL数据以进行学习。一种常见的策略是从未知样本中随机选择基因对作为阴性样本,其中可能包括假阴性。为了解决这个问题,我们基于DepMap数据库69设计了两种新的负采样方法(NSM)。
The DepMap database includes gene expression data, gene mutation data, gene dependency scores data, etc. of many cell lines. The gene dependency scores were assessed through the utilization of CRISPR technology, which involves the examination of cellular activity after the single knockout of a specific gene, and a higher gene dependency score indicates lower cell activity.
DepMap数据库包括许多细胞系的基因表达数据,基因突变数据,基因依赖性评分数据等。通过使用CRISPR技术评估基因依赖性评分,该技术涉及在单个敲除特定基因后检查细胞活性,并且较高的基因依赖性评分表明较低的细胞活性。
From DepMap database, we obtained gene expression data and gene dependency scores obtained by CRISPR knockout experiments. We have designed two new negative sampling methods (NSMs) using these data: \({{{{\rm{NSM}}}}}_{{{{\rm{Exp}}}}}\) and NSMDep.The \({{{.
。我们使用这些数据设计了两种新的负采样方法(NSM):\({{{{\rm{NSM}}}}}}}}和NSMDep。\({{{。
1.
1.
We arranged all gene pairs in ascending order of their correlation scores;
我们将所有基因对按其相关得分的升序排列;
2.
2.
To ensure that each gene appears in the negative samples, we first traverse the gene pairs sequentially from the beginning to find the smallest set of gene pairs that can contain all genes;
为了确保每个基因出现在阴性样本中,我们首先从一开始就依次遍历基因对,以找到可以包含所有基因的最小基因对集;
3.
3.
From the remaining samples, we extract them in order (ascending order in \({{{{\rm{NSM}}}}}_{{{{\rm{Exp}}}}}\) and descending order in NSMDep) and stop when the quantity reaches one, five, twenty, and fifty times the number of positive samples.
从剩余的样本中,我们按顺序(升序为\({{{{rm{NSM}}}}}}}}}}和降序为NSMDep)提取它们,并在数量达到正样本数的一倍,五倍,二十倍和五十倍时停止。
Similarly, NSMDep is based on the correlation of the gene dependency score. For each pair of genes, we calculated the correlation coefficient of the dependency score (corrDep) between the two genes. We found that the corrDep of the pairs of known SL genes were distributed mainly in the range [−0.2, 0.2].
同样,NSMDep基于基因依赖性评分的相关性。对于每对基因,我们计算了两个基因之间依赖性得分(corrDep)的相关系数。我们发现,已知SL基因对的corrDep主要分布在[-0.2、0.2]范围内。
Therefore, we first take absolute values for all corrDep and use a sampling step similar to \({{{{\rm{NSM}}}}}_{{{{\rm{Exp}}}}}\), but unlike \({{{{\rm{NSM}}}}}_{{{{\rm{Exp}}}}}\), in the first step of the sampling, we rank all gene pairs according to their correlation scores from the highest to the lowest (Fig. 5A shows the specific negative sampling steps).Evaluation metricsTo comprehensively evaluate the performance of the models, we utilized six metrics.
因此,我们首先为所有corrDep取绝对值,并使用类似于“({{{{rm{NSM}}}}}}}的采样步骤,但与“({{{{rm{Exp}}}}}}}不同,在采样的第一步中,我们根据所有基因对的相关得分从高到低对其进行排序(图5A显示了具体的负采样步骤))。评估指标为了全面评估模型的性能,我们使用了六个指标。
For the classification task, we used three metrics: area under the receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPR), and F1 score. These metrics are commonly used for binary classification. For the gene ranking task, we employed three metrics: normalized discounted cumulative gain (NDCG@K), Recall@K, and Precision@K.
对于分类任务,我们使用了三个指标:接收器工作特征曲线下的面积(AUROC),精确召回曲线下的面积(AUPR)和F1分数。这些指标通常用于二进制分类。对于基因排名任务,我们采用了三个指标:归一化贴现累积收益(NDCG@K),Recall@K,以及Precision@K.
NDCG@K measures whether the known SL gene pairs are in a higher position in the predicted list of a model, while Recall@K and Precision@K are used to evaluate the model’s ability to measure its coverage of relevant content and accuracy in returning the top-K results, respectively. The definitions of these metrics are as follows:.
NDCG@K测量已知的SL基因对在模型的预测列表中是否处于较高的位置,而Recall@K和Precision@K分别用于评估模型衡量其相关内容覆盖率和返回top-K结果准确性的能力。这些指标的定义如下:。
Area Under the Receiver Operating Characteristic Curve (AUROC): AUROC measures the model’s ability to classify samples at different thresholds. It is calculated as the area under the receiver operating characteristic curve, which is a curve plotted with false positive rate on the x-axis and true positive rate on the y-axis.
。它被计算为接收器工作特性曲线下的面积,该曲线是用x轴上的假阳性率和y轴上的真阳性率绘制的曲线。
The value of AUROC ranges between 0 and 1..
AUROC的值介于0和1之间。。
Area Under the Precision-Recall Curve (AUPR): AUPR is a performance metric used to evaluate binary classifiers, which measures the average precision across different recall levels. Like AUROC, AUPR can be used to evaluate the performance of classifiers in the presence of imbalanced classes or uneven sample distributions.
精确召回曲线下面积(AUPR):AUPR是用于评估二进制分类器的性能指标,它测量不同召回级别的平均精度。与AUROC一样,AUPR可用于在存在不平衡类别或不均匀样本分布的情况下评估分类器的性能。
AUPR is a more sensitive metric than AUROC, particularly for classification of imbalanced data..
AUPR比AUROC更敏感,特别是对于不平衡数据的分类。。
F1 score: The F1 score is a metric for evaluating the overall effectiveness of a binary classification model by considering both precision and recall. Combining precision and recall into a single value, it provides a balanced measure of the effectiveness of the model.
F1分数:F1分数是通过考虑精确度和召回率来评估二元分类模型总体有效性的指标。将精确度和召回率结合到一个单一的值中,它为模型的有效性提供了一个平衡的度量。
Normalized discounted cumulative gain (NDCG@K): NDCG@K can be used to evaluate the ability of a model in ranking candidate SL partners for a gene gi. NDCG@K is calculated as NDCG@K = DCG@K/IDCG@K, where IDCG@K is the maximum DCG@K value among the top-K predictions, and DCG@K is calculated as:$${{{\rm{DCG}}}}@{{{\rm{K}}}}(i)={\sum}_{j=1}^{K}\frac{{2}^{{{{\mathcal{I}}}}\left[{G}_{i}(j)\in {G}_{i}^{{{{\rm{SL}}}}}\right]}-1}{{\log }_{2}(j+1)},$$.
(NDCG@K)时间:NDCG@K可用于评估模型对基因gi的候选SL伴侣进行排名的能力。NDCG@K计算如下NDCG@K=DCG@K/(笑声)IDCG@K,其中IDCG@K是最大值DCG@K前K个预测中的值,以及DCG@K计算公式为:$${{{\rm{DCG}}}}@{{{\rm{K}}(i)={\sum}{j=1}^{K}\frac{{2}^{{{{\mathcal{i}}}}\left[{G}_{i} (j)英寸{G}_{i} ^{{{{\rm{SL}}}}}\右]}-1}{{\log}{2}(j+1)},$$。
(1)
(1)
where \({G}_{i}^{{{{\rm{SL}}}}}\) denotes all known genes that have SL relationships with gene gi, Gi(j) is the j-th gene on the list of predicted SL partners for gene gi, and \({{{\mathcal{I}}}}[\cdot ]\) is the indicator function.
在哪里\({G}_{i} ^{{{{{\rm{SL}}}}}表示与基因gi具有SL关系的所有已知基因,gi(j)是基因gi的预测SL伴侣列表中的第j个基因,而\({{\mathcal{i}}}[\cdot]\)是指示函数。
Recall@K: Recall@K measures the proportion of correctly identified hits among the top K predicted SL partners to the total known SL partners for gene gi.$${{{\rm{Recall@K}}}}(i)=\frac{\mathop{\sum }_{j=1}^{K}{{{\mathcal{I}}}}\left[{G}_{i}(j)\in {G}_{i}^{{{{\rm{SL}}}}}\right]}{\left\vert {G}_{i}^{{{{\rm{SL}}}}}\right\vert }.$$.
Recall@K时间:Recall@K测量前K个预测的SL伴侣中正确识别的命中率与基因gi的已知SL伴侣总数的比例$${{{\rm{Recall@K}}}}(i) =\frac{\mathop{\sum}\uj=1}^{K}{{\mathcal{i}}}\left[{G}_{i} (j)英寸{G}_{i} ^{{{{\rm{SL}}}}\右]}{\左\右{G}_{i} ^{{{{\rm{SL}}}}\右\右}.$$。
(2)
(2)
Precision@K: Precision@K represents the proportion of correctly identified SL partners among the top K predicted SL partners of gene gi.$${{{\rm{Precision@K}}}}(i)=\frac{\mathop{\sum }_{j=1}^{K}{{{\mathcal{I}}}}\left[{G}_{i}(j)\in {G}_{i}^{{{{\rm{SL}}}}}\right]}{\left\vert K\right\vert }.$$
Precision@K时间:Precision@K代表正确识别的SL伴侣在gi基因的前K个预测SL伴侣中的比例$${{{\rm{Precision@K}}}}(i) =\frac{\mathop{\sum}\uj=1}^{K}{{\mathcal{i}}}\left[{G}_{i} (j)英寸{G}_{i} ^{{{{\rm{SL}}}}\右]}{\左\右\右}$$
(3)
(3)
To evaluate the overall performance of a model, we calculate its performance under classification and ranking tasks separately and combine them with equal weights to obtain an indicator that reflects the overall performance of the model, i.e., Overall = (Classification score + Ranking score)/2. The classification and ranking scores are calculated as follows, respectively:$${C}_{c{v}_{n}}=\frac{{\sum }_{CVn}(AUROC+AUPR+F1)}{9},n=1,2,3$$.
为了评估模型的整体性能,我们分别计算其在分类和排名任务下的性能,并将它们与相等的权重相结合,以获得反映模型整体性能的指标,即总体=(分类得分+)排名得分)/2。分类和排名分数分别计算如下:$${C}_{c{v}_{n} }=\ frac{{\ sum}{CVn}(AUROC+AUPR+F1)}{9},n=1,2,3$$。
(4)
(4)
$${R}_{c{v}_{n}}=\frac{{\sum }_{CVn}(NDCG@10+Recall@10+Precision@10)}{9},n=1,2,3$$
(笑声)$${R}_{c{v}_{n} }=\frac{{\sum}\uu{CVn}(NDCG@10+(笑声)Recall@10+(笑声)Precision@10)}{9} ,n=1,2,3$$
(5)
(5)
$${{{\rm{Classification}}}}\,{{{\rm{score}}}}=Cc{v}_{1} \times 40\%+Cc{v}_{2}\ times 50\%+Cc{v}_{3}\times 10\%$$
$${{{{rm{classification}}}\,{{{{rm{score}}}=cc{v}_{1}\乘以40\%+cc{v}_{2}\乘以50\%+cc{v}_{3}\乘以10\%$$
(6)
(6)
$${{{\rm{Ranking}}}}\, {{{\rm{score}}}}=Rc{v}_{1}\times 40\%+Rc{v}_{2}\times 50\%+Rc{v}_{3}\times 10\%$$
$${{{\rm{排名}}}\,{{{\rm{得分}}}}=Rc{v}_{1} 40倍+钢筋混凝土{v}_{2} 50倍+钢筋混凝土{v}_{3} \乘以10 \%$$
(7)
(7)
In the context of predicting new SL relationships, the CV2 scenario is more realistic and prevalent, and thus it holds greater significance. For most models, CV3 is overly challenging. Therefore, when calculating the integrative classification and ranking scores, we set the weights for CV1, CV2, and CV3 scenarios to 40%, 50%, and 10%, respectively..
在预测新的SL关系的背景下,CV2情景更加现实和普遍,因此它具有更大的意义。对于大多数车型来说,CV3过于具有挑战性。因此,在计算综合分类和排名分数时,我们将CV1,CV2和CV3场景的权重分别设置为40%,50%和10%。。
Model selection and implementationIn this study, we benchmarked 12 in-silicon methods for synthetic lethality prediction, including three matrix factorization-based methods and nine deep learning-based methods (Table 1). Among these, PTGNN and NSF4SL use self-supervised learning, while the other methods are supervised or semi-supervised learning depending on specific scenarios.
模型选择和实施在本研究中,我们对12种硅内合成致死率预测方法进行了基准测试,包括三种基于矩阵分解的方法和九种基于深度学习的方法(表1)。其中,PTGNN和NSF4SL使用自监督学习,而其他方法则根据特定场景进行监督或半监督学习。
The details of these methods can be found in the Supplementary Methods. For all the methods, their implementation details are as follows:GRSMF33: due to the lack of executable code in the code repository of the method itself, the code version we used is GRSMF implemented in GCATSL https://github.com/lichenbiostat/GCATSL/tree/master/baseline%20methods/GRSMF.
这些方法的详细信息可以在补充方法中找到。对于所有方法,它们的实现细节如下:GRSMF33:由于方法本身的代码库中缺乏可执行代码,因此我们使用的代码版本是在GCATSL中实现的GRSMFhttps://github.com/lichenbiostat/GCATSL/tree/master/baseline%20methods/GRSMF.
We set num_nodes to 9845, which is the number of genes in our data.SL2MF31: we used the code of SL2MF from https://github.com/stephenliu0423/SL2MF. The num_nodes was set to 9845.CMFW32: we used the code of CMFW from https://github.com/lianyh/CMF-W.SLMGAE36: we used the code of SLMGAE from https://github.com/DiNg1011/SLMGAE.
我们将num\u nodes设置为9845,这是我们数据中的基因数量。SL2MF31:我们使用了来自https://github.com/stephenliu0423/SL2MF.num\u节点设置为9845.CMFW32:我们使用了来自https://github.com/lianyh/CMF-W.SLMGAE36:我们使用了来自https://github.com/DiNg1011/SLMGAE.
We used the default settings.NSF4SL42: we used the code of NSF4SL from https://github.com/JieZheng-ShanghaiTech/NSF4SL. The settings aug_ratio = 0.1 and train_ratio = 1 were used.PTGNN38: we used the code of PTGNN from https://github.com/longyahui/PT-GNN. We have limited the maximum length of protein sequences to 600 and redesigned the word dictionary based on the original paper.PiLSL41: we used the code of PiLSL from https://github.com/JieZheng-ShanghaiTech/PiLSL.
我们使用了默认设置。NSF4SL42:我们使用了来自https://github.com/JieZheng-ShanghaiTech/NSF4SL.使用设置aug\U比率=0.1和train\U比率=1。PTGNN38:我们使用了来自https://github.com/longyahui/PT-GNN.。PiLSL41:我们使用了来自https://github.com/JieZheng-ShanghaiTech/PiLSL.
We set the following parameters: –hop 3, –batch_size 512. When calculating the metrics for the ranking task, we need to calculate the scores of all gene pairs, which are about 50 million. PiLSL is a pair by pair prediction approach that demands significant time t.
我们设置了以下参数:–hop 3,–batch\u size 512。在计算排名任务的指标时,我们需要计算所有基因对的得分,约为5000万。PiLSL是一种逐对预测方法,需要大量时间t。
Data availability
数据可用性
Source data are provided with this paper. All the data used in our research comes from publicly available sources, including SL labels from SynLethDB 2.0 https://synlethdb.sist.shanghaitech.edu.cn/#/download. The Entrez IDs of the genes come from the NCBI database https://www.ncbi.nlm.nih.gov/gene/.
本文提供了源数据。我们研究中使用的所有数据都来自公开来源,包括SynLethDB 2.0的SL标签https://synlethdb.sist.shanghaitech.edu.cn/#/download.这些基因的Entrez ID来自NCBI数据库https://www.ncbi.nlm.nih.gov/gene/.
Ensemble ID from https://asia.ensembl.org/index.html. The PPI data come from the data released by BioGRID on June 25, 2022, and the download link is https://downloads.thebiogrid.org/File/BioGRID/Release-Archive/BIOGRID-4.4.211/BIOGRID-ALL-4.4.211.tab.zip. GO annotation and GO term data are respectively from http://geneontology.org/gene-associations/goa_human.gaf.gz and http://geneontology.org/docs/download-ontology/#go_obo_and_owl.
来自的集合IDhttps://asia.ensembl.org/index.html.PPI数据来自BioGRID于2022年6月25日发布的数据,下载链接为https://downloads.thebiogrid.org/File/BioGRID/Release-Archive/BIOGRID-4.4.211/BIOGRID-ALL-4.4.211.tab.zip.GO注释和GO术语数据分别来自http://geneontology.org/gene-associations/goa_human.gaf.gz和http://geneontology.org/docs/download-ontology/#go_obo_and_owl.
The gene expression data and gene dependency score data of the cell lines are from DepMap Public 22Q4 https://figshare.com/articles/dataset/DepMap_22Q4_Public/21637199/2. Pathway data are from KEGG database https://www.kegg.jp/kegg-bin/download_htext?htext=hsa00001&format=json&filedir=kegg/brite/hsa.
细胞系的基因表达数据和基因依赖性评分数据来自DepMap Public 22Q4https://figshare.com/articles/dataset/DepMap_22Q4_Public/21637199/2.路径数据来自KEGG数据库https://www.kegg.jp/kegg-bin/download_htext?htext=hsa00001&format=json&filedir=kegg/brite/hsa.
Pathway data are from Reactome database released on Sep 15, 2022 https://download.reactome.org/82/databases/gk_current.sql.gz. The protein complexes data are from the CORUM database, released on Sep 9, 2022 https://mips.helmholtz-muenchen.de/fastapi-corum/public/file/download_archived_file?version=4.0.
路径数据来自于2022年9月15日发布的Reactome数据库https://download.reactome.org/82/databases/gk_current.sql.gz.蛋白质复合物数据来自CORUM数据库,该数据库于2022年9月9日发布https://mips.helmholtz-muenchen.de/fastapi-corum/public/file/download_archived_file?version=4.0.
The protein sequence data are from the UniProt70 database, released on July 22, 2022 https://www.uniprot.org/help/downloads. All processed training data in this study are publicly available in the Zenodo repository (https://doi.org/10.5281/zenodo.13691648)71 with unrestricted access. Source data are provided with this paper..
蛋白质序列数据来自于2022年7月22日发布的UniProt70数据库https://www.uniprot.org/help/downloads.这项研究中所有处理过的训练数据都可以在Zenodo存储库中公开获得(https://doi.org/10.5281/zenodo.13691648)71无限制访问。本文提供了源数据。。
Code availability
代码可用性
The custom code for integrating the models is available on GitHub at https://github.com/JieZheng-ShanghaiTech/SL_benchmark, while the data is provided in the Zenodo repository at https://doi.org/10.5281/zenodo.1369164871 with unrestricted access.
用于集成模型的自定义代码可在GitHub上获得,网址为https://github.com/JieZheng-ShanghaiTech/SL_benchmark,而数据则在Zenodo存储库中提供https://doi.org/10.5281/zenodo.1369164871无限制访问。
ReferencesBridges, C. B. The origin of variations in sexual and sex-limited characters. Am. Nat. 56, 51–63 (1922).Article
。《美国国家志》第56卷第51-63页(1922年)。文章
Google Scholar
谷歌学者
Dobzhansky, T. Genetics of natural populations. XIII. Recombination and variability in populations of Drosophila pseudoobscura. Genetics 31, 269–290 (1946).Article
Dobzhansky,T。自然种群的遗传学。十三。果蝇种群的重组和变异性。遗传学31269-290(1946)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Hartwell, L. H., Szankasi, P., Roberts, C. J., Murray, A. W. & Friend, S. H. Integrating genetic approaches into the discovery of anticancer drugs. Science 278, 1064–1068 (1997).Article
Hartwell,L.H.,Szankasi,P.,Roberts,C.J.,Murray,A.W。&Friend,S.H。将遗传方法整合到抗癌药物的发现中。科学2781064-1068(1997)。文章
ADS
广告
PubMed
PubMed
Google Scholar
谷歌学者
Kaelin, W. G. Choosing anticancer drug targets in the postgenomic era. J. Clin. Investig. 104, 1503–1506 (1999).Article
Kaelin,W.G。在后基因组时代选择抗癌药物靶标。J、 。调查。1041503-1506(1999)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Lord, C. J. & Ashworth, A. PARP inhibitors: synthetic lethality in the clinic. Science 355, 1152–1158 (2017).Article
Lord,C.J。&Ashworth,A。PARP抑制剂:临床合成致死率。科学3551152-1158(2017)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Satoh, M. S. & Lindahl, T. Role of poly(ADP-ribose) formation in DNA repair. Nature 356, 356–358 (1992).Article
Satoh,M.S。和Lindahl,T。聚(ADP-核糖)形成在DNA修复中的作用。自然356356-358(1992)。文章
ADS
广告
PubMed
PubMed
Google Scholar
谷歌学者
De Vos, M., Schreiber, V. & Dantzer, F. The diverse roles and clinical relevance of PARPs in DNA damage repair: Current state of the art. Biochem. Pharmacol. 84, 137–146 (2012).Article
De Vos,M.,Schreiber,V。&Dantzer,F。PARP在DNA损伤修复中的不同作用和临床相关性:最新技术。生物化学。药理学。84137-146(2012)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Krishnakumar, R. & Kraus, W. L. The PARP side of the nucleus: molecular actions, physiological outcomes, and clinical targets. Mol. Cell 39, 8–24 (2010).Article
。分子细胞39,8-24(2010)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Bryant, H. E. et al. Specific killing of BRCA2-deficient tumours with inhibitors of poly(ADP-ribose) polymerase (vol 434, pg 913, 2005). Nature 447, 346–346 (2007).Article
Bryant,H.E.等人,《用聚(ADP-核糖)聚合酶抑制剂特异性杀死BRCA2缺陷型肿瘤》(第434卷,第9132005页)。《自然》447346-346(2007)。文章
ADS
广告
Google Scholar
谷歌学者
Farago, A. F. et al. Combination olaparib and temozolomide in relapsed small cell lung cancer. Cancer Discov. 9, 1372–1387 (2019).Moore, K. et al. Maintenance olaparib in patients with newly diagnosed advanced ovarian cancer. N. Engl. J. Med. 379, 2495–2505 (2018).Article
Farago,A.F.等人联合奥拉帕尼和替莫唑胺治疗复发性小细胞肺癌。癌症发现。91372-1387(2019)。Moore,K.等人在新诊断的晚期卵巢癌患者中维持奥拉帕尼。N、 英语。J、 医学杂志3792495-2505(2018)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Fong, P. C. et al. Inhibition of poly(ADP-ribose) polymerase in tumors from BRCA mutation carriers. N. Engl. J. Med. 361, 123–134 (2009).Article
Fong,P.C.等人。BRCA突变携带者肿瘤中聚(ADP-核糖)聚合酶的抑制。N、 英语。J、 。文章
PubMed
PubMed
Google Scholar
谷歌学者
Liu, L. et al. Synthetic lethality-based identification of targets for anticancer drugs in the human signaling network. Sci. Rep. 8, 8440 (2018).Article
。代表88440(2018)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Setten, R. L., Rossi, J. J. & Han, S.-P. The current state and future directions of RNAi-based therapeutics. Nat. Rev. Drug Discov. 18, 421–446 (2019).Article
Setten,R.L.,Rossi,J.J.&Han,S.-P。基于RNAi的疗法的现状和未来方向。《药物目录》修订版。18421-446(2019)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Behan, F. M. et al. Prioritization of cancer therapeutic targets using CRISPR-Cas9 screens. Nature 568, 511–516 (2019).Article
Behan,F.M.等人。使用CRISPR-Cas9筛选对癌症治疗靶标进行优先排序。自然568511-516(2019)。文章
ADS
广告
PubMed
PubMed
Google Scholar
谷歌学者
Topatana, W. et al. Advances in synthetic lethality for cancer therapy: cellular mechanism and clinical translation. J. Hematol. Oncol. 13, 1–22 (2020).Article
Topatana,W。等人。癌症治疗合成致死率的进展:细胞机制和临床翻译。J、 血液学。Oncol公司。13,1-22(2020)。文章
Google Scholar
谷歌学者
Horlbeck, M. A. et al. Mapping the genetic landscape of human cells. Cell 174, 953–967.e22 (2018).Article
Horlbeck,M.A.等人绘制了人类细胞的遗传图谱。细胞174953-967.e22(2018)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Wang, J. et al. Computational methods, databases and tools for synthetic lethality prediction. Brief. Bioinform. 23, bbac106 (2022).Article
。简介。生物信息。23,bbac106(2022)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Jerby-Arnon, L. et al. Predicting cancer-specific vulnerability via data-driven detection of synthetic lethality. Cell 158, 1199–1209 (2014).Article
Jerby Arnon,L.等人。通过数据驱动的合成致死率检测预测癌症特异性脆弱性。细胞1581199-1209(2014)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Lee, J. S. et al. Harnessing synthetic lethality to predict the response to cancer treatment. Nat. Commun. 9, 2546 (2018).Article
Lee,J.S.等人利用合成致死率预测对癌症治疗的反应。国家公社。92546(2018)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Sinha, S. et al. Systematic discovery of mutation-specific synthetic lethals by mining pan-cancer human primary tumor data. Nat. Commun. 8, 15580 (2017).Article
Sinha,S.等人。通过挖掘泛癌人类原发性肿瘤数据,系统发现突变特异性合成致死率。国家公社。815580(2017)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Yang, C. et al. Mapping the landscape of synthetic lethal interactions in liver cancer. Theranostics 11, 9038–9053 (2021).Article
Yang,C.等人。绘制肝癌中合成致死相互作用的景观。Theranostics 119038-9053(2021)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
De Kegel, B., Quinn, N., Thompson, N. A., Adams, D. J. & Ryan, C. J. Comprehensive prediction of robust synthetic lethality between paralog pairs in cancer cell lines. Cell Syst. 12, 1144–+ (2021).Article
De Kegel,B.,Quinn,N.,Thompson,N.A.,Adams,D.J。&Ryan,C.J。全面预测癌细胞系中旁系同源物对之间强大的合成致死率。细胞系统。121144–+(2021年)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Benfatto, S. et al. Uncovering cancer vulnerabilities by machine learning prediction of synthetic lethality. Mol. Cancer 20, 111 (2021).Article
Benfatto,S.等人。通过机器学习预测合成致死率来揭示癌症的脆弱性。摩尔癌症20111(2021)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Li, J. et al. Identification of synthetic lethality based on a functional network by using machine learning algorithms. J. Cell. Biochem. 120, 405–416 (2019).Article
Li,J.等人。使用机器学习算法基于功能网络识别综合致死率。J、 细胞。生物化学。120405-416(2019)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Tang, S. et al. Synthetic lethal gene pairs: experimental approaches and predictive models. Front. Genet. 13, 961611 (2022).Article
Tang,S.等。合成致死基因对:实验方法和预测模型。正面。基因。1396161(2022)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Wang, J. et al. SynLethDB 2.0: A web-based knowledge graph database on synthetic lethality for novel anticancer drug discovery. Database. 2022, baac030 (2022).Ashburner, M. et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25, 25–9 (2000).Article .
Wang,J。等人。SynLethDB 2.0:基于网络的新型抗癌药物发现合成致死率知识图数据库。数据库。2022年,baac030(2022年)。Ashburner,M.等人,《基因本体论:生物学统一的工具》。基因本体论联盟。纳特·吉内特。25,25–9(2000)。文章。
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Oughtred, R. et al. The BioGRID database: a comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci. 30, 187–200 (2021).Article
Oughtred,R。等人,《BioGRID数据库:精选蛋白质,遗传和化学相互作用的综合生物医学资源》。蛋白质科学。30187-200(2021)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Kanehisa, M., Furumichi, M., Sato, Y., Kawashima, M. & Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 51, D587–D592 (2022).Liu, Y., Wu, M., Liu, C., Li, X.-L. & Zheng, J. SL2MF: predicting synthetic lethality in human cancers via logistic matrix factorization.
Kanehisa,M.,Furumichi,M.,Sato,Y.,Kawashima,M。&Ishiguro Watanabe,M。KEGG用于基于分类学的途径和基因组分析。核酸研究51,D587–D592(2022)。Liu,Y.,Wu,M.,Liu,C.,Li,X.-L.&Zheng,J。SL2MF:通过逻辑矩阵分解预测人类癌症的合成致死率。
IEEE/ACM Trans. Comput. Biol. Bioinform. 17, 748–757 (2020).Article .
IEEE/ACM Trans。计算机。生物。生物信息。17748-757(2020)。文章。
PubMed
PubMed
Google Scholar
谷歌学者
Liany, H., Jeyasekharan, A. & Rajan, V. Predicting synthetic lethal interactions using heterogeneous data sources. Bioinformatics 36, 2209–2216 (2020).Article
Liany,H.,Jeyasekharan,A。&Rajan,V。使用异构数据源预测合成致死相互作用。生物信息学362209-2216(2020)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Huang, J., Wu, M., Lu, F., Ou-Yang, L. & Zhu, Z. Predicting synthetic lethal interactions in human cancers using graph regularized self-representative matrix factorization. BMC Bioinform. 20, 1–8 (2019).Article
Huang,J.,Wu,M.,Lu,F.,Ou Yang,L。&Zhu,Z。使用图正则化自代表矩阵分解预测人类癌症中的合成致死相互作用。BMC生物信息。20,1-8(2019)。文章
Google Scholar
谷歌学者
Cai, R., Chen, X., Fang, Y., Wu, M. & Hao, Y. Dual-dropout graph convolutional network for predicting synthetic lethality in human cancers. Bioinformatics 36, 4458–4465 (2020).Article
Cai,R.,Chen,X.,Fang,Y.,Wu,M。&Hao,Y。用于预测人类癌症合成致死率的双辍学图卷积网络。生物信息学364458-4465(2020)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Long, Y. et al. Graph contextualized attention network for predicting synthetic lethality in human cancers. Bioinformatics 37, 2432–2440 (2021).Article
Long,Y.等人。用于预测人类癌症合成致死率的图形情境化注意力网络。生物信息学372432-2440(2021)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Hao, Z. et al. Prediction of synthetic lethal interactions in human cancers using multi-view graph auto-encoder. IEEE J. Biomed. Health Inform. 25, 4041–4051 (2021).Article
Hao,Z.等人。使用多视图图自动编码器预测人类癌症中的合成致死相互作用。IEEE J.生物医学。健康信息。254041-4051(2021)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Lai, M. et al. Predicting synthetic lethality in human cancers via multi-graph ensemble neural network (IEEE, 2021).Long, Y. et al. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics 38, 2254–2262 (2022).Article
Lai,M.等人。通过多图集成神经网络预测人类癌症的合成致死率(IEEE,2021)。。生物信息学382254-2262(2022)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Wang, S. et al. KG4SL: knowledge graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 37, i418–i425 (2021).Article
Wang,S。等。KG4SL:用于人类癌症综合致死率预测的知识图神经网络。生物信息学37,i418–i425(2021)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Zhu, Y., Zhou, Y., Liu, Y., Wang, X. & Li, J. SLGNN: synthetic lethality prediction in human cancers based on factor-aware knowledge graph neural network. Bioinformatics 39, btad015 (2023).Article
Zhu,Y.,Zhou,Y.,Liu,Y.,Wang,X.&Li,J。SLGNN:基于因子感知知识图神经网络的人类癌症综合致死率预测。生物信息学39,btad015(2023)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Liu, X. et al. PiLSL: pairwise interaction learning-based graph neural network for synthetic lethality prediction in human cancers. Bioinformatics 38, ii106–ii112 (2022).Article
Liu,X。等。PiLSL:基于成对交互学习的图形神经网络,用于人类癌症的合成致死率预测。生物信息学38,ii106–ii112(2022)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Wang, S. et al. NSF4SL: negative-sample-free contrastive learning for ranking synthetic lethal partner genes in human cancers. Bioinformatics 38, ii13–ii19 (2022).Article
Wang,S。et al。NSF4SL:阴性无样本对比学习,用于对人类癌症中的合成致死伴侣基因进行排名。生物信息学38,ii13-ii19(2022)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Gillespie, M. et al. The reactome pathway knowledgebase 2022. Nucleic Acids Res. 50, D687–D692 (2022).Article
Gillespie,M.等人,《反应组途径知识库2022》。核酸研究50,D687–D692(2022)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Zhang, K., Wu, M., Liu, Y., Feng, Y. & Zheng, J. KR4SL: knowledge graph reasoning for explainable prediction of synthetic lethality. Bioinformatics 39, i158–i167 (2023).Article
Zhang,K.,Wu,M.,Liu,Y.,Feng,Y。&Zheng,J。KR4SL:用于可解释的合成致死率预测的知识图推理。生物信息学39,i158–i167(2023)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Fan, K., Tang, S., Gökbağ, B., Cheng, L. & Li, L. Multi-view graph convolutional network for cancer cell-specific synthetic lethality prediction. Front. Genet. 13, 1103092 (2022).Tepeli, Y. I., Seale, C. & Gonçalves, J. P. ELISL: early-late integrated synthetic lethality prediction in cancer.
Fan,K.,Tang,S.,Gökbağ,B.,Cheng,L。&Li,L。用于癌细胞特异性合成致死率预测的多视图图卷积网络。正面。基因。131103092(2022)。Tepeli,Y.I.,Seale,C。&Gonçalves,J.P。ELISL:癌症中早期-晚期综合合成致死率预测。
Bioinformatics 40, btad764 (2024).Article .
生物信息学40,btad764(2024)。文章。
PubMed
PubMed
Google Scholar
谷歌学者
Shen, J. P. et al. Combinatorial CRISPR-Cas9 screens for de novo mapping of genetic interactions. Nat. Methods 14, 573–576 (2017).Article
Shen,J.P。等人。用于遗传相互作用从头作图的组合CRISPR-Cas9筛选。自然方法14573-576(2017)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Han, K. et al. Synergistic drug combinations for cancer identified in a CRISPR screen for pairwise genetic interactions. Nat. Biotechnol. 35, 463–474 (2017).Article
Han,K。等人。在CRISPR筛选中鉴定的用于成对遗传相互作用的癌症的协同药物组合。美国国家生物技术公司。35463-474(2017)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Najm, F. J. et al. Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens. Nat. Biotechnol. 36, 179–189 (2018).Article
Najm,F.J。等人。用于组合遗传筛选的直系同源CRISPR-Cas9酶。美国国家生物技术公司。36179-189(2018)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Zhao, D. et al. Combinatorial CRISPR-Cas9 metabolic screens reveal critical redox control points dependent on the KEAP1-NRF2 regulatory axis. Mol. Cell 69, 699–708.e7 (2018).Article
Zhao,D。等人。组合CRISPR-Cas9代谢筛选揭示了依赖于KEAP1-NRF2调节轴的关键氧化还原控制点。分子细胞69699-708.e7(2018)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Ma, M., Na, S. & Wang, H. AEGCN: an autoencoder-constrained graph convolutional network. Neurocomputing 432, 21–31 (2021).Article
Ma,M.,Na,S。&Wang,H。AEGCN:一种自动编码器约束的图卷积网络。神经计算432,21-31(2021)。文章
Google Scholar
谷歌学者
Li, Q., Han, Z. & Wu, X.-m. Deeper insights into graph convolutional networks for semi-supervised learning. in Proc. of the AAAI Conference on Artificial Intelligence 32 (2018).Ito, T. et al. Paralog knockout profiling identifies DUSP4 and DUSP6 as a digenic dependence in MAPK pathway-driven cancers.
Li,Q.,Han,Z.&Wu,X.-m.对半监督学习的图卷积网络的深入了解。在过程中。。Ito,T。等人,旁系同源基因敲除分析将DUSP4和DUSP6鉴定为MAPK途径驱动的癌症中的双基因依赖性。
Nat. Genet. 53, 1664–1672 (2021).Article .
Nat.Genet。53, 1664–1672 (2021).第[UNK]条。
PubMed
PubMed
Google Scholar
谷歌学者
Parrish, P. C. R. et al. Discovery of synthetic lethal and tumor suppressor paralog pairs in the human genome. Cell Rep. 36, 109597 (2021).Article
Parrish,P.C.R.等人。在人类基因组中发现合成致死性和肿瘤抑制性旁系同源物对。Cell Rep.36109597(2021)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Thompson, N. A. et al. Combinatorial CRISPR screen identifies fitness effects of gene paralogues. Nat. Commun. 12, 1302 (2021).Article
Thompson,N.A。等人。组合CRISPR筛选鉴定基因旁系同源物的适应性效应。国家公社。121302(2021)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Vidigal, J. A. & Ventura, A. Rapid and efficient one-step generation of paired gRNA CRISPR-Cas9 libraries. Nat. Commun. 6, 8083 (2015).Article
Vidigal,J.A。&Ventura,A。快速有效地一步生成成对的gRNA CRISPR-Cas9文库。国家公社。68083(2015)。文章
ADS
广告
PubMed
PubMed
Google Scholar
谷歌学者
Zhang, B. et al. The tumor therapy landscape of synthetic lethality. Nat. Commun. 12, 1275 (2021).Article
张,B。等。合成致死性的肿瘤治疗景观。国家公社。121275(2021)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Srivatsa, S. et al. Discovery of synthetic lethal interactions from large-scale pan-cancer perturbation screens. Nat. Commun. 13, 7748 (2022).Article
。国家公社。137748(2022)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Reid, R. J. D. et al. A synthetic dosage lethal genetic interaction between CKS1B and PLK1 is conserved in yeast and human cancer cells. Genetics 204, 807–819 (2016).Article
Reid,R.J.D.等人。CKS1B和PLK1之间的合成剂量致死遗传相互作用在酵母和人类癌细胞中是保守的。遗传学204807-819(2016)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
O’Neil, N. J., Bailey, M. L. & Hieter, P. Synthetic lethality and cancer. Nat. Rev. Genet. 18, 613–623 (2017).Article
O'Neil,N.J.,Bailey,M.L。和Hieter,P。合成致死率和癌症。Genet自然Rev。。文章
PubMed
PubMed
Google Scholar
谷歌学者
Muller, F. L., Aquilanti, E. A. & Depinho, R. A. Collateral lethality: a new therapeutic strategy in oncology. Trends Cancer 1, 161–173 (2015).Article
Muller,F.L.,Aquilanti,E.A。和Depinho,R.A。侧枝致死率:肿瘤学的新治疗策略。趋势癌症1161-173(2015)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Dey, P. et al. Genomic deletion of malic enzyme 2 confers collateral lethality in pancreatic cancer. Nature 542, 119–123 (2017).Article
Dey,P。等人。苹果酸酶2的基因组缺失赋予胰腺癌的附带致死性。自然54219-123(2017)。文章
ADS
广告
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Li, S. et al. Development of synthetic lethality in cancer: molecular and cellular classification. Signal Transduct. Target. Ther. 5, 241 (2020).Article
Li,S.等人。癌症合成致死率的发展:分子和细胞分类。信号传输管。目标。他们。5241(2020)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Seal, R. L. et al. Genenames.org: the HGNC resources in 2023. Nucleic Acids Res. 51, D1003–D1009 (2022).Cunningham, F. et al. Ensembl 2022. Nucleic Acids Res. 50, D988–D995 (2022).Article
Seal,R.L.等人,Genenames.org:2023年的HGNC资源。核酸研究51,D1003–D1009(2022)。Cunningham,F.等人,Ensembl 2022。核酸研究50,D988–D995(2022)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Chua, H. N., Sung, W.-K. & Wong, L. Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22, 1623–1630 (2006).Article
Chua,H.N.,Sung,W.-K.&Wong,L。利用间接邻居和拓扑权重从蛋白质-蛋白质相互作用预测蛋白质功能。生物信息学221623-1630(2006)。文章
PubMed
PubMed
Google Scholar
谷歌学者
Tsitsiridis, G. et al. CORUM: The comprehensive resource of mammalian protein complexes-2022. Nucleic Acids Res. 51, D539–D545 (2022).Yu, G. Gene ontology semantic similarity analysis using GOSemSim. in Methods in Molecular Biology 2117, 207–215 (2020).Tsherniak, A. et al. Defining a cancer dependency map.
Tsitsiridis,G。等。CORUM:哺乳动物蛋白质复合物的综合资源-2022。核酸研究51,D539–D545(2022)。Yu,G。使用GOSemSim进行基因本体语义相似性分析。在《分子生物学方法》2117207-215(2020)中。Tshrenak,A。等人定义癌症依赖图。
Cell 170, 564–576.e16 (2017).Article .
细胞170564-576.e16(2017)。文章。
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
The UniProt Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).Article
UniProt财团。UniProt:2021年的通用蛋白质知识库。核酸研究49,D480–D489(2021)。文章
Google Scholar
谷歌学者
Feng, Y. et al. Benchmarking machine learning methods for synthetic lethality prediction in cancer. Zenodo repository, https://zenodo.org/records/13691648 (2024).Download referencesAcknowledgementsWe thank Weifan Mao, Zhen Yue, Siyu Tao, and Yang Yang for their assistance in the preliminary study and revision of this paper.Author informationAuthors and AffiliationsSchool of Information Science and Technology, ShanghaiTech University, Shanghai, ChinaYimiao Feng, He Wang, Yang Ouyang, Quan Li & Jie ZhengLingang Laboratory, Shanghai, ChinaYimiao FengBioformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, SingaporeYahui LongInstitute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), Singapore, SingaporeMin WuShanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, ChinaJie ZhengAuthorsYimiao FengView author publicationsYou can also search for this author in.
Feng,Y.等人。癌症综合致死率预测的基准机器学习方法。Zenodo存储库,https://zenodo.org/records/13691648(2024年)。下载参考文献致谢我们感谢毛伟凡,郑跃,陶思宇和杨洋在本文的初步研究和修订中提供的帮助。作者信息作者和附属机构上海理工大学信息科学与技术学院,上海,中国冯一淼,何旺,杨欧阳,权力&杰郑林港实验室,上海,中国冯一淼生物形态研究所(BII),科学技术与研究机构(A*STAR),新加坡,新加坡亚辉信息通信研究所,科学技术与研究机构(A*STAR),新加坡,新加坡吴敏敏上海智能视觉与成像工程研究中心,上海,中国郑杰作者冯一淼观点作者出版物您也可以在中搜索这位作者。
PubMed Google ScholarYahui LongView author publicationsYou can also search for this author in
PubMed Google ScholarYahui LongView作者出版物您也可以在
PubMed Google ScholarHe WangView author publicationsYou can also search for this author in
PubMed Google ScholarHe WangView作者出版物您也可以在
PubMed Google ScholarYang OuyangView author publicationsYou can also search for this author in
PubMed Google ScholarYang OuyangView作者出版物您也可以在
PubMed Google ScholarQuan LiView author publicationsYou can also search for this author in
PubMed Google ScholarQuan LiView作者出版物您也可以在
PubMed Google ScholarMin WuView author publicationsYou can also search for this author in
PubMed Google ScholarMin WuView作者出版物您也可以在
PubMed Google ScholarJie ZhengView author publicationsYou can also search for this author in
PubMed Google ScholarJie ZhengView作者出版物您也可以在
PubMed Google ScholarContributionsJ.Z., M.W., and Y.F. conceived this idea, J.Z. and M.W. designed and guided the project, Y.F., Y.L., Q.L., M.W., and J.Z. participated in manuscript writing, Y.F. completed experiments and result analysis, Y.F. designed charts, H.W., and Y.O.
PubMed谷歌学术贡献。Z、 ,M.W.和Y.F.构思了这个想法,J.Z.和M.W.设计并指导了这个项目,Y.F.,Y.L.,Q.L.,M.W。和J.Z.参与了手稿撰写,Y.F.完成了实验和结果分析,Y.F.设计了图表,H.W。和Y.O。
assisted in literature review.Corresponding authorsCorrespondence to.
协助文献复习。通讯作者通讯。
Min Wu or Jie Zheng.Ethics declarations
吴敏或郑洁。道德宣言
Competing interests
相互竞争的利益
The authors declare no competing interests.
。
Peer review
同行评审
Peer review information
同行评审信息
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.
。
Additional informationPublisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.Supplementary informationSupplementary InformationPeer Review FileDescription of Additional Supplementary InformationSupplementary Dataset 1Supplementary Dataset 2Reporting SummarySource dataSource DataRights and permissions.
Additional informationPublisher的注释Springer Nature在已发布的地图和机构隶属关系中的管辖权主张方面保持中立。补充信息补充信息同行评审文件其他补充信息的描述补充数据集1补充数据集2报告摘要源数据源数据权限。
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material.
。
You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
根据本许可证,您无权共享源自本文或其部分的改编材料。本文中的图像或其他第三方材料包含在文章的知识共享许可证中,除非该材料的信用额度中另有说明。如果材料未包含在文章的知识共享许可中,并且您的预期用途不受法律法规的许可或超出许可用途,则您需要直接获得版权所有者的许可。
To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/..
要查看此许可证的副本,请访问http://creativecommons.org/licenses/by-nc-nd/4.0/..
Reprints and permissionsAbout this articleCite this articleFeng, Y., Long, Y., Wang, H. et al. Benchmarking machine learning methods for synthetic lethality prediction in cancer.
转载和许可本文引用本文Feng,Y.,Long,Y.,Wang,H。等人。癌症综合致死率预测的基准机器学习方法。
Nat Commun 15, 9058 (2024). https://doi.org/10.1038/s41467-024-52900-7Download citationReceived: 27 November 2023Accepted: 23 September 2024Published: 20 October 2024DOI: https://doi.org/10.1038/s41467-024-52900-7Share this articleAnyone you share the following link with will be able to read this content:Get shareable linkSorry, a shareable link is not currently available for this article.Copy to clipboard.
《国家公社》159058(2024)。https://doi.org/10.1038/s41467-024-52900-7Download引文接收日期:2023年11月27日接收日期:2024年9月23日发布日期:2024年10月20日OI:https://doi.org/10.1038/s41467-024-52900-7Share本文与您共享以下链接的任何人都可以阅读此内容:获取可共享链接对不起,本文目前没有可共享的链接。复制到剪贴板。
Provided by the Springer Nature SharedIt content-sharing initiative
由Springer Nature SharedIt内容共享计划提供