商务合作
动脉网APP
可切换为仅中文
NEW YORK – A team led by researchers at the University of Hong Kong and the Columbia University Mailman School of Public Health have devised a new method for Mendelian randomization experiments tailored to proteomic datasets.
纽约——由香港大学和哥伦比亚大学邮差公共卫生学院的研究人员领导的一个团队设计了一种针对蛋白质组学数据集的孟德尔随机化实验的新方法。
In a paper published this month in Cell Genomics, the researchers demonstrated the use of the approach combined with protein structural predictions by Google DeepMind's AlphaFold3 to identify and investigate proteins linked to Alzheimer's disease.
在本月发表在《细胞基因组学》上的一篇论文中,研究人员展示了谷歌DeepMind的AlphaFold3将该方法与蛋白质结构预测相结合,用于鉴定和研究与阿尔茨海默病相关的蛋白质。
Mendelian randomization (MR) allows researchers to establish causal relationships between exposures and outcomes of interest by leveraging the fact that alleles sort randomly when genes are transmitted from parents to their children. It is often used in situations where a traditional randomized controlled trial is impossible or impractical..
孟德尔随机化(MR)允许研究人员通过利用等位基因在基因从父母传播给孩子时随机排序的事实,建立暴露与感兴趣结果之间的因果关系。它通常用于传统随机对照试验不可能或不切实际的情况。。
In recent years, the production of large-scale proteomic datasets like the UK Biobank's Pharma Proteomics Project (UKB-PPP) has allowed researchers to use MR to look for proteins linked to various diseases and health outcomes.
近年来,英国生物银行制药蛋白质组学项目(UKB-PPP)等大规模蛋白质组学数据集的产生使研究人员能够使用MR寻找与各种疾病和健康结果相关的蛋白质。
MR experiments use what are called instrumental variables (IVs) — features linked to a particular exposure — to study the causal relationship between that exposure and various outcomes. In MR experiments looking for links between particular proteins and disease, protein quantitative trait loci (pQTLs) — genetic markers linked to the expression of a particular protein — are often used as IVs.
MR实验使用所谓的工具变量(IVs)-与特定暴露相关的特征-来研究该暴露与各种结果之间的因果关系。在寻找特定蛋白质与疾病之间联系的MR实验中,蛋白质数量性状基因座(pQTLs)-与特定蛋白质表达相关的遗传标记-通常被用作IVs。
The protein expression levels linked to those pQTLs are the exposures, and the disease states of interest are the outcomes. MR analyses let researchers determine whether there is a causal relationship between the proteins (the exposure) and the disease states (the outcomes) and to estimate the magnitude of that causal effect..
与这些PQTL相关的蛋白质表达水平是暴露,感兴趣的疾病状态是结果。MR分析使研究人员能够确定蛋白质(暴露)与疾病状态(结果)之间是否存在因果关系,并估计这种因果关系的程度。。
One challenge, however, is the difficulty of identifying pQTLs that satisfy the conditions required of a suitable IV, said Zhonghua Liu, assistant professor of biostatistics at the Columbia University Data Science Institute and senior author on the study.
然而,哥伦比亚大学数据科学研究所(Columbia University Data Science Institute)生物统计学助理教授、该研究的资深作者刘中华(Zhonghua Liu)表示,一个挑战是难以确定满足合适IV所需条件的PQTL。
To be used as an IV in an MR experiment, a pQTL should, one, be associated with the protein being tested as an exposure; two, not be linked to any cofounders that impact the exposure-outcome relationship; and three, have an effect on the outcome only via the protein being tested as an exposure and not through any other pathways..
为了在MR实验中用作IV,pQTL应该与作为暴露测试的蛋白质相关联;第二,不与任何影响曝光-结果关系的联合创始人联系;第三,仅通过作为暴露测试的蛋白质而不是通过任何其他途径对结果产生影响。。
As Liu and his coauthors note, only the first of these requirements 'can be tested empirically by selecting pQTLs significantly associated with the protein.' Given this limitation, the authors add, researchers have come up with various MR approaches designed 'to handle invalid IVs.'
正如刘和他的合著者所指出的,只有这些要求中的第一个“可以通过选择与蛋白质显着相关的PQTL进行经验测试”作者补充说,鉴于这种局限性,研究人员已经提出了各种旨在“处理无效IVs”的MR方法
Liu said, however, that none of these approaches were developed with proteomic datasets specifically in mind. He said that this limits their utility for working with such datasets.
然而,刘说,这些方法都不是专门针对蛋白质组学数据集开发的。他说,这限制了他们处理此类数据集的实用性。
In part, this stems from the relatively small number of candidate pQTLs available, Liu said. 'You don't typically have many candidate pQTLs that you can use [as IVs], maybe five to 10, from which maybe you select four or three or two.'
刘说,这在一定程度上源于可用的候选PQTL数量相对较少你通常没有很多候选PQTL可以(作为IVs)使用,可能有五到十个,你可以从中选择四个或三个或两个。”
Existing methods for assessing IV validity work best with larger numbers of candidates, he said, noting that for MR analyses linking genetic variation to complex phenotypes like body mass index or lipid levels, this is not an issue.
他说,现有的评估IV有效性的方法最适合大量候选人,他指出,对于将遗传变异与体重指数或血脂水平等复杂表型联系起来的MR分析,这不是问题。
'If you look at something like body mass index, there are a large number of genetic variants that can be used as [IVs],' Liu said. The number of pQTLs, on the other hand, is much smaller, he said.
“如果你看看体重指数之类的东西,有大量的遗传变异可以用作[静脉注射],”刘说。另一方面,PQTL的数量要少得多,他说。
This likely reflects the fact that underlying genetic variation is consolidated into a smaller amount of protein variation at the proteome level, Liu said. He added that it might also reflect the relatively small size and limited depth of proteomic datasets compared to genomic datasets.
刘说,这可能反映了这样一个事实,即潜在的遗传变异在蛋白质组水平上被整合为少量的蛋白质变异。。
To address the challenge of limited pQTL candidates, Liu and his colleagues adopted what they called the Anna Karenina principle, based on that book's famous saying that 'all happy families are alike; each unhappy family is unhappy in its own way.' Applied to the question of pQTLs and IVs, the notion dictates that valid IVs will all provide similar estimates of a pQTL's causal effect, while invalid IVs will each provide a different estimate.
为了应对有限的pQTL候选人的挑战,刘和他的同事采用了他们所谓的安娜·卡列尼娜原则,这是基于那本书的名言“所有幸福的家庭都是一样的;每个不幸的家庭都有自己的不幸。”应用于pQTL和IVs的问题,这个概念表明,有效的IVs都将提供pQTL因果效应的类似估计,而无效的IVs将分别提供不同的估计。
Using this approach, which the authors named MR-SPI, researchers can identify valid IVs from small numbers of candidate pQTLs, Liu said..
。。
Liu said that the MR-SPI method also differs from existing approaches in that it selects IVs for specific protein-outcome pairings — Alzheimer's disease, in the case of the Cell Genomics paper. Traditional methods typically use the same IVs for looking at casual relationships between proteins and a range of outcomes.
刘说,MR-SPI方法与现有方法的不同之处在于,在细胞基因组学论文中,它选择IVs进行特定的蛋白质结果配对-阿尔茨海默病。传统方法通常使用相同的IVs来观察蛋白质与一系列结果之间的偶然关系。
Liu said he and his colleagues believe choosing IVs in an exposure-outcome pair-specific manner will provide more accurate results..
刘说,他和他的同事相信,以暴露-结果对特定的方式选择IVs将提供更准确的结果。。
Commenting on the method, Maik Pietzner, a bioinformatician at the MRC Epidemiology Unit at the University of Cambridge School of Clinical Medicine, said that 'the idea of selecting valid IVs based on a data-driven framework' as presented in the Cell Genomics paper 'is appealing and desirable.'
剑桥大学临床医学院MRC流行病学部门的生物信息学家Maik Pietzner在评论这种方法时说,“细胞基因组学论文中提出的基于数据驱动框架选择有效IVs的想法”是有吸引力和可取的
However, he suggested that the exposure-outcome pair-specific selection of IVs could be problematic, as it could create situations where only trans-pQTLs are selected as valid IVs. PQTLs are typically characterized as either cis — meaning that the pQTL is located close by the gene that encodes that protein — or trans, meaning it is located further away from the gene encoding the protein.
然而,他认为,暴露-结果对特定的IVs选择可能存在问题,因为它可能会产生只有反式pQTLs被选为有效IVs的情况。pQTL通常被表征为顺式(意味着pQTL位于编码该蛋白质的基因附近)或反式(意味着它位于距离编码该蛋白质的基因更远的位置)。
While both are potentially meaningful, Pietzner, who was not involved in the Cell Genomics study, said that he and his colleagues generally avoid using trans-pQTLs as IVs because they are often non-specific. .
虽然两者都有潜在的意义,但没有参与细胞基因组学研究的Pietzner说,他和他的同事通常避免使用反式pQTL作为IVs,因为它们通常是非特异性的。
Applying the MR-SPI method to data from the UKB-PPP, Liu and his colleagues identified seven proteins — CD33, CD55, EPHA1, PILRA, PILRB, RET, and TREM2 — linked to Alzheimer's disease. Six of the proteins have been associated with Alzheimer's risk in previous studies.
Liu和他的同事将MR-SPI方法应用于UKB-PPP的数据,确定了与阿尔茨海默病相关的七种蛋白质-CD33,CD55,EPHA1,PILRA,PILRB,RET和TREM2。在先前的研究中,其中六种蛋白质与阿尔茨海默氏病的风险有关。
The researchers also incorporated AlphaFold3 into their pipeline to evaluate the potential effects of missense variations in the pQTLs selected as IVs, providing insights into protein structural changes that could be linked to the outcome being studied.
研究人员还将AlphaFold3纳入他们的管道中,以评估被选为IVs的PQTL中错义变异的潜在影响,从而深入了解可能与研究结果相关的蛋白质结构变化。
The researchers used AF3 to predict structural changes in the Alzheimer's-linked proteins they identified, but Liu said it remains unclear how those changes might impact the proteins' biological function.
研究人员使用AF3来预测他们鉴定出的与阿尔茨海默氏症相关的蛋白质的结构变化,但刘说,目前尚不清楚这些变化如何影响蛋白质的生物学功能。
Pietzner noted that AF3-based efforts have to date had limited success in determining when changes in amino acid sequence lead to the production of dysfunctional proteins.
Pietzner指出,迄今为止,基于AF3的努力在确定氨基酸序列的变化何时导致功能失调的蛋白质产生方面取得了有限的成功。
'We’ve been hoping that [AF3] can distinguish benign from dysfunctional missense variants,' he said, adding that it would also be interesting if AF3 could identify variants leading to changes in a protein's stability or its detectability via the affinity agents commonly used in large-scale population proteomic studies.
“我们一直希望[AF3]能够区分良性和功能失调的错义变体,”他说,并补充说,如果AF3能够通过大规模人群蛋白质组学研究中常用的亲和剂识别导致蛋白质稳定性或可检测性变化的变体,那将是有趣的。
.
.
'A generic challenge with cis-pQTLs that encode missense variants is still to distinguish whether the affinity reagent is no longer able to bind or whether indeed the missense variant reduces the half-life or secretion of the protein into plasma,' he said.
他说:“编码错义变体的顺式PQTL的一般挑战仍然是区分亲和试剂是否不再能够结合,或者错义变体是否确实减少了蛋白质的半衰期或分泌到血浆中。”。
Liu said he and his colleagues are investigating the biological implications of some of the protein structural changes predicted by AF3.
刘说,他和他的同事正在研究AF3预测的一些蛋白质结构变化的生物学意义。
'We are working on that, but we don't have any results to show yet,' he said. 'It's a very complicated question. We are working with [outside collaborators] to try to fill that gap between protein structural changes and Alzheimer's disease etiology.'
他说,我们正在努力,但目前还没有任何结果这是一个非常复杂的问题。我们正在与(外部合作者)合作,试图填补蛋白质结构变化与阿尔茨海默病病因之间的差距。”