商务合作
动脉网APP
可切换为仅中文
NEW YORK – A pair of studies from two independent research teams showcases how scientists are using data from the UK Biobank (UKB) Pharma Proteomics Project (PPP) to identify biomarkers linked to disease and other biological processes.
纽约——来自两个独立研究团队的两项研究展示了科学家如何利用英国生物银行(UKB)制药蛋白质组学项目(PPP)的数据来鉴定与疾病和其他生物过程相关的生物标志物。
In one study, published in Nature Medicine in July, a team led by researchers at the University of Cambridge used the UKB-PPP data to identify protein profiles predictive of more than 50 diseases, while in the other, published in the same journal earlier this month, a team led by scientists at the University of Oxford used the data to develop a proteomic 'clock' for assessing individuals' biological age and general health..
在7月发表在《自然医学》上的一项研究中,剑桥大学研究人员领导的一个团队使用UKB-PPP数据来识别预测50多种疾病的蛋白质谱,而在本月早些时候发表在同一期刊上的另一项研究中,牛津大学科学家领导的一个团队使用这些数据开发了一个蛋白质组学“时钟”,用于评估个体的生物年龄和总体健康状况。。
Both studies made use of the most recent data release from the UKB-PPP, which became available to researchers outside the project — a consortium of 13 biopharma companies — in the fall of 2023. The dataset consists of measurements of roughly 3,000 proteins in blood samples from 54,000 UKB participants using Olink's Explore platform..
这两项研究都利用了UKB-PPP发布的最新数据,该数据于2023年秋季提供给该项目以外的研究人员-由13家生物制药公司组成的财团。该数据集包括使用Olink的Explore平台从54000名UKB参与者的血液样本中测量大约3000种蛋白质。。
The UKB-PPP 'is a pretty incredible dataset,' said Austin Argentieri, a research fellow at Harvard Medical School, affiliate member at the Broad Institute, and first author on the proteomic clock study. He noted that access to such a large proteomic database linked to comprehensive clinical and genomic data was key to his team's work..
哈佛医学院研究员、布罗德研究所附属成员、蛋白质组学时钟研究的第一作者奥斯汀·阿根蒂埃里(AustinArgentieri)说,UKB-PPP“是一个非常令人难以置信的数据集”。他指出,访问与综合临床和基因组数据相关的如此庞大的蛋白质组学数据库是他的团队工作的关键。。
In their study, Argentieri and colleagues used the proteomic data from 31,808 individuals in the UKB cohort to train models for predicting their chronological ages. They generated their models using six different machine learning methods then tested them in a separate 13,633-individual cohort from the UKB as well astwo external cohorts — 3,988 subjects from the Chinese Kadoorie Biobank (CKB) and 1,990 subjects from the FinnGen biobank.
在他们的研究中,Argentieri及其同事使用来自UKB队列中31808个人的蛋白质组学数据来训练预测其年龄的模型。他们使用六种不同的机器学习方法生成了模型,然后在英国皇家银行(UKB)的13633个单独队列以及两个外部队列中进行了测试-来自中国嘉道理生物库(CKB)的3988名受试者和来自芬根生物库的1990名受试者。
Ultimately, they arrived at a model using 204 plasma proteins and a gradient boosting-based machine learning approach that correlated strongly with chronological age in all three cohorts. They then pared this down to a 20-protein model that retained 95 percent of the predictive power of the 204-protein model..
最终,他们得出了一个使用204种血浆蛋白和基于梯度增强的机器学习方法的模型,该方法与所有三个队列中的实际年龄密切相关。然后,他们将其缩减为20蛋白模型,该模型保留了204蛋白模型95%的预测能力。。
Using their models, the researchers calculated what they termed the ProtAgeGap for the subjects in the three cohorts — the difference in age as predicted by the proteomic model, or ProtAge, and their actual age. In the UKB cohort, the ProtAgeGap spanned 12.3 years, with the top 5 percent of individuals having a ProtAge 6.3 older than their actual age, and the bottom 5 percent a ProtAge 6 years younger..
使用他们的模型,研究人员计算了三个队列中受试者的ProtAgeGap,即蛋白质组学模型(ProtAge)预测的年龄差异及其实际年龄。在UKB队列中,ProtAgeGap跨度为12.3年,前5%的人的ProtAge年龄比实际年龄大6.3岁,后5%的人的ProtAge年龄比实际年龄小6岁。。
Comparing ProtAgeGap with existing measures of physical health and biological aging, the researchers found that an individual's ProtAgeGap score was significantly associated with a broad range of such measures. They also found the score was predictive of a person's risk of all-cause mortality as well as common diseases including osteoarthritis, type 2 diabetes, chronic kidney disease, and a number of cancers.
研究人员将ProtAgeGap与现有的身体健康和生物衰老指标进行比较,发现个人的ProtAgeGap得分与广泛的此类指标显着相关。他们还发现,该评分可以预测一个人的全因死亡风险以及常见疾病,包括骨关节炎、2型糖尿病、慢性肾病和多种癌症。
The ProtAgeGap score was also associated with functional traits like walking speed, handgrip strength, and cognitive function..
ProtAgeGap评分还与步行速度,握力和认知功能等功能特征有关。。
A number of groups have previously developed models of biological aging based on DNA methylation, but many of these clocks are ' only weakly associated with mortality risk and aging-related function,' the authors wrote. Argentieri said he and his colleagues believe proteomics-based approaches could yield stronger functional associations as well as models that are more generalizable across diverse cohorts.
。Argentieri说,他和他的同事认为,基于蛋白质组学的方法可以产生更强的功能关联,以及在不同人群中更具普遍性的模型。
Several other research teams have developed proteomic aging clocks, including a 2023 effort led by researchers at the Biomedical Primate Research Centre in Rijswijk, the Netherlands, that used data on more than 37,000 individuals generated using SomaLogic's (now Standard BioTools') SomaScan platform..
其他几个研究团队已经开发了蛋白质组衰老时钟,包括2023年由荷兰里杰斯威克生物医学灵长类动物研究中心的研究人员领导的一项工作,该研究使用了使用SomaLogic(现为标准生物工具)SomaScan平台生成的37000多个人的数据。。
Argentieri said it remains an open question how exactly proteomic aging clocks might be put to use but that a potential application is as a preventative medicine tool for gauging individuals' general health and future disease risk.
Argentieri说,蛋白质组衰老时钟究竟可以如何使用仍然是一个悬而未决的问题,但一个潜在的应用是作为一种预防医学工具来衡量个人的总体健康和未来疾病风险。
'We envision this as something you can do early and often,' he said. 'Test when you are young. See what trajectory you are on … and then if you and your physician don't like the picture you see, you can start to course-correct.'
他说:“我们认为这是一件可以尽早经常做的事情。”当你年轻的时候测试。看看你的发展轨迹……然后如果你和你的医生不喜欢你看到的图片,你可以开始纠正。”
Argentieri and his colleagues are currently evaluating their model within several clinical trials looking at interventions like adjustments in diet and physical activity to see if it can pick up changes in patient health produced by those interventions.
Argentieri和他的同事目前正在几项临床试验中评估他们的模型,这些试验着眼于饮食和身体活动的调整等干预措施,以观察它是否能够反映这些干预措施所产生的患者健康变化。
If improvements in patient heath are reflected in the clock measurements, 'that will give us some confirmation that this is a biomarker that will tell you about how well what you are doing is working,' he said.
他说,如果时钟测量结果反映出患者健康状况的改善,“这将给我们一些证实,这是一种生物标志物,可以告诉你你正在做的事情有多有效。”。
The researchers have filed for patents in the US and UK based on the results of their Nature Medicine study, Argentieri said, though he added that they are still 'in the early days' of developing the tool. He and his colleagues are currently in discussions with several proteomics companies to develop an assay panel targeting the 20 proteins used in their model..
Argentieri说,研究人员已经根据他们的自然药物研究结果在美国和英国申请了专利,但他补充说,他们仍处于开发该工具的“早期阶段”。他和他的同事目前正在与几家蛋白质组学公司进行讨论,以开发针对其模型中使用的20种蛋白质的分析小组。。
The Cambridge effort likewise used the UKB-PPP data, in their case to predict individuals' disease risk across a wide range of conditions, though not within the context of a proteomic clock. Specifically, the researchers aimed to develop proteomic risk models for the 218 diseases for which there were more than 80 cases represented among the UKB cohort..
剑桥大学的努力同样使用了UKB-PPP数据,在他们的案例中,预测了个体在各种条件下的疾病风险,尽管不在蛋白质组时钟的背景下。具体而言,研究人员旨在为218种疾病开发蛋白质组学风险模型,其中UKB队列中有80多例病例。。
Using training sets consisting of 70 percent to 75 percent of a 41,931-subject subset of the UKB-PPP cohort and validation sets consisting of 25 percent to 30 percent of the same subset, the researchers identified panels of between five and 20 proteins that, in the case of 67 diseases, improved prediction of a patient's 10-year risk compared to models based on clinical information alone.
使用由UKB-PPP队列的41931个受试者子集的70%至75%组成的训练集和由相同子集的25%至30%组成的验证集,研究人员确定了由5至20种蛋白质组成的小组,与仅基于临床信息的模型相比,在67种疾病的情况下,改进了对患者10年风险的预测。
In the case of 52 of these 67 diseases, the protein panels improved risk prediction compared to clinical information combined with routine clinical blood tests..
对于这67种疾病中的52种,与临床信息结合常规临床血液检查相比,蛋白质组改善了风险预测。。
Like the Oxford team, the Cambridge researchers see their protein panels being potentially useful as risk assessment tools, said Julia Carrasco-Zanini, first author of the study and a postdoctoral researcher at Queen Mary University of London. Carrasco-Zanini was a graduate student at Cambridge when the study was conducted..
该研究的第一作者、伦敦玛丽女王大学博士后研究员茱莉亚·卡拉斯科·扎尼尼(JuliaCarrascoZanini)说,与牛津团队一样,剑桥研究人员也认为他们的蛋白质小组有可能用作风险评估工具。进行这项研究时,卡拉斯科·扎尼尼是剑桥大学的一名研究生。。
'At the moment, we are thinking of the signatures as screening tools, not diagnostic tools,' she said. 'For instance, if we have something like idiopathic pulmonary fibrosis, where we see that the predictive signature is very good, we think, ok, if we screen people and have a group we can identify as being at very high risk, perhaps they could be followed more closely with some sort of imaging every few years or so.'.
她说:“目前,我们认为这些签名是筛查工具,而不是诊断工具。”例如,如果我们有特发性肺纤维化这样的疾病,我们看到预测特征非常好,我们认为,如果我们筛查人群,并且有一组我们可以确定处于非常高风险的人群,也许可以每隔几年左右通过某种成像更密切地跟踪他们。”。
Carrasco-Zanini observed that the proteomic risk scores developed in the study outperformed polygenic risk scores (PRS) for 22 of the 23 diseases for which PRS were available in the UKB, the lone exception being breast cancer, which she suggested reflected that disease's large genetic component. She added that she and her colleagues also conducted a more systematic comparison of proteomic signatures and PRS using the EPIC-Norfolk cohort that they published in The Lancet Digital Health in July.
卡拉斯科·扎尼尼(Carrasco Zanini)观察到,该研究中开发的蛋白质组学风险评分优于英国不列颠哥伦比亚省可获得PRS的23种疾病中的22种的多基因风险评分(PRS),唯一的例外是乳腺癌,她认为这反映了该疾病的巨大遗传成分。她补充说,她和她的同事还使用他们7月在《柳叶刀数字健康》上发表的EPIC Norfolk队列对蛋白质组学特征和PR进行了更系统的比较。
That study similarly found proteomic risk scores to outperform PRS, with protein models showing better risk predicting for 17 of 23 outcomes investigated. .
该研究同样发现蛋白质组学风险评分优于PRS,蛋白质模型显示,在所调查的23项结果中,有17项的风险预测更好。
Moving forward, Carrasco-Zanini said the researchers plan to validate their findings in additional, more diverse cohorts. They also aim to benchmark their panels against existing markers that were not available in the UKB cohort. For instance, she noted, while the team's model for multiple myeloma was 'highly predictive,' measures for patient M-protein levels, which is commonly used as a test for the condition, were not available in the UKB-PPP data..
展望未来,卡拉斯科·扎尼尼(CarrascoZanini)表示,研究人员计划在其他更多样化的人群中验证他们的发现。他们还旨在针对UKB队列中不可用的现有标记对其小组进行基准测试。例如,她指出,虽然该团队的多发性骨髓瘤模型具有“高度预测性”,但UKB-PPP数据中却没有对患者M蛋白水平的测量,而M蛋白水平通常被用作病情的测试。。
Ultimately, Carrasco-Zanini said, they hope to develop clinical-grade assays they could use in clinical research and to test some of their panels in screening trials.
卡拉斯科·扎尼尼说,最终,他们希望开发出可用于临床研究的临床级检测方法,并在筛选试验中测试他们的一些小组。
Argentieri and Carrasco-Zanini both said they expect the UKB-PPP will allow researchers to explore a variety of questions around proteomics, as it is the largest such resource made available to date.
Argentieri和Carrasco Zanini都表示,他们预计UKB-PPP将允许研究人员探索蛋白质组学方面的各种问题,因为它是迄今为止可用的最大的此类资源。
Beyond the scale of its proteomic measurements, the biobank's detailed genomic and phenotypic data 'really expands the opportunities for addressing multiple questions of proteomic research, which is why we are starting to see this wave of different papers around this topic,' Carrasco-Zanini said.
卡拉斯科·扎尼尼(Carrasco Zanini)说,除了蛋白质组学测量的规模之外,生物库的详细基因组和表型数据“确实扩大了解决蛋白质组学研究多个问题的机会,这就是为什么我们开始看到围绕这一主题的这一波不同论文的原因。”。
Argentieri suggested that as the UKB-PPP demonstrates its value as a research tool, it will drive other biobanks to add to their proteomic datasets.
Argentieri建议,随着UKB-PPP证明其作为研究工具的价值,它将推动其他生物库增加其蛋白质组学数据集。
He cited the example of the CKB and FinnGen biobanks he and his colleagues used in their research. 'They've got proteomics on a few thousand [subjects], and they have aspirations now to have it on tens of thousands or more, because I think everyone is seeing that large-scale human population proteomics just gives you so much power for understanding disease biology, making predictive models, understanding disease progression.'.
他引用了他和同事在研究中使用的CKB和FinnGen生物库的例子他们已经在数千个[受试者]上获得了蛋白质组学,现在他们渴望在数万个或更多的受试者上获得蛋白质组学,因为我认为每个人都在看到大规模的人类蛋白质组学为你提供了理解疾病生物学,建立预测模型,理解疾病进展的强大力量。”。