商务合作
动脉网APP
可切换为仅中文
Healthcare AI faces an ethical dilemma between selective and equitable deployment, exacerbated by flawed performance metrics. These metrics inadequately capture real-world complexities and biases, leading to premature assertions of effectiveness. Improved evaluation practices, including continuous monitoring and silent evaluation periods, are crucial.
医疗保健人工智能面临着选择性和公平部署之间的道德困境,而有缺陷的性能指标加剧了这一困境。这些指标没有充分反映现实世界的复杂性和偏见,导致过早地断言有效性。改进评价做法,包括持续监测和沉默评价期,至关重要。
To address these fundamental shortcomings, a paradigm shift in AI assessment is needed, prioritizing actual patient outcomes over conventional benchmarking..
为了解决这些基本缺点,需要AI评估的范式转变,将实际患者结果优先于传统的基准测试。。
Artificial intelligence (AI) is poised to bridge the deployment gap with increasing capabilities for remote patient monitoring, handling of diverse time series datasets, and progression toward the promise of precision medicine. This proximity also underscores the urgency of confronting the translational risks accompanying this technological evolution and maximizing alignment with fundamental principles of ethical, equitable, and effective deployment.
人工智能(AI)有望通过提高远程患者监测、处理各种时间序列数据集的能力以及向精准医学的承诺迈进来弥合部署差距。这种接近也强调了应对伴随着这种技术发展的转化风险的紧迫性,并最大程度地符合道德,公平和有效部署的基本原则。
The recent work by Goetz et al. surfaces a critical issue at the intersection of technology and healthcare ethics: the challenge of generalization and fairness in health AI applications1. This is a complex issue where equal performance across subgroups can be at odds with overall performance metrics2.Specifically, it highlights one potential avenue to navigate variation in model performance among subgroups based on the concept of “selective deployment”3.
Goetz等人最近的工作在技术和医疗保健伦理的交叉点上提出了一个关键问题:健康AI应用中普遍性和公平性的挑战1。这是一个复杂的问题,子组之间的同等性能可能与整体性能指标不一致2。具体而言,它突出了一个潜在的途径,可以根据“选择性部署”3的概念来导航子组之间模型性能的变化。
This strategy asserts that limiting the deployment of the technology to the subgroup in which it works well facilitates benefits for those subpopulations. The alternative is not to deploy the technology in the optimal performance group but instead adopt a standard of equity in the performance overall to achieve parity among subgroups, what might be termed “equitable deployment”.
该策略声称,将该技术的部署限制在其工作良好的子群体中有助于这些子群体的利益。另一种方法不是在最佳性能组中部署技术,而是在总体性能上采用公平的标准,以实现子组之间的均等,这可以称为“公平部署”。
Some view this as a requirement to “level down” performance for the sake of equity, a view that is not unique to AI or healthcare and is the subject of a broader ethical debate4,5,6. Proponents of equitable deployment would counter: Can a commitment to fairness justify not deploying a technology that is likely to be effective but only for a specific subpopulation?Discussions around selective deployment do not take place in a vacuum and must be had with an awareness of the broader context of the attr.
一些人认为这是为了公平而“降低”绩效的要求,这并不是人工智能或医疗保健所独有的观点,也是更广泛的道德争论的主题4,5,6。公平部署的支持者会反驳:对公平的承诺能否证明不部署可能有效但仅适用于特定人群的技术是合理的?关于选择性部署的讨论不是在真空中进行的,必须了解attr的更广泛背景。
Access to high-resolution real-world data: Provide developers with diverse, comprehensive clinical datasets to train models on actual patient populations and scenarios.
访问高分辨率现实世界数据:为开发人员提供各种全面的临床数据集,以训练实际患者人群和场景的模型。
Systematic evaluation pipelines: Implement robust data pipelines to continuously assess model performance and patient outcomes across various demographic and clinical subgroups.
系统评估管道:实施强大的数据管道,以持续评估各种人口统计学和临床亚组的模型性能和患者结果。
Data shift monitoring: Develop dashboards to track changes in data distributions over time, alerting to potential model drift and ensuring ongoing relevance.
数据转移监控:开发仪表板以跟踪数据分布随时间的变化,提醒潜在的模型漂移并确保持续的相关性。
Accountability frameworks: Establish clear responsibilities and oversight mechanisms for all stakeholders involved in the AI model lifecycle, from development to deployment.
问责框架:为参与AI模型生命周期(从开发到部署)的所有利益相关者建立明确的责任和监督机制。
Mandatory silent evaluation periods: Require a phase of background performance assessment in real clinical settings before active deployment, focusing on safety, efficacy, and equity.
强制性沉默评估期:在积极部署之前,需要在实际临床环境中进行背景绩效评估,重点是安全性,有效性和公平性。
Multidisciplinary collaboration: Engage healthcare professionals, patients, and social scientists to define legitimate subgroup differences and ensure culturally competent AI systems.
多学科合作:让医疗保健专业人员、患者和社会科学家参与定义合法的亚组差异,并确保具有文化能力的人工智能系统。
Iterative refinement process: Implement a feedback loop for continuous model improvement based on real-world performance data and stakeholder input.
迭代优化过程:基于现实世界的性能数据和利益相关者的输入,实施反馈回路以持续改进模型。
Transparency in reporting: Mandate clear documentation of model limitations, potential biases, and performance variations across different populations.
报告的透明度:要求明确记录不同人群的模型限制,潜在偏差和性能差异。
ReferencesGoetz, L., Seedat, N., Vandersluis, R., & van der Schaar, M. Generalization—a key challenge for responsible AI in patient-facing clinical applications. Npj Digit. Med. 7, 1–4 (2024).Article
参考文献Goetz,L.,Seedat,N.,Vandersluis,R.,&van der Schaar,M.泛化-负责任的AI在面对临床应用的患者中面临的关键挑战。Npj数字。医学杂志7,1-4(2024)。文章
Google Scholar
谷歌学者
D’Amour, A. et al. Underspecification presents challenges for credibility in modern machine learning. J. Mach. Learn. Res. 23, 226:10237–226:10297 (2022).
D'Amour,A。等人。欠规范对现代机器学习的可信度提出了挑战。J、 马赫。学习。第23226:10237–226:10297号决议(2022年)。
Google Scholar
谷歌学者
Vandersluis, R. & Savulescu, J. The selective deployment of AI in healthcare. Bioethics 38, 391–400 (2024).Article
Vandersluis,R。&Savulescu,J。人工智能在医疗保健中的选择性部署。生物伦理学38391-400(2024)。文章
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Seidman, L. M. The Ratchet wreck: equality’s leveling down problem. Ky. Law J. 110, 59–106 (2021).
。《肯塔基州法律杂志》110,59–106(2021)。
Google Scholar
谷歌学者
Thomas, T. A. Leveling down gender equality. Harv. J. Law Gend. 42, 177–218 (2019).
托马斯·T·A·拉平了两性平等。哈夫。J、 法律性别。42177-218(2019)。
Google Scholar
谷歌学者
Sessions v. Morales-Santana, 137 S. Ct. 1678. https://supreme.justia.com/cases/federal/us/582/15-1191 (2017).Abràmoff, M. D. et al. Considerations for addressing bias in artificial intelligence for health equity. Npj Digit. Med. 6, 1–7 (2023).Article
塞申斯诉莫拉莱斯·桑塔纳案(Sessions v.Morales Santana),康涅狄格州南部137号,1678年。https://supreme.justia.com/cases/federal/us/582/15-1191(2017年)。Abràmoff,M.D.等人,《解决人工智能对健康公平的偏见的考虑》。Npj数字。医学杂志6,1-7(2023)。文章
Google Scholar
谷歌学者
Basu, A. Use of Race in Clinical Algorithms—PMC. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10219586/.Finlayson Samuel, G. et al. The clinician and dataset shift in artificial intelligence. N. Engl. J. Med. 385, 283–286 (2021).Article
Basu,A。在临床算法PMC中使用Race。https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10219586/.FinlaysonSamuel,G.等人,《人工智能中的临床医生和数据集转移》。N、 英语。J、 医学385283-286(2021)。文章
CAS
中科院
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Nestor, B. et al. Preparing a clinical support model for silent mode in general internal medicine. In Proc. of the 5th Machine Learning for Healthcare Conference (PMLR) (eds Doshi-Velez et al.), pp. 950–972 (2020).Kwong, J. C. C. et al. The silent trial-the bridge between bench-to-bedside clinical AI applications.
Nestor,B.等人。为普通内科的沉默模式准备临床支持模型。在过程中。第五届医疗保健机器学习会议(PMLR)(eds Doshi-Velez等人),第950-972页(2020年)。Kwong,J.C.C.等人,《沉默试验——从实验室到床边临床人工智能应用之间的桥梁》。
Front. Digit. Health 4, 929508 (2022).Article .
正面。数字。健康4929508(2022)。文章。
PubMed
PubMed
PubMed Central
公共医学中心
Google Scholar
谷歌学者
Download referencesAcknowledgementsJ.G. is funded by the National Institute of Health through DS-I Africa U54 TW012043-01 and Bridge2AI OT2OD032701. D.B. declares funding from the NIH (NIH-USA U54CA274516-01A1 and R01CA294033-01). L.A.C. is funded by the National Institute of Health through R01 EB017205, DS-I Africa U54TW012043-01, Bridge2AI OT2OD032701, and the National Science Foundation through ITEST #2148451.
下载referencesAcknowledgementsJ。G、 由国家卫生研究院通过DS-I Africa U54 TW012043-01和Bridge2AI OT2OD032701资助。D、 B.宣布由NIH资助(NIH-USA U54CA274516-01A1和R01CA294033-01)。五十、 A.C.由国家卫生研究院通过R01 EB017205,DS-I Africa U54TW012043-01,Bridge2AI OT2OD032701和国家科学基金会通过ITEST#2148451资助。
JWG is a 2022 Robert Wood Johnson Foundation Harold Amos Medical Faculty Development Program and declares support from the RSNA Health Disparities grant (#EIHD2204), Lacuna Fund (#67), Gordon and Betty Moore Foundation, and NIH (NIBIB) MIDRC grant under contracts 75N92020C00008 and 75N92020C00021.Author informationAuthors and AffiliationsLaboratory for Computational Physiology, Massachusetts Institute of Technology, Cambridge, MA, USAJack Gallifant, Leo Anthony Celi & Joao MatosDepartment of Critical Care, Guy’s and St Thomas’ NHS Foundation Trust, London, UKJack GallifantArtificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USADanielle S.
JWG是2022年罗伯特·伍德·约翰逊基金会哈罗德·阿莫斯医学院发展计划,并宣布获得RSNA健康差异补助金(EIHD2204)、Lacuna基金会(67)、戈登和贝蒂·摩尔基金会以及NIH(NIBIB)MIDRC补助金的支持,合同为75N92020C00008和75N92020C00021。作者信息作者和附属机构麻省理工学院计算生理学实验室,剑桥,马萨诸塞州,美国杰克·加利凡特,利奥·安东尼·塞利和乔·马托斯危重病护理部,盖伊和圣托马斯NHS基金会信托,伦敦,英国杰克·加利凡塔蒂奇美国马萨诸塞州波士顿哈佛医学院马萨诸塞州布莱根将军医学情报(AIM)计划Danielle s。
BittermanDepartment of Radiation Oncology, Brigham and Women’s Hospital/Dana-Farber Cancer Institute, Boston, MA, USADanielle S. BittermanComputational Health Informatics Program, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USADanielle S. BittermanDivision of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USALeo Anthony CeliDepartment of Biostatistics, Harvard T.H.
Bitterman美国马萨诸塞州波士顿布莱根妇女医院放射肿瘤学系/达纳法伯癌症研究所Danielle s.Bitterman美国马萨诸塞州波士顿哈佛医学院波士顿儿童医院计算健康信息学项目Danielle s.Bitterman美国马萨诸塞州波士顿贝斯以色列女执事医学中心肺部、重症监护和睡眠医学系。
Chan School of Public Health, Boston, MA, USALeo Anthony CeliDepartment of Radiology, Emory University School of Medicine, Georgia, USAJudy W. GichoyaFaculty of Engineering, University of Porto, Porto, PortugalJoao MatosInstitut.
马萨诸塞州波士顿市陈公共卫生学院,佐治亚州埃默里大学医学院放射学系,USAJudy W.Gichoya波尔图大学波尔图分校工程学院,波尔图,葡萄牙马托斯研究所。
PubMed Google ScholarDanielle S. BittermanView author publicationsYou can also search for this author in
PubMed Google ScholarDanielle S.BittermanView作者出版物您也可以在
PubMed Google ScholarLeo Anthony CeliView author publicationsYou can also search for this author in
PubMed Google ScholarLeo Anthony CeliView作者出版物您也可以在
PubMed Google ScholarJudy W. GichoyaView author publicationsYou can also search for this author in
PubMed Google ScholarJudy W.GichoyaView作者出版物您也可以在
PubMed Google ScholarJoao MatosView author publicationsYou can also search for this author in
PubMed Google ScholarJoaoMatosview作者出版物您也可以在
PubMed Google ScholarLiam G. McCoyView author publicationsYou can also search for this author in
PubMed Google ScholarLiam G.McCoyView作者出版物您也可以在
PubMed Google ScholarRobin L. PierceView author publicationsYou can also search for this author in
PubMed Google ScholarRobin L.PierceView作者出版物您也可以在
PubMed Google ScholarContributionsJ.G., J.M., and L.A.C. drafted the initial manuscript and edited subsequent versions. D.S.B., J.W.G., L.G.M., and R.L.P. provided critical revisions, feedback, and edits to the manuscript drafts. All authors reviewed and approved the final version.Corresponding authorCorrespondence to.
PubMed谷歌学术贡献。G、 ,J.M。和L.A.C.起草了初稿并编辑了后续版本。D、 S.B.,J.W.G.,L.G.M。和R.L.P.对稿件草稿进行了重要的修订,反馈和编辑。所有作者都审查并批准了最终版本。对应作者对应。
Leo Anthony Celi.Ethics declarations
利奥·安东尼·塞利。道德宣言
Competing interests
相互竞争的利益
L.A.C.: Editor in Chief of PLoS Digital Health; D.B.: Associate Editor of Radiation Oncology, HemOnc.org (no financial compensation, unrelated to this work). All other authors declare no competing interests.
五十、 A.C.《公共科学图书馆·数字健康》主编;D、 B.《放射肿瘤学》副主编,HemOnc.org(没有经济补偿,与这项工作无关)。所有其他作者都声明没有利益冲突。
Rights and permissions
权限和权限
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material.
。
You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
根据本许可证,您无权共享源自本文或其部分的改编材料。本文中的图像或其他第三方材料包含在文章的知识共享许可证中,除非该材料的信用额度中另有说明。如果材料未包含在文章的知识共享许可中,并且您的预期用途不受法律法规的许可或超出许可用途,则您需要直接获得版权所有者的许可。
To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/..
要查看此许可证的副本,请访问http://creativecommons.org/licenses/by-nc-nd/4.0/..
Reprints and permissionsAbout this articleCite this articleGallifant, J., Bitterman, D.S., Celi, L.A. et al. Ethical debates amidst flawed healthcare artificial intelligence metrics.
转载和许可本文引用本文Gallifant,J.,Bitterman,D.S.,Celi,L.A。等人。医疗保健人工智能指标缺陷中的道德辩论。
npj Digit. Med. 7, 243 (2024). https://doi.org/10.1038/s41746-024-01242-1Download citationReceived: 15 May 2024Accepted: 29 August 2024Published: 11 September 2024DOI: https://doi.org/10.1038/s41746-024-01242-1Share this articleAnyone you share the following link with will be able to read this content:Get shareable linkSorry, a shareable link is not currently available for this article.Copy to clipboard.
npj数字。医学7243(2024)。https://doi.org/10.1038/s41746-024-01242-1Download引文接收日期:2024年5月15日接受日期:2024年8月29日发布日期:2024年9月11日OI:https://doi.org/10.1038/s41746-024-01242-1Share本文与您共享以下链接的任何人都可以阅读此内容:获取可共享链接对不起,本文目前没有可共享的链接。复制到剪贴板。
Provided by the Springer Nature SharedIt content-sharing initiative
由Springer Nature SharedIt内容共享计划提供