马连韬,北京大学软件工程国家工程研究中心研究型助理教授(Research Assistant Professor,助理研究员),硕士生导师,北京大学计算机软件与理论博士毕业,北京大学计算机系博雅博士后。长期从事医信交叉、电子病历数据深度学习可解释分析研究工作,以大语言模型(LLM)赋能临床工作与医学科研。成果服务于智慧医疗终末期慢性肾病患者、淋巴瘤患者、产科诊疗辅助等。
研究兴趣
医信交叉,智慧医疗,预后预测,诊疗辅助 多变量时间序列电子病历数据分析 可解释深度学习 医疗垂直领域大模型 临床应用:终末期慢性肾病、淋巴瘤、产科科研项目
2025.01-2025.12 R Consortium 国际R语言联合体, Infrastructure Steering Committee (ISC) Grant Program R语言基础设施建设督导项目,面向临床数据科学家的电子病历建模方法基建,联合主持(全球每年10项,2016年以来中国科研机构首次获批) 2025.01-2027.12 国家自然科学基金,在研,主持 2021.07-2023.07 国家博士后科学基金特别站前资助,已结题,主持(全国软件工程学科同年度仅 3 人获批) 2023.01-2023.07 国家博士后科学基金面上资助,已结题,主持 2023.12-2025.04 ***后勤保障,医信交叉***智能监测与推荐系统,主持 2021.07-2023.06 北京大学,博雅博士后资助,已结题,主持 2024.01-2026.12 国家自然科学基金区域联合重点项目,在研,项目骨干 2023.01-2025.12 国家自然科学基金专项,在研,项目骨干 2019.10-2021.10 国家科技部, 国家重点研发计划, 前沿科技创新专项, 已结题, 参与 2025.01-2027.12 北京市自然基金委,前沿专项,项目骨干 2025.01-2027.12 北京大学,医学+X领航计划,项目骨干成果发表
Wu, Y., Gao, J., Tang, W*., Su, C., Zhu, Y., ... & Ma, L*. (2025). Exploring the Relationship Between Dietary Intake and Clinical Outcomes in Peritoneal Dialysis Patients. Health Data Science (HDS). Science Partner Journal, Science合作刊. 通讯作者. Liao, W., Zhu, Y., Wang, Z., Chu, X., Wang, Y., & Ma, L*. (2025). Learnable Prompt as Pseudo-Imputation: Reassessing the Necessity of Traditional EHR Data Imputation in Downstream Clinical Prediction. In Proceedings of the ACM SIGKDD international conference on knowledge discovery & data mining. 计算机学会CCF-A类最高级推荐国际学术会议, 通讯作者. Wang, T., ... & Ma, L*. (2025). Adaptive Activation Steering: A Tuning-Free LLM Truthfulness Improvement Method for Diverse Hallucinations Categories. Web Conference (WWW). CCF-A, 通讯作者. Wang, Z., ... & Ma, L*. (2025). ColaCare: Enhancing Electronic Health Record Modeling through Large Language Model-Driven Multi-Agent Collaboration. Web Conference (WWW). CCF-A, 通讯作者. Ma, L., Zhang, C., Gao, J., Jiao, X., Yu, Z., Zhu, Y., ... & Wang, T. (2023). Mortality prediction with adaptive feature importance recalibration for peritoneal dialysis patients. Patterns, 4(12). Cell Patterns子刊, 首页封面文章, 第一作者.Gao, J., Zhu, Y., Wang, W., Wang, Z., Dong, G., Tang, W., ... & Ma, L.* (2024). A comprehensive benchmark for COVID-19 predictive modeling using electronic health records in intensive care. Patterns, 5(4). Cell Patterns子刊, 通讯作者. Gao, J., Wang, Z., Tang, W., Wang, Y., Wang, L., Ma, L.,* Zhu, Y.. (2025) An AI–Clinician Interaction System for Transparent and Actionable Clinical Decision Support. Symposium on Artificial Intelligence in Learning Health Systems (SAIL). NEJM AI Top Abstract Nomination, Travel Award. Yu, Z., Zhang, C., Wang, Y., Tang, W., Wang, J., & Ma, L.* (2024, April). Predict and Interpret Health Risk Using Ehr Through Typical Patients. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1506-1510). IEEE. CCF-B, 通讯作者. Zhu, Y., Wang, Z., He, L., Xie, S., Zheng, X., Ma, L.*, & Pan, C.* (2024, October). PRISM: Mitigating EHR Data Sparsity via Learning from Missing Feature Calibrated Prototype Patient Representations. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM) (pp. 3560-3569). CCF-B, 通讯作者. Wang, T., Zhu, Y., Wang, Z., Tang, W.*, Zhao, X., Wang, T., ... Ma, L.*, & Wang, L*. (2024). Protocol to process follow-up electronic medical records of peritoneal dialysis patients to train AI models. Cell STAR protocols, 5(4), 103335. 邀稿, 通讯作者. Wu, H., Zhu, Y., Wang, Z., Zheng, X., Wang, L., Tang, W., ... & Ma, L.* EHRFlow: A Large Language Model-Driven Iterative Multi-Agent Electronic Health Record Data Analysis Workflow. In Artificial Intelligence and Data Science for Healthcare: Bridging Data-Centric AI and People-Centric Healthcare. KDD 2024 Workshop, Oral, 录取率20%, 通讯作者. Hong, S., Yin, D., Tang, G., Fu, T., Ma, L., Gao, J., ... & Zhang, L. (2024, August). Artificial Intelligence and Data Science for Healthcare: Bridging Data-Centric AI and People-Centric Healthcare. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 6720-6721). KDD 2024 Workshop联合主席. MA L, MA X, GAO J, et al. Distilling knowledge from publicly available online emr data to emerging epidemic for prognosis[C]//Proceedings of the Web Conference 2021. 2021: 3558-3568. 计算机学会 CCF-A 类最高级推荐国际顶级会议, 第一作者, MA L, ZHANG C, WANG Y, et al. Concare: Personalized clinical feature embedding via capturing the healthcare context[C]//Proceedings of the AAAI Conference on Artificial Intelligence: volume 34. 2020: 833-840. 计算机学会 CCF-A 类最高级推荐国际顶级会议, 第一作者, MA L, GAO J, WANG Y, et al. Adacare: Explainable clinical health status representation learning via scale-adaptive feature extraction and recalibration[C]//ThirtyFourth AAAI Conference on Artificial Intelligence. 2020. 计算机学会 CCF-A 类最高级推荐国际顶级会议, 第一作者. 马连韬, 张超贺, 焦贤锋, 王亚沙, 唐雯, 赵俊峰. Dr. Deep: 基于医疗特征上下文学习的患者健康状态可解释评估. 计算机研究与发展. 2021. CCF-A 中文核心, 第一作者. 马连韬, 王亚沙, 彭广举, 等. 基于公交车轨迹数据的道路 GPS 环境友好性评估[J]. 计算机研究与发展, 2016, 53(12): 2694-2707. CCF-A 中文核心, 第一作者. GAO J, ZHU Y, WANG W, Wang Z, Dong G, Tang W, Wang H, Wang Y, Harrison E, MA L*. A comprehensive benchmark for covid-19 predictive modeling using electronic health records in intensive care. AMIA Summit. 2023. 美国医学信息学协会国际报告, 通讯作者. Liao W, Liao Y, Fan Z, Zhang J, Li S, Yang J, Ma L*. Multi-modal Medical Vision-and-Language Learning for Retinal Vein Occlusion Classification. Health Data Science Summit. 2023. HDS Summit 口头报告, 会议优秀摘要提名, 通讯作者. Zhu Y, An J, Zhou E, An L, Gao J, Li H, Feng H, Hou B, Tang W, Pan C, Ma L*. Mitigating Bias in Healthcare Data through Multi-Level and Multi-Sensitive-Attribute Reweighting Method. Health Data Science Summit. 2023. HDS Summit 墙报, 通讯作者. Zhang C, Gao X, Ma L, et al. GRASP: Generic Framework for Health Status Representation Learning Based on Incorporating Knowledge from Similar Patients; 35th AAAI Conference on Artificial Intelligence (AAAI), 2021. CCF-A. Zhang C, Chu X, Ma L, Zhu Y, Wang Y, Wang J, Zhao J. M3care: Learning with missing modalities in multimodal healthcare data. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022 Aug 14 (pp. 2418-2428). CCF-A. Ma X, Wang Y, Chu X, Ma L, et al. Patient Health Representation Learning via Correlational Sparse Prior of Medical Features. IEEE Transactions on Knowledge and Data Engineering (TKDE), 2022. CCF-A. Ma X, Chu X, Wang Y, Lin Y, Zhao J, Ma L, et al. Fused Gromov-Wasserstein Graph Mixup for Graph-level Classifications. Advances in Neural Information Processing Systems (NeurIPS), 2023. CCF-A. Wang J, Wang Y, Zhang D, Wang F, He Y, Ma L. PSAllocator: Multi-task allocation for participatory sensing with sensing capability constraints. InProceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing 2017 Feb 25 (pp. 1139-1151). CCF-A. 王亚沙, 马连韬, 等. 基于时间窗口切割的健康风险关键事件检测方法及系统. 国家发明专利. 2022. CN112205965B. 第一学生发明人,已授权. 王亚沙, 马连韬, 等. 一种患者潜在重要信息的确定方法和装置. 国家发明专利. 2023. CN112289444B. 第一学生发明人,已授权. 吕云翔, 马连韬, 等. 机器学习基础 (大数据技术与应用专业规划教材). 清华大学出版社. 2018. 第一学生作者.
会议举办
2024.10.31 Seminar on Advancing Healthcare Informatics, Insights from Cell Press Patterns/Matter/iScience and Peking University 2024.08.26 SIGKDD Workshop, Artificial Intelligence and Data Science for Healthcare, Bridging Data-Centric AI and People-Centric Healthcare,Barcelona Spain 2024.01.07 AI in Medicine League (AIMEL)课程讲授
2024.12 徐州市第一人民医院,人工智能医疗交叉 2024.12 北京大学,工学院,机器学习与大数据分析 2024.11 北京大学天津滨海新一代信息技术研究院,天津财经大学,机器学习与数据挖掘 2024.10 北京大学,软件与微电子学院,软件工程前沿(博士生必修课) 2024.01 北京大学,计算机学院,ICS受邀报告
2025.02.24 广西医科大学,AI赋能医学 2024.12.28 徐州市健康管理学会,慢性肾脏病预防与控制,腹膜透析患者饮食营养推荐 2024.12.22 华北血液肿瘤免疫治疗研讨会,滤泡性淋巴瘤一线治疗后R维持获益评估与用药推荐 2024.12.06 中国产科质量控制大会,大语言模型支持的产科医患沟通辅助 2024.10.28 International Symposium on High Confidence Software,High Confidence Software on AI-Medicine Intersection 2024.09.02 Seminar at University of Edinburgh,Building Trustworthy and Accessible Clinical Prediction Framework 2024.06.01 内蒙古医院协会血液净化学术会议,腹膜透析患者可解释预后预测发展履历
2023.07-至今 北京大学 软件工程国家工程研究中心 助理研究员 2021.07-2023.07 北京大学 计算机系 博雅博士后 (合作导师:王亚沙教授) 2016.07-2021.07 北京大学 信息科学技术学院 理学博士 (导师:谢冰教授) 2012.07-2016.07 北京航空航天大学 软件学院 工学学士(导师:李红裔教授、吕云翔教授)