Trained on over 400,000 patient records, the AI predicts health trajectories for up to 20 years.
Remember the last time you visited the doctor? They likely asked you about your medical history.
For many conditions, this information isn't just relevant for diagnosis and treatment, it's also valuable for prevention. Thanks to AI, a range of algorithms can now predict the risk of single medical conditions, such as cardiovascular disease and cancer, based on medical records.
But diseases don't exist in a vacuum. Some conditions may increase the risk of others. A full picture of a person's health trajectory would predict risk across a range of diseases. This could not only inform early treatment, but also surface vulnerable groups of people for screening and other preventative measures. And it could identify people at risk for a condition -- say, high blood pressure or breast cancer -- that don't necessarily fit the usual criteria.
Recently, a team from the German Cancer Research Center and collaborators released an AI "oracle" that predicts a person's risk of getting over 1,000 common diseases decades in the future. Dubbed Delphi-2M, the AI is a type of large language model, like the algorithms powering popular chatbots.
Rather than training the AI on text, however, the team fed it over 400,000 medical records from the UK Biobank, a massive study tracking participants' health as they age. After adding lifestyle information, such as body mass, smoking, and drinking habits, Delphi could predict any participant's chance of multiple diseases for at least two decades.
Though it only trained on the Biobank cohort, the AI mapped the health trajectories of nearly two million people in Denmark without any changes to its setup, suggesting it had captured the crux of disease risk and interaction. Delphi is also explainable, in that it lays out the rationale for its assessment.
The tool is "an achievement" that sets "a new standard for both predictive accuracy and interpretability" for healthcare, said Justin Stebbing at Anglia Ruskin University, who was not involved in the study.
Health care is shifting from treatment to prevention. But individual guidance can be confusing. Take mammograms. Recommendations on what age to start testing have shifted from 40 to 50 and back to 40. More broadly, as the world ages, modeling the burden of cancer, dementia, and other diseases could better prepare healthcare systems for the so-called "silver tsunami."
Here's where medical AI comes in. Early tools were crafted to diagnose conditions based on medical images. But large language models have opened a whole new avenue for prediction.
These algorithms and classic disease modeling share a common logic. The AI samples language as a sequence of word fragments known as tokens. It then generates responses token by token based on text it's learned from scraped online resources. With enough training data, the AI learns how tokens relate to one another statistically and can generate human-like responses.
Predicting the progression of diseases is somewhat similar. If every step in the progression of a disease is a token, then predicting what's next means statistically establishing how the tokens connect. Scientists have already used large language model-like algorithms trained on electronic health records to predict single diseases including cancer, stroke, and self-harm.
But tackling multiple diseases at once is another beast altogether.
Earlier this year, an AI called Foresight took medical prediction a step further. Trained on 57 million anonymized health records from England's National Health Service, Foresight learned to predict hospitalizations, heart attacks, and hundreds of other conditions, but the algorithm was limited to Covid-19 research due to privacy concerns.
The German team designed Delphi to recognize the diagnostic code for each illness as a token. These codes are standardized globally. The team then modified the large language model to incorporate new information -- for example, blood test results -- to re-evaluate its predictions.