
Every year, global health experts are faced with a high-stakes decision: Which influenza strains should go into the next seasonal vaccine? The choice must be made months in advance, long before flu season even begins, and it can often feel like a race against the clock. If the selected strains match those that circulate, the vaccine will likely be highly effective. But if the prediction is off, protection can drop significantly, leading to (potentially preventable) illness and strain on health care systems.
This challenge became even more familiar to scientists in the years during the Covid-19 pandemic. Think back to the time (and time and time again), when new variants emerged just as vaccines were being rolled out. Influenza behaves like a similar, rowdy cousin, mutating constantly and unpredictably. That makes it hard to stay ahead, and therefore harder to design vaccines that remain protective.
To reduce this uncertainty, scientists at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the MIT Abdul Latif Jameel Clinic for Machine Learning in Health set out to make vaccine selection more accurate and less reliant on guesswork. They created an AI system called VaxSeer, designed to predict dominant flu strains and identify the most protective vaccine candidates, months ahead of time. The tool uses deep learning models trained on decades of viral sequences and lab test results to simulate how the flu virus might evolve and how the vaccines will respond.
Traditional evolution models often analyze the effect of single amino acid mutations independently. “VaxSeer adopts a large protein language model to learn the relationship between dominance and the combinatorial effects of mutations,” explains Wenxian Shi, a PhD student in MIT’s Department of Electrical Engineering and Computer Science, researcher at CSAIL, and lead author of a new paper on the work. “Unlike existing protein language models that assume a static distribution of viral variants, we model dynamic dominance shifts, making it better suited for rapidly evolving viruses like influenza.”
An open-access report on the study was published today in Nature Medicine.
The future of flu
VaxSeer has two core prediction engines: one that estimates how likely each viral strain is to spread (dominance), and another that estimates how effectively a vaccine will neutralize that strain (antigenicity). Together, they produce a predicted coverage score: a forward-looking measure of how well a given vaccine is likely to perform against future viruses.
The scale of the score could be from an infinite negative to 0. The closer the score to 0, the better the antigenic match of vaccine strains to the circulating viruses. (You can imagine it as the negative of some kind of “distance.”)
In a 10-year retrospective study, the researchers evaluated VaxSeer’s recommendations against those made by the World Health Organization (WHO) for two major flu subtypes: A/H3N2 and A/H1N1. For A/H3N2, VaxSeer’s choices outperformed the WHO’s in nine out of 10 seasons, based on retrospective empirical coverage scores (a surrogate metric of the vaccine effectiveness, calculated from the observed dominance from past seasons and experimental HI test results). The team used this to evaluate vaccine selections, as the effectiveness is only available for vaccines actually given to the population.
For A/H1N1, it outperformed or matched the WHO in six out of 10 seasons. In one notable case, for the 2016 flu season, VaxSeer identified a strain that wasn’t chosen by the WHO until the following year. The model’s predictions also showed strong correlation with real-world vaccine effectiveness estimates, as reported by the CDC, Canada’s Sentinel Practitioner Surveillance Network, and Europe’s I-MOVE program. VaxSeer’s predicted coverage scores aligned closely with public health data on flu-related illnesses and medical visits prevented by vaccination.
So how exactly does VaxSeer make sense of all these data? Intuitively, the model first estimates how rapidly a viral strain spreads over time using a protein language model, and then determines its dominance by accounting for competition among different strains.
Once the model has calculated its insights, they’re plugged into a mathematical framework based on something called ordinary differential equations to simulate viral spread over time. For antigenicity, the system estimates how well a given vaccine strain will perform in a common lab test called the hemagglutination inhibition assay. This measures how effectively antibodies can inhibit the virus from binding to human red blood cells, which is a widely used proxy for antigenic match/antigenicity.
Outpacing evolution
“By modeling how viruses evolve and how vaccines interact with them, AI tools like VaxSeer could help health officials make better, faster decisions — and stay one step ahead in the race between infection and immunity,” says Shi.
VaxSeer currently focuses only on the flu virus’s HA (hemagglutinin) protein,the major antigen of influenza. Future versions could incorporate other proteins like NA (neuraminidase), and factors like immune history, manufacturing constraints, or dosage levels. Applying the system to other viruses would also require large, high-quality datasets that track both viral evolution and immune responses — data that aren’t always publicly available. The team, however is currently working on the methods that can predict viral evolution in low-data regimes building on relations between viral families
“Given the speed of viral evolution, current therapeutic development often lags behind. VaxSeer is our attempt to catch up,” says Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health at MIT, AI lead of Jameel Clinic, and CSAIL principal investigator.
“This paper is impressive, but what excites me perhaps even more is the team’s ongoing work on predicting viral evolution in low-data settings,” says Assistant Professor Jon Stokes of the Department of Biochemistry and Biomedical Sciences at McMaster University in Hamilton, Ontario. “The implications go far beyond influenza. Imagine being able to anticipate how antibiotic-resistant bacteria or drug-resistant cancers might evolve, both of which can adapt rapidly. This kind of predictive modeling opens up a powerful new way of thinking about how diseases change, giving us the opportunity to stay one step ahead and design clinical interventions before escape becomes a major problem.”
Shi and Barzilay wrote the paper with MIT CSAIL postdoc Jeremy Wohlwend ’16, MEng ’17, PhD ’25 and recent CSAIL affiliate Menghua Wu ’19, MEng ’20, PhD ’25. Their work was supported, in part, by the U.S. Defense Threat Reduction Agency and MIT Jameel Clinic.
 
                                















