• About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
Tuesday, June 9, 2026
mGrowTech
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions
No Result
View All Result
mGrowTech
No Result
View All Result
Home Al, Analytics and Automation

A new model predicts how molecules will dissolve in different solvents | MIT News

Josh by Josh
August 19, 2025
in Al, Analytics and Automation
0
A new model predicts how molecules will dissolve in different solvents | MIT News



Using machine learning, MIT chemical engineers have created a computational model that can predict how well any given molecule will dissolve in an organic solvent — a key step in the synthesis of nearly any pharmaceutical. This type of prediction could make it much easier to develop new ways to produce drugs and other useful molecules.

The new model, which predicts how much of a solute will dissolve in a particular solvent, should help chemists to choose the right solvent for any given reaction in their synthesis, the researchers say. Common organic solvents include ethanol and acetone, and there are hundreds of others that can also be used in chemical reactions.

“Predicting solubility really is a rate-limiting step in synthetic planning and manufacturing of chemicals, especially drugs, so there’s been a longstanding interest in being able to make better predictions of solubility,” says Lucas Attia, an MIT graduate student and one of the lead authors of the new study.

The researchers have made their model freely available, and many companies and labs have already started using it. The model could be particularly useful for identifying solvents that are less hazardous than some of the most commonly used industrial solvents, the researchers say.

“There are some solvents which are known to dissolve most things. They’re really useful, but they’re damaging to the environment, and they’re damaging to people, so many companies require that you have to minimize the amount of those solvents that you use,” says Jackson Burns, an MIT graduate student who is also a lead author of the paper. “Our model is extremely useful in being able to identify the next-best solvent, which is hopefully much less damaging to the environment.”

William Green, the Hoyt Hottel Professor of Chemical Engineering and director of the MIT Energy Initiative, is the senior author of the study, which appears today in Nature Communications. Patrick Doyle, the Robert T. Haslam Professor of Chemical Engineering, is also an author of the paper.

Solving solubility

The new model grew out of a project that Attia and Burns worked on together in an MIT course on applying machine learning to chemical engineering problems. Traditionally, chemists have predicted solubility with a tool known as the Abraham Solvation Model, which can be used to estimate a molecule’s overall solubility by adding up the contributions of chemical structures within the molecule. While these predictions are useful, their accuracy is limited.

In the past few years, researchers have begun using machine learning to try to make more accurate solubility predictions. Before Burns and Attia began working on their new model, the state-of-the-art model for predicting solubility was a model developed in Green’s lab in 2022.

That model, known as SolProp, works by predicting a set of related properties and combining them, using thermodynamics, to ultimately predict the solubility. However, the model has difficulty predicting solubility for solutes that it hasn’t seen before.

“For drug and chemical discovery pipelines where you’re developing a new molecule, you want to be able to predict ahead of time what its solubility looks like,” Attia says.

Part of the reason that existing solubility models haven’t worked well is because there wasn’t a comprehensive dataset to train them on. However, in 2023 a new dataset called BigSolDB was released, which compiled data from nearly 800 published papers, including information on solubility for about 800 molecules dissolved about more than 100 organic solvents that are commonly used in synthetic chemistry.

Attia and Burns decided to try training two different types of models on this data. Both of these models represent the chemical structures of molecules using numerical representations known as embeddings, which incorporate information such as the number of atoms in a molecule and which atoms are bound to which other atoms. Models can then use these representations to predict a variety of chemical properties.

One of the models used in this study, known as FastProp and developed by Burns and others in Green’s lab, incorporates “static embeddings.” This means that the model already knows the embedding for each molecule before it starts doing any kind of analysis.

The other model, ChemProp, learns an embedding for each molecule during the training, at the same time that it learns to associate the features of the embedding with a trait such as solubility. This model, developed across multiple MIT labs, has already been used for tasks such as antibiotic discovery, lipid nanoparticle design, and predicting chemical reaction rates.

The researchers trained both types of models on over 40,000 data points from BigSolDB, including information on the effects of temperature, which plays a significant role in solubility. Then, they tested the models on about 1,000 solutes that had been withheld from the training data. They found that the models’ predictions were two to three times more accurate than those of SolProp, the previous best model, and the new models were especially accurate at predicting variations in solubility due to temperature.

“Being able to accurately reproduce those small variations in solubility due to temperature, even when the overarching experimental noise is very large, was a really positive sign that the network had correctly learned an underlying solubility prediction function,” Burns says.

Accurate predictions

The researchers had expected that the model based on ChemProp, which is able to learn new representations as it goes along, would be able to make more accurate predictions. However, to their surprise, they found that the two models performed essentially the same. That suggests that the main limitation on their performance is the quality of the data, and that the models are performing as well as theoretically possible based on the data that they’re using, the researchers say.

“ChemProp should always outperform any static embedding when you have sufficient data,” Burns says. “We were blown away to see that the static and learned embeddings were statistically indistinguishable in performance across all the different subsets, which indicates to us that that the data limitations that are present in this space dominated the model performance.”

The models could become more accurate, the researchers say, if better training and testing data were available — ideally, data obtained by one person or a group of people all trained to perform the experiments the same way.

“One of the big limitations of using these kinds of compiled datasets is that different labs use different methods and experimental conditions when they perform solubility tests. That contributes to this variability between different datasets,” Attia says.

Because the model based on FastProp makes its predictions faster and has code that is easier for other users to adapt, the researchers decided to make that one, known as FastSolv, available to the public. Multiple pharmaceutical companies have already begun using it.

“There are applications throughout the drug discovery pipeline,” Burns says. “We’re also excited to see, outside of formulation and drug discovery, where people may use this model.”

The research was funded, in part, by the U.S. Department of Energy.



Source_link

READ ALSO

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset

Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription

Related Posts

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset
Al, Analytics and Automation

ClawHub Security Signals: A Coding Guide to End-to-End Security Signal Analysis and Verdict Classification on the AI Skills Dataset

June 8, 2026
Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription
Al, Analytics and Automation

Microsoft AI Introduces MAI-Transcribe-1.5: 2.4% WER on Artificial Analysis, Best-in-Class FLEURS Accuracy, and Up to 5x Faster Long-Audio Transcription

June 8, 2026
Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation
Al, Analytics and Automation

Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation

June 7, 2026
Best 21 Low-Code and No-Code AI Tools in 2026
Al, Analytics and Automation

Best 21 Low-Code and No-Code AI Tools in 2026

June 7, 2026
Tod Machover receives George Peabody Medal for contributions to music and technology | MIT News
Al, Analytics and Automation

Tod Machover receives George Peabody Medal for contributions to music and technology | MIT News

June 6, 2026
Moonshot AI Releases Kimi Code CLI: A Terminal AI Coding Agent Built in TypeScript for Next-Gen Agents
Al, Analytics and Automation

Moonshot AI Releases Kimi Code CLI: A Terminal AI Coding Agent Built in TypeScript for Next-Gen Agents

June 6, 2026
Next Post
Stop benchmarking in the lab: Inclusion Arena shows how LLMs perform in production

Stop benchmarking in the lab: Inclusion Arena shows how LLMs perform in production

POPULAR NEWS

Trump ends trade talks with Canada over a digital services tax

Trump ends trade talks with Canada over a digital services tax

June 28, 2025
15 Trending Songs on TikTok in 2025 (+ How to Use Them)

15 Trending Songs on TikTok in 2025 (+ How to Use Them)

June 18, 2025
Communication Effectiveness Skills For Business Leaders

Communication Effectiveness Skills For Business Leaders

June 10, 2025
App Development Cost in Singapore: Pricing Breakdown & Insights

App Development Cost in Singapore: Pricing Breakdown & Insights

June 22, 2025
Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

Comparing the Top 7 Large Language Models LLMs/Systems for Coding in 2025

November 4, 2025

EDITOR'S PICK

A childhood dream made real

A childhood dream made real

April 1, 2026
Appinventiv’s Scalability Solutions for FinTech Platforms

Appinventiv’s Scalability Solutions for FinTech Platforms

July 11, 2025
The Turnaround Plan Hain Celestial Needs Right Now

The Turnaround Plan Hain Celestial Needs Right Now

September 30, 2025
Blue Origin scrubs second New Glenn launch, will try again November 12

Blue Origin scrubs second New Glenn launch, will try again November 12

November 10, 2025

About

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow us

Categories

  • Account Based Marketing
  • Ad Management
  • Al, Analytics and Automation
  • Brand Management
  • Channel Marketing
  • Digital Marketing
  • Direct Marketing
  • Event Management
  • Google Marketing
  • Marketing Attribution and Consulting
  • Marketing Automation
  • Mobile Marketing
  • PR Solutions
  • Social Media Management
  • Technology And Software
  • Uncategorized

Recent Posts

  • The Scoop: Tim Cook makes a play for his legacy at final WWDC
  • 12 best online reputation management tools for 2026
  • Researchers trained an open source AI search agent, Harness-1, that outperforms GPT-5.4 on recalling relevant information
  • Stephen Curry and Curry Brand Enter Long-Term Deal with LI-NING
  • About Us
  • Disclaimer
  • Contact Us
  • Privacy Policy
No Result
View All Result
  • Technology And Software
    • Account Based Marketing
    • Channel Marketing
    • Marketing Automation
      • Al, Analytics and Automation
      • Ad Management
  • Digital Marketing
    • Social Media Management
    • Google Marketing
  • Direct Marketing
    • Brand Management
    • Marketing Attribution and Consulting
  • Mobile Marketing
  • Event Management
  • PR Solutions