How generative models are driving accelerated discovery

Science is based on questions. They are the very basis of discipline, in which researchers ask anything from why a particular process occurs (cancer, for example), to how it can be solved (status: more targeted drugs).

But science takes time and money, said Matteo Manica, a research staff member with IBM Research. Any hypothesis can have numerous possible answers, and it is simply not possible to test them all.

This makes generative models important, especially in this age of “instant discovery” driven by the convergence of Artificial Intelligence (AI), cloud and quantum computing. As Manica explained, these generative models can be trained from known molecules to help new candidates propose and a new set of criteria for answering numerous scientific questions.

“The generative model is probably our most powerful tool right now to take advantage of the vast store of data in science,” and use it to come up with starting points for designing and discovering new materials, drugs and more.

IBM is open source

To accelerate and accelerate this process, IBM Research has released the Open Source Library Generative Toolkit for Scientific Discovery (GT4SD). The toolkit includes various generative models developed by IBM researchers that can be used to accelerate the generation of hypotheses and the search process as a whole, said its chief architect Manika. It also helps in adopting generative AI.

The GT4SD is compatible with the most popular deep learning frameworks including Pytorch, Pytorch lightning, HuggingFace Transformers, GuacaMol and Moses.

“We want to promote an open community around scientific discovery,” Manika said. “Technology like AI should be a tool that scientists and researchers use to make their research faster and more efficient, requiring very specific domain knowledge to use.”

The more science, the more users

The goal is to connect, collaborate and “promote more science through more users,” said John R., IBM Fellow of IBM Research. Smith agreed. Researchers and other professionals in education, government, industry and business can collaborate and develop, adapt, measure and compare open source technologies, contributions and projects.

For example, the GT4SD includes models that can create new molecular designs based on properties, including target proteins, binding energy, or other targets related to materials and drug discovery. Users can also work on searching around macro-molecules, enzymes, tissues and polymers such as preventive treatments and antimicrobial applications. Manica looks forward to future uses not only in healthcare and life sciences, but also in agriculture and sustainability.

A group of IBM researchers created a generative model that could propose new antimicrobial peptides 2 with their properties. These are novel, or innovative, drug candidates who have not been previously identified. This was a critical discovery, Manika noted, because the class of peptides is considered a “last resort drug” against antimicrobial resistance – one of the world’s biggest threats to global health and food security.

Novel candidate molecules were identified, then filtered into another AI system that used predictive processes around toxic and broad-spectrum activity. Within a few weeks, researchers identified several dozen novel candidate molecules. This process can usually take years.

In another example, a team used generative models in a combination of AI and high-performance equipment to come up with a new photoacid generator (PAG), which is key to the production of semiconductors. What usually takes years is completed in a week.

“These models learn how to be novel,” Manica said. “They can learn how to create a variety of new inputs that can be valuable.”

In the well-respected circular process of the scientific method, researchers work on hypotheses, study, test, evaluate, and then report back to their original question. In general, this can take from $ 10 million to $ 100 million and take up to 10 years to complete.

“The generative model can greatly reduce the time it takes and reduce costs,” Smith said. “There are applications in many different areas. It can help speed up the search for issues related to climate, sustainability and healing.

Manika notes that scientific breakthroughs have often come as a combination of curiosity and creativity, trial and error. While this can be methodical, it can be slow, and it is unlikely at times when motivation to solve problems is important (such as during COVID-19).

How to speed up AI search

The future demands a quick search. “This is an area where AI can help us a lot,” said Manica. Generative models can be creative assistants that can help researchers break down barriers and think in new ways that they may not have in the past, he said, thus creating more thought generation and so-called “Eureka!” Moments

It sees more effects than transferring the scientific thought process to the generative models of what questions researchers should ask before they go out to find answers.

“Looking at everything we know about a field, what should we ask next?” Said Manica. “We don’t know where to start – such as how to find a new antiviral for an unknown protein, or we could potentially build generative models to answer questions about whether we can create a catalyst for CO2 in the atmosphere.”

A test model can then be set up to help determine the exact conditions needed to obtain accurate results and refine future tests, he said.

Smith agreed on the broader implications of open source generative modeling. “That whole universe is not on its own; We have a lot to do around equipment, ”he stressed. “We want to do this in a way that brings this idea of ​​open source science to life.”

Venturebeat’s mission Digital Town Square is set to become a place for technical decision makers to gain knowledge about the changing enterprise technology and practices. Learn more

Similar Posts

Leave a Reply

Your email address will not be published.