Neuro-symbolic AI could provide machines with common sense

Did you miss the session at the Data Summit? See on-demand here.

Artificial intelligence research has made great strides in solving certain applications, but we are still far from the kind of general purpose AI systems that scientists have been dreaming of for decades.

The solutions that are being sought to overcome the barriers of AI include the idea of ​​neuro-symbolic systems that bring together the best of the various disciplines of computer science. In a talk at the IBM Neuro-Symbolic AI Workshop, Joshua Tenenbaum, Professor of Computational Cognitive Science at the Massachusetts Institute of Technology, explained how neuro-symbolic systems can help solve some of the major problems facing current AI systems.

In many spaces in AI, Tenenbaum focuses specifically on one: “How can we move beyond the idea of ​​intelligence as a specific pattern in information and approximate functions and move further towards the idea of ​​all things the human mind does.” When you model the world, explain and understand the things you see, imagine things you can’t see but make them goals that you achieve by planning actions and solving problems. Can? “

Admittedly, that is a long way off, but its completion begins with the discovery of one of the most basic aspects of intelligence that humans and many animals share: intuitive physics and psychology.

Intuitive Physics and Psychology

Our brains are made not only to see patterns in pixels and sound waves but also to understand the world through models. As human beings, we begin to develop these models from the age of three months, by observing and acting in the world.

We divide the world into substances and agents and the interactions between these substances and agents. Agents have their own goals and their own models of the world (which may be different than ours).

For example, multiple studies by researchers Felix Warnken and Michael Tomacello have shown that children develop abstract ideas about the physical world and others and apply them to new situations. For example, in the video below, by observation alone, the child realizes that the person holding the object has a goal in mind and needs help to open the closet door.

These abilities are often referred to as “intuitive physics” and “intuitive psychology” or “theory of mind” and are at the center of common sense.

“These systems evolved very early in the architecture of the brain, which is somewhat shared with other species,” says Tenenbaum. These cognitive systems are the bridge between all other parts of the intellect such as perceptual goals, action-planning substrate, logic and language.

AI agents should be able to reason and plan their actions based on intuitive physics and mental representations developed by the world and other agents through the theory of mind.

Neuro-symbolic architecture

Tennenbaum lists the three components needed to make AI a major for intuitive physics and psychology.

“We emphasize the three-way interaction between neural, symbolic and potential modeling and inference,” says Tenenbaum. “We think it’s a three-way combination that’s needed to get a human-like intelligence and basic common sense.”

The symbolic element is used to represent and reason with abstract knowledge. The probabilistic inference model helps to establish causal relationships between different entities, dealing with cause and uncertainty about counterfeituals and invisible scenarios. And the nerve component uses pattern recognition to help map real-world sensory data with knowledge and navigate search spaces.

“We use the power of sign language for representation of knowledge and reasoning as well as the idea of ​​neural networks and the things in which they are good, but also of possible conjecture, especially the Bayesian conjecture or the mental states of the underlying physics or agents of the world such as the things we conjecture. Busy guessing in the causal model to reason backwards from the things we want to impose.

Game engine in the head

One of the key components of Tenenbaum’s neuro-symbolic AI concept is the physics simulator that helps predict the outcome of actions. Physics simulators are fairly common in various branches of game engine and reinforcement learning and robotics.

But unlike other branches of AI that use the simulator to train agents and transfer their learning to the real world, Tenenbaum’s idea is to integrate the simulator into the agent’s process of guessing and reasoning.

“That’s why we call it a head-in-game engine,” he says.

Piblet rigid body physics simulator
The physics simulator enables AI agents to visualize and predict outcomes in the real world.

The physics engine will help AI simulate the world in real time and predict what will happen in the future. The simulation only needs to be reasonably accurate and help the agent choose a promising course of action. This is similar to how the human mind works. When we look at an image, like a stack of blocks, we get a rough idea of ​​whether it resists gravity or collapses. Or if we see a bunch of blocks on a table and are asked what would happen if we suddenly hit the table, we can roughly guess which blocks will fall.

We cannot predict the exact path of each object, but we do develop a high-level idea of ​​the outcome. When combined with a symbolic prediction system, the simulator can be configured to test various potential simulations at a very fast rate.

Approximate 3D scenes

While simulators are a great tool, one of their biggest challenges is that we cannot see the world in terms of three-dimensional objects. The neuro-symbolic system must detect the position and orientation of objects in the scene to create an approximate 3D representation of the world.

There are many attempts to use pure deep learning for object position and position detection, but their accuracy is low. In a joint project, MIT and IBM created “3D Scene Perception via Probabilistic Programming” (3DP3), a system that solves many of the bugs that occur in pure deep learning systems.

3DP3 takes an image and tries to explain it through 3D volumes that capture each object. It feeds objects into symbolic visual graphs that specify the contact and support relationships between them. And then he tries to compare the original image and the depth map against the ground truth.

3D Scene Perception through Probabilistic Programming (3DP3)
Probabilistic programming uses neural networks, symbolic inference, and probabilistic models to create 3D representations of 3D scene perception (3DP3) images (source: arXiv).

Thinking about solutions

Once a neuro-symbolic agent has the physics engine to model the world, he should be able to develop concepts that enable him to work in innovative ways.

For example, people (and sometimes animals) may learn to use a new tool to solve a problem or find a way to reuse a known object for a new goal (e.g., use a rock instead of a hammer to drive into a nail).

For this, Tenenbaum and his colleagues developed a physics simulator in which people have to use objects to solve problems in innovative ways. The same engine was used to train AI models to develop abstract concepts about using objects.

The human-animal tool uses a physics simulator
Humans and animals can instinctively invent new ways to use tools (source: PNAS)

“The key is to develop high-level strategies that can be adapted to new situations. This is where the symbolic approach becomes key, “says Tenenbaum.

For example, people can use abstract concepts like “hammer” and “catapult” and use it to solve various problems.

“People can create these abstract concepts and move them into near and far situations. We can model this through a program that can symbolically describe these concepts,” says Tenenbaum.

In one of his projects, Tenenbaum and his AI system were able to analyze the scene and use a potential model that produces a step-by-step set of symbolic instructions for solving physics problems. For example, to throw an object placed on a board, the system was able to understand that it had to find a larger object, place it higher than the opposite end of the board, and release it to create a catapult effect.

Catapult concept

Physically based language

So far, when we talked a lot about symbols and concepts, there was no mention of language. Tenenbaum explained in his talk that language is deeply rooted in the vague general knowledge that we acquire before we learn to speak.

Intuitive physics and the theory of mind are missing from the current natural language processing system. The larger language model, a currently popular approach to natural language processing and comprehension, attempts to obtain a corresponding pattern between word sequences by examining very large corpora of text. This method has given impressive results, but it also has its limitations when it comes to dealing with things that are not reflected in the statistical regularity of words and sentences.

“There have been tremendous advances in large language models, but because they lack grounding in physics and mind theory, in some ways they are quite limited,” says Tenenbaum. “And you can see this in their limitations in understanding symbolic scenes. They don’t even understand physics. Verbs often refer to causal structures. You just have to be more discriminating with the help you render toward other people.

Common sense building blocks

To date, many successful approaches to neuro-symbolic AI have provided models with prior knowledge of intuitive physics, such as dimensional compatibility and translational clutter. One of the major challenges that remains is how to design an AI system that learns these intuitive physics concepts as a child. The learning space of physics engines is much more complex than the weight space of traditional neural networks, which means we still need to find new techniques for learning.

Tenenbaum also discusses ways to develop human knowledge building blocks in a paper entitled “The Child as a Hacker”. In the paper, Tenenbaum and his co-authors use programming to show how humans search for solutions in various parameters such as accuracy, efficiency, usability, modularity, and so on. They also discuss how humans gather pieces of information, develop them into new symbols and concepts and then learn to combine them together to create new concepts. These directions of research can help break the common sense code in neuro-symbolic AI.

“We want to provide a roadmap on how to achieve the vision of what makes human common sense special and powerful from the beginning,” says Tenenbaum. In a sense, it is one of AI’s oldest dreams. Going back to Alan Turing’s original proposal for intelligence as a calculation and the idea that we can build a machine that can achieve human-level intelligence by starting as a child and teaching him as a child. This is an inspiration to many of us and we We’re trying to come up with building blocks. “

Ben Dixon is a software engineer and founder of TechTalks. He writes about technology, business and politics.

The story originally appeared on Copyright 2022

Venturebeat’s mission Digital Town Square is about to become a place for technical decision makers to gain knowledge about the changing enterprise technology and practices. Learn more

Similar Posts

Leave a Reply

Your email address will not be published.