Data2vec is part of a larger trend in AI towards models that can learn to understand the world in more ways than one. “It’s a clever idea,” says Anne Kembhavi, of Allen Institute for AI in Seattle, who works on vision and language. “It’s a promising breakthrough when it comes to general learning systems.”
An important caveat is that although the same learning algorithm can be used for different skills, it can only learn one skill at a time. Once he learns to recognize images, he must start from scratch to learn to recognize speech. It’s hard to give AI multiple skills at once, but that’s something the meta AI team wants to look forward to.
The researchers were surprised to learn that their approach actually performed better than existing techniques in text and speech models, as well as understanding the leading language.
Mark Zuckerberg is already dreaming of potential Metavers apps. “All of this will eventually be built into AR glasses with AI accessories,” he posted on Facebook today. “It can help you cook dinner, regardless of whether you missed an ingredient, signaling you to reduce the heat or for more complex tasks.”
For Oli, the main solution is that researchers should get out of their silos. “Hey, you don’t have to focus on one thing,” he says. “If you have a good idea, it can actually help the whole board.”