The new version of GPT-3 is much better behaved (and should be less toxic)

“This is an important step in the right direction,” says Duve Keila, a researcher at Hugging Face, an AI company working on open-source language models. It suggests that the feedback-driven training process can be repeated in several rounds, further refining the model. Leike says OpenAI can do this based on customer feedback.

InstructGPT still makes simple mistakes, sometimes producing irrelevant or insensitive responses. For example, if it is indicated to be false, it will take it as true. And because people are trained to do what they ask, InstructGPT will produce much more toxic language than GPT-3 if directed to do so.

Ehud Ritter, who works on text-generation AI at the University of Aberdeen in the UK, welcomes any technology that reduces the production of misinformation language models. But he notes that for some applications, such as AI that offers medical advice, lying is by no means acceptable. Repeat questions as to whether the big-language model based on black-box neural networks can ever guarantee user safety. For that reason, it favors a combination of neural networks plus symbolic AI, hard-coded rules restrict what the model can and cannot say.

Whatever the approach, there is still a lot of work to be done. “We’re not even close to solving this problem yet,” Keila says.

Similar Posts

Leave a Reply

Your email address will not be published.