But few people had enough mastery of language to manually transcribe audio. Inspired by voice assistants like Siri, Mahlona began to look into the natural-language process. “It became imperative to teach computers to speak Mરીori,” says Jones.
But that Hiku had to deal with the chicken and egg problem. To build a te reo Speech recognition model, it requires abundant transcribed audio. In order to transcribe audio, it needs advanced speakers, a small number of which it was trying to compensate in the first place. However, there were plenty of introductory and intermediate speakers who could read te reo Words louder than they could recognize in the recording.
So Jones and Mahlona, along with Te Hiku COO Suzanne Duncan, devised a clever solution: Instead of transcribing existing audio, they would ask people to read and record a series of sentences designed to capture the full range of sounds in the language. For the algorithm, the resulting data set will perform the same function. From those thousands of pairs of spoken and written sentences, he will learn to recognize te reo Syllables in audio.
The team announced the competition. Jones, Mahelon and Dunk approached every Mરીori community group they could find, including the traditional Kapa haka Dance troupes and Waka ama Canoe-racing teams, and announced that whoever submits the most recordings will win a grand prize of $ 5,000.
The whole society came together. The competition was hot. A member of the Mરીori community, Mihinga Komene, a teacher and advocate for the use of digital technologies for revival te reoRecorded 4,000 phrases alone.
Money alone was not the motivator. People bought that Hikuna Vision and trusted it to protect their data. Hiku Media said, “Whatever you give us, we are here Katiaki [guardians]We take care of it, but you still have your audio, “says Mihinga. “It simply came to our notice then. Those values define who we are M માori. “
Within 10 days, Hiku collected 310 hours of speech-text pairs from nearly 200,000 recordings made by approximately 2,500 people, an unexpected level of engagement among researchers in the AI community. “No one could do this except the Mરીori organization,” says Caleb Moses, a Mઓori data scientist who joined the project after learning about it on social media.
The amount of data was still small compared to the thousands of hours typically used to train English language models, but that was enough to get started. Using data to bootstrap the Mozilla Foundation’s existing open-source model, Hiku created its first. te reo Speech recognition model with 86% accuracy.