Norwegian speech to text at the National Library

The National Library of Norway wants to make it easier for companies to develop Norwegian speech recognition tools. In order to do this, you need large data sets with transcribed speech. Schjønhaug AS developed a transcription tool tailored to the National Library’s needs.

Text: Bendik Agersborg Eriksen. Translation: Maria Vole

For several years, Schjønhaug AS has worked with the transcription tool Benevis which converts audio files into text. The tool has been used by NRK, and has made the work easier for many journalists who need to transcribe interviews and audio files as part of their daily work.

“I gave a lecture on automatic transcription of Norwegian speech into text at the University of Oslo, and in the audience was someone from the National Library who took an interest in our work,” says Andreas Schjønhaug, general manager of Schjønhaug AS.

A language policy mission

The National Library of Norway has started a speech recognition project on its own initiative with the goal of creating large archives with transcribed speech. The idea is to contribute to creating good Norwegian speech recognition tools by creating resources that can be used by companies that choose to further develop such technology.

“Norwegian is a small language and such datasets with transcribed speech are expensive to develop. Therefore there is a danger that companies will not be able to afford to develop good speech recognition tools for the Norwegian market. To help companies invest in the Norwegian language, we have chosen to create these datasets ourselves and make them freely available in the Språkbanken, (the Language Bank),” says Per Erik Solberg, language technologist at the National Library.

In previous projects, those who worked at the National Library had to transcribe all audio files by hand. But with the help of the tool developed by Schjønhaug AS, they get a finished transcript, and the language technologists only have to edit the text and correct errors.

“An automatic transcription is not perfect, but with this solution, our work becomes mainly improving the text. This means that we save a lot of time on manual work,” says Solberg.

A complicated task

The tool works by running audio files through a speech recognition program from Google and gives the National Library a word processing tool where they can edit and correct errors in the text.

“We stated our needs to Schjønhaug and have been given a tool tailored to our requests. We needed a user-friendly tool that made it easy to edit the text after the audio file was automatically transcribed.”

— Per-Erik Solberg

Schjønhaug AS’s solution is complicated because it picks up natural sound from a normal setting. The tool must therefore interpret many different voices, dialects and voice volume.

“This is a complicated task since we have audio files from discussions, regular speeches and lectures. Therefore, it needs to be able to adapt to different tempos and ways of speaking,” says Solberg.

Freely available

This project mainly aims to strengthen the Norwegian language and ensure that companies can develop good language tools in Norwegian. To that end, both Schjønhaug AS and the National Library have made it easy for others to use the resources from this project.

“Our data sets with transcribed speech and Schjønhaug AS’s tools will be freely available in Språkbanken so that these resources can be used by everyone,” says Solberg.

In a project as large and comprehensive as this, it is important that the communication between the stakeholders works well. Solberg says that Schjønhaug AS has been easy to work with and that the tool has been changed and improved in response to their feedback.

“Andreas Schjønhaug is very likeable and his team was quick to respond if we needed help or changes in the tool,” Solberg concludes.