Uma metodologia para criação de grandes bases de voz para linguagens com recursos escassos, e inclusão social por conversão de sotaques para PT-BRal

Abreu, Marjory Cristiany da CostaLima, Thales Aguiar de2023-06-192023-06-192022-12-16LIMA, Thales Aguiar de. A methodology to create large speech datasets for lowresource languages, and improving social equity by accent conversion for PT-BR. Orientador: Márjory Cristiany da Costa Abreu. 2022. 86f. Tese (Doutorado em Ciência da Computação) - Centro de Ciências Exatas e da Terra, Universidade Federal do Rio Grande do Norte, Natal, 2022.https://repositorio.ufrn.br/handle/123456789/52764Speech is a crucial part of our way to communicate as a species and combined with the evolution of instant messaging in voice format as well as automated chatbots, its importance has become even greater. While the majority of speech technologies have achieved high accuracy, they fail when tested for accents that deviate from the “standard” of a language. This becomes more concerning for languages that lack on datasets and have scarce literature, like Brazilian Portuguese. In a parallel development, artificial intelligence(AI)-based tools are an accepted increasingly present in people’s lives, even if not always noticeable. The use of and “standard accent” combined with the advancement of AI in speech systems and the lack of resources for PT-BR, have inspired the three objectives of this work. Thus, this thesis proposes to explore new ways for Accent Conversion for this language, adapting existing models, which must convert from Paulistano to Nordestino. The second is to provide an acoustic analysis of Brazilian Portuguese accents, covering a wide area of the national territory, finding and formalising possible differences between them. Finally, to collect and release a speech dataset for Brazilian Portuguese. With a method that explores the availability of data and information in video platforms, the method automatically downloads the videos from TEDx Talks. Those short presentations are a source of reliable and clean audio with human and automatically generated transcriptions.Acesso AbertoComputaçãoBiometria por vozInclusão de sotaquesPortuguês brasileiroCorpusBase de dadosMachine learningUma metodologia para criação de grandes bases de voz para linguagens com recursos escassos, e inclusão social por conversão de sotaques para PT-BRalA methodology to create large speech datasets for lowresource languages, and improving social equity by accent conversion for PT-BRdoctoralThesisCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO