Lins, Hertz Wilton de CastroMedeiros, Gabriel Santos de2022-08-092022-08-092022-07-14MEDEIROS, Gabriel Santos de. Um estudo de caso de coleta e pré-processamento de dados na aplicação de processamento de linguagem natural. 2022. 41f. Trabalho de Conclusão de Curso (Graduação em Engenharia de Telecomunicações) - Centro de Tecnologia, Universidade Federal do Rio Grande do Norte, Natal, 2022https://repositorio.ufrn.br/handle/123456789/49109The present work is part of the text mining area, a field with a wide range of applications in several areas and a great potential to keep growing due to the technologies constantly being developed to allow better natural language processing in an automated way. In the area of telecommunications, however, not much effort was seen in these studies. With this in mind, the present work aims to show how the collection and pre-processing of data for Natural Language Processing (NLP) applications is carried out, as well as to make an introductory analysis of the information obtained from one selected newspaper. To this end, a case study was made detailing the generation of the database through web crawling of the chosen scientific journal, and the data treatments that are necessary to prepare this information for text mining, all implemented in the language python. As a result, this processing generated data that allowed a preliminary analysis of the articles of the International Journal of Interactive Mobile Technologies (iJIM), which showed even more possibilities for text mining.Attribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/Mineração de textoWeb crawlingPythonProcessamento de linguagem naturalInteligência artificialAprendizado de máquinaText miningWeb crawlingPythonData pre-processingNatural language processingArtificial intelligenceMachine learningUm estudo de caso de coleta e pré-processamento de dados na aplicação de processamento de linguagem naturalA case study of data collection and pre-processing for natural language processing applicationbachelorThesisCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::LINGUAGENS DE PROGRAMACAO