Neto Menezes, Elias Jacob deSilva, Matheus de Andrade2023-11-242023-11-242023-10-27SILVA, Matheus de Andrade. Aprendizado fracamente supervisionado para rotulagem de imagens de documentos de identificação em dados da JFRN. 2023. 50 f. Trabalho de Conclusão de Curso (Especialização em Residência em Tecnologia da Informação) - Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, 2023.https://repositorio.ufrn.br/handle/123456789/55435This work aims to solve a common problem in the Federal Court of Rio Grande do Norte, Brazil: verifying the presence of required documents in electronic cases, which creates rework for clerks who must request missing documents. The goal is to propose an artificial intelligence solution to label images of identification documents in cases from the Creta system. We extracted 62600 images of documents attached to cases. A subset was manually labeled (identity or not). Pre-trained models (ResNet50 and Vision Transformer) extracted features from the images. Clustering algorithms (KMeans, AffinityPropagation, etc.) grouped the features. The Snorkel labeling functions used the clusters to automatically label all images. The functions obtained an F1 Score of 0.89-0.90 on the development and test sets. Only about 2% of the images remained unlabeled. The proposed method successfully labeled a large volume of images, enabling the construction of AI services for document identification. The work also presents an efficient approach for automatic image labeling using weakly supervised learning.Attribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/supervisão fracaweak supervisionvisão computacionalcomputer visionextração de característicasfeature extractionAprendizado fracamente supervisionado para rotulagem de imagens de documentos de identificação em dados da JFRNbachelorThesisCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO