HARS1DE: arquitetura de hardware para processamento de CNNs1D na borda

Kreutz, Márcio EduardoGuimarães, Mailson Rodrigues de Medeiros2025-07-012025-07-012025-01-31GUIMARÃES, Mailson Rodrigues de Medeiros. HARS1DE: arquitetura de hardware para processamento de CNNs1D na borda. Orientador: Dr. Márcio Eduardo Kreutz. 2025. 105f. Dissertação (Mestrado em Sistemas e Computação) - Centro de Ciências Exatas e da Terra, Universidade Federal do Rio Grande do Norte, Natal, 2025.https://repositorio.ufrn.br/handle/123456789/64088There is a trend toward using the cloud computing paradigm, where resources, storage, and information processing are carried out in so-called "clouds"managed by providers. This paradigm is leveraged, for instance, to apply machine learning algorithms to large volumes of data. Conversely, there is the edge computing paradigm, where this processing load is transferred to elements closer to where the data is generated (at the network edge). Investment by technology companies in this type of computing and its techniques has been growing, as it can offer advantages, such as reduced processing latency, energy consumption, and resource demands that may not always be available in the cloud. Similarly to cloud computing, it is possible to apply predictive machine learning models at the edge, where hardware architectures dedicated to accelerating these processes can be employed. Thus, this work’s main objective is to implement, test, and validate a hardware architecture capable of accelerating the computation of 1D-CNNs inference, including pooling, activation, and dense layers, where performance metrics, accuracy, and hardware resource utilization are analyzed. Two representations of the architecture were developed to obtain the results: one in VHDL, synthesized for FPGA to get results regarding hardware resource allocation and timing, and another in Python, a high-level abstraction language, to obtain quicker results on the architecture’s behavior during longer processes, such as the computation of an entire neural network. Tests were conducted on three different variations of the proposed architecture.The results were obtained by applying the architecture in remote sensing, specifically for pixel classification in hyperspectral images. The neural network used was a simplified version of previous works to facilitate porting to hardware. In addition to being reconfigurable in the context of FPGAs, the resulting architecture exhibits adaptable behavior depending on the type of neural network layer being processed. Theoretical results demonstrate a maximum performance of 14.4GOP/s for the best architecture variation, as well as a maximum acceleration of 4.52× compared to an AMD EPYC 7B12 processor, 8.36× compared to an NVIDIA T4 and 3.39× to an AMD Ryzen 7 7800X3D. These results were achieved classifying one of the hyperspectral images and the best architecture variation ended with a FPGA resource usage below 80%.pt-BRAcesso AbertoComputação na bordaArquitetura de hardwareAprendizado de máquinaRedes neurais convolucionaisSensoriamento remotoHARS1DE: arquitetura de hardware para processamento de CNNs1D na bordaHARS1DE: reconfigurable and scalable hardware accelerator for CNNs-1D in edge computingmasterThesisCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO