Oliveira, Luiz Affonso Henderson Guedes deNunes, Yuri Thomas Pinheiro2024-08-222024-08-222024-05-21NUNES, Yuri Thomas Pinheiro. Detecção heurística de Concept Drift baseado em TEDA. Orientador: Dr. Luiz Affonso Guedes. 2024. 96f. Tese (Doutorado em Engenharia Elétrica e de Computação) - Centro de Tecnologia, Universidade Federal do Rio Grande do Norte, Natal, 2024.https://repositorio.ufrn.br/handle/123456789/59806The non-stationary dynamics of Data production are presented through seasonality and trends, characteristics that make applying machine learning difficult. This phenomenon can be represented as a data stream: ordered and unlimited source of non-stationary data. Data streams are often used to represent evolving systems and their non-stationarity is attributed to concept drifts. In this context, machine learning techniques must be adapted for processing data streams. It is necessary to consider real-time retraining, response to concept drif, partial data availability, and memory limitation, among others. To address such issues, it is essential to use concept drift detectors (CDD) to enable model adaptation. The literature is rich in works on detecting concept drift distributed into three groups concerning the availability of true labels: supervised, semi-supervised, and unsupervised. It is possible to argue that unsupervised methods allow for shorter detection delays in real applications by performing detections at prediction time, before feedback. This work presents a new concept drift detection method, TEDA-CDD. Two models compose this detector to represent concepts based on TEDA: the reference model and the dynamic model. The reference model aims to define the concept known by the machine learning model while the dynamic model is free to adapt to any new concept that emerges from the data stream. The models are compared heuristically using the Jaccard index to indicate similarity. When the index indicates low similarity, the detector indicates a concept drift. To compare the proposed method with other methods present in the literature, initially, a realistic approach for data stream models is proposed. This approach makes it possible to apply several classifiers and detectors to the data stream classification task and estimate performance metrics specific to the data streams context. In the experiments, the proposed method is compared to other methods present in the literature using synthetic and real benchmarks. The proposed method has comparable performance in terms of accuracy compared to methods consolidated in the literature while being the most efficient in terms of memory consumption.Acesso AbertoAprendizado não supervisionadoData StreamTEDAClassificação em Data StreamDetector de Concept DriftDetecção heurística de Concept Drift baseado em TEDAdoctoralThesisCNPQ::ENGENHARIAS::ENGENHARIA ELETRICA