Uma abordagem utilizando aprendizagem por reforço hierárquica e computação paralela para o problema dos K-Servos

Doria Neto, Adrião DuarteCosta, Mademerson Leandro da2017-10-242017-10-242017-06-09COSTA, Mademerson Leandro da. Uma abordagem utilizando aprendizagem por reforço hierárquica e computação paralela para o problema dos K-Servos. 2017. 95f. Tese (Doutorado em Ciência e Engenharia de Petróleo) - Centro de Ciências Exatas e da Terra, Universidade Federal do Rio Grande do Norte, Natal, 2017.https://repositorio.ufrn.br/jspui/handle/123456789/24149A metrical task system is an abstract model for a class of online optimization problems, including paging, access lists, industry oil problems such as the management of workover rigs and logistics in the production of offshore oil, the problem of K-Servos, among others. The use of reinforcement learning to solving these problems, although proved to be efective, is restricted to a simple class of problems due to the curse of dimensionality inherent to the method. This work presents a solution that uses reinforcement learning based on hierarchical decomposition techniques and parallel computing to solve optimization problems in metric spaces. The use of these techniques allowed to extend the applicability of the method to more complex problems, bypassing the restriction of its use to smaller problems. As the size of the storage structure used by reinforcement learning to obtain the optimal policy grows as a function of the number of states and actions, which in turn is proportional to the number n of nodes and k of servers, it is noticed that their growth is given exponentially (𝐶𝑘𝑛≅𝑂(𝑛𝑘)). To circumvent this, the problem was modeled with a multi-step decision process where we initially used the k-means algorithm as a grouping method to decompose the problem into smaller subproblems. Then, the Q-learning algorithm was applied in the subgroups, aiming at achieving the best server displacement policy. In this step, the learning and storage processes in the subgroups were executed in parallel. In this way, the problem dimension and the total execution time of the algorithm were reduced, making possible the application of the proposed method to the large instances. The proposed approach presented better results when compared to the classical reinforcement learning and the greedy method. In addition to achieving speedup and efficiency gains in the evaluation of parallel performance metrics. Keywords— Metrical Task Systems, The K-Server Problem, Curse of Dimensionality, Hierarchical Reinforcement Learning, Q-Learning Algorithm, Parallel Computing.Acesso AbertoAprendizagem por reforço hierárquicaProblemas de otimização em espaços métricosComputação paralelaUma abordagem utilizando aprendizagem por reforço hierárquica e computação paralela para o problema dos K-ServosdoctoralThesisCNPQ::ENGENHARIAS::ENGENHARIA QUIMICA::TECNOLOGIA QUIMICA::PETROLEO E PETROQUIMICA