Doria Neto, Adrião DuarteTerrematte, Patrick Cesar Alves2022-06-232022-06-232022-05-13TERREMATTE, Patrick Cesar Alves. Uma nova assinatura de 13 genes via aprendizagem de máquina para predição de sobrevida de pacientes com carcinoma renal de células clara. 2022. 72f. Tese (Doutorado em Bioinformática) - Instituto Metrópole Digital, Universidade Federal do Rio Grande do Norte, Natal, 2022.https://repositorio.ufrn.br/handle/123456789/48273Patients with renal cancer have a survival rate of 12% in 5 years in case of metastasis according data between 2009 to 2015 of American Cancer Society. It is of paramount importance to identify biomarkers in genomic data that could help predict the aggressive- ness of clear cell renal cell carcinoma (ccRCC), the most frequent renal cancer subtype. Thus, we conducted a study with the aims of evaluating gene signatures and proposing a novel one with higher predictive power and generalization in comparison to the for- mer signatures. Using ccRCC cohorts of the Cancer Genome Atlas (TCGA-KIRC) and International Cancer Genome Consortium (ICGC-RECA), we evaluated linear survival models of Cox regression with 14 signatures and six methods of feature selection, and performed functional analysis and differential gene expression approaches. In this study, we established a 13-gene signature (AR, AL353637.1, DPP6, FOXJ1, GNB3, HHLA2, IL4, LIMCH1, LINC01732, OTX1, SAA1, SEMA3G, ZIC2) whose expression levels are able to predict distinct outcomes of patients with ccRCC. Moreover, we performed a com- parison between our signature and others from the literature. The best-performing gene signature was achieved using the ensemble method Min-Redundancy and Max-Relevance (mRMR). This signature comprises unique features in comparison to the others, such as generalization through different cohorts and being functionally enriched in significant pathways: Urothelial Carcinoma, Chronic Kidney disease, and Transitional cell carcinoma, Nephrolithiasis. From the 13 genes in our signature, eight are known to be correlated with ccRCC patient survival and four are immune-related. Our model showed a performance of 0.82 using the Receiver Operator Characteristic (ROC) Area Under Curve (AUC) metric and it generalized well between the cohorts. Our findings revealed two clusters of genes with high expression (SAA1, OTX1, ZIC2, LINC01732, GNB3 and IL4) and low expression (AL353637.1, AR, HHLA2, LIMCH1, SEMA3G, and DPP6) which are both correlated with poor prognosis. This signature can potentially be used in clinical practice to support patient treatment care and follow-up.Acesso AbertoAprendizagem de máquinaBioinformáticaCâncer renalAssinatura genéticaSeleção de característicasInformação mútuaUma nova assinatura de 13 genes via aprendizagem de máquina para predição de sobrevida de pacientes com carcinoma renal de células claraA novel machine learning 13-gene signature: improving risk analysis and survival prediction for clear cell renal cell carcinoma patientsdoctoralThesis