Federal University of Rio Grande do Norte 
Brain Institute 
Laboratory of Sleep, Dreams and Memory 
 
 
 
 
 
 
 
Mapeamento mental através da análise computacional do discurso 
(Mind mapping through computational speech analysis) 
 
 
 
 
 
 
 
 
Natália Bezerra Mota 
Supervisor: Mauro Copelli 
Co-supervisor: Sidarta Ribeiro 
 
 
NATÁLIA BEZERRA MOTA 
 
 
 
 
 
 
 
 
 
 
Mapeamento mental através da análise computacional do discurso 
(Mind mapping through computational speech analysis) 
 
 
 
 
 
 
 
Tese de doutorado apresentada ao curso de  
Pós-Graduação em Neurociências da  
Universidade Federal do Rio Grande do Norte,  
como requisito para a obtenção do Grau de Doutor.  
 
 
 
 
 
 
 
 
 
Orientador: Prof. Dr. MAURO COPELLI 
Co-orientador: Prof. Dr. SIDARTA TOLLENDAL GOMES RIBEIRO 
 
 
 
 
 
 
NATAL, 11 DE JULHO DE 2017   
Catalogação da Publicação na Fonte 
Universidade Federal do Rio Grande do Norte 
Biblioteca Setorial Árvore do Conhecimento – Instituto do Cérebro 
Mota, Natalia Bezerra. 
Mapeamento mental através da análise computacional do discurso / 
Natália Bezerra Mota. – 2017 
271 f. : il.             
Tese (Doutorado) - Universidade Federal do Rio Grande do Norte, 
Instituto do Cérebro, Programa de Pós Graduação em Neurociências. Natal, 
RN, 2017.  
Orientador: Prof. Dr. Mauro Copelli Lopes da Silva. 
Co-orientador: Prof. Dr. Sidarta Tollendal Gomes Ribeiro. 
 
1. Neurociências - Tese. 2. Psicose - Tese. 3. Esquizofrenia - Tese. 4. 
Grafos - Linguagem - Tese. 5. Educação – Tese. 6. Sono – Tese. 7. Sonhos – 
Tese. I. Silva, Mauro Copelli da. II. Ribeiro, Sidarta Tollendal Gomes. 
III.Título.  
RN/UF/BSICe                                                                                CDU 612.8 
 
BrainInst itute (UFRN)  
 Av. Nascimento Castro, 2155  –  Natal –  RN –  Brazil 
e-mai l : pg@neuro.ufrn.br  
phone: +55 (84) 3215-2709 
UNIVERSIDADE FEDERAL DO RIO GRANDE DO NORTE 
 
INSTITUTO DO CÉREBRO 
PROGRAMA DE PÓS-GRADUAÇÃO EM NEUROCIÊNCIAS 
ATA DE DEFESA DA TESE DE DOUTORADO DO PROGRAMA DE PÓS-GRADUAÇÃO EM 
NEUROCIÊNCIAS DO INSTITUTO DO CÉREBRO DA UNIVERSIDADE FEDERAL DO 
RIO GRANDE DO NORTE 
Aos onze (11) dias do mês de julho de dois mil e dezessete (2017), às 14h, no Auditório do 
Instituto do Cérebro da Universidade Federal do Rio Grande do Norte, reuniu-se em sessão 
pública a banca examinadora responsável pela avaliação da tese cujo trabalho é intitulado 
“MAPEAMENTO MENTAL ATRAVÉS DA ANÁLISE COMPUTACIONAL DO 
DISCURSO” da doutoranda NATÁLIA BEZERRA MOTA. A Banca foi presidida pelo 
Profº Mauro Copelli Lopes da Silva (UFRN - Presidente) e composta pelos Profs. Cláudio 
Marcos Teixeira de Queiroz (Avaliador Interno - UFRN), Ricardo Alexsandro de Medeiros 
Valentim (Avaliador Interno - UFRN), Claudia Domingues Vargas (Avaliadora Externa – 
UFRJ) e Silvia Alice Bunge (Avaliadora Externa - UC BERKELEY). O Exame teve a 
duração de _______________ e a Banca, após a apresentação formal do trabalho e arguição 
da doutoranda, emitiu o seguinte 
parecer________________________________________considerando a 
aluna____________________________(Aprovada/Reprovada). Nada mais havendo a tratar, 
foi lavrada a presente ata, que vai assinada pelos membros da banca examinadora e pela 
doutoranda. 
 
 
Banca Examinadora 
Profº Mauro Copelli Lopes da Silva (Presidente - UFPE)_____________________________ 
Profº Cláudio Marcos Teixeira de Queiroz (Avaliador Interno - UFRN)__________________ 
Profº Dráulio Barros de Araújo (Avaliador Interno - UFRN)___________________________ 
Profº Ricardo Alexsandro de Medeiros Valentim (Avaliador Interno - UFRN)_____________ 
Profª Claudia Domingues Vargas (Avaliador Externo – UFRJ)_________________________ 
Profº Silvia Alice Bunge (Avaliador Externo - UC BERKELEY)_______________________ 
 
 
Nota_________________________________________________ 
 
 
DOUTORANDA 
Natália Bezerra Mota ________________________________________________ 
 
Natal-RN, 11 de julho de 2017. 
 
BrainInst itute (UFRN)  
 Av. Nascimento Castro, 2155  –  Natal –  RN –  Brazil 
e-mai l : pg@neuro.ufrn.br  
phone: +55 (84) 3215-2709 
 
 
Considerações: 
Recomendações de atendimento necessário: 
 
 
 
 
 
 
 
 
 
 
 
 
Recomendações de atendimento opcional: 
 
 
 
 
 
 
 
 
 
 
 
BrainInst itute (UFRN)  
 Av. Nascimento Castro, 2155  –  Natal –  RN –  Brazil 
e-mai l : pg@neuro.ufrn.br  
phone: +55 (84) 3215-2709 
 
Federal University of Rio Grande do Norte 
Brain Institute 
Laboratory of Sleep, Dreams and Memory 
 
 
 
 
 
 
Mapeamento mental através da análise computacional do discurso  
(Mind mapping through computational speech analysis)  
 
 
Natália Bezerra Mota 
Supervisor: Mauro Copelli 
Co-supervisor: Sidarta Ribeiro 
 
 
 
 
 
 
 
 
Natal, June 11th 2017 
1
Resumo 
Entender comportamentos humanos complexos como a linguagem e suas variações em diferentes 
situações é um importante objetivo de pesquisa há muitos anos. Uma abordagem naturalística e 
quantitativa para medir precisamente variações de linguagem do ponto de vista estrutural e semântico 
apontam para um avanço nessa área, possibilitando medir variações manifestadas em discurso livre que 
refletem declínio cognitivo em situações patológicas, como nas psicoses, ou no desenvolvimento 
cognitivo em crianças durante alfabetização, e até mesmo durante o processamento de memórias em 
estados fisiológicos alterados de consciência, como o que ocorre durante os sonhos. Nesse trabalho 
iniciaremos discutindo 1) a elaboração de ferramentas para análise de estrutura da fala inspiradas nas 
descrições psicopatológicas de doenças mentais, 2) sua aplicação para diagnóstico diferencial de psicose 
e demências, 3) assim como a aplicação de ferramentas semânticas para predição de episódios 
psicóticos. Pela análise da estrutura do discurso usando grafos para estudar a trajetória de palavras 
usadas pelos sujeitos ao relatar um sonho, foi possível, por exemplo, verificar que sujeitos portadores 
do diagnóstico de Esquizofrenia falavam de forma menos conectada que sujeitos com diagnóstico de 
Transtorno Bipolar do Humor ou sujeitos livres de sintomas psicóticos. Da mesa forma verificamos que 
havia uma maior distância semântica entre frases consecutivas em entrevistas psiquiátricas de sujeitos 
em fase prodrômica de psicose que em seguimento de 2 anos e meio fizeram um episódio psicótico 
pleno. Seguiremos ampliando esse olhar para além do patológico, observando 4) como variam essas 
medidas de estrutura da linguagem com o desenvolvimento cognitivo saudável e 5) sua relação com a 
educação. Observamos correlações entre conectividade do relato e performance em testes de 
inteligência fluida, teoria da mente e performance em leitura. Também investigamos em uma população 
ampla com grande variação de idades 6) como se dá o desenvolvimento dessas medidas ao longo do 
desenvolvimento educacional, 7) avaliando o impacto dos anos de educação nessa população e 8) seus 
correlatos com o desenvolvimento histórico da literatura em aproximadamente 5.000 anos. De maneira 
geral, encontramos que padrões de conectividade cresceram e estabilizaram ao final da idade do 
bronze, logo antes da era axial, na literatura, e que quanto mais tempo de educação tem o sujeito, 
maiores componentes conectados fazem ao relatar suas memórias, valores que se estabilizam apenas 
ao final do ensino médio (desenvolvimento que não se observa em população com sintomas de 
psicose). Finalizaremos aplicando ferramentas de similaridade semântica para 9) medir reverberação de 
memórias durante os sonhos e seus correlatos eletrofisiológicos em um experimento de transição entre 
vigília e sono. Podemos concluir a partir dos resultados que ferramentas estruturais e semânticas 
apresentam grande potencial para melhorar a precisão de comportamentos humanos complexos 
expressos na fala, de maneira naturalística, possibilitando investigações reveladoras sobre cognição e a 
consciência humana. 
 
  
2
Abstract 
The understanding of complex human behaviors such as language and its variations in different 
conditions and contexts has been an important research aim for many decades. Naturalistic and 
quantitative approaches to precisely measure language variations from the structural and semantic 
points of view have recently emerged, allowing the measurement of variations manifested in free 
speech that reflect atypical cognitive decline in pathological situations such as psychoses, or typical 
cognitive development in healthy children during alphabetization, and even the processing of memories 
in different states of consciousness, such as waking and dreaming. In this work we will start discussing 1) 
the construction of tools for the analysis of speech structure inspired by the psychopathological 
descriptions of mental illnesses. 2) their application to the differential diagnosis of psychosis and 
dementias, and 3) the application of semantic tools to predict psychotic episodes. In the structural level 
it was possible to observe that subjects with Schizophrenia diagnosis report their dreams with word 
trajectories represented as graphs less connected than subjects without psychosis or with Bipolar 
Disorder diagnosis. In the semantic level it was observed a higher semantic distance between 
consecutive sentences on psychiatric interviews of patients during prodromal psychotic phase 2 years 
and a half before converting to a psychotic episode. We will proceed by widening this view away from 
pathology, so as to determine 4) how graph-theoretical measures of language structure vary across 
healthy cognitive development, and 5) how they relate to indices of academic achievement. We verified 
a correlation between graph connectedness and cognitive (such as fluid intelligence and theory of mind 
abilities), as well as academic performances (of reading). Next we will investigate 6) how speech 
structure varies within a large sample of healthy and psychotic subjects with large age and educational 
variation, to 7) evaluate the impact of years of education and 8) compare with the development of 
literature across approximately 5,000 years. In summary, connectedness increases after the Bronze Age 
(just before start the Axial Age) and the longer time of education the subject had, higher the connected 
components of his memory reports, values stabilized during high school period, and a developmental 
trajectory not found in the psychotic population. We will conclude by applying tools to calculate 
semantic similarity to 9) measure memory reverberation during dreams and their electrophysiological 
correlates in a sleep transition experiment. The results indicate that the structural and semantic tools 
used in this work can greatly improve the precision of naturalistic measurements of the complex 
behaviors expressed in speech. 
  
3
Summary 
Chapter 1 - Introduction: The use of natural language processing tools can help to 
understand cognition in pathological conditions………………………………………………………… 6 
• Mota NB, Copelli M, Ribeiro S (2017) Graph Theory applied to speech: Insights on cognitive 
deficit diagnosis and dream research. In: Language, Cognition, and Computational Models. 
Edited by Thierry Poibeau and Aline Villavicencio. Publisher: Cambrigde University Press, in 
press. (Review paper). (IN PRESS)…………………………………………………………………………………………….. 7 
• Mota NB, Furtado R, Maia PPC, Copelli M, Ribeiro S (2014) Graph analysis of dream reports is 
especially informative about psychosis. Scientific Reports 4, 3691........................................... 35 
• Bertola L*, Mota NB*, Copelli M, Rivero T, Diniz BR; Romano-Silva MA, Ribeiro S, Malloy-Diniz 
LF (2014) Graph analysis of verbal fluency test discriminate between patients with Alzheimer's 
disease, mild cognitive impairment and normal elderly controls. Frontiers in Aging 
Neuroscience 6, 1-10……………………………………………………………………………………………………………… 50 
• Mota NB, Copelli M, Ribeiro S (2016) Computational Tracking of Mental Health in Youth: Latin 
American Contributions to a Low‐Cost and Effective Solution for Early Psychiatric Diagnosis. 
New directions for child and adolescent development 2016 (152), 59-69. (Review paper)……. 60 
• Bedi G, Carrillo F, Cecchi GA, Slezak DF, Sigman M, Mota NB, Ribeiro S, Javitt DC, Copelli M, M 
Corcoran CM (2015) Automated analysis of free speech predicts psychosis onset in high-risk 
youths. NPJ Schizophrenia 1, 15030……………………………………………………………………………………….. 71 
 
Chapter 2 - Hypotheses and objectives…………………………………………………………………….. 78  
• Dynamics of speech graph attributes during cognitive development and decline 
• Investigation of dream reports using speech analysis tools 
Chapter 3 - Cognitive Development………………………………………………………………………….. 80  
• Mota NB, Weissheimer J, Madruga B, Adamy N, Bunge SA, Copelli M, Ribeiro S (2016) A 
Naturalistic Assessment of the Organization of Children's Memories Predicts Cognitive 
Functioning and Reading Ability. Mind, Brain, and Education 10 (3), 184-195………………………… 81 
• Ribeiro S, Mota NB, Fernandes VR, Deslandes AC, Brockington G, Copelli M (2017) Physiology 
and assessment as low-hanging fruit for education overhaul. Prospects UNESCO IBE. (Review 
paper).......................................................................................................................................... 96 
• Ribeiro S, Mota NB, Copelli M (2016) Rumo ao cultivo ecológico da mente. Propuesta Educativa 
46 Año 25, (2) 42-49.  (Review paper)....................................................................................... 114 
 
Chapter 4 - Cognitive decline in patients undergoing psychosis……………………………… 123 
• Mota NB*, Copelli M, Ribeiro S (2017) Thought disorder measured as random speech structure 
classifies negative symptoms and Schizophrenia diagnosis 6 months in advance. NPJ 
Schizophrenia 3, 18. DOI: 10.1038/s41537-017-0019-3. (*corresponding author)……………….. 124 
• Mota NB, Carrillo F, Slezak DF, Copelli M, Ribeiro S (2016). Characterization of the relationship 
between semantic and structural language features in psychiatric diagnosis in Fiftieth Asilomar 
Conference on Signals, Systems and Computers. (IEEE Conference Publishing)……………………. 146 
 
  
4
Chapter 5 - Speech structure in healthy and pathological verbal reports, in comparison 
with literature across ages……………………………………………………………………………………… 149 
• Mota NB*, Pinheiro S*, Sigman M, Slezak DF, Cecchi G, Copelli M, Ribeiro S (2017) Bronze Age 
texts are structurally similar to verbal reports from both children and psychotic subjects. 
Nature Human Behavior. (REVIEW)……………………………………………………………………………………… 150 
 
Chapter 6 - Lucid dreams and psychosis…………………………………………………………………. 212  
• Mota NB*, Resende A, Mota-Rolim SA, Copelli M, Ribeiro S* (2016) Psychosis and the Control 
of Lucid Dreaming Frontiers in Psychology 7, 294 (*co-corresponding author)…………………….. 213 
 
Chapter 7 - Sleep transition imagery, insights from natural language processing……. 223 
Chapter 8 – Perspectives……………………………………………………………………………………….. 239 
Chapter 9 - Discussion…………………………………………………………………………………………….. 244  
Acknowledgments………………………………………………………………………………………………….. 247 
Publications and Press……………………………………………………………………………………………. 249 
Appendix (Approval from Ethical Committee)………………………………………………………… 260 
References……………………………………………………………………………………………………………… 271 
  
5
Chapter 1 - Introduction: 
How natural language processing can help to understand cognition in pathological 
conditions 
This introductory chapter deals with the development of speech analysis tools applied 
mainly to psychiatric assessment in diseases characterized by gradual cognitive 
decline, and how this knowledge allow a low-cost assessment in naturalistic situations. 
This chapter is composed by one chapter in press that focus on structural speech 
analysis based on graph theory, followed by the first two “new data” publications that 
comprise this thesis (the first on psychosis and the second on dementia). The next 
published review paper talks about other strategies using semantic similarity and how 
this speech analysis helps to characterizes speech incoherence in psychosis. It ends 
with a paper in collaboration using semantic similarity tools to measure incoherence 
and predict the psychotic break in a prodromal population.  
 
 
 
 
6
 1 
Title: Graph Theory applied to speech: Insights on cognitive deficit diagnosis and 
dream research 
Authors: Natália Bezerra Mota 1, Mauro Copelli 2, Sidarta Ribeiro 1 
 
Affiliations: 
1 – Brain Institute, Federal University of Rio Grande do Norte 
2 – Physics Department, Federal University of Pernambuco 
 
Abstract 
In the last decade, graph theory has been widely employed in the study of natural 
and technological phenomena. The representation of the relationships among the units of a 
network allow for a quantitative analysis of its overall structure, beyond what can be 
understood by considering only a few units. Here we discuss the application of graph 
theory to psychiatric diagnosis of psychoses and dementias. The aim is to quantify the flow 
of thoughts of psychiatric patients, as expressed by verbal reports of dream or waking 
events. This flow of thoughts is hard to measure but is at the roots of psychiatry as well as 
psychoanalysis. To this end, speech graphs were initially designed with nodes representing 
lexemes and edges representing the temporal sequence between consecutive 
words,leading to directed multigraphs. In a subsequent study, individual words were 
considered as nodes and their temporal sequence as edges; this simplification allowed for 
the automatization of the process, effected by the free software SpeechGraphs. Using this 
approach, one can calculate local and global attributes that characterize the network 
structure such as the total number of nodes and edges, the number of nodes present in the 
largest connected and the largest strongly connected components, measures of recurrence 
such as loops of 1, 2 and 3 nodes, parallel and repeated edges, and global measures such 
as the average degree, density, diameter, average shortest path and clustering coefficient. 
Using these network attributes we were able to automatically sort Schizophrenia and 
Bipolar patients undergoing psychosis, and also to separate these psychotic patients from 
subjects without psychosis, with over 90% sensitivity and specificity. In addition to the use 
of the method for strictly clinical purposes, we found that differences in the content of the 
verbal reports correspond to structural differences at the graph level. When reporting a 
dream, healthy subjects without psychosis and psychotic subjects with Bipolar Disorder 
produced more complex graphs than when reporting waking activities of the previous day; 
7
 2 
this difference was not observed in psychotic subjects with Schizophrenia, which produced 
equally poor reports irrespective of the content. As a consequence, graphs of dream 
reports were more efficient for the differential diagnosis of psychosis than graphs of daily 
reports. Based on these results we can conclude that graphs from dream reports are more 
informative about mental states, echoing the psychoanalytic notion that dreams are a 
privileged window into thought. Overall these results highlight the potential use of this 
graph-theoretical method as an auxiliary tool in the psychiatric clinic. We also describe an 
application of the method to characterize cognitive deficits in dementia. In this regards, the 
SpeechGraph tools were able to sensitize a neuropsychological test widely used to 
characterize semantic memory, the verbal fluency test. Subjects diagnosed with 
Alzheimer's dementia were compared to subjects diagnosed with Moderate Cognitive 
Impairment, either with amnestic symptoms only or with damage in multiple domains. Also 
studied were elderly individuals with no signs of dementia. The subjects were asked to 
report as many names of different animals as they could remember within one minute. The 
sequence of animal names was represented as a word graph. We found that subjects with 
Alzheimer's dementia produced graphs with fewer words and elements (nodes and edges), 
higher density, more loops of 3 nodes and smaller distances (diameter and average 
shortest path) than subjects in the other groups; a similar trend was observed for subjects 
with Moderate Cognitive Impairment, in comparison to elderly adults without dementia. 
Furthermore, subjects with Moderate Cognitive Impairment with amnestic deficits only 
produced graphs more similar to the elderly without dementia, while those with 
impairments in multiple domains produced graphs more similar to the graphs from 
individuals with Alzheimer's dementia. Importantly, also in this case it was possible to 
automatically classify the different diagnoses only using graph attributes. We conclude by 
discussing the implications of the results, as well as some questions that remain open and 
the ongoing research to answer them.  
  
8
 3 
1. Introduction 
Every day when we wake up, before talking with other people, we talk with 
ourselves using inner speech to remember what day it is, where we are, to make plans 
about what to do in the next minutes, hours, who we are going to meet or what we 
are supposed to do. When we recognize this “inner speech” as coming from ourselves, 
we may simply call it “thinking”. However, sometimes this inner speech is not 
recognized as self but rather as stimuli generated elsewhere; this is the basis of what 
we call psychosis. Sometimes past memories dominate this mental space and we focus 
on past feelings of sadness, joy, fear, or anxiety. Past and future memories are mixed 
in these first moments even before any interaction with another person. This flow of 
memories and thoughts helps to organize our actions and to soothe our anxiety and 
sadness as we can plan future solutions to solve past problems. Organized, healthy 
mental activity allows old and new information to interact in order to support different 
actions that take experience into account in an integrated manner. But what happens 
with this flow of thoughts when we are unable to organize our inner space? 
For centuries, psychiatry has described symptoms known as thought disorder 
that reflects disorganization of this flow of ideas, memories and thoughts (Andreasen 
& Grove, 1986; Kaplan & Sadock, 2009). Those symptoms are related with psychosis, a 
syndrome characterized by hallucinations (when one perceives an object that does not 
exist; a sensorial perception without a real external object) and delusions (when one 
believes in realities that do not exist for other people; ideas or beliefs not real for their 
peers) (Kaplan & Sadock, 2009). There are many different causes for psychosis, such as 
the use of psychoactive substances or neurological conditions like cerebral tumors or 
9
 4 
epilepsies. However, psychotic symptoms may occur without a clear cause, starting 
with a strange feeling or perception, getting worse, creating a confused reality hard to 
share even with the closest person, causing major mental suffering. 
In association with this strange reality, the patient can experience the feeling of 
fragmentation of thoughts, having difficulty to organize ideas or to follow a flow of 
memories, impacting the way to express what they are thinking or feeling, creating 
meaningless speech (symptoms known as “alogia”, and “poor speech”). This frequently 
reflects a mental disorder known as Schizophrenia. In other cases, the person may 
experience another aberrant organization of thought, with higher speed of mental 
activity, associating different memories and ideas (known as “flight of thoughts”), 
creating a speech with large amount of words (a symptom known as “logorrhea”) that 
never reaches the main point. This pattern of thought disorder is common during the 
mania phase of Bipolar Disorder, a psychiatric condition mainly described by opposite 
mood cycles comprising depressive and manic phases. This speech pattern changes 
during depressive phases on the opposite direction (low speed of thought, fewer 
associations, fewer amount of words during speech). The speech content can reflect 
that strange psychotic reality on all those conditions with unlikely word association, 
but the organization of ideas reflected in the word trajectories reveals different 
directions of thought disorder, helping psychiatrists to make differential diagnosis 
between Bipolar Disorder and Schizophrenia, predicting different life courses and 
cognitive impacts. 
The description of these different patterns of thought organization perceived 
through language helped psychiatrists to distinguish between two different 
10
 5 
pathological states and predict different life courses (with higher cognitive deficits for 
Schizophrenia, first known as Dementia Precox (Bleuler, 1911)). However, recognizing 
these features subjectively requires a long-term professional training and adequate 
time with each patient to know each individual and avoid misjudgments. And even 
with the best evaluation conditions it is only possible to quantify those features 
subjectively, judging disease severity by grades on the psychometric scales such as 
BPRS and PANSS (Bech, Kastrup, & Rafaelsen, 1986; Kay, Fiszbein, & Opler, 1987). The 
differential diagnosis requires at least six months of observation during the first 
episode (First, Spitzer, Gibbon, & Williams, 1990), which means that the initial 
treatment may occur under considerable doubt regarding the diagnostic hypothesis. 
This lack of objective quantitative evaluation also impacts negatively on the research 
strategies that aim to find biomarkers for complex psychiatric conditions (Insel, 2010).  
Another condition that benefits from early diagnosis and correct interventions to 
prevent major cognitive damage is Alzheimer’s Disease (AD) (Daviglus et al., 2010; 
Kaplan & Sadock, 2009; Riedel, 2014). Specific characterization of risk during preclinical 
AD requires specialized investigations and still challenges professionals in the field, due 
to a lack of a consensual description of each stage (Daviglus et al., 2010; Riedel, 2014). 
Failure to recognize AD early on can lead to a loss of opportunity to prevent cognitive 
decline (Daviglus et al., 2010; Riedel, 2014). In summary, the currently poor 
quantitative characterization of cognitive impairments related to pathological 
conditions such as Psychosis or Dementia hinders the early detection of these 
conditions. In this scenario, the new field called Computational Psychiatry has been 
11
 6 
proposing mathematical tools to better quantify behavior (Adams, Huys, & Roiser, 2015; 
Montague, Dolan, Friston, & Dayan, 2012; Wang & Krystal, 2014).  
To this end, natural language processing tools are particularly interesting. It is 
now possible to simulate the expert’s subjective evaluation with better precision and 
reliability, either by quantifying specific content features such as semantic incoherence 
(Bedi et al., 2015a; Cabana, Valle-Lisboa, Elvevag, & Mizraji, 2011; Elvevåg, Foltz, 
Weinberger, & Goldberg, 2007), or by analyzing the structural organization of word 
trajectories recorded from patients (Bertola et al., 2014a;; Mota et al., 2012; Mota et 
al., 2014). 
 
2. Semantic analysis for the diagnosis of Psychosis 
One useful tool used to characterize the incoherent speech characteristic of 
psychotic crises is called Latent Semantic Analysis (LSA) (Landauer & Dumais, 1997). 
The strange reality created during psychotic states impacts the coherence of the flow 
of words when patients express their thoughts freely, leading to improbable 
connections between semantically distant words within the same sentences. 
LSA is based on a model that assumes that the meaning of each word is a 
function of its relationship with the other words in the lexicon (Landauer & Dumais, 
1997). By this rationale, if two words are semantically similar, i.e. if their meanings are 
related, they must co-occur frequently in texts. It follows that if one has a large enough 
database of word co-occurrences in a large enough corpus of texts, it is possible to 
12
 7 
represent each word of that corpus as a vector in a semantic space, and their proximity 
in that space will be interpreted as semantic similarity (Landauer & Dumais, 1997). 
When healthy subjects describe their normal reality, it is expected that they will 
use words that are semantically similar within the same text. However, when reality 
becomes bizarre, as typical of psychotic states, subjects are expected to use 
semantically distant words in sequence, thus building incoherent speech. That 
incoherence can be quantified as a measure of semantic distance between consecutive 
words or sets of words (for example, a set of words used in the same sentence). The 
more incoherent the speech, the larger the semantic distance between consecutive 
words or set of words. This was first shown for chronic patients with Schizophrenia 
diagnosis (Elvevåg et al., 2007) and helped to predict diagnosis in the prodrome phase, 
2.5 years before the first psychotic crises (Bedi et al., 2015b). 
 
3. What is a Speech Graph? 
One way to quantify thought disorder is to represent the flow of ideas and 
memories reflected on the flow of words during a free speech as a trajectory and 
create a speech graph. A graph is a set of nodes linked by edges (formally defined as 
G=(N, E), being N={w1, w2, …, wn} and E={(wi, wj)} (Bollobas, 1998; Börner, Sanyal, & 
Vespignani, 2007). The criteria determining how a link is established between two 
nodes define topological properties of these graphs that can be measured locally or 
globally. In the present case, each word is defined as a node and the temporal 
sequence of words during a free speech is represented by directed edges (Mota et al., 
13
 8 
2014) (Figure 1). From a speech graph we can objectively measure local and global 
features of the word trajectory that reflects the flow of thoughts during a free speech 
task (like when the subject reports a daily event, a past memory, or even a dream 
memory).  
 
Figure 1 here: Examples of speech graphs from dream reports of schizophrenic, 
bipolar and control subjects. Starting from transcribed verbal reports, graphs were 
generated using custom-made Java software (see below). Figure from (Mota et al., 
2014). 
 In the last decade, graph theory has been widely employed in the study of 
natural or technological phenomena (Boccaletti et al., 2006). By allowing the 
representation of the relationships among their units, the overall structure of a 
network can elucidate characteristics that could not be understood by considering only 
a few units. The meaning of the represented structure basically depends on what is 
being considered as a node and on the definition of the presence and direction of 
edges (links between nodes). Graph theory as a tool may not only help to tackle 
problems in the basic sciences, but can also be applied to solve complex problems in 
everyday life, otherwise difficult to characterize and measure. An interesting strategy 
14
 9 
in scientific research is to keep both goals in focus: Seek to understand a phenomenon 
at the fundamental level, while at the same time use the knowledge as a tool to solve 
practical problems (Stokes, 1997). With a simultaneous focus on basic and applied 
research, the application of graph theory to represent the relationship between 
spoken words helps to understand how different psychiatric conditions differentially 
impact the flow of words during free speech, and how we can apply this knowledge to 
perform differential diagnosis. 
Once reports are represented as graphs, one can calculate several attributes that 
quantify local and global characteristics. We calculated 14 attributes comprising 2 
general graph attributes (Nodes and Edges), 5 recurrence attributes (Parallels – PE and 
Repeated Edges – RE; Loops of one – L1, two – L2 and three nodes – L3), 2 attributes of 
connectivity (Largest Connected Component – LCC and Largest Strongly Connected 
Component – LSC) and 5 global attributes (Average Total Degree – ATD, Density, 
Diameter, Average Shortest Path – ASP, Clustering Coefficient – CC) (Figure 2). 
15
 10 
 
FIGURE 2 here: Examples of Speech Graph Attributes described above (figure from 
(Mota et al., 2014)).  
 
Speech Graph Attributes: 
1. N: Number of nodes. 
2. E: Number of edges. 
3. RE (Repeated Edges): sum of all edges linking the same pair of nodes. 
16
 11 
4. PE (Parallel Edges): sum of all parallel edges linking the same pair of nodes 
given that the source node of an edge is the target node of the parallel edge. 
5. L1 (Loop of one node): sum of all edges linking a node with itself, calculated as 
the trace of the adjacency matrix. 
6. L2 (Loop of two nodes): sum of all loops containing two nodes, calculated by 
the trace of the squared adjacency matrix divided by two. 
7. L3 (Loop of three nodes): sum of all loops containing three nodes (triangles), 
calculated by the trace of the cubed adjacency matrix divided by three. 
8. LCC (Largest Connected Component): number of nodes in the maximal 
subgraph in which all pairs of nodes are reachable from one another in the 
underlying undirected subgraph. When you have all the words on one large 
connected component, LCC will be the same as N. 
9. LSC (Largest Strongly Connected Component): number of nodes in the maximal 
subgraph in which all pairs of nodes are reachable from one another in the 
directed subgraph (node a reaches node b, and b reaches a). 
10. ATD (Average Total Degree): given a node n, its Total Degree is the sum of “in“ 
and “out” edges. Average Total Degree is the sum of Total Degree of all nodes 
divided by the number of nodes. 
11. Density: number of edges divided by possible edges. (D = 2*E/N*(N-1)), where 
E is the number of edges and N is the number of nodes. 
12. Diameter: length of the longest shortest path between the node pairs of a 
network. 
13. Average Shortest Path (ASP): average length (number of steps along edges) of 
the shortest path between pairs of nodes of a network. 
17
 12 
14. CC (Average Clustering Coefficient): given a node n, the Clustering Coefficient 
Map (CCMap) is the set of fractions of all n neighbors that are also neighbors of 
each other. Average CC is the sum of the Clustering Coefficients of all nodes in 
the CCMap divided by the number of elements in the CCMap. 
In order to compare graphs with different amount of elements (controlling 
verbosity difference as measured by different amounts of words), two main strategies 
were used. First we divided each graph attribute by the amount of words in the report, 
assuming a linear relationship between graph attribute and verbosity. A pertinent 
critique is that the relationship between graph attributes and verbosity is not always 
linear, and for some attributes it is not clear if there is a direct relationship (Figure 3). A 
second strategy was to attribute a graph for each set of a fixed number of words, 
skipping an also fixed number of words to build the next graph, assuming a certain 
level of overlap between consecutive graphs. This “sliding window” approach allows 
calculating the average graph attributes of a graph with a fixed number of words. This 
enables the study of topological characteristics of graphs with different reports size 
(say, small, medium and big graphs). A critique for this strategy is the arbitrary cut of 
word sequences that can change topological properties, mainly global attributes. This 
is an important discussion of ongoing research that needs to be addressed carefully, so 
as to enable a better interpretation of the results. 
18
 13 
 
Figure 3 here: Linear correlation between SGA and word count (WC). (figure from 
(Mota et al., 2014)).  
 
4. Speech Graphs as a strategy to quantify symptoms on psychosis 
In an attempt to represent the flow of thoughts presented in a free speech, 
speech graphs were initially designed with nodes representing lexemes (a subject, 
object or verb on the sentence), and their temporal sequence represented as directed 
edges, yielding directed multigraphs with self-loops and parallel edges (Mota et al., 
2012). Analyzing dream reports represented as graphs from 24 subjects (8 subjects 
19
 14 
presenting psychotic symptoms with Schizophrenia diagnosis, 8 subjects also with 
psychotic symptoms diagnosed as Bipolar Disorder in the Mania phase and 8 control 
subjects without any psychotic symptom), it was possible to quantify psychiatric 
symptoms such as: 
1. Logorrhea, described as the increase of verbosity characteristic of 
Bipolar disorder on Mania phase. This was quantified not only counting more words in 
the Bipolar group, but also more recurrence (more parallel edges), even when 
controlling for differences in verbosity by dividing graph attributes by the amount of 
words in the speech. This means that the reports tend to return more often to the 
same topics. 
2. Flight of thoughts, described as talking about other topics than the main 
topic asked, which is also characteristic of Bipolar disorder. In the Bipolar group, more 
nodes were used to talk about waking events upon request to report on a recent 
dream. 
3. Poor speech, described as loss of meaning on the speech and perceived 
as a set of words which are poorly connected, characteristic of Schizophrenia. This was 
quantified as more nodes per words, denoting reports that address the topics only 
once, neither branching, nor recurring, so almost all the words used will be count as a 
different node. 
It was possible to automatically sort Schizophrenia from Bipolar group using a 
machine learning approach. A Naïve Bayes classifier was used to distinguish between 
both groups, and to distinguish between pathological groups and non-psychotic 
subjects (Kotsiantis, 2007). The classifier received as input either speech graph 
20
 15 
attributes or grades given from psychiatrists concerning psychiatric symptoms (using 
standard psychometric scales: PANSS (Kay et al., 1987) and BPRS (Bech et al., 1986)). 
Classification accuracy was assessed through the calculation of sensitivity, specificity, 
kappa statistics and the area under the receiver operating characteristic curve (AUC), 
described as a plot of sensitivity (or true positive rate) on the y-axis versus false 
positive rate (or 1-specificity) on the x-axis. An AUC around 0.5 means a random 
classification, whereas AUC = 1 means a perfect classification (none of the possible 
errors were made). It was possible to classify the pathological groups against non-
psychotic group using graph attributes and psychometric scales with high accuracy 
(AUC higher than 0.8) (Table 01). But to distinguish between Schizophrenia and Bipolar 
groups, graph attributes performed better than psychometric scales (AUC = 0.88 using 
graph attributes as input, while AUC = 0.57 when using psychometric scales as input) 
(Table 01). 
Table 01: Classification metrics between diagnostic groups using SpeechGraph 
Attributes (Mota et al., 2012). 
  Sensitivity  Specificity Kappa AUC 
S x B 93.8% 93.7% 0.88 0.88 
S x C 87.5% 87.5% 0.75 0.90 
B x C 68.8% 68.7% 0.37 0.80 
 
This first study had some limitations concerning the low sample (only 8 subjects 
per group) and the methodology. First, the transformation from a text to a graph was 
handmade, a process that is time consuming and has a higher risk of error. Second, the 
graph was not completely free of subject evaluation (a node was considered as a 
21
 16 
subject, object or verb on the sentence and, at a grammar level, it required a syntactic 
evaluation). So, in order to avoid these problems and to allow the study of a larger 
sample with larger texts, in a subsequent study we employed words as nodes and their 
temporal sequence as edges, a simplification which allowed the process to be 
automatized by the SpeechGraphs software (Mota et al., 2014). This custom-made Java 
software, available at http://neuro.ufrn.br/softwares/speechgraphs, receives as input 
a text file and returns the graph based on the text with all the 14 graph attributes 
described before. It is also possible to cut the text in consecutive graphs with a fixed 
number of words, controlling for verbosity and exploring different sizes of word 
windows to study cognitive phenomena. 
To characterize distinct pathological phenomena in the speech of different types 
of psychosis, the SpeechGraphs tool was applied. Symptoms of Bipolar Disorder such 
as logorrhea could still be associated to the increase of the network size (Mota et al., 
2014; Mota et al., 2012). Also symptoms of Schizophrenia such as alogia and poor 
speech were measured as fewer edges (E) and smaller connected components (LCC) 
and strongly connected components (LSC) when compared to Bipolar and Control 
groups, producing less complex graphs in the Schizophrenia group even after 
controlling for word count (comparing consecutive graphs of 10, 20 and 30 words with 
one word as step). In graphs from this group there are fewer edges between nodes 
and fewer nodes connected by some path or mutually reachable. This means that the 
Schizophrenia group tends to talk only few times about the same topic, not returning 
or associating past topics with consecutive ones, probably denoting cognitive deficits 
such as working memory deficits. 
22
 17 
Using these network characteristics it was also possible to automatically sort the 
Schizophrenia and Bipolar groups, and those from subjects without psychosis, with 
AUC = 0.94 to classify Schizophrenia and Control groups, AUC = 0.72 to classify Bipolar 
and Control group and AUC = 0.77 to classify Schizophrenia and Bipolar groups (Table 
02). These results highlight the potential use of this method as an auxiliary tool in the 
psychiatric clinic. 
Table 02: Classification metrics between diagnostic groups using SpeechGraph 
Attributes (Mota et al., 2014). 
  AUC Sensitivity  Specificity 
S x B x C 0.77 0.62 0.81 
S x B 0.77 0.69 0.68 
S x C 0.94 0.85 0.85 
B x C 0.72 0.74 0.75 
 
 
Figure 4 here: Representative speech graphs extracted from dream reports from a 
schizophrenic, a bipolar and a control subject (figure from (Mota et al., 2014). 
To better understand the relationship between these graph features and the 
symptomatology measured by psychometric scales, the correlation between those 
23
 18 
metrics was analyzed. Edges, LCC and LSC were strongly negatively correlated with 
cognitive and negative symptoms (as measured by psychometric scales). In other 
words, when the subjects presented more severity on symptoms such as emotional 
retraction and flattened affect (loss of emotional reaction), poor eye contact (with the 
interviewer during psychiatric evaluation), loss of spontaneity or fluency on speech 
and difficulty in abstract thinking (measured by the ability to interpret proverbs), their 
reported dreams generated graphs with fewer edges and fewer nodes on the largest 
connected and strongly connected component. Those psychiatric symptoms are more 
common in subjects with Schizophrenia (Kaplan & Sadock, 2009), indicating how we 
can measure the impact on cognition and deficits in social interactions of these 
individuals through graphs of speech (Mota et al., 2014). Cognitive and psychological 
aspects that drive this pattern of speech such as working memory, planning and theory 
of mind abilities may explain those deficits and helps to elucidate the pathophysiology 
of the different psychotic disorders. When the interviewer asks the subject to report a 
memory, the way the subjects interact socially with the interviewer, and recall what to 
report, planning the answer and the sequence of events to report, impact the 
sequence of words spoken, reflecting their mental organization. 
 
5. Differences in Speech Graphs due to content (waking x dream reports) 
We already understand that during pathological cognitive states there is an 
impact on the flow of thoughts or memories that we can track by the word trajectory. 
But what happens with physiologically altered consciousness states like dream 
24
 19 
mentation? Is it possible to characterize differences between dream and daily 
memories regarding word trajectories? Does it inform any additional features about 
general cognition? 
A few minutes before waking up every day we can experiment an exclusively 
internal reality not shared with our friends or family: Dreaming. This reality is internally 
built based on a set of memories with different affective valences, with different types 
of meaning only accessible by the dreamer. This confused mental state is 
phenomenologically similar to a psychotic state, as there is a lack of insight regarding 
the bizarreness of this strange reality (Dresler et al., 2015; Mota et al., 2014; Scarone 
et al., 2007). Thus it would not be surprising to expect that the flow of information 
regarding dream memories could better reveal thought disorganization characteristic 
of psychotic states.  
During the studies with psychotic populations there were differences in speech 
graphs depending on the speech content. When reporting a dream, subjects without 
psychosis and subjects with Bipolar Disorder produced more complex graphs (higher 
connectivity) than when reporting daily activities of the previous day, a difference 
which was not observed in subjects with Schizophrenia (those subjects reported 
dreams or daily memories with the same few connected graphs) (Mota et al., 2014). 
Therefore, graphs of dream reports were more efficient in group sorting than graphs of 
daily reports (Mota et al., 2014). 
25
 20 
 
Figure 5 here: Representative speech graphs examples extracted from dream and 
waking reports from the same schizophrenic, bipolar and control subject (figure from 
(Mota et al., 2014)). 
Another intriguing result was found in the correlations between speech graph 
attributes and clinical symptoms measured by psychometric scales PANSS (Kay et al., 
1987) and BPRS (Bech et al., 1986). Only dream graphs connectivity attributes were 
strongly and negatively correlated with negative and cognitive symptoms (as measured 
by both scales) that are more common in Schizophrenia. Waking report graphs showed 
negative correlations between general psychotic symptoms such as loss of insight 
(measured by PANSS) and incoherent speech (measured by BPRS) with LCC (also a 
26
 21 
connectivity attribute) (Mota et al., 2014). This emphasizes that reports of dream 
memories requires different cognitive functions and empathy abilities than reports of 
daily memories. 
Based on these results we can conclude that graphs from dream reports are 
more informative about mental states than graphs representing waking reports. This 
result echoes the psychoanalytic proposal that dreams are a privileged window into 
thought (Freud, 1900; Mota et al., 2014). This observation has started a new basic 
research approach to quantitatively understand what is going on when we remember 
a dream. The use of electrophysiological approaches (most notably, multi-channel 
electroencephalography) to characterize different sleep stages in the laboratory allows 
the access to dream mentation by their reports at the same time that we access 
electrophysiological activity during sleep. 
 
6. Speech Graphs applied to dementia  
Considering the characterization of cognitive deficits in conditions such as 
dementia, the use of tests designed to characterize specific cognitive impacts on 
memory domain are useful on early evaluation. One example is a test called Verbal 
Fluency Test, which consists on verbal recall of different names of a specific category 
(usually animals) during a fixed time. This was first used to investigate the executive 
aspects of verbal recall, counting the capacity to produce an adequate quantity of 
words in a limited condition of recall, not repeating nor recalling different categories 
(Lezak, Howieson, Bigler, & Tranel, 2012). The individual needs to access semantic 
27
 22 
memory correctly and to be flexible in order to quickly change the words (using 
temporal cortex structures), and to store the already mentioned words to avoid 
repetitions, which requires executive functions such as inhibitory control (using frontal 
cortex structures) (Henry & Crawford, 2004). 
Different pathologies, such as dementia, can damage the performance on this 
task. As different structures are involved to correctly answer the task, different kinds 
of errors can help distinguish between different causes (damage in different locations). 
Different causes of dementia lead to different symptomatology evolutions, which 
represent different location damages. The characterization of word trajectory with the 
application of the SpeechGraph tool complements this neuropsychological test (Bertola 
et al., 2014b). A total of 100 individuals, 25 subjects diagnosed with Alzheimer's 
dementia, 50 diagnosed with Moderate Cognitive Impairment (25 of them with only 
amnestic symptoms and the others 25 with damage in multiple domains) and 25 
elderly subjects with no signs of dementia were asked to report as many names of 
different animals as they could remember in one minute (Nickles, 2001). The sequence 
of animal names was represented as a word graph. 
It was observed that subjects with Alzheimer's dementia produced graphs with 
fewer words and elements (nodes and edges), higher density, more loops of 3 nodes 
and smaller distances (diameter and average shortest path) than other groups, with 
the same trend for subjects with Moderate Cognitive Impairment compared to elderly 
adults without dementia (Bertola et al., 2014b). Furthermore, subjects with Moderate 
Cognitive Impairment with only amnestic deficits produced graphs more similar to the 
elderly without dementia, while those with impairments in multiple domains produced 
28
 23 
graphs more similar to the graphs from individuals with Alzheimer's disease. Also in 
this case, it was possible to automatically classify the different diagnoses only from 
graph attributes (Bertola et al., 2014b). There was also correlation between speech 
graph attributes and two important standard cognitive assessments wildly used on 
geriatric population, denoting an important correlation between word trajectory on 
verbal fluency recall and general cognitive status (measured with MMSE – Mini Mental 
State Exam) and functional performance (measured with the Lawton Instrumental 
Activities of Daily Living Scale) (Bertola et al., 2014b). 
On one hand, the more cognitively preserved were the elderly, the more unique 
nodes were produced on less dense graphs. On the other hand, the more functionally 
dependent the individuals were, the less words, nodes and edges were produced on 
denser graphs with smaller diameter and average shortest paths (Bertola et al., 
2014b). Another differential impact was evident for three-node loops, a repetition of 
the same word with only two words in between (example: “lion”, “cat”, “dog”, “lion”), 
found in higher frequency in the Alzheimer group compared with MCI and control 
groups (Bertola et al., 2014b). This means an impairment in working memory since the 
early stages of the Alzheimer’s disease (already recognized by other working memory 
assessments (Huntley & Howard, 2010). 
These results point to the additional information that the characterization of 
word trajectory brings to a well-established neuropsychological test. On this 
application example, as the test has restricted rules, we expect that the subject 
produces a certain type of graph, and different types of deviations from this expected 
pattern informs about cognitive impairments. 
29
 24 
 
7. Future perspectives  
Word graphs are not the only tool to quantify psychiatric symptoms on speech 
analysis. As pointed out in the introduction, other approaches aim to quantify 
semantic similarities between words (Bedi et al., 2015a; Elvevåg et al., 2007). The 
relationship between speech incoherence measured by LSA and speech structure 
measured by Speech Graphs is not clear yet. Both measures take into account word 
sequences and word co-occurrences, but with very different approaches (one 
compares with a semantic model based on a large corpus, and the other uses graph 
theory to characterize topological features of the speech sample). Understanding 
better both approaches can improve automated speech analysis for clinical purposes 
such as diagnosis and prognosis prediction, creating useful follow-up tools in a clinical 
set. 
Other interesting perspective is to combine language analysis with prosody 
analysis. Semi-automated tools have characterized prosodic deficits related to 
Schizophrenia diagnosis. The patients made more pauses, were slower, showed less 
pitch variability and fewer variation in syllable timing, expressing a flat prosody when 
compared to matched controls (Martínez-Sánchez et al., 2015). The relationship 
between expressive prosody and language features during free speech can elucidate 
several cognitive characteristics subjectively perceived by well-trained psychiatrists 
(Berisha, Wang, LaCross, & Liss, 2015). 
30
 25 
A better understanding of word trajectories in free speech can also be applied in 
settings other than the psychiatric clinic. As these tools show important correlations 
with cognitive deficits in psychosis and dementia, could it be useful to characterize 
cognitive development in a school setting? This kind of approach could help predict 
cognitive impairment early enough to allow quick intervention, preventing learning 
disabilities that later on would be harder to manage. This could also help quantitatively 
characterize cognitive development in a naturalistic manner. 
  
31
 26 
Acknowledgements: The authors dedicate this chapter to the memory of Raimundo 
Furtado Neto, who made important contributions to the development of the 
SpeechGraphs software. This work was supported by Conselho Nacional de 
Desenvolvimento Científico e Tecnológico (CNPq), grants Universal 480053/2013-8 and 
Research Productivity 306604/2012-4 and 310712/2014-9; Coordenação de 
Aperfeiçoamento de Pessoal de Nível Superior (CAPES) Projeto ACERTA; Fundação de 
Amparo à Ciência e Tecnologia do Estado de Pernambuco (FACEPE); FAPESP Center for 
Neuromathematics (grant # 2013/07699-0, S. Paulo Research Foundation FAPESP). 
  
32
 27 
REFERENCES:  
Adams, R. A., Huys, Q. J., & Roiser, J. P. (2015). Computational Psychiatry: towards a 
mathematically informed understanding of mental illness. J Neurol Neurosurg 
Psychiatry. doi:jnnp-2015-310737. 
Andreasen, N. C., & Grove, W. M. (1986). Thought, language, and communication in 
schizophrenia: diagnosis and prognosis. Schizophr Bull, 12(3), 348-359.  
Bech, P., Kastrup, M., & Rafaelsen, O. J. (1986). Mini-compendium of rating scales for states of 
anxiety depression mania schizophrenia with corresponding DSM-III syndromes. Acta 
Psychiatr Scand Suppl, 326, 1-37.  
Bedi, G., Carrillo, F., Cecchi, G. A., Slezak, D. F., Sigman, M., Mota, N. B., Ribeiro, S., Javitt, D., 
Copelli, M., & Corcoran, C. M. (2015b). Automated analysis of free speech predicts 
psychosis onset in high-risk youths. npj Schizophrenia, 1, 15030. 
doi:10.1038/npjschz.2015.30. 
Berisha, V., Wang, S., LaCross, A., Liss, J. (2015). Tracking Discourse Complexity Preceding 
Alzheimer's Disease Diagnosis: A Case Study Comparing the Press Conferences of 
Presidents Ronald Reagan and George Herbert Walker Bush. Journal of Alzheimer's 
Disease, 45, 3. 
Bertola, L., Mota, N. B., Copelli, M., Rivero, T., Diniz, B. S., Romano-Silva, M. A., Ribeiro, S., & 
Malloy-Diniz, L. F. (2014a). Graph analysis of verbal fluency test discriminate between 
patients with Alzheimer's disease, mild cognitive impairment and normal elderly 
controls. Front Aging Neurosci, 6, 185. doi:10.3389/fnagi.2014.00185. 
Bleuler, E. (1911). Dementia praecox or the group of schizophrenias. (J. Zinkin, Trans.). New 
York: International Universities Press. 
Bollobas, B. (1998). Modern Graph Theory. Berlin. 
Börner, K., Sanyal, S., & Vespignani, A. (2007). Network Science. In B. Cronin (Ed.), Information 
Today (pp. 537–607). Medford: ARIST. 
Cabana, A., Valle-Lisboa, J. C., Elvevag, B., & Mizraji, E. (2011). Detecting order-disorder 
transitions in discourse: Implications for schizophrenia. Schizophr Res. doi:S0920-
9964(11)00233-7. 
Daviglus, M. L., Bell, C. C., Berrettini, W., Bowen, P. E., Connolly, E. S., Jr., Cox, N. J., Dunbar-
Jacob, J. M., Granieri, E. C., Hunt, G., McGarry, K., Patel, D., Potosky, A. L., Sanders-
Bush, E., Silberberg, D., & Trevisan, M. (2010). NIH state-of-the-science conference 
statement: Preventing Alzheimer's disease and cognitive decline. NIH Consens State Sci 
Statements, 27(4), 1-30.  
Dresler, M., Wehrle, R., Spoormaker, V. I., Steiger, A., Holsboer, F., Czisch, M., & Hobson, J. A. 
(2015). Neural correlates of insight in dreaming and psychosis. Sleep Med Rev, 20, 92-
99. doi:10.1016/j.smrv.2014.06.004. 
Elvevåg, B., Foltz, P. W., Weinberger, D. R., & Goldberg, T. E. (2007). Quantifying incoherence 
in speech: An automated methodology and novel application to schizophrenia. 
Schizophrenia Research, 93(1-3), 304-316. doi:10.1016/j.schres.2007.03.001. 
First, M. H., Spitzer, R. L., Gibbon, M., & Williams, J. (1990). Structured Clinical Interview for 
DSM-IV Axis I Disorders -- Research Version, Patient Edition (SCID-I/P). New York: 
Biometrics Research, New York State Psychiatric Institute. 
Freud, S. (1900). The interpretation of dreams. Strachey, J. transl (Ed). London: Basic Books. 
Henry, J. D., & Crawford, J. R. (2004). A meta-analytic review of verbal fluency performance 
following focal cortical lesions. Neuropsychology, 18(2), 284-295. doi:10.1037/0894-
4105.18.2.284. 
33
 28 
Huntley, J. D., & Howard, R. J. (2010). Working memory in early Alzheimer's disease: a 
neuropsychological review. Int J Geriatr Psychiatry, 25(2), 121-132. 
doi:10.1002/gps.2314. 
Insel, T. R. (2010). Rethinking schizophrenia. Nature, 468, 187-193.  
Kaplan, H. I., & Sadock, B. J. (2009). Kaplan & Sadock's Comprehensive Textbook of Psychiatry: 
Wolters Kluwer, Lippincott Williams & Wilkins. 
Kay, S. R., Fiszbein, A., & Opler, L. A. (1987). The positive and negative syndrome scale (PANSS) 
for schizophrenia. Schizophr Bull, 13(2), 261-276.  
Kotsiantis, S. B. (2007). Supervised machine learning: a review of classification techniques. In I.                          
Maglogiannis, K. Karpouzis, M. Wallace, J. Soldatos (Ed), Emerging artificial intelligence    
applications in computer engineering: real word AI systems with applications (pp. 3–
24). Amsterdam: IOS Press.  
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: the latent semantic 
analysis theory of acquisition, induction, and representation of knowledge. Psychol 
Rev, 104, 211–240.  
Lezak, M. D., Howieson, D. B., Bigler, E. D., & Tranel, C. (2012). Neuropsychological Assessment 
(5th ed ed.). New York: Oxford University Press. 
Martínez-Sánchez, F., Muela-Martínez, J. A., Cortés-Soto, P., Meilán, J. J. G., Ferrándiz, J. A. V., 
Caparrós, A. E., & Valverde, I. M. P. (2015). Can the Acoustic Analysis of Expressive 
Prosody Discriminate Schizophrenia? The Spanish Journal of Psychology, 18(E86 ). 
doi:doi:10.1017/sjp.2015.85.  
Montague, P. R., Dolan, R. J., Friston, K. J., & Dayan, P. (2012). Computational psychiatry. 
Trends Cogn Sci, 16(1), 72-80. doi:10.1016/j.tics.2011.11.018. 
Mota, N. B., Furtado, R., Maia, P. P., Copelli, M., & Ribeiro, S. (2014). Graph analysis of dream 
reports is especially informative about psychosis. Scientific Reports, 4, 3691. 
doi:10.1038/srep03691. 
Mota, N. B., Vasconcelos, N. A., Lemos, N., Pieretti, A. C., Kinouchi, O., Cecchi, G. A., Copelli, 
M., & Ribeiro, S. (2012). Speech graphs provide a quantitative measure of thought 
disorder in psychosis. PLoS One, 7(4), e34928. doi:10.1371/journal.pone.0034928. 
Nickles, L. (2001). Spoken word production. In B. Rapp (Ed.), What Deficits Reveal about the 
Human Mind/Brain: A Handbook of Cognitive Neuropsychology: Philadelphia: 
Psychology Press. 
Riedel, W. J. (2014). Preventing cognitive decline in preclinical Alzheimer's disease. Curr Opin 
Pharmacol, 14, 18-22. doi:10.1016/j.coph.2013.10.002. 
Scarone, S., Manzone, M. L., Gambini, O., Kantzas, I., Limosani, I., D'Agostino, A., & Hobson, J. 
A. (2007). The Dream as a Model for Psychosis: An Experimental Approach Using 
Bizarreness as a Cognitive Marker. Schizophrenia Bulletin, 34(3), 515-522. 
doi:10.1093/schbul/sbm116. 
Stokes, D. E. (Ed.) (1997). Pasteur's Quadrant – Basic Science and Technological Innovation. 
Washington, D. C.: Brookings Institution Press. 
Wang, X. J., & Krystal, J. H. (2014). Computational psychiatry. Neuron, 84(3), 638-654. 
doi:10.1016/j.neuron.2014.10.018. 
 
34
Graph analysis of dream reports is
especially informative about psychosis
Nata´lia B. Mota1, Raimundo Furtado1, Pedro P. C. Maia1, Mauro Copelli2* & Sidarta Ribeiro1*
1Brain Institute, Federal University of Rio Grande do Norte (UFRN), Natal, Brazil, Postal Code: 59056-450, 2Physics Department,
Federal University of Pernambuco (UFPE), Recife, Brazil, Postal Code: 50670-901.
Early psychiatry investigated dreams to understand psychopathologies. Contemporary psychiatry, which
neglects dreams, has been criticized for lack of objectivity. In search of quantitative insight into the structure
of psychotic speech, we investigated speech graph attributes (SGA) in patients with schizophrenia, bipolar
disorder type I, and non-psychotic controls as they reported waking and dream contents. Schizophrenic
subjects spoke with reduced connectivity, in tight correlation with negative and cognitive symptoms
measured by standard psychometric scales. Bipolar and control subjects were undistinguishable by waking
reports, but in dream reports bipolar subjects showed significantly less connectivity. Dream-related SGA
outperformed psychometric scores or waking-related data for group sorting. Altogether, the results indicate
that online and offline processing, the two most fundamental modes of brain operation, produce nearly
opposite effects on recollections: While dreaming exposes differences in the mnemonic records across
individuals, waking dampens distinctions. The results also demonstrate the feasibility of the differential
diagnosis of psychosis based on the analysis of dream graphs, pointing to a fast, low-cost and
language-invariant tool for psychiatric diagnosis and the objective search for biomarkers. The Freudian
notion that ‘‘dreams are the royal road to the unconscious’’ is clinically useful, after all.
D
ifferential diagnosis in psychiatry is more often than not a difficult task, unsupported by objective tests and
necessarily performed by experts1. Standard psychiatric diagnosis has been harshly criticized, despite
century-old efforts towards an accurate classification of mental illnesses1–4. Multi-site and cross-cultural
expert agreement is low, most diseases do not have unequivocal biomarkers, and clear-cut distinctions between
certain maladies may be unwarranted5,6. For instance, subjects with schizophrenia or bipolar disorder type I may
share several positive psychotic symptoms such as hallucinations, delusions, hyperactivity and aggressive behavior7.
The development of quantitative methods for the evaluation of psychiatric symptoms offers hope to overcome
this foggy scenario8,9. In particular, we have recently shown that the graph-theoretical analysis of dream reports
produced by psychotic patients can separate schizophrenic frommanic subjects10. This was possible because their
speech features are usually quite different. Schizophrenic subjects frequently display negative symptoms includ-
ing alogia, i.e. they speak laconically and with little digression7,10. Subjects with bipolar disorder, especially during
the manic stage, tend to present the opposite symptom called logorrhea, with much recursiveness in association
with positive symptoms7,10. These differences in symptomatology led us to hypothesize that schizophrenic and
bipolar subjects would produce less connected word graphs than control subjects, in correlation with negative
symptoms. It also remains unsettled whether dream reports are crucial for the differential diagnosis of psychosis,
as early psychiatrists would have sustained11,12, or whether waking contents are equally informative.
To elucidate these issues, we quantified the speech graph attributes (SGA; Figure 1a, Figure 2) of dream and
waking reports obtained from clinical oral interviews of schizophrenic, bipolar type I, and control subjects
(Supplementary Table S1). Using a Bayesian classifier, we compared the differential diagnosis of psychosis
provided by dream-related SGA, waking-related SGA or standard psychometric scores. Translation of the reports
into five major Western languages was performed to assess language-related variations.
Results
Speech samples were recorded during psychiatric interviews as answers to two different requests: ‘‘Please report a
recent dream’’ and ‘‘Please report your waking activities immediately before that dream’’. Each report was
transcribed and represented as a speech graph, in which every word represented a node, and every temporal
connection between consecutive words represented an edge. The visual inspection of speech graphs suggests that
dream reports (Figure 1b) vary more across groups than waking reports from the same subjects (Figure 1c).
OPEN
SUBJECT AREAS:
APPLIED PHYSICS
HUMAN BEHAVIOUR
DIAGNOSTIC MARKERS
Received
31 October 2013
Accepted
25 November 2013
Published
15 January 2014
Correspondence and
requests for materials
should be addressed to
S.R. (sidartaribeiro@
neuro.ufrn.br)
*Shared
corresponding
authorship.
SCIENTIFIC REPORTS | 4 : 3691 | DOI: 10.1038/srep03691 1
35
A semantic and grammatical inspection of the most-frequent
words, loops and their corresponding exit nodes showed few differ-
ences across dream and waking reports produced by psychotic and
control subjects, withmajor overlap in word repertoire across groups
(Supplementary Fig. S1). At the structural level, however, irrespective
of meaning, clear contrasts emerged. While waking reports in all
groups were typically sequential, with little recursiveness that
reflected the linearity of chronological narrative, dream reports were
quite convoluted when produced by bipolar and control subjects.
The SGA obtained for all the words in each report (Supplementary
Tables S2 and S3) mostly agreed with the SGA obtained with smaller
samples (n 5 8 per group) and with the use of lexemes10, which
require syntactical analysis. While dream-related graphs showed
overall good classification quality and significant SGA differences
between schizophrenic subjects and the two other groups (bipolar
and control subjects), waking-related graphs failed to differentiate
between any of the groups for any SGA (Figure 3a, Supplementary
Table S4). We also found that nearly all SGA differed between dream
and wake reports from bipolar and control subjects (Figure 3a).
Since schizophrenic subjects produce dream reports with a signifi-
cantly smaller word count (WC) than dream reports produced by
bipolar and control subjects, and given the fact that most SGA are
strongly correlated with WC (Figure 4), it is possible that the differ-
ences between schizophrenic subjects and the two other groups
derive solely from verbosity differences that could hinder the clinical
applicability of themethod. Indeed, bipolar and control subjects used
more words than schizophrenic subjects when reporting a dream,
making more complex graphs than when reporting on waking
(Figure 3a). In contrast, schizophrenic subjects showed impover-
ished graphs for both dream and waking without any SGA difference
between those, with overall low values of most SGA (Figure 3a).
To rule out the influence of verbosity, we analyzed the reports
using a moving window of fixed word length (10, 20 and 30 words)
with a step of 1 word. Each report yielded a population of graphs
from which we calculated mean SGA. This procedure revealed that
schizophrenic subjects yielded significantly less connected graphs
(smaller LCC and LSC) and fewer edges (E) than bipolar and control
subjects, for every word length tested and for both dream and waking
(Figure 5a for word length 5 30). Small graphs (word length 5 10
and 20) showed smaller internal distances (Diameter and ASP) in
schizophrenic subjects than in control subjects, for both dream
(word length 10: Diameter P 5 0.0001, ASP P 5 0.0001; word length
20: Diameter P 5 0.0007, ASP P 5 0.0004) and waking (word length
10: Diameter P 5 0.0021, ASP P 5 0.0019; word length 20: Diameter
P 5 0.0013, ASP P 5 0.0006). Additionally, dream-related small
graphs had smaller ATD (word length 10 P 5 0.0028; word length
20 P 5 0.0106), and waking-related small graphs had smaller dis-
tances (word length 10 ASP P 5 0.0140; word length 20 Diameter P
5 0.0054, ASP P 5 0.0043) in schizophrenic subjects, in comparison
with bipolar subjects. Altogether the data show that reports from
schizophrenic subjects, irrespective of originating from dream or
waking, were characterized by small and poorly connected graphs,
in comparison with bipolar and control subjects (Supplementary
Table S2).
The reports produced by bipolar subjects, on the other hand, were
very different depending on their source: dream events were reported
with more recurrence (L3), and connectivity (ATD), higher density,
smaller distances (diameter and ASP) and higher clustering coef-
ficient (CC) than waking events (Figure 5a). Control subjects also
reported dreams differently (with more E and larger LSC), and only
schizophrenic subjects did not show any difference on dream or
waking SGA (Figure 5a). When related to dreams, bipolar reports
yielded less connected graphs (smaller LCC and LSC) with fewer
nodes (N) than control subjects (Figure 5a). We also found graphs
with smaller distances when using word length 5 10 (Diameter P 5
0.006, and ASP P 5 0.0071), denoting smaller and less complex
graphs in bipolar than in control subjects. None of these differences
between bipolar and control subjects occurred in waking-related
reports (Figure 5a).
To further explore dream versus waking differences in the reports
of psychotic patients, we trained a Naı¨ve Bayes classifier to differ-
entiate among the groups using all SGA as inputs, with SCID results
as golden standard. Schizophrenic subjects could be sorted from
Figure 1 | The speech graphs of schizophrenic, bipolar and control
subjects are more varied for dream than for waking reports. (a) Graphs
were generated from transcribed verbal reports using custom-made Java
software (http://neuro.ufrn.br/softwares/speechgraphs). Drawing by NM.
(b) Representative speech graphs extracted from dream reports from a
schizophrenic, a bipolar and a control subject. (C) Same as in (b), but for
waking reports of the same subjects.
Figure 2 | Speech Graph Attributes (SGA). Examples of speech graph
attributes described in Methods.
www.nature.com/scientificreports
SCIENTIFIC REPORTS | 4 : 3691 | DOI: 10.1038/srep03691 2
36
bipolar and control subjects with AUC between 0.6 and 0.86 for both
dream and waking graphs (Figure 3b, Figure 5b, Supplementary
Table S5), but only dream-related graphs could sort bipolar from
control subjects (Figure 5b). Using raw data, it was possible to sort
dream from waking reports among bipolar (AUC 5 0.753) and
control subjects (AUC 5 0.807) (Figure 3c). Using an analysis win-
dow with length of 30 words, which provided the best accuracy for
group classification, it was possible to automatically sort dream and
waking reports among bipolar (AUC 5 0.794) and control subjects
(AUC 5 0.65) (Figure 5c). This contrasts with reports from schizo-
phrenic subjects, which showed no structural differences between
dream and waking (Figure 3c, Figure 5c). Overall, the triple sorting
of schizophrenic, bipolar and control subjects based on automatically
selected attributes (E, LSC and ASP for dream reports; E and LCC for
waking reports; word length 5 30) was substantially better for
dream-related SGA than for waking-related SGA or psychometric
scores (Figure 5d).
The investigation of correlations between dream-related SGA and
psychopathological symptoms grasped by PANSS and BPRS consid-
ering all 60 subjects produced interesting results: Using the attributes
that best differentiated schizophrenic subjects from other groups (E,
LCC and LSC), we found significant anti-correlations with negative
and cognitive symptoms (Figure 6, Supplementary Fig. S2), known to
be more frequent among schizophrenic subjects than among indivi-
duals with other psychotic syndromes7. Subjects that reported dream
graphs with fewer edges or smaller connected components (LCC,
LSC) scored higher on PANSS, on the negative PANSS subscale,
and on PANSS questions regarding flattened affection, poor contact,
difficulties on abstract thought, less spontaneous or fluent speech;
these subjects also scored higher on BPRS questions about emotional
retraction and flattened affection (Figure 6a). Significant anti-corre-
lations in waking reports only occurred between LCC and general
psychotic symptoms: Subjects that reported on waking with lower
LCCpresented higher scores on the PANSS question about judgment
Figure 3 | SGA using raw data (full reports) differentiate psychopathological groups. (a) SGA boxplots with significant differences among
schizophrenic, bipolar and control groups indicated in red, and significant differences between dream and waking reports indicated in blue. (N 5 20 per
group; Kruskal-Wallis test followed by two-sidedWilcoxon Rank-sum test with Bonferroni correction with a 5 0.0167). (b) Rating quality measured by
AUC, sensitivity and specificity, using all attributes. Notice that dream reports categorize the groups much better than waking reports. (c) Rating quality
for the distinction between dream andwaking reports.While reports from bipolar and control subjects can be sorted, schizophrenic subjects yield reports
that fail to differentiate dream from waking.
www.nature.com/scientificreports
SCIENTIFIC REPORTS | 4 : 3691 | DOI: 10.1038/srep03691 3
37
Figure 4 | Linear correlation between SGA and word count (WC). Only L1, Density, Diameter, ASP and CC did not present a significant linear
correlation with WC. (a) Dream reports. (b) Waking reports.
Figure 5 | SGA controlled for verbosity differentiate psychopathological groups due to dream reports. (a) SGA boxplots for 30-word speech graphs
show significant differences among schizophrenic, bipolar and control groups indicated in red, and significant differences between dream and waking
reports indicated in blue (N 5 20 per group for dream reports; Kruskal-Wallis test followed by two-sided Wilcoxon Rank-sum test with Bonferroni
correction with a 5 0.0167). Eight subjects reported on waking events using less than 30 words (for waking reports, N 5 17 for the schizophrenic and
control groups, and N 5 18 for the bipolar group). (b) Rating quality measured by AUC, sensitivity and specificity, using all attributes. Raw data was
compared with mean data obtained using analysis windows of fixed word length (10, 20 and 30 words per window). (c) The rating quality for the
SGA-based distinction between dream and waking reports varies considerably across groups, reaching a maximum among bipolar subjects and a
minimum among schizophrenic subjects. (d) Group sorting using dream-related SGA is better than classifications based on psychometric scores or
waking-related data.
www.nature.com/scientificreports
SCIENTIFIC REPORTS | 4 : 3691 | DOI: 10.1038/srep03691 4
38
and critical capacity, and on the BPRS question regarding incoherent
speech (Figure 6b).
Finally, to simulate the comparison of an actual psychiatric clinical
assessment with a scenario inwhich graph analysis was employed, we
compared the performances of binary classifiers trained with 1)
selected SGA from both dreaming and waking, 2) PANSS and
BPRS total scores, and 3) a combination of both. The attributes
selected were those with significant correlation with psychometric
scores: E, LCC and LSC for dream reports, and LCC for waking
reports (Figure 6). We found that SGA sufficed to successfully sort
the three groups, differentiating schizophrenic from control subjects
with AUC5 0.941, bipolar from control subjects with AUC5 0.722,
and schizophrenic subjects from bipolar subjects with AUC 5 0.768
(Figure 7a). The psychometric scales were able to properly sort
schizophrenic from control subjects (AUC 5 0.955), and bipolar
from control subjects (AUC 5 0.935), but failed to differentiate
schizophrenic subjects from bipolar subjects (AUC 5 0.376). For a
combination of SGA and standard scale scores, schizophrenic sub-
jects were sorted from bipolar subjects with AUC 5 0.748, bipolar
subjects were sorted from control subjects with AUC 5 0.928, and
schizophrenic subjects were nearly perfectly sorted from control
subjects with AUC 5 0.993. Triple group sorting was better for
SGA (AUC 5 0.767) than for scales (AUC 5 0.731), and was opti-
mized by their combination (AUC 5 0.849; Figure 7a). To assess the
general applicability of themethod, reports in Portuguese were trans-
lated to English, German, French, and Spanish. Figure 7b shows that
group classification is remarkably similar across the five most pre-
valent Western languages.
Discussion
The results provide a quantitative behavioral assessment of nega-
tive and cognitive symptoms, and thus demonstrate the feasibility
of the automatic differential diagnosis of psychosis based on the
word-by-word graph analysis of dream and waking reports. Rather
than detracting from the classical distinction between schizophrenic
and bipolar subjects, SGA quantitatively characterize their differ-
ences, providing a parameter space for the sorting of psychotic symp-
toms like alogia, logorrhea, lack of fluency on speech, and formal
thought disorders (Figure 6). Thus, SGA analysis has potential to
become a fast, non-invasive, low-cost and language-invariant tool for
psychiatric diagnosis, by which a set of behavioral biomarkers could
drive a more objective, bottom-up search for anatomical and physio-
logical biomarkers13–15. Future research must follow up the invest-
igation of non-medicated patients after first psychotic episodes,
using longitudinal measures on same samples for prodrome and
treatment evaluation2,16,17.
The results also show that dream reports are substantially more
informative about the mental state of psychotic subjects than waking
reports. The explanation for this fact, which echoes the centenary
claim that dreams constitute a privileged window into thought11, may
be rooted in the very introspective nature of dreams. While the
episodic replay of recent waking activities occupies only 1–2% of
dream reports18, declarative memories become more accessible for
retrieval after REM sleep19, when most dreaming occurs20. Perhaps
dream reports are more likely to reveal psychopathologies than wak-
ing reports because dreams are not proximally anchored on events
shared with non-psychotic individuals, but rather on memories
Figure 6 | Dream-related SGA are anti-correlated with specific psychopathological symptoms. (a) Spearman’s rho for correlations between
individual questions of the PANSS and BPRS scales, and SGA obtained from dream reports (N5 60). Note the significant anti-correlations between SGA
(E, LCC and LSC) and psychometric variables including total PANSS, PANSS negative subscale, and some negative and cognitive symptoms such as
flattened affect, poor contact, difficulty in abstract thinking, loss of spontaneity or fluency in speech in PANSS; as well as emotional retraction and
flattened affect in BPRS. A 30-word moving window was used for data analysis. Circles indicate P values smaller than the Bonferroni corrected
a 5 0.00006. (b) Same as before but for waking reports (N 5 52). Note the significant anti-correlations for LCC and general psychotic symptoms
measured on both scales (loss of criticism in PANSS and incoherent speech in BPRS).
www.nature.com/scientificreports
SCIENTIFIC REPORTS | 4 : 3691 | DOI: 10.1038/srep03691 5
39
matured and restructured over time by the patient’s own thought
process.
Another important consideration is that dream events are more
forgettable than waking events, probably because noradrenergic
transmission is decreased during sleep21. On the other hand, REM
sleep and dreaming are involved with emotional processing22,23. The
combination of memory deficits with heightened emotional salience
makes a request for a dream report yield more internally generated
content than a request for awaking report. Importantly, patients with
schizophrenia and bipolar disorder respond in opposite ways to the
dream-report task: the former maintain their flattened speech, the
latter confabulate even more.
Finally, it is possible that psychotic subjects are more likely to
reveal the structure of their thinking when reporting on dreams
simply due to the similarity between dreaming and psychosis11,12,24–28.
The dream content in patients with schizophrenia is particularly
affected by negative symptoms29, and their waking cognitionmatches
the bizarreness of dream reports27, supporting dreaming as an
experimental model of psychosis. SGA analysis combined with
neural signal decoding during sleep30 and waking31 may soon allow
for direct testing of these hypotheses.
Methods
Subjects. 60 individuals (39 males and 21 females) independently diagnosed by the
standard DSM IV ratings SCID32, as schizophrenic, bipolar type I, and control
subjects (Supplementary Table S1). Study approved by the UFRN Research Ethics
Committee (permit #102/06-98244); informed consent was obtained from all
subjects.
Clinical significance of the sample. Sample size was established according to the
global and national prevalence of schizophrenia and bipolar disorder type I.
Estimation of adequate sample size (N) considered the prevalence of Schizophrenia
and Bipolar Disorder Type I according to the equation:
N~
Z2P 1{Pð Þ
d2
where Z 5 Z statistic for a level of confidence, P5 expected prevalence or proportion
and d 5 precision33. We adopted a conventional level of confidence of 95%, with Z 5
1.96 (considering 95% of confidence interval) and a precision of d 5 0.0533. A review
of data from 46 countries with 154,140 cases considered the lifetime prevalence of
schizophrenia to be 0.55% (60.45 SD)34. The lifetime prevalence of bipolar disorder
type I was considered to be 0.6% on a review of 61,392 cases from 11 countries35, or
0.9% (60.2 SEM) based on an exclusive Brazilian sample on the same study35. The
estimated sample sizes for the prevalences considered ranged from N 5 1.53 to 15.21
for schizophrenia, and fromN5 9.16 to 16.72 for bipolar disorder type I. Note that no
estimated sample size was greater than N 5 20, with N , 10 for mean lifetime
prevalences in the world sample (schizophrenia 0.55% and bipolar type I 0.6%).
Studies focused on the Brazilian population report a local prevalence of 0.57% for
schizophrenia36, and a range of 0.3%–1.1% for bipolar disorder37. To ensure the
clinical relevance of the results with equal size samples for each group (schizophrenia,
bipolar and control), we selected N 5 20 per group.
Graph analysis of dream andwaking reports.We focused our analysis on answers to
two open questions: ‘‘please report a recent dream’’ and ‘‘please report your waking
activities immediately before that dream’’. Each transcribed report was represented as
a word-graph38–40 in which every word was represented as a node, and the temporal
link between consecutive words was represented as an edge (Figure 1a and Figure 2).
To quantify graph variations, we used custom-made Java software (http://neuro.ufrn.
br/softwares/speechgraphs; Supplementary Method) to calculate 14 speech graph
attributes (SGA; Figure 2) comprising general attributes: total of nodes (N) and edges
(E); connected components: total of nodes on the largest connected component (LCC,
the maximal subgraph in which all pairs of nodes are reachable from one another in
the underlying undirected subgraph), and on the largest strongly connected
component (LSC, the maximal subgraph in which all pairs of nodes are reachable
from one another in the directed subgraph; recurrence attributes: repeated edges (RE,
sum of all edges linking the same pair of nodes) and parallel edges (PE, sum of all
parallel edges linking the same pair of nodes given that the source node of an edge
could be the target node of the parallel edge), cycles of one (L1, calculated as the trace
of the adjacency matrix), two (L2, calculated by the trace of the squared adjacency
matrix divided by two) or three (L3, calculated by the trace of the cubed adjacency
matrix divided by three) nodes; global attributes: average total degree (ATD; given a
node n, the Total Degree is the sum of ‘‘in and out’’ edges, and the Average Total
Degree is the sum of Total Degrees of all nodes divided by the number of nodes),
density D 5 2E/N(N 2 1), where E is the number of edges and N is the number of
nodes, diameter (length of the longest shortest path between the node pairs of a
network), average shortest path (ASP, average length of the shortest path between
pairs of nodes of a network) and clustering coefficient (CC, given a node n, the
Clustering Coefficient Map (CCMap) is the set of fractions of all n neighbors that are
also neighbors of each other. Average CC is the sum of the Clustering Coefficients of
all nodes in the CCMap divided by number of elements in the CCMap). The data were
then analyzed in Matlab and Excel software.
Group classification. SGAs and/or psychometric scores were used as inputs to a
Naı¨ve Bayes classifier41 implemented with Weka software42. A 10-fold cross-
validation procedure was implemented to take full advantage of the sample size.
Sensitivity, specificity and the area under the receiver operating characteristic curve
(AUC) were used as metrics of classification quality.
Psychometric scales. The ‘‘Positive and Negative Syndrome Scale’’ (PANSS)43 and
‘‘Brief Psychiatric Rating Scale’’ (BPRS)44 were applied during the same clinical
interview from which dream and waking reports were obtained.
Report translation. Dream and waking reports in Portuguese were translated to
English, German, French, and Spanish using Google Translate.
1. Grinker, R. R. In retrospect: The five lives of the psychiatry manual. Nature 468,
168–170 (2010).
2. Kapur, S., Phillips, A. G. & Insel, T. R. Why has it taken so long for biological
psychiatry to develop clinical tests and what to do about it? Mol Psychiatry 17,
1174–1179 (2012).
3. Adam, D. Mental health: On the spectrum. Nature 496, 416–418 (2013).
4. Craddock, N. & Owen, M. J. The Kraepelinian dichotomy - going, going. but still
not gone. Br J Psychiatry. 196, 92–95 (2010).
5. Williams, J. B. W. et al. The Structured Clinical Interview for DSM-III-R (SCID)
II. Multi-site test-retest reliability. Arch Gen Psychiatry 49, 630–636 (1992).
6. Insel, T. R. Rethinking schizophrenia. Nature 468, 187–193 (2010).
7. Kaplan, H. I. & Sadock, B. J. Kaplan & Sadock’s Comprehensive Textbook of
Psychiatry [Sadock, B. J., Sadock, V. A. & Ruiz, P. (ed.)] [Wolters Kluwer Health,
Lippincott Williams & Wilkins, Philadelphia, 2009).
8. Cabana, A., Valle-Lisboa, J. C., Elvevag, B. &Mizraji, E. Detecting order-disorder
transitions in discourse: Implications for schizophrenia. Schizophr Res 131,
157–164 (2011).
Figure 7 | SGA improve the psychopathological sorting provided by
psychometric scales. (a) Good to excellent classification of the groups was
obtained using the SGA that correlated significantly with specific
psychometric scores (for dream reports: E, LSC and LCC; for waking
reports: LCC). Excellent classification using psychometric scales (BPRS
and PANSS total scores) occurred only when sorting controls from other
groups, but failed to differentiate schizophrenic from bipolar subjects.
Optimal triple group classification was obtained by combining SGA and
psychometric scales. Data correspond to 30-word speech graphs. (b) The
SGA-based diagnosis of psychosis is invariant across the five most
prevalent Western languages.
www.nature.com/scientificreports
SCIENTIFIC REPORTS | 4 : 3691 | DOI: 10.1038/srep03691 6
40
9. Valle-Lisboa, J. C., Pomi, A., Cabana, A., Elvevag, B. & Mizraji, E. A modular
approach to language production: Models and facts. Cortex pii: S0010-
9452(13)00042-7; DOI: 10.1016/j.cortex.2013.02.005 (2013).
10. Mota, N. B. et al. Speech graphs provide a quantitative measure of thought
disorder in psychosis. PLoS One 7, e34928 (2012).
11. Freud, S.The Interpretation of Dreams. [Strachey, J. transl.] (Basic Books, 2010 ed.,
New York, 1900).
12. Bleuler, E. Dementia Praecox or the Group of Schizophrenias. [Zinkin, J. transl.]
(International Universities Press Inc., 1968 ed., Boston, 1911).
13. Kircher, T. T. et al. Neural correlates of formal thought disorder in schizophrenia:
preliminary findings from a functional magnetic resonance imaging study. Arch
Gen Psychiatry 58, 769–774 (2001).
14. Keshavan, M. S., Clementz, B. A., Pearlson, G. D., Sweeney, J. A. & Tamminga,
C. A. Reimagining psychoses: an agnostic approach to diagnosis. Schizophr Res
146, 10–16 (2013).
15. Palaniyappan, L. & Liddle, P. F. Diagnostic Discontinuity in Psychosis: A
Combined Study of Cortical Gyrification and Functional Connectivity. Schizophr
Bull; DOI: 10.1093/schbul/sbt050 (2013).
16. McGlashan, T. H. Early detection and intervention in schizophrenia: research.
Schizophr Bull 22, 327–345 (1996).
17. Brietzke, E. et al. Towards a multifactorial approach for prediction of bipolar
disorder in at risk populations. J Affect Disord 140, 82–91 (2012).
18. Fosse, M. J., Fosse, R., Hobson, J. A. & Stickgold, R. J. Dreaming and episodic
memory: a functional dissociation? J Cogn Neurosci 15, 1–9 (2003).
19. Fischer, S., Diekelmann, S. & Born, J. Sleep’s role in the processing of unwanted
memories. J Sleep Res 20, 267–274 (2010).
20. Dement, W. & Kleitman, N. Cyclic variations in EEG during sleep and their
relation to eye movements, body motility, and dreaming. Electroencephalogr Clin
Neurophysiol 9, 673–690 (1957).
21. Hobson, J. A. Arrest of firing of aminergic neurones during REM sleep:
implications for dream theory. Brain Res Bull 50, 333–334 (1999).
22. Gujar, N., McDonald, S. A., Nishida, M. &Walker, M. P. A role for REM sleep in
recalibrating the sensitivity of the human brain to specific emotions. Cereb Cortex
21, 115–123 (2011).
23. van der Helm, E. et al. REM sleep depotentiates amygdala activity to previous
emotional experiences. Curr Biol 21, 2029–2032 (2011).
24. Weiler, M. A., Buchsbaum, M. S., Gillin, J. C., Tafalla, R. & Bunney Jr, W. E.
Explorations in the relationship of dream sleep to schizophrenia using positron
emission tomography. Neuropsychobiology 23, 109–118 (1990).
25. Hobson, A. A model for madness? Nature 430, 21 (2004).
26. Gottesmann, C. The dreaming sleep stage: a new neurobiological model of
schizophrenia? Neuroscience 140, 1105–1115 (2006).
27. Scarone, S. et al. The dream as a model for psychosis: an experimental approach
using bizarreness as a cognitive marker. Schizophr Bull 34, 515–522 (2008).
28. Mason, O. & Wakerley, D. The psychotomimetic nature of dreams: an
experimental study. Schizophr Res Treatment 2012, 872307 (2012).
29. Hadjez, J. et al. Dream content of schizophrenic, nonschizophrenic mentally ill,
and community control adolescents. Adolescence 38, 331–342 (2003).
30. Horikawa, T., Tamaki, M., Miyawaki, Y. & Kamitani, Y. Neural decoding of visual
imagery during sleep. Science 340, 639–642 (2013).
31. Naselaris, T., Prenger, R. J., Kay, K. N., Oliver, M. & Gallant, J. L. Bayesian
reconstruction of natural images from human brain activity. Neuron 63, 902–915
(2009).
32. First, M. H., Spitzer, R. L., Gibbon, M. &Williams, J. Structured Clinical Interview
for DSM-IV Axis I Disorders -- Research Version, Patient Edition (SCID-I/P).
(Biometrics Research, New York State Psychiatric Institute, New York, 2002).
33. Daniel, W.W. Biostatistics: A Foundation for Analysis in the Health Sciences. [9th
ed.] (John Wiley & Sons, Hoboken, 2008).
34.McGrath, J., Saha, S., Chant, D. &Welham, J. Schizophrenia: a concise overview of
incidence, prevalence, and mortality. Epidemiol Rev 30, 67–76 (2008).
35. Merikangas, K. R. et al. Prevalence and correlates of bipolar spectrum disorder in
theworldmental health survey initiative.ArchGen Psychiatry 68, 241–251 (2011).
36. Mari, J. J. & Leita˜o, R. J. A epidemiologia da esquizofrenia. Rev Bras Psiquiatr 22,
15–17 (2000).
37. Almeida-Filho, N. et al. Brazilian multicentric study of psychiatric morbidity.
Methodological features and prevalence estimates. Br J Psychiatry 171, 524–529
(1997).
38. Bolloba´s, B. Modern Graph Theory. [103–144] (Springer-Verlag, New York,
1998).
39. Ferrer, I. C. R. & Sole, R. V. The small world of human language. Proc Biol Sci 268,
2261–2265 (2001).
40. Sigman, M. &Cecchi, G. A. Global organization of theWordnet lexicon. Proc Natl
Acad Sci U S A 99, 1742–1747 (2002).
41. Kotsiantis, S. B. in Emerging artificial intelligence applications in computer
engineering: real word AI systems with applications. [Maglogiannis, I., Karpouzis,
K., Wallace, M. & Soldatos, J.] [3–24] (IOS Press, Amsterdam, 2007).
42. Hall, M. et al. The WEKA Data Mining Software: An Update. ACM SIGKDD
Explorations Newsletter 11, 10–18 (2009).
43. Kay, S. R., Fiszbein, A. & Opler, L. A. The positive and negative syndrome scale
(PANSS) for schizophrenia. Schizophr Bull 13, 261–276 (1987).
44. Bech, P., Kastrup, M. & Rafaelsen, O. J. Mini-compendium of rating scales for
states of anxiety depression mania schizophrenia with corresponding DSM-III
syndromes. Acta Psychiatr Scand Suppl 326, 1–37 (1986).
Acknowledgments
Funding was obtained from a Capes Fellowship to NM, grants CNPq Universal 481351/
2011-6, CNPq PQ 306604/2012-4, FAPERN/CNPq Pronem 003/2011, Capes SticAmSud,
and FAPESP/CEPID/Neuromat to S.R. CNPq Universal 473554/2011-9 and 480053/
2013-8, CNPq PQ 308558/2011-1, FACEPE/CNPq-PRONEX APQ- 0203-1.05/08,
FACEPE/CNPq-PRONEM APQ-1415-1.05/10, and CNAIPS to M.C. We thank the
Psychiatry Residency Program at Hospital Onofre Lopes (UFRN) and Hospital Joa˜o
Machado for allowing access to independently diagnosed patients; G. Busatto, L.
Palaniyappan, D.F. Slezak, G. Cecchi, M. Sigman, S.J. de Souza and C. Queiroz for
discussions; N. Lemos, A.C. Pieretti, N. da C. Souza, and A.C. Resende for interview
transcriptions; N. Vasconcelos and A. deMacedo for help with data analysis; D. Koshiyama
for bibliographic support; G.M. da Silva and J. Cirne for IT support, and PPG/UFRN for
covering publication costs.
Author contributions
S.R., M.C. and N.M. designed the study; N.M. collected the data; N.M., S.R., M.C., R.F. and
P.P.C.M. analyzed data; R.F. and P.P.C.M. coded analysis software; N.M., S.R. and M.C.
prepared figures; S.R., M.C. and N.M. wrote the manuscript.
Additional information
Supplementary information accompanies this paper at http://www.nature.com/
scientificreports
Competing financial interests: The authors declare no competing financial interests.
How to cite this article: Mota, N.B., Furtado, R., Maia, P.P.C., Copelli, M. & Ribeiro, S.
Graph analysis of dream reports is especially informative about psychosis. Sci. Rep. 4, 3691;
DOI:10.1038/srep03691 (2014).
This work is licensed under a Creative Commons Attribution 3.0 Unported license.
To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0
www.nature.com/scientificreports
SCIENTIFIC REPORTS | 4 : 3691 | DOI: 10.1038/srep03691 7
41
1 
	  
Supplementary Information 
 
 
 
 
Supplementary Figures 
 
Fig. S1         Page 2 
Fig. S2         Page 3 
 
Supplementary Tables 
 
Table S1        Page 4 
Table S2        Page 5 
Table S3        Page 6 
Table S4        Page 7 
Table S5        Page 7 
 
Supplementary Method      Page 8 
 
 
 
 
  
42
2 
	  
Supplementary Figures 
 
Figure S1: Semantic and grammatical properties of dream and waking reports 
produced by schizophrenic, bipolar and control subjects. (a) Word frequency (ratio 
of specific word occurrence over total word count) for the 40 most frequent words in 
dream and waking reports, which account for approximately 50% of the 19,625 words 
recorded in total. Red indicates words that are exclusive of either waking or dream 
reports (within the 40 most frequent words). Note that word repertoires between dream 
and waking reports overlap by 87.5%. (b) Relative word frequency of bipolar and 
schizophrenic subjects for the 40 most frequent words, excluding articles, conjunctions, 
prepositions, numbers and interjections. Control values were subtracted from 
schizophrenic and bipolar values. (c) Grammatical classification of self-loops; verbs are 
more prevalent in psychotic than in control subjects. (d) Grammatical classification of 
words that follow self-loops (exit words); verbs are more prevalent in bipolar than in 
control subjects. 
43
3 
	  
 
 
Figure S2: Linear correlation of SGA and individual questions of the psychometric 
scales. Plots correspond to the significant Spearman’s correlations in Figure 3. (a) 
Dream reports. (b) Waking reports. 
  
44
4 
	  
Supplementary Tables 
 
 
 
Table S1: Socio-demographic and psychiatric information about the groups 
investigated. Age (years), years of education, total score of PANSS and BPRS and 
frequency of sex, marital status and medication for the groups studied. Mean and 
standard deviation are indicated. All subjects were Brazilian. Control subjects were 
non-psychotic individuals with depression (N=5), generalized anxiety disorder (N=2), 
one past episode of post-traumatic stress disorder (N=1), various symptoms of 
mood/anxiety disorder without reaching diagnostic criteria (N=11), plus one healthy 
individual. 
 
 
  
45
5 
	  
 
Table S2: Individual SGA and psychometric data for dream reports (N=60). 
 
46
6 
	  
 
 
Table S3: Individual SGA and psychometric data for waking reports (N=60). 
47
7 
	  
 
 
Table S4: P values of non-parametrical statistical analysis comparing SGA for raw 
data (full reports) and fixed WC data (graphs of 10, 20 and 30 words). P values 
using Kruskal-Wallis test on SxBxC (differences among groups tested together), 
considering P < 0.05 and Wilcoxon Ranksum test with Bonferroni correction (for 3 
pairwise comparisons, α=0.0167). Red indicates statistically significant differences. 
 
 
 
 
 
Table S5: Classification quality measured by AUC, Sensitivity and Specificity. A 
Naïve Bayes classifier was used to split the 3 groups (SxBxC), or separately sort SxB, 
SxC, and BxC, using all SGA as inputs. 
 
48
8 
	  
Supplementary Method 
 
Customized software for the graph analysis of text (SpeechGraphs)  
 
 
 
Speechgraphs is a graph-theoretical analysis tool that uses text as input and graph 
features as output. This customized software plots graphs and calculates graph attributes 
with moving windows of fixed word length. The Speechgraphs software was developed 
at the Brain Institute of the Federal University of Rio Grande do Norte (Natal, Brazil), 
by R. Furtado, P.P.C. Maia, N.B. Mota, S. Ribeiro, M. Copelli, and D.F. Slezak. The 
software and a complete user's guide can be directly downloaded from the website: 
http://neuro.ufrn.br/research/softwares/speechgraphs. 
 
 
49
ORIGINAL RESEARCH ARTICLE
published: 29 July 2014
doi: 10.3389/fnagi.2014.00185
Graph analysis of verbal fluency test discriminate between
patients with Alzheimer’s disease, mild cognitive
impairment and normal elderly controls
Laiss Bertola1*†, Natália B. Mota2†, Mauro Copelli 3, Thiago Rivero1, Breno Satler Diniz1,4,
Marco A. Romano-Silva4,5, Sidarta Ribeiro2 and Leandro F. Malloy-Diniz1,4
1 Laboratory of Clinical Neuroscience Investigations, Federal University of Minas Gerais, Belo Horizonte, Brazil
2 Brain Institute, Federal University of Rio Grande do Norte, Natal, Brazil
3 Physics Department, Federal University of Pernambuco, Recife, Brazil
4 Mental Health Department, Faculty of Medicine, Federal University of Minas Gerais, Belo Horizonte, Brazil
5 Faculty of Medicine, National Institute of Science and Technology – Molecular Medicine, Federal University of Minas Gerais, Belo Horizonte, Brazil
Edited by:
Manuel Menéndez-González,
Hospital Álvarez-Buylla, Spain
Reviewed by:
Roberta Brinton, University of
Southern California, USA
Douglas Watt, Quincy Medical
Center, USA
Mikhail Lebedev, Duke University,
USA
*Correspondence:
Laiss Bertola, Laboratory of Clinical
Neuroscience Investigations,
Faculty of Medicine, Federal
University of Minas Gerais, Av.
Alfredo Balena, 190 office 235, Belo
Horizonte, Minas Gerais,
CEP 30.130-100, Brazil
e-mail: laissbertola@gmail.com
†These authors contributed equally
to this study.
Verbal fluency is the ability to produce a satisfying sequence of spoken words during
a given time interval. The core of verbal fluency lies in the capacity to manage the
executive aspects of language. The standard scores of the semantic verbal fluency test are
broadly used in the neuropsychological assessment of the elderly, and different analytical
methods are likely to extract even more information from the data generated in this test.
Graph theory, a mathematical approach to analyze relations between items, represents a
promising tool to understand a variety of neuropsychological states. This study reports
a graph analysis of data generated by the semantic verbal fluency test by cognitively
healthy elderly (NC), patients with Mild Cognitive Impairment—subtypes amnestic (aMCI)
and amnestic multiple domain (a+mdMCI)—and patients with Alzheimer’s disease
(AD). Sequences of words were represented as a speech graph in which every word
corresponded to a node and temporal links between words were represented by directed
edges. To characterize the structure of the data we calculated 13 speech graph attributes
(SGA). The individuals were compared when divided in three (NC—MCI—AD) and four
(NC—aMCI—a+mdMCI—AD) groups. When the three groups were compared, significant
differences were found in the standard measure of correct words produced, and three
SGA: diameter, average shortest path, and network density. SGA sorted the elderly groups
with good specificity and sensitivity. When the four groups were compared, the groups
differed significantly in network density, except between the two MCI subtypes and NC
and aMCI. The diameter of the network and the average shortest path were significantly
different between the NC and AD, and between aMCI and AD. SGA sorted the elderly in
their groups with good specificity and sensitivity, performing better than the standard
score of the task. These findings provide support for a new methodological frame to
assess the strength of semantic memory through the verbal fluency task, with potential
to amplify the predictive power of this test. Graph analysis is likely to become clinically
relevant in neurology and psychiatry, and may be particularly useful for the differential
diagnosis of the elderly.
Keywords: semantic verbal fluency, graph analysis, elderly, Alzheimer’s disease, mild cognitive impairment
INTRODUCTION
Language and semantic memory tend to remain stable across
the human lifespan in contrast to other cognitive domains,
like episodic memory and attention, which usually decline
after the 5th decade (Craik and Bialystok, 2006). They are
also usually spared in the initial stages of neurodegenerative
disorders, such as Alzheimer’s disease (AD), though we can
still observe milder deficits, e.g., anomia or reduced seman-
tic verbal fluency, which can be identified in a comprehen-
sive neuropsychological evaluation (Henry et al., 2004; Garrard
et al., 2005; Nutter-Upham et al., 2008; Taler and Phillips,
2008).
Verbal fluency is the ability to produce a satisfying sequence
of spoken words during a given time interval (Nickles, 2001).
Verbal fluency tests are experimentally designed to assess this
ability through the production of words starting with a specific
letter (Phonemic Verbal Fluency) or belonging to a category of
knowledge (Semantic Verbal Fluency). Semantic verbal fluency is
one of the most commonly used tasks to evaluate language and
semantic memory skills in older adults. This task depends on
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 1
AGING NEUROSCIENCE
50
Bertola et al. Semantic verbal fluency graph analysis
the preservation of language (e.g., words can be spoken correctly
during the task), though it is significantly influenced by seman-
tic memory (e.g., the knowledge of the category asked must be
intact) and executive function (e.g., the ability to search the asked
knowledge) domains (Adlam et al., 2006; Unsworth et al., 2011).
This task often activates the temporal lobe, a region broadly
related to conceptualization, general information and knowledge
about names (Patterson et al., 2007). Semantic verbal fluency con-
tributes to predict future cognitive and functional impairments in
the elderly (Salmon et al., 2002; Amieva et al., 2005; Hodges et al.,
2006; Aretouli et al., 2011), and predict the progression fromMCI
to AD (Saxton et al., 2004).
Despite being widely used for neuropsychological assess-
ment in the elderly, the standard measure of the verbal flu-
ency test is restricted to the total of correct words produced
in the task (Lezak et al., 2004; Strauss et al., 2006), and does
not take into account other clinically-relevant information that
may be contained in the patient’s specific performance. This
task requires the production of words belonging to a specific
category, and each subject produced the words following an
order of exemplars during the 1-min task. This order of words
produced allows the construction of a network based on the
temporal link between the words. These temporal links may
inform that words produced in a specific temporal sequence
are probably conceptually related, as suggested by the semantic
association models (McClelland and Rogers, 2003; Griffiths et al.,
2007).
Goni et al. (2010) constructed a semantic network using
the verbal fluency task applied to an adult sample, and rep-
resented the semantic memory as a graph ruled by concep-
tual constraints. A normal semantic verbal fluency network
is represented by a directed graph with only one occurrence
for each word. Lerner et al. (2009) investigated the network
properties of subjects with MCI and AD, and found that
the path lengths of the network decline while the cluster-
ing coefficient increases in the MCI and AD subjects com-
pared to healthy elderly controls. These results showed that
the normal characteristics of the semantic verbal network are
significantly changed in the continuum from normal aging
to AD.
The analysis of network properties helps understanding the
dynamics and organization of the cognitive and behavioral pro-
cesses. A graph represents a network with nodes linked by edges
(Mota et al., 2012). Formally, a graph is a mathematical repre-
sentation of a network G = (N, E), with N = {w1, w2, . . .wn} a
set of nodes and E = {(wi, wj)} a set of edges or links between
words wi in N and wj in N. The interpretation of the meaning
of a graph depends on what is being represented (Butts, 2009;
Mota et al., 2012). We carried out an analysis of the network
properties of the semantic verbal fluency of subjects with MCI or
AD. We hypothesize that the analysis of the semantic verbal flu-
ency network properties can help to better discriminate between
older adults with normal cognitive performance, mild cognitive
impairment or Alzheimer’s disease. This approach had been used
FIGURE 1 | (A) Representation of the word sequence produced on the Semantic Verbal Fluency task. (B) Representations of networks generated by NC, MCI,
and AD subjects during the Semantic Verbal Fluency task.
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 2
51
Bertola et al. Semantic verbal fluency graph analysis
with success to identify patients with schizophrenia and bipolar
disorder (Mota et al., 2012, 2014).
MATERIALS AND METHODS
SUBJECTS
One hundred older adults were included in this study. All sub-
jects were assessed in the Centro de Referência à Saúde do Idoso
Jenny de Andrade Faria, Clinical Hospital, Federal University of
Minas Gerais. All the participants underwent a comprehensive
clinical and neuropsychological assessment. The neuropsycho-
logical protocol included the following tests: Mini Mental State
Exam, Frontal Assessment Battery, Category Verbal Fluency of
Animals and Fruits, Letter Fluency of S, Digit Span, Stick Design
Test, Clock Drawing Test, Rey Auditory Verbal Learning Test,
Naming Test (TN-LIN), and Token Test. This protocol has been
validated for the neuropsychological assessment of older adults
with low educational status (de Paula et al., 2013). After the clini-
cal and neuropsychological assessment, and adjudication meeting
was held and the final diagnosis was reached by consensus. The
AD diagnosis was based on the proposed criteria of McKhann
et al. (1984) and the patient should present general and wors-
ening cognitive impairment, in two or more cognitive domains,
and functional impairment in the daily living activities. The MCI
diagnosis followed the criteria proposed by Winblad et al. (2004),
were the older adult presents cognitive decline in one or more
cognitive domains but is preserved in basic and instrumental
daily living activities or presents a minimal impairment. TheMCI
subgroup division considered the amnestic MCI (aMCI) classi-
fication for participants that only present memory impairment,
and amnestic multiple-domain MCI (a+mdMCI) for partici-
pants that present impairment in memory and other cognitive
domain, though fulfilling all the MCI criteria established by
Winblad et al. (2004).
The project was approved by the Research Ethics Committee
of the Federal University of Minas Gerais (COEP-334/06). The
subjects were divided into four groups: (1) normal cogni-
tive performance (NC), n = 25; (2) amnestic single-domain
MCI (aMCI), n = 25; (3) amnestic multiple-domain MCI
(a+mdMCI), n = 25; (4) AD, n = 25.
VERBAL FLUENCY TEST
The participants performed the Semantic Verbal Fluency test,
category of animals, for which they were asked to produce the
maximum names of animals within 60 s; explicit/implicit instruc-
tions were given to avoid repetitions. All the words were recorded,
including repetitions and errors. The scoring procedure included:
total of words produced, total of correct words, total of errors,
total of repetitions, and the fraction of repetitions according to
the total of words produced by each participant. The scores in this
task were not taken into account in the diagnosis adjudication of
each participant.
STATISTICAL ANALYSIS
The study design involved two stages of analysis, considering
three (NC, MCI, AD) or four groups (NC, aMCI, a+mdMCI,
AD), and the same statistical analysis and graph measures
were performed for comparing the three or four groups. The
MCI group comprised both the aMCI and the a+mdMCI
groups.
We performed the Shapiro-Wilk test of normality of the sam-
ple, and since the majority of the variables did not fit the assump-
tion of normality, we used the Kruskal-Wallis test of differences
between several independent groups and the Wilcoxon Rank sum
test for two independent samples. Bonferroni correction was
applied to all analyses.
Group sorting was implemented with a Naïve Bayes classifier,
which shows superior performance with small samples (Singh
and Provan, 1995; Kotsiantis, 2007). The choice of attributes for
the classifier was based on significant correlations of the attributes
with established clinical measures of differential diagnosis (global
cognitive status and daily living functionality). Sensitivity, speci-
ficity and the area under the receiver operating characteristic
curve (AUC) were used to estimate classification quality, which
was considered excellent when AUC was higher than 0.8, good
when AUC ranged from 0.6 to 0.8, and poor (not above the
chance), when AUC was smaller than 0.6.
GRAPH MEASURES
The word sequence produced on the Semantic Verbal Fluency
test was represented as a speech graph, using the software
SpeechGraphs (Mota et al., 2014). The program represents a text
(in this case, the sequence of words produced by the verbal flu-
ency test) as a graph, representing every word as a node, and the
temporal link between words as an edge (Figure 1).
We then calculated word count (WC) and 13 additional
Speech Graph Attributes (SGA) comprising general attributes:
total of nodes (N) and edges (E); connected components:
the largest strongly connected component (LSC); recurrence
attributes: repeated (RE) and parallel edges (PE), cycles of one
(L1), two (L2), or 3 nodes (L3); global attributes: average total
degree (ATD), density, diameter, average shortest path (ASP) and
clustering coefficient (CC) (for more detailed information see
Supplementary Table and Figure on Supplementary Material).
Given the task instructions, we expected the subjects to pro-
duce a linear network, i.e., a sequence in which each correct word
Table 1 | Socio-demographic data, verbal fluency and Speech Graph
Attributes of NC, MCI, and AD groups, with Bonferroni-corrected
significant differences across groups established by the
Kruskal-Wallis comparison.
NC MCI AD p
Median IQR Median IQR Median IQR
Q1 Q3 Q1 Q3 Q1 Q3
Age 76 72 80 76 71 81 78 67 81 0.9785
Education 4 3 4 4 2 4 4 3 4 0.8400
Katz 0 0 0 − − − 0 0 0 0.0105
Lawton 0 0 0 0 0 1 6 4 8 0.0000
MMSE 27 24 29 25 23 27 20 17 23 0.0000
Katz, Katz Index; Lawton, Lawton Index; MMSE, Mini Mental State Exam;
IQR, Interquartile Range; Q1, 1th Quartile; Q3, 3trd Quartile. Red values have
significance p = 0.0167.
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 3
52
Bertola et al. Semantic verbal fluency graph analysis
was followed by a different correct word, without repetitions. A
correct performance in this test should yield graphs with identical
number of nodes (N) and words (WC), N-1 edges, no recurrence
(i.e., without parallel edges, repeated edges or loops), and zero
strongly connected components (LSC). In addition, the average
total degree (ATD) should be close to 2, with a very small density,
very low clustering coefficient (CC), and large distances (diameter
should be equal to E).
RESULTS
Table 1 shows data for socio-demographic data, Mini Mental
State Exam (MMSE), total number of produced words in the
Table 2 | Verbal fluency and Speech Graph Attributes of NC, MCI, and AD groups, with Bonferroni-corrected significant differences across
groups established by the Kruskal-Wallis comparison.
NC MCI AD p
Median IQR Median IQR Median IQR
Q1 Q3 Q1 Q3 Q1 Q3
VF.E 0 0 0 0 0 0 0 0 0 1.0000
VF.PR 0 0 0.07 0 0 0.13 0 0 0.1 0.2330
VF.R 0 0 1 0 0 1 0 0 1 0.4462
VF.C 14 12 15 11 10 14 9 7 10 0.0000
VF.TT 15 13 15 12 10 15 9 8 10 0.0000
WC 15 13 15 12 10 15 9 8 10 0.0000
N 14 13 15 11 10 14 9 7 10 0.0000
E 14 12 14 11 9 14 8 7 9 0.0000
RE − − − 0 0 0 0 0 0 0.6034
PE 0 0 0 0 0 0 0 0 0 0.6591
L1 − − − 0 0 0 − − − 0.6065
L2 0 0 0 0 0 0 0 0 0 0.6942
L3 0 0 0 0 0 0 0 0 0 0.0265
LSC 1 1 7 1 1 6 1 1 4 0.7568
ATD 1.86 1.85 2.00 1.87 1.82 2.00 1.80 1.75 2.00 0.2584
Diameter 12 9.00 13.00 9 6.00 12.00 7 5.00 8.00 0.0001
ASP 4.66 3.67 5.20 3.66 2.91 4.67 3 2.29 3.33 0.0001
CC 0 0.00 0.00 0 0.00 0.00 0 0.00 0.00 0.2479
Density 0.07 0.06 0.08 0.08 0.07 0.10 0.10 0.10 0.14 0.0000
VF.E, errors; VF.PR, percentage of repetitions; VF.R, repetitions; VF.C, corrects words; VF.TT, total of words; WC, word count; N, nodes; E, edges; RE, repeated; PE,
parallel edges; L1, L2, L3, cycles of one, two or 3 nodes; LSC, largest strongly connected component; ATD, average total degree; ASP, average shortest path; CC,
clustering coefficient. Red values have significance p = 0.0167.
Table 3 | Pairwise group comparison with Bonferroni-corrected significant differences between groups established by Wilcoxon Ranksum test.
NC ×MCI NC × AD MCI × AD
W Z p W z p W Z p
Katz 1875.0 1.385 0.1658 598.5 −1.388 0.1651 1800.0 2.815 0.0049
Lawton 707.0 −3.396 0.0007 325.0 −6.254 0.0000 1327.5 6.358 0.0000
MMSE 1712.5 2.114 0.0345 394.0 4.732 0.0000 558.5 −4.412 0.0000
VF.C 1626.0 3.088 0.0020 396.5 4.776 0.0000 658.5 −3.427 0.0006
VF.TT 1689.5 2.375 0.0175 410.5 4.513 0.0000 656.5 −3.432 0.0006
WC 1689.5 2.375 0.0175 410.5 4.513 0.0000 656.5 −3.432 0.0006
N 1626.0 3.091 0.0020 394.0 4.824 0.0000 640.5 −3.629 0.0003
E 1689.5 2.375 0.0175 410.5 4.513 0.0000 656.5 −3.432 0.0006
L3 1862.5 1.225 0.2205 600.0 −1.137 0.2552 1787.5 2.613 0.0090
Diameter 1686.5 2.405 0.0161 425.5 4.238 0.0000 720.0 −2.783 0.0054
ASP 1667.0 2.618 0.0088 414.5 4.433 0.0000 720.0 −2.773 0.0055
Density 653.0 −3.338 0.0008 388.5 −4.924 0.0000 1596.5 3.558 0.0004
W, Wilcoxon Ranksum; z, = z score. Red values have significance p = 0.0167.
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 4
53
Bertola et al. Semantic verbal fluency graph analysis
verbal fluency test, total number of correct words produced, total
number of repetitions performed during the task, the percent-
age of repetitions performed according to the total of produced
words, and the errors produced.
The groups did not differ in age and education, and only the
control group had a significant difference in gender distribu-
tions (X2 = 6.76, df = 2, p = 0.009) (Table 1). The results of the
groups’ comparison on the daily living activities, the global cog-
nitive status are also reported on Table 1. Verbal fluency measures
and the Speech Graph Attributes are reported on Table 2.
Despite the lower number of correct words produced by the
NC group, it is similar to those observed to Brazilian normative
data (Brucki et al., 1997). Moreover, the scores on the verbal flu-
ency test were not taken into account for participant classification
into the diagnostic groups.
The groups significantly differed in the performance on ADLs,
in general cognitive status, number of correct words and total
words produced, and in the Speech Graph measures of word
count, nodes, edges, loops of 3 nodes, diameter, average short
path and density. As expected, the NC group performed better at
ADLs, had higher scores on the MMSE, produced more nodes, a
network with larger diameter and less dense, when compared with
theMCI and AD groups. TheMCI group showed an intermediate
performance between NC and AD groups in all measures.
Table 3 and Figure 2A show pairwise comparisons of the 3
diagnosis groups. Statistical significance was set at p < 0.0167,
after Bonferroni correction for multiple comparisons.
The comparison of the variables between NC and MCI groups
demonstrate that the groups differ in the index of instrumental
daily living activities, in the number of correct words produced,
FIGURE 2 | Speech Graph Attributes (SGA) differentiates
psychopathological groups. (A) SGA boxplots with significant differences
among Alzheimer Disorder (AD), Moderate Cognitive Impairment (MCI) and
control groups (N = 25 on AD and C group, N = 50 on MCI group;
Kruskal-Wallis test followed by two-sided Wilcoxon Rank-sum test with
Bonferroni correction with alpha = 0.0167). (B) Percentage of subjects in
each group that made one L3 on the verbal fluency test. AD subjects showed
more L3 than MCI subjects (Wilcoxon Rank-sum test with Bonferroni
correction with alpha = 0.0167, p = 0.0090). (C) Rating quality measured by
AUC, sensitivity and specificity, using MMSE or SGA correlated with clinical
symptoms measured with MMSE and Lawton scales (Table 3) (attributes:
WC, N, E, Density, Diameter, and ASP). Notice that SGA was more specific
than MMSE on triple group sorting, and on MCI diagnosis against the control
group. ∗p = 0.0167.
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 5
54
Bertola et al. Semantic verbal fluency graph analysis
number of nodes, diameter, average short path and density of
the network. The NC produced less dense graphs with more
nodes, and larger Diameter and ASP than the MCI and AD.
Furthermore, NC made more edges, total words produced, and
had a better general cognitive status than the AD group. The
MCI and AD groups differ in all measures, demonstrating that
a change in the general cognitive status, functionality, verbal
fluency measures and the speech graph attributes (WC, N, E,
L3, Diameter, ASP, and Density) (Figure 2B) almost follow a
continuous modification as the diagnosis impairs.
Table 4 shows the Spearman correlations between the SGA
and the clinical measures of differential diagnosis (global
cognitive status—MMSE—and daily living functionality—Katz
and Lawton Index). The significance level was established in
p = 0.0012 after a Bonferroni correction for 42 comparisons.
We found significant correlations between the MMSE and the
SGA of Nodes and Density, indicating that the more cognitively
preserved elderly produced a larger number of unique nodes,
and networks with a smaller density than cognitively impaired
subjects. The correlation between the attributes and the Lawton
Index of instrumental daily living activities revealed that the more
functionally dependent were the elderly, the less words, nodes and
edges they produced, showing networks with a smaller diameter
and average short path, but a higher density. These results indicate
that functional autonomy correlate more with SGA than with the
general cognitive status.
The Naïve Bayes classifier results (Figure 2C) show that a selec-
tion of SGA correlated with functional and cognitive impairment
measured by other instruments, provided good to excellent classi-
fication power, being similar to theMMSE classification power, or
even better for the distinction between the NC and MCI groups.
When the SGAwere associated to the Lawton Index or theMMSE,
the power of classification increased; a combination of the 3
measurements provided maximal classification quality (Table 5).
Overall, the combination of graph measures and functional
Table 4 | Spearman correlation (RHO and p-values) between SGA
scores and the Katz, Lawton or MMSE scores.
Katz Lawton MMSE
RHO P RHO P RHO p
WC −0.1762 0.0796 −0.4519 0.0000 0.3161 0.0014
N −0.1811 0.0714 −0.4963 0.0000 0.3335 0.0007
E −0.1762 0.0796 −0.4519 0.0000 0.3161 0.0014
RE 0.3014 0.0023 0.0698 0.4900 −0.0050 0.9609
PE 0.1031 0.3075 −0.0239 0.8137 0.0958 0.3432
L1 −0.0230 0.8199 −0.0888 0.3797 −0.0943 0.3505
L2 −0.0579 0.5670 −0.0673 0.5062 0.1116 0.2690
L3 0.1048 0.2993 0.2737 0.0059 −0.1557 0.1219
LSC −0.0349 0.7305 −0.0545 0.5905 0.1171 0.2458
ATD −0.0352 0.7279 −0.1220 0.2267 0.1611 0.1094
Diameter −0.1339 0.1842 −0.3897 0.0001 0.2433 0.0147
ASP −0.1480 0.1418 −0.4017 0.0000 0.2549 0.0105
CC 0.0692 0.4941 0.1786 0.0755 −0.1379 0.1713
Density 0.1766 0.0788 0.4933 0.0000 −0.3239 0.0010
Red values have significance p = 0.0012.
dependence yielded very accurate differential classification of the
AD (1.00) andMCI (0.78) against the NC group, and between the
MCI and AD (0.84).
The additional description of the socio-demographic data,
Mini Mental State Exam (MMSE), verbal fluency measures of
the two subgroups of MCI are reported on Table 6, and also the
results of the four groups’ comparison on the sociodemographic
variables. Table 7 shows the four group comparison on the verbal
fluency and Speech Graph Attributes.
A comparison of the four groups showed significant differ-
ences in daily functionality, general cognitive status, total and
correct words produced, and in the SGA word count, nodes,
edges, diameter, ASP and density (same attributes found in the
three-group comparison).
Table 8 and Figure 3A compare the four groups of
elderly, with Bonferroni correction for multiple comparisons
(alpha = 0.0083).
The pairwise comparison detected no significant differences
between MCI subtypes in the measures selected in this study. The
difference between the NC and aMCI groups occurred only in
instrumental daily living functionality, i.e., NC are more inde-
pendent than aMCI. The significant differences between the NC
and AD and between aMCI and AD are similar; the NC and
aMCI groups are less functionally dependent, have better cogni-
tive status, produce more total and correct words, a higher word
count, more nodes and edges, higher Diameter and ASP, and less
dense networks when compared to the AD group. The NC are
more functionally independent, produce more total and correct
words, a higher word count, more nodes and edges, and a network
less dense than the a+mdMCI group. AD subjects, comparable to
the a+mdMCI group, were more functionally dependent, showed
general cognitive impairment, and produced fewer nodes and a
denser network.
The Naïve Bayes classifier results (Figure 3B) indicate that the
selected SGA has a good classification power to the diagnosis of
MCI subtypes against cognitive healthy aging, and also a good
classification against the dementia group. On the other hand, SGA
yielded a poor classification when used to distinguish between
the two subtypes of MCI. When SGA were combined with the
Lawton Index, we observed an increase in the power of clas-
sification across the four groups, except between the two MCI
subtypes.
The combination of the SGA with the MMSE, showed less
power when compared to the combination with the Lawton
index; the combination of these three variables barely improved
the classification beyond the SGA and Lawton combination.
These results indicate that the combination of graph measures
and functional dependence again provides for good classification
across the three groups (AUC = 0.71–0.85), except between the
MCI subtypes (AUC = 0.47).
DISCUSSION
The aim of the present study was to assess graph-theoretical dif-
ferences in the execution of a verbal fluency task among elderly
with normal and pathological aging. Our results demonstrate that
SGA differed significantly among the AD, MCI, and NC groups
and it could be used to classify the groups. The present results
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 6
55
Bertola et al. Semantic verbal fluency graph analysis
Table 5 | Rating quality measured by AUC, using SGA (attributes: WC, N, E, Density, Diameter, and ASP) correlated with clinical symptoms
measured with MMSE and Lawton scales, in addition with Lawton, MMSE or both, classifying AD and MCI from NC, AD from MCI, and also
classifying subtypes of MCI (aMCI or a+mdMCI) from NC or AD, or from each other.
SGA SGA+MMSE SGA+ Lawton SGA+MMSE+Lawton MMSE Lawton
NC x MCI 0.681 0.716 0.780 0.793 0.638 0.649
aMCI 0.619 0.618 0.710 0.714 0.586 0.612
a+mdMCI 0.710 0.738 0.803 0.822 0.694 0.746
AD 0.875 0.886 1.000 1.000 0.888 1.000
aMCI x a+mdMCI 0.486 0.470 0.472 0.483 0.631 0.494
AD 0.767 0.824 0.856 0.856 0.854 0.957
a+mdMCI x AD 0.652 0.717 0.814 0.811 0.772 0.959
MCI x AD 0.727 0.793 0.849 0.858 0.813 0.958
Table 6 | Additional description of socio-demographic data for the
MCI subtypes, and the four groups comparison.
aMCI a+mdMCI p
Median IQR Median IQR
Q1 Q3 Q1 Q3
Age 75 71 79 79 73 81 0.7561
Education 4 2 5 3 2 4 0.4662
Katz 0 − − 0 − − 0.0279
Lawton 0 0 1 1 0 2 0.0000
MMSE 26 23 28 24 23 26 0.0000
p* group comparison (NC; aMCI; a+mdMCI; AD). Red values have significance
p = 0.0083.
show the potential of graph analysis of verbal fluency task to
discriminate between these groups in clinical practice.
The correlation between the SGA and the MMSE or the
Lawton Index indicate that the SGA are associated with the
general cognitive status and functional performance, two impor-
tant clinical measures used in geriatric assessment. Patients
with worse scores in the MMSE produced fewer numbers of
nodes and a less dense network. As the functional performance
decreases, indicating more severe cognitive impairment stages,
the networks became denser, with a smaller diameter and average
short path and with fewer numbers of nodes and edges. Their
networks became smaller in the number of words, with a small
path through the first word to the last one, and their animals
have more connection with different neighbors than would be
necessary. Subjects more cognitively impaired tended to perform
more dependently on their daily activities. Importantly, some
attributes of SGA could indicate the progression of cognitive
impairment and functional decline, as shown by denser and
smaller networks, with a fewer number of nodes, in subjects with
more severe cognitive impairment.
Application of speech graph analysis for sorting the groups
showed moderate to good classification quality. When selected
SGA were combined to the Lawton Index, better classification
were obtained, suggesting that the combination of these two
Table 7 | Additional description of verbal fluency and Speech Graph
Attributes for the MCI subtypes, and the four groups comparison.
aMCI a+mdMCI p
Median IQR Median IQR
Q1 Q3 Q1 Q3
VF.E 0 0 0 0 0 0 1.0000
VF.PR 0.083 0 0.125 0 0 0.1 0.1300
VF.R 1 0 2 0 0 1 0.2084
VF.C 11 10 14 11 9 13 0.0000
VF.TT 13 11 16 11 10 14 0.0000
WC 13 11 16 11 10 14 0.0000
N 11 11 14 11 9 13 0.0000
E 12 10 15 10 9 13 0.0000
RE 0 0 0 0 − − 0.5682
PE 0 0 0 0 0 0 0.7670
L1 0 − − 0 0 0 0.3916
L2 0 0 0 0 0 0 0.8658
L3 0 0 0 0 − − 0.0567
LSC 4 1 7 1 1 5 0.5115
ATD 2 1.810 2.095 1.857 1.8 2 0.1998
Diameter 9 8 12 9 6 11 0.0003
ASP 3.666 3.309 4.666 3.666 2.666 4.333 0.0002
CC 0 0 0 0 0 0 0.3936
Density 0.082 0.071 0.090 0.9 0.075 0.109 0.0000
Red values have significance p = 0.0083.
simple tools of network measure and functionality can provide
to the clinician a good indication of differential diagnosis, except
for the contrast between the two MCI subtypes, which spanned a
continuum and did not allow the differentiation and classification
of the two groups.
The differences prevalent across all groups were in the global
attributes of diameter, density and average shortest path (ASP).
The results indicate that the networks built by the normal control
elderly were more direct, without reoccurrence of words, result-
ing in a less dense network. Conversely, cognitive impairment
corresponded to denser and less direct networks. The density
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 7
56
Bertola et al. Semantic verbal fluency graph analysis
Table 8 | Pairwise group comparison in the variables with significant difference across the four groups.
NC × aMCI NC × a+mdMCI NC × AD
W z p W z p W z p
Katz 625 −0.960 0.3371 625 −0.960 0.3371 598.5 −1.388 0.1651
Lawton 548.0 2.581 0.0099 484 3.758 0.0002 325 −6.254 0.0000
MMSE 583.5 −1.046 0.2956 504 −2.597 0.0094 394 4.733 0.0000
VF.C 525 −2.186 0.0288 476 −3.140 0.0017 396.5 4.777 0.0000
VF.TT 563.5 −1.440 0.1498 501 −2.661 0.0078 410.5 4.514 0.0000
WC 563.5 −1.440 0.1498 501 −2.661 0.0078 410.5 4.514 0.0000
N 526 −2.170 0.0300 475 −3.159 0.0016 394 4.824 0.0000
E 563.5 −1.440 0.1498 501 −2.661 0.0078 410.5 4.514 0.0000
L3 625.0 −0.566 0.5714 612.5 −1.400 0.1614 600 −1.138 0.2552
Diameter 546 −1.778 0.0753 515.5 −2.367 0.0179 425.5 4.239 0.0000
ASP 534.5 −1.994 0.0462 507.5 −2.518 0.0118 414.5 4.433 0.0000
Density 510.5 2.460 0.0139 467.5 3.296 0.0010 388.5 −4.924 0.0000
aMCI × a+mdMCI aMCI × AD a+mdMCI × AD
Katz 637.5 NaN NaN 587.5 −2.001 0.0454 587.5 −2.001 0.0454
Lawton 577.5 −1.294 0.1958 352 −5.423 0.0000 350.5 −5.335 0.0000
MMSE 545 1.797 0.0723 416 4.301 0.0000 467.5 3.284 0.0010
VF.C 582 1.074 0.2828 471.5 3.334 0.0009 512 2.561 0.0105
VF.TT 573.5 1.238 0.2157 466.5 3.418 0.0006 515 2.486 0.0129
WC 573.5 1.238 0.2157 466.5 3.418 0.0006 515 2.486 0.0129
N 574.5 1.223 0.2213 460.5 3.551 0.0004 505 2.692 0.0071
E 573.5 1.238 0.2157 466.5 3.418 0.0006 515 2.486 0.0129
L3 625 0.960 0.3371 587.5 −1.654 0.0981 575 −2.268 0.0233
Diameter 600.5 0.712 0.4764 495.5 2.901 0.0037 549.5 1.884 0.0595
ASP 602.5 0.671 0.5023 502.5 2.755 0.0059 542.5 2.010 0.0444
Density 597 −0.778 0.4367 467 −3.415 0.0006 504.5 −2.700 0.0069
Red values have significance p = 0.0083.
differences across the groups were, among all comparisons, the
most uniform result, except for the comparison between the
two MCI subgroups, which yielded a pattern of continuous per-
formance. The progressive worsening of cognitive performance
within the MCI subtypes is consistent in the literature, indicating
that a group of subtle deficits underlie the differential diagnosis
(Diniz et al., 2007; Radanovic et al., 2009).
Even the groups that did not differ in total number of word
repetitions differ in the occurrence of loops of 3 nodes (L3).
Nearly all subjects, as expected, managed to avoid recurrences,
but 20% of the AD subjects repeated the same word with only
two words of interval (e.g., dog-cat-horse-dog). According to
Huntley andHoward (2010), subjects with AD already have work-
ing memory deficits at the earliest stages of the disease. The
impairment in central executive and episodic buffer functions of
working memory probably stems from the difficulty of keeping
information in mind while keeping the search for new informa-
tion. These deficits probably explain the repetition of words in
verbal fluency tasks with a very small interval.
The results outline a field that needs to be further explored
in future studies, involving the density of the networks and the
strength between the words in the semantic memory of elderly
with pathological aging. The Parallel Distributed Processing
Approach of Semantic Cognition predicts that the decrease in
strength of the links between words in a semantic network
may allow connections between pairs of words that would not
be preferential under normal circumstances (McClelland and
Rogers, 2003). Another aspect that deserves further investigation
is the absence of difference across the groups in the connectiv-
ity attributes (LSC, ATD, and CC). This raises the hypothesis that
even very different networks can share a similar structure of local
connections, in which a small portion of the words are highly con-
nected with other less connected words, maintaining the integrity
of the network’s general connection (Bales and Johnson, 2006; De
Deyne and Storms, 2008).
Considering the graph analysis performed in this study, build-
up in a co-occurrence of the words and based on the temporal link
between them, future studies should consider multidimensional
scaling and hierarchical clustering analysis. These types of analy-
ses will represent the relation between the variables and combine
it into groups, enhancing the results. Future studies should also
address the differences between MCI patients and other neu-
rological conditions in which cognitive impairments are quite
similar, for example, Temporal Lobe Epilepsy (Holler and Trinka,
2014), as well as the potential association between graph analy-
sis, neuroimaging and other diagnosis instruments. Furthermore,
longitudinal studies are also necessary to evaluate whether SGA
can help to identify MCI subjects with higher risk of progressing
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 8
57
Bertola et al. Semantic verbal fluency graph analysis
FIGURE 3 | Speech Graph Attributes (SGA) differentiates
psychopathological MCI subgroups. (A) SGA boxplots with significant
differences among Alzheimer Disorder (AD), Amnesic Moderate Cognitive
Impairment (aMCI), Multiple Domain Moderate Cognitive Impairment
(a+mdMCI), and control groups indicated (N = 25 per group; Kruskal-Wallis
test followed by two-sided Wilcoxon Rank-sum test with Bonferroni
correction with alpha = 0.0083). (B) Rating quality measured by AUC,
sensitivity and specificity, using SGA correlated with clinical symptoms
measured with MMSE and Lawton scales (Table 4) (attributes: WC, N, E,
Density, Diameter, and ASP). Notice that it is possible to sort the MCI
subgroups from the NC or AD groups, but not one from another.
Classification quality was considered excellent when AUC was higher than
0.8, good when AUC ranged from 0.6 to 0.8, and poor (not above the
chance), when AUC was smaller than 0.6. ∗p = 0.0083.
to Alzheimer’s disease. In conclusion, the results suggest that SGA
may be a useful tool to help in the differential diagnosis between
MCI and AD.
ACKNOWLEDGMENTS
Support obtained from: CNPq Universal 481351/2011-6
and 480053/2013-8, PQ 306604/2012-4 and 308558/2011-1,
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
(CAPES), FAPERN/CNPq Pronem 003/2011, FACEPE/CNPq
PRONEM APQ-1415-1.05/10, Capes SticAmSud, FAPESP
Center for Neuromathematics (grant #2013/ 076990, FAPESP),
CBB-APQ-337 00075-09, APQ-01972/12-10 and APQ-02755-10
from FAPEMIG; and 573646/2008-2 from CNPq. Dr. Diniz
is supported by grant from the Intramural Research Program
(UFMG) and CNPq (472138/2013-8). The funders had no role
in study design, data collection, analysis, decision to publish,
or preparation of the manuscript. We thank R. Furtado and P.
Petrovitch for IT support, A. Karla for administrative help, and
D. Koshiyama for library support.
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found
online at: http://www.frontiersin.org/Journal/10.3389/fnagi.
2014.00185/abstract
REFERENCES
Adlam, A.-L. R., Bozeat, S., Arnold, R., Watson, P., and Hodges, J. R. (2006).
Semantic knowledge in mild cognitive impairment and mild Alzheimer’s dis-
ease. Cortex 42, 675–684. doi: 10.1016/s0010-9452(08)70404-0
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 9
58
Bertola et al. Semantic verbal fluency graph analysis
Amieva, H., Jacqmin-Gadda, H., Orgogozo, J.-M., Le Carret, N., Helmer, C.,
Letenneur, L., et al. (2005). The 9 year cognitive decline before dementia of the
Alzheimer type: a prospective population-based study. Brain 128, 1093–1101.
doi: 10.1093/brain/awh451
Aretouli, E., Okonkwo, C. O., Samek, J., and Brandt, J. (2011). The fate of the 0.5s:
predictors of 2-year outcome in mild cognitive impairment. J. Int. Neuropsychol.
Soc. 17, 277–288. doi: 10.1017/s1355617710001621
Bales, M. E., and Johnson, S. B. (2006). Graph theoretic modeling of
large-scale semantic networks. J. Biomed. Inform. 39, 451–464. doi:
10.1016/j.jbi.2005.10.007
Brucki, S. M. D., Malheiros, S. M. F., Okamoto, I. H., and Bertolucci, P. H. F. (1997).
Dados normativos para o teste de fluência verbal categoria animais em nosso
meio. Arq. Neuropsiquiatr. 55, 56–61.
Butts, C. T. (2009). Revisiting the foundations of network analysis. Science 325,
414–416. doi: 10.1126/science.1171022
Craik, F. I. M., and Bialystok, E. (2006). Cognition through the lifespan: mecha-
nisms of change. Trends Cog. Sci. 10, 131–138. doi: 10.1016/j.tics.2006.01.007
De Deyne, S., and Storms, G. (2008). Word associations: networ and semantic
properties. Behav. Res. Methods 40, 213–231.
de Paula, J. J., Bertola, L., Ávila, R. T., Moreira, L., Coutinho, G., Moraes, E. N.,
et al. (2013). Clinical applicability and cutoff values for an unstructured neu-
ropsychological assessment protocol for older adults with low formal education.
PLoS ONE 8:E73167. doi: 10.1371/journal.pone.007316
Diniz, B. S. O., Yassuda, M. S., Nunes, P. V., Radanovic, M., and Forlenza,
O. V. (2007). Mini-mental state examination performance in mild cognitive
impairment subtypes. Int. Psychogeriatr. 19, 647–656. doi: 10.1590/S1516-
44462008000400003
Garrard, P., Lambon Ralph, M. A., Patterson, K., Pratt, K. H., and Hodges,
J. R. (2005). Semantic feature knowledge and picture naming in demen-
tia of Alzheimer’s type: a new approach. Brain Lang. 93, 79–94. doi:
10.1016/j.bandl.2004.08.003
Goni, J., Arrondo, G., Sepulcre, J., Martincorena, I., Mendizabal, N. V., Corominas-
Murtra, B., et al. (2010). The semantic organization of the animal category:
evidence from semantic verbal fluency and network theory. Cogn. Process. 12,
183–196. doi: 10.1007/s10339-010-0372-x
Griffiths, T. L., Steyvers, M., and Tenenbaum, J. B. (2007). Topics in seman-
tic representation. Psychol. Rev. 114, 211–244. doi: 10.1037/0033-295X.
114.2.211
Henry, J. D., Crawford, J. R., and Phillips, L. H. (2004). Verbal fluency perfor-
mance in dementia of the Alzheimer’s type: a meta-analysis. Neuropsychologia
42, 1212–1222. doi: 10.1016/j.neuropsychologia.2004.02.001
Hodges, J. R., Erzinçlioglu, S., and Patterson, K. (2006). Evolution of cognitive defi
cits and conversion to dementia in patients with mild cognitive impairment:
a very-long-term follow-up study. Dement. Geriatr. Cogn. Disord. 21, 380–391.
doi: 10.1159/000092534
Holler, Y., and Trinka, E. (2014). What do temporal lobe epilepsy and progressive
mild cognitive impairment have in common? Front. Syst. Neurosci. 8:58. doi:
10.3389/fnsys.2014.00058
Huntley, J. D., and Howard, R. J. (2010). Working memory in early Alzheimer’s
disease: a neuropsychological review. Int. J. Geriatr. Psychiatry 25, 121–132. doi:
10.1002/gps.2314
Kotsiantis, S. B. (2007). “Supervised machine learning: a review of classifica-
tion techniques.” in Emerging Artificial Intelligence Applications in Computer
Engineering: Real Word Ai Systems with Applications (Amsterdam: IOS Press),
3–24.
Lerner, A. J., Ogrocki, P. K., and Thomas, P. T. (2009). Network graph
analysis of category fluency testing. Cog. Behav. Neurol. 22, 45–52. doi:
10.1097/wnn.0b013e318192ccaf
Lezak, M., Howieson, D. B., and Loring, D. W. (2004). Neuropsychological
Assessment, 4th Edn. New York, NY: Oxford University Press.
McClelland, J. L., and Rogers, T. T. (2003). The parallel distributed process-
ing approach to semantic cognition. Nat. Rev. Neurosci. 4, 310–323. doi:
10.1038/nrn1076
McKhann, G., Drachman, D., Folstein, M., Katzman, R., Price, D., and Stadlan,
E. M. (1984). Clinical diagnosis of Alzheimer’s disease: report of the
NINCDS-ADRDA work group under the auspices of department of health
and human services task force on Alzheimer’s. Neurology 34, 939–944. doi:
10.1212/WNL.34.7.939
Mota, M. B., Furtado, R., Maia, P. P. C., Copelli, M., and Ribeiro, S. (2014). Graph
analysis of dream reports is especially informative about psychosis. Sci. Rep.
4:3691. doi: 10.1038/srep03691
Mota, N. B., Vasconcelos, N. A. P., Lemos, N., Pieretti, A. C., Kinouchi, O., Cecchi,
G. A., et al. (2012). Speech graphs provide a quantitative measure of thought
disorder in psychosis. PLoS ONE 7:e34928. doi: 10.1371/journal.pone.0034928
Nickles, L. (2001). “Spoken word production,” in What Deficits Reveal about the
Human Mind/Brain: A Handbook of Cognitive Neuropsychology, ed B. Rapp
(Philadelphia, PA: Psychology Press), 291–320.
Nutter-Upham, K. E., Saykin, A. J., Rabin, L. A., Roth, R. M., Wishart, H. A.,
Pare, N., et al. (2008). Verbal fluency performance in amnestic MCI and older
adults with cognitive complaints. Achiev. Clin. Neuropsychol. 23, 229–241. doi:
10.1016/j.acn.2008.01.005
Patterson, K., Nestor, P. J., and Rogers, T. T. (2007). Where do you know what you
know? The representation of semantic knowledge in the human brain. Nat. Rev.
Neurosci. 8, 976–988. doi: 10.1038/nrn2277
Radanovic, M., Diniz, B. S., Mirandez, R. M., Novaretti, T. M. S., Flacks,
M. K., Yassuda, M. S., et al. (2009). Verbal fluency in the detection of
mild cognitive impairment and Alzheimer’s disease among Brazilian por-
tuguese speakers: the influence of education. Int. Psychogeriatr. 21, 1–7. doi:
10.1017/S1041610209990639
Salmon, D. P., Thomas, R. G., Pay, M. M., Booth, A., Hofstetter, C. R., Thal, L.
J., et al. (2002). Alzheimer’s disease can be accurately diagnosed in very mildly
impaired individuals. Neurology 59, 1022–1028. doi: 10.1212/wnl.59.7.1022
Saxton, J., Lopez, O. L., Ratcliff, G., Dulberg, C., Fried, L. P., Carlson, M.
C., et al. (2004). Preclinical Alzheimer disease: neuropsychological test per-
formance 1.5 to 8 years prior to onset. Neurology 63, 2341–2347. doi:
10.1017/S1041610208007631
Singh, M., and Provan, G. M. (1995). A comparison of induction algorithms
for selective and non-selective Bayesian classifiers. Mach. Learn. Proc. 1995,
497–505.
Strauss, E., Sherman, E. M. S., and Spreen, O. (2006). A Compendium of
Neuropsychological Tests: Administration, Norms, and Commentary. Oxford, UK:
Oxford University Press.
Taler, V., and Phillips, N. A. (2008). Language performance in Alzheimer’s dis-
ease and mild cognitive impairment: a comparative review. J. Clin. Exp.
Neuropsychol. 30, 501–556. doi: 10.1080/13803390701550128
Unsworth, N., Spillers, G. J., and Brewer, G. A. (2011). Variation in verbal fluency:
A latent variable analysis of clustering, switching, and overall performance.Q. J.
Exp. Psychol. 64, 447–466. doi: 10.1080/17470218.2010.505292
Winblad, B., Palmer, K., Kivipelto, M., Jelic, V., Fratiglioni, L., Wahlund, L. O.,
et al. (2004). Mild Cognitive impairment—beyond controversies, towards
a consensus: reports of the international working group on mild cognitive
impairment. J. Int. Med. 256, 240–246. doi: 10.1111/j.1365-2796.2004.01380.x
Conflict of Interest Statement: The authors declare that the research was con-
ducted in the absence of any commercial or financial relationships that could be
construed as a potential conflict of interest.
Received: 24 March 2014; accepted: 09 July 2014; published online: 29 July 2014.
Citation: Bertola L, Mota NB, Copelli M, Rivero T, Diniz BS, Romano-Silva MA,
Ribeiro S and Malloy-Diniz LF (2014) Graph analysis of verbal fluency test dis-
criminate between patients with Alzheimer’s disease, mild cognitive impairment and
normal elderly controls. Front. Aging Neurosci. 6:185. doi: 10.3389/fnagi.2014.00185
This article was submitted to the journal Frontiers in Aging Neuroscience.
Copyright © 2014 Bertola, Mota, Copelli, Rivero, Diniz, Romano-Silva, Ribeiro and
Malloy-Diniz. This is an open-access article distributed under the terms of the Creative
Commons Attribution License (CC BY). The use, distribution or reproduction in other
forums is permitted, provided the original author(s) or licensor are credited and that
the original publication in this journal is cited, in accordance with accepted academic
practice. No use, distribution or reproduction is permitted which does not comply with
these terms.
Frontiers in Aging Neuroscience www.frontiersin.org July 2014 | Volume 6 | Article 185 | 10
59
Mota, N. B., Copelli, M., & Ribeiro, S. (2016). Computational tracking of mental health in
youth: Latin American contributions to a low-cost and effective solution for early psychi-
atric diagnosis. In D. D. Preiss (Ed.), Child and adolescent development in Latin America.
New Directions for Child and Adolescent Development, 152, 59–69.
4
Computational Tracking of Mental Health
in Youth: Latin American Contributions to
a Low-Cost and Effective Solution for Early
Psychiatric Diagnosis
Nata´lia Bezerra Mota, Mauro Copelli, Sidarta Ribeiro
Abstract
The early onset of mental disorders can lead to serious cognitive damage, and
timely interventions are needed in order to prevent them. In patients of low so-
cioeconomic status, as is common in Latin America, it can be hard to iden-
tify children at risk. Here, we briefly introduce the problem by reviewing the
scarce epidemiological data from Latin America regarding the onset of mental
disorders, and discussing the difficulties associated with early diagnosis. Then
we present computational psychiatry, a new field to which we and other Latin
American researchers have contributed methods particularly relevant for the
quantitative investigation of psychopathologies manifested during childhood.
We focus on new technologies that help to identify mental disease and provide
prodromal evaluation, so as to promote early differential diagnosis and inter-
vention. To conclude, we discuss the application of these methods to clinical and
educational practice. A comprehensive and quantitative characterization of ver-
bal behavior in children, from hospitals and laboratories to homes and schools,
may lead to more effective pedagogical and medical intervention.© 2016 Wiley
Periodicals, Inc.
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT, no. 152, Summer 2016© 2016 Wiley Periodicals, Inc.
Published online in Wiley Online Library (wileyonlinelibrary.com). • DOI: 10.1002/cad.20159 59
60
60 CHILD AND ADOLESCENT DEVELOPMENT IN LATIN AMERICA
Mental suffering during childhood is a serious concern, hard todiagnose and manage, and prone to have neurodevelopmentalimpacts. A mentally or emotionally impaired child often fails to
learn school content or to develop proper social skills, and the persistence of
these symptoms greatly hinders one’s life course. Since their original psychi-
atric description, the early signs of mental disorders such as schizophrenia
are associated with clinical severity (Kessler et al., 2007; Kessler, Keller, &
Wittchen, 2001), symptom persistence (Clark, Jones, Wood, & Cornelius,
2006; Kessler et al., 2007), and lack of response to treatment (Kessler et al.,
2007; Nierenberg, Quitkin, Kremer, Keller, & Thase, 2004). Symptoms that
go unrecognized can contribute to the appearance of depression, low self-
esteem, chronicity, school absenteeism, social isolation, and risky behavior
(Kessler et al., 2007; Oschilewsky, Gomez, & Belfort, 2010). In order to
prevent major impacts, it is thus necessary to identify the psychiatric risk
with precision and as early as possible.
A review of the prevalence of mental disorders in youth reveals a
very wide variation across different studies. For instance, the prevalence of
mental suffering in childhood and adolescence over the past four decades
ranges from 1% to 51%, depending on the publication chosen (Fleitlich &
Goodman, 2000; Roberts, Attkisson, & Rosenblatt, 1998). This major vari-
ability is likely due to inconsistencies in the instruments used to screen the
pathologies, in the severity of symptoms, and in the source of information.
The psychiatric evaluation of children poses a major challenge because it is
difficult to obtain reliable reports of internally generated symptoms. Indeed,
to characterize mental symptoms in children, it is necessary to also inter-
view other sources, such as parents, other relatives, and teachers. It is also
critical to ensure that the child patient clearly understands the questions
posed. Some concepts are not easy to explain, and cultural differences re-
garding what is considered a pathological behavior often impair the child’s
ability to communicate. For example, interviews with teachers tend to re-
veal a higher prevalence of hyperactivity in children than do interviews with
parents (Fleitlich & Goodman, 2000).
When criteria applied in developed countries were applied in de-
veloping countries, higher prevalence has often been found (Fleitlich &
Goodman, 2000; Roberts et al., 1998). In Great Britain, the overall preva-
lence of mental disorders during childhood is substantial, reaching 9.5%
(de la Barra, 2009; Ford, Goodman, & Meltzer, 2003; Oschilewsky et al.,
2010). In Latin America, studies report a large variability of prevalence. A
study conducted in Chile found an overall prevalence of 22.5% for ages 4 to
18 (de la Barra, Vicente, Saldivia, & Melipilla´n, 2012). A review including
Latin American countries reported prevalence of psychiatric disorders dur-
ing childhood ranging from 5% to 22%, a large variance that is explained
by methodological differences across studies (de la Barra, 2009). One ex-
ample is a study performed in Puerto Rico, age range 4 to 17 years old,
which found a prevalence of mental disorders of 19.8% when considering
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
61
COMPUTATIONAL TRACKING OF MENTAL HEALTH IN YOUTH 61
Diagnostic and Statistical Manual of Mental Disorders-IV criteria with or
without impairment, but the prevalence decays to 16.4% when only im-
pairment cases were considered, and to 6.9% if a measure of global impair-
ment was added (Canino et al., 2004; de la Barra, 2009). A multicentric
study in developing countries of Africa, Asia, and South America revealed
a prevalence of mental disorders ranging from 12% to 29% for ages 5 to 15,
with higher prevalence in South American countries (Fleitlich&Goodman,
2000; Giel et al., 1981).
In developing countries, poverty and social development are key fac-
tors affecting mental health. In Latin America, mental disorders are signif-
icantly related to social vulnerability during childhood, such as homeless-
ness or dropout from school (Belfer & Rohde, 2005; Oschilewsky et al.,
2010; Rohde, Celia, & Berganza, 2004). The causal link between social vul-
nerability and mental disorder changes direction across different diagnostic
entities. For schizophrenia there is evidence pointing to social selection, i.e.,
the development of symptoms leads to social impairment. In contrast, for
depression, antisocial personality and substance abuse, the evidence points
to social causation (Dohrenwend et al., 1992; Robins & Price, 1991). In
all cases, prevention and early differential diagnosis are likely to help the
patient manage a difficult situation, but large-scale interventions over en-
tire populations need to be properly designed in order to have real social
impact.
We need to understand the mental health epidemiology of Latin Amer-
ican children, so as to overcome the lack of information about this issue
globally (Baxter, Patton, Scott, Degenhardt, & Whiteford, 2013), and even
the lack of information about general mental health epidemiology in Latin
America (Baxter et al., 2013; Duarte et al., 2003; Mercadante, Evans-Lacko,
& Paula, 2009; Oschilewsky et al., 2010). Studies of this topic used a va-
riety of different methods to search for symptoms and diagnosis (different
instruments and settings) (Duarte et al., 2003), and yet found results simi-
lar to those found in developed countries, but with more prevalence of risk
factors like poverty, parental mental disorders, and family violence (Duarte
et al., 2003). Given the high poverty rate and low educational level in the
region, it is likely that there is in Latin America an undiagnosed population
undergoing mental suffering without access to proper diagnosis and treat-
ment, because of the expensive and ineffective diagnostic models used in
most of these countries.
In order to stop this vicious cycle of mental suffering and social im-
pacts, especially important in developing countries like in Latin America,
we will need to be creative. There is great hope in the interdisciplinary field
of computational psychiatry. Here, we review the advances on this new field,
focusing on automated diagnosis tools for psychiatric diseases. We also
present quantitative speech measurements adequate for large-scale analysis
and able to improve the recognition of pathological and nonpathological
neurodevelopmental paths within clinical and educational settings.
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
62
62 CHILD AND ADOLESCENT DEVELOPMENT IN LATIN AMERICA
Computational Psychiatry: New Methods for Understanding
Human Behavior
For over a century, psychiatry has described the psychopathology of di-
agnostic entities as patterns of deviant behaviors. The diagnostic manuals
(First, Spitzer, Gibbon, & Williams, 1990) emerged as a consensus among
experts, stating which associated symptoms should be considered as a diag-
nostic entity, for how long and under which circumstances (Krystal & State,
2014). However, after decades of hard effort and bulky scientific investment,
the known biomarkers are not specific for any psychiatric symptom-based
diagnosis, because behavioral symptoms aremultidetermined (Insel, 2014).
A distinct approach based on transdiagnostic dimensions has recently
emerged in psychiatry. The Research Domain Criteria (RDoC; Insel, 2014;
Insel et al., 2010; Kaufman, Gelernter, Hudziak, Tyrka, & Coplan, 2015)
classify population samples by grouping similar disorders within certain
domains of behavior. This strategy has been particularly interesting for
child psychiatry, because it allows a better assessment of the risks associ-
ated with abuse experienced by vulnerable infants (Kaufman et al., 2015).
The search for better diagnostic strategies is an essential part of the effort
to break the cyclic link between mental disorders and social damage. In
that regard, childhood represents an early window of opportunity for the
identification of cognitive deficits and mental disorder. The hope is that ad-
equate timely intervention may revert poor prognoses and establish inter-
ventions able to effectivelyminimize damage to the individual and his or her
surroundings.
The correlation of behavior with biomarkers can be meaningful only if
the quantitative measurements are both comprehensive and precise. The
nascent field of computational psychiatry employs increasingly sophisti-
cated mathematical tools to precisely quantify behavior, so as to better grasp
the relationship between biological variables (genetic, biochemical, neu-
ral) and purely behavioral variables such as performance on cognitive tasks
or psychometric scales (Adams, Huys, & Roiser, 2015; Montague, Dolan,
Friston, & Dayan, 2012; Wang & Krystal, 2014).
Even when such a relationship cannot be clearly established, it is pos-
sible to search for clusters within the population based on the variability
of the biological (Brodersen et al., 2014; Wang & Krystal, 2014) or behav-
ioral (Bedi et al., 2015; Bertola et al., 2014; Cabana, Valle-Lisboa, Elvevag,
&Mizraji, 2011; Dı´az, 2013; Elvevag, Foltz,Weinberger, &Goldberg, 2007;
Montague et al., 2012; Mota, Furtado, Maia, Copelli, & Ribeiro, 2014; Mota
et al., 2012; Yoshida et al., 2010) data measured. The hope is that the symp-
tomatic characterization of each cluster will greatly advance the under-
standing of the psychopathological mechanisms underlying a wide variety
of mental disorders. This knowledge may not only help the early identifica-
tion of individuals suffering from mental disorders but may also contribute
to the design of low-cost yet effective interventional methods able to prevent
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
63
COMPUTATIONAL TRACKING OF MENTAL HEALTH IN YOUTH 63
major cognitive deficits and their consequences, with potential to improve
the psychiatric scenario in Latin American countries.
Automated Diagnostic Tools: Hope for Early Intervention
Computational psychiatry is still a young discipline, but there are already
some identifiable advances in the development of tools capable of quanti-
fying core behaviors affected in mental disorders. In the past 5 years, re-
searchers from Uruguay, Brazil, and Argentina pioneered the development
of computational tools for the automatic analysis of psychopathological
speech; these innovative tools yield very good diagnostic performance (Bedi
et al., 2015; Bertola et al., 2014; Cabana et al., 2011;Mota et al., 2012, 2014),
and even predict psychiatric outcomes in the prodromal phase (Bedi et al.,
2015). Application of these techniques to Latin American samples demon-
strated the feasibility and advantages of these methods in developing coun-
tries (Bertola et al., 2014; Mota et al., 2012, 2014).
Language can be understood as a window into the organization of
thoughts and therefore able to reflect fundamental aspects of mental func-
tioning. Through speech we present to others what and how we think and
feel, allowing the establishment of social bonds. Language features such
as the structure of the trajectory of words (Bertola et al., 2014; Mota et al.,
2012, 2014;Wang&Krystal, 2014), semantic consistency (Bedi et al., 2015;
Elvevag et al., 2007), and prosody (Grunerbl et al., 2015) can automatically
be measured to characterize psychopathological aspects of different mental
disorders.
With regard to child psychiatry, the early identification of chronic de-
velopmental disorders such as autism is in order. It is known, for example,
that patients within the autism spectrum have a peculiar way of interacting
with toys, a behavior highly amenable to accurate automatic measurements
(Westeyn et al., 2012). Also common in the autism spectrum are deficits in
the ability known as theory of mind, which is involved, for instance, in the
capacity to understand that the beliefs of others may differ from one’s own
beliefs (Baron-Cohen, Leslie, & Frith, 1985; Frith, 1997; Misra, 2014). In
a game designed to investigate theory of mind in autistic patients, partic-
ipants are rewarded for choosing a cooperation strategy that requires one
to understand that other participants have ideas different from his/her own.
When playing this game, people diagnosed within the autistic spectrum rely
significantly less on the cooperation strategy that requires theory of mind,
in comparison with control participants. Importantly, this measure of coop-
eration correlates with the severity of the symptoms (Yoshida et al., 2010).
This is a compelling example of how an elusive, hard-to-measure behavioral
skill can now be accurately quantified in a substantially less biased manner,
generating a stream of objective data as the experimental subject behaves
freely.
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
64
64 CHILD AND ADOLESCENT DEVELOPMENT IN LATIN AMERICA
Also important for child psychiatry is the early onset of psychotic dis-
orders. One of the main symptoms of psychotic illnesses are disorders of
thought (Andreasen & Grove, 1986), characterized by a fuzzy sequence of
words produced in spontaneous speech, with a higher rate of unusual as-
sociations than in the general population (Bleuler, 1911; Kraepelin, 1906;
Moskowitz & Heim, 2011). Thus, it is to be expected that the consecu-
tive combination of words during free speech leads to more uncommon
associations when psychotic symptoms are present, resulting in incoherent
speech (Elvevag et al., 2007). This feature can be objectivelymeasured using
a mathematical strategy known as latent semantic analysis (LSA; Landauer
& Dumais, 1997), which estimates semantic proximity based on the co-
occurrence of words within large, representative language corpora. By rep-
resenting words as vectors in a high dimensional semantic space, it is pos-
sible to measure the semantic distance between words or groups of words.
This approach was first used by a joint European–North American research
team, and then by a Uruguayan team, to demonstrate that patients with
schizophrenia speak with greater semantic inconsistency than control sub-
jects (Cabana et al., 2011; Elvevag et al., 2007). More recently, within a
youth population at risk for psychosis, a study with a major contribution
from Argentinian and Brazilian researchers (our group) showed that it was
possible to predict with 100% accuracy which subjects would eventually
display actual psychotic episodes, based on quantitative features of quar-
terly clinical interviews recorded for up to 2.5 years (Bedi et al., 2015). The
features employed for this prodromal investigation were the semantic in-
consistency between consecutive sentences, maximum phrase length, and
the amount of determiners (e.g., which). Altogether, these results point to
a feasible way to track and prevent the onset of psychotic crises, even be-
fore the occurrence of a first episode during adolescence or early adult-
hood. This could give families a better chance to prevent major cognitive
damages.
Early differential diagnosis with correct prognosis is also crucial to
mitigate cognitive damage in psychotic patients. Especially for early on-
set, schizophrenia tends to produce more cognitive damage than bipolar
mood disorder (Kaplan & Sadock, 2009). Differential diagnosis is possi-
ble because thought disorders typical of patients with schizophrenia may
differ substantially from those observed in patients with bipolar disorder
(Andreasen & Grove, 1986). In order to better characterize mental organi-
zation among psychotic patients, we developed a method based on graph
theory to measure the complexity of the stream of thoughts as expressed
by speech (Mota et al., 2012, 2014). When applied to Brazilian patients at
mental institutions typical of Latin American public health settings, this
method allowed the quantitative identification of bipolar disorder symp-
toms such as logorrhea and flight of ideas (Mota et al., 2012, 2014), as
well as schizophrenia symptoms such as laconic talking, with a less con-
nected and more linear structure, which altogether stand for the symptom
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
65
COMPUTATIONAL TRACKING OF MENTAL HEALTH IN YOUTH 65
known as poor speech (Mota et al., 2012, 2014). Measures of graph con-
nectivity are significantly anticorrelated with negative symptoms (e.g., dif-
ficulty socializing and establishing ties with the interviewer), as well as cog-
nitive symptoms (e.g., failure to understand abstract concepts) (Mota et al.,
2014). This method can automatically distinguish schizophrenia, bipolar
and control subjects with high accuracy (Mota et al., 2012, 2014). The
distinction between these diagnostic entities leads to the identification of
potential prognostic predictors, as indeed indicated by the fact that the ex-
pected course of schizophrenia, as compared to bipolar disorder, produces
more severe cognitive impairment and hence a more difficult socialization.
Having established this correlation, computational psychiatrists now need
to carry on longitudinal studies in order to establish the predictive value
of the graph-theoretical method for diagnosis, prognosis, and response to
treatment in clinical situations such as prodrome or first episode. This will
allow the early identification and treatment of the diseases that can lead to
psychosis. It will also promote a deeper understanding of the distinct bi-
ological bases of schizophrenia and bipolar disorder, which have partially
overlapping symptomatology but a quite different clinical course.
Another promising research line in computational psychiatry is related
to the fact that speech features such as pitch and speed are very strongly
affected by mood. In situations of euphoria, it is common to observe higher
speech rate and higher voice amplitude, in comparison with times of sor-
row. Voice samples can be collected on a daily basis with the help of a cell
phone device, currently so ubiquitous, to generate a naturalistic, dense, and
nonbiased speech sample of individuals diagnosed with bipolar disorder
(Grunerbl et al., 2015). Prosodic measures of speech recorded by mobile
applications have been shown to be useful in the identification of extreme
mood episodes such as mania and depression (Grunerbl et al., 2015).
Conclusions
The early differential diagnosis of mental disorders affects the individual’s
life and epidemiological perspective and scaffolds the design of public poli-
cies for the prevention of mental distress. Interdisciplinary prevention leads
to a mitigation of social impact, reduced risk factors, and improved wel-
fare of the population. In Latin America, risk factors for mental illness
are particularly prevalent, and there are few professionals effectively qual-
ified to identify psychiatric vulnerabilities (Duarte et al., 2003). In this
context, the use of automated methods for the objective quantification of
prognostic predictors of mental health and cognition may greatly empower
patients and psychiatrists as well, and it may help to break the mental
disorder–poverty cycle that plagues the region. The fact that these com-
putational methods for psychiatry have in large part been developed by
Latin American researchers is an auspicious indication that the scientific gap
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
66
66 CHILD AND ADOLESCENT DEVELOPMENT IN LATIN AMERICA
between developed and developing countries may be decreasing in some
fields.
Acknowledgements
This work was supported by Conselho Nacional de Desenvolvimento
Cientı´fico e Tecnolo´gico (CNPq), grants Universal 480053/2013-8 and Re-
search Productivity 306604/2012-4 and 310712/2014-9; Coordenac¸a˜o de
Aperfeic¸oamento de Pessoal de Nı´vel Superior (CAPES) Projeto ACERTA;
Fundac¸a˜o de Amparo a` Cieˆncia e Tecnologia do Estado de Pernambuco
(FACEPE); FAPESP Center for Neuromathematics (grant # 2013/07699-0,
S. Paulo Research Foundation FAPESP).
References
Adams, R. A., Huys, Q. J., & Roiser, J. P. (2015). Computational sychiatry: Towards a
mathematically informed understanding of mental illness. Journal of Neurology, Neu-
rosurgery, and Psychiatry. doi: jnnp-2015-310737
Andreasen, N. C., & Grove, W. M. (1986). Thought, language, and communication in
schizophrenia: Diagnosis and prognosis. Schizophrenia Bulletin, 12(3), 348–359.
Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985). Does the autistic child have a “theory
of mind”? Cognition, 21(1), 37–46. doi: 0010-0277(85)90022-8
Baxter, A. J., Patton, G., Scott, K. M., Degenhardt, L., &Whiteford, H. A. (2013). Global
epidemiology of mental disorders: What are we missing? PLoS One, 8(6), e65514. doi:
10.1371/journal.pone.0065514
Bedi, G., Carrillo, F., Cecchi, G. A., Slezak, D. F., Sigman, M., Mota, N. B., . . . Corcoran,
C. M. (2015). Automated analysis of free speech predicts psychosis onset in high-risk
youths. npj Schizophrenia, 1, 15030. doi: 10.1038/npjschz.2015.30
Belfer, M. L., & Rohde, L. A. (2005). Child and adolescent mental health in Latin Amer-
ica and the Caribbean: Problems, progress, and policy research. Revista Panamericana
de Salud Pu´blica, 18(4–5), 359–365. doi: S1020-49892005000900016
Bertola, L., Mota, N. B., Copelli, M., Rivero, T., Diniz, B. S., Romano-Silva, M. A., . . .
Malloy-Diniz, L. F. (2014). Graph analysis of verbal fluency test discriminate between
patients with Alzheimer’s disease, mild cognitive impairment and normal elderly con-
trols. Frontiers in Aging Neuroscience, 6(185). doi: 10.3389/fnagi.2014.00185
Bleuler, E. (1911). Dementia praecox or the group of schizophrenias. New York: Interna-
tional Universities Press.
Brodersen, K. H., Deserno, L., Schlagenhauf, F., Lin, Z., Penny, W. D., Buhmann, J. M.,
& Stephan, K. E. (2014). Dissecting psychiatric spectrum disorders by generative em-
bedding. NeuroImage: Clinical, 4, 98–111. doi: 10.1016/j.nicl.2013.11.002
Cabana, A., Valle-Lisboa, J. C., Elvevag, B., & Mizraji, E. (2011). Detecting order-
disorder transitions in discourse: Implications for schizophrenia. Schizophrenia Re-
search, 131, 157–164.
Canino, G., Shrout, P. E., Rubio-Stipec, M., Bird, H. R., Bravo, M., Ramirez, R., . . .
Martinez-Taboas, A. (2004). The DSM-IV rates of child and adolescent disorders in
Puerto Rico: Prevalence, correlates, service use, and the effects of impairment.Archives
of General Psychiatry, 61(1), 85–93. doi: 10.1001/archpsyc.61.1.85
Clark, D. B., Jones, B. L., Wood, D. S., & Cornelius, J. R. (2006). Substance use disorder
trajectory classes: Diachronic integration of onset age, severity, and course. Addictive
Behaviors, 31(6), 995–1009. doi: S0306-4603(06)00086-4
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
67
COMPUTATIONAL TRACKING OF MENTAL HEALTH IN YOUTH 67
de la Barra, F. M. (2009). Epidemiologı´a de trastornos psiquia´tricos en nin˜os y adoles-
centes: Estudios de prevalencia [Epidemiology of psychiatric disorders in children and
adolescentes: Prevalence studies]. Revista Chilena de Neuro-Psiquiatrı´a, 47(4), 303–
314.
de la Barra, M. F., Vicente, P. B., Saldivia, B. S., & Melipilla´n, A. R. (2012). Estudio de
epidemiologı´a psiquia´trica en nin˜os y adolescentes en Chile. Estado actual [Investi-
gation of psychiatric epidemiology among children and adolescents in Chile]. Revista
Me´dica Clı´nica Las Condes, 23(5), 521–529.
Dı´az, J.-L. (2013). A narrative method for consciousness research.Frontiers in Human
Neuroscience, 7(739), 1–12.
Dohrenwend, B. P., Levav, I., Shrout, P. E., Schwartz, S., Naveh, G., Link, B. G., . . .
Stueve, A. (1992). Socioeconomic status and psychiatric disorders: The causation-
selection issue. Science, 255(5047), 946–952.
Duarte, C., Hoven, C., Berganza, C., Bordin, I., Bird, H., & Miranda, C. T. (2003). Child
mental health in Latin America: Present and future epidemiologic research. Interna-
tional Journal of Psychiatry in Medicine, 33(3), 203–222.
Elvevag, B., Foltz, P. W., Weinberger, D. R., & Goldberg, T. E. (2007). Quantifying inco-
herence in speech: An automated methodology and novel application to schizophre-
nia. Schizophrenia Research, 93(1–3), 304–316. doi: S0920-9964(07)00117-X
First,M. H., Spitzer, R. L., Gibbon,M.,&Williams, J. (1990). Structured clinical interview
for DSM-IV Axis I disorders—Research version, patient edition (SCID-I/P). New York:
New York State Psychiatric Institute.
Fleitlich, B. W., & Goodman, R. (2000). Epidemiologia. Revista Brasileira de Psiquiatria,
22(Suppl. 2), 2–6.
Ford, T., Goodman, R., & Meltzer, H. (2003). The British child and adolescent men-
tal health survey 1999: The prevalence of DSM-IV disorders. Journal of the American
Academy of Child and Adolescent Psychiatry, 42(10), 1203–1211.
Frith, U. (1997). The neurocognitive basis of autism. Trends in Cognitive Sciences, 1(2),
73–77. doi: 10.1016/S1364-6613(97)01010-3
Giel, R., de Arango, M. V., Climent, C. E., Harding, T. W., Ibrahim, H. H., Ladrido-
Ignacio, L., . . . Younis, Y. O. (1981). Childhood mental disorders in primary health
care: Results of observations in four developing countries. A report from the WHO
collaborative Study on Strategies for Extending Mental Health Care. Pediatrics, 68(5),
677–683.
Grunerbl, A., Muaremi, A., Osmani, V., Bahle, G., Ohler, S., Troster, G., . . . Lukowicz,
P. (2015). Smartphone-based recognition of states and state changes in bipolar disor-
der patients. IEEE Journal of Biomedical and Health Informatics, 19(1), 140–148. doi:
10.1109/JBHI.2014.2343154
Insel, T. (2014). The NIMH Research Domain Criteria (RDoC) Project: Precision
medicine for psychiatry. American Journal of Psychiatry, 171(4), 395–397. doi:
10.1176/appi.ajp.2014.14020138
Insel, T., Cuthbert, B., Garvey, M., Heinssen, R., Pine, D. S., Quinn, K., . . . Wang, P.
(2010). Research domain criteria (RDoC): Toward a new classification framework for
research on mental disorders. American Journal of Psychiatry, 167(7), 748–751. doi:
10.1176/appi.ajp.2010.09091379
Kaplan, H. I., & Sadock, B. J. (2009). Kaplan & Sadock’s comprehensive textbook of psy-
chiatry. Baltimore, MD: Wolters Kluwer, Lippincott Williams & Wilkins.
Kaufman, J., Gelernter, J., Hudziak, J. J., Tyrka, A. R., & Coplan, J. D. (2015). The Re-
search Domain Criteria (RDoC) Project and studies of risk and resilience inmaltreated
children. Journal of the American Academy of Child and Adolescent Psychiatry, 54(8),
617–625. doi: 10.1016/j.jaac.2015.06.001
Kessler, R. C., Amminger, G. P., Aguilar-Gaxiola, S., Alonso, J., Lee, S., & Ustun, T. B.
(2007). Age of onset of mental disorders: A review of recent literature. Current Opinion
in Psychiatry, 20(4), 359–364. doi: 10.1097/YCO.0b013e32816ebc8c
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
68
68 CHILD AND ADOLESCENT DEVELOPMENT IN LATIN AMERICA
Kessler, R. C., Keller, M. B., &Wittchen, H. U. (2001). The epidemiology of generalized
anxiety disorder. Psychiatric Clinics of North America, 24(1), 19–39.
Kraepelin, E. (1906). U¨ber sprachsto¨rungen im traume [About language disorders in
dreams]. Leipzig, Germany: Engelmann.
Krystal, J. H., & State, M. W. (2014). Psychiatric disorders: Diagnosis to therapy. Cell,
157(1), 201–214. doi: 10.1016/j.cell.2014.02.042
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The Latent
Semantic Analysis theory of the acquisition, induction, and representation of knowl-
edge. Psychological Review, 104, 211–240.
Mercadante, M. T., Evans-Lacko, S., & Paula, C. S. (2009). Perspectives of in-
tellectual disability in Latin American countries: Epidemiology, policy, and ser-
vices for children and adults. Current Opinion in Psychiatry, 22(5), 469–474. doi:
10.1097/YCO.0b013e32832eb8c6
Misra, V. (2014). The social brain network and autism. Annals of Neuroscience, 21(2),
69–73. doi: 10.5214/ans.0972.7531.210208
Montague, P. R., Dolan, R. J., Friston, K. J., & Dayan, P. (2012). Computational psychi-
atry. Trends in Cognitive Sciences, 16(1), 72–80. doi: 10.1016/j.tics.2011.11.018
Moskowitz, A., & Heim, G. (2011). Eugen Bleuler’s Dementia praecox or the group of
schizophrenias (1911): A centenary appreciation and reconsideration. Schizophrenia
Bulletin, 37(3), 471–479. doi: 10.1093/schbul/sbr016
Mota, N. B., Furtado, R., Maia, P. P., Copelli, M., & Ribeiro, S. (2014). Graph analysis
of dream reports is especially informative about psychosis. Scientific Reports, 4, 3691.
doi: 10.1038/srep03691
Mota, N. B., Vasconcelos, N. A., Lemos, N., Pieretti, A. C., Kinouchi, O., Cecchi, G.
A., . . . Ribeiro, S. (2012). Speech graphs provide a quantitative measure of thought
disorder in psychosis. PLoS One, 7(4), e34928. doi: 10.1371/journal.pone.0034928
Nierenberg, A. A., Quitkin, F. M., Kremer, C., Keller, M. B., & Thase, M. E.
(2004). Placebo-controlled continuation treatment with mirtazapine: Acute pat-
tern of response predicts relapse. Neuropsychopharmacology, 29(5), 1012–1018. doi:
10.1038/sj.npp.1300405
Oschilewsky, R. C., Gomez, C. M., & Belfort, E. (2010). Child psychiatry and men-
tal health in Latin America. International Review of Psychiatry, 22(4), 355–362. doi:
10.3109/09540261.2010.503692
Roberts, R. E., Attkisson, C. C., & Rosenblatt, A. (1998). Prevalence of psychopathol-
ogy among children and adolescents. American Journal of Psychiatry, 155, 715–
725.
Robins, L. N., & Price, R. K. (1991). Adult disorders predicted by childhood conduct
problems: Results from the NIMH Epidemiologic Catchment Area project. Psychiatry,
54(2), 116–132.
Rohde, L. A., Celia, S., & Berganza, C. (2004). Systems of care in South America. In
H. Remschmidt, M. Belfer, & I. Gooyer (Eds.), Facilitating pathways, care, treatment
and prevention in child and adolescent mental health (pp. 42–51). Heidelberg, Germany:
Springer Verlag.
Wang, X. J., & Krystal, J. H. (2014). Computational psychiatry.Neuron, 84(3), 638–654.
doi: 10.1016/j.neuron.2014.10.018
Westeyn, T. L., Abowd, G. D., Starner, T. E., Johnson, J. M., Presti, P. W., & Weaver,
K. A. (2012). Monitoring children’s developmental progress using augmented toys
and activity recognition. Personal and Ubiquitous Computing, 16, 169–191. doi:
10.1007/s00779-011-0386-0
Yoshida, W., Dziobek, I., Kliemann, D., Heekeren, H. R., Friston, K. J., & Dolan, R. J.
(2010). Cooperation and heterogeneity of the autistic mind. Journal of Neuroscience,
30(26), 8815–8818. doi: 10.1523/JNEUROSCI.0400-10.2010
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
69
COMPUTATIONAL TRACKING OF MENTAL HEALTH IN YOUTH 69
NATA´LIA BEZERRA MOTA is a PhD student at the Brain Institute, Federal Univer-
sity of Rio Grande do Norte. She completed a psychiatry residence and received
a MSc in neuroscience from the Federal University of Rio Grande do Norte.
MAURO COPELLI is associate professor of physics at the Physics Department,
Federal University of Pernambuco. He received a PhD in physics from Limburgs
Universitair Centrum.
SIDARTA RIBEIRO is full professor of neuroscience at the Brain Institute, Federal
University of Rio Grande do Norte. He received a PhD in animal behavior from
the Rockefeller University.
NEW DIRECTIONS FOR CHILD AND ADOLESCENT DEVELOPMENT • DOI: 10.1002/cad
70
ARTICLE OPEN
Automated analysis of free speech predicts psychosis onset in
high-risk youths
Gillinder Bedi1,2,9, Facundo Carrillo3,9, Guillermo A Cecchi4, Diego Fernández Slezak3, Mariano Sigman5, Natália B Mota6,
Sidarta Ribeiro6, Daniel C Javitt1,7, Mauro Copelli8 and Cheryl M Corcoran1,7
BACKGROUND/OBJECTIVES: Psychiatry lacks the objective clinical tests routinely used in other specializations. Novel
computerized methods to characterize complex behaviors such as speech could be used to identify and predict psychiatric illness
in individuals.
AIMS: In this proof-of-principle study, our aim was to test automated speech analyses combined with Machine Learning to predict
later psychosis onset in youths at clinical high-risk (CHR) for psychosis.
METHODS: Thirty-four CHR youths (11 females) had baseline interviews and were assessed quarterly for up to 2.5 years; five
transitioned to psychosis. Using automated analysis, transcripts of interviews were evaluated for semantic and syntactic features
predicting later psychosis onset. Speech features were fed into a convex hull classification algorithm with leave-one-subject-out
cross-validation to assess their predictive value for psychosis outcome. The canonical correlation between the speech features and
prodromal symptom ratings was computed.
RESULTS: Derived speech features included a Latent Semantic Analysis measure of semantic coherence and two syntactic markers
of speech complexity: maximum phrase length and use of determiners (e.g., which). These speech features predicted later psychosis
development with 100% accuracy, outperforming classification from clinical interviews. Speech features were significantly
correlated with prodromal symptoms.
CONCLUSIONS: Findings support the utility of automated speech analysis to measure subtle, clinically relevant mental state
changes in emergent psychosis. Recent developments in computer science, including natural language processing, could provide
the foundation for future development of objective clinical tests for psychiatry.
npj Schizophrenia (2015) 1, Article number: 15030; doi:10.1038/npjschz.2015.30; published online 26 August 2015
INTRODUCTION
The capacity of psychiatry to diagnose and treat serious mental
illness has been hampered by the absence of objective clinical
tests of the type routinely used in other fields of medicine.
Although recent years have seen substantial advances in under-
standing of the neurobiology of mental illness,1 these develop-
ments have yet to yield markers that reliably differentiate
psychiatric health from illness at the level of the individual
patient. Whereas clinical neuroscience has focused on the brain in
mental illness, computer science has, in parallel, developed
increasingly sophisticated automated approaches to characterize
and predict human behavior. Such advances are now commonly
utilized in industry (the private business sector): models combin-
ing demographic data and purchasing behavior are used to
personalize advertising content2 and automated language assess-
ment is employed to screen job candidates and score essays.3 The
degree to which such technologies might also aid diagnosis and
prognosis in psychiatry is only now beginning to be explored (e.g.,
see ref. 4).
Developments in automated natural language processing5
present one promising avenue for psychiatry. Although speech
may present a unique ‘window into the mind’ in a variety of
altered states,6 it is particularly relevant to psychosis. Thought
disorder, a cardinal symptom of schizophrenia in which thought
processes lose coherence, is typically diagnosed on the basis of
clinical observation of disorganized speech.7 As a complement to
clinical observation, automated analysis methods have previously
been used to assess speech correlates of thought disorder in
schizophrenia.8 For example, Latent Semantic Analysis (LSA), an
automated high-dimensional associative analysis of semantic
structure in speech, has been used to identify a reduction in
semantic coherence in schizophrenia that correlates with clinical
ratings and has comparable diagnostic accuracy.3 LSA combined
with structural speech analysis was also able to accurately
differentiate between first-degree relatives of schizophrenia
patients and unrelated healthy individuals, suggesting that subtle
differences indicative of underlying genetic vulnerabilities to
schizophrenia can be distinguished with computerized speech
analysis.9
As yet, however, these methods have not been applied to the
prediction of psychosis onset, even though clinically diagnosed
subtle disorganization in speech has consistently been identified
as predictive of psychosis (i.e., with classification accuracy of
~ 60%) among young people identified as at clinical high risk
1Department of Psychiatry, College of Physicians and Surgeons of Columbia University, New York, NY, USA; 2Division on Substance Abuse, New York State Psychiatric Institute,
New York, NY, USA; 3Department of computer Science, School of Sciences, Universidad de Buenos Aires, Buenos Aires, Argentina; 4Computational Biology Center—Neuroscience,
IBM T.J. Watson Research Center, Yorktown Heights, NY, USA; 5Department of Physics, School of Sciences, Universidad de Buenos Aires, Buenos Aires, Argentina; 6Brain Institute,
Federal University of Rio Grande do Norte, Natal, Brazil; 7Division of Experimental Therapeutics, New York State Psychiatric Institute, New York, NY, USA and 8Department of
Physics, Federal University of Pernambuco, Recife, Brazil.
Correspondence: GA Cecchi or CM Corcoran (gcecchi@us.ibm.com or cc788@columbia.edu)
9These authors contributed equally to this work.
Received 13 May 2015; revised 19 June 2015; accepted 6 July 2015
www.nature.com/npjschz
All rights reserved 2334-265X/15
© 2015 Schizophrenia International Research Group/Nature Publishing Group
71
(CHR) for psychosis (reviewed in ref. 10), as well as those at genetic
high risk for psychosis.11 There are several reasons to test
automated prediction approaches in this population. Schizophre-
nia, although relatively rare (lifetime prevalence ~ 1%), is among
the most catastrophic mental illnesses both personally and
societally. Schizophrenia and related psychotic illnesses typically
emerge in young adults at the point of maximal societal and
parental investment when individuals are poised to begin to
contribute socially and economically.12 Although those at CHR for
developing schizophrenia by virtue of subthreshold or attenuated
psychotic symptoms can be identified,13 to date reliable predic-
tion of psychosis onset among high-risk youths has proven
elusive. Improving the capacity to predict psychosis among high-
risk populations would have important ramifications for early
identification and preventive intervention, potentially critically
altering the long-term life trajectory of people with emergent
psychotic disorders.
Here, we present a proof-of-principle test of automated speech
analysis to predict, at the level of the individual, the later onset of
psychosis. Specifically, we employed analysis of free speech at
baseline to predict psychosis onset over a subsequent period of
up to 2.5 years in teens and young adults identified as at CHR for
psychosis.13 On the basis of earlier findings in schizophrenia,3,9,14
in which automated text analyses yielded parameters that
accurately discriminated between patients and controls, we
hypothesized that automated semantic and syntactic analysis of
baseline interview transcripts would yield speech features capable
of predicting subsequent psychosis outcome among CHR
individuals.
MATERIALS AND METHODS
Participants
Participants were 34 help-seeking youths aged 14 to 27 years who were
fluent in English (three were immigrants who learned English as children).
They were referred from schools and clinicians, or self-referred through the
Center of Prevention and Evaluation website. Exclusion criteria included
history of threshold psychosis or Axis I psychotic disorder, risk of harm to
self or others incommensurate with outpatient care, any major medical or
neurological disorder, and Intelligence Quotiento70 (assessed with the
Wechsler Abbreviated Scale of Intelligence). The attenuated psychotic
symptoms characteristic of the CHR participants could not have occurred
solely in the context of substance use or withdrawal. Adults provided
written informed consent; participants under 18 provided written assent,
with consent provided by a parent. All experiments were performed in
accordance with the relevant guidelines and regulations, and all
procedures were approved by the Institutional Review Board at the New
York State Psychiatric Institute at Columbia University. Five participants
transitioned to psychosis within 2.5 years of follow-up (CHR+), whereas 29
did not (CHR− ). Demographics for CHR individuals, stratified by psychosis
outcome, are presented in Table 1.
Procedures
Ascertainment and prospective characterization. The Structured Interview
for Prodromal Syndromes/Scale of Prodromal Symptoms (SIPS/SOPS)13 was
used for ascertainment of CHR status, for baseline and quarterly symptom
ratings,10 and to determine psychosis outcome. The SIPS/SOPS evaluates
positive (subthreshold psychotic), negative, disorganized, and general
symptoms.
Participants had to meet baseline criteria for one of three prodromal
syndromes, assessed with the SIPS/SOPS: (i) attenuated positive symptom
syndrome (⩾1 SOPS-positive item in the prodromal range with symptoms
beginning or worsening in the past year, and symptoms occurring
⩾ once/week in the prior month); (ii) genetic risk and deterioration
syndrome (psychosis in a first-degree relative or schizotypal disorder
accompanied by a 30% drop in global assessment of function over the
past year); or (iii) brief intermittent psychotic symptom syndrome (⩾1
SOPS-positive items in the psychotic range with symptoms beginning in
the past 3 months, and symptoms occurring ⩾ several minutes/day). All
CHR participants in this study met criteria for the attenuated positive
symptom syndrome. Trained master-level research assistants adminis-
tered the SIPS/SOPS, with clinical ratings achieved by expert consensus
(with CC).
Participants were prospectively characterized for symptoms every
3 months for up to 2.5 years, with transition to psychosis determined
using the SIPS/SOPS ‘presence of psychosis’ criteria.
Baseline interviews. Open-ended, narrative interviews of ~ 1 h were
obtained from participants by interviewers trained by an expert in
qualitative interviewing and phenomenological research.15 Participants
were encouraged to describe changes they had experienced and the
impact of these changes, what had been helpful or unhelpful for them, and
their expectations for the future. Interviews took place between 2007 and
early 2012, and were transcribed by an independent company. The first 27
transcripts were previously subject to thematic analysis using phenomen-
ological procedures, finding gender differences in themes; this earlier
qualitative analysis did not assess the predictive value of the interviews for
psychosis outcome.16
Speech preprocessing. Interview transcripts were preprocessed as pre-
viously described6 using the Natural Language Toolkit (NLTK; http://www.
nltk.org/).5 After discarding punctuation, each interview was automatically
parsed into phrases. Words were then converted to the roots from which
they are inflected, or lemmatized, using the NLTK WordNet lemmatizer.
The resultant preprocessed data consisted of a list of lemmatized words,
parsed into phrases, maintaining the original order, without punctuation
and in lower case.
Speech analyses. We employed a novel combination of semantic
coherence and syntactic assays as predictors of psychosis transition.
For the semantic analyses, we used a well-validated approach to
automated text analysis previously used to analyze speech in
schizophrenia,3 LSA17. LSA is a high-dimensional associative model that
rests on the premise that word meaning is a function of the relationship
of each word to every other word in the lexicon. If semantically similar
words co-occur in texts with consistent topics more frequently than do
unrelated words, then the semantic similarity of two words can be
quantitatively indexed by the frequency of their co-occurrence in a
sufficiently large corpus of texts.17 LSA thus captures the meaning of
words through linear representations in high-dimensional (300–400
dimensional) semantic space based on word co-occurrence frequencies.
Each word in the lexicon is assigned a vector representing its semantic
content; the orientation of these vectors can then be used to compare
semantic similarity between words.17
Here, LSA was trained on the Touchstone Applied Science Associates
(TASA) Corpus, a collection of educational materials compiled by TASA. The
semantic coherence measure we developed is similar to that used by
Elvevåg et al.,3 which discriminated between established schizophrenia
patients and controls. The present measure differs from the earlier
approach in that it explicitly incorporates syntactic information: semantic
trajectories are represented by similarity among pairs of consecutive
phrases, or pairs of phrases separated by an intervening phrase (see
Figure 1). Given the speech transcription D, the document is split into n
phrases Si and converted into a vectorial representation by replacing each
word in the phrase by its corresponding LSA vector, Si- li1;-; liNf g. The
Table 1. Demographics
CHR+
(N= 5)
CHR−
(N=29)
Age (in years) 22.2 (3.4) 21.2 (3.6)
Gender (% male) 80% 66%
Race (% Caucasian) 40% 38%
Medications prescribed (antipsychotics
and/or antidepressants)
20% 21%
Abbreviations: CHR+, clinical high-risk participants who transitioned to
psychosis during follow-up; CHR− , clinical high-risk participants who did
not transition to psychosis during follow-up.
Automated analysis of free speech
G Bedi et al
2
npj Schizophrenia (2015) 15030 © 2015 Schizophrenia International Research Group/Nature Publishing Group
72
phrase vectors are then summarized by taking the mean of their
components:
Li ¼ 1N
XN
k¼1
lik
i.e., the mean of all LSA vectors of every word in the phrase.
We defined first-order coherence by taking the similarity of consecutive
phrase vectors, averaged over all the phrases in the text (represented by
:h i below):
FOC ¼ ⟨ cos ðLi ; Liþ1Þ⟩
and second-order coherence by taking the similarity between phrases
separated by another intervening phrase, averaged over all the phrases in
the text:
SOC ¼ ⟨ cos ðLi ; Liþ2Þ⟩
With these two features, we were able to characterize semantic coherence
by measuring components of the distributions of first- and second-order
coherence over the speech samples, including features such as the
minimum, mean, median, and s.d.
Thus, we indexed speech coherence by: (i) automated separation of
interviews into phrases; (ii) assigning phrases semantic vectors as the mean
of the LSA semantic vectors for each word within the phrase; and
(iii) assessing semantic similarity (i.e., the cosine) between the phrase
vectors of consecutive phrases, or phrases separated by another
intervening phrase.
To complement the semantic analysis, we defined another measure
for processing the documents, on the basis of Part Of Speech tagging
(POS-Tag). This consists of labeling every word by its grammatical function.
For example, the sentence ‘The cat is under the table’ is tagged by the
POS-Tag procedure as (('The', 'DT'), ('cat', 'NN'), ('is', 'VBZ'), ('under', 'IN'),
('the', 'DT'), ('table', 'NN')) where DT is the tag for determiners, NN for
nouns, VBZ for verbs, and IN for prepositions. For every transcript, we
calculated the POS-Tag information (with NLTK5) and used the frequencies
of each tag as an additional attribute of the text. Tagging automation uses
a hand-tagged corpus to train a parsing process using a variety of
heuristics. NLTK uses a model called Pen Tree Bank.
Code availability. Code for speech preprocessing (WordNet lemmatizer)
and POS-Tag (Pen Tree Bank) is available open access through the NLTK
(http://www.nltk.org/).5
Classification. A cross-validated classifier is a Machine Learning algorithm
with two stages: in the first stage, it learns the underlying patterns of the
data using a subset of samples. The learned model is used in the second
stage to predict the labels of samples not used during the learning stage
(Figure 2).
We used features derived from the semantic coherence analyses and the
POS-Tag extraction, providing a vector of features for each participant's
text. With this information, we trained the classifier to learn the features
that discriminated among participants who did not subsequently develop
psychosis (CHR− ) from the group who did (CHR+).
The convex hull of a set of points is the minimal convex polyhedron that
contains them. A convex hull classifier was implemented as follows: during
training, we sequentially excluded one CHR+ or CHR− participant to be
used for testing (leave-one-subject-out cross-validation). Using the training
labels, we computed the convex hull of the CHR− set, and then tested
whether the left-out sample was inside the hull (predicting CHR− ) or
outside (predicting CHR+). Each individual was sequentially excluded from
the training set used to compute the convex hull to serve as the test
subject, providing accuracy of prediction data for all participants.
The semantic coherence feature that best contributed to classification of
subsequent psychosis onset was the minimum coherence between two
consecutive phrases (i.e., the maximum discontinuity) that occurred in the
interview. The syntactic measure included in classification was the
frequency of use of determiners (‘that’, ‘what’, ‘whatever’, ‘which’, and
‘whichever’), normalized by the phrase length. Because speech in
emergent psychosis often shows marked reductions in verbosity (referred
to clinically as poverty of speech), we also included the maximum number
of words per phrase in the classification.
Validation. To further probe findings from the CHR analyses, we also
conducted the following validation analyses:
Does the coherence measure index ‘disorder’ in a text?: Because the
concept of semantic coherence we employed does not have a
Figure 1. Pipeline for automated extraction of the semantic coherence features. Texts were initially split into sentences/phrases. Each word
was represented as a vector in high-dimensional semantic space using Latent Semantic Analysis (LSA). Summary vectors were calculated as
the mean of each vector in each phrase. Coherence was determined based on the semantic similarity between adjacent phrases, calculated as
the cosine of their respective vectors. The semantic coherence feature that best discriminated those who transitioned to psychosis from those
who did not was the minimum semantic coherence value (i.e., the coherence at the point of maximal discontinuity) within each
transcribed text.
Automated analysis of free speech
G Bedi et al
3
© 2015 Schizophrenia International Research Group/Nature Publishing Group npj Schizophrenia (2015) 15030
73
mathematical definition, in this validation we tested the coherence
measure against a corpus of classic literature and assessed how the
measure changed when we modified the original texts in a way that is
relevant to the concept of semantic coherence.
On the basis of the hypothesis that a text that makes sense will produce
a high coherence score, we applied different levels of ‘disorder’ to a range
of texts to determine whether the method could detect these modifica-
tions. We defined each level of ‘disorder’ as the percent of the text that was
moved from its original location. For example, a disorder level of 40%
indicates that 4 of 10 sentences were moved and thus were no longer in
their original position in the text. For each of 10 disorder levels (10–100%),
we created 1,000 samples, randomly shuffling the order of the appropriate
proportion of sentences. We performed coherence analysis on randomly
selected chapters of the following six classic books: On the Origin of Species
by Charles Darwin, A Study in Scarlet by Arthur Conan Doyle, Moby Dick; Or,
The Whale by Herman Melville, Pride and Prejudice by Jane Austen, The
Adventures of Tom Sawyer by Mark Twain, and The Count of Monte Cristo by
Alexandre Dumas.
Are the speech features associated with symptoms assessed with standard
diagnostic instruments?: To assess the extent to which the text features
that best predicted clinical status at follow-up in CHR patients (minimum
first-order coherence, density of determiners, and maximum phrase
length) carry information with respect to standard clinical prodromal
ratings, we computed the canonical correlation between these three text
features (semantic coherence, phrase length and use of determiner
pronouns) and two symptom measures on the SIPS/SOPS (total positive
symptoms and total negative symptoms). The canonical correlation
between two sets of features from the same samples, X and Y, estimates
the linear combination of X features such that this combined feature
has the highest correlation with an also estimated linear combination
of Y features.
Ethics statement: The Institutional Review Board at the New York State
Psychiatric Institute at Columbia approved these experiments, and
informed consent was obtained for all subjects (parental consent with
assent for minors).
RESULTS
CHR analysis
Of the 34 participants, 5 were known to develop schizophrenia (or
schizoaffective disorder) within 2.5 years. Respectively, their times
to psychosis onset from time of speech sampling were 3, 4, 8, 12,
and 16 months. Twenty-nine participants were known to not
develop psychosis over follow-up, with 22 of these participants
followed for 2.5 years, 4 participants followed for 2 years, and
3 followed for 1.5 years (these participants’ CHR status was
ascertained closer to the end of the overall study). An additional
participant’s transcript was not included in speech analyses
because her clinical outcome was indeterminate; she remained
psychosis-free over 1.5 years of follow-up, but may have
subsequently developed psychosis after the study.
A graphical representation of the differentiation obtained
between CHR+ and CHR− individuals using the three parameters
of minimum semantic coherence, normalized use of determiners,
and maximum phrase length is presented as the convex hull of
the set of CHR− individuals (the minimal convex polyhedron that
contains all data points) in Figure 3. The convex hull of CHR−
individuals does not include any CHR+ individuals.
The convex hull classifier yielded 100% accuracy for prediction
of psychosis onset. Null hypothesis tests were used to estimate the
probability of obtaining this result by chance. We first partitioned
the data set (N= 34) randomly, assigning five subjects to the CHR+
label and the remainder to the CHR− group. Because some
assignments for this initial test included the actual CHR+
individuals, we implemented a second test by repeating the
previous scheme, including only CHR− individuals. That is, using
the CHR− set, we randomly assigned CHR+ labels to 5 CHR−
individuals, and estimated the probability that they would all fall
outside the remaining 24 individuals randomly labeled as CHR− .
Finally, we repeated the same scheme by assigning random labels
to the 29 CHR− individuals (matching the original number of
labels), and also randomly assigning the semantic and syntactic
speech features, drawing values from a Gaussian distribution with
Figure 2. Pipeline for cross-validation of the Machine Learning
classifier. A vector of features for each participant is extracted and
fed into the classifier that was trained on the other participants’
data. The classifier is used to predict outcome for the left-out, or
test, participant. Each participant is sequentially left out of the
training data set to serve as the test subject once, resulting in
accuracy of prediction data for all participants.
Table 2. Classification performance metrics
Classification PPV NPV Sens. Spec. ROC
Convex Hull 3-feature 100 100 100 100 1.00
SIPS/SOPS 33 89 40 86 0.47
Abbreviations: NPV, negative predictive value; PPV, positive predictive
value; ROC, receiver operating characteristic area under the curve;
Sens, sensitivity; SIPS/SOPS, classification based on baseline scores on
the Structured Interview for Prodromal Syndromes/Scale for Prodromal
Symptoms; Spec, specificity.
Automated analysis of free speech
G Bedi et al
4
npj Schizophrenia (2015) 15030 © 2015 Schizophrenia International Research Group/Nature Publishing Group
74
the same mean and s.d. as the actual values. In each scheme, the
probability that all five individuals labeled as CHR+ would fall
outside the convex hull of CHR− individuals was less than chance,
i.e., Po0.05.
To investigate whether standard clinical ratings could differ-
entiate CHR+ and CHR− individuals, we entered variables from
clinical ratings—the SIPS/SOPS13—into several classifiers. The best
prediction obtained was less accurate than the automated
analysis, misclassifying 3 of 5 CHR+ patients and 4 of 29 CHR−
patients to yield an accuracy of 79%, consistent with prior studies
(see Table 2 for classification performance metrics).
Validations
The coherence measure as an index of ‘disorder’ in texts. We found
that two features of the semantic coherence distributions, the
minimum semantic distance for first-order coherence (i.e., the
minimum coherence or maximum discontinuity between two
adjacent sentences within the text sampled), and the mean
semantic distance for first-order coherence (i.e., the average
coherence between adjacent sentences within the text) were
negatively correlated with the disorder level we produced in texts,
indicating that higher levels of disorder within the text produced
lower coherence scores (see Figure 4).
Associations between speech features and symptoms assessed with
standard diagnostic instruments. The canonical correlation ana-
lysis of text features versus the entire set of clinical prodromal
features did not yield any significant correlation; however,
restricting the analyses to the sums of subthreshold psychotic
and negative symptom severity ratings (i.e., Atotal, Btotal) yields a
correlation of r= 0.57 and P= 0.046, for the variables s (symptoms)
and t (speech variables; Figure 5):
s ¼ 0:066 ´Atotal þ Btotal;
t ¼ - 0:68 ´maxðwords per phraseÞ
- 0:02´ coherence - 0:54 ´determiners:
In this equation, there are two symptom variables (sums of
subthreshold psychotic and negative symptoms, respectively,
Figure 3. Discrimination between individuals who transitioned to
psychosis (clinical high risk+ (CHR+); in red) and those who did not
(CHR− ; in blue) presented as the convex hull of CHR− individuals.
Color shading within the convex hull is used only to illustrate
volume. Discrimination was based on three features extracted from
free speech using automated methods. The frequency of use of
determiners (‘that’, ‘what’, ‘whatever’, ‘which’, and ‘whichever’)
normalized by phrase length; the minimum semantic coherence
between two consecutive phrases within the interview; and the
maximum phrase length.
Figure 4. Effect of randomly shuffling a proportion of classic literary texts (degree of ‘disorder’) on the measure of semantic coherence
developed. Data points represent the minimum semantic distance between two adjacent sentences within a text. Increasing levels of
‘disorder’ were associated with a decrease in the coherence measure employed.
Automated analysis of free speech
G Bedi et al
5
© 2015 Schizophrenia International Research Group/Nature Publishing Group npj Schizophrenia (2015) 15030
75
Atotal, Btotal) and three speech variables (minimum semantic
coherence, normalized use of determiners, and maximum phrase
length).
That is, this analysis reveals that there is a significant correlation
between Btotal (i.e., sum of negative symptoms) and a combination
of the maximum number of words per phrase and density of
determiners. This is consistent with the concept of paucity of
speech constituting a negative symptom in schizophrenia.
Finally, we observed that a scatter-plot of Atotal and Btotal shows
a distribution reminiscent of what we find with text features: CHR+
samples tend to occupy a region outside the distribution of the
CHR− set, similar to what we observe with the speech features
(although less precise in terms of class separation).
Thus, although the classification based on the speech
coherence analyses clearly outperformed that based on the
SIPS/SOPS clinical ratings, these additional analyses indicate that
the coherence features extracted are tapping dimensions that are
relevant for clinical symptomatology, as measured with standar-
dized rating scales.
DISCUSSION
In this initial, proof-of-principle study using a novel combination of
automated semantic and syntactic speech analyses, we found that
speech recorded and transcribed at baseline could accurately
predict subsequent transition to psychosis in a clinical high-risk
cohort. Moreover, classification based on automated analysis
outperformed that based on clinical ratings, indicating that
automated speech analysis can increase predictive power beyond
expert clinical opinion.
Of note, the sample size employed in this initial study was small,
with five participants developing psychosis during the follow-up
period. This limitation meant that we were unable to divide
participants into separate training and test samples, instead using
cross-validation procedures to assess the predictive algorithm.
This approach, although providing important information about
the potential predictive capacity of these novel speech measures,
may have resulted in higher estimates of the predictive accuracy
of the model than would be obtained in a larger, separate sample.
Thus, replication in a larger sample will be an important future
research direction.
Our findings from this proof-of-concept study, although
needing to be replicated in larger samples, have several
implications. First, reliable identification of individuals likely to
progress to schizophrenia would greatly facilitate targeted early
intervention. Second, automated speech assessment, if further
validated, could provide previously unavailable information for
clinicians on which to base treatment and prognostic decisions,
effectively functioning as a ‘laboratory test’ for psychiatry. The
ease of speech recording makes this approach particularly suitable
for clinical applications. Self-report of symptoms, on which much
of psychiatric assessment relies, depends on the patient’s
motivation and capacity to accurately report their introspective
experiences, which may be influenced by psychiatric illness.
Although clinicians routinely detect disorganized speech on the
basis of clinical observations, our data suggest that automated
analytic methods allow for superior assessment. As a direct,
objective measure, automated speech analysis could thus provide
important information to complement existing methods for
patient assessment. Finally, these findings support the use of
advanced computational methods to characterize complex
human behaviors such as speech in both normal and pathological
states. Such a fine-grained behavioral analysis could allow tighter
mapping between psychiatrically relevant phenotypes and their
underlying biology, in essence carving nature more closely at its
joints. Better mapping between the behavioral and the biological
is likely to lead to greater understanding of the pathophysiology
of schizophrenia and other psychiatric disorders, potentially also
informing psychiatric nosology.
These findings represent the initial stages in the use of
emerging computer science behavioral analysis techniques,
already prominently used in industry, to characterize and predict
human behavior in the context of psychiatric health and illness.
Using automated approaches, we were able to extract indices of
speech-semantic coherence and syntax and use these to
accurately predict the subsequent development of psychosis in
high-risk youths. Prognostic prediction using this approach
outperformed prediction on the basis of standard psychiatric
ratings. Computerized analysis of complex human behaviors such
as speech may present an opportunity to move psychiatry beyond
reliance on self-report and clinical observation toward more
objective measures of health and illness in the individual patient.
ACKNOWLEDGMENTS
We thank the participants and also Shelly Ben David, Kelly Gill, Mara Eilenberg, and
Michael Birnbaum for assistance in obtaining speech data and symptom measures,
and in coordinating the longitudinal cohort study.
CONTRIBUTIONS
GB contributed to the conception of the study, the interpretation of data, and
drafting the manuscript. CMC led the prospective clinical high-risk cohort study and
oversaw all data collection, and worked on and edited iterative drafts of the
manuscript. DCJ also contributed to the design and conduction of the cohort study,
and contributed suggested edits to the manuscript. FC and GAC designed and
performed the automated text analysis; DFS and MS contributed to the analysis of
the data; NM, SR, and MC collected and preprocessed data on patients with
schizophrenia and their controls. All the authors reviewed the results, edited the
Figure 5. Correlations of text features and clinical ratings (top panel)
and between positive (sips-A) and negative (sips-B) symptoms
(bottom panel). Upper panel: canonical correlation between the text
features, and the Structured Interview for Prodromal Syndromes
(SIPS) features. Lower panel: scatter-plot of Atotal and Btotal shows no
association between subthreshold psychotic symptoms and nega-
tive symptoms. In both panels, clinical high risk− (CHR− ) and CHR+
are labeled with blue and red dots, respectively. For the analysis, all
features were centered and normalized.
Automated analysis of free speech
G Bedi et al
6
npj Schizophrenia (2015) 15030 © 2015 Schizophrenia International Research Group/Nature Publishing Group
76
manuscript, and gave final approval for submission of the manuscript. Drs Corcoran
and Cecchi had full access to all the data in the study and take responsibility for the
integrity of the data and the accuracy of the data analysis.
COMPETING INTERESTS
The authors declare no conflict of interest.
FUNDING
This research was supported by NIMH (K23MH066279; R21MH086125, and R01
MH04933423), The National Center for Advancing Translational Sciences (NIHUL1
TR000040), the New York State Office of Mental Hygiene, NIDA (K23DA034877), and
FAPESP Research, Innovation and Dissemination Center for Neuromathematics (grant
# 2013/07699-0, S. Paolo Research Foundation).
REFERENCES
1 Insel TR, Landis SC. Twenty-five years of progress: the view from NIMH and NINDS.
Neuron 2013; 80: 561–567.
2 Adomavicius G, Tuzhilin A. Using data mining methods to build customer profiles.
IEEE Comput 2001; 34: 74–82.
3 Elvevag B, Foltz PW, Weinberger DR, Goldberg TE. Quantifying incoherence in
speech: an automated methodology and novel application to schizophrenia.
Schizophr Res 2007; 93: 304–316.
4 Poulin C, Shiner B, Thompson P, Vepstas L, Young-Xu Y, Goertzel B et al. Predicting
the risk of suicide by analyzing the text of clinical notes. PLoS One 2014; 9:
e85733.
5 Bird S, Klein E, Loper E. Natural Language Processing with Python. O'Reilly Media:
Sebastopol, CA, USA, 2009.
6 Bedi G, Cecchi GA, Slezak DF, Carrillo F, Sigman M, de Wit H. A window into the
intoxicated mind? Speech as an index of psychoactive drug effects. Neuro-
psychopharmacology 2014; 39: 2340–2348.
7 Adler CM, Malhotra AK, Elman I, Goldberg T, Egan M, Pickar D et al. Comparison of
ketamine-induced thought disorder in healthy volunteers and thought disorder in
schizophrenia. Am J Psychiatry 1999; 156: 1646–1649.
8 Mota NB, Vasconcelos NA, Lemos N, Pieretti AC, Kinouchi O, Cecchi GA et al.
Speech graphs provide a quantitative measure of thought disorder in psychosis.
PLoS One 2012; 7: e34928.
9 Elvevag B, Foltz PW, Rosenstein M, Delisi LE. An automated method to analyze
language use in patients with schizophrenia and their first-degree relatives.
J Neurolinguistics 2010; 23: 270–284.
10 DeVylder JE, Muchomba FM, Gill KE, Ben-David S, Walder DJ, Malaspina D et al.
Symptom trajectories and psychosis onset in a clinical high-risk cohort: the
relevance of subthreshold thought disorder. Schizophr Res 2014; 159: 278–283.
11 Gooding DC, Ott SL, Roberts SA, Erlenmeyer-Kimling L. Thought disorder in
mid-childhood as a predictor of adulthood diagnostic outcome: findings from
the New York High-Risk Project. Psychol Med 2013; 43: 1003–1012.
12 McGorry P, Purcell R. Youth mental health reform and early intervention:
encouraging early signs. Early Interv Psychiatry 2009; 3: 161–162.
13 Miller TJ, McGlashan TH, Rosen JL, Cadenhead K, Cannon T, Ventura J et al. Pro-
dromal assessment with the structured interview for prodromal syndromes and
the scale of prodromal symptoms: predictive validity, interrater reliability, and
training to reliability. Schizophr Bull 2003; 29: 703–715.
14 Holshausen K, Harvey PD, Elvevag B, Foltz PW, Bowie CR. Latent semantic
variables are associated with formal thought disorder and adaptive behavior in
older inpatients with schizophrenia. Cortex 2013; 55: 88–96.
15 Davidson L. Phenomenological research in schizophrenia: From philosophical
anthropology to empirical science. J Phenomenol Psychol 2004; 25: 104–130.
16 Ben-David S, Birnbaum M, Eilenberg M, DeVylder J, Gill K, Schienle J et al.
The subjective experience of youths at clinical high risk for psychosis: a
qualitative study. Psychiatr Serv 2014; 65: 1499–1501.
17 Landauer TK, Dumais ST. A solution to Plato's problem: the latent semantic
analysis theory of acquisition, induction, and representation of knowledge. Psy-
chol Rev 1997; 104: 211–240.
This work is licensed under a Creative Commons Attribution 4.0
International License. The images or other third party material in this
article are included in the article’s Creative Commons license, unless indicated
otherwise in the credit line; if the material is not included under the Creative Commons
license, users will need to obtain permission from the license holder to reproduce the
material. To view a copy of this license, visit http://creativecommons.org/licenses/
by/4.0/
Supplementary Information accompanies the paper on the npj Schizophrenia website (http://www.nature.com/npjschz)
Automated analysis of free speech
G Bedi et al
7
© 2015 Schizophrenia International Research Group/Nature Publishing Group npj Schizophrenia (2015) 15030
77
Chapter 2 - Hypotheses and Objectives: 
The results presented in the introductory chapter motivated the following hypotheses: 
Main hypothesis: Natural language processing tools at the structural and semantic 
levels can precisely quantify naturalistic human behavior expressed by language and 
can be applied to understand cognitive pathology, development and dreams. 
1. Given that during cognitive decline speech structure seems to be less 
connected (psychosis) and short-term recurrence increased (Alzheimer’s 
disease): 
a. Children that show more advanced cognitive development (regarding 
general intelligence, theory of mind abilities and academic 
performance) should present more connected and less recursive 
memory report graphs; 
b. During recent-onset psychosis, subjects with Schizophrenia diagnosis 
should produce more fragmented graphs, and graph connectivity would 
be predictive of diagnosis and correlated with negative symptoms; 
c. Healthy subjects should present an increase of connectivity and lexical 
diversity, as well as a decrease of short-term recurrence related to age 
and education, and the same pattern of development would be 
expected in the analysis of literary texts across historical time; 
2. We verified that dream memories are better than daily memories to observe 
cognitive impairment in psychosis, but there is a lack of quantitative studies of 
dream memory reports using non-subjective. The similarities between 
psychosis and dreams are also debated in research literature for many decades 
(such as the lack of criticism in reality).    
Given that it is possible to study dream memories using speech analysis tools, 
and that dream reports are specially informative about psychosis, an 
exploration of dream memory report was performed regarding: 
a. Dream lucidity (the ability to be aware of dreaming while dreaming) in 
patients undergoing psychosis;  
b. Semantic memory reverberation during sleep onset. Do visual memories 
fade or reverberate during waking and hypnagogic sleep?  
  
78
In this sense, this thesis has the following objectives: 
Main objective: verify whether structural and semantic natural language processing 
tools can quantify the naturalistic verbal reports, and whether these tools can be 
applied to understand cognitive decline in psychosis, cognitive development in healthy 
children and memory processing during sleep and dreams. 
1. Describe the relationship between speech structure development with global 
intelligence, theory of mind abilities and academic performance in reading on 
children at school settings; 
2. Combine different speech analytical methods to develop tools that extract 
information about negative symptoms in psychosis during recent-onset psychosis 
that could be predictive of Schizophrenia diagnosis; 
3. Characterize the development of speech structure features in a large population 
with a broad span of age, and analogically compare this development to literature 
development, in order to have a sense of how these features evolved historically; 
4. Identify differences in dream lucidity in a psychotic population; 
5. Describe neural correlates of semantic memory reverberation of a previously seen 
picture before closing the eyes during wakefulness and during the first stages of 
sleep. 
 
  
79
Chapter 3 - Cognitive Development and Education: 
This chapter discusses published results from typically-developing children in a school 
setting during alphabetization. The same graph-theoretical tools used to measure 
speech structure were applied to memory reports, and correlated with intelligence 
quotient, theory of mind tests and academic achievement in reading. Here we also 
include two review papers with broad ideas about physiological constraints of school 
education and naturalistic assessment in the school setting. 
 
 
 
80
MIND, BRAIN, AND EDUCATION
A Naturalistic Assessment
of the Organization
of Children’s Memories Predicts
Cognitive Functioning and
Reading Ability
Natália Bezerra Mota1, Janaína Weissheimer1,2, Beatriz Madruga3, Nery Adamy1,4, Silvia A. Bunge5,
Mauro Copelli6, and Sidarta Ribeiro1
ABSTRACT— To explore the relationship between mem-
ory and early school performance, we used graph theory to
investigatememory reports from76 children aged 6–8 years.
The reports comprised autobiographical memories of events
days to years past, and memories of novel images reported
immediately after encoding. We also measured intelligence
quotient (IQ) and theory of mind (ToM). Reading and
Mathematics were assessed before classes began (Decem-
ber 2013), around the time of report collection (June 2014),
and at the end of the academic year (December 2014).
IQ and ToM correlated positively with word diversity and
word-to-word connectivity, and negatively with word recur-
rence. Connectivity correlated positively with Reading in
June 2014 as well as December 2014, even after adjusting for
IQ andToM.To our knowledge, this is the first study demon-
strating a link between the structure of children’s memories
and their cognitive or academic performance.
1Brain Institute, Federal University of Rio Grande do Norte (UFRN)
2Department of Modern Foreign Languages and Literature, Federal
University of Rio Grande do Norte (UFRN)
3Department of Psychology, Federal University of Rio Grande do Norte
(UFRN)
4Department of Pedagogy and Physiotherapy, Faculdade Maurício de
Nassau
5Department of Psychology & Helen Wills Neuroscience Institute,
University of California
6Department of Physics, Federal University of Pernambuco (UFPE)
Address correspondence to Sidarta Ribeiro, Brain Institute, Federal
University of Rio Grande do Norte (UFRN), Natal, RN, Brazil; e-mail
sidartaribeiro@neuro.ufrn.br
When children begin formal schooling, they are faced with
the challenges of learning to read, write, and perform basic
mathematical calculations, among others. To achieve these
milestones, childrenmust be able to attend to relevant infor-
mation, keep it in mind, organize and flexibly update it,
and recall it at a later time. These cognitive skills improve
dramatically over the elementary school years, as measured
by carefully controlled laboratory tests. At the same time,
there are important individual differences in performance
on these cognitive tests, and there is ample evidence that
interindividual variability in working memory and cognitive
control helps to explain differences in academic achievement
among children (Alloway & Passolunghi, 2011; Titz & Kar-
bach, 2014).
In comparison with research on working memory and
cognitive control, the degree to which episodic memory
contributes to academic achievement is less clear (Sander,
Werkle-Bergner, Gerjets, Shing, & Lindenberger, 2012). It is
generally assumed that the ability to recall detailed accounts
of past events is important for learning new material at
school (Harel et al., 2014). However, the most common
way to measure episodic memory is through the use of
simple laboratory tests in which participants must learn a
series of pairs of stimuli presented on the computer, and
retrieve them after a brief delay (Alloway & Alloway, 2010;
Alloway, Gathercole, Kirkwood & Elliott, 2009; Alloway &
Passolunghi, 2011; Blakemore & Bunge, 2012; Bunge &
Wright, 2007; Johnson, Miller Singley, Peckham, Johnson,
& Bunge, 2014; Sander et al., 2012). While these types of
© 2016 International Mind, Brain, and Education Society and Wiley Periodicals, Inc. 1
81
Memories Correlate With Academic Performance
paradigms are tightly controlled, they are rather artificial
and do not approach the complexity of memory processes
in the real world. Here, we sought to probe the relationship
between episodic memory and academic achievement in a
more naturalistic context, asking children to report on their
own memories. To this end, we sought to use new quantifi-
cationmethods applied to unstructured, spontaneous, freely
produced speech.
The way people report their memories reflects sponta-
neous associations, and indirectly reveals the underlying
thought process. Recently, computational approaches based
on graph theory have succeeded in using structural features
ofmemory reports to quantify pathological cognitive deficits
(Bertola et al., 2014;Mota, Furtado,Maia, Copelli, & Ribeiro,
2014; Mota et al., 2012). A memory report can be accurately
represented by a graph in which the words are represented
by nodes, and the temporal links between consecutive words
are represented by edges. As described in Table 1, it is pos-
sible to calculate general attributes of graphs (such as the
number of nodes and edges or links), to examine the relation-
ship between those elements by studying recurrence mea-
sures (how repetitions of links between nodes and cycles of
nodes appear on the graphs), and to study the overall con-
nectivity between nodes (counting the number of nodes that
are connected), as well as to describe the global features that
characterize the structure of graphs as a whole (such as the
degree of clustering and the average shortest path between
nodes; Bollobas, 1998).
Such speech graphs have recently been used to reveal
cognitive deficits in pathological populations comprising
patients suffering from psychosis (Mota et al., 2012, 2014)
or dementia (Bertola et al., 2014). In particular, we have
found that dream reports from psychotic patients were less
connected than similar reports from controls. Furthermore,
connectivity measurements were negatively correlated with
cognitive and negative symptoms, denoting that the more
isolated and cognitively impaired the subject is, the less
connected the corresponding dream reports (Mota et al.,
2014). In the case of dementia, the graph-theoretical anal-
ysis of the verbal fluency test led to good sorting between
patients with Alzheimer’s disease and mild cognitive deficits
(Bertola et al., 2014). Cognitive impairment was accompa-
nied by increased graph density, decreased diameter, and
smaller average shortest path.
Graph measurements have yet to be employed to investi-
gate the normal development of memory reports produced
by a healthy population. As a first step in this direction, we
set out to quantify the relationship between the structure of
spontaneous memory reports, andmeasurements of general
intelligence, theory of mind (ToM), and school achievement.
The longitudinal design of this study allowed us to inves-
tigate whether these structural properties can predict aca-
demic performance over time.Thefirst academic assessment
was performed on December 2013, before the students were
exposed toReading andMath classes, which began onMarch
2014. Two subsequent measurements were performed on
June 2014 and December 2014, allowing for the investiga-
tion of both cross-sectional and longitudinal relationships
between graph measurements and academic achievement
during the first year of alphabetization.
In order to characterize the mechanism behind the
possible relationship between declarative memory reports
and cognitive performance, we separated the reports into
those taxing short-term memory (STM), with a few sec-
onds between the encoding and recall of novel images; and
those taxing long-term memory (LTM), with days to years
between the encoding and recall of autobiographical events.
Based on previous studies of memory organization in adult
psychotic patients (Mota et al., 2012, 2014), we hypothe-
sized that three connectivity-related graph attributes that
decrease in association with cognitive decline in these sub-
jects (edges, LCC, and LSC; see Methods section) would
increase in the case of typical developmental improvement
in alphabetization. As this is a first exploratory graph-based
study of memory reports in healthy children, we also tested
whether other 11 graph attributes are relevant.
METHODS
Participants
A total of 76 children (40 males and 36 females, aged 6–8
years, 7.29± 0.58, mean± SD) participated in this study.
These children were recruited from six public schools in
Natal, Brazil. The children came from families with low lev-
els of educational attainment (parents’ years of education
8.76± 3.90) and low socioeconomic status (family income
R$ 1,133.58± 431.21; mean± SD; average national wage R$
1,855.00; Wages, Ministry of Planning, Budget andManage-
ment, Brazil, 2014). The study was approved by the Ethics
on Research Committee of the Federal University of Rio
Grande do Norte (UFRN) (permit no. 742.116), and the
data were collected during regular class hours within the
school setting, with each child individually in a classroom
assigned exclusively for this purpose.Written informed con-
sent was obtained on behalf of all the children from their
legal guardians at a meeting between experimenters, legal
guardians, and teachers.
Protocol
Several assessments of cognitive functioning and academic
achievement were administered, as described below. We
also collected spontaneous memory reports with different
autobiographical time spans. The experimenter first inter-
viewed each child individually, explaining the experiment
2
82
Natália Bezerra Mota et al.
Table 1
Mathematical Definition and Psychological Interpretation of Speech Graph Attributes (SGA)
SGA Mathematical definition Psychological interpretation
N (nodes) Number of nodes Number of different words, measures
lexical diversity
E (edges) Number of edges Number of links between words
RE (repeated edges) Sum of all edges linking the same pair of
nodes
Number of links between two words;
measures recurrence
PE (parallel edges) Sum of all parallel edges linking the same
pair of nodes given that the source
node of an edge is the target node of
the parallel edge
Number of links between two words
with opposite directions; measures
recurrence
L1 (loop of one node) Sum of all edges linking a node with
itself, calculated as the trace of the
adjacency matrix
Numbers of repetitions of the same
word in sequence; measures
recurrence
L2 (loop of two nodes) Sum of all loops containing two nodes,
calculated by the trace of the squared
adjacency matrix divided by two
Number of sequences of two different
words; measures recurrence
L3 (loop of three nodes) Sum of all loops containing three nodes
(triangles), calculated by the trace of
the cubed adjacency matrix divided by
three
Number of sequences of three
different words; measures
recurrence
LCC (largest connected
component)
Number of nodes in the maximal
subgraph in which all pairs of nodes
are reachable from one another in the
underlying undirected subgraph
Number of different words in the
largest component in which all the
words are connected by a path of
edges; measures how well
connected the words of the report
are
LSC (largest strongly
connected component)
Number of nodes in the maximal
subgraph in which all pairs of nodes
are reachable from one another in the
directed subgraph (node a reaches
node b, and b reaches a)
Number of different words in the
largest component in which all the
words are mutually connected by a
path of edges; measures how well
connected the words of the report
are
ATD (average total degree) Given a node n, the total degree is the
sum of “in and out” edges. Average
total degree is the sum of total degree
of all nodes divided by the number of
nodes
Given the word X, total degree is how
many links this word has with any
other words in the report. ATD is
the average total degree of all words
in the report
Density Number of edges divided by possible
edges (D= 2×E/N × (N − 1)), where E
is the number of edges and N is the
number of nodes
Number of direct word links divided
by all the possible word links (using
all the different words in the report)
Diameter Length of the longest shortest path
between the node pairs of a network
Length (in words) of the path linking
the most distant pair of words in
the report
ASP (average shortest path) Average length of the shortest path
between pairs of nodes of a network
Average of all the shortest paths
between every pair of words in the
report.
CC (average clustering
coefficient)
Given a node n, the clustering coefficient
map (CCMap) is the set of fractions of
all n neighbors that are also neighbors
of each other. Average CC is the sum of
the clustering coefficients of all nodes
in the CCMap divided by number of
elements in the CCMap
Given the word X, CC of X is a
measure of how many words
directly linked to word X are also
directly linked to each other. The
average CC is the average CC of all
different words on the report
3
83
Memories Correlate With Academic Performance
and then collecting declarative memory reports compris-
ing long-term autobiographical memories—LTM (based on
events occurring in the preceding days to years) and STM
reports (based on events occurring immediately before-
hand).The interview beganwith questions regarding the for-
mer (LTM): “Please, tell me your oldest memory. When did
it happen? How old were you?” and then: “Please, tell me
how was your day yesterday,” then: “Please, tell me a dream
you had. When did it happen?” and finally: “Please, tell me
the events on the day before that dream.” Next we asked
questions to assess STM reports. We showed three affective
images (one positive, one negative, and one neutral) from
the International Affective Picture System (IAPS) database
validated in children (Lang, Greenwald, Bradley, & Hamm,
1993). After seeing each image for 15 s, the computer screen
used for the presentation was turned off, and the children
were asked to report a narrative regarding what was happen-
ing in that image. All reports were limited to a maximum of
30 s.
After collecting these memory reports, we applied stan-
dard ToM tests called the Sally–Anne task (Baron-Cohen,
Leslie, & Frith, 1985) and three cartoons of the picture
sequence test (PST; Baron-Cohen, Leslie, & Frith, 1986),
comprising a total of four tests of ToM abilities. On the
Sally–Anne test, the experimenter (NBM) used a computer
screen to show a story to the subject, and in the end she asked
a question to probewhether the subject differentiates his/her
own beliefs from the character’s beliefs (Baron-Cohen et al.,
1985). On PST, the experimenter asked the child to organize
a cartoon story in the correct sequence, and then report the
resulting story within 30 s. As in the Sally–Anne test, under-
standing of the correct picture sequence requires that the
subject understands that his own beliefs are different from
the character’s beliefs (Baron-Cohen et al., 1986). In addi-
tion, the PST provided another three STM reports.
We scored each of the four answers on the Sally–Anne
test as correct or incorrect, and entered an accuracy score of
0%, 25%, 50%, 75%, or 100% for each participant. On a sub-
sequent visit, 2–8 weeks later, we administered the RAVEN
ProgressiveMatrices test (Angelini, Alves, Custódio, Duarte,
& Duarte, 1999; Raven, 1936) to collect intelligence quotient
(IQ) data, scored for each child as the percentile corrected
by age. Memory reports, ToM, and RAVEN measurements
were sampled during August and September 2014 (right
after school vacations). Finally, we assessed the students’
scores on the standard national Brazilian test on Math
and Reading, called Provinha Brasil, which is the official
academic evaluation applied by the Ministry of Education
throughout the entire country. Each test is composed of
20 multiple-choice questions. The Reading test assessed
knowledge of grapheme–phoneme correspondence and text
comprehension, whereas the Math test presented questions
about absolute quantities, basic arithmetic operations,
and recognition of geometrical shapes. The academic
tests were administered during three different periods:
December 2013, right before the beginning of the school
year; June 2014, right before school vacations; andDecember
2014.
Graph Analysis
Thememory reports were fully transcribed to a text file that
included all the words spoken by the subject within the 30-s
limit. Whenever the child stopped the report short of the
limit, the interviewer prompted the subject to talk more. In
these cases, the ensuing words spoken by the subject were
transcribed on another line of the text. Declarative mem-
ory reports comprised a concatenation of all the memory
reports (“oldest memory,” “memory from yesterday,” “mem-
ory of a dream,” “memory from the day before the dream,”
IAPS pictures and PST). For STM reports, we concatenated
the IAPS pictures and PST reports. For LTM reports, we
concatenated the answers for the questions regarding “oldest
memory,” “memory from yesterday,” “memory of a dream,”
and “memory from the day before the dream.” The con-
catenated text files were represented as graphs using the
free software SpeechGraphs (Mota et al., 2014; available at
http://neuro.ufrn.br/softwares/speechgraphs).
In summary, for LTMweused the answers to the following
four questions, concatenated into one text file:
1. “Please, tell me your oldest memory”;
2. “Please, tell me how was your day yesterday”;
3. “Please, tell me a dream you had”;
4. “Please, tell me the events on the day before that dream.”
For STM we used the following six reports concatenated
as one text file:
1. Description of three affective images from IAPS database
(one negative, one positive, and one neutral image);
2. Description of three cartoon stories, each made of four
pictures.
For declarative memory, we combined all the reports (4
LTM+ 6 STM) as one text file.
A graph is a mathematical representation of a network
with nodes linked by edges, formally defined as G= (N , E),
with the set of nodes N = {w1, w2, … , wn} and the set of
edges E= {(wi,wj)} (Bollobas, 1998; Börner, Sanyal, &Vespig-
nani, 2007). A speech graph represents the sequential rela-
tionship of spoken words in a verbal report, with each word
represented as a node, and the sequence between succes-
sive words represented as a directed edge (Figure 1; Mota
et al., 2012, 2014). Each line or paragraph in the text file
represents a graph component. If the components share the
same words, those components become linked as a larger
4
84
Natália Bezerra Mota et al.
Fig. 1. Memory reports represented as graphs. (a) Experiment design timeline. Note that the academic performance tests were repeated
at three different time points. (b) Example of a graph from a single memory report and illustrative examples of graph attributes (general
attributes: N=nodes, E= edges; recurrence attributes: RE= repeated edges, PE= parallel edges, L1= loops of one node, L2= loops of
two nodes, and L3= loops of three nodes; connectivity attributes: LCC= largest connected component, LSC= largest strongly connected
component. For a detailed explanation see Table 1). (c) Graph examples from single memory reports of two representative subjects with
high and low cognitive performance.
5
85
Memories Correlate With Academic Performance
Fig. 2. Similar correlations between intelligence quotient (IQ) and theory of mind (ToM) performances and graph attributes
(medium-sized graphs 50 words) from declarative memory. (a) Correlations between IQ performance and nodes, repeated edges (RE),
parallel edges (PE), and largest connected component (LCC; R and p values indicated). Tertile comparison between low, medium, and
high IQ performance groups (RE: high< low IQ, p= .0002). (b) Correlations between ToM performance and nodes, RE, PE, and LCC (R
and p values indicated). Tertile comparison between low, medium, and high IQ performance groups (RE: high< low ToM, p= .0009; PE:
high< low ToM, p= .0002, medium< low IQ, p= .0016).
graph component (Figure 1). A total of 14 speech graph
attributes were calculated for each text file, comprising gen-
eral graph attributes related to the number of elements as
nodes and edges (N=nodes and E= edges), recurrencemea-
sures that count repetitions of links between nodes and
cycles of nodes presented on the graphs (PE= parallel edges,
RE= repeated edges, L1, L2, and L3= loops of one, two, and
three nodes), connectivity measures to count the number of
nodes that are connected by some path of edges regardless
of directionality (LCC= largest connected component and
LSC= largest strongly connected component) and global
attributes to quantify topological features that characterize
complex graphs (ATD= average total degree, density, diame-
ter, ASP= average shortest path, CC= clustering coefficient;
for detailed information about graph attributes, see Figure 1
and Table 1).
A moving window analysis of graph attributes was per-
formed using windows with length of 50 words and 90%
overlap from one window to the next, which means that for
the first 50 words we generated a graph, then jumped five
words to again count 50 words and thus generate the next
graph, and so on. The average graph attributes of all graphs
of 50 words for each text file were calculated and used for
statistical analysis. Illustrative examples of isolated memory
reports from two subjects (one with high cognitive perfor-
mance and one with low cognitive performance) are shown
in Figure 1b.
Statistical Analysis
Statistical analyses were performed using Matlab software
(MathWorks, Natick, MA, United States). Pearson correla-
tionswere used to investigate the relationship between graph
attributes and the different measures of cognitive and aca-
demic performance (IQ, ToM, Reading, and Math tests).
We also defined tertiles for each level of performance (low,
medium, and high) for each of the four assessments, and
6
86
Natália Bezerra Mota et al.
Table 2
Statistical Analysis: Cognitive Performances and Graph Attributes From Declarative Memories
Speech graphs Pearson correlation t-Test
Cognitive test Attributes (SGA) R p-Value Comparison p-Value
IQ N 0.36 .0014 — >.0042
RE −0.40 .0004 Low× high .0002
PE −0.43 .0001 — >.0042
LCC 0.40 .0005 — >.0042
ToM N 0.35 .0022 — >.0042
RE −0.40 .0003 Low× high .0009
PE −0.45 .0000 Low×medium .0016
Low× high .0002
LCC 0.34 .0023 — >.0042
Reading June 2014 LCC 0.33 .0041 — >.0042
LSC 0.37 .0012 — >.0042
Reading November 2014 LSC 0.35 .0023 — >.0042
IQ= intelligence quotient; LCC= largest connected component; LSC= largest strongly connected component; N=nodes; PE= parallel edges; RE= repeated edges;
SGA= speech graph attributes; ToM= theory of mind.
compared graph attributes across tertiles using Student’s
t-test. Correction for multiple comparisons using the Bon-
ferroni method included three different memory reports
types (declarative, STM, and LTM) and four cognitive assess-
ments (IQ, ToM, Reading, and Math), totaling 12 compar-
isons (corrected α= 0.0042). Multiple linear regressions of
Reading with IQ, ToM, LCC, and LSC were calculated using
the MATLAB function <regress>.
RESULTS
When we analyzed the entire set of declarative memory
reports, we found significant positive correlations between
cognitive performance (IQ and ToM performance) and
nodes (IQ: R= 0.36, p= .0014; ToM: R= 0.35, p= .0022) and
LCC (IQ: R= 0.40, p= .0005; ToM: R= 0.34, p= .0023). We
also found negative correlations with RE (IQ: R=−0.40,
p= .0004, high< low IQ, p= .0002; ToM, R=−0.40,
p= .0003, high< low ToM, p= .0009) and PE (IQ: R=−0.43,
p= .0001; ToM: R=−0.45, p= .0000, medium< low ToM,
p= .0016, high< low ToM, p= .0002; Figure 2, Tables 2 and
S1). In summary, children who reported their declarative
memories with a larger number of different words, and with
more connections among them and fewer repetitions of
word–word associations performed better on IQ and ToM
tests. As expected, IQ and ToM were positively correlated
(R= 0.48, p< .0001).
To determine whether the correlation between verbal
reports and ToM performance was because of the pres-
ence of PST reports (the ToM task) in the text file, we
also performed the correlations using either graphs from
all the memory reports except PST, or graphs made exclu-
sively from PST reports. Notably, there were significant
correlations between verbal reports and ToM performance
even when excluding those derived from the PST (Table S2).
Thus, the relationship holds for several kinds of memory
reports.
Regarding school achievement, we found significant pos-
itive correlations between Reading performance and LCC
(R= 0.33, p= .0041) and LSC (R= 0.37, p= .0012) on the sec-
ond test (June 2014). Notably, LSCpredictedReading perfor-
mance 3–4months later (R= 0.35, p= .0023, third time point
on December 2014; Figure 3, Tables 2 and S1). We calcu-
lated score differences between December 2013/June 2014,
December 2013/December 2014, and June 2014/December
2014 to estimate gains, but found no significant correlations
between speech graph attributes and gains on either Read-
ing orMath performance (all p≥ .021, corrected α= 0.0042).
In general, the correlations with different cognitive perfor-
mances were preserved when performing graph analyses
using windows of different word lengths (small graphs of 10
words and large graphs of 100 words; Table S1). Thus, LCC
was concurrently related to Reading performance, and LSC
was both concurrently and longitudinally related to Reading
performance.
To assess how much of the Reading performance could
be jointly predicted by cognitive and graph measures, we
assessed multiple linear regressions of Reading with a
linear combination of IQ, ToM, and connectivity-related
graph attributes (Figure 3b). Explained variance ranged
from R2 = 0.09 (p= .0151) in December 2013 to R2 = 0.26
(p< .0001) in June 2014 and R2 = 0.21(p< .0001) in Decem-
ber 2014.
To test whether IQ or ToM mediate the relationships
between graph connectivity (LCC and LSC) and school per-
formance, we assessed the corresponding correlations with
or without adjustments for IQ or ToM. As shown in Figure 4
7
87
Memories Correlate With Academic Performance
Fig. 3. School achievement and declarative memory graphs (medium-sized graphs, 50 words). (A) Correlations between connectivity
graph attributes (largest connected component [LCC] and largest strongly connected component [LSC]) and Reading performance on
June (second test) and December 2014 (third test; R2 and p values indicated). (B) Multiple linear regressions of Reading with IQ, ToM,
LCC, and LSC. Combination of these attributes on the x axis. Reading scores for each time point on the y axis.
and Table 3, we found that graph connectivity (LCC, LSC)
was correlated with Reading even after adjusting for IQ and
ToM. Conversely, IQ and ToM were correlated with Read-
ing even after adjusting for graph connectivity. However, the
correlation between ToM and Reading did not reach signif-
icance when adjusted for IQ, and the correlation between
Reading and IQ did not reach significance when adjusted for
ToM (Figure 4, Table 3).
When we distinguished between LTM and STM reports,
we found that STM correlations were stronger than LTM
correlations, but that in most cases their combination
yielded even stronger correlations than STM alone. A
comparison of Tables S1 (declarative= STM+LTM) and
S3 (STM vs. LTM) shows that the former displays overall
higher R values and lower p values—that is the sum of STM
and LTM is more informative than STM alone. IQ and ToM
were positively correlated with Nodes, and the same two
measurements were negatively correlated with RE as well
as PE (Tables 4 and S3). In addition, there were significant
negative correlations between IQ performance and L3 and
between IQ and CC (Tables 4 and S3). Thus, children with
higher IQ scores reported memories with less recurrence
(loops of three nodes), and with less graph clustering.
Although graph attributes from memory reports correlate
significantly with IQ and ToM, altogether the results show
that the correlation of Reading with graph attributes cannot
be reduced to the correlations of Reading with either IQ
or ToM.
DISCUSSION
The results indicate that the children with better IQ, ToM,
and Reading performance report memory events with a
richer word repertoire (more nodes), more connections
among them (larger LCC and LSC), fewer repetitions of the
same associations (less RE and PE), overall reflecting richer
and more complex contents, in comparison with the chil-
dren with medium or lower performance. Graph connectiv-
ity correlated positively with Reading in June 2014 as well as
in December 2014. IQ and ToM were also correlated with
Reading but did not mediate the correlations between graph
connectivity and Reading, because these persisted even after
adjustment for IQ or ToM. Therefore, graph connectivity
provides additional explanatory and predictive power over
IQ and ToM.
As words are symbols that signify objects of the natural
and social world, the usage of a greater variety of words likely
8
88
Natália Bezerra Mota et al.
Fig. 4. Diagram illustrating the significant and nonsignificant adjusted correlations of Reading with graph connectivity, intelligence
quotient (IQ) or theory of mind (ToM). (a) Largest connected component (LCC) and (b) largest strongly connected component (LSC).
Note that connectivity attributes correlate significantly with Reading even after adjusting for IQ or ToM (Table 3).
reflects a greater variety of things and concepts remembered,
stemming from more elaborate semantic memory. A richer
word repertoire can thus be understood as a greater capac-
ity to store and retrieve mnemonic associations, especially
considering that these nodes are also more connected to the
other nodes of the graph. This could imply a better strategy
for memory retrieval, and/or a more adaptive response to
new rules of the environment, leading to better cognitive
performance (Blakemore&Bunge, 2012; Sander et al., 2012).
More nodesmay also reflect a larger vocabulary; indeed, this
result replicates a known relationship between vocabulary
and nonverbal IQ (Rice & Hoffman, 2015) as well as ToM
(Milligan, Astington, & Dack, 2007). Of note, an important
limitation of this study was the lack of assessment of general
linguistic abilities.
The children with better IQ, ToM, and school perfor-
mance not only have a richer word repertoire but also
showed more connections between words, keeping distant
parts of the verbal report connected by the reoccurrence of
certainwords. In order to bemeaningful, a report rich in new
information needs to have solid links among all the events
reported. In a previous study with a psychotic population,
the same connectivity attributes (LCC and LSC) were found
to be smaller in the memory reports of psychotic patients
than in the reports of controls. Importantly, these measure-
ments were negatively correlated with negative symptoms
(such as poor eye contact, emotional retraction, and social
isolation) and cognitive deficits (difficulty of understand-
ing abstract meanings; Mota et al., 2014). These results sug-
gest that our spontaneous capacity for reporting memories
with strong connections among events/elements is related
to our general cognitive capacity to interact with the external
world, so that this memory ability runs together with general
cognitive improvement, increasing with healthy cognitive
development and declining with psychopathological cogni-
tive deficits.
In addition, the children with higher IQ, ToM, and school
performance also showed less word recurrence. When a
report includes a large enough vocabulary, and the speaker
is able tomake a linear trajectory comprising different mem-
ory events/elements, repetition of the sameword association
is not necessary. Patients suffering from Alzheimer’s disease
tend to make more loops of three nodes (L3) than control
subjects when performing the verbal fluency test (which asks
subjects to name asmany different animals as possiblewithin
1 min; Bertola et al., 2014). In other words, the patients
repeat the same animal name after two different names,
which reflect a working memory deficit. It is thus conceiv-
able that less recurrence reflects a more developed work-
ing memory. Impaired working memory leads to impaired
cognitive development (Alloway, Gathercole, et al., 2009;
Alloway, Rajendran, & Archibald, 2009), and reduced school
achievement even in children without cognitive impair-
ments (Alloway & Alloway, 2010; Alloway & Passolunghi,
2011). Consistent with this hypothesis, analysis of STM
reports revealed significant negative correlations between
cognitive measures (IQ or ToM) and recurrence-related
graph attributes (RE, PE), thus strengthening the notion that
the development of workingmemory contributes to the cog-
nitive and academic results.
In order to gain insight into the mechanisms related to
these correlations, and determine whether they are specif-
ically related to memory capacity or to language abilities
in general, we compared the results obtained with STM
and LTM reports. Both reflect episodic memories, but of
different kinds. STM reports are related to events that
have just occurred to the subject, while LTM reports are
related to past events with a time lag of days, months
9
89
Memories Correlate With Academic Performance
Table 3
Correlation With Reading Performance. Results Adjusted for Cognitive Performance or SGA. Statistically Significant Differences Are
Shown in Boldface
Reading June 2014 Reading December 2014
Adjusted by R p-Value R p-Value
No adjustment LCC 0.33 .0041 0.32 .0056
LSC 0.37 .0012 0.35 .0023
ToM 0.42 .0002 0.34 .0029
IQ 0.41 .0003 0.37 .0016
IQ LCC 0.36 .0023 0.35 .0034
LSC 0.38 .0011 0.37 .0018
ToM 0.28 .0171 0.20 .0981
ToM LCC 0.35 .0026 0.34 .0037
LSC 0.39 .0007 0.36 .0020
IQ 0.25 .0381 −0.02 .8977
LCC ToM 0.42 .0002 0.34 .0041
IQ 0.40 .0005 0.36 .0021
LSC ToM 0.43 .0002 0.34 .0038
IQ 0.40 .0005 0.36 .0019
IQ= intelligence quotient; LCC= largest connected component; LSC= largest strongly connected component; SGA= speech graph attributes; ToM= theory of
mind.
Table 4
Statistical Results Comparing STM Graph Attributes With Different Cognitive Performances
Speech graphs Pearson correlation t-Test
Cognitive test Attributes (SGA) R p-Value Comparison p-Value
IQ N 0.40 .0004 Low×High .0032
RE −0.40 .0004 Low×High .0029
PE −0.43 .0001 Low×High .0019
L3 −0.34 .0026 Low×High .0025
LCC 0.36 .0018 — >.0042
CC −0.37 .0012 Low×High .0007
ToM N 0.37 .0010 Low×High .0029
RE −0.34 .0024 — >.0042
PE −0.40 .0004 Low×High .0018
Reading June 2014 LSC 0.34 .0031 — >.0042
CC= average clustering coefficient; IQ= intelligence quotient; L3= loop of three nodes; LCC= largest connected component; LSC= largest strongly connected
component; N=nodes; PE= parallel edges; RE= repeated edges; SGA= speech graph attributes; ToM= theory of mind
or years. Therefore, STM and LTM reports engage differ-
ent memory mechanisms: STM reflects short-term memo-
ries of novel images retrieved immediately after encoding,
while LTM depends on the recall of consolidated memory
traces related to first-person events. The results show that
the graph attributes correlated with cognition are mostly
those extracted from STM reports, not LTM reports. This
indicates that the correlations between declarative mem-
ory attributes and cognitive performance are likely driven
by STM.
Individual differences in the ability to retrieve episodic
memories are large, reflecting differences in the devel-
opment and maturation of brain networks required for
episodic memory processing and executive functions
(Bunge & Wright, 2007; Ghetti & Bunge, 2012; Ghetti,
DeMaster, Yonelinas, & Bunge, 2010; Paz-Alonso et al.,
2013; Satterthwaite et al., 2013). While these neural changes
explain howwe learn and recall, we are still far from translat-
ing this knowledge to classroom education. The notoriously
challenging translation of neuroscience findings to the
school setting (Bruer, 1997) motivates the increasing inter-
est in diminishing this gap by improving education based
on scientific evidence across a range of disciplines, with a
focus on the interdisciplinary interaction between cognitive
psychology, animal behavior and brain research (Blakemore
& Bunge, 2012; Sigman, Pena, Goldin, & Ribeiro, 2014). Part
of the problem is related to the environmental complexity of
the naturalistic school setting, which determines differences
in learning behavior that are difficult to assess. For instance,
most of the approaches used to measure memory abilities
10
90
Natália Bezerra Mota et al.
are based on artificially designed laboratory tests aimed at
isolating independent cognitive components.
Surprisingly, we did not find significant correlations
between speech structure and future gains in Reading or
Math. This is likely related to the different temporal nature
of the variables assessed, because speech structure was
measured only once, while academic gains were calculated
as the difference of Reading orMath scores sampled at differ-
ent time points. Alternatively, it is possible that the sample
size was too small to reach significance when multiple
factors were considered. A replication of our study using
different instruments for cognitive assessment is in order to
better understand the possible mediators of the relationship
between speech structure and academic performance.
Our exploratory study indicates that the graph analysis of
naturalistic memory reports generated by healthy children
in the school setting allows for an objective quantification
of memory development. To our knowledge, this is the first
study demonstrating that the structure of children’s memory
reports is linked to key psychological measures of cognition
and to formal academic achievement. These results provide
a proof of concept that objective speech analysis could be
a useful tool in schooling and clinical settings. Follow-up
studies should attempt to replicate the present results in a
larger sample, with a small number of preregistered anal-
yses. In future, a better understanding of the relationship
between graphs from spontaneous memory reports and dif-
ferent memory components could lead to the development
of a quick, objective screening tool for assessing cognitive
functioning in children.
Acknowledgments—Work supported by Conselho Nacional
de Desenvolvimento Científico e Tecnológico (CNPq),
grants Universal 480053/2013-8, and Research Productivity
306604/2012-4 and 310712/2014-9; Coordenação de Aper-
feiçoamento de Pessoal de Nível Superior (CAPES) Projeto
ACERTA; Fundação de Amparo à Ciência e Tecnologia
do Estado de Pernambuco (FACEPE); FAPESP Center for
Neuromathematics (Grant number 2013/07699-0, S. Paulo
Research Foundation FAPESP); and UFRN. We thank the
Public Schools that allowed us access to the children and
to the school environment, and also helped the relationship
with the families; the Latin American School for Education,
Cognitive and Neural Sciences for fostering a rich intellec-
tual environment where our ideas could develop; Elizabeth
Spelke, Mitchell Nathan, and Nora Newcombe for helping
to design and interpret the experiment; two anonymous
reviewers for insightful comments on the manuscript;
the participants of project ACERTA for help during data
collection; Thiago Rivero for help with the choice of ToM
assessment; Debora Koshiyama for bibliographic support;
Pedro PC. Maia, Gabriel M. da Silva, and Jaime Cirne for
IT support. This article is dedicated to the memory of
Raimundo Furtado Neto (1973–2016).
SUPPORTING INFORMATION
Additional supporting information may be found in the
online version of this article:
Table S1. Similar results were obtained using different
graph sizes. The table shows R and P values of the Pear-
son correlation between Speech Graph Attributes using
small (10 words), medium (50 words) and large (100 words)
graphs. Significant correlations indicated in red.
Table S2. Pearson correlations between Speech Graph
Attributes and ToM performance using all declarative mem-
ory reports except PST versus only PST reports. Attributes
that show correlations with ToM using all declarative mem-
ory reports are shown in boldface; statistically significant dif-
ferences are shown in red.
Table S3. Pearson correlation (R and P values) between
Speech Graph Attributes and cognitive performance using
short-term memory (STM) or long-term memory (LTM)
reports. Red indicates statistically significant correlations.
REFERENCES
Alloway, T. P., & Alloway, R. G. (2010). Investigating the predic-
tive roles of working memory and IQ in academic attain-
ment. Journal of Experimental Child Psychology, 106, 20–29.
doi:10.1016/j.jecp.2009.11.003
Alloway, T. P., Gathercole, S. E., Kirkwood, H., & Elliott, J. (2009).
The cognitive and behavioral characteristics of children with
low working memory. Child Development, 80, 606–621.
doi:10.1111/j.1467-8624.2009.01282.x
Alloway, T. P., & Passolunghi, D. (2011). The relationship between
working memory, IQ, and mathematical skills in chil-
dren. Learning and Individual Differences, 21, 133–137.
doi:10.1016/j.lindif.2010.09.013
Alloway, T. P., Rajendran, G., & Archibald, L. M. (2009).
Working memory in children with developmental dis-
orders. Journal of Learning Disabilities, 42, 372–382.
doi:10.1177/0022219409335214
Angelini, A. L., Alves, I. C. B., Custódio, E. M., Duarte, W. F., &
Duarte, J. L. M. (1999). Matrizes Progressivas Coloridas de
Raven: Escala Especial. Manual. São Paulo, Brazil: CETEPP.
Baron-Cohen, S., Leslie, A. M., & Frith, U. (1985). Does the autis-
tic child have a “theory of mind”? Cognition, 21, 37–46.
doi:10.1016/0010-0277(85)90022-8
Baron-Cohen, S., Leslie, A. M., & Frith, U. (1986). Mechanical,
behavioural and intentional understanding of picture stories in
autistic children. British Journal of Developmental Psychology,
4, 113–125.
Bertola, L., Mota, N. B., Copelli, M., Rivero, T., Diniz, B. S.,
Romano-Silva, M. A., … Malloy-Diniz, L. F. (2014). Graph
analysis of verbal fluency test discriminate between patients
with Alzheimer’s disease, mild cognitive impairment and
11
91
Memories Correlate With Academic Performance
normal elderly controls. Frontiers in Aging Neuroscience,
6(185). doi:10.3389/fnagi.2014.00185
Blakemore, S. J., & Bunge, S. A. (2012). At the nexus of neu-
roscience and education. Developmental Cognitive Neuro-
science, 2(Suppl. 1), S1–S5. doi:10.1016/j.dcn.2012.01.001
Bollobas, B. (1998). Modern graph theory. Berlin, Germany:
Springer-Verlag.
Börner, K., Sanyal, S., & Vespignani, A. (2007). Network science.
In B. Cronin (Ed.), Information today (pp. 537–607). Medford,
OR: ARIST.
Bruer, J. T. (1997). Education and the brain: A bridge too far.
Educational Researcher, 26(8), 4–16.
Bunge, S. A., & Wright, S. B. (2007). Neurodevelopmental changes
in working memory and cognitive control. Current Opinion in
Neurobiology, 17 , 243–250. doi:10.1016/j.conb.2007.02.005
Ghetti, S., & Bunge, S. A. (2012). Neural changes underlying
the development of episodic memory during middle child-
hood. Developmental Cognitive Neuroscience, 2, 381–395.
doi:10.1016/j.dcn.2012.05.002
Ghetti, S., DeMaster, D. M., Yonelinas, A. P., & Bunge, S. A.
(2010). Developmental differences in medial temporal lobe
function during memory encoding. Journal of Neuroscience,
30, 9548–9556. doi:10.1523/JNEUROSCI.3500-09.2010
Harel, B. T., Pietrzak, R. H., Snyder, P. J., Thomas, E., Mayes,
L. C., & Maruff, P. (2014). The development of associate
learning in school age children. PLoS One, 9(7), e101750.
doi:10.1371/journal.pone.0101750
Johnson, E. L., Miller Singley, A. T., Peckham, A. D., Johnson, S. L.,
& Bunge, S. A. (2014). Task-evoked pupillometry provides a
window into the development of short-termmemory capacity.
Frontiers in Psychology, 5, 218. doi:10.3389/fpsyg.2014.00218
Lang, P. J., Greenwald,M.K., Bradley,M.M.,&Hamm,A.O. (1993).
Looking at pictures: Affective, facial, visceral, and behavioral
reactions. Psychophysiology, 30, 261–273.
Milligan, K., Astington, J. W., & Dack, L. A. (2007). Lan-
guage and theory of mind: Meta-analysis of the relation
between language ability and false-belief understand-
ing. Child Development, 78, 622–646. doi:10.1111/
j.1467-8624.2007.01018.x
Mota, N. B., Furtado, R.,Maia, P. P., Copelli,M., & Ribeiro, S. (2014).
Graph analysis of dream reports is especially informative
about psychosis. Scientific Reports, 4, 3691. doi:10.1038/
srep03691
Mota, N. B., Vasconcelos, N. A., Lemos, N., Pieretti, A. C., Kinouchi,
O., Cecchi, G. A., … Ribeiro, S. (2012). Speech graphs provide
a quantitative measure of thought disorder in psychosis. PLoS
One, 7(4), e34928. doi:10.1371/journal.pone.0034928
Paz-Alonso, P. M., Bunge, S. A., Anderson, M. C., & Ghetti,
S. (2013). Strength of coupling within a mnemonic control
network differentiates those who can and cannot suppress
memory retrieval. Journal of Neuroscience, 33, 5017–5026.
doi:10.1523/JNEUROSCI.3459-12.2013
Raven, J. C. (1936). Mental tests used in genetic studies: The
performance of related individuals on tests mainly educa-
tive and mainly reproductive (MSc thesis). University of
London, London, UK.
Rice, M. L., & Hoffman, L. (2015). Predicting vocabulary growth
in children with and without specific language impairment:
A longitudinal study from 1∕2 to 21 years of age. Journal
of Speech, Language, and Hearing Research, 58, 345–359.
doi:10.1044/2015_JSLHR-L-14-0150
Sander, M. C., Werkle-Bergner, M., Gerjets, P., Shing, Y. L., & Lin-
denberger, U. (2012). The two-component model of memory
development, and its potential implications for educational
settings. Developmental Cognitive Neuroscience, 2(Suppl. 1),
S67–S77. doi:10.1016/j.dcn.2011.11.005
Satterthwaite, T. D., Wolf, D. H., Erus, G., Ruparel, K., Elliott,
M. A., Gennatas, E. D., … Gur, R. E. (2013). Functional
maturation of the executive system during adolescence.
Journal of Neuroscience, 33, 16249–16261. doi:10.1523/
JNEUROSCI.2345-13.2013
Sigman, M., Pena, M., Goldin, A. P., & Ribeiro, S. (2014). Neuro-
science and education: Prime time to build the bridge. Nature
Neuroscience, 17 , 497–502. doi:10.1038/nn.3672
Titz, C., & Karbach, J. (2014).Workingmemory and executive func-
tions: Effects of training on academic achievement. Psycholog-
ical Research, 78, 852–868. doi:10.1007/s00426-013-0537-1
Wages, Ministry of Planning, Budget and Management, Brazil.
(2014). Instituto Brasileiro de Geografia e Estatística (IBGE).
Retrieved from http://www.ibge.gov.br/home/estatistica/
indicadores/trabalhoerendimento/pnad_continua_mensal/
default.shtm
12
92
Supplementary Table 1: Similar results were obtained using different graph 
sizes. The table shows R and P values of the Pearson correlation between 
Speech Graph Attributes using small (10 words), medium (50 words) and large 
(100 words) graphs. Significant correlations indicated in red. 
 
 
 
93
Supplementary Table 2. Pearson correlations between Speech Graph Attributes and 
ToM performance using all declarative memory reports except PST versus only 
PST reports. Attributes that show correlations with ToM using all declarative 
memory reports are shown in boldface; statistically significant differences are 
shown in red. 
 
Pearson Correlation Without PST PST Only 
Correlation 
SGA x ToM p value  R p value  R 
Nodes 0.0617 0.22 0.0085 0.31 
Edges 0.2628 0.13 0.0927 0.20 
RE 0.0099 -0.30 0.0091 -0.30 
PE 0.0030 -0.34 0.0033 -0.34 
L1 0.2272 0.14 0.3146 0.12 
L2 0.0860 -0.20 0.1493 -0.17 
L3 0.7728 0.03 0.0018 -0.36 
LCC 0.0361 0.24 0.0012 0.37 
LSC 0.1820 0.16 0.0023 0.35 
ATD 0.6687 -0.05 0.0812 -0.21 
Density 0.5523 -0.07 0.0779 -0.21 
Diameter 0.7798 0.03 0.1433 0.17 
ASP 0.5233 0.07 0.0679 0.21 
CC 0.8419 -0.02 0.0035 -0.34 
 
  
94
Supplementary Table 3: Pearson correlation (R and P values) between Speech Graph 
Attributes and cognitive performance using short-term memory (STM) or long-term 
memory (LTM) reports. Red indicates statistically significant correlations. 
 
 
 
 
95
1 23
PROSPECTS
Comparative Journal of Curriculum,
Learning, and Assessment
 
ISSN 0033-1538
 
Prospects
DOI 10.1007/s11125-017-9393-x
Physiology and assessment as low-hanging
fruit for education overhaul
Sidarta Ribeiro, Natália Bezerra
Mota, Valter da Rocha Fernandes,
Andrea Camaz Deslandes, Guilherme
Brockington & Mauro Copelli
96
1 23
Your article is protected by copyright and
all rights are held exclusively by UNESCO
IBE. This e-offprint is for personal use only
and shall not be self-archived in electronic
repositories. If you wish to self-archive your
article, please use the accepted manuscript
version for posting on your own website. You
may further deposit the accepted manuscript
version in any repository, provided it is only
made publicly available 12 months after
official publication or later and provided
acknowledgement is given to the original
source of publication and a link is inserted
to the published article on Springer's
website. The link must be accompanied by
the following text: "The final publication is
available at link.springer.com”.
97
OPEN FILE
Physiology and assessment as low-hanging fruit
for education overhaul
Sidarta Ribeiro1 • Nata´lia Bezerra Mota1 •
Valter da Rocha Fernandes2 • Andrea Camaz Deslandes3 •
Guilherme Brockington4 • Mauro Copelli5
 UNESCO IBE 2017
Abstract Physiology and assessment constitute major bottlenecks of school learning
among students with low socioeconomic status. The limited resources and household
overcrowding typical of poverty produce deficits in nutrition, sleep, and exercise that
strongly hinder physiology and hence learning. Likewise, overcrowded classrooms hamper
the assessment of individual learning with enough temporal resolution to make individual
interventions effective. Computational measurements of learning offer hope for low-cost,
fast, scalable, and yet personalized academic evaluation. Improvement of school schedules
by reducing lecture time in favor of naps, exercise, meals, and frequent automated
assessments of individual performance is an easily achievable goal for education.
Keywords Sleep  Nutrition  Exercise  Assessment  Learning
This work was supported by Conselho Nacional de Desenvolvimento Cientı´fico e Tecnolo´gico (CNPq)
Grants Universal 480053/2013-8, Human Sciences 409494/2013-5, and Research Productivity
306604/2012-4 and 310712/2014-9; ACERTA Project from Coordenac¸a˜o de Aperfeic¸oamento de Pessoal de
Nı´vel Superior (CAPES); Fundac¸a˜o de Amparo a` Cieˆncia e Tecnologia do Estado de Pernambuco
(FACEPE); JCNE fellowship and research support grant 32/2014 from Fundac¸a˜o de Amparo a` Cieˆncia e
Tecnologia do Estado do Rio de Janeiro (FAPERJ); and FAPESP Center for Neuromathematics Grant
2013/07699-0, Sa˜o Paulo Research Foundation (FAPESP). We thank Debora Koshiyama for library support.
& Sidarta Ribeiro
sidartaribeiro@neuro.ufrn.br
1 Instituto do Ce´rebro, Universidade Federal do Rio Grande do Norte, Natal, Brazil
2 Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
3 Programa de Po´s Graduac¸a˜o em Cieˆncias do Exercı´cio e do Esporte, Universidade do Estado do
Rio de Janeiro, Rio de Janeiro, Brazil
4 Departamento de Cieˆncias Exatas e da Terra, Universidade Federal de Sa˜o Paulo, Sa˜o Paulo, Brazil
5 Departamento de Fı´sica, Universidade Federal de Pernambuco, Recife, Brazil
123
Prospects
DOI 10.1007/s11125-017-9393-x
Author's personal copy
98
Governmental, economic, political, academic, and religious agents agree that the solution
for the major social problems of the world lies in the improvement and dissemination of
education. Unfortunately, however, schooling is still of very low quality in most devel-
oping countries and faulty even in some wealthy countries (UNESCO 2011; OECD
2014, 2016).
Schools in communities with low socioeconomic status (SES) suffer academic deficits
both when teaching occurs and when learning is assessed. Low-income families most often
cannot provide adequate sleep, nutrition, or exercise to their members. According to the
United Nations Human Settlements Programme (UN-HABITAT), over one billion people
around the world inhabit slums (UN-HABITAT 2007), and by 2030 this number is likely to
double (UN-HABITAT 2003).
Material and cultural poverty make evident that biology precedes psychology in school
learning. Furthermore, schools in low-income communities typically cannot compensate
for these problems, due to budget underfunding, classroom overcrowding, and underpaid
staff. For the same reasons, schools most often fail to provide personalized attention to the
students.
We propose that major improvement of schooling in the developing world, as well as in
underdeveloped areas within the wealthy nations, can result from a school-centered
reorganization of activities so as to overcome the physiological bottlenecks that hamper the
health of children, derived from biological deficits due to inadequate sleep, nutrition, and
exercise (Sigman, Pen˜a, Goldin, and Ribeiro 2014). We also argue that computational
tracking of students’ learning-related verbal and written expressions may provide scalable,
fast, low-cost solutions to improve individualized assessment of education outcomes in
low-SES communities.
Sleep
In the U.S., nearly 30% of the adult population suffers from insufficient sleep (CDC 2013).
Sleep problems are associated to obesity (Gupta, Mueller, Chan, and Meininger 2002;
Knutson 2011; Jarrin, McGrath, and Drake 2013), poor nutrition (Beebe, Simon, Summer,
Hemmer, Strotman, and Dolan 2013; Grandner, Jackson, Gerstner, and Knutson 2013;
Hogenkamp et al. 2013), and increased cardiovascular risk (Buxton and Marcelli 2010). A
cross-sectional study of 1101 Brazilian adult subjects (20–80 years old) found a depression
prevalence of 10.9% and was significantly higher among housewives, unemployed indi-
viduals, and those with low income and education (Castro et al. 2013).
Decreased duration and quality of sleep may mediate the negative impact on health due
to socioeconomic disadvantage (Van Cauter and Spiegel 1999). The invention of electric
light and then of a myriad of electro-electronic devices has led to a substantial decrease in
sleep time around the world. Average sleep duration is estimated to have dropped from 9 h
in 1910 to 7.5 h just 65 years later (Webb and Agnew 1975). Artificial light has effects that
superimpose on those produced by the natural light–dark cycle, possibly causing a
misalignment of the circadian rhythms. Researchers investigated the effects on sleep—
related to having or not having electricity—in 37 Brazilian adolescents (11–16 years old)
using actigraphy for 5 consecutive days. Students without electricity at home showed
significantly earlier sleep onset on school days (Peixoto et al. 2009). A study of 340 adult
rubber tappers living in a remote region of the Amazon rainforest, most of whom had no
electricity at home, found that the availability of electric light was associated with delayed
S. Ribeiro et al.
123
Author's personal copy
99
melatonin increase, delayed sleep onset, and reduced sleep duration during workdays
(Moreno et al. 2015). Not surprisingly, the daily schedule of activities has a major impact
on sleep quality. A study of Brazilian medical students (n = 27), positively correlated later
class-start times with better sleep quality and longer sleep duration (Lima, Medeiros, and
Araujo 2002).
How are sleep problems related to living under social and physical stress in low-SES
communities? A longitudinal survey of 11,838 adolescents (10–18 years old) found that
hopelessness and exposure to violence produce negative independent and multiplicative
impacts on adolescent sleep, particularly for females (Umlauf, Bolland, Bolland, Tomek,
and Bolland 2015). To investigate how SES affects sleep habits in U.S. preschoolers,
researchers assessed 3217 children (*3 years old) for the presence, time, and consistency
of bedtime routines. Their study associated low maternal education, overcrowded house-
hold, and poverty with worse bedtime routines (Hale, Berger, LeBourgeois, and Brooks-
Gunn 2009).
Sleep reduction is much more pronounced for low-SES individuals, reaching as low as
3.8 h in some occupations (Bliwise 1996; Bonnet and Arand 1995; Broman, Lundh, and
Hetta 1996; Mitler, Miller, Lipsitz, Walsh, and Wylie 1997). The adverse conditions that
lead to sleep problems comprise an unsafe environment, overcrowded sleep rooms,
uncomfortable housing conditions (temperature, sound, etc.), as well as stress and anxiety.
A longitudinal cohort study of 1405 Finish adults in the 1980s and 1990s showed that sleep
quality was somewhat preserved during the severe economic recession of the 1990s, except
in the case of low-SES unemployed individuals, who showed more insomnia, use of
hypnotics, and other signs of decreased sleep quality (Hyyppa, Kronholm, and Alanen
1997).
Investigators looked at the relationship between sleep problems and academic
achievement in 280 students (8–10 years old) from U.S. public schools. They assessed
sleep with actigraphy during 7 consecutive nights, measuring sleep efficiency as the per-
centage of epochs scored as ‘‘sleep’’ between sleep onset and offset. The study highly
correlated intelligence and academic achievement across a wide span of sleep quality, but
in highly intelligent children this correlation decreased with low sleep efficiency, or fewer
sleep episodes with long duration (Erath, Tu, Buckhalt, and El-Sheikh 2015). This result
suggests that sleep problems may hinder the academic potential of even the most intelligent
children.
For all we know, the mechanisms linking low SES to bad academic performance may be
the same that connect low SES to poor health. Typically, low-SES families inhabit small
and overcrowded residences in which beds are shared, and sleep quality is repetitively
disturbed due to differences in work and school schedules among family members. An
investigation of 1504 adults in the United States assessed the relationship between per-
ceived neighborhood disorder and psychological distress. As expected, participants asso-
ciated neighborhoods perceived as noisy, crime-ridden, and unclean with lower sleep
quality and greater psychological distress, possibly as a causal chain of events (Hill,
Burdette, and Hale 2009). A study of 170 pregnant women associated a household income
of less than $50,000/year with reduced sleep quality and more sleep fragmentation, even
after statistical adjustments for covariates (Okun, Tolge, and Hall 2014). Sharing the
household with many individuals, particularly bed sharing, exposes children to sleep
disturbances and anxiety due to noise, movement, uncleanliness, and other factors, which
jointly have a negative impact on cognition (Liu, Liu, and Wang 2003; Solari and Mare
2012).
Physiology and assessment as low-hanging fruit for…
123
Author's personal copy
100
Many studies show that these conditions increase the number of nighttime awakenings,
decrease total sleep time, and produce chronic sleep debt. An investigation of 371 adult,
low-SES Latino residents of New York City revealed an association between home
crowding and reduced sleep duration. Poor sleep quality, with more arousals and longer
sleep latency, was associated with neighborhood disorder and perceived building prob-
lems—with compounded effects of negative housing and neighborhood conditions on sleep
outcomes (Chambers, Pichardo, and Rosenbaum 2014). A representative cross-sectional
survey of 8578 British subjects, ages 16–74, found strong independent connections
between sleep problems and four SES indices: household income, educational qualifica-
tions, living in rented housing, and being unemployed (Arber, Bote, and Meadows 2009).
An observational study of 150 adult slum dwellers from Buenos Aires, Argentina, before
and after relocation to better housing, showed very positive effects of housing upgrading
on sleep quality and quality of life (Simonelli et al. 2013).
Sleep problems during adolescence impact negatively on emotional balance and self-
regulation, increasing the chance of risky behaviours. Using actigraphy, daily diaries, and
questionnaires, a study evaluated 250 U.S. public high school students (mean age:
15.7 years) for sleep problems; these students were of low or middle SES (Matthews, Hall,
and Dahl 2014). Most students showed less sleep than the 8–9 h recommended by the
Centers for Disease Control and Prevention. Black students and male students showed less
sleep, with more fragmentation. Female students reported worse quality of sleep and more
daytime sleepiness. Results were significant even after adjustments for age, body mass
index, physical activity, and smoking status. Black male students showed the least amount
of sleep, which the authors hypothesized could be related to the increased risks suffered by
this cohort (Matthews, Hall, and Dahl 2014). A recent large-scale cross-sectional study of
20,222 undergraduate students from 27 universities in 26 low- or middle-income countries
across the Americas, Africa, and Asia showed that 10.4% of the subjects reported major
sleeping problems, with a wide variation (3.0–32.9%) among countries (Peltzer and
Pengpid 2015).
A very large-scale cross-sectional study of sleep problems using questionnaires was
carried out with 43,935 subjects (above 50 years old) from 8 low-income countries from
Africa and Asia: Ghana, Tanzania, South Africa, India, Bangladesh, Vietnam, Indonesia,
and Kenya (Stranges et al. 2012). Severe or extreme sleep problems afflicted 16.6% of the
subjects, with large variation across countries (from 3.9% in Indonesia and Kenya to 40.0%
in Bangladesh). The study found a consistent association of higher prevalence of sleep
problems with lower education, not living in partnership, and low quality of life. It
revealed independent correlations of sleep problems with limited physical functional-
ity/greater disability, and feelings of depression or anxiety (Stranges et al. 2012).
The social component can directly affect sleep deficits, because low-SES children often
must work to supplement the household income. An investigation of how work affects
sleep among adolescents (14–18 years old) found that working students (n = 16) woke up
earlier than nonworking students (n = 11) on regular working days, causing a significant
decrease in total nocturnal sleep duration (Teixeira et al. 2004). It found that SES nega-
tively correlates with health outcomes, leading to a health gradient across SES strata
(Teixeira et al. 2004). To examine whether a socioeconomic gradient also exists for sleep
features, another study assessed 239 Canadian children and adolescents (8–17 years old)
through self and parent reports. Several sleep measures showed socioeconomic gradients.
Evidence associated objective parental SES with sleep disturbances and subjective SES
with sleep quality and daytime sleepiness (Jarrin, McGrath, and Quon 2014).
S. Ribeiro et al.
123
Author's personal copy
101
In terms of mechanisms, it is no exaggeration to say that sleep deprivation impedes
learning. Laboratory studies clearly indicate that sleep plays a crucial role both before and
after the formation of new memories (Diekelmann and Born 2010; Mander, Santhanam,
Saletin, and Walker 2011; Stickgold 2005). The large body of evidence pointing to the
cognitive role of sleep has begun to motivate research in classrooms on the value of naps in
school learning. An investigation of the effect of classroom naps on spatial learning by
preschool children (n = 40, 36–67 months of age) showed nap-related gains 24 h after
learning (Kurdziel, Duclos, and Spencer 2013). We have recently demonstrated the ben-
eficial effect of naps for the retention of declarative memories acquired in school (Lemos,
Weissheimer, and Ribeiro 2014). A total of 584 children in the sixth grade (10–15 years
old) received a trial lesson and then were randomly assigned to continue awake or go to
sleep for up to 2 h. To assess learning, researchers gave surprise tests days or months after
class. The results showed very similar memory retention across sleep and wake groups
when the evaluation took place 24 h after class. However, 5 days after the class, only the
sleep group retained the cognitive gains. These results suggest that post-class naps can
increase the duration of the memories acquired in the school setting (Lemos, Weissheimer,
and Ribeiro 2014).
Further research must elucidate how to best use naps to aid learning. In particular, it is
key to parametrize the cognitive effects of nap duration, sleep-state composition of the nap,
and interactions with exercise and nutrition.
Nutrition
Food insecurity is associated with diabetes (Ding, Wilson, Garza, and Zizza 2014;
Seligman, Bindman, Vittinghoff, Kanaya, and Kushel 2007), obesity (Tayie and Zizza
2009), hypertension (Seligman, Laraia, and Kushel 2010), heart disease (Seligman, Laraia,
and Kushel 2010), hyperlipidemia (Seligman, Laraia, and Kushel 2010; Tayie and Zizza
2009), mental illness (Casey et al. 2004; Laraia, Siega-Riz, Gundersen, and Dole 2006),
and depression (Seligman, Laraia, and Kushel 2010; Whitaker, Phillips, and Orzol 2006).
A study of U.S. children and teenagers (6–16 years old) found negative psychosocial and
academic outcomes associated with food scarcity (Alaimo, Olson, and Frongillo 2001).
It should not be any surprise that nutritional state plays a preponderant role in learning.
The brain consumes about 60% of the glucose used up by the body. After a career testing
substances that can enhance learning in humans and animal models, neurobiologist Paul
Gold and colleagues found that one of the most effective is precisely glucose (Gold and
Korol 2012; McNay and Gold 2002). In an experiment conducted with college students,
ingestion of glucose led to increases of over 30% in participants’ capacity to memorize text
passages, in comparison with performance after ingestion of a control substance, the
sweetener saccharin (Korol and Gold 1998). The result suggests that the positive cognitive
effect of glucose intake is not simply due to its sweet flavor, of potential rewarding value,
but in fact to the extra calories ingested. This aligns with recent evidence of separate
circuits in mice for encoding the nutritional and hedonic values of sugar, with prioritization
of energy-seeking over taste quality (Tellez et al. 2016).
Yet, the mere increase in caloric intake may be insufficient to produce cognitive gains.
A study of the effects of fat-rich food in spatial learning in rats showed that animals fed
with a low-fat diet took four sessions of daily training to achieve optimal task performance,
while rats fed with a high-fat diet showed very slow learning: Even after 8 daily training
Physiology and assessment as low-hanging fruit for…
123
Author's personal copy
102
sessions, their performance was nearly three times worse than that of the low-fat diet group
(Valladolid-Acebes et al. 2011). Thus, as insistently pointed out by political figures such as
Michelle Obama, the poor quality of school meals, with an excess of fat, may be partly
responsible for the comparatively poor results for U.S. students in school performance
tests, vis-a`-vis students of other developed countries.
There are important interactions between food security and sleep. To investigate this
relationship, 5637 men and 5264 women (all over 22 years old) were surveyed to obtain
self-reported information about sleep duration, sleep latency, and sleep complaints.
Women suffering from very low food security showed significantly shorter sleep duration
than women with full food security. Men undergoing food insecurity reported significantly
longer sleep latency than food-secure men (Ding, Keiley, Garza, Duffy, and Zizza 2015).
Researchers have yet to develop a school-based investigation of the acute effects on
nutrition on academic performance. It is necessary to conduct empirical research in the
classroom setting in order to quantify the cognitive impact of caloric intake, meal com-
position, and the role of micronutrients and hydration, as well as the effects of portion size,
food frequency and the reward value of food. Furthermore, interactions with sleep and
exercise must be assessed in detail.
Physical exercise
One of the most unhealthful consequences of living in excessively small houses is the lack
of space at home for stretching or exercising, adding to the lack of infrastructure for sports
in most low-income communities. Yet, exercise deprivation affects all SES strata:
Decreased physical activity levels and increased body mass indices for the whole popu-
lation have accompanied economic development, with dire human and economic costs (Ng
and Popkin 2012). The human body is genetically programmed to move, requiring physical
activity to maintain the best functionality of neurons and metabolism (Vaynman and
Gomez-Pinilla 2006; Deslandes et al. 2009). There is ample evidence that physical exercise
contributes to the prevention of cardiovascular and metabolic diseases (Fiuza-Luces,
Garatachea, Berger, and Lucia 2013), but its impact on cognition has been greatly
underestimated. Yet, in the past decade investigators have given increasing attention to the
topic (Chaddock, Pontifex, Hillman, and Kramer 2011; Diamond and Lee 2011; Haapala
et al. 2014; Masley et al. 2009).
Exercise can help improve specific cognitive functions not only in elderly, but also in
children (Diamond 2013). Among the cognitive functions that are benefited by an active
life style, the most important are the executive functions, comprising the inhibitory control,
planning, working memory, decision making and cognitive flexibility (Diamond 2013).
Among the brain regions involved in executive functioning, the prefrontal cortex (PFC)
is one of the most important and continues to develop until the third decade of life. This
extended deployment makes the PFC especially susceptible to the influence of the envi-
ronment, cognitive enhancement, and an active life style (Halperin and Healey 2011).
Indeed, exercise contributes to increased activation of the frontal cortex and hippocampus,
respectively involved in the formation of new memories and in motor control (Diamond
2001). The search for mechanisms of the cognitive benefits of exercise occurs mostly in
animal models. In mice, research links voluntary exercise with an increase in the number of
new neurons in the hippocampus (Van Praag, Kempermann, and Gage 1999). Since then,
several studies have shown that, in addition to neurogenesis, exercise contributes to the
S. Ribeiro et al.
123
Author's personal copy
103
angiogenesis, synaptic plasticity, and the increased synthesis of trophic factors and neu-
rotransmitters (Duman 2005; Pereira et al. 2007; Van Praag 2009).
Mounting evidence points to a link between motor skills and overall academic
achievement. In preschoolers, an evaluation of datasets from three longitudinal studies
shows that fine motor skills are a strong predictor of later reading and math achievement
(Grissmer, Grimm, Aiyer, Murrah, and Steele 2010). In a recent systematic review, Van
der Fels et al. (2015) show a relationship between cognitive skills and complex motor skills
(fine motor skills, bilateral body coordination, and timed performance). To assess how
motor skills relate to academic achievement and cognition, we recently investigated 45
Brazilian children and adolescents (8–14 years old) (Fernandes et al. 2016), finding that
motor coordination is a good predictor of school performance. We found significant cor-
relations between motor coordination and several indices of cognitive function, which
indicate that visual motor coordination and visual selective attention may affect academic
achievement and cognitive function.
The relation between cardiorespiratory fitness and cognitive performance is also well
established (Berchicci, Pontifex, Drollette, Pesce, Hillman, and Di Russo 2015; Pontifex
et al. 2011; Voss et al. 2011). Exercises that develop aerobic capacity correlate with
enhanced executive functions, greater activation of PFC, and improved school perfor-
mance. Chaddock et al. (2010) showed that higher levels of aerobic fitness are associated
with a greater capacity to inhibit a maladaptive response in a selective attention task.
In the learning process, attention seems to be a crucial challenge to educators. Class-
rooms are busy environments where students must sort relevant from irrelevant informa-
tion. The slow maturation of the PFC imposes neuropsychological limits throughout
childhood (Quartz and Sejnowski 1997), especially regarding short-term memory and
attention. Physical activity might be key to improving attention in classrooms: the inte-
gration of exercise with the presentation of academic concepts in elementary school
classrooms showed positive results in academic achievement (Donnelly et al. 2016) in
accordance with the acute positive effect of exercise on attention (Hillman et al. 2008).
Even a single session of aerobic exercise can facilitate cognitive performance in children.
In tests of mathematics and reading, the best results were obtained after 30 min of mod-
erate racing (Hillman et al. 2008). Physical education classes conducted immediately
before lectures are likely to enhance academic performance due to acute responses to
exercise, such as increased alertness, improved reaction time, and increased information-
processing speed.
A meta-analysis by Fedewa and Ahn (2011) showed a positive effect of aerobic exercise
on children’s achievement and cognitive outcomes. Type, intensity, and volume of exercise
were correlated with the responses, indicating dose–response effects. Lees and Hopkins
(2013) showed positive impacts of interventions described as ‘‘aerobic exercise’’, in
children’s psychosocial function and cognition. However, these effects were minimal or
not significant in several studies (Lees and Hopkins 2013). One possible explanation is the
intensity and volume of exercise training. Investigators conducted a randomized controlled
study with 67 Spanish adolescents to measure the cognitive and academic effects of
increased time and intensity of physical exercise (Ardoy, Ferna´ndez-Rodrı´guez, Jime´nez-
Pavo´n, Castillo, Ruiz, and Ortega 2014). Three classes were randomly sorted into a control
group with 2 weekly PE sessions, and two experimental groups with 4 weekly PE sessions,
differing only in the intensity of the exercises. The high-intensity exercise group showed
significantly higher performance in nonverbal and verbal ability, abstract reasoning,
numerical and spatial abilities, as well as school grades, in comparison with the other two
groups (Ardoy, Ferna´ndez-Rodrı´guez, Jime´nez-Pavo´n, Castillo, Ruiz, and Ortega 2014).
Physiology and assessment as low-hanging fruit for…
123
Author's personal copy
104
Although most school-based research focuses on aerobic aspects, researched have
begun to consider other variables. In children, acute bilateral coordination exercises
(10 min) showed better effects on concentration and attention than normal PE lessons with
the same duration (Budde, Voelcker-Rehage, Pietrabyk-Kendziorra, Ribeiro, and Tidow
2008). There were no significant differences in the exercise intensity between the groups,
which suggests that the coordinative characteristic of the exercises was responsible for the
results (Budde, Voelcker-Rehage, Pietrabyk-Kendziorra, Ribeiro, and Tidow 2008).
Studies with racket-sports verified positive chronic effects of coordination exercises on
visual perception and executive functions in children with developmental coordination
disorder (Tsai 2009) and in children with mild intellectual disabilities and borderline
intellectual functioning (Chen, Tsai, Wang, and Wuang 2015).
In this manner, using coordination exercises in schools might be an efficient tool to
reduce learning delays in children with special needs. PE classes based on open-skill tasks,
characterized by an unstable environment demanding continuous adaptation, showed better
results in the executive functioning of overweight children, compared with standard PE
classes (Crova et al. 2014). A school-based motor program, designed to stimulate executive
function and attention performance in children, showed positive results in children aged
6–10 years (Cardeal, Pereira, Da Silva, and De Franc¸a 2013). Open-skills activities tend to
demand not only physical effort but also cognitive engagement. In this context, exercise
programs capable of simultaneously enhancing aerobic capacity, motor coordination,
cognitive challenges, and social integration, such as team sports and the Brazilian practice
of Capoeira, are of special interest for school interventions.
Regardless of children’s ages, economic-status, and cultural differences, the school
must offer them physical exercise to facilitate learning and improve physical and mental
health. All school components should provide, or encourage students to engage in, physical
activity at least 60 min per day, 7 days per week (Kohl and Cook 2013). Schools face
several barriers to implementation of quality PE programs, such as lack of facilities and
time, crowded curricula, insufficient infrastructure, scarcity of PE teachers, and low levels
of professional development (Kohl and Cook 2013). Educators must design lessons to
integrate physical activity with other subjects, in order to facilitate learning and improve
academic performance.
Assessment of individual learning
One important bottleneck for education in crowded environments is how to assess learning
individually in order to properly adapt teaching strategies. Given how crowded typical
classrooms are across the world, it is extremely difficult to orient activities and learning
strategies that better fit students individually; the identification of each student’s deficits
and potentials surpasses even the most well trained teacher. This need goes beyond
measuring academic achievement—it points to behavioral and cognitive assessments that
can predict learning deficits early enough for teachers and families to intervene.
Most cognitive and behavioral tests use norms based on populations with specific
cultural features—namely, those who live in Western, educated, industrialized, rich, and
democratic countries—which are not representative of cognitive development in low-SES
societies (Henrich, Heine, and Norenzayan 2010). To build specific norms for each pop-
ulation seems a challenge for countries with low investments in research and education.
Fortunately, new technologies and analytical strategies related to the advent of ‘‘big data’’
S. Ribeiro et al.
123
Author's personal copy
105
bring hope to the field (Goldin et al. 2014; Lomas et al. 2013; Lopez-Rosenfeld et al. 2013;
Me´ndez et al. 2015; Mota, Copelli, and Ribeiro 2016; Odic et al. 2016). For instance,
Adaptive Collaborative Learning Support (ACLS) (Magnisalis, Demetriadis, and Kar-
akostas 2011; Walker, Rummel, and Koedinger 2014) is one way to deal with this com-
plexity. Developers have come up with educational softwares, modelling collaborative
learning, to create rich learning environments that adapt to each student’s characteristics,
helping to improve achievement beyond the mere assessment of performance. The system
provides intelligent feedback that guides the student in finding his or her best individual
learning pathway.
Regarding such computational approaches, we have developed speech analyses that are
successful on cognitive deficits associated with pathological conditions such as dementia
(Bertola et al. 2014) and psychosis (Mota et al. 2012, 2014), and can even predict psychotic
breaks more than 2 years in advance during the prodromal phase, i.e. during the initial
stages of the disease when symptoms are not very apparent (Bedi, Carrillo, Cecchi, Slezak,
Sigman, Mota, Ribeiro, Javitt, Copelli, and Corcoran 2015). These approaches use struc-
tural and semantic features measured on free speech recorded naturalistically, and were
successful in low-SES environments in Latin American countries (Mota et al. 2012, 2014;
Mota, Copelli, and Ribeiro 2016). Cognitive deficits related to temporal abilities impaired
by attention-deficit/hyperactive disorder (ADHD) could be correctly measured by gamelike
software, and the discrimination function classified 82.4% of the cases (Me´ndez et al.
2015). Given the success of computational behavioral analysis in characterizing cognitive
deficits, we have great hope that they can also be used to characterize cognitive gains in the
school environment. We have recently set out to measure speech structure from memory
reports of 76 children (6–8 years old), recorded in the school environment in low-SES
communities. We found that several structural features of speech are correlated with
intelligence quotient (IQ) and theory of mind (i.e. knowledge that other people also have a
mind), as well as school performance on math and reading (Mota, Copelli, and Ribeiro
2016; Mota et al. 2016).
Designed softwares and educational games based on developmental sciences are useful,
low-cost tools to assess learning; they enable specific interventions based on recognized
deficits assessed by individual learning curves compared to the learning curve of peers.
This strategy is poised to enable physiological inputs like sleep, nutrition, or exercise to
positively reinforce significant cognitive shifts within minutes to hours of their detection.
Big-data analysis is a powerful new reality that has been revealing surprising results
regarding motivation and learning, for instance (Lomas et al. 2013). With the students’
frequent use of automated tools, it is possible to build a big dataset specific to their
environment, which analysts could then use as a model to search for learning patterns. In
principle, this approach would help avoid the mistake of interpreting cultural differences as
deviances from the norm.
Investigators have shown that using technology is effective in assessing, and intervening
in, the learning processes in schools in low-SES countries (Goldin et al. 2014; Lopez-
Rosenfeld et al. 2013; Odic et al. 2016), as verified by the experiences of the Argentinian
Joaquı´n V. Gonza´lez program (http://www.programajoaquin.org/) and the Uruguayan
CEIBAL program (http://www.ceibal.edu.uy/), both part of the worldwide initiative One
Laptop Per Child (OLPC) (http://www.laptop.org/), which delivers and manages one low-
cost laptop per student.
Samples of more than 500 children in Uruguay (Odic et al. 2016) revealed
notable discoveries regarding math learning and abilities to estimate time and quantity. In
Argentina, an intervention applied through games in schools of low-SES communities
Physiology and assessment as low-hanging fruit for…
123
Author's personal copy
106
showed cognitive benefits for 6-to-7-year-old children, with transfer to some executive
functions and some equalization of academic outcomes between children who regularly
attend school and children who could not attend for different reasons (Goldin et al. 2014).
It is now possible to envision a future in which fun and motivating computational tools
will allow teachers and researchers to assess each student very frequently (for instance,
practicing 10 min/day on a computer game involving math or reading skills) so as to
quickly build an individual dataset. Using machine learning approaches, it is possible to
build a student’s learning curve and compare a variety of features (accuracy of answers,
reaction times, language elements) with those found in peers in the same classroom,
school, city, or country—comparing within and across SES cohorts. In a few weeks one
could have enough data from each individual to identify learning patterns to reward as well
as deficits to remedy. This would allow teachers to quickly adapt their teaching strategies,
and even to suggest new motivating approaches based on the student’s potential as assessed
in other disciplines.
Toward healthy, cyclical inputs to strengthen learning
Health and education gradients are related to the fact that low-SES subjects are exposed to
a systematically higher risk for worse health outcomes, morbidity, and mortality (Mack-
enbach and Howden-Chapman 2003; Mackenbach et al. 1997). Inadequate sleep, nutrition,
and exercise have a compound negative impact on youth cognition, academic achievement,
and quality of life. To have schools compensate for the physiological deficits suffered by
low-SES youth is key for education improvement. Low-SES children and adolescents are
at the most severe risk for poor outcomes; amelioration of the physiological conditions that
prepare and consolidate learning is likely to maximize gains for these students.
Cognitive improvement from mitigation of physiological deficits depends on the time
between physiological intervention and acquisition of new knowledge, on the scale of
minutes to hours. To achieve that, automated assessment of individual student performance
is of the essence. Systematic, dense mapping of cognitive trajectories will give educators a
much better grasp of the appropriate psychological and physiological interventions,
allowing for personalized and yet scalable education. In developing countries with blatant
educational inequality, overcoming physiological and assessment bottlenecks is likely to
generate major cognitive benefits in the poorest strata of society. From the point of view of
public policies, these bottlenecks are ‘‘low-hanging fruit’’—goals relatively easy to
achieve. Schools can become places where attending classes, eating, sleeping, exercising,
and undergoing examinations alternate in a cyclical manner so as to optimize learning.
Regular classes—often long and boring—could be replaced by shorter, more effective
classes so as to free time for physiology and assessment activities. Educators will focus
much future research on how to best schedule and design these activities.
References
Alaimo, K., Olson, C. M., & Frongillo, E. A. Jr. (2001). Food insufficiency and American school-aged
children’s cognitive, academic, and psychosocial development. Pediatrics, 108(1), 44–53.
Arber, S., Bote, M., & Meadows, R. (2009). Gender and socio-economic patterning of self-reported sleep
problems in Britain. Social Science and Medicine, 68(2), 281–289. doi:10.1016/j.socscimed.2008.10.
016.
S. Ribeiro et al.
123
Author's personal copy
107
Ardoy, D. N., Ferna´ndez-Rodrı´guez, J. M., Jime´nez-Pavo´n, D., Castillo, R., Ruiz, J. R., & Ortega, F. B.
(2014). A physical education trial improves adolescents’ cognitive performance and academic
achievement: The EDUFIT study. Scandinavian Journal of Medicine and Science in Sports, 24(1),
e52–e61. doi:10.1111/sms.12093.
Bedi, G., Carrillo, F., Cecchi, G., Slezak, D. F., Sigman, M., Mota, N., Ribeiro, S., Javitt, D., Copelli, M., &
Corcoran, C. (2015). Automated analysis of free speech predicts psychosis onset in high-risk youths.
npj Schizophrenia, 1, 15030. doi:10.1038/npjschz.2015.30.
Beebe, D. W., Simon, S., Summer, S., Hemmer, S., Strotman, D., & Dolan, L. M. (2013). Dietary intake
following experimentally restricted sleep in adolescents. Sleep, 36(6), 827–834. doi:10.5665/sleep.
2704.
Berchicci, M., Pontifex, M. B. B., Drollette, E. S. S., Pesce, C., Hillman, C. H. H., & Di Russo, F. (2015).
From cognitive motor preparation to visual processing: The benefits of childhood fitness to brain
health. Neuroscience, 298, 211–219. doi:10.1016/j.neuroscience.2015.04.028.
Bertola, L., Mota, N. B., Copelli, M., Rivero, T., Diniz, B. R., Romano-Silva, M. A., et al. (2014). Graph
analysis of verbal fluency test discriminate between patients with Alzheimer’s disease, mild cognitive
impairment and normal elderly controls. Frontiers in Aging Neuroscience, 6, 1–10. doi:10.3389/fnagi.
2014.00185.
Bliwise, D. L. (1996). Historical change in the report of daytime fatigue. Sleep, 19, 462–464.
Bonnet, M., & Arand, D. (1995). We are chronically sleep deprived. Sleep, 18, 908–911.
Broman, J. E., Lundh, L. G., & Hetta, J. (1996). Insufficient sleep in the general population. Neurophysi-
ologie Clinique, 26, 30–39.
Budde, H., Voelcker-Rehage, C., Pietrabyk-Kendziorra, S., Ribeiro, P., & Tidow, G. (2008). Acute coor-
dinative exercise improves attentional performance in adolescents. Neuroscience Letters, 441(2),
219–223. doi:10.1016/j.neulet.2008.06.024.
Buxton, O. M., & Marcelli, E. (2010). Short and long sleep are positively associated with obesity, diabetes,
hypertension, and cardiovascular disease among adults in the United States. Social Science and
Medicine, 71(5), 1027–1036. doi:10.1016/j.socscimed.2010.05.041.
Cardeal, C. M., Pereira, L. A., Da Silva, P. F., & De Franc¸a, N. M. (2013). Efeito de um programa escolar de
estimulac¸a˜o motora sobre desempenho da func¸a˜o executiva e atenc¸a˜o em crianc¸as. Motricidade, 9(3),
44–56. doi:10.6063/motricidade.9(3).762.
Casey, P., Goolsby, S., Berkowitz, C., Frank, D., Cook, J., Cutts, D., et al. (2004). Maternal depression,
changing public assistance, food security, and child health status. Pediatrics, 113, 298–304.
Castro, L. S., Castro, J., Hoexter, M. Q., Quarantini, L. C., Kauati, A., Mello, L. E., et al. (2013). Depressive
symptoms and sleep: A population-based polysomnographic study. Psychiatry Research, 210(3),
906–912. doi:10.1016/j.psychres.2013.08.036.
CDC [Centers for Disease Control and Prevention] (2013). Vital and health statistics. Health Behaviors of
Adults: United States, 2008–2010, 257(10).
Chaddock, L., Erickson, K. I., Prakash, R. S., VanPatter, M., Voss, M. W., Pontifex, M. B., et al. (2010).
Basal ganglia volume is associated with aerobic fitness in preadolescent children. Developmental
Neuroscience, 32, 249–256. doi:10.1159/000316648.
Chaddock, L., Pontifex, M. B., Hillman, C. H., & Kramer, A. F. (2011). A review of the relation of aerobic
fitness and physical activity to brain structure and function in children. Journal of International
Neuropsychological Society, 17, 975–985. doi:10.1017/S1355617711000567.
Chambers, E. C., Pichardo, M. S., & Rosenbaum, E. (2014). Sleep and the housing and neighborhood
environment of urban Latino adults living in low-income housing: The AHOME study. Behavioral
Sleep Medicine, 11, 1–16.
Chen, M. D., Tsai, H. Y., Wang, C. C., & Wuang, Y. P. (2015). The effectiveness of racket-sport inter-
vention on visual perception and executive functions in children with mild intellectual disabilities and
borderline intellectual functioning. Neuropsychiatric Disease and Treatment, 11, 2287–2297. doi:10.
2147/NDT.S89083.
Crova, C., Struzzolino, I., Marchetti, R., Masci, I., Vannozzi, G., Forte, R., et al. (2014). Cognitively
challenging physical activity benefits executive function in overweight children. Journal of Sports
Sciences, 32(3), 201–211. doi:10.1080/02640414.2013.828849.
Deslandes, A., Moraes, H., Ferreira, C., Veiga, H., Silveira, H., Mouta, R., et al. (2009). Exercise and mental
health: Many reasons to move. Neuropsychobiology, 59(4), 191–198.
Diamond, M. C. (2001). Response of the brain to enrichment. Anais da Academia Brasileira de Cieˆncias, 73,
210–220. doi:10.1590/S0001-37652001000200006.
Diamond, A. (2013). Executive functions. Annual Review of Psychology, 64, 135–168. doi:10.1146/
annurev-psych-113011-143750.
Physiology and assessment as low-hanging fruit for…
123
Author's personal copy
108
Diamond, A., & Lee, K. (2011). Interventions shown to aid executive function development in children 4 to
12 years old. Science, 333, 959–964. doi:10.1126/science.1204529.
Diekelmann, S., & Born, J. (2010). The memory function of sleep. Nature Reviews Neuroscience, 11,
114–126. doi:10.1038/nrn2762.
Ding, M., Keiley, M. K., Garza, K. B., Duffy, P. A., & Zizza, C. A. (2015). Food insecurity is associated
with poor sleep outcomes among US adults. Journal of Nutrition, 145(3), 615–621. doi:10.3945/jn.114.
199919.
Ding, M., Wilson, N. L., Garza, K. B., & Zizza, C. A. (2014). Undiagnosed prediabetes among food insecure
adults. American Journal of Health Behavior, 38, 225–233.
Donnelly, J. E., Hillman, C. H., Castelli, D., Etnier, J. L., Lee, S., Tomporowski, P., et al. (2016). Physical
activity, fitness, cognitive function, and academic achievement in children. Medicine and Science in
Sports and Exercise, 48(6), 1197–1222.
Duman, R. S. (2005). Neurotrophic factors and regulation of mood: Role of exercise, diet and metabolism.
Neurobiology of Aging, 26(1), 88–93. doi:10.1016/j.neurobiolaging.2005.08.018.
Erath, S. A., Tu, K. M., Buckhalt, J. A., & El-Sheikh, M. (2015). Associations between children’s intelli-
gence and academic achievement: The role of sleep. Journal of Sleep Research, 24(5), 510–513.
doi:10.1111/jsr.12281.
Fedewa, A., & Ahn, S. (2011). The effects of physical activity and physical fitness on children’s
achievement and cognitive outcomes: A meta-analysis. Research Quarterly for Exercise and Sport, 82,
521–535. doi:10.5641/027013611X13275191444107.
Fernandes, V. R., Ribeiro, M. L., Melo, T., Maciel-Pinheiro, P. D., Guimara˜es, T. T., Arau´jo, N. B., et al.
(2016). Motor coordination correlates with academic achievement and cognitive function in children.
Frontiers in Psychology, Specialty Section: Educational Psychology. doi:10.3389/fpsyg.2016.00318.
Fiuza-Luces, C., Garatachea, N., Berger, N. A., & Lucia, A. (2013). Exercise is the real polypill. Physiology,
28, 330–358. doi:10.1152/physiol.00019.2013.
Gold, P. E., & Korol, D. L. (2012). Making memories matter. Frontiers in Integrative Neuroscience, 6, 116.
doi:10.3389/fnint.2012.00116.
Goldin, A. P., Hermida, M. J., Shaloma, D. E., Costa, M. E., Lopez-Rosenfeld, M., Segretin, M. S., et al.
(2014). Far transfer to language and math of a short software-based gaming intervention. Proceedings
of the National Academy of Sciences, 111(17), 6443–6448. doi:10.1073/pnas.1320217111.
Grandner, M. A., Jackson, N., Gerstner, J. R., & Knutson, K. L. (2013). Dietary nutrients associated with
short and long sleep duration: Data from a nationally representative sample. Appetite, 64(C), 71–80.
doi:10.1016/j.appet.2013.01.004.
Grissmer, D., Grimm, K. J., Aiyer, S. M., Murrah, W. M., & Steele, J. S. (2010). Fine motor skills and early
comprehension of the world: Two new school readiness indicators. Developmental Psychology, 46,
1008–1017. doi:10.1037/a0020104.
Gupta, N. K., Mueller, W. H., Chan, W., & Meininger, J. C. (2002). Is obesity associated with poor sleep
quality in adolescents? American Journal of Human Biology, 14(6), 762–768. doi:10.1002/ajhb.10093.
Haapala, E. A., Poikkeus, A. M., Tompuri, T., Kukkonen-Harjula, K., Leppa¨nen, P. H. T., Lindi, V., et al.
(2014). Associations of motor and cardiovascular performance with academic skills in children.
Medicine and Science in Sports and Exercise, 46(5), 1016–1024. doi:10.1249/MSS.00000000
00000186.
Hale, L., Berger, L. M., LeBourgeois, M. K., & Brooks-Gunn, J. (2009). Social and demographic predictors
of preschoolers’ bedtime routines. Journal of Develeopmental and Behavior Pediatrics, 30(5),
394–402. doi:10.1097/DBP.0b013e3181ba0e64.
Halperin, J. M., & Healey, D. M. (2011). The influences of environmental enrichment, cognitive
enhancement, and physical exercise on brain development: Can we alter the developmental trajectory
of ADHD? Neuroscience and Biobehavioral Reviews, 35, 621–634. doi:10.1016/j.neubiorev.2010.07.
006.
Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behav Brain Sci.,
33(2–3), 61–83. doi:10.1017/S0140525X0999152X.
Hill, T. D., Burdette, A. M., & Hale, L. (2009). Neighborhood disorder, sleep quality, and psychological
distress: Testing a model of structural amplification. Health & Place, 4, 1006–1013. doi:10.1016/j.
healthplace.2009.04.001.
Hillman, C. H., Erickson, K. I., & Kramer, A. F. (2008). Be smart, exercise your heart: Exercise effects on
brain and cognition. Nature Reviews Neuroscience, 9(1), 58–65.
Hogenkamp, P. S., Nilsson, E., Nilsson, V. C., Chapman, C. D., Vogel, H., Lundberg, L. S., et al. (2013).
Acute sleep deprivation increases portion size and affects food choice in young men. Psychoneu-
roendocrinology, 38(9), 1668–1674. doi:10.1016/j.psyneuen.2013.01.012.
S. Ribeiro et al.
123
Author's personal copy
109
Hyyppa, M. T., Kronholm, E., & Alanen, E. (1997). Quality of sleep during economic recession in Finland:
A longitudinal cohort study. Social Science and Medicine, 45, 731–738.
Jarrin, D. C., McGrath, J. J., & Drake, C. L. (2013). Beyond sleep duration: Distinct sleep dimensions are
associated with obesity in children and adolescents. International Journal of Obesity, 37, 552–558.
doi:10.1038/ijo.2013.4.
Jarrin, D. C., McGrath, J. J., & Quon, E. C. (2014). Objective and subjective socioeconomic gradients exist
for sleep in children and adolescents. Health Psychology, 33(3), 301–305. doi:10.1037/a0032924.
Knutson, K. L. (2011). Association between sleep duration and body size differs among three Hispanic
groups. American Journal of Human Biology, 23(1), 138–141. doi:10.1002/ajhb.21108.
Kohl, H. W., & Cook, H. D. (2013). Educating the student body: Taking physical activity and physical
education to school. Washington, DC: National Academies Press. doi:10.17226/18314.
Korol, D. L., & Gold, P. E. (1998). Glucose, memory, and aging. American Journal of Clin Nutrition, 67(4),
764S–771S.
Kurdziel, L., Duclos, K., & Spencer, R. M. (2013). Sleep spindles in midday naps enhance learning in
preschool children. Proceedings of the National Academy of Sciences of the United States of America,
110(43), 17267–17272. doi:10.1073/pnas.1306418110.
Laraia, B. A., Siega-Riz, A. M., Gundersen, C., & Dole, N. (2006). Psychosocial factors and socioeconomic
indicators are associated with household food insecurity among pregnant women. Journal of Nutrition,
136, 177–182.
Lees, C., & Hopkins, J. (2013). Effect of aerobic exercise on cognition, academic achievement, and psy-
chosocial function in children: A systematic review of randomized control trials. Prevention of Chronic
Disease, 10, E174. doi:10.5888/pcd10.130010.
Lemos, N., Weissheimer, J., & Ribeiro, S. (2014). Naps in school can enhance the duration of declarative
memories learned by adolescents. Frontiers in Systems Neuroscience. doi:10.3389/fnsys.2014.00103.
Lima, P. F., Medeiros, A. L., & Araujo, J. F. (2002). Sleep-wake pattern of medical students: Early versus
late class starting time. Brazilian Journal of Medical and Biological Research, 35(11), 1373–1377.
Liu, X., Liu, L., & Wang, R. (2003). Bed sharing, sleep habits, and sleep problems among Chinese school-
aged children. Sleep, 26(7), 839–844.
Lomas, D., Patel, K., Forlizzi, J. L., & Koedinger, K. R. (2013). Optimizing challenge in an educational
game using large-scale design experiments. In Proceedings of the SIGCHI conference on human
factors in computing systems. CHI’13—CHI conference on human factors in computing systems, Paris,
April 27–May 02, 2013. doi:10.1145/2470654.2470668.
Lopez-Rosenfeld, M., Goldin, A. P., Lipina, S., Sigman, M., & Slezak, D. F. (2013). Mate Marote: A
flexible automated framework for large-scale educational interventions. Computers & Education, 68,
307–313.
Mackenbach, J. P., & Howden-Chapman, P. (2003). New perspectives on socioeconomic inequalities in
health. Perspectives in Biology and Medicine, 46, 428–444.
Mackenbach, J. P., Kunst, A. E., Cavelaars, A. E., Groenhof, F., & Geurts, J. J. (1997). Socioeconomic
inequalities in morbidity and mortality in Western Europe. The EU Working Group on Socioeconomic
Inequalities in Health. Lancet, 349, 1655–1659.
Magnisalis, I., Demetriadis, S., & Karakostas, A. (2011). Adaptive and intelligent systems for collaborative
learning support: A review of the field. IEEE Transactions on Learning Technologies, 4(1), 5–20.
Mander, B. A., Santhanam, S., Saletin, J. M., & Walker, M. P. (2011). Wake deterioration and sleep
restoration of human learning. Current Biology, 21(5), R183–R184. doi:10.1016/j.cub.2011.01.019.
Masley, S., Roetzheim, R., & Gualtieri, T. (2009). Aerobic exercise enhances cognitive flexibility. Journal
of Clinical Psychology in Medical Settings, 16, 186–193. doi:10.1007/s10880-009-9159-6.
Matthews, K. A., Hall, M., & Dahl, R. E. (2014). Sleep in healthy black and white adolescents. Pediatrics,
133(5), e1189–e1196. doi:10.1542/peds.2013-2399.
McNay, E. C., & Gold, P. E. (2002). Food for thought: Fluctuations in brain extracellular glucose provide
insight into the mechanisms of memorymodulation. Behavioral and Cognitive Neuroscience Review,
1(4), 264–280.
Me´ndez, A., Martı´n, A., Pires, A. C., Va´squez, A., Maiche, A., Gonza´lez, F., et al. (2015). Temporal
perception and delay aversion: A videogame screening tool for the early detection of ADHD. Revista
Argentina de Ciancias del Comportamiento, 7(3), 90–101.
Mitler, M. M., Miller, J. C., Lipsitz, J. J., Walsh, J. K., & Wylie, C. D. (1997). The sleep of long-haul truck
drivers. New England Journal of Medicine, 337, 755–762. doi:10.1056/NEJM199709113371106.
Moreno, C. R. C., Vasconcelos, S., Marqueze, E. C., Lowden, A., Middleton, B., Fischer, F. M., et al.
(2015). Sleep patterns in Amazon rubber tappers with and without electric light at home. Scientific
Reports, 5, 14074. doi:10.1038/srep14074.
Physiology and assessment as low-hanging fruit for…
123
Author's personal copy
110
Mota, N. B., Copelli, M., & Ribeiro, S. (2016). Computational tracking of mental health in youth: Latin
American contributions to a low-cost and effective solution for early psychiatric diagnosis. New
Directions for Child and Adolescent Development (special issue on Child and adolescent development
in Latin America), 152, 59–69.
Mota, N. B., Furtado, R., Maia, P. P. C., Copelli, M., & Ribeiro, S. (2014). Graph analysis of dream reports
is especially informative about psychosis. Scientific Reports, 4, 3691.
Mota, N. B., Vasconcelos, N. A. P., Lemos, N., Pieretti, A. C., Kinouchi, O., Cecchi, G. A., et al. (2012).
Speech graphs provide a quantitative measure of thought disorder in psychosis. PLoS ONE, 7(4),
e34928. doi:10.1371/journal.pone.0034928.
Mota, N. B., Weissheimer, J., Madruga, B., Adamy, N., Bunge, S. A., Copelli, M., et al. (2016). A
naturalistic assessment of the organization of children’s memories predicts cognitive functioning and
reading ability. Mind, Brain and Education, 10, 184–195.
Ng, S. W., & Popkin, B. M. (2012). Time use and physical activity: A shift away from movement across the
globe. Obesity Reviews, 13(8), 659–680. doi:10.1111/j.1467-789X.2011.00982.x.
Odic, D., Lisboa, J. V., Eisinger, R., Olivera, M. G., Maiche, A., & Halberda, J. (2016). Approximate
number and approximate time discrimination each correlate with school math abilities in young
children. Acta Psychologica, 163, 17–26.
OECD (2014). Education at a glance: OECD indicators. Paris: OECD. doi:10.1787/eag-2014-en.
OECD (2016). PISA 2015 results in focus. PISA in Focus, 67, Paris: OECD. doi: 10.1787/aa9237e6-en.
Okun, M. L., Tolge, M., & Hall, M. (2014). Low socioeconomic status negatively affects sleep in pregnant
women. Journal of Obstetric, Gynecologic, & Neonatal Nursing, 43(2), 160–167. doi:10.1111/1552-
6909.12295.
Peixoto, C. A., da Silva, A. G., Carskadon, M. A., Louzada, F. M., Pereira, E. F., Moreno, C., et al. (2009).
Adolescents living in homes without electric lighting have earlier sleep times. Behavioral Sleep
Medicine, 7(2), 73–80. doi:10.1080/15402000902762311.
Peltzer, K., & Pengpid, S. (2015). Nocturnal sleep problems among university students from 26 countries.
Sleep Breath, 19(2), 499–508. doi:10.1007/s11325-014-1036-3.
Pereira, A. C., Huddleston, D. E., Brickman, A. M., Sosunov, A., Hen, R., McKhann, G. M., et al. (2007).
An in vivo correlate of exercise-induced neurogenesis in the adult dentate gyrus. Proceedings of the
National Academy of Sciences of the United States of America, 104, 5638–5643. doi:10.1073/pnas.
0611721104.
Pontifex, M. B., Raine, L. B., Johnson, C. R., Chaddock, L., Voss, M. W., Cohen, N. J., et al. (2011).
Cardiorespiratory fitness and the flexible modulation of cognitive control in preadolescent children.
Journal of Cognitive Neuroscience, 23, 1332–1345. doi:10.1162/jocn.2010.21528.
Quartz, S. R., & Sejnowski, T. J. (1997). The neural basis of cognitive development: A constructivist
manifesto. Behavioral and Brain Sciences, 20(4), 537–556.
Seligman, H. K., Bindman, A. B., Vittinghoff, E., Kanaya, A. M., & Kushel, M. B. (2007). Food insecurity
is associated with diabetes mellitus: Results from the National Health Examination and Nutrition
Examination Survey (NHANES) 1999–2002. Journal of General Internal Medicine, 22, 1018–1023.
Seligman, H. K., Laraia, B. A., & Kushel, M. B. (2010). Food insecurity is associated with chronic disease
among low-income NHANES participants. Journal of Nutrition, 140, 304–310.
Sigman, M., Pen˜a, M., Goldin, A. P., & Ribeiro, S. (2014). Neuroscience and education: Prime time to build
the bridge. Nature Neuroscience, 17, 497–502.
Simonelli, G., Leanza, Y., Boilard, A., Hyland, M., Augustinavicius, J. L., Cardinali, D. P., et al. (2013).
Sleep and quality of life in urban poverty: The effect of a slum housing upgrading program. Sleep,
36(11), 1669–1676. doi:10.5665/sleep.3124.
Solari, C. D., & Mare, R. D. (2012). Housing crowding effects on children’s wellbeing. Social Science
Research, 41(2), 464–476. doi:10.1016/j.ssresearch.2011.09.012.
Stickgold, R. (2005). Sleep-dependent memory consolidation. Nature, 437, 1272–1278. doi:10.1038/
nature04286.
Stranges, S., Tigbe, W., Go´mez-Olive´, F. X., Thorogood, M., & Kandala, N. B. (2012). Sleep problems: An
emerging global epidemic? Findings from the INDEPTH WHO-SAGE study among more than 40,000
older adults from 8 countries across Africa and Asia. Sleep, 35(8), 1173–1181.
Tayie, F. A., & Zizza, C. A. (2009). Food insecurity and dyslipidemia among adults in the United States.
Preventive Medicine, 48, 480–485.
Teixeira, L. R., Fischer, F. M., de Andrade, M. M., Louzada, F. M., & Nagai, R. (2004). Sleep patterns of
day-working, evening high-schooled adolescents of Sa˜o Paulo, Brazil. Chronobiology International,
21(2), 239–252.
S. Ribeiro et al.
123
Author's personal copy
111
Tellez, L. A., Han, W., Zhang, X., Ferreira, T. L., Perez, I. O., Shammah-Lagnado, S. J., et al. (2016).
Separate circuitries encode the hedonic and nutritional values of sugar. Nature Neuroscience, 19(3),
465–470. doi:10.1038/nn.4224.
Tsai, C. L. (2009). The effectiveness of exercise intervention on inhibitory control in children withdevel-
opmental coordination disorder: Using a visuospatial attention paradigm as a model. Research in
Developmental Disabilities, 30(6), 1268–1280. doi:10.1016/j.ridd.2009.05.001.
Umlauf, M. G., Bolland, A. C., Bolland, K. A., Tomek, S., & Bolland, J. M. (2015). The effects of age,
gender, hopelessness, and exposure to violence on sleep disorder symptoms and daytime sleepiness
among adolescents in impoverished neighborhoods. Journal of Youth Adolescence, 44(2), 518–542.
doi:10.1007/s10964-014-0160-5.
UNESCO (2011). World data on education. Paris: UNESCO.
UN-HABITAT (2003). The challenge of slums: Global report on human settlements. London: Earthscan.
UN-HABITAT (2007). State of the world’s cities 2006/7. London: Earthscan.
Valladolid-Acebes, I., Stucchi, P., Cano, V., Ferna´ndez-Alfonso, M. S., Merino, B., Gil-Ortega, M., et al.
(2011). High-fat diets impair spatial learning in the radial-arm maze in mice. Neurobiology of Learning
and Memory, 95(1), 80–85. doi:10.1016/j.nlm.2010.11.007.
Van Cauter, E., & Spiegel, K. (1999). Sleep as a mediator of the relationship between socioeconomic status
and health: A hypothesis. Annals of the New York Academy of Sciences, 896, 254–261.
Van der Fels, I. M. J., Te Wierike, S. C. M., Hartman, E., Elferink-Gemser, M. T., Smith, J., & Visscher, C.
(2015). The relationship between motor skills and cognitive skills in 4–16 year old typically devel-
oping children: A systematic review. Journal of Science and Medicine in Sport, 18, 697–703. doi:10.
1016/j.jsams.2014.09.007.
Van Praag, H. (2009). Exercise and the brain: Something to chew on. Trends in Neuroscience, 32, 283–290.
doi:10.1016/j.tins.2008.12.007.Exercise.
Van Praag, H., Kempermann, G., & Gage, F. H. (1999). Running increases cell proliferation and neuro-
genesis in the adult mouse dentate gyrus. Nature Neuroscience, 2, 266–270. doi:10.1038/6368.
Vaynman, S., & Gomez-Pinilla, F. (2006). Revenge of the ‘‘sit’’: How lifestyle impacts neuronal and
cognitive health through molecular systems that interface energy metabolism with neuronal plasticity.
Journal of Neuroscientific Research, 715, 699–715. doi:10.1002/jnr.
Voss, M. W., Chaddock, L., Kim, J. S., Vanpatter, M., Pontifex, M. B., Raine, L. B., et al. (2011). Aerobic
fitness is associated with greater efficiency of the network underlying cognitive control in preado-
lescent children. Neuroscience, 199, 166–176. doi:10.1016/j.neuroscience.2011.10.009.
Walker, E., Rummel, N., & Koedinger, K. R. (2014). Adaptive intelligent support to improve peer tutoring
in algebra. International Journal of Artificial Intelligence in Education, 24(1), 33–61. doi:10.1007/
s40593-013-0001-9.
Webb, W. B., & Agnew, H. W. (1975). Are we chronically sleep deprived? Bulletin of the Psychonomic
Society, 6, 47–48.
Whitaker, R. C., Phillips, S. M., & Orzol, S. M. (2006). Food insecurity and the risks of depression and
anxiety in mothers and behavior problems in their preschool-aged children. Pediatrics, 118, e859–
e868.
Sidarta Ribeiro (Brazil) is full professor of neuroscience and director of the Brain Institute at the Federal
University of Rio Grande do Norte. He holds a bachelor’s degree in biology from the University of Brası´lia
(1993), a master’s in biophysics from the Federal University of Rio de Janeiro (1994), and a PhD in animal
behavior from the Rockefeller University (2000), with postdoctoral studies in neurophysiology at Duke
University (2005). His main research topics are: memory, sleep, and dreams; neuronal plasticity; vocal
communication; symbolic competence in nonhuman animals; computational psychiatry, and neuroeduca-
tion. He was secretary of the Brazilian Society for Neuroscience and Behavior (2009–2011), and chair of the
Brazilian Regional Committee of the Pew Latin American Fellows Program in the Biomedical Sciences
(2011–2015). Since 2011, he has been a member of the Steering Committee of the Latin American School
for Education, Cognitive and Neural Sciences (LA School). In 2016, he was elected to the Latin American
Academy of Sciences (ACAL).
Nata´lia Bezerra Mota (Brazil) is a PhD student of neuroscience at the Brain Institute of the Federal
University of Rio Grande do Norte. She graduated in medicine, did her residency in psychiatry, and received
a master’s degree in neuroscience. She is an alumna of the Latin American School of Education, Cognitive
and Neural Sciences. She developed a quantitative method of speech analysis based on graph theory, which
helps to differentiate the structure of speech in psychiatric patients and to classify different causes of
Physiology and assessment as low-hanging fruit for…
123
Author's personal copy
112
psychosis with tremendous accuracy. For her doctorate, she aims to perform graph-theoretical analyses of
speech in three experimental contexts: psychosis, wake-sleep cycle, and school declarative learning.
Valter da Rocha Fernandes (Brazil) is a graduate in physical education and a master’s student in the
School of Sports and Physical Education of the Federal University of Rio de Janeiro. A member of the
Neuroscience of Exercise Laboratory, he researches the influence of exercise, especially Capoeira, in
cognition. He is an alumnus of the Latin American School of Education, Cognitive and Neural Sciences.
Founder and director of the nonprofit Capoeira Cidada˜, he has long experience working with education, in
kindergarten, schools, and social programs in Brazil.
Andrea Camaz Deslandes (Brazil) is the coordinator of the Neuroscience of Exercise Laboratory, and
adjunct professor of the Institute of Physical Education and Sports of Rio de Janeiro State University. She
has a PhD in mental health. She teaches exercise science (e.g., motor learning and neuroscience of exercise)
and advises graduate/postgraduate students. Has experience in physical exercise and neuroscience, focusing
on mental health and cognition, and the impact of physical exercise on several diseases (e.g., depression,
anxiety, Alzheimer’s disease, and Parkinson’s), acute and chronic effects of physical exercise on affect,
cognitive function, hormonal, and EEG changes in different populations (children, adolescents, and elderly).
Guilherme Brockington (Brazil) is an adjunct professor of science at UNIFESP-DIADEMA, with a
bachelor’s degree in physics from the Federal University of Juiz de Fora, a master’s in science education,
University of Sa˜o Paulo, and a PhD in education from the University of Sa˜o Paulo. He has introduced
modern and contemporary physics in high school curricula, taught numerous education courses for public
school teachers, and is the author of several school textbooks. Has experience in the area of education and
science education, with emphasis on physics teaching. He focuses on research connecting neuroscience and
education, mainly investigating the role of emotion in the process of learning scientific information.
Mauro Copelli (Brazil) is an associate professor in physics at the Federal University of Pernambuco
(UFPE). He has worked on the applications to neuroscience of techniques from statistical mechanics and
nonlinear dynamics. He and his collaborators have studied how collective neural phenomena can account for
information processing in sensory systems, emphasizing that coding of incoming physical stimuli can be
optimized if the system is in a critical state. This interdisciplinary research theme has fostered his
collaborations with theoretical physicists and experimental neuroscientists, including the joint supervision of
students under the umbrella of the graduate program in physics. He has also worked on the application of
complex graphs to speech, a technique that has shown potential for automated diagnoses of psychiatric
subjects.
S. Ribeiro et al.
123
Author's personal copy
113
2016
Dossier
“Rumo ao cultivo ecológico da mente”, 
por Sidarta Ribeiro, Natalia Mota y Mauro Copelli
Propuesta Educativa Número 46 – Año 25 – Nov. 2016 – Vol2 – Págs. 42 a 49
Educación
FLACSO ARGENTINA 
Facultad Latinoamericana de Ciencias Sociales
propuesta@flacso.org.ar
ISSN 1995- 7785 
ARGENTINA
46
114
42
D
O
SS
IE
R 
/ E
N
TR
EV
IS
TA
 / 
A
RT
ÍC
U
LO
S 
/ R
ES
EÑ
A
S
Introdução
Famílias de baixa renda na maioria das vezes não conseguem fornecer a seus integrantes a 
quantidade e qualidade de sono, nutrição e exercícios físicos necessários a uma vida saudável. 
De acordo com o Programa das Nações Unidas para os Assentamentos Humanos, mais de um 
bilhão de pessoas em todo o mundo habitam favelas (UN-HABITAT 2007), e em 2030 este nú-
mero deverá duplicar (UN-HABITAT 2003). Escolas em comunidades com baixo nível socioeco-
nômico sofrem déficits acadêmicos tanto durante o momento de ocorrência do aprendizado 
quanto durante sua avaliação subsequente. 
Enfatizar a urgência do problema da educação é talvez desnecessário diante da persistente 
pobreza material da América Latina, ainda de mãos dadas com nossa profunda pobreza cul-
tural. Não obstante, é preciso compreender que o problema da educação não decorre apenas 
da desigualdade econômica, pois mesmo nas escolas de elite do primeiro mundo observa-se 
o aprendizado fugaz do exame bem feito seguido do esquecimento perene. Via de regra o 
aprendizado de longa duração é frágil, exceto quando o aprendiz encontra-se intrinsecamente 
motivado para aprender.
A educação é um problema sério porque, a despeito de sua imensa importância para mitigar 
a desigualdade social, não recebe investimentos suficientes para motivar professores e outros 
*Prof. Titular de Neurociências e Diretor do Instituto do Cérebro da Universidade Federal do Rio Grande do Norte 
(UFRN). É Bacharel em Ciências Biológicas pela Universidade de Brasília (1993), Mg. em Biofísica pela Universida-
de Federal do Rio de Janeiro (1994), Dr. em Comportamento Animal pela Universidade Rockefeller (2000) com 
Pós-Doutorado em Neurofisiologia pela Universidade Duke (2005). Tem experiência nas áreas de neuroetologia, 
neurobiologia molecular e neurofisiologia de sistemas, atuando principalmente nos seguintes temas: Sono, so-
nho e memória; plasticidade neuronal; comunicação vocal; competência simbólica em animais não-humanos; 
psiquiatria computacional e neuroeducação. Membro permanente das Pós-Graduações da UFRN em Psicobio-
logia (conceito Capes 6), Bioinformática (conceito Capes 5) e Neurociências (conceito Capes 4). Exerceu no tri-
ênio 2009-2011 a função de secretário da Sociedade Brasileira de Neurociências e Comportamento (SBNeC). 
De 2011-2015 foi coordenador do comitê brasileiro do Pew Latin American Fellows Program in the Biomedical 
Sciences. Desde 2011 é membro do comitê científico da Latin American School of Education, Cognitive and 
Neural Sciences (LA School), que em 2014 recebeu o prêmio inaugural Exemplifying the Mission of the Interna-
tional Mind, Brain and Education Society. Coordenador de núcleo do projeto de avaliação de crianças em risco 
para transtorno de aprendizagem (ACERTA - CAPES/Observatório da Educação). Investigador associado sênior 
do Centro FAPESP de Pesquisa, Inovação e Difusão em Neuromatemática (Neuromat). Membro do Conselho 
Consultivo da Plataforma Brasileira de Política de Drogas, criada em 2015. Editor associado dos pe-
riódicos Frontiers in Integrative Neuroscience, Frontiers In Psychology - Language Sciences, 
Neurobiologia e Basic and Clinic Neuroscience. Membro do Núcleo de Estudos Interdisciplinares sobre Psico-
ativos (NEIP) e da OSCIP Plantando Consciencia. Eleito em 2016 membro da Academia de Ciências da América 
Latina (ACAL). Membro desde 2016 do Conselho Consultivo da Rede Nacional de Ciência para a Educação (CpE). 
É a favor da manutenção e valorização do Ministério de Ciência, Tecnologia e Inovação. Autor de referência de 
este artigo.
**Instituto do Cérebro, Universidade Federal do Rio Grande do Norte, Natal, Brasil. 
***Departamento de Física, Universidade Federal de Pernambuco, Recife, Brasil.
SIDARTA RIBEIRO*
NATALIA MOTA ** 
MAURO COPELLI***
 
Rumo ao cultivo ecológico da mente
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
115
43
profissionais do ensino. Por outro lado, ainda sabemos supreendentemente pouco sobre os 
mecanismos biológicos e psicológicos subjacentes à educação (Sigman, Peña, Goldin e Ribei-
ro 2014). Por estas razões os debates pedagógicos costumam prolongar-se ao infinito, alimen-
tados pelo choque de opiniões divergentes, referenciais teóricos mutuamente excludentes e 
escassas bases empíricas para balizar decisões. Em particular, não há consenso sobre como 
manter elevada a motivação intrínseca de alunos e professores. As escolas que logram fazê-lo 
são preciosos arranjos locais, comunidades específicas que inspiram e informam outras experi-
ências, mas não são diretamente escaláveis ou reproduzíveis.
Na maior parte do mundo, as escolas não são muito diferentes das primeiras escolas da humani-
dade, as “edubas” sumérias onde os alunos repetiam monótonos exercícios caligráficos e eram 
admoestados por seus professores pela falta de atenção e motivação. Precisamos de uma nova 
educação com melhores salários, menos alunos por sala de aula e sobretudo mais imaginação 
e liberdade. Escolas boas permitem experiências marcantes e transformadoras que estruturam 
e empoderam os indivíduos, mas escolas ruins são muito mais frequentes e seus efeitos sobre o 
aprender - e sobre o prazer de aprender – podem ser devastadores.
A nova educação precisa resolver a contradição entre ensino personalizado e necessidade de 
aplicação em escala para atender a toda a população global de aprendizes, i.e. bilhões de crian-
ças, adolescentes e adultos que necessitam educação formal. É evidente que o uso de compu-
tadores tem implicações importantes tanto para a personalização quanto para a disseminação 
da educação em grande escala, mas desconectado de uma comunidade vibrante de pessoas 
reais, dificilmente bastarão para produzir a efervescência criativa de que necessitamos para as 
próximas gerações.
Sono, alimentação e exercício físico na escola
Na pobreza material e cultural, fica evidente que a biologia precede a psicologia. Além disso, 
escolas em comunidades de baixa renda normalmente não podem compensar esses proble-
mas, devido ao sub-financiamento, superlotação de salas de aula e profissionais da educação 
mal remunerados. Pelas mesmas razões, as escolas geralmente não conseguem prestar atenção 
personalizada aos alunos. Neste artigo argumentamos que é urgente uma reorganização das 
atividades escolares para superar os gargalos fisiológicos que dificultam a cognição no ensino 
do terceiro mundo, bem como de bolsões subdesenvolvidos no interior de nações ricas. Argu-
mentamos ainda que o rastreamento computacional das expressões verbais e escritas relacio-
nadas à aprendizagem escolar pode fornecer soluções escaláveis, rápidas e de baixo custo para 
melhorar a avaliação individualizada dos resultados da educação em comunidades de baixo 
nível socioeconômico.
Um dos eixos principais dessa reorganização é o sono, quase sempre encarado como “inimigo 
do professor” e francamente reprimido após o ensino fundamental. Por diversas razões, crian-
ças e jovens de todas as idades chegam à escola sonolentas. Está amplamente demonstrado 
que a privação do sono impede a aprendizagem, e sono desempenha um papel crucial tanto 
antes como depois da formação de novas memórias (Diekelmann e Born, 2010, Mander, San-
thanam, Saletin e Walker, 2011). Pesquisas em sala de aula sugerem que a soneca pós-aula pode 
aumentar a duração das memórias adquiridas no ambiente escolar (Kurdziel, Duclos e Spencer, 
2013, Lemos, Weissheimer e Ribeiro 2014). Permitir que o aluno durma antes das aulas sempre 
que assim desejar é tão natural quanto permitir que vá ao banheiro, pois o sono literalmente 
detoxifica o cérebro (Yang et al., 2014). Permitir que o aluno durma depois de uma aula intensa, 
por outro lado, atua na consolidação de longo prazo dos novos conteúdos (Ribeiro e Stickgold, 
2014).
Outro eixo muito importante é a alimentação, pois a merenda escolar típica não é desenha-
da com objetivos cognitivos. O estado nutricional desempenha um papel preponderante na 
aprendizagem e o cérebro consome cerca de 60% da glicose utilizada pelo organismo. Em tes-
Rumo ao cultivo ecológico da mente
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
116
44
D
O
SS
IE
R 
/ E
N
TR
EV
IS
TA
 / 
A
RT
ÍC
U
LO
S 
/ R
ES
EÑ
A
S
tes realizados com estudantes de graduação, a  ingestão de glicose pode levar a aumentos 
de mais de 30% na capacidade de memorização de textos (Korol e Gold 1998). Por outro lado, 
experimentos com modelos animais mostram que alimentos ricos em gordura são prejudiciais 
à aprendizagem (Valladolid-Acebes et al., 2011). Há muito terreno a se conquistar na escola com 
a otimização cognitiva da merenda escolar no que diz respeito.
Um terceiro eixo de grande relevância é o exercício físico, via de regra desacoplado das demais 
disciplinas e atividades escolares. Uma das conseqüências mais daninhas de habitar domicí-
lios superlotados é a falta de espaço para alongamento ou exercícios, agravado pela falta de 
infra-estrutura para esportes na maioria das comunidades de baixa renda. Há ampla evidência 
de que o exercício físico contribui para a prevenção de doenças cardiovasculares e metabóli-
cas (Fiuza-Luces, Garatachea, Berger e Lucia, 2013), mas seu impacto sobre a cognição foi su-
bestimado até recentemente (Hillman, Erickson e Kramer, 2008, Chaddock, Pontifex, Hillman 
e Kramer, 2011, Diamond e Lee, 2011, Masley, Roetzheim e Gualtieri, 2009). As evidências hoje 
apontam para uma associação estreita entre habilidades motoras e desempenho acadêmico 
em geral (Grissmer, Grimm, Aiyer, Murrah e Steele, 2010; Fernandes et al., 2016). A relação entre 
aptidão cardiorrespiratória e desempenho cognitivo também está bem estabelecida (Berchicci 
et al., 2015, Pontifex et al., 2011, Voss et al., 2011). Entre as funções cognitivas que se beneficiam 
de exercícios físicos destacam-se as funções executivas, que compreendem o controle inibitó-
rio, planejamento, memória de trabalho, tomada de decisão e flexibilidade cognitiva (Diamond 
2013). Em modelos animais, o exercício voluntário se correlaciona com um aumento do número 
de novos neurônios numa região cerebral diretamente relacionada com a aquisição de novas 
memórias (Van Praag, Kempermann e Gage, 1999).
Pesquisas em sala de aula precisam elucidar como usar melhor o sono pré-aula e pós-aula para 
fortalecer o aprendizado. Em particular, é fundamental parametrizar os efeitos cognitivos da 
duração do cochilo e da sua composição de estados fisiológicos. Também será necessário reali-
zar mais pesquisas empíricas em sala de aula para quantificar o impacto cognitivo da ingestão 
calórica, da composição da refeição, o papel dos micronutrientes e da hidratação, bem como 
os efeitos do tamanho da porção, freqüência alimentar e o papel reforçador dos alimentos. 
Finalmente, as interações entre sono, exercício e nutrição devem ser investigadas em busca de 
efeitos sinérgicos.
Avaliação frequente e automatizada do desempenho escolar
Outro importante fator limitante para a educação é a superlotação das salas de aula, que difi-
culta a avaliação da aprendizagem individual. Essa necessidade vai além da mera mensuração 
do desempenho acadêmico, pois aponta para avaliações comportamentais e cognitivas que 
podem prever déficits de aprendizagem suficientemente cedo para que professores e famílias 
possam intervir com sucesso.  Felizmente, novas tecnologias e métodos de análise abrem pers-
pectivas alvissareiras para a quantificação acurada dos avanços e prejuízos na educação (Goldin 
et al., 2014, Lomas, Patel, Forlizzi e Koedinger, 2013, López Rosenfeld, Goldin, Lipina, Sigman e 
Slezak 2013, Méndez et al., 2015, Mota et al. 2016, Mota, Copelli e Ribeiro 2016, Odic et al., 2016). 
Um bom exemplo de como lidar com essa complexidade encontra-se no Adaptive Collaborati-
ve Learning Support (ACLS) (Magnisalis, Demetriadis e Karakostas, 2011, Walker, Rummel e Ko-
edinger, 2014). Os desenvolvedores criaram softwares educacionais que modelam a aprendiza-
gem colaborativa, criando ambientes de aprendizagem ricos que se adaptam às características 
de cada aluno, ajudando a melhorar o desempenho para além de sua mera avaliação. O sistema 
fornece feedback inteligente que orienta o aluno a encontrar o melhor caminho individual de 
aprendizagem.
Em relação a tais abordagens computacionais, desenvolvemos ferramentas para análise ma-
temática da fala capazes de identificar déficits cognitivos durante a alfabetização de crianças 
saudáveis (Mota et al., 2016) ou associados a condições patológicas como demência (Bertola 
et al., 2014) e psicose (Mota et al., 2012, Mota et al., 2014, Bedi et al., 2015). É importante ressal-
tar que estas abordagens, fundadas em características estruturais ou semânticas da expressão 
Sidarta Ribeiro, Natalia Mota y Mauro Copelli
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
117
45
D
O
SS
IE
R 
/ E
N
TR
EV
IS
TA
 / 
A
RT
ÍC
U
LO
S 
/ R
ES
EÑ
A
S
verbal natural, foram bem-sucedidas em ambientes de baixo nível socioeconômico (Mota et 
al., 2012; Mota et al., 2014; Mota et al., 2016). Recentemente, medimos a estrutura de relatos de 
memória de 76 crianças entre 6 e 8 anos, registradas no ambiente escolar em comunidades de 
baixo nível socioeconômico. Descobrimos que várias características estruturais da fala estão 
correlacionadas com o desempenho escolar em leitura (Mota et al., 2016). Em conjunto, essas 
estratégias permitem avaliar computacionalmente o aprendizado de forma nova, eficaz e com 
baixo custo, a fim de motivar intervenções específicas baseadas em déficits avaliados não pe-
las curvas médias de aprendizagem entre vários alunos, calculadas ocasionalmente, mas por 
curvas de aprendizagem individuais atualizadas diariamente. Idealmente, esta estratégia pode 
ser combinada com intervenções fisiológicas (sono, nutrição e exercício) para reforçar positiva-
mente, minutos após sua detecção, as mudanças cognitivas observadas individualmente em 
alunos específicos. 
Ciclo de atividades e insumos para fortalecer o aprendizado
Os gradientes de saúde e educação 
estão relacionados ao fato de que 
os sujeitos com baixo nível socio-
econômico estão expostos a um 
risco sistematicamente maior para 
resultados piores de saúde, morbi-
dade e mortalidade (Mackenbach 
e Howden-Chapman, 2003, Macke-
nbach, Kunst, Cavelaars, Groenhof 
e Geurts, 1997). O sono, a nutrição 
e o exercício inadequados têm um 
impacto negativo composto sobre 
a cognição, a realização acadêmica 
e a qualidade de vida da juventu-
de. Para a melhoria da educação é 
fundamental que as escolas com-
pensem os déficits fisiológicos 
sofridos por jovens de baixo nível 
socioeconômico. Como as crianças 
e adolescentes com baixo nível so-
cioeconômico apresentam os riscos 
mais elevados de resultados ruins, 
a melhoria das condições fisiológi-
cas que preparam e consolidam o 
aprendizado tem grande potencial 
para maximizar ganhos cognitivos 
entre estes alunos.
A melhora cognitiva da mitigação 
dos défñicits fisiológicos depende do tempo entre a intervenção fisiológica e a aquisição de 
novos conhecimentos, na escala de minutos a horas. Para conseguir isso, a avaliação automati-
zada do desempenho individual do aluno é essencial. O mapeamento sistemático e denso das 
trajetórias cognitivas dará aos educadores uma compreensão muito melhor das intervenções 
psicológicas e fisiológicas apropriadas, permitindo uma educação personalizada mas escalá-
vel. Em países em desenvolvimento com flagrante desigualdade educacional, a superação de 
gargalos fisiológicos e de avaliação provavelmente gerará grandes benefícios cognitivos nos 
estratos mais pobres da sociedade.
Assim como a agricultura ecológica alterna cultivos e animais de criação em diferentes parce-
las de terra, preparando, adubando e limpando o terreno com a ação monitorada de animais, 
plantas e fungos, na educação nova os alunos poderão ciclar através de diferentes estágios de 
aquisição e consolidação da memória, reduzindo a superlotação de sala de aula sem custos 
adicionais e potencialmente contribuindo para o nivelamento de gradientes educacionais em 
Rumo ao cultivo ecológico da mente
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
118
46
D
O
SS
IE
R 
/ E
N
TR
EV
IS
TA
 / 
A
RT
ÍC
U
LO
S 
/ R
ES
EÑ
A
S
todo o mundo. As escolas se tornariam lugares em que o aprendizado é otimizado pela alter-
nância cíclica de alimentação, sono, exercício físico, aulas e realização de testes computacionais. 
As aulas regulares - muitas vezes longas e monótonas - poderão ser substituídas por aulas mais 
curtas e eficazes, de modo a liberar tempo para atividades fisiológicas e avaliação, com foco na 
sinergia e na auto regulação dos alunos.
Referências
 Bedi, G.; Carrillo, F.; Cecchi, G.A.; Slezak, D.F.; Sigman, M.; Mota, N.B., et al. (2015), “Automated analysis of 
free speech predicts psychosis onset in high-risk youths”, in Schizophrenia, 1, 15030. 
 Berchicci, M.; Pontifex, M.B.B.; Drollette, E.S.S.; Pesce, C.; Hillman, C.H.H. & Di Russo, F. (2015), “From cog-
nitive motor preparation to visual processing: The benefits of childhood fitness to brain health”, in Neu-
roscience, 298, pp. 211-9. 
 Bertola, L.; Mota, N.B.; Copelli, M.; Rivero, T.; Diniz, B.R.; Romano-Silva, M.A., et al. (2014), “Graph analysis 
of verbal fluency test discriminate between patients with Alzheimer’s disease, mild cognitive impair-
ment and normal elderly controls”, in Frontiers in Aging Neuroscience, 6, pp. 1-10. 
 Chaddock, L.; Pontifex, M.B.; Hillman, C.H. & Kramer, A.F. (2011), “A review of the relation of aerobic fit-
ness and physical activity to brain structure and function in children, in Journal of the International Neu-
ropsychology Society, 17, pp. 975-985. 
 Diamond, A. (2013), “Executive functions”, in Annual Review of Psychology, 64, pp. 135-68.
 Diamond, A. & Lee, K. (2011), “Interventions shown to aid executive function development in children 4 
to 12 years old”, in Science, 333, pp. 959-964. 
 Diekelmann, S., & Born, J. (2010), “The memory function of sleep”, in Nature Review Neuroscience, 11, pp. 
114–126. 
 Fernandes, V.R.; Ribeiro, M.L.; Melo, T.; Maciel-Pinheiro, P.D.; Guimarães, T.T.; Araújo N.B., et al. (2016), 
“Motor coordination correlates with academic achievement and cognitive function in children”, in Fron-
tiers in Psychology, 7, 318.
 Fiuza-Luces, C.; Garatachea, N.; Berger, N.A. & Lucia, A. (2013), “Exercise is the real polypill”, in Physiology, 
28, pp. 330-58. 
 Goldin, A.P.; Hermida, M.J.; Shalom, D.E.; Costa, M.E.; Lopez-Rosenfeld, M.; Segretin, M.S., et al. (2014), 
“Far transfer to language and math of a short software-based gaming intervention”, in Proceedings of 
the National Academy of Sciences USA, 111, pp. 6443-6448.
 Grissmer, D.; Grimm, K.J.; Aiyer, S.M.; Murrah, W.M. & Steele, J.S. (2010), “Fine motor skills and early com-
prehension of the world: Two new school readiness indicators”, in Developmental Psychology, 46, pp. 
1008–1017.
 Hillman, C.H.; Erickson, K.I. & Kramer, A.F. (2008), “Be smart, exercise your heart: exercise effects on brain 
and cognition”, in Nature Review Neuroscience, 9, pp. 58-65
 Korol, D.L. & Gold, P.E. (1998), “Glucose, memory, and aging”. American Journal of Clinical Nutrition, 67, pp. 
764S-771S. 
 Kurdziel, L.; Duclos, K. & Spencer, R.M. (2013), “Sleep spindles in midday naps enhance learning in pres-
chool children”, in Proceedings of the National Academy of Science USA, 110, pp. 17267-17272. 
 Lemos, N.; Weissheimer, J. & Ribeiro, S. (2014), “Naps in school can enhance the duration of declarative 
memories learned by adolescents”, in Frontiers in Systems Neuroscience, 8, 103.
 Lomas, D.; Patel, K.; Forlizzi, J.L. & Koedinger, K.R. (2013), “Optimizing challenge in an educational game 
using large-scale design experiments”, in Proceedings of the SIGCHI Conference on Human Factors in Com-
puting Systems.
 Lopez-Rosenfeld, M.; Goldin, A.P.; Lipina, S.J.; Sigman, M. & Slezak, D. F. (2013), “Mate Marote: A flexible au-
tomated framework for large-scale educational interventions”, in Computers & Education, 68, pp. 307-313.
Sidarta Ribeiro, Natalia Mota y Mauro Copelli
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
119
47
D
O
SS
IE
R 
/ E
N
TR
EV
IS
TA
 / 
A
RT
ÍC
U
LO
S 
/ R
ES
EÑ
A
S
 Mackenbach, J.P. & Howden-Chapman, P. (2003), “New perspectives on socioeconomic inequalities in 
health”, in Perspectives in Biological Medicine, 46, pp. 428-444.
 Mackenbach, J.P.; Kunst, A.E.; Cavelaars, A.E.; Groenhof, F.; Geurts, J.J. & The EU Working Group on So-
cioeconomic Inequalities in Health (1997), “Socioeconomic inequalities in morbidity and mortality in 
western Europe”,  in Lancet, 349, pp. 1655-1659. 
 Magnisalis, I.; Demetriadis, S. & Karakostas, A. (2011), “Adaptive and intelligent systems for collaborative 
learning support: A review of the field”, in IEEE Transactions on Learning Technologies, 4, pp. 5-20.
 Mander, B.A.; Santhanam, S.; Saletin, J.M. & Walker, M.P. (2011), “Wake deterioration and sleep restoration 
of human learning”, in Current Biology, 21, pp. 183-184. 
 Masley, S.; Roetzheim, R. & Gualtieri, T. (2009), “Aerobic exercise enhances cognitive flexibility”, in Jour-
nal of Clinical Psychology in Medicine Settings, 16, pp. 186-193. 
 Mota, N.B.; Vasconcelos, N.A.P.; Lemos, N.; Pieretti, A.C.; Kinouchi, O.; Cecchi, G.A., et al. (2012), Speech 
graphs provide a quantitative measure of thought disorder in psychosis”, in PLoS ONE, 7, e34928. 
 Mota, N.B.; Furtado, R.; Maia, P.P.C.; Copelli, M. & Ribeiro, S. (2014), “Graph analysis of dream reports is 
especially informative about psychosis”, in Scientific Reports, 4, 3691. 
 Mota, N.B.; Weissheimer, J.; Madruga, B.; Adamy, N.; Bunge, S.A.; Copelli, M. & Ribeiro, S.A (2016), “Natura-
listic assessment of the organization of children’s memories predicts cognitive functioning and reading 
ability”, in Mind, Brain and Education, 10, pp. 184-195.
 Mota, N. B.; Copelli, M. & Ribeiro, S. (2016), “Computational tracking of mental health in youth: Latin 
American contributions to a low-cost and effective solution for early psychiatric diagnosis”, in New Di-
rections for Child and Adolescent Development, 152, pp. 59-69.
 Odic, D.; Lisboa, J. V.; Eisinger, R.; Olivera, M. G.; Maiche, A. & Halberda, J. (2016), “Approximate number 
and approximate time discrimination each correlate with school math abilities in young children”, in 
Acta Psychologica, 163, pp. 17-26.
 Pontifex, M.B.; Raine, L.B.; Johnson, C.R.; Chaddock, L.; Voss, M.W.; Cohen, N.J., et al. (2011), “Cardiorespi-
ratory fitness and the flexible modulation of cognitive control in preadolescent children”, in Journal of 
Cognitive Neuroscience, 23, pp. 1332–1345. 
 Ribeiro, S. & Stickgold, R. (2014), “Sleep and school education”, in Trends in Neuroscience and Education, 3, 
pp. 18-23. 
 Sigman, M.; Peña, M.; Goldin, A.P. & Ribeiro, S. (2014), “Neuroscience and education: prime time to build 
the bridge”, in Nature Neuroscience, 17, pp. 497-502.
 UN-HABITAT (2007), State of the world’s cities 2006/7, London, Earthscan Publications Ltd.
 UN-HABITAT (2003), The challenge of slums: global report on human settlements, Earthscan Publications Ltd.
 Valladolid-Acebes, I.; Stucchi, P.; Cano, V.; Fernández-Alfonso, M.S.; Merino, B.; Gil-Ortega, M., et al. (2011), 
“High-fat diets impair spatial learning in the radial-arm maze in mice”, in Neurobiology of Learning and 
Memory, 95, pp. 80-5.
 Van Praag, H.; Kempermann, G. & Gage, F.H. (1999), “Running increases cell proliferation and neurogene-
sis in the adult mouse dentate gyrus”, in Nature Neuroscience, 2, pp. 266-270. 
 Voss, M.W.; Chaddock, L.; Kim, J.S.; Vanpatter, M.; Pontifex, M.B.; Raine, L.B., et al. (2011), “Aerobic fitness 
is associated with greater efficiency of the network underlying cognitive control in preadolescent chil-
dren”, in Neuroscience, 199, pp. 166-76. 
 Walker, E.; Rummel, N. & Koedinger, K.R. (2014), “Adaptive intelligent support to improve peer tutoring 
in algebra”, in International Journal of Artificial Intelligence in Education, 24, pp. 33-61. 
 Yang, G.; Lai, C.S.; Cichon, J.; Ma, L.; Li, W.; & Gan, W.B. (2014), “Sleep promotes branch-specific formation 
of dendritic spines after learning”, in Science, 344, 1173-8. 
Rumo ao cultivo ecológico da mente
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
120
48
D
O
SS
IE
R 
/ E
N
TR
EV
IS
TA
 / 
A
RT
ÍC
U
LO
S 
/ R
ES
EÑ
A
S
Resumo
Em estudantes de baixo nível socioeconômico, 
o aprendizado escolar é diretamente prejudi-
cado pelas importantes limitações fisiológicas 
que tipicamente acompanham a pobreza, bem 
como pela baixa qualidade de avaliação do 
aprendizado individual. A escassez de recur-
sos e a superlotação de residências produzem 
déficits de nutrição, sono e exercício físico que 
prejudicam a aprendizagem por meio de meca-
nismos fisiológicos bem conhecidos mas pouco 
considerados no ambiente escolar. A superlota-
ção das salas de aula, por outro lado, prejudica a 
avaliação fidedigna e suficientemente frequente 
da aprendizagem individual, capaz de informar 
intervenções focadas nas dificuldades especí-
ficas de cada aluno. Testes automatizados do 
aprendizado por meio da análise matemática do 
discurso e de jogos computacionais constituem 
alternativas de baixo custo, rápidas e escaláveis 
para personalizar e qualificar a avaliação acadê-
mica. As metas essenciais de uma nova educa-
ção, capaz de efetivamente minorar a distância 
entre ricos e pobres, devem incluir a otimiza-
ção dos horários escolares através da redução 
do tempo de aulas em favor de regimes otimi-
zados de sonecas, exercícios físicos e refeições, 
bem como avaliações automáticas frequentes do 
desempenho individual, que motivem interven-
ções específicas baseadas em déficits avaliados 
não pelas curvas médias de aprendizagem entre 
vários alunos, calculadas ocasionalmente, mas 
por curvas de aprendizagem individuais atuali-
zadas diariamente. Estas estratégias podem ser 
combinadas para reforçar positivamente, minu-
tos após sua detecção, as mudanças cognitivas 
observadas em alunos específicos. Assim como 
a agricultura ecológica promove a rotação inteli-
gente de culturas e insumos, é preciso construir 
um novo modelo de “educação ecológica” em 
que os alunos possam ciclar por diferentes es-
tágios de aquisição e consolidação da memória, 
reduzindo a superlotação das salas de aula sem 
custos adicionais e contribuindo potencialmente 
para nivelar gradientes educacionais em todo o 
planeta.
Palavras chave:
Aprendizagem escolar - Pobreza - Limitações fi-
siológicas para a aprendizagem - Avaliação esco-
lar - Educação ecológica
Resumen
En estudiantes de bajo nivel socioeconómico, el 
aprendizaje escolar se ve directamente perjudica-
do por las importantes limitaciones fisiológicas 
que típicamente acompañan la pobreza, así como 
por la baja calidad de la evaluación del proceso 
de aprendizaje individual. La escasez de recursos y 
la hacinación en los hogares producen déficits de 
nutrición, sueño y actividad física, que perjudican 
el aprendizaje por medio de mecanismos fisiológi-
cos bien conocidos pero poco considerados en el 
ambiente escolar. La superpoblación en las aulas, 
por otro lado, perjudica la evaluación fidedigna y 
suficientemente frecuente del aprendizaje indivi-
dual, que pueda orientar intervenciones enfocadas 
hacia las dificultades específicas de cada alumno. 
Los exámenes/tests automatizados del aprendizaje 
por medio de un análisis matemático del discurso 
y de juegos computacionales constituyen alterna-
tivas de bajo costo, rápidas y escalables, para per-
sonalizar y calificar la evaluación académica. Las 
metas esenciales de una nueva educación, capaz 
de efectivamente acortar la distancia entre ricos y 
pobres, deben incluir una optimización de los hora-
rios escolares a través de la reducción del tiempo de 
clase en favor de esquemas optimizados de siestas, 
ejercicios físicos y comidas, así como evaluaciones 
automáticas frecuentes del desempeño individual, 
que motiven intervenciones específicas basadas en 
déficits medidos no por las curvas medias de apren-
dizaje entre varios alumnos, calculadas ocasional-
mente, sino por curvas de aprendizajes individuales 
actualizadas diariamente. Estas estrategias pueden 
ser combinadas para reforzar positivamente, minu-
tos después de su detección, los cambios cognitivos 
observados en alumnos determinados. Así como la 
agricultura ecológica promueve la rotación inteli-
gente de culturas e insumos, es preciso construir un 
nuevo modelo de ‘educación ecológica’ en la que 
los alumnos puedan circular por diferentes etapas 
de adquisición y consolidación de la memoria, re-
duciendo la superpoblación de las aulas sin costos 
adicionales y contribuyendo potencialmente a ni-
velar gradientes educacionales en todo el planeta.
Palabras clave:
Aprendizaje escolar - Pobreza - Limitaciones fisioló-
gicas para el aprendizaje - Evaluación escolar - Edu-
cación ecológica
Sidarta Ribeiro, Natalia Mota y Mauro Copelli
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
121
49
D
O
SS
IE
R 
/ E
N
TR
EV
IS
TA
 / 
A
RT
ÍC
U
LO
S 
/ R
ES
EÑ
A
S
Abstract
In students of low socio-economic level, school 
learning is directly impaired by the important physi-
ological limitations that typically go with poverty, 
as well as by the poor quality of evaluation of in-
dividual learning process. Lack of resources and 
overcrowded dwellings are the cause of deficit  in 
nutrition, sleep, and physical exercise, that impair 
learning by means of physiological mechanisms, 
well-known but very little considered in school en-
vironments. Overcrowded classrooms, on the other 
hand, impair reliable and frequent enough evalua-
tion of individual learning, that can lead to specific 
interventions focused on the difficulties of each stu-
dent. Automatized tests for learning, by means of a 
mathematical analysis of discourse and computa-
tional games constitute low cost alternatives, fast 
and scalable, to personalize and qualify academic 
assessment. The essential goals of a new education, 
capable of effectively reduce the gap between rich 
and poor, must include an optimization of school 
schedules through a reduction of classroom time 
in favor of optimized schemes of naps, physical ex-
ercises and food, as well as frequent automatized 
evaluations of individual performance, that moti-
vate specific interventions based on deficits mea-
sured not by the average learning curves among 
various students, sporadically calculated, but by in-
dividual learning curves updated daily. These strate-
gies can be combined in order to reinforce positively, 
minutes after their detection, the cognitive changes 
detected in specific students. As well as ecologi-
cal agriculture promotes the intelligent rotation of 
cultures and inputs, it is necessary to build a new 
model of ‘ecological education’ in which students 
move through different stages of acquisition and 
consolidation of memory, by reducing overcrowded 
classrooms without additional costs and potentially 
contributing to the leveling of educational gradients 
all over the planet.
 
Key words:
Learning at school - Poverty - Physiological limita-
tions for learning - School evaluation - Ecological 
education 
Rumo ao cultivo ecológico da mente
Propuesta Educativa, Año 25, Nro. 46, págs. 42 a 49, Noviembre de 2016
122
Chapter 4 - Cognitive decline during psychosis: 
This chapter presents two studies of the use of speech analytical tools to investigate 
and diagnose psychosis-related diseases. The first manuscript comprises speech graph 
analysis applied to a recent-onset psychotic sample. Speech connectivity was tight 
correlated with negative symptoms. We used these correlation coefficients to create a 
single index able to predict diagnosis six months in advance, using data exclusively 
from the first psychiatric interview of each patient. The second manuscript is a paper 
presented in an international conference, which aims to combine structural and 
semantic speech analysis to improve the assessment of negative symptoms. 
 
 
123
ARTICLE OPEN
Thought disorder measured as random speech structure
classifies negative symptoms and schizophrenia diagnosis
6 months in advance
Natália B. Mota1, Mauro Copelli2 and Sidarta Ribeiro1
In chronic psychotic patients, word graph analysis shows potential as complementary psychiatric assessment. This analysis relies
mostly on connectedness, a structural feature of speech that is anti-correlated with negative symptoms. Here we aimed to verify
whether speech disorganization during the first clinical contact, as measured by graph connectedness, can correctly classify
negative symptoms and the schizophrenia diagnosis 6 months in advance. Positive and negative syndrome scale scores and
memory reports were collected from 21 patients undergoing first clinical contact for recent-onset psychosis, followed for 6 months
to establish diagnosis, and compared to 21 well-matched healthy subjects. Each report was represented as a word-trajectory graph.
Connectedness was measured by number of edges, number of nodes in the largest connected component and number of nodes in
the largest strongly connected component. Similarities to random graphs were estimated. All connectedness attributes were
combined into a single Disorganization Index weighted by the correlation with the positive and negative syndrome scale negative
subscale, and used for classifications. Random-like connectedness was more prevalent among schizophrenia patients (64 × 5% in
Control group, p = 0.0002). Connectedness from two kinds of memory reports (dream and negative image) explained 88% of
negative symptoms variance (p < 0.0001). The Disorganization Index classified low vs. high severity of negative symptoms with
100% accuracy (area under the receiver operating characteristic curve = 1), and schizophrenia diagnosis with 91.67% accuracy (area
under the receiver operating characteristic curve = 0.85). The index was validated in an independent cohort of chronic psychotic
patients and controls (N = 60) (85% accuracy). Thus, speech disorganization during the first clinical contact correlates tightly with
negative symptoms, and is quite discriminative of the schizophrenia diagnosis.
npj Schizophrenia  (2017) 3:18 ; doi:10.1038/s41537-017-0019-3
INTRODUCTION
Schizophrenia is associated with negative symptoms, major
impacts on social behavior and poor prognosis.1 In particular,
elevated negative symptoms are associated with low rates of
recovery.1, 2 Formal thought disorder—which comprises poverty
of speech, derailment, and incoherence—constitutes an important
set of psychotic symptoms, and negative formal thought disorder
is associated with the schizophrenia diagnosis even during first
episode psychosis.2, 3 The early stages of the disease constitute a
critical opportunity for prevention of major cognitive damage.4
Improved behavioral measures subjected to novel mathema-
tical analyses are emerging as part of a new field that uses
computational tools to better characterize psychiatric phenom-
ena.5–13 A particularly useful example of such computational
phenotyping is the assessment of verbal reports by graph analysis,
which provides a precise and automated quantification of speech
features that are related with negative symptoms9 and show
potential to help the differential diagnosis of psychosis.9, 10 By
representing each word as a node and the temporal sequence of
consecutive words as directed edges, it is possible to calculate
attributes that characterize graph structure.9, 10 The assessment of
dream reports from chronic psychotic patients has shown that
patients diagnosed with schizophrenia typically talk with fewer
words than those diagnosed with bipolar disorder or matched
controls.9, 10 Even when verbosity differences are controlled,
negative symptoms are anti-correlated with various measures of
word connectedness (such as number of edges, and the amount
of nodes in the largest connected component—LCC and in the
largest strongly connected component—LSC). Overall, the higher
the graph connectedness, the lesser the negative symptoms.9
An interesting point is that dream reports were especially
informative regarding the schizophrenia diagnosis and correlations
with negative symptoms compared to reports from waking activities.
The same graph attributes, when calculated from short-term memory
reports produced by healthy children, were positively correlated with
Intelligence Quotient and Theory of Mind scores, and could predict
academic performance independently of other cognitive measures.14
Interestingly, reports related to long-term memories were not
correlated with cognitive measurements.14 Altogether, these data
add to the notion that word connectedness rises during healthy
development, but not during the course of schizophrenia.9, 10, 14
Although this hypothesis can only be directly addressed with a
longitudinal design, we found a positive exponentially saturating
relationship between educational level and connectedness in healthy
controls, in a cross-sectional study of a larger sample with a wide
span of educational levels.15 Importantly, this education-dependent
dynamics was blurred in the psychosis group.
Received: 23 November 2016 Revised: 3 March 2017 Accepted: 22 March 2017
1Brain Institute, Federal University of Rio Grande do Norte, UFRN, Natal, RN, Brazil and 2Physics Department, Federal University of Pernambuco, UFPE, Recife, PE, Brazil
Correspondence: Natália B. Mota (nataliamota@neuro.ufrn.br)
www.nature.com/npjschz
Published in partnership with the Schizophrenia International Research Society
124
The results led us to hypothesize that early markers of speech
disorganization during recent-onset psychosis, such as decreased
connectedness, may be able to correctly classify the severity of
negative symptoms as well as the schizophrenia diagnosis. Here we
tested four specific hypotheses: (1) Speech connectedness from
dream reports9 and short-term memory reports14 can discriminate
the schizophrenia diagnosis; (2) Patients in the schizophrenia group
produce verbal reports less connected and more similar to random
connectedness than those from Bipolar or Control groups; (3)
Connectedness attributes correlate negatively with negative symp-
toms9; (4) A single index combining connectedness attributes highly
correlated with negative symptoms will improve the schizophrenia
diagnosis and the classification of negative symptom severity.
RESULTS
Patients seeking treatment for the first time for psychotic symptoms,
without neurological or drug-related disorders, were interviewed in
2014 and 2015 (N= 21). After a 6 months follow-up, 11 patients were
diagnosed with schizophrenia disorder, and 10 with Bipolar disorder
(Table 1, Fig. 1a). The schizophrenia group used more atypical
antipsychotic medications and less mood stabilizers than the Bipolar
group (Table 1). As controls, healthy subjects matched for sex, age,
and education were recruited and interviewed in public schools (N =
21). Despite the absence of significant differences regarding
demographical characteristics (age, sex, educational level, and family
income) or disease duration (Table 1), the schizophrenia group had
substantially more males than the other groups, as well as a smaller
educational level. For this reason our analyses included gender, years
of education, age and chlorpromazine equivalent dose as potential
confounding factors.
Interviews included regular psychiatric anamnesis plus requests
to report a dream, a memory of the day that preceded the dream,
and the oldest memory recalled. Subjects were also requested to
imagine and report a short story based on three affective images
(one negative, one neutral, and one positive regarding affective
valence).14, 16 All the reports were limited to 30 s by the
interviewer. Whenever a subject interrupted a report before 30 s
had elapsed, he/she was prompted by the interviewer to continue
talking up to the time limit. The reports were audio recorded,
transcribed and represented as graphs with each word repre-
sented as a node and the temporal sequence between words
represented as directed edges (Fig. 2a).
Three connectedness attributes were calculated: Amount of
edges (E); amount of nodes in the LCC, defined as the largest set
of nodes directly or indirectly linked by some path; and the
amount of nodes in the LSC, defined as the largest set of nodes
directly or indirectly linked by reciprocal paths, so that all the
nodes in the component are mutually reachable, i.e., node ‘a’
reaches node ‘b’ and node ‘b’ reaches node ‘a’; (Fig. 2a). The use of
time-limited reports allowed us to take full advantage of group
differences in verbosity, which is directly measured by E.
Next, 1000 random graphs were created by preserving the same
nodes and amount of edges, but shuffling word sequences (Fig. 2b).
The z-scores of the original graph connectedness relative to the
random graph distributions (LCCz and LSCz) were then calculated to
estimate the degree of randomness of each graph (Fig. 2c). The
purpose of this analysis was to formally verify whether structural
aspects of thought disorder could be quantified by measuring the
similarity of verbal reports to randomized speech. In this way,
structural speech disorganization was mathematically defined as
similarity of the verbal reports to random graphs: if there is a
mathematical structure that determines a specific word sequence in
the speech graph, shuffling word order will disrupt this pattern and
LCC/LSC will change. As the comparison to random graphs
distribution already kept strictly the same number of words in the
graph, the verbosity difference is already controlled.
Negative image reports from schizophrenia subjects showed
random-like connectedness (i.e., difference from random graph
distribution smaller than two standard deviations) more fre-
quently than reports from the Control group (64% of schizo-
phrenia group vs. 5% of Control group, Chi-square test p = 0.0002;
Fig. 2d and e). Reports from Bipolar subjects showed intermediate
random-like connectedness (30%; Fig. 2d and e).
The illustrative examples shows that subjects from the
schizophrenia group report a story based on a negative image
recently seen with a less connected structure (fewer edges,
smaller LCC and LSC), more similar to what would be expected
from random graphs with the same words (LCCz and LSCz within
2 standard deviations) than other groups (Fig. 2f).
Table 1. Socio-economic and clinical information of Schizophrenia, Bipolar, and Control groups
Demographic Characteristics Schizophrenia Bipolar Disorder Control p Value S x (B + C)
Age (years) 14.64± 2.57 15.30± 3.77 15.43± 3.55 0.5837
Family Income (US$ per month) 326.14± 190.58 297.50± 166.94 368.42± 151.76 0.3746
Sex
Male 82% 27% 45% 0.0542
Female 18% 73% 55%
Years of Education (years) 5.73± 2.34 6.40± 3.77 8.05± 2.77 0.0810
Psychiatric Assessment Schizophrenia Bipolar Disorder p Value: S x B
Medication
Typical Antipsychotic 55% 60% 0.8008
Atypical Antipsychotic 82% 40% 0.0487
Mood Stabilizer 9% 70% 0.0041
Benzodiazepine 9% 10% 0.9435
Antidepressants 9% 20% 0.4755
Disease Duration (days) 339.36± 244.80 370.60± 306.08 1
Mean± standard deviation of age in years, family income in USD per month, educational level in years, disease duration in days. Shown are the percentage of
male and female subjects per group, and the percentage of subjects under specific types of medication. P values of Wilcoxon–Ranksum test or Chi-square test
between Schizophrenia vs. Bipolar and Control groups (general information), or Schizophrenia vs. Bipolar group (clinical information). Group label according
to diagnosis established after 6 months of follow-up
Thought disorder measured as random speech structure
NB Mota et al.
2
npj Schizophrenia (2017)  18 Published in partnership with the Schizophrenia International Research Society
125
Using 5 connectedness attributes from each memory report as
inputs to a binary classifier, only dream reports and negative image
reports allowed to discriminate the schizophrenia diagnosis against
other conditions (Bipolar disorder or Control), with area under the
receiver operating characteristic curve (AUC) >0.75 and accuracy
(Acc) >75% correct (Fig. 3a, Supplementary Table 1). Dream reports
yielded better classification than negative image reports (Fig. 3a,
Supplementary Table 1). However, some subjects were unable to
recall a dream during their first interview (Fig. 1a), so that 36% of
the schizophrenia group (N = 4), 20% of the Bipolar group (N = 2),
and none of the Control subjects failed to recall a dream. For this
reason, further analyses used only these 2 report types.
Non-parametric statistical tests were chosen to assess the
dataset, which was not normally distributed but had homoge-
neous variances (Supplementary Table 2). As predicted, schizo-
phrenia subjects produced less connected reports than subjects
from other groups, with fewer edges and smaller connected
components (Figs 2f and 3b, Supplementary Table 2). In the
control group there were no gender-related differences for any
graph attribute from any kind of report (Supplementary Table 2).
When verbosity was controlled by dividing E, LCC and LSC by
word rate (amount of words produced in the 30 s limited reports),
negative image reports still showed significant LSC differences
(Fig. 3c; Kruskal–Wallis test p = 0.0145; LSC/word rate Schizophre-
nia < Control with p = 0.0033 and Schizophrenia < [Control + Bipo-
lar] with p = 0.0055, Wilcoxon Ranksum test). Also negative image
reports showed higher similarity with random connectedness
(LSCz were smaller for Schizophrenia group compared to Control
group, Wilcoxon Ranksum test p = 0.0033, and smaller than
Control + Bipolar groups, Wilcoxon Ranksum test p = 0.0060,
Fig. 3d, Supplementary Table 2).
In further agreement with our prediction, connectedness attributes
were anti-correlated with the PANSS negative subscale for dream and
negative images reports (Supplementary Table 3), and there were no
significant correlations between any connectedness attribute and the
potential confounding factors age, years of education or chlorpro-
mazine equivalent dose (Supplementary Table 4). Interestingly,
connectedness attributes from negative image reports were more
frequently correlated with negative symptoms than connectedness
attributes from dream reports (Supplementary Table 3).
Fig. 1 Illustrative diagrams of the flow of participants. a Using Dream + Negative image reports or only Negative image reports, or only Dream
reports. Control subjects were excluded from positive and negative syndrome scale (PANSS) analyses because they were “not clinical”, i.e., they
were not at clinical settings. c Through the validation in an independent cohort of chronic psychotic patients. Schizophrenia (S), Bipolar b and
Control c groups
Thought disorder measured as random speech structure
NB Mota et al.
3
Published in partnership with the Schizophrenia International Research Society npj Schizophrenia (2017)  18 
126
Next we combined all the connectedness attributes that
showed significant differences among the groups. Multiple linear
correlations were calculated between total PANSS negative
subscale scores and seven attributes from both kinds of memory
report (E, LCC, LSC, and LSCz from negative image reports; E, LCC,
and LSC from dream reports), or four attributes exclusively from
negative image reports, or three attributes exclusively from dream
reports. Since all these parameters are to some extent correlated
with verbosity,9 collinearity among attributes is a serious concern.
To address this issue we performed a collinearity diagnosis and
sequentially excluded the most collinear variables until a
combination without collinearity was reached.
The combination of non-collinear connectedness attributes
from both kinds of reports explained nearly all the variance in total
negative symptoms (Fig. 4a; R2 = 0.88, p < 0.0001, observed power
= 1), while using only negative image reports explained substan-
tially less (Fig. 4a; R2 = 0.74, p < 0.0001, observed power = 0.9998),
and using only dream reports even less (Fig. 4a; R2 = 0.49,
p = 0.0182, observed power = 0.8764). The following equations
defined “Disorganization Indices” for either a combination of
dream and negative image reports, or separately for negative
image or dream reports:
Disorganization Index ðNegativeþ DreamÞ ¼ 30:78
þLSC negative ´ ð0:015Þ þ LSCz negative
´ ð2:33Þ þ LCC dream ´ ð0:20Þ
Disorganization IndexðNegativeÞ ¼ 31:43
þ LCC ´ ð0:30Þ þ LSC ´ ð0:08Þ þ LSCz ´ ð2:12Þ
Disorganization IndexðDreamÞ ¼ 27:82
þ LCC ´ ð0:32Þ þ LSC ´ ð0:012Þ
The schizophrenia group showed a higher Disorganization
Index than the other groups using both kinds of reports
(Kruskal–Wallis p = 0.0035, Fig. 4b, Supplementary Table 5), using
only negative image reports (Kruskal–Wallis p = 0.0044, Fig. 4b,
Supplementary Table 5), or using only dream reports
Fig. 2 Speech graph connectedness attributes and random-like connectedness in schizophrenia. a Illustrative example of a text represented
as a graph, showing connectedness attributes Edges, LCC, and LSC. b Illustrative example of random graphs created from an original report.
By shuffling word order 1000 times, surrogated graphs maintained the same words but displayed a random word structure (displaced words
in red). c Examples of one negative image report compared to 1000 random graphs for each group. Estimation of original LSC (red dot)
distance from a 1000 random graph distribution (blue histogram) by z-score—LSCz. d LSCz histogram from each diagnostic group, considering
as random-like speech those with LSCz= −2 until 2 (2 standard deviation from a random graph distribution). e Percentage of random-like
reports in each diagnostic group (Asterisk means p< 0.05—χ2 test). f Representative graphs for each group, obtained from negative image
reports
Thought disorder measured as random speech structure
NB Mota et al.
4
npj Schizophrenia (2017)  18 Published in partnership with the Schizophrenia International Research Society
127
(Kruskal–Wallis p = 0.0070, Fig. 4b, Supplementary Table 5). The
Disorganization Index from both kinds of reports correctly
classified the schizophrenia diagnosis with accuracy higher than
90%, and also classified the negative symptoms severity perfectly
(Fig. 4c, Table 2). The Disorganization Indices calculated exclu-
sively from negative image reports or from dream reports were
also discriminative, but less so (Fig. 4c, Table 2).
In order to understand how much of the information in the
Disorganization Index is actually due to verbosity differences, we
verified that all the 3 Disorganization Indices were correlated
with word rate (Spearman correlation between word rate and
Disorganization Index from dream and negative image reports:
Rho = −0.67, p = 0.0059; exclusively from negative image reports:
Rho = −0.84, p < 0.0001; exclusively from dream reports: Rho =
−0.96, p < 0.0001), but the correlation between the Disorganiza-
tion Indices and negative symptoms remained significantly
different when adjusted for word rate (adjusted Spearman
correlation by word rate between PANSS negative subscale and
index from dream and negative image reports: Rho = 0.84,
p = 0.0001; index from negative image reports only: Rho = 0.57,
p = 0.0087), except for the Disorganization Index calculated
exclusively from dream reports (Rho = 0.18, p = 0.5346; Bonferroni
correction for 3 comparisons, α = 0.0167).
Importantly, there was an 82% overlap between the schizo-
phrenia group and the psychotic patients that presented high
scores in the PANSS negative subscale. Also, there was no
significant Spearman correlation between any Disorganization
Index and the potential confounding factors age, years of
education and chlorpromazine equivalent dose (Supplementary
Table 6), neither did these factors disrupt the Spearman
correlation between Disorganization Index and PANSS negative
subscale when considered as adjustment (Supplementary Table 6),
Fig. 3 Comparison of different methods for eliciting informative reports in terms of their discrimination performance for schizophrenia.
Dream and Negative image reports are more discriminating than long-term memories. a Schizophrenia diagnostic classification using 5
connectedness attributes (E, LCC, LSC, LCCz, and LSCz) using 6 time-limited memory reports. Only dream and negative image reports
classified schizophrenia group vs. Bipolar and Control group with AUC> 0.75 and accuracy> 75%. b Connectedness attributes from dream
and negative image reports compared between groups. c LSC normalized by word rate from dream and negative image reports compared
between groups d The z-scores of the original graph connectedness relative to the random graph distributions (LCCz and LSCz) from dream
and negative image reports compared between groups. Bar plots indicate of median values and error bars indicate standard error of the mean
(s.e.m); Kruskal–Wallis tests: p value for dream/negative image reports indicated in each title; Wilcoxon–Ranksum tests (Bonferroni corrected
for 8 comparisons (4 comparison for each 2 memory reports—SxB, SxC, Sx(B + C), and BxC)): # means p< 0.0063—Schizophrenia vs. Bipolar
and Control groups, asterisk means p< 0.0063—Schizophrenia vs. Bipolar or Control groups
Thought disorder measured as random speech structure
NB Mota et al.
5
Published in partnership with the Schizophrenia International Research Society npj Schizophrenia (2017)  18 
128
Fig. 4 Disorganization Index classifies negative symptoms severity and schizophrenia diagnosis 6 months in advance. a Multiple linear
correlation between PANSS negative subscale vs. Disorganization Index from dream + negative image reports, from negative image reports, or
from dream reports (R2 and p value indicated on title; linear coefficients used to calculate Disorganization Index on Results). b Bar plot of the
mean and standard error of Disorganization Index from dream + negative image reports, from negative image reports, or from dream reports
for diagnostic groups (schizophrenia in red, bipolar in blue and control in black; bar plots indicate of median values and error bars indicate s.e.
m; Kruskal–Wallis tests (Bonferroni corrected for 6 comparisons (2 memory reports asterisk 3 groups)): p value indicated in each title; #
indicates p< 0.0063—Schizophrenia > Bipolar and Control groups; asterisk indicates p< 0.0063—Schizophrenia > Bipolar or Control groups). c
Classification quality using only Disorganization Index from dream + negative image reports, from negative image reports, or from dream
reports (measured by AUC and Accuracy—classification of schizophrenia diagnosis 6 months in advance (black); Negative Symptom Severity
measured by PANSS negative subscale (gray). d Validation of the Disorganization Index using dream reports from an independent cohort of
chronic psychotic patients.9 Multiple linear correlation between PANSS negative subscale vs. Disorganization Index (R2 and p value indicated
on title; linear coefficients used to calculate Disorganization Index on Results), statistical comparison (schizophrenia in red, bipolar in blue and
control in black; Kruskal–Wallis tests: p value indicated in each title; # indicates p< 0.0063—Schizophrenia > Bipolar and Control groups;
asterisk indicates p< 0.0063—Schizophrenia > Bipolar or Control groups) or classification quality (measured by AUC and Acc—classification of
schizophrenia diagnosis 6 months in advance (black); Negative Symptom Severity measured by PANSS negative subscale (gray))
Thought disorder measured as random speech structure
NB Mota et al.
6
npj Schizophrenia (2017)  18 Published in partnership with the Schizophrenia International Research Society
129
except for the effect of medication dose in the correlation
between negative symptoms and the Disorganization Index
calculated exclusively from dream reports (Supplementary Table 6).
This could be due to a weaker relationship with negative
symptoms in these reports, or to a smaller sample of dream
reports in comparison to negative image reports, since not all
subjects were able to recall dreams.
To validate the method in an independent cohort, the same
strategy was applied to dream reports of a previously collected
sample of chronic psychotic patients and controls,9 which was not
normally distributed and had homogeneous variances (Fig. 1b,
Supplementary Table 5). There was a similar multiple correlation of
connectedness attributes with negative symptoms (R2 = 0.54,
p < 0.0001, observed power = 1), which after the exclusion of
collinear variables led to a Disorganization Index = 93.91 +
E × (−3.08) + LSC × (0.21). The statistical differences among the
groups resembled those found in the recent-onset psychosis
sample (Kruskal–Wallis p < 0.0001, Fig. 4d, Supplementary Table 5),
and the Disorganization Index was also quite informative of the
schizophrenia diagnosis and the severity of negative symptoms
(Fig. 4d, Table 2). It was also possible to validate diagnosis and
symptom severity classification using the index calculated from a
sample to another sample (Supplementary Table 7).
Finally, in both the recent-onset and chronic psychosis samples,
there were no statistically significant differences between the
Bipolar and the Control groups for any connectedness attribute,
either in isolation or combined into the Disorganization Index
(Supplementary Table 2 and 5).
DISCUSSION
One of the promises of computational psychiatry is to provide
quantitative phenotyping of relevant psychiatric symptoms.5–7, 17
Here we showed that speech graph analysis allows for the
structural quantification of formal thought disorder, mathemati-
cally defined by the linear combination of connectedness graph
attributes and their degree of similarity to randomly generated
graph attributes. This procedure offers unbiased and precise
numbers to what was previously only described by words. While
the results can be partially explained by verbosity differences,
especially with regard to dream reports, subjects from the
schizophrenia group showed smaller LSC even after controlling
for verbosity (either normalizing attributes by word rate, or
comparing to random graphs with the same amount of words).
Furthermore, verbosity could not explain the relationship between
negative symptoms and Disorganization Indices, except for the
Index calculated exclusively from dream reports.
The four hypotheses raised were verified. Dream and negative
image short-term memory reports could be used—and their
combination was optimal—to discriminate the schizophrenia
diagnosis 6 months in advance. Connectedness attributes from
dream reports were most discriminative of schizophrenia, with
better performance than connectedness attributes from waking
reports.9 However, the difficulty shown by some subjects to recall
dreams was a practical clinical concern that could be circum-
vented using short-term memory reports based on affective
images.14 As predicted, short-term memory reports were more
informative than long-term memory reports (“yesterday” or
“oldest” memories).
The results show that connectedness is often impaired in
schizophrenia patients, to the point of being undistinguishable
from random values in 64% of the subjects in this group. The
estimation of the randomness degree of connectedness provides
a quantitative measurement of though disorder at the structural
level. Such structural disorganization is likely exacerbated in
subjects with advanced cognitive impairment, as in patients with
the psychopathological symptom “word salad”.18 Note in this
regard that connectedness as measured by graph analysis does
not directly estimate semantic relationships, although we have
recently reported a significant correlation (R = −0.4) between LSC
and semantic incoherence.19 Furthermore, the psychotic subjects
studied here were not expressing full-fledged “word salad”,
understood as extreme speech disorganization at both the
structural and semantic levels, which rarely occurs in early-
course psychosis. While the analogy with “word salad” must be
taken with caution, the quantitative method to assess thought
disorganization presented here has major potential for revealing
early signs of thought disorder, measurable even before semantic
incoherence becomes clinically evident.
The results also confirmed that connectedness is negatively
correlated with negative symptom severity. A linear combination
of connectedness attributes explained nearly all the variance of
the negative symptoms severity, and reached high classification
accuracy for negative symptom severity (100% when combining
both reports) and of schizophrenia diagnosis 6 months in
advance. There was a very high overlap (82%) between the
schizophrenia diagnosis and high scoring in the PANSS negative
subscale, but overall the accuracy was better for negative
symptoms severity than for DSM diagnosis. This raises the point
that precise behavioral measurements are more likely to describe
symptomatology than standard diagnosis.20 Importantly, it was
possible to correctly classify schizophrenia diagnosis and negative
symptom severity using the Disorganization Index from dream
reports of an independent cohort of chronic psychotic patients
and control subjects interviewed years before the present study.9
Table 2. Classification quality of sorting Schizophrenia patients from others subjects, or sorting between low and high negative symptom severity,
using the Disorganization Index obtained from dream + negative image reports, negative image reports, or dream reports only
Disorganization Index Classification Sensitivity Specificity Precision Recall F-measure AUC Accuracy
Dream + Negative Recent-onset Sample S × (B + C) 0.92 0.76 0.91 0.92 0.91 0.85 91.67
High × Low 1.00 1.00 1.00 1.00 1.00 1.00 100.00
Only Negative Recent-onset Sample S × (B + C) 0.81 0.64 0.80 0.81 0.80 0.79 80.95
High × Low 0.95 0.95 0.96 0.95 0.95 0.97 95.23
Only Dream Recent-onset Sample S × (B + C) 0.78 0.62 0.80 0.78 0.79 0.77 77.78
High × Low 0.73 0.67 0.73 0.73 0.73 0.80 73.33
Dream—Chronic Sample S × (B + C) 0.85 0.78 0.85 0.85 0.85 0.81 85.00
High × Low 0.73 0.62 0.72 0.73 0.72 0.81 73.33
The last row shows an independent validation of the Disorganization Index calculated for dream reports of a chronic psychotic sample.9 S × (B + C) indicates
that the classification was performed between the Schizophrenia group (S) vs. the sum of Bipolar and Control groups (B + C)
Thought disorder measured as random speech structure
NB Mota et al.
7
Published in partnership with the Schizophrenia International Research Society npj Schizophrenia (2017)  18 
130
Of note, the Bipolar and Control groups could not be
differentiated using neither connectedness attributes nor the
Disorganization Index. Semantic computational strategies, rather
than the structural approach chosen here, may be better to predict
psychotic breaks during prodromal stages,11 or to differentiate
patients with Bipolar Disorder from healthy controls.21
Our study has some limitations worth mentioning. First, to
obtain sound psychopathological boundaries for the Disorganiza-
tion Index, i.e., more reliable estimations of the linear combination
coefficients, it will be necessary to investigate a larger sample
better matched for gender and educational level, with multiple
researchers scoring negative symptoms at high inter-rater
reliability. Second, the sample sizes of the present study were
based on the prevalence of schizophrenia. While the main results
reached very high observed power, future studies should also
consider statistical power a priori when planning sample sizes.
Third, the findings must be replicated with native speakers of other
languages to assert their general applicability. Fourth, the
medications taken by the schizophrenia and Bipolar groups could
not be rigorously matched due to treatment differences between
the pathologies, and to the non-interventional experimental
design. Indeed, we found an important impact of adjusting for
medication dose in the correlation of negative symptoms with the
Disorganization Index calculated exclusively from dream reports,
and therefore medication should be better controlled in future
studies. Fifth, the duration of psychotic symptoms before the first
clinical interview was estimated by interviews with families and
patients, and therefore was not precisely measured.22 Sixth, a
longitudinal prodromal evaluation is in order to describe how
graph attributes progress over time in relation to clinical evolution,
and how sensitive these attributes are to medication changes.
Beyond these limitations, our study exemplifies how computa-
tional strategies can precisely measure important psychiatric
symptoms using a naturalistic approach that mathematically
characterizes what psychiatrists have for decades subjectively
described in clinical practice. Graph analysis is a fast and low-cost
tool for complementary psychiatric evaluation. The recording of
two time-limited memory reports takes ~3min, audio transcrip-
tion takes ~10min, and data processing from text transcript to
graph analysis is nearly instantaneous.9 Whenever a patient fails to
recall a dream, it is still possible to calculate an accurate
Disorganization Index using only a negative image report. The
method presented is directly based on the psychopathological
description of formal though disorder in schizophrenia, shows
substantial discriminative power, and represents a successful
translation of basic science into applied technology able to
improve clinical evaluation.
METHODS
Study design
This prospective study recruited patients interviewed during first clinical
contact for recent-onset psychosis in a public child psychiatric clinic
(CAPSi) in Natal, RN, Brazil, from August 2014 to July 2015. All patients had
the initial diagnosis of psychotic episode under evaluation, and were
followed up for 6 months by an interdisciplinary clinical team, who
evaluated information from different sources including family, school
environment, clinical assessment, and exams. After 6 months the cases
were discussed by the team and disease diagnosis was established
according to DSM IV criteria (applying SCID).23 This reference standard was
chosen for compatibility with previous studies using graph analysis to
investigate psychosis.9, 10 After the psychosis sample was collected, well-
matched controls were recruited on nearby public schools. The parameters
matched were age, sex, socio-economic status, and educational level.
Matching was facilitated by the fact that Brazilian public schools have high
levels of age-grade delay.24 Psychotic and control groups were collected as
convenience samples. Data analysis began after the entire sample was
collected and all patients had finished follow-up (the index was not
available during the clinical follow-up and diagnosis was not available
during the speech recording). The method was validated on dream reports
from an independent cohort of chronic psychotic subjects and matched
controls recruited at convenience samples at Hospitals Onofre Lopes and
João Machado (in Natal, RN, Brazil) between February 2008 and October
2012.
Sample sizes were based on Brazil’s prevalence25 of schizophrenia using
the following equation:
N ¼ Z2Pð1 PÞ=d2
(Z statistic for a level of confidence = 1.96, considering 95% of
confidence interval; P was the prevalence, considered 0.57%,26 and d
was the precision = 0.05). The estimated sample size was N = 9. We
doubled the value of N, considering that some individuals would be
expected to have Bipolar disorder diagnosis in the end of the follow-up.
Participants
Study approved by the UFRN Research Ethics Committee (permit #
742–116 for recent-onset psychosis sample, permit #102/06-98244 for
chronic psychosis sample). Pre-established exclusion criteria comprised
having any neurological symptom, or having drug-related disorders.
Twenty-two patients undergoing recent-onset psychosis (Table 1) were
recruited during first psychiatric interview and followed up for 6 months to
establish diagnoses. Inclusion criterion was to be seeking treatment for
psychotic symptoms for the first time (maximum duration of two years as
reported by patient and family members). One patient was excluded after
epilepsy diagnosis. Twenty-one healthy control subjects matched by age,
sex, and education were interviewed during regular class time in public
schools of Natal, RN, Brazil (Table 1). An additional exclusion criterion for
the Control group was not having any psychiatric symptom or diagnosis, as
assessed during family member interviews.
The independent cohort comprised subjects diagnosed according to
DSM-IV9 with schizophrenia (n = 20), or Bipolar Disorder (n = 20), as well as
subjects without psychosis. Participants and legal guardians provided
written informed consent.
Protocol
Subjects were submitted to an audio-recorded interview that consisted of
requests for six time-limited memory reports. In order to minimize inter-
subject differences in word count, each report was limited to 30 s.
Whenever the subject spontaneously stopped the report, he/she was
stimulated to keep talking by way of general instructions like “please, tell
me more about it”. When the report reached the 30-s limit, the interviewer
interrupted the report saying “ok”. The interview began with a request to
produce a “dream report” (either recent or remote). Next, the “oldest
memory report” was obtained by requesting the subjects to report the
most remote memory they could access at that moment. Then the subjects
were requested to report on their previous day (“yesterday report”), and
finally they were exposed to three images presented on a computer
screen, comprising a “highly negative image”, a “highly positive image”
and a “neutral image” from the IAPS database16 of affective images,
previously tested in children16 and psychotic subjects.27 Subjects were
instructed to pay attention to each image for 15 s and then report an
imaginary story based on it. The entire memory report protocol took up to
10min to be completed. Subjects undergoing recent-onset psychosis were
then evaluated psychiatrically using the psychometric scale PANSS28
composed of three subscales (positive, negative, and general). The
negative subscale measured seven symptoms: Blunted affect (N1),
Emotional withdrawal (N2), Poor rapport (N3), Passive/apathetic social
withdrawal (N4), Difficulty in abstract thinking (N5), Lack of spontaneity
and flow of conversation (N6), Stereotyped thinking (N7).28 Only one
researcher performed PANSS scoring (NBM), and all the psychometric
evaluations were completed during the data collection, and therefore prior
to speech graph analysis.
Graph measures
The search for a discriminative index of connectedness was exploratory,
and for that we tested six different kinds of memory reports. Memory
reports were transcribed and represented as graphs in which each
word was represented as a node, and the temporal sequence bet-
ween consecutive words was represented by directed edges (Fig. 2a)
using the software SpeechGraphs (http://www.neuro.ufrn.br/softwares/
speechgraphs) (code freely available).9 Three connectedness attributes
were calculated: Edges (E), which measures the amount of links between
Thought disorder measured as random speech structure
NB Mota et al.
8
npj Schizophrenia (2017)  18 Published in partnership with the Schizophrenia International Research Society
131
words; LCC, which measures the amount of nodes in the largest
component in which each pair of nodes has a path between them; and
LSC, which counts the amount of nodes in the largest component in which
each pair of nodes has a mutually reachable path, i.e., node “a” reaches
node “b” and node “b” reaches node “a” (Fig. 2a).
We compared each memory report graph to 1000 random graphs built
with the same nodes and number of edges, but with a random shuffling of
the edges that amounts to shuffling words (Fig. 2b). Next we estimated the
LCC and LSC z-scores between each original graph and the corresponding
random graph distribution (Fig. 2c). These normalized attributes were
termed LCCz and LSCz. Formally, LCCz = (LCC—LCCmr) / LCCsdr and LSCz
= (LSC—LSCmr) / LSCsdr, with LCCmr and LSCmr corresponding respec-
tively to mean LCC and LSC values in the random graph distributions;
likewise, LCCsdr and LSCsdr denote the standard deviation of LCC and LSC
from the random graph distribution. A graph was considered random-like
when its connectedness attributes fell within two standard deviations from
the mean of the random distribution (Fig. 2d).
Analyses
All the statistical analyses used Matlab software. To avoid over-fitting and
better combine the most informative connectedness attributes, we first
applied five connectedness attributes (E, LCC, LSC, LCCz, and LSCz) from
each memory report as inputs to a Naïve Bayes classifier with cross-
validation (10-fold) implemented with Weka software,29 and trained for the
binary choice between the schizophrenia group vs. the sum of Bipolar and
Control groups, using as golden standard the diagnostic reached after
6 months of follow-up. Classification quality was assessed using Accuracy
(Acc, percentage of correctly classified subjects) and AUC. A threshold of
Acc = 75% correct or AUC = 0.75 was established in order to consider a
memory report informative (Fig. 3a, Supplementary Table 1). Using
Spearman correlations, we related each connectedness attribute from
each informative memory report to the PANSS negative subscale
(Supplementary Table 3), and compared the groups applying
Kruskal–Wallis and two-sided Wilcoxon–Ranksum test (Fig. 3b, Supple-
mentary Table 2). All statistical analyses were corrected for multiple
comparisons (Bonferroni). Normality and variance homogeneity were
assessed by the Kolmogorov–Smirnov and Levene tests, respectively. As
the sample distribution was not normal, we used only non-parametric
statistical tests.
To calculate the Disorganization Index, we began by selecting only the
connectedness attributes that presented any significant statistical differ-
ence between groups after Bonferroni correction. Following the selection
of these most informative connectedness attributes, they were combined
and correlated with the total score of the PANSS negative subscale using
multilinear regression (Fig. 4a). Multicollinearity diagnosis was performed
to guarantee a non-collinear combination. Variables with the largest
variance decomposition proportion whenever the conditioning index was
higher than ten were sequentially excluded until a non-collinear
combination was reached. Attribute coefficients were then extracted and
this linear combination was used to create the Disorganization Index
(equation described in the Results Session). Since the sample size was
planned based on the prevalence of schizophrenia, we estimated the
statistical power a posteriori (observed power) to guarantee regression
results with power higher than 0.80.30 We also verified whether the
Disorganization Index differed between the groups using Kruskal–Wallis
and two-sided Wilcoxon Ranksum tests with Bonferroni correction for four
comparisons: Schizophrenia vs. [Bipolar + Control], Schizophrenia vs.
Bipolar, Schizophrenia vs. Control, Bipolar vs. Control (α = 0.0125; Fig. 4b,
Supplementary Table 4). Normality and variance homogeneity were
assessed by the Kolmogorov–Smirnov and Levene tests, respectively.
Partial Spearman correlations to control for confounding factors were
implemented using the Matlab code partialcorr.
To verify whether the Disorganization Index could classify the schizo-
phrenia diagnosis using only connectedness attributes from memory reports
recorded during the first psychiatric interview, a binary classifier Naïve
Bayes29 with 10-fold cross-validation was used to sort the patients that
6 months later received the schizophrenia diagnosis from other groups. To
verify whether the Disorganization Index could correctly sort patients with
severe negative symptoms from those with milder negative symptomatol-
ogy, the samples were divided in two subsamples with high (more than the
median) and low (less or equal the median) scores of total PANSS negative
subscale. The cutoff was the PANSS median of the entire group of psychotic
patients (Schizophrenia + Bipolar). The median PANSS value was 16. Next we
verified whether the Naïve Bayes classifier was able to classify both samples
using only the Disorganization Index. Classification quality was verified by
measuring true positive rate (sensitivity), true negative rate (1-specificity),
precision, recall, f-measure, AUC and Acc (Table 2).
The same strategy to obtain a Disorganization Index was validated in a
previously collected sample of dream reports from chronic psychotic
subjects and matched controls.9 As this previous protocol was not time-
limited, verbosity differences were controlled using average graph
attributes from 30-word graphs (see ref. 9 for details). Also a validation
of the index across samples were calculated (applying the index calculated
for dream reports from recent-onset sample to chronic psychotic data, and
index calculated for chronic psychotic sample for recent-onset psychosis
data). Classification accuracy for schizophrenia diagnosis and negative
symptom severity was verified using Naïve Bayes classifiers (Supplemen-
tary Table 7). All the graph attribute measurements used in the current
study are available as Supplementary Information (Supplementary
Tables 8, 9, and 10). For research purposes only, all the raw transcribed
data are available in our webpage (http://neuro.ufrn.br/multiusuario/
cadastramento/?page_id=19).
ACKNOWLEDGEMENTS
We thank “CAPS Infantil Natal/RN”, “Hospital Universitário Onofre Lopes” and
“Hospital João Machado” for access to the patients; Diego Fernández-Slezak, Cláudio
Queiroz, Sandro de Souza and Mariano Sigman for insightful discussions; Débora
Koshiyama for bibliographic support; Pedro PC. Maia, Gabriel M. da Silva and Jaime
Cirne for IT support. In memory of Raimundo Furtado. Work supported by UFRN,
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), grants
Universal 408145/2016-1 and Research Productivity 308775/2015-5 and 310712/
2014-9; Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Projects OBEDUC- ACERTA 0898/2013 and STIC AmSud 062/2015; FAPESP Center for
Neuromathematics (grant # 2013/07699-0, São Paulo Research Foundation).
AUTHOR CONTRIBUTIONS
N.B.M. performed data collection, N.B.M. and S.R. prepared figures. N.B.M., S.R. and
M.C. contributed study design, literature search, data analysis, data interpretation,
and writing.
COMPETING INTEREST
The authors declare that they have no competing interests.
REFERENCES
1. Austin, S. F. et al. Long-term trajectories of positive and negative symptoms in
first episode psychosis: a 10 year follow-up study in the OPUS cohort. Schizophr.
Res. 168, 84–91, doi:10.1016/j.schres.2015.07.021 (2015).
2. Andreasen, N. C. & Grove, W. M. Thought, language, and communication in
schizophrenia: diagnosis and prognosis. Schizophr. Bull. 12, 348–359 (1986).
3. Ayer, A. et al. Formal thought disorder in first-episode psychosis. Compr. Psy-
chiatry 70, 209–215, doi:10.1016/j.comppsych.2016.08.005 (2016).
4. Michel, C., Ruhrmann, S., Schimmelmann, B. G., Klosterkotter, J. & Schultze-Lutter,
F. A stratified model for psychosis prediction in clinical practice. Schizophr. Bull.
40, 1533–1542, doi:10.1093/schbul/sbu025 (2014).
5. Huys, Q. J. M., Maia, T. V. & Frank, M. J. Computational psychiatry as a bridge from
neuroscience to clinical applications. Nat. Neurosci. 19, 404–413 (2016).
6. Mota, N. B., Copelli, M. & Ribeiro, S. Computational tracking of mental health in
youth: Latin American contributions to a low-cost and effective solution for early
psychiatric diagnosis. New. Dir. Child. Adolesc. Dev. 2016, 59–69, doi:10.1002/
cad.20159 (2016).
7. Wang, X. J. & Krystal, J. H. Computational psychiatry. Neuron 84, 638–654,
doi:10.1016/j.neuron.2014.10.018 (2014).
8. Koutsouleris, N. et al. Multisite prediction of 4-week and 52-week treatment
outcomes in patients with first-episode psychosis: a machine learning approach.
Lancet Psychiatry 3, 935–946, doi:10.1016/S2215-0366(16)30171-7 (2016).
9. Mota, N. B., Furtado, R., Maia, P. P., Copelli, M. & Ribeiro, S. Graph analysis of
dream reports is especially informative about psychosis. Sci. Rep. 4, 3691,
doi:10.1038/srep03691 (2014).
10. Mota, N. B. et al. Speech graphs provide a quantitative measure of thought
disorder in psychosis. PLoS ONE 7, e34928, doi:10.1371/journal.pone.0034928
(2012).
11. Bedi, G. et al. Automated analysis of free speech predicts psychosis onset in high-
risk youths. npj Schizophr. 1, 15030, doi:10.1038/npjschz.2015.30 (2015).
Thought disorder measured as random speech structure
NB Mota et al.
9
Published in partnership with the Schizophrenia International Research Society npj Schizophrenia (2017)  18 
132
12. Elvevåg, B., Foltz, P. W., Weinberger, D. R. & Goldberg, T. E. Quantifying inco-
herence in speech: an automated methodology and novel application to schi-
zophrenia. Schizophr. Res. 93, 304–316, doi:10.1016/j.schres.2007.03.001 (2007).
13. Cabana, A., Valle-Lisboa, J. C., Elvevag, B. & Mizraji, E. Detecting order-disorder
transitions in discourse: implications for schizophrenia. Schizophr. Res. 131,
157–164, doi:10.1016/j.schres.2011.04.026 (2011).
14. Mota, N. B. et al. A naturalistic assessment of the organization of children’s
memories predicts cognitive functioning and reading ability. Mind Brain Educ. 10,
184–195 (2016).
15. Mota, N. B. et al. The ontogeny of discourse structure mimics the development of
literature. Preprint at arXiv1612.09268 (2016).
16. Lang, P. J., Greenwald, M. K., Bradley, M. M. & Hamm, A. O. Looking at pictures:
affective, facial, visceral, and behavioral reactions. Psychophysiology 30, 261–273
(1993).
17. Whitaker, K. J. et al. Adolescence is associated with genomically patterned con-
solidation of the hubs of the human brain connectome. Proc. Natl. Acad. Sci. USA.
113, 9105–9110, doi:10.1073/pnas.1601745113 (2016).
18. Kaplan, H. I. & Sadock, B. J. Kaplan & sadock’s comprehensive textbook of psychiatry.
(Wolters Kluwer, Lippincott Williams & Wilkins, 2009).
19. Mota, N. B., Carrillo, F., Slezak, D. F., Copelli, M. & Ribeiro, S. in Fiftieth Asilomar
Conference on Signals, Systems and Computers. (IEEE Conference Publishing).
20. Insel, T. R. The NIMH research domain criteria (RDoC) project: precision medicine
for psychiatry. Am. J. Psychiatry 171, 395–397, doi:10.1176/appi.
ajp.2014.14020138 (2014).
21. Carrillo, F. et al. Emotional intensity analysis in Bipolar subjects. Preprint at
arXiv:1606.02231 (2015).
22. Breitborde, N. J., Srihari, V. H. & Woods, S. W. Review of the operational definition
for first-episode psychosis. Early Interv. Psychiatry 3, 259–265, doi:10.1111/j.1751-
7893.2009.00148.x (2009).
23. First, M. H., Spitzer, R. L., Gibbon, M. & Williams, J. Structured clinical interview for
DSM-IV Axis I disorders -- Research version, patient edition (SCID-I/P). (Biometrics
Research, New York State Psychiatric Institute, 1990).
24. INEP. Taxas de distorção idade-série, Brasil, regiões e Ufs, http://portal.inep.gov.br/
indicadores-educacionais (2015).
25. Daniel, W. W. Biostatistics: a foundation for analysis in the health sciences. 9th ed.
edn, (Wiley, 2008).
26. Mari, J. J. & Leitao, R. J. A epidemiologia da esquizofrenia. Rev. Bras. Psiquiatr. 22,
15–17 (2000).
27. Yee, C. M. et al. Integrity of emotional and motivational states during the pro-
dromal, first-episode, and chronic phases of schizophrenia. J. Abnorm. Psychol.
119, 71–82, doi:10.1037/a0018475 (2010).
28. Kay, S. R., Fiszbein, A. & Opler, L. A. The positive and negative syndrome scale
(PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276 (1987).
29. Hall, M. et al. The WEKA data mining software: an update. SIGKDD Explor. 11,
10–18 (2009).
30. Gatsonis, C. & Sampson, A. Multiple correlation: exact power and sample size
calculations. Psychol. Bull. 106, 516–524 (1989).
Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing,
adaptation, distribution and reproduction in anymedium or format, as long as you give
appropriate credit to the original author(s) and the source, provide a link to the Creative
Commons license, and indicate if changes were made. The images or other third party
material in this article are included in the article’s Creative Commons license, unless
indicated otherwise in a credit line to the material. If material is not included in the
article’s Creative Commons license and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly
from the copyright holder. To view a copy of this license, visit http://creativecommons.
org/licenses/by/4.0/.
© The Author(s) 2017
Supplementary Information accompanies the paper on the npj Schizophrenia website (doi:10.1038/s41537-017-0019-3).
Thought disorder measured as random speech structure
NB Mota et al.
10
npj Schizophrenia (2017)  18 Published in partnership with the Schizophrenia International Research Society
133
Supplementary Information (SI): 
Composed by 10 tables - statistical information (classification using different reports, comparison of 
connectedness,  correlation connectedness versus PANSS, correlation connectedness versus confound 
factors, comparison of Disorganization Indexes, adjusted correlations for confound factors and validation 
of Disorganization Index across samples), 3 tables with Raw Data (from recent-onset psychosis and 
chronic-psychosis sample). 
 
 
  
134
Supplementary Table 1: Classification quality to classify Schizophrenia group from others subjects 
using a Naïve Bayes classifier with all 5 connectedness attributes (E, LCC, LSC, LCCz, LSCz) from 
different time-limited memory reports. 
 
 
 
  
Groups Sensitivity Specificity Precision Recall F-Measure AUC Accuracy 
Dream 0.81 0.85 0.87 0.81 0.82 0.84 80.56 
Negative 0.76 0.68 0.78 0.76 0.77 0.78 76.19 
Positive 0.69 0.77 0.79 0.69 0.71 0.74 69.05 
Neutral 0.69 0.71 0.76 0.69 0.71 0.63 69.05 
Yesterday 0.69 0.54 0.70 0.69 0.70 0.64 69.05 
Oldest 0.57 0.56 0.66 0.57 0.60 0.62 57.14 
135
Supplementary Table 2: Statistical comparison of connectedness attributes (E, LCC, LSC, LCCz, LSCz) 
between diagnostic groups (Schizophrenia = S, Bipolar = B, Control = C). KS test rejects normal 
distribution of all samples (Bonferroni corrected for 2 comparisons, p < 0.0250 in bold), Levene’s test 
verifies variance homogeneity (Bonferroni corrected for 2 comparisons, p < 0.0250 in bold). Kruskal-
Wallis test (SxBxC, Bonferroni corrected for 2 comparisons (2 memory reports), p < 0.0250 in bold); 
Wilcoxon-Ranksum test (SxB, SxC, SxB+C, BxC; Bonferroni corrected for 6 comparisons (4 comparison 
for each memory reports), p < 0.0063 in bold). Statistical comparison using Wilcoxon-Ranksum test 
between male x female in Control group showed no difference (Bonferroni corrected for 2 comparisons (2 
memory reports), p < 0.0250 in bold)). 
 
KS test E LCC LSC LCCz LSCz 
Dream 
p value 3.60E-33 3.60E-33 5.45E-25 4.59E-16 1.01E-22 
h 1 1 1 1 1 
Negative 
p value 2.39E-38 9.68E-37 2.42E-27 2.66E-16 4.70E-21 
h 1 1 1 1 1 
Levene’s Test E LCC LSC LCCz LSCz 
Dream 0.0804 0.1266 0.0349 0.4017 0.0603 
Negative 0.1264 0.3544 0.095 0.8135 0.7008 
Kruskal-Wallis E LCC LSC LCCz LSCz 
Dream S x B x C 0.0070 0.0074 0.0112 0.3976 0.2240 
Negative S x B x C 0.0021 0.0056 0.0034 0.3197 0.0158 
Wilcoxon Ranksum E LCC LSC LCCz LSCz 
Dream 
SxB 0.0056 0.0031 0.0040 0.1893 0.1893 
SxC 0.0042 0.0045 0.0079 0.2652 0.1239 
Sx(B+C) 0.0021 0.0019 0.0031 0.1872 0.1013 
BxC 0.5418 0.8640 0.8642 0.9029 0.5419 
Negative 
SxB 0.0081 0.0181 0.0205 0.3418 0.1300 
SxC 0.0009 0.0022 0.0009 0.1421 0.0033 
Sx(B+C) 0.0005 0.0015 0.0008 0.1446 0.0060 
BxC 0.7997 0.6719 0.8823 0.7513 0.4856 
Wilcoxon Ranksum E LCC LSC LCCz LSCz 
Dream Male x Fem 0.6959 1.0000 0.6440 0.9717 0.9151 
Negative Male x Fem 0.6439 0.3922 0.5936 0.2707 0.2707 
 
 
  
136
Supplementary Table 3: Spearman correlation between connectedness attributes (E, LCC, LSC, LCCz, 
LSCz) and negative symptoms measured by PANSS (total negative subscale, N1, N2, N3, N4, N5, N6, 
N7), using dreams or negative image reports. Showed R, and p values (significant results in bold after 
Bonferroni correction for 80 comparisons – 5 attributes * 2 reports * 8 symptoms, p < 0.0006).  
 
 
  
Dream Reports E LCC LSC LCCz LSCz 
PANSS Negative Subscale Rho p Rho p Rho p Rho p Rho p 
Total -0.69 0.0046 -0.69 0.0042 -0.65 0.0089 -0.41 0.132 -0.16 0.5654 
N1 -0.71 0.0028 -0.71 0.0031 -0.72 0.0026 -0.34 0.2121 -0.23 0.4098 
N2 -0.85 0.0001 -0.8 0.0003 -0.76 0.0009 -0.39 0.1463 -0.2 0.4775 
N3 -0.57 0.0279 -0.57 0.0279 -0.56 0.0286 -0.25 0.3755 -0.11 0.6962 
N4 -0.56 0.0317 -0.48 0.0724 -0.4 0.1392 -0.11 0.6852 0.33 0.2355 
N5 -0.44 0.0978 -0.49 0.0634 -0.47 0.0757 -0.39 0.1459 -0.46 0.0836 
N6 -0.6 0.0192 -0.6 0.0183 -0.57 0.0281 -0.44 0.0988 -0.2 0.4774 
N7 0.63 0.0126 0.64 0.0101 0.6 0.0184 0.28 0.3200 0.33 0.2342 
Negative Image Reports E LCC LSC LCCz LSCz 
PANSS Negative Subscale Rho p Rho p Rho p Rho p Rho p 
Total -0.81 0.0000 -0.85 0.0000 -0.81 0.0000 -0.7 0.0005 -0.77 0.0001 
N1 -0.78 0.0000 -0.8 0.0000 -0.77 0.0000 -0.63 0.0021 -0.69 0.0006 
N2 -0.77 0.0000 -0.77 0.0001 -0.75 0.0001 -0.62 0.0027 -0.67 0.0008 
N3 -0.8 0.0000 -0.77 0.0000 -0.82 0.0000 -0.59 0.0051 -0.75 0.0001 
N4 -0.69 0.0006 -0.73 0.0002 -0.62 0.0026 -0.69 0.0005 -0.57 0.0065 
N5 -0.63 0.0024 -0.66 0.0011 -0.66 0.0012 -0.46 0.0364 -0.67 0.0008 
N6 -0.8 0.0000 -0.81 0.0000 -0.81 0.0000 -0.57 0.0065 -0.73 0.0002 
N7 0.32 0.1562 0.26 0.2543 0.24 0.2998 -0.02 0.9409 0.05 0.8288 
137
 Supplementary Table 4:  Spearman correlations between each graph attribute and confound factors 
(Bonferroni corrected for 30 comparisons (2 memory reports, 3 confound factors, and 5 graph attributes, 
p < 0.0017). 
 
  AGE EDUCATION AP DOSE (CLPeq) 
  Dream Negative Dream Negative Dream Negative 
  rho p rho p rho p rho p rho p rho p 
E -0.14 0.6291 0.17 0.4626 0.13 0.6455 0.47 0.0324 -0.50 0.0572 -0.42 0.0573 
LCC 0.01 0.9746 0.06 0.7865 0.27 0.3342 0.40 0.0746 -0.41 0.1270 -0.46 0.0357 
LSC -0.03 0.9034 0.35 0.1246 0.21 0.4563 0.60 0.0042 -0.51 0.0549 -0.30 0.1908 
LCCz 0.43 0.1065 -0.20 0.3821 0.32 0.2480 0.01 0.9529 -0.04 0.8890 -0.35 0.1207 
LSCz 0.36 0.1879 0.29 0.1952 0.32 0.2396 0.50 0.0203 -0.08 0.7798 -0.36 0.1085 
 
  
138
 Supplementary Table 5: Statistical comparison of Disorganization Index between diagnostic groups 
(Schizophrenia = S, Bipolar = B, Control = C), considering dream + negative image reports, negative 
image reports or dream reports and applying the Disorganization Index from dream reports to an 
independent cohort of chronic psychotic sample9. KS test rejects normal distribution of all samples 
(Bonferroni corrected for 4 comparisons, p < 0.0125 in bold), Levene’s test verifies variance 
homogeneity (Bonferroni corrected for 4 comparisons, p < 0.0125 in bold). Kruskal-Wallis test (SxBxC, 
Bonferroni corrected for 3 comparisons (3 Disorganization Indexes) p < 0.0167 in bold); Wilcoxon-
Ranksum test (SxB, SxC, Sx(B+C), BxC; Bonferroni corrected for 8 comparisons (4 comparison for each 
memory reports), p < 0.0063 in bold). 
Disorganization Index Kruskal-Wallis (p) KS test (p) KS test (h) Levene's test (p) 
Dream + Negative S x B x C 0.0035 2.35E-31 1 0.0472 
Negative S x B x C 0.0044 1.89E-38 1 0.6966 
Dream S x B x C 0.0070 3.60E-33 1 0.1157 
Dream - Chronic Sample S x B x C 8.60E-06 2.87E-54 1 0.0268 
Disorganization Index - Wilcoxon Ranksum test (p values) 
Dream + Negative 
SxB 0.0006 
Negative 
SxB 0.0221 
SxC 0.0030 SxC 0.0013 
Sx(B+C) 0.0009 Sx(B+C) 0.0011 
BxC 0.7511 BxC 0.7513 
Dream 
SxB 0.0037 
Dream - Chronic Sample 
SxB 0.0011 
SxC 0.0042 SxC 0.0000 
SxB+C 0.0018 SxB+C 0.0000 
BxC 0.8452 BxC 0.0385 
 
 
  
139
Supplementary Table 6: Controls for confound factor for episode psychosis group (age, educational 
level, and medication status). Spearman correlation between disorganization indexes and confound factor 
and adjusted Spearman correlation between disorganization indexes versus negative symptoms (PANSS 
negative subscale), adjusted for each confound factor (Bonferroni corrected for 6 comparisons (2 memory 
reports and 3 confound factors, p < 0.0083). 
 
Confound Factors Dream+Negative Negative Dream 
Disorganization Index rho p rho p rho p 
Index x Age (years) -0.12 0.6688 -0.14 0.5375 -0.01 0.9848 
Index x Education (years) -0.20 0.4639 -0.42 0.0555 -0.27 0.3380 
Index x AP dose (CLPeq) 0.54 0.0385 0.32 0.1529 0.43 0.1108 
Index x PANSS negative rho p rho p rho p 
No Adjustment 0.92 0.0000 0.84 0.0000 0.70 0.0038 
By Age (years) 0.92 0.0000 0.84 0.0000 0.70 0.0054 
By Education (years) 0.91 0.0000 0.80 0.0000 0.68 0.0070 
By AP dose (CLPeq) 0.89 0.0000 0.84 0.0000 0.61 0.0202 
 
  
140
Supplementary Table 7: Validation of coefficients across different samples. Classification quality (a 
Naïve Bayes classifier) of sorting Schizophrenia patients from others subjects (Diagnosis), or sorting 
between low and high negative symptom severity (Negative Symptoms), using the Disorganization Index 
obtained from dream reports of the recent-onset psychotic sample (DI1), and applied to dream reports of a 
chronic psychotic sample 9 (Sample 2), or Disorganization Index obtained from dream reports of chronic 
psychotic sample (DI2) and applied to dream reports of the recent-onset psychotic sample (Sample 1). 
 
    AUC Accuracy (%) 
Sample 2 in DI1 
Diagnosis 0.74 76.67 
Negative Symptoms 0.82 70.00 
Sample 1 in DI2 
Diagnosis 0.81 80.56 
Negative Symptoms 0.78 73.33 
 
141
Supplementary Table 8: Raw data and PANSS from recent-onset psychosis sample. 
  Negative Image Dream PANSS 
NoID Subjects Group WC Edges LCC LSC LCCz LSCz WC Edges LCC LSC LCCz LSCz Total N1 N2 N3 N4 N5 N6 N7 
Subject 01 Schizophrenia 18 17 15 4 1.50 0.96 29 26 22 2 1.73 -0.57 24 5 4 3 2 5 4 1 
Subject 02 Schizophrenia 32 30 24 18 1.68 4.93 11 8 6 1 -0.09 -0.54 16 3 3 1 1 5 2 1 
Subject 04 Schizophrenia 24 23 20 12 1.64 4.06 41 40 35 16 2.26 3.73 15 2 2 2 3 4 1 1 
Subject 05 Schizophrenia 31 29 26 7 1.93 1.63 32 31 18 17 0.81 2.80 21 3 3 3 2 6 3 1 
Subject 07 Schizophrenia 5 3 3 1 0.04 -0.30   
    
  26 5 3 3 3 6 5 1 
Subject 08 Schizophrenia 8 6 7 1 1.70 -0.48 24 20 18 16 1.07 8.40 31 4 4 4 6 6 5 2 
Subject 09 Schizophrenia 13 7 6 1 0.40 -0.42 28 23 14 11 -1.69 3.56 25 4 5 3 5 4 3 1 
Subject 10 Schizophrenia 30 27 26 8 1.93 2.58   
    
  20 5 4 4 1 1 4 1 
Subject 11 Schizophrenia 32 27 19 6 0.58 0.52   
    
  33 5 5 4 5 6 5 3 
Subject 03 Schizophrenia 20 18 15 8 0.77 3.33   
    
  32 6 4 5 3 6 5 3 
Subject 06 Schizophrenia 8 3 2 1 -1.29 -0.25 14 9 9 1 1.33 -0.64 34 6 5 5 4 7 6 1 
Subject 12 Bipolar Disorder 34 31 28 14 2.10 4.53 67 63 39 36 1.30 3.45 8 1 1 1 1 2 1 1 
Subject 15 Bipolar Disorder 33 28 24 19 1.40 6.05 22 19 17 10 1.39 4.16 16 3 3 2 3 2 2 1 
Subject 17 Bipolar Disorder 65 62 43 40 1.78 4.63 67 63 42 39 1.61 4.33 14 2 1 1 1 4 1 4 
Subject 13 Bipolar Disorder 15 9 6 1 -0.54 -0.57   
    
  33 5 5 5 5 7 4 2 
Subject 14 Bipolar Disorder 93 92 55 48 1.55 3.14 91 90 53 51 1.53 3.51 13 1 1 1 1 5 1 3 
Subject 16 Bipolar Disorder 18 12 12 1 1.44 -0.64   
    
  29 6 4 4 2 7 5 1 
Subject 18 Bipolar Disorder 33 30 26 16 1.98 4.85 63 61 37 36 1.32 3.61 12 1 2 1 2 3 2 1 
Subject 19 Bipolar Disorder 32 30 29 15 2.29 5.75 71 69 48 24 1.79 1.83 11 1 1 1 1 3 1 3 
Subject 20 Bipolar Disorder 45 43 30 27 1.42 4.28 61 60 43 39 1.81 5.00 11 1 3 1 3 1 1 1 
Subject 21 Bipolar Disorder 39 36 25 14 1.33 1.95 76 75 50 49 1.72 4.67 16 3 1 2 2 2 3 3 
Subject 23 Control 68 67 46 44 1.77 4.99 86 85 55 52 1.72 4.10 
       
  
Subject 24 Control 36 35 30 26 2.06 7.55 95 93 64 63 2.11 5.29 
       
  
Subject 25 Control 42 41 31 28 1.68 5.38 56 53 35 32 1.49 4.13 
       
  
Subject 26 Control 19 15 12 6 0.70 2.17 42 39 27 18 -0.61 3.78 
       
  
Subject 28 Control 33 28 23 12 1.66 2.92 27 22 19 13 1.27 5.29 
       
  
Subject 29 Control 42 40 29 20 0.62 3.97 24 21 15 8 1.07 1.80 
       
  
Subject 31 Control 28 27 25 8 1.97 2.41 62 60 39 28 1.46 2.58 
       
  
Subject 33 Control 34 33 25 23 1.52 5.25 33 30 25 9 1.24 2.28 
       
  
Subject 35 Control 23 19 13 6 -0.74 2.46 47 43 33 23 1.37 4.49 
       
  
Subject 38 Control 16 14 9 1 -0.79 -0.61 77 74 55 44 1.92 5.07 
       
  
Subject 39 Control 58 56 39 35 1.64 4.45 97 96 58 57 1.60 3.96 
       
  
Subject 40 Control 42 39 34 19 2.17 4.66 58 56 43 37 2.17 5.88 
       
  
Subject 22 Control 70 68 49 47 1.62 6.01 88 86 60 59 2.12 5.55 
       
  
Subject 27 Control 39 37 31 21 2.00 5.57 100 99 54 54 1.26 3.44 
       
  
Subject 30 Control 45 43 36 28 2.17 6.46 76 75 51 50 1.84 4.76 
       
  
Subject 32 Control 41 39 31 23 1.78 5.11 52 51 39 35 1.99 6.01 
       
  
Subject 34 Control 34 30 26 11 1.90 2.91 28 23 11 1 -2.17 -0.69 
       
  
Subject 36 Control 24 21 15 9 -0.41 3.67 26 20 10 1 -1.64 -0.61 
       
  
Subject 37 Control 36 35 30 27 2.06 7.72 81 80 48 47 1.41 3.65 
       
  
Subject 41 Control 31 29 26 9 1.91 2.30 57 55 39 36 1.81 4.98 
       
  
Subject 42 Control 33 31 25 14 1.60 3.42 61 59 39 39 1.64 4.46                 
142
Supplementary Table 9: Demographic data and indices from recent-onset psychosis sample. 
   Demographic/Medication Dirsorganization Index 
NoID Subjects Group APDose Age Education Sex Dream+Negative Dream Negative 
Subject 01 Schizophrenia 414 16 8 m 24.13 20.68 25.17 
Subject 02 Schizophrenia 157 18 9 m 18.32 25.87 15.12 
Subject 04 Schizophrenia 132 18 6 m 14.36 16.29 17.71 
Subject 05 Schizophrenia 7 9 3 m 23.41 21.78 20.64 
Subject 07 Schizophrenia 91 15 6 m    
31.23 
Subject 08 Schizophrenia 289 13 4 m 28.26 21.80 30.41 
Subject 09 Schizophrenia 50 15 8 m 28.93 23.15 30.58 
Subject 10 Schizophrenia 100 16 7 m    
18.71 
Subject 11 Schizophrenia 264 12 4 m    
25.05 
Subject 03 Schizophrenia 100 16 7 f    
20.46 
Subject 06 Schizophrenia 264 13 1 f 29.54 24.90 31.42 
Subject 12 Bipolar Disorder 0 7 2 m 12.49 14.75 14.46 
Subject 15 Bipolar Disorder 289 17 10 m 13.50 22.20 12.84 
Subject 17 Bipolar Disorder 100 16 6 m 12.02 13.74 11.76 
Subject 13 Bipolar Disorder 132 16 1 f    
30.89 
Subject 14 Bipolar Disorder 248 14 9 f 13.38 10.03 11.92 
Subject 16 Bipolar Disorder 330 15 1 f    
29.23 
Subject 18 Bipolar Disorder 25 15 6 f 12.18 15.40 14.54 
Subject 19 Bipolar Disorder 0 13 7 f 7.80 11.98 11.62 
Subject 20 Bipolar Disorder 0 23 12 f 12.45 13.42 15.41 
Subject 21 Bipolar Disorder 66 17 10 f 16.25 11.02 20.82 
Subject 23 Control   14 9 m 8.61 9.37 10.42 
Subject 24 Control   19 11 m 0.51 6.32 8.39 
Subject 25 Control   16 9 m 11.52 16.09 12.85 
Subject 26 Control   8 2 m 20.30 18.86 23.65 
Subject 28 Control   13 6 m 20.28 21.51 19.22 
Subject 29 Control   8 2 m 18.77 22.87 15.82 
Subject 31 Control   14 9 m 17.33 14.85 19.37 
Subject 33 Control   15 6 m 13.80 19.62 14.56 
Subject 35 Control   17 7 m 18.42 16.85 22.75 
Subject 38 Control   16 7 m 21.02 9.47 30.08 
Subject 39 Control   21 12 m 9.10 8.33 12.95 
Subject 40 Control   15 9 m 11.43 13.44 12.75 
Subject 22 Control   18 11 f 5.25 7.66 7.58 
Subject 27 Control   13 7 f 7.09 9.67 11.88 
Subject 30 Control   23 12 f 5.74 10.69 9.05 
Subject 32 Control   15 8 f 11.25 14.76 13.02 
Subject 34 Control   13 7 f 21.92 24.25 18.25 
Subject 36 Control   14 6 f 20.32 24.57 19.82 
Subject 37 Control   19 11 f 3.40 11.70 8.12 
Subject 41 Control   15 7 f 17.61 14.75 19.39 
Subject 42 Control   18 11 f 15.07 14.71 17.72 
143
Supplementary Table 10: Raw data from an independent cohort of chronic psychotic sample (20 
patients with schizophrenia diagnosis, 20 patients with bipolar disorder diagnosis and 20 matched control) 
(initials, diagnostic group, connectedness graph attributes from dream reports - average of 30-words 
graphs, comprising edges (E), largest connected component (LCC)). 
  Dream Dirsorganization Index PANSS negative 
NoID Subjects Diagnostic Group Edges LCC LSC Dream Total 
Subject 01 Schizophrenia 21.00 16.00 1.00 29.49 27 
Subject 02 Schizophrenia 25.96 17.38 8.98 15.88 20 
Subject 04 Schizophrenia 27.39 23.71 14.47 12.61 13 
Subject 05 Schizophrenia 25.81 15.90 8.79 16.30 17 
Subject 07 Schizophrenia 27.82 21.06 11.92 10.76 16 
Subject 08 Schizophrenia 24.45 19.86 8.30 20.37 29 
Subject 09 Schizophrenia 25.92 20.08 10.55 16.31 16 
Subject 10 Schizophrenia 28.51 24.52 17.82 9.85 16 
Subject 01 Schizophrenia 24.58 17.01 6.62 19.64 9 
Subject 02 Schizophrenia 18.97 12.70 1.00 35.73 33 
Subject 04 Schizophrenia 27.25 21.07 9.49 12.00 9 
Subject 05 Schizophrenia 28.58 23.49 17.13 9.48 8 
Subject 07 Schizophrenia 25.05 15.76 5.49 17.94 11 
Subject 08 Schizophrenia 24.44 17.65 4.47 19.63 26 
Subject 09 Schizophrenia 25.83 21.71 8.23 16.12 20 
Subject 10 Schizophrenia 25.40 20.40 6.40 17.06 37 
Subject 01 Schizophrenia 25.76 19.69 16.24 17.99 27 
Subject 02 Schizophrenia 25.84 18.90 9.10 16.26 16 
Subject 04 Schizophrenia 27.88 21.76 14.03 11.01 11 
Subject 05 Schizophrenia 23.30 17.25 6.38 23.51 25 
Subject 07 Bipolar Disorder 28.47 23.24 17.29 9.87 10 
Subject 08 Bipolar Disorder 27.30 21.24 12.75 12.53 17 
Subject 09 Bipolar Disorder 28.38 19.87 13.41 9.33 16 
Subject 10 Bipolar Disorder 28.84 22.60 15.92 8.44 7 
Subject 01 Bipolar Disorder 26.47 19.42 9.82 14.47 16 
Subject 02 Bipolar Disorder 26.35 19.17 10.86 15.07 17 
Subject 04 Bipolar Disorder 28.51 22.83 16.98 9.66 10 
Subject 05 Bipolar Disorder 25.14 17.20 7.29 18.05 7 
Subject 07 Bipolar Disorder 26.06 20.45 11.41 16.08 11 
Subject 08 Bipolar Disorder 27.24 20.53 11.31 12.40 10 
Subject 09 Bipolar Disorder 27.27 19.81 10.39 12.12 8 
Subject 10 Bipolar Disorder 28.11 23.07 14.99 10.50 11 
Subject 01 Bipolar Disorder 27.05 18.39 13.65 13.49 16 
Subject 02 Bipolar Disorder 28.82 23.27 16.19 8.57 7 
Subject 04 Bipolar Disorder 28.33 23.61 17.14 10.28 10 
Subject 05 Bipolar Disorder 28.40 22.66 14.49 9.51 10 
Subject 07 Bipolar Disorder 28.69 21.77 16.55 9.04 13 
Subject 08 Bipolar Disorder 27.97 20.52 13.43 10.61 9 
144
Subject 09 Bipolar Disorder 27.50 20.37 12.50 11.86 16 
Subject 10 Bipolar Disorder 25.79 18.67 9.39 16.47 15 
Subject 01 Control 28.15 23.02 15.46 10.46 7 
Subject 02 Control 28.39 21.92 15.55 9.74 7 
Subject 04 Control 28.06 21.48 15.70 10.78 7 
Subject 05 Control 28.01 19.08 14.91 10.80 8 
Subject 07 Control 28.73 23.87 19.03 9.43 10 
Subject 08 Control 28.62 23.65 13.78 8.68 8 
Subject 09 Control 28.65 22.87 16.44 9.15 7 
Subject 10 Control 28.20 22.99 16.31 10.50 7 
Subject 01 Control 28.91 22.25 18.25 8.72 13 
Subject 02 Control 26.81 22.69 13.07 14.10 7 
Subject 04 Control 28.77 22.54 17.50 8.99 7 
Subject 05 Control 28.88 23.66 16.32 8.39 11 
Subject 07 Control 28.93 25.85 16.78 8.33 8 
Subject 08 Control 28.88 24.50 15.34 8.20 8 
Subject 09 Control 27.85 21.70 14.75 11.26 7 
Subject 10 Control 27.73 24.11 14.29 11.52 7 
Subject 01 Control 29.00 25.28 18.29 8.44 7 
Subject 02 Control 28.22 24.32 13.95 9.95 7 
Subject 04 Control 26.27 20.05 13.17 15.80 16 
Subject 05 Control 28.30 22.62 17.92 10.52 9 
 
 
145
Characterization of the relationship between 
semantic and structural language features in 
psychiatric diagnosis 
 
 
 N.B.Mota 
Brain Institute 
UFRN Natal 
Brazil  
                                        
F.Carrillo 
Department of 
Computation 
UBA Buenos 
Aires 
Argentina  
D.F.Slezak 
Department of 
Computation 
UBA Buenos 
Aires 
Argentina 
M.Copelli 
Physics 
Department 
UFPE Recife 
Brazil  
  
  S.Ribeiro 
Brain Institute 
UFRN Natal 
Brazil 
 
 
Abstract 
Psychiatry describes speech symptoms that 
are indicative of disorganized thought, but 
measuring them is not easy. With natural 
language processing tools, it is possible to 
quantify psychiatric symptoms. Graph 
representations of word trajectories and 
semantic incoherence have independently 
been shown to predict the Schizophrenia 
diagnosis. Both analyses assess thought 
organization through speech, but the 
relationship between them is unknown. To 
fill this gap, here we characterize the 
relationship between structural and semantic 
features of free verbal reports from 60 
patients and matched controls. Graph 
connectedness is inversely correlated to 
semantic incoherence and both explain 54% 
of negative symptoms variance. 
 
INTRODUCTION 
 
For over a century, psychiatry has 
described speech symptoms perceived by the 
specialist as indicative of disorganized thought 
[1]. The descriptions used by psychiatrists to 
identify thought disorders focus on aberrant 
trajectories in word sequences used by patients 
to report their memories. While mild severity is 
described as, for instance, ‘loss of associations’, 
higher severity may be described as 
‘derailment’, reaching in extreme cases an 
apparent randomness described clinically as 
‘word salad’. However, even for very well 
trained psychiatrists, the aberrant thought 
organization identified through language is hard 
to measure with precision and without 
subjective biases. The development of natural 
language processing tools now enable us to 
quantify aberrant word trajectories analyzing 
structural [2-4] as well semantic features on 
patient reports [5, 6].   
Semantic incoherence between 
consecutive sentences is increased in verbal 
reports of schizophrenic patients [6], a feature 
that has been shown to predict Schizophrenia 
even during the prodromal phase, nearly 3 years 
before the first psychotic break [5]. On the other 
hand, the representation of word trajectories as 
directed graphs has revealed that subjects with 
chronic psychosis speak with significantly less 
connectedness between words than healthy 
subjects, and this allows for the automated 
diagnosis of Schizophrenia [4, 7]. Importantly, 
connectedness attributes were negatively 
correlated with the severity of negative 
symptoms measured during standard psychiatric 
evaluations [3]. The set of symptoms known as 
negative symptoms is associated with the 
Schizophrenia diagnosis, poor prognosis and 
major impacts in social behavior [8]. Both 
structural and semantic measures assess thought 
organization through word trajectories, but the 
relationship between structure (word graph 
connectedness) and semantics (language 
incoherence) is yet to be mapped. Are these 
measures redundant or complementary? Could 
the combination of structural and semantic 
analyses improve the quantification of negative 
symptoms?  
    To address these questions, we aimed in the 
present study to characterize the relationship 
between structural and semantic features of 
verbal reports from patients with and without 
psychotic symptoms (same dataset as [3]). The 
study also assessed whether the combination of 
structural and semantic features explains the 
severity of negative symptoms better than the 
same features separately.    
 
METHODS 
 
A total of 40 psychotic patients (20 
with Schizophrenia diagnosis and 20 with 
146
Bipolar Disorder diagnosis), and 20 control 
subjects without psychotic symptoms were 
interviewed during psychiatric assessment at 
public clinical services in Natal, Brazil. 
Participants and legal guardians provided 
written informed consent. The study was 
approved by the UFRN Research Ethics 
Committee (permit #102/06-98244). In order to 
establish the diagnosis according to DSM IV, 
SCID was applied [9]. A psychometric scale 
PANSS [10] was also applied to measure 
psychiatric symptoms according to psychiatric 
evaluation. For the analysis we used the total 
value of the PANSS negative subscale. Next the 
participants were requested to report a dream, 
and this report was audio recorded and 
transcribed. 
 To assess structural features, each 
report was represented as a graph in which each 
word corresponded to a node, and the temporal 
sequence of two consecutive words 
corresponded to an edge. In order to control for 
verbosity differences, a graph was performed 
for each set of 30 consecutive words, with one 
word of difference to perform the next graph. 
Three connectedness graph attributes were 
assessed for each graph: The amount of edges 
(E), the amount of nodes in the largest 
connected component (LCC), and the amount of 
nodes in the largest strongly connected 
component (LSC). After calculating graph 
attributes for all 30-word graphs, the average of 
each attribute was calculated and considered for 
the analysis. Graphs analysis was performed 
using the software SpeechGraphs [3].  
 To assess semantic features, we 
calculated the median semantic distances 
between consecutive sentences using latent 
semantic analysis (LSA), a measure known as 
first order incoherence, also predictive of the 
Schizophrenia diagnosis [5]. To control for 
verbosity differences, semantic distances were 
normalized by the largest sentence [5]. All the 
statistical analysis was performed with Matlab 
software. 
 
RESULTS 
 
 We found significant differences 
between the groups compared (Schizophrenia, 
Bipolar and Control groups), both for structural 
connectedness and for semantic incoherence 
(Figure 1 and Table 1). The Schizophrenia 
group produced less connected graphs (fewer 
Edges, smaller LCC and LSC) compared to the 
Control group, and also compared to the Bipolar 
group (fewer Edges and smaller LSC) (Figure 1 
and Table 1). The Bipolar group also produced 
less connected graphs in comparison with the 
Control group (Figure 1 and Table 1). 
In addition, the Schizophrenia group produced 
reports that were less semantically coherent than 
those of the Control group (Figure 1 and Table 
1).  
 When we analyzed all subjects 
together, Edges, LCC and LSC were negatively 
correlated with median incoherence (Figure 
2A). However, the relationship of semantic 
incoherence with Edges, LCC or LSC explained 
only a small portion of the data variance (14% 
of the semantic incoherence variance explained 
by Edges, 8% explained by LCC and 15% 
explained by LSC, as estimated by Pearson’s 
R²). Moreover, these correlations tend to persist 
only for the Schizophrenia group after sorting 
the participants according to their groups, (for 
the Schizophrenia group: Semantic Incoherence 
versus Edges p=0.0855, versus LCC p=0.1056, 
and versus LSC p=0.0813) (Figure 2B).  
  Since the semantic and structural 
features seem to share some information but 
without much redundancy, we combined the 
three connectedness attributes with the median 
semantic incoherence to assess the multilinear 
correlation of these features with the severity of 
negative symptoms measured by the PANSS 
negative subscale. The combination of both 
strategies was able to explain 54% of the 
variance in the severity of negative symptom 
(R² = 0.54, p < 0.0001) (Figure 3). 
 
 
 
Fig 1. Dispersion plot of graph connectedness attributes and 
median semantic incoherence. * means a group differs from 
another and ** means a group differs from the others 2 
groups 
 
Table I: P value of Wilcoxon Ranksum test between groups. 
Significant results in boldface (Bonferroni corrected for 3 
comparisons – SxB, SxC and BxC – α = 0.0167). 
 
Ranksum E LCC LSC Incoherence 
S x B 0.0013 0.0909 0.0051 0.1288 
S x C 0.0000 0.0002 0.0001 0.0079 
B x C 0.0275 0.0031 0.0066 0.1069 
 
147
 
 
Fig 2. A) Pearson correlations between graph connectedness 
attributes and semantic incoherence. B) Pearson correlations 
between graph connectedness attributes and semantic 
incoherence for the Schizophrenia group (S in red), Bipolar 
group (B in blue) and Control group (C in black). 
 
 
 
Fig 3. Multilinear correlation between structural and 
semantic measures and PANSS negative subscale. In y axis 
the coefficients founded for each attribute is described. 
 
DISCUSSION 
 
 The results point to an inverse 
relationship between graph connectedness (E, 
LCC and LSC) and semantic incoherence 
(median distance between consecutive 
sentences). This means that the less connected 
the verbal report is, the more semantically 
incoherent it is. Both the structural and the 
semantic approaches study the same object 
(memory reports) in order to quantify similar 
phenomenology (thought disorganization), but 
graph connectedness was able to explain only a 
small percentage of the variance in semantic 
incoherence when all subjects were considered, 
which indicates that these measurements are 
largely complementary. When we studied the 
correlations inside each group no significant 
correlations were found, and only in the 
Schizophrenia group - the main psychiatric 
pathology associated with thought 
disorganization - the effect tended to keep the 
same direction. 
One limitation of the study is that the 
results are impacted by the small number of 
subjects in each group, and thus future work is 
necessary to better characterize the relationship 
between structure and semantics in a larger 
sample.  
 Notwithstanding, the combination of 
structural and semantic features explained more 
than half of the variance of negative symptoms 
severity. The results show that the combination 
of both strategies to quantitatively assess 
negative symptoms is an important direction 
that should be pursued in a larger sample. 
 
REFERENCES 
 
[1] H. I. Kaplan and B. J. Sadock, Kaplan & Sadock's 
Comprehensive Textbook of Psychiatry: Wolters Kluwer, 
Lippincott Williams & Wilkins, 2009. 
[2] N. B. Mota, M. Copelli, and S. Ribeiro, "Computational 
Tracking of Mental Health in Youth: Latin American 
Contributions to a Low-Cost and Effective Solution for 
Early Psychiatric Diagnosis," New Dir Child Adolesc Dev, 
vol. 2016, pp. 59-69, Jun 2016. 
[3] N. B. Mota, R. Furtado, P. P. Maia, M. Copelli, and S. 
Ribeiro, "Graph analysis of dream reports is especially 
informative about psychosis," Scientific Reports, vol. 4, p. 
3691, 2014. 
[4] N. B. Mota, N. A. Vasconcelos, N. Lemos, A. C. Pieretti, O. 
Kinouchi, G. A. Cecchi, et al., "Speech graphs provide a 
quantitative measure of thought disorder in psychosis," 
PLoS One, vol. 7, p. e34928, 2012. 
[5] G. Bedi, F. Carrillo, G. A. Cecchi, D. F. Slezak, M. Sigman, 
N. B. Mota, et al., "Automated analysis of free speech 
predicts psychosis onset in high-risk youths," npj 
Schizophrenia, 2015. 
[6] B. Elvevåg, P. W. Foltz, D. R. Weinberger, and T. E. 
Goldberg, "Quantifying incoherence in speech: An 
automated methodology and novel application to 
schizophrenia," Schizophrenia Research, vol. 93, pp. 304-
316, 2007. 
[7] N. B. Mota, R. Furtado, P. P. Maia, M. Copelli, and S. 
Ribeiro, "Graph analysis of dream reports is especially 
informative about psychosis," Sci Rep, vol. 4, p. 3691, 2014. 
[8] S. F. Austin, O. Mors, E. Budtz-Jorgensen, R. G. Secher, C. 
R. Hjorthoj, M. Bertelsen, et al., "Long-term trajectories of 
positive and negative symptoms in first episode psychosis: A 
10year follow-up study in the OPUS cohort," Schizophr Res, 
vol. 168, pp. 84-91, Oct 2015. 
[9] M. H. First, R. L. Spitzer, M. Gibbon, and J. Williams, 
Structured Clinical Interview for DSM-IV Axis I Disorders -- 
Research Version, Patient Edition (SCID-I/P). . New York: 
Biometrics Research, New York State Psychiatric Institute, 
1990. 
[10] S. R. Kay, A. Fiszbein, and L. A. Opler, "The positive and 
negative syndrome scale (PANSS) for schizophrenia," 
Schizophr Bull, vol. 13, pp. 261-76, 1987. 
  
 
148
Chapter 5 - Speech structure in healthy, pathological and literature development: 
Cognitive development and cognitive decline, indirectly measured by graph-theoretical 
tools to analyze speech structure, will be discussed more deeply in the following paper 
(also published as a pre-print version on the ArXiv). In this work the speech graph 
analysis was applied to a larger sample, with and without psychotic symptoms, and the 
role of education was investigated. We also compared the ontogenetic developmental 
pattern with structural changes during the development of literature across 5,000 
years. 
 
 
149
 1 
TITLE: The effects of education on speech recapitulate the history of writing 1 
 2 
 3 
AUTHORS: 4 
Natália Bezerra Mota 1†, Sylvia Pinheiro 1†, Mariano Sigman 2, Diego Fernández-5 
Slezak 3,4, Antonio Guerreiro 5, Luís Fernando Tófoli 6, Guillermo Cecchi 7, Mauro 6 
Copelli 8*, Sidarta Ribeiro 1* 7 
 8 
†  Equal contribution, * Corresponding authors 9 
 10 
AFFILIATIONS: 11 
1 Instituto do Cérebro, Universidade Federal do Rio Grande do Norte, Natal, 12 
Brazil. 13 
2 Universidad Torcuato Di Tella, CONICET, Buenos Aires, Argentina. 14 
3 Departamento de Computación, Facultad de Ciencias Exactas y Naturales, 15 
Universidad de Buenos Aires, Buenos Aires, Argentina. 16 
4 Instituto de Investigación en Ciencias de la Computación, CONICET, 17 
Universidad de Buenos Aires, Buenos Aires, Argentina. 18 
5 Departamento de Antropologia, Universidade Estadual de Campinas, Campinas, 19 
Brazil. 20 
6 Departamento de Psiquiatria, Universidade Estadual de Campinas, Campinas, 21 
Brazil. 22 
7 Computational Biology Center – Neuroscience, IBM T.J. Watson Research 23 
Center, Yorktown Heights, USA. 24 
8 Departamento de Física, Universidade Federal de Pernambuco, Recife, Brazil. 25 
 26 
CORRESPONDING AUTHORS: 27 
Sidarta Ribeiro, Instituto do Cérebro, Avenida Nascimento de Castro 2155, Natal 28 
RN 59056-450, Brazil. Telephone +55(84)991277141, Email: 29 
sidartaribeiro@neuro.ufrn.br 30 
 31 
Mauro Copelli, Departamento de Física, Universidade Federal de Pernambuco, 32 
Avenida Prof. Moraes Rego 1235, Recife PE 50670-901, Brazil. Telephone 33 
+55(81)99483502, Email: mcopelli@df.ufpe.br 34 
 35 
KEYWORDS: 36 
Graph, Schizophrenia, Bipolar Disorder, Development, Childhood, Literature, 37 
Bronze Age, Axial Age, Pre-Literate, Amerindian, Indigenous, Poetry, 38 
Consciousness. 39 
 40 
  41 
150
 2 
ABSTRACT: 42 
 43 
Discourse varies with age, education, mental state and culture, but the 44 
ontogenetic and cultural dynamics of discourse structure remain to be 45 
quantitatively compared. Here we report that word graphs obtained from verbal 46 
reports of subjects ages 2-90, and literary texts spanning ~4,500 years show 47 
remarkably similar asymptotic maturation over time: While lexical diversity, 48 
long-range recurrence and graph size depart from near-randomness as they 49 
increase, short-range recurrence declines towards random levels. In typical 50 
subjects, short-range recurrence and lexical diversity stabilize after elementary 51 
school, whereas graph size and long-range recurrence only steady after high 52 
school. Subjects with psychosis do not show similar dynamics, presenting a 53 
children-like discourse akin to Bronze Age texts. These were distinct from 54 
poetry, and closer to narratives from illiterate adults than to narratives from 55 
preschoolers or Amerindian adults. Written structure converged to educated 56 
adult levels at the onset of the Axial Age (~800 BC), a putative boundary for 57 
contemporary human mentality. 58 
  59 
151
 3 
INTRODUCTION 60 
Culture shapes the organization of discourse in ontogeny as in history. At 61 
the individual level, language begins to be learned within weeks of birth if not 62 
earlier 1,2 but its full development takes many years of formal and informal 63 
education 3,4. At the historical level, the schooling of readers that become writers 64 
led to the gradual development of literature. Since the edubas of Sumer, schools 65 
are organizations specialized in using the scaffolding of biological maturation to 66 
train declarative and procedural skills such as reading and writing, firmly 67 
grounded on the progressive expansion of memory capacity and retrieval, 68 
coordination, brain area recycling, and symbolic repertoire 5-8. While 69 
phonological perception and production are typically mastered within the initial 70 
years of life, vocabulary, syntax and grammar continue to mature into high 71 
school through a combination of cognitive development and education that is 72 
accelerated by alphabetization, but undergoes an extended period of subsequent 73 
refinement 4,9,10. 74 
In 1-2% of the population, however, discourse may deteriorate during 75 
adolescence instead of improving, despite schooling and in parallel with the first 76 
surfacing of psychotic symptoms 11,12. The mental perturbations that 77 
characterize schizophrenia typically appear between adolescence and early 78 
adulthood, and progressively impact social behavior and language use 13-15. The 79 
contrast between healthy and psychotic development before adolescence is 80 
blurred, because children are normally more prone to confabulation than adults 81 
16, and often engage in private speech that includes dialogues with imaginary 82 
friends 17. Indeed, a reliable diagnosis of psychosis before mid-childhood is 83 
effectively precluded by the fact that typically-developing children under ~7 84 
years old normally display illogical thinking and loosening of associations 18. 85 
Child psychotherapy has also pointed, albeit subjectively, to a resemblance with 86 
psychosis 19. 87 
Two general hypotheses arise from the notion that psychosis represents 88 
the lingering of immature mental functioning. First, the disorganization of 89 
language that results from psychosis may follow the reverse path of typical 90 
language development. With proper metrics to establish the distance between 91 
typical and atypical adults with psychotic symptoms - as proxies of organized 92 
152
 4 
and disorganized discourse – it should be possible to verify whether verbal 93 
reports from typically-developing children move along this dimension as they 94 
mature. 95 
Second, psychosis may represent a trace of immature human language not 96 
only at the ontogenetic level, but also at the historical one. Psychosis has been 97 
proposed to resemble a primitive mental mode, an early trait of civilization that 98 
persisted historically as recently as the Bronze Age 20. According to this 99 
hypothesis, human mentality only matured into its current mode during the 100 
Axial Age (800-200 BC), a period in ancient history marked by a philosophical, 101 
artistic, political, legal, economic and educational boom in Afro-Eurasia 21,22. 102 
Influential and controversial 23, the concept of Axial Age only recently began to 103 
receive empirical attention 24,25. Here we analyze Pre- and Post-Axial literature 104 
using the same metrics employed to investigate psychosis and childhood in 105 
order to elucidate the question. 106 
To explore these hypotheses, we began by mathematically comparing 200 107 
interview transcripts (recorded from 135 healthy subjects and 65 patients with 108 
psychotic symptoms, ages 2 to 58 years old; Suppl. Table 1) to 447 109 
representative literary texts spanning ~4,500 years (Suppl. Table 2), 110 
comprising the following Afro-Eurasian traditions: Syro-Mesopotamian (N=62), 111 
Egyptian (N=49), Hinduist (N=37), Persian (N=19), Judeo-Christian (N=76), 112 
Greek-Roman (N=133), Medieval (n=20), Modern (n=20) and Contemporary 113 
(N=31). Understanding how discourse develops in time poses a significant 114 
mathematical challenge, because the lexicon is a high-dimensional object 26. 115 
Semantic analysis based on word co-occurrence in a representative corpus of 116 
texts has been successfully applied to many topics including psychosis 27-29 and 117 
literature 24,30, but it is very sensitive to arbitrary choices of probed words, 118 
textual corpora, and specific languages assessed. More traditional approaches 119 
such as syntactical analysis suffer from similar caveats 30. 120 
Since our hypotheses set predictions on the organization of words, the 121 
most natural way to examine them in a quantitative manner is to measure graph 122 
attributes, which allow for structural network characterization free of the above-123 
listed confounds 31, and account for the global organization of the lexicon 32,33. 124 
Here we focused on the following graph attributes: number of nodes (N), which 125 
153
 5 
accounts for lexical diversity, repeated edges (RE) and the largest strongly 126 
connected component (LSC), which respectively measure short- and long-range 127 
recurrence, as well as average shortest path (ASP), a measure of the graph size 128 
(Fig. 1a; see Methods).  129 
Psychotic discourse is characterized by comparatively reduced vocabulary, 130 
short-range repetitions of word sequences, a reduction in long-range themes, 131 
and a decrease in the global extent of the word network employed 11,13,14. Each of 132 
these aspects corresponds to a specific property in a graph made of words, 133 
respectively 1) lexical diversity, 2) short-range recurrence, 3) long-range 134 
recurrence, and 4) graph size. These properties successfully grasp disorganized 135 
language in psychotic adults 34-36 and language organization during the 136 
alphabetization of typically-developing children 37. Recent-onset psychotic 137 
patients show strong anti-correlation between long-range recurrence and 138 
negative symptoms that impact social behavior 35,36. Conversely, during typical 139 
(non-psychotic) development, long-range recurrence increases, in correlation 140 
with reading performance, IQ and theory of mind 37, three important measures of 141 
cognitive and social skills required for collective integration. 142 
Our previous results lead us to predict that as healthy subjects age and 143 
undergo schooling, their memory reports should progressively increase in lexical 144 
(node) diversity (N), long-range recurrence (LSC) and graph size (ASP). On the 145 
other hand, short-range recurrence (repeated edges - RE) should gradually 146 
decrease (Fig. 1a). Reports from psychotic subjects should not show the same 147 
dynamics, i.e. we hypothesize that the same 4 graph attributes will be less 148 
correlated with age or years of education, remaining similar to those of healthy 149 
children’s reports. Finally, in light of the conjectures of a saturating change of 150 
mentality at the dawn of the Axial Age 21,22, we could expect the dynamics of 151 
graph attributes across the historical record to resemble ontogenetic changes in 152 
healthy subjects. 153 
For each dataset, we measured the 4 graph attributes of interest N, LSC, 154 
ASP and RE, controlling for differences in total number of words per report by 155 
averaging across moving windows of 30 words with 50% of overlap (Fig. 1b), as 156 
detailed in 35 and Methods. The evolution of each attribute was modeled as an 157 
exponential fit to represent their accelerated initial development followed by a 158 
154
 6 
saturation process of slow progress, with f(t) = f0+(f∞- f0)(1-exp(-t/)); where f∞ 159 
is the asymptotic graph attribute value, f0 is the initial value, and  is the 160 
characteristic time to reach saturation (see Methods, Suppl. Table 3). This fit to 161 
exponentials allows us to identify dynamic properties of each attribute and 162 
hence examine in a quantitative manner whether the ontogenetic dynamics of 163 
verbal discourse mimics the historical development of literary structure. It also 164 
sets the stage for specific predictions. 165 
At the ontogenetic level, the saturation onset should either precede or 166 
coincide with adolescence, when it becomes possible for the first time to 167 
clinically identify the losses produced by psychosis 18. Furthermore, if discourse 168 
in healthy children shifts through development from disorganized to organized, 169 
but remains largely disorganized in psychotic subjects, we expect initial and 170 
asymptotic graph attribute values to be quite different in the former, but not in 171 
the latter, i.e. |f∞- f0| should be greater in healthy subjects than in psychotic 172 
patients. Furthermore, healthy subjects should show f∞ > f0 for N, ASP and LSC, 173 
but f0 > f∞ for RE. 174 
Precise predictions for cultural development are harder to make. The 175 
mathematical analysis of ancient texts is inherently impacted by a plethora of 176 
confounds, such as imprecise dating, variable physical support, multiple 177 
authorship and versions, editing, censorship, standardization, translation, access 178 
to few, production by fewer, distinct degrees of versification and fictionalization, 179 
stylistic, aesthetic and philosophical differences of both authors and translators 180 
24. A distinctive limitation is the fact that the transition from orality to literacy 181 
can only be timed by approximation, with reference to the earliest texts available 182 
(~2,500 BC) 38, Suppl. Note 1a). Furthermore, the historical evolution of 183 
narrative complexity was surely shaped by different literary schools, since 184 
writing at any given time is informed by knowledge and criticism of previous 185 
writing forms 39. The investigation of discourse structure across such different 186 
scales of analysis, involving both biological and cultural phenomena, must have 187 
categorical limitations that at some point turn potential homology into mere 188 
metaphor. Due to their inherently different nature, spontaneous speech and 189 
literature, albeit possibly sharing mechanisms for the accumulation of 190 
complexity over time, are also expected to differ in many ways. Notwithstanding 191 
155
 7 
all these caveats, we expect the historical development of writing to overall 192 
resemble healthy ontogenetic dynamics, and thus f∞- f0 should be positive for N, 193 
ASP and LSC but negative for RE. We also expect the characteristic times of the 194 
structural development of literature to either precede or coincide with the Axial 195 
Age (Suppl. Note 1b). 196 
 197 
RESULTS 198 
 199 
Ontogenetic dynamics of discourse structure 200 
 201 
The 4 graph attributes differed as predicted between healthy subjects 202 
below and above 12 years of age, indicating a change towards more organized 203 
discourse (Fig. 1c, light and dark blue columns). Also as expected, psychotic 204 
subjects produced reports that structurally resembled the disorganized pattern 205 
seen in healthy subjects with less than 12 years of age (Fig. 1c, light blue and red 206 
columns). Importantly, both groups yielded measurements equivalent to those of 207 
Bronze Age literature, while Post-Axial literature structurally resembled reports 208 
from healthy adults (Fig. 1c, white and black columns; Table 1). 209 
Representative graphs illustrate the marked structural differences between 210 
typically-developing children and adults, not present in subjects with psychotic 211 
symptoms (Fig. 2a). In support of our hypotheses, 3 attributes of interest (N, 212 
LSC, ASP) showed significant positive correlations with both age and education 213 
in healthy subjects (Suppl. Table 4). The short-range recurrence attribute RE, 214 
which in typically-developing children is negatively correlated with Intelligence 215 
Quotient and Theory of Mind scores 37, showed a significant negative correlation 216 
with education but not with age in healthy subjects (Suppl. Table 4). In striking 217 
agreement with our prediction that psychotic language remains in a 218 
disorganized stage, none of the graph attributes changed significantly either with 219 
age or with education among subjects with psychosis (Suppl. Table 4). A 220 
multiple linear regression confirmed the predominance of education over age in 221 
healthy subjects (Suppl. Table 4). 222 
To further characterize these changes, graph attribute values were binned 223 
in years of education, and fit with an exponential model weighted for the 224 
156
 8 
standard error of the mean. Graph attributes obtained from healthy subjects 225 
adjusted very well to the model (Fig. 2b-e, blue panels), with an education-226 
related exponentially saturating increase in lexical diversity (Fig. 2b), and a 227 
corresponding decrease in short-range recurrence (Fig. 2c). Long-range 228 
recurrence (Fig. 2d) and graph size (Fig. 2e) showed a much slower saturating 229 
increase. In agreement with our hypothesis that the organization of psychotic 230 
discourse changes less through years of education, the graph parameters 231 
obtained from the recordings of psychotic subjects adjusted poorly to the model 232 
(Fig.2b-e, red panels). The prediction that |f∞- f0| would be larger in typical 233 
subjects than in subjects with psychotic symptoms was confirmed for lexical 234 
diversity (N), short-range recurrence (RE) and graph size (ASP), but not for long-235 
range recurrence (LSC) (Suppl. Table 5). This occurred because LSC had lower f0 236 
values in the psychotic sample than in the typical sample, while f∞ values were 237 
more similar across groups. Thus, the long-range recurrence deficit in subjects 238 
with psychotic symptoms may reflect not a return to an immature pattern, but 239 
rather a developmental course that strays from the healthy profile from start. 240 
In typical subjects, word repetitions (RE) decreased exponentially within 241 
the first year of formal education, in parallel with a saturating increase in lexical 242 
diversity (N). Graph size (ASP) also increased, but with much slower dynamics 243 
that begins to saturate around the beginning of high school. Long-range 244 
recurrence (LSC) behaved similarly, with a characteristic time near the end of 245 
high school. To further test the null hypothesis of lack of temporal structure in 246 
the data, the temporal order of the samples was randomized 1,000 times and the 247 
graph attributes of this surrogate dataset were compared to real data. Such 248 
disruption of temporal order abolished significant Spearman correlations 249 
(Suppl. Fig. 1a) and greatly reduced the R2 of the exponential models (Suppl. 250 
Fig. 1b). 251 
If memory reports from subjects with psychotic symptoms are more 252 
disorganized than the reports of educated healthy adults, it is conceivable that 253 
their structure is also closer to that of random graphs 40. To gain insight into the 254 
structural randomness of our samples, each graph was randomized 100 times by 255 
keeping the nodes and shuffling the edges (Fig. 3a). Normalizing each graph 256 
attribute by the corresponding mean random graph attribute, LSC and ASP from 257 
157
 9 
typical controls with more than 12 years of education (yE) were significantly 258 
larger than in controls with less than 12 ye (Fig. 3b). RE showed the opposite 259 
profile: Above random in typical controls with less than 12 yE, and near-random 260 
in typical controls with more than 12 yE. None of these education-related 261 
differences in discourse structure were significant in subjects with psychotic 262 
symptoms (Fig. 3b). 263 
The results reveal different scales for the typical maturation of distinct 264 
aspects of discourse structure, confirming the expectation of a protracted 265 
dynamics of characteristic times, which either precede or coincide with 266 
adolescence. That these changes span the entire period of regular schooling 267 
points to the importance of high school completion 41. It also seems that 268 
education, more than age, shapes the structural modification of discourse from 269 
early childhood to adolescence. This process requires time, but developmental 270 
time per se does not suffice without education. Overall, the results support the 271 
notion that the forces driving the organization of discourse are cultural, re-272 
enforcing the expectation that a similar pattern should be observed in the 273 
historical record. 274 
 275 
Historical dynamics of discourse structure 276 
 277 
Next we assessed whether the ontogenetic dynamics of graph attributes 278 
structurally resembles the historical development of the same attributes in texts 279 
from ~2,500 BC to 2,014 AC (Fig. 4a). For standardization, the analyses were 280 
performed in English. Mimicking the ontogenetic pattern, lexical diversity, graph 281 
size and long-range recurrence increased steadily over time across different 282 
traditions, while short-range recurrence decreased (Fig. 4b-e; Suppl. Table 6). 283 
Using 2,500 BC as the most parsimonious estimation of t=0 for the birth of 284 
written culture (Suppl. Table 3, Suppl. Note 1a), the literary data were 285 
remarkably well fit by the same model that described the ontogenetic data in 286 
healthy subjects (Fig. 4b-e). The null hypothesis of lack of temporal structure in 287 
the data was refuted by the same surrogation procedure described above 288 
(Suppl. Fig. 1c, d). As expected, |f∞- f0| was positive for all graph attributes 289 
except RE, which was negative (Suppl. Table 6). 290 
158
 10 
Research on literary data implies assessing data points that are not 291 
independent, since books are linked by multiple cultural influences. To avoid 292 
overestimating statistical power, we nested the data by literary tradition, and 293 
exponentially fitted the mean weighted by the standard error of the graph 294 
attributes in each tradition. The nested data showed the same overall dynamics 295 
observed for all texts (Fig. 4f-i), with nearly no differences in characteristic time 296 
for lexical diversity, an approximation to the Axial Age onset for RE and LSC, and 297 
an anticipation of saturation for ASP (Suppl. Table 7). 298 
While the earliest texts show near-random long-range recurrence, later 299 
texts depart progressively from randomness. In contrast, short-range recurrence 300 
is much above random in the earliest texts, and becomes sub-random in the later 301 
ones. This is clear in a 2D plot of LSC and RE normalized by mean random values, 302 
which reconstitutes the temporal dynamics of the data based solely on structural 303 
properties (Fig. 5a). Indeed, almost 40% of the time variance among texts is 304 
explained by a single scalar combining normalized LSC and RE (Fig. 5b). A 305 
particularly interesting case is that of Hinduist literature, which evolved across 306 
2,750 years from a primitive pattern of near-random long-range recurrence to 307 
its opposite (Fig. 5c; Suppl. Note 1c, d). 308 
The exponentially saturating fits yielded characteristic times for the 309 
dynamics of graph attributes in literature (Suppl. Table 6). The results indicate 310 
that the structure of written discourse began to mature much after the earliest 311 
record. For ‘all data’ and ‘nested data’, LSC showed characteristic times of 1,427 312 
BC and 731 BC, respectively. For RE these times were 1,127 BC and 603 BC, 313 
respectively. This means that LSC and RE began to mature between the middle 314 
Bronze Age and the onset of the Axial Age.  Interestingly, the saturation of lexical 315 
diversity and graph size is estimated to be in the distant future: 5,321 AC and 316 
5,120 AC for N; 96,946 AC and 44,486 AC for ASP. 317 
Before the invention of writing, the ability to narrate real or fictional events 318 
was nearly exclusively mediated by oral storytelling. Short-range recurrence was 319 
likely favored because it facilitates rhyme and rhythm, as well as the 320 
memorization of short strings of words 42. The need for attentive recall and the 321 
taste for reiteration is emphatically expressed in the words of the last king of the 322 
Sumerian city-state of Shuruppag in one of the earliest extant texts, possibly 323 
159
 11 
dating from before 2,500 BC: “In those days, in those far remote days, in those 324 
nights, in those faraway nights, in those years, in those far remote years, at that 325 
time the wise one who knew how to speak in elaborate words lived in the Land; 326 
Shuruppag, the wise one, who knew how to speak with elaborate words lived in the 327 
Land. Shuruppag gave instructions to his son; Shuruppag, the son of UbaraTutu 328 
gave instructions to his son Ziudsura: My son, let me give you instructions: you 329 
should pay attention! Ziudsura, let me speak a word to you: you should pay 330 
attention!” 43 (Fig. 5a). 331 
However, a highly recursive structure hinders the communication of 332 
complex meaning, which requires long-range semantic context and imagetic 333 
schema 44, but is disrupted by short cycles 45. Load restrictions on attention and 334 
working memory 46 must have limited the structural complexification of 335 
narratives for millennia. The invention of written text as an external support for 336 
memory allowed for a substantial increase in the size and complexity of the 337 
narratives, no longer constrained by the needs and strategies of memorization. 338 
This transformation seems to be well captured by our analysis. Ancient literature 339 
became structurally more complex as it developed, with an increase over time in 340 
the diversity of words employed, fewer repetitions of short-range word 341 
sequences and increasingly larger connected components. In particular, the 342 
dynamics of recurrence is characterized by a monotonic increase in range, likely 343 
reflecting the departure from oral to written discourse, the former strictly 344 
dependent on working memory, the latter much less so. 345 
 346 
Controls for translation, sampling, data correlation, and dating  347 
 348 
Computer science and mathematical modeling have been increasingly 349 
applied to archeological and historical research 23,47,48. For text analysis across 350 
multiple live and dead languages and alphabets, this approach has the caveat of 351 
the need to use translations, mitigated here by the use of a single target language 352 
(English), and by the translation robustness of the differential diagnosis of 353 
psychosis based on graph analysis, which is nearly invariant across five major 354 
European languages including English 35. To further investigate translation as a 355 
potential source of noise, transliterated original texts (N=29) were subjected to 356 
160
 12 
graph analysis for comparison with their English translations. Significant 357 
positive correlations were observed for N, RE and ASP (Suppl. Fig. 2a), but LSC 358 
showed no correlation due to a subset of Bronze Age texts with substantially 359 
larger LSC in the English translations than in the originals (Suppl. Fig. 2a). As a 360 
consequence, the abrupt LSC increase at the Axial Age onset is even more 361 
marked in originals than in translations (Suppl. Fig. 2b). Overall, the dynamics 362 
of graph attributes in the original texts agrees with the results obtained for the 363 
larger sample of translated texts. 364 
Unintended bias in the reference sample is another potential caveat: while 365 
our selection of classical texts is quite comprehensive, the sampling becomes 366 
increasingly arbitrary due to book popularization following Gutenberg’s printing 367 
press ~1,440 AC. To address this criticism, 10 sets of 20 post-medieval texts 368 
were randomly sampled (Suppl. Table 8) and their graph attributes do not 369 
differ significantly from those of the reference sample (Suppl. Fig. 2c). Another 370 
potential criticism is the particular choice of mathematical model. We chose to 371 
adjust the data to the simplest possible model, one that only presupposes linear 372 
dynamics that converges to a stable fixed point. This provides useful parameters 373 
to interpret the data, as indicated by the agreement with the dating of 374 
civilizational collapse between the Bronze Age and Axial periods (Suppl. Note 375 
1b,d). 376 
A further concern is the possibility of high inter-correlation among the 377 
graph attributes assessed, which could spuriously inflate the results’ importance. 378 
Suppl. Table 8 shows that the empirical levels of independence between graph 379 
attributes vary substantially across samples. Although strongly correlated in 380 
some samples (most notably Post-Axial literature), in most cases the graph 381 
attributes seem to measure distinct aspects of the network. Most correlations are 382 
weak (R2<0.3) or non-significant. Only 3 in 30 correlations explain more than 383 
70% of the variance. Importantly, the correlations between LSC and RE, crucial 384 
for the points made in Fig. 5, range from 10% in Post-Axial texts to 0% in Pre-385 
Axial texts, and from 12% in psychotic subjects to 2% in healthy adults, and 1% 386 
in healthy children. 387 
Lastly, a caveat that requires attention is the intrinsic noise due to dating 388 
errors, which increase as we move towards the past. The criteria of “middle of 389 
161
 13 
author’s life” and “middle of historical period” were employed to parsimoniously 390 
and systematically address dating uncertainties regarding exact date of 391 
publication or authorship. To assess the effects of possible dating errors derived 392 
from these criteria, each data point was randomly subjected to a jitter of 100 393 
years (on the high end of human longevity), or to a jitter equal to the difference 394 
between the oldest and newest estimated dates, whenever that difference was 395 
larger than 100 years. Exponential fit parameters for 1,000 such data 396 
surrogations did not differ significantly from the values estimated above, 397 
indicating that dating errors are unlikely to mislead the interpretation of the 398 
data (Suppl. Fig. 3). 399 
 400 
Written structure converged abruptly to contemporary educated adult 401 
levels at the onset of the Axial Age 402 
 403 
Inferring the ancient mind based on a mathematical analysis of arcane 404 
records has an inevitable degree of speculation, but cognitive archeology gains 405 
depth when ancient literary data are compared to extant psychological data. The 406 
structural dynamics of historical texts shows similarity to the dynamics observed 407 
in healthy literate subjects, and most Bronze Age texts have graph attributes 408 
comparable to those measured in present-day reports from adults with 409 
psychotic symptoms or typically-developing children. One way to interpret the 410 
data is to consider that ancient literature resembles psychotic speech. Another is 411 
to conclude that ancient written discourse is structurally comparable to verbal 412 
reports of present-day children. Both interpretations resonate with the notion 413 
that adult psychosis reflects childish residues 19. This is likely related to 414 
developmental limitations in working memory and attention 49, which subside 415 
with education 50. Not surprisingly, limitations also observed in patients with 416 
psychotic symptoms 51. 417 
But the structural resemblance of childish, psychotic and ancient 418 
discourses does not necessarily imply similar mental functioning. Ancient texts 419 
were often a repository for the oral recitation of poetry—hence their repetitive 420 
structure. Rather than being psychotic or puerile, perhaps the ancient peoples 421 
simply wrote like poets. Alternatively, it is conceivable that the structure of 422 
162
 14 
ancient texts is simply too quaint to be meaningfully compared to the cultural 423 
record of extant literate societies, i.e. perhaps Pre-Axial discourses are similar to 424 
narratives from pre-literate societies or individuals. 425 
To address the first possibility, we compared the data to post-medieval 426 
Western poetry (N=60). To address the second possibility, we assessed verbal 427 
reports from three illiterate groups characterized by a decreasing gradient of 428 
indirect exposure to written discourse: illiterate adults (N=18, Suppl. Table 11), 429 
pre-school children (N=18, Suppl. Table 11), and non-literate Amerindians 430 
(N=41 narratives from at least 12 different subjects; Suppl. Table 10). As 431 
expected, there was an orderly gradient of structural differences across groups 432 
(Fig. 6). Importantly, Bronze Age texts differ significantly in structure from 433 
poetry as well as pre-literate narratives from either Amerindian adults or pre-434 
school children, but not from adult illiterates (Suppl. Table 13). Interestingly, 435 
poetry mixed features from pre-literate narratives (small LSC leading to reduced 436 
graph size) and contemporary literature (larger lexical diversity and fewer 437 
short-range recurrences, in comparison with both Pre and Post-Axial texts). 438 
From a strictly structural point of view, cultural accumulation allowed for 439 
changes across 2.5 millennia that in healthy children take ~12 years of schooling. 440 
Surely Plato’s writings were no adolescent material, being manifestly interested 441 
in adult topics. Yet, Plato’s writings and other Axial classics are at par in 442 
structural complexity with verbal reports from modern-day healthy adolescents: 443 
far from typical children and individuals with psychotic symptoms, much closer 444 
to Voltaire than to Shuruppag (Fig. 5a). Childish or psychotic as it may, the Pre-445 
Axial record reached a structural plateau around 800 BC, as shown by a moving 446 
window averaging of the data across all traditions (Fig. 7). The 4 graph 447 
attributes show highly significant changes between the middle Bronze Age and 448 
the Axial Age (Suppl. Table 14).  449 
This sharp empirical transition, as well as the characteristic times for RE 450 
(1,127 BC for ‘all data’, 603 BC for ‘nested data’) and LSC (1,427 BC and 731 BC, 451 
respectively), agrees well with the cultural collapse between the end of the 452 
Bronze Age (~1,200-1,000 BC) and the onset of the Axial Age (~800 BC) (Suppl. 453 
Note 1b-d), when droughts, famine, plagues, war, invasions and natural 454 
cataclysms led to social disorganization, educational disruption, and literacy 455 
163
 15 
reduction 52. Interestingly, this transition represented a departure from near-456 
random long-range structures (N, LSC and ASP), with the opposite happening in 457 
the short-range (RE) (Fig. 7b). 458 
 459 
DISCUSSION 460 
 461 
Here we present for the first time a graph-based description of how 462 
schooling gradually changes the way people speak, how psychosis affects this 463 
process, and how it compares with the historical evolution of writing. 464 
Throughout the school years, verbal discourse becomes less repetitive, richer in 465 
vocabulary, and more structured in the long range, so that words recur in a 466 
greater number of “word-vicinity” contexts. The benefits of education are lost in 467 
subjects with psychotic symptoms, whose verbal production structurally 468 
resembles that of children. Strikingly, the effects of education on the speech 469 
structure of healthy adults seem to recapitulate the history of writing. Starting 470 
from the earliest stage when literature was closely linked to recitation and used 471 
schemes typical of orality, such as repetition, texts asymptotically matured into 472 
having richer vocabularies, less repetition, and more long-range structure. 473 
In literate societies, cultural exposure to written discourse begins early in 474 
childhood and extends over life by way of social interactions with literate 475 
individuals. Despite this influence, speech structure only begins to mature after 476 
alphabetization, as subjects adapt to the standards found among literate adults. 477 
Subjects with psychosis have difficulties in social interaction, maintaining a 478 
speech structure similar to that of Pre-Axial texts. Illiterate adult subjects also 479 
display a Pre-Axial pattern: Although they have been immersed for a long time in 480 
the literate culture, full literacy never developed. Reports from pre-school 481 
children, while similar to Pre-axial literature in LSC and RE, have significantly 482 
smaller graphs and less lexical diversity, denoting less exposure to the literate 483 
culture. The Amerindian reports, although mostly comprising elaborate oral 484 
narratives that take long years of training to be properly memorized in shape 485 
and content 53, were the farthest in structure from Pre-Axial texts.  486 
The sharp transient in graph attributes ~800 BC supports the concept of 487 
Axial Age 21, which has been challenged as a vague concept without empirical 488 
164
 16 
evidence 22,23,25. However, a quantitative semantic analysis of Judeo-Christian 489 
and Greco-Roman texts detected increased text similarity to the concept of 490 
“introspection” throughout the Axial Age 24. Statistical modeling attributed the 491 
timing of the Axial Age to economic development, not political complexity nor 492 
population size 25. This has been interpreted as evidence that the intellectual 493 
blossoming of the Axial Age derived from changes in reward systems, rather than 494 
from changes in cognitive styles 23,25. Our results argue for a complementary 495 
view: The economical prosperity of the Axial Age co-existed with a major change 496 
in discourse structure, with a contemporary parallel in the maturation of verbal 497 
reports that depends more on years of education than on biological age. 498 
Bronze Age texts are structurally similar to verbal reports from both 499 
children and psychotic subjects. The notion that psychosis resembles childish or 500 
primitive behavior is culturally pervasive, but so far has lacked empirical 501 
support. While the graph-theoretical similarity of Pre-Axial literature and 502 
psychotic discourse is compatible with the notion that Bronze Age mentality was 503 
psychotic-like 20, it surely does not imply that the graph-theoretical features of 504 
verbal and written production of psychotic subjects, children and ancient 505 
authors had similar underlying causes. Despite the formal similarities reported 506 
here, the mechanisms responsible for the changes from childhood to adulthood 507 
and in psychosis are likely to differ. Still, our results contribute to address two 508 
major criticisms of Jaynes’ theory, namely the lack of psychiatric basis 54, and 509 
missing evidence of recent change in “mental software” 55. The results concur 510 
with the proposition that it is not “ridiculous to suppose that consciousness is a 511 
cultural construction based on language and learned in childhood” 56. Lastly, 512 
the results encourage the investigation of Pre-Axial mummies for putative 513 
genetic or epigenetic markers of schizophrenia 57-60. 514 
Our results also suggest that Amerindian discourse is even more ancient in 515 
structure than Pre-Axial literature. Ethno-psychiatry recognizes the occurrence 516 
of psychosis in pre-literate Amerindian societies 61, but its prevalence is 517 
controversial because of ethnocentrism 62 and the difficult sorting of 518 
psychopathology from exotic cultural behavior 63. Amerindian narratives often 519 
take many years of training to be learned. Recitation is accompanied by complex 520 
sequences of gestures and postures, and in some traditions tends to maintain a 521 
165
 17 
very similar structure across different narrators 53. Short-range recurrence is 522 
pervasive, and the several forms of parallelism used in such verbal performances 523 
indicate that the repetition of words or sentences is an important feature of a 524 
highly regarded style of both thinking and narrating. The production of 525 
symbolism for its own sake is at the core of what Lévi-Strauss called the “savage 526 
mind”, in opposition to what could be taken as “tamed thought” - the constraint 527 
of symbolic activity by external needs, ends and means 64. Perhaps psychotic 528 
subjects and healthy children in literate societies exhibit some degree of the 529 
“savage mind” (Suppl. Note 2). If, on one hand, writing presents new 530 
possibilities for narrative complexity, it also limits certain characteristics of 531 
thought which, in societies without writing or that were developing writing 532 
millennia ago, were valued and considered functional. 533 
The characteristic times for the ontogenetic and historical development of 534 
graph attributes are summarized in Fig. 8. Education-related cultural 535 
accumulation makes discourse less recursive and more connected at both the 536 
ontogenetic and historical levels, but the corresponding transformation paths 537 
are only partially overlapping. While the monotonic dynamics in both datasets 538 
are overall quite similar (compare Figs. 1c, 2 and 4), the temporal order of 539 
saturation for specific graph attributes differs across datasets. 540 
Ontogenetically, short-range recurrence and lexical diversity begin to 541 
stabilize in the first school year, as expressed in a wider use of an expanding 542 
vocabulary and less use of mnemonic resources to organize speech. This is 543 
consistent with evidence that lexical connectivity facilitates language acquisition 544 
even in preschool children 8. Then, mostly during high school but with large 545 
inter-individual variation, graph size and long-range recurrence saturate, and 546 
graph attributes evolve towards the typical adult profile. The data point to a 547 
hierarchical development of discourse structure, by which we depart from an 548 
initial pattern of fragmented word segments dominated by short-range 549 
connections to a learned pattern of globally connected word strings. 550 
Historically, the earliest maturation of discourse structure occurred for the 551 
increase in long-range recurrence and decrease in short-range recurrence 552 
between the middle Bronze Age and the Axial Age. Similarly to the ontogenetic 553 
data, a decrease in short-range recurrence is an early marker of maturation in 554 
166
 18 
literature. However, lexical diversity and graph size follow a distinct path, not 555 
stabilizing until much beyond the present. These differences are likely related to 556 
the fact that the historical data was not produced by children, but by educated 557 
adults of the cultural elites of yore. Still, the different paths reach similar 558 
outcomes. The results imply that, at any given time, it is the educated subject 559 
able to create literature – the writer – who will push the envelope of discourse 560 
structure. The fine-grained dynamics of graph attributes are different for 561 
ontogenesis and history because of the many intrinsic differences between these 562 
processes, including the fact that they correspond in the latter to the maximum 563 
found in the population, while in the former they simply measure the degree of 564 
adherence to the current educational canon. 565 
First established in ancient Sumer 65, schools foster the education of those 566 
who will instruct younger generations through written language. Literacy 567 
acquisition is associated with important anatomical and physiological changes in 568 
neocortical organization, including robust lateralization 6,66,67. Given the 569 
association between psychosis and reduced lateralization 68, the results suggest 570 
that the lateralization associated with literacy may have shaped the mental 571 
processes underlying the development of literature. While the complex discourse 572 
structure of healthy adults owes more to nurture than to nature, education does 573 
not do its work in subjects with psychosis. When cognitive development is 574 
impaired by disease, nature trumps nurture. Despite exposure to education, 575 
subjects with psychosis retain a linguistic structure akin to that of children’s 576 
speech, failing to mature in complexity and remaining closer to a near-random 577 
structure. The historical parallel of a psychotic breakdown with cognitive decline 578 
is given by the cultural collapse at the end of the Bronze Age, which coincides 579 
with the resurgence of literature with increased short-range recurrence and 580 
decreased long-range recurrence. In the context of societies where reading and 581 
writing are the norm, the structural randomness of long-range connections 582 
seems therefore to represent an immature trace of the human mind, at the level 583 
of the individual as well as historically. 584 
  585 
167
 19 
METHODS 586 
 587 
Ontogenetic Data: 588 
The convenience sample (data pooled from 34,35,37,69 plus new samples) 589 
comprised clinical oral interviews from 200 individuals (135 without any 590 
diagnosis of psychiatric disorder, and 65 independently diagnosed by the 591 
standard DSM IV ratings SCID 70 with psychotic symptoms as schizophrenic (S) 592 
(N=36) or bipolar type I (B) (N=29) (Suppl. Table 1). Also applied were two 593 
standard psychometric scales, the ‘‘Positive and Negative Syndrome Scale’’ 594 
(PANSS) 71 and the ‘‘Brief Psychiatric Rating Scale’’ (BPRS) 72, and a 595 
socioeconomic-clinical questionnaire (with information regarding age, sex, 596 
family income, educational level, marital status, disease duration and onset). This 597 
study used data from two protocols approved by the Research Ethics Committee 598 
of the Federal University of Rio Grande do Norte (permits #102/06-98244 and 599 
#742.116). Signed informed consent was obtained from all participants and also 600 
from a legal guardian when necessary, and the study adhered to all relevant 601 
ethical regulations. The exclusion criteria were any neurological condition or 602 
alcohol/drug abuse. The analysis of memory reports focused on answers to three 603 
open questions, namely requests for reports on one recent dream, on waking 604 
activities in the previous day, and about a negative affective image shown for 15 605 
seconds immediately before the request. The negative image was selected from a 606 
widely validated affective images database 73 4. For each subject, the three 607 
reports were concatenated and the final text was represented as a word graph 608 
(Fig. 1a).  The same report protocol was applied to an independent control 609 
group of 18 pre-school children, and 18 illiterate adults from a rural region 610 
nearby Natal, RN, Brazil. Demographic information in Suppl. Table 11. Also as a 611 
control, we analyzed 41 Amerindian oral narratives comprising myths, historical 612 
events, and personal stories. The data were obtained from one of the authors 613 
(AG) under permit 1712/09 from the National Indian Foundation (FUNAI), from 614 
publications, and from a public corpus at the State University of Campinas 615 
(http://www.tycho.iel.unicamp.br). Demographic information and sources of the 616 
Amerindian reports is presented are Suppl. Table 10. 617 
168
 20 
 618 
Literary Data: 619 
Bibliography Selection and Edition: Representative prose texts translated to 620 
English or written in English were extracted from the public domain of internet 621 
or kindly provided by their authors were converted to .txt extension and edited 622 
to remove prefaces, notes, comments, line breaks, page/tablet numbering and 623 
publisher information. Paragraphs were preserved. All text editing procedures 624 
performed with Matlab and Notepad++ software. Text identification, time 625 
intervals, and dating are detailed in Suppl. Table 2. 626 
Control for arbitrary selection of post-medieval texts: To compare with our 627 
literary sample, additional texts comprising 10 random sets of 20 modern and 628 
contemporary texts were selected using the search engine "Random Page" on the 629 
digital library Project Gutenberg, with plays, poetry and non-English versions 630 
excluded (https://www.gutenberg.org/ebooks/search/?sort_order=random). 631 
For this control, only the initial 1,000 words of each text were analyzed. The 632 
composition of the 10 sets is detailed in Suppl. Table 8. Two texts were 633 
randomly selected twice, for a total of 198 different texts analyzed in this control. 634 
Transliterated originals: As a control for translation effects, 50 transliterated 635 
original texts were also analyzed (29 non-English texts and 22 English originals 636 
already included in the initial sample). When necessary, originals were 637 
translated phonetically. Transliterations that contained non-Latin characters 638 
required for the accuracy of the phonetic reproduction were subjected to a 639 
replacement by corresponding standard characters (Example: "ṥ" replaced with 640 
"s"). 641 
Poetry: 60 poetry samples from medieval, modern and contemporary periods 642 
were also collected as a control to assess if detected graph patterns are related to 643 
poetical structure. Text identification, time intervals, and dating are detailed in 644 
Suppl. Table 12. 645 
Text Dates: Text dating information was obtained preferentially by exact 646 
(known) dating or time of work conclusion (1). In the absence of this 647 
information was lacking, dates corresponded to the middle of the historical 648 
169
 21 
period when the text was written (2), or to the middle of the author's lifespan 649 
(3). Details about the dating employed can be found in Suppl. Note 3. 650 
 651 
A grand total of 733 different texts were analyzed. Text sources included the 652 
Digital Egypt of the University College London (http://www.ucl.ac.uk/museums-653 
static/digitalegypt/), the Electronic Text Corpus of Sumerian Literature of the 654 
University of Oxford (http://etcsl.orinst.ox.ac.uk/), Project Gutenberg 655 
(www.gutenberg.org), and The Internet Classics Archive of the Massachusetts 656 
Institute of Technology (http://classics.mit.edu/). The sources of all texts are 657 
indicated in Suppl. Table 2. 658 
 659 
Graph Analysis of Ontogenetic and Literary Data: 660 
All the data are fully available upon request. Graph analysis was performed using 661 
the software SpeechGraphs, which is freely available at 662 
http://www.neuro.ufrn.br/softwares/speechgraphs. For memory reports as 663 
well as literary texts, average graph attributes were calculated across each graph 664 
using moving windows of 30 words with 50% of overlap 35, i.e. steps of 15 words 665 
(Fig. 1b). A total of 4 average graph attributes were calculated for each text file, 666 
comprising lexical diversity (Nodes=N), short-range recurrence (RE = repeated 667 
edges= RE), long-range recurrence (largest strongly connected component = 668 
LSC) and graph size (ASP = average shortest path). To estimate randomness 669 
levels, each 30-word window was shuffled 100 times so as to keep the same 670 
words but change their order (Fig. 3a). This procedure is equivalent to a random 671 
permutation of edges 74. Graph attributes of randomized word windows were 672 
then averaged and used to normalize the original average data (Figs. 3b, Fig. 5). 673 
To cope with computational cost, texts above 50,000 words were trimmed to this 674 
maximum. Data analyzed in Excel and Matlab software. 675 
 676 
Exponential model: 677 
In order to study the dynamics of graph attributes across different educational 678 
levels or across time in literature, the following model was used: 679 
 680 
f(t) = f0+( f∞- f0)(1-exp(-t/Ƭ)) 681 
170
 22 
 682 
where 683 
 684 
f∞ is the maximum asymptotic graph attribute value 685 
f0 is the initial graph attribute value 686 
t is time  687 
Ƭ is characteristic time to reach saturation. 688 
 689 
The function is the solution to a linear differential equation of first order:  690 
 691 
df/dt = (1/ Ƭ)( f∞-f) with initial condition f(t=0)= f0, 692 
 693 
For memory reports we used as input data the average graph attribute from all 694 
individuals with the same age, and weighted the model for the standard error of 695 
the mean. For literary data we first used a non-weighted model considering all 696 
data points, and then we repeated the analysis using as input data the average 697 
graph attribute from all texts from the same tradition, and weighing the model 698 
for the standard deviation of the mean, to control for the different number of 699 
texts available from different traditions. To better adjust the fit, we considered 700 
lower and upper points to each coefficient, according to the maximum and 701 
minimum value expected for each graph attribute and for time (years of 702 
education or historical time), as detailed in Suppl. Table 3. In order to further 703 
evaluate the model’s goodness of fit, we shuffled the temporal variable 1,000 704 
times, using years of education for the ontogenetic data (Suppl. Fig. 1) and years 705 
for the historical data (Suppl. Fig. 2). To assess the impact of dating imprecision 706 
on the results, the data were submitted to 1,000 surrogations with random 707 
temporal jitter of 100 years, or the difference between the oldest and newest 708 
estimated dates, whenever that difference was larger than 100 years. 709 
 710 
ACKNOWLEDGEMENTS: 711 
Work supported by UFRN, Conselho Nacional de Desenvolvimento Científico e 712 
Tecnológico (CNPq), grants Universal 480053/2013-8 and 408145/2016-1 and 713 
Research Productivity 308775/2015-5 and 310712/2014-9; Coordenação de 714 
171
 23 
Aperfeiçoamento de Pessoal de Nível Superior (CAPES) Projects OBEDUC-715 
ACERTA 0898/2013 and STIC AmSud 062/2015; Fundação de Amparo à Ciência 716 
e Tecnologia do Estado de Pernambuco (FACEPE); Center for Neuromathematics 717 
of the São Paulo Research Foundation FAPESP (grant 2013/07699-0), 718 
Boehringer-Ingelheim International GmbH (grant 270561). We thank the 719 
Hospitals Onofre Lopes and João Machado for the sampling of psychiatric 720 
patients; the Schools “Arte de Nascer”, “Ulisses Góis”, “Antonio Severiano”, 721 
“Carlos Belo Moreno", “Luis Antonio”, “Arnaldo Monteiro Bezerra”, and “Berilo 722 
Wanderley” for the sampling of school students; M Posner, S Dehaene, S Bunge, 723 
CJ Cela Conde, S Lipina, D Araujo, C Queiroz, J Sitt, JV Lisboa, A Cabana, J Queiroz, 724 
A Battro, J Luban, MP de Souza, and P Dalgalarrondo for insightful discussions 725 
and comments on the manuscript; M Laub and JE Agualusa for source material; 726 
PPC Maia for IT support; D Koshiyama and V Ribeiro for documentation support; 727 
AEA Oliveira for help with the sampling of adult illiterates sample; and Instituto 728 
Metrópole Digital UFRN for cloud usage. 729 
 730 
COMPETING INTERESTS 731 
The authors declare no competing interests. 732 
 733 
  734 
172
 24 
Figures 735 
 736 
 737 
 738 
Fig. 1: Verbal reports from typical children and psychotic adults are 739 
structurally similar to Bronze Age literature, while reports from typical 740 
adults resemble Post-Axial literature. a) The graph attributes investigated 741 
comprised lexical diversity (N), long-range recurrence (LSC), short-range 742 
recurrence (RE) and graph size (ASP) 34,35. Red circles indicate nodes, black 743 
arrows indicate edges. b) Moving windows (length = 30 words, 50% overlap) 744 
were used to calculate mean values per graph for the different attributes. c) 745 
Graph attributes from psychotic subjects are not significantly different from 746 
those of typical children and Bronze Age literature (Table 1). KW(p) for Kruskal-747 
Wallis p value. Mean ± SEM are shown, and post-hoc statistical significance was 748 
assessed by the Wilcoxon rank sum test (two-tailed); * indicates significant 749 
differences from Bronze Age texts, typical children < 12 years and psychotic 750 
subjects, # indicates significant differences from the same groups plus typical 751 
subjects > 12 years (Bonferroni correction for 40 comparisons, alpha = 0.00125, 752 
p values in Table 1). Sample sizes: Typical < 12 yo (N=80), typical > 12 yo 753 
(N=55), subjects with psychosis >12 yo (N=63), Pre-Axial texts (N=115), Axial 754 
and Post-Axial texts (N=332). 755 
  756 
173
 25 
 757 
 758 
Fig. 2: The structure of memory reports matures with years of education in 759 
typical subjects, but not in psychotic patients. a) Representative examples of 760 
graphs from typical and psychotic subjects, as children or adults. Light blue 761 
perimeters indicate LSC. b) Lexical diversity as a function of years of education 762 
(yE) for typical (N=135) and psychotic (N=65) subjects. Similar plots for c) 763 
Short-range recurrence, d) Long-range recurrence, and e) Graph size. For 764 
significant Spearman correlations, characteristic years of education (Ƭ) and 765 
asymptotic values (f∞) indicated by vertical and horizontal dashed lines, 766 
respectively. R² and Root-mean-square error (RMSE) indicated on top. For 767 
information about the model and parameters used, see Methods and Suppl. 768 
Table 3. For data on Spearman correlations and multiple linear combinations 769 
between education and age, see Suppl. Table 4. Goodness of fit in Suppl. Table 770 
5, randomization analysis in Suppl. Figure 1.  771 
  772 
174
 26 
 773 
 774 
Fig. 3: Memory reports from psychotic subjects have a near-random 775 
structure. a) Graph attributes were calculated for each random graph and 776 
averaged to compose the denominator of the ratio shown as normalized graph 777 
attribute in the next panel. b) The graph attributes of each individual report 778 
were normalized by the corresponding mean random value, and the data were 779 
sorted according to more or less than 12 yE. Typical subjects showed significant 780 
differences between subjects below (<) or above (>) 12yE (p for RE=0.00004, 781 
LSC=1.19e-10, ASP=8.04e-8), but psychotic subjects did not. Typical subjects > 782 
12 yE showed significant differences from psychotic subjects < 12 yE for all 783 
graphs attributes (p for RE=0.0001, LSC=3.25e-8, ASP=0.0005, not represented 784 
in the figure), and from psychotic subjects > 12 yE for LSC (p=0.0001, not 785 
represented in the figure). Sample sizes: Typical < 12 yE (N=99), Typical > 12 yE 786 
(N=36), subjects with psychosis < 12 yE (N=43), > 12 yE (N=22). * for p<0.05 787 
corrected for multiple comparisons, n.s. for non-significant differences 788 
(Wilcoxon rank sum test, two-tailed, Bonferroni correction for 18 comparisons, 789 
=0.0028). 790 
  791 
175
 27 
 792 
 793 
Fig. 4: The historical development of literary structure mimics the 794 
ontogenetic dynamics. a) A corpus of 447 representative texts across 9 Afro-795 
Eurasian literary traditions spanning ~4,500 years was investigated by graph 796 
analysis as in Fig. 1. b) Lexical diversity increased monotonically over time, 797 
while c) Short-range recurrence showed the opposite dynamics. d) Long-range 798 
recurrence and e) Graph size increased over time. The data are well explained by 799 
the exponentially saturating model. The historical data can be further explored 800 
at http://www.neuro.ufrn.br/historicaldata. f-i) The data nested by literary 801 
tradition show the same dynamics observed for fits of all individual texts. Each 802 
data point represents the mean and standard deviation of the graph attribute for 803 
all texts sampled in the tradition. R² and Root-mean-square error (RMSE) 804 
indicated on top. For information about the model and parameters used, see 805 
Methods and Suppl. Table 3. For data on Spearman correlations and goodness 806 
of fit using all data points, see Suppl. Table 6. Data on the goodness of fit of the 807 
nested analysis in Suppl. Table 7. Date randomization analysis in Suppl. Figure 808 
1, date jittering analysis in Suppl. Figure 3. 809 
  810 
176
 28 
 811 
 812 
Fig. 5: The maturation of literary structure reflects historical time. a) LSC 813 
and RE normalized by mean random values reconstitute the “arrow of time”. 814 
Grey rectangle indicates supra-random LSC and infra-random RE (R² and p 815 
values of Pearson correlation between the two normalized attributes indicated 816 
on the top). b) A linear combination of normalized LSC and RE strongly 817 
correlates with historical time (R² and p values of multiple linear regression 818 
using least squares indicated on the top, coefficients for each attribute indicated 819 
on the y axis). c) LSC saturates over time in Hinduist literature, with 820 
characteristic times within the Indo-Aryan migration (Suppl. Note 1b-d). R² and 821 
Root-mean-square error (RMSE) indicated on top. For information about the 822 
model and parameters used, see Methods and Suppl. Table 3. 823 
  824 
177
 29 
 825 
 826 
Fig. 6: Graph attributes from Pre-axial texts differ from the graph attributes 827 
of poetry and pre-literate narratives from Amerindian subjects or urban 828 
preschoolers. a) Mean ± SEM for each graph attribute of interest. b) Mean ± 829 
SEM for LSC versus RE. Note that Poetry and Amerindian narratives have very 830 
distinct structures. * indicates differences from Post-Axial texts and # indicates 831 
differences from both Pre-Axial and Post-Axial texts, with p<0.05 corrected for 832 
multiple comparisons (Wilcoxon rank sum test, two-tailed, Bonferroni correction 833 
for 32 comparisons, =0. 0016; p values in Suppl. Table 13). Dashed and solid 834 
red lines indicate the boundaries given by mean ± SEM of Pre-Axial and Post-835 
Axial texts, respectively. Pre-Axial texts did not differ significantly from adult 836 
illiterates in any structural measure. In contrast, Pre-Axial texts did not differ 837 
from poetry only for ASP, from Amerindian adults only for RE, and from pre-838 
school children for RE and LSC. Overall, Pre-Axial texts showed more structural 839 
differences than similarities with poetry and Amerindian narratives. 840 
  841 
178
 30 
 842 
 843 
Fig. 7: Empirical transition in text structure near the onset of the Axial Age. 844 
Marked transient in graph attributes across all traditions for a) Nodes, b) RE, c) 845 
LSC, and d) ASP. Plotted are non-overlapping moving averages (windows of 200 846 
years, mean ± SEM). For historical context, see Suppl. Note 1b,d. * for p<0.05 847 
corrected for multiple comparisons, p values in Suppl. Table 14 (Wilcoxon rank 848 
sum test, two-tailed, Bonferroni correction for 24 comparisons, =0.0021).  849 
  850 
179
 31 
 851 
 852 
Fig. 8: Ontogenetic and literary characteristic times (Ƭ). The temporal order 853 
of maturation for specific graph attributes differs between ontogenetic and 854 
literary data. a) Characteristic times for ontogenetic development, indicated by 855 
colored circles for each graph attribute. b) Characteristic times for historical 856 
development, indicated by black dots for ‘all data’, boxes for ‘jittered data’, and 857 
arrow for ‘nested data’. The boxes indicate the range of characteristic times for 858 
the 1,000 jitter surrogations (details in Methods). 859 
860 
180
 32 
Table 1: Statistically significant differences among ontogenetic and literary 861 
datasets. Significant p values indicated in bold (Bonferroni correction for 40 862 
comparisons, alpha = 0.00125). 863 
 864 
p values for KW and post-hocs tests Nodes RE LSC ASP 
Kruskal-Wallis 1,28E-36 7,78E-35 1,15E-63 7,72E-37 
Typical <12 yo x Typical >12 yo 0.0000 0.0079 0.0000 0.0000 
Typical <12 yo x Psychosis 0.6992 0.9077 0.3311 0.0156 
Typical >12 yo x Psychosis 0.0000 0.0074 0.0002 0.0007 
Typical <12 yo x Bronze Age 0.1155 0.8516 0.8315 0.0242 
Typical <12 yo x Post-Axial 0.0000 0.0000 0.0000 0.0000 
Typical >12 yo x Bronze Age 0.0000 0.0022 0.0000 0.0000 
Typical >12 yo x Post-Axial 0.0024 0.0000 0.0000 0.2449 
Psychosis x Bronze Age 0.0784 0.9031 0.0995 0.4804 
Psychosis x Post-Axial 0.0000 0.0000 0.0000 0.0000 
Bronze Age x Post-Axial 0.0000 0.0000 0.0000 0.0000 
 865 
  866 
181
 33 
REFERENCES 867 
 868 
 869 
1 DeCasper, A., Lecanuet, J., Bunsel, M., Granier-Deferre, C. & R., M. Fetal 870 
Reactions to Recurrent Maternal Speech. Infant Behavior and Development 871 
17, 159–164, doi:10.1016/0163-6383(94)90051-5 (1994). 872 
2 Dehaene-Lambertz, G., Dehaene, S. & Hertz-Pannier, L. Functional 873 
neuroimaging of speech perception in infants. Science 298, 2013-2015, 874 
doi:10.1126/science.1077066 (2002). 875 
3 Jung, C. G. Studies in Word Association. Vol. 2 (Routledge & K. Paul, 1919). 876 
4 Kuhl, P. K. Early Language Learning and Literacy: Neuroscience 877 
Implications for Education. Mind Brain Educ 5, 128-142, 878 
doi:10.1111/j.1751-228X.2011.01121.x (2011). 879 
5 Sigman, M., Pena, M., Goldin, A. P. & Ribeiro, S. Neuroscience and 880 
education: prime time to build the bridge. Nature Neuroscience 17, 497-881 
502, doi:10.1038/nn.3672 (2014). 882 
6 Dehaene, S. et al. How learning to read changes the cortical networks for 883 
vision and language. Science 330, 1359-1364, 884 
doi:10.1126/science.1194140 (2010). 885 
7 Rueckl, J. G. et al. Universal brain signature of proficient reading: Evidence 886 
from four contrasting languages. Proc Natl Acad Sci U S A 112, 15510-887 
15515, doi:10.1073/pnas.1509321112 (2015). 888 
8 Beckage, N., Smith, L. & Hills, T. Small worlds and semantic network 889 
growth in typical and late talkers. PLoS One 6, e19348, 890 
doi:10.1371/journal.pone.0019348 (2011). 891 
9 Gervain, J., Macagno, F., Cogoi, S., Pena, M. & Mehler, J. The neonate brain 892 
detects speech structure. Proc Natl Acad Sci U S A 105, 14222-14227, 893 
doi:10.1073/pnas.0806530105 (2008). 894 
10 Rosselli, M., Ardila, A., Matute, E. & Velez-Uribe, I. Language Development 895 
across the Life Span: A Neuropsychological/Neuroimaging Perspective. 896 
Neurosci J 2014, 585237, doi:10.1155/2014/585237 (2014). 897 
11 Kuperberg, G. R. & Caplan, D. in Neuropsychiatry   (ed S.M. Rao R.B. 898 
Schiffer, and B.S. Fogel)  pp 444-466 (Lippincott Williams and Wilkins, 899 
2003). 900 
12 McGrath, J., Saha, S., Chant, D. & Welham, J. Schizophrenia: a concise 901 
overview of incidence, prevalence, and mortality. Epidemiol Rev 30, 67-902 
76, doi:10.1093/epirev/mxn001 (2008). 903 
13 Kraepelin, E. Dementia praecox and paraphrenia.  (R. E. Krieger Pub. Co., 904 
1919). 905 
14 Bleuler, E. Dementia praecox.  (International Universities Press, 1911). 906 
15 Insel, T. R. The NIMH Research Domain Criteria (RDoC) Project: precision 907 
medicine for psychiatry. Am J Psychiatry 171, 395-397, 908 
doi:10.1176/appi.ajp.2014.14020138 (2014). 909 
16 Ackil, J. K. & Zaragoza, M. S. Memorial consequences of forced 910 
confabulation: age differences in susceptibility to false memories. Dev 911 
Psychol 34, 1358-1372 (1998). 912 
182
 34 
17 Davis, P. E., Meins, E. & Fernyhough, C. Individual differences in children's 913 
private speech: the role of imaginary companions. J Exp Child Psychol 116, 914 
561-571, doi:10.1016/j.jecp.2013.06.010 (2013). 915 
18 Caplan, R., Guthrie, D., Fish, B., Tanguay, P. E. & David-Lando, G. The 916 
Kiddie Formal Thought Disorder Rating Scale: clinical assessment, 917 
reliability, and validity. J Am Acad Child Adolesc Psychiatry 28, 408-416, 918 
doi:10.1097/00004583-198905000-00018 (1989). 919 
19 Klein, M. Envy and gratitude, and other works, 1946-1963. Free Press edn,  920 
(Free Press, 1984). 921 
20 Jaynes, J. The origin of consciousness in the breakdown of the bicameral 922 
mind.  (Houghton Mifflin, 1976). 923 
21 Jaspers, K. The origin and goal of history.  (Yale University Press, 1953). 924 
22 Árnason, J. h. P. l., Eisenstadt, S. N. & Wittrock, B. r. Axial civilizations and 925 
world history.  (Brill, 2005). 926 
23 Baumard, N., Hyafil, A. & Boyer, P. What changed during the axial age: 927 
Cognitive styles or reward systems? Commun Integr Biol 8, e1046657, 928 
doi:10.1080/19420889.2015.1046657 (2015). 929 
24 Diuk, C. G., Slezak, D. F., Raskovsky, I., Sigman, M. & Cecchi, G. A. A 930 
quantitative philology of introspection. Front Integr Neurosci 6, 80, 931 
doi:10.3389/fnint.2012.00080 (2012). 932 
25 Baumard, N., Hyafil, A., Morris, I. & Boyer, P. Increased affluence explains 933 
the emergence of ascetic wisdoms and moralizing religions. Curr Biol 25, 934 
10-15, doi:10.1016/j.cub.2014.10.063 (2015). 935 
26 Davidson, D. Truth and meaning. Synthese 17, 304–323, 936 
doi:10.1007/BF00485035 (1967). 937 
27 Bedi, G. et al. Automated analysis of free speech predicts psychosis onset 938 
in high-risk youths. NPJ Schizophr 1, 15030, doi:10.1038/npjschz.2015.30 939 
(2015). 940 
28 Cabana, A., Valle-Lisboa, J. C., Elvevag, B. & Mizraji, E. Detecting order-941 
disorder transitions in discourse: implications for schizophrenia. 942 
Schizophr Res 131, 157-164, doi:10.1016/j.schres.2011.04.026 (2011). 943 
29 Elvevag, B., Weinstock, D. M., Akil, M., Kleinman, J. E. & Goldberg, T. E. A 944 
comparison of verbal fluency tasks in schizophrenic patients and normal 945 
controls. Schizophr Res 51, 119-126 (2001). 946 
30 Pennebaker, J. W., Mehl, M. R. & Niederhoffer, K. G. Psychological aspects 947 
of natural language. use: our words, our selves. Annu Rev Psychol 54, 547-948 
577, doi:10.1146/annurev.psych.54.101601.145041 (2003). 949 
31 Bollobás, B. Modern Graph Theory.  103–144 (Springer-Verlag, 1998). 950 
32 Sigman, M. & Cecchi, G. A. Global organization of the Wordnet lexicon. 951 
Proc Natl Acad Sci U S A 99, 1742-1747, doi:10.1073/pnas.022341799 952 
(2002). 953 
33 Costa, M. E., Bonomo, F. & Sigman, M. Scale-Invariant Transition 954 
Probabilities in Free Word Association Trajectories. Front Integr Neurosci 955 
3, 19 (2009). 956 
34 Mota, N. B. et al. Speech graphs provide a quantitative measure of thought 957 
disorder in psychosis. PLoS One 7, e34928, 958 
doi:10.1371/journal.pone.0034928 (2012). 959 
183
 35 
35 Mota, N. B., Furtado, R., Maia, P. P., Copelli, M. & Ribeiro, S. Graph analysis 960 
of dream reports is especially informative about psychosis. Sci Rep 4, 961 
3691, doi:10.1038/srep03691 (2014). 962 
36 Mota, N. B., Copelli, M. & Ribeiro, S. Thought disorder measured as 963 
random speech structure classifies negative symptoms and Schizophrenia 964 
diagnosis 6 months in advance. npj Schizophrenia 3, 1, 965 
doi:10.1038/s41537-017-0019-3 (2017). 966 
37 Mota, N. B. et al. A naturalistic assessment of the organization of 967 
children’s memories predicts cognitive functioning and reading ability. 968 
Mind, Brain and Education 10, 184-195, doi:10.1111/mbe.12122 (2016). 969 
38 Biggs, R. D. Inscriptions from Tell Abu Salabikh. Vol. 99 (University of 970 
Chicago Press, 1974). 971 
39 Shklovskiĭ, V. & Sher, B. Theory of prose. 1st American edn,  (Dalkey 972 
Archive Press, 1990). 973 
40 Erdős, P. & Rényi, A. On Random Graphs. I. Publicationes Mathematicae 6, 974 
290–297 (1959). 975 
41 Bridgeland, J. M., Dilulio, J. J. & Morison, K. B. The Silent Epidemic: 976 
Perspectives of High School Dropouts. (Bill & Melinda Gates Foundation, 977 
Washington, D.C., 2006). 978 
42 Tree, J. J., Longmore, C. & Besner, D. Orthography, phonology, short-term 979 
memory and the effects of concurrent articulation on rhyme and 980 
homophony judgements. Acta Psychol (Amst) 136, 11-19, 981 
doi:10.1016/j.actpsy.2010.08.009 (2011). 982 
43 Black, J. A. et al. Instructions of Shuruppag, <http://etcsl.orinst.ox.ac.uk/> 983 
(1998-2006). 984 
44 Bransford, J. D. & Johnson, M. K. Contextual prerequisites for 985 
understanding: Some investigations of comprehension and recall. Journal 986 
of Verbal Learning and Verbal Behavior 11, 717-726 (1972). 987 
45 Ma'ayan, A. et al. Ordered cyclic motifs contribute to dynamic stability in 988 
biological and engineered networks. Proc Natl Acad Sci U S A 105, 19235-989 
19240, doi:10.1073/pnas.0805344105 (2008). 990 
46 Hauser, M. D., Chomsky, N. & Fitch, W. T. The faculty of language: what is 991 
it, who has it, and how did it evolve? Science 298, 1569-1579, 992 
doi:10.1126/science.298.5598.1569 (2002). 993 
47 Malkin, I. A Small Greek World. Networks in the Ancient Mediterranean.  994 
(Oxford University Press, 2011). 995 
48 Preiser-Kapeller, J. Calculating the Middle Ages? The Project 996 
“Complexities and Networks in the Medieval Mediterranean and the Near 997 
East” (COMMED). Medieval Worlds 2, 100-127, 998 
doi:10.1553/medievalworlds_no2_2015s100 (2015). 999 
49 Gathercole, S. E., Pickering, S. J., Ambridge, B. & Wearing, H. The structure 1000 
of working memory from 4 to 15 years of age. Dev Psychol 40, 177-190, 1001 
doi:10.1037/0012-1649.40.2.177 (2004). 1002 
50 Cowan, N. Working Memory Underpins Cognitive Development, Learning, 1003 
and Education. Educ Psychol Rev 26, 197-223, doi:10.1007/s10648-013-1004 
9246-y (2014). 1005 
51 Forbes, N. F., Carrick, L. A., McIntosh, A. M. & Lawrie, S. M. Working 1006 
memory in schizophrenia: a meta-analysis. Psychol Med 39, 889-905, 1007 
doi:10.1017/S0033291708004558 (2009). 1008 
184
 36 
52 Hall, J. M. A History of the Archaic Greek World.  (Wiley-Blackwell, 2007). 1009 
53 Guerreiro, A. Ancestrais e Suas Sombras.  (UNICAMP, 2015). 1010 
54 Assad, G. & Shapiro, B. What About the Bicameral Mind? Drs. Assad and 1011 
Shapiro Reply. American Journal of Psychiatry 144, 696 (1987). 1012 
55 Dennett, D. Julian Jaynes's Software Archeology. Canadian Psychology 27 1013 
(1986). 1014 
56 Williams, G. What is it like to be nonconscious? A defense of Julian Jaynes. 1015 
Phenomenology and the Cognitive Sciences  10, 217-239, doi:10: 217–1016 
239. doi:10.1007/s11097- 010-9181-z (2010). 1017 
57 Walsh, T. et al. Rare structural variants disrupt multiple genes in 1018 
neurodevelopmental pathways in schizophrenia. Science 320, 539-543, 1019 
doi:10.1126/science.1155174 (2008). 1020 
58 Stefansson, H. et al. Common variants conferring risk of schizophrenia. 1021 
Nature 460, 744-747, doi:10.1038/nature08186 (2009). 1022 
59 International Schizophrenia, C. et al. Common polygenic variation 1023 
contributes to risk of schizophrenia and bipolar disorder. Nature 460, 1024 
748-752, doi:10.1038/nature08185 (2009). 1025 
60 Vitale, A. M. et al. DNA methylation in schizophrenia in different patient-1026 
derived cell types. NPJ Schizophr 3, 6, doi:10.1038/s41537-016-0006-0 1027 
(2017). 1028 
61 Kohn, R. & Rodríguez, J. J. in Epidemiología de los trastornos mentales en 1029 
América Latina y el Caribe   (ed R Kohn JJ Rodríguez, S Aguilar-Gaxiola)  1030 
223–233 (Organización Panamericana de la Salud, 2009). 1031 
62 O’Nell, T. D. Psychiatric investigations among American Indians and 1032 
Alaska natives: a critical review. Cult Med Psychiatry 13, 51–87 (1989). 1033 
63 Lucas, R. H. & Barrett, R. J. Interpreting culture and psychopathology: 1034 
primitivist themes in cross-cultural debate. Cult Med Psychiatry 3, 287–1035 
326 (1995). 1036 
64 Lévi-Strauss, C. The savage mind (La pensée sauvage).  (Weidenfeld & 1037 
Nicolson, 1966). 1038 
65 Vanstiphout, H. L. J. Lipit-Eštar's Praise in the Edubba. Journal of 1039 
Cuneiform Studies 30, 33-61 (1978). 1040 
66 Petersson, K. M., Silva, C., Castro-Caldas, A., Ingvar, M. & Reis, A. Literacy: a 1041 
cultural influence on functional left-right differences in the inferior 1042 
parietal cortex. Eur J Neurosci 26, 791-799, doi:10.1111/j.1460-1043 
9568.2007.05701.x (2007). 1044 
67 Carreiras, M. et al. An anatomical signature for literacy. Nature 461, 983-1045 
986, doi:10.1038/nature08461 (2009). 1046 
68 Sun, Y., Chen, Y., Collinson, S. L., Bezerianos, A. & Sim, K. Reduced 1047 
Hemispheric Asymmetry of Brain Anatomical Networks Is Linked to 1048 
Schizophrenia: A Connectome Study. Cereb Cortex, 1049 
doi:10.1093/cercor/bhv255 (2015). 1050 
69 Mota, N. B., Resende, A., Mota-Rolim, S. A., Copelli, M. & Ribeiro, S. 1051 
Psychosis and the Control of Lucid Dreaming. Front Psychol 7, 294, 1052 
doi:10.3389/fpsyg.2016.00294 (2016). 1053 
70 First, M. H., Spitzer, R. L., Gibbon, M. & Williams, J. Structured Clinical 1054 
Interview for DSM-IV Axis I Disorders -- Research Version, Patient Edition 1055 
(SCID-I/P).  (Biometrics Research, 1990). 1056 
185
 37 
71 Kay, S. R., Fiszbein, A. & Opler, L. A. The positive and negative syndrome 1057 
scale (PANSS) for schizophrenia. Schizophr Bull 13, 261-276 (1987). 1058 
72 Bech, P., Kastrup, M. & Rafaelsen, O. J. . Mini-compendium of rating scales 1059 
for states of anxiety depression mania schizophrenia with corresponding 1060 
DSM-III syndromes. Acta Psychiatr Scand Suppl 326, 1-37 (1986). 1061 
73 Lang, P. J., Greenwald, M. K., Bradley, M. M. & Hamm, A. O. Looking at 1062 
pictures: Affective, facial, visceral, and behavioral reactions. 1063 
Psychophysiology 30, 261–273 (1993). 1064 
74 Erdös, P. & Rényi, A. On random graphs, I. Publ Math 6, 290–297 (1959). 1065 
 1066 
186
1 
 
Supplementary Information 
 
Suppl. Table 1: Demographic and psychiatric characteristics of cohort of typical and non-
typical (psychotic) subjects …………………......…………………..…………………………………………..… Pg 2 
 
Suppl. Table 2: Identification and dating of literary texts included in the reference set 
(independent file) 
 
Suppl. Table 3: Parameters and rationales for the exponential model……………………….... Pg 3 
 
Suppl. Note 1: Historical events of interest.……………………………….…………………………...……. Pg 4 
 
Suppl. Table 4: Spearman correlations between graph attributes and years of age, 
education, and a multiple linear combination of education and age that confirms the 
predominance of the former.………………………………………………………………………………….....…. Pg 5 
 
Suppl. Table 5: Goodness of fit and parameters of exponential model for ontogenetic 
dataset (healthy and psychotic subjects).…………………………….…………………………………….… Pg 6 
 
Suppl. Figure 1: Ontogenetic and literary data respectively randomized for years of 
education or historical time do not correlate with graph attributes ………………………..…. Pg 7 
 
Suppl. Table 6: For literary data, parameters for Spearman and exponential correlations of 
graph attributes with historical time…………..……………………………………………………….....…. Pg 8 
 
Suppl. Table 7: For literary data, parameters for exponential fit of the data nested by 
literary tradition (fit of mean graph attributes weighted by standard 
error)……………………………………………………………………………………………………....……………...…. Pg 9 
 
Suppl. Figure 2: Controls for potential discrepancy of graph attributes between original 
and translated texts, and text selection bias…………………………………………………………….... Pg 10 
 
Suppl. Table 8: Identification and dating of literary texts included in the 10 randomly 
chosen sets (N=20 texts per set) (independent file) 
 
Suppl. Table 9: Pearson correlation between graph attributes…………………………...…..…. Pg 11 
 
Supplementary Figure 3: Literary data assuming dating jitter of at least 100 years from the 
estimated date of each data point………………………………...…………………………………..………... Pg 12 
 
Suppl. Table 10: Demographic information of Amerindian reports (independent file) 
 
Suppl. Note 2: Ethnopsychiatry………………………………………………………………………………...  Pg 13 
 
Suppl. Table 11: Demographic information of illiterate samples ………………………….....  Pg 16 
 
Suppl. Table 12: Identification and dating of Poetry (independent file) 
 
Suppl. Table 13: Statistically significant differences to Pre-Axial and Post-Axial texts of 
Poetry, Illiterate Adults, Preschool children and Amerindian adults …………………….…. Pg 17 
 
Suppl. Table 14: Statistically significant differences between historical periods (Bronze 
Age, Axial Age and Post-Axial Age)……………………………………………………………………………... Pg 18 
 
Suppl. Note 3: Detailed Dating Procedure………………………………………...………………………… Pg 19 
 
Suppl. References………………………………………................................................………………………… Pg 22 
 
  
187
2 
 
Supplementary Table 1: Demographic and psychiatric characteristics of 
cohort of typical and non-typical (psychotic) subjects. Number of adult and 
non-adult individuals in each sample (adult considered as equal or above 18 
years old). Mean and standard deviation for age in years, sex and years of 
education. The psychiatric assessment shows number of individuals for each 
diagnosis (Schizophrenia or Bipolar Disorder), number of females in each 
diagnostic group, mean and standard deviation for psychometric scale (severity 
of general and psychotic symptomatology), and disease duration in years, and 
medication used (in percentage of patients using a medication class in each 
diagnostic group). Note that there is imbalance regarded to sex distribution in 
psychotic sample (specifically subjects with schizophrenia diagnosis). Another 
important note is that there are many more children in the Control sample, due 
to the difficulties of diagnosing psychosis during childhood. This difference 
impacts the distributions of age and years of education. Abbreviations: Brief 
Psychiatric Rating Scale (BPRS), Positive and Negative Syndrome Scale (PANSS), 
anti-psychotic (AP). 
 
Demographic Characteristics Psychosis Control 
Number of individuals 
Non-adults 17 93 
Adults 48 42 
Age 29.51 ± 13.36 14.92 ± 11.61 
Sex 
Male 72% (N=47) 49% (N=66) 
Female 28% (N=18) 51% (N=69) 
Years of Education 7.42 ± 4.61 6.22 ± 6.37 
Psychiatric Assessment Schizophrenia Bipolar 
Number of individuals 36 (6 females) 29 (12 females) 
Psychometric Scales 
BPRS 16.81 ± 6.33 15.28 ± 7.06 
PANSS 69.69 ± 14.58 62.45 ± 15.46 
Disease Duration (years) 12.31 ± 12.44 8.28 ± 9.64 
Medication 
AP typical 67% 59% 
AP atypical 47% 28% 
Mood stabilizer 11% 62% 
Antidepressant 3% 21% 
Benzodiazepine 22% 21% 
 
  
188
3 
 
Supplementary Table 3: Parameters and rationales for the exponential 
model. 
 
 
 
 
  
Coefficient 
Rationale for 
lower point 
Rationale for 
upper point 
Start-point 
f∞ 
0 / no graph attribute 
can be smaller than 0 
30 for N and LSC (graph 
attributes counted by 
number of nodes) / 
maximum number of nodes 
for 30 word graphs Maximum observed 
value 29 for RE and ASP (graph 
attributes counted by 
number of edges) / 
maximum number of edges 
for 30 word graphs) 
Ƭ 
0 for Education / 
illiterates 
30 for education 
(Post-doctoral level) 
12 years of education 
(High school level) 
2,500 BC for 
historical time / 
earliest written 
record 
Infinite for historical 
time (Future) 
800 BC (Axial Age) 
f0 
0 / no graph attribute 
can be smaller than 0 
30 for N and LSC (graph 
attributes counted by 
number of nodes) / 
maximum number of nodes 
for 30 word graphs Minimum observed 
value 29 for RE and ASP (graph 
attributes counted by 
number of edges) / 
maximum number of edges 
for 30 word graphs) 
189
4 
 
Supplementary Note 1: Historical events of interest. 
 
a) The birth of literature occurred in Afro-Eurasia during the early Bronze 
Age, in the context of the first major civilization merge, involving Indo-European 
and Semitic populations. Proto-Indo-European originated in west-central Asia 
9,500 to 6,000 years ago, spawning since then to Europe and most of Afro-Asia 
as the multiple Indo-European languages 1-3 co-evolved with branches of the 
Afro-Asiatic linguistic family 4. Cultural and linguistic diversity are estimated to 
have peaked during the Neolithic and declined afterwards 5,6. Around 2,500 BC 
writing created the capacity for reliable communication across space and time, 
as the historical record began 7. Population growth, migrations and military 
conquests began to periodically unify larger and larger groups of people around 
similar cultural kernels 8-11. 
 
b) The Axial Age (800-200 BC) was marked by civilization blossoming in 
multiple Eurasian sites, including Athens, Rome, Babylon, and the Persian, 
Macedonian and Mauryan Empires 12-20. Many fundamental texts of ancient 
literature date from this period (e.g. The Iliad, The Odyssey, The Republic, Book 
of Genesis, Avesta, Mahabharata). Multicultural development and integration 
was accelerated by the consolidation of alphabetic writing, new literary 
traditions and the foundation of the first high-level educational institutions, such 
as Plato’s Academy and the Library of Alexandria in the 4th century BC. By 326 
BC, when Alexander invaded northern India, Indo-European and Afro-Asiatic 
languages were developing sympatrically, with shared aspects of literature, 
religion, govern, trade and money 21,22. 
 
c) Civilizations fell and rose in rapid succession at the end of the early 
Bronze Age, marked by severe aridification 23. For instance, the collapses of the 
Old Egyptian Kingdom (~2,181 BC), and of the Akkadian Empire in Mesopotamia 
(~2,154 BC) were soon followed by empire reunification in Egypt (~2,055 BC) 
and Mesopotamia (~2,025 BC for Assyria and ~1,760 BC for Babylon) 24-26. On 
the East, major urban centers dating from before 3,000 BC such as Mohenjodaro 
and Harappa, began to collapse by ~1,900 BC. The decay of the Indus valley 
civilization was followed by an early migration of Indo-Aryan populations into 
northwestern India between 1,800 BC and 1,500 BC 27,28. Together with several 
other examples, these events mark the end of early Bronze Age and the onset of 
middle Bronze Age in Afro-Eurasia 29,30. 
 
d) The end of Bronze Age is marked by a long list of city-states that 
collapsed or began to fade in the West at the dawn of the first millennium BC 
31,32, including Knossos (~1,100 BC), Homeric Troy (Herodotus ~1,250 BC, 
archaeological Troy VII: ~950 BC), Mycenae (~1,200 BC), Ugarit (~1,190 BC), 
Megiddo ~1,150 BC, and Babylon (~1,026 BC). Collapses also occurred in the 
empires of Egypt (~1,100 BC) and Assyria (~1,055 BC). By 1,200 BC Indo-Aryan 
groups were penetrating eastward into the Ganges plains, and by ~1,000 BC the 
transition from semi-nomadic pastoral to settled agricultural Vedic societies was 
consolidated 27,29,33-36. 
 
  
190
5 
 
Supplementary Table 4: Spearman correlations between graph attributes 
and years of age, education, and a multiple linear combination of education 
and age that confirms the predominance of the former. Significant p values 
indicated in bold (Bonferroni correction for 8 comparisons (2 groups * 4 
attributes), alpha = 0.0063). Coef stands for coefficient. 
 
AGE Spearman Correlation Nodes RE LSC ASP 
Typical 
Rho 0.36 -0.22 0.40 0.41 
p value 0.0000 0.0118 0.0000 0.0000 
Psychosis 
Rho -0.02 -0.04 0.17 0.06 
p value 0.8919 0.7744 0.1806 0.6178 
EDUCATION Spearman Correlation Nodes RE LSC ASP 
Typical 
Rho 0.49 -0.33 0.45 0.51 
p value 0.0000 0.0001 0.0000 0.0000 
Psychosis 
Rho 0.06 -0.01 0.19 0.17 
p value 0.6578 0.9253 0.1294 0.1750 
 
Multiple Linear 
Combination 
Nodes RE LSC ASP 
 
R² 0.16 0.09 0.23 0.26 
 
p 0.0000 0.0025 0.0000 0.0000 
 
Coef AGE -0.0067 0.0023 0.0578 0.0050 
 
Coef EDU 0.1195 -0.0500 0.2353 0.0394 
 
Coef EDU - Coef AGE 0.1128 0.0478 0.1776 0.0344 
 
  
191
6 
 
Supplementary Table 5: Goodness of fit and parameters of exponential 
model for ontogenetic dataset (healthy and psychotic subjects). Significant 
Spearman correlations indicated in bold. 
 
For years of education Goodness of Fit Nodes RE LSC ASP 
Control 
R Square 0.85 0.95 0.83 0.52 
SSE 7.81 0.63 45.58 0.36 
RMSE 0.53 0.15 1.28 0.11 
f∞ 24.56 1.07 18.68 4.94 
Ƭ 0.63 0.28 13.34 11.06 
f0 19.43 4.33 8.32 3.85 
|f∞- f0| 5.13 3.26 10.36 1.08 
Psychosis 
R Square 0.01 0.01 0.42 0.05 
SSE 9.16 1.96 137.30 1.33 
RMSE 0.53 0.24 2.04 0.20 
f∞ 22.53 1.55 18.84 4.43 
Ƭ 29.99 1.12 14.94 3.71 
f0 23.48 0.00 6.69 3.59 
|f∞- f0| 0.95 1.55 12.15 0.85 
 
  
192
7 
 
Supplementary Figure 1: Ontogenetic and literary data respectively 
randomized for years of education or historical time do not correlate with 
graph attributes. In every case, 1,000 surrogated calculations were performed. 
a) Spearman correlations of graph attributes with shuffled or real years of 
education (lines or dots, respectively). b) Exponential fits of graph attributes 
with shuffled or real years of education (lines or dots, respectively). c) Spearman 
correlations of graph attributes with shuffled or real historical time (lines or 
dots, respectively). d) Exponential fits of graph attributes with shuffled or real 
historical time (lines or dots, respectively). 
 
  
193
8 
 
Supplementary Table 6: For literary data, parameters for Spearman and 
exponential correlations of graph attributes with historical time. Significant 
correlations indicated in bold (Bonferroni correction for 4 comparisons, alpha = 
0.0125). 
 
Spearman Nodes RE LSC ASP 
Rho 0.50 -0.46 0.49 0.54 
p 4.18E-30 1.23E-24 5.97E-28 6.23E-35 
Goodness Nodes RE LSC ASP 
R Square 0.24 0.23 0.42 0.30 
SSE 564.74 125.51 3243.73 70.66 
RMSE 1.13 0.53 2.70 0.40 
f∞ 30.00 0.09 19.34 29.00 
Ƭ 5,321 -1,127 -1,427 96,946 
f0 22.34 2.55 1.00 3.66 
 
 
  
194
9 
 
Supplementary Table 7: For literary data, parameters for exponential fit of 
the data nested by literary tradition (fit of mean graph attributes weighted 
by standard error).  
 
Goodness Nodes RE LSC ASP 
R square 0.46 0.56 0.71 0.49 
R adjusted 0.28 0.42 0.62 0.32 
SSE 1160.61 163.78 6231.67 153.10 
RMSE 13.91 5.22 32.23 5.05 
Asymptotic f∞ 30.00 0.00 21.44 16.20 
Characteristic time 5,120 -603 -731 44,482 
Coefficient f0 21.99 2.52 1.00 3.57 
 
  
195
10 
 
Supplementary Figure 2: Controls for potential discrepancy of graph 
attributes between original and translated texts, and for text selection bias. 
a) Nodes, RE and ASP were significantly correlated between originals and 
translations. LSC was not, due to a subset of Bronze Age texts on the top left 
corner of the plot, with much larger LSC in the translations than in the originals. 
b) The dynamics of graph attributes in original texts shows monotonic changes 
quite similar to those observed in translated texts (compare with Fig. 4). Note 
the structural clustering of recent English originals. c) Graph attributes of the 
reference sample of post-medieval texts do not differ from those of random 
samples. Compare results from the reference sample (Ref; black boxplots) and 
10 samples of 20 post-medieval texts randomly chosen from the Gutenberg 
Project digital library (R1-R10, gray boxplots). P values for Kruskal-Wallis tests 
corrected for 4 comparisons (alpha=0.0125). 
 
 
 
  
196
11 
 
Supplementary Table 9: Pearson correlations between graph attributes. In 
bold R2 from correlations with significant p value (Bonferroni corrected for 6 
comparisons, alpha = 0.0083). 
 
 
R² N x RE N x LSC N x ASP RE x LSC RE x ASP LSC x ASP 
< 12yo 0.61 0.00 0.32 0.01 0.22 0.00 
> 12yo 0.60 0.01 0.61 0.02 0.21 0.04 
Psychosis 0.66 0.15 0.34 0.12 0.16 0.08 
Pre-Axial 0.62 0.01 0.54 0.00 0.20 0.00 
Post-Axial 0.86 0.06 0.90 0.10 0.64 0.01 
 
  
197
12 
 
Supplementary Figure 3: Literary data assuming dating jitter of at least 100 
years from the estimated date of each data point. A total of 1,000 surrogated 
calculations were performed considering an error of at least 100 years (when 
the estimated error was higher than that, the larger interval was used as jitter). 
a) Exponential R² of graph attributes with jittered or estimated dates (lines or 
dots, respectively). b) The characteristic times of graph attributes with jittered 
or estimated dates did not differ (lines or dots, respectively). 
 
 
 
  
198
13 
 
Supplementary Note 2: Ethnopsychiatry 
 
The notion that schizophrenia is heavily influenced by civilization lingers 
37. The arguments in that regard include the paucity of descriptions of 
schizophrenia core symptoms in older sources, the description of a supposed 
absence of this disorder in indigenous peoples, and an alleged uneven 
distribution of the disorder across cultures. Early descriptions of the mental 
health of indigenous peoples pointed to the fact that the prototypical chronic 
evolution symptoms of schizophrenia were rarely observed 38-40. However, these 
descriptions lacked systematic sampling and could be influenced by other 
factors, such as the concealment of affected individuals and the cross-cultural 
barrier that would preclude the access of the early researchers of psychosis to 
patient symptoms. By 1942, the idea that schizophrenia was a disease of 
civilization was already challenged 41. Also popular was the notion that 
schizophrenia and shamanism share common traces, and that in the so-called 
primitive societies a person with those traces would became a shaman and not a 
psychotic. Contemporary studies on shamanism and cross-cultural psychiatry 
tend to reject this notion 42,43. 
After the World Health Organization’s cross-cultural studies on 
schizophrenia, this debate evolved. Since then, the general consensus is that the 
prevalence of schizophrenia is considerably similar across the major 
contemporary cultures. Current day researchers are unconvinced that 
schizophrenia is less common in indigenous groups, and there are studies that 
indicate that indigenous populations are not immune to schizophrenia 44-47, 
including South Amerindians 48-51. However, a definite answer for this particular 
type of population is difficult to be ascertained, since it is highly influenced by 
ethnocentrism 52,53 and the problematic attempt to separate psychopathology 
from exotic cultural behavior 43. Also, the perspective of Medical Anthropology 
stresses the importance of understanding that the concept of self – whose 
disturbances are key elements for the diagnosis of schizophrenia – may vary 
widely among cultures, especially in indigenous populations. This fact could 
potentially influence outcomes and symptoms of the biological traces that 
underlie the disorder 54. 
On the other hand, at the same time that schizophrenia started to present 
similar prevalence worldwide, a surprising evidence emerged: The notion that 
the disorder appears to have better outcomes in countries with lower average 
income (developing countries). Though controversial, and still lacking an 
explanation, the evidence in this direction is strong and has not been sufficiently 
refuted 55-57. Another consistent finding is the fact that the risk of schizophrenia 
is greater for those born in urban settings 58-60. In general, the attempts to 
199
14 
 
explain these phenomena include a supposedly inferior demand for individual 
performance in ‘less Westernized’ societies and regions and the role of stronger 
family ties in these regions 61. Arguments against the association between better 
outcomes and living in a less developed country include methodological 
problems and a higher mortality rate of severe cases in less developed 
environments 62.  
While the relationship between culture and schizophrenia is more 
tenuous than the original descriptions, some associations – such as country 
income and urbanicity – remain. Although a tight connection between 
civilization and schizophrenia has been postulated, the explanations proposed 
for the phenomenon are mainly biological 37. So far, no biological factor has 
gathered enough evidence to explain these apparent variations. Therefore, the 
idea that cultural variation could be included among the variables that influence 
the occurrence and outcome of schizophrenia is acceptable in principle, as it 
underlies hypotheses that the disorder may be linked to religion and to the 
internal dialogue with Gods and spirits 63,64, postulated to be common before the 
Axial Age 65. 
In the specific case of the Kalapalo, which comprise a major part of our 
Amerindians sample, they consider that some people may be hidigü, i.e. “crazy”. 
The word comes from the root hidi, which can also be nominalized by the suffix –
du (hidindu), meaning “craziness”. To be “crazy” can have a lot of meanings. 
Young people say those who have several lovers are “crazy”, since they do not 
have respect for their partners. The people from the past, present in narrative 
from “the dawn of time”, are said to be crazy because they did things that would 
not be considered as proper behavior, like eating inedible food, going to 
dangerous places, making contacts with spiritual beings and enemies, and lied to 
each other. Others might be said to be crazy if they don’t show respect to their 
kin, and if they eventually come to kill them through sorcery. Hidindu might also 
be an illness that makes people unconsciously run screaming into the forest, or 
climb their house’s roofs. In general, we might say that one is hidigü because he 
or she has uncontrolled relations to alterity: Too many sexual partners, or 
unpredictable relations with enemies or spirits. 
This last condition, in particular, is usually provoked by spirits. These 
beings enjoy the company of humans, and may address men and women to talk, 
to offer food, to take a walk to beautiful places, or even have sex. If this happens, 
the company of the spirits will probably lead the person to see them as if they 
were human; on the other hand, the person would also stop recognizing their 
own kin as such. From the spirits’ point of view, the person body becomes like 
theirs, turning one into their kin; from the humans’ point of view, the person’s 
200
15 
 
body is passing through a metamorphose that might lead to death – that is, the 
unmaking of the kinship relations built during a person’s life. To become kin to 
the spirits means to forget your former kin, and this means to suffer a 
metamorphosis. A former human being could, thus, become a deer or a jaguar. 
Not all contacts with spirits lead to death, but since they induce new relations 
with different kinds of beings, the experience of forgetting about your kin, your 
home and your duties may manifest itself as “craziness”, hidindu. It’s import to 
emphasize that this is not a mental state, but a bodily one. In indigenous 
Amazonia, the body and its affections are usually considered as the locus of both 
perception and thought 66. In order to produce persons that think and act 
accordingly to collective ideals, Amazonian peoples invest their energies in 
producing specific kinds of human bodies, through dietary prescriptions, 
adornment and innumerous techniques for modeling the body 67,68. Thus, when 
someone deeply alters his or her way of thinking, feeling and acting, this is seen 
as the result of a bodily transformation. 
If hidindu can be seen as a strong disorder in the way one relates to 
alterity, controlled relations with spirits by shamans are very important. 
Shamans may see the spirits, talk to them, and they frequently have families 
among them (male shamans usually marry their assistant spirit’s daughter, with 
whom they have kids). In the past, shamanic trances were frequently described 
in the literature as the symptoms of neurosis, and psychological traits were used 
by some to describe what was called a culture’s “personality”. However, as Lévi-
Strauss argued long ago, psychic conditions might be seen as a translation, at the 
individual level, of sociological structures, since normal and abnormal behavior 
depends on what is considered as such in different cultural contexts 69. 
According to him, individual conducts are never symbolic in themselves, but are 
the elements from which a symbolic system might be constructed. While normal 
behavior represents some kind of “alienation” (since it means being subjected to 
arbitrary standards of normality), “abnormal” conducts are able to create the 
illusion of an autonomous symbolism at the individual scale, and 
psychopathological conditions would offer society an equivalent of symbolism 
different from its own. Since no society is fully symbolic, individuals with an 
abnormal behavior could be demanded by society to occupy positions in which 
their own symbolism could create mediations between incompatible dimensions 
of social and symbolic life. Thus, psychotic individuals could, under certain 
historical and sociological conditions, exert at an individual scale a symbolic 
activity crucial to collective life, analogous to what might be achieved by 
collective symbolic thought. 
  
201
16 
 
 
Supplementary Table 11: Demographic information of illiterate samples 
 
Demographic Characteristics 
Preschool 
children 
Adults 
Number of individuals 18 18 
Age 3.61 ± 0.14 46.17 ± 5.94 
Sex 
Male 50% 33% 
Female 50% 67% 
 
 
202
17 
 
  
Supplementary Table 13: Statistically significant differences to Pre-Axial 
and Post-Axial texts of Poetry, Illiterate Adults, Preschool children and 
Amerindian adults. Significant p values indicated in bold (Bonferroni 
correction for 32 comparisons, alpha = 0.0016). 
 
 
Wilcoxon Ranksum test (p values)  Nodes RE LSC ASP 
Pre-Axial x Amerindian adults 0.0002 0.0845 0.0000 0.0000 
Post-Axial x Amerindian adults 0.0000 0.0000 0.0000 0.0000 
Pre-Axial x Preschool children 0.0000 0.0819 0.0380 0.0000 
Post-Axial x Preschool children 0.0000 0.0000 0.0000 0.0000 
Pre-Axial x Illiterate adults 0.9397 0.9240 0.1107 0.0527 
Post-Axial x Illiterate adults 0.0002 0.0000 0.0000 0.0088 
Pre-Axial x Poetry 0.0000 0.0000 0.0000 0.1058 
Post-Axial x Poetry 0.0000 0.0000 0.0000 0.0000 
  
203
18 
 
Supplementary Table 14: Statistically significant differences between 
historical periods (Bronze Age. Axial Age and Post-Axial Age). Significant p 
values indicated in boldface (Bonferroni correction for 24 comparisons. alpha = 
0.0021). 
 
Wilcoxon Ranksum test (p values) Nodes RE LSC ASP 
Middle Bronze x Axial 0.0000 0.0000 0.0000 0.0000 
Early Bronze x Middle Bronze 0.0000 0.0158 0.0144 0.0010 
Early Bronze x Axial 0.5839 0.2141 0.0000 0.0976 
Middle Bronze x Post-Axial 0.0000 0.0000 0.0000 0.0000 
Axial x Post-Axial 0.0040 0.0255 0.6257 0.0011 
Early Bronze x Post-Axial 0.0753 0.0299 0.0000 0.0010 
  
204
19 
 
Supplementary Note 3: Detailed Dating Procedure 
 
Syro-Mesopotamian 
Although there is lack of consensus about the composition date of the majority of 
the Mesopotamian scriptures, Instructions of Shuruppag is considered one of the 
oldest writings of humanity, dating from circa 2,500 BC. Several Sumerian texts 
date from approximately 2.000 BC 70,71. 
 
Egypt 
Dating Egyptian texts demanded focus on age of papyri/stelae production. due to 
high uncertainty on the composition date of many scriptures. The ‘Book of the 
Dead’, for example, is a compilation of various rituals, holding textual 
productions from many different periods. One of the main sources for this work 
was the Digital Egypt website from University College London. It provides 
information about presumable origins of composition, together with the 
estimated age of the papyrus or stelae in which the text was found. When a 
certain period or dynasty is offered for dating the material, we used the 
following chronology of the same database. Available in: 
http://www.ucl.ac.uk/museums-static/digitalegypt/chronology/index.html. 
 
Hinduist 
Most of the works of Hinduism present a substantial uncertainty in dating, even 
for AC texts, and especially for the older ones. The collection of ‘Puranas’, for 
instance, comprises texts from many different centuries across the 1st millennia 
BC and AC, with varying attribution of dating composition 72. More ancient 
scriptures like Vedic scriptures (i. e. the ‘Rigveda’) reach late Bronze Age 
composition time, most likely in the middle of the 2nd millennium BC. 
 
Judeo-Christian 
Dating of Biblical texts is more accurate in the New Testament than the Old 
Testament. in which there is a lot of discussion concerning composition time. In a 
general manner, dating was found in The New Oxford Annotated Bible, which 
links historical events, idiom and writing style to certain periods 73. An example 
205
20 
 
is the book ‘Lamentations of Jeremiah’, which supposedly has the Destruction of 
Jerusalem (circa 586 BC) as the story background. For some other books, such as 
compilations, assigning a certain date was a more difficult task, such as in the 
case of the Psalms, Proverbs and Songs of Solomon, with dating uncertainty of up 
to 900 years. 
 
Greek-Roman 
Greek and Latin literary productions are usually well documented. However, 
some textual pieces still have unclear information concerning dating and even 
authorship. Some specific uncertainties are presented below: 
 
Aesop – His tales probably were written during his lifetime. Since the majority of 
the sources offer this period to date the ‘Fables’, we dated the book using middle 
of author’s lifespan 74. 
 
Apollodorus – The work ‘Library and Epitome’ is assigned to Apollodorus. 
However, recent research has speculated that it was probably written later by an 
author called “pseudo-Apollodorus”, from 1 AC. Source: 
http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0
004%3Aalphabetic+letter%3DA%3Aentry+group%3D13%3Aentry%3Dapollod
orus 
 
Aristotle – The collection ‘Ethics’ contains various treatises composed most 
likely between 360 BC e 330 BC 75. 
 
Epicurus – Due to lack of information concerning the dating of “Doctrines” and 
‘Letter to Menoceus’, middle of author’s lifespan was the chosen dating method 
for these works. Source: https://plato.stanford.edu/entries/epicurus/. 
 
Lysias – Various discourses/orations occurred during the author’s lifetime, 
presumably between 403 and 380 BC 76. 
 
206
21 
 
Porphyry – The books ‘Life of Plotinus’, ‘Against the Christians’, and ‘On 
abstinence of animal food’ were dated exactly, while the other books were dated 
using the criterion of middle of lifespan. Source: 
http://classics.oxfordre.com/view/10.1093/acrefore/9780199381135.001.000
1/acrefore-9780199381135-e-5259). 
 
Thucydides - Since the author is estimated to have died circa 400 BC, and since 
there is evidence that the “History of the Peloponnesian War” continued to be 
modified after the end of the war in 404 BC, the date assigned to this book in the 
revised manuscript was 400 BC. “Stories” seems to be a compilation of various 
narratives written in different moment, so we assigned the middle of author’s 
lifespan as the date of the composition: 430 BC 77. 
 
Persian 
Persian traditional texts were collected from scriptures like the ‘Zend Avesta’, 
attributed mainly to prophet Zoroaster, but written during the time of the 
Sassanid Empire, around 530 AC 78. Other Persian works analyzed in this study 
comprehend Denkard and Pahlavi Scriptures, also dating from the 1st 
millennium BC. 
 
Medieval, Modern and Contemporary 
Since many works from those periods were written and popularized due to the 
advent of the press, dating became more accurate. Most dates were directly 
extracted from the editorial information of books. However, some works like 
‘One Thousand and One Nights’ (unknown author) and ‘Physics of Healing’ (from 
Avicenna) had their dates calculated based on periods of probable composition. 
  
207
22 
 
Supplementary References 
 
1 Bouckaert, R. et al. Mapping the origins and expansion of the Indo-
European language family. Science 337, 957-960, 
doi:10.1126/science.1219669 (2012). 
2 Haak, W. et al. Massive migration from the steppe was a source for Indo-
European languages in Europe. Nature 522, 207-211, 
doi:10.1038/nature14317 (2015). 
3 Gray, R. D. & Atkinson, Q. D. Language-tree divergence times support the 
Anatolian theory of Indo-European origin. Nature 426, 435-439, 
doi:10.1038/nature02029 (2003). 
4 Greenberg, J. H. Studies in African linguistic classification.  (Compass Pub. 
Co., 1955). 
5 Broome, R. Aboriginal Australians : a history since 1788. Fully rev. 4th edn,  
(Allen & Unwin, 2010). 
6 Binford, L. R. Constructing frames of reference : an analytical method for 
archaeological theory building using hunter-gatherer and environmental 
data sets.  (University of California Press, 2001). 
7 Daniels, P. T. & Bright, W. The world's writing systems.  (Oxford University 
Press, 1996). 
8 Trigger, B. in Ancient Egypt: A Social History   (eds Bruce Trigger, Barry 
Kemp, David O'Connor, & Alan Lloyd)  1-69 (Cambridge University Press, 
2001). 
9 O'Connor, D. in Ancient Egypt: A Social History   (eds Bruce Trigger, Barry 
Kemp, David O'Connor, & Alan Lloyd)  183-278 (Cambridge University 
Press, 2001). 
10 Breasted, J. H. Ancient Time or a History of the Early World. Vol. 1 
(Kessinger Publishing, 2003). 
11 Schwartzberg, J. E. A Historical Atlas of South Asia.  (University of Oxford 
Press, 1992). 
12 Jaspers, K. The origin and goal of history.  (Yale University Press, 1953). 
13 Eisenstadt, S. N. The Origins and diversity of axial age civilizations.  (State 
University of New York Press, 1986). 
14 Voegelin, E. Order and history: In search of order. Vol. 5 (Louisiana State 
University Press, 1956). 
15 Armstrong, K. The great transformation : the beginning of our religious 
traditions. 1st edn,  (Knopf, 2006). 
16 Árnason, J. h. P. l., Eisenstadt, S. N. & Wittrock, B. r. Axial civilizations and 
world history.  (Brill, 2005). 
17 Baumard, N., Hyafil, A., Morris, I. & Boyer, P. Increased affluence explains 
the emergence of ascetic wisdoms and moralizing religions. Curr Biol 25, 
10-15, doi:10.1016/j.cub.2014.10.063 (2015). 
18 Baumard, N., Hyafil, A. & Boyer, P. What changed during the axial age: 
Cognitive styles or reward systems? Commun Integr Biol 8, e1046657, 
doi:10.1080/19420889.2015.1046657 (2015). 
19 Ong, W. J. & Hartley, J. Orality and literacy : the technologizing of the word. 
30th anniversary edn,  (Routledge, 2012). 
20 Hall, J. M. A History of the Archaic Greek World.  (Wiley-Blackwell, 2007). 
208
23 
 
21 Graeber, D. Debt : the first 5,000 years. Updated and expanded edition. 
edn,  (Melville House, 2014). 
22 Harari, Y. N. Sapiens : a brief history of humankind. First U.S. edition. edn,  
(Harper, 2015). 
23 deMenocal, P. B. Cultural responses to climate change during the late 
Holocene. Science 292, 667-673, doi:10.1126/science.1059188 (2001). 
24 Kemp, B. in Ancient Egypt: A Social History   (eds Bruce Trigger, Barry 
Kemp, David O'Connor, & Alan Lloyd)  71-182 (Cambridge University 
Press, 2001). 
25 Pruß, A. in Atlas of Preclassical Upper Mesopotamia Vol. Subartu 13  (eds 
Stefano Anastasio, Marc Lebeau, & Martin Sauvage)  7-21 (Brepols, 2004). 
26 Arnold, B. T. Who Were the Babylonians? ,  (Brill Publishers, 2005). 
27 Parpola, A. in The Bronze Age and Early Iron Age Peoples of Eastern and 
Central Asia, ( Institute for the Study of Man 1998). 
28 Moorjani, P. et al. Genetic evidence for recent population mixture in India. 
Am J Hum Genet 93, 422-438, doi:10.1016/j.ajhg.2013.07.006 (2013). 
29 Singh, U. A History of Ancient and Early Mediaeval India: From the Stone 
Age to the 12th Century.  (Pearson Education India, 2008). 
30 Kulke, H. & Rothermund, D. A History of India.  (Routledge, 1998). 
31 Diamond, J. M. Collapse : how societies choose to fail or succeed.  (Penguin 
Books, 2011). 
32 Drews, R. The End of The Bronze Age: Changes in Warfare and the 
Catastrophe ca. 1200 B.C.,  (Princeton University Press, 1993). 
33 Kochhar, R. The Vedic People: Their History and Geography.  (Sangam 
Books, 2000). 
34 Bryant, E. & Patton, L. L. The Indo-Aryan Controversy: Evidence and 
Inference in Indian History.  (Routledge, 2005). 
35 Staal, F. Discovering the Vedas: Origins, Mantras, Rituals, Insights.  
(Penguin, 2008). 
36 Wright, R. P. The ancient Indus: Urbanism, economy, and society 
(Cambridge Univ Press, 2010). 
37 Torrey, E. F. Schizophrenia and civilization.  (J. Aronson, 1980). 
38 Seligman, C. G. Temperament, conflict and psychosis in a stone-age 
population. . Br J Med Psychol. 9, 187–202 (1929). 
39 Lopes, C. Ethnographische Betrachtungen über die Schizophrenie. 
Zeitschrift für die gesamte Neurologie und Psychiatrie 142, 706–711 
(1932). 
40 Faris, R. E. L. Some observations on the incidence of schizophrenia in 
primitive societies. J Abnorm Soc Psychol 29, 30 (1934). 
41 Demerath, N. J. Schizophrenia among primitives. Am J Psychiatry 98, 703–
707 (1942). 
42 Noll, R. Am Ethnol. Shamanism and schizophrenia: a state-specific 
approach to the “schizophrenia metaphor” of shamanic states 10, 443–459 
(1983). 
43 Lucas, R. H. & Barrett, R. J. Interpreting culture and psychopathology: 
primitivist themes in cross-cultural debate. Cult Med Psychiatry 3, 287–
326 (1995). 
44 Sullivan, R. J., Allen, J. S. & Nero, K. L. Schizophrenia in Palau. Curr 
Anthropol 48, 189–213 (2007). 
209
24 
 
45 Hunter, E. et al. Psychosis and its correlates in a remote indigenous 
population. Australas Psychiatry 19, 434–438 (2011). 
46 Black, E. B. et al. A systematic review: Identifying the prevalence rates of 
psychiatric disorder in Australia’s Indigenous populations. Aust New Zeal J 
Psychiatry 49, 412–429 (2015). 
47 Robin, R. W., Gottesman, I. I., Albaugh, B. & Goldman, D. Schizophrenia and 
psychotic symptoms in families of two American Indian tribes. BMC 
Psychiatry 7, 30 (2007). 
48 Caqueo-Urízar, A., Urzúa-M, A., Miranda-Castillo, C. & Irarrázaval, M. 
Adherencia a la medicación antipsicótica en pacientes indígenas con 
esquizofrenia. Salud Ment 39, 303–310 (2016). 
49 Caqueo-Urízar, A., Gutiérrez-Maldonado, J., Ferrer-García, M. & 
Darrigrande-Molina, P. Sobrecarga en cuidadores aymaras de pacientes 
con esquizofrenia. Rev Psiquiatr Salud Ment 5, 191–196 (2012). 
50 Kohn, R. & Rodríguez, J. J. in Epidemiología de los trastornos mentales en 
América Latina y el Caribe   (ed Kohn R Rodríguez JJ, Aguilar-Gaxiola S)  
223–233 (Organización Panamericana de la Salud, 2009). 
51 Calvo de Padilla, M. et al. Temperament traits associated with risk of 
schizophrenia in an indigenous population of Argentina. Schizophr Res 83, 
299-302, doi:10.1016/j.schres.2005.12.848 (2006). 
52 Kirmayer, L. J. & Ban, L. Cultural Psychiatry: Research Strategies and 
Future Directions. Advances in Psychosomatic Medicine 33, 97–114 
(2013). 
53 O’Nell, T. D. Psychiatric investigations among American Indians and 
Alaska natives: a critical review. Cult Med Psychiatry 13, 51–87 (1989). 
54 Fabrega, H. On the significance of an anthropological approach to 
schizophrenia. Psychiatry 52, 45–65 (1989). 
55 Jaaskelainen, E. et al. A systematic review and meta-analysis of recovery 
in schizophrenia. Schizophr Bull 39, 1296-1306, 
doi:10.1093/schbul/sbs130 (2013). 
56 Hopper, K. & Wanderling, J. Revisiting the developed versus developing 
country distinction in course and outcome in schizophrenia: results from 
ISoS, the WHO collaborative followup project. International Study of 
Schizophrenia. Schizophr Bull 26, 835-846 (2000). 
57 Bresnahan, M., Menezes, P., Varma, V. & Susser, E. in The Epidemiology of 
Schizophrenia   (ed Peter B. Jones Robin M. Murray, Ezra Susser, Jim van 
Os, Mary Cannon)  18–33 (Cambridge University Press, 2003). 
58 Messias, E. L., Chen, C.-Y. & Eaton, W. W. Epidemiology of Schizophrenia: 
Review of Findings and Myths. Psychiatr Clin North Am 30, 323–338 
(2007). 
59 Padhy, S. K., Sarkar, S., Davuluri, T. & Patra, B. N. Urban living and 
psychosis – An overview. Asian J Psychiatr, 17–22 (2014). 
60 Vassos, E., Pedersen, C. B., Murray, R. M., Collier, D. A. & Lewis, C. M. Meta-
analysis of the association of urbanicity with schizophrenia. Schizophr 
Bull 38, 1118-1123, doi:10.1093/schbul/sbs096 (2012). 
61 Pedersen, C. B. & Mortensen, P. B. Why factors rooted in the family may 
solely explain the urban-rural differences in schizophrenia risk estimates. 
Epidemiol Psychiatr Sci 15, 247–251 (2006). 
210
25 
 
62 Patel, V., Cohen, A., Thara, R. & Gureje, O. Is the outcome of schizophrenia 
really better in developing countries? Rev Bras Psiquiatr 28, 149–152 
(2006). 
63 Dein, S. & Littlewood, R. Religion and psychosis: A common evolutionary 
trajectory? Transcult Psychiatry 48, 318–335 (2011). 
64 Littlewood, R. & Dein, S. Did Christianity lead to schizophrenia? Psychosis, 
psychology and self reference. Transcult Psychiatry 50, 397–420 (2013). 
65 Jaynes, J. The origin of consciousness in the breakdown of the bicameral 
mind.  (Houghton Mifflin, 1976). 
66 Surrallés, A. Au Coeur Du Sens: Perception, Affectivité, Action Chez Les 
Candoshi.  (Éditions de la Maison des sciences de l’homme, 2003). 
67 Seeger, A., DaMatta, R. & de Castro, E. V. A Construção da pessoa nas 
sociedades indígenas brasileiras. Boletim do Museu Nacional 32 (1979). 
68 Vilaça, A. Making Kin out of Others in Amazonia. Journal of the Royal  
Anthropological Institute 8, 347–365 (2002). 
69 Lévi-Strauss, C. The savage mind (La pensée sauvage).  (Weidenfeld & 
Nicolson, 1966). 
70 Kramer, S. N. The Oldest Literary Catalogue: A Sumerian List of Literary 
Compositions Compiled about 2000 B.C. Bulletin of the American Schools 
of Oriental Research 88, 10-19 (1942). 
71 Biggs, R. D. Inscriptions from Tell Abu Salabikh. Vol. 99 (University of 
Chicago Press, 1974). 
72 Rocher, L. The Purāṇas 
. Vol. 2 (Otto Harrassowitz Verlag, 1986). 
73 authors, S. The New Oxford Annotated Bible: New Revised Standard Version 
with the Apocrypha. 4th edn,  (Oxford University Press, 2010). 
74 Fernandes, M. The animal fable in modern literature.  (B.R. Pub. Corp., 
1996). 
75 Kenny, A. The Aristotelian Ethics: A Study of the Relationship Between the 
Eudemian and the Nicomachean Ethics of Aristotle. 2nd edn,  (Clarendon 
Press). 
76 Freeman, K. The Murder of Herodes: And Other Trials from the Athenian 
Law Courts.  (Hackett Publishing Company, 1963). 
77 Zagorin, P. Thucydides: An Introduction for the Common Reader.  
(Princeton University Press, 2005). 
78 Snodgrass, M. E. Encyclopedia of the literature of empire.  (Facts On File, 
2010). 
 
211
Chapter 6 - Lucid dreams and psychosis: 
In this chapter the more applied perspective of speech analysis changes to questions 
related to the basic science of dreams, and an assessment of the overlap between 
dreaming and psychosis. Since criticism of reality is reduced during psychosis and 
enhanced during lucid dreams, in this published paper we studied lucid dream features 
in a psychotic sample compared to well-matched controls, and also speech features 
related to dream memories on psychotic subjects that were able to be lucid while 
dreaming.  
 
212
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 1
ORIGINAL RESEARCH
published: 09 March 2016
doi: 10.3389/fpsyg.2016.00294
Edited by:
Sue Llewellyn,
University of Manchester, UK
Reviewed by:
Manuel Schabus,
University of Salzburg, Austria
Martin Dresler,
Radboud University Medical Centre,
Netherlands
*Correspondence:
Natália B. Mota
nataliamota@neuro.ufrn.br;
Sidarta Ribeiro
sidartaribeiro@neuro.ufrn.br
Specialty section:
This article was submitted to
Psychopathology,
a section of the journal
Frontiers in Psychology
Received: 01 December 2015
Accepted: 16 February 2016
Published: 09 March 2016
Citation:
Mota NB, Resende A,
Mota-Rolim SA, Copelli M
and Ribeiro S (2016) Psychosis
and the Control of Lucid Dreaming.
Front. Psychol. 7:294.
doi: 10.3389/fpsyg.2016.00294
Psychosis and the Control of Lucid
Dreaming
Natália B. Mota1*, Adara Resende1, Sérgio A. Mota-Rolim1,2, Mauro Copelli3 and
Sidarta Ribeiro1*
1 Brain Institute, Federal University of Rio Grande do Norte, Natal, Brazil, 2 Onofre Lopes University Hospital, Federal
University of Rio Grande do Norte, Natal, Brazil, 3 Physics Department, Federal University of Pernambuco, Recife, Brazil
Dreaming and psychosis share important features, such as intrinsic sense perceptions
independent of external stimulation, and a general lack of criticism that is associated
with reduced frontal cerebral activity. Awareness of dreaming while a dream is happening
defines lucid dreaming (LD), a state in which the prefrontal cortex is more active than
during regular dreaming. For this reason, LD has been proposed to be potentially
therapeutic for psychotic patients. According to this view, psychotic patients would
be expected to report LD less frequently, and with lower control ability, than healthy
subjects. Furthermore, psychotic patients able to experience LD should present milder
psychiatric symptoms, in comparison with psychotic patients unable to experience LD.
To test these hypotheses, we investigated LD features (occurrence, control abilities,
frequency, and affective valence) and psychiatric symptoms (measure by PANSS, BPRS,
and automated speech analysis) in 45 subjects with psychotic symptoms [25 with
Schizophrenia (S) and 20 with Bipolar Disorder (B) diagnosis] versus 28 non-psychotic
control (C) subjects. Psychotic lucid dreamers reported control of their dreams more
frequently (67% of S and 73% of B) than non-psychotic lucid dreamers (only 23%
of C; S > C with p = 0.0283, B > C with p = 0.0150). Importantly, there was no
clinical advantage for lucid dreamers among psychotic patients, even for the diagnostic
question specifically related to lack of judgment and insight. Despite some limitations
(e.g., transversal design, large variation of medications), these preliminary results
support the notion that LD is associated with psychosis, but falsify the hypotheses that
we set out to test. A possible explanation is that psychosis enhances the experience of
internal reality in detriment of external reality, and therefore lucid dreamers with psychotic
symptoms would be more able to control their internal reality than non-psychotic lucid
dreamers. Training dream lucidity is likely to produce safe psychological strengthening
in a non-psychotic population, but in a psychotic population LD practice may further
empower deliria and hallucinations, giving internal reality the appearance of external
reality.
Keywords: psychosis, schizophrenia, bipolar disorder, lucid dreams, dreaming
Frontiers in Psychology | www.frontiersin.org 1 March 2016 | Volume 7 | Article 294
213
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 2
Mota et al. Psychosis and Lucid Dreaming
INTRODUCTION
Dreaming and psychosis share important phenomenological and
neurophysiological features (Gottesmann, 2005; Manoach and
Stickgold, 2009; Mota-Rolim and Araujo, 2013; Dresler et al.,
2014). In terms of subjective experience, both phenomena present
intrinsic sense perceptions independent of external stimulation,
associated with a lack of criticism (or rational judgment)
regarding the bizarreness of these experiences (Cicogna and
Bosinelli, 2001). The latter feature has been hypothesized to stem
from the decrease in frontal cerebral activity that characterizes
both psychosis and rapid-eye-movement (REM) sleep (Dresler
et al., 2014; Voss et al., 2014). Yet, executive functions are not
necessarily impaired during dreaming. It is possible to be aware
of dreaming while a dream is happening, with partial or total
control of the dream contents by the dreamer, a phenomenon
called lucid dreaming (LD; Laberge et al., 1986; Mota-Rolim
and Araujo, 2013; Stumbrys et al., 2013; Voss et al., 2014).
Recent studies using functional magnetic resonance imaging
(Dresler et al., 2012) and electroencephalography (Voss et al.,
2009) indicate that LD is related to increased activity in the
prefrontal cortex (Voss et al., 2009, 2014; Mota-Rolim et al., 2010;
Neider et al., 2011; Dresler et al., 2012; Stumbrys et al., 2013). In
agreement with this notion, transcranial electrical stimulation of
the prefrontal cortex can induce dream awareness during REM
sleep (Stumbrys et al., 2013; Voss et al., 2014). Frontal cortex
activity correlates with self-consciousness, working memory,
and attention (Postle, 2006). Therefore, an increase in frontal
activity should contribute to lucidity during dreaming (Hobson,
2009; Voss et al., 2009), while a decrease in prefrontal activity
should explain the lack of rational judgment in both psychosis
and non-lucid dreams (Anticevic et al., 2012; Dresler et al.,
2014).
Theories about human consciousness propose that the LD
phenomenon is possible due to the linguistic ability of our
species, which permits the semantic access of episodic memories
of sensory origin (Edelman, 2003; Voss et al., 2013). By
accessing episodic memories, the flow of thoughts can be
reported, and the subjective ability of “mind wandering” can
be shared with others. Similarly, dream mentation can be
understood as spontaneous thinking, not associated to any
external task (Fox et al., 2013). An important set of systems
involved in this process is the default mode network (DMN),
a functional circuit comprising brain areas activated during
resting states, and suppressed during cognitive tasks (Anticevic
et al., 2012; Fox et al., 2013). Some core DMN areas are also
engaged during REM sleep, such as the medial pre-frontal
cortex and multiple temporal structures (parahippocampal,
hippocampal, and entorhinal cortices; Fox et al., 2013). In
patients with schizophrenia, there is an impairment in DMN
suppression during attention tasks that may contribute to the
cognitive deficits found in these subjects (Anticevic et al.,
2012).
The dream experience is also peculiar for psychotic patients.
Dream report analysis reveals a higher frequency of nightmares
among schizophrenic patients than in healthy subjects (Okorome
Mume, 2009; Michels et al., 2014), with more hostile contents,
higher proportion of strangers among the dream characters, and
a lower frequency of dreams in which the dreamer is the main
character (Skancke et al., 2014). We have recently uncovered
evidence of language impairments in the dream reports of
schizophrenic subjects, who produce substantially less complex
narratives than non-schizophrenic subjects (Mota et al., 2014).
Using a graph-theoretical approach to represent and quantify
word trajectories, we found that the recurrence, connectivity
and global complexity of dream reports characterize the distinct
patterns of thought disorder that correspond to schizophrenia
and bipolar disorder type I, two different diseases associated
with psychosis (Mota et al., 2012, 2014). Interestingly, graph
connectivity attributes were strongly correlated with negative
and cognitive symptoms among psychotic patients (Mota et al.,
2014). In other words, psychosis-related cognitive deficits are
accompanied by impairment in the ability to share a flow of
thoughts when remembering a dream, leading to less connected
reports than those produced by healthy subjects. Notably, these
differences were more prominent for dream reports than for
waking reports (Mota et al., 2014). A likely explanation is
the hypo-function of the prefrontal cortex in psychosis, which
resembles the reduction of prefrontal cortex activity during
REM sleep in healthy subjects, in comparison to the levels
found in waking. Both in psychosis and regular dreaming,
prefrontal cortex hypo-function seems to be causally related to
the decreased criticism typical of these states (Dresler et al., 2014;
Laruelle, 2014). Since LD displays increased frontal activity in
comparison with non-LD (Mota-Rolim et al., 2010; Stumbrys
et al., 2013; Voss et al., 2014), LD has been proposed as potential
therapy for psychotic patients (Dresler et al., 2014; Voss et al.,
2014).
Despite the large amount of evidence linking sleep and
dreaming to psychosis (Gottesmann, 2005; Manoach and
Stickgold, 2009; Mota-Rolim and Araujo, 2013; Dresler et al.,
2014), there is a lack of quantitative information regarding
dreaming in psychotic patients. In particular, there are simply no
studies of LD in psychotic patients. To address these gaps, we set
out to quantitatively characterize LD in a psychotic sample, using
graph-theoretical tools and standard psychiatric instruments to
test three hypotheses: (1) Psychotic patients report LD less
frequently than non-psychotic subjects; (2) Psychotic patients
report LD control less frequently than non-psychotic subjects;
and (3) Psychotic patients who experience LD present attenuated
psychiatric symptoms and present less thought disorder, in
comparison with psychotic patients who do not experience LD.
MATERIALS AND METHODS
Participants
Seventy-three Brazilian individuals (43 males and 22 females,
mean age 35.59 ± 10.92 years), comprising 28 subjects without
psychotic symptoms (control group – C), 25 patients diagnosed
with schizophrenia (S), and 20 patients diagnosed with bipolar
disorder type I (B), for a total of 45 medicated patients with
psychotic symptoms (Table 1). The study was approved by
the UFRN Research Ethics Committee (permit#102/06-98244),
Frontiers in Psychology | www.frontiersin.org 2 March 2016 | Volume 7 | Article 294
214
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 3
Mota et al. Psychosis and Lucid Dreaming
TABLE 1 | Socio-demographic and psychiatric information about the groups investigated.
Psychotic subjects Control subjects P-value
Schizophrenia Bipolar S × B S × C B × C
Demographic characteristics
Age Years 34 ± 9.55 39.05 ± 11.79 34.79 ± 11.25 0.1342 0.8369 0.2910
Sex Male 84% 65% 61% 0.1406 0.0603 0.7624
Female 16% 35% 39%
Education Years 6.92 ± 4.02 9.35 ± 4.20 8.79 ± 3.94 0.0592 0.0867 0.7232
Marital status Married 24% 50% 60% 0.0702 0.0071∗∗ 0.4607
Previously Married 20% 30% 8% 0.4380 0.1676 0.0362∗
Never Married 56% 20% 32% 0.0143∗ 0.0802 0.3507
Psychiatric assesment
Medication Typical
Antipsychotic
72% 65% 0 0.6143 0.0000∗∗ 0.0000∗∗
Atypical
Antipsychotic
36% 20% 0 0.2393 0.0027∗∗ 0.0350∗
Mood Stabilizer 12% 55% 5% 0.0020∗∗ 0.4123 0.0006∗∗
Benzodiazepine 28% 30% 15% 0.8831 0.2973 0.2560
Antidepressants 0% 20% 20% 0.0191∗ 0.0191∗ 1
Age of onset Years 22.84 ± 8.27 27.1 ± 9.73 36.8 ± 8.9 0.1013 0.0101∗ 0.0569
Disease duration Months 17.32 ± 12.10 12.45 ± 9.98 1.24 ± 1.57 0.2162 0.0011∗∗ 0.0042∗∗
Age (years), years of education, frequency of sex, marital status, and medication for the groups studied. Mean and standard deviation are indicated. All subjects were
Brazilian. Control subjects were non-psychotic individuals with depression (N= 5), generalized anxiety disorder (N= 2), one past episode of post-traumatic stress disorder
(N = 1), various symptoms of mood/anxiety disorder without reaching diagnostic criteria (N = 11), plus nine healthy individuals. The groups were compared in pairs using
the chi-square test for sex, marital status, and medication, and the Wilcoxon Ranksum test for age, years of education, age of onset, and disease duration. P-values are
described for each pair comparison (∗p < 0.05 and ∗∗p < 0.01).
and the data were collected by convenience sampling at the
“Onofre Lopes” and “João Machado” Hospitals. The control
group was recruited at the same clinical institutions among
subjects presenting anxiety or depression symptoms but without
a psychiatric diagnosis (N = 11), among psychiatric patients
without psychotic symptoms [individuals with depression
(N = 5), generalized anxiety disorder (N = 2), one past
episode of post-traumatic stress disorder (N = 1)] and healthy
individuals accompanying patients (N = 9). All individuals
gave written informed consent. During the psychiatric interview,
patients were examined for major changes in state and
level of consciousness (e.g., drowsiness, torpor), for signs of
autopsychic and allopsychic disorientation (e.g., inability to
remember name, age, spatial localization), and for signs of
reduced mnemonic and cognitive capacity. All psychotic subjects
were medicated and out of the acute psychotic phase at the
onset of the study, so typically they were in good capacity
to provide informed consent. When signs of disorientation or
reduced mnemonic capacity were detected, the experimenter
also obtained written informed consent on their behalf from
their legal guardians (next of kin). There were differences
related to marital status (more single subject on S than on
B, previously married on B than on C and more married
subjects on C than on S – which could be explained by
social behavior impairments in the psychotic group), medication
(more antipsychotics for psychotic groups, more mood stabilizers
for B and less antidepressants for S – which reflects the
clinical symptoms treated), the age of onset and the duration
(smaller age of onset for S compared to C, and larger
duration to psychotic group – also expected for the different
diseases). Those differences mostly reflect the epidemiological
features of a psychotic population within a regular clinical
setting.
Instruments
Diagnosis was obtained with SCID DSM IV (First et al., 1990),
followed by application of the psychometric scales PANSS (Kay
et al., 1987) and BPRS (Bech et al., 1986). We used all the 48
symptoms measured by both scales (30 symptoms measured by
PANSS, grades of severity from 1 until 7; and 18 symptoms
measured by BPRS, grades of severity from 0 until 3). Next a
dream report was requested. Specifically, we asked the subject to
report the most recent dream they could remember, followed by
questions about regular dreaming (translated from Portuguese:
“Do your dreams usually resemble your daily life?,” “Do your
dreams usually resemble your psychotic symptoms?,” and “Do your
dreams change following changes in medication?”), and also about
LD (“Can you be aware of dreaming during sleep?,” “Can you
control your dream when this happens?,” “How frequently does
this happen: Once in lifetime, more than once but less than 10
times, more than 10 times but less than 100 times, or more than
100 times?,” “How do you feel when you wake up from these
dreams: very good, good, bad or very bad?”). We considered as
lucid dreamers individuals that claimed to be aware of dreaming
during a dream at least once in lifetime. All the verbal reports
were digitally recorded and transcribed. Analysis: The chi-square
test was used to establish statistically significant differences
between groups (S, B, and C) on questions about LD, and between
Frontiers in Psychology | www.frontiersin.org 3 March 2016 | Volume 7 | Article 294
215
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 4
Mota et al. Psychosis and Lucid Dreaming
TABLE 2 | Speech graph attributes (SGA): detail description of each speech graph attribute measured from dream reports.
N: Number of nodes.
E: Number of edges.
RE (Repeated Edges): sum of all edges linking the same pair of nodes.
PE (Parallel Edges): sum of all parallel edges linking the same pair of nodes given that the source node of an edge is the target node of the parallel edge.
L1 (Loop of one node): sum of all edges linking a node with itself, calculated as the trace of the adjacency matrix.
L2 (Loop of two nodes): sum of all loops containing two nodes, calculated by the trace of the squared adjacency matrix divided by two.
L3 (Loop of three nodes): sum of all loops containing three nodes (triangles), calculated by the trace of the cubed adjacency matrix divided by three.
LCC (Largest Connected Component): number of nodes in the maximal subgraph in which all pairs of nodes are reachable from one another in the
underlying undirected subgraph.
LSC (Largest Strongly Connected Component): number of nodes in the maximal subgraph in which all pairs of nodes are reachable from one another in the
directed subgraph (node a reaches node b, and b reaches a).
ATD (Average Total Degree): given a node n, the Total Degree is the sum of “in and out” edges. Average Total Degree is the sum of Total Degree of all nodes
divided by the number of nodes.
Density: number of edges divided by possible edges. [D = 2∗E/N∗(N – 1)], where E is the number of edges and N is the number of nodes.
Diameter: length of the longest shortest path between the node pairs of a network.
Average Shortest Path (ASP): average length of the shortest path between pairs of nodes of a network.
CC (Average Clustering Coefficient): given a node n, the Clustering Coefficient Map (CCMap) is the set of fractions of all n neighbors that are also neighbors
of each other. Average CC is the sum of the Clustering Coefficients of all nodes in the CCMap divided by number of elements in the CCMap.
lucid dreamers and non-lucid dreamers (within groups S and B)
on questions about regular dreams.
Graph Analysis
Thought disorder was investigated by representing the verbal
reports of experimental and control subjects as directed graphs.
These were computed by the custom-made free software Speech
Graphs (http://www.neuro.ufrn.br/softwares/speechgraphs),
which allows the calculation of several attributes related to the
recurrence, connectivity, and global complexity of graphs (Mota
et al., 2014). This methodology is free of subjective bias, since
it does not take into account any personal evaluation of the
semantic content of the verbal reports. Rather, it mathematically
analyzes various structural aspects of the reports. We have
previously validated this methodology for the diagnosis of
psychosis (Mota et al., 2012, 2014) and dementia (Bertola et al.,
2014). The rationale for combining the use of psychometric
scales and speech graph analysis was to quantitatively analyze
the psychiatric symptoms, so as to compare groups of lucid
and non-lucid psychotic dreamers and better characterize their
mental functioning. A graph is a mathematical representation
of a network with nodes linked by edges, formally defined as
G = (N, E), with the set of nodes N = {w1, w2, . . ., wn} and the
set of edges E = {(wi,wj)} (Mota et al., 2012, 2014; Bertola et al.,
2014). A speech graph represents the sequential relationship of
spoken words in a verbal report, with each word represented as
a node, and the sequence between successive words represented
as a directed edge (Mota et al., 2012, 2014; Bertola et al., 2014).
A total of 14 speech graph attributes (SGA) were calculated
for each dream report, comprising general graph attributes
(N, total of nodes; E, total of edges), recurrence (PE, parallel
edges; RE, repeated edges; L1, L2, and L3, loops of one; two and
three nodes), connectivity (LCC, largest connected component
and LSC, largest strongly connected component) and global
attributes (ATD, average total degree; Density, Diameter; ASP,
average shortest path; CC, clustering coefficient; Table 2).
The non-parametric statistical test Wilcoxon Ranksum was
used to establish SGA differences between lucid dreamers and
non-lucid dreamers, as well as differences in the symptomatology
measured by psychometric scales and speech measures (corrected
for the number of symptoms and speech attributes by the
Bonferroni method, α = 0.0008). Effect size was measured by
Cohen’s d.
RESULTS
About half of the psychotic subjects (48% of S and 55% of B)
and 46% of C reported having at least one LD in life, but we
found no statistically significant difference among the groups S
versus B (p = 0.6407), S versus C (p = 0.3138), or B versus
C (p = 0.5582; Figure 1A). Psychotic lucid dreamers reported
control of their dreams more frequently (67% of S and 73% of
B) than non-psychotic lucid dreamers (only 23% of C; S versus
C p = 0.0283, B versus C p = 0.0150; Figure 1B). There was
no statistical difference among groups concerning the number
of lifetime LD episodes (33% of S, 55% of B, and 31% of
C reported having had more than 10 LD in life; S versus B
p = 0.3053, S versus C p = 0.8908, B versus C p = 0.2391;
Figure 1C), nor for the proportion of subjects that reported
feeling good after waking up from a LD (58% of S, 91% of B,
and 77% of C; S versus B p = 0.0755, S versus C p = 0.3195,
B versus C p = 0.3596; Figure 1D). Specifically regarding lucid
dreamers in the psychotic groups, 57% of those that were unable
to control LD, and 81% of those that claimed to control LD,
reported pleasant feelings after waking from a LD (no statistical
difference between lucid dreamers that control the dream and
lucid dreamers that do not control the dream on S and B groups,
p= 0.2257).
A possible confounding factor to interpret the higher fre-
quency of dream control in the psychotic groups is the use
of antipsychotic medications. Neurons in the prefrontal cortex
are among the main targets of antipsychotics, via modulation
Frontiers in Psychology | www.frontiersin.org 4 March 2016 | Volume 7 | Article 294
216
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 5
Mota et al. Psychosis and Lucid Dreaming
FIGURE 1 | Characteristics of lucid dream reports in schizophrenia (S), bipolar (B), and control (C) groups. (A) Percentage of each group reporting
occurrence of lucid dreaming at least once in a lifetime. (B) Percentage of the ability to control their dreams: psychotic groups report control ability more frequently
than control group (S vs. C: p = 0.0283, B vs. C: p = 0.0150). (C) Percentage of high frequency of lucid dreams (more than 10 lucid dreams in a lifetime).
(D) Percentage of positive affective valence (good feeling after wake up from a lucid dream) (∗p < 0.05).
of the prefrontal cortex output to basal ganglia circuits
(Monti and Monti, 2004; Merikangas et al., 2011). First
generation antipsychotics enhance total sleep time and sleep
efficiency by controlling psychotic symptoms, but there are no
consistent results in non-psychotic subjects. Second generation
antipsychotics increase total sleep time and sleep efficiency in
both psychotic and non-psychotic subjects, with some drugs
having specific effects on sleep patterns (e.g., olanzapine increases
the amount of the N2 stage of sleep; Monti and Monti, 2004;
Cohrs, 2008). To investigate this effect in our psychotic sample,
we compared the doses of antipsychotics (chlorpromazine-
equivalent) between lucid and non-lucid dreamers. Within lucid
dreamers, we compared the antipsychotic doses administered
to those that reported to control LD to the doses administered
to those who reported not to control their dreams. Neither
comparison showed statistically significant differences (lucid
versus non-lucid dreamers p = 0.5460, and control versus non-
control p = 0.8556), thus strengthening the conclusion that the
differences between psychotic and control groups concerning the
ability to control LD are related to the psychotic state, not to the
different medications used.
Among psychotic patients, lucid dreamers reported
similarities between dreams and daily life more frequently
than non-lucid dreamers (for B: 73% of lucid dreamers and 22%
of non-lucid dreamers, p = 0.0246; for S: 94% of lucid dreamers
and 69% of non-lucid dreamers, p= 0.0596; Figure 2). Following
changes in medication, lucid dreamers were much more likely to
report changes in dream content (100% of B and 92% of S) than
non-lucid dreamers (0% of B, and 8% of S; p = 0.0000 on S and
B; Figure 2). Figure 2 also shows that there was no difference
concerning the similarity of dreams and symptoms between
lucid (55% of B, and 58% of S) and non-lucid (44% of B, and 38%
of S; p= 0.3204 on S and p= 0.6531 on B) dreamers.
With regard to the application of standard psychometric
scales and speech quantitative analysis, we did not find any
difference between lucid and non-lucid dreamer patients, neither
in S nor in B groups after correction for multiple comparisons
(α = 0.0008). We failed to detect any clinical advantage for lucid
dreamers even when multiple comparisons were disregarded
(α = 0.05), even for the item G12 on PANSS, related to the
symptom “Lack of judgment and insight.” This means that the
psychotic patients that were more able to have insight during
Frontiers in Psychology | www.frontiersin.org 5 March 2016 | Volume 7 | Article 294
217
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 6
Mota et al. Psychosis and Lucid Dreaming
FIGURE 2 | Characteristics of regular dream reports among psychotic
patients. (A) Within the S group, there were no significant differences
between lucid dreamers and non-lucid dreamers concerning similarities
between dream and daily experiences, but lucid dreamers reported changes
on dream contents after changes on medication more frequently than
non-lucid dreamers (p < 0.00005). (B) In the B group, lucid dreamers
reported similarities between dream and daily experiences, as well as changes
on dreams after medication changes, more frequently than non-lucid
dreamers (p = 0.0246 and p < 0.00005, respectively). Neither S nor B
showed differences between lucid and non-lucid dreamers on reports about
similarities between dreams and psychotic symptoms (∗p < 0.05).
dreaming were not more able to have insight about their own
psychotic reality than patients that were less aware during
dreaming. On the contrary, the emotional retraction symptom
measured by item N2 of the PANSS Negative Subscale, (Kay
et al., 1987) was more prevalent in lucid dreamers than in non-
lucid dreamers among S [Figure 3 and Supplementary Table 1;
LD versus non-LD on S: p = 0.0329, mean ± SD non-lucid
(n = 13): 2.54 ± 1.28 lucid (n = 12): 3.75 ± 1.36; Cohen’s
d: –0.92, a large effect size]. This symptom is characterized by
the lack of interest in external events, with little involvement
or affective commitment. Likewise, with regard to the structural
features of speech, only in S we found that lucid dreamers
displayed a significantly different SGA, namely smaller clustering
coefficient [CC; p = 0.0171, mean ± SD non-lucid (n = 13):
0.065 ± 0.047 lucid (n = 12): 0.030 ± 0.037; Cohen’s d: 0.83,
a large effect size] in comparison with non-lucid dreamers
(Figure 4 and Supplementary Table 2). This means that lucid
dreamers in the S group produced less complex speech graphs
when reporting a regular dream, in comparison with S subjects
that were not lucid dreamers, reflecting a less complex flow of
thought.
DISCUSSION
Altogether, the results falsified the three hypotheses that we
set out to test. First, psychotic patients did not report LD
less frequently than non-psychotic subjects. Second, among
the subjects that reported being lucid dreamers, psychotic
patients reported LD control more frequently than non-psychotic
subjects. Finally, patients who reported LD did not present
attenuated psychiatric symptoms, in comparison with patients
who did not report LD. Indeed, schizophrenia patients that
qualified as lucid dreamers showed a tendency to be more, not
less symptomatic than non-lucid dreamers in the same group.
Therefore, although the results on the lifetime occurrence of
LD replicate prior data (Snyder and Gackenbach, 1988; Mota-
Rolim et al., 2013), we could not find support for the notion that
a psychotic sample would report less LD than a non-psychotic
sample. There was no difference between psychotic and non-
psychotic subjects regarding the number of LD events in life. As
previously detected in a non-psychotic sample (Voss et al., 2013),
we found positive emotions to be more frequently associated with
LD in all groups, without significant differences.
In a sample of 3,427 Brazilian subjects interviewed online, 29%
of the subjects reported the ability to control LD (Mota-Rolim
et al., 2013). In the present study, only 23% of the non-psychotic
sample reported LD control, in contrast with significantly larger
numbers among psychotic subjects (67% of S and 73% of B).
This result was unexpected, considering that non-psychotic lucid
dreamers show increased control of internal reality (Blagrove
and Tucker, 1994; Blagrove and Hartnell, 2000), being more
frequently able to regulate cognition and emotion than non-lucid
dreamers (Blagrove and Hartnell, 2000). A possible explanation
is that psychosis enhances the experience of the internal reality
in detriment of the external reality, and therefore lucid dreamers
with psychotic symptoms would be more able to control
their internal reality than non-psychotic lucid dreamers. If
we hypothesize that the positive symptoms of psychosis may
represent the intrusion of REM sleep mentation into waking
(Freud, 1900; Dzirasa et al., 2006; Dresler et al., 2014), and that
LD may reflect the intrusion of waking mentation into REM
sleep (Mota-Rolim and Araujo, 2013), subjects who frequently
experience both conditions may be more cognitively trained to
control their internal reality than those who rarely experience LD.
This line of reasoning is supported by the fact that lucid dreamers
with psychotic symptoms reported more similarity between
dreams and daily life than non-lucid dreamers with psychotic
symptoms. Lucid dreamers were also much more likely than non-
lucid dreamers to report changes in dream content following
changes in medication, possibly reflecting a higher awareness
of dream reality in the former. Indeed, the frequent experience
of REM sleep-like mentations into the waking life might train
control of internal reality, and thus explain higher control of lucid
dream in psychotic patients. This might be particularly true for
transition phases between acutely psychotic and non-psychotic
phases. Within the dreaming/psychosis model, such transition
phases might thus be considered as “pre-lucid.” Future studies
should consider a longitudinal design, and aim to characterize the
transition between acute and non-acute psychotic phases.
Frontiers in Psychology | www.frontiersin.org 6 March 2016 | Volume 7 | Article 294
218
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 7
Mota et al. Psychosis and Lucid Dreaming
FIGURE 3 | Psychometric differences between lucid dreamers and non-lucid dreamers among schizophrenia patients. (A) Boxplots showing total BPRS
of lucid dreamers and non-lucid dreamers in the S group (p = 0.5930). (B) Boxplots showing total PANSS of lucid dreamers and non-lucid dreamers in the S group
(p = 0.6434). (C) Among S subjects, lucid dreamers showed higher scores on PANSS item N2 about emotional retraction (p = 0.0329), without significant
differences for the other symptoms; no significant differences were found among B subjects (see Supplementary Table 1) (∗p < 0.05).
FIGURE 4 | Differences on speech structure when reporting a regular dream between lucid dreamers and non-lucid dreamers among schizophrenia
patients. (A) Example of a text (regular dream report) represented as a speech graph. For this plot the original text was in Portuguese and each word was translated
to English, preserving the original grammatical structure. Speech graph attributes (SGA, see Table 2) were used to characterize speech structure from dream
reports. (B) In the S group, speech graphs from dream reports of lucid dreamers showed smaller clustering coefficient (CC) than non-lucid dreamers (p = 0.0171)
(∗p < 0.05).
We found no clinical advantages of having LD with regard
to psychiatric symptomatology, to speech structure, and in
particular to criticism of reality [question G12 of PANSS (Kay
et al., 1987), Supplementary Table 1]. On the contrary, we
found that lucid dreamers in the S group tends to be more
emotionally retracted than non-lucid dreamers, which means
that they were more isolated from others. These subjects also
tended to report their regular dreams in a less clustered manner,
reflecting a decrease in the complexity of the flow of thought
when reporting a dream, a symptom related to cognitive and
negative severity in schizophrenia (Mota et al., 2014), and
with cognitive impairment in dementia (Bertola et al., 2014).
Although these results do not reach significance after Bonferroni
correction, they have a large effect size that should not be
neglected. Possibly if the number of subjects per group was
higher, these symptomatology differences would become clearer.
Taken together, both psychometric features reveal impairment
of social behavior and thought disorganization among lucid
dreamers in the S group, which could be considered a potential
disadvantage related to clinical severity. But considering that
those lucid dreamers tend to control dream contents more
frequently, we can also interpret this result as a compensatory
attempt to enhance dream control, rather than trying the more
difficult control of reality. Do changes in dream control precede
changes in reality control, or vice-versa? While the transversal
design employed here cannot answer this question, future
longitudinal studies should help to disentangle these alternatives,
by synchronously collecting data on insights about dreaming and
psychotic reality, to determine the order of occurrence of changes
in these states.
Our study has other limitations that need to be considered.
First, sample sizes were relatively small, reflecting the scarcity of
individuals that experience both psychotic symptoms and LD.
The prevalence of LD (considering the definition adopted in
this study) is high in the Brazilian population (77.2%; Mota-
Rolim et al., 2013) and was not found to be low in our sample
Frontiers in Psychology | www.frontiersin.org 7 March 2016 | Volume 7 | Article 294
219
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 8
Mota et al. Psychosis and Lucid Dreaming
(48% in S, 55% in B, and 46% in C), but the prevalence of
psychosis is much lower (B prevalence data from 11 countries:
0.6%; Merikangas et al., 2011, S prevalence data from 46
countries: 0.55%; McGrath et al., 2008). We also had differences
between the groups that mostly reflect general epidemiological
differences regarding marital status within psychotic populations,
but should be considered as a potential confounding factor.
In addition, the control sample (subjects without psychotic
symptoms in lifetime) had a mixture of individuals with and
without psychiatric symptoms, some with psychiatric diagnosis
like depression and others without any psychiatric symptom in
lifetime, what make this control group very heterogeneous; in
future studies a control sample without any psychiatric symptoms
should be investigated.
Another caveat is the fact that the research was only based
on self-reports of LD, with possible confounds of secondary
elaboration, motivation, conscious and unconscious intentions
(Freud, 1900). Ideally lucidity should be assessed by external
judges to avoid fallacious interpretations (Stumbrys et al.,
2012). Moreover, we assessed LD throughout the lifetime, but
did not investigate whether the patients experienced lucid
dreams specifically during the psychotic episode(s). This is an
important issue to be clarified in future studies, specifically when
considering symptomatology differences, such as the increase
of insight. Maybe the patients that were considered as lucid
dreamers in the present study were not experiencing lucid
dreams during that period, and would not show potential clinical
advantages such as increased insight.
Medication was another limitation to consider (Table 1),
since all the psychotic subjects were medicated with a variety of
different drugs, and the use of psychotropic drugs can modify
dream perception and recall (Solms, 2000; Gottesmann, 2005).
Future studies should also interview psychotic patients during
acute crises, to compare with the data collected during non-acute
states as in the present study. In principle, data sampled during
acute phases should be more informative. The symptomatology
during this transition phase (acute to non-acute phase) should
give important information regarding changes in insight of the
differences between fantasy and reality.
Furthermore, we did not control for differences in dream
recall frequency among the patients, an important methodo-
logical issue for dream research (Schredl, 2011; Michels et al.,
2014; Skancke et al., 2014), which could perhaps explain the
differences in continuity between daily life and dreams, or
changes of dream content after change of medication. In addition,
we did not control for differences in the frequency of nightmares,
which is heightened in S patients (Okorome Mume, 2009;
Michels et al., 2014; Skancke et al., 2014), and may be related with
lucidity in pathological conditions (Rak et al., 2015). However,
nightmares are by definition associated with unpleasant feelings
after waking up, and we found a high frequency of pleasant
feelings after waking up from a lucid dream in this sample (58%
for S and 91% for B). Finally, we did not employ training or
induction techniques for LD generation (Stumbrys et al., 2012),
but rather dealt with natural recollections of spontaneous LD.
The results in trained subjects may be quite different from those
reported here. Beyond these limitations, our results suggest that
psychotic lucid dreamers, which fail the “external reality test,”
are nevertheless more able to control their internal reality during
dreaming.
To the best of our knowledge, the present study is the
first to assess LD in a clinically characterized psychotic
sample. Overall the results confirm the notion that LD
is associated with psychosis. This relationship deserves a
closer investigation, since the present data does not conform
to the hypothesis that LD control is helpful to psychotic
patients. The distinctive features of the LD experience in
our sample pose a challenge to the perspective of clinically
using LD for the treatment of psychosis (Dresler et al.,
2014; Voss et al., 2014). Also, the results point to an
intriguing relationship between dream lucidity and judgment
of reality among psychotic patients, which deserves deeper
investigation with larger samples. Training dream lucidity is
likely to produce safe psychological strengthening in a non-
psychotic population (Stumbrys et al., 2012), but in a psychotic
population LD practice may further empower deliria and
hallucinations, giving internal reality the appearance of external
reality.
AUTHOR CONTRIBUTIONS
NM and SR designed the study, collected the data, NM, AR,
SM-R, MC, and SR analyzed the data, and NB, SR, SM-R, and
MC wrote the paper.
FUNDING
This work was supported by Conselho Nacional de Desen-
volvimento Científico e Tecnológico (CNPq), grants Universal
480053/2013-8 and Research Productivity 310712/2014-9 and
306604/2012-4; Coordenação de Aperfeiçoamento de Pessoal
de Nível Superior (CAPES) – Projeto ACERTA; Fundação de
Amparo à Ciência e Tecnologia do Estado de Pernambuco
(FACEPE); FAPESP Center for Neuromathematics (grant #
2013/07699-0, S. Paulo Research Foundation FAPESP).
ACKNOWLEDGMENTS
We thank the Psychiatry Residency Program at Hospital Onofre
Lopes (UFRN) and Hospital João Machado for allowing access
to independently diagnosed patients; M. Schredl and the two
reviewers for insightful comments on the manuscript, N. da C.
Souza, N. Lemos, and A. C. Pieretti for interview transcriptions;
D. Koshiyama for bibliographic support; G. M. da Silva and J.
Cirne for IT support, and PPG/UFRN for covering publication
costs.
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online
at: http://journal.frontiersin.org/article/10.3389/fpsyg.2016.
00294
Frontiers in Psychology | www.frontiersin.org 8 March 2016 | Volume 7 | Article 294
220
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 9
Mota et al. Psychosis and Lucid Dreaming
REFERENCES
Anticevic, A., Cole, M. W., Murray, J. D., Corlett, P. R., Wang, X. J.,
and Krystal, J. H. (2012). The role of default network deactivation in
cognition and disease. Trends Cogn. Sci. 16, 584–592. doi: 10.1016/j.tics.2012.
10.008
Bech, P., Kastrup, M., and Rafaelsen, O. J. (1986). Mini-compendium of rating
scales for states of anxiety depression mania schizophrenia with corresponding
DSM-III syndromes. Acta Psychiatr. Scand. Suppl. 326, 1–37.
Bertola, L., Mota, N. B., Copelli, M., Rivero, T., Diniz, B. S., Romano-Silva,
M. A., et al. (2014). Graph analysis of verbal fluency test discriminate
between patients with Alzheimer’s disease, Mild Cognitive Impairment and
normal elderly controls. Front. Aging Neurosci. 6:185. doi: 10.3389/fnagi.2014.
00185
Blagrove, M., and Hartnell, S. J. (2000). Lucid dreaming: associations with internal
locus of control, need for cognition and creativity. Pers. Individ. Differ. 28,
41–47. doi: 10.1016/S0191-8869(99)00078-1
Blagrove, M., and Tucker, M. (1994). Individual differences in locus of control
and the reporting of lucid dreaming. Pers. Individ. Differ. 16, 981–984. doi:
10.1016/0191-8869(94)90242-9
Cicogna, P. C., and Bosinelli, M. (2001). Consciousness during dreams. Conscious.
Cogn. 10, 26–41. doi: 10.1006/ccog.2000.0471
Cohrs, S. (2008). Sleep disturbances in patients with schizophrenia : impact
and effect of antipsychotics. CNS Drugs 22, 939–962. doi: 10.2165/00023210-
200822110-00004
Dresler, M., Wehrle, R., Spoormaker, V. I., Koch, S. P., Holsboer, F., Steiger, A.,
et al. (2012). Neural correlates of dream lucidity obtained from contrasting
lucid versus non-lucid REM sleep: a combined EEG/fMRI case study. Sleep 35,
1017–1020. doi: 10.5665/sleep.1974
Dresler, M., Wehrle, R., Spoormaker, V. I., Steiger, A., Holsboer, F.,
Czisch, M., et al. (2014). Neural correlates of insight in dreaming
and psychosis. Sleep Med. Rev. 20, 92–99. doi: 10.1016/j.smrv.2014.
06.004
Dzirasa, K., Ribeiro, S., Costa, R., Santos, L. M., Lin, S. C.,
Grosmark, A., et al. (2006). Dopaminergic control of sleep-wake
states. J. Neurosci. 26, 10577–10589. doi: 10.1523/JNEUROSCI.1767-
06.2006
Edelman, G. M. (2003). Naturalizing consciousness: a theoretical framework.
Proc. Natl. Acad. Sci. U.S.A. 100, 5520–5524. doi: 10.1073/pnas.0931
349100
First, M. H., Spitzer, R. L., Gibbon, M., and Williams, J. (1990). Structured Clinical
Interview for DSM-IV Axis I Disorders – Research Version, Patient Edition
(SCID-I/P). New York, NY: New York State Psychiatric Institute.
Fox, K. C., Nijeboer, S., Solomonova, E., Domhoff, G. W., and Christoff, K. (2013).
Dreaming as mind wandering: evidence from functional neuroimaging
and first-person content reports. Front. Hum. Neurosci. 7:412. doi:
10.3389/fnhum.2013.00412
Freud, S. (ed.). (1900). The Interpretation of Dreams. New York, NY: Basic Books.
Gottesmann, C. (2005). Dreaming and schizophrenia: a common neurobiological
background. Sleep Biol. Rhythms 3, 64–74. doi: 10.1111/j.1479-8425.2005.
00164.x
Hobson, J. A. (2009). REM sleep and dreaming: towards a theory of
protoconsciousness. Nat. Rev. Neurosci. 10, 803–813. doi: 10.1038/
nrn2716
Kay, S. R., Fiszbein, A., and Opler, L. A. (1987). The positive and negative
syndrome scale (PANSS) for schizophrenia. Schizophr. Bull. 13, 261–276. doi:
10.1093/schbul/13.2.261
Laberge, S., Levitan, L., and Dement, W. C. (1986). Lucid dreaming: physiological
correlates of consciousness during REM sleep. J. Mind Behav. 7, 251–258.
Laruelle, M. (2014). Schizophrenia: from dopaminergic to
glutamatergic interventions. Curr. Opin. Pharmacol. 14, 97–102. doi:
10.1016/j.coph.2014.01.001
Manoach, D. S., and Stickgold, R. (2009). Does abnormal sleep impair
memory consolidation in schizophrenia? Front. Hum. Neurosci. 3:21. doi:
10.3389/neuro.09.021.2009
McGrath, J., Saha, S., Chant, D., and Welham, J. (2008). Schizophrenia: a concise
overview of incidence, prevalence, and mortality. Epidemiol. Rev. 30, 67–76. doi:
10.1093/epirev/mxn001
Merikangas, K. R., Jin, R., He, J. P., Kessler, R. C., Lee, S., Sampson, N. A.,
et al. (2011). Prevalence and correlates of bipolar spectrum disorder in the
world mental health survey initiative. Arch. Gen. Psychiatry 68, 241–251. doi:
10.1001/archgenpsychiatry.2011.12
Michels, F., Schilling, C., Rausch, F., Eifler, S., Zink, M., Meyer-Lindenberg, A.,
et al. (2014). Nightmare frequency in schizophrenic patients, healthy relatives
of schizophrenic patients, patients at high risk states for psychosis, and healthy
controls. Int. J. Dream Res. 7, 9–13.
Monti, J. M., and Monti, D. (2004). Sleep in schizophrenia patients and the
effects of antipsychotic drugs. Sleep Med. Rev. 8, 133–148. doi: 10.1016/S1087-
0792(02)00158-2
Mota, N. B., Furtado, R., Maia, P. P., Copelli, M., and Ribeiro, S. (2014). Graph
analysis of dream reports is especially informative about psychosis. Sci. Rep.
4:3691. doi: 10.1038/srep03691
Mota, N. B., Vasconcelos, N. A., Lemos, N., Pieretti, A. C., Kinouchi, O.,
Cecchi, G. A., et al. (2012). Speech graphs provide a quantitative measure of
thought disorder in psychosis. PLoS ONE 7:e34928. doi: 10.1371/journal.pone.
0034928
Mota-Rolim, S. A., and Araujo, J. F. (2013). Neurobiology and clinical implications
of lucid dreaming. Med. Hypoth. 81, 751–756. doi: 10.1016/j.mehy.2013.
04.049
Mota-Rolim, S. A., Erlacher, D., Tort, A. B. L., Araujo, J. F., and
Ribeiro, S. (2010). Different kinds of subjective experience during lucid
dreaming may have different neural substrates. Int. J. Dream Res. 3,
33–35.
Mota-Rolim, S. A., Targino, Z. H., Souza, B. C., Blanco, W., Araujo, J. F.,
and Ribeiro, S. (2013). Dream characteristics in a Brazilian sample: an
online survey focusing on lucid dreaming. Front. Hum. Neurosci. 7:836. doi:
10.3389/fnhum.2013.00836
Neider, M., Pace-Schott, E. F., Forselius, E., Pittman, B., and Morgan, P. T.
(2011). Lucid dreaming and ventromedial versus dorsolateral prefrontal
task performance. Conscious. Cogn. 20, 234–244. doi: 10.1016/j.concog.2010.
08.001
Okorome Mume, C. (2009). Nightmare in schizophrenic and depressed
patients. Eur. J. Psychiatry 23, 177–183. doi: 10.4321/S0213-61632009000
300006
Postle, B. R. (2006). Working memory as an emergent property of the
mind and brain. Neuroscience 139, 23–38. doi: 10.1016/j.neuroscience.2005.
06.005
Rak, M., Beitinger, P., Steiger, A., Schredl, M., and Dresler, M. (2015).
Increased lucid dreaming frequency in narcolepsy. Sleep 38, 787–792. doi:
10.5665/sleep.4676
Schredl, M. (2011). Dream research in schizophrenia: methodological issues
and a dimensional approach. Conscious. Cogn. 20, 1036–1041. doi:
10.1016/j.concog.2010.05.004
Skancke, J. C., Holsen, I., and Schredl, M. (2014). Continuity between waking life
and dreams of psychiatric patients: a review and discussion of the implications
for dream research. Int. J. Dream Res. 7, 39–53.
Snyder, T. J., and Gackenbach, J. (1988). “Individual differences associated with
lucid dreaming,” in Conscious Mind, Sleeping Brain, eds J. Gackenbach and S.
LaBerge (New York, NY: Plenum Press), 221–259.
Solms, M. (2000). Dreaming and REM sleep are controlled by different
brain mechanisms. Behav. Brain Sci. 23, 843–850. doi: 10.1017/S0140525X00
003988
Stumbrys, T., Erlacher, D., Schadlich, M., and Schredl, M. (2012). Induction of lucid
dreams: a systematic review of evidence. Conscious. Cogn. 21, 1456–1475. doi:
10.1016/j.concog.2012.07.003
Stumbrys, T., Erlacher, D., and Schredl, M. (2013). Testing the
involvement of the prefrontal cortex in lucid dreaming: a tDCS
study. Conscious. Cogn. 22, 1214–1222. doi: 10.1016/j.concog.2013.
08.005
Voss, U., Holzmann, R., Hobson, A., Paulus, W., Koppehele-Gossel, J., Klimke, A.,
et al. (2014). Induction of self awareness in dreams through frontal low
current stimulation of gamma activity. Nat. Neurosci. 17, 810–812. doi: 10.1038/
nn.3719
Voss, U., Holzmann, R., Tuin, I., and Hobson, J. A. (2009). Lucid dreaming: a state
of consciousness with features of both waking and non-lucid dreaming. Sleep
32, 1191–1200.
Frontiers in Psychology | www.frontiersin.org 9 March 2016 | Volume 7 | Article 294
221
fpsyg-07-00294 March 7, 2016 Time: 16:10 # 10
Mota et al. Psychosis and Lucid Dreaming
Voss, U., Schermelleh-Engel, K., Windt, J., Frenzel, C., and Hobson, A. (2013).
Measuring consciousness in dreams: the lucidity and consciousness in dreams
scale. Conscious. Cogn. 22, 8–21. doi: 10.1016/j.concog.2012.11.001
Conflict of Interest Statement: The authors declare that the research was
conducted in the absence of any commercial or financial relationships that could
be construed as a potential conflict of interest.
Copyright © 2016 Mota, Resende, Mota-Rolim, Copelli and Ribeiro. This is an open-
access article distributed under the terms of the Creative Commons Attribution
License (CC BY). The use, distribution or reproduction in other forums is permitted,
provided the original author(s) or licensor are credited and that the original
publication in this journal is cited, in accordance with accepted academic practice.
No use, distribution or reproduction is permitted which does not comply with these
terms.
Frontiers in Psychology | www.frontiersin.org 10 March 2016 | Volume 7 | Article 294
222
Chapter 7 - Sleep transition imagery, insights from natural language processing 
Dream content has been extensively investigated and is known to reflect waking 
activities. However, the dream persistence of the last image seen before sleep has 
never been quantified without subjective bias. Do visual memories fade or reverberate 
during hypnagogic sleep? In this chapter we will discuss the application of a semantic 
similarity tool called word2vec to study memory reverberation of a visual affective 
image, presented immediately before sleep, on dream reports collected during sleep 
transition. This is an ongoing project with preliminary results from 21 
electroencephalographic (EEG) recording sessions, with the aim to identify the neural 
correlates of image penetration in dreams. 
223
Semantic memory reverberation during sleep onset correlates with different frequency band 
power during waking and sleep 
Natália Bezerra Mota, Ernesto Soares, Edgar Altszyler, Vincenzo Muto, Dominik Heib, 
Manuel Schabus, Mauro Copelli, Sidarta Ribeiro 
 
Abstract 
There is evidence of the sleep role in semantic memory, but how it is spontaneously processed during 
dreams and its neural correlates still mysterious. The study of visual mentation during sleep onset allows 
time resolution to capture the moment when this mentation has being processed, and to repeat trials 
with vision recall success. By measuring semantic similarity between the report of an affective image 
seen before close the eyes and the report of the visual mentation with eyes closed it is possible to 
estimate how semantically close those reports are (as an automated measure of semantic memory 
reverberation). In order to characterize semantic memory reverberation during waking and sleep on 
sleep onset and its neural correlates we investigated 21 EEG recording sessions (64 cortical channel, plus 
EMG, EOG, ECG and skin conductance) composed by 36 trials each of 19 subjects. The subjects were 
sleep deprived. Each trial was composed by an affective image exposition for 15 seconds, which was 
reported and followed be an instruction to sleep. The experimenter was monitoring the sleep stages 
(waking with eyes closed, the first stage (N1) or the second stage of sleep (N2)), and when reached the 
target stage a beep sound signalized to open the eyes. The subject was asked to reports visual 
mentations during the moment with eyes closed. All the reports were time limited on 30 seconds, 
transcribed and semantic similarity to the previous image report were calculated using Word2Vec 
algorithm. All the recording sessions were sleep staged blindly and sleep biomarkers (like vertex, K-
complex and spindles were visually identified). After pre-processing data, the 20 seconds before the 
beep were analyzed using spectrogram and power spectrum density in 7 frequency bands (delta, theta, 
alpha, sigma, beta1, beta2 and low gamma). There was a tendency to have higher visual recall rate 
during the first sleep stage (p = 0.0545), and visual recall trials presented longer time with eyes closed 
(during Waking) and fewer K-complex (during Sleep stages) when compared to no visual recall trials, 
confirming the hypothesis that this first sleep stage is probably a sweet spot to study dream mentations. 
Related to semantic memory reverberation measured by image penetration (semantic similarity 
between image and visual mentation reports), there was higher image penetrance on Waking trials 
compared to Sleep trials (N1 or N2, which did not differ). Global theta power (PSD) was anti-correlated 
with image penetrance in both studied stages (Waking and Sleep). But while during waking alpha power 
correlated positively and sigma (or higher frequencies) power correlated negatively with image 
penetrance (higher image penetrance with a more relaxed waking stage), during sleep a higher 
awareness combined with deeper sleep stage (higher power in delta combined with higher power in 
Beta 1, 2 and Low gamma, mostly frontal) correlated with a higher image penetrance. Both stages also 
showed similarities (theta and sigma power are both anti-correlated with memory persistence. Semantic 
memory reverberation seems to be related to sleep process during sleep onset and the mixture of brain 
oscillations during this phase correlates with spontaneous memory traces measured naturalistically, 
confirming the hypothesis that there is memory related process during dreams on sleep onset. 
 
 
  
224
Introduction 
In the past years sleep science has accumulate evidences about the role of sleep in mnemonic 
process, improving performances on procedural memory tasks 1 as well in declarative memory 
tasks 2. Sleep oscillations as spindles, a biomarker of slow wave sleep, has being associated 
with performance improvement on declarative memory task 3,4 with important implications to 
understand and improve learning mechanisms. A deeper understanding of sleep-memory 
mechanisms is not only interesting, but also useful in order to design interventions able to 
boost learning 5.  
But the role of dreaming in memory process still mysterious, and also are the neural correlates 
of dreams related to memory process. Several theories argue from the randomicity of dreams 
content 6 until the evolutionary gain as a threatening simulator that helped our ancestors to 
gain insights during dreams and improve survival performance in several ecological tasks 7,8. 
Part of this discrepancy can be explained by the difficulty to study such an internal and 
subjective data as dreams content without a subjective bias. The Freudian notion that the 
latent meaning of a dream is interpretable sometimes only by the dreamer makes even harder 
the study of dreams content 9. In this field, however, breakthrough was achieved when it was 
demonstrated that dreaming with specific trained skills improved the performance on a game 
after sleep 10. This result raises the hypothesis that memory reverberation in dreaming is an 
important mnemonic mechanism to improve learning while sleep, but how can we measure 
memory reverberation in dreams content without subjective bias? Similar challenge faced by 
psychiatric evaluations (also very subjective) has being bypassed in the past years by a new 
field called computational psychiatry 11, especially by the use of speech analysis approaches 12-
16. The estimation of semantic similarity between terms (words) or set of terms (reports) 
represented in a semantic space (based on co-occurrence of words in a large set of 
documents) 17,18 is an interesting tool to dream content analysis.  
Also a challenge is the study of brain function associated to dreaming. How can experimenters 
precisely identify during a recording session that last hours an event that probably last minutes 
to seconds? Progress has being achieved by studying EEG recordings during naps or nights of 
sleep in the lab comparing sessions that were followed by a dream recall or not 19,20. 
Differential sleep oscillations seem to be associated with dream recall in different sleep stages 
19,20. But in order to study specific oscillations associated with memory process during 
dreaming it is important to guarantee a more precise time resolution with the dream 
phenomena. An interesting strategy is to study dreams during sleep onset 21-23. This first sleep 
stage last few minutes and could be repeated in the same recording session with a successful 
recall rate, and has shown even the possibility of decoding functional visual processing data 
using machine learning techniques to identify dreams content based exclusively on fMRI data 
recorded during sleep onset 21. 
A third bottleneck to study dreams content is the diversity of possible contents that seem to 
not have a clear relationship with memory process 6, unless it does not have an important 
affective impact 8,24. With the large amount of daily possible narratives with mild affective 
impact in the dreamer’s life, the amount of possible contents is also variable. But stimulus with 
important affective impact during sleep can influence the content of dream imagery, 
225
associated with physiological response to stress 25. So, if there is semantic memory 
reverberation during dreams, it is expected to affective relevant semantic memory to be 
recalled on dreams content. 
Given the evidences and caveats explained previously, we designed a multiple nap recording 
session after affective images expositions in order to assess dreams content during sleep 
onset. By measuring semantic similarity between the report of an affective image seen before 
close the eyes and the report of the visual mentation with eyes closed it is possible to estimate 
how semantically close those reports are (as an automated measure of semantic memory 
reverberation). The first hypothesis is that there are differences between waking and sleep 
regarding to semantic memory reverberation (waking trials should present higher 
reverberation compared to sleep trials, that should present more aberrant content), and brain 
oscillations should present different associations with semantic memory reverberation in 
waking and sleep stages. Additionally, the first stage of sleep (N1) and brain oscillations related 
to it should present higher dream recall rate compared to other stages during sleep onset, 
which confirms the strategic benefits to study dreams during sleep onset. 
  
226
Methods 
Here we analyzed 21 EEG recording sessions from 19 subjects (10 males and 9 females, ages 
above 18 and under 44 years old). They were first interviewed to exclude mental, neurological 
or sleep disorder symptoms, and instructed to fill a sleep/dream diary for two weeks before 
the recording session. At the day before the experiment they were requested to not drink 
alcohol or caffeine. They were instructed to wake half of the habitual sleep time earlier and 
arrive at the sleep laboratory one hour before the habitual awakening time (to start the 
recording session at this time). 
In order to get sleep data better time matched with dream experience and multiples 
awakenings from the same individuals, we collected sleep transition recording with 36 trials 
interrupted by a beep sound during initial phases of sleep (wake with eyes closed or first or 
second stage of sleep – N1 and N2). Before close the eyes, an affective image was showed for 
15 seconds. The individuals were asked to report “what did he/she see”. Then they were 
instructed to pay attention to visual imagery during the period with eyes closed. Sleep staging 
was made during the experiment and when reached the target sleep stage a beep was started 
and lasted 2 second, signaling to the subject to open the eyes. They were asked to report 
“what did he/she see” during the period with eyes closed.  The experimenter balanced trials 
for collect visual mentation reports of the first stages of sleep (N1 or N2 stages) and of wake 
with eyes closed, which were randomly ordered for each experiment.  
In order to quantitatively measure semantic memory reverberation of the previous image 
showed before close the eyes in dream or visual mentation, both reports (which were time 
limited on 30 seconds) were transcribed and translated to English using Google translator. The 
texts were compared using Word2Vec pre-trained semantic representation 17,18 in order to 
measure semantic similarity between both reports (which was called image penetrance – IP). 
This representation maps each word to a vector, where words with similar meanings tend to 
be located closer to each other. Given a semantic representation, the semantic similarity of 
two words it is calculated using the cosine similarity measure between their respective 
vectorial representations. Thus, the similarity of two texts can be computed as the cosine 
similarity measure between the average vectors of each text. Word2vec technique consists of 
a state-of-the-art neural network which is trained to predict the context of the words among a 
large corpus (Google News dataset of 100 billion words in this case). Then we were able to 
localize the set of words used to describe the previous image seen before close the eyes and 
then calculate the similarity with the set of words used to describe the visual mentation during 
the period with eyes closed. This measurement is here called image penetrance (similarity of 
visual mentation compared to the previous image seen). 
Electroencephalography using 64 cortical channels was recorded (plus electroculogram and 
electromyogram). Blind and offline sleep staging was performed, and also counted sleep 
biomarkers like vertex, spindles and K-complex. Data was downsampled to 126Hz, filtered 
from 0.5 – 30Hz, excluded bad trials, interpolated bad channels after visual inspection and cut 
on 20 seconds before starts the beep sound. From the 756 trials collected, after pre-
processing, we analyzed 694 trials, 589 with visual report (275 during waking and 314 during 
sleep – 237 in N1 and only 75 in N2). Cortical electrodes were re-referenced to the average 
227
and computed power spectral density – PSD using pwelch method for each cortical channel 
and average across channels (global PSD), and also spectrogram in target channels. It was 
calculated mean PSD across seven frequency band intervals named as: Delta (0.5-4.5Hz), Theta 
(4.5-8.5Hz), Alpha (8.5-12.5Hz), Sigma (12.5-16.5Hz), Beta1 (16.5-20.5Hz), Beta2 (20.5-24.5Hz), 
and Low Gamma (24.5-28.5Hz).  
Then we performed statistical analysis to verify if both groups of experiments show different 
results related to wake or sleep trials. It was used non-parametric statistics Bonferroni 
corrected for 14 comparisons (7 frequency bands x 2 sleep stages – wake x sleep). Matlab 
software was used to EEG and statistical analysis. 
 
 
Figure 1: Methods and concepts. A) Experimental protocol: an affective image was showed for 
15 seconds and after the screens went off the subject reported for 30 seconds “what did 
he/she see”. After they were instructed to try to sleep and pay attention on visual mentation 
during the period with eyes closed. Then after a beep the subject were instructed to open the 
eyes and report “what did he/she see” during eyes-closed period. If the subject remembers a 
visual mentation, it was considered a visual recall trial; otherwise it was considered a no recall 
228
trial. The experimenter made sleep staging during the eye-closed period and started the beep 
during sleep stage (N1 or N2), or before sleep (Wake), and the order was randomized for each 
experiment. The entire experiment had 36 trials. All the trials were again and blindly sleep 
staged and this offline staging was considered for analysis. B) Image Penetrance concept: 
semantic similarity calculated using word2vec strategy estimated the semantic similarity 
between two sets of word (report from visual stimulus before close the eyes x report from 
visual mentation with eyes closed). In this example it is shown two trials, 1 and 2, and the 
reports during visual mentation (a) and the description of the stimulus (b), plotted a semantic 
similarity matrix between reports inside and across trials. Colors indicate semantic similarity 
(equal texts have the maximum similarity of 1).   
229
Results 
There was a tendency to have higher visual recall rate during the first sleep stage N1 compared 
to waking (p = 0.0545). Also, visual recall trials presented longer time with eyes closed when 
compared to no visual recall trials (considering all trials or only waking trials), and fewer K-
complex when compared to no visual recall trials (considering all trials or only sleep trials), 
partially confirming the hypothesis that this first sleep stage (N1) is probably a sweet spot to 
study dreams (Figure 2).  
 
 
Figure 2: The first sleep stage N1 and its sleep biomarkers are associated with a better recall 
rate. A) Analyzing together all experiments, there is a tendency to have a visual or dream recall 
in the first sleep stage (N1) trials compared to Waking trials. B) Only for sleep trials there is 
more K-complex in no recall trials than in trials with dream recall. Note that in most of trials 
there is none K-complex (that’s why K-complex count in mean is smaller than 1). Also, wake 
trials with visual recall lasted longer time with eyes closed compared to trials without visual 
230
recall. Median values for recall trials represented by red bars and by blue bars for no visual 
recall trials, and standard error represented by black lines. C) Results summary confirming the 
hypothesis that memory process related to sleep can be studied during sleep transition on N1 
stage (visual recall trials are associated with longer wake periods with eyes closed - closer to 
N1, as well sleep trials with dream recall presented less K-complex).  
 
As expected, semantic memory reverberation by image penetration was higher on waking 
trials compared to sleep trials (N1 or N2, which did not differ) (Figure 3A). Importantly, there 
were no correlation between time with eyes closed and image penetrance (Rho = -0. 0216, p = 
0.6015). And also as expected, sleep trials presented different mean spectrogram compared to 
the mean waking spectrogram: there was also higher power in alpha, sigma, beta and low 
gamma and less power in delta and theta frequency band during waking, especially closer to 
beep sound (Figure 3B).   
In order to understand the association between brain oscillatory pattern and image 
penetrance in both waking and sleep trials, we studied its correlation with power spectrum 
density (PSD) in 7 different frequency bands (0.5 to 28.5Hz). Global theta power (PSD on 4.5 to 
8.5Hz) was anti-correlated with image penetrance in both studied stages (waking and sleep). 
But while during waking alpha global power (PSD on 8.5 to 12.5Hz) correlated positively and 
sigma global power (PSD on 12.5 to 16.5Hz) correlated negatively (Figure 3C), during sleep 
higher frequencies bands (beta 2 and low gamma, PSD 20.5 to 28.5Hz) correlated positively 
with image penetrance (Figure 3D). This result points to similarities between both stages for 
theta range, but differences in other frequency bands. 
 
231
 Figure 3: Semantic memory reverberation on visual mentations with eyes closed (image 
penetrance shows differences between waking and sleep trials). A) Semantic memory 
reverberation from the last image seen before close the eyes (image penetrance) is higher 
during waking than during sleep trials, as predicted. Difference between waking x sleep 
represented by (**) and difference between waking x N1, wake x N2 represented by (*). B) 
Examples of PSD peaks (blue line for each frequency window and gray dots for mean of each 
frequency band) and spectrogram of a wake and a sleep trial twenty seconds before beep 
sound. C) Spearman correlation of global PSD versus image penetrance considering all waking 
trials. Frequency band, Rho and p value described on title (in red significant correlations after 
Bonferroni correction for 7 (frequency bands) x 2 (waking or sleep) comparisons). D) Spearman 
correlation of global PSD versus image penetrance considering all sleep trials. Frequency band, 
Rho and p value described on title (in red significant correlations after Bonferroni correction 
for 7 (frequency bands) x 2 (waking or sleep) comparisons).         
 
Analyzing the correlations between PSD and image penetrance in each channel isolated 
(considering significant after Bonferroni correction for 7 x 2 x 64 comparisons), there is 
generally more correlated channels in waking than in sleep (that were topographically 
restricted to frontal or temporal left sites). Both stages did not differ much related to 
topography of negative correlation in Theta power (PSD on 4.5 to 8.5Hz) (showing a densely 
distribution of correlated channels in frontal-central-temporal regions). While during waking 
central-occipital alpha power (PSD on 8.5 to 12.5Hz) correlated positively with image 
penetrance and there was no correlation in delta frequency range (PSD on 0.5 to 4.5Hz), 
during sleep there was a positive correlation in frontal delta range and a negative correlation 
232
in frontal alpha range. Sigma range (PSD on 12.5 to 16.5Hz) shows for both waking and sleep a 
negative frontal correlation (also distributed to centro-parietal regions on waking), which in 
higher frequencies (Beta 1, 2 and Low gamma, PSD 16.5 to 28.5Hz) kept negative correlation in 
waking trials and turns to positive correlations in temporal regions on sleep trials.  Interestingly 
on waking there are also positive correlations in temporal sites on Beta 2 and Low gamma 
ranges (PSD 20.5 to 28.5Hz). Also the correlations peaks are restricted to the left sites on sleep 
trials, which is not the same on waking trials (that even presents two Rho peaks, a positive and 
a negative peak, in Beta 2 and Low gamma) (Figure 4A).  
  
233
  
Figure 4: Different associations between image penetrance and power in seven frequency 
bands (0.5Hz – 28.5Hz) during waking and sleep. Topographic representation of Spearman 
correlation between image penetrance and power spectrum density in delta (0.5 – 4.5Hz), 
theta (4.5 – 8.5Hz), alpha (8.5 – 12.5Hz), sigma (12.5 – 16.5Hz), beta1 (16.5 – 20.5Hz), beta2 
(20.5 – 24.5Hz) and low gamma (24.5 – 28.5Hz) range. White dots represent cortical channels 
with significant correlation after Bonferroni correction, and black circle represent the peak (the 
highest and/or the lowest Rho) channel on the frequency range.  
  
234
Discussion 
As predicted, there was a tendency to have more frequently dream recall during first stage of 
sleep compared to waking stage, but no difference or tendency were observed between N1 
and N2. As the experiment was designed to get only the initial moments of N2, there was only 
a few numbers of trials staged as N2, and probably it was too early to have activated 
mechanism that are more specific from this second stage of sleep. Although, analyzing all sleep 
trials, it was possible to observe that those with dream recall presented less K-complex 
compared to no recall trials. The association of K-complex and lack of dream recall can be 
speculated to be a consequence of a very slow oscillation impairing memory process, although 
no causality relationship can be inferred by this data. As the oscillation become more and 
more slow, memory impairment is more and more pronounced, as we can observe by the 
difficulty on recall a dream after awakening from N3 26,27, Together with the other result that 
exclusively on waking, vision recall trials shows more time with eyes closed give us indirect 
evidences that N1 is a privilege stage to collect vision recall (vision/dream recall is associated 
with a wakefulness closer to this first stage and an initial sleep stage far from deeper sleep). As 
it is described on literature , this is an accessible sleep stage, reached on a seconds to few 
minutes with eyes closed, and full of mental imagery 22,23, which give us an opportunity to 
study vision/dream recall electrophysiology with a better time resolution 21. That said we keep 
our investigation related to semantic memory reverberation on this mental imagery. 
By measuring semantic similarity between a previous image seen before close the eyes and the 
mental images during the period with eyes closed, it was possible to observe differences 
regarding semantic memory reverberation while the subject were on sleep transition. As 
predicted there was a higher semantic memory reverberation when the subjects were waking. 
During sleep, mentations are described to be more bizarre and this was expected to compete 
with memory reverberation of the previous image 27-30. The computational approach enabled 
to find latent similarities between reports, what was hard to measure without subjective bias 
in previous studies that fail on detecting semantic memory reverberation 6.  
But the main hypothesis was only partially confirmed, as we observed differences and 
similarities of electrophysiological correlates of semantic memory reverberation during waking 
and sleep. While a more relaxed waking stage (higher alpha power, and lower high frequency 
bands - higher than sigma) correlates with a higher image penetrance, during sleep more 
awareness combined with deeper sleep restricted to frontal sites (higher power in delta 
restricted to frontal channels combined with higher power in Beta 1, 2 and Low gamma in 
temporal left channels) were also correlated with higher sematic memory reverberation. These 
results points to a similar direction of the previous result: memory reverberates more on 
waking trials as closer they are from sleep, and on sleep trials as closer they are from waking. 
This highlight the importance of the mixture of both states associated with semantic memory 
process. It is possible that sleep after a stressful or threating situations enhance awareness 
during sleep, what is associated with a higher reverberation of semantic memory traces, 
training the subject to deal with this situation after wakening 7.  
In both waking and sleep trials, theta (PSD 4.5 to 8.5Hz) and sigma power (PSD 12.5 to 16.5Hz) 
were anti-correlated with memory persistence. First discussing theta results, global theta 
235
power and almost all frontal, temporal, central channels were negatively correlated to image 
penetrance (excepting only parietal-occipital channels), and it does not seem to have any 
difference related to waking or sleep. This can be a reflex of how slow oscillations globally 
impairing mnemonic reverberation, a phenomena more pronounced in sleep inertia after 
awakening from N3 26,27. On the other hand, for sigma range the similarities between waking 
and sleep trials are restricted to frontal channels. In waking trials it was observed negative 
correlation with global sigma PSD, also distributed to frontal, central and parietal channels, but 
this negative correlation with image penetrance was restricted to frontal channels during 
sleep, showing a gradual change to positive correlations in the more posterior channels that 
turned to significant positive correlations in higher frequency bands. This switch of negative to 
positive correlations can be interpreted as a higher awareness during sleep associated with 
higher memory reverberation during this state, also pointing to alertness and stress playing a 
role to this mnemonic mechanism. 
The results points to a sleep neural mechanism related to semantic memory reverberation on 
dream imagery happening since sleep onset. Semantic memory reverberation seems to be 
related to a mixture of brain oscillations during this phase as there are correlates with 
spontaneous memory traces measured naturalistically, confirming the hypothesis that there is 
memory related process during dreams on sleep onset. These results help to better 
understand natural conditions that probably favor semantic memory reverberation, and new 
experiments with intervention design should help understand causal relationship, as well how 
we can boost learning with sleep onset paradigm.  
 
  
236
References 
1 Di Rienzo, F. et al. Online and Offline Performance Gains Following Motor Imagery 
Practice: A Comprehensive Review of Behavioral and Neuroimaging Studies. Front Hum 
Neurosci 10, 315, doi:10.3389/fnhum.2016.00315 (2016). 
2 Diekelmann, S. & Born, J. The memory function of sleep. Nat Rev Neurosci 11, 114-126, 
doi:10.1038/nrn2762 (2010). 
3 Gais, S., Molle, M., Helms, K. & Born, J. Learning-dependent increases in sleep spindle 
density. J. Neurosci. 22, 6830-6834 (2002). 
4 Schabus, M. et al. Sleep spindles and their significance for declarative memory 
consolidation. Sleep 27, 1479-1485 (2004). 
5 Sigman, M., Pena, M., Goldin, A. P. & Ribeiro, S. Neuroscience and education: prime 
time to build the bridge. Nat Neurosci 17, 497-502, doi:10.1038/nn.3672 (2014). 
6 Fosse, M. J., Fosse, R., Hobson, J. A. & Stickgold, R. J. Dreaming and episodic memory: a 
functional dissociation? J Cogn Neurosci 15, 1-9. (2003). 
7 Revonsuo, A. The reinterpretation of dreams: an evolutionary hypothesis of the 
function of dreaming. Behav. Brain Sci. 23, 877-901 (2000). 
8 Revonsuo, A. & Valli, K. Dreaming and Consciousness: Testing the Threat Simulation 
Theory of the Function of Dreaming. Psyche 6 (2000). 
9 Freud, S. The interpretation of dreams. 1952 edn,  (Encyclopaedia Britannica, 1900). 
10 Wamsley, E. J., Tucker, M., Payne, J. D., Benavides, J. A. & Stickgold, R. Dreaming of a 
learning task is associated with enhanced sleep-dependent memory consolidation. 
Curr Biol 20, 850-855, doi:10.1016/j.cub.2010.03.027 (2010). 
11 Wang, X. J. & Krystal, J. H. Computational psychiatry. Neuron 84, 638-654, 
doi:10.1016/j.neuron.2014.10.018 (2014). 
12 Cabana, A., Valle-Lisboa, J. C., Elvevag, B. & Mizraji, E. Detecting order-disorder 
transitions in discourse: implications for schizophrenia. Schizophr Res 131, 157-164, 
doi:10.1016/j.schres.2011.04.026 (2011). 
13 Elvevåg, B., Foltz, P. W., Weinberger, D. R. & Goldberg, T. E. Quantifying incoherence in 
speech: An automated methodology and novel application to schizophrenia. 
Schizophrenia Research 93, 304-316, doi:10.1016/j.schres.2007.03.001 (2007). 
14 Bedi, G. et al. Automated analysis of free speech predicts psychosis onset in high-risk 
youths. npj Schizophrenia 1, 15030, doi:10.1038/npjschz.2015.30 (2015). 
15 Mota, N. B., Furtado, R., Maia, P. P., Copelli, M. & Ribeiro, S. Graph analysis of dream 
reports is especially informative about psychosis. Scientific Reports 4, 3691, 
doi:10.1038/srep03691 (2014). 
16 Mota, N. B. et al. Speech graphs provide a quantitative measure of thought disorder in 
psychosis. PLoS One 7, e34928, doi:10.1371/journal.pone.0034928 (2012). 
17 Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient Estimation of Word 
Representations in Vector Space. arXiv:1301.3781v3 [cs.CL] (2013). 
18 Altszyler, E., Sigman, M., Ribeiro, S. & Slezak, D. F. Comparative study of LSA vs 
Word2vec embeddings in small corpora: a case study in dreams database. . arXiv 
Computer Science Computation and Language, doi:arXiv:1610.01520 (2016). 
19 Marzano, C. et al. Recalling and forgetting dreams: theta and alpha oscillations during 
sleep predict subsequent dream recall. J Neurosci 31, 6674-6683, 
doi:10.1523/JNEUROSCI.0412-11.2011 (2011). 
20 Chellappa, S. L., Frey, S., Knoblauch, V. & Cajochen, C. Cortical activation patterns 
herald successful dream recall after NREM and REM sleep. Biol Psychol 87, 251-256, 
doi:10.1016/j.biopsycho.2011.03.004 (2011). 
21 Horikawa, T., Tamaki, M., Miyawaki, Y. & Kamitani, Y. Neural decoding of visual 
imagery during sleep. Science 340, 639-642, doi:10.1126/science.1234330 (2013). 
237
22 Foulkes, W. D. Dream reports from different stages of sleep. J Abnorm Soc Psychol 65, 
14-25 (1962). 
23 Nielsen, T. A. A review of mentation in REM and NREM sleep: "covert" REM sleep as a 
possible reconciliation of two opposing models. Behav Brain Sci 23, 851-866; 
discussion 904-1121 (2000). 
24 Valli, K. et al. The threat simulation theory of the evolutionary function of dreaming: 
Evidence from dreams of traumatized children. Consciousness and Cognition in press 
(2004). 
25 Flo, E. et al. Transient changes in frontal alpha asymmetry as a measure of emotional 
and physical distress during sleep. Brain Res 1367, 234-249, 
doi:10.1016/j.brainres.2010.09.090 (2011). 
26 Chugh, D. K., Weaver, T. E. & Dinges, D. F. Neurobehavioral consequences of arousals. 
Sleep 19, S198-201 (1996). 
27 Hobson, J. A. REM sleep and dreaming: towards a theory of protoconsciousness. Nat 
Rev Neurosci 10, 803-813, doi:10.1038/nrn2716 (2009). 
28 Hobson, J. A. Dreaming as delirium: A mental status exam of our nightly madness. 
Seminars in Neurology 17, 121-128 (1997). 
29 Hobson, J. A., Hoffman, S. A., Helfand, R. & Kostner, D. Dream bizarreness and the 
activation-synthesis hypothesis. Hum Neurobiol 6, 157-164 (1987). 
30 Hobson, J. A. & McCarley, R. W. The brain as a dream state generator: an activation-
synthesis hypothesis of the dream process. Am J Psychiatry 134, 1335-1348 (1977). 
 
 
238
Chapter 8 - Perspectives: 
• Neural basis of speech graph biomarkers in Schizophrenia. (collaboration with 
Prof. Lena Palaniyappan) 
Lena Palaniyappan*1,2, Natália Bezerra Mota*3, Shamuz Oowise4, Vijender Balain5, Mauro Copelli6, Peter 
Liddle1,2, Sidarta Ribeiro3 
Schizophrenia is a potentially devastating disease with complex genetic and 
environmental etiology, and still uncertain biomarkers. A longstanding notion in the 
concept of Schizophrenia is the prominence of loosened associative links in thought 
processes.  Assessment of such subtle aspects of thought disorders has proved to be 
a challenging task in clinical practice. Recently, speech graph analysis surfaced as a 
quantitative source of schizophrenia biomarkers related 
to structural speech disorganization, but the neural correlates remain unknown. To 
address this question, we investigated the structural connectedness of speech samples 
obtained from 56 patients with psychosis (22 with bipolar disorder, 34 with 
schizophrenia). We found a canonical correlation linking speech connectedness and i) 
functional plus anatomical brain measurements (degree centrality from resting state 
functional imaging and gyrification based assessment of brain structure) ii) 
psychometric evaluation of thought disorder , iii) cognitive performance (speed 
deficits) and iv) global dysfunction in patients. Only speech connectedness was 
correlated with biological markers, and was a better predictor of 
cerebral disconnectivity than conventional diagnostic categories. Speech 
connectedness filled the dynamic range of responses much more efficiently than 
psychometric measurements of thought disorder. The results provide direct evidence 
that brain disconnectivity is linked to disconnected thought process in psychosis, 
better measured by graph analysis. 
239
  
Figure: Speech connectedness (lower in Schizophrenia) is the only behavioral 
measure to correlate with brain disconnectivity. A) Schizophrenia group presents 
lower connectedness than Bipolar group. B) Speech connectedness is correlated with 
brain disconnectivity (measured by VCC – variance of the degree centrality of the core 
hubs and LGI – gyrification index), with psychometric scales (measured by SSPI and 
TLI), with global functioning (measured by GAF and SOFAS) and cognitive performance 
on DSST (Digit Symbol Substitution Score). C) None of other behavioral measures (such 
as psychometric scales, or global functioning or cognitive performance) were 
correlated with brain disconnectivity. D) Schematic summary of main results 
illustrating the main correlations searched in this work.  
 
 
  
240
• Analogical Reasoning and graphs from gaze path and from verbal explanations. 
This is an ongoing collaboration with Silvia Bunge’s laboratory at the University of 
California, Berkeley, which started after the first publication of the speech graph 
methodology applied to cognitive development. Analogical reasoning is the skill to find 
correspondence between entities based on shared relationships 1, and its 
development is linked to learning abilities 1. In previous work we found a correlation 
between educational level and speech structure measured using graphs. In this project 
we aim to verify if there is also a correlation between analogical reasoning skills and 
speech structure. Also we aim to characterize efficient gaze trajectory (or gaze path) 
during the performance of an analogical reasoning task also using graph theory. The 
hypothesis is that the correct trials should present a more linear path with fewer 
recurrence compared to wrong trials (that should present a gaze path engaging more 
distractors nodes, with more recurrence and loops).      
• The role of working memory in structural language development . 
Part of the study participants presented on chapter 3 (45 of 74 children) were 
interviewed again almost one year later with the same memory report protocol. Also, 
in collaboration with Janaína Weissheimer and Renata Callipo from UFRN, the same 
children were tested on working memory abilities using the AWMA task 2. Based on 
the results with the first paper 3, we developed the hypothesis that working memory 
should be correlated with the speech structure pattern presented during 
development. As in typically developing children the working memory buffer starts 
shorter, younger children should not be able to store much information while planning 
the speech, repeating the same terms with a smaller distance during a natural speech, 
than performing memory graphs with more short-term recurrence, while when they 
expand the working memory buffer, they can store more information related to a 
topic, increasing lexical diversity and the amount of nodes on the largest connected 
components (performing more connected speech graphs). Both aspects of cognitive 
development also should be related to reading (better readers should present better 
performance on reading). We have data from reading performance of these children 
from a 4 years observation project that ended in December 2016. 
• The study of dream reports in typically developing children. 
We intend to analyze the development of dream reports of these 45 children that 
were assessed in two different time points with one year of interval. We aim to verify if 
there is a relationship between the ability to recall a dream and the repression of old 
memory contents. It was observed by Freud and discussed in his seminal book about 
dreams 4 that children start to repress their memory content at the end of the first 
infancy (which is a similar period starting elementary school). On this hypothesis, by 
repressing memory content of earlier ages children also start to repress their dream 
241
recall. So, the children should, in this longitudinal study, diminish their dream recall 
ability, as well increase the age they had on their oldest memory (their oldest memory 
should be later in their life). Also, the children that repressed more their memory 
(higher gap between oldest memory age from the first to the second interview), 
should present a lower index of dream recall, as well larger semantic distance with 
oldest memory report from both interviews. The ones that keep recalling recent 
dreams should present more similarity between the oldest memory reports (they 
should still recall the same oldest memory).    
• Lucid dreams induction after repetitive awakening during sleep transition. 
In chapter 7, another question that was raised after a pilot study is related to dream 
lucidity. As the instruction to remember the visual mentation while dreaming can 
enhance awareness during sleep, could the repetitive awakening protocol induce 
dream lucidity? At the end of the protocol we added a nap when the subject was 
instructed to signal with eyes movements if he/she became aware of dreaming while 
dreaming. Sleep data was collected from 19 participants (11 males, mean age of 26.3 
years old) and in the end of the nap they answered if they had a dream, and if they 
were lucid while dreaming, answering two questionnaires to characterize dream 
lucidity 5,6. As a preliminary result, we found that 47% of the subjects were able to 
report a lucid dream after awakening and 37% were able to make the eye movement 
signal. We intend to analyze sleep electrophysiology associated with the episodes 
marked with eye movement and the sleep transition data of subjects that were able to 
experience lucid dream versus those that were not.    
Figure: An example of combined eye movements’ signal   
242
• Semantic similarity between vision and thought memory reports during wake or 
sleep transition dreams and brain connectivity. 
Also in the last experiment discussed in this thesis, data were collected so as to 
differentiate visual mentation and semantic thinking during sleep transition (as 
described in the methods section of this chapter 7). The main hypothesis related to 
this experiment was that during waking trials, visual mentation and semantic thinking 
should be more similar than visual mentation and semantic thinking during sleep. We 
also hypothesized that this vision-thought dissociation during sleep (measured as 
decreased similarity between visual mentation and semantic thinking) should be 
accompanied by weaker coherence between frontal and occipital areas, mainly in high 
frequency bands. We intend to perform electrophysiological analysis on this dataset to 
test these hypotheses.  
• Genetics and cognitive deficits in Schizophrenia – Twins case reports. 
During the data collection 7, a family with twin sisters both diagnosed with 
Schizophrenia at the same time was identified. They shared positive symptoms content 
(of delusions and hallucinations), but only one of them had serious cognitive 
impairment and negative symptoms that justified two hospitalizations. Recent genetic 
evidence shows advances in the identification of biomarkers that are associated with 
cognitive impairment of psychosis and shared with other psychiatric diseases with 
disruption of neurocognitive development (like autism) 8. Given the different cognitive 
symptomatology in two genetically identical twins, we started a project to search for 
genetic biomarkers that could help understand the cognitive impairment associated 
with Schizophrenia.   
243
Discussion: 
After the presentation of results in each chapter of this thesis, we can move on to 
discuss the hypotheses raised in the beginning, starting from the main hypothesis: 
‘Natural language processing tools at the structural and semantic levels can precisely 
quantify naturalistic human behavior expressed by language and can be applied to 
understand cognitive pathology, development and dreams’. We demonstrated 
extensively that it is possible to advance in this direction and the application of this 
knowledge can reach different areas of expertise related to human behavior. Inspired 
by the discussion that basic and applied science can grow together and advance 
knowledge in a useful way 9, the path pursued here aimed to contribute in both 
directions.  
Understanding the behavioral phenomenon is necessary to produce mathematical 
abstractions and design computational tools able to make precise quantification of 
that behavior. This was the main strategy adopted in the development of the Speech 
Graph methodology 7,10, inspired by the psychopathological descriptions such as ‘word 
salad’ and ‘derailment’, which carry the idea of loss of an expected trajectory 
perceived in the flow of thoughts shared during spontaneous verbalization 11. The 
results analyzed in this thesis from different samples revealed that it is possible to 
characterize and precisely measure this type of symptoms and that these measures are 
predictive of diagnosis and clinical outcome 7,10.  Specifically the hypothesis ‘During 
recent-onset psychosis, subjects with Schizophrenia diagnosis should produce more 
fragmented graphs, and graph connectivity would be predictive of diagnosis and 
correlated with negative symptoms’ was confirmed 10. Not only speech structure, but 
also semantic incoherence was predictive of a psychotic break 12 (also a computational 
assessment inspired in the description of thought disorders) 13, and the combination of 
both strategies can improve these predictive measures 14. This result resembles the old 
psychopathological idea that psychotic diseases are behaviorally too complex and 
needed a set of symptoms to be characterized 11. In summary, the publications 
presented here confirm that it is feasible to computationally measure psychometric 
symptoms that previously could only be described by trained psychiatrists, and that 
this knowledge can be applied to the clinical practice as a complementary tool, helping 
professionals to be more precise in their daily predictions.   
One intriguing result is that not all content reports were able to represent 
characteristic structural markers of the schizophrenic group. In chapter 1, dream 
reports were more informative than waking reports 7. This result was replicated in a 
recent-onset psychosis sample, and the experiment revealed that short-term 
memories from affective images (mainly negative images) were also more informative 
compared to long-term memory reports or neutral short-term memory reports 10. This 
can be interpreted as evidence that these structural differences measured by graphs 
244
cannot be a general language feature; otherwise the results would not differ changing 
the report content. Rather, speech structure measured by graphs seems to be closely 
related to memory, specifically short-term memory, and affective valence seems to 
play an additional role in this process. 
To better understand this behavioral phenomenon, and to characterize speech 
structure from memory reports during typical development, we formulated the 
following hypothesis: ‘Children that show more advanced cognitive development 
(regarding general intelligence, theory of mind abilities and academic performance) 
should present more connected and less recursive memory report graphs’. As expected, 
the hypothesis was confirmed but only when short-term memory reports were 
analyzed 3. This also confirms the important role of short-term memory process in 
determining speech structural differences related to cognition far from the 
pathological point of view. With advances in this developmental perspective it is 
possible to note that a computational tool designed to measure psychopathological 
characteristics can actually measure cognitive features that are not exclusive from 
pathological populations, but are directly related to cognition and then severely 
impaired in the course of psychosis. Also, from an applied perspective, we 
characterized a relationship between speech structure and reading performance 
independently from general intelligence or theory of mind ability, pointing to a useful 
and low-cost way to screening risk for learning difficulties.  
The previous results guide us to deepen this basic cognitive question in an even wider 
view and formulate the hypothesis: ‘Healthy subjects should present an increase of 
connectivity and lexical diversity, as well as a decrease of short-term recurrence related 
to age and education, and the same pattern of development would be expected in the 
analysis of literary texts across historical time’. We analyzed all dataset collected since 
the creation of speech graph methodology in a developmental perspective, analyzing a 
large population with a wide variation of age and educational level. We added to the 
analysis a large sample of historical text since the first written text until nowadays (in 
collaboration with Sylvia Pinheiro, a master’s student from our laboratory). The 
analysis of both samples together allowed us to gain important insights related to 
speech structure development. First, it was possible to discuss the similarities of 
speech structure development from an individual perspective across years of 
education and speech structure development from a historical perspective across 
literature development. Second, educational level explained better speech structure 
development in a healthy population than age, but this development was not observed 
in a psychotic population, which kept similar speech structure found on ancient texts. 
This is an evidence of how speech structure can be influenced by cultural knowledge 
disseminated through education when cognitive development is not impaired. Third, 
we were able to observe how speech structure probably evolved during important 
historical periods already discussed in the literature (most of changes in speech 
245
structure were observed between the end of the Bronze Age and the beggining of the 
Axial Age), and how this parallels with cognitive development (or cognitive 
pathologies).  
From a different perspective, but also trying to deepen basic knowledge derived from 
the first paper published in this thesis 7, we pursued scientific explorations that could 
help us understand why dream reports are more informative about psychosis. The first 
strategy adopted was to describe lucid dream features in a psychotic sample (‘Dream 
lucidity (the ability to be aware of dreaming while dreaming) in patients undergoing 
psychosis’). Surprisingly we found that patients from the schizophrenia group were 
more frequently able to control their dreams than the subjects from other groups. This 
result opens more questions related to the shared phenomenology between psychosis 
and dreams. This internal reality created from memory fragments during psychosis 
seems to help subjects to have higher cognitive control from their also internal reality 
created from memory fragments during dreaming. But this also seems to isolate the 
subject in his internal experiences, impairing his social behavior 15. It is also important 
to remember that dreams are compared to psychosis as a model to understand this 
pathology 16. 
At this point, our curiosity about memory processes during altered states of 
consciousness once again extended beyond the psychotic phenomena and guided us 
to a naturalistic characterization of memory reverberation during sleep , in pursuit of 
the last hypothesis ‘Do visual memories fade or reverberate during waking and 
hypnagogic sleep?’. This work was initially inspired by the feasibility to get a lot of 
dream reports from the same subject in the same experimental session using a 
repetitive awakening protocol during wake-to-sleep transition 17. The use of this 
protocol should be enough to naturalistically describe (in repetitive trials mixed from 
wakefulness and the first sleep stages) how semantic memory reverberates during this 
physiological stages. So far it was possible to verify the hypothesis and to describe 
behavioral and electrophysiological differences related to semantic memory 
reverberation between wakefulness and initial sleep, which are: the more relaxed is 
the wakefulness and the more alert is the initial sleep, the higher the image 
penetrance, thus linking this mnemonic process to the transition of sleep. 
Altogether, we can conclude that, based on the data explored in this thesis, 
computational speech tools such as speechgraphs (related to speech structure) and 
latent semantic analysis or word to vector (related to semantic similarity) represents 
interesting methodologies to precisely measure human complex behavior 
naturalistically expressed through speech, spanning the possible basic questions 
related to human cognition and consciousness.   
  
246
Acknowledgments 
Não há palavras. Clichê, pura verdade. Nem pensar em conseguir fazer agradecimentos em outra língua que não fosse a minha. E 
para minha sorte, como meu treinamento foi inteiramente no Brasil, graças ao sonho de um uma pessoa iluminada que tive a 
imensa sorte de encontrar em meu caminho, posso hoje escrever os agradecimentos desse trabalho em português. Mas mesmo 
assim, não há palavras... onde habitam esses sentimentos, só há emoção.  
Sidarta, muito obrigada por todos os teus sonhos! Muito obrigada por acreditar num país melhor, num planeta mais justo! Muito 
obrigada por ver na humanidade esperança, por inspirar e reverberar amor e justiça com tanta disposição para essa luta. Há mais 
de 11 anos eu fui tocada de maneira irreversível pelos teus sonhos, que hoje são meus, e deles nasceram e nascerão filhos e frutos 
que continuarão a semear essas ideias, e a acreditar que podemos ser um mundo mais justo, mais humano, mais amoroso, mais 
combativo, mais irmão. Obrigada por nosso filho guerreiro que tanto me inspira e me alimenta de força e energia, e ao nosso novo 
querido, é para eles esse legado, sempre será!   
Muito obrigada Mauro, por ter embarcado nessa aventura conosco! Você sempre porto seguro, combinação perfeita de 
orientação com Sidarta! Sempre que ele me levava para estratosfera você media os ângulos e acertava o caminho de volta para 
ter os pés no chão. Minha principal fonte de formação sobre formalismo matemático, me municiou com as armas dos elfos e tanto 
contribuiu para que eu desconstruisse o medo da matemática! Obrigada por tanta paciência, dedicação, confiança. Obrigada por 
acreditar! 
Obrigada a todos os queridos amigos que nos ajudam nessa jornada! À família ICe, todos vocês, que fazem desse um lugar mágico 
realizador de sonhos, em especial ao queridos companheiros Pedro Petrovich e Raimundo Furtado (in memoria), que embarcaram 
nessa aventura de criar o programa SpeechGraphs, o qual rendeu tantas descobertas nessa tese descritas! 
Obrigada aos queridos colaboradores Janaína Weissheimer e à família ACERTA, vocês são demais! Ernesto Soares, grande amigo 
companheiro de aventuras! Aos queridos hermanos argentinos (Mariano Sigman, Guillermo Cecchi, Diego Slezak, Facundo Carrillo, 
Jacobo Sitt), uruguaios (Juan Valle-Lisboa, Álvaro Cabana) pela louca jornada rumo à psiquiatria computacional! Aos queridos 
amigos da família LASchool que fazem concretas as palavras de Sidney Strauss (science is friendship). Em especial Silvia Bunge, que 
abriu as portas de seu laboratório e desde o primeiro encontro tanto apoia e incentiva essa caminhada, muito obrigada! Obrigada 
também pela oportunidade da visita ao professor Manuel Schabus.  
Obrigada a todos os professores e colegas, daqui e de outros locais do mundo, aos que acreditaram e aos que não acreditaram, 
todos vocês me apontaram ensinamentos importantes sobre assertividade, crítica, ceticismo, fundamentais na ciência. Obrigada 
especialmente às mentoras mulheres (cito aqui apenas algumas que tanto me marcaram, como Cecília Hedin Pereira, Silvia Bunge, 
Maria Bernardete Cordeiro de Sousa, Katarina Svahn Leão, Kerstin Schimidt, Cláudia Vargas, Susan Fitzpatrick, Marcela Peña, 
Elizabeth Spelke, Kathy Hirsh-Pasek, Roberta Golinkoff, Cheryl Corcoran, Elisa Dias, Susan Sara, e tantas mais) que estavam lá para 
inspirar, incentivar, ensinar com seus belos exemplos como o feminino faz a diferença na ciência. Muito orgulho de todas nós! 
Obrigada aos caros professores que me acompanharam no comitê interno, Cláudio Queiroz e Sandro de Souza, por toda dedicação 
e paciência! Aos professores que compuseram essa banca, pela disponibilidade de estar conosco nesse momento final e nos 
ajudar a ver outros caminhos.  
Obrigada aos queridos colegas que aceitaram minha ajuda em seus caminhos. Tanto que aprendi com vocês! Èspecialmente Adara 
Resende, primeira aluna que confiou em minha orientação (dela nasceu um artigo tão divertido sobre sonhos lúcidos)! Obrigada 
Ana Raquel por embracar sempre com tanta energia e confiança nas nossas aventuras (que delas virão conhecimentos sobre 
nossos pequenos em aprendizado)! Obrigada/Thanks DeeAnn, que veio de tão longe (Nova Iorque) inspirada por sonhos lúcidos, e 
tão cedo confiou em minha orientação passando dois meses conosco aqui em Natal, fazendo experimentos de maneira tão natural 
que levou ao reconhecimento por prêmio em seu país! 
A todos os voluntários pela confiança e participação! Sem a participação de cada um de vocês nada disso seria possível.  
Eu jamais teria conseguido acreditar e entender o valor da ciência se não fosse minha querida família. Obrigada mãe (Digessila), 
por me ensinar o valor do trabalho na vida de uma mulher, por me mostrar que é possível combinar maternidade e produtividade, 
e sempre acreditar em mim, desde a infância! Obrigada por me ensinar a lutar por um mundo melhor, meu pai (Sílvio), a dar valor 
à vida intelectual, a mostrar desde cedo que a vida é dura, que precisamos aceitar e entender as críticas, e que devemos buscar o 
melhor de nós mesmos para mudar o mundo! Obrigada minha irmã querida (Guta) por todo seu amor, seu entusiamo, sua leveza, 
suas risadas! Se não fosse você a me ensinar a meditar, o que teria sido desse doutorado... nem consigo imaginar! Obrigada meus 
sobrinhos (em especial Caio), meu irmão (Leonardo), minhas tias e tios, primas e primos, amigos queridos de tantos lugares do 
Brasil e do mundo, muito obrigada!! Agradecimento especial a minha querida sogra (Vera), e meu querido sogro (Edson) por tanto 
apoio com nossa família, que me permitiram crescer na minha carreira e manter os cuidados com nosso pequeno Ernesto! Muita 
gratidão!  
À família Arte de Nascer, que com tanto carinho acolhe minha família no seu cotidiano amoroso. Em especial duas grandes 
amigas, Carolina Damásio e Angelita Araújo, que são minha família escolhida aqui, presente divino de prática de amor no nosso dia 
a dia, muita gratidão!  
E por fim, muito obrigada meu querido Ernesto, meu primogênito, fonte de tanta luz, sorriso mais lindo e iluminado! Ao meu bebê 
querido, que venha para nossa casinha que te espera com tanto amor! E ao final, meu amor, meu companheiro, meu querido 
Sidarta! Meus meninos, meus amores! 
 
  
247
Financial Support: 
Work supported by UFRN, Conselho Nacional de Desenvolvimento Científico e 
Tecnológico (CNPq), grants Universal 480053/2013-8 and 408145/2016-1 and Research 
Productivity 308775/2015-5 and 310712/2014-9; Coordenação de Aperfeiçoamento de 
Pessoal de Nível Superior (CAPES) Projects OBEDUC-ACERTA 0898/2013 and STIC 
AmSud 062/2015; Fundação de Amparo à Ciência e Tecnologia do Estado de 
Pernambuco (FACEPE); Center for Neuromathematics of the São Paulo Research 
Foundation FAPESP (grant 2013/07699-0), Boehringer-Ingelheim International GmbH 
(grant 270561).   
  
248
Scientific Publications and Press: 
Number of publications: 18 papers with 165 citations on Google Scholar Citation, index H=6, i10 = 4 
PUBLICATIONS during PhD (total 15 papers, 10 as first author) 
1) Mota NB, Copelli M, Ribeiro S (2017) Thought disorder measured as random speech structure 
classifies negative symptoms and Schizophrenia diagnosis 6 months in advance. NPJ Schizophrenia. 
DOI: 10.1038/s41537-017-0019-3 
2) Ribeiro S, Mota NB, Fernandes VR, Deslandes AC, Brockington G, Copelli M (2017) Physiology and 
assessment as low-hanging fruit for education overhaul. Prospects DOI 10.1007/s11125-017-9393-x 
UNESCO IBE. (Review paper) 
3) Ribeiro S, Mota NB, Copelli M (2016) Rumo ao cultivo ecológico da mente. Propuesta Educativa 46 
Año 25, (2) 42-49.  (Review paper) 
4) Mota NB, Carrillo F, Slezak DF, Copelli M, Ribeiro S (2016). Characterization of the relationship 
between semantic and structural language features in psychiatric diagnosis in Fiftieth Asilomar 
Conference on Signals, Systems and Computers.   (IEEE Conference Publishing). DOI: 
10.1109/ACSSC.2016.7869165 
5) Mota NB, Weissheimer J, Madruga B, Adamy N, Bunge SA, Copelli M, Ribeiro S (2016) A Naturalistic 
Assessment of the Organization of Children's Memories Predicts Cognitive Functioning and Reading 
Ability. Mind, Brain, and Education 10 (3), 184-195.  DOI 10.111/mbe.12122 
Citations: 8 
6) Mota NB*, Resende A, Mota-Rolim SA, Copelli M, Ribeiro S* (2016) Psychosis and the Control of 
Lucid Dreaming Frontiers in psychology, (7) 294, doi: 10.3389/fpsyg.2016.00294 (*shared 
corresponding author) 
Citations: 5 
7) Mota NB, Copelli M, Ribeiro S (2016) Computational Tracking of Mental Health in Youth: Latin 
American Contributions to a Low‐Cost and Effective Solution for Early Psychiatric Diagnosis. New 
directions for child and adolescent development 2016 (152), 59-69. (Review paper) 
Citations: 7 
8) Bedi G, Carrillo F, Cecchi GA, Slezak DF, Sigman M, Mota NB, Ribeiro S, Javitt DC, Copelli M, M 
Corcoran CM (2015) Automated analysis of free speech predicts psychosis onset in high-risk youths. 
npj Schizophrenia 1, Article number: 15030 doi:10.1038/npjschz.2015.30. 
http://www.nature.com/articles/npjschz201530 
Citations: 42 
9) Carrillo F, Mota N, Copelli M, Ribeiro S, Sigman M, Cecchi G, Slezak DF (2014) Automated Speech 
Analysis for Psychosis Evaluation. International Workshop on Machine Learning and Interpretation 
in Neuroimaging. Springer International Publishing 
10) Bertola L*, Mota NB*, Copelli M, Rivero T, Diniz BR; Romano-Silva MA, Ribeiro S, Malloy-Diniz LF 
(2014) Graph analysis of verbal fluency test discriminate between patients with Alzheimer's disease, 
mild cognitive impairment and normal elderly controls. Frontiers in Aging Neuroscience, v. 6, p. 1-
10. http://journal.frontiersin.org/article/10.3389/fnagi.2014.00185/abstract 
Citations: 19 (*Shared 1º authorship) 
11) Mota NB, Furtado R, Maia PPC, Copelli M, Ribeiro S (2014) Graph analysis of dream reports is 
especially informative about psychosis. Scientific Reports 4: e3691. doi:10.1038/srep03691. 
http://www.nature.com/srep/2014/140115/srep03691/full/srep03691.html 
Citations: 31  
 
Pre-print Papers 
12) Mota NB*, Pinheiro S*, Sigman M, Slezak DF, Cecchi G, Copelli M, Ribeiro S (2016) The ontogeny of 
discourse structure mimics the development of literature. arXiv preprint arXiv:1612.09268 (*Shared 
1º authorship) 
Citations: 2 
13) Carrillo F, Mota N, Copelli M, Ribeiro S, Sigman M, Cecchi G, Slezak DF (2014) Emotional Intensity 
analysis in Bipolar subjects. arXiv preprint arXiv:1606.02231  
Citations: 1 
 
249
 
In Press  
14) Mota NB, Copelli M, Ribeiro S (2017) Graph Theory applied to speech: Insights on cognitive deficit 
diagnosis and dream research. In: Language, Cognition, and Computational Models. Edited by 
Thierry Poibeau and Aline Villavicencio. Publisher: Cambrigde University Press, in press. (Review 
paper) 
 
Under Review 
15) Mota NB*, Pinheiro S*, Sigman M, Slezak DF, Cecchi G, Copelli M, Ribeiro S (2017) Bronze Age texts 
are structurally similar to verbal reports from both children and psychotic subjects. Nature Human 
Behavior. 
 
In preparation 
16) Mota NB, Soares E, Altszyler E, Muto V, Heib D, Schabus M, Copelli M, Ribeiro S. Semantic 
memory reverberation during sleep onset correlates with different frequency band power 
during waking and sleep 
 
 
INVITED TALKS 
 
International: 
1) Investigator Meeting for Boehringer Ingelheim Pharmaceuticals, Inc. at Orlando, United States of 
America, Apr 2017; 
2) 50th Asilomar Conference on Signal, Systems and Computers at Asilomar Conference Ground, 
California, United States of America, Nov 2016; 
3) Equality of opportunity: What does science tell us? Contributions from research in economics, 
education and neuroscience 2016 at Pontificia Universidad Catolica de Chile, Santiago, Chile; 
4) Laboratory for "Sleep, Cognition and Consciousness Research" Seminar 2016 at University of 
Salzburg, Salzburg, Austria; 
5) 2015 Joint Retreat Brain Institute UFRN – Uppsala University at Roccarasso, Italy; 
6) 2014 Joint Retreat Brain Institute UFRN – Uppsala University at Stöten, Sweden; 
 
National: 
1) III Jornada de Fonaudiologia 2017 at Depto de Farmárcia, UFRN, Natal, Brazil; 
2) House Symposyum Brain Institute 2015 at Imirá Hotel, Natal, Brazil; 
3) Pipa Brain Institute UFRN – Uppsala University retreat 2016 at Natal, Brazil; 
4) VII Simpósio de Psicobiologia 2015 at Federal University of Rio Grande do Norte auditorium, Natal, 
Brazil; 
5) DEB’s Seminar 2015 at Federal University of Rio Grande do Norte campus, Natal, Brazil; 
6) I Jornada de Neuropsiquiatria e Psicologia Infantil 2015 at Onofre Lopes’ University Hospital, Natal, 
Brazil; 
7) 2ª Conferência em Linguística e Neurociências 2014 at Federal University of Santa Catarina, 
Florianópolis, Brazil; 
8) Second Brazilian Meeting on Brain and Cognition 2013 at Federal University of ABC, São Paulo, 
Brazil;  
 
HONORS & AWARDS 
 
2016 6th Latin American School for Education, James S. McDonnell Foundation 
2015 5th Latin American School for Education, James S. McDonnell Foundation 
2014 4th Latin American School for Education, James S. McDonnell Foundation 
2013 Honra ao Mérito, Sociedade Brasileira de Neurociências - SBNeC. 
 
RESEARCH INTERNATIONAL EXPERIENCE: 
 
250
Jan 2016 to Feb 2016 Research training in the Laboratory for "Sleep, Cognition and Consciousness 
Research" at University of Salzburg, Salzburg, Austria. 
Nov 2016 to Nov 2016 Research training in the “Building Blocks of Cognition Laboratory” at Helen 
Wills Neuroscience Institute, Department of Psychology, University of 
California at Berkeley, Berkeley, USA. 
GRANT FUNDING 
Boehringer-Ingelheim International GmbH (grants # 270906 and 270561). 
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES): Projects ACERTA  
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES): STIC AmSud 
 
ROLE AS REVIEWER  
IBM Journal: 
• G. A. Cecchi, V. Gurev, S. J. Heisig, R. Norel, I. Rish, S. R. Schrecke. Computing the structure of 
language for psychiatric evaluation. IBM Journal of Research and Development. 61, (2/3), 1-10. 
2017. Doi: 10.1147/JRD.2017.2648478 
Frontiers in Psychology: 
• Social Cognition in Schizophrenia: A network-based approach to a Taiwanese Version of the Reading 
the Mind in the Eyes Test (not accepted) 
• N. Dagnall, A. Denovan,  K. Drinkwater, A. Parker,  P. Clough. Toward a Better Understanding of the 
Relationship between Belief in the Paranormal and Statistical Bias: The Potential Role of 
Schizotypy. Frontiers in Psychology. 14. 2016. Doi: 10.3389/fpsyg.2016.01045 
 
 
  
251
MEDIA REPERCUSSION  
 
17 matérias escritas nacionais, 12 matérias escritas internacionais, 1 entrevistas televisão nacional, 2 
entrevistas televisão locais, 1 matéria para divulgação científica (em anexo) 
 
 Mota NB, Furtado R, Maia PPC, Copelli M, Ribeiro S (2014) Graph analysis of dream reports is 
especially informative about psychosis. Scientific Reports 4: e3691. doi:10.1038/srep03691. 
• How You Describe a Dream Could Help Determine What Kind of Psychosis You Have 
Smithsonian Magazine: http://www.smithsonianmag.com/smart-news/how-you-describe-
dream-could-help-determine-what-kind-psychosis-you-have-180949652/ 
• Brasileiros criam software que diagnostica doenças mentais traduzindo sonhos  iG (Brazil): 
http://saude.ig.com.br/2014-05-19/brasileiros-criam-software-que-diagnostica-doencas-
mentais-traduzindo-sonhos.html 
• Discurso sobre sonho pode ajudar no diagnóstico de doenças mentais  Jornal do Brasil: 
http://www.jb.com.br/ciencia-e-tecnologia/noticias/2014/03/18/discurso-sobre-sonho-
pode-ajudar-no-diagnostico-de-doencas-mentais/ 
• Diagrama de sonhos ajuda no diagnóstico de psicose  Folha de São Paulo: 
http://www1.folha.uol.com.br/ciencia/2014/01/1399472-diagrama-de-sonhos-ajuda-no-
diagnostico-de-psicose.shtml 
• Cientistas brasileiros mostram que sonhos podem ajudar no diagnóstico de doenças 
mentais Veja: http://veja.abril.com.br/ciencia/cientistas-brasileiros-mostram-que-sonhos-
podem-ajudar-no-diagnostico-de-doencas-mentais/ 
• Dream Meanings Could Reveal Possible Case Of Psychosis, Based On Your Speech Patterns  
Medical Daily: http://www.medicaldaily.com/dream-meanings-could-reveal-possible-case-
psychosis-based-your-speech-patterns-282758 
• Dream analysis reveals if you are psychotic Real Clear Science: 
http://www.realclearscience.com/journal_club/2014/02/02/dream_analysis_reveals_if_you
_are_psychotic_108486.html 
• What Dreams Mean And What They Say About You, Based On Science  Medical Daily: 
http://www.medicaldaily.com/what-dreams-mean-and-what-they-say-about-you-based-
science-314558 
• The way you talk could reveal if you are psychotic Business Insider: 
http://www.businessinsider.com/dream-descriptions-could-reveal-psychosis-2014-5 
• Diferenças dos relatos de sonhadores JCNET: 
http://www.jcnet.com.br/Saude/2014/03/diferencas-dos-relatos-sonhadores.html 
• O que os sonhos tem a dizer sobre a saúde Revista Saúde: http://saude.abril.com.br/bem-
estar/o-que-os-sonhos-tem-a-dizer-sobre-a-sua-saude/ 
• Discurso sobre o sonho pode ajudar no diagnóstico de doenças mentais  Agência FAPESP:  
http://agencia.fapesp.br/discurso_sobre_o_sonho_pode_ajudar_no_diagnostico_de_doenc
as_mentais/18760/ 
• Analisis matemático de los sueños El Mundo: 
http://www.elmundo.es/baleares/2016/07/26/57977069e5fdea69288b4624.html  
• Sonhos podem ser interpretados por ferramentas matemáticas?  Diário da Saúde: 
http://diariosaude.com.br/print.php?article=sonhos-interpretados-ferramenta-matematica 
 
 Bedi G, Carrillo F, Cecchi GA, Slezak DF, Sigman M, Mota NB, Ribeiro S, Javitt DC, Copelli M, M 
Corcoran CM (2015) Automated analysis of free speech predicts psychosis onset in high-risk youths. 
npj Schizophrenia 1, Article number: 15030 doi:10.1038/npjschz.2015.30.  
• Computers Can Predict Schizophrenia Based on How a Person Talks The Atlantic: 
https://www.theatlantic.com/technology/archive/2015/08/speech-analysis-schizophrenia-
algorithm/402265/ 
• IBM Watson, Using Speech Analysis Techniques, Correctly Identifies Patients At -Risk For 
Psychosis Medical Daily: http://www.medicaldaily.com/ibm-watson-using-speech-analysis-
techniques-correctly-identifies-patients-risk-349794 
252
• Predecir qué personas se volverán psicóticas mediante el análisis por ordenador de su 
habla NCYT - Noticias de la Ciencia y la Technologia: 
http://noticiasdelaciencia.com/not/15773/predecir-que-personas-se-volveran-psicoticas-
mediante-el-analisis-por-ordenador-de-su-habla/ 
• Computer can predict if you'll develop psychosis with 100% accuracy – study RT Network: 
https://www.rt.com/news/313742-computer-schizophrenia-psychosis-diagnosis/ 
• Psychiatrie : l’algorithme qui prédit les psychoses  Sciences et Avenir: 
https://www.sciencesetavenir.fr/sante/e-sante/psychiatrie-l-algorithme-qui-predit-les-
psychoses_19537 
• Prédire la schizophrénie par ordinateur Le Monde: 
http://www.lemonde.fr/sciences/article/2015/08/31/predire-la-schizophrenie-par-
ordinateur_4741621_1650684.html 
• Raio X da Mente Mente & Cérebro: 
http://www2.uol.com.br/vivermente/artigos/raio_x_da_mente.html 
 
 Mota NB*, Resende A, Mota-Rolim SA, Copelli M, Ribeiro S* (2016) Psychosis and the Control of 
Lucid Dreaming Frontiers in psychology, (7) 294, doi: 10.3389/fpsyg.2016.00294 (*shared 
corresponding author) 
• Educação em Pauta IFRN TV Câmara Natal (YouTube): 
https://www.youtube.com/watch?v=EVhd_xm752o 
 
 Mota NB, Copelli M, Ribeiro S (2017) Thought disorder measured as random speech structure 
classifies negative symptoms and Schizophrenia diagnosis 6 months in advance. NPJ Schizophrenia. 
DOI: 10.1038/s41537-017-0019-3 
• Novo método pode ajudar no diagnóstico da esquizofrenia Band TV / Jornal da Band 
(YouTube): http://noticias.band.uol.com.br/jornaldaband/videos/ultimos-
videos/16240201/novo-metodo-pode-ajudar-no-diagnostico-da-esquizofrenia.html 
• Pesquisa do Instituto do Cérebro da UFRN auxilia diagnóstico e o tratamento da 
esquizofrenia TVU RN (YouTube): https://www.youtube.com/watch?v=HZzQ-_YlmF8 
• Abnormal speech in someone showing early signs of psychosis can help doctors diagnose 
schizophrenia Marie Barabas: 
https://plus.google.com/106010668250647143671/posts/bAmMnxZE8pQ 
• Brasileiros criam método para diagnosticar esquizofrenia na primeira consulta O Estado de 
São Paulo - Estadão: http://ciencia.estadao.com.br/noticias/geral,brasileiros-criam-metodo-
para-diagnosticar-esquizofrenia-na-primeira-consulta,70001826266 
• Método de diagnóstico que anaisa a fala dos pacientes prevê casos de esquizofrenia R7 
Notícias: http://noticias.r7.com/saude/metodo-de-diagnostico-que-analisa-fala-dos-pacientes-
preve-casos-de-esquizofrenia-com-80-de-precisao-05062017 
• Brasileiros criam teste para detectar esquizofrenia mais cedo Veja: 
http://veja.abril.com.br/saude/brasileiros-criam-teste-para-detectar-esquizofrenia-mais-cedo/ 
• Esquizofrenia, diagnóstico, método, precisão - Em teste, avaliação previu problema com 80% 
de precisão Isto É: http://istoe.com.br/tag/esquizofreniadiagnosticometodoprecisao/ 
• O novo teste pode detectar a esquizofrenia mais cedo  24horasPB: 
http://24horaspb.com/Portal/home/2016-02-23-20-58-18/tecnologia/item/28587-brasileiros-
criam-teste-que-detecta-esquizofrenia-mais-cedo 
• Novo método fas diagnóstico de esquizofrenia Amazonas Atual: 
http://amazonasatual.com.br/novo-metodo-faz-diagnostico-precoce-de-esquizofrenia/ 
• Nova técnica consegue diagnosticar esquizofrenia precoce Revista Exame: 
http://exame.abril.com.br/ciencia/nova-tecnica-consegue-diagnosticar-esquizofrenia-precoce/ 
• Método faz diagnóstico precoce de esquizofrenia O Livre: 
http://www.olivre.com.br/geral/metodo-faz-diagnostico-precoce-de-esquizofrenia/4145  
• Novo método diagnostica esquizofrenia em 30 minutos, técnica tradicional leva 6 meses Site 
VIX: http://www.vix.com/pt/saude/546409/novo-metodo-diagnostica-esquizofrenia-em-30-
minutos-tecnica-tradicional-leva-6-meses 
253
52
psiquiatria e computação
T
odos os dias, ao despertarmos, iniciamos o 
complexo trabalho de julgar nosso entorno em 
busca de sinais de estabilidade e previsibilidade. 
Vindos dos sonhos, ao abrirmos os olhos, nos 
tranquilizamos ao perceber que está tudo normal, tudo em 
seu lugar. Chegamos a ficar entediados com nossas rotinas 
repetitivas e planejamentos necessários para que possamos 
dar conta de tudo o que desejamos fazer. Ao final do dia, 
vamos dormir tranquilos com a sensação de missão cum-
prida – ou de que não alcançamos metas importantes, que 
ficaram para amanhã. Mas sempre com a certeza de que, ao 
Medir 
comportamentos 
para entender  
a psicose
aplicação da teoria matemática para caracterizar a relação  
entre palavras – e, indiretamente, pensamentos e memórias –  
tem permitido a quantificação de sintomas de transtornos mentais  
que antes eram descritos apenas subjetivamente
por Natália B. Mota, Mauro Copelli e Sidarta Ribeiro
OS AUTORES
NATÁLIA BEZERRA MOTA é psiquiatra, 
doutoranda pelo Instituto do Cérebro 
da Universidade Federal do Rio 
Grande do Norte (UFRN). MAURO 
COPELLI é doutor em física, professor 
adjunto da Universidade Federal 
de Pernambuco (UFPE). SIDARTA 
RIBEIRO é neurobiólogo, doutor em 
neurociência, professor titular e diretor 
do Instituto do Cérebro da UFRN.
254
despertarmos, estaremos no mesmo lugar, na companhia 
das mesmas pessoas, com tudo programado e previsível.
No entanto, essa realidade, cenários e personagens 
que nos cercam, confirmando que está tudo em seu lugar, 
por diferentes causas, podem repentinamente perder seus 
significados originais. Pense como acordar em um lugar 
que costumava ser sua casa, mas agora é um espaço frio, 
distante, desconhecido. Pior: imagine que as pessoas com as 
quais divide seu quarto, sua sala, seu escritório, de repente, 
parecem estranhas. Como confiar se você sente como cada 
vez mais real o sentimento de medo, estranheza, insegurança, 
apesar de a velha realidade tentar convencê-lo, pela repetição, 
de que está tudo bem e em seu lugar? Daí você começa a 
compreender a realidade de outras formas. E percebe que 
só você consegue entender toda a conspiração maligna para 
destruição de sua família ou a invasão planejada de seres de 
outro planeta disfarçados de pessoas dedicadas e gentis.
Logo você começa a ouvir uma voz muito real dentro 
da sua cabeça, inicialmente, depois fora e mais clara que 
qualquer outra voz. No início não dá para entender o que ela 
fala, mas, à medida que cresce o medo, aumenta a certeza da 
existência de uma realidade paralela, e a clareza daquela que 
255
psiquiatria e computação
54
se torna a única voz confiá vel, que o entende, 
aconselha, orienta – e até ordena como agir.
Nesse momento você já está completa-
mente distante daquelas pessoas que antes 
reconhecia como família, amigos, colegas. 
Não é possível nem saber quem são essas 
pessoas que obrigam você a aceitar a velha 
realidade ameaçadora.
A essa quebra de contato com a realidade 
compartilhada por seus pares damos o nome 
de psicose. Podemos percebê-la como uma 
síndrome, um conjunto de sinais e sintomas. 
Assim como a síndrome gripal apresenta febre, 
tosse e dor de garganta, a síndrome psicótica 
se caracteriza pela presença de sintomas como 
delírios (crença forte em ideias que não condi-
zem com a realidade compartilhada por seu 
grupo) e alucinações (percepção de estímulos 
ambientais inexistentes, como ouvir vozes sem 
haver nenhum som no ambiente ou ver algo em 
um lugar vazio). As causas desses sintomas po-
dem ser secundárias a outra desordem, como 
uma intoxicação por substâncias ou alguma 
doença neurológica, como tumores cerebrais, 
epilepsias ou degenerações de tecido nervoso. 
Podem também ser de origem primária, ou 
seja, quando, após a verificação detalhada com 
uma boa escuta do paciente e acompanhante, 
além de exames clínicos e de imagem, não são 
identificadas quaisquer causas neurológicas. 
Na maioria das vezes, encontram-se outros 
sinais e sintomas que configuram os quadros 
descritos nos manuais diagnósticos como 
esquizofrenia ou transtorno bipolar do humor. 
Atualmente esses diagnósticos são orientados 
por manuais diagnósticos (consenso entre es-
pecialistas em diversos países) que ditam quais 
sinais e sintomas compõem cada entidade 
diagnóstica, por quanto tempo devem ser ob-
servados e em que combinação. Infelizmente, 
após um século de pesquisas desde o início 
da psiquiatria, ainda não temos marcadores 
biológicos desses transtornos. Quantificar es-
ses fenômenos se torna tão desafiador quanto 
quantificar a própria percepção da realidade.
A maneira como os pacientes se expressam 
revela duas maneiras bem distintas de pensar: 
uma bastante fragmentada e desorganizada e 
outra acelerada, pouco objetiva e cheia de as-
sociações com diversos temas. Psiquiatras bem 
treinados conseguem perceber essas caracte-
rísticas das linhas de raciocínio expressas em 
trajetórias de palavras, principalmente quando 
melhor conhecem o seu paciente. Quantificar 
essas diferenças, porém, ainda é um desafio.
Não é nova a ideia de olhar para o fluxo do 
pensamento para caracterizar as psicoses, nem 
os modelos matemáticos que visam especificar 
as trajetórias e a relação entre seus elementos. 
Em 1736 a teoria de grafos surge como ferramen-
ta matemática para compreensão da estrutura de 
relação entre elementos de um fenômeno. Um 
grafo é um conjunto de nós (elementos) ligados 
entre si por arestas que, quando direcionadas, 
são representadas por setas. Com esse modelo, 
podemos entender a complexidade das relações 
entre elementos de redes das mais diversas 
naturezas (tanto biológicas como tecnológicas 
e sociais) e caracterizar estruturalmente, por 
exemplo, as relações entre aeroportos e redes 
aeroviárias e/ou sites na internet. Recentemente, 
a aplicação da teoria de grafos para caracterizar 
THE SPEECH GRAPHS of 
schizophrenic, bipolar and 
control subjects are more 
varied for dream than for 
waking reports. (A) Graphs
were generated from 
transcribed verbal reports using 
custom-made Java software 
(http://neuro.ufrn.br/softwares/
speechgraphs). Drawing by 
NM. (B) Representative speech 
graphs extracted from dream 
reports from a schizophrenic, a 
bipolar and a control subject.
ESQUIZOFRENIA
Eu 
estava 
sonhando 
com 
um 
show
BIPOLAR SEM PSICOSE
Eu/ estava/ sonhando/ 
com/ um/ show
A
The speech graphs
256
55novembro 2015 • mentecérebro
Foram analisados 
relatos de sonhos 
de pacientes 
diagnosticados 
com esquizofrenia 
e transtorno 
bipolar do humor 
na fase maníaca, 
em comparação 
com outras oito 
pessoas que não 
apresentavam 
sintomas psicóticos
a relação entre palavras (indiretamente, pensa-
mentos ou memórias) também tem permitido a 
quantificação das desordens do pensamento que 
antes eram apenas descritas subjetivamente.
SUJEITO, OBJETO E VERBO
Em 2012 foram analisados relatos de sonhos 
de pacientes diagnosticados com esquizo-
frenia e transtorno bipolar do humor na fase 
maníaca (sendo oito deles em cada grupo), em 
comparação com outras oito pessoas que não 
apresentavam sintomas psicóticos. Após uma 
análise sintática em que se identificavam sujeito, 
objeto e verbo de cada frase, esses elementos 
foram representados por nós e suas sequências, 
demonstradas por setas, indicando a trajetória 
de palavras. Adicionalmente, foram contabiliza-
dos elementos utilizados para falar do assunto 
e os que fugiam do tópico (sonho). O processo 
permitiu caracterizar sintomas como logorreia 
(aumento do conteúdo da fala traduzido como 
aumento no número de palavras) e fuga de 
ideias (mais elementos utilizados para falar de 
outros assuntos que não a pergunta original).
Em 2014, o método de representação de texto 
em grafos de trajetória de palavras foi automa-
tizado e, para tanto, cada palavra passou a ser 
representada por um nó e sua sequência, por 
setas (arestas direcionadas). Foram analisados 
relatos de sonhos de um número maior de parti-
cipantes (20 de cada grupo) e feito o controle da 
diferença no total de palavras. Assim, foi possível 
caracterizar maior conectividade entre vocábulos 
no discurso de voluntários sem sintomas de 
psicose, seguidos por relatos de pessoas com 
diagnóstico de transtorno bipolar do humor. Por 
fim, trabalhamos com discursos menos conecta-
dos (menor número de arestas e menor número 
de nós nos subgrafos, em que todos os nós 
estão conectados entre si de maneira mais ou 
menos íntima) de pessoas com esquizofrenia. 
Essas características objetivamente mensuráveis 
apresentam relação com sintomas como dificul-
dades de raciocínio e de relacionamento com 
outras pessoas, medidos pelos psiquiatras por 
meio de métodos convencionais subjetivos (que 
necessitam de um especialista treinado para dar 
notas a cada sintoma listado nas escalas).
Algo que há anos é descrito como desordem 
ou desorganização do pensamento parece agora 
possível ser caracterizado e medido, o que abre 
possibilidades para quantificação menos subje-
tiva – e, portanto, menos sujeita a diferenças de 
opiniões e treinamentos. Surge a possibilidade 
de um método que possa 
guiar o psiquiatra na avalia-
ção dos seus pacientes, tanto 
para acompanhar a evolução 
dos sintomas e verificar se o 
tratamento está sendo sufi-
ciente quanto para, em situ-
ações de urgência, permitir 
a tomada de decisões de 
conduta baseadas em dados 
mais ricos de informação: 
por exemplo, numa primeira 
crise psicótica, quando não 
se sabe como vai evoluir o 
quadro, sendo necessário 
observar de perto o paciente 
por pelo menos seis meses 
para fechar um diagnóstico 
ESQUIZOFRENIA
Eu 
estava 
sonhando 
com 
um 
show
BIPOLAR SEM PSICOSE
Eu/ estava/ sonhando/ 
com/ um/ show
B
The speech graphs
257
psiquiatria e computação
56
Nomes, sintomas e sofrimento
Desde o final do século 19, existe a neces-
sidade de caracterizar a síndrome psicótica 
e suas principais causas para que haja uma 
melhor compreensão a respeito do que 
gera esse sofrimento tão deletério tanto 
para pessoas quanto para quem está próxi-
mo a elas. A principal maneira de fazer isso 
foi observando atentamente o comporta-
mento de pacientes em diferentes lugares 
do mundo, com histórias variadas, deta-
lhando o que apresentavam em comum. 
Nada simples de ser feito numa época em 
que a comunicação científica ficava restrita 
àqueles que tinham acesso aos poucos 
periódicos impressos nos grandes centros.
Na mesma época em que Freud divul-
gava suas descrições da psique com maior 
foco em transtornos que hoje conhecemos 
como “neuróticos”, Emil Kraepelin propôs 
a categorização das desordens psicóticas 
não apenas pela descrição detalhada de 
sintomas próprios ou exclusivos de uma 
patologia ou outra, mas pela observação 
de padrões de sintomas e seu curso no 
tempo. Ele sugeriu dois diagnósticos que 
na época chamou de psicose maníaco-de-
pressiva (que englobava tanto o que deno-
minamos hoje de transtorno depressivo 
maior até transtorno bipolar do humor) e 
a demência precoce (conhecida atualmen-
te como esquizofrenia).
Kraepelin percebeu 
que ambos poderiam 
apresentar os mesmos 
sintomas ao longo 
do tempo, mas nos 
transtornos de humor 
o paciente apresentava 
um conjunto mais rele-
vante de sintomas que 
envolviam oscilações 
de humor entre euforia 
e depressão, podendo 
muitas vezes apresentar 
períodos sem sintomas, 
enquanto na esquizofre-
nia se observava um curso mais deteriorante 
das funções cognitivas com mudanças mais 
profundas e irreversíveis da personalidade 
descrita antes do surgimento dos sintomas.
Eugen Bleuler, que cunhou o termo 
“esquizofrenia”, já propunha a pesquisa de 
sintomas centrais das desordens mentais 
para entendê-las melhor. Ele sugere o estu-
do aprofundado de sintomas centrais como 
o prejuízo da afetividade, a ambivalência 
e o que chamou desordens do pensamen-
to para entender o núcleo patológico do 
transtorno. Outros contemporâneos como 
Kraepelin também descrevem essas desor-
dens do pensamento, assim como vários 
autores ao longo dos anos, guardando 
a ideia central de que, na esquizofrenia, 
temos um afrouxamento das associações 
entre as ideias percebidas nos relatos dos 
pacientes, que iniciam como pequenas in-
coerências, chegando a ponto de gerar re-
latos descritos como “saladas de palavras”, 
de tão desorganizados. Nos transtornos 
de humor, principalmente nos estados de 
mania, essas desordens do pensamento se 
caracterizam pela alta velocidade de racio-
cínio que gera um aumento na quantidade 
de palavras faladas, maior pressão de fala, 
encadeamento de várias histórias em sequ-
ência (fuga de ideias) e dificuldade de man-
ter o foco e a objetividade no relato.
258
57novembro 2015 • mentecérebro
diferencial entre esquizofrenia e transtorno bipo-
lar do humor. Nesse cenário, a quantificação da 
conectividade entre palavras nos relatos iniciais 
permite a classificação automática dos grupos 
com mais de 90% de acerto.
POSSIBILIDADE DE AUTOEXAME
Na prática, isso pode significar menos erro no 
diagnóstico e na condução inicial do quadro, 
além de menos estigma (visto que a melhor 
compreensão da natureza do fenômeno desmis-
tifica rótulos com o tempo). Voltando à analogia 
com a síndrome gripal, ao aliar a análise clínica 
do paciente com a contagem de células sanguí-
neas (hemograma), o médico conclui o diag-
nóstico de infecção bacteriana ou viral; também 
aliando a análise clínica à análise automatizada 
do discurso, será possível concluir o diagnóstico 
de esquizofrenia ou transtorno bipolar do humor 
para explicar a síndrome psicótica do sujeito e 
acompanhar sua resposta ao tratamento.
Um resultado que chama atenção nesse estu-
do é a melhor distinção entre os grupos quando 
se solicita aos participantes um relato de sonho. 
Quando são analisados os relatos do cotidiano 
(do dia anterior ao sonho), essas diferenças de 
conectividade entre os grupos é bem mais dis-
creta, e não são encontradas as relações com os 
sintomas medidos pelas escalas padronizadas. 
Tanto Bleuler quanto Kraepelin e principalmente 
Freud falaram sobre as semelhanças entre o 
fenômeno onírico e psicótico. Em ambos temos 
a crença em realidades absurdas que quebram 
os padrões que convencionamos chamar de 
normais e que mesmo assim aceitamos sem 
críticas ou questionamentos. Será que relatar 
uma realidade por natureza mais próxima da 
vivência psicótica exacerba a desorganização do 
pensamento? O fato é que sujeitos sem sinto-
mas de psicose e sujeitos portadores de trans-
torno bipolar do humor relatam seus sonhos 
de maneira mais conectada e complexa que ao 
relatar o dia anterior ao sonho, mostrando uma 
tentativa de organizar um relato de conteúdo 
menos previsível.
No entanto, não apenas com trajetórias de 
palavras podemos caracterizar as desordens 
do pensamento. A incoerência nas associações 
também pode ser medida. Se entendermos 
como similares ou semanticamente próximas 
palavras que ocorrem frequentemente nos 
PARA SABER MAIS
Automated analysis of free 
speech predicts psychosis 
onset in high-risk youths. 
Gillinder Bedi, Facundo 
Carrillo, Guillermo A. 
Cecchi, Diego Fernández 
Slezak, Mariano Sigman, 
Natália B. Mota, Sidarta 
Ribeiro, Daniel C. Javitt, 
Mauro Copelli e Cheryl 
M. Corcoran em NPJ Schi-
zophrenia, no 15030. Dispo-
nibilizado online em 26 de 
agosto de 2015.
Graph analysis of dream 
reports is especially infor-
mative about psychosis. 
Natália B. Mota, Raimundo 
Furtado, Pedro P. C. Maia, 
Mauro Copelli e Sidarta Ri-
beiro, em Scientific Reports, 
no 3691. Disponibilizado 
online em 15 de janeiro 
de 2014.
Speech Graphs Provide 
a Quantitative Measure 
of Thought Disorder in 
Psychosis.Natalia B. Mota, 
Nivaldo A. P. Vasconcelos, 
Nathalia Lemos, 
Ana C. Pieretti, Osame 
Kinouchi, Guillermo A. 
Cecchi, Mauro Copelli e Si-
darta Ribeiro, em Plos One. 
Disponibilizado online em 
9 de abril de 2012.
Quantifying incoherence 
in speech: an automated 
methodology and novel 
application to schizophre-
nia. Elvevåg B. e outros, 
emSchizophrenia Research, 
vol. 93, págs 304-316; julho 
de 2007.
mesmos textos, podemos medir a frequência 
de co-ocorrência de pares de palavras em vários 
textos, montando um banco de dados suficiente-
mente grande para ser representativo. Dessa ma-
neira pode-se calcular a distância entre palavras 
consecutivas nos relatos, sendo mais incoerente 
o relato que apresentar maior distância semân-
tica entre palavras consecutivas. Essa técnica, 
conhecida como LSA (do inglês latent semantic 
analysis), mostrou-se útil para caracterizar relatos 
de participantes portadores de esquizofrenia 
e recentemente permitiu, em conjunto com o 
somatório do total de palavras e de palavras de 
ligação, a caracterização eficiente de participan-
tes que viriam a desenvolver psicose dois anos 
e meio depois. Os psiquiatras acompanharam 
por esse período 34 pacientes ainda sem psicose 
(mas que apresentavam risco de desenvolvê-la) 
e perceberam que a análise automatizada da fala 
era capaz de prever sem nenhum erro os cinco 
participantes que vieram a desenvolver psicose.
Iniciamos o século 21 ainda na promessa de 
biomarcadores que caracterizem a origem bio-
lógica dos sintomas psiquiátricos. No entanto, 
mesmo a quantificação da fenomenologia que 
encontramos hoje nos consultórios e definimos 
como transtornos psiquiátricos ainda não acom-
panhou a evolução tecnológica necessária para 
caracterizar um fenômeno tão complexo. Certa-
mente a categorização diagnóstica tem muitas 
falhas e erros de identificação, juntando em um 
único diagnóstico uma multiplicidade de fenô-
menos, mas separando sintomas semelhantes 
sob rótulos distintos. O fato é que o comporta-
mento humano é extremamente complexo, e nós 
apenas começamos a vislumbrar maneiras mais 
adequadas de abordar suas variações. Surge 
neste início de século o campo da psiquiatria 
computacional, que coloca a tecnologia e a 
matemática a serviço do sujeito, para além dos 
estereótipos. Essa nova maneira de olhar para 
os fenômenos psiquiátricos permite avançar 
em modelos mais complexos e aprofundar o 
conhecimento sobre as causas desses fenôme-
nos, considerando o sujeito como ser biológico 
e social inserido no ambiente. Sobretudo, as 
novas descobertas permitem desenvolver ferra-
mentas que empoderam o paciente psiquiátrico, 
ao permitir o autoexame e a caracterização de 
seu quadro de maneira quantitativa, objetiva e 
complementar à opinião do especialista.
259
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
PARECER CONSUBSTANCIADO DO CEP
Pesquisador:
Título da Pesquisa:
Instituição Proponente:
Versão:
CAAE:
Estrutura de linguagem e coerência semântica de relatos de sonhos, memórias e
figuras afetivas em sujeitos durante desenvolvimento cognitivo fisiológico e patológico
SIDARTA RIBEIRO
Instituto do Cérebro
4
27499314.9.0000.5537
Área Temática:
DADOS DO PROJETO DE PESQUISA
Número do Parecer:
Data da Relatoria:
742.116
01/08/2014
DADOS DO PARECER
Conforme o referencial teórico que embasa o tema a ser estudado neste projeto: "Psicose é uma síndrome
definida pela presença de sintomas como alucinações e delírios, que pode ter diferentes causas".
Entre as psicoses mais conhecidas, destacam-se a Esquizofrenia e o Transtorno Bipolar de Humor, cujo
diagnóstico diferencial ainda está baseado em um método subjetivo, assim como todos os diagnósticos
classificatórios atuais da psiquiatria.
Ainda segundo os autores da proposta ora analisada: "Muitas vezes o exame psíquico procura por
diferenças qualitativas na linguagem do sujeito, o que pode lhe indicar sintomas típicos de esquizofrenia ou
bipolaridade".
No entanto, a percepção dessas diferenças demanda treinamento intenso e limita a quantificação das
mesmas no discurso.
A construção da proposta de estudo ora revisada eticamente foi fundamentada nos resultados de trabalhos
anteriormente realizados pelo grupo de pesquisadores proponentes e pela compreensão de que a relação
entre palavras no discurso é um sistema complexo, que sua representação por grafos de sequencia de
palavras pode mostrar padrões característicos em grafos produzidos por sujeitos psicóticos portadores de
esquizofrenia ou bipolaridade.
Apresentação do Projeto:
Financiamento PróprioPatrocinador Principal:
59.078-970
(84)9193-6266 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Página 01 de  06
260
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 742.116
Os pesquisadores propõem estudar o que há de semelhante entre os sintomas psicóticos (alucinações e
delírios) apresentados pelos participantes psicóticos e os seus sonhos.
O estudo também será realizado com participantes sadios em ganho cognitivo durante a aquisição da
leitura.
Para atingir esse propósito será feito um ensaio observacional e longitudinal com uma amostra composta de
60 (sessenta) unidades.
Quarenta participantes serão recrutados no Centro de Atenção Psicossocial Infantil Oeste II (CAPs - infantil).
Vinte participantes deverão estar no primeiro episódio psicótico e 20 (vinte) deverão ser psicóticos crônicos.
Vinte indivíduos sem história de sintomas psicóticos e em aprendizado de leitura comporão o grupo controle.
Os dados serão coletados mediante uma entrevista gravada e o preenchimento de duas fichas, uma clínica
e outra com dados socioeconômicos e culturais do participante.
Também compõe a coleta de dados o preenchimento da entrevista estruturada do Manual Estatístico de
Diagnóstico IV (DSM-IV) e das escalas psicométricas para quantificação sintomatológica.
Após as entrevistas e preenchimento das escalas psicométricas, os participantes serão perguntados sobre
seus sonhos, suas memórias mais antigas e as mais recentes. Em seguida serão apresentados a figuras do
International Affective Picture System (IAPS), para que ele elabore uma história a partir da mesma.
Todo o processo de coleta de dados será realizado em seis sessões e haverá um acompanhamento por um
ano, caso o participante permita.
Os dados coletados serão apreciados mediante duas metodologias: a análise psicométrica que observará a
semântica e estrutura dos relatos e valências subjetivas e a análise estatística. A etapa de coleta de dados
está prevista para iniciar em abril do corrente ano com a efetivação de um préteste e a fase de coleta
propriamente dita para o período de junho de 2014 a junho de 2015.
O planejamento financeiro para a execução da pesquisa foi orçado em R$ 14.500 (quatorze mil em
quinhentos reais) sob a responsabilidade dos pesquisadores responsáveis pelo estudo.
A pesquisa sob apresentação subsidiará uma tese de Doutorado do Programa de Pós-Graduação em
Neurociências, do Instituto do Cérebro da UFRN, e os pesquisadores acreditam que os resultados obtidos
poderão "constituir um grande avanço na compreensão da relação entre estrutura de linguagem e
desenvolvimento cognitivo fisiológico e psicopatológico".
59.078-970
(84)9193-6266 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Página 02 de  06
261
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 742.116
Objetivo Primário: caracterização longitudinal da estrutura e da coerência semântica de relatos de sonhos,
memórias e figuras afetivas em sujeitos durante desenvolvimento cognitivo fisiológico (período de aquisição
da capacidade de leitura) ou patológico, em declínio cognitivo decorrente a um primeiro episódio psicótico.
Objetivos Secundários:
1. Verificar de forma longitudinal modificações em estrutura e coerência semântica dos relatos de diferentes
eventos em sujeitos durante aquisição de capacidade de leitura.
2. Comparar estrutura e coerência semântica em relatos de sujeitos que desenvolvem melhor ou pior
capacidade de leitura.
3. Verificar de forma longitudinal modificações em estrutura de relatos nos grupos de primeiro episódio
psicótico (que evoluem para Esquizofrenia ou para TAB).
4. Verificar de forma longitudinal modificações na coerência semântica de relatos nos grupos de primeiro
episódio psicótico.
5. Verificar diferenças em estrutura de relatos de memórias remotas, assim como de figuras afetivas
impactantes (positivas e negativas) entre sujeitos em primeiro episódio psicótico e sujeitos controle, assim
como entre sujeitos em primeiro episódio psicótico que evoluem para diagnóstico de esquizofrenia e que
evoluem para diagnóstico de TAB.
6. Verificar diferenças em coerência semântica em relatos produzidos por sujeitos em primeiro episódio de
psicose quando comparados aos controles.
7. Comparar coerência semântica de relatos de sonhos, memórias remotas e figuras afetivas impactantes
em sujeitos controle, em relação a relatos de memórias recentes, e de figuras afetivamente neutras.
8. Comparar estrutura de relatos de memórias remotas e figuras afetivas impactantes com estrutura de
relatos de sonhos, assim como comparar estrutura de relatos de memórias recentes e figuras afetivas
neutras com estrutura de relatos do dia anterior ao sonho.
9. Verificar diferenças em estrutura de relatos de memórias, figuras afetivas, sonho ou dia anterior ao sonho
em sujeitos controle e grupos em primeiro episódio de psicose que evoluir para diagnóstico de esquizofrenia
ou TAB, verificando que relatos possuem melhor qualidade classificatória.
10. Verificar diferenças em estrutura de grafos de relatos em geral entre grupo psicótico crônico com
diagnóstico de Esquizofrenia e grupo em primeiro epidódio de psicose.
Objetivo da Pesquisa:
59.078-970
(84)9193-6266 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Página 03 de  06
262
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 742.116
Entendemos que o estudo proposto não pode ser considerado de risco mínimo. O desconforto associado à
aplicação dos instrumentos de pesquisa (questionários e entrevistas) e a condição de acentuada
vulnerabilidade dos participantes da pesquisa leva este Comitê a considerar o estudo proposto de risco
maior que o mínimo.
Entendemos, outrossim, não haver benefício direto para o participante, porém, podem ser gerados
conhecimentos que tragam benefícios para outros indivíduos da população a ser estudada.
Os pesquisadores devem incluir no planejamento de sua pesquisa a previsão de riscos não físicos e apontar
as medidas que serão tomadas para minimizar ou extinguir os mesmos.
Avaliação dos Riscos e Benefícios:
O tema da pesquisa tem relevância considerável e a metodologia estabelecida pode favorecer o
cumprimento dos objetivos nomeados.
No entanto, a escolha da técnica de amostragem não probabilística, usando o método de amostragem por
conveniência, e a não descrição de todas as características da população (idade, por exemplo) a ser
estudada não mostram o quão generalizáveis são os resultados obtidos.
Comentários e Considerações sobre a Pesquisa:
O pesquisador juntou ao PB - Projeto de pesquisa os documentos seguintes:
> carta de apresentação;
> Folha de Rosto (FR);
> projeto na íntegra;
> formulário CEP/UFRN;
> Termo de Consentimento Livre e Esclarecido (TCLE)
> Termo para gravação de voz;
> carta de anuência da diretora do CAPSi;
> instrumentos de pesquisa e,
> declaração de que a pesquisa não foi iniciada.
Considerações sobre os Termos de apresentação obrigatória:
O pesquisador, ao responder às pendências, julgou necessário acrescentar que estava adicionando um
novo membro à equipe de pesquisa.
Recomendações:
59.078-970
(84)9193-6266 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Página 04 de  06
263
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 742.116
Recomendamos ao pesquisador proceder de uma maneira formal, isto é, postar na Plataforma Brasil uma
EMENDA ao projeto original, solicitando e justificando a adição do novo pesquisador.
A inobservância desse procedimento torna a emenda solicitada sem valor ético e legal.
Após a revisão ética das respostas às pendências levantadas no parecer anterior, concluímos que as
mesmas foram reparadas adequadamente.
Essa adequação situa o protocolo em questão dentro dos preceitos básicos da ética nas pesquisas que
envolvem o ser humano.
Conclusões ou Pendências e Lista de Inadequações:
Aprovado
Situação do Parecer:
Não
Necessita Apreciação da CONEP:
Em conformidade com a Resolução 466/12 do Conselho Nacional de Saúde - CNS e Manual Operacional
para Comitês de Ética - CONEP é da responsabilidade do pesquisador responsável:
1.  elaborar o Termo de Consentimento Livre e Esclarecido - TCLE em duas vias, rubricadas em todas as
suas páginas e assinadas, ao seu término, pelo convidado a participar da pesquisa, ou por seu
representante legal, assim como pelo pesquisador responsável, ou pela (s) pessoa (s) por ele delegada(s),
devendo as páginas de assinatura estar na mesma folha (Res. 466/12 - CNS, item IV.5d);
2. desenvolver o projeto conforme o delineado (Res. 466/12 - CNS, item XI.2c);
3. apresentar ao CEP eventuais emendas ou extensões com justificativa (Manual Operacional para Comitês
de Ética - CONEP, Brasília - 2007, p. 41);
4. descontinuar o estudo somente após análise e manifestação, por parte do Sistema CEP/CONEP/CNS/MS
que o aprovou, das razões dessa descontinuidade, a não ser em casos de justificada urgência em benefício
de seus participantes (Res. 446/12 - CNS, item III.2u) ;
5. elaborar e apresentar os relatórios parciais e finais (Res. 446/12 - CNS, item XI.2d);
6. manter os dados da pesquisa em arquivo, físico ou digital, sob sua guarda e responsabilidade, por um
período de 5 anos após o término da pesquisa (Res. 446/12 - CNS, item XI.2f);
7. encaminhar os resultados da pesquisa para publicação, com os devidos créditos aos
Considerações Finais a critério do CEP:
59.078-970
(84)9193-6266 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Página 05 de  06
264
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 742.116
pesquisadores associados e ao pessoal técnico integrante do projeto (Res. 446/12 - CNS, item XI.2g) e,
8. justificar fundamentadamente, perante o CEP ou a CONEP, interrupção do projeto ou não publicação dos
resultados (Res. 446/12 - CNS, item XI.2h).
NATAL, 07 de Agosto de 2014
Dulce Almeida
(Coordenador)
Assinado por:
59.078-970
(84)9193-6266 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Página 06 de  06
265
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
PARECER CONSUBSTANCIADO DO CEP
Pesquisador:
Título da Pesquisa:
Instituição Proponente:
Versão:
CAAE:
Processamento cortical de imagens afetivas durante a transição do sono
SIDARTA RIBEIRO
Instituto do Cérebro
3
25946913.2.0000.5537
Área Temática:
DADOS DO PROJETO DE PESQUISA
Número do Parecer:
Data da Relatoria:
650.714
25/04/2014
DADOS DO PARECER
O presente projeto tem nível de abrangência de Doutorado e tem como instituição proponente o Instituto do
Cérebro. Os sujeitos serão informados de que a pesquisa pretende estudar as fases de transição do sono e
como são influenciadas pela visualização de imagens afetivas; sendo que para isso serão utilizados
registros de áudio, icônicos e eletrofisiológicos. O estudo propõe uma caracterização eletroencefalográfica
epsicológica detalhada da transição vigília-sono em 65 sujeitos experimentais voluntários do sexo
masculinoe feminino com idade entre 20 e 40 anos. Será utilizado um eletroencefalógrafo com 64 eletrodos
ativos para obter registros neocorticais de alta resolução temporal e espacial. Para quantificar as mudanças
no processamento sensorial e cognitivo, os sujeitos experimentais serão submetidos a stimulação visual
antes de dormir, sendo despertados após poucos minutos para relatar imagens e pensamentos oníricos,
que serão registrados eletronicamente. Será utilizado o IAPS [1] como banco de estímulos visuais para aferir
diferenças mnemônicas associadas a diferentes valências afetivas. Técnicas quantitativas para análise d
egrafos, distâncias semânticas e distâncias icônicas serão empregadas para comparar conteúdos
psicológicos da vigília e do sono. O presente projeto é motivado pela necessidade de obter uma melhor
caracterização psicológica e eletrofisiológica do estado hipnagógico. Dessa forma, os pesquisadores
pretendem caracterizar o fenômeno psicofisiológico das visualizações e pensamentos que ocorrem durante
a transição da vigília para o sono após apresentação de figuras
Apresentação do Projeto:
Financiamento PróprioPatrocinador Principal:
59.078-970
(84)3215-3135 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Fax: (84)3215-3135
Página 01 de  05
266
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 650.714
de conteúdo afetivo. A pesquisa será realizada em três dias: no primeiro dia serão realizados os seguintes
procedimentos: Esclarecimentos sobre os objetivos e procedimentos da pesquisa; Assinatura do Termo de
Consentimento Livre e Esclarecido; Anamnese e avaliação neuropsicológia; Mapeamento Visuocortical. No
segundo dia serão realizados procedimentos em sujeitos com privação de sono: Aplicação do questionário
sonho e memórias; Registros Eletrofisiológicos com apresentação de figuras do IAPS; Desenhos das
imagens visualizadas. No terceiro e último dia serão realizados procedimentos em sujeitos sem privação de
sono: Aplicação do questionário sonho e memórias; Registros Eletrofisiológicos com apresentação de outras
figuras do IAPS e Desenhos das imagens visualizadas. Como desfecho primário a pesquisa trará: a
caracterização do papel psicológico e fisiológico do fenômeno de visualização durante transição do sono
para o processamento de imagens de conteúdo afetivo e contribuirá para compreensão do papel do sono e
dos sonhos para memória, em especial, para memórias afetivas. Como desfecho secundário,os resultados
trarão implicações importantes para compreensão da relação do sono na etiogênese de sintomas mentais
como sintomas depressivos e ansiosos ligados a eventos estressores, assim como sintomas psicóticos
vivenciados na Esquizofrenia, por exemplo.
Objetivo Primário:
Caracterizar efeitos de penetrância, dissociação e atenuação afetivas durante imageamento em fases de
transição do sono e seus correlatos eletrofisiológicos.
Objetivos Secundários:
1. Verificar ocorrência de penetrância semântica e icônica do conteúdo de imagens afetivas prévias em
imagens visualizadas durante transição para o sono;
2. Verificar ocorrência de atenuação do conteúdo afetivo em imagética de transição do sono;
3. Verificar ocorrência de dissociação entre valência afetiva da imagem e do pensamento durante sono;
4. Verificar estrutura do relato caracterizada por grafos de palavras em relatos de imagens e pensamentos
durante sono e vigília, assim como memórias remotas, recentes e sonhos;
5. Verificar impacto da privação de sono na frequência de imagens, efeitos de dissociação, penetrância,
Objetivo da Pesquisa:
59.078-970
(84)3215-3135 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Fax: (84)3215-3135
Página 02 de  05
267
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 650.714
atenuação afetiva e conectividade de grafos de palavras;
6. Correlacionar potência em banda de frequência alfa e penetrância, atenuação afetiva e dissociação
imagem e pensamento;
7. Correlacionar coerência entre canais frontais e occipitais em sono com efeito de dissociação imagem e
pensamento;
8. Correlacionar capacidade de neuroretroalimentação visual e auditiva com frequência de experiências
hipnagógicas durante sono.
Na versão atual, os riscos foram melhor abordados, no que se refere à questão da ocorrência de fatos
adversos ou sintomas psicológicos visto que foram asseguradas assistência médica imediata aos
participantes da pesquisa.Com relação aos benefícios, na versão corrigida, os mesmos foram melhor
explorados na versão modificada do projeto através da demonstração clara dos benefícios diretos e indiretos
aos participantes.
Avaliação dos Riscos e Benefícios:
A pesquisa em apreço tem importância científica e se fundamenta no papel de importância do sono e dos
sonhos no processamento cognitivo e afetivo. O estado de transição da vigília para o sono, conhecido como
estado hipnagógico, apresenta semelhanças psicológicas e neurofisiológicas com o estado de sono REM
mas mantém importantes particularidades ainda pouco exploradas. Através da pesquisa será realizada uma
caracterização eletroencefalográfica e psicológica detalhada da transição vigília-sono em 65 sujeitos
experimentais voluntários. O referido trabalho tem bom referencial teórico metodológico, tem importância
clínica e social, sendo passível de execução.
Comentários e Considerações sobre a Pesquisa:
O Termo de Consentimento Livre e Esclarecido - TCLE foi alterado conforme solicitação do parecer anterior,
estando, atualmente, adequado.
Considerações sobre os Termos de apresentação obrigatória:
Recomendações:
59.078-970
(84)3215-3135 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Fax: (84)3215-3135
Página 03 de  05
268
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 650.714
Na primeira versão foram apontadas algumas inadequações éticas como: a minimização dos riscos da
pesquisa, o que foi melhor abordado e esclarecido na atual versão do projeto. O TCLE foi modificado
incluindo o tempo dos procedimentos, bem como foi redigido em linguagem mais acessível, o processo de
arrolamento dos sujeitos da pesquisa e o local onde serão realizados os procedimentos foram detalhados,
como também foi explicada a fonte financiadora da pesquisa. Considerando que todas as inadequações
éticas apontadas foram esclarecidas, o projeto de pesquisa encontra-se eticamente aceitável.
Conclusões ou Pendências e Lista de Inadequações:
Aprovado
Situação do Parecer:
Não
Necessita Apreciação da CONEP:
Em conformidade com a Resolução 466/12 - do Conselho Nacional de Saúde - CNS e Manual Operacional
para Comitês de Ética - CONEP é da responsabilidade do pesquisador responsável:
1.  elaborar o Termo de Consentimento Livre e Esclarecido - TCLE em duas vias, rubricadas em todas as
suas páginas e assinadas, ao seu término, pelo convidado a participar da pesquisa, ou por seu
representante legal, assim como pelo pesquisador responsável, ou pela (s) pessoa (s) por ele delegada(s),
devendo as páginas de assinatura estar na mesma folha (Res. 466/12 - CNS, item IV.5d);
2. desenvolver o projeto conforme o delineado (Res. 466/12 - CNS, item XI.2c);
3. apresentar ao CEP eventuais emendas ou extensões com justificativa (Manual Operacional para Comitês
de Ética - CONEP, Brasília - 2007, p. 41);
4. descontinuar o estudo somente após análise e manifestação, por parte do Sistema CEP/CONEP/CNS/MS
que o aprovou, das razões dessa descontinuidade, a não ser em casos de justificada urgência em benefício
de seus participantes (Res. 446/12 - CNS, item III.2u) ;
5. elaborar e apresentar os relatórios parciais e finais (Res. 446/12 - CNS, item XI.2d);
6. manter os dados da pesquisa em arquivo, físico ou digital, sob sua guarda e responsabilidade, por um
período de 5 anos após o término da pesquisa (Res. 446/12 - CNS, item XI.2f);
7. encaminhar os resultados da pesquisa para publicação, com os devidos créditos aos pesquisadores
associados e ao pessoal técnico integrante do projeto (Res. 446/12 - CNS, item XI.
Considerações Finais a critério do CEP:
59.078-970
(84)3215-3135 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Fax: (84)3215-3135
Página 04 de  05
269
UNIVERSIDADE FEDERAL DO
RIO GRANDE DO NORTE /
UFRN CAMPUS CENTRAL
Continuação do Parecer: 650.714
2g) e,
8. justificar fundamentadamente, perante o CEP ou a CONEP, interrupção do projeto ou não publicação dos
resultados (Res. 446/12 - CNS, item XI.2h).
NATAL, 16 de Maio de 2014
Dulce Almeida
(Coordenador)
Assinado por:
59.078-970
(84)3215-3135 E-mail: cepufrn@reitoria.ufrn.br
Endereço:
Bairro: CEP:
Telefone:
Av. Senador Salgado Filho, 3000
Lagoa Nova
UF: Município:RN NATAL
Fax: (84)3215-3135
Página 05 de  05
270
 References of Perspectives and Discussion: 
1 Whitaker, K. J., Vendetti, M. S., Wendelken, C. & Bunge, S. A. Neuroscientific insights 
into the development of analogical reasoning. Dev Sci, doi:10.1111/desc.12531 (2017). 
2 Alloway, T. P., Gathercole, S. E. & Pickering, S. J. Verbal and visuospatial short-term 
and working memory in children: are they separable? Child Dev 77, 1698-1716, 
doi:10.1111/j.1467-8624.2006.00968.x (2006). 
3 Mota, N. B. et al. A Naturalistic Assessment of the Organization of Children’s Memories 
Predicts Cognitive Functioning and Reading Ability. . Mind, Brain, and Education 10, 
184–195 (2016). 
4 Freud, S. The interpretation of dreams.  (1900). 
5 Voss, U., Schermelleh-Engel, K., Windt, J., Frenzel, C. & Hobson, A. Measuring 
consciousness in dreams: the lucidity and consciousness in dreams scale. Conscious 
Cogn 22, 8-21, doi:10.1016/j.concog.2012.11.001 (2013). 
6 Stumbrys, T., Erlacher, D., Johnson, M. & Schredl, M. The phenomenology of lucid 
dreaming: an online survey. Am J Psychol 127, 191-204 (2014). 
7 Mota, N. B., Furtado, R., Maia, P. P., Copelli, M. & Ribeiro, S. Graph analysis of dream 
reports is especially informative about psychosis. Scientific Reports 4, 3691, 
doi:10.1038/srep03691 (2014). 
8 Fromer, M. et al. De novo mutations in schizophrenia implicate synaptic networks. 
Nature 506, 179-184, doi:10.1038/nature12929 (2014). 
9 Stokes, D. E.     (Brookings Institution Press, Washington, D. C., 1997). 
10 Mota, N. B., Copelli, M. & Ribeiro, S. Thought disorder measured as random speech 
structure classifies negative symptoms and Schizophrenia diagnosis 6 months in 
advance. NPJ Schizophrenia, doi:10.1038/s41537-017-0019-3 (2017). 
11 Kaplan, H. I. & Sadock, B. J. Kaplan & Sadock's Comprehensive Textbook of Psychiatry.  
(Wolters Kluwer, Lippincott Williams & Wilkins, 2009). 
12 Bedi, G. et al. Automated analysis of free speech predicts psychosis onset in high-risk 
youths. npj Schizophrenia 1, 15030, doi:10.1038/npjschz.2015.30 (2015). 
13 Elvevåg, B., Foltz, P. W., Weinberger, D. R. & Goldberg, T. E. Quantifying incoherence in 
speech: An automated methodology and novel application to schizophrenia. 
Schizophrenia Research 93, 304-316, doi:10.1016/j.schres.2007.03.001 (2007). 
14 Mota, N. B., Carrillo, F., Slezak, D. F., Copelli, M. & Ribeiro, S. in Fiftieth Asilomar 
Conference on Signals, Systems and Computers. (ed IEEE Xplore) 836-838 (IEEE, 06 
March 2017). 
15 Mota, N. B., Resende, A., Mota-Rolim, S. A., Copelli, M. & Ribeiro, S. Psychosis and the 
Control of Lucid Dreaming. Front Psychol 7, 294, doi:10.3389/fpsyg.2016.00294 (2016). 
16 Gottesmann, C. The dreaming sleep stage: A new neurobiological model of 
schizophrenia? Neuroscience 140, 1105-1115, doi:10.1016/j.neuroscience.2006.02.082 
(2006). 
17 Horikawa, T., Tamaki, M., Miyawaki, Y. & Kamitani, Y. Neural decoding of visual 
imagery during sleep. Science 340, 639-642, doi:10.1126/science.1234330 (2013). 
 
 
271