By Claude Bajada
OHBM is a community of neuroscientists interested in neural cartography. It draws upon the traditions of 19th century neural mappers such as the Vogts, Brodmann and von Economo. While the spirit of the society is still based in the biological brain, the conference itself is multidisciplinary. Although still a place for biologists, anatomists, physicians and surgeons, thanks to the development of Magnetic Resonance Imaging the field has become increasingly computational.
Thomas Yeo is an assistant professor at the National University of Singapore where he leads the Computational Brain Imaging Group. His lab develops machine learning algorithms for MRI data. His work is well known to brain imagers who are familiar with the “Yeo” brain networks. Ahead of his keynote lecture, I met Thomas and learned how he made the switch from engineering to neuroscience, what led him to working on the topics he is now well known for, and what the exciting new topics in his field are.
Claude Bajada (CB): From studying electrical engineering and computer science to getting into neuroscience what was the path? Or perhaps draw the graph.
Thomas Yeo (TY): To tell you the truth, my path was actually quite random. There was no sudden epiphany, no single life-changing event that led me down this path. As a kid, I was generally interested in the brain, but I was also interested in mathematics and physics. The best way to describe my path is that things just sort of happened. When I was deciding on PhD supervisors, I was debating between computer vision or medical vision. I ended up pursuing medical vision mainly because my PhD supervisor (Polina Golland) expressed the most interest in me joining her lab. At the time, I did not want to work with fMRI because it seemed too difficult. Consequently, I ended up working with both Polina and Bruce Fischl during my PhD, developing machine learning algorithms for registering and segmenting brain data on cortical surfaces. When my PhD was nearing its end, I was looking for a postdoc position, but I also wanted to try something new. I could either move closer towards image acquisition (MR physics) or towards the “end users” (neuroscientists/clinicians). Bruce mentioned that Randy Buckner was putting together a big dataset. At that time, there was not as much data sharing and not as many large datasets like today, so I was to join Randy’s lab, learn some neuroscience and analyze some large datasets. From then on, I was hooked onto neuroscience research, but with a computational bent because of my PhD training.
CB: As someone with a solid STEM background what are your experiences interacting with clinicians, psychologists and other health professionals?
TY: Interactions with clinicians, neuroscientists, and psychologists are extremely important to what I do. I have found that problems, which neuroscientists think are important, are often quite different from what I was interested in as a PhD student. Back then, I thought I was developing algorithms that were very helpful to neuroscientists/clinicians/psychologists. But after joining Randy’s lab, I realized that my algorithms are often not immediately relevant to what neuroscientists need. In engineering/computer science, there is the pressure to develop novel, beautiful, fast algorithms. However, at the Martinos center, where there is a very nice big computing cluster, speed is often not a pressing issue. Most neuroscientists also do not care about novelty or how elegant an algorithm is. They care about whether an algorithm can help to answer their question or help their patients. They don’t really care whether the algorithm involves lots of equations or just simple correlations. In fact, they prefer a simple algorithm to a complex algorithm unless I can demonstrate the complexity is really necessary. So working with neuroscientists has really changed how I think about problems.
On a day-to-day basis, I like to think about what interesting neuroscience problems can be formulated as machine learning problems. For example, around 2012, I became intrigued by Russ Poldrack’s 2006 paper on reverse inference. He had this beautiful figure showing that tasks recruit unobserved cognitive processes, which can then be observed with brain imaging, behavioral and other kind of data. I realized that the figure can be mathematically expressed using a hierarchical generative model. I then applied this model to real data to estimate the unobserved cognitive processes and discover new insights into brain organization. Throughout this project, I received a lot of inputs from quite a number of neuroscientists, who brought with them their own unique expertise and insights to the project. In fact, I met Simon Eickhoff and Maxwell Bertolero because of this project and we have since collaborated on many more projects. Later on, I realized that the same class of hierarchical generative model can be applied to understanding disease nosology: in this case, the model would encode the idea that different disorders or disorder subtypes share multiple disease processes, which can then be observed with brain imaging and behavioral data. This has in turn led to projects on disorder nosology with quite a number of folks. Thus, one project led to new collaborations, which led to even more collaborations.
CB: Your name is now almost synonymous with the 7 and 17 resting state “Yeo” networks. How did that work come about and how did it influence your subsequent career?
TY: As I was saying before, I ended up joining Randy’s lab as a postdoc because he was amassing a large dataset in collaboration with a large number of PIs in the Boston area. At that time, there was already a lot of work showing that resting-state fMRI can be used to extract different networks. Given that my PhD advisor (Bruce Fischl) is one of the creators of FreeSurfer, I ended up re-processing Randy’s data and projecting them onto the surface coordinate system to visualize the data. I then used a clustering technique developed by a fellow PhD student (Danial Lashkari) of my other PhD advisor (Polina Golland) to parcellate the cortex. Frankly speaking, most of the networks we found were already known in the literature, so to this date, I am not 100% sure why this paper became such a hit. Perhaps it was the large number of subjects. Or the surface coordinate system allowed us to see some very exquisite topography that were less obvious in the volume. For example, we showed that the existence of multiple parallel-distributed and interdigitated association networks. Or perhaps it was the comprehensiveness of the paper – 40 pages long. I like to joke that it’s my second thesis.
Without a doubt, the paper has been incredibly helpful for my career. I have a few students, who continue to push the frontier on this topic. Our work probably gets a disproportionate amount of attention, so my lab continues to benefit from the original paper. In some sense, I was very lucky. The technical aspects of the 2011/2012 papers (e.g., surface processing, clustering) were possible because of my PhD training. And I arrived in Randy’s lab at exactly the right time. If I came a year earlier, the data would not be ready. If I came a year later, the impact of the work might be overshadowed by similar papers (e.g., Jonathan Power’s work), which would have been published a lot earlier. I was lucky to have worked with super talented people in Randy’s lab, including Fenna Krienen (who was co-first author on the paper), Hesheng Liu, Jorge Sepulcre and of course Randy!
CB: What would you say is the most exciting topic in computational brain imaging at the moment?
TY: Given the large quantity of public data out there, I think this is an exciting time for human neuroscience. This is especially the case for computational scientists like me. I have found the big data to be very helpful for developing algorithms and applying them to discover new insights into the brain.
Given the large public investments in these datasets, I am also thinking a lot about how we can use these big data for useful applications, e.g., helping patients, etc. Consequently, I have become less interested in problems, such as classifying controls versus schizophrenia, which are useful for benchmarking algorithms, but not really useful for clinicians per se. There are definitely machine learning problems with real clinical value, e.g., predicting best treatment in depression, but there’s not that much big public data on that (although I can’t really complain since I am just a data leech).
Furthermore, the vast majority of machine learning algorithms only allow us to find associations. So no matter how “deep” the algorithms are, we are just finding glorified correlations, even if it’s out-of-sample prediction! Do these big data only allow us to find associations or can we gain mechanistic insights into the brain? On this front, I think biophysical modeling and causal modeling are potentially promising and exciting.
CB: You played an integral role in COBIDAS, what was the motivation for that and what influence do you think it has had?
TY: WelI, I wouldn’t say I played an integral role. I was one of many folks who contributed to the report. It was really Tom Nichols who had the unenviable task of “herding cats”! The OHBM Council initiated COBIDAS to develop recommendations and consensus on best neuroimaging practice. But soon it became clear that “neuroimaging” would cover too many things, so we ended up focusing on MRI. EEG/MEG COBIDAS is now spearheaded by Aina Puce and Cyril Pernet.
Unfortunately, in my opinion, the COBIDAS report has not been as influential as I hope. We recommended a checklist of items that researchers should consider and report, but I think it’s safe to say that the vast majority of papers (including from my lab) do not really do so. I am speculating here, but one reason might be that many researchers do not know sufficient details of their preprocessing pipelines or analysis algorithms to actually complete the checklist. The checklists are also very long, so researchers might balk at the work of filling them. I think the best way for this to happen is to try to automate the process. I can imagine some software that keep track of the preprocessing/analysis one performs on the data. These metadata can then be shared. I believe Tom Nichols and others might be working on this. This could be promising.
In the case of my lab, we mostly perform analysis of open datasets and we often develop our own algorithms. Unfortunately, I do not believe that there is a checklist long enough to completely specify an algorithm without access to the original code. Thus, my lab is more focused on sharing our code. Even then, replication is not so easy of course. While we work on open datasets, many datasets (e.g., UK Biobank) might not allow us to re-distribute the data. Thus, replicating our results is not so easy. If you explore our github (https://github.com/ThomasYeoLab/CBIG), you will see that our wrapper scripts often reference data on our server. But we have tried to make the code user friendly, so hopefully users can easily apply our code to their own data.
CB: What can we expect in the future from the Yeo lab?
TY: We have some new exciting individual-specific brain parcellations stuff coming out! We are also working on using machine learning and GPU to invert neural mass models; right now, these biophysical models mostly require hand-tuning of critical parameters. Finally, we are also working on using machine learning to understand disease nosology.