by Claude Bajada
The GDPR is a new(ish) legislation by the European Union that regulates the processing of personal data when the person processing or controlling the data is in the EU, even if the actual processing occurs outside of the EU. Further, the GDPR also sometimes regulates the processing of personal data of people who are in the EU, even if the persons doing the processing are outside of the EU.
How does this affect neuroimaging? We sit down with neuroimaging expert and Open Brain Consent co-author Dr Cyril Pernet (CP) and Technology law expert Dr Mireille Caruana (MC) to discuss the implications of this law on our work.
The article flip-flops between the term “participants” and “data subjects” since ““data subject” is the term used in the GDPR but for the purposes of this article you can think of them as equivalent terms.
What follows is a summary of our conversation, edited for conciseness and clarity.
Who are our experts?
CP: I do a lot of method development in neuroimaging and in a clinical context. Data sharing is something that I have always been happy to work towards. Data sharing is like code sharing, we need it for good science. With the advent of the GDPR, we've got some extra constraints on what to share and how to share.
In the clinical context, the typical thing is to say is: “Oh, you know, we have patients’ data, therefore, privacy issues,” and people don't even try to share. This really annoys me because there are ways we can do it. It doesn't have to be completely open on the web so that everybody can download it. I've been working on all sorts of open science related projects and the Open Brain Consent is part of that line of work.
MC: I am the head of the Media, Communications and Technology Law Department within the Faculty of Laws at the University of Malta and my research has, since before the GDPR, focused on privacy and data protection issues. I would not contradict you that the GDPR is a relatively new law that has, from the start, been the subject to a lot of uncertainty and difficulty in implementation and application. It's well worth working our way through the legislation to seek correct interpretations of it.
Why is it important to discuss GDPR across disciplines?
CP: We are scientists, when we read the GDPR text, we don’t understand the implications. We do not know how judges will interpret the law. This means that we need lawyers to guide us on how to interpret what is written there.
MC: The problem is that in many instances there aren’t clear answers. In fact, while a lawyer may give legal advice, it may eventually be contradicted by a court. Nevertheless, scientists should behave as diligently and carefully as possible. If the perception of the GDPR ends up restricting research or not allowing researchers to do their work, that's a problem. It shouldn't be that way. But achieving this balance is very difficult.
Anonymous data are not governed by the GDPR. Do you think there's anything within neuroimaging that can be considered anonymous?
CP: In my opinion, one of the key points in GDPR that is relevant to neuroimaging is that neuroimagers are able to single out individuals from datasets, which makes the data identifiable. And I'm not just talking about brain structure data, I am also talking about EEG data, MEG data, etc.. With connectome matrices and a few tasks you can single out individuals, and we can thus consider that any imaging data should be considered identifiable. Others disagree with me and argue that singling out is not strictly identifiability, while I contend the opposite because GDPR indicates that singling-out is a prerequisite to identification.
This is a key difference between North American legislation and the GDPR. While North America differentiate between anonymised data, pseudonymised data and identifiable data, the GDPR only distinguishes between anonymised data or identifiable data. Pseudonymisation is just a process. Data can go through that process without changing their status as identifiable. That is we can remove the face, ID, etc ., but brain imaging data remain identifiable, in that we can potentially distinguish between individuals and even if we don’t have the metadata, link those data to someone by name.
Can we have an example?
CP: Imagine, for instance, that we have two independent datasets consisting of connectome matrices and tasks. There may be individuals who have been participating in each of those datasets. So we can now think about linking them and studies have indeed shown that it is possible to say that the same individual belongs to both datasets, because of the way connectomes look. Not only can we single out people within datasets, but we can also link datasets, and possibly by adding associated metadata we are getting even closer to identifying that person in the real world.
Are there any proposed solutions for this problem?
CP: The solutions that we have come up with are detailed in Open Brain Consent and involve two consent forms as well as a data user agreement for data collected in the EU. Of the two consent forms, one is the consent for the study and the other one is consent for people to share their data. The way you can legally share this is through a data user agreement, not through a licence, which means we ‘control’ who has access and to a lesser extent what can be done to the data. Now the control can be done in a way where people register to use specific datasets. For example, the Netherlands have a good system because every researcher is registered on a database. So for instance, if you log into the system of a particular institute, they know who you are, which institution you are affiliated with, and you can just download data, even if you're not part of the data-holding institute. This is possible because they can identify you. You can sign the data user agreement with a simple click.
A user agreement also helps researchers share data outside of the European Union. The GDPR refers to this as “standard contract clauses.” This allows you to get to a point where non-EU researchers can download the data and become the data controller. With the data user agreement, the downloader agrees with the terms of the GDPR. This way you can share data anywhere in the world, even outside the EU. But you cannot just put your data up on openneuro. This is important since openneuro servers reside in the US, and the US is special because it is not considered to be a “safe country” by the EU. Institutions can sign an agreement with the EU to become a safe repository. But that also means openneuro would have to change their infrastructure to support data user agreements.
Where does consent come into all of this? Could I just get consent from my participant to share all of my data in the US, and the rest of the world?
MC: In the GDPR, sharing or transferring data is considered to be a type of processing. Let's forget about how the original data were collected and focus on the sharing of these data. In this case, you should still have a legal basis for processing in the GDPR. I am also assuming that they're sensitive personal data, since I am assuming that they say something about an individual’s health status.
Article 9 of the GDPR has a legal basis specifically for research data processing. So perhaps you don't need to rely on consent to share data because there is another legal basis which speaks about the necessity for scientific research. However, this legal basis is somewhat unclear in its application because it speaks about individual member states laying down a law that provides appropriate safeguards.
With regard to data transfers to a third country such as the US, chapter 5 of the GDPR concerns transfers of personal data to third countries or international organisations. According to Article 45, transfer of personal data to a third country may take place where the EU Commission has decided that the third country, or one or more specified sectors within that third country, ensures an adequate level of protection. Such a transfer does not require any specific authorisation. In the absence of an adequacy decision, a controller or processor may transfer personal data to a third country only if the controller or processor has provided appropriate safeguards, and on condition that enforceable data subject rights and effective legal remedies for data subjects are available.
Under Article 49, in the absence of an adequacy decision, or of appropriate safeguards, a transfer or a set of transfers of personal data to a third country may take place only on one of a set of stated conditions, which include that “the data subject has explicitly consented to the proposed transfer, after having been informed of the possible risks of such transfers for the data subject due to the absence of an adequacy decision and appropriate safeguards”.
How do we deal with requests for deletion of data?
MC: Article 17, GDPR sub article 2 states that “Where the controller has made the personal data public and is obliged pursuant to paragraph 1 to erase the personal data, the controller, taking account of available technology and the cost of implementation, shall take reasonable steps, including technical measures, to inform controllers which are processing the personal data that the data subject has requested the erasure by such controllers of any links to, or copy or replication of, those personal data.” It talks about reasonable steps that would, by way of good practice, mean a record of people who accessed the data and contacting them to inform them about the request.
How long can we store data for?
CP: You are required to set a time frame within which you must review the need for continued storage of the data. However, if the data keep being necessary, the data can be kept indefinitely.
Is it true that under the GDPR, legally, you're not allowed to reuse your own data in your own lab to answer different questions than what it was originally collected for?
MC: The GDPR speaks about purpose limitation (“personal data shall be collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes”) and ‘specific’ consent (“‘consent’ of the data subject means any freely given, specific, informed and unambiguous indication of the data subject’s wishes…”). So ideally, I think even ethically, your research participants should understand how you're going to use their personal data; but no, research is treated in a particular manner under the GDPR. Research is not considered to be incompatible with the original purpose for data collection (“further processing for ... scientific ... research purposes ... shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes”).
Furthermore, recital 33 of the GDPR clarifies “It is often not possible to fully identify the purpose of personal data processing for scientific research purposes at the time of data collection. Therefore, data subjects should be allowed to give their consent to certain areas of scientific research when in keeping with recognised ethical standards for scientific research. Data subjects should have the opportunity to give their consent only to certain areas of research or parts of research projects to the extent allowed by the intended purpose.” So, legally, you may be covered, even though the debate surrounding so-called ‘broad consent’ is not conclusive (cf. for example the Article 29 Working Party’s Guidelines on consent under Regulation 2016/679).
CP: In my opinion, the “purpose” research is not specific enough. But if you say the purpose is “memory” that's too specific because that way you could not even use a T1w image to create a template. So, we came up with a compromise. If you look at the Open Brain Consent GDPR edition, our solution is to say that, for instance, the purpose of conducting the study is one thing, but also that the data may be used for future research projects in the field of medicine and cognitive neuroscience, which strikes the balance.
MC: Article 5 (1) (b) of the GDPR states that “personal data shall be collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes; further processing for ... scientific or historical research purposes or statistical purposes shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes”
This gives researchers quite a bit of flexibility. This is maybe one area where law and ethics overlap. The debate within research on genetic data that I have come across when dealing with biobanks, is that people speak of dynamic consent and they want to use dynamic consent to have more granular consent for specific projects. The thinking behind this is that certain people might object morally to particular research. So of course, you're being more respectful to the data subject if you don't use the data in ways that they would not approve of. Specific, granular consent is in line with the spirit of GDPR, but I don't think that the GDPR excludes such broader consent for scientific purposes.
The GDPR refers to data minimization? How do you guarantee that we don't collect data that are unnecessary?
CP: This is something that we also struggled with. On one hand, you would want to be able to collect participants' data and typically, in my lab, we go through a bunch of health questionnaires, handedness, medical history, language etc… because, of course, you can then reuse these data in a larger dataset. You've got 100 different studies, but for each participant, you have the common six questions, and you can do a nice big analysis. You could possibly connect these studies and perform richer analyses. What is the balance? We know that this may be the only way to aggregate enough data from multiple studies to then have a study that is powerful enough to look at the effect of some type of medication.
MC: Unfortunately, I think that this is an outstanding difficulty or problem because as a researcher you may not know exactly what you're looking for; for example, what analysing the patterns may reveal. It is a known tension in the GDPR that may also go against the purpose specification principle. So I think it's a tension that is real. I would however always emphasise in such contexts that the sole purpose for processing these data is in fact scientific research, that there may be uncertainty associated with research, but that there is also an important public good to be gained from such research that affects the balance to be achieved between the different competing interests, including the privacy and data protection rights of the data subjects.
What are the next steps?
CP: I think the next steps are twofold. One is for neuroimagers to engage with their own institutional repositories. We need to work with them and with data protection officers to come up with solutions for sharing data. You need to explain what systems need to be in place and how to implement them. We do have power because we do receive money from funders who often actively ask us to share the data. And it is the university’s job to provide us with the tools to be successful in funding applications and to comply with funders.
The other aspect is more ambitious. There are systems that work under the architecture of any repository to index them, so that for instance, every university in Europe could very well have their information connected. But this would necessitate that all universities cooperate with each other. It's more like a dream.
I am also very keen on making sure that everyone reading this interview knows about all the efforts of the Open Brain Consent project. I would like to highlight all of the hard work put in by many, in particular Chis Gorgolewski and Yaroslav Halchenko who started the project, Stephan Heunis and Peer Herholz who organize work on this during the Organisation for Human Brain Mapping (OHBM) hackathon, and all the people who helped sharing their consent, experience, and proposed translations (now available in 12 languages) thanks to the COST association support (GliMR2). Note that we are keen on having more people involved, in particular having and sharing more information about how these issues are dealt with in countries from the Global South that are currently under-represented.
You can find more details on the Open Brain Consent website.
Now is the time to submit your nominations for 2021 OHBM Awards. To inspire you, we are highlighting some of the outstanding winners from this year’s meeting.
This year’s annual meeting was unique in many ways. Uncertainty about whether the meeting would happen was followed by a remarkably fast reorganization in order to hold the meeting online with a complex time schedule. One event that was not missing in the program was the traditional award ceremony that recognized the work of individuals who have changed the scientific landscape of human brain mapping.
Inspired by their nomination letters, we honor OHBM 2020 award winners and their achievements:
2020 Early Career Investigator Award Winner: Danielle Bassett
Danielle Bassett received her PhD in physics in 2009 and after only 10 years, she is now a full professor at the University of Pennsylvania and has published over 240 peer reviewed articles. Her top-cited paper on small-world brain networks has received over 1,800 citations. In addition to the OHBM Early Career Investigator Award, she has also received the Erdos Renyi Prize in Network Science, a National Science Foundation CAREER award and a MacArther “genius” fellowship, amongst others.
Danielle Bassett’s laboratory, the Complex Systems Group at the University of Pennsylvania, combines theory and tools from bioengineering, physics, electrical engineering, neurology, psychiatry and philosophy. Her team’s translational, interdisciplinary research has enabled them to explore the human thought process through investigations of how we learn and how this is underpinned by the flexibility of brain network dynamics. Her interdisciplinary approach applies new physically-informed metrics and null models for spatially embedded systems to look at networks at different scales (from cellular to systems) in order to inform clinical medicine and societal interventions. Danielle and her lab also contribute to software packaging and science outreach events.
Danielle gave the opening lecture at OHBM2019 and participated in the mentoring symposium organized by the OHBM student/postdoc special interest group.
From musical notes to neural nodes, you can learn more about Danielle Bassett’s career and aspirations at ScienceMag.
2020 Education in Neuroimaging Award Winner: Robert Savoy
Educational programs are a key part of the success of OHBM. Before the OHBM meeting in 1996 in Boston, Dr. Robert L. Savoy organized an educational workshop on fMRI attended by 600 of the 900 meeting attendees. The success of this inaugural course showed the high demand for educational programs. These have continued annually with the still highly-attended workshops alongside each OHBM meeting. Robert’s very first course was offered at the MGH NMR Center in October of 1994, and it was envisioned that the market for an introductory fMRI course would soon be exhausted. In contrast, the continual advances in fMRI and the general excitement associated with the technology meant that it reached an ever-expanding range of disciplines, increasing the pool of interested students. As the field has grown, so too have Robert’s educational offerings. Since 2007, Robert has organized an annual two-week Multi-Modality course; this has in turn generated another short course on connectivity. Robert is a rare scientist who devotes almost all of his efforts to education. His courses have had a profound impact on the career trajectory of many of our colleagues, including many active and leading members of the functional imaging community around the globe.
A large fraction of the leaders in the field have attended his course – receiving their first instruction on fMRI and neuroimaging there. Peter Bandettini, Ph.D., Director of Functional Magnetic Resonance Imaging Core Facility (FMRIF) collected the following quotes:
2020 Mentor Award Winner: F. Xavier Castellanos
In her nomination letter, Lucina Q. Uddin describes Francisco Xavier Castellanos as “a winner with great mentoring values, guiding his lab members to become independent thinkers and scientists. He is a tireless mentor and teacher. He proposes clear goals with defined timeline and expectation along the way and he predicts correctly. He shows a clear vision of a career path and best opportunities that should not be to define a path for new lab members. He is able to teach the art of “grantsmanship”, one that every scientist must master. Xavier is always there for his trainees, current and past. Trainees can always count on Xavier to submit a letter of recommendation at a moment’s notice, which is a great aid to apply for fellowships, grants, and positions as the opportunities arise. He is always happy to comply with letter requests, no matter how frequent. He also remains, at every career transition, a sounding board, providing clear-headed, rational and thoughtful advice.”
Lucina mentioned one particular anecdote that represents her experience of being supervised by F. Xavier Castellanos: “One particularly salient example of Xavier’s unconditional and enthusiastic support for his trainees comes to mind. One day, in a conversation with Bharat Biswal, we were tossing around the idea of trying to collect neuroimaging data from a split-brain patient in order to test a theory we had about functional connectivity. Without a fuss, Xavier funded the trip for me and a colleague to fly across the country, collect data from this unique patient, and spend months analyzing the data (though this project was unrelated to any of his grants at that time). This spontaneous trip led to a number of interesting case studies (Nomi et al. 2019, Uddin et al. 2008), and remains one of my favorite Xavier memories. The fact that he has always been enthusiastically supportive of whimsical projects has made science fun over the years.”
What particularly distinguishes Xavier from other senior successful scientists is his generosity, intellectuality and personality. He clearly has had a positive impact on a number of young scientists. Indeed, it is worth noting that three of his previous mentees (Lucina Uddin, Mike Milham, and Daniel Margulies) have received the OHBM Early Career Award. Another example of the way in which Xavier exemplifies the values of collegiality and building community through acknowledgment and recognition is in his authorship practices. He never hesitated to add junior scientists as co-authors on manuscripts, and readily gave up senior authorship positions to his trainees, as he always practices the maxim of giving credit where credit is due.
Xavier has been a proponent of open science before open science was a thing. His lab was one of the earliest to get involved in grass-roots data sharing initiatives such as the Autism Brain Imaging Data Exchange, the ADHD 200 International Neuroimaging Data-sharing Initiative, and the Enhanced Nathan Kline Institute – Rockland Sample. In fact, he is acknowledging that so much data are collected and so many people are needed to analyze them, so he favors giving others the opportunity to use their expertise without worrying about authorship or credit or restrictions. This kind of radical data sharing has inspired countless researchers worldwide, who are beginning to follow a similar philosophy. Xavier’s lab and pioneering radical data sharing initiatives set the stage for the current climate of open science and collaboration that permeates the field today.
2020 Replication Award Winner: Andre Altmann
Andre received the replication award for his paper titled ‘A comprehensive analysis of methods for assessing polygenic burden on Alzheimer’s disease pathology and risk beyond APOE’, Altmann et al., Brain Communications (16th December):
In this paper, Andre Altmann and colleagues attempted to replicate results from “Polygenic hazard score, amyloid deposition and Alzheimer’s neurodegeneration”, published in early 2019 by Tan et al.. The original paper was proposing a link between a polygenic hazard score (PHS) and amyloid deposition (from amyloid PET) beyond APOE.
Andre Altmann and colleagues proposed to account for APOE4 status (carrying or not) instead of APOE4 burden (number of copies). Beyond this difference in analysis, Altmann et al. went further to show that their analysis better accounted for APOE4 than the initial study. While using subjects from the same database (ADNI), Altmann et al. were not able to replicate the results from Tan et al. (2019).
APOE4 is the strongest common genetic risk factor for the sporadic late onset of Alzheimer's disease and is known to be associated with amyloid deposition in the brain. Therefore, it is of importance to disentangle the effect of APOE4 from the polygenic hazard score in order to avoid correlation of no interest in the results. This would explain part of the previously observed strong link between their proposed Polygenic hazard score (PHS) and amyloid deposition. This reanalysis questions the conclusion from Tan et al. (2019) that PHS influences longitudinal cognitive decline in regard to the model.
Altmann et al. adjusted their linear mixed effects model and the replication study shows that small differences in modeling decisions have a dramatic impact on the results.
This study also rectifies a result that could have had a large impact in the field as PHS could have been used for follow up study in the Alzheimer's disease community without proper initial support.
2020 Open Science Award Winner: Michael P. Milham
Mike P. Milham’s efforts in open data, open resources and collaborations are numerous. They impact both clinical and basic science neuroimaging communities. In just a decade, starting with the aggregation and publication of the 1000 Functional Connectomes Project (FCP-1000), Mike has:
The above initiatives have had a substantial impact on the neuroscientific community both in terms of immediate/direct (e.g., publications) and sustained/indirect impact (e.g., cultural change).
Mike has been the driving and inspirational force of a host of important open science initiatives that have helped change the landscape of human and non-human primate neuroimaging.
Once again, we congratulate all the OHBM 2020 winners and nominees. We wish them a great year of science.
We hope we have inspired you to look around you, consider your own mentors, colleagues, trainees, friends and neuroimaging heros to might be an appropriate candidate for one of the 2021 OHBM Awards. The OHBM website has all the details regarding eligibility, and required information for each of the award categories; just select the award by name and there you will find the link to the submission webform. The nominating process is reasonably easy, all online and waiting for your submission. Remember, our ability to inclusively honor members of our diverse community is directly dependent upon you submitting deserving candidates!
Written by: Claude Bajada, Fakhereh Movahedian Attar, Ilona Lipp
Expert reviewers: Adina Wagner, Cyril Pernet
Newbie editors: Yana Dimech, Renzo Torrecuso
This post is about good neuroimaging practices. ‘Practices’ relates to all aspects of conducting research. By ‘good’, we mean beneficial to the field and neuroimaging community - but you’ll see that most of these practices also benefit the individual researcher. Here, we collected a number of tools, tips and tricks to do neuroimaging in the ‘best’ way possible. We aim to provide an overview and answer some questions you may have asked yourself about reproducibility and good neuroimaging practices. As usual, we refer to OHBM On-Demand videos from the educational sessions of previous annual meetings. OHBM has both a special interest group (SIG) for Open Science as well as a Best Practices Committee, where leading brain mappers promote and help implement Open Science and good practices in data analysis and sharing. Both the Open Science SIG and the Best Practices Committee regularly create invaluable resources, such as the annual Hackathon workshops, and the COBIDAS Best Practices in MRI and M/EEG data analysis papers.
Isn’t the main issue in our field reproducibility? Or the lack of it? Should I care about my science being reproducible?
Those are loaded questions. We think we just might not answer them because you are luring us into a trap that begins with seemingly innocent questions and then rabbit into an unending borough. There are so many terms to wade through that the novice neuroscientist can easily get lost in this bog!
In his video, Cyril Pernet clarifies the often used terms 'repeatability’ and ‘reproducibility’ (from min. 1:07). First, ‘repeatability’ means “simply” that redoing the same analysis with the same data should result in an identical result as the original analysis, which is not as trivial as it seems. The software version and the operating system can be variables that affect the output of your imaging analysis. That, however, is only step one. In his video, David Kennedy (from min. 3:54) highlights that ‘reproducibility’ is really a spectrum. We could use the exact same data and nominally similar analysis. Or, we may have nominally similar data with the exact same analysis. Or, we may have nominally similar data with nominally similar analysis. This way we can test the sensitivity and stability of our experiment.
Cyril explaining the different levels of reproducibility.
But this leads back to your question. Scientific findings should generalise. They should first be valid (repeatable) but should also be robust to various permutations of the data and analyses used. There is a great video by Kirstie Whitaker on YouTube that tackles these issues.
The reproducibility crisis is often associated with the field of psychology, is there anything different in the field of human brain mapping?
Ok, so here we are generally talking about the more general “reproducibility”, not just about being robust to permutations. We will assume that researchers have already ensured that their analysis is re-executable.
In 2005, John Ioannidis published a landmark article with the eye watering title of “Why Most Published Research Findings Are False.” If you are interested in understanding why many scientific articles are not reproducible we strongly recommend reading this article; it is an easy and insightful read. Notice that this article does not even specifically refer to psychology or to neuroimaging. This problem is general to, at least, the wider “medically-related” field.
The article points out that effect sizes in these fields tend to be low and that sample sizes are frequently lower than what would be needed to test for such small effects. In neuroimaging, there are many steps and expertise (and often money) involved in acquiring good data. As a result, our sample sizes tend to be typically small. Indeed, it was not too long ago when most neuroimaging articles were published on samples of approximately 20 participants. In 2020, studies with several hundred, up to a couple of thousand, participants are becoming more common, but these require a massive investment in resources and tight collaboration between sites.
In his video, Cyril provides an overview of cognitive biases that can contribute to limited reproducibility of neuroscientific research (from min. 7:18). He also explains how the analytical flexibility in neuroimaging research (such as fMRI analyses) adds an additional level of complexity (from min. 15:59). While papers with hot stories and “positive results” have it much easier to find a home in very high impact journals, the drawbacks of this trend are slowly starting to be recognized. Neuroimaging scientific societies are becoming aware of the importance of reproducible research and are incentivising the work. OHBM has a yearly replication award that was won by Andre Altmann this year. Also, initiatives, such as DORA, The Declaration on Research Assessment, aim to find ways of evaluating research and researchers that go beyond journal impact factors.
Pia Rotshtein discussing the conflict of interest between good science and researcher’s careers.
So what can we do to make neuroimaging research more reproducible?
Well, some things are harder to deal with than others. Running neuroimaging studies is time-consuming and expensive, there is very little that can be done about that, at least in the short to medium term. One thing we can do is to work towards using robust and valid measures from neuroimaging data. In his video, Xi-Nian explains how validity of our measures depends on reliability (from min. 5:40). He introduces reliability indices (the intraclass correlation coefficient) and gives an example of how they can inform the extent to which inter-subject variability (which is often what we are interested in, e.g. when investigating different groups of people or brain-behaviour correlations) exceeds intra-subject variability (which in these cases is unwanted variability in repeated measurements, often caused by measurement noise). He reminds us of this paper pointing out that brain-behaviour correlations are “puzzling high”, given the reliability of our cognitive measures and of our imaging measures. From min. 16:20 he goes through a variety of imaging measures and their reliability, and introduces CoRR (min. 21:30), the Consortium for Reliability and Reproducibility. The prerequisite to have reliable imaging measures is, of course, to have sufficient data quality.
How do I ensure that my data exhibits sufficient quality?
Quality assurance (QA) and quality control (QC) procedures are put forward to ensure and verify the quality of neuroimaging data, respectively. Although somewhat intertwined, QA and QC are slightly different. QA is process-oriented and aims to boost our confidence in the data via routine system checks, whereas QC is product-oriented and deals with verifying the quality of the final product in the pipeline. In his video, Pradeep Raamana briefly introduces QA and QC and outlines the different QC steps involved in the acquisition of neuroimaging data (from min. 3:47). Visualising and checking your neuroimaging data at all processing stages is absolutely essential. The most important yet basic tool you need is therefore an image viewer that allows simultaneous visualization of the three image planes, and of course, you as the observer! For more specialized QC, Pradeep presents a list of some of the available neuroimaging QC tools per neuroimaging modality here, where he also presents use-cases of some of the tools.
In order to conduct QC successfully, one would need to take care of the various common types and sources of artifacts encountered in neuroimaging data. Importantly, we need to keep in mind that QA and QC must be tailored to the specific nature of neuroimaging data in its various modalities, separately.
In the videos of the ‘Taking Control of Your Neuroimaging Data’ session, some of these procedures are presented. Pradeep introduces common sources of artifacts in anatomical MRI (min. 8:14) and presents some tips and tricks for detecting artifacts in T1-weighted images (min. 19:08). Then, Martina Callaghan presents key metrics to perform scanner QA for functional MRI, emphasising the need to look for subtleties (min. 3:53). Here, the key is to establish whether the system fluctuations inherent in the acquisition procedure and hardware are sufficiently low to allow detection of BOLD-related signal changes in task-based and resting-state functional MRI. Martina Callaghan then presents some of the online (i.e. real-time) QC procedures for functional MRI (min. 17:17).
Esther Kuehn then takes over and introduces artifacts in high resolution functional MRI acquired at high-field strength with particular emphasis on cortical layer imaging applications and presents some available means of artifact-correction (from beginning). In her video, Joset Etzel introduces a different aspect of QC for neuroimaging data - dataset QC - and talks about the importance of checklists and standard operating procedures (SOPs).
Dataset QC aims to verify whether a valid dataset (i.e. one that has already passed the various data QC steps) is also usable by different people at different times in different places, and intuitive data organisation alone is not sufficient. Finally, in his video, Alexander Leemans introduces common artifacts in diffusion MRI, presents strategies for checking the quality of data and common errors in this checking, and also correcting artifacts.
I’ve got so much data, how do I organise it?
Lots of neuroimaging data are acquired all over the world and the resulting datasets are organized in different ways according to the personal preferences of the users or the labs. With Open Data, so data that is publicly accessible, picking up momentum, there is growing need for standardization of neuroimaging datasets so that they are easy to use soon across a wide community of neuroscientists. The brain imaging data structure (BIDS) initiative aims to standardize neuroimaging data structures in order to make them interoperable under the FAIR data principle. In this tutorial, the BIDS data structure is introduced as a practical means for achieving FAIR data. Here, a number of BIDS resources and repositories and simple BIDS specifications are also given for an easy get-go (min. 27:27). Later, a hands-on session on how to create and validate a basic BIDS dataset is also introduced (min. 34:57). Also check out the TrainTrack session on BIDS of this year’s virtual meeting by Sam Nastase!
Jeffrey going through the benefits of the brain imaging data structure (BIDS).
Once you have nicely organised your data, they are also easier to use for other people. To make neuroimaging more reproducible overall, something else that can be done is to ensure that data does not get lost and forgotten. In short that our data are Findable, Accessible Interoperable and Reusable (or FAIR; see the educational course on FAIR data from min. 1:52 by Maryann Martone and Jeffrey Grethe).
This way, your science will be more robust, transparent and verifiable.
The problem is that making research FAIR as an afterthought is really tough. Indeed, generating or curating good quality data that abides by FAIR principles requires some forethought (FAIR workshop min. 12:36). Not only do a lot of steps and expertise go into acquiring good quality data, but your data need to be in a format and in a place that makes those data easy to use for your present self, your future self and for someone who is not yourself!
One tool to share statistical maps from your study is the platforms NeuroVault and Neurosynth. In his video, Chris Gorgolewski goes through the advantages that uploading your map has for you, such as the options for fancy visualisations of your maps (min. 4:37), cognitive decoding of your maps (min. 5:25), search to find similar maps in papers (min. 6:25), gene decoding (min. 7:04).
How can I make sure that my analysis workflow can be reproduced by others?
If you want all aspects of your study to be documented and reproducible, then this of course also includes your analysis. The BIDS structure can help with setting up a reproducible workflow, but it is not sufficient. It also needs to be clear which processing steps have happened, which analyses were done, with which software and which parameters, etc. There are a lot of tools out there to help you and the Center for Reproducible Neuroimaging Computation initiative (ReproNim) has held an extensive course at the 2018 annual meeting about this (and a whole Webinar series on best practices for neuroimaging, if you are interested).
Starting with the “computational basis”, Yaroslav Halchenko gives an introduction into the Linux shell, including the importance of environment variables (from min. 12:50) to ensure you are running the right version of software, how to use shell history (from min. 23:40) to check whether you indeed ran the right commands, and how to write shell scripts (min. 29:30). He also shows how Neurodebian can be used to search and download software (min. 41:21).
Most people have probably heard the name Git before. (Did you know the official definition is “stupid content tracker”?) Yaroslav explains the Git philosophy in 2 minutes (min. 58:01) and shows the most important commands (min. 52:50). While Git is useful to keep track of your scripts, get and provide code, a tool called DataLad (min. 1:03:17) can be used to do similar stuff with datasets. A hands-on session on this is provided in the Workflows for neuroimaging session from min. 47:20, and how this can be combined with specific statistical analyses is explained from min. 1:52:08.
Other tools to help you make sure you use consistent software within a study are containers and virtual machines. Dorota Jarecka gives a good overview of why these are very useful in research (from min. 7:39) and even guides you through some exercises (from min. 15:45). Jean-Baptiste Poline gives a short intro to Jupyter notebooks to demonstrate your code to others (from min. 2:43:51).
This year’s OHBM Hackathon also has a session on Git by Steffen & Saskia Bollman, on good coding practices with Matlab by Agah Karakuzu, on Datalad by Adina Wagner and on Containers by Tom Shaw and Steffen Bollmann.
You said that replicability also refers to other people being able to get the same outcome as my study, but if they test different participants, this is out of my control, right?
This is a good point, it is somewhat out of your control, but there are some ways in which you can help. First, being very transparent about what you did to your data will allow others to adapt methods as similar as possible to yours. As Celia Greenwood explains (from min. 2:24:01), the final statistical measure that one tries to replicate involves a lot more than just the statistical test, but includes all steps before, the processing, exclusion of outliers etc., which sometimes makes it hard to even work out what the null hypothesis is. She states that reproducibility in the statistical sense is about the final inference you make, so it is tied to the p-value. And this of course depends on your sample size and, to some extent, chance. In a demonstration (from min. 2:34:24) she shows that if you draw different samples from the same population, there is huge variability in the p-values and effect sizes that you get across samples (even with sample sizes of N > 100) , which are purely a result of random sampling.
Celia illustrates the effect of random sampling on estimated effect sizes.
Is this why “most published research findings are false?”
Are you insisting on going back to things we have already discussed?! I suppose it is fair to say that there is more to it. A measure called “predictive value” is the probability of the alternative hypothesis being true given your test result. In his video, Jean-Baptiste (from min. 2:47:14) uses a Jupyter notebook to explain the Bayesian math behind this value and shows that this measure depends on the power of your study as well as the odds ratio of the alternative hypothesis over the null hypothesis being true. So the lower the power in your study, the more unlikely that the alternative hypothesis (usually what you are interested in) is true, even if you have a significant result. And most neuroscience studies do not have much power, as shown by Katherine Button.
Well, you may say now, how do I know what my power will be? And is there even a point in doing my experiment or will it just produce another - false - research finding!?.
Good question. Doing power analysis for neuroimaging studies is not straightforward, but luckily, some packages, such as fmripower and neuropower, have been developed to at least get an educated guess of what your power might be. As Jeanette Mumford explains in her video (from min. 4:53) doing a power analysis has many benefits. She also gives some tips on how to assess other people’s power analyses (from min. 7:08) and what to consider when estimating effect sizes based on the literature (from min. 9:18). Jeanette also explains why the difficulty of doing power analysis increases with difficulty in model (from min. 11:59).
Jeanette talking about the power of different statistical models.
What else can I do to ensure best practices in neuroimaging?
Thorough reporting of what you have been doing in your data acquisition and analysis is always a good idea. Guidelines have been created by the Committee on Best Practices in Data Analysis and Sharing (COBIDAS; also see Tonya White’s video for the idea behind COBIDAS) for MRI and MEEG.
Various tools are available for testing your code. Also, if you publish your code on sites such as github, then other researchers can try it out and help further develop it.
Preregistration and registered reports are becoming more and more popular for neuroimaging, meaning that more and more journals accept and encourage them. In her video, Pia Rotshtein explains the philosophy behind and principles of registered reports (from min. 11:06) and shows some examples (from min. 22:55).
Tonya telling us about the Committee on Best Practices in Data Analysis and Sharing.
If I get into all these things, will I still have time to do research?
That is why there are 36 hours in every day! Seriously though, this is all part of doing research! Often, however, efforts on good practices in neuroimaging are not publishable by themselves and have not been well respected. There are good reasons and incentives to follow Open Science practices as individual researchers (for examples see this summary) and with the new OHBM initiative Aperture (see video and website), a new room for unconventional research objects (such as software and documentation) is being created.
If this still all seems overwhelming and time consuming, don’t worry. Most of the tools presented here have been developed to save you time and resources in the long run while making your research more sustainable. Think about the time that one would spend re-acquiring a data set because of a previously unnoticed problem with the scanner, trying to make sense of not intuitively organised data or trying to find a mistake in a long, badly structured code. Putting in place some of these preventative measures, does not seem like such a big investment anymore.
If you’re hooked, stay tuned. The numerous emerging Open Science initiatives keep coming up with new ideas and tools for how to make research as a whole more reproducible and trustworthy, and help us brain mappers, conduct neuroimaging research in more robust and applicable ways.