Programs of Research

I have been fortunate to work on a range of exciting and impactful programs of research over the years. Below is an in-depth overview of the research programs I have led (or contributed to) since 2017, along with the topic areas, statistical analyses, and software used. 

For information that precedes 2017, checkout my employment & volunteer history on LinkedIn. For an abbreviated version, see my Resume/CV page.

Current program of Research

My current program of research is focused on measurement issues in brain-behavior research during adolescents. This work is a three-year F32 grant funded by the National Institute on Drug Abuse. In this work I am attempting to answer key measurement questions about validity and reliability in task-based fMRI using a variety of analytic techniques in R and Python. Because these project use a large number of subjects with brain imaging (e.g., 400,000+ files/40+ TB of data), I use high-performance computing and storage across Stanford's Sherlock, University of Minnesotas MSI and AWS.

Specific details of the grant can be found here and specific projects are described below.

Project 1

My first project is focused on modern psychometrics in neuroimaging data. Specifically, in prior work I had considered how group activity and brain-behavior activity fluctuates within a single sample. A single sample often is insufficient to make broad measurement conclusions. It is a great starting point, but the generalizability can be murky at times (at least it is in my case..). In the first project of my F32, I expanded this work by considering a measurement invariance. Measurement invariance simples means that the structure of a measure (how it performs) should be relatively constant across samples. This way when researchers make comparisons, they know it's due to a differences in some group difference, say age, and not age as a function of the measure working differently. Specifically, in this project I am using Confirmatory Factor Analyses, Exploratory Structural Equation Modeling, Exploratory Factor Analyses and Local Structural Equation Modeling to evaluate the measurement structure of a popular fMRI computerized task. The project will use three adolescent samples with nearly identical task designs. 

Since I had the question defined and did not have access to the data, as described earlier, for this project I am using a registered report format. The Stage 1 registered report received an in-principle acceptance at Developmental Cognitive Neuroscience. Currently, I am doing the preprocessing and analyses on the three datasets. The Stage 1 code is available on github

Project 2

My second project is focused on issues of reliability in fMRI data. Specifically, I developed a python package that can calculate different metrics that will be helpful for researchers working on measurement issues in neuroimaging. I have applied this tool in a project that evaluated the analytic impacts on individual and group fMRI estimates. The package, PyReliMRI, is currently available on github and associated documentation on the use at readthedocs. The Stage 1 registered report has been accepted at the Peer Community in Registered Reports and a preprint is available on bioRxiv.

In-kind Project 3

For the F32, one of the samples includes the ABCD Study®. The ABCD study is a large consortium study across the United States (21 sites) that is attempting to ask key behavioral and developmental questions. One major measurement component are task-based fMRI, resting state fMRI, structural and diffusion acquisitions. In addition to the coordinating center, several other groups are performing preprocessing steps on these data. One group, the ABCD-BIDS Community Collection (ABCC) is releasing a BIDS format of the data as well as several derivatives for the input data. Due to my use of the ABCC data for Projects 1 & 2, I am developing preprocessing pipelines for the ABCD study data that use MRIQC and fMRIPrep. My role is to lead the MRIQC and fMRIPrep steps so the ABCC group can release these data and derivatives to the NIH data archive that researchers across the scientific community can access and use these data without having to require thousands of hours of compute time and computing nodes.

As a complement to this project, I have also been working to develop individual and group level reports to summarize the ABCD behavioral data. Given the large nature of these data, this would permit quick and efficient visual reports that researchers can use when making inclusion/exclusion decisions and have a wholistic view when interpreting their results. The project has been leveraging open source tools from nipreps/MRIQC to generate reactive summaries that can aggregate 10,000+ datapoints in a few figures and leverage interactive features to examine and learn more about individual datapoints.

Measurement & Reproducibility

An evolving interest of mine involves issues of measurement and the reproducibility of model estimates and interpretations. With respect to measurement, a big proportion of what researchers do with data comes down to measurement. The consistency (i.e., reliability) and accuracy (i.e., validity) of measurement in research, like other things, is extremely important. For example, if you have a thermometer that is telling you it's 110F outside, when it's actually 77F, how useful is that thermometer? Sure, if the thermometer is consistently off by 33F, you can adjust the values and still use the thermometer (I suppose this would be quite useful before thermometers were easily accessible, so you'd consider this new measure as a blessing). What if you took five measurements, and you got 110F, 101F, 94F, 115F and 77F. What use is this thermometer and how can you even fix it after the fact? You usually cant -- so you might as well throw away the thermometer. But if you were wrong all along and over looked an important measurement issue that was contributing to this variability in temperatures? Say the different readings came when the thermometer was in direct sun, in a hot car, under a blanket or in a regulated temperature space? You may inadvertently throw away a tool that is useful, simply because you didn't follow the instructions on how to use the tool. 

The same problem can arise in psychological, behavioral and/or neurobiological research. Whether you're conducting hypothesis driven work or data-driven prediction models, the dataset of numbers (continuous, ordinal and/or nominal) that are used as inputs into the statistical models will definitely impact the outputs and your takeaways. So if a measure is inconsistent and/or not accurate, you fall victim to the saying, "garbage in, garbage out"... Sadly, more often than not, commonly used measures do not come with very detailed instructions and so it is on the onerous of the research to ensure the interpretation is valid and reliability to the extent that seems appropriate (i.e., reliability and validity may be EXTREMELY important before brain surgery compared to the prediction of who will score high or low on a math test).

Reproducibility is also key to outputs and takeaways for any analysis.  If one researcher runs an analysis on a data set and produces some results/conclusions, but another research cannot reproduce the results using the same data and methods. Is this still useful? How do you reconcile these differences? Which result is correct -- the one that supports that narrative?

Similarities and Differences in Alternative Definitions of Brain Activity

Measurement issues are especially important in studying neural activity. In some ways, because soooo much goes into getting the data of brain activity, they may be more critical than self-report and behavioral data. The number of steps removed that the final neuroimaging signal is from what is initially collected is several fold greater than self-report. 

In the section on adolescent risk taking, I described how we found no evidence for a key element of a modern theoretical framework between different risk taking groups. In this 2020 study (Demidenko et al., 2020), we made some decisions in testing the hypotheses that some may not agree with or would do entirely different. Which is a cool thing about science -- we can ask things in different ways and hopefully converge on the same results. This convergence of results is a question that I was interested in for second study in my dissertation, because perhaps my choices were... dare I say it... wrong.

The question: How do different definitions (i.e., operationalizations) of the same neural measure relate in their a) neural activity and b) converge in the conclusion about association with some behavioral measure? In this publication (Demidenko et al., 2021), in the data-driven study we evaluated how different definitions used to arrive at the neural activity in the brain impacted the magnitude and direction of brain-behavior associations. In a way, in this study we found converging evidence for what we reported in the trait and state study  (Demidenko et al., 2019). Specifically, while there may be verbal agreements and definitions about what set of task contrasts (what is often used in fMRI research to get at specific mental process in a task) may be related, when they are empirically evaluated the brain-brain associations (i.e., covariation of mean signal intensity) are not always consistent. This nuance differences is pretty important, because researchers often use different contrasts in their studies for different reasons (some well justified). So identifying when findings do and do not converge can increase exponentially as the number of parameters that differ between studies increases. Moreover, in the case of this study (Demidenko et al., 2021), the subtle differences in defining a contrast to get at neural activity in the brain may provide a wide range of associations with self-reported behavior that were really difficult to interpret. Posing concerns about the measures being using in research programs using neural activity data and self-report/behavioral data.

Reproducing Results and Impacts of Defining Measures

As mentioned earlier, having a science that is reproducible and/or replicable is important. Furthermore, if there are different ways to get at a numerical representation of a variable (i.e., operationalizations), it is important to have a reasonable explanation when/why the data may not converge. 

In neurodevelopment cognitive neuroscience, one important topic is focused on the effects of the family environment on the development brain. The brain is malleable and there are several sensitive periods. Stressful environments, such as harsh parenting, disadvantaged neighborhoods and high crime neighborhoods may alter developmental trajectories and influence future behaviors and health-related outcomes. Large consortium studies have been used more and more to ask these specific questions. One study in 2019 published some findings pubertal developmental explained a significant amount of the associations between the family environment and brain development. Because these data are accessible with specific authorizations, a group of researchers and I wanted to test the reproducibility and extension of these the findings in the 2019 publication.

The question: In the open dataset, can we replicate the results of the initial study and is there converging evidence the alternative definitions of the key self-report variables 'Family Environment' and 'Pubertal Development. In this publication (Demidenko et al, 2022), we found that we could replicate direction (i.e., positive/negative association in initial study and replication study) of nearly all of the reported effects (90%) and the majority of the non-significant/significant (i.e., p > .05 or p < .05) categorization (60%). With respect to the alternative variable in the family variable, we found quite a bit of variable is the key findings, in that some alternative definitions would impact the conclusions. Furthermore, in the context of the definition of pubertal development (i.e., self-reported parent or self-reported child), we found that there was nearly no similarity in the interpretation when using the parent versus child reported pubertal development. This study demonstrate that effects are replicable in large samples, however, the conclusions may differ depending on what is being interpreted as meaningful, the p-value or the magnitude/direction of the associations? Importantly, we demonstrated that when a study can define a specific variable, such as the family environment, using large set of variables from the dataset this can change the conclusions depending on how/what variable used. In the context of the family environment and brain development, this posits point of discussion as we have shown in prior work that we were not able to confirm a hypothesis in a registered report (Demidenko et al., 2021).

Data Analysis & Coding Software

Over the course of the last five years, I have been able to learn and apply different statistical models and use multiple statistical/coding programs

Data Management & Statistical Analyses

Across several projects during my graduate and postdoctoral training, I have been able to apply several statistical analyses. 

Demidenko et al. (2019): I handled and prepared the data, ran/reported descriptive statistics (means, standard deviations, counts, min/max), moment product correlations, hierarchical multiple regression (on continuous outcome) and ordinal regression (i.e., ordinal outcome)

Demidenko et al. (2020): I handled and prepared the self-report and neuroimaging data. I ran/reported descriptives statistics for key demographic, self-report and behavioral variables. For the neuroimaging data, I preprocessing and modeled the timeseries data, and ran the group-level, whole brain non-parametric analyses (5000 permutations). In addition to the whole brain analyses, I performed region on interest analyses using multiple regression. I created all visualizations for key models.

Demidenko et al. (2021): I handled and prepared the self-report and neuroimaging data. I ran and reported the descriptive statistics, the whole brain activation GLM contrast analyses, the region of interest moment product correlation matrix. In addition, I extracted the timeseries and with code from colleagues to extract and plot timeseries for select regions. I created the majority of the visualizations for key models.

Demidenko et al. (2021): I handled and prepared the self-report and neuroimaging data. I ran and reported the descriptive statistics and moment product correlations. I created the majority of the visualizations.

Demidenko et al. (2022): I handled and prepared the self-report and neuroimaging data. I ran and reported the descriptive statistics, moment product correlation of key variables and the mediation analyses using structural education modeling. I wrote the code to run the multiverse analyses and tailored the output structure to be compatible with the specr package. I created all of the visualizations.

Demidenko et al. (2022):  I handled and prepared the self-report and neuroimaging data. I extracted the timeseries data and ran Group Iterative Multiple Model Estimation (GIMME). I extracted key a priori parameters and used these in the brain-behavior models using logistic regression and multiple regression. I created all of the visualizations.

Beltz et al. (2021), Beltz et al. (2022) & Constante et al. (2022): I handled and prepared the neuromaging data. I extracted the timeseries data and provided it to the lead author for subsequent analyses. I created some visualizations.

Demidenko et al. (2023): I handled and prepared the self-report data. I ran descriptive statistics, moment product correlations and multilevel models. This project includes multiple collaborators across multiple institutes.

Demidenko et al. (2022, Stage 1 Registered Report [received in-prinicple acceptance]): Wrote and piloted simulated models in R and pilots fMRI analyses in Python. Packaged and shared associated code via github.

Demidenko et al. (2023, Stage 1 Registered Report [under review]): Wrote and piloted simulated models in R and pilots fMRI analyses in Python. Led the curation of a python-based library calculating reliability estimates on 3D neuroimaging data. Prepared github and readthedos documentation. Packaged and shared associated code via github.

Coding & Statistical Software

For data analyses, I have worked with R, Python, JASP and MPlus statistical software. As mentioned in my current research programs, I have written a Python library to estimate different types of reliabilities on neuroimaging data. 

For neuroimaging analyses, I have worked with Linux, FSL, Nilearn, Python and R. 


The bulk of science is funded by taxpayer's money through the National Institutes of Health ($45 billion 2023 budget) or the National Science Foundation ($10.5 billion 2023 budget). This means, the research, products and findings are taxpayer owned (I'm willing to be debated on this...2663 Mission St in SF at 11:37pm. Be there!). In my opinion, this means taxpayer funded research should abide by open science practices. This includes but is not limited to: Sharing Code, Making Data Public, Making Publications Open Access. In addition to making science open, science should also not be overly skewed by a negative incentive structure, such as rewarding only shiny findings/stories.

Registered Reports 

When it is possible, it is beneficial for research studies to be submitted as registered reports. In simple terms, for a registered report the researcher describes the justification (introduction), the population and analyses (methods) before the data is accessed/acquired. This servers multiple purposes, two that I highlight here. First, it encourages the researcher to follow the scientific method in hypothesis-driven (but is not limited to this, as it's flexible for exploratory work, too) research by justifying the work and the methods before any research begins. Before this is finalized, reviewers at journals can review and provide feedback on the work before the research is performed. Hence, before the analyses are performed the author and reviewers are already in agreement, so there are no post-hoc what-ifs, redoing analyses or having an unpublishable study before it even begins (given the incentive structure in research, lots of studies go unpublished...). Second, the research reports in the publication all results not just the flashy significant ones. In this case, researchers get the benefit of publications (which helps with applying for academic jobs) without the pressure of having to p-hack (unfortunate reality of the business) to get significant finds and biasing the published literature by discarding non-significant findings.

When it has been possible, I have tried to incorporate registered reports into my research. Specifically, when I am planning to do a study on secondary data that I have not accessed, or I am beginning grant I have not started, registered reports are a great avenue when there are clear plans and hypotheses. Unfortunately, given that I already had access to the data and/or had not yet known about registered reports, I was unable to apply this in my dissertation work. However, I have used registered reports with secondary open-data. We have asked key neurodevelopment question about environmental factors, brain functioning and internalizing symptoms in the large ABCD study (Demidenko et al., 2021). In addition, I have contributed to another registered report that has received stage 1 approval (Ip et al., 2022) and have received stage 1 approval for a project that  address one of the research aims proposed in my NIDA F32 grant proposal (Demidenko et al., 2023). 


Sometimes a research program cannot use a registered report. Either the data is already accessed or it has been published by a team member. In this case, an alternative method can be used, Preregistration. While preregistrations significantly differ from registered reports, preregistrations allow researchers to document their analyses and provide a time-stamp of the document before the work begins. This permits the researcher to be explicit in what they plan to do and how they plan to do it for that specific research study. Then, when the work is completed and submitted for publication, the researcher can be explicit how/when they deviated from this time-stamped plan. The major differences with preregistrations and registered reports is that a preregistration requires more self-monitoring and so it can be manipulated (e.g., one can preregister after they had run analyses) a lot easier than registered reports.

There have been several scenarios where I had either work with the research data or had access to the research data which prevented me from using registered reports. In these scenarios, I tried to use preregistrations. In my first preregistration (Demidenko et al., 2021), I was performing analyses on fMRI data that I already worked with and had preprocessed (sequence of data cleaning steps for MRI/fMRI). For this project, the co-authors and I met on multiple occasions, outlined the goals and analytic plans. I wrote this plan out in a template and them submitted it on the OSF paltform. After this first preregistration and having worked with registered reports, I learned about the benefits of both techniques. In my subsequent preregistration (Demidenko et al., 2022), I attempted to use the registered report approach in the preregistration framework. Since we couldn't submit the work as a registered report, after we received interested from an editor for our project at a journal we outlined our introduction and methods and drafted the analytic code. Once we agreed on these materials, we submitted the preregistration on OSF platform and what would be the stage 1 draft/code on github. Unlike a traditional registered report, we could not have the stage 1 reviewed by reviewers at a journal, but we could abide by similar sequence of steps internal. Furthermore, it increased our precision in 1) the justification and 2) the methods/analyses.

Adolescent Risk taking/Decision Making

One of my initial academic interests was risk taking (specifically, substance use related-behaviors) during adolescence. Adolescence make-up over 1 billion of the worlds population and this age range (10-25 [definition of adolescence has varied over the 20th and 21st century, see Swayer et al 2018]) is marked by distinct increases in deaths from homicide, suicide and unintentional injuries. One category that contributes to these mortality rates is substance use. Identifying ways to reduce mortality rates as a result of substance use's role in unintentional injuries would benefit society greatly.  

Adolescent Trait/State Measures in Context of Substance Use

Modern theoretical framework hypothesize that there are distinct traits and states in adolescents that give rise to risk taking behaviors, such as substance use. These traits/states include being more sensitive to rewards (i.e., positive experiences) and being less likely to self-regulate (i.e., being more impetuous/impulsive). In studies, researchers often have adolescents self-report about the trait(s) (using 1-5 scales) and/or perform well designed experimental tasks that evoke different state(s) (via computerized experimental tasks). The numerical values extracted from the self-report scales and/or experimental tasks are sometimes believed to belong to related processes, such as being more/less sensitive to positive experiences. However, for the interpretations to converge across research teams (in magnitude and/or direction) with a specific theoretical framework, a key assumption needs to be tested. That is, verbally defined trait and state measures may be argued to be related but they should also be empirically related. Otherwise, interpreting findings can be quite challenging. Moreover, if the hypothesis is that they then both [more or less equally] associate with substance use behaviors, this should be apparent in the data.

To answer the question: Are state and trait measures related (i.e., convergent/discriminant validity) and are they similarly related in direction/magnitude with substance use behaviors (i.e., predictive validity)?, I used a sample of 2000+ adolescents to address this in my Masters Thesis. In this publication (Demidenko et al., 2019), the empirical study demonstrated that [in our sample] there was inadequate empirical evidence to suggest that trait and state measures were representing the same psychological process (also referred to as as 'construct'). Furthermore, the trait (self-report) and state (computerized tasks) did not associate with substance use at a similar magnitude. Consistently, self-report measures were related to substance use at greater magnitude than the derived parameters (i.e., numerical representations of a process) from the computerized tasks. Around the time of this publication, others demonstrated a similar problem. In a 2023 APA handbook chapter (Keating et al., 2023), we discussed some of these issues further as they relate to cognitive development during adolescence. 

Differences in Neural Activity Between High and Avg/Low Adolescent Risk Takers.

Similar to the above problem, another element of modern theoretical frameworks of risk taking in adolescents is focused on differences in neural activity to rewarding/positive experiences. Specifically, it is hypothesized that brain regions that are sensitive to rewards may, in part, contribute to the influx of risk taking observed during adolescence. This is often investigated across distinct development stages, such as adults, adolescents and children (a definition that has changed a TON over the years!). In these studies, researchers compare these distinct developmental groups (or stages) to see how their brain activity differs in response to specific types of stimuli and behaviors in the MRI scanner. Together with other information, this type of evidence is often used to make conclusions about what does/doesn't increase risk taking during adolescence. While informative, this type of evidence lacks specificity. For example, if the assertion is that neural activity is the different between those who do and don't engage in substance use, why not limit the scope to adolescents that do and do not engage in these behaviors and see how their brain activity differs?

To answer the question: How does neural activity in the brain differ between high and avg/low risk taking adolescents?, I used 108 adolescent fMRI scans as part of my dissertation. In this publication (Demidenko et al., 2020), the empirical study demonstrated that there were not significant differences between high and avg/low risk taking adolescents (17-21) in key brain regions when engaged in a monetary reward computerized task during fMRI. In fact, when extracting the average neural activity from specific brain regions believed to be important to a popular neurodevelopmental framework, we were unable to relate the neural activity to risk taking using a single or multi-wave definition of risk taking (i.e., substance use). This highlight the lack of specificity of the theoretical framework in differentiating risk taking profiles during adolescence. More importantly, it puts into question some of the hyperbolic statements that are made about teenagers such has been shown on Vox's 'The Teenage Brain' on Netflix.