Introduction

There has been a significant increase in the number of children exposed to illicit drugs and alcohol in the prenatal period, over the past two decades. During the period 1999–2014, the prevalence of opioid use during pregnancy increased from 1.5 to 6.5 per 1000 deliveries.1 Prenatal alcohol exposure (PAE) has been an ongoing problem with an estimated 22.5% of women drinking during the first month of pregnancy2 and 11.3% having consumed alcohol within the last 30 days throughout pregnancy.3,4 Prospective studies evaluating the impact of prenatal drug exposure on neurodevelopment during the first 2 years of life are important to help identify early deficits in cognitive, language or motor skills, and provide opportunities for early interventions. Due to the limited availability of validated infant scales, the ability to identify those infants at the highest risk has been a challenge.

A compounding factor in the developmental outcome of children with prenatal drug exposure is pre- and postnatal environment, which can impact development over time.5 Prior studies have shown that low family socio-economic status (SES) is often associated with reduced scores in language and motor domains of the Bayley Scales of Infant Development (BSID) in infants 6–9 months of age.6 Furthermore, infants with a failure to thrive and exposure to non-accidental trauma have demonstrated reduced mental development scores, and children with a failure to thrive also experienced reduced motor scores.7 Furthermore, maternal mental health during pregnancy and postpartum period can influence infant cognitive and behavioral development.8 Therefore, it is important to examine prenatal exposures in the context of other pre- and postnatal variables to best understand infant development.

The BSID9 is the most widely used scale of developmental outcome in infants and young children.10 Studies related to the early developmental sequelae of prenatal drug and alcohol exposure using the BSID scales have had heterogeneous results. PAE was associated with decreased BSID-II scores in areas of cognitive and motor development in children at 6 months of age, and this association was mediated by the child’s gestational age at birth and SES.5 Skumlien et al. found that boys exposed to opioids or alcohol had lower BSID-III cognitive and language scores at a median age of 10.4 months (cognitive scale) and 9.4 months (language scale) when compared to unexposed children.11 Another study found differences in BSID-III fine and gross motor scores in children with PAE at 6 and 24 months, but not in language development.12 In contrast, some studies found that prenatal opioid exposure was not associated with adverse cognitive or executive functioning outcomes after adjustment for socio-economic factors.13,14,15 Previous studies by our group found no differences in BSID-III cognitive, language, or motor scores at 5–8 months of age in infants with mild/moderate PAE or prenatal exposure to medications for opioid use disorder (MOUD) compared to an unexposed control (UC) group.16,17

In this study we investigated whether there was a change over time, in BSID-III developmental scores from 6 to 20 months of age, in a cohort of children with prenatal alcohol and opioid exposure. We hypothesized that children classified into the prenatal exposure groups would have a significant decline in BSID-III scores between 6 and 20 months of age.

Methods

Study design and population

Data for this analysis were derived from the prospective cohort study, Ethanol, Neurodevelopment, Infant, and Child Health (ENRICH-1).18 The ENRICH-1 study was conducted at the University of New Mexico (UNM) and the Mind Research Network to assess the effects of prenatal alcohol and opioid exposures, as two primary exposures of interest, on infant development. Prospective data collection occurred over four study visits: visit 1 (V1) baseline interview and biological sample collection during pregnancy (on average, 25.4 ± 7.2 gestational weeks); visit 2 (V2) interview and biological sample collection at delivery; visit 3 (V3) caregiver interview and neurodevelopmental assessments when the child was approximately 6 months of age; and visit 4 (V4) caregiver interview and neurodevelopmental assessments when the child was approximately 20 months of age. The developmental assessment windows were chosen given the project’s a priori focus on early indices of child development during the first 2 years of life. By 20 months of age, critical cognitive functions known to be affected in individuals with an FASD typically emerge, providing a first look at the impact of PAE on higher cognitive functioning in at-risk children. The participant recruitment commenced in 2013, and the prospective follow-up of participants was completed in 2019.

The sample size was 105 maternal–infant pairs at V3 and 72 at V4; the final sample size included in this repeated measures analysis included 69 maternal–infant pairs (3 subjects who completed BSID-III at V4 had missing data at V3). All study procedures were approved by the UNM Health Sciences Center Human Research Review Committee, and all pregnant women provided written informed consent prior to participating.

Participants were recruited from prenatal care clinics affiliated with UNM, including the specialty clinic dedicated to providing prenatal and postpartum care to women with substance use disorders and their infants. Participants were recruited into one of four study groups: (1) UC; (2) patients receiving MOUD; (3) patients who consumed alcohol during pregnancy (PAE); and (4) patients with combined opioid and alcohol (MOUD+PAE) use.

Overall inclusion criteria in the ENRICH-1 cohort were: (1) gestational age at enrollment between 12 and 35 weeks, (2) ultrasound confirmation of a singleton pregnancy, (3) delivery at UNM hospital, (4) intention to remain in New Mexico for 2 years following delivery in order to complete follow-up study visits, and (5) ability to give informed consent in English. Exclusion criteria for all groups included (1) diagnosis of a major fetal structural anomaly, (2) more than minimal use of stimulants (cocaine, methamphetamines, or MDMA) in the first trimester, defined as more than monthly self-reported use or >1 positive urine drug test, and (3) any use of stimulants during the second and third trimesters assessed by self-report or urine drug test. In addition, participants with tobacco, marijuana, or alcohol use (assessed via self-report and alcohol biomarkers, described below) after the last menstrual period (LMP) were not eligible for participation as UCs.

Assessment of PAE

PAE was assessed via a comprehensive battery of measures, including self-report and ethanol biomarkers. Participants were asked to complete three timeline follow-back (TLFB) calendars: TLFB-1 completed at baseline interview assessed alcohol use during the periconceptional period (the 2 weeks before and 2 weeks after the LMP); TLFB-2 captured 30 days prior to enrollment/V1; and TLFB-3 captured 30 days prior to delivery/V2. Average ounces of absolute alcohol consumed per day (AA/day) were calculated from the number of standard drink units reported for each of the three TLFB calendars. AA/day across pregnancy was calculated as an average across the three TLFB calendars. In addition to self-report, alcohol exposure was also assessed via a battery of ethanol biomarkers measured in blood and urine specimens at V1 and V2. Carbohydrate-deficient transferrin (CDT), phosphatidylethanol (PEth), and gamma glutamyl transferase (GGT) were analyzed in maternal blood, and ethyl sulfate (EtS) and ethyl glucuronide (EtG) in maternal urine. In addition, PEth was measured in infant dried blood spots at delivery (PEth-DBS).

Participants with or without prenatal opioid exposure that reported periconceptional use of alcohol consisting of greater than 13 drinks or at least 2 binge episodes in the month surrounding LMP were initially classified as alcohol exposed (PAE or MOUD+PAE). Periconceptional alcohol use was the basis for initial classification because there is less stigma surrounding pre-pregnancy drinking, making it more likely to be reported accurately. In addition, risky periconceptional drinking has been associated with alcohol consumption later in pregnancy19 and child developmental outcomes.20 To remain eligible in the PAE and PAE+MOUD groups, the second tier eligibility criteria included consumption of alcohol during pregnancy per TLFB-2 and TLFB-3 reports or at least one positive ethanol biomarker at V1 or V2. Non-PAE participants had no/minimal periconceptional alcohol use (no more than 2 drinks per week) and no alcohol use during pregnancy (per self-report on TLFB-2 and TLFB-3 calendars and negative tests on all ethanol biomarkers).

Assessment of prenatal opioid exposure and co-exposures

Participants receiving MOUD, such as methadone or buprenorphine, with or without other opioids (illicit or misuse of prescription opioids), were identified from the specialty clinic based on pre-screening of medical records, structured interviews, and study-specific 7-drug urine panel (amphetamines, barbiturates, benzodiazepines, cocaine, opiates, PCP, cannabinoids/THC) administered at V1 and V2. Co-exposures to illicit and prescription drugs were captured at V1 and V2 utilizing the 7-drug urine panel (above) and self-report via structured interview.

Assessment of neurobehavioral outcomes and postnatal environment

A battery of neurodevelopmental assessments and validated questionnaires were completed to assess child development. Assessments were conducted at V3 when children were between 5 and 8 months of age and at V4 when children were between 18 and 22 months of age (both adjusted for prematurity). Developmental assessments included administration of the BSID-III. The three subscales, Gross and Fine Motor, Cognitive, and Language, which involve direct assessment by a pediatric developmental diagnostician (J.L.) who was blinded to participant exposure status, constituted the basis of this report. The BSID-III Cognitive, Language, and Motor composite scores have a mean of 100 and a standard deviation of 15; scores range from 55 to145 for the cognitive composite score and from 45 to 155 for the language and motor composite scores.

Postnatal environment was assessed via structured interviews, including the number and ages of children living in the home, number of homes the child had lived in since birth, family SES assessed with the Barratt Simplified Measure of Social Status,21 household income, working status (outside the home) of the primary caregiver, years of education completed by the primary caregiver, and child/family participation in early intervention programs.

Statistical analyses

Differences in demographic, medical, and substance use characteristics among the four study groups (PAE, MOUD+PAE, MOUD, UC) were compared using χ2 or Fisher’s exact test for categorical variables and Kruskal–Wallis test for continuous variables. Statistical significance for the unadjusted analyses examining the association between the study group and BSID-III scores at 6 and 20 months, as well as the change in scores between 6 and 20 months were determined using Kruskal–Wallis tests.

Mixed effects using generalized least squares via restricted maximum likelihood was used to model changes in the BSID-III Cognitive, Language, and Motor scores between 6 and 20 months. The covariance matrix for each outcome was determined based on the lowest Akaike information criterion estimator and a significant likelihood-ratio test against the null model (i.e., the model with an ordinary least squares covariance structure). For the BSID-III Cognitive and Language subscales, an unstructured covariance structure was used, and compound symmetry structure was used for BSID-III Motor subscale. Residual analysis did not detect any violations in model assumptions.

In the mixed effects analyses, the initial model included the main effects (MOUD and PAE), time (V3 vs. V4), and all possible two-way and three-way interactions (e.g., MOUD-by-PAE, MOUD-by-PAE-by-Time, etc.). MOUD-by-PAE was modeled as an interaction to assess the combined estimate of MOUD and PAE, and time was included as an interaction term to evaluate developmental changes between baseline to follow-up visits by the group. Next, infant sex and family SES were added as covariates to each model. Pairwise comparisons between each of the exposure groups to the UC group, and within each group between 6 months and 20 months, were calculated using least squares estimates from the mixed effects analyses.22,23,24

All analyses were two-tailed and conducted using SAS statistical software (Cary, NC version 9.4). An alpha level of 0.05 was used to determine statistical significance; however, significance using an alpha level of 0.10 is also reported for adjusted analyses.

Results

The sample included a racially and ethnically diverse sample (63.8% Hispanic/Latina, 7.2% Native American, 2.9% multi-racial). There were no differences in maternal ethnicity, maternal age or gestational age at recruitment, family SES, number of children in the household, and number of households the child had lived in among study groups (all p's > 0.05; Table 1). There were some differences in maternal race, marital status, and education level among the groups (all p's < 0.05; Table 1). With respect to child characteristics, lower birth weight was observed in the three substance exposure groups compared to controls, but no difference among groups was observed in mean gestational age at delivery or prevalence of preterm delivery. Compared to the UC and PAE groups, a high proportion of children in the MOUD groups (MOUD = 69.2% and MOUD+PAE = 80.0%) participated in early intervention programs (p < 0.05).

Table 1 Demographic and medical characteristics of participants (N = 69).

Table 2 shows patterns of substance use by the study group. Mean (±SD) alcohol consumption across periconceptional period and pregnancy was 0.7 ± 0.8 AA/day (approximately 10 drinks/week) for the PAE group and 0.5 ± 1.0 AA/day (approximately 7 drinks/week) for the MOUD+PAE group. In addition, 23.1% of subjects in the PAE group and 60.0% of subjects in the MOUD+PAE group were positive for at least one ethanol biomarker. Among subjects in the MOUD groups, 56.5% were on methadone, 39.1% were on buprenorphine, and 4.3% were on both medications during the course of pregnancy. Marijuana use was prevalent in all three exposure groups (ranging from 38.5% in the MOUD group to 53.8% in the Alcohol group), and tobacco use was highly prevalent in both MOUD groups (84.6% in the MOUD, 70.0% in the MOUD+PAE).

Table 2 Alcohol and substance use patterns by study group (N = 69).

Mean (±SD) age of assessment at 6 and 20 months, adjusted for prematurity, was 6.8 ± 1.1 and 20.2 ± 1.5, respectively. Age at assessment was similar among the groups at both 6 months (p = 0.240) and 20 months (p = 0.120; Table 3). There were no differences in any of the BSID-III scales among study groups at 6 months (all p's > 0.05). At 20 months, group scores were more divergent, with lower scores for BSID-III Cognitive and Language subscales observed in the exposed groups (Table 3). While all three exposed groups had a decrease in scores from 6 to 20 months, the MOUD group had the most pronounced decrease in the Language (–19.1 points) and Cognitive (–13.8 points) mean scores.

Table 3 Infant developmental outcomes at V3 and V4 by study group (N = 69): unadjusted analysis.

The resulting regression coefficients from the mixed models are summarized in Table 4. The group-by-time interaction plots for the change in BSID-III Cognitive, Language, and Motor scores between 6 and 20 months are shown in Fig. 1. These plots show the stability over time in BSID-III scale scores for each of the study groups, with the p value indicating an overall difference. There was a significant three-way interaction (MOUD-by-PAE-by-Time) with respect to the BSID-III Cognitive (p = 0.045) and Motor (p = 0.033) scales. These three-way interactions for Cognitive (p = 0.022) and Motor (p = 0.017) scales remained significant after adjusting for SES and infant sex (Table 4 and Fig. 1a, e). While the UC group remained approximately the same or improved for these measures, the MOUD and PAE BSID-III Cognitive and Motor scale scores declined across time, as did the MOUD+PAE BSID-III Cognitive scale score, and the MOUD+PAE BSID-III Motor scale score was low at both time points. The three-way interaction did not reach statistical significance for the BSID-III Language scale (p = 0.107), and remained non-significant after adjusting for SES and infant sex (p = 0.085) (Table 4 and Fig. 1c). For all BSID-III subscale measures, the effect of SES was found to be significant (Table 4, all p's < 0.05), while infant sex was not found to be significant (all p's > 0.05).

Table 4 Predictors of infant developmental outcomes in mixed effects analysis.
Fig. 1: Changes in BSID-III Cognitive, Language, and Motor Scores from 6 to 20 months by study group adjusted for child sex and family SES.
figure 1

The change in BSID-III mean scores from 6 to 20 months is shown in a for Cognitive, c for Language, and e for Motor. The p value is for the three-way interaction (MOUD-by-PAE-by-Time) with respect to the BSID-III score. Bar plots summarize the statistical significance of pairwise comparisons for the differences between 6 and 20 months within each study group (row directly above bars), as well as comparisons of each study group to the UC group at 20 months (top 3 rows in the plot); b is for BSID-III Cognitive score, d is for BSID-III Language score, and f is for BSID-III Motor score. NS non-significant (p ≥ 0.05), ***p < 0.001, **p < 0.01, *p < 0.05.

Interaction plots after stratification by BSID-III Expressive vs. Receptive Language subscales and Gross and Fine Motor subscales are shown in Supplementary Fig. a–d. A significant group-by-time interaction was observed for Expressive Language (p = 0.030), Receptive Language (p = 0.023), and for Fine Motor (p = 0.029) subscales indicating that changes in scores between two assessments for those scales varied among study groups.

The bar plots in Fig. 1 summarize the specific pairwise comparisons in mean estimates for between-group and within-group (6 vs. 20 months) differences. With respect to within-group change, in models adjusted for child sex and family SES, significant changes between 6 and 20 months in the BSID-III Cognitive scores were observed for the MOUD (–14.6 points, p < 0.001) and PAE (–9.8 points, p = 0.013) groups, but not for the MOUD+PAE (p = 0.153) or the Control (p = 0.949) groups (Fig. 1b). Similarly, significant changes between 6 and 20 months in the BSID-III Language scores were observed for the MOUD group (–18.4 points, p < 0.001), PAE (–12.6 points, p = 0.007), and MOUD+PAE (–13.2 points, p = 0.018) groups, but not for the Controls (p = 0.470) (Fig. 1d). For the Motor subscale, only the PAE group demonstrated a significant within-group change in scores between 6 and 20 months (–8.6 points, p = 0.018) (Fig. 1f). With respect to between-group variation, in models adjusted for child sex and family SES, significant changes were observed between the MOUD group and Controls at 20 months for the Cognitive (–11.8 points, p = 0.003) and Language (–11.3 points, p = 0.030) subscales (Fig. 1b, d, respectively), as well as a change at alpha = 0.10 for the PAE group compared to Controls for the Cognitive subscale (–6.2 points, p = 0.094).

Discussion

This study looked at the developmental trajectory during the first 2 years of life in a group of children with prenatal exposures to alcohol and/or opioids and a control group of unexposed children, in areas of cognition, language, and motor skills. We hypothesized that differences would be more apparent during the second year of life, when developmental testing is able to measure a larger variety of skills in a child. Our findings supported this hypothesis in the area of cognition as all three prenatal exposed groups had significantly lower BSID-III Cognitive scores compared to controls at 20 months. In the area of language, both MOUD groups were significantly lower scores (9–12 points) than the control group on the BSID-III Language scale. In the area of Motor scores, no significant differences were detected between exposed and the control groups at 20 months. These findings partially replicate Flannery et al. who found significant differences on the BSID-III Cognitive, Language, and Motor scales for 18-month-old children exposed to opioids.25

Numerous studies have found that SES is an important mediator for neurodevelopmental outcomes for a variety of conditions, including PAE,5 prematurity,26 and prenatal drug exposure.14 SES was included as a covariate along with child sex in mixed effects models assessing change at 20 months within groups, and between prenatally exposed groups and the Control group. In these models, sex of the child was not a significant predictor; however, SES was a significant predictor for BSID-III scores in all four study groups. In these adjusted models, statistically significant differences were observed between the MOUD group and the Control group for Cognitive and Language BSID-III scores at an alpha level of 0.05, and change for the PAE group for Cognitive BSID-III scores at an alpha level of 0.10.

Our hypothesis that there would be a significant decrease in BSID-III scores over time, from 6 to 20 months, for the subjects with prenatal exposures but not the UCs was supported for the Cognitive and Language scales. The BSID-III is a normed and validated test, with test-retest found to be robust.27 Therefore, the stability in scores for the control group supports the reliability of measures for the BSID-III scales and supports a meaningful change in abilities for the exposed groups during the second year of life. After adjustment for SES and infant sex, the greatest difference from 6 to 20 months was observed for the MOUD group that demonstrated an 18.4-point decrease in Language scores and a 14.6-point decrease in Cognitive scores, which was close to or more than the 15-point standard deviation for the BSID-III scale. A decline close to one standard deviation was also observed for the Language score in the MOUD+PAE (13.2 points) and PAE (12.6 points) groups. These results are similar to findings in the literature for preterm children where cognitive and language deficits usually become more pronounced by 12–18 months of age.25,28,29

We have previously reported no differences in BSID scores in infants prenatally exposed to opioids compared to controls at 6 months of age; however, we observed subtle differences in infant’s self-regulation and sensation-seeking behaviors.17 Several recent systematic reviews summarized the effects of PAE on neurodevelopmental outcomes, adaptive behaviors, and self-regulation in toddlers.30,31 A systematic review by Garrison et al.30 found that among 24 publications that included a specific assessment of neurocognitive behavior (typically by the BSID) during the first 2 years of life, only 54% demonstrated significant deficits with PAE.30 Deficits in self-regulation were observed more consistently (in 75% of studies). In general, the effects of PAE on infant/toddler neurodevelopment are highly dependent on the study population, level of exposure, developmental tests used, and family environment.

Our study did not find a significant effect of infant sex on neurodevelopmental outcomes, which is in contrast with the recent Danish Family Outpatient Clinics historical cohort, which reported poorer cognitive and language development in boys after prenatal opioid exposure.11 Other studies suggested that developmental delays in girls with prenatal opioid exposure might become more apparent as they reach school age.32 In a prospective longitudinal study of children with prenatal exposure to methadone with comprehensive neurodevelopmental evaluation at 4.5 years of age, school readiness was found to be significantly affected by the male sex, higher social risk, and quality of postnatal environment.33

Our findings of a significant decline in BSID-III scores over time is extremely important as children who are tested at a younger age may be identified to have skills within the normal range in the first year of life, but this may not be the case once they get older. There are many possible reasons for the decline in scores, which can include the test itself, as the range of cognitive and language skills a 6-month-old child can perform limits the types of tasks that can be measured via standardized assessments, such as the BSID-III. By 20 months of age, early working memory and communication skills can be assessed, allowing for more complex and varied test items.34,35 It is also possible that family socio-environmental factors and parenting style may impact changes in the developmental scores over time36,37 highlighting an important area for further research. Though we did not look directly at the impact of early intervention services (due to their heterogeneity and variability in quality and intensity), it was interesting to note that significantly more of the children in both MOUD groups were enrolled in early intervention programs, though these children also had the lowest development scores at 20 months. It is important to note that infants with perinatal drug exposure automatically qualify for early intervention services in New Mexico under the “medically at-risk category”. There are only five other states (Florida, Massachusetts, California, New Hampshire, and West Virginia) that have an “at-risk” criterion allowing services to begin after birth. Other states require children to have an identified delay, ranging from 20 to 50% in one area of development or a delay of 1.5–2 standard deviations on a standardized measure in one or more areas of development, to be eligible for early intervention services.38 According to this study, most children would not qualify for early intervention services until closer to 2 years of age, when more substantial delays can be identified using standardized testing. It would be beneficial to have future research explore the impact of age at enrollment in early intervention programs on overall development.

The current results indicate that direct interventions for children, regardless of early indications of normal developmental progression, may be warranted. Findings also emphasize the importance of longitudinal research. While the logistical and budgetary constraints of this cohort did not allow for more frequent assessments during the first 2 years of life, more frequent testing in future studies would allow for the identification of neurodevelopmental delays when they first appear. One possibility for more intensive follow-up might be a partnership with State-funded early intervention programs. In addition, further investigation into ways to improve development in children exposed to MOUD would be pertinent.

There are strengths and limitations to this study. First, we acknowledge that the lost to follow-up rate between 6 to 20 months was higher than anticipated (33%), especially in the MOUD+PAE group. Dropout rates for the UCs were 10%, while for the MOUD, PAE, and MOUD+PAE they were 39, 41, and 58%, respectively. Second, we also acknowledge a relatively small sample size per group which precluded detection of smaller differences at 6 months. Nevertheless, changes over time were associated with a large effect size detectable with the present sample size. Third, while efforts have been made to minimize the effect of prenatal (excluded subjects with co-exposure to methamphetamines, MDMA, cocaine) and postnatal (infant sex, family SES) factors, we acknowledge a potential role of residual confounding. Fourth, we recognize that neurodevelopmental outcomes in the MOUD groups might be affected by the severity of the neonatal opioid withdrawal syndrome (NOWS), as previously reported;17 however, adjustment for NOWS severity might not be appropriate in the current study since it is likely to be a factor on a causal pathway rather than a confounder. Other ongoing studies in the field focus on approaches to minimize NOWS severity and examine the effect of different NOWS treatment approaches on the long-term neurodevelopmental outcomes. Finally, we acknowledge that this study focused on developmental outcomes related to the BSID-III Cognitive, Language, and Motor scales, while other measures and assessment tools of neurodevelopment, such as the Child Behavior Checklist,25,39 sensory processing and temperament scales,28 and MRI imaging,40 might be important to incorporate in future studies to more comprehensively characterize developmental outcomes.

These limitations should be viewed in light of the strengths, including the prospective cohort design in which maternal–infant pairs were followed from mid-pregnancy to 2 years after birth with the repeated evaluation of neurodevelopmental outcomes. Another strength of the study was the rigorous manner used to obtain exposure information (prospective repeated self-report during pregnancy accompanies by the study-specific biomarker batteries), as well as detailed information on the family’s socio-economic background and pre-/postpartum environment. Furthermore, the BSID-III was administered by examiners who were certified, highly trained in the scale, and blinded to exposure status, which helped ensure the testing was completed in a standardized manner.

In conclusion, this study addresses the importance of using a longitudinal approach in the evaluation of children with prenatal polysubstance exposure. The significant decrease in cognitive and language developmental scores over the first 2 years of life in children with prenatal opioid and alcohol exposures highlights the importance of programs that provide both early identification and effective intervention programs for high-risk children.