Measuring Parenting Self-Efficacy from Pregnancy into Early Childhood: Longitudinal Factor Analysis and Measurement Invariance

SYNOPSIS Objective. Parenting self-efficacy is an important and widely examined construct in parenting research. Yet, studies that thoroughly assess the psychometrics properties of scales that assess parenting self-efficacy are scarce. We examined the longitudinal factor structure and measurement invariance of a self-report measure of parenting self-efficacy. Design. A sample of 1,851 first-time mothers completed the 16-item Self-Efficacy in the Nurturing Role questionnaire at 12, 22, and 32 weeks gestation and 3, 12, and 24 months postnatal. Results. Factor analyses indicated that the SENR consisted of two dimensions at all timepoints: Confidence in parenting skills and Lack of insecurity/distress in the parenting role. Strict measurement invariance was found for the SENR across prenatal timepoints, but only metric measurement invariance across postnatal timepoints. Conclusions. Parenting self-efficacy is a multidimensional construct, consisting of cognitive and emotionally laden appraisals of the ability to parent. Across the transition into motherhood, as mothers gain more experience in their parenting role, parenting self-efficacy levels and the way mothers answer the questions that assess parenting self-efficacy change.


INTRODUCTION
Since Bandura's (1977) formulation of self-efficacy theory of behavioral change, many studies have examined the role of efficacy expectations in different areas of functioning, including mental health, lifestyle behaviors, and academic and work performance.According to Bandura, selfefficacy is crucial for motivation and coping.When people are convinced of their own abilities to carry out certain tasks, they are more likely to persevere during challenges and obstacles.Review studies have noted the importance of self-efficacy theory for studies on parenthood and child development (Albanese et al., 2019;Coleman & Karraker, 1998;Jones & Prinz, 2005;Schuengel & Oosterman, 2019).Parenting self-efficacy is defined as "the expectation caregivers hold about their ability to parent successfully" (Jones & Prinz, 2005, p. 342).Parenting self-efficacy is directly associated with a broad spectrum of positive Supplemental data for this article can be accessed online at https://doi.org/10.1080/15295192.2023.2268130.parenting and child outcomes (Albanese et al., 2019;Jones & Prinz, 2005) and indirectly as proximal mediator of risk and protective factors (Fang et al., 2021;Schuengel & Oosterman, 2019).In line with Bandura's theoretical notions on sources of self-efficacy, parenting selfefficacy is amenable to change as evidenced by both longitudinal correlational studies (Porter & Hsu, 2003;Verhage et al., 2013;Zayas et al., 2005) and intervention studies (Barlow et al., 2012;Wittkowski et al., 2016).Longitudinal studies during the transition to parenthood show that parenting self-efficacy generally increases over the course of pregnancy and during the first year after birth (Porter & Hsu, 2003;Wernand et al., 2014;Zayas et al., 2005), although the magnitude of change might be decreased by lower maternal psychological well-being (Kunseler et al., 2014).Parenting self-efficacy can profit from interventions in the short term, but results about effects in the long term remain inconclusive (Barlow et al., 2012;Wittkowski et al., 2016).Wittkowski and colleagues attributed this lack of certainty about the sustainability of parenting self-efficacy effects partly to non-uniformity in the measurement of parenting self-efficacy.Instruments to measure parenting selfefficacy vary in scope from domain-general to domain-specific or taskspecific self-efficacy (Coleman & Karraker, 1998;Jones & Prinz, 2005;Schuengel & Oosterman, 2019).Domain-general measures focus on feelings of parenting competency "in general" without specifying a certain parenting domain (e.g., support, stimulation, discipline) or parenting task.Domainspecific measures focus on separate domains, such as support or cognitive stimulation.Task-specific measures refer to concrete activities related to child-rearing, such as feeding or soothing a child.Instruments that measure parenting self-efficacy vary also in the extent to which psychometric properties of the measure (e.g., reliability and validity) have been examined (Schuengel & Oosterman, 2019;Wittkowski et al., 2017).For example, a factor analysis with adequate sample size was conducted for only 14 out of 34 scales examined by Wittkowski et al. (2017), finding evidence for one to four dimensions of parenting self-efficacy.Moreover, just two studies have tested the assumption of measurement invariance across groups (Dumka et al., 1996) or time (Moran et al., 2016).Measurement invariance is a prerequisite for developmental research focused on comparisons of means across time based on self-report measures (Putnick & Bornstein, 2016).This study aims to add to our knowledge about the validity of parenting self-efficacy across development by examining the longitudinal factor structure and measurement invariance of a self-report measure, the Self-Efficacy in the Nurturing Role (SENR; Verhage et al., 2013), which has examined change in parenting self-efficacy across the transition to parenthood (Pedersen et al., 1989;Porter & Hsu, 2003;Verhage et al., 2013).
Longitudinal measurement invariance refers to the idea that a particular scale used to measure an underlying construct measures that construct in an equivalent way across time (Widaman et al., 2010).Longitudinal measurement noninvariance can occur when the process of substantive interest (e.g., transitioning to parenthood, developing as a parent as your child grows up) alters the way one interprets and answers the questions on the scale (Vandenberg & Lance, 2000), without necessarily a change occurring in the underlying construct.Changes in observed scores on the scale reflect then a change in measurement properties of the scale rather than true changes at the level of the underlying construct.Thus, before being able to draw meaningful conclusions about mean changes in a construct across time (whether naturally occurring during development or as a result of an intervention), one needs to establish that the scale used to measure the construct is longitudinally measurement invariant.
Different levels of measurement invariance are typically distinguished (Van de Schoot et al., 2012).Broadly, these levels concern the factor structure of the construct (i.e., whether parenting self-efficacy is a unidimensional construct, and remains so across time, or consists of multiple dimensions) and more specific measurement properties of the scale (e.g., the strength of the associations between the underlying construct and the questions of the scale, the probability to provide a specific answer to a certain question given a true level of parenting self-efficacy, and the amount of measurement error in the questions of the scale).Measurement invariance is a property of the scale used to assess the construct rather than of the construct itself.
To our knowledge, measurement invariance has been investigated for two scales of parenting self-efficacy (Wittkowski et al., 2017).One study tested measurement invariance of five items of the Parental Self-Agency Measure across two cultural groups, an Anglo and Mexican immigrant group (Dumka et al., 1996).The five items measure domain-general parenting self-efficacy and reflect parenting confidence, knowledge, and willingness to expend effort in problem solving.This scale is suitable for parents of children aged 3 to 12 years old and reflects one dimension that was measurement invariant across the two cultural groups (Dumka et al., 1996).In a second study, the factor structure of the Assessment of Parenting Tool (APT; Moran et al., 2016), including 12 items measuring domain-general parenting self-efficacy, was examined.Three factors were identified that were labeled Coping with being a parent, Attuned parenting, and Self-perceived model parenting.Evidence for metric, but not strict measurement invariance, was found for this scale across six age groups spanning 0 to 24 months old.The APT also contains 30-37 items on domain-specific, task-related parenting self-efficacy, but the content of these items depends on the specific age of the children, and the authors did not examine the factor structure of this item set.

The Present Study
We focus on the Self-Efficacy in the Nurturing Role (SENR) questionnaire, which consists of 16 domain-general and domain-or task-specific items (Pedersen et al., 1989;Porter & Hsu, 2003;Verhage et al., 2013).We examined the longitudinal factor structure and measurement invariance of the SENR in a sample of 1,851 first-time mothers followed longitudinally during 3 prenatal time points and during 3 postnatal time points in infancy and toddlerhood of the child.The SENR was chosen to measure parenting self-efficacy because it contains both domaingeneral and domain-or task-specific items (relating to nurturing tasks that are specific to the first years of parenthood) and is therefore appropriate to include in a longitudinal study of the transition into parenthood.First, we examined whether the SENR is unidimensional or consists of multiple dimensions.Second, we applied the more stringent levels of measurement invariance to investigate the extent to which the SENR is measurement invariant across the transition to parenthood.Specifically, we tested different levels of longitudinal measurement invariance focusing first on prenatal and postnatal periods separately, and then on prenatal versus postnatal comparisons.

Participants
Participants were 1,851 Dutch women who were pregnant with their first child at the start of the study (M age = 29.7,SD = 4.08, range 18-43 years, 89 mothers with missing birth date).The majority of participants (64.1%) finished a bachelor's or master's degree, 28.2% finished tertiary vocational education or secondary education preparing for higher education, 4.4% finished middle level secondary education or primary education, and 1.4% finished another form of education, and 1.9% of the participants did not fill in this information.Participants were predominantly married (40.0%) or cohabiting (53.3%), although some were single (2.1%), not living with their partner (2.8%), or did not answer this question (1.8%).Based on the mothers' parents' country of birth, most participants had a Dutch background (85.7%; both parents born in the Netherlands); 6.5% of the remaining participants had a Western migration background (at least one parent born in a Western country other than the Netherlands), 6.0% had a non-Western migration background (at least one parent born in a non-Western country), and 1.8% of the participants did not provide this information.About half of the children were boys (864 boys, 846 girls, 141 missing data on gender).

Procedures
The current study was part of the ongoing longitudinal Generations 2 study on parenting and mental health (Verhage et al., 2013).Recruitment of the participants took place between 2009 and 2014 via midwifery practices, the study website, and at a pregnancy fair.Women were asked to fill out questionnaires at three time points before giving birth (at 12, 22, and 32 weeks of pregnancy) and at several time points after giving birth (when the child was 3, 10, 12, 24, 48, and 78 months of age).At all time points, data on parenting self-efficacy were collected except at 10 months.For this study, we used parenting self-efficacy data only for time points that at the time of analysis had been completed (12, 22, and 32 weeks of pregnancy, and 3, 12 and 24 months after birth, which we label T1, T2, T3, T4, T5, and T6).Valid parenting self-efficacy data at these time points were obtained, respectively, for 1,662, 1,668, 1,641, 1,569, 1,484 and 1,051 mothers.Following their informed consent, questionnaires were sent to participants by post (T1-T5) or electronic mail (T6) several days before the target date of the questionnaire.If questionnaires were not received within 2 weeks after the target date, participants received reminders via e-mail and telephone.
Participants did not receive a fixed reward for taking part in the study, but a gift certificate of €500 was given to a randomly selected participant every 6 months.Furthermore, participants received a newsletter three times a year containing study updates and up-to-date information on topics of interest for new parents (e.g., sleeping, feeding).

Measures
The Dutch version (Verhage et al., 2013) of the SENR questionnaire (Pedersen et al., 1989) was used to measure parenting self-efficacy.The SENR consists of 16 items measuring parents' feelings of competence in caring for their child.Answers are given on a Likert scale ranging from 1 (not at all representative of me) to 7 (strongly representative of me).The SENR questionnaire has a prenatal and postnatal version to accommodate the different phases of parenthood.In the prenatal version of the questionnaire, the questions are framed to reflect expectations; in the postnatal version, the questions refer to actual experiences.Furthermore, in the postnatal version at T6 slight adaptations to 4 of the 16 items made them more suitable for toddlerhood.Additionally, in seven items the word "baby" was replaced with "child."The full list of items in prenatal and postnatal versions is provided in Table 1.Negative items were reversescored such that higher scores on the items always reflect greater parenting self-efficacy.Previous studies have shown moderate to high test-retest reliability and internal consistency of both the prenatal and postnatal versions of the SENR (Hsu & Sung, 2008;Pedersen et al., 1989; Table 1.Item descriptions of the prenatal and postnatal versions of the Self-Efficacy in the Nurturing Role (SENR) questionnaire.

Postnatal version -Infancy (3-12 months)
Postnatal version -Toddlerhood (24 months) 1 I look forward to becoming a parent with confidence in my role as a parent.
I feel confident in my role as a parent.
I feel confident in my role as a parent.

2
I feel I can catch on quickly to the basic skills of caring for my child.I feel I have quickly caught on to the basic skills of caring for my child.I feel I have quickly caught on to the skills of caring for my child.

I think I will have difficulty interpreting my baby's cries,
knowing whether he or she wants to be fed rather than played with or held.I have difficulties interpreting my baby's cries, knowing whether he or she wants to be fed rather than played with or held.I have difficulties interpreting my child's cries, knowing whether he or she wants attention or whether something else going on.

4
I imagine myself getting uptight if my baby becomes fussy or irritable for longer than a few minutes.I get uptight if my baby becomes fussy or irritable for longer than a few minutes.I get uptight if my child becomes irritable or stubborn for longer than a few minutes.

5
I expect to be comfortable playing actively with my baby and getting him or her to smile at me.
I am comfortable playing actively with my baby and getting him or her to smile at me.I am comfortable playing actively with my child and getting him or her to smile at me.

6
I feel unprepared in becoming a parent.
I feel like I was unprepared in becoming a parent.
I feel like I am unprepared in being a parent that fits the toddler phase of my child.Touching, holding, and being affectionate with my baby will be comfortable and pleasurable for me.
Touching, holding, and being affectionate with my child is comfortable and pleasurable for me.
Touching, holding, and being affectionate with my child is comfortable and pleasurable for me.

9
I think I will be able to trust my feelings and intuitions about taking care of my baby.12 I expect to be able to soothe my baby easily when he or she is crying or fussing.
I am able to soothe my baby easily when he or she is crying or fussing.I am able to soothe my child easily when he or she is crying or fussing.
13 I am concerned that my patience with my baby may be limited.I am concerned that my patience with my baby is limited.I am concerned that my patience with my child is limited.14 I expect to feel comfortable and natural using baby-talk.
I feel comfortable and natural using baby-talk.
I feel comfortable and natural talking with my child.15 I find nothing unusually complicated or difficult about the prospect of feeding, playing with, or providing day-to-day care for a child.
I find nothing unusually complicated or difficult about feeding, playing with, or providing day-to-day care for a child.
I find nothing unusually complicated or difficult about feeding, playing with, or providing day-to-day care for a child.

16
The thought of being solely responsible for my child is frightening.The thought of being solely responsible for my child is frightening.The thought of being solely responsible for my child is frightening.Porter & Hsu, 2003).Cronbach's alphas of the SENR questionnaire for the current sample were .84,.87,and .88 for the prenatal version assessed at T1, T2, and T3, and .84,.85,and .84 for the postnatal versions administered at T4, T5, and T6.

Plan of Analyses
All descriptive analyses were conducted in SPSS version 24 (for item endorsement frequencies and correlations across items, see Supplementary Tables S1  and S2).To examine the dimensionality of the parenting self-efficacy construct, confirmatory and exploratory factor analyses (CFA and EFA, respectively) were conducted in Mplus version 7. Because the distribution of the items deviated from normality (Shapiro Walk tests p < .001),Robust Maximum Likelihood was used to estimate all models.First, whether the SENR questionnaire assesses a unidimensional parenting self-efficacy construct was tested by fitting a 1-factor model to the item data at each time point.The 1-factor model was accepted if three fit indices, namely the Root Mean Square Error of Approximation (RMSEA), the Comparative Fit Index (CFI) and the Standardized Root Mean Square Residual (SRMR), indicated adequate fit (Kline, 2011).For RMSEA, the fit is considered acceptable if the value lies between 0.05 and 0.08, and good if below 0.05.A CFI value of 0.95 or larger indicates good fit, and between 0.90 and 0.95 acceptable fit.For SRMR, a value below 0.05 indicates good fit and between 0.05 and 0.08 acceptable fit.
If the 1-factor model did not show acceptable fit, EFA were performed at each time point separately to further explore the factor structure of the SENR questionnaire.EFA models ranging from 1 up to 4 factors were examined.
Once a good fitting factor model was found that replicated across time points, CFA was used to test for different levels of measurement invariance across time points, following standard recommended procedures, and fitting the different measurement invariance models to the data of all time points simultaneously in one model (Putnick & Bornstein, 2016;Van de Schoot et al., 2012).Measurement invariance tests were conducted first for the 3 prenatal timepoints, then for the 3 postnatal time points, and last for all 6 time points simultaneously.The measurement invariance models are graphically displayed in Figure 1, depicting only two time points for simplicity.The first step in testing measurement invariance involved testing for configural invariance.Configural invariance holds if the same factor model fits the data well across time points; that is, the same items load on the same factors at each time point, but all parameter estimates are estimated freely across time points.The second step was to test for metric invariance by constraining the factor loadings of all items to be equal across time points, but to freely estimate all other parameters in the model across time points.In the third step, item intercepts were additionally constrained to be equal across time points, For simplicity, only two items per factor are depicted (instead of 9 items for factor 1 and 7 items for factor 1), and the factor model is drawn for just two timepoints (instead of 6 time points).To identify the factor model, and to be able to freely estimate the factor means and variances across time points, the factor loadings and intercepts of one item per factor were constrained to be respectively 1 and 0 in the scalar and strict invariance models.This was done for items 2 and 11 that showed the least variation in loadings and intercepts across time (see Table 5 and Supplementary Table S6).
thereby testing for scalar invariance.In the fourth and last step, strict invariance was tested by additionally constraining the item residual variances to be equal across time points.More stringent levels of measurement invariance were said to hold if the additional constraints on the parameters did not lead to a substantial deterioration of model fit.In case a more stringent level of measurement invariance did not hold, several partial measurement invariance models were explored, constraining part of the parameters of interest to be equal while allowing the other part to be free.For example, if metric invariance holds, but scalar invariance does not, a partial scalar invariance model was explored in which some intercepts are constrained to be equal, but others are left free (Byrne et al., 1989).Models were compared by computing the difference (Δ) in RMSEA, CFI, and SRMR values between two subsequent measurement invariance models.If ΔRMSEA was below 0.015, ΔCFI was below 0.01, and ΔSRMR was below 0.03 for metric invariance and .015for scalar and strict invariance, the more stringent measurement invariance level was accepted (Chen, 2007;Putnick & Bornstein, 2016).Factor means and variances were identified in the metric, scalar, and strict invariance models by constraining per factor the factor loading of one item to 1 and its intercept to 0 (rather than scaling the factors with a mean 0 and variance 1).The item that showed the smallest differences in factor loadings, intercepts, and residual variances across time points (in the configural invariance model) was chosen for these constraints (which happened to be items 2 and 11, see results).

Exploring the Factor Structure
The EFA 1-factor model showed bad-to-adequate fit across the six time points (Table 2 and Supplementary Table S3).For the first prenatal time point, none of the three fit indices showed acceptable or good fit.For the next two prenatal time points, RMSEA and CFI showed inadequate fit, while SRMR showed acceptable fit.For postnatal timepoints, RMSEA and SRMR showed acceptable fit, but CFI showed bad fit.Therefore, the factor structure was further explored using EFA considering all time points and up to 4 factors.The EFA 2-factor model generally fit the data well (Table 2).Model fit was acceptable to good based on all three indices for all prenatal and the first two postnatal time points.Only for the last time point model fit was not acceptable based on CFI, but good-to-acceptable considering SRMR/RMSEA.Further, for T1-T5, the exact same items loaded on the same factors (Table 3).For T6, the 2-factor solution was almost identical, except for item 7 loading on factor 2 instead of factor 1. The fit of the 3-and 4-factor models was also acceptable to good (RMSEA range = 0.01-0.07;CFI range = 0.92-0.99;and SRMR range = 0.02-0.03),but these factor structures were inconsistent across time points and frequently included factors with low loadings (< 0.40) (Supplementary Tables S4 and S5).Therefore, the 2-factor solution was taken as the best model: It was the most parsimonious model with mostly acceptable fit, and it showed the most robust solution across time points.In this solution, nine items (1, 2, 5, 7, 8, 9, 12, 14, and 15) loaded on factor 1, and seven items (3, 4, 6, 10, 11, 13, and 16) loaded on factor 2. When inspecting the content of the items, we labeled factor 1 Confidence in parenting skills, as it includes items about feeling confident in one's role as a parent and feeling comfortable in interacting with and taking care of one's child.Factor 2 includes items about having doubts, being concerned, and feeling insecure about one's role and skills as a parent.Compared to factor 1, factor 2 taps into the presence or absence of negative emotions concerning the parenting role.Therefore, we labeled factor 2 Lack of insecurity/distress in the parenting role.

Longitudinal Measurement Invariance Tests
Table 4 describes the model fit of the 2-factor models testing the different levels of measurement invariance, including tests for partial measurement invariance where applicable.The measurement invariance tests are conducted for the three prenatal time points, the three postnatal time points, and all six time points.points.The intercepts and residual variances of this model are found in Supplementary Tables S6 and S7.
The measurement invariance tests for the three prenatal time points showed that strict invariance holds.Across the three prenatal timepoints, there are no large differences in factor loadings, intercepts, and residual variances, and constraining their values to be equal across time points did not lead to substantial deterioration of model fit.For example, as can be seen in Table 5, the difference in factor loadings across prenatal time points ranges between .02 and .10across items, which is quite small.These results indicate that the prenatal version of the SENR is fully measurement invariant across time.Across the three postnatal time points, metric invariance holds, but scalar and strict invariance lead to substantial deterioration of model fit.
A partial scalar invariance model, with item intercepts for time points T4 and T5 constrained to be equal, and intercepts for time point T6 free also did not lead to adequate fit.In addition, a model with metric invariance for the first factor, and scalar invariance for the second factor, or vice versa, also did not fit the data well.Thus, the metric invariance model was kept as the best model.The differences in factor loadings across postnatal time points are generally small (Table 5), but larger nonignorable differences are found for the intercepts and residual variances across postnatal time points (Supplementary Tables S6 and  S7).These results indicate that the postnatal version of the SENR questionnaire is not fully measurement invariant across time.
When testing measurement invariance across all time points, again metric invariance holds, but scalar or strict invariance is not reached (also not for just one of the factors, or partially).Thus, the same 2-factor structure holds across all time points and the strengths of the associations between the items and the factors (that is, the factor loadings) are equal across time.However, substantial differences in item intercepts across time (from prenatal to postnatal phase, and across the first two years of parenthood) are found, indicating that the same parenting self-efficacy factor scores give rise to different item answers on the SENR questionnaire depending on time.The most striking differences in intercepts across time are found for items 12 ("able to soothe baby/child," largest difference of 1.08) and 14 ("feel comfortable using baby-talk/talking with child," largest difference of 1.28) from the factor Confidence in parenting skills and for item 3 ("difficulties interpreting cries," reverse-scored, largest difference of 1.08) from the factor Lack of insecurity/distress in the parenting role.For these items, intercepts are higher for postnatal compared to prenatal time points, indicating that parents with the same factor score are more likely to endorse these items postnatally than prenatally.Substantial differences across time in item residual variances also emerged, meaning that the amount of variance in item answers that is unexplained by the parenting self-efficacy factors (which includes measurement error) changes across time as well.The largest difference in residual variances across time is found for item 14 ("feel comfortable using baby-talk/talking with child") from the factor Confidence in parenting skills, with large residual variances prenatally and a decline postnatally.Further, in the metric invariance model, the two factors Confidence in parenting skills and Lack of insecurity/distress in the parenting role were found to correlate between 0.56 and 0.70 across time points.The means of both factors across time points are provided in Table 6.As can be seen, the means of both factors are somewhat larger postnatally than prenatally but given that strict measurement invariance could not be established, these means cannot be directly compared and reflect a combination of both true differences in parenting selfefficacy scores as well as measurement bias.

DISCUSSION
We examined the measurement invariance and longitudinal factor structure of the SENR questionnaire assessing parenting self-efficacy in a large sample of first-time mothers, followed longitudinally during pregnancy and postnatally during infancy and toddlerhood.A two-dimensional structure is consistent for the SENR questionnaire across time.The first dimension reflects feelings of confidence in the parenting role, containing items that are domain-general but also domain-and task-specific.The second dimension taps more into the presence or absence of negative emotions (e.g., feelings of doubt or distress) concerning the parenting role.Strict measurement invariance was found for the SENR questionnaire across prenatal timepoints, but only metric measurement invariance was reached across postnatal timepoints.
As far as we know, no other studies have examined the dimensionality of the SENR questionnaire specifically.However, studies using other self-report questionnaires to assess parenting self-efficacy show mixed results regarding the dimensionality (Wittkowski et al., 2017), with the number of dimensions found ranging from one up to four in studies that include infancy and toddlerhood (Črnčec et al., 2008;Hamilton et al., 2014;Matthey, 2011;Moran et al., 2016).It is difficult to directly compare our results to the findings of those studies regarding dimensionality because the scope of the assessed constructs varies greatly across studies.While the SENR assesses a combination of domain-general, domain-specific, and task-specific parenting self-efficacy, other scales focus on domain-general parenting self-efficacy only (e.g., Hamilton et al., 2014) domain-or task-specific parenting selfefficacy (e.g., Matthey, 2011), a mix of domain-general and specific parenting self-efficacy (e.g., Moran et al., 2016), or even a combination with other aspects of experiences of parenthood, such as satisfaction or self-esteem (e.g., Ohan et al., 2000).In addition, stages of parenthood (that is, the age ranges of the children) that the scales refer to show variability.Some scales are developed specifically for childbirth-related experiences of parenthood (e.g., Perceived Maternal Parenting Self-efficacy; Barnes & Adamson-Macedo, 2007), whereas other scales are suitable for a much wider age range spanning infancy to adolescence (e.g., Me as a Parent; Hamilton et al., 2014).What most of these studies do show however, is that parenting self-efficacy is a complex multidimensional construct that can be assessed at different levels (domain or taskspecific) and during different stages of parenthood.The two-dimensional structure as found in our analyses, with a dimension of general feelings of confidence and a more affective dimension, has not been found in previous studies, but this difference may be explained by the different content of the items across measures.This interpretation is consistent with Bandura's (1977) theory of selfefficacy that stresses that efficacy expectations might differ in magnitude, generality, and strength.As such, parenting self-efficacy is a dynamic concept that is shaped by previous experiences, the context, and the specific tasks that one needs to perform as a parent (Schuengel & Oosterman, 2019).Bandura (1977) also noted that there are four different sources of self-efficacy: accomplishments of behavior performance (e.g., experiencing successes in parenting children), vicarious experiences (e.g., observing successful accomplishments of other parents), verbal persuasion (e.g., receiving verbal feedback by others on parenting), and emotional arousal (e.g., experiencing positive or negative physical and emotional states as a parent).Given the different sources that dynamically shape self-efficacy, a multifaceted rather than a unidimensional construct of parenting self-efficacy is theoretically expectable.The twodimensional structure of the SENR questionnaire found here may be linked to these different sources of (parenting) self-efficacy.Our first dimension, Confidence in parenting skills, contains items that reflect more cognitive appraisals of parenting abilities (e.g., "I feel I have quickly caught on to the basic skills of caring for my child"), whereas the second dimension, Lack of insecurity/distress in the parenting role, contains items that tap into the negative emotional states associated with one's role as a parent (e.g., "I get uptight if my baby becomes fussy or irritable for longer than a few minutes").Future research might examine whether the two dimensions show differential associations with predictors and outcomes, such as the association of the first dimension with more cognitive variables (e.g., perceived parenting competence as a source of parenting stress in Abidin's, 1997, parenting stress model), and the second dimension with more affective variables (e.g., maternal anxiety and depression).
Aside from a two-dimensional structure, strict measurement invariance was found for the SENR questionnaire across prenatal timepoints, but only metric measurement invariance across postnatal timepoints.The postnatal findings are consistent with the single previous study that examined longitudinal measurement invariance for a parenting self-efficacy scale, where metric longitudinal measurement invariance was found across six age groups spanning 0 to 24 months (Moran et al., 2016).As far as we know, this study is the first to establish longitudinal measurement invariance for a parenting self-efficacy scale assessed during pregnancy.This finding is striking because the SENR was originally developed for postnatal use in the first year of parenthood (Pedersen et al., 1989), and later adapted for prenatal use by Porter and Hsu (2003), framing the items in terms of expectations of how competent parents would feel rather than feelings of competence that are based on actual experiences as a parent.A possible explanation for the differential findings regarding invariance of the prenatal version versus lack of full invariance of the postnatal version of the instrument could be the changes in wordings of the items to reflect infancy versus toddlerhood and accumulating and changing parenting experiences (successes and failures, as predicted by self-efficacy theory; Bandura, 1977).When the SENR questionnaire is used during pregnancy, valid inferences can be made about comparisons of scores across timepoints.However, some caution in interpreting mean scores from a strict invariance model should still be taken, as Little (2013) has argued that this model may be too restrictive because it assumes residual variances to be equal across time points (which includes measurement error, or the unreliability of the items).Valid comparisons of mean scores of the SENR across infancy and toddlerhood cannot be readily made, as they in part reflect response shifts of mothers and not just true changes in parenting self-efficacy that the SENR intends to measure.However, valid conclusions about correlations across postnatal time points can be made because metric invariance is sufficient.
Strengths of this study include the large sample of first-time mothers and the inclusion of six time points.Hence, the longitudinal dataset enabled a thorough psychometric analysis of the 16-item SENR questionnaire to assess how parenting self-efficacy changes across time quantitatively and qualitatively.Qualitative changes are indicated by shifts in the way mothers respond to items in the questionnaire as a result of experiences in actual parenting as opposed to expected parenting during pregnancy and changes in these experiences as infants grow into toddlers.
In spite of these strengths, this study also comes with a number of limitations.First, our findings regarding the dimensionality and measurement invariance are specific to the SENR instrument and cannot be generalized to other instruments of parenting self-efficacy.Second, the mothers in our sample were relatively highly educated, and they were ethnically and sociodemographically relatively homogeneous.The relative homogeneity of the sample has the advantage that results are better generalizable to specific populations (e.g., Bornstein et al., 2013), but this relative homogeneity comes with the disadvantage that it is unknown whether the same factor structure and measurement invariance would hold in different subgroups of mothers, such as mothers with different educational, ethnic, or migration backgrounds (as suggested by Dumka et al., 1996, for another parenting self-efficacy scale).Furthermore, our findings cannot readily be generalized to other types of parents or caregivers, such as fathers, non-biological mothers or adoptive or foster parents, who might experience parenting and particularly their expectations of it during pregnancy differently.Nor can findings be generalized to children's age beyond infancy and toddlerhood, such as primary school age or adolescence.Because the SENR questionnaire focuses on parenting self-efficacy in the nurturing role, at least for part of the items that are task-specific rather than domain-general, the SENR questionnaire cannot be readily used in parents of children who are older, as this would require substantial modifications of the items in the questionnaire.However, it would be worthwhile to investigate whether an abbreviated questionnaire can be developed that contains items that are suitable for a broader child age range, for example because they are domain-general (e.g., the item "I feel confident in my role as a parent"), they are longitudinally measurement invariant, and they tap into the different identified dimensions of parenting self-efficacy.Development of an abbreviated SENR questionnaire would need to consider measurement invariance across time and across groups and require careful evaluation of content, convergent, and divergent validity.
Altogether, our results highlight that parenting self-efficacy is not a unidimensional construct, but rather consists of cognitive and more emotionally laden appraisals of one's ability to parent.Furthermore, across the transition into parenthood, as mothers gain more experience as a parent, parenting self-efficacy levels and the way mothers answer the questions to assess parenting self-efficacy may change.Future research may focus on adaptations of the SENR questionnaire to assess parenting self-efficacy for which full measurement invariance does hold while keeping its twodimensional structure, leading to refined insights into how parenting selfefficacy components change across time and may be differentially related to parental and child outcomes.

IMPLICATIONS FOR THEORY AND PRACTICE
The present study is the first to investigate the dimensional structure and factorial invariance across time of the SENR instrument to assess parenting self-efficacy across the transition to parenthood.The instrument consists of two dimensions from pregnancy up to early childhood.This finding contributes to parenting self-efficacy theory by showing that parenting selfefficacy is not unidimensional but consists of cognitive and emotionally laden appraisals of one's ability to parent.The current findings may be used in the future to develop an abbreviated instrument that is fully measurement invariant across time and relevant groups while keeping the twodimensional structure.Such an abbreviated parenting self-efficacy instrument will allow clinicians to assess parenting self-efficacy in an efficient way, providing insights into which components of parenting self-efficacy should be targeted in interventions.

AFFILIATIONS AND ADDRESSES
for help with data collection.The authors thank Carlo Schuengel for his invaluable input on an earlier draft of the manuscript and for his continuing support throughout this research.
I trust my feelings and intuitions about taking care of my baby.I trust my feelings and intuitions about taking care of my child.10 I wonder if I really can understand my baby's needs.I wonder if I really understand my baby's needs.I wonder if I really understand my child's needs.11 I am unsure just how much attention I should give my baby.I am unsure just how much attention I should give my baby.I am unsure just how much attention I should give my child.

.
Differences between the prenatal version and the postnatal version -Infancy are given in Italic.Differences between the postnatal version -Infancy and the postnatal version -Toddlerhood are Underlined.

Figure 1 .
Figure 1.Graphical depiction of the factor models that test for the different levels of measurement invariance.Squares depict items; Circles depict factors; Triangles represent a constant for modeling the means.Single-headed arrows pointing from factors to items are factor loadings; Single-headed arrows pointing from triangles to items are intercepts; Single-headed arrows pointing from triangles to factors are factor means; Double-headed arrows represent (residual) variances or covariances.For simplicity, only two items per factor are depicted (instead of 9 items for factor 1 and 7 items for factor 1), and the factor model is drawn for just two timepoints (instead of 6 time points).To identify the factor model, and to be able to freely estimate the factor means and variances across time points, the factor loadings and intercepts of one item per factor were constrained to be respectively 1 and 0 in the scalar and strict invariance models.This was done for items 2 and

Table 2 .
Fit of the exploratory factor models.

Table 3 .
Estimates of factor loadings in the exploratory 2-factor model.
Note.For each item, the highest factor loading is given in bold.

Table 5 .
Estimates of factor loadings in the 2-factor confirmatory model (configural invariance model).

Table 6 .
Estimates of factor means in the 2-factor metric invariance model.