Dynamics of overqualification: evidence from the early career of graduates

ABSTRACT This study analyses the persistence and true state dependence of overqualification, i.e. a mismatch between workers' qualifications and their jobs' educational requirements. Employing individual-level panel data for Germany, we find that overqualification is highly persistent among university graduates over the first ten years of their career cycle. Accounting for unobserved heterogeneity, results from dynamic random-effects probit models suggest that a moderate share of the persistence can be attributed to true state dependence. Unobserved factors are found to be the main driver of overqualification persistence. However, observed heterogeneity in terms of ability and study characteristics significantly contributes to overqualification persistence.


Introduction
Labour markets of industrialised countries share the common feature that substantial shares of workers acquired a level of qualification exceeding the educational requirement of their current job (Leuven and Oosterbeek 2011). These workers are formally overqualified for their job which may signal an inefficient allocation of skills in the labour market. At the economy level, misallocations of workers to jobs may imply significant productivity losses and negatively affect long-run growth (Jovanovic 2014;McGowan and Andrews 2015;Hsieh et al. 2019). At the individual level, cross-sectional analyses consistently find that overqualified workers earn less than equally educated individuals in adequate jobs (Hartog 2000). Employing panel data, some studies conclude that this wage penalty only represents spurious correlations due to an ability bias (Tsai 2010;Leuven and Oosterbeek 2011), while other studies find that wage penalties remain significant after controlling for skill-heterogeneity (Korpi and Tåhlin 2009;Kleibrink 2015). According to assignment theory, workers holding jobs below their qualification level cannot fully utilise their human capital and do not reach their individual production capacity (Sattinger 1993). However, overqualification does not necessarily lead to an underutilisation of skills since the measure of overqualification may neglect important components of workers' human capital and some overqualified individuals may have an inferior level of ability (Chevalier 2003;Green and McIntosh 2007).
To what extent overqualification is detrimental at the economy or the individual level will strongly depend on its longevity. If overqualification only occurs at the start of the career, losses may only arise in the short-run until available skills are optimally allocated. If workers are permanently overqualified, long-term losses may arise due to a continuous underutilisation of their human capital.
In this case, private and public educational investments may be partly wasted because of reduced returns in terms of wages or tax revenues. This is of particular relevance in the context of the publicly subsidised expansion of tertiary education induced by a strong increase in the demand for skills and the notion that human capital promotes economic growth. For instance, it is one main pillar of the European strategy for economic growth to ensure a steady increase in the supply of university graduates (European Commission 2010).
As previous studies imply, overqualification among graduates entering the labour market is highly persistent (e.g. Verhaest and van der Velden 2012). From a policy perspective, it is crucial to detect whether persistent overqualification arises due to differences in individual-specific characteristics, such as (innate) ability, or because of true state dependence. If overqualification in itself increases the probability of future overqualification, policy measures that prevent entry into or promote exits out of overqualification can induce a lasting reduction in the rate of overqualification. In contrast, if persistence arises solely due to individual heterogeneity, i.e. spurious state dependence, policies facilitating exits out of overqualification have little impact on future overqualification unless the factors causing overqualification are addressed directly.
The aim of this study is to estimate the true state dependence effect of overqualification among the policy-relevant group of university graduates. The empirical analysis is based on panel data covering the first ten years of individual careers for university graduates in Germany (DZHW Graduate Panel). Interviews were conducted one, five, and ten years after graduation and provide a rich set of individual characteristics such as university grades and school grades, field of study, or previous unemployment. Graduate overqualification is found to be considerably high over the observed time span. While 53% of the graduates who were overqualified in the previous interview remain overqualified in the next period, only 8% of previously well-matched graduates enter overqualification. Therefore, the persistence amounts to 45 percentage points.
Accounting for unobserved heterogeneity in dynamic random-effects probit models (Wooldridge 2005), a moderate share of the observed persistence can be attributed to a true state dependence effect. Previous overqualification experience is found to have a significant true state dependence effect on future overqualification amounting to 3 percentage points. Furthermore, the results suggest that unobserved factors are the main driver of the high persistence of overqualification over the early career of graduates. In particular, unobserved characteristics driving the selection into the initial state of overqualification observed one year after graduation are strongly related to the probability to remain overqualified later on. Sensitivity analyses show that our findings are robust to the choice of the econometric model, the choice of the measure of overqualification, potential non-random sample attrition, and the inclusion of an alternative measure of unobserved ability. We additionally show that persistent overqualification is associated with significant wage penalties over the course of the early career.
The study contributes to the scarce literature on true state dependence of overqualification. A common feature of previous studies is that they rely on samples drawn from the entire working population observed at different stages of individual career cycles. In contrast, this study provides evidence for graduates who enter the labour market and are observed over the early career cycle. Focusing on individuals with similar educational background, age, and working experience may reduce the extent of unobservable heterogeneity and improve the estimation of true state dependence. Furthermore, previous studies focus on rather short-term transitions between consecutive years, while the present study analyses dynamics of overqualification between observations roughly five years apart. This eases a problem studies based on annual data often face, i.e. one single spell of overqualification may span two consecutive years inducing spurious state dependence (Arulampalam, Booth, and Taylor 2000). In line with previous studies, our analysis provides evidence for a true state dependence effect of overqualification. However, the effect is somewhat lower if medium-term transitions among graduates are analysed.
The remainder of this paper is organised as follows. Section 2 provides a review of the related literature. Section 3 introduces the data and provides descriptive statistics. Section 4 describes the econometric model. Section 5 provides the results and Section 6 provides the sensitivity analyses. Section 7 discusses the results and concludes.

Background discussion
Several labour market theories suggest that overqualification is only a temporary phenomenon. According to matching theory, for instance, the job match quality cannot be foreseen by workers and firms because of imperfect information (Jovanovic 1979). Because job match quality is an 'experience good', mismatched workers will start to search for new positions and improve their job matches over time. As suggested by the career mobility theory, overqualification might also be a part of a planned career path (Sicherman and Galor 1990;Sicherman 1991).
On the contrary, other theories focus on labour market frictions and suggest that overqualification may be persistent at the individual level. In the job-competition model (Thurow 1972), overqualified candidates have a head start in the queue for a promising job because employers use education as a proxy for future job performance. Moreover, hiring overqualified applicants may be a form of insurance strategy to ensure a steady supply of high-skilled labour (Cedefop 2012). In line with studies showing positive effects on firm productivity (Mahy, Rycx, and Vermeylen 2015), employers might have an incentive to hire and retain overqualified employees (Büchel 2002;Verhaest et al. 2018). Allowing for skill-heterogeneity among workers and jobs, the assignment theory assumes that the phenomenon of overqualification will be an inevitable outcome of a complex allocation process (Sattinger 1993). Models combining assignment with costly search have shown that job mismatch might be a long-lasting phenomenon at the individual level (Teulings and Gautier 2004;Dolado, Jansen, and Jimeno 2009).
Overqualification may also arise due to individual heterogeneity among workers who attained the same level of education leading to spurious state dependence (Leuven and Oosterbeek 2011). The measure of overqualification might neglect important components of workers' human capital, since several studies have shown that individuals with an inferior level of ability are more likely to be overqualified (Chevalier and Lindley 2009). If individuals are 'apparently mismatched', overqualification may not imply an underutilisation of skills (Chevalier 2003;Green and McIntosh 2007). Moreover, workers may voluntarily choose positions below their own level of education, e.g. because of preferences for non-pecuniary job characteristics such as an easier workload (McGuinness and Sloane 2011; Sattinger 2012).
Alternatively, overqualification persistence could arise because of a behavioural effect of previous overqualification experience implying true state dependence. Overqualification may alter the process of human capital acquisition and job-specific human capital investments may lock overqualified workers into low-requirement positions (Pissarides 1994). Furthermore, knowledge and skills untapped during periods of overqualification may depreciate or become devalued because of changing skill requirements (Heckman and Borjas 1980;De Grip et al. 2008). Overqualification may send negative signals to employers using previous overqualification experience as screening device for future productivity. Recent audit studies find significant stigma effects of overqualification in terms of lower callback rates to fictitious job applications (Farber, Silverman, and von Wachter 2016;Baert and Verhaest 2019). Overqualification may lead to a lower perception of the own market value, discouraging workers to apply for adequate jobs (Stewart and Swaffield 1999). Finally, overqualified workers may be locked in because of reduced job search efforts on-the-job (Holzer 1987) or segmented labour markets (Doeringer and Piore 1971).
The longevity of a mismatch is the focus of a growing strand of the empirical overqualification literature. Several studies have tried to assess whether overqualification serves as a steppingstone to better future jobs (Korpi and Tåhlin 2009). Based on cohort studies of college graduates, overqualification is found to be a persistent phenomenon for a substantial share of workers at the start of the career. Verhaest and van der Velden (2012) analyse the persistence of overqualification over the first five years of graduates' careers in 13 European countries. Among those graduates who were overqualified in their initial employment, between 30% (the Netherlands) and 58% (Switzerland) have remained overqualified five years later. Employing annual panel data for Germany, Blázquez and Budría (2012) and Boll, Leppin, and Schömann (2016) show that roughly 85% of overqualified workers remain overqualified in the next year. Supplementary Table A.1 provides an overview of further results on conditional overqualification rates. Among the presented studies, the time lag between the previous and current overqualification status ranges from one year to seven years. Apart from differences across countries and time periods, the results indicate that conditional overqualification rates tend to decrease for higher time lags since individuals have more time to successfully exit overqualification. Moreover, the studies differ according to the measurement of overqualification and the sample of individuals, i.e. whether the entire working population or only tertiary graduates are covered.
A few recent studies employ panel estimation models to evaluate the size of true state dependence of overqualification by controlling for unobserved individual heterogeneity (see Supplementary Table A.1). Estimating a trivariate probit model with annual German data, Blázquez and Budría (2012) find a significant true state dependence effect of 15% for a sample of workers drawn from the entire working population. Boll, Leppin, and Schömann (2016) employ the dynamic random-effects probit model proposed by Wooldridge (2005) and find a significant state dependence effect for female (8%) and male (13%) graduates living in Western Germany 1 . Using the same estimation approach, Joona, Gupta, and Wadensjö (2014) find evidence for significant state dependence for workers in Sweden but do not provide the size of the marginal effect 2 . For Australia, Mavromaras and McGuinness (2012) find a true state dependence effect of 15% for overskilling, i.e. a situation where workers do not fully utilise their skills on the job. Using dynamic randomeffects logit models, Kiersztyn (2013) shows that overqualified workers in Poland had a four times higher probability to stay overqualified during the period 1988 to 2008. In contrast, Clark, Joubert, and Maurel (2017) estimate mixed proportional hazard models for the US and find that the duration of overqualification experience does not seem to have a significant impact on the probability to exit overqualification.
A common feature of these studies on state dependence of overqualification is that their results rely on samples of workers drawn from the entire working population observed at different stages of individual career cycles. Consequently, they do not measure state dependence for cohorts of equally educated workers who enter the labour market and are observed over the early years of the career cycle. However, focusing on individuals with similar educational background, age, and working experience may reduce the extent of unobservable heterogeneity. In turn, employing data for a specific group of individuals sharing more unobservable features than in the previous studies should improve the estimation of the state dependence effect. Therefore, it is the main aim of the present study to contribute to the literature by providing evidence on the extent of true state dependence of overqualification over the early career cycle of university graduates in Germany.

Data
The empirical analysis is based on two cohorts of the DZHW Graduate Panel covering university graduates who completed their study programme in 1997 and 2001, respectively. 3 It is a longitudinal nationwide study of university graduates in Germany which surveys individuals of each cohort one year, five years, and ten years after graduation. This data set has several advantageous features with respect to our analysis. In comparison to survey data covering the entire working population, focusing the analysis on the policy-relevant group of graduates does not produce small sample sizes. Therefore, the data set allows to adequately analyse the subsequent labour market progression for the group of initially overqualified individuals. In addition, the individuals are jointly observed over the first ten years of their career cycles and face the same overall economic situation. Taken together, the data allow to analyse individuals with similar educational background, age, and work experience. Unobservable heterogeneity that surrounds labour market participants should be lower within such a specific group of individuals. Comparability of graduates is increased by excluding individuals who were older than 35 years at the time of graduation or who obtained the university entrance certificate abroad. After deleting observations with missing data, the size of the remaining sample amounts to 6,467 graduates. 4

Overqualification
The focal variable in our analysis is overqualification. Overqualification indicates the occurrence of a vertical educational mismatch, such that workers hold a higher qualification than is required by their position. Three different measures for overqualification have been employed in the literature. They differ in the way how they evaluate the level of qualification that is needed for a job and are based on either the respondents' subjective assessments, realised matches within occupational classifications, or job analysts' ratings (see e.g. Hartog 2000). This study employs a subjective measure. The graduates were asked whether they hold a position for which a tertiary degree is a conventional requirement. An indicator variable takes value 1 if graduates indicate that their job usually does not require a tertiary degree and 0, otherwise. The overqualification status is observed in each of the three waves. 5 The subjective measure has one main advantage that is of particular relevance in the dynamic context of the study. As Hartog (2000) and others pointed out, the subjective measure captures specific job characteristics that only the job holder can assess and, thus, is not based on information aggregated at any occupational level. Therefore, in contrast to the objective measures defined for occupational classifications, a change in the overqualification status is possible within occupations if the tasks that have to be performed by the respondent have changed. However, since this measure relies on the workers' self-assessment, it is sensitive to potential differences in the individuals' perception of job requirements. Therefore, we have to assume that the assessment of job requirements does not systematically change over time at the individual level. Further potential drawbacks have been discussed in the literature. On the one hand, individuals with low wages may be more likely to incorrectly indicate that they are overqualified (Dolton and Vignoles 2000). On the other hand, workers may tend to overrate the requirements of their job in order to improve the perceived status of their position (Borghans and de Grip 2000).
To check the sensitivity of our main results, we replicate our analysis using a statistical measure which focuses on realised matches. Instead of self-assessments, the definition of job requirements is based on the actual educational attainment of workers within occupations and can either rely on the mean or the mode of this distribution (Verdugo and Verdugo 1989;Kiker, Santos, and Oliveira 1997). We employ the mode approach since it reduces the sensitivity to outliers (Borghans and de Grip 2000). Calculating the mode value of educational attainment within occupations requires a data set covering workers from all educational backgrounds. Since our main data set only includes graduates, we employ the German Socio-Economic Panel (GSOEP), a yearly representative panel survey conducted since 1984 among roughly 20,000 persons living in Germany (Goebel et al. 2019). The data set includes individuals' educational background according to the ISCED-classification and individuals' jobs according to the German Classification of Occupations (KldB 1992). Using information for the years 1996-2013, we calculate the mode value of education from all workers disaggregated at the 3-digit level of the occupational classification (Kiker, Santos, and Oliveira 1997). To allow for changes in requirements over time, we calculate the mode value for each survey year in our main data set. To avoid small sample sizes for occupational cells, the calculations are based on pooled observations from five years. For instance, observations from the years 1996-2001 are pooled for the calculation of required education in 1998, i.e. the time of the first interview in our study. 6 According to the statistical measure, graduates in our main data set are overqualified if they work in a 3digit occupation for which the mode value of education of workers is lower than a tertiary degree.

Control variables
Analysing the underlying mechanisms of overqualification persistence, it is important to include relevant determinants of mismatch which might contribute to spurious state dependence. Individuals holding the same level of qualification may differ in (innate) ability and skills. In order to account for differences in ability and skills among the graduates in the sample, this study incorporates school leaving examination grades and university grades as proxy variables. Although grades are surely an imperfect proxy for ability, previous research shows that cognitive as well as non-cognitive skills are relevant predictors of grades (Poropat 2009;Almlund et al. 2011). School leaving examination grades may proxy primarily for differences in general skills and ability. Since the procedures of the school leaving examination differ across the 16 federal states in Germany, school grades are standardised within federal states. Concerning the differences in occupation specific skills that are relevant for holding graduate jobs in a given field, university grades may be a sound proxy variable. University grades are standardised within fields of study and university types in order to account for substantial differences in the distribution of grades along these dimensions. As a robustness test, Section 6.4 replicates our baseline model using an alternative measure for unobserved ablity, i.e. residual wages.
Study characteristics such as field of study or university type are relevant determinants of overqualification among graduates in several countries (Green and McIntosh 2007;Berlingieri and Erdsiek 2012). In Germany, individuals can choose between two tracks of tertiary education. They can either enrol at traditional universities or at universities of applied sciences. In general, traditional universities are academically more demanding than the practically oriented study programmes at universities of applied sciences. Concerning the respondents' fields of study, four subject groups are derived for the analysis. The first subject group consists of the three subjects Medicine, Law, and Teaching which require a state examination as finals and can solely be studied at traditional universities. The remaining subjects can be studied at both university types and are divided into three groups, namely Science, Technology, Engineering, and Mathematics (STEM subjects), Business & Economics, and Social & Cultural Sciences. The individuals' study duration is another study characteristic that future employers may use as productivity-related signal. Since average durations of programmes vary across subjects and university types, study duration is standardised within subject groups and university types.
The family background of graduates may also affect the probability to find an adequate job at the outset of the career cycle (Erdsiek 2016). Social background may be related to the individuals' motivation or aspiration, may provide advantageous social networks for finding promising jobs, or may ease the pressure to take a job offer due to financial constraints. As a proxy for various forms of capital transmitted within families, we include the information whether at least one of the graduate's parents obtained a tertiary degree.
Concerning the role of gender for the occurrence of overqualification, previous results are mixed (Leuven and Oosterbeek 2011). Most studies for Germany find a higher prevalence of overqualification among women than men and that differences tend to be more pronounced for the subjective measure than the statistical measure (Büchel 2001). In order to analyse whether overqualification dynamics differ according to gender, the estimations either include a gender dummy or are performed for splitted samples.
So far, all presented explanatory variables are time-constant. In the main part of the analysis, the following time-variant explanatory variables are accounted for. Changes in job preferences and family responsibilities over the observed time span may occur because of parenthood and marital status (Frank 1978). On the one hand, parenthood as well as getting married may increase the individuals' reservation wage. On the other hand, increased family responsibilities may change the preferences for specific job characteristics in a way that less demanding jobs become a more favourable option, e.g. providing an easier workload. 7 In addition, unemployment experience is included as a time-variant explanatory variable. Based on calender information, the individuals' duration of previous unemployment (in months) is calculated for each of the three waves. The literature on unemployment scarring commonly finds that past unemployment causally increases the probability of future unemployment and also the probability to be lowly paid (Stewart 2007;Cappellari and Jenkins 2008a).
To account for job-related characteristics, some models include occupation fixed-effects. The graduates' occupation in each wave is measured in terms of occupational fields defined by Tiemann et al. (2008). They employ German data on tasks in 369 different occupations at the 3digit level of the German Classification of Occupations (KldB 1992) and identify 54 occupational fields that are highly homogenous. Finally, year dummies are included in all specifications to capture calender effects such as the current business cycle. Table 1 provides summary statistics. One year after graduation (t = 1), a substantial share of 16% of the graduates is working in jobs that do not require a tertiary degree. After four more years (t = 5), the average share of overqualified graduates still remains 16%. Finally, ten years after graduation (t = 10), the rate of graduate overqualification has fallen to a share of 14%. At the aggregate level, job mismatch thus seems to occur throughout the early career of graduates in Germany with a tendency to decrease with potential labour market experience.

Descriptive statistics
Turning to persistence at the individual level, Supplementary Table A.2 depicts the frequency of observed patterns of overqualification dynamics over the three time periods. The majority of graduates (72%) was holding appropriate jobs at each t. In contrast, 5% of the graduates have been permanently overqualified in every t. The columns 5 and 6 provide the distribution of dynamic patterns conditional on the initial state of overqualification. 86% of initially well-matched graduates remained in adequate employment 5 and 10 years after graduation. However, among initially overqualified graduates only 42% were matched in both later periods, 17% were matched only in t = 2, and 9% were matched only in t = 3. In total, one-third of the initially overqualified graduates remained overqualified over the observed time span.
The transition matrices in Supplementary Table A.3 provide the probabilities for entering and exiting overqualification. Among graduates who were overqualified in t = 1, a share of 50% has managed to find a matching job five years after graduation, while the other half of this group remained stuck in a mismatch (top panel). The probability to be overqualified in t = 5 is considerably lower for previously well-matched graduates (10%). Taken together, the probability to be overqualified in t = 5 is 40 percentage points higher for graduates already overqualified in the previous period. The persistence between t = 5 and t = 10 is even higher (bottom panel). If the transitions are pooled over the first ten years of the career cycle, persistence amounts to 45 percentage points. Supplementary Table A.4 provides the rate of overqualification for the statistical measure. Based on the realised matches approach, roughly 12% of graduates are overqualified one year after graduation and approximately 9% are overqualified after 5 or 10 years. In line with the literature, we therefore find lower overqualification rates for the statistical measure than for the subjective measure. Concerning conditional overqualification, roughly 6% of well-matched graduates become overqualified and 40% of overqualified graduates remain overqualified in the next period. Therefore, the persistence between periods amounts to roughly 35% for the statistical measure and 45% for the subjective measure.
In Supplementary Table A.5, the incidence of overqualification over the early career of graduates is provided for relevant subgroups. The overqualification rate is higher among female graduates than among male graduates in each time period. Considerable differences according to the field of study also show a clear pattern. The lowest rate of overqualification is observed for the fields Medicine, Law, and Teaching (4%). Graduates in these subjects have to take a state examination which is a prerequisite for holding a civil service job or a job regulated by the state. These graduates will act on a highly specialised labour market narrowly focused on their own profession. The demand for graduates seems to be sufficiently high in these fields so that constrained options do not result in a high risk of overqualification. Allocation processes might be more difficult in the other fields leading to a higher share of mismatched graduates. A comparably low rate of overqualification among STEM graduates (13%) could be explained by the high demand for their field-specific skills. These skills are deemed to be highly important for innovation and technical developments. Overqualification is much more likely in the fields Business Administration & Economics (27%) and Social & Cultural Science (26%).
A clear pattern also evolves regarding overqualification and university grades. The incidence of overqualification gradually decreases from the lowest to the top quartile. Graduates in the lowest quartile are more than twice as likely to become overqualified than graduates in the top quartile. While the rate of overqualification decreases over time, the differences according to gender, field of study, and quartiles of the university grade remain robust. Supplementary Table A.6 shows the rate of overqualification conditional on the previous overqualification status for the same subgroups. Overqualification persistence tends to be weaker among male graduates and graduates with better university grades. Differences also exist between the subject groups, with the overqualification persistence ranging from 36 percentage points (Med/Law/Teach) to 48 percentage points (BusAdmin/ Econ).
The gender composition within fields of study and gender-specific overqualification rates are presented in Supplementary Table A.7. Female graduates are overrepresented in Social & Cultural Science and Medicine, Law & Teaching but underrepresented in STEM subjects and Business Administration & Economics. The gender-specific overqualification rate within subjects is higher among female graduates, regardless of how overqualification is measured 8 . Except for STEM subjects, gender differences are more pronounced if the subjective measure of overqualification is used.

Sample attrition
Non-random attrition of individuals can induce sample selection biases and lead to inconsistent panel estimation results (Wooldridge 2010). One approach to consistent estimation in the presence of non-random selection is based on inverse probability weighting (Wooldridge 2002). This approach weights the criterion function to be minimised, e.g. the negative of a log likelihood, by the inverse probabilities of selection. To test for systematic differences in sample attrition, we examine whether characteristics significantly differ between individuals who participated in all three interviews and those who dropped out of the survey after the second interview 9 . Supplementary Table A.8 provides the variable means for the characteristics of stayers and drop-outs. Out of the 10,674 individuals in our data set, 1,340 did not participate in the third period. The overqualification rate among drop-outs is higher than among stayers, however the difference is not statistically significant. Stayers obtained significantly better school grades but no significantly better university grades than drop-outs. Moreover, stayers are less likely to graduate from universities of applied sciences. Considering a general indicator for favourable characteristics, we reassuringly find no significant difference between log hourly wages of stayers and drop-outs.
Using the inverse probability weighting approach, we can account for sample selection on observables. The respective weights are based on a linear regression with participation in wave 3 as dependent variable and individual characteristics as explanatory variables 10 . The probability to stay in the sample is significantly higher for individuals who obtained better school grades or who graduated in STEM subjects or Medicine, Law & Teaching. In contrast, graduates from universities of applied sciences have a lower probability to stay. Supplementary Table A.4 provides the weighted and unweighted overqualification rates. Using inverse probability weights increases the overqualification rates by only 0.1 to 0.2 percentage points. This holds for both measures and all three periods. In addition, the weighting scheme does not alter overqualification rates conditional on the previous overqualification status.
A threat to the validity of our analysis could arise if the persistence of overqualification considerably differs between stayers and drop-outs. For both groups, Supplementary Table A.9 provides the conditional overqualification rate in wave 2. The transition patterns are similar across both groups with drop-outs having an insignificantly higher probability of becoming or staying overqualified. Taken together, the above tests indicate that differences in the individual characteristics, overqualification rate, and persistence are small between stayers and drop-outs. If any sample biases may arise, our tests suggest that they may result in downward biases. As a robustness test, our main results are estimated using the inverse probability weighting approach.

Econometric model
The aim of this study is to estimate the true state dependence effect of past overqualification experience on future overqualification. Dynamic models of labour market choices which account for unobserved individual-specific effects face the problem of endogenous initial conditions. The 'initial conditions problem' can be considered as an endogenous selection problem because individualspecific unobserved factors may affect both the persistence of a labour market state and the initial state in the first period available in the data (Heckman 1981b). True state dependence is likely to be overestimated if potential endogeneity of the outcome in the first period is ignored (Chay and Hyslop 2014). Therefore, we employ the dynamic random-effects probit model proposed by Wooldridge (2005).
The econometric model can be summarised as follows. Let y * it be the latent propensity for individual i to be overqualified at time t. The latent propensity depends on the previous (realised) overqualification experience y i,t−1 , on observable explanatory variables summarised in the row vector x it and on individual-specific attributes m i that are unobservable and time-invariant. An individual is observed to be overqualified, i.e. y it = 1, if y * it exceeds a constant threshold which is assumed to be zero. The model is given by: 11 where 1 it represents an idiosyncratic error term. It is assumed that . In such a model, the coefficient of the lagged dependent variable, γ, is interpreted as measuring the 'structural' or true state dependence (Heckman 1981a). 'Spurious' state dependence due to permanent unobserved heterogeneity is accounted for by the term for constant individual-specific attributes m i . In the present study, this term may be interpreted to capture differences in the individuals' unobserved ability or preferences for specific job characteristics.
The estimation of this model requires to account for the initial conditions problem. Treating the initial conditions as exogenous would lead to an overstatement of the true state dependence effect if the initial conditions are correlated with m i . In order to integrate out the individual-specific effect, its relationship with the outcome in the initial period y i1 has to be specified. As suggested by Wooldridge (2005), one possibility is to assume that y i1 is random and to specify the distribution of m i conditional on y i1 and x i which leads to the joint density of (y i2 , . . . , y iT )|y i1 , x i . Following this estimation strategy, it is assumed that the individual-specific effect depends on the initial condition and the strictly exogenous variables as follows: The inclusion of the time-averages of the observed explanatory variables x i = 1 T−1 T t=2 x it accounts for potential correlation between the unobserved heterogeneity and the timevariant explanatory variables as suggested by correlated random-effects models (Mundlak 1978;Chamberlain 1984). 12 In this type of random-effects models, it is possible to estimate the effect of a change in x it by holding the time-averages fixed. It is assumed that the error term a i is i.i.d. as N(0, s 2 a ) and that a i ⊥(y i1 , x i ). Thus, the distribution of the individual heterogeneity is specified as follows: Under these conditions, the probability to be overqualified is given by: As shown by Wooldridge (2005), integrating out a i yields a likelihood function with the same structure as in the standard random-effects model including the initial condition y i1 and the time-averaged x i as additional explanatory variables in each time period t. Incorporating the augmented set of explanatory variables, standard random-effects probit estimation methods can be employed to estimate g, b, a 0 , a 1 , a 2 and s 2 a . If γ is estimated to be significantly greater than zero, true state dependence is present such that a previous experience of overqualification increases the probability to be overqualified in the next period.

Implementation and data structure
The structure of our data differs in several aspects from most studies employing dynamic randomeffects probit models. Most studies on state dependence use (1) more than three waves of annual data for (2) samples of the entire working population at (3) various stages of individual career cycles. First, our data set comprises three waves conducted approximately five years apart. Therefore, we observe changes in the individuals' labour market status over the medium-run rather than annual short-run. The increased time lag between interviews eases a problem studies employing annual data will often face. Spurious state dependence may occur if a single spell of the labour market outcome investigated may span two consecutive years for a substantial share of individuals. Arulampalam, Booth, and Taylor (2000) take this problem into account and report that over one third of the unemployment spells in their annual data set lasted longer than one year. Second, our data set solely comprises university graduates as compared to samples drawn from the entire working population. Therefore, we concentrate the analysis on the policy-relevant group of university graduates and only compare individuals that are equally educated. Third, the individuals are observed over the first ten years of their career cycles after entering the labour market. Therefore, the initial condition is observed much more closely to the real start of the labour market experience than in most previous studies. Nevertheless, accounting for the initial conditions problem remains important. Even if the entire history of the process of overqualification experience is observed it would be a strong assumption that unobserved ability or preferences are independent from the initial state of overqualification (Wooldridge 2005).
Some of the explanatory variables in our analysis do not vary over time, such as the proxies for ability, the characteristics of the study programme, or the family background. As suggested by Wooldridge (2005), time-constant explanatory variables can be included in the dynamic random-effects model in order to increase explanatory power. However, the model is not able to separately identify the partial effects of the time-constant variables from their partial correlation with the unobserved individual heterogeneity.
The time-variant regressors included in the model are assumed to be strictly exogenous, conditional on the individual-specific unobserved effect m i . This generally rules out potential feedback effects from changes in the outcome variable on future values of explanatory variables (Wooldridge 2000). Studies allowing for feedback effects by relaxing the strict exogeneity assumption are scarce. 13 Most studies focusing on state dependence of unemployment do not account for the possibility that the time-variant regressors marital status and number of children may depend on past unemployment experience (Arulampalam, Booth, and Taylor 2000). Arguably, feedback effects from unemployment may be more likely and severe than feedback effects from overqualification where individuals are still active in the labour market. Therefore, in line with the literature, we assume that marital status and parenthood are not affected by earlier overqualification experience. Furthermore, it is assumed that overqualification does not affect future unemployment experience. This assumption may be violated if, conditional on the individual-specific unobserved effect, overqualification experience reduces the individual's chances to find any job, for instance due to negative signals for employers.

Results
This section provides the results of the empirical analysis. As a starting point, the first subsection provides the results from a simple dynamic pooled probit model. The second subsection provides the main results from the dynamic random-effects probit model taking the selection into the initial condition and unobserved heterogeneity into account.

Observed heterogeneity
This subsection presents the results from a simple dynamic pooled probit model including the lagged overqualification status as a determinant for future overqualification experience. Here, it is assumed that the lagged overqualification status is exogenously given. A stepwise inclusion of individual characteristics into the model allows us to gauge the relevance of observed heterogeneity for the persistence from changes in the coefficient of the lagged dependent variable.
As shown in the first specification in Table 2, previous overqualification is strongly related to future overqualification. Controlling only for cohort membership and gender, previously overqualified graduates obtain a 44 percentage points higher probability to be overqualified in the next period. This resembles the persistence shown and discussed in the data section. The average partial effect of the lagged dependent variable is moderately reduced by 2 percentage points after including sociodemographic characteristics as well as previous unemployment spells into the model. Parental background, age at graduation, and unemployment experience are significantly related to overqualification. In contrast to most studies concerned with state dependence of labour market outcomes, we are able to include further explanatory variables concerning the study programme and individual ability. Accounting for differences across university types and fields of study in specification 3, the effect of previous overqualification is substantially reduced to 36 percentage points. Specification 4 additionally includes the proxy variables for ability resulting in a further reduction of the dynamic effect. Taken together, approximately one fifth of the average partial effect of previous overqualification can be explained by observed heterogeneity because the partial effect is reduced by 10 percentage points in specification 4 as compared to specification 1. Heterogeneity concerning study characteristics and individual ability seems to be particularly relevant. The question to what extent the remaining dynamic effect of 34 percentage points can be attributed to unobserved heterogeneity or to true state dependence is analysed in the next subsection.

Main results
The results on the true state dependence effect of overqualification are presented in Table 3. The average partial effects are obtained by subtracting the predicted probabilities of being overqualified conditional on the previous overqualification status under the assumption that the individualspecific heterogeneity takes its average value. For the complete sample (specification 1), the average partial effect of the lagged overqualification experience is significant and amounts to 3 percentage points. This implies that even after controlling for observed and unobserved characteristics, graduates are on average 3 percentage points more likely to be overqualified at time t if they have already experienced overqualification in t−5. Therefore, we find evidence for a true state dependence effect of graduate overqualification. However, the size of the true state dependence effect of previous overqualification experience is substantially smaller than the observed persistence of 45 percentage points. It is also much smaller than the average partial effect of lagged overqualification if only the observed heterogeneity is accounted for (34 percentage points). 14 The results thus suggest that most of the persistence is due to unobserved heterogeneity across graduates rather than due to true state dependence. Therefore, unobserved factors seem to be the main driver of overqualification persistence over the early career for graduates in Germany. The size of the true state dependence effect is similar for men and women, however, the effect loses its significance in the separate models for male graduates only.  Note: Pooled Probit estimation; Average partial effects; Standard errors clustered at the individual level; Previous period indicated by (t − 5); Significant at 1% ***, significant at 5% **, significant at 10% *.
The effect of the initial condition, i.e. overqualification experience one year after graduation, is highly significant and much larger than the effect for the lagged dependent variable in all three specifications. This implies a substantial correlation between the graduates' initial overqualification status and the unobserved heterogeneity. The results suggest that most of the observed persistence of overqualification is attributable to the initial selection into inadequate job matches based on unobserved factors.
The average partial effects for the observed explanatory variables show that female graduates significantly obtain a 1.5 percentage points higher probability to be overqualified than their male peers. Graduates with higher ability are less likely to be overqualified. This holds for school grades that may proxy primarily for differences in general skills and ability as well as for university grades that may proxy for occupation specific skills that are relevant for holding graduate jobs in a given field. Graduates who needed a relatively long time to complete their study programme, as measured by the standardised study duration in months, are significantly more likely to be overqualified. Workers who graduated from a university of applied sciences are more likely to be mismatched than graduates from traditional universities, even when differences in the fields of study are accounted for. In comparison to graduates in Business Administration & Economics, graduates in the other subjects are more likely to be adequately matched. In addition to their time-specific values, the model includes the averages for the time-variant explanatory variables. Only the time-averaged duration of unemployment experience is significant, indicating a correlation to the unobserved heterogeneity for female graduates.
Including 54 occupation dummies into the baseline specifications, Table 4 provides the results on overqualification state dependence if job-related features are accounted for. Controlling for the occupation fixed effects leads to a moderate increase of the estimated state dependence effect Overqualification (  Note: Random-effects probit estimation; Average partial effects; Standard errors in parentheses; Previous period indicated by (t − 5); Initial condition indicated by (t = 1); ρ: estimate of the cross-period correlation of the composite error term a i + 1 it ; Occupation FE: 54 occupational dummies; Year dummies included; Significant at 1% ***, significant at 5% **, significant at 10% *.
for both female and male graduates. In contrast, the effect of the initial condition decreases once occupation fixed effects are included. However, the main baseline results remain robust because the coefficient of the lagged overqualification status is still significantly smaller than the coefficient of initial overqualification. If occupation fixed-effects are accounted for, the coefficients for school grades, study duration, and subject decrease and lose their significance. As a robustness check, Table 5 provides further specifications of the baseline model for the whole sample. First, specification 1 includes the lags and leads of the time-variant explanatory variables as proposed in the original form of the Wooldridge estimator. Second, unemployment experience is excluded from the model since previous overqualification may exhibit potential feedback effects on future unemployment spells leading to biased estimates (Biewen 2009). Third, study characteristics and grades are excluded from the model. In all three specifications, the average partial effect of the lagged dependent variable remains qualitatively unchanged. Only the negative effect of initial overqualification increases if study characteristics and grades are excluded. This illustrates the relevance of these characteristics for the selection into early overqualification. Furthermore, the effect of parental education increases and becomes significant, signalling that study characteristics and ability are relevant pathways for family background effects on overqualification.
In comparison to existing results, our estimate of the true state dependence effect is rather small. This might be due to the fact that we focus on changes in overqualification in the medium-run rather than the short-run. Moreover, focusing our analysis on individuals with similar educational background, age, and working experience may reduce the extent of unobservable heterogeneity and, thus, improve the estimation. As shown in Supplementary Table A.1, Blázquez and Budría (2012) find a state dependence effect of 15% for a sample of workers drawn from the entire working population in Germany and Boll, Leppin, and Schömann (2016) find a state dependence effect for female Note: Random-effects probit estimation; Average partial effects; Standard errors in parentheses; Previous period indicated by (t − 5); Initial condition indicated by (t = 1); ρ: estimate of the cross-period correlation of the composite error term a i + 1 it ; Occupation FE: 54 occupational dummies; Year dummies included; Significant at 1% ***, significant at 5% **, significant at 10% *.
(8%) and male (13%) graduates living in Western Germany. Notably, both studies use a time lag of one year for estimating state dependence. 15 For the same reason, overqualification persistence is also more pronounced in their studies, i.e. roughly 85%, than in our study, i.e. 45%. Estimating dynamic random-effects logit models for Poland, Kiersztyn (2013) shows that overqualified workers have a four times higher probability to stay overqualified after five years. In contrast, Clark, Joubert, and Maurel (2017) find that the duration of overqualification experience does not seem to significantly affect the probability to exit overqualification in the US. Taken together, our results in favour of true state dependence are in line with most of the existing studies.

Sensitivity analyses
This section provides further robustness checks. In Section 6.1, the analysis is replicated using an alternative econometrical model. In Section 6.2, the analysis is replicated employing another measure for overqualification. In Section 6.3, inverse probability weighting is applied to account for potential non-random sample attrition. In Section 6.4, residuals from wage regressions are used as alternative measure for unobserved ability. In Section 6.5, we show that persistent overqualification is associated with lower labour market success in terms of wages.

Heckman model
As a robustness check for the main results, an alternative econometric model is employed. For dynamic non-linear models, Heckman (1981b) proposed an approach that differs in the way of modelling unobserved heterogeneity and solving the initial conditions problem. 16 He suggests to model   Note: Random-effects probit estimation; Average partial effects; Standard errors in parentheses; Previous period indicated by (t − 5); Initial condition indicated by (t = 1); ρ: estimate of the cross-period correlation of the composite error term a i + 1 it ; Study characteristics: university type and subjects; Year dummies included; Significant at 1% ***, significant at 5% **, significant at 10% *.
the initial outcome of the dependent variable, y i1 , jointly with the subsequent outcomes of the dependent variable, y i2 , . . . , y iT . In order to integrate out the unobserved effect m i , he suggests to approximate the unknown distribution of y i1 | m i , x i . In the latent variable form, the Heckman model can briefly be summarised as follows: where z i is a row vector of covariates including x i1 and additional exogenous instruments for the initial condition. By construction, m i and 1 i1 are orthogonal to one another. It is assumed that the initial observation y i1 is uncorrelated with 1 it and also that 1 i1 is uncorrelated with the x it for all i and t. Moreover, it is assumed that both 1 i1 and m i are normally distributed, the former with variance 1 and the latter with variance s 2 m . A test of exogeneity of the initial condition in this model is provided by the test of u = 0.
Equations (6) and (7) together specify a complete model for (y 1 , y 2 , . . . , y T ). The contribution to the likelihood function for individual i is given by where g(m) is the probability density function of the unobserved individual-specific heterogeneity and Φ denotes the standard normal cumulative distribution function. With μ taken to be normally distributed, the integral in Equation (8) can be evaluated using Gaussian-Hermite quadrature (Butler and Moffitt 1982).
In contrast to the Wooldridge model, the Heckman estimator requires an additional instrument. The instrument has to affect the initial state but, conditional on this state, must not affect transition probabilities. Heckman (1981b) suggests that when modelling labour market outcomes, initial conditions may be instrumented by using information prior to labour market entry. Most commonly, family background has been used as instrument in applications of the Heckman model for labour market outcomes such as unemployment (Arulampalam, Booth, and Taylor 2000), low wage (Stewart 2007), or social assistance receipt (Cappellari and Jenkins 2008b). Focusing on job mismatch, Blázquez and Budría (2012) use the quality of the relationship with the parents while Mavromaras and McGuinness (2012) use parental employment as instruments. 17 However, family background might not be an ideal instrument. In the context of overqualification, the offspring from wealthy and highly educated parents might obtain a higher aspiration and motivation to escape from an initial mismatch than workers with an adverse family background.
The present study uses a measure for the labour market condition at the time of the first interview as an instrument for the initial overqualification status. In particular, occupation-specific unemployment rates capturing exogenous labour demand shocks are employed as a measure for the initial labour market condition. This strategy builds on the recent literature focusing on the long-lasting effects of university graduation during a recession (Kahn 2010;Oreopoulos, Von Wachter, and Heisz 2012). In particular, Liu, Salvanes, and Sørensen (2016) show that graduates are more likely to be affected by job mismatch if they entered the labour market in a bad economy as measured by unemployment rates.
Occupation-specific unemployment rates are employed to measure initial economic conditions affecting the probability to be overqualified, e.g. because a higher occupation-specific unemployment rate might force more graduates to accept lower level jobs in order to avoid unemployment. For this purpose, the unemployment rates for university graduates have been computed in 120 distinct occupations based on administrative data for Germany. 18 Merging this information to the original data set, the occupation-specific unemployment rate at the time of the first interview is given for all respondents (Mean: 3.74%, Min: 1.05%, Max: 16.48%) Table 6 presents the results of the Heckman model including the occupation-specific unemployment rate as an instrument for the initial condition. Since estimation coefficients instead of partial effects are presented, the size of the parameters cannot be interpreted. 19 The top panel of the table depicts the main estimation results for time periods t = 5,10, while the bottom panel depicts the results for the initial condition equation (t = 1). The occupation-specific unemployment rate at the time of the first interview significantly increases the probability to be initially overqualified for both female and male graduates. Therefore, on average the risk of initial overqualification is higher in occupations with a lower demand for graduates or an excess supply of graduates one year after graduation. As expected, the hypothesis of exogenous initial conditions is strongly rejected because the respective test statistic θ is significantly greater than zero. Similarly to the Wooldridge estimator, the coefficient on the lagged overqualification experience signals a significant true state dependence effect if the initial condition is jointly modelled with the subsequent periods.

Main Panel Estimation
Overqualification ( Note: Random-effects probit estimation; Estimation coefficients are displayed; Standard errors in parentheses; Stata command: redprob; ρ: estimate of the cross-period correlation of the composite error term m i + 1 it ; θ: statistic used to test whether the initial conditions are exogenous. The estimate of θ is significantly greater than zero, i.e. the hypothesis that the initial conditions are exogenous is rejected; Unemployment rate: occupation-specific unemployment rate at the time of labour market entry; Year dummies included; Significant at 1% ***, significant at 5% **, significant at 10% *.
Furthermore, in the Heckman model, the state dependence effect remains significant for both female and male graduates.
To assess the robustness of the estimated size of true state dependence across econometric models, Table 7 presents the average partial effect of the lagged dependent variable from the Heckman model and the Wooldridge model. The Heckman model estimate of the true state dependence effect amounts to 4.0 percentage points which is slightly higher than the Wooldridge model estimate of 2.9 percentage points. Moreover, gender differences are slightly more nuanced in the Heckman model. As this sensitivity test indicates, our main result concerning state dependence of overqualification is robust to the choice of econometric model.
To check the validity of the Heckman model results, we perform three robustness tests. First, we additionally include the occupation-specific unemployment rates at the time of wave 2 and wave 3 in our estimation model. Second, we account for the fact that using occupation-specific unemployment rates as instruments might be prone to selection into occupations and use subject-specific unemployment rates as instruments instead. These subject-specific unemployment rates have been calculated by computing the unemployment rate among the pool of graduates in a given subject. Third, in line with the previous literature, we use parental education as an instrument for the initial condition. Taken together, our main results concerning state dependence do not change if these robustness tests are performed. 20

Measure of overqualification
As a robustness check that our results are not driven by the operationalisation of overqualification, this section replicates the baseline estimations using the realised matches approach. Table 8 provides the results of the baseline model on state dependence if overqualification is identified based on a statistical measure. In line with the previous results, the estimated true state dependence effect is significant and amounts to 3 percentage points. This result also holds for the separate regressions for female and male graduates. Furthermore, the coefficients for the initial overqualification status remain significant in all 3 specifications. In comparison to the previous results, the coefficients are attenuated but still nearly three times larger than the coefficients for the lagged overqualification status. Taken together, this robustness test supports the main result that unobserved factors are the main driver of the persistence in overqualification. Furthermore, using the statistical measure for overqualification does not change which explanatory variables exhibit a significant association with overqualification, e.g. grades, subjects and unemployment. 21

Inverse probability weighting
To account for potential non-random sample attrition, we replicate our baseline specifications applying an inverse probability weighting scheme. Assuming that attrition is random conditional on grades, the respective weights are based on a probit regression with participation in wave 3 as Table 7. Results, predicted probabilities of overqualification.
dependent variable. As shown in Column 1 of Table 9, the probability to stay in the sample is significantly higher for individuals who obtained better school grades. Based on the regression model in Column 1, the probability to stay in the sample is predicted for every individual and inverse probability weights are calculated. Column 2 and 3 of Table 9 provide the results from the weighted estimation of the state dependence effect of overqualification. Since inverse probability weighting requires at least one variable in the sample selection model that is not included in the final model, grades are excluded in the dynamic random-effects models (Wooldridge 2010). For a better assessment of the potential effect of weighting on estimated coefficients, Columns 4 and 5 provide the unweighted results for the same model. The weighted as well as the unweighted estimation suggest that the state dependence effect of lagged overqualification amounts to 3.7 (3.0) percentage points if occupation fixed-effects are (not) accounted for. Therefore, weighting under the assumption that attrition is random based on grades does not indicate that our baseline results are biased.

Unobserved ability
Our analysis uses school grades and university grades as measures for individual ability. However, well-matched and overqualified graduates could still differ with respect to unobserved abilities that are not captured by grades. If unobserved characteristics such as motivation determine the persistence of overqualification and are correlated with the controls included in the model, our estimates would be biased. Using residuals from wage regressions as a proxy for unobserved skills is one common approach to address this issue (e.g. Juhn, Murphy, and Pierce 1993;Dickson 2013;Chevalier and Lindley 2009). Residual wages capture all the individual characteristics, including job characteristics, that affect wages beyond the control variables included in the wage equation.   Note: Random-effects probit estimation; Average partial effects; Standard errors in parentheses; Previous period indicated by (t − 5); Initial condition indicated by (t = 1); ρ: estimate of the cross-period correlation of the composite error term a i + 1 it ; Year dummies included; 3-digit occupations with less than 20 observations for calculating realised matches are excluded from the estimation; Significant at 1% ***, significant at 5% **, significant at 10% *.
As a robustness test, we include residual wages as a measure of unobserved skills in our baseline regressions. In a first step, we estimate the following wage equation where ln(y i1 ) is the logarithm of hourly wages for individual i in the first period, i.e. one year after graduation, and x i1 is a vector of individual characteristics that might explain the graduates' wages. The estimate of unobserved characteristics, i.e. the residual wage, is given by the disturbance term 1 i1 . To simplify the interpretation, residual wages are normalised to have a mean of 0 and a standard deviation of 1. The results of the wage regression using omitted least squares estimation is provided in Column 1 in Table 10. One year after graduation, female graduates and graduates with lower university grades earn significantly lower wages. Moreover, wages strongly differ between fields of study. This finding is in line with a growing literature documenting substantial and increasing wage differences across college majors (Altonji, Kahn, and Speer 2014;Altonji, Arcidiacono, and Maurel 2016). In our sample, the wages are highest for graduates in Business Administration & Economics and significantly lower in STEM subjects, Social & Cultural Sciences, and Medicine, Law, & Teaching (in descending order). The substantial negative coefficient for Medicine, Law, & Teaching appears to be particularly striking. As shown in Figure A.1, wages are especially low for this subject group one year after graduation. Over the next five to ten years, however, wages grow substantially. An explanation for the particularly low starting wage is that these graduates predominantly hold civil service jobs or jobs regulated by the state. In these jobs, an additional preparatory service or Overqualification (  Note: OLS (1) and random-effects probit (2,3) estimation; Average partial effects; Standard errors in parentheses; Residual wage: normalised residual predicted from specification (1); Previous period indicated by (t − 5); Initial condition indicated by (t = 1); ρ: estimate of the cross-period correlation of the composite error term a i + 1 it ; Mean controls: mean married, mean children, mean months unemployed; Occupation FE: 54 occupational dummies; Year dummies included; Significant at 1% ***, significant at 5% **, significant at 10% *.
introductory training is often required for entry into a profession after tertiary education has been completed (e.g. 2 year teaching post traineeship or legal traineeship). During this period, wages are comparably low. In Column 2 and 3 of Table 10, the obtained residual wages are included in our baseline model on state dependence of overqualification. As expected, the coefficient is negative but it is only significant if the model also includes occupation fixed-effects (Column 3). Our baseline results on the true state dependence effect, however, do not qualitatively change due to the inclusion of unobserved characteristics. Although the coefficients of lagged overqualification are reduced, the estimates do not significantly differ from the baseline models.

Wage differences
So far, the empirical analysis has focused on the dynamics of overqualification over the early career. The following section briefly shows that overqualification in this period is negatively correlated with the individuals' labour market success as measured in terms of wages. Although we are not able to identify causal effects, e.g. by using exogenous variation in overqualification, we can use important individual characteristics to control for selection on observables. Most importantly, we can account for some differences in ability by including grades, for study characteristics by including subjects and type of university, and for job-related characteristics by including occupation fixed-effects. Table 11 provides separate linear wage regressions for each wave. As shown in Column 1, current overqualification is significantly and negatively related with the wage level one year after graduation. Controlling for job-related characteristics, the negative coefficient of overqualification increases in size and significance. Therefore, overqualified graduates on average earn less than their peers working in the same occupation as measured by 54 occupational dummies. Wage penalties significantly increase during the first ten years of graduates' careers, regardless of whether occupation-specific characteristics are controlled for ( . In line with a growing literature, we find that wages strongly differ between fields of study. These wage differences are reduced but remain strongly significant if occupation fixed-effects are accounted for. Moreover, the observed differences across subjects do not qualitatively change if overqualification is excluded from the wage equations. 22 Therefore, overqualification does not seem to be a main driver of college major wage differentials in our sample. Finally, wage differences across subjects are persistent up until ten years after graduation. Table 12 provides results for the question of how the pattern of individual overqualification experience is associated with current wages. In Column 1 and 2, the dependent variable is the individual's wage five years after graduation. In addition to the baseline explanatory variables, wages are hypothesised to be determined by the combination of current overqualification (wave 2) and previous overqualification (wave 1). In comparison to individuals who have been well-matched in both periods, individuals who are overqualified in period 2 earn significantly less. In more detail, wage penalties tend to be higher for individuals who have also been overqualified in the first period as compared to individuals who just became overqualified in period 2, though the difference is not statistically significant. This result holds with occupation fixed-effects.
Columns 3 and 4 show the same associations over the first ten years of the career cycle. If occupation fixed-effects are accounted for, significant wage penalties are found even for individuals who are well-matched in wave 3 but who have experienced overqualification in the past. 23 However, wage penalties are significantly higher for individuals who are currently overqualified in wave 3. Moreover, wage losses tend to increase with the time spent in overqualification such that individuals who have experienced overqualification in two or three periods earn increasingly less. Taken together, wage penalties associated with overqualification seem to be most severe for individuals who have been persistently overqualified during the first ten years of their professional career. Of course, these results do not take account of graduates' preferences for certain types of jobs or non-pecuniary job characteristics. However, with respect to the main focus of the present study,   Year dummies included; Significant at 1% ***, significant at 5% **, significant at 10% *.
these findings indicate that persistent overqualification is significantly related to a lower labour market success of graduates in terms of wages.

Conclusion
In this study, we analyse the persistence and true state dependence of overqualification over the first ten years of the career cycles of university graduates in Germany. In order to analyse to what extent the high overqualification persistence arises due to true state dependence or due to spurious correlation induced by unobserved factors, we employ the dynamic random-effects estimator proposed by Wooldridge (2005). Accounting for unobserved heterogeneity and the initial conditions problem, the results suggest that a moderate share of the overqualification persistence can be attributed to true state dependence. Previous overqualification experience is found to have a significant true state dependence effect on future overqualification amounting to 3 percentage points. In conjunction with the finding that observed heterogeneity explains only a fraction of the persistence, these results suggest that unobserved factors are the main driver of the high persistence of overqualification. In particular, unobserved characteristics driving the selection into the initial state of overqualification are strongly related to the probability to remain overqualified later on. Such forms of unobserved heterogeneity might include preferences for particular job characteristics that can be found in low-requirement jobs. Moreover, differences in ability that are not captured by grades or residual wages may induce persistence because graduates with low ability lack the skills required to switch to an adequate job. Overall, we find little evidence for gender-specific differences in the dynamics of overqualification. Finally, our results are strongly robust to various sensitivity checks. These robustness tests include the choice of the econometric model, the measure of overqualification, potential non-random sample attrition, and an alternative measure of unobserved ability. We additionally show that persistent overqualification is associated with significant wage penalties.
Limitations of the current study may arise from the fact that only three observations per respondent are available in the data. Concerning the external validity, our results are specific for university   graduates over the early career in Germany due to our sample restrictions. Therefore, the results may not apply for later stages of the career cycle, individuals with other educational backgrounds, or countries with different labour market frictions than in Germany. Finally, the study does not account for the potential endogeneity in graduates' choice of the moment of their graduation which might be affected by the current business cycle. Knowledge about the causes of overqualification persistence is important from a policy perspective. Efficient policy responses differ in situations where persistence can be fully explained by individual characteristics, i.e. spurious state dependence, and situations with true state dependence. Therefore, the following policy implications can be drawn from our analysis. Since true state dependence explains a small but statistically significant share of the overqualification persistence, policies that help avoiding or exiting overqualification are likely to exhibit a moderate long-lasting effect on the rate of overqualification. Moreover, our results indicate that a substantial share of the persistence can be attributed to heterogeneous individual characteristics complicating the development of policy programs with durable effects. Since overqualification persistence is only partly explained by the rich set of observable characteristics included in the empirical analysis, further research is needed in order to disentangle which further barriers could contribute to the overqualification persistence.
The study shows that the field of study is an important determinant of overqualification over the first ten years of the career cycle. Therefore, a further implication is that the need for policy measures to reduce allocation constraints might differ across fields of study. For instance, skills of STEM graduates are deemed to be highly important for innovation and technical developments as drivers of economic growth. In the light of the public debate about a potential shortage of those graduates, targeted measures could help to reconcile the actual available supply and the high demand for STEM graduates. Whether policies should focus on supporting a better match after course completion or should try to attract more applicants to other fields of study with lower rates of overqualification requires further research. In particular, policies aiming to attract more applicants for, e.g. STEM subjects have to be based on studies which account for potential self-selection into fields of study.
Notes 11. Note that the model is presented in its standard notation indicating the lagged dependent variable as y i,t−1 . Due to the time structure of the data and following the notation used so far, the lagged dependent variable will be indicated by y i,t−5 in the results section. 12. The original Wooldridge model includes all values of the time-varying explanatory variables at each period (except the initial period). Most studies rely on the within-means of the time-varying explanatory variables instead. As shown by Rabe-Hesketh and Skrondal (2013), this specification does not lead to biases. As a robustness check, the original Wooldridge model including leads and lags is also estimated and presented. 13. Focusing on state dependence in poverty, Biewen (2009) concludes that ignoring feedback effects in terms of future unemployment and household composition may lead to biased estimates. 14. See Table 3, Column 4. 15. Moreover, the estimation models by Boll, Leppin, and Schömann (2016) simultaneously include a subjective overqualification measure, a statistical overqualification measure, as well as the interaction of both measures. 16. Several studies have compared the performance of the Wooldridge estimator and the Heckman estimator. Based on monte-carlo simulations and an application for unemployment persistence, Arulampalam and Stewart (2009) conclude that the results of both estimators are very similar and that neither of them outperforms the other. In contrast, Akay (2012) concludes that in short panels (below 5 waves) the Heckman estimator should be preferred because the Wooldridge estimator overestimates the true state dependence and underestimates the persistence due to unobserved time-invariant individual characteristics. Rabe-Hesketh and Skrondal (2013) attribute this conclusion to a misspecification of the Wooldridge estimator as it is used in the study by Akay. They show that biases vanish if time-averages of the explanatory variables do not include the initial period. 17. Blázquez and Budría (2012) employ a trivariate probit model, nevertheless the requirements concerning the instruments are the same as in the Heckman model. 18. Data source: Sample of Integrated Labour Market Biographies (SIAB), SUF (Regional File 1975. Employer information coded at the 3-digit level of the German occupational classification KldB-1988 have been aggregated to 120 occupational groups. See vom Berge, Burghardt, and Trenkle (2013) for further information about the data set. 19. The redprob command for Stata is used. See http://www2.warwick.ac.uk/fac/soc/economics/staff/mstewart/ stata/ and Stewart (2007). 20. The results are available upon request. 21. No specifications with occupation fixed-effects are estimated for the realised matches approach because of collinearity between the definition of overqualification and the occupation dummies. 22. The respective results excluding overqualification from the wage regression models are available upon request. 23. Only for individuals overqualified in the first two periods and well-matched in the third period wage differences are not significant in comparison to always matched individuals.