The retention effect of training: Portability, visibility, and credibility1

Abstract This paper analyses the effect of training participation on employees’ retention in the training establishment. On the basis of the human capital and monopsony theories the effect of portability, visibility, and credibility of training on employee retention is jointly calculated. We use an extensive German linked employer–employee panel data set with detailed survey information on the training history and administrative labour market information of 4318 employees working in 149 establishments (WeLL-ADIAB). In multivariate panel regressions including internal instruments we compare the probability of staying with the same employer between training participants and employees who were by chance unable to participate in a planned training event. The high portability of training contents and training visibility provided by training certificates reduce the retention effect of training independently. Retention is reduced further when training content is reported credibly, that is, it is provided and certified by external institutions. However, the total effect of portable, visible, and credible training on retention is still positive. This paper therefore implies that employers can reap a double dividend of higher productivity and increased retention even from general, visible, and credible training.


Introduction
To maintain competitiveness, increase productivity and to avoid skilled labour shortages, establishments must invest in the human capital of their employees (Stevens, 1994;Green et al., 2000;Zwick, 2006;Aguinis & Kraiger, 2009). However, to benefit from the new skills and to recoup the investment in training participants, it is important to keep trained employees at the establishment in the long term (Acemoglu & Pischke, 1999). Consequently, an empirical assessment of the retention effect of training is important for explaining employer investment in training contents that increase productivity outside the training establishment.
Differences in the effects of training on retention could arise from differences in training measures. For example, these differences may affect whether a company pays for courses attended at a community college or provides in-house training, awarding a certificate from its own staff. Human capital theory and monopsony theory show how training characteristics influence the retention effect of training (Manning, 2003;Leuven, 2005). Human capital theory derives the impact of training on employee retention from the distinction between general and specific human capital in a perfect labour market (Becker, 1962). General human capital training should induce a lower retention effect than specific human capital training because general training is portable and increases productivity in other establishments. These outside establishments can outbid training establishments because they do not have to recoup training investments. Therefore, the labour market value and the outside options of employees with portable human capital training increase (Stevens, 1994;Loewenstein & Spletzer, 1999).
An important condition for the prediction that portable human capital training decreases retention is that there are no labour market frictions. Monopsony theory proposes a number of frictions that explain why training establishments can avoid losing employees with general human capital training (Stevens, 1994;Acemoglu & Pischke, 1999). Important examples of these market frictions are training establishments having better information about the ability of their training participants (Acemoglu & Pischke, 1998;Autor, 2001) or training contents (Katz & Ziderman, 1990;Chang & Wang, 1996). A key hypothesis is that training visibility may be a precondition for the portability of general human capital training (Acemoglu & Pischke, 1999a;Katz & Ziderman, 1990). In other words, information asymmetries about training contents may transform portable human capital into non-portable human capital (Becker, 1962, p. 50-51;Barron et al., 1997b;Loewenstein & Spletzer, 1999, p. 730;Booth & Bryan, 2005). Thus, focusing on portability may produce wrong conclusions about the retention effect of training in the presence of market frictions.
There are few empirical papers on the impact of training on retention and they reach contradictory conclusions (Brunello & De Paola, 2009). Even studies that use the same data set have produced different results. Therefore, we argue that differences in the measured effects of training on retention may be driven by uncontrolled differences in the portability of training contents and labour market frictions, such as the visibility of training (Loewenstein & Spletzer, 1999). 2 More specifically, we show that differences in training visibility and portability that usually are not controlled for have effects on training retention. We also introduce into our empirical analysis the new dimension of credibility of training certificates. We show that certificates for portable and visible training contents issued by an external institution have a stronger negative effect on retention because they are more credible than internal certificates (Katz & Ziderman, 1990;Manchester, 2012).
We find that training increases employee retention by up to 14% for all training measures and a stronger retention effect of 18% for credible training, depending on the estimation method. Training visibility and portability reduce retention by up to 2.5% in general and by 4% for credible training, respectively. As a consequence, the overall training retention effect is positive even for portable, visible and credible training. Training firms therefore can enjoy the double dividend of increased productivity and retention after training irrespectively of the training characteristics.
As our main contribution to the literature is empirical, we are careful to control for the usual sources of estimation bias when assessing the effects of self-reported formal and informal training on retention (which is defined as staying at the employer at least until the next year). A central empirical problem is selectivity for training regarding unobservable drivers that are also related to retention (Card, 1999;Heckman, 1999). To overcome the endogeneity problem, we use a comparison group approach (Leuven & Oosterbeek, 2008). Instead of comparing training participants with all training non-participants, we only include a control group consisting of employees who had been selected by the employer to participate in training but had to cancel their participation for exogenous reasons, referred to as "accidental training non-participants". Although training participants and accidental training non-participants have both been chosen by their employer for training, the groups may differ with respect to time-invariant unobservable characteristics related to training participation and retention, such as ability or motivation. This source of estimation bias is avoided by applying first difference (FD) estimations. In other words, we measure changes in training incidence and retention over time instead of comparing both variables between employees. There may also be time-varying unobserved characteristics, such as career prospects. This source of estimation bias is avoided by applying difference general method of moments (Diff GMM) estimations. In other words, we explain who gets training by using additional explanatory variables that are not related to retention. We compare the results from all three estimation specifications, rigorously test their applicability and derive management implications from our findings.

Conceptual considerations
According to Becker's human capital theory, skills can be considered as enablers of individual and firm-level productivity. When assessing the effect of training on retention, we should distinguish general and firmspecific skills. Training with general contents increases the productivity of trainees at many employers, whereas training measures imparting specific human capital increase employees' productivity exclusively in the training establishment (Becker, 1962). After general training, training establishments may risk having their trained employees poached 3 (Bishop, 1997;Black & Lynch, 1998;Mohrenweiser et al., 2019). The poaching threat is lower if the new skills are not completely portable to other establishments. Moreover, employees' benefits from specific training are lost when they leave the establishment. Therefore, employees and employers are interested in continuing the employment relationship, and specific training is associated with an increase in retention (Barron et al., 1997b;Loewenstein & Spletzer, 1999).
There are many reasons why even general human capital training may not lead to a decrease in retention. According to monopsony theories based, for example, on labour market frictions, retention probability also depends on the visibility of training contents (Acemoglu & Pischke, 1999a;Chang & Wang, 1996). The current employer usually has an information advantage concerning the exact content (e.g. focus and type of training) and the amount of training. Training measures are often informal, heterogeneous, and tailored to the needs of the training participants (Katz & Ziderman, 1990), and thus they are hard for outside establishments to assess. Based on this information asymmetry, it is difficult for outside establishments to observe and assess the quantity and quality of training fully. Consequently, outside firms will not be willing to compensate the trained employees fully for their newly acquired skills, and they pay a wage below the real productivity of the trained employees (Acemoglu & Pischke, 1998;Benson et al., 2004;Katz & Ziderman, 1990). The training establishment with an information advantage should be able to match the outside wage offer for trained employees it would like to retain. Hence, portable training that is hard to assess or invisible may not reduce retention. Loewenstein and Spletzer (1999) and Booth and Bryan (2005) argue but do not show that invisibility may transform portable training into non-portable training. Thus, only visible and portable training may reduce retention.
Training visibility may also have a direct effect on retention. Acemoglu and Pischke (1998) and Spence (1973) argue that training visibility itself could reduce retention irrespective of training portability or other training characteristics. Their argument is that visible training reveals the motivation of the employee to exert effort. However, visibility may only facilitate labour market access for those with portable training because non-portable training is considered irrelevant by potential new employers. In this case, a visible but non-portable training measure would not decrease retention. Therefore, it is unclear whether the positive retention effect of visible training vanishes if we also control for training portability.
Finally, the credibility of training certificates may be important in employee retention. Training employers may be tempted to conceal portable contents in their certificates to influence the retention effect of the training (Barron et al., 1997c). However, training employers may also exaggerate portability of training contents to improve the labour market chances of employees they would like to get rid of. An independent certification system and certificates issued by accredited external institutions, such as chambers of commerce or chambers of crafts, may give higher credibility to the training contents than internal certificates (Katz & Ziderman, 1990;Acemoglu & Pischke, 2000;Manchester, 2010). Thus, credible certificates should increase the retention effects of portable and visible training.

Previous empirical evidence
There are few empirical papers on the retention effects of training. An important reason for the scarcity of papers is that detailed information on training characteristics over a certain period, especially the timing of training and employment spells, is needed (Krueger & Rouse, 1998;Lynch, 1991). Empirical papers draw different conclusions on the retention effect of training and most papers have drawbacks. Many papers have no information on training contents or cover only part of the training activities that may be correlated with other training activities that are not controlled for (Lynch, 1991;Parent, 1999;Veum, 1997;Brunello & De Paola 2009). Many papers only cover a selected group of training participants; for example, employees in entry-level jobs after education or young employees (Lynch, 1991). Papers with detailed information on training contents tend to be case studies (Krueger & Rouse, 1998;Benson et al., 2004;Kampk€ otter & Marggraf, 2015). Finally, several papers cannot account for selectivity for training because they have cross-section data and no proper instrument for training participation (Dearden et al. 1997;Loewenstein & Spletzer, 1999;Green et al. 2000). Most importantly for our contribution, no empirical paper has assessed the joint impact of portability and training visibility on retention so far. Differences in the measured retention effects of training therefore might be caused by uncontrolled differences in visibility or portability of training.

Hypotheses
Before we describe our data set, we derive our empirical hypotheses. Most training measures involve specific human capital. Trained employees would lose some of this acquired human capital if they left, and thus training increases retention (Loewenstein & Spletzer, 1999;Mincer, 1988) H 1 : There is a positive relationship between training and retention.
Our second hypothesis incorporates the basic insight from the human capital theory that portability of training contents reduces the overall retention effect of training H 2 : When the training addresses general human capital, and thus is portable to outside employers, the retention effect is lower than for training on average.
Certification improves visibility of the training contents and the labour market chances of trained employees. Consequently, based on monopsony theory, we assume the following H 3 : When training contents can be signalled by a certificate, and thus training contents are visible, the retention effect is lower than for training on average.
Visibility may be a precondition for the negative retention effect of portable training. Therefore, the explanatory power of portable training contents on retention may vanish if we additionally control for training visibility courses. It may also be the case that training visibility has a negative effect on retention, irrespective of training contents. In this case, visibility and portability have orthogonal effects on retention and do not influence each other H 4 : When we jointly control for training portability and visibility, both training characteristics have a separate negative retention effect.
Finally, the credibility of a certificate may improve the value of portable and visible training in the labour market because independent institutions may not have an incentive to misreport training contents H 5 : When training portability and visibility can be signalled by an external certificate, their negative retention effect is higher than for certified training measures on average.

The Well-ADIAB data
We use the German linked employer-employee data set WeLL-ADIAB 4 . The data set was developed in the project 'Further Training as a Part of Lifelong Learning (WeLL)' to gain a better understanding of '( … ) the determinants and consequences of further training in Germany' (Bender et al., 2009, p. 638). In the project, 149 establishments were selected from the 2005 wave of the Institute of Employment Research (IAB) Establishment Panel. 5 From these establishments, between the years 2007 and 2010, in four annual waves, 6 7,352 randomly selected employees were asked about their individual training behaviour and specific training measures undertaken during the last year(s). Only employees with jobs covered by social security contributions were included in our sample selection. In addition, apprentices, people in internships, and employees in partial retirement were excluded. The survey features include the exact start and end date, and the duration and content of the training measures for the years 2006-2010.
An important advantage of the WeLL data set is the link between the individual training survey information and administrative labour market history data provided by the IAB in Nuremberg. Based on the Integrated Employment Biographies (IEB), highly reliable information is available on individual employment histories during the observation period (Bender et al., 2009). For example, we know the start and end dates of employment periods, the exact daily wage, further characteristics of employment (e.g. occupation, job status, working time), and unemployment periods. Thus, we know at which establishment training took place. 7 The data set also comprises administrative socio-demographic information, such as age, sex, and educational and vocational qualifications (Schmucker et al., 2014). The individual information can be linked to establishment-level information (e.g. establishment size, sector, location) from the IAB Establishment Panel (Bender et al., 2009;Spengler, 2007).
Because the selection of establishments was not random, the WeLL-ADIAB data set cannot claim to be representative of the population of German establishments (Knerr et al., 2012). Despite this limitation, the employer-employee panel structure of the data set as well as the wide range of topics relating to training is unique for Germany. Furthermore, the basic employee sample was defined as the workforce of about 56,000 employees in the selected establishments. Survey participants were not selected by their employer but they were directly approached by the social research institute conducting the survey. Therefore, the employee sample is representative of the establishments with respect to important observables (Bender et al., 2009).
For our analysis, we use the longitudinal version of the WeLL-ADIAB data set. To increase the homogeneity of our sample (7,352 employees), we eliminate 1,411 employees in part-time employment. Furthermore, in our main specifications, 1,623 individuals without training participation are excluded to obtain a homogeneous comparison group according to the comparison group approach. Thus, our sample consists of 4,318 training participants and accidental training non-participants from 149 establishments. Further descriptive statistics are reported in the next section.
3.1.1. Measures: dependent variable Brunello and De Paola (2009), Card and Sullivan (1988), Loewenstein and Spletzer (1999), and Picchio and Van Ours (2013) measure the effect of training on employee retention as the probability of staying in employment in the next period of time. In this paper, we adopt their empirical approach but focus on future employment in the same establishment instead of employment in general. In the WeLL-ADIAB data set, employment spells are measured with daily accuracy and there can be several spells in one or different establishments in one year. A new spell always starts on 1 January, irrespective of whether the employee changes employer. To calculate the employment duration and whether an employee was retained, we use the spell that starts on 1 January as the reference point. 8 Specifically, when having worked for an employer in year t, we regard an employee as retained if he or she still worked for the same employer on 1 January in year t þ 1. Our dependent binary variable takes the value of 1 in this case. If the individual changes his or her employer or is unemployed on 1 January, the variable is 0.

Measures: training information
In each annual survey wave, the respondents were asked about the timing and further characteristics of at most three training measures during the last year in chronological order. If the respondents have stated more training measures than requested, we delete this additional information to ensure consistency. We also delete all training measures that have no detailed information about their start and end dates. For our research question, it is particularly important that we know which employer offered the training measure. Therefore, we eliminate all training measures that cannot be assigned clearly to an employer (300 eliminations). Because the training period is given with monthly precision, we also eliminate training courses that were not finished one month before a job change (42 eliminations). We give the explanatory binary training variable, d ijt , a value of 1 if an individual participated in training offered by establishment j in current calendar year t; otherwise, the variable takes the value 0. Barron et al. (1997), Green et al. (2000), and Loewenstein and Spletzer (1999) note that training definitions differ between establishments and that information on training portability provided by the employer is unreliable. They propose using the assessments by training participants because their assessments are more comparable and reliable 9 . Therefore, we rely on the subjective assessment of the training participants as to whether their training contents could be used in other establishments. 10 The portability dummy takes a value of 1 if the training participants answered that the obtained training knowledge can easily be used at another employer and 0 otherwise.
We also control for the training visibility. A certificate at the end of the training course conveys the contents and value of the training to the outside labour market (Booth & Bryan, 2005). The visibility dummy takes a value of 1 if there was a certificate at the end of training and 0 otherwise.
The credibility of the training certificate is identified by the fact that the training was provided and the certificate was issued by a third party and not the training establishment itself. We assume that external institutions do not strategically manipulate the certification of training contents. The credibility dummy takes a value of 1 if the certificate was issued by an external training provider and 0 otherwise.

Measures: control variables
Besides information on training participation, a couple of further individual and establishment-level characteristics may affect the probability of retaining employees in the training establishment and training participation. Individual characteristics that may influence the retention probability of employees are gender, age, tenure, and education level (G€ oggel & Zwick, 2012). Qualifications may be positively related to training and retention (Gritz, 1993). In addition, training participation decreases with age, tenure, and experience (Picchio & Van Ours, 2013;Zwick, 2015). As an indicator of previous employment history, we consider the years of employment in the same establishment (tenure) and professional experience (Benson et al., 2004;Parent, 1999). Furthermore, we capture age as birth cohort effects, namely, as groups of birth years, because birth year is often closely related to experience. The propensity of the employer to train may influence the employment prospects and retention probability of employees; thus, we also consider establishment size and sector (Loewenstein & Spletzer, 1999).
Training is frequently accompanied or followed by wage increases, and these wage increases may have a decisive impact on the decision to stay at the training employer (Brunello & De Paola, 2009;Ch eron et al., 2010;Mincer, 1988). Training establishments want to increase employee retention by increasing wages and sharing rents (Becker, 1962;Hashimoto, 1981). Therefore, wage increases after training may be a key factor in the retention effect of training (Benson et al., 2004;Grund & Sliwka, 2001). In contrast to previous studies, in which individual wages were observed at only one point in time (Gritz, 1993;Lynch, 1991;Parent, 2003), or at the beginning and end of the observation period (Benson et al., 2004), we use individual wage changes on an annual basis. To control for general wage increases in the establishment, we define a wage increase as an individual wage change that exceeds the average establishment-wide wage increase in the occupational peer group. According to this definition, our binary wage increase variable takes a value of 1 if the wage increase of an individual is higher than the average wage increase of individuals in the same occupation in the establishment in the current calendar year. Wages may differ between several employment spells in one establishment and year. To calculate the individual wage increase, we use the weighted daily wage 11 of the employees by establishment and year. In the case of unemployment spells, the wage is set to 0.
Job satisfaction is another potential mechanism between training and retention (Georgellis & Lange, 2007;Jones et al., 2009). Job satisfaction may also capture additional dimensions of otherwise unobservable individual characteristics (Brunello & De Paola, 2009). Therefore, we control for yearly changes in general job satisfaction that are individually assessed in our data set (Zwick, 2015). However, as job satisfaction and wage increases are measured during the same time period as retention, both variables may be outcomes instead of controls for employee retention and they would be bad controls in our regressions (Angrist & Pischke, 2009, pp. 64-66). Therefore, we only use these variables in a robustness check to show whether the impact of training on retention is robust when we add them.

Estimation strategy
Our main contribution to the literature is the analysis of the retention effect of training considering portability, visibility, and credibility. We expand the training participation dummy with indicators of whether training content was general, whether training was completed with a certificate, and whether the certificate was issued by an external provider.
In estimating the impact of training participation on employees' retention in the training establishment, several estimation problems may occur that could produce biased estimators and results. To avoid these problems, we adopt a before-and-after approach, in which training participation in period t is related to employment in period t þ 1, because this approach prevents reverse causality (Dearden et al., 2000).
In addition, we consider the non-random selection of employees for training owing to unobserved third factors, such as motivation or ability, which affects training participation in t and retention in t þ 1 (Card, 1999;Heckman, 1999). There are several solutions to this endogeneity problem, and we show how our results differ if we apply these solutions in turn. The comparison group approach proposed by Leuven and Oosterbeek (2008) is one solution to reduce the possibility of unobserved third factors affecting the coefficients. They compare training participants only with employees who were selected by the employer to participate in training but could not participate for exogenous reasons. Reducing the sample to training participants and accidental training non-participants as the comparison group reduces the potential impact of endogeneity because employees who were not selected for training based on their unobserved characteristics are not compared with the training participants. In the WeLL data, the question used to identify accidental training non-participants is: 'Did you have the opportunity to participate in training courses, seminars, or lectures in the last two years without realizing this plan?' 12 It is crucial for the comparison group approach that the reasons for training non-participation are random because otherwise selection bias could contaminate the results (G€ orlitz, 2011). Therefore, we must examine the reasons for training cancellation more closely. We regard course cancellation by the training organizer or an unexpected job taking priority as random reasons to cancel training. 13 We use the reduced sample of training and accidental training non-participants in all the main tables in our paper and compare the results obtained with the full sample including the other training non-participants in a robustness check.
Based on the comparison group approach, in our first estimation of the retention effect of training, r ijtþ1 , we include the training information, d ijt , in an ordinary least squares (OLS) estimation. In addition to information on whether training occurred in year t for employee i at employer j, we also control for whether training was portable, visible, and credible using interaction terms with the training dummy. In addition, we include birth year, tenure, experience, gender, and qualifications in an individual characteristics vector, X it : Finally, we also include an establishment characteristics vector, Z jt , with employer size and sector, and year dummies ydt. The regression equation is written as where e is an idiosyncratic error term. However, training participants and accidental training non-participants may differ in unobserved time-variant or time-invariant characteristics related to both training participation and retention. Thus, the error term can be split. We estimate all variables in time differences (FD estimations) to eliminate all time-fixed individual unobserved heterogeneity, denoted as l i . 14 The FD estimation can be written as where D is an indicator of differences from year to year and X it2 is a vector of the time-varying individual characteristics, such as tenure and experience. 15 Even if we use the comparison group approach and control for time-invariant unobserved heterogeneity, training participation and retention can be affected by unobserved time-variant heterogeneity, such as future employment expectations at the training establishment or the chance of the employee to obtain promotion within the employer. Therefore, in our third and preferred estimation approach, we use the Arellano-Bond Diff GMM estimator 16 (Arellano & Bond, 1991;Roodman, 2006). In the Diff GMM estimation, lagged levels of the dependent and explanatory variables are added in the estimation equation as internal instruments. By so doing, the endogenous variables are predetermined and not correlated any more with the error term in the preceding estimation equations (Roodman, 2006). The Diff GMM estimation can be written as where V itÀ1 is a vector of the lagged levels. 17 To reduce endogeneity, some authors propose using external instruments, which are additional variables related to training but not retention. However, the inclusion of external instruments 18 reduces our sample size substantially. 19 In addition, many papers on the effects of training argue that it is hard to create a convincing external instrument (Dearden et al. 1997;Leuven, 2005). Consequently, we do not show the results of the Diff GMM estimation with our external instrument. Although our dependent retention variable is a dummy that equals one if the employee stays at the employer at least until the following year, we prefer to use the more general linear regression model. We however provide the analogous evidence using a Probit model in a robustness check. Table 1 shows descriptive sample characteristics separately for training participants and accidental training non-participants. The majority of the respondents are male and were born between 1952 and 1971. Most of the survey participants (79.12%) have professional experience of at least 10 years, although only 54.63% have worked at the same establishment for more than 10 years. Regarding educational background, 3.72% have no vocational education, 68.67% completed vocational education, and 27.63% hold an academic degree. Furthermore, 35.57% of the respondents received a higher wage increase than their occupational peer group in the establishment. In 75.22% of the training measures, training was visible, that is, the employees received training certificates. In most of the certified training measures, training was credible because it was provided by external institutions (84.31%). Furthermore, training participants often assessed their training measures as easily useable at other employers, and thus as portable (82.68%).

Descriptive statistics
The observable characteristics of training participants and accidental training non-participants are more similar in our comparison group approach than the characteristics of training participants and all training non-participants in the original sample (compare Table A1 in the Appendix with Table 1). 20 However, we still find significant differences between training participants and accidental training non-participants for some observable characteristics. For example, training participants have significantly higher daily wages than the comparison group. This may be because training participants are older and have higher tenure and experience. However, education level and gender do not differ significantly between accidental non-participants and training participants. According to Pischke (2001), differences in unobservable characteristics between training participants and training non-participants are often reflected by past wage differentials. 21 Therefore, an important indicator that accidental training non-participants are similar to training participants is that there are no significant differences in earnings in 2005. Given the socio-demographic differences between participants and accidental training non-participants, we should control for these observable characteristics in our retention estimations. In addition, it is important to perform within estimations, such as FD estimations, in addition to between estimations to control for differences in unobserved timeinvariant characteristics.

Retention effect of training
In our multivariate analyses, we test whether training increases employee retention when we control for individual and establishment characteristics. Table 2 shows the regression output of OLS, FD, and Diff GMM estimations.
The OLS estimation suggests that training participation increases retention probability in the training establishment in the next calendar year on average by 8.7 percentage points (PP). By also controlling for time-fixed individual unobserved heterogeneity and endogeneity, we obtain a significantly positive retention effect of 9.1 PP in the FD estimation and of 11.8 PP in the Diff GMM estimation. The OLS estimation indicates that there are few gender differences in the retention rate and older employees have a higher probability of staying in the current establishment, consistent with results reported by Brunello and De Paola (2009). When we focus on our preferred model, the Diff GMM estimation (model 3), we see that individuals with shorter job tenure have a higher probability of being retained (also compare Benson et al., 2004). The AR test in the Diff GMM estimation indicates that there is no autocorrelation in levels. Because the Hansen test is insignificant (p ¼ 0.334), we conclude that the internal instruments are valid.
After establishing that training on average increases employee retention in the current establishment, we investigate whether the retention effect is influenced by the portability and training visibility measures (Table 3). If the training content is portable to outside establishments, the retention effect is significantly reduced from about 10 PP to about 8 PP in the OLS and FD models. The retention effect for visible training measures is lower than for training in general (models 4-6). Visibility reduces retention by about the same magnitude as portability. The effects of visibility and portability on retention are not significant in the Diff GMM estimation. The retention effects of the other control variables in the different model specifications are robust to the addition of the visibility and portability interaction terms. Furthermore, in models 3 and 6, the AR tests and Hansen tests indicate that there is no autocorrelation in levels and that the instruments are valid.
To check whether portability and training visibility play an independent role in explaining retention, we simultaneously consider the interactions of training with portability and visibility in one model (Table 4, models 1-3). The coefficients for portable training are similar but they lose significance if visibility is also controlled for. Based on these results, training visibility and portability have separate negative effects on retention. Thus, visibility is not a precondition for portable training to have a negative retention effect and visible training does not have to be portable to reduce retention. Again, the AR test and Hansen tests indicate that there is no autocorrelation in levels and that the instruments are valid. Furthermore, the retention effect of the other control variables is practically unchanged compared with the previous estimations.
We finally find that the negative retention effects of general and certified training measures are stronger when we focus exclusively on training measures provided and certified by external independent institutions (models 4-6). The participation of individuals in externally provided training measures with general content reduces the retention probability by 3.9 PP. Furthermore, we find a large negative retention effect (3.3 PP) for training measures that are certified by external institutions. Visibility and portability again have a separate impact on retention, and the size and significance of the other covariates barely change compared with estimation models 1-3 in Table 4. The AR test in the Diff GMM model indicates that there is no autocorrelation in levels. The instruments are valid (Hansen test). We conclude that training certificates from external providers can be considered as powerful signals of training participants' ability and the portability of training contents. Consequently, employees participating in externally certified portable and visible measures can credibly prove their acquired skills to potential new employers. Summing up the evidence, we find that our first and fifth hypotheses are supported by our analyses, the second, third and fourth hypothesis get weaker support.

Robustness checks
To ensure that our results are not distorted by estimation problems or the sample selection, we run a series of robustness checks. First, our dependent retention variable is a binary variable. For the model specifications in Table 4, we also calculate marginal effects in a linear probit model (Table A3). The results in model 1 suggest that training increases the probability of retaining the employee in the training establishment by 12.8 PP. However, this positive effect is reduced by 1.2 PP when training is portable and by 1.4 PP when trained employees can show the contents to outside establishments via a certificate. Again, these negative retention effects are stronger when we focus exclusively on externally provided and certified training measures. When individuals participate in externally provided training with general content, this reduces the retention rate by 3.3 PP. Furthermore, when training participants receive certificates from an external independent institution, this also reduces the retention probability by 2.0 PP. Thus, the marginal effects in the linear probit model are comparable to the results obtained in the OLS, the FD and the Diff GMM models in Table 2.4. In addition, the retention effects of the other covariates are robust.
We use the control group approach and thereby restrict our sample exclusively to training participants and accidental training non-participants. To test whether unobserved heterogeneity among training non-participants affects the estimation results, we replicate our basic estimation models based on the full sample (Table A4). The positive retention effects of training are robust in all model specifications, whereas the additional effects of general and certified training measures differ. In particular, we find an additional positive retention effect of general training contents in the FD and Diff GMM estimations. Moreover, in contrast to all previous results, we only find a significant negative retention effect for externally provided certificates. The distortion of the effects may result from higher unobserved individual heterogeneity in the full sample. Therefore, the restriction of the comparison group to accidental training non-participants seems to be a good strategy.
In our last robustness check, we also add changes in job satisfaction and wage increases after training as explanatory variables (Table A5). Job satisfaction and wage increases after training have the expected positive impact on retention. The training coefficient and the interaction terms with portability and visibility are smaller and less significant, but the main picture is preserved (compare the results with Table 4). Therefore, we conclude that training as well as portability and visibility have direct effects on retention. These effects are not fully mediated by changes in job satisfaction and wage increases.

Management implications
Training in general human capital reduces retention of trained employees compared with that in specific human capital because their productivity is also increased for other employers, making trained employees more desirable in the labour market. Training visibility also improves the employability of employees at other employers, and thus training certificates also reduce employee retention. Both mechanisms are independent; therefore, managers cannot avoid the reduction in retention after training by not issuing training certificates. These results are in contrast to the suggestion that invisibility may decrease the portability of training (Barron et al., 1997b;Booth & Bryan, 2005;Loewenstein & Spletzer, 1999). Managers must consider that the selection of employees for training itself increases their employability elsewhere, irrespective of the portability. Rival employers value the motivation indicated by training participation (Acemoglu & Pischke 1998;Spence 1973). Managers however also should take into account the visibility effect of a training certificate that might lead to a reduction in trained employee retention.
The negative retention effect of certificates is stronger if the certificates are issued by external training providers. Managers can expect that not all training certificates are perceived as credible on the labour market (Acemoglu & Pischke, 2000); thus, training offered internally, for example, by colleagues or staff from the personnel department, has a less damaging effect on trained employee turnover. External certificates especially increase employability of employees trained in portable human capital contents.
However our main finding is that the total retention effect of training after controlling for portability, visibility, and credibility is still positive. Training providers can increase their chances of keeping employees even by offering training measures that increase productivity in other establishments and lead to credible certificates that outside employers can assess easily. Consequently, firms can increase the productivity of their employees (Zwick, 2006) and reduce turnover costs by offering training programmes accredited by chartered institutes, employer-sponsored bachelor, master's, and master of business administration courses, recognised language certificates and IT courses, mandatory health and safety training, and apprenticeship training (Benson et al., 2004;Manchester, 2010Manchester, , 2012Mohrenweiser et al., 2019). Shaw, Gupta, et al. (2005) and Shaw, Duffy, et al. (2005) stress that a reduction in turnover leads to higher employer performance because the loss of firm-specific human capital, social capital, and relationships can be avoided. Hence, a reduced turnover rate increases the positive performance effect of personnel development because trained employees are especially valuable for the exchange of knowledge within the workforce.
Our results contradict the prediction of standard human capital and monopsony theory that portable, visible, and credible training should increase employee turnover. There may be several reasons for our findings, the analysis of which is beyond the scope of our data set. There may be additional mobility barriers: 22 Training establishments may be able to retain their trained employees because important dimensions of ability, such as social or non-cognitive skills, cannot be signalled by training participation certificates (Acemoglu & Pischke, 1998;Autor, 2001;Manchester, 2010;Mohrenweiser et al., 2015). Reciprocity may also be a mechanism that increases the retention of employees who obtain portable and visible human capital (Kampk€ otter & Marggraf, 2015). Employees perceive firm-sponsored training courses as a kind action and an indicator that the employer values them (Lee & Bruvold, 2003). They reciprocate the training investments by staying with the establishment although their training opened them attractive outside employment options (Batt, 2002;Sieben, 2007). Employees who are satisfied with their work and the training they received, and who have affective and continuance commitment take training offers as additional reasons not to quit or to stay longer (Newman et al., 2011;Koster et al., 2011;Memon et al., 2016;Fletcher et al., 2018). If managers can improve psychological outcomes induced by personnel development measures, they may reduce employee turnover in addition to increasing employee productivity (Fletcher, 2019).

Conclusions
The objective of this paper is to determine the retention effect of training. Based on large linked employer-employee panel data with detailed information on the employees' employment and training history, we find that training has a significantly positive retention effect. Training provides the double dividend of an increase in human capital and a reduction in turnover. According to human capital theory, training portability reduces the retention effect of training, and according to monopsony theory, training visibility also reduces the retention effect. These negative retention effects are much stronger if training is credible, namely, if it is provided and certified by external institutions. The reduction in the retention of portable human capital is independent of its visibility and visible training does not need to be portable to reduce retention.
We are careful to avoid the estimation problems usually encountered in the literature when measuring the effects of training. We compare training participants with accidental training non-participants, defined as employees chosen to participate in training but who had to cancel for exogenous reasons, instead of all training non-participants. We have high-quality administrative spell data with daily accuracy on the labour market history of all employees included. Therefore, our measure of retention indicates whether the employee stayed with the employer they worked for during the previous year and whether the training received during this time period was with this employer. In addition, we account for unobserved time-invariant heterogeneity and for endogeneity of training participation. Finally, we include a broad range of individual and establishment characteristics that determine retention. We show that our results are robust with respect to changes in the estimation technique and the inclusion of additional explanatory variables and instruments.
Given the large importance of external certification in the German vocational education and training system in an international comparison (Acemoglu & Pischke, 2000;Steedman, 2001), we might assume that German employers also appreciate external training certification more than employers in countries with less strongly certified training such as Spain, the USA or the UK. Our results on the retention effects of (external) certifications therefore may not be as transferable to other countries as the results on portability. We also have to bear in mind that the establishments in our sample might have a higher training affinity and incidence than average German establishments because they have been selected from the IAB establishment panel on the basis that they offered training in the first half of the year 2005. The retention effects of training might differ in average German establishments that put less effort into training their employees." In summary, this is the first empirical paper that shows that even general, visible, and credible training has a positive retention effect. Therefore, training appears to be an effective measure to keep qualified employees at the employer and counteract the impending shortage of skilled workers and loss of human and social capital. This finding is puzzling, given that human capital and monopsony theories predict that this kind of training increases the labour market value of participants, and thus reduces retention. Personnel development measures have a positive effect on employee attitudes and emotions towards their employers; hence, these factors may play an important part in the explanation of the retention effect of training, especially for portable, visible, and credible training (Lee & Bruvold, 2003;Fletcher et al., 2018 (Mohrenweiser et al., 2019). 4. WeLL-ADIAB is the abbreviation for 'WeLL survey data linked to administrative data on the IAB' (Schmucker et al., 2014). 5. Only establishments with between 50 and 1,999 employees subject to social security contributions, establishments from manufacturing or the service industry and locations in the German federal states of Bavaria, Schleswig-Holstein, North Rhine-Westphalia, Mecklenburg-Western Pomerania, and Saxony were selected. By stratification of the selection criteria, 12 employer groups were formed, from which the five establishments with the highest and the five establishments with the lowest overall investment expenditures were asked to participate in the WeLL project. The selection criteria have been chosen to guarantee that the results are not driven by specific training patterns correlated with the numbers of employees, branches or regions (Bender et al., 2009). 6. The first wave contains the complete training information for the years 2006 and 2007, the second wave includes the training information for the year 2008, the third wave for 2009, and the fourth wave for 2010. 7. Mincer (1988) and Parent (1999) stress the importance of information on the incidence and timing of training and employee turnover in the causal analysis of the retention effect of training. 8. One employment spell always starts on 1 January in each year in our data set.
During the year, there are only new spells in the case of well-defined changes, for example, a change of employer or a promotion to another role at the same employer. 9. However, Raffiee and Coff (2016) argue that the perception of specificity of human capital by employees may be problematic. They find, based on the NLSY and the Korean Labor and Income Panel Study, that employees with longer tenure and higher commitment do not think they have human capital that is more specific. They do not find a correlation between the general specificity perception and onthe-job training obtained during a limited time spell in the past. However, they cannot compare perceptions with other training forms. We think that the insignificant correlations may call into question the assumption that on-the-job training is an indicator of human capital specificity, rather than errors in the perception of specificity. 10. The exact wording of the question is: 'How easily can the obtained knowledge be used at another employer, in your opinion?' ('Inwieweit ließen sich nach Ihrer Einsch€ atzung die erworbenen Kenntnisse auch in einem anderen Betrieb verwenden?)'. 11. To obtain the weighted daily wage, daily wage is multiplied by the number of days in the corresponding employment spell and divided by the overall duration of all employment spells by employer and year. Although the annual wage increase cannot be assigned exactly to the start and end date of training, we can consider changes in daily wages as a consequence of an employer change during the year. 12. We include accidental training non-participants only for the years in which they were chosen for training but did not participate. 13. There is some debate about whether family or health reasons can also be regarded as random cancellation reasons (G€ orlitz, 2011). The main argument against this assumption is that employees with long-term health problems or, for example, employees with young children or care duties for elderly family members, may routinely have to cancel training participation. Few employees indicated that family or health was the reason for training non-participation, and we remove these cases from our sample. If we include them in the group of training non-participants, our results are unchanged. 14. The error term is split into e it ¼ l i þ v it. 15. Our employer characteristics are time invariant, and thus vector Z jt drops out in the FD regression. 16. We use the one-step Arellano-Bond Diff GMM estimator, which is not robust to panel-specific autocorrelation and heteroscedasticity (Arellano & Bond, 1991). 17. In the Diff GMM estimation, we use two lags of the dependent variable (retention in the last two years) and one lag of all other endogenous explanatory time-varying variables (training information, year dummies, tenure, and experience) as internal instruments. We test for autocorrelation and use heteroscedasticity-corrected standard errors. Furthermore, we apply the small sample adjustment (Arellano & Bond, 1991). 18. We use establishments' expectations of skill shortages as an external instrument. If establishments expect skill shortages, this should lead to more training in the establishment, and therefore to a higher individual training probability. However, expected skill shortages should not affect the short-run individual retention probability. Therefore, this instrument is assumed to be valid. 19. In addition to the difference equation in the Diff GMM estimation, the system GMM estimator uses the level equation to obtain a system of two equations. Because the variables in levels in the second equation are instrumented with their own FDs, additional instruments can be obtained (Blundell & Bond, 1998). However, this approach reduces the sample size by one observation per individual. Furthermore, it is not appropriate to use system GMM estimation with a relatively small data set, as is the case in the current paper. 20. Leuven andOosterbeek (2008) andG€ orlitz (2008) show that the wage effect of training decreases in their comparison group approaches. Based on the underlying sample, we can confirm their results. Appendix Table A2 shows that the impact of training on wages is much smaller (3.4%) when training participants are compared with accidental training non-participants than in the full sample (9.8%). 21. Because administrative labour market history data are linked to the survey data, the WeLL-ADIAB also contains wage information for the time before the period covered by the training questions. 22. Manchester (2012) stresses that the retention effect has to be separated into the stronger sorting effect and the training participation effect. Because we calculate the retention effect based on the changes in retention of employees given training participation, we only measure the pure participation effect and control for the sorting effect. 23. The log daily wages for 2005-2008 are given in absolute numbers. 24. The log daily wages for 2005-2008 are given in absolute numbers.

Disclosure statement
No potential conflict of interest was reported by the authors.