Measurement and decomposition of Lithuania’s income inequality

Despite Lithuania’s household income inequality being among the highest in the European Union (EU), little empirical work has been carried out to explain such disparities. In this article, we use the EU Statistics on Income and Living Conditions sample micro data. We confirm that income inequality in Lithuania is high compared to the EU average and find that it is robust to inequality measure or equivalence scale used. We have also decomposed household disposable income inequality by subgroups and factors. We find that the number of employed household members in Lithuania’s households affects income inequality more as compared to the EU. It is related to a larger labour income, and self-employment income in particular, contribution to inequality in Lithuania as opposed to the EU. Moreover, transfers and taxes have a smaller impact on reducing inequality in Lithuania than in the EU.

c We express our gratitude to Ignas Goštautas, with whom this project was originally started and for the patience and support of the Macroeconomics and Forecasting Division, the Applied Macroeconomics Research Division and the Center for Excellence in Finance and Economic Research at the Bank of Lithuania. This paper is based on data from Eurostat, EU Statistics on Income andLiving Conditions [2004:2016]. The responsibility for all conclusions drawn from the data lies entirely with the authors. The views expressed in this paper are those of the authors and do not necessarily represent those of the Bank of Lithuania or the Eurosystem.

© Lietuvos bankas, 2019
Reproduction for educational and non-commercial purposes is permitted provided that the source is acknowledged.
Gedimino pr. 6, LT-01103 Vilnius www. lb.lt Discussion Papers describe research in progress by the author(s) and are published to stimulate discussion and critical comments. The views expressed are those of the author(s) and do not necessarily represent those of the Bank of Lithuania.

Introduction
Income inequality in Lithuania has been one of the largest in the EU and is still growing. Specifically, the Gini coefficient of household equivalised disposable income, a common measure of inequality, stood at 37% in 2016 for Lithuania (Eurostat 2018d). This was the second largest Gini coefficient among the surveyed EU countries, second to Bulgaria, and exceeded the EU average income inequality by over 6 Gini points. Additionally, income inequality in Lithuania has increased by 5 Gini points since 2012. All this happened in the context of a more general concern over rising income inequality within major countries (Atkinson and Piketty 2010;OECD 2011OECD , 2015aOECD , 2015b and increasing empirical evidence that income inequality may hinder economic growth (Aghion et al. 1999;Berg and Osrty 2011;Ostry et al. 2014;Cingano 2014;Grigoli and Robles 2017). The size and dynamics of income inequality in Lithuania along with warnings about its possible negative consequences encouraged political and economic debate in Lithuania. There was an interest to re-examine whether income inequality in Lithuania is indeed one of the largest within the EU (or whether it is only caused by, for example, the choice of inequality measure used), what contributes to income inequality, what are its consequences and what policy could be efficient at reducing it. This study focuses on the first two questions: how confident are we in claiming that Lithuania's income inequality is high and what factors lay behind such inequality.
We employ several statistical tests to examine whether we can claim that household income inequality in Lithuania is one of the highest across the EU. First, we have evaluated the sampling errors to verify that conclusions from the sample data do not contradict the actual situation. Rao et al. (1992) bootstraped standard errors based on survey design information reconstructed according to Goedemé (2013) and Zardo Trindade and Goedemé (2016) allow to estimate the likely biases. Second, we have conducted a test for the choice of the inequality measure, i.e. we have estimated inequality using alternative measures to the Gini index: the Atkinson index and the Generalized entropy index as in Jenkins (2017). Additionally, each index is calculated with various inequality preference parameters. Additionally, each index is calculated with various inequality preference parameters. Third, we have adjusted household income by alternative equivalence scales because these have been proven to influence the results (Buhmann et al. 1988). We use an OECD and the square root equivalance scale.
Next, we have investigated why household income inequality is higher compared to other countries using univariate factor and subgroup decompositions that decompose inequality into parts. These decompositions are purely statistical: they do not incorporate agent responses to any covariate. Nevertheless, these decompositions help identify the households amongst which inequality is acute and suggest which aspects should be looked into deeper.
Factor component decomposition decomposes inequality measure by disaggregating it into mutually exclusive and exhaustive income components, for example, labour and capital income. Two versions of this method are well known: the natural decomposition as in Shorrocks (1982) that focuses on the decomposition of the variance and the Lerman and Yitzhaki (1985) that is used to decompose the Gini coefficient. We use the latter method, as the Gini is a more conventional index of inequality.
Subgroup decomposition decomposes inequality measures within and between mutually exclusive and exhaustive subgroups, for example, inequality between males and females and inequality amongst males and amongst females. There are many ways to decompose subgroups as illustrated in Cowell (2011) and Yitzhaki and Lerman (1991). We apply Yitzhaki and Lerman (1991) method to decompose the Gini in a way that is closer to the chosen factor decomposition technique.
Our results suggest that household income inequality in Lithuania is one of the highest in the EU and this finding is robust to various statistical tests. The decompositions reveal large inequalities between and within many groups of households in Lithuania. The largest inequalities lie between the employed and the rest of the population, and this kind of inequality has been rising over time. Inequalities within the unemployed and those working in the agricultural sector are particularly distinct. The factor decomposition shows that labour income, especially self-employment income, is more unequally distributed in Lithuania than elsewhere, while social transfers and taxes seem to have a smaller impact on reducing inequality than in other countries.
The paper is structured as follows: in Section 2, we give definitions of income and describe the data set used throughout the empirical investigation. The other two sections answer both of the research questions, each using its own methodology and provide comments on the results. The final section concludes.

Definitions and data on income
We focus on equivalised household disposable income inequality. Let us explain each term in more detail. Income is defined as yearly disposable income. To get disposable income we subtract taxes and social contributions that a household has to pay from gross income. Gross income is the sum of market income (labour and capital income) and transfers (both private and public). The unit of observation is an equivalised household. This assumes that household members share their income and make joint decisions. In order to correct for household size an equivalence scale is used.
Focusing on equivalised household income rather than individual one affects the results and this should be briefly justified. Research literature suggests that individuals make economic decisions taking themselves as well as their household members into consideration (see, among others, Vogler and Pahl 1994). For example, the income of all household members comprise a common budget constraint (Chiappori and Meghir 2015) thereby influencing each household member's behaviour. Additionally, some benefits are only granted at a household level (e.g. social assistance benefit) making the allocation of this benefit to any specific member artificial. Nevertheless, each household member has their own preferences and a typically unequal control of the household's budget with evidence suggesting that decision taken within a household are rarely joined and more often dominated by specific household member (Pahl 1995). Therefore, while it is useful to look at equivalized household inequality to get a first idea of how unequally income is distributed within society, specific questions require looking deeper into within household inequality (for example, when determining how child benefits should be allocated if mothers are more likely to spend on children rather than fathers).
The data on income and covariates come from the yearly European Union Statistics on Income and Living Conditions (EU-SILC) instrument running since 2004. The data are compiled from a mixture of survey and administrative sources. Each year around 5 thousand Lithuanian households with around 10 thousand persons over 16 years old who agree to share information on their income are included. The exact number of households and persons recorded in Lithuania and other countries in 2015 is shown in Table 1. Most of these persons provided all information on income, as can be seen from column 5 titled "Observations". As all EU member states collect data using the same methodology, we can compare the inequality in Lithuania with that of other EU countries.
[ Table 1 about here.] While the data is explained by Eurostat (2018c), several several features are mentioned here. The survey captures household income and, therefore, certain income components are available for the household rather than individual level. Therefore, income of all household members are summed up and allocated to each household member. While most covariates are recorded at the time of the interview, income is recorded for a previous year (the reference year). In this paper, all years represent reference years. While the EU-SILC has a large survey component, some countries make use of register (administrative) data and are referred to as register countries. In 2015, the register countries included Cyprus, Denmark, Finland, Latvia, Lithuania, the Netherlands, Northern Ireland, Norway, Slovenia, Sweden, and Switzerland. Finally, survey weights are used to form conclusions on the population from the sample data. The weights are further adjusted according to Eurostat (2018b): weights of household members over 16 years old are scaled up by distributing weights of those under 16.

Is income inequality in Lithuania high?
First we have examined inequality from the full data sample and then analysed subgroup inequality (inequality between-and within-subgroups) in Lithuania.

Inequality
The most popular measure of the level of inequality is the Gini coefficient. The higher the Gini, the greater the level of inequality and it stood at G = 0.37 for Lithuania in 2015 (Eurostat 2018d). An intuitive way to understand the Gini is to say that for any two households in a country, one of them is likely to have 2G% higher income than the other one. The Gini is represented by two times the covariance between income y and the rank of income F (y) divided by average income µ, which describes inequality within the entire population. Since we have sample data only, we modify (1) to include sample weights, as shown in (2) in the Appendix. Lithuania's Gini coefficient has been compared with the Gini coefficients of all countries that are included in the EU-SILC data set for 2015 in Figure 1 and with the Gini coefficients for a subset of all countries in Table 2. The subset of countries include the Baltic States, Finland as one of the Scandinavian countries, Germany-which represents the average inequality in the EU and Slovakia, where inequality is the lowest. As in previous studies (IMF 2016;Lazutka 2017), income inequality in Lithuania is one is of the highest according to the EU-SILC. The estimated confidence intervals ( Figure 1) and standard errors (Table 2) indicate that this is statistically significant. For example, the Gini in Lithuania is about 7 Gini points higher than in Germany. The latter also happens to be the median in terms of inequality within the whole EU-SILC sample of countries. Although Table 2 focuses on fewer countries, it provides more statistics on inequality than Figure 1. In Figure 1, household disposable income is equivalised by the OECD equivalence scale. In Table 2, two different scales are used: the OECD scale and the square root equivalence scale. The square root scale increases the Gini for Lithuania by 0.3 points, yet remains with the highest level of income inequality among all countries and 7 Gini points higher than the median country.
[ Table 2 about here.] Furthermore, in Table 2, the generalized Gini coefficient, G(v) (Yitzhaki 1983), where parameter v represents inequality aversion. This inequality parameter represents the dissatisfaction expressed towards inequality. The value v = 2 gives the standard Gini, v between 1 and 2 represent lower inequality dissatisfaction and v > 2 indicates higher dissatisfaction. The measurement G(1.5) results in lower Gini values in all countries for both equivalence scales (i.e. inequality is not as "bad"). Additionally, the difference between the Gini in Lithuania and the median country shrinks to 5 Gini points for both scales. Nevertheless, inequality in Lithuania remains significantly the highest out of the sample of 6 countries. Setting v = 4 increases the Gini index, but for Lithuania it remains the highest among the selected countries.
Finally, the Gini is compared with other measures of inequality. Other prominent measures include the Atkinson index (Atk) and General entropy index (GEI), see Das and Parikh (1982), Cowell (2000), and Plat (2012). Both of these measures show that the higher the value, the greater the inequality. Both indexes also feature inequality aversion parameters. In the Atkinson index, a parameter value close to zero means indifference about inequality, while higher values show that people dislike it. In contrast, high GEI parameter values mean that people are indifferent about inequality. In all cases, inequality in Lithuania remained significantly the highest.

Subgroup inequality
The previous subsection has shown that inequality in Lithuania is large when compared to EU countries. Next, we will consider inequality between and within population subgroups, for example, between males and females and amongst males and females. Then we will estimate stratification-the extent to which income of one group overlaps with other groups' income.
Continuing the discussion started in Section 2, the interpretation of a subgroup may not be straightforward, as we are dealing with (equivalised) household income instead of individual income, but can be explained with the help of an example. Imagine a household composed of 1 male and 1 female. Then, comparing household income (i.e. adding up household members' income and allocating the summed household income to each member) implies no income inequality between the male and the female in that household. However, this is only true if all households have the same number of males and females, which is not true in general. There are some households consisting of more males, while others have a higher number of female members. If males tend to earn more than females, households with more males will earn higher equivalised household income than equivalised households with more females. In aggregate, this will lead to inequalities between the subgroups. Inequality between this group should be interpreted as "inequality between male and female dominated households". This way, we can combine information of household income and composition together with individual characteristics.
The methodology used to estimate inequality between subgroups is similar to that in the IMF (2016) and is based on Eurostat (2018a). The methodology for estimating inequality within subgroups and stratification are adapted from Yitzhaki and Lerman (1991). Additionally, Additionally, the technique proposed by Yitzhaki and Lerman (1991) is used to decompose total inequality into between, within and stratification terms to see which of them contributes most to inequality.
Inequality between subgroups Inequality between subgroups refers to measured inequality between households grouped under certain criteria. For example, households can be grouped by "Sex" into two subgroups l = 1 and l = 2: "Males" and "Females". To estimate between subgroup inequality, we first estimate weighted average income of a subgroup m (l) and then divide by the average weighted income of all subgroups m, see (4) in Appendix A, to get an income ratio m (l) / m. We then compare the ratio with that of the EU, namely of its member states that joined the EU before 2004 (old EU states), and with those Member States that joined it after 2004 (new EU states). Our method is similar to that used in the IMF (2016), but has several differences: the IMF (2016) analyse weighted income decile ratios while we compare weighted average income ratios. The IMF (2016) compare Lithuania to the EU, while we additionally compare it to new and old EU states to control for the development of countries. Finally, we have more grouping criteria (a total of 9) and estimate standard errors.
Our findings are in line with the IMF (2016), which also reviews between-subgroup inequality in Lithuania. The IMF (2016) reveals large inequalities between the top and the bottom income deciles, between the employed and the unemployed and non-labour market participants, between the elderly and other age subgroups, as well as between educated and less educated households subgroups, i.e. these ratios are much higher in Lithuania than in the entire EU.
In addition to these findings, the results presented in Table 3 allow adding the following points: [ Table 3 about here.] • Differences of ratios are significant between many subgroups in Lithuania. The subgroups include those grouped according to the IMF (2016) criteria (activity status, age bracket, number of dependants, education) as well as ratios in other subgroups. For example, we split households based on the main income source. Those who receive largely self-employment income tend, on average, to have more disposable income than those who work as employees or others-a trend not observed in the EU as a whole. Significant inequality also exist between subgroups grouped by the number of people working in the household (nr working) and the sector where one works (sector).
• Ratios between majority of the 9 subgroups are also significantly different from the ratios between their EU counterparts. Besides the subgroups in the IMF (2016) (those grouped by activity status, age bracket, education), the self-employed in Lithuania on average earn proportionally more than their EU counterparts. Additionally, those who work in the information technologies, finance, real estate and administration sector (IT, finance, RE, admin) earn, on average, relatively more income in Lithuania than one would in the EU.
• There are some groups between which inequality in Lithuania is smaller as compared to the EU. For example, those working within the agricultural sector are relatively better off in Lithuania compared to the EU. Additionally, income ratios in Lithuania are more similar to those in the new EU states. In particular, those who are under 19 years old have very similar relative income both in Lithuania and new EU states.
In general, ratios between subgroups are largely persistent and slightly widening since 2004 and, especially, since 2010. For example, there was a slowly widening gap between the employed and the retired. The gap between those who received tertiary education first decreased, but has again started rising since the crisis. Income of managers have been rising at the expense of professionals, technicians and associates. Recently, relative average income in the private service sector ("IT, finance, RE, admin") has also been growing, largely at the expense of the public sector ("Public admin, education, health"). Families with at least 4 children have seen their relative income fluctuate greatly over 2004-2015. In particular, families with at least 5 children saw their income rise from 40% of average population income in 2004 to over 90% in 2010, which then fell back to around 50% by 2015. This is likely to be related to family benefits, which depend on (high) past income and reacted very countercyclicaly during the economic crisis of 2009-2010. Additionally, there was a simultaneous increase in the number of births during this period, leading to more benefit payouts.
[ Inequality within subgroups Inequality exists within subgroups in Lithuania. A common way to measure it is to calculate inequality measures for subgroup income as is done for total income, see G l in Formula (5) in Appendix A. We have calculated the Gini coefficients for Lithuania's subgroups and compared them with the Gini coefficients of the EU, new and old EU states in 4.
• Most of the within-subgroup Gini coefficients examined in Table 4 are higher in Lithuania than in the EU. Especially large subgroup inequality exist among those working in the agricultural sector and the unemployed.
• The above-mentioned within-group inequalities are much higher in Lithuania than in the EU. Additionally, households where the main source of income is self-employment income are also unequal among themselves, even though similar inequality within subgroups exists in new EU states. The Gini of households with many children is relatively small and we know from the between analysis that these households earn much lower income.
Over time, inequality within subgroups increased in many subgroups. This rise has been especially strong since 2009. In particular, the Gini coefficient of the unemployed rose from 42 in 2004 to 48 in 2015. This may well reflect the unemployment benefit structure which is dependent on past earnings and employment history. Unemployment has risen substantially since the crisis and there have been many unemployment benefits handed out. However, these benefits were stopped to those who were unemployed for a longer time. Additionally, as the economy recovered, it became easier for the unemployed to be in employment for at least several months during the year. Similarly, there was a rise in inequality among those who are neither employed, unemployed, retired or students (largely disabled). Additionally, there has been a rise in inequality among those who are over 65 and, to a lesser extent, those aged 30-64. Inequality increased within all the different education levels and within all occupations (mangers in particular). Inequality increased in the agricultural sector as well as in the IT, finance, real estate and administration sectors ("IT, finance, RE, admin").
Stratification between subgroups Inequality is linked to stratification. Stratification measures whether income of each member of a subgroup differs compared to income of every member of all other subgroups. We use the methodology proposed by Yitzhaki and Lerman (1991), which measures stratification on a scale from -100 to 100. Value 100 indicates high stratification: all members of a subgroup have income that are different from members of other subgroups. Value 0 indicates no stratification-there is a perfect income overlap between the subgroups. Negative numbers indicate that the subgroup should actually be multiple subgroups, i.e. income of some subgroup members is much higher than that of members of other subgroups, however, some members also have much lower income than members of other subgroups. The estimates of measures of stratification in Table 5 allow us to make two more insights: [ • Several subgroups in Lithuania are stratified. Families with more dependants are detached in terms of income from other subgroups and the difference is stark when compared to the EU. Households which are employed or have more employed members are stratified from the unemployed or those which do not participate in the labour market. Income stratification of these subgroups is greater in Lithuania than in the EU. Additionally, there are several subgroups which are stratified in Lithuania to a similar extent as they are stratified in new EU states: subgroups characterised by occupation, education and age bracket. This could signal that Lithuania, like in new EU states, is facing more labour market imbalances, where the demand for highly educated professionals is especially high, while redistribution channels are too weak to compensate for the income of those out of labour force (e.g. elderly).
• There are several subgroups which should form several smaller subgroups in Lithuania. The unemployed, for example, have a stratification value of -9.9, meaning that some unemployed are relatively well off, while others are not. This could reflect that some of the unemployed are still getting unemployment benefits, are able to take on part-time work or are simply living in a high income households, while others do not. Similar tendencies also exist in the agricultural sector, with some being much better off than others.
Stratification between groups has been increasing, especially since 2009. This is particularly apparent when considering activity status: the stratification coefficient of those employed rose from 16% in 2009 to 33% in 2015. However, this could be largely attributed to market correction, as the stratification coefficient was around 26% before the crisis.
Subgroup decomposition We have analysed between-and within-subgroup inequality and stratification separately. Now, we will identify how much each of the terms contributes to the Gini of disposable income in Lithuania and compare this to the EU, new and old EU states. To do this, we will use the methodology provided by Yitzhaki and Lerman (1991), outlined in Appendix A.
The subgroup decomposition results are presented in Table 6. The Gini coefficient is decomposed into within, between or stratification component for each of the 9 groupings considered before. The following conclusions can be drawn: [ Table 6 about here.] • The majority of inequality decomposes into within-groups rather than between-groups in Lithuania.
The largest between-contribution is observed between different households which have different number of people working ("nr working", 10 Gini points), but even here the within-contribution is 3 times higher. This finding is not surprising, as inequality within subgroups is often found to matter more (see Elbers et al. 2008), suggesting that the majority of variation in income is between households of similar characteristics. Income inequality within groups is also more important for the EU. Additionally, several household characteristics seem to not contribute to inequality significantly in Lithuania, for example, sex.
• Except for education, labour market characteristics of the household are more important in explaining inequality than demographics. For example, the different number of people working, the main source of income of the household, and the occupation individually explain 5-10 Gini points. The betweencontribution, when grouping people according to activity status is 7 Gini points. This means that if all household members were employed and would earn employment income, the Gini coefficient would fall by 7 points and become similar to the EU Gini coefficient. This between-contribution in Lithuania is about 2 times higher than the EU between-contribution, indicating that employment is much more important in terms of income in Lithuania than in the EU. Low redistribution (low taxes and transfers) in Lithuania could explain why it is very costly to not participate in the labour market (IMF 2016; Lazutka 2017). Furthermore, the number of those employed within a household matters in Lithuania. If we consider the number of people working in a household ("nr working"), the between contribution is 10. Demographic characteristics (age, number of dependents, sex) determine a relatively lower share (0.2-1.4 of Gini).
The within, between and stratification decomposition is decomposed further to reveal the importance of the employed to income inequality each year from 2005 to 2015. Specifically, the within-contribution of activity status is decomposed to the within contribution of the employed, unemployed and non-participants. This decomposition, along with the between and stratification contributions, is shown in Table 7 for Lithuania. The rise in disposable income household inequality in Lithuania since 2011 can be primarily explained by a rise in income inequality among those who are employed. This is partly determined by the fact that a larger share of population has become employed since the crisis (51% in 2011 and 55% in 2015), the employed are taking a larger share of income (from 62% to 68%) and are themselves more unequally distributed (the within-Gini rose from 29 to 33). To a lesser extent, inequality is also rising due to greater between-subgroup inequality and stratification, especially stratification of the employed vis-a-vis other groups. This is because average wages rose faster than non-labour income during this period.

What factors can explain income inequality in Lithuania?
In order to explain why household disposable income inequality in Lithuania is high, we have decomposed income inequality by income factors. The four components of disposable income are labour income, capital income, transfers and taxes (including social transfers). These are further broken down by more granular income factors.
Each decomposition method yields different results, thus one must be careful in drawing firm conclusions without trying different methods. We use Lerman and Yitzhaki (1985) method to decompose the Gini coefficient. The method considers how equally each income source is distributed within the population, what share of income each factor constitutes and how does each factor source correlate with total income. More details on this method are provided in Appendix B. This differs from a "dropping" method, where a factor is excluded from the definition of disposable income and a new Gini is computed. In both cases, these are statistical decompositions which cannot provide any causal analysis. However, they allow getting an idea of where the inequalities lie. We provide the estimates for Lithuania and the EU. Unfortunately, 4 countries, including Germany, did not provide all the necessary income factors, meaning that the data sample for the EU differs from the previous analysis. Table 8 reveals the results for decomposition of disposable income in Lithuania and the EU. The contributions allow us to make the following conclusions: [ • Labour income contributes most to income inequality in Lithuania-53.63 Gini points, while capital contributes only 1.32 and transfers and taxes actually reduce income inequality by 0.25 and 17.74 points respectively. Labour income contributes most to income inequality on the EU level as well, yet about 9.72 Gini points less than in Lithuania.
• All labour subfactor contributions are larger in Lithuania than in new and old EU states. The largest subfactor contribution is employee income in Lithuania (34.48 Gini points). The contribution is about 0.58 Gini points higher than in the new EU states and 4.42 higher than in the old EU states. Selfemployed contribute less to inequality in Lithuania (9.29 Gini points). However, this is by 6.23 Gini points more than in new EU states and by 3.32 Gini points more than in the old EU states.
• Transfers reduce income inequality in Lithuania. Specifically, transfers contribute -0.25 Gini points if we use Lerman and Yitzhaki (1985) decomposition method. This is because there is little correlation between the number of transfers a household receives and the amount of total income it possesses--transfers are quite equally distributed across the population. However, if transfers are increased, it will also cause the transfers to occupy a larger share of total income. This will reduce the inequality generated by other income components, most notably-labour income. This follows directly from (6) in Appendix B. Old-age benefits especially reduce income inequality due to their large income share and the fact that pension benefits are capped in Lithuania. Pensions are not capped in many EU countries and old EU states in particular, which is why old-age benefits increase inequality there. Nevertheless, some transfers at least partly correlate with total income in Lithuania (family/children-related allowances and sickness benefits) because they depend on labour income of the household. It should be noted that, using a "dropping" decomposition method, inequality would fall much more (around 16 Gini points).
• Taxes (and social contributions) reduce income inequality in Lithuania. Specifically, taxes reduce income inequality by 17.74 Gini points if we use Lerman and Yitzhaki (1985) decomposition method. This is because income tax negatively correlates with labour income-the main contributor to income inequality. Additionally, there is a degree of progressivity in the tax system due to minimal non-taxable income, for example. While taxes reduce inequality less than in the old EU states, they actually reduce it more than in the new EU states. Again, if we used the "dropping" method, the situation would be much different. Taxes and social contributions would reduce the Gini of disposable income in Lithuania by 3 percentage points.
As labour is the key contributor to inequality, we have provided more details on the factor decomposition. Specifically, the estimated total labour contribution, e.g. T 1 in (6) in Appendix B, is decomposed into income share S 1 , the within-Gini G 1 and the Gini correlation, R 1 , which measures how much this income factor correlates with income. Table 9 shows decomposition results for Lithuania and the EU. The labour decomposition further reveals several aspects of income inequality: [ Table 9 about here.] • Lithuanian households are especially dependent on labour income. This is reflected by high Gini correlation, R 1 , which is the main reason why labour income contributes a lot to income inequality. The quantity R 1 is equal to 90.61 in Lithuania, while it is under 80 in new EU states and 74 in old EU states. In contrast, S 1 and G 1 is quite similar in Lithuania and the EU. High R 1 means that those who get high labour income are also likely to receive high income in general and it creates a high dependence on labour income. Other types of income play a smaller role in Lithuania as compared to the EU.
• While the Gini correlation in Lithuania is high relative to the EU level for employment and selfemployment income, the difference is greater for self-employment income. The estimated Gini correlation is equal to 81.44 for the employment income (denoted by R 11 ), and this is 11 points higher than in the EU. The Gini correlation R 12 is equal to 70.11 for the self-employed and this is 25.46 points more compared to the EU. High R l2 means that self-employment income is especially important for self-employed households. This may give rise to concern, as such income is generally less stable than employment income. Additionally, self-employment income is more unequally distributed, with G l2 in Lithuania standing at 91.13, as opposed to 55.16 for the employed.
The decomposition also sheds light on marginal contribution of each income factor to the Gini coefficient. We have calculated the amount to which the Gini changes if we raise the factor contribution by a small value e k and hold other income factors constant. This is approximately equal to evaluating by how many Gini points will the Gini coefficient change if we increase an income factor by 1%. The formula (7) in Appendix B quantifies the effects. If all income factors are raised by the same e k = e, the Gini would not change, as summarised in the first row of Table 10.
[Table 10 about here.] Table 10 shows the marginal contributions to the Gini for Lithuania and the EU. Several conclusions can be drawn on labour and capital income as well as taxes and transfers.
• In Lithuania and the EU, raising labour income results in higher income inequality. Specifically, a 1% increase in labour income raises the Gini by approximately 0.1174 Gini points in Lithuania. However, raising labour income would result in higher inequality in Lithuania than elsewhere. The reason why inequality would rise more in Lithuania than in the EU is self-employment income. A 1% rise in selfemployed income raises income inequality by 0.0391 Gini points in Lithuania as compared to 0.0131 Gini points in the EU. Raising employment income would raise income inequality by similar amounts in both economies: by 0.0611 in Lithuania and 0.0694 in the EU.
• Higher capital income increases income inequality, yet the estimated effect in Lithuania is small. A 1% rise in capital income would increase the Gini by 0.0067 Gini points. In the EU, the effect of capital is more than two times as strong, although still low. This finding should be taken with caution. The effect is not prominent because capital constitutes a small share of total income in the survey. However, surveys have trouble measuring capital income levels and, therefore, the real marginal effects could be much larger.
• Higher transfer income and taxes reduce inequality. Raising transfers by 1% reduces inequality by 0.0892 Gini points, while raising taxes (including social contributions) reduces income inequality by an additional 0.0348. Raising transfers actually has a larger effect in Lithuania than in the EU. Increasing old-age benefits alone would reduce inequality by 0.0544 Gini points-three times more than in the EU. Other transfers have a much smaller impact individually. Taxes, however, have less effect in Lithuania than in the EU, especially the old EU states. Specifically, a 1% rise in income taxes and social contributions paid by the household reduces inequality by 0.0348 Gini points-about half of the impact in the old EU states, which is 0.0643. However, the tax situation in Lithuania is very similar to that of new EU states.
This information can help to forecast how income inequality will evolve in Lithuania: if market income (labour and capital) will grow faster than transfer and tax income, inequality will rise; if market income grows proportionally to transfer and tax income, inequality will remain the same; if transfers and tax income grow faster, inequality will fall.

Conclusions
We have tackled two questions and each of them is presented in this study. We have also suggested possible improvements for future studies.
First, we have run three statistical tests and found that income inequality in Lithuania is in all cases one of the highest in the EU. Specifically, we have tested for accuracy of estimates by estimating their standard errors, the inequality measure used as well as different equivalence scales. In all cases, income inequality in Lithuania is found to be one of the highest across the EU.
Second, we have investigated why income inequality in Lithuania is higher compared to the EU by using univariate decomposition techniques. We have found large inequalities between and within many groups of households in the country. In all cases, the within-group inequality contributes more to income inequality in Lithuania and in the EU. It means that inequality is higher within households of similar observable characteristics rather than between households of different characteristics. Inequalities within the unemployed and those working in the agricultural sector are especially prominent. Nevertheless, between-contributions are also significant for Lithuania, suggesting where policy can look into deeper. The largest between-group inequalities lie between the employed and the rest of the population and this type of inequality has been rising over time. While the number of the employed in a household matters regarding inequality between households in all countries, its contribution in Lithuania is particularly large. This results in stratification of these groups. As the factor decomposition shows, the large between-group inequality contribution can be explained by unequal distribution of labour income, especially-self-employment income. Self-employment income is particularly unequally distributed among households, while other income factors (taxes and transfers) constitute a much smaller share of income in Lithuania, when compared to the EU. Having said that, it is clear that a portion of those who are self-employed are faring much worse than others, thus a closer look into income determinants of the self-employed would be helpful for policy makers.
Additionally, the marginal decomposition of the Gini coefficient by factors has shown the difficulty in lowering the Gini for Lithuania. To reduce income inequality by 1 Gini point, transfers and taxes must be raised by 8% more than market income. Currently, labour income are rising by about 10% in Lithuania per year. This means that tax and transfer have to rise universally by about 18%. The alternative way is to change the transfer and tax mechanisms-to insure that those with higher labour income receive less transfers and have to pay more taxes and social contributions, relative to their market income.
The estimates of inequality may have several drawbacks. First, there is a large shadow economy in Lithuania, with some estimates exceeding 25% of GDP in 2013 and 2015 (see Schneider 2013;Žukauskas 2016). Even though survey respondents are informed that their data will not be used for tax purposes, some of them may still be unwilling to disclose information on their true income received or taxes paid. It remains unclear how this affects inequality, because it depends on the income distribution within the shadow economy together with income distribution of the observed economy. Additionally, this estimate may cause problems when comparing households across countries, since the size of the shadow economy is particularly large in Lithuania. Second, as has been already pointed out various times, EU-SILC undersamples the income of rich individuals in all countries (especially capital income (Navickė and Lazutka 2017))-something that the survey weights do not correct for. Including the rich will result in higher measures of inequality in Lithuania.
However, inequality will rise in other EU countries as well. Therefore, the relative position of Lithuania vis-a-vis other countries may not change so much. Nevertheless, the alternative Household Finance and Consumption Survey (HFCN 2019) could partly correct for both of these shortcomings, as it has data on consumption, which can be used to estimate the shadow economy and oversample the wealthy households for Lithuania along with many other EU countries. Furthermore, greater access to administrative data would be yet another path to take.
Future studies can also consider using an alternative methodology, for example, by using multivariate techniques to decompose income inequality. This was not the focus of the current study because the results of a multivariate decomposition depends on all variables by which the Gini is decomposed, and there is no consensus on which should be included. Furthermore, variables available to some countries are less available in others in the EU-SILC. Nevertheless, our additional check using a multivariate decomposition technique as in Social Situation Monitor (2017)  . 2012. What are equivalence scales? http://www.oecd.org/eco/growth/OECD-Note-Equivalence Scales.pdf. U = {1, . . . , N } is the set representing elements of the finite survey population, and y 1 , . . . , y N are values of the variable of interest (income) in U. The subset s = {i 1 , . . . , i n } of U is the sample, while w i and i ∈ s are the corresponding survey weights. We use the estimator of the Gini coefficient (1), constructed in line with Berger (2008), where F (y i ) are values of the estimated distribution function F (y) = 1 Here I{·} stands for the indicator function. Estimators of the subgroup and factor decompositions are constructed using similar plug-in principles.

A Subgroup decompositions
We give the decomposition of (2) by groups as in Yitzhaki and Lerman (1991). Let s = s 1 ∪ · · · ∪ s L be a division of the sample by non-overlapping groups. Denote where N l is the estimated population size in the subgroup l, the quantity P l is the estimated population share, m is the estimated mean of the survey variable in U,m (l) is the estimated mean in the subgroup, and F (l) is the estimate of the average of global ranks in the subgroup l. Consider the values F l (y i ) and F L\l (y i ), i ∈ s, of the estimated distribution functions and cov l (y, F l (y)) = 1

21
Appendices and cov l (y, F l (y) − F L\l (y)) = cov l (y, F l (y)) − 1 Then the estimated decomposition by groups is written as where Here the component S l represents the share of the survey variable, G l is the estimated within-group Gini coefficient, and the part Q l is the estimated stratification term.

B Factor decompositions
We write down an estimate of the factor decomposition by Lerman and Yitzhaki (1985). Write i , where k is a factor of the survey variable. Consider the values F (y i ) and F (y (3) and denote the expressions cov(y (k) , F (y)) = 1 and cov(y (k) , F (y (k) )) = 1 Also, introduce the weighted meansμ Then the estimated decomposition by factors is where R k = cov(y (k) , F (y)) cov(y (k) , F (y (k) )) , Here R k is the estimate of the so-called Gini correlation between the survey variable and its kth component, G k represents the Gini index of factor k, and S k is the share of factor. For a small change in the kth factor, 22 the expression of marginal effects is see Lerman and Yitzhaki (1985). The variables "Households" and "Household members" are the unique number of households and household members in the data set. The variable "Observations" refers to those household members for whom all income data is available. Columns 6 to 8 refer to the average, median and the Gini coefficient of the population estimate of equivalized household disposable income.