Cost allocation for the problem of pollution reduction: a dynamic cooperative game approach

Abstract This paper studies CO2 emissions at a global level. The authors use Dynamic Optimisation to derive the minimum penalty cost on countries every single time. They then use an Imputation Distribution Procedure to allocate the minimum penalty cost among countries. Their work provides the extension of the Shapley value cost allocation as a penalty to reduce CO2 emissions. The paper has implications for how to provide initiatives to improve cooperation on reducing CO2 emissions at an international level. Results show that a reduction in cost of only one country can be harmful for other countries. In this way, some countries can end up or worse off in a case where all countries experience a uniform decrease in their penalty cost. Therefore, the findings of this work suggest a low penalty-cost scenario that helps the countries fight for pollution reduction and provide fruitful links for policy-makers. They show that the Clean Development Mechanism (CDM) of the Kyoto Protocol could be implemented by the Shapley value cost allocation.


Introduction
In CO 2 emissions at a global level, free-riding is a key problem for countries designing and signing an agreement because of the nature of public goods. Two problems exist when numbers of countries join an agreement. One is to recognise the targeted level of CO 2 emissions; the other is how the targeted level can be allocated among the countries in a consistent and stable fashion. In this paper, we try to tackle these two issues and give suggestions on the Clean Development Mechanism (CDM) of the Kyoto Protocol as a cost-effective tool to reduce CO 2 emissions to the committed targets which is also discussed by Buchner and Carraro, (2004), Ringius, Torvanger, and Underdal (2002) and Viguier (2004).
We study the aforementioned problem in a cooperative game context. First, we calculate the Pareto-Optimal level of CO 2 emissions and then use Shapely value to allocate the payoff (in our case, penalty cost) among players (countries). We use Dynamic Optimisation instead of Dynamic Equilibrium to derive the optimal path, i.e., a minimum penalty cost every single time. Then, we use the concept of an Imputation Distribution Procedure for the purpose of allocating a minimum penalty cost among countries. Our work provides a new extension of the Shapley value cost allocation as a penalty to reduce CO 2 emissions among the players. Our research offers implications for the Kyoto Protocol, which can provide initiatives for improving cooperation for CO 2 emissions control and will provide policy tools or incentives that will have a positive effect on countries' welfare. Theoretically, for CO 2 emissions, an incentive must be based on a cost and benefit analysis for policy analysis. The highest incentives must be at the level where a high net benefit exists for reducing CO 2 emissions at a committed target. However, it is difficult to calculate net benefit empirically for policy. Therefore, employing a cost-effective approach becomes a substitute for policy implications. It will lead us to make the policy for reducing CO 2 emissions. Our cost analysis may be implemented by the Clean Development Mechanism (CDM) of the Kyoto Protocol. The Clean Development Mechanism (CDM), which is defined in the Kyoto Protocol (IPCC, 2007), can provide an emissions-reductions project, which generates a Certified Emission Reduction in unit terms in order to limit countries' emissions to a targeted level. 1 This paper suggests a method of working out the minimum penalty cost for any given aggregate targeted level as a Pareto-Optimal outcome and how to allocate cost among individual countries.
Our research relates to the last strand of literature. These studies try to find out cooperative gains of abatement costs and then assign that cooperative penalty cost among players (countries). Therefore, some studies examined allocation rules for emissions reductions. Germain et al. (2003) extended the model of Chander and Tulkens (1995) in the form of a dynamic setting (closed loop) and at each time point, cooperation was negotiated. Besides that, they also defined a mechanism with the core theoretical idea of ensuring a stable grand coalition. Botteon (2001) studied stable environmental agreements at a global level by using a model of asymmetric players and found that the Shapley value stimulates high cooperation. Further extension was made by L. Petrosjan and Zaccour (2003) for trans-boundary pollution games by computing characteristic function and the Shapley value for each player and the Shapley value allocated with time consistent. Further, Benchekroun and Taherkhani (2014) examined pollution reduction costs by using time-consistent Shapley value allocation and examined country adoption to reduce pollution depending on a change in its damage (abatement) cost function.
Our work relates to the literature in the context of the recent surveys on CO 2 emissions reduction by the likes of Kozlovskaya, Petrosyan, and Zenkevich (2010) and L. A. Petrosjan and Zenkevich (2015), but our findings are based on optimality and cost allocation. This was so that we could examine a strategic static framework from the literature. Therefore, the difficulty is in adaptation and CO 2 emissions. In pollution problems (CO 2 emissions), stability is most important for achieving cooperation among the countries to overcome the free-rider problem. L. A. Petrosjan and Zenkevich (2015) described three conditions for ensuring the stability of cooperation: time consistency; stability with dynamics; and irrational behaviour. The time consistency property by using differential cooperative games was introduced by L. A. Petrosjan (1993). Time consistency property means that when any cooperation builds up, the partners of those cooperating are going to cooperate guided by principle (optimal) each time. The result of a cooperation agreement for a strategic stable solution appeared by using a Nash Equilibrium, which supports strategic cooperation.
Perhaps the most pioneering work on applying the Shapley value was done by Littlechild and Owen (1973). In this paper they demonstrated that airport cost allocation is closely related to the Shapley value in a static model. In our paper, we discuss both the Pareto-Optimality and Imputation Distribution Procedure to allocate the penalty cost in a dynamic optimal path. An airport game theoretic analysis which was introduced by Littlechild and Owen (1973) could be used for the cost allocation of CO 2 emissions among countries to reduce pollution. In the literature, this approach is widely used in different kinds of cost-allocation problems (for more details see Castro, G omez, & Tejada, 2009;Littlechild & Thompson, 1977;V azquez-Brage, Van den Nouweland, & Garcıa-Jurado, 1997).
The rest of the paper is organised as follows. Section 2 introduces the emissiongame model, Nash Equilibrium and values for characteristic function. In section 3, cooperative dynamic game and possible coalitions for all countries are discussed. The calculation, the decomposition of the Shapley value and cost allocation among countries are presented in section 4. Discussions are presented in section 5; in particular, we compare our model with Littlechild and Owen (1973). The concluding remarks and policy suggestions are explained in section 6.

Dynamic game emission model
For the study of abatement cost analysis, we use a simple dynamic game model with pollution stock which was introduced by L. Petrosjan and Zaccour (2003). Let there be n countries in the game of pollution control by i 2 ð1; 2; ::::::; nÞ . The country i CO 2 emission rate is referred by Z i ðtÞ at every moment t ! 0 . The emissions of countries form a pollution stock. The accumulated pollution stock is noted by YðtÞ at time t and can be written in the form of differential equation; Where yð0Þ ¼ y 0 >0 and d>0 is the natural rate of the pollutant country. Suppose z i is the emission rate of the country as usual and C i ðz i Þ is the cost of CO 2 emissions for reduction when country i limits its emission to Z i .
The pollution stock that is the damage cost to a country i is noted by D i ðzðtÞÞ ; The main objective of each country in game of CO 2 emission is to reduce damage costs and CO 2 emission reduction costs. Therefore, a non-cooperative problem is solved as follows: Where q is a discount rate that is common. For the definition of best emission path, we solve following problem as follows: We want to add some remarks to the calculation of such a formulation. First, we chose a simple ecological problem in economics and put a stress on cost allocation and a model for allocating pollution costs over time in a desired manner. Second, this model has the main problem that each country's cost depends on the stock of pollution and on the total CO 2 emissions. Third, the first derivate of two-cost function under two assumptions (convexity and sign) must be natural. Therefore, the meaning of convexity of C i ðz i Þ is that the MAC (marginal abatement cost) for reducing CO 2 emission is high for low emission levels (Germain et al., 2003). Fourth, we found mathematically that countries will reduce their costs at a similar rate. Alongside the computation by dynamics, we will finally conclude that pollution is reduced by reducing CO 2 emissions.

The Nash Equilibrium
Suppose, I ¼ ½1; 2; :::::::; n is a group of countries. Therefore, a coalition problem, K I, in sub coalition, k, is Subject to The coalition K value is defined in Equation (6), where k stands for a group of countries, where k<n , which make a coalition that is calculated by using c core 2 assumption. Therefore, in coalition K, the objective is to minimise its members' costs and the remaining countries respond individually supported to minimise their total cost. We solved for nÀk þ 1 player to estimate the Nash Equilibrium. While in L. Petrosjan and Zaccour (2003), when a coalition is formed, non-signatories played the strategy gained from a scenario of non-cooperation, this strategy was the solution to Equation (6). We consider L. Petrosjan and Zaccour's (2003) model for the equilibrium of a linear differential game where each country plays a dominant strategy.

Value function
Let us suppose that the coalition K of k countries, with nÀk denoting countries that are not part of the coalition, choose to minimise cost strategy. Suppose UðK; YÞ is a value function of coalition K. The coalition K is written by using the Hamilton-Jacobi-Bellman equation (HJBE) as follows, By differentiating the right-hand side of Equation (8) with respect to y i , we get By substituting Equation (9) in Equation (8), we get: By using the linear function as follows The above equation satisfies the Hamilton-Jacobi-Bellman equation (HJBE). By writing Equation (10) in a linear function form, we have Where Now by comparing coefficient of variables with equal power, we get By substituting the values of E i and F i in Equation (11).
This is the construction of the Nash Equilibrium. Therefore, the solution will be in the Nash Equilibrium in the sub-game and it will satisfy the initial condition.

Cooperative solution of the emissions-reduction game model
The calculation of the characteristic function is not usual (L. Petrosjan and Zaccour, 2003). Therefore, when the value of a characteristic function is calculated for a grand coalition K & I, the leftover players are bound to the Nash Equilibrium. Now consider the case where players agree to cooperate. Let us suppose that all players are going to minimise their total payoff under cooperation. Let a cooperative game be noted by C c ðy 0 ; tÞ , where all players agreed to the optimality principle. There must be an agreement on how players can act cooperatively and how they can share their cooperative payoff to build up an optimal solution of cooperative design. Therefore, the optimal solution of the Cooperative game C c ðy 0 ; tÞ must fulfil these two conditions. 3 In this way, we solve the dynamic optimisation problem.
For further interpretation, the Bellman function satisfies the Hamilton-Jacobi-Bellman Equation (HJBE), that is: By taking the derivative of the right-hand side (RHS) of Equation (15) with respect to z i , and tends to zero to get the best approach for emission reduction, We write the linear form of the Bellman function in the following way Now we write Equation (15) in linear form and solve to get By inserting the value of Equation (17) in Equation (16), we get the optimal strategy for players under cooperation.
The optimal path of CO 2 emission relies on how cooperation among the countries can remain viable. The solution of optimal principle will be effective along the trajectory path fyg t t 0 . We use the Pontryagin' maximum principle to get the optimal strategy for z ¼ ðz 1 ; :::::::; z n Þ Substituting these sets of optimal strategies into Equation (7) yields an optimal trajectory path fyg t t 0 , such that, By solving Equation (20) and substituting Equation (19), we get as: Now the Bellman function takes form as:

All possible coalition outcomes
The total cost at the optimal level contains a subset of the countries except the grand coalition. Therefore, a subset of grand coalition is defined as 2 n ÀnÀ2. The objective function that we consider for coalition will be the sum of all countries' objective function. Therefore, for the optimisation solution, we insert Nash values for decision strategies of players that are obtained to compute the Nash Equilibrium. The values function for coalition K is UðS; y; tÞ. This value function is computed as follows: Subject to Suppose UðS; y; tÞ is a function of the value of coalition S. The coalition S can be written by using the Hamilton-Jacobi-Bellman Equation (HJBE) as follows, qU S; y; t ð Þ¼ min Differentiating the right hand side (RHS) of Equation (24) with respect to y i , we get: Substituting Equation (2), Equation (3) and Equation (25) in Equation (24) then we get: Where E i ¼ @U i @y By using linear function as follows: The above equation satisfies the Hamilton-Jacobi-Bellman Equation (HJBE) Writing Equation (26) in way of linear function, we have Now by comparing coefficient of variables with equal power, we get

Shapley value computation
In (Shapley, 1953), it was defined that the Shapley value is distinctive. Therefore, the Shapley value presents a principle for Pareto-Optimal distribution of a total gain of U s ðK; y; tÞ and it can be defined without using the concept of dominance.
Our main aim is to calculate the Shapley value for every country i and then assign each country's marginal contribution, which is the difference between values of characteristic function in this form UðK; y; tÞÀUðKnfig; y; tÞ. In the case where equilibrium exists, values are connected to the Nash Equilibrium in K coalition and other remaining countries (players) in InK. Now we will allocate the total abatement cost among the countries by the Shapley value under cooperation if the grand coalition formed.
We consider that the cooperative game has n countries. We know that the Shapley value is Now, in this case, the game is symmetric; therefore, all players have the same Shapley values which are given as

Allocation of the Shapley value
The cost allocation for the reduction of CO 2 emissions will be allocated among the countries with the methodology of a cooperative game. Let the state of a cooperative game be defined as in a pair ðy; tÞ and denoted by Cðy; tÞ. After that, a sub game starts at a time t with pollution stock y I . Let Cðy I ; tÞ be a sub-game which started along a cooperative trajectory. Furthermore, in a subgame Cðy I ; tÞ, the characteristic function for grand coalition K I is defined as in its minimum cost and written as UðK; y; tÞ. The cooperative cost (total) can be distributed among the countries as UðI; y; 0Þ which is a minimum cost for a grand coalition I in Cðy; 0Þ . Suppose Sh i ðU; y; tÞ ¼ fSh 1 ðU; y; tÞ; ::::::::::::::::::; Sh n ðU; y; tÞg denoted the Shapley value in sub-game Cðy; tÞ . At last, k i ðtÞ ¼ fk 1 ðtÞ; ::::::::::::::::::; k n ðtÞg , a cost which is allocated among the countries i at time t.
The k i ðtÞ ¼ fk 1 ðtÞ; ::::::::::::::::::; k n ðtÞg called an Imputation Distribution Procedure if It means that the time function k i ðtÞ points to an Imputation Distribution Procedure for an overall cost of country i which is given by the element of game Cðy; 0Þ of the Shapley value if we decompose the Imputation Distribution Procedure. It means that the total sum of the discounted cost will be equal to Sh i ðU; y; 0Þ.
Multiplying both sides of the above equation by e Àqt and integrating it, we get: We need to show that k i ðtÞ is a time consistent for this, so we have to prove that Sh i ðU; y; 0Þ ¼ Ð t 0 e Àqs k i ðsÞds þ e Àqt Sh i ðU; y I ; tÞ After putting the values of Equation (30) into Equation (33) and calculating it we get Where y I ðtÞ is given in Equation (21). Now we check that k i ðtÞis time consistent. That must decompose the Shapley value for each country i at t ¼ 0, so Equation (31) becomes as follows By multiplying both side of Equation (35) by e Àqt and integrating it, we derive Now y I ðtÞ is given in Equation (21) that is By inserting the value of y I ðtÞ in Equation (37) After simplifying all above integral in Equation (38), it will lead to Finally, it is noticed that the Shapley value must be in the core. Therefore, the next theorem shows that the sub-cooperative game Cðy; tÞ is convex in the core of game; the Shapley value is the centre of gravity (Shapley, 1971).
Definition of core : The emission reduction cost C i ðz i Þ will be at the core of coalitional game UðI; y; tÞ if 8 k I P i2I C i ðz i Þ ! UðK; y; tÞ Proof :-Suppose, K I; S ¼ K [ fig where i 2 K then, Sh i ðU; K; y; tÞ Sh i ðU; S; y; tÞ . To prove this, we consider a permutation p of K. Let nðp; iÞ be a marginal contribution of i in K, then Where, pðKÞ is a set of all permutations of all members of K . We know that S ¼ K [ fig, again let n 0 ðp; iÞ be the average marginal contribution of i in S. The permutation, which is an average that is taken over the jSj! which is different from permutation p that is taken over C, then It shown that 8 p 2 Q ðKÞ, so we have n 0 p; i ð Þ ! n p; i ð Þ; As the marginal contribution of i in S will still be nðp; iÞ., if we place j after i and on the other hand, if we put j before i, then we get the convexity of n 0 p; i ð Þ ! n p; i ð Þ Now the same property can be satisfied in general for any of set S K; thus we have to prove that Now, let us consider the allocation It is clearly seen that no part of the allocation of S can be blocked in this allocation Sh i U; I ð Þ; :::::::::::::::::::; Sh i U; I ð Þ ð Þ It is also clear that more of the coalition can block this and furthermore that Sh i U; I ð Þþ; ::::::::::::::::::::::::: This shows that the Shapley value allocation is a core or an element of core and it is the centre of gravity.

Discussion
In this section, we discuss how our optimal cost allocation in the form of characteristic function could be adopted in the case of the pollution problem. A characteristic function is a mathematical tool to measure a strategic coalition. We apply the Imputation Distribution Procedure in a characteristic function for cost allocation that will satisfy Pareto-Optimal condition. The Shapley value will take that imputation which will satisfy some definitions and theorems that are discussed in Section 4.
The imputation is usually linked to a pay-off distribution procedure (minimum cost allocation in our case) in dynamic games. In our case, an interesting discussion can be about how to distribute these amounts among the countries. Therefore, the distribution procedure is feasible over time. So the amount is distributed to each country with the sum of its total share (see the Imputation Distribution Procedure definition). Our analysis decomposes total minimum penalty cost for the CO 2 emissions over time. It means that if the countries renegotiate the agreement at any time along the cooperative trajectory, they would get the penalty.
The coalition formation in a cooperative game has some benefits for all the countries to participate in a game, but in the real world if we would like to overcome the problem of pollution, coalition formation takes too much time and faces difficulties. One of the difficult tasks is that although all the countries know the benefit of coalition formation, all the countries do not agree upon how to distribute the penalty cost immediately among each other through cooperation. All the countries usually spend a lot of time bargaining with each other. This process creates complications and difficulties as the number of players increasesalthough there are some solutions, like the Shapley value, for the problem of how to allocate the cost to reduce pollution. This concept needs to be settled by the governing authority and it could not be seen in a free cooperative game in which there is no government. To tackle this issue, our paper suggests that the Shapley value cost allocation through the Imputation Distribution Procedure is an efficient way and the Clean Development Mechanism of Kyoto Protocol could play a governing role.
One option is highlighted by Littlechild and Owen (1973), which suggests to us to adopt the Shapley value cost allocation; this approach is only applicable in the case where the grand coalition forms. In the pollution problem, the Shapley value of the airport problem is applicable, but this approach does not follow the principle of Pareto-Optimality. Therefore, in our case, it does not seem to be the best one. The details of the airport problem in our case are presented in the Appendix. In this paper, we have taken n numbers of countries that would like to share the cost of pollution (the penalty cost of CO 2 emissions). We have made the assumption that the country or groups of countries have different characteristics. In order to make the agreement, they must follow the optimal path. While in the airport problem, the countries will bargain about how the value of coalition will be distributed, it is time consuming and leads us to high penalty costs when countries form a grand coalition. Regarding the policy implications of our analysis, there are three types of groups of countries: those like the EU with a high preference for the protection of the environment; those like the USA, Australia and Canada with a weak preference for the environment, but with high per capita incomes; and those like China, Brazil and India, with a weak preference for the climate and a low per capita income. The environmental protection problem includes all countries of the globe by definition. Some of the countries whose policies of environmental protection can make a difference at the level of CO 2 emissions, and many countries are acting as non-atomistic agents. This issue could be handled through the Kyoto Protocol, and especially through the Clean Development Mechanism.

Concluding remarks
In the game of pollution reduction among countries, we consider the Shapley value as a penalty-cost sharing rule for allocating pollution costs to each country. We examined the characteristic function on assigning the Shapley outcome to every country and finally its time decomposition to distributing the penalty cost among the countries. The most important implication of our analysis is that the allocation of the Shapley value for emission penalty costs provides the motivations for each country to take a positive action towards reducing pollution. This paper found that the minimum penalty cost could be allocated through a Dynamic Optimisation. Our paper derived an optimal path to find out the minimum cost of CO 2 emission. Our key finding is that the Shapley value penalty-cost allocation for the problem of pollution is optimal and the cost allocation by Imputation Distribution Procedure provides new tools for the researcher. Our findings should promote the penalty-cost allocation rule of the Shapley value at a global level.
However, this analysis provides a time-dependent policy tool for all the countries through the Shapley value cost allocation for the pollution problem. The analysis has policy implications for global environmental protection in a time-dependent manner. Therefore, we provide a Dynamic Optimisation solution in the form of a characteristic function that has the properties of Pareto-Optimality and found a minimum penalty-cost allocation strategy. The governing authorities can take the initiative to handle the global free-riding problem. This is a promising way to tackle the global issue. Our analysis offers guidelines for institutions like the International Environmental Agreement (IEA), and the Kyoto Protocol to implement the Clean Development Mechanism (CDM).

Notes
1. According to Burniaux, Chateau, Dellink, Duval, and Jamet (2009)  is an income source of United Nations Framework Convention on Climate Change in terms of funds to finance the programs and projects on voluntary basis to tackle the adverse effects of climate. 2. This assumption was discussed by (Chander & Tulkens, 1995;Germain et al., 2003). 3. (1) There must be an agreement on the set of strategic cooperation.
(2) There must be a mechanism to distribute the payoff among the Cooperative players.