Data-related challenges and solutions in building China’s national carbon emissions trading scheme

ABSTRACT China is now in the process of building its national carbon emissions trading scheme (ETS). Data are the foundation for the design and operation of an ETS. This paper presents a comparative analysis of the data requirements for China’s national ETS construction and the existing ETS-related data for enterprises in China. In doing so, it identifies the underlying data gap in building China’s national ETS in terms of data availability and data quality. Based on the experiences of international ETSs, and experiences at national and pilot levels in China, we propose two short-term strategies and four long-term solutions to meet the challenges from technical and management perspectives. Key policy insights The major data requirements for China’s national ETS can be categorized into six groups: production, emissions, technology, management, economy and policy data. The ETS-related data are generally available in China except for parts of data on emissions, such as energy carbon content and the oxidation factor. The data challenges that are faced by China’s national ETS include differences in corporate data availability and imperfect data quality. Short-term strategies to address the challenges include establishing data collection guidelines based on existing data and prioritizing major emissions or sectors with better data for inclusion under the ETS. Long-term solutions to address the challenges include introducing the concept of tiers, clarifying data sources and introducing a monitoring plan, conducting MRV capacity building and establishing a rigorous third-party verification system.


Introduction
As a market-based instrument for controlling carbon emissions, emissions trading schemes (ETS) are attracting attention in an increasing number of jurisdictions (World Bank, 2016). Over the past few years, China has gained substantial experience in building and operating an ETS, as exemplified by the ETS pilot programmes implemented in seven provinces and cities (National Development and Reform Commission [NDRC], 2011), including Beijing, Tianjin, Shanghai, Chongqing, Guangdong, Hubei and Shenzhen. In December 2017, China's national ETS was launched, marked by the release of the National Carbon ETS Construction Plan (Power Generation Sector) (NDRC, 2017a). According to the plan, China's national ETS will undergo a basic construction and simulation period over the next two years. Establishing a monitoring, reporting and verification (MRV) system is one of the major tasks of the national ETS (NDRC, 2017b). Obtaining the accurate data required by the ETS' design and operation is the main purpose of the MRV system. Data requirements are ETS-specific, and it is vital to understand whether, or to what extent, the existing data in China can support China's national ETS and the corresponding challenges. Therefore, it is necessary to research the data-related issues confronting the construction of China's national ETS.
Currently, in-depth studies of carbon emissions data from enterprises in an ETS context are still lacking. Few studies have examined data problems related to the design and operation of ETSs, although many studies that focus on allowance allocation methods to outline the operational experiences of ETSs (mainly from the European Union ETS) have touched on ETS data problems Ellerman, Buchner, & Carraro, 2007;. The limited availability and quality of data are two widely recognized problems that could lead to serious consequences.
First, an overly tight or loose cap could be established as a result of inappropriate data. The ETS cap is based on an economic and emissions forecast that relies on the accuracy and availability of historical data Ellerman & Joskow, 2008). If the availability and quality of data are poor, large deviations between the forecast and actual conditions might occur. For example, in Phase 1 (2005)(2006)(2007) of the EU ETS, a lack of accurate data resulted in an overestimation of future emissions, thereby resulting in an overallocation of allowances and a significant decrease in allowance prices Trotignon & Delbosc, 2008). In addition, the uncertainty of poor-quality data was significant compared with the size of the (generally small) emissions abatement target, and this made the target difficult to achieve (Betz & Sato, 2006). In other words, the emissions of enterprises would be lower than the cap because enterprises provided inaccurate emissions data.
Second, limited data can restrict the choice of the allowance allocation method. Compared to grandfathering, benchmarking is better for enterprises with historically low emissions intensity than for those with high intensity, but it involves having access to a large amount of installation-level data (Groenenberg & Blok, 2002;Zhang, Wang, & Da, 2014;Zhou & Wang, 2016). Benchmarking established on inadequate installation-level data would have heterogeneity problems (Buchner, Carraro, & Ellerman, 2006). The resulting deviations of allocations by heterogeneous benchmarks from actual conditions would be too great to have the benchmark gain widespread acceptance in a sector.
There are studies mentioning data uncertainty in the national CO 2 emission inventories of China (Shan et al., 2018) and recommending improvement of the authenticity and accuracy of the government statistical data (Zeng et al., 2018). Studies of data issues regarding China's ETSs have mainly focused on optimizing MRV mechanisms for carbon emissions data and improving data completeness and quality (Chen, Chen, Bao, & Zhang, 2014;Falconer et al., 2013;Zheng, Liu, & Wang, 2015). However, these studies lack systematic analyses of the data requirements for the design and operation of China's national ETS or the challenges that are faced, for example, the data required during the design and operation of the scheme, the data's availability and quality, and gaps in this regard. In practical terms, China released data reporting requirements for the national ETS in 2014 (NDRC Office, 2016 and 2017 (NDRC Office, 2017) along with national guidelines for 24 industries (NDRC Office, 2013. In addition, each ETS pilot has released its own MRV guidelines, which it has implemented for the local ETS for 4-5 years. However, the MRV system for the national ETS is still under development through learning by doing. Nevertheless, these early experiences provide useful practical points of reference for this article.
This article analyses the major data requirements for building and operating China's national ETS and the existing data foundation in China, identifying the challenges that China faces in terms of data availability and quality. By drawing on the experiences of international ETSs and experiences at the national and pilot levels in China, the article also describes solutions to cope with the challenges and ensure the availability and quality of relevant data for China's national ETS.

Qualitative analysis
Qualitative analysis was conducted for research on data requirements, existing data, data challenges and corresponding solutions. This analysis was based on the authors' professional experience in the field, 1 a literature review, and consultation with relevant experts in government, representative enterprises, verification agencies and consulting organizations. The literature review was based on peer-reviewed papers and 'grey literature', such as research reports and policy documents from China's relevant government sectors, and implemented ETSs, such as the EU ETS, California ETS, China's national ETS and ETS pilots.

Quantitative analysis
Quantitative analysis for this paper was based on field investigation of data relating to enterprises in China. To understand more precisely data availability and quality for the enterprises to be covered by the national ETS, 75 enterprises in seven proposed covered sectors of China's national ETS (NDRC Office, 2016) were investigated. To ensure that they were representative, we used a random and stratified sampling method, referring to the categories (A, B and C) 2 in the EU ETS determining the minimum requirements for tiers 3 (European Commission, 2012a). We investigated 3-4 enterprises in each category (Table 1). The enterprise data were for the year 2015-2016.
To assess the data availability of enterprises, a scorecard was developed based on the data types and level requirements reviewed in section 3. Scorecards are a common tool for performance rating in a business context (Kaplan & Norton, 1996) and have also been tested as a means of measuring the quality of national GHG inventory systems (Neeff et al., 2017). The detailed scoring rules in this study are presented in the Appendix. A higher score for a specific type and level of data means higher availability.
Data quality was reflected in monitoring frequency and data quality control measures. We investigated the proportion of enterprises with different monitoring frequencies and the proportion of enterprises that had external certification involving data quality control measures, including relevant standards, such as ISO10012 (International Standardization Organization [ISO], 2003), and ISO14001 (ISO, 2015), ISO50001 (ISO, 2011) and laboratory qualifications that are certified by the national or provincial industry associations in China. Implementing data monitoring quality control measures is a prescribed recommendation in the above-mentioned standards and qualifications.

Major data requirements for China's national ETS
The main purpose of collecting data for China's national ETS is to determine the scheme's coverage, and for cap setting, allowance allocation and compliance. Detailed mechanisms for China's national ETS are still being designed, whereas general principles for key issues such as the approach to coverage, cap and allowance allocation have been established.
Data requirements will be investigated from two perspectives: data type and data level. Data type refers to categories of data. In this context, the categories of data are divided into six groups: (1) production data: product type and output; (2) emissions data: emissions volume and its calculating parameters, for example, type and activity data (e.g. consumption) for energy and materials, calorific value, carbon content and oxidation rate; (3) technology data: the data on manufacturing and emissions-reduction technologies, for example, type of technology, process flow data, capacity, mass balance diagram; (4) management data: data on enterprise internal management, for example, ownership structure, plant layout, control rights, organization structure; (5) economic data: for example, GDP, energy price, enterprise profit, output value and cost; (6) policy data: information on major policies that will influence the emissions of a region/sector/enterprise: for example, regional/ sectoral economic development and mitigation targets, key construction projects, government studies on climate change, population. The data level in this context refers to the 'fineness' of the data, for example, data at the region/sector level refer to the data of the whole region/sector, the enterprise level refers to data for an individual enterprise as a whole and the installation level refers to data for an individual installation.

Coverage determination
Coverage determination in this context involves determining the coverage of the ETS in terms of gas type, emissions activity and sector, and identification of individual enterprise boundaries.
3.1.1. Determination of ETS coverage: gas type, emissions activity and sector For China's national ETS, coverage has been generally already set. The covered gas is CO 2 , and covered emissions activities are direct emissions from combustion activity, process emissions from some sectors (e.g. cement) and indirect emissions from using electricity and heat. The proposed sectors to be covered are power generation, petrochemicals, chemicals, non-metallic minerals, non-ferrous metals, steel, papermaking and aviation (NDRC Office, 2016). The priority of the sector to be capped and allocated allowances can be different, for example, the power generation sector is first covered in the initial stage of China's national ETS, and other sectors will be covered at subsequent stages. The priority can be determined by considering several factors for a sector, such as the scale of emissions, carbon reduction potential, enterprise affordability and relevant policy. The emissions scale indicates the emission control responsibility for a sector, which can be estimated by emissions data at the regional and sector level. Emissions-reduction potential refers to technically feasible emissions reductions, which can be assessed based on a comparison between the best available technology and the covered enterprises' status quo of emissions and manufacturing and carbon reduction techniques. Therefore, corresponding data requirements involve data on the best available technology and data on technology and emissions at the installation level of covered enterprises (California Air Resources Board, California ARB, 2010a). Affordability refers to the financial feasibility of emissions control for an enterprise, which can be estimated based on economic data at the enterprise level, such as the profit, output value and cost of covered enterprises. Relevant policy refers to policy related to sectoral economic development trends or mitigation responsibility, which can be reflected in policy data.

Identification of an enterprise boundary
The boundaries of enterprises include organizational and operational boundaries (ISO, 2006). 'Organizational boundary' refers to the scope of installations that are controlled or owned by an enterprise. 'Operational boundary' refers to the emissions gas type and activities covered by the ETS and associated with the enterprises' operations. In terms of the data types required, management and technology data of an enterprise could facilitate the identification of boundaries. In terms of data level, installation-level data are needed because the condition of ownership, control rights, emissions gas type and activity will possibly vary across installations.

Cap setting
Two approaches that are used in cap setting include the top-down and bottom-up approaches. For China's national ETS, the cap will be determined by combining top-down and bottom-up approaches, but mainly by a bottom-up approach.

Top-down approach
A cap determined with the top-down approach is based on the emissions and the mitigation targets, which could be typically determined according to Equation (1) 4 : According to Equation (1), historical emissions of certain base years (e.g. past three or five years) or projected emissions data and the reducing ratio for ETS are needed. ETS emissions can be determined by the aggregation of emissions collected at the enterprise level. The determination of base years, projected emissions and the reducing ratio involves the determination of an appropriate cap stringency, which requires extra supporting data as follows. Policy choices should be made between the 'ambitious' mitigation target and feasible emissions-reduction measures to determine the cap stringency. The 'ambitious' mitigation target could be set based on a projected emissions and emissions-control responsibility. The data requirements for projection are different according to different projection methods, for example, expert judgement, trend extrapolation, ENERGY 2020 and E-DRAM models used in the California ETS (California ARB, 2010b) and the PRIMES and GAINS models used in the EU ETS (European Commission, 2018). Corresponding data requirements may include but are not limited to production, emissions, technology, economic and policy data ranging from the region/sector level to the enterprise or installation level. Emissions-control responsibility can be assessed based on emissions and policy data. Feasible emissions-reduction measures can be assessed technically and financially based on emissions-reduction potential (California ARB, 2010c) and the affordability of the ETS enterprises, as stated in Section 3.1.1. After the policy choices are made and a feasible mitigation target is set, the base years and reducing ratio can be determined.

Bottom-up approach
A cap determined with the bottom-up approach is mainly based on the sum of allowances that are allocated to market participants according to allocation rules. Therefore, the data requirement in the case of a bottom-up approach is mainly determined by the requirements for the allowance allocation.

Allowance allocation
Allowance allocation methods mainly include auctioning and free allocation, with free allowance approaches mainly including benchmarking, intensity-based grandfathering and emissions-based grandfathering (Pang & Duan, 2016). In China's national ETS, benchmarking will be used as the first priority approach for allowance allocation. Data requirements vary according to different allowance allocation approaches. No data collection is needed for auctioning (Harrison & Radov, 2010). For free allocation approaches, the following are typical equations for three approaches to free allowance allocation: Emissions -based grandfathering: Allowance = historical emissions × allocation factor, Benchmarking: Allowance = historical or actualproduction × benchmark, Intensity -based grandfathering: Allowance = historical or actual production × historical intensity × allocation factor.
A benchmark refers to an emissions level per production, which is typically determined in accordance with average performance or best practice (European Commission, 2011; Groenenberg & Blok, 2002). Historical intensity can be considered as a benchmark determined by the enterprise's own historical emissions level. The allocation factor in this context means a factor reflecting the allowance shortage compared to historical emissions or intensities, for example, a number between 0 and 1 (Pang & Duan, 2016).
According to the equations, benchmarking and intensity-based grandfathering require historical or actual production data categorized according to a benchmark or intensity. In addition, corresponding emissions of certain base years are also needed to calculate a benchmark or intensity. In China's national ETS, actual production-based benchmarking will be used as the first priority. However, to enable allowance trading, historical production will also be used for allocating free allowances to enterprises before actual production information is available. Therefore, both historical and actual production data are needed. For emission-based grandfathering, historical emissions of certain base years are needed. In terms of the data level, benchmarking and intensitybased grandfathering require installation-level data (Buchner et al., 2006) and emission-based grandfathering requires enterprise-level data. To determine the acceptable setting of the benchmark (e.g. average or best practice), base years, allocation factors, and allocation stringency should be considered, which require similar supporting data as the cap stringency determination mentioned in section 3.1.

Compliance
The compliance process involves determining the number of allowances that enterprises must surrender. The data requirements are mainly emissions data at the enterprise level.

Existing ETS-related data in China
According to the data requirements summarized in section 3, we studied the availability of data from three perspectives: government data, conventional data collection regulations and internal enterprise data in China. Government data refer to public or non-public data in government. The conventional data collection regulations refer to institutionalized requirements for the enterprise to collect and report data to the government. Internal enterprise data refer to data from within enterprises, including data both reported and not reported to the government. Furthermore, we explored the quality of existing data in China from two perspectives: data quality control regulations and internal enterprise data quality control measures.

Data availability
In China, the major existing data collection regulations touching upon ETS-related data include the statistics system managed by the National Bureau of Statistics (NBS) and the energy data system managed by NDRC. Table 2 shows the data required for the ETS that is already available from government data and existing data collection regulations with detailed sources. We find that production data at the region/sector/enterprise level, part of the emissions data (energy type, activity data and calorific value) at the region/sector/enterprise/installation level, and part of the technology, economic and policy data are already collected by the government or requested by regulations in China.
As for internal enterprise data management, there are two major data systems within Chinese enterprises: financial and manufacturing data systems. The financial data system consists of data for financial purposes, such as cost accounting and trade settlement. In contrast, the manufacturing data system consists of data that are monitored during the manufacturing process for production control. In terms of data type, the ETSrelated data collected in the financial data system include the production and part of the emissions (energy type and activity data), management and economic (enterprise profit, output value, cost) data, whereas the manufacturing data system includes the production and part of the emissions (energy and material type, activity data, calorific value, carbon content) and technology data. Few enterprises in China have continuous emissions monitoring systems directly monitoring carbon emissions. In terms of the data level, the financial data system contains data at both the enterprise and installation levels, whereas the manufacturing data system is usually at the installation level. The data availability investigation results (Table 2) show that the investigated enterprises scored 100% for the production and part of the emissions (energy and material type and activity data), technology and management data at both the enterprise and installation levels, and economic data at the enterprise level. For the calorific value of fuel, the investigated enterprises scored 78.1% at the enterprise level and 53.9% at the installation level. In terms of the carbon content of materials (including parameters used to calculate the carbon content) related to process emissions, enterprises scored as high as 98.3% at the enterprise level and 88.4% at the installation level. By contrast, enterprises only scored 5.1% in fuel carbon content at the enterprise level.

Data quality
In terms of data quality, we first reviewed regulations for the data measurement of enterprises. A comprehensive regulation system for data quality control, especially for energy data, has developed in China, from legal to technical documents. Table 3 shows laws, regulations and technical standards related to data quality control in China. First, China has enacted the Law on Measurement and regulations regarding measuring instrument calibration, inspection and responsibility for irregularities. Second, in terms of energy data measurement, China has enacted the Law on Energy Conservation, including implementing regulations and supporting technical standards. Key energy-using enterprises 5 are required to be equipped with measurement instruments of specified accuracy according to the technical standards (NDRC et al., 2011) and to have conducted an external evaluation of their energy management system (NDRC, Certification and Accreditation Administration, 2012). Further, China has also established an energy conservation supervision system at the provincial, city and county levels, requiring supervising enterprises to adhere to the laws, regulations and mandatory technical standards of energy conservation.
'√' means the information is collected requested by the policy, 'partly √' means the requirements are optional (i.e. the enterprise can provide information according to its actual circumstances), '×' means not requested and '-' means not applicable. b Industrial Division of NBS (2016). c NBS (2017a). d Energy audit reports involve data on production, energy consumption, calorific value, technology (technology type, process flow, capacity, energy balance diagram) at enterprise and installation levels (General Administration of Quality Supervision, Inspection and Quarantine 1997). Conducting energy audit are optional for key energy-using enterprises but compulsory for those fail to achieve annual energy saving targets assigned by government (State Council, 2011 Second, in the field investigation, we also found that enterprises have made substantial efforts in data quality control. In terms of monitoring frequency (Table 4), the proportion of enterprises that measure the calorific value of a fuel (mainly coal) and the material's carbon content once a day or more frequently in accordance with the mandatory requirements is 83.0% and 77.3%, respectively. This frequency is higher than the EU's corresponding requirement in time frequency (4-6 times a year for most categories of fuel and material) (European Commission, 2012a). In terms of the certification of data quality control measures, the proportion of enterprises that utilized external certification reached 59.0%.

Data challenges related to establishing China's National ETS
In comparing the existing databases in China to the aforementioned data requirements, we found that most data are available. There is also an abundant foundation in regulations and internal enterprise management of data quality control in China. However, challenges exist because there are differences in the data availability and uncertainty in the data quality.

Differences in the availability of ETS-related data in Chinese enterprises
There are differences between sectors in terms of data availability. Using calorific value as an example (Table 5), the scores for the power, cement and papermaking industries are higher, whereas the scores are lower for the non-ferrous metals, petrochemical and chemical industries. Multiple reasons lead to these differences. First, there are different sectoral technological characteristics; several examples follow. Sectors with more combustion activity and purchased fuel pay more attention to monitoring calorific value because it is an important factor for setting the fuel price when trading, and could reflect the nature of fuel for better internal management of combustion efficiency indicators. For the power generation sector, combustion emissions usually comprise over 99% of total emissions, and the calorific value is the calculating parameter of the net coal/gas consumption rate for electricity and heat supply, which is one of the most important production efficiency indicators for power plants and is usually strictly controlled and managed. The papermaking sector is similar because most medium and large-scale enterprises are equipped with self-owned power plants in China, which are the main emission sources for the sector. In contrast, for the petrochemical sector, especially in large-scale enterprises, the proportion of emissions related to combustion activity is relatively low. By-products, such as refinery dry gas, are usually used as fuel input but are less likely to be measured for their calorific value, because by-products are not directly purchased but are home grown, and the calorific value is not a necessary parameter for internal management. In addition, sectors with complicated fuel connections among installations, such as the steel and petrochemical sectors, are less likely to measure calorific value at the installation level because this brings a higher monitoring cost.
The second difference is the enterprise scale. Taking emissions as an indicator reflecting the enterprise scale, the average emissions of enterprises in the power, cement and petrochemical sectors are larger than those in other sectors. The large-scale enterprises tend to monitor calorific value both at the enterprise and at installation levels, because their energy management systems are more complete and they can afford higher monitoring costs.
Therefore, considering the broad range of economic sectors and the large number of related enterprises of different scales that exist in China, we can infer that there are clear differences in data availability among the various enterprises.

Uncertainty in the quality of existing data in Chinese enterprises
There are still uncertainties in data quality for certain types of currently available data, which can be reflected in the following three ways.
First, part of the existing data has not been verified by third-party verifiers. Energy auditing involves thirdparty verification of energy use, involving data on production, energy type, activity data, calorific value and technology, but it is only compulsory for enterprises that fail to achieve annual energy conservation targets assigned by the government. The production, energy type, activity data, calorific value and economic data reported in the data system managed by NBS and NDRC are checked by the authorities themselves. If the data are abnormal, the enterprises will be required to give explanations or amend the data, but verification by a third-party verifier is rarely required. In addition, reporting of part of the emissions (material type and activity data, carbon content), technology and management data to the government is not required, so there is probably no external check for those data.
Second, uncertainty can result from the existence of multiple sources of the same data within an enterprise. One type of data can be introduced from different sources, including different measurement points (e.g. entering the enterprise or furnace), samples on a different basis (e.g. received or air-dried basis), different internal data systems (e.g. financial or production management) and different sources of evidence (e.g. internal reports or third-party examination reports). The numerical value of the data from different sources possibly differs. Data deviations can result when data sources mismatch or when incorrect data sources are used. An example is presented in Table 6: the carbon emissions calculated from a calorific value on an as-received basis are 8.3% lower than those calculated from a calorific value on an air-dried basis, mainly because the moisture content of coal differs according to its state. Third, uncertainty regarding data quality persists because some regions in China have insufficient experience with monitoring and reporting (MR) of ETS-related data. Although enterprises in the seven ETS pilots have gained MR experience by collecting ETS-related data for five years, limited MR capacities have been found for enterprises in the other (more than twenty) provinces because the MR of ETS-related data is a new process in these areas, and it takes time for the vast number of enterprises involved to train staff to a sufficient standard. Limited MR capacities could lead to mistakes and de-normalization of MR; thus, data quality may not be well ensured.

Suggested solutions to the data challenges related to establishing China's national ETS
To meet the data-related challenges mentioned in section 5, we propose short-term strategies and long-term solutions based on our analysis of existing data along with the experiences of international ETSs and at the national and pilot levels in China. The short-term strategies aim to lower the technical and administrative difficulties associated with collecting data at the initial stage, which can be used under conditions where time, capacity, funding and other resources are limited. Long-term solutions are fundamental ways to cope with the challenges and obtain accurate ETS-related data, which involve two solutions in technical terms and two in administrative terms. Meanwhile, we will point out the differences between our suggestions and the existing practice at the national level to indicate some directions for the future establishment of an MRV system for China's national ETS.
6.1. Short-term strategies 6.1.1. Establish data collection guidelines based on existing data systems in China and provide default values Data collection guidelines can be established by referring to existing statistical and energy data systems, as long as the data requirements in these systems are consistent with ETS-related data. For example, the definition and counting method of production in a statistical system can be referenced such that it can be easier for the enterprise to report data because the rules are similar. In addition, for the emissions data that some enterprises still lack, such as calorific value, carbon content and oxidation factors, default values can be provided in the guidelines.

Covering major emissions or sectors with a better data foundation prior to other emissions or sectors in ETS
Among the existing data in China, we found that energy data are better established than other data, as indicated by the high availability of energy consumption at the enterprise and installation levels. A specialized and comprehensive energy data quality control policy system has been established and implemented for a relatively  (Guangdong DRC, 2017b).
long time in China. In addition, energy emissions are the major emissions of many sectors, such as the power, papermaking, non-ferrous metal (aluminium calendaring), and iron and steel (short steel-making process and steel rolling) sectors. Therefore, energy emissions, and sectors whose major emissions are from energy, can be covered at the initial stage of the national ETS; other emissions and sectors can be covered at subsequent stages.
6.2. Long-term measures 6.2.1. Introducing the tier concept into data collection guidelines Once enterprises that lack emissions data are provided with default values, they should be guided to conduct actual measurements to improve data availability and quality over the long run. Moreover, allocation methods may be adjusted by competent authorities, thus changing corresponding requirements. Introducing the 'tier' concept into data collection guidelines can help guide enterprises in improving data availability and quality and adapting to changing requirements. The tier concept refers to set tiers that represent different requirements for data measurement or data types and levels.
For example, the actual measuring requirement can be divided into tiers that represent increasing monitoring effectiveness and cost. Enterprises with a lower monitoring capacity can start with lower tiers of actual measurement requirements, i.e. methods with lower effectiveness and cost, and gradually progress to the higher tiers. Enterprises with larger amounts of emissions can be required to adopt high tiers because their emissions data uncertainty can have a more substantial influence. In practice, the tiers system with different methods and accuracy levels for different source streams and installations are applied in the EU ETS (European Commission, 2012a).
Another example is that a tiered reporting framework with different levels (e.g. enterprise, installation and equipment levels) can be set to adapt to differences in the availability of the data level and the data requirements of the data level (e.g. different allocation methods). In this framework (Figure 1), an enterprise is considered to be composed of a certain number of installations, and installations are composed of a certain number of pieces of emissions equipment (e.g. boiler, calciner). Enterprises that cannot report data at the installation level can initially report at the enterprise level and gradually progress to the installation level. However, for enterprises whose allocations are based on benchmarking of intensity-based grandfathering, reporting at the installation level is compulsory. Furthermore, data at the equipment level is still necessary to, for example, explore benchmarks for different equipment. In practice, a three-tiered reporting framework is implemented in the Guangdong ETS pilot (Guangdong DRC, 2017b). Figure 1. Schematic diagram of a tiered reporting framework with three levels. S100 For national guidelines, the tier concept can be applied more widely. For example, in the national guidelines for the power generation sector, carbon content must be measured every month. However, in China, only a limited number of enterprises are able to meet this requirement. Therefore, tiers can be set, for example, tier 1 (every 4 months) for power generation units emitting less than 1 million tons CO 2 (tCO 2 ), tier 2 (every 2 months) for power generation units emitting 1-5 million tCO 2 and tier 3 (every month) for power generation units emitting over 5 million tCO 2 .

Clarify detailed data sources in guidelines and introduce monitoring plans
Data sources with detailed rules should be established in guidelines to avoid deviations in emission-calculation results through data-source misuse and changes. First, the correct data source should be clarified in detail. At least five elements related to the data source should be clearly defined. The first element is the monitoring standard, such as the sampling, testing and calculation methods. The second element is the monitoring frequency, including minimum requirements for time and mass frequency. The third element is the accuracy and calibration of the measurement equipment that monitors the data. The fourth element is the source of evidence. For example, activity data can be determined based on a continual metering of the process (usually recorded in manufacturing data systems) or based on aggregation of metering of quantities separately delivered, taking into account relevant stock changes (usually recorded by financial data systems). Emissions factors (calorific value, carbon content, etc.) can be determined from the evidence of internal or third-party laboratories, and internal laboratories can be classified as uncertified or certified by different accreditation bodies. The fifth element is the state wherein the data are measured. The relevant activity data, production and emissions factors should be used in consistent states when calculating emissions (European Commission, 2012b). For the example, in section 5.2, the correct calculation is to use coal consumption and calorific value in the same state of moisture (numerical differences for the same data from different monitoring points and sample bases mainly result from moisture differences).
Second, enterprises should be required to draw up and implement monitoring plans to prevent enterprises from distorting emissions via changes in the aforementioned data sources. A monitoring plan is mainly an exante record of the data source of an enterprise. Changing data sources requires modification of the monitoring plan, which usually requires re-examination by competent authorities and third-party verifiers. Monitoring plans are common measures required in the EU ETS, the California ETS, China's ETS pilots and data collection for China's national ETS.
For national guidelines, part of the requirements for data sources is yet to be clarified. For example, monitoring standards for production data, monitoring frequency requirements for cement, steel, non-ferrous metal and the papermaking sector, certification requirements for internal laboratories, as well as requirements on the state wherein the data are measured, are still lacking.

Collect data and conduct MR capacity building in advance
Before enterprises are covered by an ETS, data collection as well as MR capacity building can be carried out for a number of years to improve the availability and quality of databases and to minimize data differences among enterprises. For example, NDRC released data reporting requirements for the national ETS in 2014, 2016 and 2017. The sector scope of these requirements extends beyond the power generation sector. Meanwhile, seven national ETS capacity-building centres have been established. It is urgent to improve the MRV capacity of enterprises, such as establishing complete ETS-related data internal management systems in enterprises.

Establish a rigorous third-party verification system
Third-party verification is an effective and common method for quality assurance of the ETS-related data provided by enterprises and has been adopted by major implementing ETSs (EU ETS, California ETS, China's ETS pilots, etc.) and in data collection for China's national ETS. A rigorous third-party verification system involves quality control measures at the ex-ante, in-process and ex-post stages.
The ex-ante quality control process is mainly focused on formulating strict regulations and tools. A rigorous third-party verification agency access system should be implemented that defines the requirements for verification agencies in terms of, for example, qualification, capabilities and personnel quality (Beijing DRC, 2016;Shanghai DRC, 2012;Shenzhen Market and Quality Supervision Commission, Shenzhen MQS, 2014). In terms of tools, verification guidelines and a complete set of reporting and verification templates and checklists should be developed to help minimize data uncertainty and ensure the validity of verifications (Beijing DRC, 2016;Guangdong DRC, 2017a;Hubei DRC, 2014;Shanghai DRC, 2012;Shenzhen MQS, 2012).
In-process quality control refers to the spot examination of verification agencies by an accreditation body and establishing a system for filing complaints against third-party verification agencies that violate the regulations.
Ex-post quality control should involve establishing a checking system, including expert review, random inspection and spot counterchecks, to re-examine the data that are verified by third-party verifiers. Further, a data quality assessment and penalty system for enterprises and verifiers is also necessary. With such a system, data problems can be identified and resolved, and enterprises and verifiers that are in violation of the relevant regulations can be penalized.
In data collection for China's national ETS, referential qualification requirements for third-party verifiers and verification guidelines have been released, but these are not compulsory for third-party verifiers. The access system, in-process spot examination for verifiers, and data quality assessments and penalty systems for enterprises and verifiers are yet to be established.

Conclusion
This article systematically reviewed the data requirements for ETS, based on the experience of existing ETSs and recent progress in China's national ETS, dividing the data into six groups, i.e. data on production, emissions, technology, management, economy and policy. There are several differences between China's national ETS and existing ETSs in other countries, including requirements for data on indirect emissions from purchased electricity and heat, actual production data used for compliance, and data for intensity-based grandfathering. Comparing the data requirements with the existing data in China from the perspective of government data, conventional data regulations and enterprise internal data, we find that data are generally available, except for energy carbon content and oxidation factors. There is also an abundant foundation of data quality control in China. However, challenges exist for collecting data for China's national ETS because there are differences in data availability and uncertainty in data quality. Therefore, we propose short-term strategies and long-term solutions, both technical and administrative, to cope with these challenges. China's national ETS is under construction, and one of the key tasks is to build a reliable MRV system to obtain accurate data. The systematic review of data requirements, investigation of existing data, and identification of challenges in this article is highly relevant to policy-makers' concerns. The proposed solutions, with detailed explanations and examples, could facilitate the establishment of an MRV system for China's national ETS, as well as similar newly developing ETSs around the globe. Notes 1. The authors of this paper were deeply involved in the design and data collection of China's national ETS and the Guangdong ETS pilot. 2. The EU ETS classifies installations in three different monitoring categories: • Category A: average annual emissions are equal to or less than 50,000 tCO 2 (e), • Category B: average annual emissions are equal to or less than 500,000 tCO 2 (e), • Category C: average annual emissions are more than 500,000 tCO 2 (e) (European Commission, 2015). 3. We refer to the threshold of tiers and de-minimis source streams in the EU ETS because there were no similar thresholds that could be referenced in China's ETSs. 4. The 'reducing ratio' refers to the rate at which the cap is tightened compared with the historical/projected emissions. 5. Key energy-using enterprises are enterprises with annual energy consumption levels that exceed 10,000 tons (5000 tons in some sectors and regions) of coal equivalent, which is close to the threshold for enterprises to be regulated by the China's national ETS. 6. Because MRV guidelines at the national level are not yet implemented in ETS operation, we used the Guangdong Enterprise CO 2 Monitoring and Reporting Guideline, which is the official MVR document that has supported the Guangdong ETS pilot for the last five years. It was first released in 2013 and has been revised three times. S102 X. ZENG ET AL.