A decadal inversion of CO2 using the Global Eulerian–Lagrangian Coupled Atmospheric model (GELCA): sensitivity to the ground-based observation network

Abstract We present an assimilation system for atmospheric carbon dioxide (CO2) using a Global Eulerian–Lagrangian Coupled Atmospheric model (GELCA), and demonstrate its capability to capture the observed atmospheric CO2 mixing ratios and to estimate CO2 fluxes. With the efficient data handling scheme in GELCA, our system assimilates non-smoothed CO2 data from observational data products such as the Observation Package (ObsPack) data products as constraints on surface fluxes. We conducted sensitivity tests to examine the impact of the site selections and the prior uncertainty settings of observation on the inversion results. For these sensitivity tests, we made five different site/data selections from the ObsPack product. In all cases, the time series of the global net CO2 flux to the atmosphere stayed close to values calculated from the growth rate of the observed global mean atmospheric CO2 mixing ratio. At regional scales, estimated seasonal CO2 fluxes were altered, depending on the CO2 data selected for assimilation. Uncertainty reductions were determined at the regional scale and compared among cases. As measures of the model–data mismatch, we used the model–data bias, root-mean-square error, and the linear correlation. For most observation sites, the model–data mismatch was reasonably small. Regarding regional flux estimates, tropical Asia was one of the regions that showed a significant impact from the observation network settings. We found that the surface fluxes in tropical Asia were the most sensitive to the use of aircraft measurements over the Pacific, and the seasonal cycle agreed better with the results of bottom-up studies when the aircraft measurements were assimilated. These results confirm the importance of these aircraft observations, especially for constraining surface fluxes in the tropics.


Introduction
Carbon dioxide (CO 2 ) is a major greenhouse gas and the most important contributor to anthropogenic climate change. Before the industrial revolution, the atmospheric CO 2 exchange with natural carbon reservoirs (land and ocean) was largely in balance, in the absence of human influences. However, the combustion of fossil fuels (coal, natural gas and oil), as well as certain industrial processes and land-use changes, has considerably increased since the pre-industrial era. The current level of CO 2 in the atmosphere has increased by nearly 40% compared to the level in the pre-industrial era (Conway and Tans, 2014). Currently, about half of the extra CO 2 that modern human 2 T. SHIRAI ET AL. spatial and temporal CO 2 variability in the vicinity of variable sources and sinks is quite challenging. Global Eulerian models with high spatial resolution have a high computational cost. One way of obtaining higher resolution flux estimates within a region of interest is to use a 'zoomed' or 'nested' atmospheric transport model Peylin et al., 2005). The idea of coupling two different types of models for global and regional modelling for inversion was introduced by Rodenbeck et al. (2009), and Trusilova et al. (2010) implemented this idea as a coupled system consisting of TM3, a global Eulerian atmospheric transport model and the Stochastic Time-Inverted Lagrangian Transport (STILT) regional Lagrangian model. Rigby et al. (2011) implemented a global inverse model with zoom over several regions resolved with a regional Lagrangian transport model NAME. Lagrangian particle dispersion models (LPDMs) are an effective tool for simulating observations at high spatial and temporal resolutions (Lin, 2012). Lagrangian models have minimal numerical diffusion, which is inherent in Eulerian models. LPDMs have been coupled with numerical weather prediction (NWP) models and used extensively in air-pollution dispersion modelling (Uliasz, 1993). Recently, coupled LPDM/NWP models, such as the coupled Weather Research and Forecasting-Stochastic Time-Inverted Lagrangian Transport (WRF-STILT) model, have been used for a wide range of applications, including surface flux estimates by carbon cycle studies (Gerbig et al., 2003;Gourdji et al., 2010;Nehrkorn et al., 2010;Pillai et al., 2011). Ganshin et al. (2012) developed the Global Eulerian-Lagrangian Coupled Atmospheric model (GELCA) based on a framework introduced by Koyama et al. (2011). GELCA combines two transport models: The National Institute for Environmental Studies-Transport Model (NIES-TM) version 8.1i Belikov et al., 2013), a Eulerian global transport model, is coupled with FLEXPART version 8.0 (Stohl et al., 2005), a LPDM. The global background mixing ratio field generated by NIES-TM is used as time-variant boundary conditions for FLEXPART, which performs backward simulations from each receptor point (observation location). GELCA has demonstrated better performance in resolving short-timescale variations compared with NIES-TM only (Koyama et al., 2011;Ganshin et al., 2012).
In this paper, we introduce a global CO 2 inverse system using GELCA and we evaluate the performance of the GELCA inverse modelling system in estimating decadal global monthly CO 2 flux distributions. As constraining observation data, we used an ObsPack data product, which includes actual data (whereas GLOBALVIEW contains only processed data), to take full advantage of the coupled modelling approach, which can effectively make use of measurements reflecting CO 2 exchange along a local path or footprint as well as measurements representing hemispheric-scale background air. We examine the activities have released into the atmosphere has been absorbed by the land biosphere and oceans (Ciais et al., 2010a). Although global land and ocean carbon sinks increase with rising atmospheric CO 2 , the Intergovernmental Panel on Climate Change Fifth Assessment Report stated with high confidence that global warming will reduce the sinks and partially counterbalance the equilibrium. It is thus urgent to understand the current status and trends of CO 2 exchange between land, ocean, and atmosphere so that the potential impacts of ongoing global climate change on the carbon cycle can be assessed.
Inverse modelling is one approach to quantifying the spatiotemporal distribution of sources and sinks at the Earth's surface; this approach starts from a set of atmospheric mixing ratio observations by using an atmospheric transport model and sophisticated statistical inversion schemes (Ciais et al., 2010b). Global Eulerian models have been used extensively for global CO 2 inversion (e.g. Gurney et al. (2004) and references therein). Initially, Eulerian models with low spatial resolution (starting from 10° × 10° in the 1980s) were able to reproduce the seasonal cycle of global atmospheric CO 2 mixing ratios reasonably well. At that time, observational network was much less abundant and most observations were made at weekly to monthly intervals. In 1996, the GLOBALVIEW-CO 2 data product was provided by the US National Oceanic and Atmospheric Administration (NOAA) Earth System Research Laboratory (ESRL) (http://www.esrl.noaa.gov/gmd/ccgg/globalview/). This data product contains extended records of CO 2 with a regular temporal distribution, derived from high-precision atmospheric measurements such as those from the World Data Centre for Greenhouse Gases (http://ds.data.jma.go.jp/gmd/wdcgg/introduction.html) of the World Meteorological Organization Global Atmospheric Watch program and the Carbon Dioxide Information and Analysis Center (CDIAC; http://cdiac.esd.ornl.gov). The observational records in GLOBALVIEW products are free of temporal gaps and have been extensively used by many carbon cycle models. Recently, spatial observational coverage has been expanding as more vertical profiles and better horizontal coverage become available from aircraft and satellite measurements, and measurement frequency has been getting higher as more continuous measurements are being made at surface stations, including tower sites (Bruhwiler et al., 2011;Saeki et al., 2013;Houweling et al., 2015). Models have been developed that are able to handle the higher frequency but irregular data-sets, and such models have started to use actual data for inversion (e.g. Rodenbeck et al., 2003;Chevallier et al., 2010;Chevallier et al., 2011). NOAA ESRL released a new set of observation data products in 2012 as a successor to GLOBALVIEW, called Observation Package (ObsPack) data products (Masarie et al., 2014).
To derive regional surface flux information, highfrequency observations that represent hourly to synoptic variations are particularly useful. Nevertheless, simulating fine sensitivity of the inverse system to the data selection by comparing inversion results among five different subsets of the ObsPack data product.

GELCA coupled atmospheric model
A schematic diagram of the GELCA inverse modelling framework is shown in Fig. 1. We implemented the coupling at temporal boundaries instead of spatial boundaries. Two-day backward-transported particles modelled by FLEXPART were combined at the end points with the background CO 2 levels 2 days prior to the observations simulated by NIES-TM. The mixing ratio C x r , t r at the receptor location x r at time t r can be expressed as the sum of near-site contributions calculated by FLEXPART and the background contributions calculated by NIES-TM.
(1) C x r , t r = C near_site x r , 0 ≤ t r − t ≤ 2 days FLEXPART simulates the backward transport of 10,000 particles released from each receptor point (observation location). C near_site x r , 0 ≤ t r − t ≤ 2 days is calculated by integrating the sensitivity of CO 2 mixing ratio to the surface fluxes (footprint) along 2-day trajectory paths of all particles. C background x r , t r − 2 days is the average of the CO 2 mixing ratios at the time of coupling simulated by NIES-TM, weighted by the number of the end points of the back-trajectories contained in each model grid cell. Detailed description about the Eulerian-Lagrangian coupling is given in Ganshin et al. (2012).
The duration of the backward calculations was set to two days to be consistent with the timescale of particles leaving the mixed layer (Gloor et al., 2001). Note that coupling with a Lagrangian model might not result in a significant improvement, compared with use of a pure Eulerian model, for remote sites, because numerical diffusion has a significant impact on the simulated mixing ratios at the receptor only if there are inhomogeneous sources or sinks near (less than about two days upwind of) the receptor. Supplementary Figure 1 shows the examples of footprints in winter and summer for one of the observational data-sets used in this study, from which we point out the following features. Firstly, the distribution of observation sites mostly determines the footprint coverage, making North America and Europe fairly

Inversion scheme
For long-lived trace gases such as CO 2 , the assumption that atmospheric mixing ratios respond linearly to changes in emissions holds well. Under the assumption of linearity, the relationship between a vector of observed values (z) and that of sources and sinks (s) can be expressed in matrix form as (2) z = Hs + v, well covered compared to other regions. Secondly, the footprint coverage varies significantly with the wind as well. In general, the coverage widens in winter compared to summer due to the stronger winds during winter in middle and high latitudes. The wind direction is important as well. For example, in East Asia, in winter, the wind blows dominantly from the Siberian High towards the Pacific Ocean, whereas it blows dominantly from the Pacific High towards the continent in summer. Since most observation sites are located around the east side of the continent, more surface flux signal can be captured from the continental East Asia in winter than in summer (Supplementary Figure 1).
The meteorological fields driving both models were taken from the Japan Meteorological Agency Climate Data Assimilation System (Onogi et al., 2007), which has a regular horizontal Fig. 2. Illustration of the inversion process employed in this study. The t indicates the time step on monthly basis. The modelled CO 2 concentrations z mod are sum of the background concentrations z b and the presubtracted concentrations z p calculated by GELCA. In each inversion cycle, the modelled concentrations are compared to observations z ob and the state vector s is optimized within a 3-month window. Optimized fluxes are incorporated into the background concentration (z ′ b ) before calculating for the next time step. The number of asterisks in the upper right of s shows how many times a set of monthly fluxes has been optimized previously from past cycles. The prime in the upper right of z b means that the z b has been updated. The dashed arrows mean monthly calculations by GELCA.
where the Kalman gain matrix is In a batch mode inversion, all non-observed parameters are estimated using all available observations simultaneously at each solution step. When the number of observations and source regions increases, the matrix of basis functions H becomes very large, and the computational cost becomes very large. To avoid this large computational cost, we employed the fixed-lag Kalman Smoother optimization technique (Bruhwiler et al., 2005) to minimize J(s) in Equation (4) rather than a full-matrix batch mode inversion. In this technique, only a subset of the transport information is kept at each time step, because most of the signal from source regions decays within a few months to half a year. The time window of the transport information kept is called the lag length. We used a lag length of three months based on the results of the numerical experiments performed by Maksyutov et al. (2009) on the influence of various time windows. The detailed description about the fixed-lag Kalman smoother applied for atmospheric inversion is given in Bruhwiler et al. (2005).
The inversion process employed in this study is illustrated in Fig. 2. The modelled CO 2 concentrations z mod are sum of the background concentrations z b and the presubtracted concentrations z p calculated by GELCA. The calculation of z b is started from the initial CO 2 mixing ratio 3D field based on an ensemble of forward simulation results by six different transport models: Gap-filled and Ensemble Climatology Mean (Saito et al., 2011). Details about the prior fluxes used to calculate z p are given in the next section. In each inversion cycle, the modelled concentrations are compared to observations z ob and the state vector s is optimized with a 3-month window. With the response functions prepared by GELCA, posterior fluxes from step t are where H is a matrix of the sensitivities of observations to changes in emissions or initial conditions and v represents the model-data mismatch error, which includes both observational and model errors. The sensitivity of the observations to emission fields can be decomposed into two parts for the coupled model: The term H near_site represents the sensitivity of the observations at a particular site to emissions surrounding the site as calculated by FLEXPART. The term H background represents the sensitivities to background emissions (i.e. the impact of emissions beyond the immediate vicinity of the site), which are estimated by NIES-TM.
Using the Bayesian approach, the measure of the fit between modelled source strengths s and observed values z is expressed as a cost function J(s), assuming that s, z, and their uncertainties can be described as Gaussian probability density functions: where s p is the vector of the prior source strength, R is the observation error covariance matrix and Q is the prior source strength error covariance matrix. The prior covariance structure describes the uncertainties of each regional flux, and the correlation in space of the regional fluxes. In the current study, we assumed a diagonal prior covariance matrix, which means that estimated fluxes were assumed to show no correlation. At the minimum of J(s), the posterior source strength vector s and the posterior covariance matrix Q′ are expressed as The prior flux uncertainty for land regions and oceanic regions were prescribed as the mean standard deviation of the monthly NEE calculated by VISIT for the past 30 years  and the mean standard deviation of the oceanic flux assimilated by OTTM for the period 2001-09.

Atmospheric CO 2 observational data
In this study, the global atmospheric CO 2 data are from the package version obspack_co2_1_PROTOTYPE_v1.0.3_2013-01-29, hereafter called the ObsPack product, which includes actual CO 2 measurement data from multiple observation platforms, including towers, aircraft, and ships, contributed by 22 laboratories from around the world. Quality control of data in the ObsPack products is left to the data providers, which means that the criteria for data selection are not uniform across each product. Most of the data are provided by the data providers as 'representative of site,' indicating that the data have been selected to represent large, well-mixed air masses. When there was more than one laboratory conducting the same type of measurements during the same time period at a given site, we chose only one (priority was given to NOAA). For tower sites, which provide data from multiple sampling altitudes, we used only data from the highest level as representative of the boundary layer mixing ratio. The programmable flask package, an automated grab sampler (Turnbull et al., 2012), was categorized as a flask sampler in this study. The details of each measurement technique are available elsewhere (e.g. Gomez-Pelaez and Ramos, 2011;Stephens et al., 2011). All sites used in this study are listed in Table 1.
The mean annual values of RSD were used as elements of the data mismatch error covariance matrix. The RSD for corresponding sites are provided in the obspack_co2_1_ GLOBALVIEW-CO2_2013_v1.0.3_2013-05-24 product (GLOBALVIEW-CO2, 2013). For the sites that are not included in the GLOBALVIEW product, we used the average RSD values of all other sites over a latitudinal zone of 20° and an altitudinal level of 1 km. These RSD values were also used in data filtering described in Section 2.2. The minimum uncertainty value was set to 0.25 ppm.
In this study, we conducted sensitivity tests for different cases, each consisting of different site/data selections and observation uncertainties, to determine the impact of the observation settings on the inversion results. We prepared five cases, a control case data-set and four different subsets of the control case. The control case used all of the sites listed in Table 1, whereas the other four cases included only selected sites (indicated by checkmarks in the four right-hand columns of the table). The number of sites and types of data used in each case are shown in Table 2. A total of 154 sites were used in the control case, including 35 continuous measurement sites and 27 aircraft sites. Among 35 continuous sites, data from 29 sites were pretreated to give the 'afternoon mean' and 'night-time mean' that is the average value of 12-16LT and 2-5LT, respectively. We calculated from the optimized state vector, and incorporated into the background concentration for step t + 1.
In the inversion process, we applied a criteria to filter outliers from data-sets. We deselected data points for which the modeldata mismatch exceeded three times of the annual value of the residual standard deviation (RSD) around the smooth-fit curve of the measurements at each site. These data-filtering criteria worked much more effectively in keeping as many data while filtering obvious outliers than eliminating data points with a larger model-data mismatch than a certain fixed value because the filtering condition is nicely adjusted according to the normal variability of CO 2 records at each site.
The calculation period was from January 2001 to December 2011. The first year was considered to be a spin-up period. Fluxes were solved monthly for 64 regions: 42 land regions and 22 ocean regions (Fig. 3).

Prior CO 2 flux estimates and their uncertainties
As prior CO 2 fluxes, we used daily terrestrial biosphere fluxes, monthly oceanic fluxes, monthly fossil fuel CO 2 emissions, and monthly biomass-burning emissions. The spatial resolution of all prior fluxes used in this study was 1° latitude × 1° longitude. The fluxes from the biosphere, the oceans, and fossil fuel burning were developed for the NIES Level 4 data product of the Greenhouse gas Observing SATellite (GOSAT) project; detailed descriptions are available in Maksyutov et al. (2013).
For daily CO 2 exchange between the terrestrial biosphere and the atmosphere, Net Ecosystem Production (NEP) of the Vegetation Integrative SImulator for Trace gases (VISIT) process-based biosphere model was used (Ito, 2010). The physiological parameters of the VISIT model were optimized by the method described by Saito et al. (2014).
The monthly ocean-atmosphere CO 2 exchange was calculated by an ocean pCO 2 data assimilation system (Valsala and Maksyutov, 2010) based on an ocean offline tracer transport model (OTTM) (Valsala et al., 2008). The OTTM was coupled to a simple biogeochemical model that synthesizes the surface ocean pCO 2 and air-sea CO 2 flux by a variational assimilation method.
Fossil fuel emissions, which were imposed in forward and inverse calculations, were obtained from the Open-source Data Inventory of Anthropogenic CO 2 (ODIAC) emission data-set (Oda and Maksyutov, 2011); emission estimates were based on CDIAC the country-level estimates (~2008) and to the year 2008 emissions were projected up to 2011 by using data from the British Petroleum Statistical Review of World Energy ( British Petroleum, 2012). The emissions data-set used in this study are available from the NIES web site (http://db.cger.nies. go.jp/data-set/ODIAC/).
Prior estimates of CO 2 emissions from biomass burning were taken from the Global Fire Emissions Database version 3.1 van der Werf et al., 2010). used both afternoon and night-time means in the control case. We used only data collected at 00:00 UTC and 12:00 UTC values when continuous data were provided at an hourly time step, which was the case for 3 JMA sites (MNM, RYO, and YON in Table 1). Since these sites are 'marine boundary' sites, we considered diurnal cycles were not significant. Among 27 aircraft sites, 26 are vertical profiles at certain locations except CONTRAIL (CON; Comprehensive Observation Network for Trace gases by Airliner) (Machida et al., 2008;Matsueda et al., 2015) of which we used data from a specific sampling mode ASE (Automatic Air Sampling Equipment) that sampled at certain latitudes during the level flight along a nearly fixed route between Narita and Sydney/Brisbane. For CON, we aggregated the data by 5 latitude bin between 30N and 25S, whereas for other aircraft sites, we aggregated the data by vertical bins. The interval of the vertical bins varied from 0.5 to 2 km, mostly following the interval used for the corresponding site in the GLOBALVIEW product, 2013. Case CT used 90 surface sites, including 22 continuous measurement sites and a shipboard site but no aircraft sites. This case was named Case CT because the selected sites are those used by CarbonTracker North America (CT2011_oi), a CO 2 measurement and modelling system developed by NOAA (Peters et al., 2007). For Case CT, a prior observation uncertainty was assigned to each observation site according to the categories defined by Peters et al. (2005); these uncertainties ranged between 0.75 ppm (marine boundary layer) and 7.5 ppm (difficult sites). Case NF used 61 surface flask sites in the NOAA ESRL Cooperative Global Air Sampling Network (Dlugokencky et al., 2013); it included no continuous-measurement or aircraft sites. The case NF was named meaning 'case NOAA Flasks'. In this study, a sensitivity test was first conducted using the control case, Case CT, and Case NF. The observation locations of these three cases are shown in Fig. 4. Case SEL and Case NA were then defined on the basis of the inversion results obtained in the first sensitivity test. For Case SEL, three sites that showed large model-data mismatch values were removed from the control case. The name SEL means that the 'data selection' is applied. For Case NA, all aircraft data were removed from Case SEL. NA stands for 'no aircraft data'. Details of these two cases are explained in Sections 3.5 and 3.6.

Global budget/trend
Decadal time series of the annual CO 2 fluxes estimated by inversion using the five different observation data-sets (five cases) described in Section 2.4 are shown in Fig. 5. The global net fluxes into the atmosphere are also plotted against the global atmospheric CO 2 growth rate derived directly from the observed CO 2 (Dlugokencky and Tans, 2014)  The time series of the global net fluxes agreed well among the five cases and were generally consistent with the time series of the observed growth rate with respect to both year-to-year variations and annual mean values ( Fig. 5(a)). The interannual variability of the net fluxes appeared to be strongly correlated with the variability in the land CO 2 flux, shown in Fig. 5(b). The large interannual fluctuations of the land flux correspond to El Niño-Southern Oscillation (ENSO) phases ( Fig. 5(d)). High growth rates of the CO 2 mixing ratio in 2003,2005,2007 and 2010 were likely due to reduced CO 2 uptake by land during El Niño phases (Jones et al., 2001;Knorr et al., 2007;Mabuchi, 2013). The low land CO 2 uptake in 2002 is considered to be due to global dry condition during the period (Knorr et al., 2007). The interannual variations of the estimated land flux is in phase with the ensemble results of nine dynamic global vegetation models (Le Quéré et al., 2013) and with the atmospheric inversion results of an ensemble of 11 transport models (Peylin et al., 2013) and 7 transport models in which GELCA is included as well (Thompson et al., 2016). The increasing tendency of the land CO 2 sink in the early 2000s ( Fig. 5(b)) was also reported by Peylin et al. (2013). The effect of ENSO events on the ocean CO 2 flux (Fig. 5(c)

Regional flux distributions
The spatial distributions of the decadal mean CO 2 fluxes during 2002-2011 of the control case, Case CT, and Case NF are shown in Fig. 6. Although the global net fluxes agreed well among these three cases, at regional scales, we can see differences in the estimated CO 2 fluxes among them due to the different observational data used in each case. The three inversion results share some features in common, such as increased uptake in temperate South America and boreal Eurasia and increased emissions the air-sea CO 2 flux in the Pacific Ocean (Ishii et al., 2014), an association of interannual variation in the tropics with ENSO events was suggested by diagnostic models and ocean general circulation models, but it was not clear in the results of atmospheric inversions. Since global interannual variability of land fluxes is generally larger than that of oceanic fluxes, it is more challenging for atmospheric inversions to resolve the global interannual variations of oceanic fluxes without interference from larger atmospheric CO 2 fluctuations mainly caused by land fluxes. of Case CT and Case NF changed the flux to negative. In the control case, which had more observational constraints than the other two cases, the flux was estimated to be positive. The reason for this difference is discussed in Section 3.6. Table 3 shows the number of data used in the inversion in these three cases. The control case used twice as many observational data as Case CT and six times as many as Case NF.
in tropical South America and south-western Europe, compared to the prior fluxes. These regions, except for south-western Europe, are poorly constrained. The results for tropical Asia (Region 33) are interesting. The decadal mean CO 2 flux from this region was positive (emission) in the control case, but negative (uptake) in both Case CT and Case NF. Considering that the prior flux in this region is positive, the observational constraints Europe (regions 40 and 42) is barely constrained owing to fewer observation sites. Therefore, the high UR in Transcom3 'Europe' is due mainly to the denser observation network in Western Europe.
Just on the basis of the number of observations, the control case would be expected to constrain the regional flux estimation much better. However, not only does the effectiveness of the inversion depend on the amount of observational data, but it also depends strongly on the spatial (and temporal) coverage of the observation sites. We evaluate the effectiveness of the inversion using two indicators, the uncertainty reduction (UR) and the model-data mismatch, in the following sections.

Uncertainty reduction
UR is a measure commonly used to evaluate the effectiveness of observational constraints in different regions. UR is defined as the relative difference between the prior and posterior flux uncertainty: where post and pri are the quadratic means of the posterior standard deviation and the prior standard deviation, respectively. By definition, the more the posterior error is reduced relative to the prior error, the closer to 1 UR becomes, which means that more information from observations is provided to the inversion. Figure 7 shows the UR calculated for each region in each of the three cases. UR is higher in land regions in the northern mid-high latitudes, where observations are the most abundant in the framework of the current surface observation network, whereas UR is lower in the poorly covered tropical Northern Hemisphere and the whole Southern Hemisphere. The global pattern of the UR distribution is consistent with the UR distributions reported by Chevallier et al. (2010), who conducted an inversion at both grid scale (3.75° × 2.5° longitude × latitude) and regional scale (22 Transcom3 regions distributed worldwide).
The control case showed higher UR than Case NF and Case CT in all regions, and the difference was significant in East Asia and southern Europe, where the control case had better data coverage. Case CT had strong constraints in North America, which is the target of the CarbonTracker North America project ( Fig. 7(b)). Outside of North America, however, Case CT had slightly lower URs than Case NF. Considering that most of the stations included in Case NF were also in Case CT, this UR difference may be due to the relatively larger prior observation uncertainty values assigned to sites outside North America, which resulted in the constraints in Case CT being weaker than those in Case NF.
The UR became more sensitive to the exact location of each observational station as the spatial scale became finer. For example, the Transcom3 'Europe' region is, as a whole, relatively well covered by observations, but in our land mask in which the Transcom3 'Europe' is divided into four sub-regions, Western Europe (regions 39 and 41) is well constrained with denser observation coverage, whereas Eastern

Model-data mismatch
The model-data mismatch is another measure used to evaluate the effectiveness of inversion results. We compared the forward simulation results using the optimized fluxes with observed CO 2 mixing ratios at the observation sites and calculated three measures of the model-data mismatch: the model-data bias, the root-mean-square error (RMSE), and the linear correlation. The model-data bias is a systematic mismatch between observations and model (model minus observations) throughout the observation period. The RMSE is an aggregated form of the residuals (the difference between simulated values and observed values). The correlation indicates the strength and direction of the linear relationship between model output and observed values. These three measures of the model-data mismatch calculated for each observation site in the control case are shown in Table 1, and the averaged values for all sites used in each case are shown in Table 3. Table 3(a) compares these measures among the control case, Case CT, and Case NF. In the control case, the global mean bias of 0.21 ppm was the smallest of the three cases. The mean RMSE was ±1.34 ppm for the control case, ±1.66 ppm for Case CT, and ±1.07 ppm for Case NF. The differences in RMSE may reflect the fraction of continuous data in each data-set, because the RMSE is affected by the higher variability of continuous data compared with flask data. The mean correlation coefficient R was 0.962, 0.958, and 0.974 for the control case, Case CT, and Case NF, respectively. The model-data correlations were high for all cases, indicating overall good performance of the GELCA inversion system.
The bias and RMSE for each site in the three cases are shown in Fig. 8. The observations were not well reproduced by the model at sites that showed high values of both bias and RMSE. Nine sites in the control case showed a bias larger than ±1 ppm: Heidelberg (HEI), Toronto (TOT), Bukit Kototabang (BKT), Black Sea (BSC), Lutjewad (LUT), Sutro Tower (STR), Hohenpeissenberg (HPB), Baltic Sea (BAL), and Point Arena (PTA). We investigated the reasons for the discrepancies between observations and simulations at these sites. Among these nine sites, three were probably strongly influenced by a local CO 2 flux such as urban emissions (HEI and TOT) or forest uptake (BKT). For sites located in cities or downwind of urban areas, the model often failed to reproduce sporadic sharp peaks in the observations. Continuous measurements inside urban areas (HEI and TOT) resulted in a significantly negative bias compared to background sites. LUT and STR often captured a high CO 2 plume transported from urban areas of The Netherlands and San Francisco, respectively. In the case of BSC, the observational behaviour has apparently been changing. The prominent seasonal cycle seen in the early 2000s gradually disappeared, and the frequency of significantly high mixing ratios increased in the late 2000s. These changes might reflect a change of either the surrounding environment in the vicinity and topographic features), and should be further investigated in future studies.

Data selection to reduce observational noise
Based on the results reported in Section 3.4, we designed a new subset called Case SEL to minimize noise from observations. To avoid strong local influences, data from BSC, HEI, and TOT were excluded from Case SEL. We also applied temporal data selection to seven continuous sites located near source or sink areas. Only afternoon averages were used from the tower sites Boulder Atmospheric Observatory (BAO), Moody (WKT), Beech Island (SCT), Park Falls (LEF), West Branch (WBI), and Walnut Grove (WGC), and the Pallas-Sammaltunturi (PAL) surface site (PAL), to exclude local extreme values in the stable boundary layer at night. In contrast, only night-time averages were used from a mountain site, Shenandoah National Park (SNP; 1008 m above sea level) to minimize the bias from (possibly increasing CO 2 sources) or the measurement system. When both the topography near a site and nearby source or sink distributions are complicated, the model tends to express a higher mismatch, as in the cases of HPB, BAL, and PTA. GELCA showed significantly better performance compared to NIES-TM for the site that require finer resolution than 2.5° grid of NIES-TM. For example, two European tower sites Ochsenkopf (OXK; 50.0°N, 11.8°E) and Pic du Midi (PDM; 42.9°N, 0.1°E) are located close to the border of the model grids in which the topography is rather complicated (on the top of mountain). Since NIES-TM cannot resolve the topographical change within a grid, the forward simulation doesn't fit observation well. On the other hand, GELCA handles the simulation in the vicinity of the observation sites with FLEXPART, resulting in much better fit at these difficult sites. The observed and simulated CO 2 time series for OXK and PDM are shown in Supplementary Figure 2. The performance of GELCA depends on site-specific conditions (e.g. source and sink distributions Fig. 11. Differences between estimated annual mean regional CO 2 fluxes from the (a) land biosphere and (b) ocean derived with and without aircraft observations (control case -Case NA) during 2002-11. The numbered regions are shown in Fig. 3. Positive fluxes indicate emission and negative fluxes indicate uptake. which reflects uptake by local vegetation, than of the daytime large-scale boundary condition. Thus, the net CO 2 uptake in tropical Asia in Case CT and Case NF may be largely due to the BKT observations.
In contrast to Case CT and Case NF, the control case yielded net CO 2 emissions in tropical Asia even though it used BKT data. The UR of the control case was higher not only in tropical Asia but also in the overall tropical southern Pacific Ocean, compared to Case NF and Case CT. This spatial distribution difference of UR suggests that net CO 2 emissions in tropical Asia in the control case might result from the observational local sources or sinks. Temporal data selection has been used in previous studies carried out since the TransCom Continuous experiment (Peters et al., 2007;Law et al., 2008;Patra et al., 2008;Chevallier et al., 2010). Figure 9(b) shows the inversion results for Case SEL. The decrease in biospheric emissions from southwestern Europe (region 39) compared to the control case is the most prominent feature, whereas the impact of Case SEL was not significant in south-eastern Europe (region 40). The decadal mean decrease of biospheric emissions was 0.076 ± 0.024 PgC/region/year in north-western Europe, and 0.040 ± 0.026 PgC/region/year in south-western Europe; both values correspond to a 41% change from the estimated regional fluxes in the control case. This result indicates that HEI, BSC and PAL significantly affected the inversion results for Western Europe. Estimation of finely distributed anthropogenic and natural sources and sinks in Western Europe may need higher spatial and temporal resolution of both prior fluxes and transport simulation. In contrast, in North America, there was no significant difference between the control case and Case SEL. This result shows that the temporal data selection of continuous tower observations and the removal of TOT did not significantly affect the flux estimation in North America.

Effect of aircraft observations on flux estimates in tropical Asia
Here we discuss the large difference in terrestrial biosphere fluxes from tropical Asia (region 33) among the prior and three posterior fluxes described in Section 3.2. The decadal mean flux and UR for this region in the control case, Case CT, and Case NF are shown in Fig. 10. In tropical Asia, only one observation site, BKT in Indonesia, was used in this study (Fig. 4). We set the observation uncertainty for BKT to 2.8 ppm for the control case and Case NF; this value is derived from the RSD values of the data record at site BKT in the ObsPack GLOBALVIEW product. For Case CT, the observation uncertainty for BKT was set to 7.5 ppm, which is the maximum uncertainty in the Car-bonTracker model, because of its relatively large model-data mismatch . The higher UR for Case NF than Case CT (Fig. 10) can be explained by the smaller prior uncertainty assigned to BKT as well as by additional constraints from the Western Pacific Cruise (WPC; shipboard observations in the western Pacific Ocean; Fig. 4(c)) during 2004, which may have detected flux signals from tropical Asia.
As shown in Fig. 8, BKT showed the largest positive bias among all sites used in this study. A similar large positive bias for BKT has been found by many other atmospheric inversion studies as well (e.g. CarbonTracker Team, 2014). The flask sampling at BKT is conducted on a weekly basis, usually around 14:00 LT, when the CO 2 hourly average mixing ratio reaches its minimum value (Nahas, 2012). Because the observation site is surrounded by a tropical rainforest, the samples may be more representative of the daily minimum mixing ratio, mixing ratios over the course of a month. The results shown in Fig. 12 indicate that the signal from surface fluxes in tropical Asia could be detected by aircraft observations in the mid/upper troposphere through vertical convection and the consequent rapid horizontal transport in the free troposphere. This active convection in tropical Asia as part of the Walker circulation must be a key process connecting surface fluxes and aircraft observations. We next examined the impact of the aircraft data on the seasonality of terrestrial biospheric fluxes from tropical Asia. The decadal mean seasonal cycle derived from the inversion using the aircraft data (control case) and the inversion without using the aircraft data (Case NA) are plotted with the prior flux in Fig. 13. The flux estimates in Case NA became significantly negative (sink) compared with the prior fluxes, whereas the seasonal estimates were mostly positive (source) in the control case. This might be due to the increased effect of the negative bias from BKT observation when we don't use aircraft measurements. A major difference in the estimated fluxes between the control case and Case NA was found during two periods: May-June and November-January. During May-June, the estimated flux was almost zero in the control case, whereas Case NA estimated a sink. During November-January, the control case estimated large emissions, but Case NA estimated much lower emissions in November-December and even uptake in January. Niwa et al. (2012) have also pointed out CO 2 emissions in tropical Asia during October-January in their atmospheric inversion for the period 2006-2008 using CONTRAIL data that were enhanced compared to an estimate made by using only ground-based data (GLOBALVIEW-CO 2 ). Niwa et al. (2012) used CONTRAIL CME (Continuous CO 2 Measuring Equipment) data, which were binned and monthly averaged after smoothing and gap-filling, and the inversion was conducted with the NICAM-TM (Nonhydrostatic Icosahedral Atmospheric Model-based Transport Model). We used only CON-TRAIL ASE data without preprocessing, and we conducted the decadal inversion by GELCA. The decadal inversion results in this study confirmed the strong impact of aircraft data on surface flux estimates in tropical Asia.
To further evaluate the seasonality of the estimated fluxes for tropical Asia, we compared our results with bottom-up studies. Among the limited number of bottom-up studies in this region, the seasonal cycle of NEP was estimated by continuous CO 2 flux measurements using the eddy covariance technique in tropical peat swamp forests in Central Kalimantan (Hirano et al., 2007(Hirano et al., , 2012. These estimates suggest that the CO 2 flux is positive during the rainy season (November-April) and the late dry season (August-October), whereas it is nearly neutral or slightly negative during the early dry season (May-July). The neutral flux in the early dry season and higher emissions during the early rainy season were also seen in the seasonal cycle of the control case (Fig. 13). The seasonal cycle of the control case agrees better with the results from the bottom-up study than constraints in the tropical southern Pacific Ocean, which were used only in the control case. These observational constraints are aircraft data such as Rarotonga (RTA; 21.25°S, 159.83°W) and CON, which were included only in the control case. Therefore, we hypothesize that the aircraft data affected the inverted flux for the tropical Asia region in the control case. The measurement periods of RTA and CON are 2001-2011. The frequencies of both observations are biweekly on average.
To test this hypothesis, we conducted another sensitivity test by removing all aircraft observations from the control case. Without aircraft data, the decadal mean regional flux in tropical Asia became negative ( Fig. 9(c)). This result supports our hypothesis that the aircraft data strongly constrained the CO 2 flux estimate in this region. However, the differences did not appear to be significant in the oceans and other land regions. To check the sensitivity to aircraft data in detail, differences between decadal mean regional fluxes estimated with ( Fig. 9(a)) and without (Fig. 9(c)) aircraft data are shown in Fig. 11. The flux difference in tropical Asia (region 33 in Fig. 11(a)) stands out among the regions. Among oceanic regions, the largest flux difference was found in South Pacific north (region 50 in Fig.  11(b)). This sensitivity analysis indicates that tropical Asia and its neighbouring ocean regions are the areas most sensitive to the aircraft data used in the inversion.
To investigate how surface fluxes from tropical Asia are transported, we calculated the distribution of atmospheric CO 2 at three vertical levels, approximately 990, 500, and 250 hPa, from monthly, pulse emission from the region (annual mean is shown in Fig. 12). We kept a constant CO 2 source (spatially distributed according to the multiple year mean of NEP from VISIT) in tropical Asia, and the transport model tracked its studies when aircraft measurements were used. These results confirm the importance of aircraft observations, especially in constraining surface fluxes in the tropics.
Overall, we found GELCA to be capable of handling various types of observations provided in ObsPack, and its performance in reproducing observed concentrations was good, with reasonably small model-data mismatches. The sensitivity studies indicated that the reduction of uncertainty in CO 2 flux estimation could be improved by expanding the observation network. In particular, the study results highlighted the impact of aircraft measurements over the Pacific on surface flux estimation in tropical Asia. This study evaluated the basic performance of GELCA as an assimilation tool for top-down CO 2 flux estimation. Studies are now underway, for example, to integrate more observations (e.g. satellite data) into GELCA and to analyse certain regional carbon flux estimations. Our future plans include optimization of GELCA's settings (e.g. the duration of backward simulation by FLEXPART, temporal/spatial resolutions, and preprocessing of certain types of data) according to the specific aims of an investigation.
Case NA. This result indicates that the inversion with aircraft data captures well the seasonal signals of the regional land biosphere. Both our inversion results and those of the top-down study by Niwa et al. (2012) agree better with independent bottom-up studies when aircraft data are included; thus, aircraft observations play a key role in constraining CO 2 flux estimates in tropical Asia.

Summary and conclusions
We presented an assimilation system for atmospheric CO 2 using GELCA, and we demonstrated its ability to capture observed atmospheric CO 2 mixing ratios and to estimate CO 2 fluxes. In this study, to take full advantage of the data handling efficiency of GELCA, we used non-smoothed observational data from ObsPack as constraints. ObsPack includes various types of direct atmospheric CO 2 measurements, continuous tower measurements, and aircraft measurements, provided by a large number of laboratories around the world.
We conducted sensitivity studies to examine the impact of the observation network setting on the inversion results and to optimize the site/data selection to minimize noise while optimizing the signal from the extensive observation data-set. We selected five different sets of sites/data from ObsPack: (1) comprehensive data-set (control case); (2) data selection conformed to the CarbonTracker North America project (Case CT); (3) data selection conformed to the NOAA ESRL Cooperative Global Air Sampling Network (Case NF); (4) data selection according to the model-data mismatch of the inversion results of the control case (Case SEL); and (5) Case SEL without aircraft sites (Case NA).
For all cases, the time series of the global net flux to the atmosphere were similar to that of the fluxes calculated from the growth rate of the observed global mean atmospheric CO 2 mixing ratio. At regional scales, estimated seasonal CO 2 fluxes were altered by the selection of assimilated CO 2 data. UR was derived at regional scale and compared among cases. In all regions, UR was higher in the control case than it was in Case CT and Case NF. Case CT showed considerably higher UR in North America, whereas outside of North America, Case NF showed slightly higher UR than Case CT. We employed three measures of model-data mismatch between the forward simulation results using the posterior fluxes and the observed CO 2 mixing ratios: the model-data bias, RMSE, and the linear correlation. For most observation sites, the model-data mismatch was reasonably small (global mean bias, 0.21 ± 0.03 ppm; mean RMSE, 1.38 ± 0.23 ppm; correlation coefficient R > 0.9 for 91% of all used sites). There were some sites with a larger model-data mismatch, caused mostly by local conditions. Surface fluxes in tropical Asia were found to be the most sensitive to the use of aircraft measurements in the inversion. The seasonal cycle agreed better with the results from bottom-up