The evolution of large-scale variations in globally averaged atmospheric CO2 concentrations since 1830

ABSTRACT The analyses of the cumulative sums of the observed yearly averaged atmospheric CO2 concentrations revealed unambiguously two change points during 0–2021. The first abrupt change that occurred during 1830 delineated the starting epoch of the pre-industrial era, which is marked by a linear increase in concentrations (0.24 ± 0.01 ppm/yr.) as described by the IPCC being driven by economic and population growth. Another notable change occurred during 1943 with the start of a uniform acceleration (0.028 ± 0.000 ppm/yr2) since then. These findings bring not only clarity and precision into the IPCC’s vague statement on the topic but also alleviates the bias introduced in estimating the trend (constant velocity) of the atmospheric concentrations, which is three times larger in magnitude (0.78 ± 0.01 ppm/yr.) for the period 1830–2021 if the uniform acceleration since 1943 is not accounted for. If the increased concentrations of CO2 before 1943 are predominantly caused by the climate system, i.e. of non-anthropogenic origin, then the concentrations will continue to increase with the constant velocity estimated in this study despite the efforts to limit the anthropogenic contributions that are the source of the uniform acceleration since 1943.


Introduction
The Intergovernmental Panel on Climate Change, IPCC, in its summary for policy makers stated that Anthropogenic greenhouse gas emissions have increased since the preindustrial era, driven largely by economic and population growth, and are now higher than ever [see, Figure 1]. This has led to atmospheric concentrations of carbon dioxide, methane and nitrous oxide that are unprecedented in at least the last 800,000 years. Their effects, together with those of other anthropogenic drivers, have been detected throughout the climate system and are extremely likely to have been the dominant cause of the observed warming since the mid-20th century (IPCC, 2014). Investigating the evolution of atmospheric CO 2 is therefore essential for establishing accurate models for their predictions for effective climate change risk assessments.
The input of CO 2 to the atmosphere by emissions from human activities, the growth rate of atmospheric CO 2 concentration, changes in the storage of carbon in the land and ocean reservoirs climate change and variability, and other anthropogenic and natural changes have been addressed by abundant number of studies. The studies by Friedlingstein et al. (2019) and Friedlingstein et al. (2020) documents exhaustively the global carbon budget and its temporal variations, and trends in the perturbation of CO 2 in the environment, referenced to the beginning of the Industrial Era (defined as 1750). Their works detail the components of the global carbon cycle over the historical period with focus since 1958, onset of atmospheric CO 2 measurements.
What is missing in past and current literature, however, is the holistic overview of their lumpsum progression over time, which is the focus of this study. Note that, globally averaged atmospheric CO 2 will be abbreviated onward as concentrations for brevity.
Although there will be no attempt to carry our any predictions of concentrations in this work, differentiating the term prediction global atmospheric CO 2 from scenario-based dynamic projections 1 is necessary because the models entertained here are intended to enable precise predictions of global atmospheric CO 2 . These models are purely kinematic and rely solely on past data to represent the retrospective and prospective progression of concentrations. In other words, they are purely empirical. Their effectiveness is therefore contingent on the assumption that the observed past data will continue into the future, which of course cannot be ascertained with high certainty. Nonetheless, the predictions to be made using kinematic model elucidated in this study are more precise than the scenario-based projections in the short run (months to years) and these predictions establish also baselines for projections based on different scenarios to verify them independently (İz, 2021). Moreover, any deviations of the predictions in the near future from the observed progression of concentrations operate as flags to quantify changes in natural variations in atmospheric CO 2 levels, as well as the effectiveness of limiting anthropogenic contributions if there were any mitigation efforts made in the interim. Although there are abundant number of scenario-based dynamic modelling of atmospheric CO 2 concentrations, this investigator has failed to find any study to meet this end. Only vague numbers are quoted in current literature as percentage changes over brief periods for the historical and recent kinematic evolution of atmospheric CO 2 (ex. NOAA, 2021).
In addition, the kinematic models used in this study quantifies, what is visually evident in global records, the nonlinear process unambiguously. Large-scale properties of the progression embedded concentration records are identified and modelled. The outcome enables unbiased estimation of the pertinent model parameters.
In the following sections, first, the time series for the observed yearly and monthly concentrations, are briefly discussed. The sections that follow are about the statistical properties of the records. They are followed by the analyses of the variants of the generic kinematic model that best fit to the historical concentrations.

Data: Global atmospheric CO 2 concentrations
Spline of the concentrations depicted in Figure 1 in parts per million, ppm, are based on ice core data before 1959 (Etheridge et al., 1996;MacFarling Meure et al., 2006). The yearly averages of direct observations from Mauna Loa, and the South Pole after and including 1959(Scripps CO2 Program, 2021. Monthly averaged concentrations for Mauna Loa and South Pole, shown in Figure 2 for the period 1958-2021, are also considered. It is known as Keeling curve (C. D. C. Piper Keeling et al., 2005). The records were derived from seasonally detrended monthly mean values for each station that have been filled with values from the spline fit when monthly averages are not available. Average values are centred on July first of each year (Etheridge et al., 1996).

A critical statistical property of the atmospheric CO 2 concentrations
A critical statistical property of the concentrations which has not been accounted for in recent studies is their high first-order autocorrelations (serial correlations), AR(1). This property appears in the following expression of the observed concentrations, which has a distinct systematic and a random component,  Keeling et al., 2005) and World popululation since 1850 (Worldometer, 2021).
where, CO2 Obs t are the observed monthly averaged concentrations. CO2 t refers to their systematic components with t ¼ 1 � � � n where n is the number of months. The random variable ε t represents the lumpsum effect of the disturbances subject to an underlying autoregressive process of first-order AR(1). Their presence is determined recursively through the residuals of various kinematic models discussed in the following sections.
The autoregressive property of the disturbances, ε t ; may be attributed to the averaging/smoothing of the data and/or intrinsically physical in origin caused by unmodeled variations in concentrations. The disturbances ε t of such yearly/monthly averaged concentrations can be represented as, In this expression, À 1 � ρ � 1 is the first-order autocorrelation coefficient. The underlying random variations, denoted by u t , have the following assumed distributional properties, This expression together with Equation (2) gives, (Toutenburg, 1982). Accounting for the effect of the AR (1) process is of utmost importance for the subsequent analyses. If they are not properly represented, pertinent statistics, such as the uncertainties of the estimated model parameters, will be underestimated causing Type I error, i.e. fail to reject the null hypothesis in the presence of a positive AR(1) effect. Although the solution statistics are not as important in this study because the estimated parameters are statistically significant under both scenarios, they still induce persistent excursions in the residual series that may be triggered by what is called in econometrics as innovations/shocks (İz & Shum, 2020), hence they confound the interpretation of the solution statistics. A recursively estimated AR(1) correlation coefficient is about ρ ffi 0:95 for both yearly and monthly atmospheric CO 2 concentrations. This effect is eliminated by differencing the subsequent observations as follows, The resulting differences as represented by the lefthand side of the yearly series, which will be called revised gradients, ΔCO2 Obs t , are shown in Figure 3.

An abrupt change in atmospheric CO 2 concentrations in 1830s
Figure 3 also reveals a visually evident abrupt change in yearly concentrations during 0-2021. Because the variability during the transitory period is also affected by the AR(1) effect, the use of revised gradients is more appropriate to determine the epoch of this event. The change point analysis using Cumulative Sum, CUSUM, of the concentrations described in the appendix are used to determine the timing of the change. The CUSUMs of the revised gradients of the concentrations revealed a sharp change in 1830 as shown in Figure 4 with a probability close to 100%. However, as one of the reviewers commented, the time resolution of carbon dioxide concentration records in ice cores is very coarse and only becomes more precise since 1835. Nonetheless, the potential imprecision in timing of the change within a few years will not hamper this study.
The following sections will focus on the observed concentrations starting 1830, during 1830-2021.

A generic kinematic model for systematic and random variations in atmospheric CO 2 concentrations
The following basic kinematic model (İz & Shum, 2020), and its various parametric combinations will be used to represent systematic and random variations in observed concentrations as a baseline, In this model, T 0 represents the initial epoch of the records, with the unknown reference concentration CO2 T 0 . The initial velocity at T 0 is denoted by ν T 0 , and a is the uniform acceleration/deceleration in concentrations. Note that because of the uniform acceleration, the velocity at each epoch t, denoted by ν t , is given by, The statistical properties of the autocorrelated random errors are defined through Equations (1)-(5). The following n � n variance/covariance, V/C, matrix, of the autocorrelated errors, ε t ; is denoted by �, where n is the number of yearly/monthly observed concentrations (İz, 2018a), The AR(1) effect decreases for increasing time lag because ρ j j < 1. The correlation between two random variables ε t andε tÀ τ is σ 2 ρ τ , where τ is the time lag. The above patterned V/C matrix has an analytical inverse, and is given by, where P is the n � n weight matrix. In this model, the initial autocorrelation coefficient ρ is estimated from the residuals of the ordinary least squares (OLS) solutions iteratively and used in a generalised least squares (GLS) solution in forming the V/C matrix of concentration measurements. The residuals of the GLS solution are then used to calculate a new AR(1), which is then used in another GLS solution till there are no changes in the a posteriori variance of the unit weight (the root mean squared error or the standard error, SE, of the solution). Iterations result in an optimal AR(1) estimate between adjacent disturbance terms to be used in the final GLS solution.

Model I: Yearly averaged atmospheric CO 2 concentrations during 1830-2021
The kinematic model of this section assumes that the concentrations during 1830-2021 has a uniform acceleration, given by Equation (6). Hence, the velocity varies throughout the series given by Equation (7). A GLS solution was carried out using the records during 1830-2021 with the initial epoch set at T 0 ¼ 1830, which is also the beginning of the yearly series and ρ ¼ 0:95. The residuals of the solutions are displayed in Figure 5. Although this model assumed that the rise in the concentrations were smooth and occurred uniformly during this period, the residuals plotted in Figure 5 reveal that despite the highly explanatory power of the model (Adj. R 2 = 95.7%), there are systematic unmodeled changes in concentrations before and after the critical epoch as determined by the   Table 1 for the period 1830-2021 under Model I. Figure 6 displays the observed yearly concentrations together with the fitted model.
However, the detection of a statistically significant change during 1943 in the residual concentrations is a violation of the assumed uniform acceleration. It is therefore informative to evaluate the degree of uniformity before and after 1943. In the following sections, the kinematic model described by Equation (6) Table 1 under Model II. In this solution, all the estimated parameters are statistically significant at α = 0.05% significance level and the model explains over 95% of the variation in the concentrations during this period.
The residuals shown in Figure (7) still exhibit unmodeled variability during 1830-1943. But they are an order of magnitude smaller than those from the  solution for the period 1830-2021. Fitted model concentrations (adjusted values) are also superimposed with the yearly averaged concentrations in Figure 7. The important finding of the solution is that Model I initial velocity and the acceleration estimates, 0.78 ± 0.01 mm/yr. 0.010 ± 0.000 mm/yr 2 , respectively, are markedly different than those from Model I solution 0.41 ± 0.02 mm/yr. and 0.031 ± 0.000 mm/yr 2 . This outcome suggests that the two different processes are at play during 1830-2021.
The following section investigates the kinematics of the concentrations after 1943 if this is indeed the case also for the period 1943-2021.

Model III: Yearly averaged atmospheric CO 2 concentrations during 1943-2021, after the change point in 1943
The same steps were followed, as in the previous section, to investigate the variability after 1943, during 1943-2021. The residuals of the GLS solution are displayed in Figure 8 together with the fitted model and yearly concentrations. Summary statistics of the GLS solution are given in Table 1 under Model III. Again, the residuals are an order of magnitude smaller than those during 1830-2021. Like the Model II solution for the period before 1943, the estimates for the initial velocity and the acceleration are 0.12 ± 0.02 mm/yr., and 0.031 ± 0.000 mm/yr 2 respectively and again they are markedly different than the Model I solution estimates, 0.78 ± 0.01 mm/yr. 0.010 ± 0.000 mm/yr 2 respectively. This outcome confirms unambiguously the presence of two different regimes at play before and after 1943. What is noteworthy however, statistically significant the uniform acceleration, which is also large in magnitude as compared to the Model I and Model II estimates for the initial values. The implication is that Model I is not appropriate to properly represent the evolution of the concentrations during 1830-2021 since the acceleration is not uniform. Either one or both solutions' estimates of Model II and Model III for the period before and after 1943 may be be biased despite their best fits (Adjusted R 2 ) because of their differing estimates for their initial velocities. In the following section, Model II and Model III will be unified under a single model, Model IV, to better represent the varying kinematics of the concentrations during 1830-2021.

Model IV: A composite model for systematic and random variations in atmospheric CO 2
The following model is intended to preserve prominent properties of the records before and after 1943 studies in the previous sections under the same umbrella.  1943-2021. In other words, the formulation does not only conjoin the previous two independent models for two consecutive time spans under a single model but also identifies each one of them with attributes due to two different dynamics. The summary statistics of this model are listed in Table 1 under Model IV. They are as good as the separate kinematic solutions before and after 1943 but also coherent with each other as intended to be. The  residuals displayed in Figure 9 together with the observed and fitted concentrations are random but also exhibit small-scale transient excursions.
Modelling yearly concentrations so far was informative, but a model based on monthly concentrations and conform with the previous model will provide a better resolution not only for the concurrent data but also will enable high-resolution predictions. This is the topic of the following section.

Model V: Monthly atmospheric CO 2 concentrations during 1958-2021
Monthly averaged concentrations for the period 1958-2021 provide better resolutions (Figure 2). Considering the ongoing analyses, the following kinematic model with tend and uniform acceleration is entertained, This model is the generic model described previously through the equation Equations (6), (8) and (9) with additional representations to account for the effect of annual and biannual periodicities in concentrations that became relevant by the use of monthly data. Four more parameters are introduced for the components of annual and semi-annual changes. They are α h ; γ h for the sine and cosine components of the periodicities from which their amplitudes, a h , and their phase angles can be calculated. The estimated parameters of the GLS solution are listed in Table 1 are also statistically significant. The residuals for the e and u disturbances are shown in Figure 10. The residuals u that were also assumed to be identically and independently distributed, i.i.d., random variations, are also contrasted with a Normal Distribution in Figure 11 and passes the normality test. Meanwhile, the impact of the first-order autocorrelations in e is visually clear in Figure 10. The AR(1) residuals also show the presence of a reversal around 1990 that can be attributed to a global shock/intervention, worthy to explore in follow-up studies.

Discussion and conclusion
The CUSUMs of the yearly concentrations during 0-2021 identified unambiguously two significant change points during this period. The first abrupt change occurred during 1830 delineated the starting epoch of the preindustrial era. IPCC explained this increase in concentrations as driven by economic and population growth. Their correlations can be traced in Figure 1.
The first attempt to model the variations during 1830-2021 using a kinematic model with an initial velocity and uniform velocity found both estimates and their statistics statistically significant. However, the analysis of the solution residuals detected another significant upward change since 1943 that continues until 2021. 2 Splitting the records before and after 1943 and estimating their initial velocities and uniform accelerations independently using the same kinematic model show that, the velocities and the accelerations were not uniform during 1830-2021, despite the satisfactory performance of the models. In particular, the acceleration estimates, 0.003 ± 0.000 ppm/yr 2 for the period 1830-1943, versus 0.031 ± 0.000 ppm/yr 2 for the period 1943-2021 are markedly different suggesting a rapid increase in concentrations since 1943.
In the light of the findings from two independent solutions, another kinematic model was entertained. In this model, concentrations increase steadily during 1830-2021 without any uniform acceleration until  Another solution with an initial velocity and uniform acceleration using this time monthly records during 1958-2021 confirmed the constant acceleration detected in the earlier solutions using yearly averaged concentrations.
These findings bring support and provide precision into the IPCC's vague statement quoted in the introduction section. The sudden uniform acceleration after 1943, as quantified by the lagged uniform acceleration model, is not only because of the increasing population, but also the impact of the efficiency of the technology and the start of industrialisation around the World, especially in China, and efficiency in worldwide agriculture, etc., which can be traced back to the 1940s.
Nonetheless, this outcome unfortunately does not negate the ambiguity about the origin of the earlier rise in concentrations. If the increased concentrations before 1943 is dominantly caused by the climate system, i.e. of non-anthropogenic origin in addition to the increasing world population, then the concentrations will continue to increase with the constant rate estimated in this study despite the efforts to limit the anthropogenic contributors. This finding should not be interpreted as futility of limiting anthropogenic atmospheric CO 2 emissions, which are absolutely necessary, but also for looking beyond the necessary interventions with a deeper understanding of the natural source of CO 2 in the atmosphere and readying for its mitigation.
Another finding of this study is about the bias of the estimated initial velocity 0.78 ± 0.01 ppm/yr. for the period 1830-2021 assuming a uniform acceleration throughout this period (Model IV). This estimate is three times larger in magnitude the estimated initial velocity (0.24 ± 0.01 ppm/yr.) for the period 1943-2021. Proper modelling of the acceleration alleviates biased estimation of changes in the concentrations thereby improve the accuracy of their predictions.
The predictions to be made using the kinematic model based on the monthly records will suit well to watch minute-by-minute world-wide changes in atmospheric CO 2 concentrations, a topic which was addressed in another study by İz (2021). (2009) for a detailed discussion on the topic. 2. Start of the uniform acceleration in atmospheric CO 2 concentrations since 1943 also coincides with ice core records (Bastos et al., 2016;Rubino et al., 2019) as noted by one of the reviewers.

Acknowledgments
Two anonymous reviewers read the manuscript carefully and commented extensively. Their time and their insightful contributions are greatly appreciated.

Disclosure statement
No potential conflict of interest was reported by the authors.

Appendix: CUSUM Charts for Mean Shift Detection
Cumulative sums and change-point analysis are used to find whether the mean of a stochastic process has shifted. The following narrative is an excerpt from İz and Ng (2005). Change point analysis complements the CUSUM charts by trying to answer the following questions: Did a change occur? Did more than one change occur? When did the changes occur? With what confidence can we say that any changes occurred?
Consider the cumulative sums, s 0 ; s 1 . . . ; s n , that are calculated from data x 1 ; . . . ; x n , which are assumed to be random, as follows, where, Plots of CUSUMs s i against i generates CUSUM charts. If there are no changes in the mean, CUSUM charts display a steady straight path since the data is random. Otherwise, a segment of the CUSUM chart with an upward slope indicates a period where the values tend to be above average or a segment with a downward slope indicates a period of time where the values tend to be below overall average (see CUSUM plots in Figures 4 and 5 as examples). A sudden change s diff in the slope of the CUSUM occurs with a sudden shift in the average. The magnitude of the sudden change, s diff j j, can be estimated from, A bootstrap analysis based on the random reordering of the elements of the series with replacement gives the confidence level about whether a mean shift has occurred. The sudden shift, s diff , which is calculated from a large number of bootstrapped series, tends to stay at about zero or a mean value when compared to the s Original diff of the original series. Hence, for a total of N bootstraps for which p is the number of bootstraps with s diff < s Original diff , the confidence level as a percentage is, A high confidence level is strong evidence that a change has occurred. Once a change has been detected, the location of the change needs to be determined. Among others, a natural and simple approach is to use the following expression, where s m is the point furthest from zero in the CUSUM chart. The m th point gives the last data before the change occurred and the m þ 1 point refers to the first data value after the change.
The average value of the random data can be replaced by their median values in the case of outliers. Alternatively, random data can be replaced by their ranks, and the analysis can be performed using ranks rather than the actual values for more robust and reliable results.