Introduction
Carbon cycle data assimilation systems offer a promising new tool for
CO2 surface flux (CF) inversion (e.g., Peters et al., 2005; Feng et al.,
2009), which tends to yield CO2 surface flux estimates by optimally
combining information from both chemistry transport model (CTM) simulations
and atmospheric CO2 observations. Previous studies have helped to
improve our understanding of the contemporary carbon cycle (e.g., David et
al., 2006; Peters et al., 2007; Feng et al., 2011; Kang et al., 2012). The
ensemble Kalman filter (referred to as EnKF) has been widely adopted in
carbon cycle data assimilation (e.g., Peters et al., 2007; Feng et al.,
2009, 2011; Kang et al., 2012; Liu et al., 2012), largely due to its simple
conceptual formulation and relative ease of implementation (Evesen, 2003). Peters
et al. (2005) coupled the state-of-the-art atmospheric transport TM5 model
(http://www.projects.science.uu.nl/tm5/) to the ensemble square
root filter (EnSRF), which forms the “CarbonTracker” data assimilation
system, and its CF inversion results are fairly consistent with the majority
of carbon inventories reported by the first North American State of the
Carbon Cycle Report (SOCCR) (Peters et al., 2007). In CarbonTracker, a
simple persistence forecasting operator is taken as the forecast model to
represent the surface CO2 flux propagation. This implies that the CFs
(actually the scaling factors) are essentially treated as the model (i.e.,
the simple persistence forecasting operator) prognostic variables. Inclusion
of a CF dynamical model in CarbonTracker meant that any useful information
for CFs' improvement achieved by the current data assimilation procedure
could be used in the next assimilation cycle, so that the observed
information would not be wasted. However, the uncertainty of the initial
CO2 concentration fields has been ignored in CarbonTracker. In fact,
this uncertainty has such a large effect on CF estimates that neglecting this
effect might result in unpredictable consequences (Bousquet et al., 2000;
McKinley et al., 2004; Peylin et al., 2005). Recently, Kang et al. (2011, 2012) also presented a simultaneous data
assimilation system of surface CO2 fluxes and atmospheric CO2 concentrations
by means of the local ensemble transform Kalman filter (LETKF-CDAS). Here
”LETKF-CDAS” means the LETKF (i.e., the local ensemble transform Kalman
filter)-based carbon cycle data assimilation system (referred to as CDAS). In
LETKF-CDAS, the CFs were also treated as part of the model states (as in
Peters et al., 2005) and essentially a simple persistence dynamical model is
adopted to describe the CFs' integration. Similarly, Feng et al. (2009) also developed an ensemble Kalman
filter to estimate 8-day CO2 surface fluxes over geographical regions
globally from satellite measurements of CO2.
The four-dimensional variational data assimilation (4D-Var) method has also
been introduced in this field (e.g., Baker et al., 2006a; Engelen et al.,
2009). Compared with EnKF, 4D-Var has its own attractive features: for example, it
has the ability to simultaneously assimilate the observations at multiple
times to the analysis fields (Tian and Xie, 2012). Nevertheless, the needs
of the adjoint model and the linearization of the forecast model limit the
wider applications of 4D-Var. Tian et al. (2008b, 2011) proposed the
POD-based (proper orthogonal decomposition) ensemble four-dimensional
variational data assimilation method (PODEn4DVar) based on the POD and
ensemble forecasting techniques, which aims to exploit the strengths of the
two forms (i.e., EnKF and 4D-Var) of data assimilation while simultaneously
offsetting their respective weaknesses. In PODEn4DVar, the control (state)
variables in the 4D-Var cost function appear explicitly so that the adjoint
model is no longer needed and the data assimilation process is significantly
simplified (Tian et al., 2008). Furthermore, PODEn4DVar largely retains the
basic advantages of the traditional 4D-Var. Its feasibility and effectiveness
are demonstrated in an idealized model with simulated observations (Tian et
al., 2011; Tian and Xie, 2012). It is found that the PODEn4DVar performs
better than both 4D-Var and EnKF, and with lower computational costs than the EnKF
(Tian et al., 2011). This method has been successfully applied to land data
assimilation (Tian et al., 2009, 2010). Furthermore, we have built a
PODEn3DVar (the three-dimensional version of PODEn4DVar)-based radar assimilation
system on the atmospheric transport WRF model platform (Pan et al., 2012). This
WRF-based data assimilation system indicates its (PODEn4DVar) potential in the atmospheric transport data assimilation.
In this study, we report on a new development of a CF data assimilation
system based on the PODEn4DVar approach, named Tan-Tracker (in Chinese,
“Tan” means carbon). This system is developed by incorporating a joint
PODEn4DVar assimilation framework into the GEOS-Chem model (V9-01-03,
http://acmg.seas.harvard.edu/geos/). We choose an identity
operator as the CF dynamical model to describe the CFs' evolution and then
utilize such a CF dynamical model to constitute an augmented dynamical model
together with the GEOS-Chem atmospheric transport model. Therefore in this case,
the large-scale state vector made up of both the CFs and CO2
concentrations is assumed to be the prognostic variable, which will be
simultaneously constrained by assimilation of atmospheric CO2
concentration observations.
In Sect. 2, we describe our Tan-Tracker data assimilation system,
including the Tan-Tracker joint assimilation framework, a simple review of
the PODEn4DVar assimilation approach and its coupling with the joint
assimilation framework, and its covariance localization scheme. The following section (Sect. 3) shows observing system simulation experiments (OSSEs) for the
evaluations of the Tan-Tracker system in comparison to its simplified
version only taking CFs as the prognostic variables.
Furthermore, another assimilation experiment for assimilation of real
spaceborne CO2 dry-air mole fraction observations (XCO2) indicates potential wider applications of this new proposed
Tan-Tracker system (Sect. 4). Finally, a summary and concluding remarks
are provided in Sect. 5.
The Tan-Tracker joint data assimilation system
Joint or dual-pass assimilation schemes have been utilized to optimize model
states and parameters simultaneously from noisy measurements through
classical filters (e.g., the dual UKF or EnKF) (Tian et al., 2008; Tian and
Xie, 2008). Tian et al. (2009) expanded the dual-pass assimilation strategy
to the PODEn4DVar approach and built a PODEn4DVar-based dual-pass microwave
land data assimilation system (Tian et al., 2010). Similar to the usual
joint assimilation schemes, the augmented vector used in LETKF-CDAS is also
a state-parameter-augmented one and the CFs are treated as the model
parameters. However it should be noted that the prognostic variable used in
Tan-Tracker is the large-scale vector made up of CFs and CO2 concentrations,
whose evolutions, according to the augmented dynamical
model, consist of an identity operator and the CTM.
Flowchart of the Tan-Tracker joint data assimilation
system.
The Tan-Tracker joint assimilation framework
An ordinary ensemble-based assimilation system (for example, CarbonTracker)
usually begins with the preparation of an ensemble of NCFs Fi,g(i=1,…,N) based on the first-guess net CO2 surface exchange
F∗(t) at the rth assimilation cycle:
Fi,g(t)=λi,g,rFg∗(t),
where λg,r represents a set of linear scaling factors (Peters et
al., 2005) for each day and each grid (g) to be estimated and the subscript
“r” denotes the rth assimilation cycle. Usually, the CTM would integrate and
produce the 3-D CO2 concentration ensemble Um,i (i=1,…,N) N times derived by the ensemble of CFs Fi,g(t)
from the same initial background CO2 concentration field. However, for
Tan-Tracker, we seek a more innovative way to accomplish its implementation.
Figure 1 shows the flowchart of the Tan-Tracker joint assimilation system:
Tan-Tracker is initiated by two CTM runs – one is the background run (the
blue part in Fig. 1) and the other is the sampling run (the red part in
Fig. 1).
The 4-D moving sampling strategy.
Figure 2 shows the makeup of the assimilation window (i.e., the optimized
window + the lag window + the observational window; see Fig. 2) in
Tan-Tracker. Fba(Fbs) denotes the prior CF series over the
assimilation (sampling) window, and Fa∗(Fs∗)
represents the first-guess CF series over the assimilation (sampling)
window. In the background run, we integrate the CTM (GEOS-Chem) to produce the background CO2 concentration fields Ub
forced by the prior CF series Fba at the rth assimilation cycle over
the assimilation window
Fba(t)=λb(t)Fa∗(t),(t=1,⋯,La),
which is used to prepare the background joint state vector
λb,UbT. Here La is the
length of the assimilation window and λb,ris the prior scaling
factor at the rth assimilation cycle. As mentioned, the assimilation window
consists of an optimized window (1 week), a lag window (5 weeks) and an
observational window (1 week). In each assimilation cycle, the
observations in the observational window will be used to update the joint
prognostic variables λ,UT in the optimized window.
Correspondingly, in the sampling run, we run the CTM from the background
CO2 concentration field Ubs at the beginning of
the sampling window (i.e., the Pre-Assim window + the Assimilation window +
the Post-Assim window) (Fig. 2) driven by the prior CF series in the same
(rth) assimilation cycle
Fbs(t)=λb,rFs∗(t),
where t=1,…,Ls; Ls (= LPre + La + LPos) is
the length of the sampling window; and LPre and LPos are the
lengths of the Pre-Assim and Post-Assim windows, respectively (see Fig. 2),
over the sampling window to yield the sampling CO2 concentration series
Uis (i=1,⋯,Ls, and U1s=Ubs). Next, a 4-D moving sampling strategy
(Fig. 2; Wang et al., 2010) is adopted to create the large-scale vector ensemble
λm,i,Um,iT (i=1,…,N, N=Ls-La+1) as follows:
λm,i,Um,iT=Xis=FbsiFa∗1⋮Fbsi+La-1Fa∗LaUis⋮Ui+La-1s.
As a result the large-scale joint state vector λ,UT is viewed as the prognostic variable in
Tan-Tracker, with the identity operator (4) chosen to be the CF dynamical
sub-model to describe the CFs' evolution:
MCF=I,
where I is the identity matrix. This CF persistence forecasting
model (4) follows Peters et al. (2005) and assumes that the prior (or background)
scaling factors λb,r+1 for the next assimilation cycle
[(r+1)th] are equal to the optimized scaling factors λa,r of the
current (rth) assimilation cycle. In the actual implementations, the
following dynamical model (5) is applied to the linear scaling factors,
λ
λb,r+1=1Lo∑j=1Loλa,rj,
where Lo is the length of the optimized window (Fig. 2) and λa,rj are the daily optimized scaling factors λa,j(j=1,….,Lo).
The CF dynamical sub-model MCF is
thus utilized to constitute the augmented dynamical model
M=ICTM
for Tan-Tracker together with the CTM (GEOS-Chem) model. By applying the
observation operator H to the modeled CO2 concentrations Um,i and the background CO2 concentrations Ub,
we can obtain the ensemble simulated observations Umo
and the background simulated observations Ubo as
follows:
Um,io=HUm,i
and
Ubo=HUb.
So far, the background joint vector λb,UbT, the joint vector ensemble λm,i,Um,iT, Eqs. (8) and (9) and
the real CO2 measurements Ubo would be input to the
PODEn4DVar assimilation processor, which yields the assimilated λa,UaT and the
optimized CFs Fa=λaF∗ as a result.
In conclusion, Tan-Tracker works as follows: two CTM runs forced by the
background CFs' series are firstly achieved over the assimilation window and
the sampling window, respectively: the background run is used to prepare the
background joint vector, and the sampling run is used to produce the joint
vector ensemble by applying a 4-D moving strategy (Wang et al., 2010) to the
sampling simulations throughout the sampling window. The background joint
vector and the joint vector ensemble are then input into the PODEn4DVar
processor, in which the usual observation operator (e.g., the interpolation
function to interpolate the model gridded variables to the in situ
observations) compares the simulated CO2 concentrations with the
observed according to the 4D-Var cost function: the CO2 concentrations
are assimilated to initialize the next assimilation cycle. Meanwhile, the
scaling factors λ in the optimized window are also optimized and
used for the next assimilation cycle through Eq. (5).
The PODEn4DVar and its coupling with the joint assimilation
framework
The PODEn4DVar approach is born out of the incremental format of the 4D-Var
cost function
J(x′)=12x′B-1x′+12y′(x′)-yobs′TR-1y′(x′)-yobs′,
where x′=x-xb is the perturbation of the background field xb at the initial time t0,
yobs′=yobs,1′yobs,2′⋮yobs,S′,y′=y′(x′)=(y1)′(y2)′⋮(yS)′,(yk)′=yk(xb+x′)-yk(xb),yobs,k′=yobs,k-yk(xb),yk=Hk(Mt0→tk(x)),
and
R=R10⋯00R2⋯0⋮⋮⋱⋮00⋯RS.
Here index k denotes the observation time; the superscript T stands for a
transpose; b represents background values; S is the total observational time
steps in the observational window; Hk acts as the observation operator;
and matrices Rk and B are the observational
and background error covariances, respectively.
With the prepared background field xb, the initial model
perturbations (MPs) x′(x1′,x2′,⋯,xN′), the simulated observation
perturbations y′(y1′,y2′,⋯,yN′), the observational increments
yobs,k′, and the background and observational error
covariances B and Rk, the final PODEn4DVar
analysis solution xa without localization of its analysis
error covariance Pa is formulated through some necessary
calculations (see Tian et al., 2010, 2011, for more details) as
xa=xb+x′V(N-1)I+PyTR-1Py-1PyTR-1yobs′
and
Pa=PxPa∗PxT,
where Pa∗=(N-1)I+PyTR-1Py-1 and V is derivable from
y′Ty′=VΛ2VT and Py=y′V. To clarify, the background covariance B is approximately
estimated by B=PxPxTN-1(Px=x′V) in formulating
PODEn4DVar.
In particular, in Tan-Tracker,
yobs,k′=Uo,k-Ubo
and
y′=Umo-Ubo,
where Ubo=HUb. Here we
mark
H=H10⋯00H2⋯0⋮⋮⋱⋮00⋯HS.
As mentioned, the model state to be optimized is the joint vector (λ,U)T, which indicates
xb=λb,UbT
and
x′=λm,UmT-λb,UbT
in Tan-Tracker.
We have realized the coupling between the joint assimilation framework with
the PODEn4DVar assimilation processor through Eqs. (18–22) (see the green
part of Fig. 1).
Covariance localization
As an ensemble-based assimilation system, Tan-Tracker also utilizes the
covariance localization techniques to ameliorate the contaminations
resulting from the spurious long-range correlations (Houtekamer and
Mitchell, 2001). It uses the following exponential decay of the covariance
structure with distance between state and observational variables (Gaspari
and Cohn, 1999),
ρh[i,j]=e-di,j/d0,
to calculate the elements ρh[i,j] of the matrix ρh[Lx×Ly], where Lx and Ly are the lengths of the state
vector x and the observational vector y,
respectively; di,j is the distance between the ith state and the jth
observation locations and d0 is the horizontal covariance localization
Schur radius.
Consequently, the covariance localization in Tan-Tracker can be implemented
by calculating the Schur product ∘ (i.e., piecewise multiplication) as
follows (Greybush et al., 2011):
xa=xb+ρh∘x′V(N-1)I+PyTR-1Py-1PyTR-1yobs′.
OSSEs for the evaluations of Tan-Tracker
In this section, Tan-Tracker will be comprehensively evaluated through a
group of well-designed global observing system simulation experiments
(OSSEs) over a given assimilation period.
Experimental setup
We simulate atmospheric CO2 concentrations using the global
three-dimensional chemical transport model GEOS-Chem (version 9-01-03,
http://acmg.seas.harvard.edu/geos/) driven by the assimilated
meteorological data from the Goddard Earth Observing System (GEOS) of the
NASA Global Modeling and Assimilation Office. The version of the model we
use is driven by the GEOS-5 meteorological fields with a horizontal
resolution of 2∘ latitude by 2.5∘ longitude and 47
vertical layers up to 0.01hPa. The original GEOS-Chem CO2 simulation
was described in Suntharalingam et al. (2004) and updated by Nassar et al. (2010).
Our simulations include CO2 fluxes from monthly fossil fuel
burning and cement production CO2 emissions from the Carbon Dioxide
Information Analysis Center (CDIAC) inventory for year 2009 (Andres et al.,
2010), monthly biomass burning from the third version of the Global Fire
Emission Database (GFEDv3) for 2010 (van der Werf et al., 2010; Mu et al.,
2011), climatological biofuel burning (Yevich and Logan, 2003), monthly
ocean exchange (Takahashi et al., 2009), 3-hourly biospheric fluxes from the
Carnegie–Ames–Stanford Approach (CASA) model for 2000 (Olsen and Randerson,
2004), annual climatology terrestrial biosphere exchange based on TransCom
CO2 inversion results adjusted with GFEDv2 fire emissions (Baker et
al., 2006b; van der Werf et al., 2006), the chemical production of CO2
from the atmospheric oxidation of other carbon species (Nassar et al.,
2010), the monthly emissions from shipping (Olivier and Berdowski, 2001), and
aviation CO2 emissions (Friedl, 1997; Sausen and Schuman, 2000; Kim et
al., 2005, 2007; Wilkerson et al., 2010). For this work, our model
simulation was initialized on 01 January 2008 with a globally uniform 3-D
CO2 field of 383.76 ppm. According to the record of NOAA-ESRL Mauna Loa
Observatory in Hawaii (http://www.esrl.noaa.gov/gmd/ccgg/),
which is a marine surface site, the annual mean CO2 at Mauna Loa in
2007 was 383.76 ppm, with monthly means of 383.89 ppm in December 2007 and
385.44 ppm in January 2008. A 2-year spin-up simulation from this
initialized state allows for model transport, sources and sinks to develop the
global spatial patterns of CO2; this approach was evaluated in
Nassar et al. (2010). After the spin-up run, the obtained CO2 fields
were used to drive the observing system simulation experiments. In all the
following OSSEs, we firstly assume the default surface CO2 fluxes
released with the GEOS-Chem model as the true CF series FTrue.Then we run
the GEOS-Chem model, driven by the true CF series FTrue, to obtain the true CO2 concentration results from 1 January 2010 to 31 December 2010
(i.e., the assimilation period). The artificial
CO2 observations are thus generated every day by sampling the daily
true CO2 concentrations through adding small random noise (whose error
variance is 0.01 ppm2) through the 136 observational sites used in this
study (Fig. 3). The first-guess CF series F∗ are set to 1.8FTrue, which drives the GEOS-Chem model at the same resolution
(2∘ latitude × 2.5∘ longitude) to produce the
background CO2 simulations from the spun-up equilibrium state.
The observational sites used in this study.
The performance of our Tan-Tracker system is examined by comparison with the
simplified version (referred to as TT-S), taking only CFs as the prognostic
variables. TT-S is somewhat similar to CarbonTracker except that
the ensemble square root filter (EnSRF) has been replaced by the PODEn4DVar approach and the GEOS-Chem model is used instead of the TM5 model. Similar to CarbonTracker,
the GEOS-Chem model in TT-S is actually the observation operator linking the
CFs with CO2 observations. In TT-S, since the CO2 concentrations
are not assimilated together with the CFs, we first obtain the optimized
scaling factors through assimilating CO2 observations, and thus the CO2
concentrations are also updated by the GEOS-Chem modeling forced by the
optimized CFs. All the assimilation processes are initiated by the GEOS-Chem
model with first-guess CF series F∗ (=1.8FTrue) and
conducted continuously by assimilating the daily pseudo-observations
throughout the assimilation period. In all the assimilation experiments, the
scaling factors are initiated from λb,0(i,j)=1.0 (where i and
j are the longitude and latitude indexes, respectively, and 0 denotes the
rth (=0) assimilation cycle). In all the OSSEs, the default lag window is
5 weeks, and the observational window and optimized window are both 1 week.
The reference ensemble size N is 106 and the standard localization radius
d0 is 900 km. Changes in the assimilation parameters might influence the
assimilation performance. We further investigate the effects of the length
of the horizontal localization Schur radius and the ensemble size in
Tan-Tracker by means of several sensitivity numerical experiments, the results of
which are presented in Sect. 3.2. In all assimilation experiments, we use
the adaptive inflation technique proposed by Li et al. (2009).
Time series of the global mean (a) CO2 surface fluxes
and (b) CO2 concentrations from the “truth”, simulations, TT-S (the
simplified version of Tan-Tracker) and TT (Tan-Tracker) assimilations from
1 January to 31 December 2010.
Time series of the posterior uncertainties (shaded
areas) of the analyzed surface fluxes (TT) from 1 January to 31 December 2010.
Time series of the averaged scaling factors from 1 January to 31 December 2010.
(a) Mean CO2 surface fluxes and (b) CO2 concentration from the “truth”, simulations, TT-S (the simplified
version of Tan-Tracker) and TT (Tan-Tracker) assimilations aggregated to
TransCom regions (i.e., CT-01: North America Boreal; CT-02: North America
Temperate; CT-03: South America Tropical; CT-04: South America Temperate; CT-05:
Northern Africa; CT-06: Southern Africa; CT-07: Eurasia Boreal; CT-08: Eurasia
Temperate; CT-09: Tropical Asia; CT-10: Australia; CT-11: Europe; CT-12: North
Pacific Temperate; CT-13: West Pacific Tropical; CT-14: East Pacific
Tropical; CT-15: South Pacific Temperate; CT-16: Northern Ocean; CT-17: North
Atlantic Temperate; CT-18: Atlantic Tropics; CT-19: South Atlantic
Temperate; CT-20: Southern Ocean; CT-21: India Tropical; CT-22: South India
Tropical; CT-23: Zero Flux Regions; G-T: Global Total) during the period from
1 June to 31 December 2010.
Experimental results
To evaluate Tan-Tracker's performance in a general view, time series of
the daily global mean fluxes and CO2 concentrations from the background
simulations, the TT-S and the TT (Tan-Tracker) assimilations are compared
with the true simulations in Fig. 4. Not surprisingly, the background
simulations (referred to as Sim) will inevitably deviate seriously from the
“true” simulations due to the predetermined background CF series Fb (=1.8FTrue).
Remarkably, since both the CO2 concentrations and CFs are simultaneously assimilated under
the joint assimilation framework, it could largely eliminate the uncertainty of the
initial CO2 concentrations on the CO2 evolution during the
assimilation window and maximize the observations' potential. Probably for
this reason, Fig. 4 shows that Tan-Tracker works very well throughout the
whole assimilation period, especially after the first few months, which can
be considered a spin-up period. However, the performance of TT-S is
not very robust and its assimilated errors do not show a trend of becoming less
even though its performance seems to be substantially better than the background
simulation case: obviously, the impacts of the CO2 concentration have
not been taken into full consideration in the TT-S system and there must be
some non-negligible errors remaining in the TT-S-optimized CO2
concentrations (Fig. 4b). The resulting errors in the initial CO2
concentrations will in turn contaminate the TT-S assimilation of CO2
fluxes for the next assimilation cycle. In the following discussions, we
focus on the results only during the latter half of the year 2010 and thus
remove the spin-up period occurring in the first half of the year. Figure 5
also shows that the posterior uncertainties of the analyzed CFs are
gradually decreased with assimilation of CO2 observations. Furthermore,
Fig. 6 shows time series of the daily globally averaged scaling factor.
The daily averaged scaling factor is also decreased and becomes close to
∼ 0.56 (i.e., 1/1.8) with small fluctuations during the latter
half of the year 2010.
Time series of the daily mean CO2 surface fluxes
from the “truth”, simulations, TT-S (the simplified version of
Tan-Tracker) and TT (Tan-Tracker) assimilations aggregated to the selected
four TransCom regions (i.e., CT-02, CT-07, CT-11 and CT-20) during the
period from 1 July to 31 December 2010.
Same as Fig. 8 but for CO2 concentrations.
Similar to Peters et al. (2005), we also aggregated the daily, gridded
(2∘ latitude × 2.5∘ longitude) simulation and
assimilation results to 24 “super-regions” corresponding to the TransCom 3
regions given by Gurney et al. (2002). Figure 7 shows the 24
super-regions' aggregated mean CO2 concentrations and fluxes during
the latter half of the year 2010. Generally, Tan-Tracker is able to
reproduce the true fluxes well and its superiority dominates most of the 24
super-regions except for 3 – CT-09 (Tropical Asia), CT-12 (North
Pacific Temperate) and CT-20 (Southern Ocean) – whose absolute values are very
small (Fig. 7a). Furthermore, as far as the CO2 concentration is
concerned, the superior performance of Tan-Tracker beyond TT-S is
increasingly obvious (Fig. 7b): the differences between the “truth” and
the TT-assimilated CO2 concentrations are much less than those between
the TT-S-assimilated and the “truth” in the overwhelming majority of
cases, which illustrates once more that the simultaneous assimilation of
CO2 concentrations and CFs is indispensable. The time series of daily
mean fluxes and CO2 concentrations from the four selected super-regions
(Temperate North America, Europe, Boreal Eurasia, and Southern Ocean) are
shown in Figs. 8 and 9. Similar to the global mean case shown in Fig. 3, the
ability of our assimilation system to represent the variations of seasonal
peak-to-trough amplitudes of CO2 concentrations and fluxes is expressed
thoroughly and demonstrates its power to make full use of the observations.
Comparatively speaking, the ability of the TT-S system is considerably
inferior to Tan-Tracker, especially in the Southern Ocean super-region during
October–December, 2010: here the TT-S-optimized CO2 concentrations are even
worse than the background simulations (Fig. 9d).
Root-mean-square errors (RMSEs) (units are
10-11 kg C m-2 s-1) for the daily (a) TT- and (b) TT-S-assimilated
CO2 surface fluxes during the period from 1 July to 31 December 2010.
To evaluate the performance of our Tan-Tracker data assimilations system
comprehensively, we show the root-mean-square errors (RMSEs) for the daily,
gridded (2∘ latitude × 2.5∘ longitude) TT- and
TT-S-assimilated fluxes from 1 July to 31 December 2010 in Fig. 10.
In addition, their corresponding RMSEs for the assimilated
(optimized) CO2 concentrations are also shown in Fig. 11. Compared with
the Tan-Tracker case, larger RMSEs (> 300 × 10-11 kg C m-2 s-1) for the TT-S-assimilated fluxes can be found in the
central parts of South America, most of East Asia, and southern Africa (Fig. 10b). Encouragingly, the TT-assimilated flux RMSEs are largely kept at
a very low level (≤ 80 × 10-11 kg C m-2 s-1), in which
relatively larger RMSEs (but still much less than the TT-S-assimilated)
appear only in a very small area in the central parts of South America (Fig. 10a). Naturally, a parallel circumstance is also replayed in the CO2
concentration case (Fig. 11). Evidently, a relatively definite conclusion
can be drawn that the uncertainty of the initial CO2 concentrations
cannot be ignored and the joint assimilation framework contributes a lot to
the final Tan-Tracker performance. Moreover, the application of the advanced
hybrid assimilation approach (i.e., PODEn4DVar) would definitely make a
positive contribution to its excellent performance (Tian et al., 2011). Of
course, the imbalance of CFs and CO2 concentrations in TT-S partly
explains its inferior performance.
Another group of experiments using the Tan-Tracker system with different
horizontal localization radii (d0=100, 900, 1450, 2000 and 5000 km) are
also conducted to explore the sensitivity of our Tan-Tracker assimilation
system to the variations of the horizontal radius. As suggested by Peters et al. (2005), we take 900km as the default or reference radius. Figure 12
shows time series of the daily global CO2 concentrations and fluxes
from the “truth” as well as the TT assimilations using the three different
horizontal localization radii (d0=900, 1450 and 2000 km). Therefore,
we can roughly judge that the Tan-Tracker system could perform well with
its horizontal localization radius around 900 km. Nevertheless, two extremely
inappropriate localization radii (d0=100 and 5000 km) are also tested
in our experiments (but not shown here), whose poor performance demonstrates that the
choice of an appropriate covariance localization radius is essential to Tan-Tracker's successful implementation.
Finally, to investigate the impacts of sample sizes on Tan-Tracker's
assimilation results, we also conduct another group of Tan-Tracker
assimilation experiments with the ensemble numbers N=60, 106 and 150. Figure 13 shows that the differences between the two
assimilation experiments with N=106 and 150 are very small. However, if we
decrease the ensemble number to 60 (not shown), the assimilation results
become divergent. Synthesizing the above results, we can conclude that
giving a certain number of sample sizes (≥ 100) could generally
guarantee the robust performance of our system.
Same as Fig. 10 but for CO2 concentrations
(units are ppm).
Time series of the daily global mean (a) CO2 surface
fluxes and (b) CO2 concentrations from the “truth” and the
TT (Tan-Tracker) assimilations using different covariance localization radii
(900, 1450 and 2000 km), respectively, from 1 January to 31 December 2010.
Time series of the daily global mean (a) CO2 surface
fluxes and (b) CO2 concentrations from the “true” and the
TT (Tan-Tracker) assimilations with ensemble numbers N=106 and 150,
respectively, from 1 January to 31 December 2010.
Comparisons between the observed XCO2 and the open-loop GEOS-Chem-simulated (Sim), Tan-Tracker-assimilated (TT)
and the TT-Sim (i.e., the GEOS-Chem model run without data assimilation
forced by the TT-optimized CF series derived from the Tan-Tracker
assimilation run with the TT-assimilated initial CO2 fields at 1 January 2010) simulated XCO2 on 15 March 2010.
Real-data assimilation experiment with spaceborne observations
In this section, a preliminary real assimilation experiment is conducted by
using spaceborne CO2 dry-air mole fraction observations to illustrate
the potential applications of Tan-Tracker in real-data assimilation.
Experimental setup
The basic experimental designs (such as the GEOS-Chem model, ensemble size,
assimilation window, localization radius, etc.) are exactly the same as
those adopted in Sect. 3. Nevertheless, in this real-data experiment, we
took the default surface CO2 fluxes released with the GEOS-Chem model
as the first-guess CF series F∗ and used spaceborne CO2
dry-air mole fraction observations (XCO2) instead of
artificial CO2 observations. The spaceborne observations used here are
originated from the Japanese Greenhouse Gases Observing Satellite (GOSAT),
which was launched into orbit in 2009. TANSO-FTS, onboard GOSAT, operates in the shortwave infrared band (SWIR) between 758 and 2080 nm and thermal
infrared band (TIR) from 5.56 to 14.3 µm, providing information on
CO2 and CH4 in the atmosphere. Level 2 data or the so-called the
column-averaged CO2 dry-air mole fraction XCO2
is taken from version 3.3 atmospheric CO2 observations from space
(ACOS) data product (O'Dell et al., 2012). Validation against ground-based
TCCON data shows a mean bias less than 1.4 ppm; these biases can be further
reduced by applying the recommended data screening criteria and bias
correction technique (for more details please refer to the document “ACOS
Level 2 Standard Product Data User's Guide”,
http://disc.sci.gsfc.nasa.gov/acdisc/documentation/ACOS_v3.3_DataUsersGuide.pdf). Furthermore, to guarantee the
high quality of the assimilated data as much as possible, we discarded the
XCO2 data with observation errors ≥ 0.75 ppm.
In order to assimilate the spaceborne XCO2 directly,
the following observation operator (Eq. 25) needs to be incorporated into
Tan-Tracker to provide a link between the observational variable XCO2 and the GEOS-Chem-simulated CO2 concentrations (Feng et
al., 2009):
XCO2=XCO2,a+hTAum-ua,
where h is the pressure weighting function; A is
the full averaging kernel matrix; Ua and XCO2,a are the prior CO2 profile and the associated
column amount, respectively; and um is the GEOS-Chem-produced CO2 profile. The experiment period is from 1 January 2010 to
31 March 2010. In particular, we chose one arbitrary day's (15 March 2010
in this experiment) XCO2 data as the evaluation data
set, which are designedly not assimilated in the experiments to provide an
independent evaluation for the Tan-Tracker system.
Experimental results
The lack of reliable independent CF estimates derived from GOSAT XCO2
retrievals (Chevallier et al. 2014) forces us to seek an indirect way to
evaluate the Tan-Tracker assimilations. Here, we performed a parallel free run of
GEOS-Chem forward simulation without any data assimilation. Then, to examine
Tan-Tracker's performance quantitatively, the simulated and assimilated
CO2 dry-air mole fraction observations of XCO2 on
15 March 2010 were compared with the corresponding (independent) GOSAT
observations. After the data quality control (observation error < 0.75 ppm) implemented in this experiment, there are still 163 valid
footprints left for system evaluation. Compared with the Sim case, the
TT-assimilated XCO2 is improved considerably with
higher correlation (0.83 vs. 0.77) and a smaller RMSE (1.38 ppm vs. 2.95 ppm).
The GEOS-Chem model generally underestimates the XCO2
values by a substantial negative bias of -2.46 ppm, where the
mean bias is given by err =1163∑i=1163
XCO2,is(a)-XCO2,io, with XCO2,is(a) and XCO2,io being the simulated
(assimilated) and observed XCO2 values for each valid
footprint, respectively. However, the TT-assimilated case only has a
very small bias (err = -0.45 ppm). Obviously, the above discussions could
only demonstrate that our Tan-Tracker system is capable of yielding fairly
good CO2 concentration results. It is encouraging to find that the
performance of the TT-Sim case is slightly inferior to the TT case (RMSE = 1.45 ppm and r=0.83), suggesting that Tan-Tracker does enhance the
CO2 concentration and flux estimations . It provides a promising new
tool for CO2 surface flux (CF) inversion. In addition, in Fig. 14,
α (0.01) is the confidence coefficient. Certainly, extra efforts
should be made to give a more detailed assessment for Tan-Tracker satellite
data assimilation, which will be provided in another study.
Summary and concluding remarks
In this study, a new carbon cycle data assimilation system (i.e.,
Tan-Tracker) is developed based on an advanced hybrid assimilation approach
(PODEn4DVar), as a part of the preparation for the launch of the Chinese
carbon dioxide observation satellite (TanSat) (Liu et al., 2012; Cai et al.,
2014). Tan-Tracker adopts a joint data assimilation framework: a simple
persistence model is chosen to describe the CFs' evolution, which acts as
the CF dynamical sub-model and constitutes an augmented dynamical model
together with the GEOS-Chem atmospheric transport model. In such an
augmented dynamical model, the large-scale state vector made up of CFs and
CO2 concentrations is actually the prognostic variable, which is
designed to be simultaneously constrained by the observations of atmospheric
CO2 concentrations. As a step towards the application of Tan-Tracker,
we carefully designed several groups of observing system simulation
experiments (OSSEs) to comprehensively evaluate Tan-Tracker's performance
in comparison to its simplified version (TT-S), taking only
CFs as the prognostic variables. It is found that the simultaneous
estimation of CO2 concentrations and CFs plays a vital role in enhancing
the Tan-Tracker system's performance: contamination in Tan-Tracker's
performance in CF estimation from the uncertainty in the CO2
concentration evolution has been gradually reduced through continuously
fitting model CO2 concentration simulations to the observations.
Our future work will focus on the realization of XCO2 assimilation in the first version of Tan-Tracker, which is a key
step to extending Tan-Tracker with functions for assimilating satellite
measurements. This goal could be achieved by introducing the observation
operator to link the CO2 concentration profiles with XCO2. As the Chinese TanSat has not
yet been launched, we will focus our proposed Tan-Tracker on GOSAT and OCO-2 (O'Dell
et al., 2012) measurements of CO2. Encouragingly, a preliminary real-data assimilation experiment conducted by using spaceborne (GOSAT)
observations demonstrates its potential wider applications.