Introduction
Considerable progress has been made in recent years to reduce
the uncertainties of surface CO2 flux estimates through the use of an
advanced data assimilation technique (e.g. Chevallier, 2007; Chevallier et al., 2005, 2007; Baker et al., 2006; Engelen et al., 2009; Liu et al., 2012). Feng et
al. (2009) showed that the uncertainties of surface CO2 flux estimates
can be reduced significantly by assimilating OCO XCO2
measurements. Peters et al. (2005, 2007, 2009) developed a surface CO2
flux inversion system, CarbonTracker, by incorporating the ensemble
square-root filter (EnSRF) into the atmospheric transport TM5 model; the
inversion results obtained by assimilating in situ surface CO2
observations are in excellent agreement with a wide collection of carbon
inventories that form the basis of the first North American State of the
Carbon Cycle Report (SOCCR) (Peters et al., 2007). CarbonTracker has also been frequently
used to constrain the surface CO2 fluxes over Europe and Asia (eg.,
Zhang et al., 2014a, b). Kang et al. (2012) presented a simultaneous data
assimilation of surface CO2 fluxes and atmospheric CO2
concentrations along with meteorological variables using a local ensemble
transform Kalman filter (LETKF). They indicated that an accurate estimation
of the evolving surface fluxes can be gained even without any a priori
information. Recently, Tian et al. (2014) developed a new surface CO2
flux data assimilation system, Tan-Tracker, by incorporating a joint
PODEn4DVar assimilation framework into the GEOS-Chem model on the basis of
Peters et al. (2005, 2007) and Kang et al. (2011, 2012). They discussed in
detail that the assimilation of CO2 surface fluxes could be improved
through the use of simultaneous assimilation of CO2 concentrations and
CO2 surface fluxes. Despite the rigour of data assimilation theory,
current CO2 flux-inversion methods still face many challenging
scientific problems, such as: (1) the well-known “signal-to-noise” problem
(NRC, 2010); (2) large inaccuracies in chemical transport models (e.g.
Prather et al., 2008); (3) vast computational expenses (e.g. Feng et al.,
2009); and (4) the sparseness of observation data (e.g. Gurney et al.,
2002).
The “signal-to-noise” problem is one of the most challenging issues for an
ensemble-based CO2 flux inversion system due to the fact that surface
CO2 fluxes are the model forcing (or boundary condition), rather than
model states (like CO2 concentrations), of the chemistry transport
model (CTM). In the absence of a suitable dynamical model to describe the
evolution of the surface CO2 fluxes, most CO2 flux-inversion
studies have traditionally ignored the uncertainty of anthropogenic and
other CO2 emissions and focused on the optimization of natural (i.e.
biospheric and oceanic) CO2 emissions at the ecological scale (e.g.
Deng et al., 2007; Feng et al., 2009; Peters et al., 2005, 2007; Jiang et
al., 2013; Peylin et al., 2013).
This compromise is acceptable to some extent. Indeed, the total amount of
anthropogenic CO2 emissions can be estimated by relatively
well-documented global fuel-consumption data with a small degree of
uncertainty (Boden et al., 2011), and the uncertainties involved in the
total amount of anthropogenic CO2 emissions are much smaller than those
related to natural emissions. However, their spatial distribution, strength
and temporal development still remain elusive because of their inherent
non-uniformities (Andres et al., 2012; Gurney et al., 2009). Marland (2008)
pointed out that even a tiny amount of uncertainty, i.e. 0.9 %, in one of
the leading emitter countries like the U.S. is equivalent to the total
emissions of the smaller emitter countries in the world. Furthermore, the
usual values of anthropogenic CO2 emissions in chemical transport
models have thus far been simply interpolated from very coarse monthly-mean
fuel consumption data. Therefore, great uncertainty in the spatiotemporal
distributions of anthropogenic emissions likely exists, which could reduce
the accuracy of CO2 concentration simulations and subsequently increase
the inaccuracy of natural CO2 flux inversion results. In addition,
current research approaches tend only to assimilate natural CO2
emissions at the ecological scale, which is far from sufficient. Therefore,
surface CO2 fluxes should be constrained as a whole at a finer scale.
In CarbonTracker (Peters at al., 2007), a smoothing operator is innovatively
applied as the persistence forecast model. In that application, the surface
CO2 fluxes can be treated as the model states and the observed
information ingested by the current assimilation cycle can be used in the
next assimilation cycle effectively. However, the “signal-to-noise” problem
has not yet been resolved, and thus CarbonTracker also has to assimilate natural
CO2 emissions at the ecological scale only. In Tan-Tracker (Tian et al.,
2014), a four-dimensional (4-D) moving sampling strategy (Wang et al., 2010) is used to generate
the flux ensemble members, and so the surface CO2 fluxes can be
optimized as a whole at the grid scale. In this work, the
persistence dynamical model taken by Peters et al. (2005) was further
developed for the purpose of resolving the “signal-to-noise” problem, to
optimize the surface CO2 fluxes as a whole at the grid scale. This
process is described in detail in Sect. 2 of this paper.
The surface CO2 flux inversion system presented in this paper was
developed by simultaneously optimizing the surface CO2 fluxes and
constraining the CO2 concentrations. As we know, assimilating CO2
observations from multiple sources can improve the accuracy of simulation
results (e.g. Miyazaki, 2009; Liu et al., 2011, 2012; Feng et al., 2011;
Tangborn et al., 2013; Huang et al., 2014). In addition, previous studies
showed that the simultaneous assimilation of CO2 concentrations and
surface CO2 fluxes can largely eliminate the uncertainty in initial
CO2 concentrations on the CO2 evolution (Kang et al., 2012; Tian et
al., 2014). Therefore, we also use the simultaneous assimilation framework;
the ensemble Kalman filter (EnKF) was used to constrain CO2
concentrations and the ensemble Kalman smoother (EnKS) was used to optimize
surface CO2 fluxes. Since the regional chemical transport models,
compared to global models, have some advantages in reproducing the effects of
meso–micro–scale transport on atmospheric CO2 distributions (Ahmadov
et al., 2009; Pillai et al., 2011; Kretschmer et al., 2012), we choose a
regional model, Regional Atmospheric Modeling System and Community
Multi-scale Air Quality (RAMS-CMAQ) (Zhang et al., 2002, 2003, 2007; Kou et
al., 2013; Liu et al., 2013; Huang et al., 2014), to develop this inversion
system. For simplicity, this system is referred to as CFI-CMAQ (Carbon Flux
Inversion system and Community Multi-scale Air Quality).
Since this is the first introduction of CFI-CMAQ, we focus mainly on
introducing the methodology in this paper. Nevertheless, in addition,
Observing System Simulation Experiments (OSSEs) were designed to assess the
system's ability to optimize surface CO2 fluxes. The retrieval
information of GOSAT XCO2 are used to generate artificial observations
because of the sparseness and heterogeneity of ground-based measurements.
The remainder of the paper is organized as follows. Section 2 describes the
details of the regional surface CO2 flux inversion system, CFI-CMAQ,
including the developed persistence dynamical model, a simple review of the
EnKS and EnKF assimilation approaches, and the process involved. The
experimental designs are then introduced and the assimilation results shown
in Sect. 3. Finally, a summary and conclusions are provided in Sect. 4.
Framework of the regional surface CO2 flux
inversion system
Suppose we have the prescribed net CO2 surface flux,
F∗(x,y,z,t), which can be released from a climate model or be
generated by other's methods, our ultimate goal is to optimize
F∗(x,y,z,t) by assimilating CO2 observations from various
platforms. As an ensemble-based assimilation system, CFI-CMAQ was also
developed by applying a set of linear multiplication factors, similar to the
approach by Peters et al. (2007) and Tian et al. (2014). The ith ensemble
member of the surface fluxes, Fi(x,y,z,t), from an N-member ensemble
can be described by
Fi(x,y,z,t)=λi(x,y,z,t)F∗(x,y,z,t),(i=1,…,N),
where λi(x,y,z,t) represents the ith ensemble member of the
linear scaling factors (Peters et al., 2007; Tian et al., 2014) for each time
and each grid to be optimized in the assimilation. The notations are
standard: the subscript i refers to the ith ensemble member. In the
following, λi(x,y,z,t) is referred to as λi,t, F∗(x,y,z,t) is referred to as Ft∗, and Fi(x,y,z,t) is referred to as
Fi,t for simplicity.
At each optimization cycle, CFI-CMAQ includes the following four parts in
turn (see Fig. 1): (1) forecasting of the linear scaling factors at time
t, λi,t|t-1a; (2) optimization of the
scaling factors in the smoother window, (λi,t-M|t-1a,λi,t-M+1|t-1a,…, λi,j|t-1a,…,λi,t-1|t-1a,λi,t1|t-1a), by EnKS, where λi,j|t-1a(j=t-1-M,…,t-1) refer to analysed quantities from the
previous assimilation cycle at time j (see Fig. 1), |t-1 means that
these factors have been updated using observations before time t-1, and
the superscript a refers to the analysed; (3) updating of the fluxes in the
smoother window, (Fi,t-M|t-1a,Fi,t-M+1|t-1a,…, Fi,j|t-1a,…,Fi,t-1|t-1a,Fi,t|t-1a); and (4) assimilation of the forecast
CO2 concentration fields at time t, Cif(x,y,z,t) (referred to as Ci,tf, and the superscript f refers
to the forecast or the background), by EnKF. A flowchart illustrating
CFI-CMAQ is presented in Fig. 2. The assimilation procedure is addressed in
detail below. In addition, the observation operator is introduced,
particularly for use in the GOSAT XCO2 data in Sect. 2.4.
Furthermore, covariance inflation and localization techniques applied in
CFI-CMAQ are introduced briefly in Sect. 2.5.
Forecasting the linear scaling factors at time t λi,t|t-1a
In the previous assimilation cycle t-1-M∼ t-1 (see Fig. 1), the
optimized scaling factors in the smoother window are (λi,t-1-M|t-1a,λi,t-M|t-1a,λi,t-M+1|t-1a,…, λi,j|t-1a,…,λi,t-1|t-1a) and
the assimilated CO2 concentration fields at time t-1 are
Cia(x,y,z,t-1) (referred to as
Ci,t-1a). In the current assimilation cycle t-M ∼ t, the scaling factors in the current smoother window are (λi,t-M|t-1a,λi,t-M+1|t-1a,…,λi,j|t-1a,…, λi,t-1|t-1a,λi,t|t-1a) and the forecast CO2
concentration fields at time t are Ci,tf.
In order to pass the useful observed information onto the next assimilation
cycle effectively, following Peters et al. (2007) the smoothing operator is
applied as part of the persistence dynamical model to calculate the linear
scaling factors λi,t|t-1a,
λi,t|t-1a=(∑j=t-Mt-1λi,j|t-1a+λi,t|t-1p)M+1,(i=1,…,N),
where λi,t|t-1p refers to the prior values of
the linear scaling factors at time t. The superscript p refers to the prior.
This operation represents a smoothing over all the time steps in the
smoother window (see Fig. 1), thus dampening variations in the forecast of
λi,t|t-1a in time.
In order to generate λi,t|t-1p, the
atmospheric transport model (CMAQ) is applied and a set of ensemble forecast
experiments are carried out. It integrates from time t-1 to t to produce
the CO2 concentration fields C^if(x,y,z,t) (referred to
as C^i,tf hereafter to distinguish from
Ci,tf) forced by the prescribed net CO2 surface flux Ft∗ with Ci,t-1a as
initial conditions. Then, the ratio κi,t=C^i,tfC^i,tfC^i,tf‾C^i,tf‾ is calculated, where C^i,tf‾=1N∑i=1NC^i,tf. Suppose that λi,t|t-1p=κi,t due to the fact that the surface CO2 fluxes correlate with
its concentrations, the values for λi,t|t-1p
are obtained and then λi,t|t-1a can finally be
calculated (see the red arrows in the flowchart in Fig. 2).
Schematic diagram of the smoother window. (λi,t-1-M|t-1a,λi,t-M|t-1a,λi,t-M+1|t-1a,…,λi,j|t-1a,…,λi,t-1|t-1a) are the optimized scaling factors in the
smoother window and Ci,t-1a are the assimilated
CO2 concentrations fields at time t-1 in the previous assimilation
cycle t-1-M∼ t-1. (λi,t-M|t-1a,λi,t-M+1|t-1a,…,λi,j|t-1a,…,λi,t-1|t-1a,λi,t|t-1a) are the
scaling factors in the smoother window and Ci,tf are
the forecast CO2 concentrations fields at time t which need to be
optimized in the current assimilation cycle t-M∼ t.
Flowchart of the CFI-CMAQ system used to optimize surface CO2
fluxes at each assimilation cycle. The system includes the following four
parts in turn: (1) forecasting of the linear scaling factors λi,t|t-1a (red arrows); (2) optimization of the scaling
factors in the smoother window by EnKS (see Fig. 1) (blue arrows); (3)
updating of the flux in the smoother window (green arrows); and (4)
assimilation of the CO2 concentration fields at time t by EnKF (black
arrows).
The way the prior scaling factor λi,t|t-1p is
updated by associating with the atmospheric transport model is the main
improvement over the one used in CarbonTracker (Peters et al., 2007). In
CarbonTracker, all λi,t|t-1p are set to 1
(Peters et al., 2007). The distribution of the ensemble members of the
linear scaling factors at time t, λi,t|t-1p,
are finally dependent on the distribution of the previous scaling factors
because Eq. (2) is a linear smoothing operator. In this study, the values of
λi,t|t-1p are updated by association with the
atmospheric transport model. It is important to note that λi,t|t-1p in this study are rand fields with mean 1. However,
the distribution of λi,t|t-1a are dependent on
the distribution of all the scaling factors in the smoother window. An OSSE
was designed to illustrate the difference between our method and the one
in which λi,t|t-1p are set to 1 in Sect. 3.
It is also important to note that, similar to Peters et al. (2007), this
dynamical model equation still does not include an error term in the
dynamical model, and the model error cannot be estimated yet. However, the
covariance inflation is applied to compensate for model errors before
optimization, which is addressed in Sect. 2.5.
Optimizing the scaling factors in the smoother window by EnKS
Substituting λi,t|t-1a into Eq. (1), the ith
member of the surface fluxes at time t, Fi,t|t-1a,
can be generated. Then, forced by Fi,t|t-1a, CMAQ was
run from time t-1 to t to produce the background concentration field
Ci,tf with Ci,t-1a as initial
conditions.
In the current assimilation cycle t-M∼ t (see Fig. 1), the scaling
factors to be optimized in the smoother window are (λi,t-M|t-1a,λi,t-M+1|t-1a,…,λi,j|t-1a,…, λi,t-1|t-1a,λi,t|t-1a), as stated in the first
paragraph of Sect. 2.1. Using the EnKS analysis technique, these scaling
factors are updated in turn via
λi,j|ta=λi,j|t-1a+Kj,t|t-1e(ytobs-yi,tf+υi,t),(i=1,…,N,j=t-M,…,t),
Kj,t|t-1e=Sj,t|t-1eHT(HSt,t|t-1eHT+R)-1,Sj,t|t-1e=1N-1∑i=1N[λi,j|t-1a-λi,j|t-1a‾][λi,t|t-1a-λi,t|t-1a‾]T,St,t|t-1e=1N-1∑i=1N[λi,t|t-1a-λi,t|t-1a‾][λi,t|t-1a-λi,t|t-1a‾]T,yi,tf=H(ϕt-1→t(λi,t|t-1a))=H(Ci,tf),
where Kj,t|t-1e is the Kalman gain matrix of EnKS,
ytobs is the observation vector measured at time
t and yi,tf is the simulated values,
υi,t is a random normal distribution perturbation
field with zero mean, Sj,t|t-1e is the
background error cross-covariance between the state vector λi,j|t-1a and λi,t|t-1a, St,t|t-1e is the background
error covariance of the state vector λi,t|t-1a, H(⋅) is the observation operator that maps the
state variable from model space into observation space, R is the
standard deviation representing the measurement errors, and ϕ(⋅) is the atmospheric transport model.
In actual implementations, it is unnecessary to calculate
Sj,t|t-1e and St,t|t-1e separately. Sj,t|t-1eHT
and HSt,t|t-1eHT can be calculated as a
whole by
Sj,t|t-1eHT=1N-1∑i=1N[λi,j|t-1a-λi,j|t-1a‾][yi,tf-ytf‾]T,HSt,t|t-1eHT=1N-1∑i=1N[yi,tf-ytf‾][yi,tf-ytf‾]T,ytf‾=H(Ctf‾)=H(1N∑i=1NCi,tf).
After EnKS, (λi,t-M|ta,λi,t-M+1|ta,…,λi,j|ta,… ,λi,t-1|ta,λi,t|ta)
are gained. Then the corresponding fluxes in the smoother window
(Fi,t-M|ta,Fi,t-M+1|ta,…,Fi,j|ta,…,Fi,t-1|ta,Fi,t|ta) can be gained (see the green arrows
in the flowchart in Fig. 2) by substituting (λi,t-M|ta,λi,t-M+1|ta,…,λi,j|ta,…,λi,t-1|ta,λi,t|ta) into Eq. (1).
Then the ensemble mean values of the assimilated fluxes in the smoother
window can be calculated via,
Fi,j|ta‾=1N∑i=1NFi,j|ta,(j=t-M,…,t).
Finally, those ensemble mean assimilated fluxes which are before the next
smoother window and will not be updated by the succeeding observations are
regarded as the final optimized fluxes. We referred to them as Fta‾ for simplicity.
Assimilating the CO2 concentration fields at time t by EnKF
The analysis of CO2 concentrations fields at time t in the EnKF scheme
is updated via
Ci,ta=Ci,tf+K(ytobs-ytf+υi,t),K=PfHT(HPfHT+R)-1,
where K is the Kalman gain matrix of EnKF,
Pf is the background error covariance among the
background CO2 concentration fields Ci,tf.
In the actual application, PfHT and
HPfHT can be calculated as a whole by
PfHT=1N-1∑i=1N[Ci,tf-Ctf‾][yi,tf-ytf‾]T,HPfHT=1N-1∑i=1N[yi,tf-ytf‾]T[yi,tf-ytf‾]T,Ctf‾=1N∑i=1NCi,tf.
Finally, the ensemble mean values of the assimilated CO2 concentrations
fields can be gained via
Cta‾=1N∑i=1NCi,ta,
where Cta‾ is regarded as the final
analysing concentration field.
The observation operator
As mentioned above, the observation operator H(⋅) transforms the
state variable from model space into observation space. Usually, it is the
spatial bilinear interpolator for traditional ground-based observations.
Since the GOSAT XCO2 retrieval is a weighted CO2 column average,
the simulated XCO2 should be calculated with the same weighted column
average method (Connor et al., 2008; Crisp et al., 2010, 2012; O'Dell et al.,
2012). Hence, the observation operator to assimilate the GOSAT XCO2
retrieval is
yi,tf=H(ϕt-1→t(λi,t|t-1a))=H(Ci,tf)=ypriori+hTaCO2(S(Ci,tf)-fpriori),
where yi,tf is the simulated XCO2;
ypriori is the a priori CO2 column average used in the
GOSAT XCO2 retrieval process; S(⋅) is the spatial bilinear
interpolation operator that interpolates the simulated fields to the GOSAT
XCO2 locations to obtain the simulated CO2 vertical profiles
there; fpriori is the a priori CO2 vertical profile used in the
retrieval process; h is the pressure weighting function, which indicates
the contribution of the retrieved value from each layer of the atmosphere;
and aCO2 is the normalized averaging kernel.
Covariance inflation and localization
In order to keep the ensemble spread of the CO2 concentrations at a
certain level and compensate for transport model error to prevent filter
divergence, covariance inflation is applied before updating the CO2
concentrations. So,
(Ci,tf)new=α(Ci,tf-Ci,tf‾)+Ci,tf‾,
where α is the inflation factor of CO2 concentrations and
(Ci,tf)new is the final field used for
data assimilation.
Similarly, covariance inflation is also used to keep the ensemble spread of
the prior scaling factors at a certain level and compensate for dynamical
model error. Hence,
(λi,t|t-1p)new=β(λi,t|t-1p-λi,t|t-1p‾)+λi,t|t-1p‾,
where β is the inflation factor of scaling factors and
(λi,t|t-1p)new is the final
scaling factors used for data assimilation.
In addition, the Schur product is utilized to filter the remote correlation
resulting from the spurious long-range correlations (Houtekamer and Mitchell
2001). Hence, the Kalman gain matrix Kj,t|t-1e and
K are updated via
Kj,t|t-1e=[(ρ∘Sj,t|t-1e)HT(H(ρ∘Pt,t|t-1e)HT+R)-1,K=[(ρ∘Pf)HT][(H(ρ∘Pf)HT+R]-1,
where the filtering matrix ρ is calculated using the formula
C0(r,c)=-14|r|c5+12|r|c4+58|r|c3-53|r|c2+1, 0≤|r|≤c112|r|c5-12|r|c4+58|r|c3+53|r|c2-5|r|c+4-23c|r|,c≤|r|≤2c0,c≤|r|,
where c is the element of the localization Schur radius. The matrix
ρ can filter the small background error correlations
associated with remote observations through the Schur product (Tian et al.,
2011); and the Schur product tends to reduce the effect of those
observations smoothly at intermediate distances due to the smooth and
monotonically decreasing of the filtering matrix.
(a) Total number of observations in February 2010 in the
model grid. Each symbol indicates the total number of all GOSAT
XCO2 measurements in the corresponding model grid. Monthly
mean values in February 2010 of (b) XCO2p, column mixing ratio of Ctp; (c)
XCO2f, column mixing ratio of
Ctf; (d) XCO2a‾, column mixing ratio of Cta‾; (e) XCO2p-XCO2f; and (f)
XCO2p-XCO2a‾. All column mixing ratios are column-averaged with real GOSAT
XCO2 averaging kernels at GOSAT XCO2
locations. Each symbol indicates the monthly average value of all
XCO2 estimates in the model grid. Cta‾ are the ensemble mean values of the
assimilated CO2 concentrations fields of a CFI-CMAQ OSSE, in which the
lag-window was 9 days and β was 70. They are the same OSSE in Figs. 3–6.
Monthly mean values of (a) Ctp,
the artificial true simulations driven by the prior surface CO2 fluxes
Ftp; (b) Ctf, the
background simulations driven by magnified surface CO2 fluxes
Ft∗=(1.8+δ(x,y,z,t))Ftp;
(c) Cta‾, the ensemble mean
values of the assimilated CO2 concentrations fields; (d)
Ctp-Ctf; (e)
Ctp-Cta‾; and
(f) 100∗(Ctp-Cta‾)Ctp at model-level 1 in February 2010. Black
lines EF and GH indicate the positions of the cross sections shown in
Fig. 5.
Monthly mean cross sections of Ctp-Ctf along line (a) EF and (b) GH,
and monthly mean cross sections of Ctp-Cta‾ along line (c) EF and (d) GH
(cross section lines shown in Fig. 4d) in February 2010.
Daily mean time series of CO2 concentrations at national
background stations in China and their nearest large cities from 1 January to
20 March 2010 extracted from the artificial true simulations
Ctp (black), background simulations
Ctf (red), and the ensemble mean values of the
assimilated CO2 concentrations fields Cta‾ (blue). All time series were interpolated to
the observation locations by the spatial bilinear interpolator method. The
sites used are (a) Waliguan (36.28∘ N, 100.91∘ E),
(b) Xining (36.56∘ N, 101.74∘ E), (c)
Longfengshan (44.73∘ N, 127.6∘ E), (d) Haerbin
(45.75∘ N, 126.63∘ E), (e) Shangdianzi
(40.65∘ N, 117.12∘ E), (f) Beijing
(39.92∘ N, 116.46∘ E), (g) Linan
(30.3∘ N, 119.73∘ E), and (h) Hangzhou
(30.3∘ N, 120.2∘ E).
OSSEs for evaluation of CFI-CMAQ
A set of OSSEs were designed to quantitatively assess the performance of
CFI-CMAQ. The setup of the experiments and the results are described in this
section.
Experimental setup
The chemical transport model utilized was RAMS-CMAQ (Zhang et al., 2002), in
which CO2 was treated as an inert tracer. The model domain was 6654 × 5440 km2 on a rotated polar stereographic map projection
centred at (35.0∘ N, 116.0∘ E), with a horizontal grid
resolution of 64 × 64 km2 and 15 vertical layers in the
σz-coordinate system, unequally spaced from the surface to
approximately 23 km. The initial fields and boundary conditions of the
CO2 concentrations were interpolated from the simulated CO2 fields
of CarbonTracker 2011 (Peters et al., 2007). The prior surface CO2 fluxes
included biosphere–atmosphere CO2 fluxes, ocean–atmosphere CO2
fluxes, anthropogenic emissions, and biomass-burning emissions (Kou et al.,
2013),
Fp(x,y,z,t)=Fbio(x,y,z,t)+Foce(x,y,z,t)+Fff(x,y,z,t)+Ffire(x,y,z,t),
where Fp(x,y,z,t) (referred to as Ftp) was the
prior surface CO2 flux; Fbio(x,y,z,t) and Foce(x,y,z,t) were the biosphere–atmosphere and ocean–atmosphere CO2
fluxes, respectively, which were obtained from the optimized results of
CarbonTracker 2011 (Peters et al., 2007); Fff(x,y,z,t) was fossil fuel
emissions, adopted from the Regional Emission inventory in ASia (REAS, 2005
Asia monthly mean emission inventory) with a spatial resolution of
0.5∘ × 0.5∘ (Ohara et al., 2007); and Ffire(x,y,z,t) was biomass–burning emissions, provided by the
monthly mean inventory at a spatial resolution of 0.5∘ × 0.5∘ from the Global Fire Emissions Database, Version 3 (GFED v3)
(van der Werf et al., 2010). Among all these fluxes, Fbio(x,y,z,t), Foce(x,y,z,t) and Fff(x,y,z,t) had
nonzero values at model level 1, while they all were zeros at the other 14
levels. However, Ffire(x,y,z,t) had nonzero values at model
level 1 ∼ 5 and they were all zeros at other the 10 levels. So, all
fluxes in this paper were the function of (x,y,z,t) for convenience.
Firstly, the prior flux Ftp was assumed as the true
surface CO2 flux in all of the following OSSEs. Forced
by Ftp, the RAMS-CMAQ model was run to produce the
artificial true CO2 concentration results Cp(x,y,z,t)(referred to as
Ctp in the following). Then, the artificial GOSAT
observations ytobs (or XCO2p) were generated by substituting Ctp into the observation
operator in Eq. (16). The retrieval information of GOSAT
XCO2(ypriori, fpriori, h and aCO2)
needed in Eq. (16) were gained from the v2.9 Atmospheric CO2
Observations from Space (ACOS) Level 2 standard data products, which only
utilized the SWIR observations. Only data classified into the “Good”
category were utilized in this study. During the retrieval process, most of
the soundings (such as data with a solar zenith angle greater than
85∘, or data not in clear sky conditions, or data collected over
the ocean but not in glint, etc.) were not processed, so typically data products
for the “good” category contained only 10–100 soundings per satellite
orbit (Osterman et al., 2011), and there were only 0 ∼ 60
samples per orbit in the study model domain generally. Fig. 3a also
showed the total number of “good” GOSAT XCO2 observations for each
model grid in February in 2010. There were relatively more observations over
most continental regions of the study domain except some regions in
North-East and South China. The total numbers ranged from 1 to 8. However,
there were almost no data over the oceans of the study domain.
Secondly, the prescribed surface CO2 fluxes series
Ft∗ were created by
Ft∗=(1.8+δ(x,y,z,t))Ftp,
where δ was a random number. They were standard normal distribution
time series at each grid in the integration period of our numerical
experiment. Driven by Ft∗, the RAMS-CMAQ model
was integrated to obtain the CO2 simulations Cf(x,y,,z,t) (referred
to as Ctf hereafter). Then, the column-averaged
concentrations XCO2f were obtained using Eq. (16).
The performance of CFI-CMAQ was evaluated through a group of well-designed
OSSEs, and the goal of each OSSE was to retrieve the true fluxes
Ftp from given true observations XCO2p and
“wrong” fluxes Ft∗. In all the OSSEs, we
assimilated artificial observations XCO2p about three times
a day since GOSAT has about three orbits in the study model domain. If there
were some observations, CFI-CMAQ paused to assimilate. Otherwise, it
continued simulating. The default ensemble size N was 48, the measurement
errors were 1.5 ppmv, the standard localization Schur radius c was 1280 km
(20 grid spacing), and the covariance inflation factor of concentrations
α was 1.1. The referenced lag-window was 9 days and the covariance
inflation factor of the prior scaling factors β was 70. Since the
smoother window was very important for CO2 transportation and β
was a newly introduced parameter, both these parameters were further
investigated by several numerical sensitivity experiments. The primary focus
of this paper was to describe the assimilation methodology, so all the
numerical experiments started on 1 January 2010 and ended on 30 March 2010.
Monthly mean values in February 2010 of (a)
Ftp, the prior true surface CO2 fluxes;
(b) Ft∗, the prescribed CO2 surface
fluxes, Ft∗=(1.8+δ(x,y,z,t))Ftp; (c) Fta‾, the ensemble mean values of the assimilated
surface CO2 fluxes; (d) Ftp-Ft∗; and (e) Ftp-Fta‾ (units:
µmole m-2 s-1). Fta‾ are the assimilated results of a CFI-CMAQ OSSE, in which the lag-window
was 9 days and β was 70. They are the same in Figs. 7–10.
Monthly mean RMSEs of Fta‾ in
February 2010 (units: µmole m-2 s-1).
As for the initialization of CFI-CMAQ, only the ensemble of background
concentration fields Ci,0f needed to be initialized at the time
t=0 because the values of λi,t|t-1a were
updated using the persistence dynamical model. In practice, the mean
concentration fields at t=0 are interpolated from the simulated CO2 fields
of CarbonTracker 2011 (Peters et al., 2007). The ensemble members of the background
concentration fields were created by adding random vectors. The mean values
of the random vectors were zero and the variances were 2.5 percent of the
mean concentration fields. The atmospheric transport model then integrated
from time t=0 to t=1 driven by Ft∗ with
Ci,0f as initial conditions to produce the CO2 concentration
fields C^i,1f. Subsequently, the first prior linear scaling
factors, λi,1|0p, could be calculated by
applying C^i,1f. Assuming that λi,1|0a=λi,1|0p, λi,1|0a
are gained, finally. For the first assimilation cycle, the lag-window was
only one (that is, only λi,1|0a needed to be
optimized in the first assimilation cycle). It increased for the first
dozens of assimilation cycles until it reached M + 1 as CFI-CMAQ continued
to assimilate observations. Once the system was initialized, all future
scaling factors could be created using the persistence dynamical model,
which associated the smoothing operator with the atmospheric transport
model.
In order to illustrate the limitation using only the smoothing operator
as the persistence dynamical model to generate all future scaling factors,
another OSSE (referred to as the reference experiment to distinguish it from
the above-mentioned CFI-CMAQ OSSEs) was designed to optimize the surface
CO2 fluxes at grid scale. The reference experiment was under the same
assimilation framework as CFI-CMAQ except that all λi,t|t-1p were set to 1 (Peters et al., 2007). Beside that, the
initialization procedure of the reference experiment was different from that
of the CFI-CMAQ. In practice, both the ensemble of background concentration
fields at t=0, Ci,0f, and the ensemble members of the scaling
factors at t=1, λi,1|0a, needed to be
initialized because they could not be generated in other ways (Peters et al.,
2005). The initial concentration fields Ci,0f were created using
the same method as that used to generate Ci,0f for the CFI-CMAQ
OSSEs. The ensemble members of the scaling factors λi,1|0a were rand fields. Their mean values were 1 and their
variances were 0.1. In addition, in order to keep the ensemble spread of the
scaling factors λi,t|t-1a at a certain level
and compensate for dynamical model error, covariance inflation was also used
and the covariance inflation factor of the scaling factors λi,t|t-1a was 1.6. All other parameters are the same as used in
the CFI-CMAQ OSSEs. The ensemble size N was 48, the measurement errors were
1.5 ppmv, the standard localization Schur radius c was 1280 km, the
covariance inflation factor of concentrations α was 1.1, and the
lag-window was 9 days.
Experimental results
Essentially, the assimilation part of CFI-CMAQ includes two subsections: one
for the CO2 concentration assimilation with EnKF, which can provide convincing CO2 initial analysis fields for the next assimilation cycle;
and the other for the CO2 flux optimization with EnKS, which can
provide better estimation of the scaling factors for the next time through
the persistence dynamical model, except for optimized CO2 fluxes. The
performance of the EnKF subsection will be greatly influenced by the
validation of the EnKS subsection, or vice versa. Firstly, the performance
of CFI-CMAQ will be quantitatively assessed in detail using the
assimilated results of a CFI-CMAQ OSSE, in which the lag-window was 9 days
and β was 70. The sensitivities of β and the lag-window
will then be discussed in the following two paragraphs. Finally, the
assimilation results of the reference experiment in whichλi,t|t-1p were set to 1 will be briefly described at the end
of this subsection.
We begin by describing the impacts of assimilating artificial observations
XCO2p on CO2 simulations by CFI-CMAQ. As shown in
Fig. 4a, b and d, the monthly mean values of the background CO2
concentrations Ctf produced by the magnified
surface CO2 fluxes Ft∗ were much larger
than those of the artificial true CO2 concentrations
Ctp produced by the prior surface CO2 fluxes
Ftp near the surface in February 2010. In the east and
south of China especially, the magnitude of the difference between
Ctp and Ctf was at least 6 ppmv.
Also, as expected, the monthly mean XCO2f was much larger
than the monthly mean artificial observations XCO2p, and the
magnitude of the difference between XCO2p and
XCO2f reached 2 ppmv in the east and south of China (see
Fig. 3b, c, e). However, the impact of magnifying surface CO2
fluxes on the CO2 concentrations was primarily below the model-level 10
(approximately 6 km), and especially below model-level 7 (approximately 1.6 km). Above model-level 10, the differences between Ctp
and Ctf fell to zero (see Fig. 5a, b). After
assimilating XCO2p, the analysis CO2 concentrations
Cta‾ was much closer to
Ctp (see Fig. 4c, e, f). The monthly mean difference
between Ctp and Cta‾
ranged from -2 to 2 ppmv and the relative error (Ctp
– Cta‾)/Ctp ranged
from -1 to 1 % in almost the entire model domain at model-level 1. The
monthly mean differences between Ctp and Cta‾ were negligible above model-level 2 (see Fig. 5c, d). The monthly mean XCO2a was also closer to
XCO2p and the difference between XCO2p and
XCO2a ranged from -0.5 to 0.5 ppmv. In order to evaluate
the general impact of assimilating XCO2p in the surface
layer, time series of the daily mean CO2 concentration extracted from
the background simulations and the assimilations were compared with the
artificial true simulations at four national background stations in China
and their nearest large cities. As shown in Fig. 3a, Waliguan is 150 km away
from Xining, Longfengshan is 180 km away from Haerbin, Shangdianzi is 150 km
away from Beijing, and Linan is 50 km away from Hangzhou. The assimilated
results are shown in Fig. 6. The background time series were much larger
than the artificial true time series, especially at Shangdianzi, Beijing,
Linan and Hangzhou, which are strongly influenced by local anthropogenic
CO2 emissions. After assimilating XCO2p, the
assimilated time series were very close to the true time series with
negligible bias, as expected, at Waliguan, Xining, Shangdianzi, Beijing,
Linan and Hangzhou, especially after the first 10 days, which can be
considered the spin-up period. Meanwhile, the improvements at Longfengshan
and Haerbin were limited due to the absence of observation data at those
locations (see Fig. 3a). Nevertheless, in general, the substantial benefits
to the CO2 concentrations in the surface layer of assimilating GOSAT
XCO2 with EnKF are clear. All the results illustrated that CFI-CMAQ can
provide a convincing CO2 initial analysis fields for CO2 flux
inversion.
(a) Ratios of monthly mean Ft∗to
monthly mean Ftp; and (b) ratios of monthly
mean Fta‾ to monthly mean
Ftp in February 2010. The white part indicates the
ratios where the absolute values of monthly mean Ftp
are larger than 0.1, not analysed in this study. The black square labelled I
indicates the domain where surface CO2 fluxes were used for the results
presented in Figs. 12 and 13.
Daily mean time series of CO2 fluxes at national background
stations in China and their nearest large cities from 1 January to 20 March 2010,
extracted from the prior true surface CO2 fluxes
Ftp (black), the prescribed CO2 surface fluxes
Ft∗ (red), and the assimilated CO2 fluxes
Fta‾ (blue). All time series were
interpolated to the observation locations by the spatial bilinear
interpolator method. The sites used are (a) Waliguan, (b)
Xining, (c) Longfengshan, (d) Haerbin, (e)
Shangdianzi, (f) Beijing, (g) Linan, and (h)
Hangzhou.
(a) Ensemble spread of Ci,tf after
inflating; (b) ensemble spread of λi,t|t-1p before inflating; (c) ensemble spread of
λi,t|t-1a at model-level 1 at 00:00 UT
on 1 March 2010, when β=70.
Time series of daily mean CO2 fluxes averaged in domain I
(shown in Fig. 9a) from 1 January to 20 March 2010 with the inflation factor
of scaling factors β=50,60,70,75 and 80. The black dashed line is
the time series averaged from Ft∗ and the black solid
line is the time series averaged from Ftp.
The impacts of assimilating XCO2p on surface CO2 fluxes
were also highly impressive by CFI-CMAQ. On the whole, the prescribed
CO2 surface fluxes Ft∗ were much larger
than the true surface CO2 fluxes Ftp in February
2010, especially in the east and south of China. The monthly mean difference
between Ft∗ and Ftp reached 5 µmole m-2 s-1 in Jing–Jin–Ji, the Yangtze River delta, and
the Pearl River Delta Urban Circle because of the strong local anthropogenic
CO2 emissions (see Fig. 7a, b, d). After assimilating
XCO2p, the ensemble mean of the assimilated surface CO2
fluxes Fta‾ decreased sharply. Thus,
the monthly mean values of Fta‾ were
much smaller than Ft∗ in most of the model
domain in February 2010. The pattern of the difference between Fta‾ and Ft∗ was
similar to that of the difference between Ftp and
Ft∗ (see Fig. 7d). The ensemble mean of the
assimilated surface CO2 fluxes Fta‾ were also compared to the artificial true fluxes Ftp,
revealing that Fta‾ was equivalent to
Ftp in most of the model domain. The monthly mean
difference between Fta‾ and
Ftp ranged from -0.1 to 0.1 µmole m-2 s-1
only (see Fig. 7e). In addition, the root-mean-square errors (RMSEs) of the
assimilated flux members were analysed. As shown in Fig. 8, the monthly mean
RMSE was less than 0.5 µmole m-2 s-1 in most of the model
domain, except in areas near to large cities such as Beijing, Shanghai and
Guangzhou, indicating that the assimilated CO2 fluxes were reliable.
In order to evaluate the ability of CFI-CMAQ to optimize the surface
CO2 fluxes comprehensively, the ratios of the monthly mean
Ft∗ to the monthly mean Ftpwere analysed. In actual implementation, we only analysed the ratios where
the absolute values of the monthly mean Ftp were larger
than 0.1, to avoid random noise. As shown in Fig. 9a, the ratios of the
monthly mean Ft∗ to the monthly mean
Ftp are about 1.8 in most of China, except in the
Qinghai–Tibet Plateau, where the absolute values of the monthly mean
Ftp in February were very small and were not analysed. In
addition, the ratios of the monthly mean Fta‾ to the monthly mean Ftp are
shown in Fig. 9b. This figure demonstrates that the impact of the
assimilation of XCO2p by CFI-CMAQ on CO2 fluxes was
great in the east and south of China in general, but the influence was
negligible in Northeast China due to the lack of observation data.
Time series of daily mean surface CO2 fluxes extracted from
Ft∗ and Fta‾ were also compared with that from Ftp at four national
background stations in China and their nearest large cities, similar to the
CO2 concentration assimilation. The results are shown in Fig. 10. The
background time series were much larger than the artificial true time
series, especially at Haerbin, Shangdianzi, Beijing, Linan and Hangzhou,
which are strongly influenced by local anthropogenic CO2 emissions.
After assimilating XCO2p, the assimilated time series were
near to the true time series with acceptable bias, as expected, at Waliguan,
Xining, Shangdianzi, Linan and Hangzhou after the 10-day spin-up period.
However, the improvements at Longfengshan and Haerbin were negligible
because of a lack of observations at these locations. Also, this inversion
system failed to show improvements in Beijing. One of the possible reasons
was that the values of the ensemble spread of λi,t|t-1a in the Beijing area are too large (see Fig. 11c). Beijing is located
in the Jing–Jin–Ji Urban Circle, which had strong local anthropogenic CO2
emissions during January–March. So the values of the ensemble spread of
Ci,tf in the Beijing area at model-level 1 could be
much larger than those in other areas, which had weak local anthropogenic
CO2 emissions (see Fig. 11a). As a result, the values of the ensemble
spread of λi,t|t-1p before inflating in the
Beijing area are much larger than those in other areas with small local
anthropogenic CO2 emissions (see Fig. 11b). After inflating, the
ensemble spread of λi,t|t-1p in the Beijing
area could be too large, compared to those in other areas with small local
anthropogenic CO2 emissions (see Fig. 11c), which led to the failure
to reproduce the true fluxes in the Beijing area. Later, CFI-CMAQ will be
improved by optimizing the covariance inflation method.
Since the impact of assimilation XCO2p by CFI-CMAQ on
CO2 fluxes was in general greater in the east and south of China than
other model areas (see Figs. 7, 9), the time series of the daily mean
CO2 fluxes in that area averaged from Fta‾ was compared with those from
Ft∗ and Ftp(see Fig. 12).
This figure indicates that CFI-CMAQ could in general reproduce the true
fluxes with acceptable bias.
As stated in the above section, β was a newly introduced parameter.
The prior scaling factors should have been inflated indirectly through the
inflated CO2 concentration forecast. However, the values of the ensemble
spread of λi,t|t-1p before inflating were very
small (ranging from 0 to 0.08 in most areas at model-level 1, see Fig. 11b),
though the values of the ensemble spread of Ci,tf
after inflating could reach 1–14 ppmv in most areas at model-level 1 (see
Fig. 11a). Consequently, we had to inflate them again before using them in Eq. (2).
Fig. 11c shows the distribution of the ensemble spread of λi,t|t-1a at model-level 1 at 00:00 UT on 1 March 2010 when β=70. It shows that the values of the ensemble spread of λi,t|t-1a ranged from 0.1 to 0.8 in most areas. In order to
investigate the sensitivity of the inflation factor of the scaling
factors β, a series of numerical experiments were conducted. As shown
in Fig. 12, CFI-CMAQ worked rather well for β=60,70,75,80.
However, if β was much smaller than 50 (e.g. β=10), the impact
of assimilation was small due to the small ensemble spread; or if β
was much larger than 80 (e.g. β=100), the assimilated CO2 fluxes
deviated markedly from the “true” CO2 fluxes. In other words, the
performance of CFI-CMAQ greatly relies on the choice of β.
From the perspective of the lag-window, the differences among the four
assimilation sensitivity experiments with lag-windows of 3, 6, 9 and 12 days
were very small (see Fig. 13). Although Peters et al. (2007) indicated that
the lag-window should be more than five weeks, it seemed that the smoother
window had a slight influence on the assimilated results for CFI-CMAQ. It
was clear that the assimilated results with a larger lag-window were better
than those with a smaller lag-window; however, CFI-CMAQ performed very well
even with a short lag-window (e.g. 3 days).
Time series of daily mean CO2 fluxes averaged in domain I
(shown in Fig. 9a) from 1 January to 20 March 2010 with different smoother
windows (3, 6, 9 and 12 days). The black dashed line is the time series
averaged from Ft∗ and the black solid line is the
time series averaged from Ftp.
At the end of this subsection, the assimilation results of the reference
experiment in which λi,t|t-1p were set to 1
will be addressed briefly. The impact of assimilation XCO2p
on CO2 fluxes was disordered. The monthly mean values of the difference
between the prior true surface CO2 fluxes and the ensemble mean values
of the assimilated surface CO2 fluxes were irregular noise (see Fig. 14). The main reason is that all the elements of the scaling factors to be
optimized in the smoother window are only random numbers. As stated in the
above section, only λi,1|0a needed to be
optimized in the first assimilation cycle. However, λi,1|0a were rand fields (in other words, all the elements of
λi,1|0a are random numbers) because they
could not be generated in other ways in the first instance. Therefore their spatial
correlations were too small. The correlations between the scaling factors
and the observations were also too small. It was therefore impossible to
systematically change the values of λi,1|0a in
large areas where the observations located after assimilating observations
at t=1. Hence, the signal-to-noise problem arose. So, the elements of
λi,1|1a are also only random numbers. Though
λi,2|1a could be generated automatically by
the smoothing operator when all λi,2|1p were
set to 1, the elements of λi,2|1a are also random
numbers because the smoothing operator is only a linear operator.
Similarly, it was impossible to systematically change the values of
λi,1|1a and λi,2|1a in large areas after assimilating observations at t=2. As this
inversion system continued assimilating observations, all future scaling
factors could be created by the smoothing operator and then updated. But
this inversion system could not ingest the observations effectively because
all the elements of the scaling factors were always random numbers. Though
the 9-day lag-window in the reference experiment is too short compared to
the 5 week lag-window recommended by Peters et al. (2007), this reference
experiment could illustrate the limitation by only using the smoothing
operator as the persistence dynamical model. If the lag-window was around 5
weeks, we could achieve better results because there were more observations in
every assimilation cycle. However, the results could not be better than
those obtained by CFI-CMAQ because most grids have no observations (see Fig. 3a) and the signal-to-noise problem still remained.
Monthly mean values of the difference between the prior true surface
CO2 fluxes and the ensemble mean values of the assimilated surface
CO2 fluxes (units: µmole m-2 s-1) of the reference
experiment in which λi,t|t-1p were set
to 1.