In this study, the effect of CO
Atmospheric CO
Recent studies on atmospheric CO
Data assimilation algorithms are fundamentally based on a linear statistical assumption (Talagrand, 1997). Both sequential and variational algorithms combine background and observation information to estimate parameters based on the linear assumption. According to the linear assumption, the influence matrix that measures the impact of individual observations on estimated parameters can be calculated in the observation space. Cardinali et al. (2004) suggested a method for calculating the influence matrix within the general data assimilation framework and applied the method to a forecast model of the European Centre for Medium Weather Forecasts (ECMWF). The diagonal elements of the influence matrix are the analysis sensitivities (i.e., self-sensitivity), which are proportional to the spread of the analysis and are inversely proportional to the predetermined observation error. The trace of the diagonal elements of the influence matrix reflects the information content, which is the amount of information extracted from observations. The influence matrix provides objective diagnostics regarding the impact of observations on the analysis and hence the performance of the data assimilation system because inaccurate observations can be identified by analyzing the observation impact (Cardinali et al., 2004). Liu et al. (2009) suggested a method for calculating self-sensitivity and cross-sensitivity (i.e., off-diagonal elements of the influence matrix) within the EnKF framework and diagnosed the relative importance of individual observations within an observation system using the idealized Lorenz 40 model and the simplified hydrostatic model.
Although Cardinali et al. (2004) and Liu et al. (2009) suggested methods for
calculating the impact of individual observations on an analysis, their
studies focused on NWP. Therefore, the impact of individual observations on
surface CO
CarbonTracker is a system developed by the National Oceanic and Atmospheric
Administration (NOAA), which optimizes the surface CO
In this study, an influence matrix is calculated in CarbonTracker to evaluate
the impact of mole fraction observations of CO
CarbonTracker is an atmospheric CO
Schematic diagram of the assimilation process employed in
CarbonTracker. In each analysis cycle, observations made within 1 week are
used to update the state vectors with a 5-week lag. The dashed line
indicates how the simple dynamic model uses analysis state vectors from the
previous 1 and 2 weeks to produce a new background state vector for the
current analysis time. The TM5 model is used as the observation operator to
calculate the model CO
The TM5 model (Krol et al., 2005) is used as a transport model that
calculates model CO
The EnKF data assimilation method used in CarbonTracker is the ensemble
square root filter (EnSRF) suggested by Whitaker and Hamill (2002). The
analysis equation for data assimilation is expressed as
To reduce the sampling error and filter divergence due to the
underestimation of background error covariance in EnSRF, the covariance
localization method is used (Houtekamer and Mitchell, 2001). Because the
physical distance between the scaling factors cannot be defined in
CarbonTracker, correlations between the ensemble of the scaling factor and
the ensemble of the model CO
The influence matrix for EnKF is calculated as in Liu et al. (2009). The
projection of Eq. (3) onto the observation space becomes
Substituting Eq. (10) into Eq. (12) becomes
The cumulative impact of the influence matrix for the 5 weeks of lag can
be calculated because the background in the lagged window already includes
the effect from previous observations. For example, Fig. 2 shows that
Schematic diagram of calculating cumulative impact in CarbonTracker.
Observation network of CO
The information content (i.e., degrees of freedom for signal), which is a
measure of the information extracted from the observations, is calculated by
the trace of the influence matrix. As suggested by Cardinali et al. (2004),
the globally averaged influence of the observations can be calculated by
averaging the global self-sensitivities as
Information on the observation sites used in this study. MDM represents the model–data mismatch, which is the observation error.
Observation site categories and corresponding MDM values.
Information on the observation sites located in Asia, including the
number of observations, number of rejected observations, MDM values,
innovation
The observations used in this study are surface CO
The surface carbon flux analysis system used in this study is based on the
CarbonTracker 2010 release (CT2010). However, the system employed in this
study is different from CT2010 in two aspects: first, the nesting domain of
the TM5 model, with 1
Cardinali et al. (2004) demonstrated that the self-sensitivity is
theoretically between 0 and 1 if observations are not correlated. In 4D-VAR,
Cardinali et al. (2004) denoted that analysis error covariance based on the
Hessian representation with truncated eigenvector expansion can introduce the
self-sensitivities greater than 1 for only a small percentage of the cases.
In contrast, the self-sensitivity in EnKF theoretically has a value lesser
than 1 (Liu et al., 2009). Nevertheless, the self-sensitivity in this study
shows a value greater than 1 because the sparse observations cause
insufficient reduction of the background and the observation operator used has
nonlinearity in calculating the transport of CO
Because the spatial coverage and number of observations varies during the experimental period, the average self-sensitivity throughout the experimental period was analyzed to evaluate the overall characteristics of the self-sensitivity at each observation site. As in previous studies (e.g., Peters et al., 2007, 2010; Kim et al., 2014), the results for the year 2000 were excluded from the data analysis because 2000 is considered as the spin-up period.
Figure 4 shows the average self-sensitivities at each observation site during the experimental period. Different sizes of circles are used in some locations to distinguish sites at the same location or at geographically close locations. In the globe, negative correlations between the spatial density of the observation sites and the self-sensitivities are not as apparent as those reported by Cardinali et al. (2004) and Liu et al. (2009). Negative correlations between the spatial density of the observation sites and the self-sensitivities are apparent in the Northern Hemisphere (NH). In particular, some observation sites in Asia show high sensitivities and a low spatial density of observation sites. The observation sites located in deserts, remote oceans, and high-altitude regions generally exhibit low sensitivities.
Average self-sensitivity at each observation site from 2001 to 2009. The overlapping observation sites at the same locations or at close locations are distinguished by different sizes of circles.
Histograms of the average self-sensitivity for each observation site
category from 2001 to 2009
Time series of the average self-sensitivity (red solid line with
blue dots) and the number of observations (black solid line) with a weekly
temporal resolution
Time series of the average self-sensitivity (red solid line with
blue dots) and the number of observations (black solid line) with a weekly
temporal resolution for the
The average self-sensitivities of each observation site category over the
globe, in the NH, tropics, and Southern Hemisphere (SH) are shown in Fig. 5.
The average global self-sensitivity is 4.8 % (Fig. 5a), which implies
that the analysis extracts 4.8 % of its information from the observations
and 95.2 % from the background each assimilation cycle. Although the
average self-sensitivity seems low, the background contains the observation
information included in the previous analysis cycle, as reported in Cardinali
et al. (2004). Moreover, the surface CO
Average standard deviation of background biosphere and ocean fluxes
in
In the globe, the Mixed site category shows the highest average
self-sensitivity, and the Difficult site category shows the lowest average
self-sensitivity (Fig. 5a), which is related to the model–data mismatch
values shown in Table 2. The model–data mismatch for the Mixed site category
is relatively low, while that of the Difficult site category is high.
Although the MBL site category has the lowest model data mismatch, the MBL
site category does not show the highest average self-sensitivity due to the
small spread of the analysis CO
Figure 6 shows the time series of the average self-sensitivity and number of observations around the globe and in each region. Globally, two apparent characteristics can be identified in the time series (Fig. 6a): first, the average self-sensitivity decreases as the number of observations increases, showing an inversely proportional relationship; second, there is seasonal variability in the average self-sensitivity, showing high values in summer and low values in winter. In the NH, the above two features are more apparent than in other regions (Fig. 6b). Because most of the observation sites are located in the NH, characteristics of the average global self-sensitivity are mostly determined by those in the NH. As the number of observations in the tropics increases in the late 2000s, a slight inversely proportional relationship between the average self-sensitivity and the number of observations appears in the tropics (Fig. 6c). However, the average self-sensitivity in the tropics does not show distinct seasonal variability. In the SH, an inverse relationship between the average self-sensitivity and the number of observations is not clearly shown (Fig. 6d), which is due to the insufficient increase of the number of observations assimilated in the SH compared with the other regions. However, the seasonal variability of the average self-sensitivity appears clearly in the SH. Therefore the inverse relationship is distinctly shown when the increase of the number of observations is enough to cause the decrease of the average self-sensitivity.
Figure 7 shows the average self-sensitivity for each observation site category. Although the MBL site category has the second largest number of observations, the average self-sensitivity shows little variation with respect to time (Fig. 7a). Similarly, the average self-sensitivity for the Continental site category does not vary with respect to time (Fig. 7b). The average self-sensitivity of the Mixed site category shows distinct seasonal variation (Fig. 7c). The Continuous site category displays distinct seasonal variability in the average self-sensitivity and an inversely proportional relationship between the average self-sensitivity and the number of observations (Fig. 7d). Because Continuous sites are mostly located in North America with relatively large numbers (Fig. 3), the impact of a single observation decreases as the number of observations increases. Therefore, the inversely proportional relationships between the average self-sensitivity and the number of observations around the globe (Fig. 6a) and in the NH (Fig. 6b) are mainly attributed to the Continuous site category. The Difficult site category shows a slight inverse relationship between the average self-sensitivity and the number of observations (Fig. 7e).
Despite the inversely proportional relationship between the self-sensitivity
and the number of observations in the NH time series (Fig. 6a), the average
self-sensitivity in the NH is higher than in the other regions (Fig. 5). In
addition, the average self-sensitivities in the NH and SH are greater in
summer than in winter (Fig. 6). The above two characteristics imply that
another factor in addition to the number of observations affects the
self-sensitivity. As briefly mentioned in Sect. 3.2.1, another factor that
affects the self-sensitivity is the spread of the analysis CO
The uncertainties of the optimized biosphere and ocean fluxes from 1 week of
observations, shown in Fig. 8c and d, are reduced compared with those of the
background fluxes, shown in Fig. 8a and b. The magnitude of the reduction of
the surface CO
Therefore, the surface CO
Average normalized information content for each observation site from 2001 to 2009. The overlapping observation sites at the same locations or at close locations are distinguished using different sizes of circles.
Figure 9 shows the average information content at each observation site during the experimental period. This value was calculated by averaging the ratio of information contents for each cycle at each site during the experimental period. Note that this average value is not the amount of information content extracted from observations but rather the relative ratio of each site's information content, normalized by the total influence of all observations. Because the magnitude of the information content at one observation site is proportional to the self-sensitivity and the number of observations, the observation sites with a high average self-sensitivity or a large number of observations show high information content. The number of observations at one station depends on the temporal resolution, missing rate, and total period of observations. Therefore, the observation sites located in North America and Asia generally show high average information content.
Histograms of the average information content for each observation
site category
Time series of the average information content for each observation
site category
Times series of the
RMSD between the background flux and
prior flux in
To investigate the distribution of the information content in each region,
histograms of the average information content around the globe and in the NH,
tropics, and SH were generated (Fig. 10). The average information content was
80.2 % in the NH, 13.3 % in the tropics, and 6.5 % in the SH,
which implies that the observations in the NH are the most informative. This
is due to the large number of observations and high self-sensitivities in the
NH. Around the globe, the most informative observation site category is the
Continuous category (Fig. 10a). The MBL, Continental, and Mixed site
categories show a similar magnitude of information content, but the Difficult
site category shows the lowest information content. As in the globe, the
Continuous site category is the most informative in the NH (Fig. 10b). In the
current CarbonTracker system, the observation sites of the Continuous site
category are mainly located in North America, except for the three JMA sites,
which are located in Asia. Therefore, most of the information extracted from
the Continuous site category is used to constrain the surface CO
Figure 11 shows the time series of the weekly averaged information content
for each site category in each region. In the globe, the proportion of the
information content of the Continuous site category increases steadily over
time (Fig. 11a), which is associated with the steady increase in the number
of observations of the Continuous site category over time. In the NH, the
increase of the proportion of the information content and the number of
observations of the Continuous site category is more readily apparent
(Fig. 11b). In the tropics, the MBL and Mixed site categories provide the
most information, while the Difficult site category provides limited
information from 2004 onward (Fig. 11c) because, after this date,
observations from only one Difficult observation site (Bukit Kotobang (BKT),
Indonesia, 0.2
To investigate the regional distribution of the information content in the NH, the time series of the information contents in Asia, North America, and Europe are shown in Fig. 12. The information content in North America is greater than that in the other regions because the self-sensitivities are high and the number of observations increases with time in North America. However, the rate of increase in the information content is lower than that of the number of observations because self-sensitivity decreases as the number of observations increases in North America.
Because CarbonTracker is a system that optimizes the surface CO
In this study, the effect of observations of CO
The average global self-sensitivity is 4.8 %, which implies that the
impact of the background on the optimized flux is 95.2 %. The value of
4.8 % obtained in CarbonTracker is lower than the 15 % value obtained
from NWP models, as reported by Cardinali et al. (2004) and Liu et
al. (2009). However, as indicated by Cardinali et al. (2004), the background
fluxes predicted by the dynamic model already include information extracted
from earlier observations used in previous cycles. Because the state vector
used in CarbonTracker includes 5 weeks of lag, the cumulative impact of
the observations on the analysis is greater than the impact calculated for a
single assimilation cycle. The cumulative impact over 5 weeks is
19.1 %, much greater than 4.8 %, and the large cumulative impact is
confirmed by the RMSD of the surface CO
The self-sensitivity and spatial coverage of the observation sites are
inversely correlated in the NH, whereas these factors are not apparently
related in the tropics and SH. The lower correlation between the
self-sensitivity and the spatial coverage of the observation sites in the
tropics and SH is attributed to either the sparseness of the observation
sites or the locations of the observation sites which are not appropriate
for detecting the variability of CO
The self-sensitivity time series is characterized by seasonal variations. In
both hemispheres, the self-sensitivity in summer is greater than in winter,
which is clearly evident in the Mixed and Continuous site categories and is
associated with the background surface CO
The observation sites with a high average self-sensitivity and a small
number of observations show low average information content, whereas the
observation sites with a low average self-sensitivity and a large number of
observations show high average information content because the range of
average self-sensitivity is bounded from 0 to 1, but the range of the number
of observations is large. Therefore, the Continuous site category shows high
average information content. In general, the information extracted from
observations is concentrated in the NH, especially in North America. A
strong correlation exists between the information content and the optimized
surface CO
The effect of various observations on the analyzed surface CO
The authors thank the two anonymous reviewers for their valuable comments. The authors thank Andrew R. Jacobson for providing the resources necessary for this study. This study was funded by the Korea Meteorological Administration Research and Development Program under the Grant CATER 2012-3032.Edited by: N. Zeng