A method for evaluating bias in global measurements of CO 2 total columns from space

We describe a method of evaluating systematic errors in measurements of total column dry-air mole fractions of CO_2 (X_(CO_2)) from space, and we illustrate the method by applying it to the v2.8 Atmospheric CO_2 Observations from Space retrievals of the Greenhouse Gases Observing Satellite (ACOS-GOSAT) measurements over land. The approach exploits the lack of large gradients in X_(CO_2) south of 25° S to identify large-scale offsets and other biases in the ACOS-GOSAT data with several retrieval parameters and errors in instrument calibration. We demonstrate the effectiveness of the method by comparing the ACOS-GOSAT data in the Northern Hemisphere with ground truth provided by the Total Carbon Column Observing Network (TCCON). We use the observed correlation between free-tropospheric potential temperature and X_(CO_2) in the Northern Hemisphere to define a dynamically informed coincidence criterion between the ground-based TCCON measurements and the ACOS-GOSAT measurements. We illustrate that this approach provides larger sample sizes, hence giving a more robust comparison than one that simply uses time, latitude and longitude criteria. Our results show that the agreement with the TCCON data improves after accounting for the systematic errors, but that extrapolation to conditions found outside the region south of 25° S may be problematic (e.g., high airmasses, large surface pressure biases, M-gain, measurements made over ocean). A preliminary evaluation of the improved v2.9 ACOS-GOSAT data is also discussed.

Our technical approach for evaluating the X CO2 product from the ACOS-GOSAT retrievals makes use of the relatively spatially uniform CO 2 in the Southern Hemisphere to identify systematic errors, including large-scale biases and other artifacts caused by the retrieval algorithm or errors in the instrument calibration. Once identified, these biases are removed and the success of this modification to the data is evaluated through comparisons with the Northern Hemisphere TCCON data. We exploit observed correlations between free-troposphere potential temperature and X CO2 to minimize variability in X CO2 that is dynamic in origin (Keppel-Aleks et al., 2011) when defining coincidence criteria in the Northern Hemisphere. This better defines comparable observations than using a simple geographic constraint.
In Sect. 2, we detail our approach to comparing global X CO2 measurements against the TCCON X CO2 measurements. We then describe the ACOS-GOSAT X CO2 data product and screening procedures in Sect. 3. The techniques are applied and evaluated in Sect. 4 and Sect. 5, and a discussion and conclusions follow in Sect. 6.
2 Comparing satellite-based X CO2 with ground-based TCCON measurements Observations and models of surface, partial and total column amounts of CO 2 in the Southern Hemisphere show low seasonal and geographic variability compared with the Northern Hemisphere. Observations from the global network of in situ atmospheric CO 2 measurements show that surface CO 2 concentrations at latitudes between 25 • S and 55 • S have a small seasonal cycle (∼1 ppm peak-topeak), and small geographic gradients (GLOBALVIEW-CO 2 , 2006). Olsen and Randerson (2004) predicted such uniformity in modeling the total columns of CO 2 in the Southern Hemisphere. Measurements of CO 2 profiles from the recent Hiaper Pole-to-Pole Observations (HIPPO) campaign by Wofsy et al. (2011) also show that the Southern Hemisphere CO 2 field does not vary by more than 1.6 ppm south of 25 • S. Figure 2 shows the HIPPO CO 2 data centred on the Pacific Ocean.
There are two TCCON stations located south of 25 • S: Wollongong, Australia (34 • S) and Lauder, New Zealand (45 • S). Wollongong is located on the Australian eastern coast, on the outskirts of a small urban centre, located about 100 km south of Sydney. Lauder is located on New Zealand's south island and predominantly samples clean maritime air. The Lauder site has a seasonal cycle in X CO2 with a small peak-to-peak amplitude of about 0.6 ppm (Fig. 3). The measurements over Wollongong are affected by local pollutants which can increase the seasonal cycle of X CO2 over Wollongong to ∼2 ppm peak-to-peak, but this is variable from year to year. When the effect from the pollution is accounted for, the background seasonal cycle is reduced to ∼1 ppm peak-to-peak. The Lauder X CO2 time series is the longest in the Southern Hemisphere, and has a secular increase of 1.89 ppm yr −1 since 2004, which is in good agreement with the global mean secular increase of about 2 ppm yr −1 (with a year-to-year variability of 0.3 ppm yr −1 , 1σ) from the GLOBALVIEW surface in situ flask network over the same time period (Conway and Tans, 2011). Consistent with HIPPO, TCCON, and GLOBALVIEW, we assume that the Southern Hemisphere poleward of 25 • S has a small seasonal cycle in X CO2 of ∼0.6 ppm (peak-to-peak), has no geographic gradients and a secular increase of 1.89 ppm yr −1 . We assume that measurements of X CO2 in this region that show spatial and temporal variations that exceed this constraint contain spurious variance, and we look for empirical correlations of X CO2 with retrieval or instrument parameters that explain the variance. We assume that these correlations represent systematic errors that exist globally. After accounting for these biases, the satellite X CO2 data are compared against TCCON data globally. This procedure is applicable to any global measurement of X CO2 , including the Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY Burrows et al., 1995), GOSAT and the future OCO-2 and OCO-3 instruments. We will apply it to the ACOS-GOSAT X CO2 in the following sections.

ACOS-GOSAT data product
The ACOS-GOSAT data processing algorithm is described in detail in O'Dell et al. (2011). It is adapted from the OCO retrieval algorithm (Boesch et al., 2006;Connor et al., 2008;Boesch et al., 2011) and incorporates modifications required to accurately represent the physics of the GOSAT instrument, such as the instrument line shape and noise model. The inverse method is based on the optimal estimation approach given by Rodgers (2000). The forward model is based on LIDORT (Spurr et al., 2001;Spurr, 2002), and a two-order scattering model to account for polarization, described by Natraj and Spurr (2007). A "low-streams interpolation" scheme, devised by O'Dell (2010), ensures that the scattering calculation is both fast and accurate.
The molecular absorption coefficients for CO 2 (Toth et al., 2008) and O 2 (Long et al., 2010) have been extended to account for line mixing and collision-induced absorption using the results of Hartmann et al. (2009) for CO 2 and of Tran and Hartmann (2008) for O 2 . The disk-integrated solar spectrum is based on ground-based measurements from the Kitt Peak Fourier transform spectrometer. All other molecular spectral parameters are taken from HITRAN 2008 (Rothman et al., 2009).
Surface pressure is retrieved from the oxygen A-band near 0.76 µm. The CO 2 columns are retrieved from the weak band near 1.61 µm, and the strong band near 2.1 µm. The spectral ranges used in the ACOS algorithm match those of the OCO and future OCO-2 instrument.

ACOS-GOSAT data screening
We use the v2.8 release of the ACOS-GOSAT data, available from the Goddard Data and Information Services Center (GDISC, see note ACOS-GOSAT Data Access), spanning 5 April 2009 through 21 March 2011. Using the method described in Taylor et al. (2011) andO'Dell et al. (2011), these retrievals are pre-screened to include only cloud-free scenes. The ACOS-GOSAT data product includes a "master quality flag" that provides an estimate of confidence in the retrieved X CO2 and its associated a posteriori error. The master quality flag uses filters that are described in the ACOS readme document also available from the GDISC (Savtchenko and Avis, 2010). Here, we apply postprocessing filters that are slightly different from those used to derive the master quality flag provided with the data. The filters as applied are listed in Table 1 and are chosen to limit the retrievals to those in which we have the highest confidence. The main differences between the filters applied here and those used to determine the master quality flag are in the quality of the spectral fit (i.e., reduced χ 2 ), the allowed deviation of the retrieved surface pressure from the a priori, and a few additional filters as described below.
Retrievals are defined as successful by the master quality flag when they satisfy χ 2 < 1.2. However, the χ 2 values have increased linearly over time, because the time-dependent radiometric calibration owing to a sensitivity degradation of the O 2 A-band channel was not applied to the noise model. To compensate for this, we adjust the cutoff value so that it starts at 1.2 and evolves with a linear increase in time, matching the increase in minimum χ 2 . As a result, a similar number of scenes are retained over time.
Data with retrieved surface pressure (P surf ) that differs significantly from the ECMWF a priori surface pressure (P ECMWF ) are marked as 'bad' in the master quality flag. Data are retained by the master quality flag when the difference between the retrieved and a priori surface pressures: is 0 < ∆P < 20 hPa. In this work, scenes are retained that satisfy: |(∆P ) − (∆P )| < 5 hPa. The global mean value of ∆P is approximately 10.9 hPa.
We apply three additional filters: one to remove the medium-gain scenes, one to remove the glint measurements, and one to remove scenes that contain surface ice or snow. The medium-gain (Mgain) TANSO-FTS mode, which is used over very bright surface scenes ( Fig. 1), is known to have ghosting issues caused by mismatched timing delays in the signal chain (Suto and Kuze, 2010). In future releases of the spectra, this ghosting effect will be corrected, but in the meantime, we do not use the M-gain data. Glint measurements are made exclusively over ocean and have different properties than the nadir measurements made over land. The ACOS-GOSAT glint retrieval algorithm requires additional refinement, so glint retrievals are not considered here.
A fraction of the ACOS-GOSAT retrievals exhibit anomalous X CO2 values due to the presence of the higher-albedo snow-and ice-covered land surfaces, which are indistinguishable from low-lying cloud or aerosol in the current version of the algorithm. We apply a filter that depends on the retrieved albedos of the O 2 A-band (A AO2 ) and the strong CO 2 band (A SCO2 ). We will call this combination of albedos the "blended albedo." The blended albedo was determined from a multivariate linear regression on the data, which was trained on scenes known to have snow or ice conditions at the surface, and correctly characterises over 99.9 % of the scenes. Data that are retained satisfy Eq. (2), and their distribution is shown in Fig. 4 blended albedo ≡ 2.4A AO2 −1.13A SCO2 < 1.
(2) 4 Bias determination from the Southern Hemisphere The filtering described in Sect. 3.1 removes spectra recorded under atmospheric conditions that are not yet modeled well in the ACOS retrieval (e.g., surface ice). However, these filters do not remove all systematic errors in the treatment of the instrument calibration, spectroscopy, measurement geometry, or other features. This section discusses the identification of these biases.
Known deficiencies in the implementation of the spectroscopic line shape of the O 2 A-band and the strong CO 2 bands cause systematic biases in the retrieved X CO2 . In the absence of an improved line shape model (currently under development), the biases can either be removed after the retrieval by calibrating against known X CO2 values, or by scaling the cross-sections before the retrieval. The method that will be employed by the ACOS team in the 2.9 version of the algorithm (Appendix B) is to scale the cross-sections of the O 2 A-band in order to retrieve the known column of atmospheric O 2 , and to ensure that the spectroscopic parameters describing the strong CO 2 band result in a retrieval that yields the same column amount as the weak CO 2 band for the same atmospheric conditions. The v2.8 algorithm does not use scaled cross-sections, so here we perform an initial "calibration" of the ACOS-GOSAT X CO2 data using Southern Hemisphere TCCON data. The mean ratio between the summertime (December, January, February) Lauder TCCON data and the corresponding ACOS-GOSAT data within ±5°latitude of Lauder is ∼2 %. We have thus corrected this bias globally by dividing all ACOS-GOSAT data by 0.982 (Fig. 5). Much of this bias is due to the retrieved surface pressure offset (∆P ), described in Sect. 3.1.
From the v2.8 release of the ACOS-GOSAT product, we select the most significant parameters that reduce the variance of the X CO2 anomalies in the Southern Hemisphere south of 25 • S. The anomalies are computed by subtracting a 1.89 ppm yr −1 slope with a seasonal cycle derived from the Baring Head, New Zealand GLOBALVIEW seasonal climatology (GLOBALVIEW-CO 2 , 2006) from the ACOS-GOSAT data between 25 • S and 55 • S. Because the GLOBALVIEW data replicate the in situ seasonal cycle at the surface and not the column seasonal cycle, we have applied a time lag of 6 weeks and have reduced the amplitude by multiplying by 0.65 to best match the seasonal cycles at Lauder and Wollongong (Fig. 3).
In order of importance, the most significant parameters correlated with this spurious variability in the retrieved X CO2 are the blended albedo (defined in Eq. 2), ∆P (defined in Eq. 1), airmass (described in Eq. 3 below), and the continuum level of the O 2 A-band spectral radiance (called "signal o2" in the v2.8 data files). The airmass is approximated by airmass = 1/cos(solar zenith angle) + 1/cos(observing angle), where solar zenith angle is the angle of the sun, and observing angle is the off-nadir viewing angle of the instrument. (These parameters are labeled "sounding solar zenith," and "sounding zenith," respectively, in the v2.8 data files.) A multivariate linear regression on the blended albedo, ∆P (in hPa), the airmass, and the signal o2 (in W cm −2 sr −1 (cm −1 ) −1 ) suggests that the following modification to the retrieved X CO2 (in ppm) partially removes the biases: where the coefficients are C 0 = 0.982, C 1 = 10.5 ppm/units of blended albedo, C 2 = −0.15 ppm hPa −1 , C 3 = −2.0 ppm/airmass and C 4 = −0.25 ppm/ (10 7 W cm −2 sr −1 (cm −1 ) −1 ). Subtracting off the mean values, listed in Table 2, minimizes the overall change in X CO2 . Scatter plots of the simultaneous regressions are shown in Fig. 6. If only the secular increase is removed from the Southern Hemisphere data to produce the anomalies (i.e., if we do not include the small seasonal cycle), the regression coefficients agree within two bootstrapped standard errors with the coefficients in Eq. (4). Further, if we apply a −1 ppm gradient between 25 • S and 55 • S to approximate the HIPPO observations, the coefficients again agree, within two bootstrapped standard errors (see Table 2). The bootstrapping technique is described by, for example, Efron and Gong (1983).
These basis functions (blended albedo, ∆P , airmass, signal o2) are not orthogonal, and other parameters may be used to accomplish a similar reduction in the variability of retrieved X CO2 .
Errors in aerosol and cloud characterization or identification can affect the retrieved albedos and hence the blended albedo parameter, and they can also affect the retrieved path length and ∆P . not use the blended albedo parameter directly, but they use the ratio of the weak CO 2 band signal to the O 2 A-band signal, which is strongly and linearly related to blended albedo (r 2 =0.78).) This suggests that at least part of the blended albedo-X CO2 and ∆P -X CO2 relationships are caused by the retrieval algorithm itself.
In addition to parameters that can be tested in the simulator, there are several known causes of systematic effects on the retrievals. First, errors in the spectroscopy can produce spurious airmass dependencies as well as global biases (e.g., Yang et al., 2005;Hartmann et al., 2009;Deutscher et al., 2010;Wunch et al., 2011) and can affect the pressure retrieval (e.g., ∆P ). Another error source is from nonlinearities in the instrument signal chain that can manifest themselves as zero-level offsets in the O 2 A-band. Zero-level offsets in a Fourier transform spectrometer depend strongly on the signal at zero path difference, and hence on the average signal level of the spectrum (Abrams et al., 1994). As a proxy for the average signal level, which is not available in the public v2.8 data, we use the continuum level radiance ("signal o2"), which is highly correlated with the average signal level (r 2 = 0.994). Disentangling biases associated with the spectral continuum level from the airmass is difficult, because they are strongly (and nonlinearly) anti-correlated.
Future releases of data will account for the zero-level offset explicitly, either as in Butz et al. (2011), or, preferably, in the measured radiances in the interferograms, prior to the Fourier transform, once the underlying instrumental cause is properly quantified.
Finally, there is a photosynthetic fluorescence signal in the O 2 A-band Joiner et al., 2011). Its potential impact on the retrieval of scattering properties in the A-band is described by Frankenberg et al. (2011) and makes use of the Fraunhofer lines near the O 2 A-band.
This effect is currently ignored in the X CO2 retrievals and can give rise to systematic biases. Over photosynthetically active regions of the globe, the vegetation fluoresces, adding a broad-band signal throughout the O 2 A-band. If this additional signal is not included in the forward model, the measured O 2 lines appear shallower than expected, and the retrieved X CO2 will be incorrect (too high), with a seasonal cycle from the vegetation fluorescence imposed on top of the true X CO2 seasonal cycle that is of interest here. The effects of fluorescence will be retrieved and the fluorescence data will be available in a future release of the ACOS-GOSAT data.
In applying Eq. (4) to the global dataset, we assume that the dependencies of ∆X CO2 on the parameters are linear, and can be reasonably extrapolated to values found outside the range in the Southern Hemisphere. The Northern Hemisphere and Southern Hemisphere have similar distributions of ∆P , blended albedo and signal o2, but the Northern Hemisphere data contain a larger range of airmasses. In the Southern Hemisphere, 99 % of the data poleward of 25 • S have sampled airmasses between 2 and 3.3. In the Northern Hemisphere, 99 % of the data poleward of 25 • N have sampled airmasses between 2 and 5.1. Any nonlinearity in the airmass-∆X CO2 relationship will result in a residual airmass dependency in the modified Northern Hemisphere data. Maps and histograms of the four parameters are in the supplementary material (Figs. S1 and S2).

Applying averaging kernels
To compare two X CO2 observations properly, the retrievals must be computed about a common a priori profile, and the effect of smoothing must be taken into account by applying the averaging kernels (Rodgers and Connor, 2003). Since the v2.8 ACOS and TCCON retrievals were computed using different a priori profiles, we must adjust the retrieved X CO2 values accordingly (see Sect. A for the mathematical details). To test the effect of this adjustment and of the smoothing, we select retrievals within ±0.5 • latitude and ±1 • longitude of the Lamont TCCON site. We cannot test the effects of the averaging kernels globally because this requires an estimate of the real atmospheric variability everywhere, which is unknown. We can generate an estimate of the atmospheric vari-ability over Lamont, however, by using the bi-weekly low-altitude (0-5 km) aircraft measurements of CO 2 profiles over the Lamont TCCON station (Fig. 7) and the surface CO 2 measurements from the co-located tall tower when they were available. Each profile was extrapolated up to 5500 m and down to the surface altitude (315 m) from the nearest available data point, resulting in 177 profiles recorded between January 2006 and November 2009. In order to compute the weekly variance over several years of observations, a secular increase of 1.89 ppm yr −1 was subtracted from all altitudes of the profiles. Next, we adjust the ACOS-GOSAT values to the ensemble profile, which we assume to be the TCCON a priori profile. This results in an adjustment to the ACOS-GOSAT X CO2 that is seasonal, with an amplitude of about 0.5 ppm. It may also have a small secular decrease of about 0.1 ppm yr −1 as well, which could be due to the differences in the secular increases in the ACOS-GOSAT and TCCON a priori profiles. The ACOS X CO2 values are adjusted downward in the winter, and upward in the summer, which has the effect of reducing the overall seasonal cycle of the ACOS-GOSAT retrieval (Fig. 8). The adjustment at Lamont has a seasonal cycle because the ACOS-GOSAT a priori profile does not contain a seasonal cycle, whereas the real atmosphere does (Fig. A1). This seasonal cycle is driven near the surface by biospheric respiration and uptake, and in the stratosphere by dynamics that seasonally alter the tropopause height. The adjustment to the ACOS-GOSAT data will be latitude-dependent, with smaller adjustments in the Southern Hemisphere, and the largest adjustments at the latitude of the Boreal forests (i.e., around 50-65 • N), where the surface seasonal cycle has the largest amplitude. Figure S3 illustrates the latitude-dependence of the adjustment.
The smoothing error (defined in the caption and given by the red curve in Fig. 8) is about 0.6 ppm, which is smaller than the sum of the variances of the ACOS-GOSAT X CO2 and the TCCON X CO2 (∼1.5 ppm) but not negligibly so. The effect of smoothing the TCCON data using the ACOS-GOSAT averaging kernel results in a bias of about 0.6 ppm with no significant seasonal cycle or airmass dependence (the yellow curve in Fig. 8).
Applying the averaging kernels in a globally consistent manner is not possible without a global estimate of atmospheric variability. However, we can draw two important conclusions from the Lamont test: 1. There is a seasonal cycle induced by the adjustment of the ACOS-GOSAT data to the TC-CON a priori profile. The amplitude of the adjustment has a latitude dependence and is about 0.5 ppm at Lamont.
2. There is a bias of about 0.6 ppm induced by smoothing the TCCON profile with the ACOS-GOSAT averaging kernel at Lamont.
The TCCON a priori profile is being evaluated for a future version of the ACOS-GOSAT algorithm, which would make the adjustment step unnecessary.
Our correction scheme described by Eq. (4) should significantly reduce airmass dependencies caused by global error terms (e.g., spectroscopic errors) and the overall bias. This will not be perfect, of course, and the results will likely contain a residual latitude-dependent seasonal bias. Once the TCCON priors are used for the ACOS-GOSAT retrievals, the discrepancies caused by the a priori profiles will be eliminated, leaving us only to consider the smoothing error. For the remainder of this paper, only the adjustments in Eq. (4) are applied.

Comparisons in the Northern Hemisphere
The first step in evaluating the Northern Hemisphere seasonal cycles from the ACOS-GOSAT data before and after applying Eq. (4) is to inspect the retrieved values in latitude bands corresponding to TCCON sites. Figure 9 shows latitude bands containing the 11 TCCON sites used in this study. The Tsukuba TCCON data were adjusted up by 1.32 ppm in this analysis, due to a known instrumental bias that has been characterized through aircraft calibration campaigns (Tanaka et al., 2011).
The seasonal cycle shape, after applying Eq. (4) to the ACOS-GOSAT data, is generally improved over the data that has only the global bias removed (0.982). Site-by-site investigations require stricter coincidence criteria. However, criteria based on tight geographic and temporal constraints result in few coincidences at higher latitude sites, because the surface is covered in snow, or it is often cloudy.
We can loosen geographic and temporal constraints on the coincidence criteria if we exploit the relationship between the free-tropospheric potential temperature and variability in X CO2 in the Northern Hemisphere (Fig. 10). Keppel-Aleks et al. (2011) detail the use of the potential temperature coordinate as a proxy for equivalent latitude for CO 2 gradients in the Northern Hemisphere. We use the mid-tropospheric temperature field at 700 hPa, T 700 (which is directly proportional to the potential temperature at 700 hPa for the range of temperatures of interest here), to allow a significantly broader comparison between TCCON and ACOS-GOSAT than could be found using only geographic coincidence. The pressure (700 hPa) is arbitrary, and any mid-tropospheric pressure would do. Choosing 700 hPa is convenient, however, because the NCEP/NCAR analysis product is provided on a 700 hPa grid level (Kalnay et al., 1996), and the NCEP/NCAR data provide the a priori atmospheric information to the TCCON retrieval algorithm. A Northern Hemisphere map of the NCEP/NCAR T 700 field for 10 days in August 2010 is shown in Fig. 11.
For our coincidence criteria, we find GOSAT measurements within 10 days, latitudes within ±10 • and longitudes within ±30 • of the TCCON site, for which T 700 is ±2 K of the value over the TCCON site. The longitude limits for Tsukuba are set to be ±10 • because we do not wish to inadvertently over-weight the measurements over China. The possible locations of the coincidences for each TCCON site, given the latitude, longitude, and T 700 of each site, are overlaid on the map in Fig. 11.
This set of criteria results in many more coincident measurements over the higher latitude sites (Table   3). For example, over Park Falls, the T 700 criterion results in 10 times more coincident measurements than using a geographic constraint of ±0.5°latitude and ±1.5°longitude.
These criteria are applied to generate Fig. 12 and Table 3, which show the site-by-site comparisons in the Northern Hemisphere. The correlations between TCCON and ACOS-GOSAT are shown in Fig. 13. All slopes are quoted as x±y, where x is the best fit slope and y is twice the standard error on the best fit, calculated using the method outlined in York et al. (2004), under the assumption that there is no correlation between the errors in x and the errors in y. The slope is significantly improved after applying Eq. (4) (compare the left and middle panels of Fig. 13, which have slopes of 0.82 ± 0.07 and 0.88 ± 0.07, respectively). Selecting a T 700 coincidence criterion also improves the coefficient of determination (r 2 ) over a simple latitude/longitude/time coincidence (compare the middle and right panels of Fig. 13, which have r 2 of 0.80 and 0.77, respectively). When using a T 700 constraint of ±1 K (instead of ±2 K), the r 2 decreases, and the comparison dataset diminishes significantly (10 % loss in data over Park Falls, and 25 % loss in data over Tsukuba). A constraint of ±3 K shows no reduction in r 2 , but also no significant gain in coincident measurements, as the geographic constraints become dominant. Using a simple geographic constraint but with a larger ±2.5°box around each TCCON site results in a reduced slope (0.89±0.04) compared with the right panel of Fig. 13 (which has a slope of 0.96 ± 0.08), and the same coefficient of determination (r 2 = 0.76).
The variability of the ACOS-GOSAT X CO2 seen in this work is comparable to that described by Morino et al. (2011) and Butz et al. (2011). Morino et al. (2011) remove a large-scale spectroscopic bias that is similar in magnitude to the bias seen in the ACOS retrievals (−8.6ppm, or 2.2%), but a significantly smaller northern hemisphere standard deviation of 1.2 ppm for Białystok, Orléans, Garmisch, Park Falls, Lamont and Tsukuba, using ±2°latitude and longitude and ±1hour coincidence criteria (Table A1 of  The correlation slope between the ACOS-GOSAT and TCCON data is not unity within the uncertainty: it is 0.88 ± 0.07 with an r 2 of 0.80. This difference from unity may be partially due to a time-dependent difference in X CO2 between the TCCON data and the ACOS-GOSAT data (Hiroshi Suto, personal communication). This could imply that there is a residual radiometric calibration error (due to degradation of the mirrors or other optical components) or another time-dependent effect, such as a drift in the reference laser frequency. A residual airmass-dependent error remains, especially at very high airmasses, and indeed the assumed linear regression reduces the agreement at very high airmasses. This is clear in the Eureka time series and in Table 3. Limiting the correlation plot to airmasses ≤3.3 improves the r 2 and increases the slope (to 0.85 and 0.93 ± 0.08, respectively).
The additional airmass-dependent errors may be reduced by adjusting the ACOS-GOSAT retrieval to the TCCON a priori profile and accounting for the photosynthetic fluorescence signal. OCO-2's target mode will allow for a determination of the airmass dependence globally.
Even after modification of the ACOS-GOSAT data by Eq. (4), the ACOS-GOSAT noise is too large to see significant (∼ 2 ppm) interannual X CO2 drawdown differences. Figure 10  to the ACOS algorithms are implemented, the noise should reduce, and we anticipate that these important interannual features will become separable from the noise.

Discussion and conclusions
Estimating sources of bias in satellite observations is essential if the data are to be used to infer surface fluxes. The ACOS retrievals of X CO2 from the GOSAT TANSO-FTS instrument contain global and regional systematic errors. We have demonstrated that bias between the ACOS-GOSAT retrieval of X CO2 data and TCCON X CO2 is significantly reduced if a set of modifications determined from the Southern Hemisphere data is applied globally. After applying the modifications to the data described by Eq. (4), the comparisons of ACOS-GOSAT X CO2 to TCCON are significantly improved but remain imperfect and show both residual time and airmass dependences. Future versions of the ACOS-GOSAT data will include an updated radiometric calibration, a fluorescence correction and a nonlinearity correction, and will use a seasonally and latitudinally varying a priori profile, all of which should improve the retrievals.
One underlying assumption in this work has been that the X CO2 gradients in the Southern Hemisphere are small. We expect that as the quality of the satellite data improves, this assumption will become less valid. In future work, using assimilations of Southern Hemisphere CO 2 (e.g., Carbon-Tracker, described by Peters et al., 2007) and the Southern Hemisphere TCCON sites could provide a more robust estimate of the true Southern Hemisphere X CO2 fields. A second important assumption we have made is that the spurious variability in the Northern Hemisphere is caused by the same retrieval or instrument parameters that cause the spurious variability in the Southern Hemisphere.
Anywhere that this assumption is invalid will lead to residual variability and bias in the Northern Hemisphere.
When turning to comparisons of ACOS-GOSAT X CO2 with TCCON in the Northern Hemisphere, coincidence criteria that include the temperature at 700 hPa, which serves as a tracer of dynamicallydriven variability in X CO2 , allow for a broader comparison with larger sample sizes. The ACOS-GOSAT noise in v2.8 is still too large to distinguish interannual variability in the Northern Hemisphere seasonal cycles in 2009 and 2010, but we anticipate that future versions of the ACOS algorithm will be able to clearly distinguish the two years.
The methods outlined in this paper: using the Southern Hemisphere to define modifications to remove spurious variability, and using the temperature at 700 hPa to define coincidence criteria in the Northern Hemisphere, are readily applicable to other satellite instruments observing X CO2 . These methods are directly applicable to the future OCO-2 retrieval algorithm, and will form the basis for initial evaluations of the OCO-2 data.

Appendix A
The effect of averaging kernels The averaging kernels and a priori profiles for the ACOS-GOSAT retrievals over Lamont and the TCCON FTS retrievals are shown in Figs. A1 and A2. According to Rodgers and Connor (2003), to compare retrieval results from two different instruments with differing viewing geometries, retrieval algorithms, a priori profiles (x a ) and averaging kernels (A), an "ensemble" profile (x c ) and covariance matrix (S c ) should be selected, which represent the mean and variability of the ensemble of true atmospheric profiles over which the comparison is to be made. That is, in order to compare retrieved valuesx i from the i-th instrument, the equations, traditionally written aŝ with measurement error xi , should be "adjusted" to a common comparison ensemble, x c , by adding to both sides of the equation, giving our new, adjusted equations: wherex i is the "adjusted"x, and I is the identity matrix: We are interested in comparing the dry-air mole fractions (DMFs, X CO2 ) in ppm, and not the profiles of CO 2 . The X CO2 are computed by dividing the total column abundances of CO 2 by the column of dry air.
X CO2 = column CO 2 column dry air (A4) where m H2O is the molecular weight of water (18.02×10 −3 /N A kg molecule −1 ), m dry air is the molecular weight of dry air (28.964×10 −3 /N A kg molecule −1 ), N A is Avogadro's constant, and {g} air is the column-averaged gravitational acceleration.
The TCCON and ACOS-GOSAT algorithms compute the total column of dry air in different ways.
Both use a measurement of the O 2 column, but the TCCON approach is to divide the total column of CO 2 by the total column of O 2 , measured in the 1.27 µm spectral region (i.e., Eq. A5). This approach is advantageous because the CO 2 and O 2 bands are spectrally close, so many errors caused by instrumental imperfections are reduced in the ratio, and no additional water vapor correction is necessary (Wallace and Livingston, 1990;Yang et al., 2002;Wunch et al., 2011). Mesospheric dayglow from the 1.27 µm O 2 band precludes useful measurements of this band from space, and so the GOSAT instrument measures the O 2 A-band (0.76 µm). The ACOS-GOSAT algorithm cannot simply use the TCCON formulation (Eq. A5) because the A-band is spectrally distant from the CO 2 bands and is measured on a separate detector. Instead, it uses the O 2 A-band measurements to compute a surface pressure, which is then used to compute the dry air column via Eq. (A6), explicitly correcting for the water column with the retrieved value from the ACOS algorithm.
The retrieved X CO2 , denotedĉ, can also be described as the profile-weighted column-average CO 2 mixing ratio in dry air, and is related to the retrieved profile,x, via the pressure weighting function h, described by Connor et al. (2008).
The pressure weighting function contains the pressure thicknesses in the state vector, normalized by the surface pressure corrected for the atmospheric water content. Applying h T = (h 1 ,...,h j ,...) to both sides of Eq. (A2) gives Eq. (22) in Rodgers and Connor (2003): where ci is the measurement error on the column retrieval for instrument i and j is the pressure level.
The normalized column averaging kernel is a i = (a i1 ,...,a ij ,...) T for instrument i and is defined by Connor et al. (2008), Eq. (8): The "adjusted" retrieved columnĉ i is then where u is a vector of ones. The difference and variance in the DMFs are then represented by Eqs. (23) and (24) from Rodgers and Connor (2003): The matrix S c is the ensemble covariance matrix, and represents the real atmospheric variability. We will use the convention that GOSAT is i = 1, and TCCON is i = 2.
For simplicity, we can choose the TCCON a priori profile as the ensemble profile (e.g., x a2 = x c ).
The TCCON a priori profile is a statistically reasonable estimate of X CO2 in the atmosphere -it is an empirical function that is latitude-and time-dependent, built on the GLOBALVIEW data set in the troposphere (GLOBALVIEW-CO 2 , 2006) and the age-of-air calculations of Andrews et al. (2001) in the stratosphere.
If the first term on the right hand side of Eq. (A12) is small compared with σ 2 c1 + σ 2 c2 , then an adjustment to a common ensemble a priori profile is sufficient to account for the major differences in the two retrievals at the same location and time. This means that we can directly compareĉ 1 and However, if the first term on the right hand side of Eq. (A12) is not negligibly small, we must reduce our smoothing error by computing what the GOSAT instrument would retrieve given the TCCON total column as "truth," via Eq. (25) from Rodgers and Connor (2003): where γ is the TCCON scaling factor applied to the a priori profile to get the final TCCON profile that is then integrated to produceĉ 2 .
A comparison ofĉ 12 withĉ 1 (the GOSAT adjusted retrieval) should significantly reduce the smoothing error introduced by the averaging kernels. Analogs of Eqs. (A11) and (A12) for this case are found in Eqs. (26) and (27) of Rodgers and Connor (2003): A full profile (from the surface up to 12 km) was measured by an instrumented aircraft over Lamont on 2 August 2009, which provides an example "true" profile (i.e., x). Using this profile to compute (a 1 − a 2 ) T (x − x c ) yields a difference of about 0.2 ppm, which is very small compared with 1 +

Appendix B
A Preview of ACOS v2.9 A significant subset of version 2.9 data, covering July 1, 2009 through March 28, 2011, has been processed since this paper was first published. Significant changes and improvements to the algorithm include -The new time dependence of the radiometric calibration was computed and applied to the data and noise model. This implies that the time-dependent filter on the χ 2 values described in Table 1 is no longer necessary. The new recommendation for the χ 2 filters is described in Table B1.
-The O 2 A-band cross-sections were scaled by 1.024. This has corrected the ∼11 hPa bias between the retrieved surface pressure and the ECMWF surface pressure. This also eliminates the need for the overall bias correction factor (0.982 in v2.8).
-The zero level offsets in the O 2 A-band were removed through fitting the spectra with an additional parameter. This reduces the error caused by detector nonlinearity, improves the spectral fits and should have some impact on the relationship between X CO2 and both signal o2 and airmass.
-The stratospheric column averaging kernel has been corrected. This should have little impact on the retrieved X CO2 , and was a bug in the pressure-weighting function calculation.
The a priori profiles remain unchanged and fluorescence has not yet been included in the state vector. Hence, there may still be both a latitude-dependent seasonal cycle induced by the a priori profile (compared with using the more realistic TCCON a priori), and continued signal o2 dependencies due to the unaccounted fluorescence signal in the O 2 A-band.
We now have more confidence in our glint data in v2.9, and would encourage data users to use it with caution. The parameters that are used to minimize the variance in the southern hemisphere glint data will likely not be the same as those needed to modify the land data. It is useful to note that the glint flag in the v2.9 data is incorrect after mid-October, 2010, when the GOSAT viewing strategy changed from a 5-point observation to a 3-point observation. A suitable glint flag is described in Table B1. When using both glint and nadir data to determine the fit parameters in equation 4, the coefficients change significantly. The covariates for calculating a bias in the glint data will be different from those used for the land data, because there are no glint data poleward of 25°S between March and October, and there is little variability in airmass and signal o2. The overall difference between the glint and glint-free data in the Southern Hemisphere over the same time period is ∼ 1 ppm.  3.4 × 10 −7 W cm −2 sr −1 (cm −1 ) −1 −0.25 ± 0.08 −0.23 ± 0.08 −0.24 ± 0.08 Table 3. This table presents the results of three comparisons between northern hemisphere TCCON XCO 2 and the ACOS-GOSAT XCO 2 . Coincidence between the two datasets are determined either by the T700 constraint   Table B1. Filters applied to the ACOS v2.9 data.

Filter Filter criterion
Retain data with good spectral fits reduced chi squared o2 fph < 1.4 reduced chi squared strong co2 fph < 2 reduced chi squared weak co2 fph < 2 Retain data with well-retrieved |(∆P ) − ∆P | < 5 hPa surface elevation (∆P = surface pressure fph − surface pressure apriori fph; ∆P = 0.59 hPa) Retain scenes without extreme aerosol 0.05 < retrieved aerosol aod by type < 0.    variations. The solid red lines are the best fit lines described by the coefficients listed in Table 2. show the ACOS-GOSAT adjustment to the ensemble profile ( j hj (a1 − u) T j (xa1 − xc) j , blue), the TC-CON adjustment to the ensemble profile ( j hj (a2 − u) T j (xa2 − xc) j = 0, green), the smoothing error ( k j hj (a1 − a2) T j (Sc) jk (a1 − a2) k , red), the ACOS-GOSAT standard deviation (σ1, cyan), the TC-CON standard deviation (σ2, purple), the difference between the TCCON adjusted ACOS-GOSAT smoothed values (ĉ 12 −ĉ 2 , yellow) and the square root of the sum of the TCCON and ACOS-GOSAT variances ( σ 2 1 + σ 2 2 , dark green). All parameters are defined in Appendix A.   The right-hand panel shows the regression after applying Eq. (4), but using coincidence criteria that restricts latitudes to within ±0.5 • , longitudes to within ±1.5 • , and interpolates the TCCON data onto the ACOS-GOSAT measurement times. Note that there are no coincident data over Eureka when using the geographic coincidence criteria (right-hand panel). The solid lines show the best fit to the data (with equations and ±2 standard errors shown on the plot), and the one-to-one line is plotted as a dashed line. The vertical bars represent the ±2σ variability of the ACOS-GOSAT data, illustrating the dependence of the variability of the ACOS-GOSAT data at each TCCON value (i.e., var(y|x)) in the regression. Similarly, the horizontal bars represent the ±2σ variability of the TCCON data.

DISCLAIMER
This document was prepared as an account of work sponsored by the United States Government. While this document is believed to contain correct information, neither the United States Government nor any agency thereof, nor The Regents of the University of California, nor any of their employees, makes any warranty, express or implied, or assumes any legal responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by its trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof, or The Regents of the University of California. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof or The Regents of the University of California.
Ernest Orlando Lawrence Berkeley National Laboratory is an equal opportunity employer.