Edinburgh Research Explorer Estimating regional methane surface fluxes: the relative importance of surface and GOSAT mole fraction measurements

. We use an ensemble Kalman ﬁlter (EnKF), to-gether with the GEOS-Chem chemistry transport model, to estimate regional monthly methane (CH 4 ) ﬂuxes for the period June 2009–December 2010 using proxy dry-air column-averaged mole fractions of methane (XCH 4 ) from GOSAT (Greenhouse gases Observing SATellite) and/or NOAA ESRL (Earth System Research Laboratory) and CSIRO GASLAB (Global Atmospheric Sampling Laboratory) CH 4 surface mole fraction measurements. Global posterior estimates using GOSAT and/or surface measurements are between 510–516 Tg yr − 1 , which is less than, though within the uncertainty of, the prior global ﬂux of 529


Abstract.
We use an ensemble Kalman filter (EnKF), together with the GEOS-Chem chemistry transport model, to estimate regional monthly methane (CH 4 ) fluxes for the period June 2009-December 2010 using proxy dry-air columnaveraged mole fractions of methane (XCH 4 ) from GOSAT (Greenhouse gases Observing SATellite) and/or NOAA ESRL (Earth System Research Laboratory) and CSIRO GASLAB (Global Atmospheric Sampling Laboratory) CH 4 surface mole fraction measurements. Global posterior estimates using GOSAT and/or surface measurements are between 510-516 Tg yr −1 , which is less than, though within the uncertainty of, the prior global flux of 529 ± 25 Tg yr −1 . We find larger differences between regional prior and posterior fluxes, with the largest changes in monthly emissions (75 Tg yr −1 ) occurring in Temperate Eurasia. In non-boreal regions the error reductions for inversions using the GOSAT data are at least three times larger (up to 45 %) than if only surface data are assimilated, a reflection of the greater spatial coverage of GOSAT, with the two exceptions of latitudes > 60 • associated with a data filter and over Europe where the surface network adequately describes fluxes on our model spatial and temporal grid. We use CarbonTracker and GEOS-Chem XCO 2 model output to investigate model error on quantifying proxy GOSAT XCH 4 (involving model XCO 2 ) and inferring methane flux estimates from surface mole fraction data and show similar resulting fluxes, with differences reflecting initial differences in the proxy value. Using a series of observing system simulation experiments (OSSEs) we characterize the posterior flux error introduced by nonuniform atmospheric sampling by GOSAT. We show that clear-sky measurements can theoretically reproduce fluxes within 10 % of true values, with the exception of tropical regions where, due to a large seasonal cycle in the number of measurements because of clouds and aerosols, fluxes are within 15 % of true fluxes. We evaluate our posterior methane fluxes by incorporating them into GEOS-Chem and sampling the model at the location and time of surface CH 4 measurements from the AGAGE (Advanced Global Atmospheric Gases Experiment) network and column XCH 4 measurements from TCCON (Total Carbon Column Observing Network). The posterior fluxes modestly improve the model agreement with AGAGE and TCCON data relative to prior fluxes, with the correlation coefficients (r 2 ) increasing by a mean of 0.04 (range: −0.17 to 0.23) and the biases de-

Introduction
Atmospheric in situ mole fraction measurements of methane (CH 4 ) have been used extensively to estimate emissions of methane using "top-down" assimilation or inversion schemes (e.g. Rigby et al., 2008;Bousquet et al., 2006;Chen and Prinn, 2006;Wang et al., 2004;Houweling et al., 1999). Although the global annual methane budget is well constrained using these surface data, substantive discrepancies between estimates remain at the regional/subcontinental spatial scale and in terms of seasonal cycles (e.g. Kirschke et al., under review, 2013). Total column space-borne retrievals of methane are now available from several instruments, notably from SCIAMACHY (SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY, 2002-2012, Schneising et al., 2011Frankenberg et al., 2011) and GOSAT (Greenhouse gases Observering SATellite, launched 2009, Kuze et al., 2009). SCIAMACHY data have been used in previous studies to estimate emissions (Spahni et al., 2011;Bergamaschi et al., 2009, and references therein).
Here, we build on previous work (Parker et al., 2011), in which we compared GOSAT retrievals of dry-air columnaveraged mole fraction of methane (XCH 4 ) and the corresponding GEOS-Chem model fields. In that study we found very good agreement on both annual and monthly time scales, with no significant bias, and the model capturing > 70 % of the variability, with some differences over key source regions such as Southeast Asia which we attributed to known uncertainties in the bottom-up inventories. In this paper, we exploit those spatial and temporal differences using an ensemble Kalman filter to assimilate XCH 4 GOSAT retrievals and surface flask CH 4 measurements and infer methane fluxes.
In Sect. 2 we discuss the space-borne and ground-based measurements used in the assimilations. Section 3 describes the GEOS-Chem chemical transport model. We discuss the ensemble Kalman filter scheme in Sect. 4. Results from the assimilation are presented in Sect. 5. Conclusions are given in Sect. 6.

Data
GOSAT, launched in a sun-synchronous orbit by the Japanese Space Agency in January 2009, provides global short-wave infrared (SWIR) radiances which allow the retrieval of XCO 2 and XCH 4 with global coverage every three days (Kuze et al., 2009). The GOSAT scientific payload comprises the Thermal And Near infrared Sensor for carbon Observations -Fourier Transform Spectrometer (TANSO-FTS) and the Cloud and Aerosol Imager (TANSO-CAI).
Here we include a brief description of the University of Leicester proxy XCH 4 retrieval algorithm, and refer the reader to Parker et al. (2011), and references therein, for further details. XCH 4 is retrieved using the proxy CO 2 method  using the OCO (Orbiting Carbon Observatory) retrieval algorithm (Boesch et al., 2006Cogan et al., 2012), modified for use with TANSO-FTS spectra. XCH 4 and XCO 2 retrievals are performed sequentially at 1.65 and 1.61 µm, respectively. The ratio of the two species, using XCO 2 as a proxy for the light path through the atmosphere, minimizes spectral artefacts due to aerosol scattering and instrument lightpath effects. To obtain a mole fraction of XCH 4 , we use model XCO 2 from a global 3-D model: We have used location and time specific model output from the GEOS-Chem CarbonTracker (Peters et al., 2007) models, which are convolved with scenedependent averaging kernels from the GOSAT XCO 2 retrievals and normalized so that the annual global mean is consistent with the GOSAT XCO 2 . From here on in, we refer to the XCH 4 measurements scaled by GEOS-Chem XCO 2 as the GC proxy data and those scaled by CarbonTracker XCO 2 as the CT proxy data. We apply the data filtering from Parker et al. (2011), which includes cloud-screening and only uses retrievals over land. We further filter for solar zenith angle (< 70 • ), latitude (60 • S ≤ lat ≤ 60 • N), and instrument gain (high-gain only). We apply this conservative filtering to avoid potentially spurious data resulting from retrievals made over snow and ice. We also assimilate weekly surface CH 4 data from 48 sites of the NOAA Earth System Research Laboratory (ESRL), Global Monitoring Division, version 2011-10-14 , and nine sites of the CSIRO Global Atmospheric Sampling Laboratory (GASLAB), released August 2011 (Francey et al., 1996), which collect air samples distributed globally with an uncertainty of 1.5 ppb. Four sites are in both networks: Alert, Canada; Mauna Loa, USA; Cape Grim, Australia; and the South Pole. The flask data from both networks are reported on the NOAA04 mole fraction scale. Figure 1 shows the locations of the 57 ESRL and GASLAB sites used in this work. Only sites that have a continuous record over the study period (June 2009-December 2010 were used in the inversions. To evaluate the performance of the posterior fluxes we use surface CH 4 measurements from the AGAGE (Advanced Global Atmospheric Gases Experiment, June 2012 release) network (Prinn et al., 2000;Cunnold et al., 2002;Chen and Prinn, 2006;Rigby et al., 2008) and total column XCH 4 measurements from the TCCON (Total Carbon Column Observing Network, GGG2012, Wunch et al., 2011a). The AGAGE measurements have a precision of 0.075-0.15 % and an accuracy of 0.1-0.2 % (2-4 ppb) (Cunnold et al., 2002). These measurements are reported on the Tohoku University (TU) mole fraction scale, which differs from the NOAA04 scale by 0.03 %, approximately 0.5 ppb in a column of 1750 ppb (Dlugokencky et al., 2005). Because this is much smaller than the accuracy of the measurements, we do not adjust the AGAGE measurements to the NOAA04 scale. The TCCON measurements have a precision of 0.2 % and an accuracy of 7 ppb (Wunch et al., 2010). Figure 1 also shows the location of these measurement sites.

The GEOS-Chem transport model
We use the GEOS-Chem global 3-D chemical transport model (version v8-01-01), driven by version 5 of the assimilated meteorological fields from the NASA Global Modelling and Assimilation Office, to help interpret the GOSAT XCH 4 measurements. The model is described and evaluated against surface, aircraft, and satellite measurements of methane in a recent paper . In that study we found that the model reproduces the seasonal cycle of methane at the surface and in the free troposphere but overestimates the positive trend over the four year study period. In the stratosphere, the model systematically overestimates methane by ∼ 10 %. For this study we use the model with a horizontal resolution of 4 • (latitude) × 5 • (longitude) and with 47 vertical levels that span from the surface to the mesosphere with typically 35 levels in the troposphere. Anthropogenic sources of methane from ruminant animals, coal mining, oil and natural gas production, and landfills are from the Emission Database for Global Atmospheric Research, Fast Track (EDGAR 3.2 FT) inventory (Olivier et al., 2005). These emissions are assumed to have no seasonal variation; year-to-year variation is described using country-specific socio-economic factors (Wang et al., 2004).
Biomass burning emissions are from the Global Fire Emissions Database (GFED v3) inventory, which includes both seasonal and interannual variability (van der Werf et al., 2010). Natural sources from oceans (Houweling et al., 1999), termites, and hydrates are included, as well as a soil sink (Fung et al., 1991). We assume these emissions are constant throughout the study period, though they potentially exhibit yet-to-be described seasonal behaviour. Emissions from rice and wetlands vary seasonally and from year to year, based on a top-down study (Bloom et al., 2012). The tropospheric OH sink is described by monthly mean 3-D fields generated from a full-chemistry O x -NO x -VOC run of the GEOS-Chem model (Fiore et al., 2003). Loss rates for methane in the stratosphere are adapted from a 2D stratospheric model (Wang et al., 2004). This OH field has been shown to be consistent with observations of methyl chloroform (CH 3 CCl 3 , or MCF) from 1990 to 2007 (Patra et al., 2011). Figure 2 compares GOSAT proxy methane retrievals with XCH 4 simulated with the GEOS-Chem model. Unlike the comparisons in Parker et al. (2011), the new comparisons show a regional bias between the data and the model, peaking in the tropics, with GEOS-Chem generally underestimating the GOSAT data. These changes largely reflect revised estimates for wetlands and rice emissions, which take into account changes in the available carbon pool, improving the model's performance with respect to the in situ data (Bloom et al., 2012). Also shown on this figure are the number of measurements in the regions per month from GOSAT.

Ensemble Kalman filter
We use an ensemble Kalman filter (EnKF) to assimilate the in situ CH 4 measurements and XCH 4 retrievals and estimate consistent methane fluxes. A detailed description of the EnKF applied to CO 2 is given by Feng et al. (2009Feng et al. ( , 2011. The methane-specific settings for the EnKF are as follows. We do not use a lag window to estimate monthly methane fluxes: measurements of methane only affect fluxes in the month they were taken. Because of model transport error, and unevenly distributed clear-sky observations, in some regions it can be difficult to identify the origin and strength of the emissions correctly. In those regions, using a lag window can introduce likely non-physical changes in the seasonal variation of the fluxes. Fluxes are estimated over the 13 regions (Gurney et al., 2002) shown in Fig. 1. The global ocean is treated as one region. Fluxes are estimated for nine source categories in each of the land regions: wetlands, rice, biomass burning and biofuel, fossil fuels (coal mining and emissions associated with natural gas), ruminant animals, landfills, termites, other emissions (oceans and hydrates), and the soil sink. We assume monthly uncertainties on the prior regional fluxes of 50 % for the categories that vary seasonally (wetlands, rice, and biomass burning) and 25 % for the remaining categories that are assumed to be constant in the model. We  Fig. 1. The error bars represent one standard deviation of the GOSAT and GEOS-Chem data, respectively. The grey bars are the monthly total number of soundings. The inset numbers are the Pearson correlation coefficients between the two GOSAT proxies (green), the CT proxy and GEOS-Chem XCH 4 (blue), and the GC proxy and GEOS-Chem XCH 4 (red). Note the different y-scales for the XCH 4 over each region. assume uncertainties of 1 % for the ocean region and 10 % for the ice region as these regions have diffuse sources that are unlikely to be informed by the mole fraction data. We assume errors between regions are uncorrelated.
We In Appendix A we show results from several observing system simulation experiments (OSSEs) that test the ability of the EnKF to retrieve reliable fluxes using the observed distribution of clear-sky GOSAT measurements in the presence of random and systematic errors, giving a theoretical upper limit to the performance of the assimilation system. In these idealized experiments we find that the assimilation scheme is able to retrieve fluxes within 10 % of the known true fluxes in most regions. In tropical regions with few observations and with a large seasonal cycle in the number of measurements, retrieved fluxes are within 15 % of the true fluxes. Measurements are weighted by their uncertainties in the assimilation. We increase reported uncertainties for the filtered GOSAT XCH 4 retrievals by 50 %, with resulting values ranging between 9 and 40 ppb with a median value of 14 ppb, which is consistent with the standard deviation between GOSAT and TCCON XCH 4 (Parker et al., 2011). For the in situ measurements, we adopt the approach taken by Wang et al. (2004): the error is taken to be the sum in quadrature of the transport and representation errors. We describe the transport error as 0.5 % of the mixing ratio obtained by the flask measurement, and the representation error as the standard error of the monthly mean calculated from the observations made over that month (Wang et al., 2004). The relatively small measurement uncertainty of approximately 0.1 % (1.5 ppb) was not considered. The total error typically ranges between 5 and 20 ppb, with generally smaller values at Southern Hemisphere stations. Note that the EnKF weights the measurements inversely to their variance (i.e. the square of these total errors).

Bias correction
Similar to XCO 2 retrievals, biases in GOSAT XCH 4 are expected to be scene dependent, as they are sensitive to, for example, the presence of cirrus clouds and highaltitude aerosols, spectroscopy, airmass, and surface pressure (Wunch et al., 2011b). However, we expect biases from airmass, surface pressure, and aerosol optical depth to be smaller in the proxy XCH 4 retrievals than from a full-physics retrieval (Butz et al., 2010). The biases for proxy XCH 4 retrievals are further complicated by uncertainties in model XCO 2 (Sect. 2) (Schepers et al., 2012). Biases between the model and data can also arise from the model, for example from errors in the transport. For simplicity, we assume that the biases in GOSAT XCH 4 data vary only with latitude, following previous studies (Bergamaschi et al., 2009), and are constant over the study period.
From a comparison with prior model simulations, we find that the main features of the systematic difference between the model and GOSAT retrievals can be approximately described by a piecewise linear function with five evenly spaced nodes at latitudes 60 • S, 30 • S, 0 • , 30 • N, and 60 • N. The biases at these five nodes are estimated as part of the inversions from comparisons of model simulations with GOSAT (and/or in situ) observations. The prior values of the bias at these nodes are taken from the mean difference between the model and GOSAT data at those latitudes averaged over the study period. The uncertainty of the bias at the nodes is taken to be 5 ppb. We find that the retrieved bias estimates are robust and not sensitive to assumed prior values or uncertainties and are consistent with an independent statistical analysis (Appendix B).

Information metric
We define a metric, η, that gives an indication of how much information can be extracted from the GOSAT observations in a given region in a given month: where obs cs is the number of clear-sky observations in the region for that month, obs p is the number of possible observations in the region, calculated from the theoretical distribution of measurements for a satellite in the GOSAT orbit, σ region is the standard deviation of the prior fluxes within the region during the month, and σ total is the standard deviation of the total prior flux in the region over the 19-month study period. We normalize η to the maximum value in a particular region. When the fraction of clear-sky observations increases, η is larger: the more measurements there are the more information contained in them. When the variation of the fluxes within a region as a fraction of the variation of the total flux increases, η is smaller: the more variation in the fluxes in a region means that more observations would be needed to fully capture the variation in the region. Figure 3 shows the time series of η for the 11 land regions used in this study. All regions display a seasonal cycle in η. As expected, the boreal regions and Europe have a minimum in the winter when the number of measurements is close to zero. These regions also have the largest peak-to-peak difference. The boreal regions have their maximum values in February or March, reflecting the small variation in fluxes within the regions at that time. Other regions, such as Tropical Asia and South America, show minima when cloud cover is greatest. Temperate North America has the smallest variation, with values of η always greater than 0.5. We do not define a "cutoff" below which we do not analyse data, but note that lower values of η denote months where we have less confidence in the inversion results within that region. Because the metric is only dependent on the number of observations and the variation within an individual region, and other factors that would influence the information content of the region are not explicitly included, values of η within one region cannot be directly compared to values of η in other regions. June 2009-December 2010. The results from the ice and ocean regions are not shown as the emissions from these regions are small compared to the land regions and do not vary significantly from the prior. The total global fluxes from all the inversions agree with the prior amount of 529 ± 25 Tg yr −1 , but are 13-19 Tg yr −1 smaller: between 510-516 Tg yr −1 . We define a percentage error reduction metric γ :

Posterior fluxes
where is the posterior flux error, and 0 is the prior flux error. γ is defined such that larger values indicate that more information has been extracted from the observations Palmer et al., 2011). The posterior flux errors are generally smaller for the inversions using GOSAT data (INV2-5): the mean γ for the surface only inversion (INV1) is 6.0 %, while for the inversions using GOSAT data γ ranges from 17-20 %. This reflects information content from a much larger number and distribution of measurements than from the surface network. Europe is the only exception: this region has a reasonable surface measurement density on the spatial scale of the inversions, with six stations within the region and several more in the surrounding area. Recent results for a CO 2 inversion also concluded that Europe is well-sampled by the sur-face network (Niwa et al., 2012). Also due to GOSAT's orbit, high latitude Europe is not observed though the winter (November-February at 50 • N), allowing the surface data to have more influence than the satellite data during these months. The largest changes are found in Temperate Eurasia and Tropical Asia (Fig. 1). Fluxes over Boreal North America and Eurasia are largely unaffected by GOSAT data, which is expected as the majority of these regions lie north of the 60 • latitude filter we apply to the GOSAT data (Sect. 2).
The total posterior fluxes of the source categories are typically within 5 % of the prior fluxes, however the associated uncertainties have been reduced by 9-48 % after the GOSAT data are assimilated. Only fossil fuel emissions change by more than the prior uncertainty, with emissions from the inversions using the GOSAT data (INV2-5) reduced by 34-36 %. Typically, assimilating the surface and GOSAT data moves the posterior fluxes in the same direction (becoming larger or smaller than the prior), however wetland emissions become smaller using only the surface data (INV1) and larger in the four inversions using the GOSAT data (INV2-5). Figure 4 shows the time series of the monthly regional prior and posterior methane flux estimates over the study period inferred from surface data only, GOSAT GC proxy data only, and surface and GOSAT GC proxy data (INV1-3). Similar results using the CT proxy data are shown in Fig. C1 in Appendix C. In general the inversion using only surface data (INV1) is consistent with the prior flux emissions. The posterior fluxes over Temperate North America, Eurasia, and Europe show shifts in the seasonal cycle and changes in the peak emissions relative to the prior. The seasonal cycle of methane fluxes over South Africa changes significantly, due primarily to changes in wetland emissions. Also shown in Fig. 4 is the monthly error reduction (γ ) from the three inversions (coloured bars) and the mean error reduction over the whole time period. The mean error reductions for INV1, with the exception of Europe, are all less than 25 %.
In general for non-boreal regions, GOSAT XCH 4 retrievals increase γ , resulting in posterior fluxes that are statistically different from the prior. Over South America, South Africa, Tropical Asia, and Australasia, where surface measurements are sparse and therefore provide weak constraints, GOSAT observations have the largest impact on the error reduction with values at least three times as large as those for the surface inversions. For these regions the posterior fluxes generally follow the same seasonal cycle as the prior, with changes only in the magnitude of the fluxes. Europe, as discussed above, is the one region where more information comes from the surface than the satellite observations on our spatial scale.
The largest seasonal departures between the posterior and the prior are over Temperate North America and South Africa. In Temperate North America the GOSAT data are implying a smaller amplitude in the seasonal cycle of the methane emissions. For South Africa this is partly a result of the performance of the inversions in this region: the seasonal cycle of the observations, due to clouds and aerosols, leads to uneven seasonal sampling. As discussed previously, the OSSEs highlighted an upper limit of 11 % for inferring true fluxes over this region due to GOSAT sampling (Appendix A). This region is further discussed below.

South Africa
The posterior fluxes in South Africa from the GC and CT proxies differ, especially in January when the GC proxy flux drops to nearly zero. This difference is due to Carbon-Tracker's larger XCO 2 , and hence XCH 4 , columns in the region. The sharp drop in fluxes in January using the GC proxy is caused by a sharper latitudinal gradient in the GC proxy than the GOSAT XCH 4 . This region is often covered by cirrus clouds at this time of year (Heymann et al., 2012), which may not be filtered out by the cloud filtering applied in the GOSAT retrievals. Schepers et al. (2012) compare retrievals of XCH 4 using both the proxy method and a "full physics" method, which explicitly models atmospheric scattering processes. The full physics retrieval returns several parameters to describe the scattering, or path length, through the atmosphere, including aerosol optical thickness, height of the aerosol layer, and a size parameter. Schepers et al. (2012) show that although the proxy method is less sensitive to these scattering parameters than the full physics method, some de-pendence remains, with columns underestimated by > 1 % (∼ 17.5 ppb in a column of 1750 ppb) for large scattering path lengths. The proxy method does not return any estimates of scattering, so we have investigated three parameters that are retrieved to identify outlying data that may be affected by scattering, either by cirrus clouds or aerosols: the ratio of the model and retrieved CO 2 , differences in prior and posterior surface pressure, and differences in retrieved brightness temperature at several levels in the vertical profile. None of these parameters are correlated with the location of cirrus clouds, and filtering for outlying values of these parameters has no significant effect on the posterior fluxes.
We also attempted to filter for the aerosol optical depth (AOD) retrieved by the full physics XCO 2 retrieval product from the Atmospheric CO 2 Observations from Space (ACOS) group (Crisp et al., 2012). We matched the proxy XCH 4 retrievals to the XCO 2 retrievals and filtered using the recommended value for the ACOS product: 0.15 (Crisp et al., 2012), which eliminated roughly 25 % of the available data. The results of the inversion using this filter on the GC proxy data and assimilating the surface data are shown in Fig. 5. The fluxes are generally not significantly changed in the region, with the exception of the sharp drop in January 2010, which is reduced. The data that is excluded by the filter is affected by a large AOD, and could potentially be biased low, as per Schepers et al. (2012).
The standard inversion only allows measurements to affect the fluxes in the month that they were taken, however methane has a lifetime of ∼ 10 yr in the atmosphere. We increased the lag window to three months, so that measurements can affect monthly fluxes up to three months before or after they are taken. The results of this are also shown in Fig. 5. This has the effect of slightly increasing the drop in the flux in January 2010, and generally reducing the fluxes throughout the whole time period.
Finally we separated the South African region into three roughly equal area regions by latitude and ran the inversion. In this experiment, the posterior fluxes in the southern-most region stayed close to the prior, while those in the other two regions varied. The results of this are also shown in Fig. 5, with the three regions re-combined into one. The posterior fluxes in general stay closer to the prior and the sharp drop in January is removed. However, the fluxes in other regions influenced by South Africa are negatively affected. In North Africa and South America the fluxes are decreased and display unphysical variation.
As shown by the OSSEs discussed in Appendix A, the EnKF does not perform as well in South Africa as in other regions. That, combined with the sensitivity to AOD highlighted by the ACOS AOD filtering experiment, leads to posterior fluxes that are not always reliable. The value of η (Sect. 4.2) in the region is at a minimum in January, at the time of the drop, meaning that the information contained in the GOSAT data is at a minimum at this time. for the South African region assimilating the GC proxy and surface data and the effect of filtering by parameters related to aerosols and cirrus clouds, changing the lag window of the EnKF, and splitting the region into three regions.

Agreement with ground-based data
To assess the performance of the model's posterior fluxes, we force the GEOS-Chem model with the posterior fluxes described in Sect. 5.1. To avoid inconsistencies in the fluxes and resulting concentrations, we first "spin-up" the model for 4.5 yr, from January 2005, using posterior fluxes from January to December 2010 and the appropriate GEOS-5 meteorology. Figure 6 shows daily mean and hemispherically averaged GEOS-Chem (prior and INV3 posterior fluxes) and observations for two other ground-based methane measurement networks (Sect. 2): AGAGE (surface mole fraction, June 2012 release) and TCCON (total column mole fraction, GGG2012). We have sampled the model at the time and location of the measurements and for the TCCON sites we have smoothed the GEOS-Chem profile using TCCON averaging kernels and a priori. Figure 6 also shows the mean bias and standard deviation of the differences between the observations and the model, and the correlation coefficient (r 2 ) between the observations and model. For both AGAGE and TCCON comparisons the effects of the posterior emissions are greatest in the Northern Hemisphere, where the largest changes in the emissions occur. In the Northern Hemisphere the posterior standard deviations are 1-2 ppb smaller and the posterior correlations are larger (by 0.09 and 0.18, respectively) than the prior values while the biases are increased by approximately 1 ppb for AGAGE and decreased by 2 ppb for TCCON. For the Southern Hemisphere AGAGE comparisons, the posterior standard deviation is increased by 0.2 ppb and the correlation coefficient is decreased by 0.03 while the biases are decreased by 9.1 ppb. At TCCON sites, the bias decreases by 1.7 ppb and the standard deviation decreases by 1.6 ppb, while the correlation coefficient increases by 0.03. In all cases, the absolute biases are decreased from the prior to the posterior. The GC proxy posterior fluxes have the greatest impact on the AGAGE CH 4 comparison, as expected because, in the short term, changes in the emissions will affect the surface mole fractions more than the total column abundance due to the time taken to transport methane emitted at the surface upwards from the boundary layer to the free troposphere. Differences between the prior and the four inversions using GOSAT data are similar, while the surface-only inversion remains closer to the prior model (not shown). Figure 7 shows the correlation coefficient (r 2 ) and absolute mean difference between the model (driven by prior and posterior flux estimates from the five inversions) and observations at the 5 AGAGE sites and the 12 TCCON sites used in this study. For the AGAGE CH 4 data, all the inversions improve the correlation between the observations and the model relative to the prior at sites in the Northern Hemisphere and decrease the correlation at sites in the Southern Hemisphere. Co-located AGAGE and ESRL measurements have been shown to agree within 1 ppb and to have similar precisions, so we expect that assimilating the ESRL and GASLAB data should improve the agreement with the AGAGE data. In addition, the AGAGE stations are co-located with ESRL and GASLAB measurement sites (or located close to, in the case of Trinidad Head) which are assimilated in our inversions. However, AGAGE measurements are continuous, while ESRL and GASLAB measurements are weekly and at many sites samples are taken when the wind is from a nonpolluted direction (e.g. at Cape Grim, Australia only when the winds are coming from the Southern Ocean). AGAGE measurements collect data from all directions, meaning that they are more influenced by local emissions than the ESRL and GASLAB measurements, which are designed to sample background airmasses. The biases between the observations and model values decrease for Mace Head, Ragged Point, and Cape Grim and increase at Trinidad Head and Samoa. On average, the bias is decreased by 1.1 ppb across all sites.
Karlsruhe, Wollongong, and Lauder are the only TCCON sites where the model reproduces most of the observed variability, with r-squared values greater than 0.5. At the other sites, neither the prior or posterior models reproduce the variability in the observations (i.e. r-squared values are smaller than 0.5). Using the posterior fluxes from all inversions improves the correlation coefficient between TCCON observations and the model, with the exception of Karlsruhe and Darwin. At Karlsruhe, correlation coefficients are increased when only GOSAT data is assimilated, but decrease when surface data are included. This mixed performance is perhaps due to the shorter time series available at Karlsruhe where measurements are available from April 2010. At Darwin correlation coefficients are consistently smaller using the posterior data. At Wollongong the r-squared values are mostly unchanged. Biases between the TCCON XCH 4 columns and the model can either increase or decrease, depending on the site. On average, the bias is decreased by 0.1 ppb across all sites.
The mixed performance of the posterior fluxes in the Southern Hemisphere is a result of the trends at Southern Hemisphere sites. At Northern Hemisphere sites, the difference between the prior and posterior fluxes shows a seasonal cycle, but no strong trend at either AGAGE or TCCON sites. In the Southern Hemisphere, this difference is increasing at all sites from both networks. This is visible in the bottom panels of Fig. 6b and d, as the differences between the model and data diverge over time. At the relatively clean-air sites of Cape Grim and Lauder, the bias is significantly decreased as the posterior model approaches the data, but the r-squared value is reduced at Cape Grim as the amplitude of the seasonal cycle of the posterior model is reduced by the changes in the fluxes. At the tropical sites of Samoa and Darwin, interhemispheric transport (e.g. Fraser et al., 2011) may also have a complicating role.
Looking at both AGAGE and TCCON sites together, the biases within continents tend to both increase and decrease. For example, at European sites four sites see an improvement in the bias, while at three sites the bias increases. Similar patterns are seen in Australasia and Temperate North America. As shown by the OSSEs (Appendix A), the ensemble Kalman filter is able to retrieve continental-scale fluxes. The resolution of the Kalman filter is not fine enough to universally improve the comparisons with individual sites within these diverse continental regions.

Concluding remarks
We have used an EnKF to estimate regional methane fluxes using two different proxy XCH 4 GOSAT datasets and weekly surface ESRL and GASLAB CH 4 data and evaluated the results using AGAGE CH 4 and TCCON XCH 4 measurements. The posterior global flux of each inversion agrees with the prior value of 529±25 Tg yr −1 , but is consistently smaller: between 510-516 Tg yr −1 . Changes in total emissions and seasonal cycle are seen at the regional level. The largest changes occur in Temperate Eurasia (a decrease) and Tropical Asia (an increase) due to changes in emissions from rice cultivation. Despite the shift in rice emissions to lower latitudes, the total rice emissions remain the same as the prior. The posterior fluxes from the GC and CT proxy agree, with differences reflecting initial differences in the XCH 4 values, and hence differences in the modelled XCO 2 . In all inversions there is significant month-to-month variation in the retrieved fluxes in some regions (e.g. Temperate North America), which may be improved by introducing temporal correlation to the posterior fluxes in the EnKF.
We have used the posterior fluxes from the inversions in GEOS-Chem and compared to ground-based measurements of surface CH 4 (AGAGE) and total column XCH 4 (TCCON) measurements. As expected, the difference between the prior and posterior model was greater at the AGAGE sites since the surface concentration makes up only a portion of the total column. As a result, changes in methane emissions are detectable earlier at the surface than in the total column. At the AGAGE sites, which are co-located with assimilated ESRL and GASLAB sites, assimilating the surface and/or GOSAT data increases the correlations at Northern Hemisphere sites and decreases the correlations at Southern Hemisphere sites. At the TCCON sites, assimilating the data tends to increase the correlation coefficients but the bias can be either increased or decreased. In all cases, the changes in bias and r-squared are modest.
While the surface data do constrain methane emission estimates, the limited spatial coverage leaves large areas of the globe with no measurements. For example, tropical and Southern Asia, the regions with the largest methane emissions, have only two surface sites in India and Indonesia to constrain the emissions. GOSAT observations cover a larger geographical area than surface observations and hence provide more information to the assimilation system. The error reductions for inversions using GOSAT data are at least twice the error reductions when only surface data are assimilated with the exception of the boreal regions, where we filter the GOSAT data, and Europe, which is well covered by the surface network. However, surface data are integral to the inversions as the data from these networks, with a record dating back to the early 1980s, have been validated extensively, while the GOSAT data have so far not undergone such an extensive validation, with many regions of the world (e.g. South America) lacking any TCCON sites for validation. The surface data also contain a stronger signature from the emissions than the total column amounts from GOSAT.
In future studies, we plan to estimate fluxes on a finer spatial scale over select regions, for example resolving the diverse region of Temperate Eurasia on the model grid scale (4 • × 5 • ). This will give more information about the fluxes in the regions, and also potentially improve the results over problematic regions limited by the current assimilation system (such as South Africa). Increasing the resolution of the inversions could also potentially help in improving the comparisons to the TCCON and AGAGE networks. We also plan to assimilate columns of methane from the Infrared Atmospheric Sounding Interferometer (IASI), which are sensitive to the middle troposphere (Razavi et al., 2009). IASI columns could help to constrain the free troposphere, allowing the GOSAT measurements to better inform the surface emissions.

Observing system simulation experiments
We performed a series of observing system simulation experiments (OSSEs) using GOSAT data simulated from the GEOS-Chem model to test the performance of the ensemble Kalman filter using clear-sky atmospheric measurements of XCH 4 sampled by the GOSAT instrument, following Feng et al. (2009). We simulated data by sampling the GEOS-Chem model at the location of the clear-sky GOSAT observations. Four sets of simulated data were created: "perfect data" where the model value is taken as the simulated data, "random error" where we added a randomly generated error to the model based on the error of the actual GOSAT measurement and assuming a Gaussian distribution (Feng et al., 2009), "global bias" where in addition to the random error we added a global bias of 10 ppb, and "varying bias" where in addition to the random error we added a latitudinally varying bias with minima at the poles (−5 ppb) and a maximum at the equator (15 ppb). The different simulated datasets will allow us to test our ability to retrieve fluxes with different types of error.
In the first round of experiments, the inversions were run using the same set-up as described in Sect. 4. The model used in the inversion and the model used in generating the data were identical: the prior emissions corresponded exactly to the true emissions used in simulating the data. These ex-periments establish a theoretical upper limit to the assimilation system due to the non-uniform sampling of GOSAT. In these experiments the posterior fluxes retained the seasonal cycle of the true/prior emissions. The annual mean posterior fluxes are shown in Fig. A1a. With "perfect data" the posterior fluxes are within 0.02 % of the true/prior fluxes. The different bias simulations had no significant effect on the posterior fluxes, which are within 5 % of the true/prior fluxes, with the largest differences in Europe and Tropical South America. In all cases, the returned bias was within 1.3 ppb of the true value.
In the second round of experiments, the set-up was identical except the prior emissions in the model were increased by 20 %. In this case the true emissions used to simulate the data were therefore 83 % of the prior emissions. Four inversions were performed with the four simulated datasets, the results of which are shown in Fig. A1b. Again, the three experiments with random error and different biases return similar posterior fluxes. The inversions infer fluxes that agree with the truth to within 10 %, with the exception of Boreal Eurasia, Tropical and Temperate South America, and North and South Africa. In Boreal Eurasia this is likely due to a lack of observations due to the latitudinal filter used in the analysis. The other regions are all located in the tropics. In Tropical South America and North Africa, which are both mainly in the Northern Hemisphere, the performance may be a result of the relatively few measurements in the region, between 300 and 1100 per month for Tropical South America and between 350 and 1260 per month for North Africa (see Fig. 2), which are some of the smallest numbers for the nonboreal regions. In South Africa and Temperate South America, both located in the Southern Hemisphere, this is perhaps due to the seasonal cycle of the observations due to clouds and aerosols: both regions have a strong seasonal cycle with more observations in the austral winter (May-October) than in the austral summer (November-April). Other regions have seasonal cycles in the number of observations as well, though the amplitude in South Africa (amplitude 1600 observations, with minimum 580) and Temperate South America (amplitude 1270 observations, with minimum 700) is larger than any other region except Temperate Eurasia, which has a minimum of 1800 observations per month. The returned bias was within 1.5 ppb of the true values in all of the experiments.
The third and final round of experiments was the same as the second, but in this case we perturbed the prior fluxes by a random number between −20 and 20 % from the true fluxes. We again performed four inversions with the simulated datasets, the results of which are shown in Fig. A1c. The original perturbation to the prior fluxes is also given in this figure. The results in these experiments are similar to those of the second experiments, with the posterior fluxes in most regions agreeing with the true fluxes to within 10 %, with the exception of Boreal North America, Temperate South America, North Africa, and Boreal Eurasia. In the boreal regions, this is likely due to a lack of measurements.
For the other regions, the reasons are likely the same as the second experiments. In the regions with differences in fluxes less than 10 %, the percentage difference between the prior and true fluxes is at least halved, even in areas with small differences to begin with.
We conclude that the GOSAT observing system is able to retrieve fluxes to within 10 %, and in some regions much better than this, of the "true" values in most regions. The observation pattern of the measurements and the conservative latitude-based filtering applied means that the system is not able to correct fluxes in the boreal regions. In tropical regions, the fluxes are over-or underestimated by up to 15 % in our idealized experiments, possibly due to the small number and seasonal cycle of the observations leading to an incomplete sampling of the seasonal cycle. The inversions with the addition of different errors return similar fluxes, so we conclude that the assimilation system is not sensitive to random error on the order of magnitude of the measurement error of GOSAT or global or latitudinally varying biases. As more GOSAT data become available and the measurement distribution potentially changes these conclusions will need to be revisited.

Bias correction
The bias correction scheme is described in Sect. 4.1. Figure B1 shows the time series of the bias between the GOSAT GC proxy and the prior model in different latitudinal bands. No obvious trend is apparent in any of the latitude bands, indicating that the prior model generally reproduces the trend in the GOSAT XCH 4 measurements. The bias varies with the latitude band, as expected from the comparisons in Fig. 2, with minimum values in the Southern Hemisphere extratropics and maximum values in the Northern Hemisphere tropics.
The initial value of the bias at 60 • S, 30 • S, 0 • , 30 • N, and 60 • N was selected from the mean of the difference between the observations and prior model. Figure B2 shows the latitudinal distribution of the bias between the GOSAT GC proxy and the prior model, the first guess a priori bias, the retrieved bias, and a sensitivity study where the a priori bias was set to zero and the uncertainty in the nodes of the bias was increased to 15 ppb. This bias agrees very well with the bias retrieved in the standard inversion. The resultant fluxes from the sensitivity test are nearly identical to those retrieved in the standard inversion. We conclude that our inversion is not sensitive to the prior bias chosen.
We also performed sensitivity studies by changing the number and location of nodes. We find no significant difference in the posterior fluxes when the location of the nodes is changed. The fluxes are also robust to the number of nodes, provided there are at least two nodes; if a single node is    used, which represents a global bias, the fluxes in the extratropical Northern Hemisphere are not greatly affected. The fluxes in the tropics and in the Southern Hemisphere, where the bias between the observations and the model is larger, become much smaller or larger than the prior, and can display some potentially unphysical variations. We choose an initial bias with five nodes to capture the variation in the bias with latitude, which could be partially due to biases in the GOSAT data resulting from thin cirrus clouds, sensitivity to the solar zenith angle of the satellite, uncertainties in water vapour spectroscopy, and the modelled CO 2 used in the proxy method. Figure C1 shows regional prior and posterior methane flux estimates over the study period inferred from surface data only, GOSAT CT proxy data only, and surface and CT proxy data. The posterior fluxes are similar to those found using the GOSAT GC proxy in Fig. 4. Differences between the results from the two proxies are a result of the differences between the XCH 4 values in the proxies shown in Fig. 2.