AtmosphericChemistryandPhysics History of atmospheric SF 6 from 1973 to 2008

. We present atmospheric sulfur hexaﬂuoride (SF 6 ) mole fractions and emissions estimates from the 1970s to 2008. Measurements were made of archived air samples starting from 1973 in the Northern Hemisphere and from 1978 in the Southern Hemisphere, using the Ad-vanced Global Atmospheric Gases Experiment (AGAGE) gas chromatographic-mass spectrometric (GC-MS) systems. These measurements were combined with modern high-frequency GC-MS and GC-electron capture detection (ECD) data from AGAGE monitoring sites, to produce a unique 35-year atmospheric record of this potent greenhouse gas. Atmospheric mole fractions were found to have increased by more than an order of magnitude between 1973 and 2008. The 2008 growth rate was the highest recorded, at 0.29 ± 0.02 pmol mol − 1 yr − 1 . A three-dimensional chemical transport model and a minimum variance Bayesian inverse method was used to estimate annual emission rates using the measurements, with a priori estimates from the Emissions Database for Global Atmospheric Research (EDGAR, version 4). Consistent with the mole fraction growth rate maximum, global emissions during 2008 were also the highest in the 1973–2008 period, reaching 7.4 ± 0.6 Gg yr − 1 (1-Correspondence to: σ uncertainties) surpassing the previous in 1995. The 2008 values follow an increase in emissions of 48 ± 20% since 2001. A second global inversion which also incorporated National Oceanic and Atmospheric Administration (NOAA) ﬂask measurements and in situ monitoring site data agreed well with the emissions derived using AGAGE measurements alone. By estimating continent-scale emissions using all available AGAGE and NOAA surface measurements covering the period 2004–2008, with no pollution ﬁltering, we ﬁnd that it is likely that much of the global emissions rise during this ﬁve-year period originated primarily from Asian developing countries that do not report detailed, annual emissions to the United Nations Framework Convention on Climate Change (UNFCCC). We also ﬁnd it likely that SF 6 emissions

σ uncertainties) and surpassing the previous maximum in 1995. The 2008 values follow an increase in emissions of 48 ± 20% since 2001. A second global inversion which also incorporated National Oceanic and Atmospheric Administration (NOAA) flask measurements and in situ monitoring site data agreed well with the emissions derived using AGAGE measurements alone. By estimating continent-scale emissions using all available AGAGE and NOAA surface measurements covering the period 2004-2008, with no pollution filtering, we find that it is likely that much of the global emissions rise during this five-year period originated primarily from Asian developing countries that do not report detailed, annual emissions to the United Nations Framework Convention on Climate Change (UNFCCC). We also find it likely that SF 6 emissions reported to the UNFCCC were underestimated between at least 2004 and 2005.

Introduction
With a global warming potential of around 22 800 over a 100-year time horizon, sulfur hexafluoride (SF 6 ) is the most potent greenhouse gas regulated under the Kyoto Protocol (by mass, Forster et al., 2007;Rinsland et al., 1990). The concentration of SF 6 in the atmosphere is currently relatively low, leading to a contribution to the total anthropogenic Published by Copernicus Publications on behalf of the European Geosciences Union. radiative forcing of the order of 0.1%. However, its long lifetime of ∼3200 years means that levels will only increase over human timescales (Ravishankara et al., 1993). Given these considerations, it is important that estimates of emissions of this compound are well constrained using both "bottom-up" (where emissions are calculated based on production, sales and usage information) and "top-down" methods (where atmospheric measurements are used to derive emissions) so that emissions reduction strategies can be properly designed and evaluated.
Sulfur hexafluoride is primarily used as a dielectric and insulator in high voltage electrical equipment, from which it is released to the atmosphere through leakage and during maintenance and refill (Niemeyer and Chu, 1992). It is also released from a variety of more minor sources including the magnesium and aluminum industries and semiconductor manufacture (Maiss and Brenninkmeijer, 1998). Natural sources of SF 6 are very small (Deeds et al., 2008), leading to a very low estimated pre-industrial concentration, derived from firn air measurements, of ∼6×10 −3 pmol mol −1 (Vollmer and Weiss, 2002), compared to more than 6 pmol mol −1 in 2008. Its overwhelmingly anthropogenic origin means that SF 6 emissions are very highly weighted to the Northern Hemisphere (NH), with previous studies estimating a 94% NH contribution to the global total (Maiss et al., 1996).
Previous work has examined the concentration and growth of atmospheric SF 6 during various intervals, using measurements made with different instruments. Maiss and Levin (1994) and Maiss et al. (1996) reported an increase in the global atmospheric burden throughout the 1980s and early 1990s using GC-ECD measurements of air samples taken at a number of background sites. They found that mole fractions increased almost quadratically with time during this period, implying a near-linear rise in emissions. Geller et al. (1997) confirmed this finding using GC-ECD measurements of air samples collected at the NOAA Earth System Research Laboratory (ESRL) sites. Using a two-box model of the troposphere they derived a global emission rate of 5.9 ± 0.2 Gg yr −1 for 1996 (1-σ uncertainties are used unless specified otherwise). Fraser et al. (2004) reported highfrequency in situ GC-ECD measurements at Cape Grim, Tasmania starting in 2001. They deduced that global emissions had remained relatively constant from 1995 to 2003 within ± 10%. Most recently, Levin et al. (2010) extended the pioneering work of Maiss and Levin (1994), by reporting Southern Hemisphere (SH) measurements beginning in 1978 and NH measurements beginning in 1991, showing a renewed increase in the rise rate from 1997 to 2008. They inferred global SF 6 emissions by estimating the atmospheric burden and taking its derivative with respect to time. A twodimensional atmospheric box model was then used to simulate atmospheric mole fractions based on these emissions, which were compared to the measurements to check that the derived emissions were reasonable. By studying inven-tory emissions estimates and economic factors, they postulated that emissions were likely to be under-reported to the United Nations Framework Convention on Climate Change (UNFCCC, 2010), and that a recent emissions increase was probably driven by non-reporting countries.
Some studies have derived regional emissions for SF 6 . Emissions for Northern China were investigated using a Lagrangian model, and GC-ECD measurements at the Shangdianzi station by Vollmer et al. (2009). Airborne SF 6 measurements were also used to determine North American emissions in 2003 (Hurst et al., 2006).
Given that it is highly chemically inert and relatively easy to measure, many geophysical applications have been found for SF 6 . These include validation of chemical transport model advection schemes (Denning et al., 1999;Gloor et al., 2007;Peters et al., 2004), determination of the age of stratospheric air (e.g. Hall and Waugh, 1997), investigation of the relative importance of atmospheric transport processes (Patra et al., 2009) and groundwater dating (e.g. Bunsenberg and Plummer, 2000). Each of these applications rely on an accurate knowledge of the atmospheric history of SF 6 . Small amounts of SF 6 are also intentionally released to the atmosphere for a variety of purposes, including the tracking of urban air movements and the detection of leaks in reticulated gas systems.
In this paper we use a three-dimensional chemical transport model to derive annual hemispheric emission rates from 1973-2008 using new measurements of archived air samples collected at Cape Grim, Tasmania, and NH archived air samples mostly collected at Trinidad Head, California, along with modern ambient measurements from the Advanced Global Atmospheric Gases Experiment (AGAGE, . We then use additional data from the NOAA-ESRL flask and in situ networks (Dlugokencky et al., 1994;Geller et al., 1997;Hall et al., 2007) to derive annual hemispheric and then regional-scale emissions using data from both networks. This work improves on the approach of Levin et al. (2010) by extending the NH record 18 years further back, by using a three-dimensional chemical transport model to derive emissions with an inverse approach that considers measurement error and allows for the incorporation of useful prior information, and by resolving regional sources.
The derived emissions are compared to the Emissions Database for Global Atmospheric Research (EDGAR v4, JRC/PBL, 2009) and reports to the UNFCCC. The 39 (socalled "Annex-1") countries that report detailed, annual emissions to the UNFCCC are Australia, Austria, Belarus, Belgium, Bulgaria, Canada, Croatia, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Italy, Japan, Latvia, Liechtenstein, Lithuania, Luxembourg, Monaco, Netherlands, New Zealand, Norway, Poland, Portugal, Romania, Russia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey, UK and the USA. We refer to countries that do not make detailed annual reports as "non-reporting", or "non-UNFCCC" throughout this paper.

AGAGE measurements
AGAGE has been making high-frequency measurements of SF 6 at Cape Grim, Tasmania, using a GC-ECD system since 2001 (Fraser et al., 2004). From 2003, AGAGE stations also began measuring this compound with GC-MS "Medusa" systems .
To define the growth in the global background mole fraction, we used data from five AGAGE stations in the first part of this work: Cape Grim, Tasmania (since 2001) Table 1 shows the location of these background AGAGE sites. Additional ambient AGAGE measurements from Gosan, Korea (since 2007) and Jungfraujoch, Switzerland (since 2008) were also used for regional emissions estimation in the second part of our analysis. The Cape Grim GC-ECD SF 6 measurements were discontinued in 2009.
To extend this time series back further than 2001, we report new AGAGE GC-MS Medusa measurements of samples from the Cape Grim Air Archive (CGAA) and a collection of Northern Hemisphere (NH) archived air samples. The CGAA consists primarily of whole air compressed by cryogenic trapping and archived into 35L stainless steel cylinders, with most of the condensed water being expelled after trapping (for details, see Langenfelds et al., 1996). 64 samples of the CGAA filled between 1978 and 2006 were measured at the Commonwealth Scientific and Industrial Research Organisation (CSIRO) Division of Marine and Atmospheric Research (CMAR, Aspendale, Australia). Six additional SH samples filled between 1995 and 2004 were measured at the Scripps Institution of Oceanography (SIO, La Jolla, California) and were found to be in excellent agreement with the SH samples with similar fill dates measured at CSIRO (mole fractions differed by χ=0.001-0.05 pmol mol −1 for sampling time differences ( t) of 3-33 days). 124 NH samples filled between 1973 and 2008 were measured at SIO. Four out of five additional NH samples filled between 1980 and 1999 were measured at CSIRO and were in excellent agreement with the NH samples with similar fill dates measured at SIO ( χ=0.003-0.05 pmol mol −1 , t=0-12 days). The fifth NH sample measured at CSIRO had a unique fill date ( t=194 days to other tanks). These tests show that measurements at the two sites are in agreement, at least for mixing ratios in the range 0.9-5.1 pmol mol −1 . The 129 NH samples originate mostly from Trinidad Head, California and to a smaller extent from La Jolla, California (laboratories of R. F. Weiss, C. D. Keeling, and R. F. Keeling at Scripps Institution of Oceanography), Cape Meares, Oregon (NOAA-ESRL, Norwegian Institute for Air Research (NILU), and CSIRO, originally collected by R. A. Rasmussen and the Oregon Graduate Institute), Niwot Ridge, Colorado (NOAA-ESRL and NILU), and Barrow, Alaska (University of California, Berkeley). In contrast to the CGAA samples, which were filled to create an air archive for the SH, the NH samples were filled during periods when the sites intercepted background air, but with various filling techniques and for different purposes. Non-background samples were rejected in an iterative process based on their deviation from a polynomial fit through all data. 29 NH samples had to be rejected as outliers with mostly higher mixing ratios for reasons such as initial retention of analytes on drying agents used during the filling followed by breakthrough, or sampling of polluted air, leaving 100 (81%) NH samples. None of the 70 (64 at CSIRO and 6 at SIO) SH samples had to be rejected, which is consistent with the strict procedures adopted for the collection of the CGAA samples .
The long-term stability of SF 6 in the early CGAA samples has been evaluated empirically. Sub-samples of the CGAA were prepared at CMAR, Aspendale, and sent to the University of Heidelberg for analysis of SF 6 by GC-ECD in 1995 (see Maiss et al., 1996). Small corrections to the originally reported SF 6 mixing ratios were applied recently, after a careful reassessment of the non-linear response of the ECD (Levin et al., 2010). The Medusa GC-MS measurements of the CGAA samples were carried out in 2007 at CMAR, Aspendale, without the need to prepare explicit sub-samples (each CGAA cylinder was sampled directly, via a suitable pressure-reducing regulator). Comparison of both sets of measurements (made 12 years apart) show excellent agreement in the SF 6 trend, for CGAA samples collected in the period 1978-1994 (see Supplement). A small average offset of ∼1.5% between the two sets of measurements is consistent with a difference of this magnitude between the two independently prepared SF 6 calibration scales. These results strongly support the contention that SF 6 has stored faithfully in the CGAA cylinders.
All AGAGE in situ and archived measurements are presented on the SIO-2005 scale as dry gas mole fractions in pmol mol −1 . Details of the calibration chain from SIO to each station are reported in Miller et al. (2008). The estimated accuracy of the calibration scale is 1-2%. The typical repeatability of reference gas measurements is ∼0.05 pmol mol −1 , which was used as an estimate of the repeatability of each in situ measurement. Typical repeatability for archived air samples was ∼0.02 pmol mol −1 , with 3-4 replicates for most older samples and 10-12 replicates for more recent samples. Any non-linearity of the response of the Medusa GC-MS and any potential for system blank contamination in the analysis of the CGAA samples was experimentally determined to be negligible over the mole fraction range of the CGAA. Figure 1 shows the SF 6 mole fractions from the CGAA and NH archive from 1973-2008. From 2001, monthly mole fractions from the Cape Grim GC-ECD are presented and starting around 2004 the Cape Grim and Trinidad Head AGAGE Medusa measurements are shown, following Table 1. Site identification codes, names and locations. An asterisk (*) in the first column denotes a site used in the AGAGE-only global inversion, and a plus (+) denotes a site used in the AGAGE-NOAA global inversion. All sites are used in the regional inversion. removal of local pollution events using the statistical filtering algorithm described in . The measurements show that the SF 6 loading of the atmosphere has increased by more than a factor of 10 between 1973 and 2008. Close examination of the data indicates a steady acceleration of the mole fraction growth rate throughout the 1970s and 1980s, indicating a gradual rise in emission rate. This approximately quadratic increase with respect to time was previously noted by Maiss et al. (1996). The growth rate was seen to stabilize during the 1990s before accelerating again from around 2000, reaching 0.29 ± 0.02 pmol mol −1 yr −1 in 2008 (lower panel, Fig. 1). This recent acceleration in growth was previously noted by Elkins and Dutton (2009) and Levin et al. (2010).

NOAA measurements
SF 6 data are from surface air samples collected as part of the NOAA Global Cooperative Air Sampling Network. Surface samples are collected in duplicate, approximately weekly, from a globally distributed network of background air sampling sites (Dlugokencky et al., 1994). Daily samples are collected at tall tower sites using flask and compressor packages built into suitcases for portability. The flask package contains 12 borosilicate glass flasks and a microprocessor to control flask valves. Flasks are cylindrical in shape, 0.7 L volume and have glass-piston, Teflon-O-ring sealed stopcocks on each end. Materials used in the flasks collected at tall towers are identical to those used in the surface network. Custombuilt actuators, controlled by the microprocessor, are used to open and close stopcocks. The compressor package contains two compressors connected in series. During sampling, flask and compressor packages are connected by cables to transfer power and instructions from the microprocessor, and tubing to get air from the compressors to the flasks. For each sample, 10 L of ambient air is flushed through a flask, then it is pressurized to 0.28 MPa. The entire flask package is returned to Boulder, Colorado for trace gas analysis, while the compressor package remains at the sampling site.
SF 6 dry-air mole fractions (pmol mol −1 ) are determined at NOAA in Boulder by GC-ECD (for details, see Geller et al., 1997). The ECD response to SF 6 is calibrated against the NOAA 2006 (gravimetrically-prepared) standard scale. Each aliquot of sample is bracketed by aliquots of natural air from a reference cylinder; repeatability of the analytical system is 0.04 pmol mol −1 , determined as one standard deviation of multiple measurements of air from a cylinder containing natural air. In addition to SF 6 , samples are also analyzed for CH 4 , CO 2 , CO, H 2 , N 2 O, and δ 13 C and δ 18 O in CO 2 .
Six NOAA field sites (SPO, SMO, MLO, BRW, NWR and SUM, see Table 1) are equipped with GCs that sample air from a 10 m tower once an hour. Each in situ GC is fitted with four electron capture detectors and packed or capillary columns tuned to measure a variety of trace gases including SF 6 . To separate SF 6 , two 1.59 mm outer-diameter packed columns of Porapak Q are used (2 m pre-column and 3 m main column) and are thermally controlled at 60 • C. The air samples are compared to two on-site calibrated reference tanks with values assigned on the NOAA-2006 SF 6 scale that are sampled once every two hours. SF 6 estimated repeatability ranges from 0.03 to 0.05 pmol mol −1 at each in situ station.

Measurement intercomparison
Coincident AGAGE GC-ECD and GC-MS Medusa highfrequency measurements made at Cape Grim were found to compare very well with each other, with a mean bias of approximately 0.01 pmol mol −1 , and a standard deviation of 0.07 pmol mol −1 . Coincident AGAGE archive and highfrequency GC-MS Medusa measurements at Cape Grim and at Trinidad Head/Mace Head also agree with each other to better than 0.1 pmolmol −1 .
AGAGE measurements are regularly compared with the NOAA-ESRL in situ and flask networks where coincident measurements exist (NOAA flask samples are collected at Mace Head, Trinidad Head, Cape Matatula, Ragged Point, and Cape Grim and NOAA in situ measurements are made at Cape Matatula). Data from the two networks generally agree very well for this species with a mean bias (AGAGE minus NOAA) of around −0.02 pmol mol −1 and standard deviation of 0.05 pmol mol −1 (∼ 1%) between coincident measurements (defined as being within 3 hours of each other). This offset between the networks is consistent with measurements of air samples exchanged between NOAA-ESRL and SIO, for which a mean AGAGE minus NOAA difference of −0.02 ± 0.01 pmol mol −1 has also been derived. Where data from both networks are used in the inversions below, the NOAA-ESRL measurements were adjusted to the SIO-2005 scale by multiplication by a constant factor (0.998 ± 0.005) determined from the described comparisons of coincident measurements.

Emissions inversion method
In order to use the measurements to estimate emission rates, an atmospheric chemical transport model is required, along with a suitable inverse method, and prior estimates of global SF 6 emissions. Here we outline these individual components of the inversion.

Atmospheric chemical transport model
The Model for Ozone and Related Tracers (MOZART v4.5, Emmons et al., 2010) was used to simulate three-dimensional SF 6 atmospheric mole fractions. The model has previously been demonstrated to accurately represent the variability and inter-hemispheric and vertical gradients of this species at NOAA sampling sites, assuming EDGAR v3.2 emissions (Gloor et al., 2007). Meteorological data from the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) reanalysis project (Kalnay et al., 1996) were used to simulate the transport of SF 6 , which was assumed to exhibit no chemical loss in the atmosphere or at the surface (i.e. an infinite lifetime was assumed).
We present two types of inversion in this paper, one estimating hemispheric emissions from 1970-2008 using the archived air samples and modern background measurements, and a second, from 2004-2008, in which continentscale emissions were estimated, also incorporating nonbackground sites. NCEP/NCAR reanalyses were available for use with MOZART from 1990-2008 at 6-hourly intervals at 1.8 • ×1.8 • resolution, with 28 vertical sigma levels extending from the surface to approximately 3 hPa. For reasons of computational efficiency, these dynamics data were interpolated to 5 • latitude/longitude for the hemispheric 1970-2008 inversion, and monthly average background mole fractions were compared to the measurements. Since it is assumed that all the measurements represented background values in this part of the work, the resolution was not expected to significantly influence the derived emissions. Annually repeating 1990 dynamics was used between 1970 and 1990, and the error associated with this limitation was incorporated into our inverse estimates using the method described below. For the regional 2004-2008 inversion, we ran MOZART at 1.8 • ×1.8 • resolution and output weekly average mole fractions.

Prior emissions
In both inversions, prior emissions estimates from EDGAR v4 (JRC/PBL, 2009) were used and interpolated to the required grid resolution. These estimates currently exist only until 2005, so we extrapolated the EDGAR values through the final years (2006)(2007)(2008). The extrapolation was carried out by breaking the emissions field into separate continents and then subdividing continents into countries that report to the UNFCCC and those that do not. Emissions were then linearly projected in these regions using 2004 and 2005 EDGAR values.
The SF 6 consumption data used in EDGAR v4 is based on the global sales and emissions dataset constructed by Maiss and Brenninkmeijer (1998) for the period 1953-1995. To compile this dataset they considered production data, sales into six end-use categories, other end-use estimates, and atmospheric observations. For EDGAR v4 , global sales data were used that were collected by the RAND corporation through surveys of six major producers of SF 6 (SPS, 1997; Smythe, 2004;Knopman and Smythe, 2007), and modified as described in Maiss and Brenninkmeijer (1998). The data cover all producing countries except for Russia and China (and possibly a very small contribution from India). From this dataset, supplemented with estimates for Russia and China (R. Bitsch personal communication, 1998;Cheng, 2006), annual emissions were estimated as the sum of prompt releases and delayed emissions from banked SF 6 . For recent years, national consumption data have also been incorporated, including for example, SF 6 use in semiconductor manufacture, the magnesium industry, sound-insulated windows, soles of sport shoes and automobile tires (UNFCCC, 2010;ESIA, 2007;SIA, 2006;Nike, 2005). For end-use applications where SF 6 is stored in products, and for semiconductor manufacture, which exhibits reduced emissions due to SF 6 destruction during the manufacturing process, default emission factors and banking times were used as recommended by the Intergovernmental Panel on Climate Change (IPCC, 2006). The regional amounts of SF 6 banked in switchgear in 1995 and the identification of countries within each region that use SF 6 -containing switchgear was based on industry estimates (R. Bitsch personal communication, 1998). Per-country estimates of annual stock changes of SF 6 in switchgear were based on their relative share in regional electricity consumption changes in 1995, while the trend in this proxy was used to estimate stock changes in other years. Regional and global total stock build-up over time was estimated such that regional stocks and stock emissions matched industry estimates for 1995 (R. Bitsch personal communication, 1998).
Estimates of the time delay between the sale of SF 6 and its release to the atmosphere from insulated electrical equipment, and the global fraction of sales that were banked in such equipment in 1995, were obtained by Maiss and Brenninkmeijer (1998) through comparison with atmospheric measurements. Independent estimates of these quantities by equipment manufacturers in Europe and Japan, which were used in EDGAR v4 to calibrate the accumulated regional SF 6 stock through 1995, were found to agree well with the topdown values derived.
In their analysis, Maiss and Brenninkmeijer (1998) observed a discrepancy between SF 6 sales and end-use estimates in 1995 for two regions.
In North America and Europe (including Russia) these discrepancies totaled 1.2 ± 0.4 Gg yr −1 and 0.4 ± 0.4 Gg yr −1 respectively, and were identified as unaccounted-for sales to utilities (SPS, 1997). These quantities have been added as unknown sources in the USA, Canada and Russia, and represent about 20% of global reported sales in 1995.
From the size of the adjustments made in the RAND data due to incomplete reporting, the uncertainty in global total production and sales data is estimated at 5 to 10% for the period 1970-2000 and could be as high as 15% in 2005 (2-σ interval). Additional uncertainty in global emissions arise from sources where SF 6 is partially banked, mainly in switchgear stocks, but also, from the 1990s onwards, in soundproof windows, soles of sport shoes and car tires. Taking into account the estimated uncertainty in the emission factors of switchgear, in other applications with delayed emissions and in the factors for Chinese and Russian consumption, we obtain an average global emission factor uncertainty of about 10% in 1970, about 20% in the 1990s, and more than 25% in 2005 (2-σ interval). The resulting uncertainty in global EDGAR v4 emissions is estimated at about 10% for the period 1970-1995, increasing to over 15% in 2005 (1-σ interval). This uncertainty was incrementally increased to 20% between 2005 and 2008, since we expected that our simple extrapolation of the EDGAR emissions was an increasingly poor approximation of the "true" emissions in later years. Global EDGAR v4 and projected emissions are shown in Fig. 2.
Compared to global total SF 6 emissions, regional and national estimates are more uncertain. Due to uncertainty introduced by the proxies used and differences in equipment and maintenance practices, uncertainty in regional emissions may be twice as high as for global estimates. For the regional emissions inversion presented in the second part of our analysis (Sect. 5), we therefore assume a 40% error on the EDGAR v4 emissions (1-σ interval). It is estimated that on national scales EDGAR v4 emissions will have an uncertainty of up to 100% or more (2-σ interval).

Sensitivity estimates and inverse method
Each inversion requires an estimate of the sensitivity of the atmospheric mole fractions at each measurement site to changes in emission rate from each region. A model reference run was performed using the EDGAR emissions. The emissions were then perturbed in each hemisphere/continental region uniformly throughout each year (c.f. Chen and Prinn, 2006). These "pulses" were tracked for two years and compared to the reference; one year during which the emissions were increased, and a further year where the emissions were returned to the reference value. After the second year the excess mole fraction due to the perturbation was similar at each station (in other words the excess SF 6 was almost fully mixed throughout the troposphere). For all subsequent times the perturbed mole fractions at each measurement site were assumed to tend exponentially towards the completely well mixed value with a timescale of one year (the inversions were not found to be sensitive to this mixing time, since most of the updates of the emissions occur within the first two years). The sensitivity to a change in emissions was then found by dividing the magnitude of these increases in mole fraction by the magnitude of the emissions perturbation.
A recursive Bayesian minimum variance (Kalman-type) filter was implemented to determine emissions using these sensitivities . This technique provides an optimal estimate of the true emission rate by combining the prior estimates with the information provided by the measurements, with each weighted by their respective uncertainties. Using this method we show that the annual global emission rate can be constrained very well using the in situ measurements from the AGAGE and NOAA networks and the archived air samples, which together cover the period from 1973 to 2008. Regional emissions estimates for 2004-2008 were also obtained using all available AGAGE and NOAA observations with no pollution filtering, but were more poorly constrained than the global values.

Measurement-model uncertainty estimation
The assumed total uncertainty on each monthly/weekly average mole fraction measurement included contributions from the measurements themselves, sampling frequency, modeldata mismatch and a repeating dynamics uncertainty (where required): σ 2 =σ 2 measurement +σ 2 sampling frequency +σ 2 mismatch +σ 2 dynamics (1) Here σ measurement is the estimated total uncertainty on each measurement, which includes measurement repeatability and scale propagation errors. Since we did not know the latter term exactly, we estimated it by comparisons of the colocated AGAGE and NOAA measurements. It was assumed to be equal to the unbiassed root-mean-square (RMS) difference between the networks (equal to around 1% of the mole fraction). Where weekly or monthly average mole fractions were used, the repeatability error was reduced if multiple measurements were available. The reduction in this component was calculated as the square root of the number of days for which measurements were available in an averaging period. The one-day unit was chosen since this is the typical order of magnitude of the autocorrelation timescale of the data at the high-frequency sites, and is therefore a measure of how many "independent" estimates contributed to each (weekly/monthly) average mole fraction.
Since the high-frequency data provided a more representative estimate of the mole fraction averaged over some period, we also included a sampling frequency term (σ sampling frequency ) equal to the standard deviation of the variability divided by the square root of the number of measurements in that time period. This term was estimated at the flask sites (for which no high-frequency data were available) from the standard deviation of the variability at the closest high-frequency measurement site, scaled by the ratio of the flask and high-frequency mole fractions.
The transport model-data mismatch uncertainty, σ mismatch , was estimated as the standard deviation of the difference between the model grid cell containing the measurement site and the eight surrounding grid cells (cf. Chen and Prinn, 2006).
In the hemispheric 1970-2008 inversion, the use of annually repeating meteorology before 1990 was found to increase the uncertainty in the derived emissions by only a small amount, compared to the other terms in the Eq. (1). It affects two components of the inversion; the simulated mole fractions and the derived sensitivities. The magnitude of the first component (σ dynamics in Eq. 1) was determined by running a one-year simulation multiple times with constant emissions, and identical initial conditions, but with different wind fields. This term was found to introduce a mean uncertainty of approximately 0.01 pmol mol −1 at the grid cells used, and was included in the inversion through Eq. (1). The influence of the choice of meteorological year in the derived sensitivities was investigated by running the inversion 1000 times with randomly perturbed sensitivities. The standard deviation of the random perturbations was again found by performing multiple runs for one year with different dynamics. This process introduced an uncertainty of approximately 1% of the global emissions in each hemisphere, and is added to the pre-1990 emissions uncertainty presented.
The total estimated uncertainties, as calculated using Eq. (1), are shown as error bars in Fig. 1 at the five background AGAGE stations.
A further uncertainty must be added to the derived emissions, linked to the uncertainty in the SIO-2005 scale. This was estimated as approximately 2% in Sect. 2.1. Two potential sources of error are unaccounted for in the emissions derived below. Firstly, by solving for emissions from aggregated continental regions, we must assume that the EDGAR spatial distribution within each region is correct. This leads to "aggregation" errors that cannot be quantified here, but may be substantial, given the large estimated national-level uncertainty derived above. Secondly, whilst we attempt to account for random short-range transport uncertainties through the mismatch term in Eq. (1), transport model biases and large-scale transport errors cannot be fully estimated here, since only one transport model is used.

Global and hemispheric emissions
We first estimated global and hemispheric emissions of SF 6 between 1970 and 2008 using the AGAGE measurements and inverse method outlined above. MOZART was run at 5 • ×5 • using EDGAR v4 and extrapolated emissions to provide a prediction of atmospheric mole fractions and to estimate sensitivites of the mole fractions to hemispheric emissions changes. Monthly average modeled mole fractions were stored at each grid cell, to be compared to monthly averages of the measurements. To ensure that the modeled mole fractions were representative of background air, the values in the oceanic grid-cell "upwind" of the actual cell containing the measurements were used (i.e. we used the cell to the West of the coastal sites in the high latitudes, and to the East of the coastal tropical sites). This strategy was necessary, since at the low resolution used, some of the cells containing a measurement site also contained a significant contribution from local land sources, thereby preventing the modeled mole fractions within the cell from simulating truly "background" values. For simplicity, we assume that all of the NH archive samples were taken at Trinidad Head (where the majority of the samples were collected), since the difference between background values at NH archive sites is typically much less than 0.1 pmol mol −1 (the typical magnitude of the total measurement-model uncertainty given by Eq. 1).
An initially well mixed atmosphere was assumed in 1970, thereby allowing three (model) years before the first measurement, to allow a reasonable inter-hemispheric and vertical profile to emerge. A longer spin-up period was not used since EDGAR v4 emissions begin in 1970. Therefore, a small error may be induced in the derived emissions in the first few years, due to an incomplete stratospheric profile being set up before the incorporation of the first measurement. The initial well-mixed mole fraction was solved for in the inversion.
The mole fractions predicted using MOZART with EDGAR v4 emissions were generally found to agree well with the observations (Fig. 1, upper panel), indicating that global EDGAR v4 estimates are reasonably reliable for at least the pre-2001 period. From 2001 onwards a growing discrepancy can be seen to emerge between the measurements and the model run with the EDGAR and extrapolated emissions, suggesting that emissions were somewhat underestimated.
Using the sensitivities estimated with the transport model, we derive a new estimate of emissions using EDGAR v4 as a prior. The estimated annual emissions are shown in Fig. 2 and are tabulated in Table 2. The global emission rate can be seen to grow steadily from below 1 Gg yr −1 in 1970 to 6.3 ± 0.6 Gg yr −1 in 1995. The emissions then drop to 5.0 ± 0.6 Gg yr −1 2001 before increasing by 48 ± 20% from 2001 through 2008. The global emission rate of 7.4 ± 0.6 Gg yr −1 in 2008 is the highest in this record. Our estimates generally agree with the top-down estimates of Levin et al. (2010, also shown in Fig. 2) who estimate a 6% error on their annual emissions. They also agree well with EDGAR v4 until around 2002. After 2002, the derived emissions are significantly higher than the inventory, consistent with the discrepancy noted above between the prior modeled mole fractions and the measurements. Annual mean background mole fractions obtained by running MOZART with optimized emissions are given in Table 2. Annuallyaveraged, three-dimensional optimized mole fraction fields have also been extracted from the model and are available in NetCDF format in the Supplement.
A second hemispheric inversion was performed, incorporating several NOAA-ESRL background sites in addition to the AGAGE high-frequency and archive data (Table 1). These measurements begin from 1997, and therefore provide a slightly longer time series than the AGAGE in situ instruments. Several sites were not used, since their proximity to pollution sources, at the coarse model resolution used, made it difficult to identify nearby model grid cells that could be thought to represent background air. The emissions derived using both measurement networks are shown in Fig. 2 as a dotted line. The figure shows that the emissions derived using both networks do not deviate significantly from the AGAGE-only estimates in most years, adding confidence to our global estimates.
EDGAR places a higher percentage of emissions in the NH than previous estimates, being between 96% and 100%, depending on the year (for example Maiss et al., 1996, estimated a 94% NH source between 1978 to an average of 97% in EDGAR during this period). Our estimates do not deviate significantly from these values (Fig. 2,  lower panel). However, it should be noted that some correlation exists between our hemispheric estimates (with an average R 2 of around 0.2). Therefore, whilst some hemispheric emissions information may be obtained from the measurements, one cannot be quite as confident in their value as one is in the global total. Between 2005 and 2008, our inversion indicates an increased weighting of the NH in the global total, showing that the increased emissions most likely originated predominantly from the NH. The upper panel of Fig. 2 shows the emissions reported by 39 countries to the UNFCCC, along with the EDGAR estimate of UNFCCC emissions. Levin et al. (2010) found that Japanese emissions were likely to be overestimated before 1995 in the reported UNFCCC values, and we applied their correction to the 1990-1994 values here. UNFCCC re-ports use a "bottom-up" methodology, and therefore have not incorporated any information from atmospheric data. The EDGAR inventory has been compiled using "bottom-up" methods, and draws on practical experience of European and Japanese switchgear manufacturers on the fraction of SF 6 lost during manufacture, commissioning and maintenance. Atmospheric measurements have been used to confirm global estimates by manufacturers of the banked fraction of SF 6 in insulated electrical equipment in 1995 (Maiss and Brenninkmeijer, 1998).
UNFCCC-reported emissions are substantially lower than the global totals for all years, derived in the inversion. Whilst this is to be expected given that the reported emissions leave out many large emitters, the EDGAR estimates suggest that these countries may also be significantly under-reporting (as postulated in Levin et al., 2010). Both sets of bottom-up estimates indicate that since 1995, the trend amongst UNFCCC countries has been to report dramatically reduced emissions. If this trend in the reports and in EDGAR are reliable, it therefore seems likely that the growth since 2000 has been mostly driven by non-UNFCCC reporting regions. In the next section we discuss the possibility of verifying these estimates using the atmospheric measurements.

Continental emissions estimation
We have identified a new surge in SF 6 emissions between 2000 and 2008, of a similar magnitude to that previously derived by Levin et al. (2010). For the last three years of this period there is no global EDGAR v4 information available, only UNFCCC national inventory data for industrialized countries. Here we ask whether this increase can be attributed to specific regions using all available (unfiltered) data from AGAGE and NOAA networks and the threedimensional transport model.
The global emissions field was split into eight regions chosen to separate continents and UNFCCC reporting countries. The regions were: North America, South and Central America, Africa, European countries reporting to the UNFCCC, non-UNFCCC European countries, Asian UNFCCC countries, non-UNFCCC Asian countries and Oceania (Fig. 3). Therefore, only Asia and Europe are split into UNFCCC and non-UNFCCC regions as they are the only continents with significant emissions from both reporting and non-reporting countries (although non-reporting European countries represent a very small source). As above, the emissions in each of these regions were linearly extrapolated from 2006 to 2008 using the EDGAR 2004 and 2005 values. A priori uncertainties of 40% were were assumed for each region in each year (Sect. 3.2).
In addition to the five AGAGE measurement sites described above, additional AGAGE Medusa measurements from Jungfraujoch, Switzerland and Gosan, Korea were incorporated along with all in situ and flask measurements from the NOAA network. NOAA flask data are collected at a frequency of approximately one pair of flasks per week at surface sites, and daily at tall tower sites, whilst the NOAA in situ measurements are approximately hourly. Sampling locations for the three networks are shown in Fig. 3 and coordinates and names of the sites are given in Table 1. MOZART was run at 1.8 • ×1.8 • resolution for the period 2004-2008, using inter-annually varying meteorology. This period was chosen because AGAGE GC-MS Medusa measurements of SF 6 began in 2004. The model was used to estimate the sensitivity of weekly-average mole fractions to changes in annual emission rates from each of the regions. Weekly averages were used in order to extract emissions information from synoptic-scale "pollution events" at the highfrequency measurement sites. In order to reproduce the flask measurements as accurately as possible, we compared the weekly minimum rather than the mean, to represent the conditional background sampling at these sites. Periods shorter than one week were not thought to be as well modeled at the spatial resolution of the global model used. Further, for measurements averaged over timescales shorter than that of typical synoptic variability (about 1 week) the assumption of independent measurements, used here, may not be as valid.
Modeled mole fractions were output at the location of each AGAGE, NOAA flask and in situ station. However, it was found that significant biases existed between the modeled and measured mole fractions at some sites, which were difficult to explain through changes in emissions alone. In many instances, the bias could be reduced by moving the measurement location to an adjacent grid cell in the model. It therefore seems likely that some measurement locations shared model grid cells with significant local sources, which would then "pollute" the simulated measurements at all times. To correct this effect, the RMS difference between the model and the measurements was calculated at each grid cell in which the station truly resided and at surrounding grid cells. A site was moved if a lower RMS error could be obtained by positioning the measurement in an adjacent grid cell. The error associated with the site relocation was investigated and included in our final error estimate. This was achieved by performing the inversion 1000 times, each time with a bias randomly added to the modeled mole fraction at each site. The magnitude of the added bias was randomly chosen from the mean difference between the grid cell containing the site and the eight surrounding cells. The bias was used here, rather than the standard deviation of the difference (for example), to avoid duplication of the "mismatch" uncertainty in Eq. (1), which accounts for normally distributed, random mismatch errors, but will not account for model biases. The uncertainty associated with these biases was added to the uncertainty derived in the inversion. The mole fractions predicted by the model with the a priori emissions estimates are shown in the Supplement.
Using the inversion technique described above, emissions from the regions were estimated in each year, using a total of ∼12 000 weekly-average measurements. The optimized emissions fields are provided in NetCDF format in the Supplement, along with plots showing the the mole fractions obtained from the transport model run using the estimated emissions. The prior and optimized emissions are shown in Fig. 4, broken down into two periods of interest: [2004][2005] (for which EDGAR estimates are available) and 2006-2008 (for which EDGAR estimates are extrapolated). Significant error reduction was achieved in the inversion only for emissions from the three major source regions: non-UNFCCC Fig. 3. Eight regions whose emissions were estimated in our regional inversion. Colored areas show grid cells (1.8 • ×1.8 • resolution) where EDGAR predicts non-zero emissions. Measurement locations are shown for AGAGE (triangles), NOAA in situ (asterisks), tower (diamonds), surface flask sites (crosses) and NOAA Pacific Ocean Cruise tracks (dots). The site location shown here refers to the assumed position within the model and may differ from the true location as outlined in the text. The quoted emissions are the regional EDGAR v4 estimates for 2005.
Asia, North America and UNFCCC-Europe, with little error reduction in the more minor emissions regions.
When discussing the derived regional emissions, it is important to note that highly significant correlations were obtained between the major regions (Fig. 5). In other words, the inversion was not able to fully resolve emissions from these areas. The reason for the inability of the inversion to fully distinguish between emissions from some regions is thought to be two-fold. Firstly, most of the sites used here measure predominantly background air. Therefore, whilst the networks can constrain the global background very well, there is little influence of "polluted" air masses on the measurements, making it unclear which regions are responsible for any increase in the background. This is particularly true of the flask sites, where the sampling strategy generally attempts to avoid intercepting polluted air (i.e. air containing information on nearby emissions). The second problem may be one of low signal-to-noise. The effect of an increase in emissions in any one year in any region is to increase the hemispheric background, and to increase the size of pollution events (of, say, daily-weekly duration) at the monitoring sites close the source. A regional inversion therefore relies on being able to distinguish these "local" signals from the change in the global background that also results from the change. However, the typical size of these signals for this compound tends to be relatively small compared to the measurement uncertainty at the existing monitoring sites (see sensitivity estimates presented in the Supplement).
Examination of the derived emissions in Fig. 4 shows that for most regions, no significant trend can be inferred between the two periods investigated. The exception is for non-UNFCCC Asian emissions, which shows a large upward trend in emissions from 2.7 ± 0.3 Gg yr −1 in 2004-2005 to 4.1 ± 0.3 Gg yr −1 in 2008. This rise would account for all of the required global emissions growth between the two periods (1.0 ± 0.4 Gg yr −1 , as shown in Fig. 2 and Table 2). It is also likely that non-UNFCCC Asian emissions are underestimated in EDGAR for [2004][2005]. Other regions show smaller discrepancies between EDGAR and the optimized values.
When comparing our derived emissions to the UNFCCC estimates, we find it likely that SF 6 emissions are underreported. Figure 4c shows the aggregated UNFCCC estimates from the inversion, EDGAR and those reported. Whilst the uncertainties will be be larger than shown in the figure (for the reasons discussed in Sect. 3.4), the optimized UNFCCC values (2.5 ± 0.5 Gg yr −1 ) are almost two standard deviations larger than reported (1.6 Gg yr −1 ), strongly suggesting under-reporting. This agrees with Levin et al. (2010), who arrived at a similar conclusion by consideration of EDGAR emissions and economic factors. Similar discrepancies were obtained for 2006 and 2007, but are not shown in the figure since EDGAR estimates are not yet available for comparison during these years.
There are few regional "top-down" emissions estimates currently available, that cover similar spatial scales, with which we can compare our estimates. Using airborne measurements, a Lagrangian transport model, carbon monoxide (CO) versus SF 6 ratios and a CO inventory, Hurst et al. (2006) found that North American emissions were 0.6 ± 0.2 Gg yr −1 in 2003, compared to approximately 1.6 ± 0.3 Gg yr −1 in 2004-2005 in this work. Whilst our  values are roughly consistent with EDGAR v4, the Hurst et al. (2006) estimates are more consistent with emissions reported to the UNFCCC. The reasons for this apparent discrepancy are unclear. Potential sources of error could include biases in either transport model used, aggregation errors, a large bias in the EDGAR prior influencing our derived emissions, or errors in the CO inventory or assumptions about CO lifetime influencing the Hurst et al. (2006) estimates.
The uncertainties in global EDGAR v4 SF 6 emissions are partly due to uncertainties in global total consumption arising from incomplete reporting of production and sales (production data for Russia and China are largely missing) and discrepancies in amounts sold and sectoral consumption as reported by industry organizations and countries (e.g. sales in 1995 unaccounted for in sectoral applications of about 20% of total reported sales, mainly in North America). There is also increasing uncertainty in regional emissions from switchgear for years other than 1995, the year for which regional stock estimates were made by industry organizations (R. Bitsch personal communication, 1998). The use of UNFCCC data after 1995 is often hindered by opaque or incomplete reporting due to confidentiality of national sectoral data. We recall that while global EDGAR emissions have estimated uncertainties of the order of 10 to 20% (1-σ confidence interval, see Sect. 3.2), regional and particularly national EDGAR emission estimates are more uncertain due to differences in equipment and maintenance practices, with uncertainties in regional emissions up to 30 to 40% (1-σ interval). The largest uncertainties in the years after 1990 are in (a) unknown sources (reported as sales to utilities and equipment manufacturers), (b) production level and sales mix of China and Russia (in particular the division into sources with banking, e.g. switchgear, and others, e.g. magnesium), (c) 2004-2006 data due to incomplete surveys, (d) the effective annual emission rates of SF 6 stock in switchgear, and (e) emissions related to SF 6 production and handling of returned cylinders.
The differences between our estimates, UNFCCC reports and EDGAR highlight the need for improved national and regional emissions estimation in the future in terms of transparency and completeness, both by countries reporting to the UNFCCC and by SF 6 manufacturers. Particularly important issues are the unaccounted for sales in 1995 and thereafter, SF 6 production and sales by Russia and China and the consistency of data used for estimating emissions from the electrical equipment sector (manufacturers and utilities).
In order for top-down regional emissions validation to become more accurate for this species, more information is required in the inversion. This may be achievable in a number of ways. Firstly, the addition of many more high-frequency monitoring sites in regions that regularly intercept nonbackground air should increase the number of regions that can be distinguished. Secondly, the use of higher resolution transport models in regions close to high-frequency monitoring sites may allow us to extract a higher information content from the existing stations. For example, weekly averages were used here since we did not have confidence that the global model would be able to resolve shorter timescales. Further, the resolution of current global models means that local sources will not be well resolved at stations which are very close to polluted regions (e.g. Gosan, Korea, or Mace Head, Ireland). Therefore, it may be possible to extract more information from the higher-frequency (hourly-daily) measurements, potentially with a higher signal-to-noise ratio (since the smoothing effect of averaging can be avoided), provided suitable high-resolution meteorological fields are available.

Conclusions
We have presented new atmospheric SF 6 mole fraction measurements from the 1970s to 2008 in both hemispheres, comprising archived air samples and modern high-frequency data from the AGAGE network. Global emissions of this potent greenhouse gas were obtained using the AGAGE data alone and AGAGE plus NOAA-ESRL data, with a three-dimensional chemical transport model and an inverse method. These emissions were generally found to compare well with EDGAR v4 between 1970 and 2005 (the period for which inventory data currently exists). Since 2001, emissions have increased dramatically, and are now higher than at any point in the period investigated, reaching 7.4 ± 0.6 Gg yr −1 in 2008. The global-average growth rate for 2008 was found to be 0.29 ± 0.02 pmol mol −1 yr −1 .
Regional emissions estimates were obtained for the period 2004-2008 using all AGAGE and NOAA surface measurements. Significant correlations were found between the emissions derived in the inversion for the three major centers (North America, Europe, non-UNFCCC Asia). However, it was found that much of the emissions growth between 2004 and 2008 could most likely be attributed to non-UNFCCC Asian countries. No significant trends could be derived from the other emissions regions given the large uncertainties obtained in the inversion. However, even with these large uncertainties, we find it likely that the emissions reported to the UNFCCC were underestimated between 2004 and 2007. For these uncertainties to be reduced in future, a more dense monitoring network, higher-resolution transport models and more complete and transparent emissions reporting will be required.