Long-term tropospheric formaldehyde concentrations deduced from ground-based fourier transform solar infrared measurements

. We report a 13-year (1992–2005) dataset of total column measurements of formaldehyde (HCHO) over Lauder, New Zealand, inferred from solar infrared spectra measured using a high-resolution Fourier Transform Spectrometer (FTS). Ambient HCHO concentrations at this rural location are often close to levels typical of remote marine environments ( < 250 ppt), which are close to the detection limit using standard techniques. Consequently we develop a new method that successfully produces HCHO columns with sufﬁcient sensitivity throughout the whole season. HCHO columns over Lauder have a strong seasonal cycle ( ± 50%), with a mean column of 4.9 × 10 15 molecules cm − 2 , peaking during summer months. A simple box model of CH 4 oxidation reproduces the observed broad-scale seasonal cycle, but signiﬁcantly underestimates the seasonal peak HCHO ground concentrations during summer. This suggests the existence of an additional signiﬁcant source of HCHO, possibly isoprene that cannot be explained by oxidation of CH


Introduction
A key challenge in environmental science today is the change in atmospheric composition introduced by increasing emissions of pollutants from human activities. Pollutants introduced into the atmosphere are primarily removed by photooxidation. The oxidation capacity of the atmosphere is therefore an important indicator of the ability of the atmosphere to cleanse itself.
Formaldehyde (HCHO) is one of the most important species for understanding photo-oxidation pathways in the atmosphere. It is produced by the oxidation of methane (CH 4 ) and other non-methane volatile organic compounds (VOCs), emitted into the atmosphere by plants and animals, and generated by industrial processes as well as incomplete combustion of biomass and fossil fuels. HCHO is closely linked with the atmosphere's principal oxidant, the hydroxyl radical (OH). HCHO has a resulting atmospheric lifetime of the order of a few hours.
In the remote atmosphere methane is the main source for HCHO (Wagner et al., 2002;Lowe and Schmidt, 1983;Singh et al., 2001;Heikes et al., 2001;Frost et al., 2002). Main sinks of HCHO are reaction with OH and photolysis. Via its two photolysis channels HCHO is also an important source of free radicals. In the continental boundary layer, other VOCs emitted, for example, by the terrestrial biosphere have been shown to be important HCHO sources Lee et al., 1998). One of the most important biogenic VOCs is isoprene (C 5 H 8 ), a highly reactive VOC, which shows a strong relationship with HCHO (Palmer et al., 2003(Palmer et al., , 2006 where biogenic emissions dominate. Published by Copernicus Publications on behalf of the European Geosciences Union. Atmospheric HCHO concentrations can be measured by several independent techniques with different time resolutions and detection limits. For a good overview about different intercomparison campaigns see Hak et al. (2005). The most common techniques include tunable diode laser absorption spectroscopy (TDLAS) (e.g. Wert et al., 2003), differential optical absorption spectroscopy (DOAS) (Platt et al., 1979), the fluorimetric Hantzsch technique (Dasgupta et al., 1988) and a chromatographic technique using 2,4dinitrophenylhydrazine (DNPH) (Fung and Grosjean, 1981).
Fourier Transform InfraRed spectroscopy (FTIR), a method that has been rarely used for measuring HCHO, has the potential to address many issues relating to atmospheric HCHO concentrations and trends. Significant spectral datasets that cover the relevant absorption bands of HCHO already exist in numerous ground-based measurement programmes but relatively few studies have been reported in the literature from these data. The exceptions are the studies by Notholt (1997a) and Mahieu et al. (1997), both using data from state-of-the-art high-resolution FTIR spectrometers.
This paper describes the analysis method and compares HCHO columns and mixing ratios with box model calculations and satellite measurements. In Sect. 2 details of the ground based Fourier Transform Spectroscopy (gb-FTS) are given including a formal error analysis. Section 3.1 describes the results of the analysis on 13 years of infrared (IR) data from Lauder, New Zealand; Sect. 3.2 outlines a simple box model based on CH 4 and isoprene oxidation chemistry, while Sect. 3.3 compares the gb-FTS results with co-located measurements from GOME.

Method
The infrared spectra used in this study are from the Network for the Detection of Atmospheric Composition Change (NDACC, http://www.ndacc.org/) primary Southern Hemisphere station at Lauder, New Zealand (45 • S, 169 • E, 0.37 km a.s.l.). The time period covers 13 years from mid-1992 to mid-2005, and as such, season-to-season changes can be studied along with long-term trends. The instrumentation and history of the monitoring program at Lauder are described in previous papers (Jones et al., 1994(Jones et al., , 2001Rinsland et al., 2002).
HCHO is a weak absorber in the mid-infrared with absorption depths of less than 1% under normal background conditions. Its quantification must therefore be done in a careful manner. The chosen lines are all part of the ν 1 and ν 5 stretches, with the band centers located at 2782 cm −1 and 2843 cm −1 respectively. The micro-windows that have been adopted for use in the mid-IR in this study are based on 5 windows selected by using the method reported by Notholt et al. (2006). Table 1 outlines the windows, wavelength ranges and interfering species. Because there are several major interfering species in all microwindows, the concentration profiles of the interfering gases CH 4 and O 3 were determined simultaneously with HCHO, while a single scaling parameter was used for other gases. Further, pre-fitting several of these interfering gases (HDO,H 2 O,N 2 O,and CH 4 ) in wavenumber regions specifically selected to obtain better estimates of their individual a priori profiles (labeled as step 1 in Table 1) gave more consistent results; for HDO and H 2 O this is due to the large variability of atmospheric humidity levels. There are also very small but significant absorptions from solar CO; significant in the context that the absorptions are very similar in shape to the HCHO feature in the 2869.650-2870.100 cm −1 window. To overcome this potential source of ambiguity, a relatively isolated solar CO feature was fitted simultaneously in a separate window (2856.10-2856.35 cm −1 ) with only very minor interferences from CH 4 and N 2 O. The HITRAN 2004 line parameters (Rothman et al., 2005) were used in the analysis.
All spectra were recorded using a 700 cm −1 wide filter centered at 2750 cm −1 . Since the HCHO spectral signature is very weak and broad (as the peak of its density is close to the ground and therefore completely dominated by pressure broadening), the resolution of the spectra were reduced to 0.02 cm −1 (optical path difference of 50 cm) and apodised with a triangular function, thus reducing the high frequency noise, and increasing the signal to noise ratio (SNR) of the spectra to around 2000:1.
The algorithm used in the inversion, SFIT2 (version 3.91), has been developed jointly by several groups within the NDACC (NASA Langley Research Center, University of Denver, NCAR, NIWA Lauder, and the University of Wollongong). SFIT2 is capable of retrieving the vertical profiles of several gases simultaneously from ground-based infrared spectra. The inverse model is based on a semi-empirical application of the optimal estimation method (OEM) (Rodgers, 2000). Both the forward and inverse models have been described previously (Rinsland et al., 1998). The earlier simple solar model used in SFIT2, however, has been replaced with a more accurate algorithm (Hase et al., 2004).
The global SNR assumed in the retrieval was 1200, and was reduced to slightly lower values in selected spectral intervals (see Table 1) due to consistent residual features in the fitted spectra caused by various inaccuracies in line parameters or unknown absorbers.
The OEM uses an assumed a priori concentration for all gases. In this study the HCHO a priori profile is based on aircraft profile measurements (NASA/PEM-Tropics B; Singh et al., 2001) and is shown in Fig. 1a. The a priori profile decreases exponentially in the troposphere, with a concentration of 290 ppt at the ground. The scale height is approximately 6.2 km in mixing ratio, with a HCHO density scale height of 4.2 km. The 1 sigma uncertainties used in the OEM are directly employed in the a priori covariance matrix as a tuning parameter, i.e. they are adjusted empirically to obtain stable retrievals while obtaining the maximum Table 1. Details of the microwindows, target gases, and interfering species used in the analysis of HCHO. The windows listed as "step 1" were first fitted for the listed target species. The retrieved profiles from step 1 were used as a priori profiles in the windows listed in step 2.
Window (cm −1 ) Target gas Interfering species SNR Step 1   This profile is based on measured aircraft profile measurements (NASA/PEM-Tropics B, (Singh et al., 2001). Averaging kernels for the total column (0-100 km), 0-3 and 3-12 km partial columns for a solar zenith angle of 73.4 • . Also shown (black dotted line) is the contribution of the a priori profile to the final retrieved solution.
possible spectral information. The specific method used here was to adopt a 1 sigma value at the ground that is consistent with reported variability in the Pacific region Singh et al., 2001) Figure 1b shows the averaging kernels for the tropospheric part of the HCHO profile; there is a clear semi-independent kernel from 0-3 km, consistent with the degrees of freedom for signal (DOFS, the trace of the averaging kernel matrix, see discussion in Sect. 3.1) of 0.6 to 0.9 (the range here reflects the range of solar zenith angles in the measurements). We present in Sect. 3.3 the first multi-year comparison between the gb-FTS data and space-borne column measurements of HCHO from the GOME satellite instrument, which provides an independent validation dataset.  Table 2) that is divided by the square root of the number of measurements per month. The blue line is a seasonal mean least squares fit to the data using Eq. (1).

Column averaged time-series
The full dataset is shown in Figs. 2 and 3. The HCHO total column is displayed in Fig. 2. The depicted points (diamonds) are monthly means, with error bars that are computed from the root mean square error of all measurements within the month, assuming an error of 16% per measurement (Table 2). This 16% error includes random components from smoothing (10.5%), measurement (4.5%) and temperature (1.2%) errors, and the systematic spectroscopic errors from uncertainties in the line strength (4.6%), air broadening half-width (6.6%), and the effective apodisation parameter (0.9%), a measure of the instrumental performance. The error terms for all random error components were computed using the OEM formalism (i.e. by calculating the gain and sensitivity matrices), while the systematic error terms were dealt with using perturbation methods. A selection of spectra (50) were chosen at random with a range of zenith angles and HCHO column amounts, and their columns were compared with the same spectra analysed with the systematic error component terms perturbed by either 5% (the line strength and the effective apodisation parameter were multiplied by a factor of 1.05) or 10% (the air broadened half width was decreased by 10%). These systematic spectroscopic errors were based on the reported uncertainties from HITRAN, (Rothman et al., 2005), who reported 2-5% uncertainties in the line Table 2. The characteristics (DOFS and contribution of the a priori HCHO profile to the final retrieval) and sources of error for the total column and two partial columns (0-3 and 3-12 km) given the assumed measurement conditions used in Fig. 1 Tejwani and Leung (1977). In this later report, the air broadened half widths were computed for a range of lines and compared with a limited number of experimental values, and were found to agree to approximately 10%. The adopted number in HITRAN 2004 of 0.107 cm −1 atm −1 is weighted by line intensity as reported from Table 2 of Tejwani and Leung (1977). Recent HCHO line strengths reported by Perrin et al. (2006) were not available for this study, but the errors listed for the combination of lines used in this current study, examining Table 10 of Perrin et al. (2006), are 3% or less. All errors in this current paper are listed in Table 2. The retrieval characteristics listed in Table 2 consist of the DOFS and the contribution of the a priori, a measure of how much weighting the retrieval process gives to the a priori information compared with the measurement. Both are defined via the representationx=Ax+(I −A)x a (Rodgers, 2000), wherex is the retrieved state, A the averaging kernel matrix, x the true state, x a the a priori state, and I the identity matrix. For the purpose of computing the retrieval characteristics, the errors in the measurement have been ignored. The contribution of the a priori (the sensitivity of the retrieved state to the a priori) is the term (I −A)x a , and is listed in Table 2. This term is small and negative for the total column for example, implying that the averaging kernel is slightly greater than 1.0 (the measurements are slightly Fig. 3. The monthy mean HCHO partial columns and seasonal trend line for the 0-3 km (red symbols and red line respectively) and 3-12 km (black symbols and black line respectively) layers. The error bars are based on the results from the full error analysis data in Table 2, while the seasonal trend lines are computed from the coefficients in Table 3. overweighted compared to the a priori). Note that the total column is the only independent retrieved quantity, with a DOFS that is generally in the range from 1.2 to 1.8.
Also plotted in Fig. 2 (solid line) is the following function that consists of a seasonal and trend term using a non-linear gradient expansion curve fitting technique: where HCHO(t) is the HCHO column at time t (the elapsed time in days since 1 January 1992), a 0 the mean HCHO column at the start of the fitting interval, a 1 the linear change in column, a 2 the amplitude of the annual seasonal modulation, and φ 1 the phase (day-of-year). The fitted coefficients and their uncertainties for the total column are given in Table 3. The data show a clear seasonal cycle with a summer maximum and winter minimum, consistent with expected photochemical control by OH and known NO x sources at the site (see Sect. 3.2). The annual trends in the layers are small but statistically significant, while there are also obvious departures from the mean seasonal trends in the summers of 1999, 2000, and 2002. The error terms of the retrieved coefficients (Table 3) were computed using a boot-strap method (Gautrois et al., 2003). The non-linear fitting procedure was repeatedly called (n=400) with the HCHO data resampled with replacement.
Similarly, Fig. 3 shows partial columns of HCHO over two different height ranges, 0-3 km (red diamonds) and 3-12 km (black diamonds), for the same time period as the HCHO total column data in Fig. 2. These height ranges correspond to the averaging kernel function height ranges discussed in the previous section (see Fig. 1). The solid red (0-3 km) and black (3-12 km) lines are fitted using Eq. (1) with the corresponding coefficients presented in Table 3. The overall features of these two partial columns are comparable to each other as well as the total column results from Fig. 1 as would be expected from retrievals of HCHO with limited vertical resolution. However there are notable features in the data occurring between 1999 and 2002, referred to earlier, which appear to be prominent in the 0-3 km partial column but not to the same extent above this. This feature is particularly evident in the summer of 1999 where there is a factor 3 difference between the mean 1999 HCHO column and the peak summer value. In 2000 and 2002 this difference (factor of 2) is less, while it is not always clear which partial column is perturbed the most with respect to the mean. In other years, the HCHO column seems to be well captured by the simple seasonal fit, particularly in the last two years of data. Most of this interannual variability corresponds with local biomass burning events. Long range transport of biomass burning plumes from Australia, associated with particularly severe burning events in New South Wales (Paton-Walsh et al., 2004 during austral summer months (with very high values of associated HCHO concentrations up to 20 times above background), are a possible source of HCHO over Lauder. This is discussed further in Sect. 3.2.2.  (1), for the HCHO total column, partial columns 0-3 km and 3-12 km for the gb-FTS data while the right two columns contain the smoothed gb-FTS total column using the GOME averaging kernel and the GOME total column results respectively.  Mahieu et al. (1997) reported a mean HCHO column of (5.9±1.5)×10 14 molecules cm −2 above the International Scientific Station of the Jungfraujoch (3.57 km a.s.l.), Switzerland, from a high resolution FTIR averaged over the time period from 1988 to 1996. Data binned into several day averages were also reported by Mahieu et al. (1997) but unlike the results presented here, seasonal effects were less clear. The other studies that report multi-year datasets are two papers by Notholt et al. (1997a, b) who published column HCHO data from ground based instruments in the Arctic (Ny Alesund, 78.9 • N, 11.9 • E) and the Antarctic (McMurdo, 77.9 • N, 166.7 • E) covering 4 seasons in the case of Ny Alesund and a single campaign in Antarctica (September-October 1986). The Ny Alesund data in particular show seasonal behavior of a magnitude (range 2-5×10 15 molecules cm −2 ) and phase consistent with our results, but are also affected by direct transport of pollutants from the European continent in the winter (giving a second maximum). More recently, De Smedt et al. (2008) reported a comparison of a combined GOME and SCIA-MACHY HCHO total column dataset covering a 12 year period. De Smedt et al. (2008) noted apparent trends in the GOME HCHO total column of 16% (1996-2000 winter maxima) in the Asian region (possibly from anthropogenic emissions, particularly from China), but attributed this trend to possible instrument artifacts' as a similar trend was observed in Europe and North America where anthropogenic emissions have reportedly decreased. Figure 3 also shows, in the right hand axis, the HCHO concentrations for the two plotted layers. These concentrations are estimated by calculating the ratio of the retrieved partial column for the layer divided by the total air mass in the respective partial column.

Box modelling
We applied a box model (running in the Facsimile language) based on subsets of the Master Chemical Mechanism version 3 (MCMv3) developed by the University of Leeds (Saunders et al., 2003). Initially we used a clean air subset to estimate HCHO production from CH 4 alone (including sensitivity tests with varying NO x levels). Later we included a module with the detailed isoprene chemistry provided in MCMv3. We used the MCMv3 functional forms for the appropriate photolysis j-values, but normalized their mid-day values to the corresponding values obtained from the Tropospheric Ultraviolet Visible (TUV) model (Madronich and Flocke, 1998) at the position of Lauder. A more detailed description of the modelling approach can be found in Riedel et al. (2005). We ran our model under mid July (winter) and mid-January (summer) conditions for 30 days to ensure that the system had reached equilibrium and that the maximum HCHO levels for the conditions were obtained, although generally only a few days were required to reach equilibrium. Our model set-up and resources did not allow us to evaluate a complete seasonal cycle, so our choice of winter and summer conditions enabled us to obtain the seasonal amplitudes of the HCHO cycle under various conditions. Methane, O 3 and CO were constrained to synoptic values. A boundary layer thickness of 1 km was chosen, consistent with findings of Johnston and McKenzie (1984). Appropriate dry deposition rates for HCHO, CH 3 OOH, H 2 O 2 , O 3 , etc. were applied.
For comparisons with our model results we calculated monthly averages of atmospheric HCHO mixing ratios from the HCHO profiles measured by FTIR over the 13 year time period. We used the lowest layer of the FTIR measurements ranging from 0.1 to 3 km.
The maximum observed 13-year mean HCHO mixing ratio occurred in January (summer) at 870 ppt and the minimum observed 13-year mean HCHO mixing ratio occurred in August (winter) at 350 ppt HCHO (Fig. 4). Based on the degrees of freedom for the signal from the FTIR measurement conditions, the retrieval algorithm very likely underestimates the HCHO mixing ratios below 1 km. Thus the 13 year mean values are likely to be underestimates of the actual HCHO values in the lowest 1 km of the atmosphere where the box model is applied, and where we assume the main source of HCHO is located.

NO x dependency
The level of NO x (NO+NO 2 ) has a strong influence on the production of HCHO from the precursor species CH 3 O 2 and CH 3 O. We used the box model to test the NO x sensitivity of HCHO production from methane photo-oxidation. The model was run for mid-January and mid-July, capturing summer and winter HCHO atmospheric mixing ratios with varying NO x mixing ratios from 20 ppt to 1000 ppt. Modelled HCHO mixing ratios for January and July are shown in Fig. 5. HCHO mixing ratios increase initially with increasing NO x , but decrease after a certain NO x mixing ratio is reached, suggesting that HCHO precursor species CH 3 O 2 and CH 3 O are depleted through reaction with NO x . In January, a maximum of 515 ppt HCHO was calculated for a NO x mixing ratio of 613 ppt, while in June the highest HCHO mixing ratio (168 ppt) was reached at 220 ppt NO x , implying that a significant seasonal cycle in HCHO should exist, as is indeed seen in the results presented in Sect. 3.1. However, these maximum modelled HCHO values reached with methane photo-oxidation alone for optimum NO x conditions are still significantly smaller than the 13-year mean observed HCHO mixing ratios of 870 ppt in January and 380 ppt in July. At present, we do not have concurrent measurements of HCHO and NO x to confirm NO x mixing ratios at Lauder. However, Johnston and McKenzie (1984) found using long path spectroscopic absorption that the tropospheric mixing ratio of NO 2 at Lauder was extremely variable. Values ranged from below the measurement threshold of about 20 ppt during windy conditions, to well over 1 ppb under still conditions, with typical values of a few hundred ppt. Lauder is located in a remote region with open grassland, pasture and sheep farming. Recent research showed that nitrogen fertilized soil can act as a source for NO (Bertram et al., 2005;Clough et al., 2003;Davidson and   Kingerlee, 1997; Milligan et al., 2002;Veldkamp and Keller, 1997). Between 0.5% (Veldkamp and Keller, 1997) and 10%  of the nitrogen applied as fertilizer can be emitted as NO to the atmosphere if the soil is moist. These NO emissions could account for the large variability in NO 2 observations (Johnston and McKenzie, 1984) especially after precipitation events when the soil contains the necessary moisture for nitrification. For our simulations with the box model, we chose a range of mean NO x values between 30 and 1000 ppt, accounting for very high and very low NO x conditions. In order to estimate realistic NOx mixing ratios we use the ratio between hydroperoxyl and hydroxyl radicals (HO 2 /OH ratio) calculated from the box output as a control parameter. Measurement campaigns in mid-latitudes measuring HO 2 and OH radicals reported HO 2 and OH ratios between 33 and 150 (Carslaw et al., 2001;Creasey et al., 2002;Stevens et al., 1997) with the most likely ratio being between 45 and 134 (Carslaw et al., 2001). Based on these considerations and extrapolation we conclude that NO x mixing ratios at Lauder are most likely to be in the range of 35 ppt to 275 ppt, provided the measured HCHO is generated only from methane. Extreme values of up to 1000 ppt, as observed by (Johnston and McKenzie, 1984), could occur after rain by emissions from fertilized soils (Clough et al., 2003).

HCHO production from isoprene
The photo-oxidation of VOCs is known to produce HCHO in the continental boundary layer (Palmer et al., 2006;Shim et al., 2005). Recent research analyzing HCHO vertical profiles has shown that the HCHO molar yield from isoprene oxidation can be as high 1. 6±0.5 (Millet et al., 2006) more than from any other VOC (Palmer et al., 2006). We therefore included isoprene in the box model by incorporating the full isoprene chemistry reaction system from MCMv4. Since  formaldehyde yield from isoprene oxidation is known to increase with NO x mixing ratio (Palmer et al., 2003(Palmer et al., , 2006Horowitz et al., 1998) we varied NO x in our model, ranging from 60 to 330 ppt. These limits were chosen based on tests that showed the inclusion of isoprene generally required larger NO x mixing ratios compared with the system having only methane. We then modelled isoprene mixing ratios that were required to match 13-year mean HCHO observations at Lauder, i.e. 870 ppt in January and 380 ppt in July (see Fig. 4). With the same approach as for CH 4 photochemistry, using HO 2 /OH ratios to estimate reasonable NO x values (between 120 and 300 ppt) we found that 135-190 ppt isoprene in January and 30-50 ppt isoprene in July would be necessary to produce the observed HCHO mixing ratios. Note that the reasonable NO x range is different from that for the methaneonly system because of the much more complex chemistry taking place when isoprene is included; the two reasonable NO x ranges are both well within the Johnston and McKenzie (1984) measurement range of NO x at Lauder. The results of these simulations are shown in Fig. 6. Because no isoprene measurements have been taken over Lauder, we need to ensure our estimated mixing ratios are realistic. Isoprene is a reactive VOC mainly emitted by broadleaf trees, but also grass and scrubs (Bai et al., 2006). Maximum emissions occur during summer in the main growing season. The vegetation around Lauder is dominated by grassy farmland, natural tussock grass and low scrubs. The lifetime of isoprene is about 0.5-1.5 h during the day, but can be up to 20 h during the night, when OH mixing ratios decrease and NO 3 mixing ratios are low (Carslaw et al., 2001;Warneke et al., 2004). This longer lifetime enables isoprene to be transported. Lewis et al. (2001) measured isoprene mixing ratios up to 120 ppt at Cape Grim, Tasmania, in air masses originating principally from Tasmanian forest and grassland. It is possible that similar high isoprene episodes can occur at Lauder, with air masses being transported from vast forested areas 150 km to the west and south west. Another possibility is that HCHO from biomass burning events, either local or transported from Australian bush fires (see Sect. 3.1), could be contributing to the seasonal large HCHO values measured at Lauder. While HCHO has a short lifetime of 5 h near the surface, only 50% or so of HCHO production from fires occurs during the first day (Stavrakou et al., 2008), while the balance is produced from other HCHO precursors, for example, ethene, methanol, and acetic acid that are also produced in the fires with lifetimes of several days.

A comparison with GOME HCHO vertical columns
GOME is a space-based grating spectrometer that measures backscattered solar radiation in the UV/Vis spectral range (240-790 nm) at a spectral resolution of 0.2-0.4 nm (Burrows et al., 1999). HCHO slant columns are fitted in the 336-356 nm wavelength region with a mean slant column fitting uncertainty of 4×10 15 molec cm −2 (Chance et al., 2000). An air-mass factor (AMF) that accounts for scattering processes from aerosols and clouds, in addition to Rayleigh scattering, is used to convert these slant columns to vertical columns (Palmer et al., 2001). Use of the AMF introduces a 30% error on the columns (Palmer et al., 2006) which is added in quadrature with the slant column fitting error. For a typical scene over Lauder with a vertical column of 1×10 16 molec cm −2 and an AMF of 0.91, the overall uncertainty on the vertical column is 0.7×10 16 molec cm −2 . For the work shown here we use only GOME data with an associated cloud fraction of less than 40% and with the geolocation of the gb-FTS located within the corner coordinates of the 320×40 km 2 GOME pixels (which occurs on average every 3 days). Figure 7 shows GOME averaging kernels that are derived using the method described by Eskes and Boersma (2003) based on the Rodgers formulation (Rodgers, 2000) for summer (red) and winter (blue) profiles, while the averaging kernel for the gb-FTS is plotted in green for reference (from Fig. 1). The GOME averaging kernels are the means of assumed summer/winter model profiles from GEOS-Chem  at several different airmasses. The specific computation assumes that the slant column is linear with respect to gas amount and that the dependence of the spectrum on the HCHO vertical distribution can be described by a single scaling factor (Eskes and Boersma, 2003). The averaging kernel for each layer can therefore be expressed as the ratio of the air-mass factor for the particular layer to the total a priori air-mass factor. While there are clear differences in the relative weights of the GOME and gb-FTS averaging kernels, in the lower half of the troposphere (below about 6 km), the kernels are reasonably similar in magnitude and shape. Both the gb-FTS and GOME averaging kernels tend to overweight the HCHO column in the upper troposphere but this has little effect on the column due to the rapidly decreasing HCHO mixing ratio. The nature of the analysis techniques, instruments, platforms and geometries mean that the two datasets are not directly comparable. We have therefore used the method outlined by Rodgers and Connor (2003). For simplicity, we assume that the gb-FTS is the "truth", against which the GOME measurement is compared. We can therefore write the following expression, following the nomenclature of Rodgers and Connor (2003), that relates the GOME column with the gb-FTS derived column: whereC Gf is the smoothed gb-FTS column, C c the ensemble gb-FTS column average, a T G the transpose of the GOME column averaging kernel,x f the gb-FTS mixing ratio profile, and x c the gb-FTS ensemble average mixing ratio profile. The gb-FTS ensemble averages for both the column and vertical profile were taken as the mean from all gb-FTS data. Figure 8 shows monthly mean smooth gb-FTS total columns along with GOME total columns over the time period 1996-2001. The GOME data has been smoothed with a 21-day filter to reduce noise on a sub-monthly timescale, and regridded temporally to match all gb-FTS monthly mean data points, resulting in 51 data points over the five year period studied. The GOME data was also uniformly scaled down by 20% according to recent laboratory UV cross section measurements of HCHO (Gratien et al., 2007) that showed measurements based on cross section data of Cantrell et al. (1990), used by GOME, were inconsistent with the most Fig. 8. A comparison of total columns from GOME (blue diamonds) that have been regridded onto the spatial grid of the gb-FTS. The gb-FTS data (red diamonds) have been smoothed with the GOME averaging kernel (either summer or winter kernels, Fig. 7). Also plotted are seasonal fits, using Eq. (1), to the GOME data (solid blue line), smoothed gb-FTS (solid red line), and the fit of the "pre-smoothed" gb-FTS total column from Fig. 2. The fitting statistics are reported in Table 3. The vertical error bars are the GOME errors derived from the regridded GOME data by interpolation. The two horizontal dashed lines, red (gb-FTS) and blue (GOME), are the pair-wise statistical means from Sect. 3.3. recent UV cross section and IR line parameter data. The use of monthly mean data was adopted for two reasons, 1) to improve the precision of the presented data due to the inherently very weak spectroscopic lines, and 2) to average over large short-term (order of hours) variability in the HCHO concentration from local sources near the ground that will affect the gb-FTS measurements but will be more than likely missed by GOME (both spatially and temporally). Also shown in Fig. 8 are three fitted curves, using Eq. (1) to the regridded GOME data (solid blue line), gb-FTS data smoothed with the GOME summer or winter averaging kernel (solid red line), and the original pre-smoothed fit to the gb-FTS data (solid black line) as displayed in Fig. 2. The smoothing operation on the gb-FTS data had little effect on the ground-based data. The two datasets are in good agreement in terms of seasonal trends (the fitted phases agree to within their respective errors, see Table 3), but the magnitudes of their respective cycles and year-to-year variations are clearly different. The variance in the GOME columns is much higher than the gb-FTS data. The correlation coefficient r 2 is 0.65, indicating that the two datasets are well correlated, driven mainly by the annual season change. A simple statistical analysis on the means of the total columns of the two datasets using a 7140 N. B. Jones et al.: Long-term tropospheric formaldehyde concentrations T-test (gb-FTS=5.1± 0.3, GOME=5.6±0.7, both in units of 10 15 molecules cm −2 ), gives a T paired value of 1.2 and an associated P-value of 0.22 for 49 degrees of freedom, i.e. the means are not statistically different. In the test we excluded four outliers. Note that the time period of this comparison is shorter than the complete gb-FTS record, and therefore the statistics are slightly different to those listed in Table 3.
In general the two datasets agree very well given the preliminary nature of this comparison. On seasonal scales the two HCHO data sets (ground and space-based) agree to within their respective errors. However, the GOME data does appear to show larger variations in the columns throughout anyone particular year. This is indicative of influences from heterogeneous sources being captured by one measuring platform, and not the other. We note that the east and west coasts of the South Island of New Zealand have significantly different vegetation types, the west being wet with large tracks of dense forest, while the east coast is dry and less vegetated. The exact orbital path of the spacecraft could therefore have an impact on whether short term volatile compounds like HCHO are successfully correlated with ground based observations. However, as the HCHO concentrations are normally at background levels, the agreement between the two data sets is remarkable given the difficulties of the HCHO measurements from both the gb-FTS and GOME platform. New data products from the Ozone Monitoring Instrument (OMI) will likely improve this statistical comparison because of better spatial and temporal resolution of the OMI measurement.

Summary and conclusions
Long-term total column measurements of HCHO are reported from the Southern Hemisphere site at Lauder, New Zealand, and compared with co-located satellite measurements and a box model. A robust method of retrieving HCHO columns from ground based remotely sensed infrared spectra is described. As the low ambient HCHO concentrations often recorded at Lauder are often close to detection level, this poses a challenge for analysis techniques. The mean HCHO column over Lauder from 1992 to 2005 was (4.9±0.3)×10 15 molecules cm −2 , with a strong seasonal cycle (±50%) maximizing in the summer. A simple box model estimating summer and winter HCHO extremes from CH 4 oxidation alone over a likely range of NO x mixing ratios is consistent with the existence of a large seasonal cycle, but significantly underestimates the 13-year mean HCHO boundary layer mixing ratios deduced from the column observations, particularly in summer. This suggests the existence of a significant extra source of HCHO, possibly isoprene. A comparison of the ground-based FTS column data with collocated measurements from the GOME satellite instrument shows good agreement in the respective mean HCHO columns with the data also being well correlated (r 2 =0.65).