© Author(s) 2008. This work is licensed under a Creative Commons License. Atmospheric Chemistry and Physics Discussions Validation of ozone measurements from the Atmospheric Chemistry Experiment (ACE)

This paper presents extensive {bias determination} analyses of ozone observations from the Atmospheric Chemistry Experiment (ACE) satellite instruments: the ACE Fourier Transform Spectrometer (ACE-FTS) and the Measurement of Aerosol Extinction in the Stratosphere and Troposphere Retrieved by Occultation (ACE-MAESTRO) instrument. Here we compare the latest ozone data products from ACE-FTS and ACE-MAESTRO with coincident observations from nearly 20 satellite-borne, airborne, balloon-borne and ground-based instruments, by analysing volume mixing ratio profiles and partial column densities. The ACE-FTS version 2.2 Ozone Update product reports more ozone than most correlative measurements from the upper troposphere to the lower mesosphere. At altitude levels from 16 to 44 km, the average values of the mean relative differences are nearly all within +1 to +8%. At higher altitudes (45–60 km), the ACE-FTS ozone amounts are significantly larger than those of the comparison instruments, with mean relative differences of up to +40% (about +20% on average). For the ACE-MAESTRO version 1.2 ozone data product, mean relative differences are within ±10% (average values within ±6%) between 18 and 40 km for both the sunrise and sunset measurements. At higher altitudes (~35–55 km), systematic biases of opposite sign are found between the ACE-MAESTRO sunrise and sunset observations. While ozone amounts derived from the ACE-MAESTRO sunrise occultation data are often smaller than the coincident observations (with mean relative differences down to −10%), the sunset occultation profiles for ACE-MAESTRO show results that are qualitatively similar to ACE-FTS, indicating a large positive bias (mean relative differences within +10 to +30%) in the 45–55 km altitude range. In contrast, there is no significant systematic difference in bias found for the ACE-FTS sunrise and sunset measurements.


Introduction
Ozone is a key molecule in the middle atmosphere because it absorbs solar ultraviolet (UV) radiation and contributes to the radiative balance of the stratosphere. Understanding changes occurring in the distribution of ozone in the atmosphere is, therefore, important for studying ozone recovery, climate change and the coupling between these processes (WMO, 2007). To this end, it is important to have continuous high quality measurements of ozone in the stratosphere. Profile measurements from satellite-borne instruments provide height-resolved information that can be used to understand changes in ozone concentrations occurring at different altitudes. For the past two decades, one of the primary sources for ozone profile information has been satellite-borne instruments making solar occultation measurements. The solar occultation technique provides self-calibrating measurements of atmospheric absorption spectra with a high signal-to-noise ratio and good vertical resolution. Thus, to extend this time series of measurements in a consistent way, it is crucial to conduct validation studies that compare the results from new instruments with those from older and more established instruments.
The newest satellite for solar occultation studies is the Atmospheric Chemistry Experiment (ACE). This Canadian-led satellite mission, also known as SCISAT, was launched on 12 August 2003. There are two instruments on-board the spacecraft that provide vertical profiles of ozone and a range of trace gas constituents, as well as temperature and atmospheric extinction due to aerosols. The ACE Fourier Transform Spectrometer (ACE-FTS)  measures in the infrared (IR) region of the spectrum and the Measurement of Aerosol Extinction in the Stratosphere and Troposphere Retrieved by Occultation (ACE-MAESTRO)  operates in the UV/visible/near-IR. The main objective of the ACE mission is to understand the global-scale chemical and dynamical processes which govern the abundance of ozone from the upper troposphere to the lower mesosphere, with an emphasis on chemistry and dynamics in the Arctic. SCISAT, the platform carrying the ACE-FTS and ACE-MAESTRO, is in a circular low-Earth orbit, with a 74 • inclination and an altitude of 650 km . From this orbit, the instruments measure up to 15 sunrise (hereinafter SR) and 15 sunset (hereinafter SS) occultations each day. Global coverage of the tropical, mid-latitude and polar regions (with the highest sampling in the Arctic and Antarctic) is achieved over the course of one year and the ACE measurement latitude pattern repeats each year. When ACE was launched, there were several solar occultation satelliteborne instruments in operation: Stratospheric Aerosol and Gas Experiment (SAGE) II (Mauldin et al., 1985), SAGE III (SAGE ATBD Team, 2002a), HALogen Occultation Experiment (HALOE) (Russell et al., 1993), Polar Ozone and Aerosol Measurement (POAM) III (Lucke et al., 1999) and SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) (Bovensmann et al., 1999). The first four instruments only make occultation measurements while SCIAMACHY operates in nadir, limb and occultation modes. Between August and December 2005, the SAGE II, SAGE III, HALOE, and POAM III measurements ended. Currently, ACE-FTS and ACE-MAESTRO are the only satellite-borne instruments operating exclusively in solar occultation mode, while SCIAMACHY provides occultation measurements in addition to its limb and nadir observations. To be able to extend the long-standing record of observations from the SAGE II, SAGE III, POAM III and HALOE instruments, it is important that the ozone measurements provided by ACE-FTS and ACE-MAESTRO be well characterized and their quality thoroughly assessed.
In this paper, we present extensive studies focusing on bias determination for the most recent ozone data products from ACE-FTS (version 2.2 Ozone Update) and ACE-MAESTRO 290 E. Dupuy et al.: Validation of ACE ozone (version 1.2). The current ozone data are here compared with measurements from satellite-borne instruments as well as ozonesondes and balloon-borne, airborne and groundbased instruments employing different observation techniques. Section 2 describes the ACE satellite mission, instruments, and the ozone data products. The coincidence criteria and the validation methodology are described in Sects. 3 and 4, respectively. The comparisons are organized by instrument platform in the following two sections, Sect. 5 for the satellites and Sect. 6 for the ozonesondes, balloon-borne, airborne and ground-based instruments. The overall results are summarized and discussed in Sect. 7 and conclusions are given in Sect. 8.

ACE-FTS
The primary instrument for the ACE mission, the ACE-FTS, is a successor to the Atmospheric Trace MOlecule Spectroscopy (ATMOS) experiment , an infrared FTS that operated during four flights on the Space Shuttle (in 1985, 1992, 1993 and 1994). ACE-FTS measures high-resolution (0.02 cm −1 ) atmospheric spectra between 750 and 4400 cm −1 (2.2-13 µm) . A feedback-controlled pointing mirror is used to target the centre of the Sun and track it during the measurements. Typical signal-to-noise ratios are more than 300 from ∼900 to 3700 cm −1 . From the 650 km ACE orbit, the instrument field-of-view (1.25 mrad) corresponds to a maximum vertical resolution of 3-4 km . The vertical spacing between consecutive 2 s ACE-FTS measurements depends on the satellite's orbit geometry during the occultation and can vary from 1.5-6 km. The altitude coverage of the measurements extends from the cloud tops to ∼100-150 km. The suntracker used by the ACE instruments cannot operate in the presence of thick clouds in the field-of-view. Therefore the profiles do not extend below cloud top level. The lower altitude limit of the profiles is thus generally 8-10 km, extending in some cases to 5 km, depending on the presence or absence of clouds.
Vertical profiles of atmospheric parameters, namely temperature, pressure and volume mixing ratios (VMRs) of trace constituents, are retrieved from the occultation spectra. This is described in detail in Boone et al. (2005). Briefly, retrieval parameters are determined simultaneously in a modified global fit approach based on the Levenberg-Marquardt nonlinear least-squares method (see Boone et al., 2005, and references therein). The retrieval process consists of two steps. Knowledge of pressure and temperature is critical for the retrieval of VMR profiles. However, sufficiently accurate meteorological data are not available for the complete altitude range of ACE-FTS observations. Therefore, the first step of the retrieval derives atmospheric pressure and tem-perature profiles directly from the ACE-FTS spectra, using microwindows containing CO 2 spectral lines. During the second phase of the retrieval process, these profiles are used to calculate synthetic spectra that are compared to the ACE-FTS measured spectra in the global fitting procedure to retrieve the VMR profiles of the target species. In the current ACE-FTS dataset (version 2.2 with updates for ozone, N 2 O 5 , and HDO), profiles are retrieved for more than 30 species using spectroscopic information from the HITRAN 2004 line list (Rothman et al., 2005). First-guess profiles are based on the results of the ATMOS mission. It is important to emphasize that the global fitting approach used here does not use the Optimal Estimation Method, hence does not impose constraints based on a priori information. Therefore the retrieval method is not sensitive to the first-guess profiles. Also, averaging kernels are not available for the ACE-FTS retrievals. The altitude range of the ozone retrievals typically extends from ∼10 km to ∼95 km. The final results are provided jointly on the measurement (tangent height) grid and interpolated onto a 1 km grid using a piecewise quadratic method. The latter form is used for all analyses presented in this study. The uncertainties reported in the data files are the statistical fitting errors from the least-squares process and do not include systematic components or parameter correlations . The mean relative fitting errors are lower than 3% between 12 and ∼65 km and typically less than 1.5% around the VMR peak (30-35 km). A detailed error budget including systematic errors is not currently available for the ACE-FTS data products.
Initial validation comparisons for ACE-FTS version 1.0 ozone retrievals have been reported Petelina et al., 2005a;Fussen et al., 2005;McHugh et al., 2005;Kerzenmacher et al., 2005). Version 2.1 ozone was used in the early validation studies for the Microwave Limb Sounder (MLS) on the Aura satellite (hereafter Aura-MLS) by Froidevaux et al. (2006). In these earlier ACE-FTS ozone retrievals (up to and including version 2.2), a set of microwindows from two distinct spectral regions (near ∼5 µm and ∼10 µm) was used. Because of apparent discrepancies in the spectroscopic data for these two regions, the vertical profiles near the stratospheric ozone concentration peak were found to have a consistent low bias of ∼10% in comparisons with other satellite-borne instruments. This was corrected in an update to version 2.2 by removing from the analysis the microwindows in the 5 µm spectral region. A consistent set of 37 microwindows around 10 µm (from 985 to 1128 cm −1 , with the addition of one microwindow at 922 cm −1 to improve results for the interfering molecule CFC−12) is now used for ozone retrievals. This O 3 data product, "version 2.2 Ozone Update", is used in the comparisons presented here. These version 2.2 Ozone Update profiles were used in recent validation studies for Aura-MLS (Froidevaux et al., 2008) and the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) on Envisat (Cortesi et al., 2007). The agreement with Aura-MLS version 2.2 ozone profiles is within 5% in the lower stratosphere (with ACE-FTS ozone VMRs consistently larger than those of Aura-MLS), but degrades with altitude with the largest difference in the upper stratosphere (up to ∼25%) (Froidevaux et al., 2008). Relative differences with the MIPAS ESA operational ozone v4.62 data products are within ±10% between 250 and ∼2 hPa (10-42 km) but increase above this range, with ACE-FTS reporting larger VMR values than MIPAS by up to +40% around 0.6 hPa (∼53 km) (Cortesi et al., 2007).

ACE-MAESTRO
ACE-MAESTRO is a dual-grating diode-array spectrophotometer that extends the wavelength range of the ACE measurements into the near-IR to UV spectral region . It records over a nominal range of 400-1010 nm with a spectral resolution of 1.5-2 nm for its solar occultation measurements. The forerunner of the ACE-MAESTRO is the SunPhotoSpectrometer instrument which was used extensively by Environment Canada as part of the NASA ER-2 stratospheric chemistry research program (McElroy, 1995;McElroy et al., 1995). ACE-MAESTRO uses the same sun tracking mirror as the ACE-FTS, receiving ∼7% of the beam collected by the mirror. The ACE-MAESTRO instrument vertical field-of-view is ∼1 km at the limb. The observation tangent altitudes range from the cloud tops to 100 km with a vertical resolution estimated at better than 1.7 km .
The processing of ACE-MAESTRO version 1.2 occultation data is done in two stages and is described in McElroy et al. (2007). In summary, the raw data are converted to wavelength-calibrated spectra, corrected for stray light, dark current and other instrument parameters in the first step. The corrected spectra are then analyzed by a nonlinear least-squares spectral fitting code to calculate slant-path column densities for each spectrum, from which vertical profiles of O 3 and NO 2 VMRs are subsequently derived. The retrieval algorithm does not require any a priori information or other constraints . The inversion routine uses the pressure and temperature profiles and tangent heights from the ACE-FTS data analysis to fix the tangent heights for ACE-MAESTRO. Vertical profiles for the trace gases are determined by adjusting an initial guess (high-vertical-resolution model simulation) using a nonlinear Chahine relaxation inversion algorithm (see McElroy et al., 2007, and references therein). The final profiles are provided both on the tangent grid and linearly interpolated onto a 0.5 km-spacing vertical grid. As is done for ACE-FTS, the latter profiles are used in the analyses presented in this work. Propagation of the spectral fitting errors in the ozone VMR retrievals yield typical errors of 1-2% between 20 and 40 km and increasing above and below this range. An error budget including systematic errors has not been produced for the ACE-MAESTRO ozone product. Averaging kernels are not available for the ACE-MAESTRO retrievals.
As described above, ACE-MAESTRO consists of two spectrophotometers and each can provide vertical VMR profiles for ozone. Following the previous validation study of Kar et al. (2007), this work presents only the comparisons made with the Visible-Near-IR (VIS) spectrometer ozone data product. The retrieved profiles from the VIS spectrometer are in good agreement (mean relative differences within ±10%) with those obtained from the UV spectrometer over the altitude range where the UV data have good signal-tonoise (∼15-30 km). The VIS profiles provide results over a larger vertical range, necessary for studies in the upper stratosphere and lower mesosphere.
The version 1.2 ACE-MAESTRO data products have been compared with SAGE III, POAM III and ozonesonde observations . Mean relative differences are generally within ±10% from 20-40 km. At higher altitudes, there is a significant bias between the SR observations, for which ACE-MAESTRO reports less ozone than the comparison instrument, and the SS observations, which show a large positive bias for ACE-MAESTRO with respect to the coincident measurements (of up to +30% around 50 km) . Direct comparison with the ACE-FTS version 2.2 Ozone Update profiles was also performed by Kar et al. (2007) for data obtained in the period March 2004-March 2005. The SR comparisons show a low bias of ACE-MAESTRO at most altitudes. The mean relative differences are within ±5% between 22 and 42 km, and increase above and below this range to a maximum value of −30% at 15 and 55 km. For the SS comparisons, the mean relative differences remain globally within ±5% for the Northern Hemisphere occultations, with ACE-MAESTRO VMR values lower than those of ACE-FTS except around 40 km; however, the mean relative differences are larger (within ±10%) for the Southern Hemisphere observations, with ACE-MAESTRO showing less ozone than ACE-FTS below 35 km and more ozone above this altitude .

Temporal and spatial criteria for coincidences
The nominal time period chosen for this study extends over 2.5 years from 21 February 2004 to 31 August 2006. The start date is the first day for which routine, reliable measurements were available for both ACE-FTS and ACE-MAESTRO. This time period includes the 2004, 2005, and 2006 Canadian Arctic ACE Validation Campaigns (Kerzenmacher et al., 2005;Walker et al., 2005;Sung et al., 2007;Manney et al., 2008;Fraser et al., 2008;Fu et al., 2008;Sung et al., 2009) and the final period of measurements from the SAGE II, SAGE III, POAM III and HALOE instruments. Based on availability of correlative measurements, this time period has been adjusted for some comparisons.
Common coincidence criteria were used to search for correlative observations to compare with ACE-FTS and ACE-MAESTRO. In addition to the spatial and temporal criteria discussed below, it was also required that there were profiles available for both ACE instruments for each coincidence. This provided a consistent distribution of comparisons for ACE-FTS and ACE-MAESTRO. Coincidence criteria can vary widely between different validation studies. The coincidence criteria used in this study have been chosen to ensure a sufficient number of coincidences in all comparisons while trying to limit the scatter resulting from relaxed coincidence criteria. For satellite comparisons, a maximum time difference of ±2 h between the ACE observation and the correlative measurement, and maximum latitude and longitude differences of ±5 • and ±10 • , respectively, were generally used. All time differences were calculated using Universal Time (UT). The geographic coincidence criteria correspond to maximum distances of ∼600 km at high latitudes and about twice this value near the equator. These distances are of the same order of magnitude as the typical groundtrack distance of an ACE occultation (300-600 km). Note that the measurement density is lower at low latitudes because of the high inclination of the ACE orbit and, therefore, we have significantly fewer coincidences available in the tropics and subtropics. These criteria provide good statistics consisting of a few hundred to several thousand events for most satellite-borne instruments. The list of the correlative datasets, time periods, number of coincidences and mean values of the distance and of the time, latitude and longitude differences is given in Table 1. For the sparser datasets from ozonesondes and airborne, balloon-borne and ground-based instruments, it is more difficult to find coincidences using the criteria listed above. In those cases, a similar fixed distance criterion was used (800 km for ozonesondes, 500 to 1000 km for other ground-based instruments) but the time criterion was relaxed to ±24 h. This was done in an effort to maximize the number of coincident profiles while at the same time avoiding biases in the atmospheric sampling.
To test the sensitivity of the comparison results to the temporal and geolocation criteria of the correlative measurements, we performed comparisons within shorter time periods and smaller geographical regions: typically, comparisons were done for each month of the 2.5-year period and in five latitude bands: four (two in each hemisphere) for mid-and high latitudes (latitudes 30 • -60 • and 60 • -90 • , respectively) and a larger one for the tropics and subtropics (30 • S-30 • N). This analysis was performed for most of the statistical comparisons with satellite-borne instruments and with ozonesondes (not shown). In addition, a detailed check of the time series of the mean relative differences, at each ground-based station, was performed for the study presented in Sect. 6.6. These analyses did not show any systematic latitudinal dependence of the relative differences or apparent temporal trend in the quality of the ACE observations. We also analyzed the dependence of the relative difference profiles on the distance between the measurement pairs and on observation parameters such as the beta angle for occultation instruments or the solar zenith angle for sun-synchronous measurements (not shown). This did not reveal significant systematic biases which might have required the use of narrower coincidence criteria. Finally, we did not find any visible latitude bias between the ACE measurements (e.g., ACE latitudes systematically higher or lower than those of the coincident observations) and the correlative instruments (not shown).
It should be noted that broad criteria such as those defined here may result in multiple coincident observations for a particular ACE occultation, for instance when the ACE orbit footprint is close to the satellite ground-track of the correlative instrument or when the allowed time difference is large (e.g., 24 h). In such cases, each coincident pair (the same occultation measured by ACE-FTS or ACE-MAESTRO paired with a distinct observation from the comparison instrument) is treated as an independent event, except for the statistical comparisons with ozonesondes (see Sect. 6.5) and Mi-croWave Radiometers (MWRs) (see Sect. 6.9). However, the number of multiple matches did not exceed a few hundred for the largest comparison sets (e.g., for comparisons with SABER), with no more than 6-8 distinct comparison measurements coinciding with a single observation from the ACE instruments.
In a first step, the comparisons with all satellite instruments (Sect. 5) and with the ozonesondes (Sect. 6.5) were made for ACE-FTS or ACE-MAESTRO SR and SS occultations separately. These initial analyses did not show evidence for a systematic SR/SS bias in the ACE-FTS dataset. Therefore, averages over all coincidences -without SR/SS separation -are shown for the ACE-FTS analyses in all sections except Sect. 5.1. Since SR/SS differences can be important for intercomparisons between two solar occultation instruments, the results of the comparisons with SAGE II, HALOE, POAM III and SAGE III (Sect. 5.1) are presented separately for both ACE-FTS and the correlative dataset. For the ACE-MAESTRO measurements, there is a known SR/SS bias . Thus, we present all of the ACE-MAESTRO SR and SS comparisons separately.
Day/night differences in ozone VMR can have an impact on the comparison results in the mesosphere (e.g., Schneider et al., 2005). For the comparisons presented hereafter, we did not routinely use any photochemical model for the ACE measurements to account for these diurnal variations. However, in two cases, a photochemical correction was applied to the correlative data (Sects. 5.4.1 and 5.4.2).

Validation methodology
The satellite data used in the following comparisons have vertical resolutions ranging from 0.5 to 5 km, which is the same order of magnitude as those of the ACE instruments (∼3-4 km for ACE-FTS and better than 1.7 km for ACE-MAESTRO). Therefore, coincident profiles are linearly interpolated onto the ACE vertical grid (with a spacing of 1 km for ACE-FTS or 0.5 km for ACE-MAESTRO) for the Table 1. Summary of the coincidence characteristics for the instruments (column 1) and data products (column 2) used in the statistical analyses. The full comparison period, latitude range and number of coincidences are presented in columns 3-5. Columns 6-9 give the mean and 1-σ standard deviation for: great circle distance, differences in latitude, longitude and time between the ACE and correlative measurements. For instruments which have multiple retrieval codes, these are noted in parentheses in column 1. comparison. Tests with other interpolation methods (using quadratic or cubic spline), or by comparing at the actual ACE tangent heights, did not yield any systematic differences. For example, the different interpolation methods gave results within a few percent for the Odin/OSIRIS SaskMART dataset (not shown). Secondly, for high-resolution measurements such as those from ozonesondes or other instruments measuring in situ, it is necessary to smooth the comparison data. Since averaging kernels are not available for the ACE measurements, alternative smoothing methods were employed. In this case, two techniques were used, either a smoothing function was applied or an integration method was used.
For most in situ and high-resolution profile comparisons, smoothing (convolution) functions were created for ACE-FTS, consisting of triangular functions of full width at the base equal to 3 km and centered at the tangent heights of each occultation. This value was chosen to account for the smoothing effect of the limited ACE-FTS vertical resolution (∼3-4 km field-of-view), whilst allowing for simplified but valid systematic analysis. Furthermore, it accounts for the vertical spacing of the tangent heights in a retrieved ACE-FTS profile. The spacing varies with altitude (including refraction below ∼30 km) and with the beta angle for the occultation (angle between the satellite orbital plane and the Earth-Sun vector). The minimum spacing is about 1.5 km at low altitudes for a high-beta occultation and increases to a maximum value of ∼6 km at mesospheric heights for a low-beta event. High-resolution correlative measurements are convolved with these triangular functions for each ACE tangent height z i : where x s (z i ) is the smoothed mixing ratio for the highresolution instrument at tangent height z i , x hr is the VMR value of the high-resolution profile at altitude z hr j , w j the associated weight (function of z hr j −z i ), and n hr the number of points from the high-resolution profile found in the 3 km layer centered at z i . The resulting smoothed profile is subsequently interpolated onto the 1 km grid. For ACE-MAESTRO comparisons, the high-resolution profiles are smoothed by convolution with a Gaussian filter of full width at half-maximum (FWHM) equal to 1.7 km, which is the upper limit for the vertical resolution of the instrument. The smoothed profiles are then interpolated onto the ACE-MAESTRO 0.5 km grid. This smoothing technique was used by Kar et al. (2007).
An alternative method is used in some comparisons with ozonesondes and lidars (Sect. 6.6). To account for the higher vertical resolution of the ozonesonde and lidar mea-surements, these profiles are first integrated to obtain partial columns calculated within layers centered at the ACE measurement grid levels (tangent heights). To calculate the partial column corresponding to altitude z i , the layer edges are defined as the mid-points between tangent heights z i−1 and z i (lower limit) and z i and z i+1 (upper limit). Then these partial columns are converted to VMR values attributed to the same tangent heights. The resulting profiles are interpolated onto the ACE-FTS (1 km) and ACE-MAESTRO (0.5 km) altitude grids.
Thirdly, for ground-based measurements with lower vertical resolution than the ACE instruments (Fourier Transform IR spectrometers (FTIRs) and MWRs), the ACE-FTS and ACE-MAESTRO profiles are smoothed using the averaging kernels calculated during the ground-based retrieval process, following the method of Rodgers and Connor (2003): where x ACE is the original ACE profile (ACE-FTS or ACE-MAESTRO), x s is the smoothed profile, and x a and A are the a priori profile and the averaging kernel matrix of the ground-based instrument, respectively. For the analysis, data are screened to reject either the whole profile or identified low-quality measurements at some altitudes. First, the data from each instrument are filtered according to the recommendations provided by each calibration/processing team. The specific criteria that were used are described in the appropriate subsections of Sects. 5 and 6. The profiles which do not meet the quality requirements are rejected as a whole. Then, altitude levels for which the stated error represents more than 100% of the profile value, or which exhibit unphysical VMR values -outside of the relatively broad interval of [−10; +20] ppmv -are excluded from the analysis. This generally leads to a lower number of comparison pairs at the lowermost and uppermost altitude levels. Negative VMR values are not systematically rejected as they can be produced by the retrieval process as an artifact due to noise in the measurements, especially at altitudes where O 3 abundance is naturally low. Finally, an initial comparison step was used to identify and remove erroneous profiles that were not rejected during the aforementioned analysis (a maximum of 5-6 per comparison set). These general filtering criteria were applied to all comparisons given in Sects. 5 and 6.
Differences are calculated for each individual pair of profiles, at the altitude levels where both instruments satisfy the screening criteria described above. The difference at a given altitude z is expressed as where x ACE (z) is the VMR at altitude z for ACE (ACE-FTS or ACE-MAESTRO), x comp (z) the corresponding VMR for the comparison instrument, and x ref (z) is given by = (x ACE (z) + x comp (z))/2 (rel.-others) The first line is the value of x ref (z) for absolute difference calculations. The second and third lines give the denominator for calculations of relative differences for the ozonesondes and the ground-based instruments and for all other comparisons, respectively. This difference in the relative difference calculation method is based on the assumption that the in situ high-resolution ozonesonde measurements are a good reference for the comparisons, while satellite-borne measurements are affected by larger uncertainties and a more logical reference is the average of both instruments VMRs (Randall et al., 2003). There are two exceptions. For the comparisons with the Airborne SUbmillimeter Radiometer (ASUR, Sect. 6.1), x ref (z)=x ACE (z) was used. In comparisons between ACE and the Global Ozone Monitoring by Occultation of Stars (GOMOS, Sect. 5.4.1) instrument, x ref (z)=x GOMOS (z) was used as the denominator. In addition, a different calculation methodology has been used for the comparisons with GOMOS. It is explained in detail in Sect. 5.4.1.
The resulting mean differences (absolute or relative) for a complete set of coincident pairs of profiles are calculated as where N (z) refers to the number of coincidences at altitude z and δ i (z) is the difference (absolute or relative) for the ith coincident pair calculated using Eq. (3). The mean relative differences are given in percent in the following sections. In some cases, notably for ACE-MAESTRO, there may seem to be a discrepancy between the apparent differences given by the mean profiles and the sign of the mean relative differences, or between the signs of the mean absolute and relative differences. The reader is reminded that the mean relative differences are not calculated from the mean VMR profiles but from each pair of coincident profiles (Eq. 3). Thus, the mean relative differences can become negative, even though the mean absolute differences are positive, if some profiles exhibit unusually low VMR values at certain altitude levels or if the VMRs for both instruments are of the same magnitude but of opposite signs (e.g., for the comparisons between ACE-MAESTRO and OSIRIS SaskMART, Fig. 10).
Finally, as mentioned in Sect. 2, a full error budget including estimates of the systematic errors is not available for the ACE data products analyzed in this work. Therefore, it is not possible to conduct a full precision validation study. In order to provide the reader with additional information on the significance of the bias and to set an upper limit to the precision of the ACE instruments, we calculate and show the standard deviation of the bias-corrected differences (referred to as "de-biased standard deviation" hereinafter) and the statistical uncertainty of the mean.
The de-biased standard deviation is a measure of the combined precision of the instruments that are being compared (von Clarmann, 2006). It has been used in previous validation studies, for example for POAM III (Randall et al., 2003) or MIPAS . It is expressed for a given altitude as where N (z) refers to the number of coincidences at altitude z, δ i (z) is here the difference (absolute or relative) for the ith coincident pair calculated using Eq. (3), and (z) the mean difference (absolute or relative) calculated from Eq. (4). The statistical uncertainty of the mean differences (also known as standard error of the mean or SEM) is the quantity that allows the significance of the estimated biases to be judged. It is related to the de-biased standard deviation by O and aerosol extinction, measured using seven channels centered at wavelengths from 0.385 to 1.02 µm. The ozone retrievals use data from the center of the Chappuis absorption band measured by the 0.603 µm channel. The retrieval algorithm is described in detail by Chu et al. (1989).
Data versions prior to version 6.00 have been the subject of several publications, including an extensive study of version 5.96 in the first Stratospheric Processes And their Role in Climate assessment report (SPARC, 1998). In 2000, a major revision of the retrieval algorithm corrected long-standing data issues (version 6.00). Version 6.00 was used in detailed comparisons with HALOE (Morris et al., 2002) and several other instruments (Manney et al., 2001). Subsequent improvements, versions 6.10 and 6.20, were made and have been extensively validated Kar et al., 2002;Iyer et al., 2003;Randall et al., 2003;. The current version (version 6.20) shows good agreement with correlative measurements within ±5% above ∼18 km. At lower altitudes, the relative differences increase, with a persistent low bias of −10% or more below ∼10 km (e.g., Borchi et al., 2005;Nazaryan and McCormick, 2005;Froidevaux et al., 2008). This version (v6.20) was used for the comparisons with ACE-FTS and ACE-MAESTRO. Applying the coincidence criteria (±2 h, ±5 • in latitude and ±10 • in longitude), we found 229 matches in the period between August 2004 and early May 2005. Among these, 199 correspond to SR occultations for both instruments, and 30 to both SS observations. The ACE-FTS comparison results are shown in Fig. 1 for the SR/SR (top panel) and the SS/SS (bottom panel) comparisons. ACE-FTS reports consistently higher ozone values than SAGE II at all altitudes. The mean relative differences are within +10 to +17% in the range 12-18 km, which is comparable to the low bias of SAGE II ozone values previously reported (e.g., Borchi et al., 2005;. They are within 0 to +10% between 18 and 42 km for both SR and SS events, with average values of about +5 and +6% for SR and SS, respectively. Above 42 km, both SR and SS comparisons show larger positive differences of up to +20%. Comparisons for SS events yield generally smaller mean relative difference values, notably around 12 km and in the range 38-44 km (<3%). Below ∼18 km, the de-biased standard deviation of the mean relative differences is large (within 30 to 60% for SR and within 20 to 50% for SS), which is explained by the lower number of coincident pairs and by the large natural variability of the ozone field at these altitudes. Above 18 km, the de-biased standard deviation of the mean relative differences remains lower than 10% for both SR and SS events up to the top of the comparison range. Note also that there is high consistency shown by the standard deviation of the ACE-FTS and SAGE II mean profiles, which confirms that both instruments sounded airmasses with similar variability. Finally, the observed differences are statistically significant as shown by the very small values of the standard errors of the mean. Figure 2 shows the comparisons between the SAGE II and ACE-MAESTRO ozone retrievals for the ACE-MAESTRO SR (top panel) and SS (bottom panel) profiles, respectively. For the SR cases, the agreement is very good between 15 and 55 km with mean relative differences within ±3% throughout, except near 20 km. For the ACE-MAESTRO SS events, the agreement is again quite good (within ±5% between 16 and 45 km), except for a significant positive bias between 45-55 km, reaching a maximum of +17% at 54 km. This is much larger than the SR bias at these altitudes. In contrast to ACE-FTS, the relatively large standard errors of the mean relative differences for the ACE-MAESTRO comparisons show that the observed biases are only marginally significant: below 20 km for both SR and SS events, and in the upper stratosphere for the SS comparisons. The standard deviation of the mean VMR profiles shows a noticeable scatter of the ACE-MAESTRO VMR values, also reflected in the de-biased standard deviation of the mean absolute and relative differences. These are within 30 to 70% for the SR comparisons and within 10 to 50% for the SS comparisons. The estimated biases in the stratosphere found for ACE-FTS and ACE-MAESTRO are comparable to these found in previous validation studies for SAGE II. Note also that this analysis provides an incomplete test of biases in the ACE (or SAGE II) datasets since the ACE SR (SS) occultations are all coincident with SAGE II SR (SS) occultations.

UARS/HALOE
The Upper Atmosphere Research Satellite (UARS) (Reber et al., 1993) was deployed from the Space Shuttle Discovery in September 1991. The satellite circled the Earth at an altitude of 585 km with an orbital inclination of 57 • . HALOE (Russell et al., 1993)   HALOE observations used 8 channels to measure infrared absorption bands between 2.45 and 10.04 µm, providing VMR profiles of trace constituents (including O 3 , H 2 O, NO 2 , and CH 4 ) with a vertical resolution of ∼2 km. O 3 profiles are retrieved with an onion-peeling scheme from the 9.6 µm channel, which provides an accurate product from the upper troposphere to the mesopause (Russell et al., 1993).
Extensive validation studies have been conducted for previous versions of the HALOE dataset (e.g., for version 17: Brühl et al., 1996;for version 18: Bhatt et al., 1999). The latest version, version 19 (hereinafter V19) has also been compared to numerous correlative measurements. Good agreement, to within ∼10%, was found in comparisons with various satellite-borne instruments for the mid-latitudes in November 1994 (Manney et al., 2001). Differences of 4 to 11% were found between HALOE V19 and SAGE II version 6.10 throughout the stratosphere (Randall et al., 2003). The differences with the POAM III version 3 ozone profiles were typically smaller than 5% and always within ±10% (Randall et al., 2003). Comparisons with the MIPAS IMK-IAA version V3O O3 7 retrievals show a global agreement within 10% in the middle and upper stratosphere . The agreement of the HALOE V19 O 3 profiles with the most recent release (version 2.2) of the Aura-MLS ozone data product is ∼5% between 68 and 2 hPa (∼20-42 km) but degrades to 15% at 100 and 147 hPa (∼15 and ∼14 km, respectively), with Aura-MLS values larger than the HALOE values (Froidevaux et al., 2008). In this study, we use the HALOE V19 ozone retrievals.
In the comparisons, only 49 pairs of coincident profiles were found using ±2 h, ±5 • in latitude and ±10 • in longitude for the coincidence criteria. As for SAGE II, there are no SR/SS collocations, but only SR/SR and SS/SS events (respectively 8 and 41 coincidences). In Fig. 3, we present the results for the SS/SS comparisons only, because of the limited number of coincidences for the SR events. The ACE-FTS mixing ratios exhibit a positive bias over most of the altitude range. Mean relative differences for the SS comparisons are within +4 to +13% in the range 15-42 km, increasing to about +28% at 60 km. These larger positive mean relative differences are similar to those noted with SAGE II and are a persistent feature in most of the profile comparisons presented in this paper. The de-biased standard deviation of the mean relative differences remains small at all altitudes above ∼17 km (<8% throughout). As for SAGE II, the standard errors of the mean show that the observed differences are statistically significant. The ACE-MAESTRO comparisons were also done separately for SR and SS events. As for ACE-FTS, only the comparison between ACE-MAESTRO SS and HALOE SS results is shown (Fig. 4). For this comparison, there is good agreement between 12 km and 40 km, with mean relative differences within 0 to +10% (+5% on average). The mean relative differences increase thereafter to a maximum of about +27% near 55 km. This is generally similar to the ACE-FTS -HALOE comparison shown above. Contrary to the comparisons with SAGE II, there is little discrepancy in the standard deviations of the ACE-MAESTRO and HALOE mean VMR profiles, except above 45 km. The de-biased standard deviations of the mean relative differences are larger than those found for ACE-FTS but remain within 10% between 15 and ∼50 km.

POAM III
POAM III (Lucke et al., 1999) was launched in March 1998 onboard the fourth Satellite Pour l'Observation de la Terre (SPOT-4) in a sun-synchronous orbit, with an altitude of 833 km, an inclination of 98.7 • and ascending node crossing at 22:30 (local time). It is a solar occultation instrument able to provide high-resolution (∼1 km) vertical profiles of O 3 , NO 2 , H 2 O and aerosol extinction using nine filter channels from 0.353 to 1.02 µm. POAM III measured in high latitude ranges throughout the year (∼55 • -71 • N and ∼63 • -88 • S), with satellite sunrises in the Northern Hemisphere and satellite sunsets in the Southern Hemisphere. POAM III was operational from April 1998 to early December 2005.
Briefly, the retrieval algorithm for POAM III consists of a spectral inversion for species separation, followed by the limb (vertical) inversion. Ozone is retrieved primarily from the 0.603 µm channel where the Chappuis absorption dominates the total optical depth between 15 and 60 km.
The retrieval and error budget for the version 3 (v3) data products are described in detail in Lumpe et al. (2002). The ozone v3 retrievals have been extensively compared and validated using observations from aircraft, balloon and satelliteborne instruments (see Randall et al., 2003, and references therein). They were shown to be highly accurate from 13 to 60 km with a typical agreement of ±5%. A possible slight bias of ∼5% was noted between the SR (Northern Hemisphere) and SS (Southern Hemisphere) profiles, and a high bias (up to 0.1 ppmv) was found below 12 km (Randall et al., 2003). For these comparisons, we use version 4 (hereinafter v4) of the POAM III retrievals. This version was improved to account for problems in the POAM III v3 retrievals, due in part to unexpected instrument degradation over the course of the mission. Comparative studies similar to those conducted with v3 show that the general conclusions of Randall et al. (2003) can be applied to POAM III v4 ozone data (http://eosweb.larc.nasa.gov/PRODOCS/poam3/ documents/poam3 ver4 validation.pdf).
The quality flag implemented for the POAM III v4 O 3 product (http://eosweb.larc.nasa.gov/PRODOCS/poam3/ documents/poam3 ver4 documentation.pdf) was used for data screening: altitude levels with non-zero values of the quality flag were excluded from the calculations. We used ±2 h, ±5 • in latitude and ±10 • in longitude for the coincidence search. A total of 376 coincidences was found in the comparison period, with about 1/3 in the Northern Hemisphere (POAM III SR) and the remainder in the Southern Hemisphere (POAM III SS).
Results are shown in Fig. 5 for the ACE-FTS SR (top) and SS (bottom) occultations. Mean relative differences are within ±10% (+2 to +5% on average) between ∼12 and ∼42 km for both SR and SS. In particular, the ACE-FTS SS/POAM III SS results show an excellent agreement with mean relative differences within ±3% in the range 23-41 km and de-biased standard deviation of the mean relative Atmos. Chem. Phys., 9,2009 www.atmos-chem-phys.net/9/287/2009/ E. Dupuy et al.: Validation of ACE ozone 299 differences lower than 5%. These are indicative of a good combined precision for these events and therefore imply low random errors for the ACE-FTS retrievals. The largest differences are found for the ACE-FTS SR/POAM III SS comparisons (109 coincidences, with mean relative differences within 0 to +13%). Below 16 km, ACE-FTS measures consistently less ozone than POAM III, with large mean relative differences corresponding to mean absolute differences of less than 0.1 ppmv. The de-biased standard deviation of the mean relative differences is lower than 8% (SR/SS and SS/SR) and 15% (SR/SR and SS/SS) between about 12 and 42 km. Above 42 km, mean relative differences increase to a maximum of +34% around 60 km. The largest mean relative differences are found for the ACE-FTS SR/POAM III SS events in the range 42-48 km and for the ACE-FTS SS/POAM III SR pairs (∼230 coincidences) above 42 km. In each panel of Fig. 5, a discrepancy in the mean relative difference profiles can be seen, notably at high altitudes. However, when comparing all ACE-FTS SR profiles against POAM III (top panel) and all ACE-FTS SS profiles against POAM III (bottom panel), the resulting differences between the ACE-FTS SR and SS observations are always lower than 1-2% (not shown). Therefore the observed differences should not be interpreted as showing a SR/SS bias of the ACE-FTS data. The ACE-MAESTRO and POAM III comparisons were done by Kar et al. (2007) using measurements from February 2004 to September 2005. This slightly shorter comparison period did not significantly lower the number of coincidences. Therefore, a short summary will be given but the reader is referred to the analysis of Kar et al. (2007) for more information and to their Figs. 6a and 6b for illustration of the results. ACE-MAESTRO SR events show consistently smaller VMRs from 20-50 km when compared to POAM III SR or SS profiles, with mean relative differences within −5 to −15%. The comparison of the ACE-MAESTRO SS profiles with POAM III yields mean relative differences within ±10% in the altitude range ∼18-40 km, with smallest values (within ±4% from 20-35 km) for the comparisons of ACE-MAESTRO SS and POAM III SR. Above ∼40 km, the ACE-MAESTRO SS profiles show larger ozone values than POAM III (up to +20% for POAM III SR and +30% for POAM III SS). As for SAGE II or HALOE, the shape of the relative difference profile above ∼45 km for the ACE-MAESTRO SS events is qualitatively similar to the results obtained for ACE-FTS at high altitudes. Here also, the debiased standard deviation of the mean relative differences is larger than that found for ACE-FTS, within 10 to 25% over the comparison altitude range (18-40 km) .  Two different processing algorithms have been used for SAGE III ozone retrievals in the upper troposphere and the stratosphere. One is a SAGE II type (least-squares) algorithm using only a few wavelengths and the second one employs a multiple linear regression (MLR) technique to retrieve ozone number densities from the Chappuis absorption band (SAGE ATBD Team, 2002b). The recent study of H. J. , using the latest release (version 3.0) of the retrievals, showed that both products are essentially similar from 15 to 40 km. When compared to correlative measurements, the SAGE II type retrievals provide better precision above 40 km and do not induce artificial hemispheric biases in the upper stratosphere, whereas the MLR retrieval yields slightly better accuracy in the upper troposphere/lower stratosphere (UT/LS) region. Comparisons with ozonesondes, SAGE II and HALOE show that the estimated precision of SAGE III for the least-squares (SAGE II type) retrieval algorithm is better than 5% between 20 and 40 km and ∼10% at 50 km, and the accuracy is ∼5% down to 17 km. In particular, excellent agreement was found with SAGE II from 15 to 50 km, with ozone values reported by SAGE III systematically larger than those of SAGE II by only 2-3%. Below 17 km, SAGE III ozone VMR values are systematically larger than those of the comparison instruments, by 10% at 13 km (H. J. . We use version 3.0 of the ozone data product from the SAGE II type algorithm for the comparisons detailed hereafter. Of the solar occultation instruments, the most coincidences were found with SAGE III (648 events). There is very good overall agreement between ACE-FTS and SAGE III, as shown in Fig. 6. Mean relative differences are within ±6% from 12-42 km (except for the ACE-FTS SR/SAGE III SR results at 17 km) and generally smaller than ±2%. Above 42 km, ACE-FTS reports larger VMRs than SAGE III (by up to +20%). This is consistent with other comparisons presented in this study. There is no significant difference between the ACE-FTS SR and SS comparisons below 42 km. Above this altitude, the SR results show slightly smaller mean relative differences (by −2 to −6%) but are based on a considerably lower number of coincidences. Based on these comparisons, there does not appear to be a systematic SR/SS bias in the ACE-FTS retrievals. The de-biased standard deviation of the mean relative differences is within 15% at all altitudes but often smaller than 6%, a value comparable to the estimated precision of the SAGE III retrievals. This could mean that the ACE-FTS contribution to the combined random errors of the comparison is very small.
As for POAM III, comparisons of ACE-MAESTRO with SAGE III were conducted by Kar et al. (2007) using narrower geographic criteria (maximum distance of 500 km) and will not be reproduced here. Mean relative differences within ±5% are found between 15 and ∼40 km for the larger samples (ACE-MAESTRO SS/SAGE III SR and ACE-MAESTRO SS/SAGE III SS). Above this range, the ACE-MAESTRO SS profiles exhibit a large positive bias with mean relative differences of up to +30%, larger than those found for ACE-FTS. The de-biased standard deviation of the mean relative differences is quite large (within 10 to 20%), which suggests that the ACE-MAESTRO spectral fitting errors to not entirely account for the random errors of the retrieval. For the ACE-MAESTRO SR measurements, the mean relative differences are consistently within −5 to −15% in the altitude range 28-55 km, with smaller values of the de-biased standard deviation (<7%) compared to the SS events. This is shown in Figs. 5a and 5b of Kar et al. (2007).

Odin
The Swedish-led Odin satellite, launched in February 2001, is in a sun-synchronous, near-terminator orbit at ∼600 km with a 97.8 • inclination and an ascending node crossing at 18:00 (local time) . This orbit provides the limb-scanning instruments with latitudinal coverage in the orbit plane from 82.2 • N to 82.2 • S. Odin serves both astronomy and aeronomy objectives and, while in normal operation, it shares time equally between aeronomy and astronomy measurements. The stratospheric mode (measured for one day out of every three) scans the Earth's limb from 7 to 70 km with a vertical speed of 0.75 km per second.

Odin/OSIRIS
The Optical Spectrograph and InfraRed Imager System (OSIRIS) is one of the two instruments on Odin. It measures limb-scattered solar radiance in the spectral range 280-810 nm with ∼1 nm resolution . The instrument's vertical field-of-view is ∼1 km at the tangent point. OSIRIS provides approximately 30 ozone profiles per orbit over the sunlit hemisphere (about 60 profiles per orbit during orbital equinox periods).
There are presently two versions of the OSIRIS ozone data product. The retrieval algorithm for the first product is developed and maintained at York University (Toronto, Canada). It applies the inversion technique developed by Flittner et al. (2000) and McPeters et al. (2000) to OSIRIS radiances measured at three wavelengths in the Chappuis absorption band (von Savigny et al., 2003). The resulting ozone number density profiles, version 3.0 (v3.0), are provided from 10-46 km with a 2 km spacing. The York v3.0 data products are described in Haley and Brohede (2007). The major change in the York v3.0 data product is the correction of a pointing drift affecting the previous retrieval versions. Total error estimates for the O 3 retrievals are 6% at about 24 km, increasing to ∼14% at 10 km and 33% at 44 km (Haley and Brohede, 2007). These will be referred to as the "York retrievals" hereinafter. There were two previous releases of the York ozone product (v1.2 and v2.4), yielding very similar results (agreement better than 3%). Version 1.2 has been validated against coincident ozonesonde and satellite measurements (Petelina et al., , 2005a. These comparisons showed a good agreement of the OSIRIS York data product with correlative measurements, within ±7% over the altitude range 16-32 km. Recently, v3.0 data were validated against Odin/SMR, POAM III, balloon-borne and ground-based instruments. An overall low bias of the York retrievals, generally of about −15% (−0.3 to −0.7 ppmv depending on the altitude), was found in the range 10-35 km Jégou et al., 2008). The second OSIRIS ozone retrieval algorithm, SaskMART, is developed and maintained at the University of Saskatchewan (Saskatoon, Canada). We also compare the ACE-FTS and ACE-MAESTRO ozone profiles with version 2.1 (v2.1) of this product (hereinafter "SaskMART retrievals"). The SaskMART algorithm combines information from the Chappuis and the Hartley-Huggins bands to infer the ozone number density from the cloud tops to the lower mesosphere. It is described by Roth et al. (2007) and uses a Multiplicative Algebraic Reconstruction Technique (MART) and the SASKTRAN radiative transfer model (Bourassa et al., 2007). SaskMART zonal mean profiles were compared with SAGE II v6.20 and SAGE III v3.0 O 3 profiles by Roth et al. (2007). Results show an overall agreement within ±5% for SAGE II and ±10% for SAGE III from 20-40 km, with OSIRIS reporting less ozone over most of the altitude range. Comparisons with SAGE II, using the complete OSIRIS SaskMART dataset over the full altitude range of the retrievals (10-60 km), were conducted by Degenstein et al. (2008). The results show very good agreement with SAGE II, with mean relative differences within ±2% between 18 and 53 km, and a substantial low bias below and above this range (−20% at 58 km) . For OSIRIS, the ACE-FTS profiles were first compared with the York retrievals (Fig. 7). Following the developers' recommendation, only profiles for which the measurement response is greater than 0.9 (i.e., where 90% or more of the information content comes from the observation and not from the a priori (Rodgers, 2000)) were included in the analysis. Furthermore, the data were screened to exclude altitude levels for which the estimated vertical resolution is >5 km. A total of 913 coincidences was found with criteria of ±2 h, ±5 • in latitude and ±10 • in longitude. As explained in Sect. 3, results for ACE-FTS will now be given for averages over all coincident events, with no SR/SS separation. ACE-FTS consistently reports more ozone than the OSIRIS York retrievals except at the lowermost altitudes (11-12 km). Above 12 km, the mean relative differences are within +4 to +11% throughout, with largest values at 18 and at 37 km (∼+11%). Here also, the standard error values are very small, indicating that the observed differences are statistically significant. These are, however, compatible with other validation studies of the York v3.0 retrievals. The debiased standard deviation of the mean relative differences is lower than 15% above 20 km and increases below this altitude. Note again the very good consistency of the standard deviations of the ACE-FTS and York mean VMR profiles (as seen in most comparisons presented in this work).
Results of the comparison of ACE-FTS with the SaskMART retrievals are presented in Fig. 8. In these comparisons, the ACE-FTS VMR values are also consistently larger than those of OSIRIS, but with better agreement (with mean relative differences within ±6%) in the altitude range 9-45 km. Above 45 km mean relative differences increase, up to +44% at 60 km. The de-biased standard deviation of the mean relative differences remains lower than 20% at all altitudes between 18 and 55 km. Considering the low bias previously noted in the comparisons of OSIRIS SaskMART with SAGE II and SAGE III, this suggests that this large positive difference may be the combination of the persistent high bias of ACE-FTS between ∼45 and 55-60 km and of a low bias of the SaskMART retrievals above ∼50 km. Figure 9 shows the results of the comparison between ACE-MAESTRO and the York retrievals, for ACE-MAESTRO SR (top panel) and SS (bottom panel) occultations. For both types of events, the mean relative differences are within ±5% between 16 and 26 km and within +6 to +12% between 26 and 40 km. However, the ACE-MAESTRO SR profiles around 37 km seem to have a larger positive bias compared to the SS profiles, which is opposite to the known SR/SS bias seen with the solar occultation comparisons. The reason for this is not clear at this time.
For ACE-MAESTRO, the de-biased standard deviation of the mean relative differences is larger than for ACE-FTS, with values within 10 to 25% found above 18 km. The de-biased standard deviation of the mean relative differences is slightly smaller for the SS comparisons than for the SR events, but by less than 2-3%. Since these are an estimate of the combined precision of the instruments, the comparison of the results for ACE-FTS and for ACE-MAESTRO could indicate that ACE-MAESTRO retrievals have a noticeably poorer precision than those of ACE-FTS.
The comparison results for ACE-MAESTRO and OSIRIS SaskMART retrievals are shown in Fig. 10 for ACE-MAESTRO SR (top) and SS (bottom) events. The agreement is quite good for the SR events, with mean relative differences within ±7% over the altitude range 18-59 km. For ACE SS events, ACE-MAESTRO ozone mixing ratios have a large positive bias between 40 and 60 km, similar to comparisons with most other instruments. However, the maximum mean relative difference of ∼15% near 53 km is somewhat smaller than the corresponding positive bias for ACE-FTS at this altitude. A SR/SS bias in ACE-MAESTRO ozone measurements can be seen, particularly in the upper stratosphere. The fact that the mean relative differences at the uppermost levels are negative while the mean absolute differences are small but positive is due to very low VMR values in the ACE-MAESTRO retrievals for more than half (∼240 out of ∼450) of the coincident events. The de-biased standard deviation of the mean relative differences for the comparison of ACE-MAESTRO with the SaskMART retrievals is very similar to the York comparisons, with a minimum of ∼10% and a maximum of ∼28% in the altitude range 18-50 km, for both the SR and SS events.

Odin/SMR
The Sub-Millimetre Radiometer (SMR) is the second instrument on board the Odin satellite. It uses four tunable heterodyne radiometers to observe thermal limb emission from atmospheric molecules, in the frequency range 486-581 GHz. In the stratospheric mode, SMR measures several species related to stratospheric ozone processes in two frequency bands centered at 501.8 GHz and 544.6 GHz, namely O 3 , HNO 3 , ClO and N 2 O .
The current best ozone data product for SMR is version 2.1 of the operational processing developed at the Chalmers University of Technology, Gothenburg, Sweden (hereinafter Chalmers-v2.1). It uses the observations of a weak O 3 line near 501.5 GHz to retrieve ozone VMRs mainly in the stratosphere (above ∼17-18 km at mid-latitudes), with a retrieval scheme based on the Optimal Estimation Method (Rodgers, 2000). The vertical resolution achieved is on the order of 2.5-3.5 km below ∼40-45 km. Chalmers-v2.1 and two previous operational ozone data products (v1.2 and v2.0) were compared with ozonesondes and with the MIPAS ozone profiles retrieved with the ESA Level 2 processor prototype ( Raspollini et al., 2006) version 4.61 in the recent study of Jones et al. (2007). The SMR ozone v2.1 is very similar to the older versions in the altitude range 20-45 km, but is significantly improved below 20 km and above ∼45 km. Comparisons with MIPAS show relative differences of about −10% (smaller than 0.4 ppmv) between 17 and 55 km, with SMR reporting VMR values systematically smaller than those of MIPAS. Absolute differences with ozonesonde measurements are typically within ±0.3 ppmv below 27 km, but the SMR ozone VMRs are smaller than the ozonesonde measurements in the tropics around 30 km (by more than 10% or 0.9 ppmv; Jones et al., 2007). We used the Chalmers-v2.1 SMR ozone data product for the comparisons with the ACE instruments.
The comparisons were made with coincidence criteria of ±2 h, ±5 • in latitude and ±10 • in longitude. Following the recommendations of the retrieval team, only SMR data with a profile quality flag value of 0 were used at altitude levels where measurement response was greater than 0.9 (see Urban et al., 2005, for a description of the measurement response and the quality flag). The vertical range was limited to altitudes where the SMR measurements have a good signal-to-noise ratio (∼20-55 km). A total of 1161 coincidences was found in the comparison period. The results are presented in Fig. 11. Between 20 and ∼55 km, ACE-FTS consistently reports more ozone than SMR. The mean relative differences are within +2 to +13% (0.5 ppmv) below ∼25 km and within +13 to +20% between 25 and 41 km. In the altitude range 41-55 km, the mean relative differences are larger (within +20 to +30%), which is consistent with the other comparisons presented in this study. Here, the debiased standard deviation of the mean relative differences is very large, within 30 to 60% between 20 and 55 km. The large positive bias is consistent with previous validation studies for SMR, and the large de-biased standard deviations of the mean relative differences may indicate that the SMR instrument has a relatively limited precision since such large values are not found in most other comparisons. Similar comparisons were conducted with ACE-MAESTRO and are presented in Fig. 12. Overall, the mean relative differences for the SR and SS events are similar and comparable to those of ACE-FTS. Mean relative differences are within ±10% below 25 km and within +10 to +20% in the altitude range ∼25-44 km (25-40 km) for the ACE-MAESTRO SR (SS) events. The ACE-MAESTRO SR data show more ozone below 33 km than the SS data, which translates into larger values of the mean relative differences (by up to +5%) with SMR at these altitudes. A larger positive bias is also observed in the ACE-MAESTRO -SMR comparisons between 40 and ∼50 km, with a maximum mean relative difference of about +28%. For both SR and SS comparisons, the de-biased standard deviation of the mean relative differences is comparable to that found for ACE-FTS (within 30 to 60% over the altitude range 25-44 km). Above 50 km, the mean relative differences rapidly decrease and become smaller than +5% at the top of the comparison range (∼55 km).

TIMED/SABER
The Sounding of the Atmosphere using Broadband Emission Radiometry (SABER) instrument is one of the four instruments onboard the Thermosphere, Ionosphere, Mesosphere Energetics and Dynamics (TIMED) satellite. TIMED was launched in December 2001 into an orbit with an altitude of ∼625 km and an inclination of 74 • . The latitude coverage alternates between 54 • S-82 • N and 82 • S-54 • N, and the local time coverage is ∼22 h in about 60 days. SABER uses ten channels in the near-and mid-IR spectral region (1.27-15 µm) to perform broadband limb emission measurements of pressure, temperature, the O 2 ( 1 ) and OH Meinel volume emission rates, as well as VMR profiles for CO 2 , O 3 and H 2 O. The retrieval code takes into account non-local thermodynamic equilibrium (non-LTE) effects in the emissions measured above ∼55 km (Mertens, 2001). The ozone profiles are retrieved from the 9.6 µm channel, in the vertical range ∼12-∼100 km with a vertical spacing of ∼0.4 km. The temperature and wind data have been used extensively for comparisons and scientific publications (e.g., Sica et al., 2008;Forbes et al., 2006;Petelina et al., 2005b;Mertens et al., 2004). However, at the time of writing, there are no published comparisons for the SABER trace gas data. The present study thereby constitutes the first large-scale intercomparison for the SABER ozone dataset. The SABER O 3 data product available at the time of writing, version 1.06 (hereinafter v1.06), is used for the comparisons. A new version (v1.07) is currently being developed, but the reprocessing was not completed in time for this analysis. Version 1.07 should show significant changes in the SABER temperature and ozone retrievals. For O 3 , it should yield lower VMR values (by a few percent) in the stratosphere and a larger decrease (by 10% or more) in the mesosphere (B. T. Marshall, personal communication).
Results for the ACE-FTS and SABER comparisons are shown in Fig. 13. The shape of the difference profile is significantly different from the comparisons presented above. A total of 6210 coincidences was found between ACE-FTS and SABER with the criteria: ±2 h and ±5 • and ±10 • for the latitude and longitude differences, respectively. Narrower coincidence criteria did not induce significant changes in the results. Good agreement is found in the altitude range 19-46 km, with mean relative differences within ±7%. ACE-FTS reports less ozone than SABER around the peak in ozone VMR (31-42 km), but shows larger VMRs around 20 km and at altitudes between 42 and 56 km. Below 19 km and above 56 km, the O 3 VMRs measured by ACE-FTS are systematically lower than those of SABER. Note that the standard deviation of the SABER mean VMR profile is always larger than that of ACE-FTS, with largest discrepancy found below 25 km. The de-biased standard deviation of the mean relative differences is within 13 to 30% between 19 and 50 km. The expected decrease in the ozone VMR for the SABER v1.07 ozone data product should significantly reduce the discrepancies, notably in the mesospheric part of the comparison range. However, the reasons for this particular behavior cannot be explained at this time.
The comparisons of the ACE-MAESTRO retrievals with the SABER ozone profiles are shown in Fig. 14. Large mean relative differences are found at the top and at the bottom of the altitude range for both the SR and the SS events (below ∼20 km and above ∼52 km). Between 20 and 52 km, the ACE-MAESTRO SR profiles show good agreement with SABER (Fig. 14, top panel), with mean relative differences within ±7% and decreasing with increasing altitude above 27 km. The corresponding de-biased standard deviation values are within 20 to 40% in the altitude range 20-52 km. The mean relative difference profile for the SS occultations (Fig. 14, bottom panel) is closer in shape to the results found for ACE-FTS, with values within ±4% between 20 and 42 km and de-biased standard deviations of the mean relative differences comparable to, but slightly smaller than for the SR events (within 15 to 30%). Between 42 and 54 km, ACE-MAESTRO SS measurements show VMR values significantly larger than those of SABER, with mean relative differences of up to +16% around 48 km. As was found for the comparisons between ACE-MAESTRO and OSIRIS SaskMART in Sect. 5.2.1, the mean relative differences at the uppermost level of the comparison vertical range are negative for ACE-MAESTRO SS occultations. This is also explained by unusually low values of the retrieved ACE-MAESTRO VMRs.

Envisat
The ESA Environmental Satellite (Envisat) was launched in March 2002 into a quasi-polar, sun-synchronous orbit at an altitude of 800 km, with an inclination of 98.6 • and an ascending node crossing at 22:00 (local time). For most of the onboard sensors, this allows complete coverage of the Earth in one to three days. Three of the ten instruments are dedicated to atmospheric chemistry: the GOMOS, MIPAS and SCIAMACHY instruments.

Envisat/GOMOS
GOMOS is a stellar occultation instrument, that has been in operation since the launch of Envisat (see Kyrölä et al., 2004, and references therein). It is a UV/visible/near-IR grating spectrometer that can measure about 100 000 star occultations per year with a vertical sampling of better than 1.7 km. From these observations, atmospheric concentration profiles are retrieved for O 3 , NO 2 , NO 3 , H 2 O, O 2 , Na, OClO and stratospheric aerosols. The range of latitudes sampled by GOMOS depends on the suitable stars available during each orbit and thus varies throughout the year. GOMOS sounds the atmosphere at different local solar times depending on the position of the star that is being observed. The ozone measurements are made in the 250-687 nm spectral range. GOMOS ozone profiles are produced using a two step retrieval process (Kyrölä et al., , 2006. First, the spectral inversion uses a nonlinear Levenberg-Marquardt method to fit the refraction-corrected atmospheric spectra simultaneously at all wavelengths. Then, the onion-peeling method is used to perform the vertical inversion to obtain profiles. The typical altitude range of the GOMOS ozone re-trievals is 15-100 km. The GOMOS precision is strongly influenced by the star magnitude and temperature as both can impact the signal-to-noise ratio of the measured spectra. The daytime (bright-limb) occultations suffer from additional noise from scattered solar light. Because of this, the comparisons shown here will be restricted to nighttime (dark-limb) observations. The GOMOS ozone profiles have been validated using measurements from ozonesondes, lidars and MWRs (Meijer et al., 2004). Between 14 and 64 km, the differences were found to be 2.5-7.5% with GOMOS measuring less ozone than the comparison instrument. In comparisons with MIPAS and SCIAMACHY, the agreement for GOMOS dark limb profiles was −5% from 20-50 km and +1% from 20-40 km, respectively (Bracher et al., 2005). The level 2 data product used for these comparisons was version 6.0a. Version IPF 5.00 is used for the comparisons with ACE-FTS and the difference between these versions is expected to be less than 1-2%.
The approach taken for the GOMOS comparisons differs from that used for the other satellite instruments. Instead of calculating the mean of the relative differences for the GOMOS and ACE-FTS comparisons, the weighted median difference is determined. This approach, used in earlier GOMOS validation studies (e.g., Fussen et al., 2005), was adopted because outliers in either dataset can significantly influence the results of the comparison. The weighted median difference, m, is calculated by minimizing the expression, with respect to m, where x ACE (i) and x GOMOS (i) are the profile values at a given altitude, for coincidence i and for ACE-FTS and GOMOS, respectively, and w i is the weighting factor, equal to the inverse of the combined estimated experimental errors from ACE-FTS and GOMOS. Figure 15 shows the dependence of the weighted median difference at 24.5 km on the number of collocated events and the spatial and temporal coincidence criteria used for the comparisons. From these results, it can be seen that a larger dataset improves the statistical significance although a slight linear bias is apparent. Using criteria of ±12 h and 500 km, 1240 pairs of collocated profiles were identified for the comparisons. Because both datasets extend into the mesosphere (60-80 km), we have used the Simulation of Chemistry, Radiation, and Transport of Environmentally important Species (SOCRATES) model to correct the GOMOS data for diurnal variations between the observation time and the local sunset or sunrise. SOCRATES is a two-dimensional chemistryclimate model which extends from the surface to the lower thermosphere. The version used here is optimized to study the heat budget and the photochemistry in the mesosphere (Chabrillat and Fonteyn, 2003;Kazil et al., 2003). Because the present study requires a precise representation of the chemical composition at sunrise and sunset, the model was run with a photochemical time step of 5 min over a whole year with solar flux conditions representative of the year 2004. Each GOMOS observation was scaled by the modeled ratio between ozone density at local sunset or sunrise and ozone density at the observation time.
The results of the ACE-FTS -GOMOS comparisons are presented in Fig. 16. The differences shown in Fig. 16 were calculated after applying the photochemical correction from the SOCRATES model. A good agreement (median relative differences within ±10%) can be observed in the stratosphere (15-40 km) with a slight positive bias increasing slowly with altitude. However, there exists a larger bias (up to +40%) between 40 and 60 km, similar to other comparisons. Above 60 km, the positive bias increases strongly when comparing the ACE-FTS and corrected GOMOS profiles. Without applying the photochemical correction, ACE-FTS reports significantly less ozone than GOMOS (with median relative differences down to about −80%, not shown). Because of the photochemical correction method used and the low ozone number densities, it is difficult to draw conclusions about the accuracy of the ACE-FTS profiles in the mesosphere based on these relative differences.
The GOMOS observations have better vertical resolution than the ACE-FTS profiles. Thus, we also performed an additional qualitative comparison. Since the ACE-FTS retrievals do not produce averaging kernels, an empirical triangular smoothing function was therefore applied to the GO-MOS data. This was done to degrade their vertical resolution (from initial values of 0.3 to 1.7 km) in order to minimize the differences between the median profiles. The agreement between both datasets was considerably improved, as seen in Fig. 16. However, this result was obtained using a convolution function with a FWHM of 10.5 km, which could indicate that the effective resolution of the ACE-FTS measurements is larger than 10 km in the upper mesosphere.

Envisat/MIPAS
MIPAS is a mid-IR Fourier transform emission spectrometer designed to perform global-scale continuous (day/night) limb-sounding measurements of VMR profiles for a range of atmospheric species (Fischer et al., 2008). For this purpose, it acquires spectra in five frequency bands over the range 685-2410 cm −1 (14.6-4.15 µm). Global measurements are achieved every day (Cortesi et al., 2007). The pointing system allows MIPAS to observe atmospheric parameters in a maximum altitude range of 5-160 km with a vertical spacing of 1-8 km depending on the altitude and on the measurement mode (Fischer et al., 2008). Operational measurements at full spectral resolution (0.025 cm −1 ) were conducted from July 2002 to March 2004. However, anomalies affecting the interferometer slide mechanism led to the suspension of operations on 26 March 2004. Observations were resumed in January 2005 with a new operation mode, on a finer vertical grid and with reduced spectral resolution (0.0625 cm −1 ). The following analyses present the comparisons of the ACE-FTS data product with three MIPAS datasets: the operational ESA processor (MIPAS full resolution mission), the ESA prototype processor used for validation purposes (reduced resolution observations) and the IMK-IAA scientific processor (full resolution observations). During the time period cor-responding to the full resolution observations, ACE-FTS acquired data from SS occultations only. Therefore, there are no ACE-FTS SR events in the comparisons with the ESA operational retrievals and the IMK-IAA retrievals.

Comparison of ACE-FTS with the operational ESA retrievals
The algorithm used for the ESA near-real-time Level 2 analysis is based on the Optimised Retrieval Model (ORM) scientific prototype Ridolfi et al., 2000). Given the redundancy of measurements in MIPAS limbscanning sequences, vertical profiles do not need constraints such as a priori information. Complementary information, when available, can however be used to improve the quality of the retrieved parameters (Ridolfi et al., 2000). The retrieval uses a set of microwindows designed to obtain maximum information on the target species while minimizing the total error and the computing cost .
The microwindow selection algorithm is described by Dudhia et al. (2002). The standard products of the ESA processor are the atmospheric pressure and temperature profiles along with the volume mixing ratio profiles of 6 "key species": H 2 O, O 3 , HNO 3 , CH 4 , N 2 O and NO 2 . These are provided at the tangent heights of the MIPAS measurements during the full resolution mission, i.e., from 68-6 km with a variable vertical spacing ranging from 3 km below 42 km to 8 km above 52 km. A detailed validation analysis of the data acquired during the full resolution mission can be found in Cortesi et al. (2007). Briefly, the MIPAS profiles retrieved with the ESA operational processor (version 4.61 and 4.62) showed very good agreement with the correlative datasets in the middle and upper stratosphere, with relative differences within ±10% in the altitude range between ∼20 and ∼50 km (50-1 hPa). In the UT/LS, MIPAS profiles show a significant positive bias of +5 to +25% with respect to the coincident observations (Cortesi et al., 2007).
Here, MIPAS operational ozone data version 4.62 (ESA-v4.62) are compared with ACE-FTS. We found a total of 138 events at latitudes 70 • -80 • N, using coincidence criteria of ±6 h and 300 km. The time constraint was relaxed to 6 h (instead of the typical 2 h) in order to increase the statistics of the comparison since it did not introduce notable biases in the atmospheric sampling. For MIPAS, only profiles associated with a successful pressure/temperature and target species retrievals have been considered. The results of the comparison are summarized in Fig. 17. Mean relative differences are within ±10% between 11 and 41 km, with a local maximum of about +10% (+0.44 ppmv) at 30 km. Between 35 and 48 km, ACE-FTS reports increasingly larger ozone values, with a pronounced maximum around 48 km corresponding to mean relative differences of +58% (about +1.4 ppmv). The amplitude of this peak is larger than the high altitude bias noted in other comparisons, but is limited to a narrower altitude range. The de-biased standard deviation of the mean relative differences is low (<10%) between 17 and 25 km and increases above and below this range, but remains within 25% at all altitudes between 11 and 41 km. As for most comparisons, the standard error of the mean is very small, showing that the observed biases are statistically significant.

Comparison of ACE-FTS with the reduced-resolution mission ESA data product
New measurement scenarios were adopted for the reduced resolution mission. These scenarios are characterized by a finer vertical limb scanning step of 1.5 km from 6-21 km, 2 km from 21-31 km, 3 km from 31-46 km (i.e., equal to the instrument field-of-view) and 4 km above 46 km. A detailed description of these measurement scenarios can be found in Ceccherini et al. (2006). Since the retrieval is performed at the tangent altitudes, the use of a limb scanning step smaller than the width of the instrument field-of-view introduces instabilities in the retrieval and requires a regularization to avoid oscillations in the retrieved profiles. For this reason, the ORM retrieval code was modified to implement a Tikhonov regularization scheme that is described in detail by Ceccherini et al. (2007). Furthermore, a new set of microwindows, optimised for the new measurement mode, was selected using the same algorithm as for the full resolution observations. In particular, a larger number of spectral points is considered, in order to compensate for the loss of information content caused by the reduced spectral resolution. Comparison of the results obtained for the full and reduced resolution measurements showed that the new algorithm yields improved spatial resolution (horizontal and vertical) and lower retrieval errors . A first study of the quality of the MIPAS reduced resolution ozone profiles was reported by Ceccherini et al. (2008). In general, the quality of the ozone profile retrieved from reduced-resolution measurements is comparable or better than that obtained from the full-resolution dataset. The only significant change in MI-PAS performance is found at altitudes around 40 km, where a bias of approximately 3% is observed between full and reduced-resolution datasets. For this comparison, we used ±5 • and ±10 • for the latitude and longitude criteria, respectively. Here also, the time criterion was relaxed to ±6 h to increase the number of coincident pairs. A total of 160 coincidences was found. We used the MIPAS profiles retrieved with the ESA MIPAS Level 2 processor prototype (version ML2PP/5.0). These are a preliminary set of data that ESA generated for validation purposes. Figure 18 shows the results of the comparison. They are qualitatively consistent in the stratosphere with those from the full resolution observations. Mean relative differences are within ±8% between 14 and 45 km, with closest agreement around 20 and around 38 km (±3%). Corresponding de-biased standard deviation values are within 12% in the range 20-58 km and increase subtantially above and below. At altitudes between 45 and 65 km, the mean relative differences are larger, with a maximum of +27% (55 km). This is consistent with the comparisons with other satellite sensors.

Comparison of ACE-FTS with the IMK-IAA scientific processor
The IMK-IAA retrieval scheme (von Clarmann et al., 2003, and references therein) is a scientific processor complementary to ESA's near-real-time analysis. It is based on regularized inversion using a first-order Tikhonov-type smoothing constraint (von Clarmann et al., 2003) and optionally includes non-LTE calculations, implemented at the IAA, to analyse cases (specific molecular species and/or altitude levels) where the LTE assumption is not verified. Ozone retrievals use a set of 10 microwindows within the spectral ranges 740-800 cm −1 and 1060-1110 cm −1 where non-LTE emissions are mostly negligible (Glatthor et al., 2006). The retrieved profiles are provided on a vertical grid with finer spacing than the tangent height distances: 1 km up to 44 km and 2 km from 44 to 70 km (von Clarmann et al., 2003). For the analysis presented here, the current IMK-IAA ozone data product (V3O O3 7) is used for the full spectral resolution observation period. This product was compared by Steck et al. (2007) with ground-based instruments, ozonesondes and observations from HALOE and POAM III. They found relative differences within ±10% in the stratosphere, with a precision of 5-10% and an accuracy of 15-20%. Below 18 km, the precision was reduced to 20% or more . Using criteria of ±9 h and 800 km, we found a total of 333 (348) coincidences between ACE-FTS and the daytime (nighttime) measurements from MIPAS. The results of the comparisons are shown in Fig. 19, for daytime (top panel) and nighttime (bottom panel) MIPAS profiles. To take into account diurnal variations in the ozone abundance, the retrieved MIPAS data were corrected using the KArlsruhe SImulation model of the Middle Atmosphere (KASIMA) chemistry and transport model (Kouker et al., 1999). Mean relative differences between ACE-FTS and the MIPAS data are within ±8% from 12 to 43 km in both the KASIMAcorrected and uncorrected cases, with the ACE-FTS VMRs generally larger than those of MIPAS. The de-biased standard deviation of the mean relative differences is smaller than 15% in this range for both daytime and nighttime observations and smaller than 10% above 18 km, with slightly better results for the nighttime MIPAS measurements (up to 8%). When compared with the precision estimates of the MIPAS IMK-IAA product (previous paragraph), this seems to indicate, as mentioned previously, that the ACE-FTS random errors are small. This is also consistent with the results for the ESA retrievals from the full and reduced resolution data products. Above 40 km, the KASIMA correction generally improves the comparison. Overall, the mean relative differences become larger with increasing altitude, with values of about +40% (+0.9 ppmv) at 48 km. For daytime MIPAS measurements, a sharp decrease of the mean absolute differences can be noted around 52 km. The daytime mean relative differences at these altitudes are more affected by outliers but show a generally better agreement than the nighttime comparisons.

Envisat/SCIAMACHY
SCIAMACHY is a limb-and nadir-viewing imaging spectrometer, also capable of occultation measurements. It uses eight channels in the UV, visible and near-IR spectral range from 240 to 2380 nm, with a moderate resolution of 0.2-1.5 nm (Bovensmann et al., 1999). Number density profiles of several atmospheric species (such as O 3 , NO 2 , BrO, OClO), as well as polar stratospheric clouds and noctilucent clouds, are routinely retrieved from the limb measurements from the surface to ∼92 km with a vertical spacing of 3.3 km (e.g., Brinksma et al., 2006).
The retrievals of stratospheric ozone density profiles in the 15-40 km altitude range from SCIAMACHY limb scattering measurements, used in this study, are the scientific retrievals done at the Institute of Environmental Physics (IUP, Bremen, Germany). They use version 1.63 of the Stratozone retrieval code . Stratozone employs limb radiance profiles at three discrete visible wavelengths (525 nm, 600 nm, 675 nm) and exploits the differential absorption signature of ozone between the center and the wings of the Chappuis absorption band. A nonlinear iterative Optimal Estimation scheme drives the radiative transfer model SCIARAYS , which is used as the forward model.
As the SCIAMACHY limb tangent heights are affected by errors of up to 2.5 km (von Savigny et al., 2005b), in this study we used tangent height retrievals using the Tangent height Retrieval by UV-B Exploitation (TRUE) algorithm (Kaiser et al., 2004) version 1.7 to correct the tangent heights prior to the O 3 profile retrieval. TRUE version 1.7 uses pressure and temperature data from the European Center for Medium-range Weather Forecast (ECMWF) for the location, date and time of each limb measurement. The ozone profile information required for the tangent height retrieval is taken from the dynamic ozone climatology of Lamsal et al. (2004), providing ozone profiles as a function of total ozone columns for five latitude regimes, in combination with total ozone column measurements from the Earth Probe -Total Ozone Mapping Spectrometer (EP-TOMS, http://toms.gsfc.nasa.gov/index v8.html) for the location and date of each SCIAMACHY limb measurement. The tangent height offsets derived for tropical latitudes, where TRUE provides the most accurate results, are applied to all limb measurements in the corresponding orbit. The mean tangent height offset for 2004 is about −1.5 km. Previous SCIAMACHY IUP ozone profiles (version 1.6) have been validated extensively with lidars, ozonesondes, MWRs and SAGE II and SAGE III data (Brinksma et al., 2006). Results showed that the SCIAMACHY-IUP v1.62 data product is biased low between 16 and 40 km, by a few percent (3-6% with a standard deviation of ∼10%). In this analysis, we use version 1.63 of the IUP ozone number density profiles for SCIAMACHY. The difference between versions 1.62 and 1.63 is the improved pointing correction provided by TRUE version 1.7 algorithm.
The criteria chosen for the ACE-FTS and SCIAMACHY comparisons are a maximum difference of ±6 h and a maximum distance of 500 km. This gives a total of 734 coincidences between March and December of 2004, with more than 75% occurring in the Arctic polar region in the latitude range 60 • -82 • N, out of which 90% or more of the SCIAMACHY events are measured at high solar zenith angle (70 • -85 • ). The overall results are shown in Fig. 20. The vertical range was limited to 17-41 km, since the retrieval below and above this range is dominated by the a priori and there is no information from the measurement. Over the full altitude range, the mean relative differences are within ±4% (with de-biased standard deviations, within 8 to 16%, consistent with previous validation results for SCIAMACHY IUP v1.62 data), except around 30 km where ACE-FTS reports larger ozone values than those of SCIAMACHY by up to +15%. This large bias around 30 km is noted in the highsolar zenith angle SCIAMACHY observations, mostly in the Arctic (564 events), but is not seen in other regions. It is still present in the most recent version of the SCIAMACHY ozone data product (v2.0, currently in development), but its amplitude is significantly reduced (<10%) in comparisons with HALOE and SAGE II. The source of this bias is still unclear.

Aura-MLS
The Aura satellite (Schoeberl et al., 2006) was launched in July 2004 in a sun-synchronous, quasi-polar orbit, with an altitude of ∼700 km, an inclination of 98 • and ascending node crossing at 13:45 (local time). MLS aboard Aura scans the Earth's limb to measure thermal emission at millimeter and submillimeter wavelengths, using seven radiometers designed to cover five broad spectral regions from 118 GHz to 2.5 THz. The Aura-MLS instrument, calibration and performance for the different channels are described by Jarnot et al. (2006), Cofield and Stek (2006) and . The orbit geometry provides global coverage from 82 • S to 82 • N each day. 240 vertical scans are performed during each orbit, allowing the retrieval of ∼3500 profiles per day for 17 primary atmospheric parameters: pressure, temperature and cloud ice water content, as well as 14 trace constituents such as O 3 , H 2 O and CO. An overview of the instrument and observation characteristics, main spectral lines and target species can be found in Waters et al. (2006). The retrieval scheme is based on the Optimal Estimation Method (Rodgers, 2000). Taking advantage of the forwardlooking geometry of the instrument with respect to the spacecraft, the innovative approach of the Aura-MLS retrievals resides in the combination of ∼5-10 subsequent scans to retrieve atmospheric parameters on a two-dimensional grid, in the vertical direction and along the line-of-sight. This retrieval approach is detailed by Livesey et al. (2006). The vertical retrieval is provided on a standard pressure grid with 6 pressure surfaces per decade change in stratospheric pressure, and 3 levels per decade for pressures smaller than 0.1 hPa. The corresponding vertical resolution is 3-5 km. The ozone volume mixing ratio is retrieved from the observations of the radiometer centered at 240 GHz.
The Aura-MLS ozone version 1.5 dataset was compared with numerous correlative datasets (including SAGE II, HALOE, POAM III and the previous data version (v2.1) of ACE-FTS O 3 ) in the early validation study of  and with Odin/SMR (Bordeaux version 222 processor) by Barret et al. (2006). An overall agreement of 5-10% was found throughout the stratosphere, with Aura-MLS biased high in the lower stratosphere but low in the upper stratosphere. Extensive validation of the Aura-MLS version 2.2 (hereinafter v2.2) ozone product, with a limited time coverage, showed better results than version 1.5 with respect to the correlative datasets, with an agreement of 5-8% in the stratosphere (Froidevaux et al., 2008;Boyd et al., 2007;Jiang et al., 2007). Estimated precision is about 5% or better between 100 and 3 hPa.
The comparisons presented here extend the analyses of Froidevaux et al. (2008) to the full Aura-MLS v2.2 dataset processed (as of May 2007) and include comparisons with ACE-MAESTRO. At the time of the analysis, coincidences were available on 465 dates, with very few in 2004 (19) and the remainder evenly distributed in the other years. A total of 3180 coincidences was found using the coincidence criteria: ±2 h, ±5 • in latitude and ±10 • in longitude. We used the recommended parameters for screening the Aura-MLS data: quality value >0.4, positive precision, even values of the status flag, and convergence <1.8 (Froidevaux et al., 2008). We also limited the vertical range of the comparisons to the altitudes ∼10-65 km as recommended for Aura-MLS and ACE-MAESTRO. For the comparison, the Aura-MLS vertical profiles were interpolated in log(pressure) onto the ACE-FTS pressure levels and subsequently reported on the ACE-FTS or ACE-MAESTRO altitude grid.
The results of the comparisons for ACE-FTS are shown in Fig. 21. ACE-FTS reports consistently more ozone than Aura-MLS over the comparison range. Between 12 km and 43 km (∼2 hPa), the mean relative differences are within 0 to +10% and often smaller than +4%. Above 43 km and below ∼60 km, they are within +10 to +25%, with the maximum value found at 53 km (∼0.6 hPa). This is consistent with the findings of Froidevaux et al. (2008) and with the other comparisons presented in this paper. The de-biased standard deviation of the mean relative differences is within 25% in the full altitude range and smaller than 12% between 24 and 48 km.
The results for ACE-MAESTRO are presented in Fig. 22, recalling what was found for SABER. The ACE-MAESTRO SR profiles show larger VMRs than Aura-MLS in the range 21-57 km, with mean relative differences within +2 to +15% (+6% on average), in closest agreement with the Aura-MLS data around 38 km (∼ +2%). Above and below this range, the SR retrievals report VMR values increasingly smaller than those of Aura-MLS, with mean relative differences down to about −50% at the limits of the comparison range. In the case of the ACE-MAESTRO SS events, the mean relative ranging from −10% at 15 km (∼120 hPa) to a maximum of +21% at 52 km (∼0.7 hPa), similar to that found for the ACE-FTS comparisons. For both SR and SS comparisons the de-biased standard deviation of the mean relative differences is within 10 to 25% between 19 and ∼45 km, generally larger than what was found for ACE-FTS, suggesting again a poorer precision of the ACE-MAESTRO observations. Note that the standard deviation of the mean VMR profiles shows significant discrepancies for both SR and SS events.

Aircraft measurements from ASUR
ASUR is a microwave receiver operating in a tunable frequency range between 604.3 and 662.3 GHz (von Koenig et al., 2000). It measures atmospheric emission from various trace gas molecules including O 3 , N 2 O, HNO 3 and ClO. Stratospheric measurements performed with the Acousto-Optical Spectrometer (AOS) are used in this intercomparison exercise. The total bandwidth of the AOS is 1.5 GHz and its resolution is 1.27 MHz. The heterodyne sensor is operated on board a high-flying research plane to avoid strong absorption signals from tropospheric water vapor. The instrument looks upwards at a stabilized constant zenith angle of 78 • . Measured spectra are integrated during up to 80 s, which leads to a horizontal resolution of about 18 km along the flight path. Vertical abundance profiles are retrieved on a 2 km-spacing altitude grid using the Optimal Estimation Method (Rodgers, 2000). Vertical resolution of the ozone measurements is about 6-18 km, and the vertical range is 16-50 km. The precision of a single measurement is 0.1 ppmv (3 to 8% depending on the altitude) and the accuracy (includ- ing systematic uncertainties) is 15% or 0.3 ppmv, whichever is greater. Details about the measurement technique and retrieval theory can be found in Kuttippurath et al. (2007). The ASUR ozone measurements used in this study were performed aboard the NASA DC-8 aircraft during the Polar Aura Validation Experiment (PAVE) (http://www.espo.nasa. gov/ave-polar/). These were compared with ACE-FTS and ACE-MAESTRO using coincidence criteria of ±12 h and 1000 km. This resulted in a total of 39 (37) coincident ASUR measurements with ACE-FTS (ACE-MAESTRO), from 5 flights out of Portsmouth (New Hampshire, USA) reaching northern high latitudes (∼65 • N) on 24, 29 and 31 January and 2 and 7 February 2005. The corresponding ACE-FTS and ACE-MAESTRO occultations were obtained exclusively at sunrise. The ACE-FTS and ACE-MAESTRO VMR profiles were convolved with the ASUR averaging kernels to account for the lower vertical resolution of the ASUR profiles.  Figure 23 shows the results from the comparison between ACE-FTS and ASUR. The mean relative differences are within ±19% (0.45 ppmv) over the full altitude range and smaller than ±8% between 18 and 38 km, with consistently positive values above 22 km. Below 22 km, the ACE-FTS VMRs are slightly smaller than the ASUR values, down to −8% (−0.2 ppmv). The de-biased standard deviation of the mean relative differences is smaller than 11% over the full altitude range (<7% in the range 22-32 km). The agreement between the datasets is best around the peak in ozone VMR (mean relative difference of 0.8% at 32 km).
The results from the comparison between ACE-MAESTRO and ASUR are presented in Fig. 24. The mean relative differences are within ±16% (0.33 ppmv) at all altitudes and within ±3% from 22-38 km, with a corresponding de-biased standard deviation of 6 to 13% (<10% in the range 22-32 km), again slightly larger than for ACE-FTS.

Balloon-borne observations from FIRS-2
The Far-InfraRed Spectrometer (FIRS)-2 is a remote-sensing FTIR spectrometer designed and built at the Smithsonian Astrophysical Observatory. It measures thermal emission from the atmosphere in the wavelength range 8-120 µm (∼80-700 cm −1 ), with a spectral resolution of 0.004 cm −1 (Johnson et al., 1995). The balloon-borne observations are performed in the limb-sounding geometry. To analyse the data, first, the atmospheric pressure and temperature profiles are retrieved using the 15 µm band of CO 2 . Then, vertical profiles of about 30 trace constituents are retrieved from the float altitude (typically 38 km) down to the tropopause, using a nonlinear Levenberg-Marquardt least-squares algo- rithm (Johnson et al., 1995). Uncertainty estimates for FIRS-2 contain random retrieval error from spectral noise and systematic components from errors in atmospheric temperature and pointing angle (Jucks et al., 2002;Johnson et al., 1995).
In the case of the O 3 profile used in this analysis, the total error is 10-20% below 20 km and 5-8% above. Balloon flights of FIRS-2 have been used to validate observations from the Improved Limb Atmospheric Spectrometer (ILAS) on board the Japanese Advanced Earth Observing Satellite (ADEOS) (e.g., Nakajima et al., 2002) as well as from the MLS, HALOE and the Cryogenic Limb Array Emission Spectometer (CLAES) instruments aboard UARS (Jucks et al., 2002, and references therein). Results from FIRS-2 were also compared more recently with Aura-MLS observations (Canty et al., 2006). We compared a FIRS-2 observation acquired on 24 January 2007 (∼68 • N, ∼22 • E) with the ACE-FTS and ACE-MAESTRO profiles from the SR occultation sr18561 (64.7 • N, 15.0 • E, distance: ∼481 km) measured on 23 January 2007 at 08:25 UT (Fig. 25). Scaled (Dunkerton and Delisi, 1986;Manney et al., 1994) PV values for the times and locations of the measurements indicate that both ACE and FIRS-2 measured airmasses inside the polar vortex. Since the FIRS-2 data is reported on a 1 km-spacing altitude grid, we simply interpolated the FIRS-2 profile onto the altitude grids of ACE-FTS (1 km) and ACE-MAESTRO (0.5 km). For this particular observation, the float altitude of the balloon carrying FIRS-2 was lower than usual, setting the upper limit of the vertical range of the comparison at 31 km. The relative differences between the O 3 profiles from ACE-FTS and FIRS-2 are within ±15% over the vertical range 13-30 km. ACE-FTS generally reports larger VMR values than those of FIRS-2 above 16 km, except around 26 km. The comparisons with ACE-MAESTRO yield similar results, with relative differences within ±15% at altitudes between 16 and 31 km but down to −20% at lower altitudes.

SAOZ-balloon measurements in the tropics
The Système d'Analyse par Observation Zénitale (SAOZ) sonde is a light-weight UV-visible diode array spectrometer measuring the atmospheric absorption of sunlight during the ascent of the balloon and during a sunset occultation from float altitude (Pommereau and Piquard, 1994). Spectral analysis is performed using the Differential Optical Absorption Spectroscopy (DOAS) technique which uses least-squares fitting of the spectra with laboratory cross-sections. Ozone is measured in the Chappuis band (visible spectral range at 450-620 nm) where the absorption cross-section is not sensitive to temperature. The profiles are retrieved in the altitude range 10-28 km with a vertical resolution of 1.4 km, using the onion peeling method within 1 km-thick atmospheric shells. Data contaminated by clouds are removed by looking at the atmospheric extinction at 615 nm. For O 3 , the estimated precision is 1.5% at 20 km, degrading to 5% at 17.5 km, 10% at 15 km and 23% at 10 km. Accuracy is evaluated by adding a systematic error of 1.5% (uncertainty from the ozone absorption cross-sections) to the precision values. The SAOZ ozone profiles have been compared to a number of satellite and sonde observations and were found to be very consistent with the most accurate data available (Lumpe et al., 2003;Haley et al., 2004;Borchi and Pommereau, 2007). The three SAOZ flights used in this study were part of the African Monsoon Multidisciplinary Analysis (AMMA) balloon campaign (Redelsperger et al., 2006)  The six resulting profiles (3 for ascent and 3 occultation profiles at float altitude) are compared with the spatially coincident ACE profiles from SS occultation ss16090 (8 August 2006 at 17:40 UT). Since the vertical resolution of the SAOZ balloon instrument is comparable to that of the ACE instruments, the SAOZ profiles were simply interpolated onto the vertical grids of ACE-FTS (1 km) and ACE-MAESTRO (0.5 km).
The results for ACE-FTS are presented in Fig. 26. Relative differences are within ±10% (<0.4 ppmv) above 19 km for all ascent (solid lines) and occultation (dotted lines) SAOZ profiles. Below 19 km the relative differences increase, with maximum values between −40 and −60% at 16 km for all SAOZ profiles. Figure 27 shows the comparison for ACE-MAESTRO. The ACE-MAESTRO and the SAOZ profiles are in good agreement, with relative differences within −15 to +5% above 19 km. As was found for ACE-FTS, ACE-MAESTRO reports significantly less ozone than SAOZ in the range 15-19 km, with maximum relative differences larger than −70%. Below 16 km, the ACE-MAESTRO VMRs are considerably larger than those of SAOZ. The large differences noted for ACE-FTS as well as for ACE-MAESTRO below ∼18 km may be explained by the fact that the SAOZ measurements used in this study were deliberately performed in the vicinity of high altitude (up to 18 km) convective clouds. Because the effects of these clouds can be highly localized, it is possible that the ozone field at the lowest altitudes measured by SAOZ and ACE could be quite different.

Balloon-borne SPIRALE observations
The SPectroscopie Infra-Rouge d'Absorption par Lasers Embarqués (SPIRALE) instrument is operated from a balloon-borne gondola by the Laboratoire de Physique et Chimie de l'Environnement (LPCE, Orléans, France) and is routinely used at all latitudes, in particular as part of European validation campaigns for the Odin and Envisat missions. The six tunable diode laser absorption spectrometer (TDLAS) has been previously described in detail (Moreau et al., 2005). In brief, it can perform simultaneous in situ measurements of about ten chemical species over the vertical range 10-35 km. The high frequency sampling (∼1 Hz) yields a vertical resolution of a few meters, depending on the ascent rate of the balloon. The diode lasers emit at mid-IR wavelengths (3-8 µm) and the beams are injected into a multipass Heriott cell, located under the gondola and largely exposed to ambient air. The cell (3.5 m long) is deployed during ascent when the pressure is lower than 300 hPa. The multiple reflections obtained between the two cell mirrors give a total optical path of 430.78 m. Species concentrations are retrieved from direct IR absorption, by fitting experimental spectra with spectra calculated using the HITRAN 2004 database (Rothman et al., 2005). Specifically, the rovibrational lines at 2086.0191 and 2086.4294 cm −1 were used for the SPIRALE O 3 retrievals. Simultaneous measurements of pressure and temperature onboard the gondola allow the number densities to be converted to VMRs. Estimates of the uncertainties in the SPIRALE measurements were detailed by Moreau et al. (2005). Total root-sum-square uncertainties are about 6% above 18 km (<80 hPa) and 8% below (>80 hPa).
For this study, we compared a SPIRALE profile (obtained during ascent) from 20 January 2006 (17:34-19:47 UT) with the coincident ACE-FTS and ACE-MAESTRO profiles from the SR occultation sr13151. The SPIRALE O 3 vertical range was 10.8-27.3 km. The balloon position remained rather constant around a mean location of 67.6±0.2 • N and 21.6±0.2 • E. The ACE occultation occurred 13 h later (on 21 January 2006 at 08:00 UT) and was located at 64.28 • N-21.56 • E at a distance of 413 km from the SPIRALE mean position. Potential vorticity (PV) maps were calculated with the Modélisation Isentrope du transport Méso-échelle de l'Ozone Stratosphérique par Advection (MIMOSA) contour advection model . They confirmed that SPIRALE and ACE sounded similar air masses in the well established polar vortex at this time, for the whole range of altitudes, with PV differences of less than 10%.
Since the vertical resolution for SPIRALE is of the order of meters, we smoothed the SPIRALE data using triangular or Gaussian convolution functions as described in Sect. 4. The ACE-FTS (Fig. 28) and ACE-MAESTRO (Fig. 29) O 3 profiles are in good agreement with the SPIRALE profile between 15 and 25 km, where the relative differences remain within the error bars of the comparison.

Ozonesonde measurements
Ozonesondes are balloon-borne instruments launched (typically) weekly from various stations around the globe. They perform in situ measurements of pressure, temperature, humidity and O 3 abundances from the surface to the balloon's burst altitude (typically ∼35 km) with a resolution of 100-150 m. There are three types of ozonesondes currently in operation: the Electrochemical Concentration Cell (ECC) (Komhyr et al., 1995), Brewer-Mast (BM) (Brewer and Milford, 1960) and Carbon-Iodine (CI) (Kobayashi and Toyama, 1966) ozonesondes. The accuracy of ozonesonde observations is generally estimated to be 5% (e.g., SPARC, 1998) but in fact depends on numerous parameters (for instance, for ECC ozonesondes, the concentration of the sensing solution or the manufacturer influence the accuracy). Depending on the type of ozonesonde and the altitude, typical values for the precision and accuracy are ∼3-8% and ∼5-15%, respectively, up to 30 km (see Smit et al., 2007, and references therein).
For the statistical comparison of ACE-FTS and ACE-MAESTRO with ozonesonde observations, we used measurements from the World Ozone and Ultraviolet Data Center (WOUDC), the Southern Hemisphere ADditional OZonesonde (SHADOZ) archive and the 2004 INTEX Ozonesonde Network Study (IONS) campaign (see Table 2  for URLs and references). We defined coincidence criteria of ±24 h and 800 km. Table 2 lists the stations for which coincidences were found. Because of their high vertical resolution, the ozonesonde data were smoothed using the convolution functions described in Sect. 4. When several ACE-FTS or ACE-MAESTRO profiles were coincident with the same ozonesonde measurement, they were averaged and the resulting mean profile was compared with the ozonesonde data (Randall et al., 2003). From the initial total of 547 coincidences, we compared 376 profiles. Figure 30 shows the results for the comparison with ACE-FTS. There is good agreement with the ozonesonde observations in the altitude range 11-35 km. In this range, ACE-FTS reports systematically larger VMRs than the ozonesondes, with mean relative differences within −1 to +10% and corresponding de-biased standard deviations within 12 to 15% (17 to 30%) above (below) 20 km. Note that ACE-FTS and the ozonesondes sample airmasses with similar variability, as demonstrated by the standard deviations of the mean VMR profiles. Below 11 km, the variability of the measured profiles is high (de-biased standard deviation of the mean relative differences of 40% and larger) and the mean relative differences increase significantly. Above 35 km, the Table 2. List of the ozonesonde stations which provided data for the analyses, including location (column 2) and operating agency (column 3). The type of sensor used by each station is indicated in column 5. The source of the data used for these studies is indicated in column 6. In column 1, normal font indicates the stations included only in the statistical comparisons (Sect. 6.5); bold font shows the stations used in the studies presented in Sects. 6.5 and 6.6; italicized font applies to stations used in the detailed NDACC study described in Sect. 6.6.

Station
Coordinates Agency GAW ID Type Source  e.g., in 2006), the corresponding results were provided directly by the station P.I. b Summer 2004 sounding was part of the IONS protocol optimized for Aura validation (Thompson et al., 2007b,c); data available at http: //croc.gsfc.nasa.gov/intex/ions.html. c Data acquired from the SHADOZ archive (http://croc.gsfc.nasa.gov/shadoz/; Thompson et al., 2003aThompson et al., ,b, 2007a. number of coincident events drops sharply and the statistical significance of the results is limited, therefore these results are not shown. Comparison results for ACE-MAESTRO are shown in Fig. 31 for the SR (top panel) and SS (bottom panel) events. Overall, the mean relative differences are within ±5% from 16-30 km, increasing above and below this altitude range, with corresponding de-biased standard deviation within 12 to 30% and 15 to 40% for the SR and for the SS comparisons, respectively. Using a rather limited sample, Kar et al. (2007) had earlier shown a small bias (of about +5%) between the ACE-MAESTRO SR and SS retrievals in the altitude range 20-30 km, when compared with the ozonesondes, with larger mean relative differences for the ACE-MAESTRO SR events. This bias is not seen for this larger sample of coincidences. The mean relative differences are larger below 15 km and reach −20% (SS) and −40% (SR) at the lowest altitudes, with ACE-MAESTRO reporting consistently lower Table 3. Name, location and operating agency for the lidar stations which provided data for the detailed NDACC analyses (Sect. 6.6).

Station
Coordinates Agency VMRs than the ozonesondes, while the de-biased standard deviation at these altitudes exceeds 35%. The bias and de-biased standard deviation values found here are compatible with the second study including ozonesonde data (following section) for both ACE instruments.

NDACC ozonesonde and lidar measurements
Detailed comparisons were performed for individual sites with two types of ozone profiling instruments, ozonesondes and lidars. These are operated within the framework of the Network for the Detection of Atmospheric Composition Change (NDACC, formerly the Network for the Detection of Stratospheric Change or NDSC), a major component of the World Meteorological Organization's Global Atmosphere Watch program (WMO-GAW). The ozonesonde measurements have been described in the previous section. DIfferential Absorption Lidar (DIAL) systems provide the vertical distribution of night-time ozone number density at altitudes between ∼10 km and ∼45 km, with a vertical resolution of 300 m to 3 km, depending on the altitude. Typical values for lidar accuracies are 3-7% between 15 and 40 km. At 40 km and above, due to the rapid decrease in signal-tonoise ratio, the errors increase and a significant bias of up to 10% may appear (Godin et al., 1999;. Coincidence criteria of ±12 h and 500 km were used to select available data from a total of 31 ozonesonde stations (Table 2) and 5 lidar stations (Table 3). Figure 32 shows the time and latitude coverage of all coincidences stored in the database used for this study. However, to ensure a minimum statistical significance of the comparison results at all stations, only those for which at least three coincidences were found with the ACE instruments were included in the analyses. Therefore, stations visible in Fig. 32 but for which there were less than three coincident observations are not listed in Tables 2 and 3.
The analyses were conducted in three steps. First, the individual coincident events were examined to check the quality of the retrieved profiles. Then, time series for the ACE and the ground-based measurements and their relative differences were analyzed. This allowed time periods to be identified in which homogeneous results, and hence meaningful statistics, could be obtained. Finally, the vertical structure of the differences was investigated within these homogeneous time periods, by grouping the stations where similar results were found. The second and third steps will be described below. The integration methodology applied in smoothing the high-resolution ozonesonde and lidar profiles is described in Sect. 4.
In the detailed analysis of the time series, mean relative differences between the ACE-FTS profiles and the groundbased data were within ±10%, in the altitude ranges 10-30 km for the ozonesondes and 15-42 km for the lidars. For ACE-MAESTRO, the mean relative differences with ozonesondes were mostly negative, with values of about −10% in the altitude range 15-30 km and down to −16% below. When compared to lidars, ACE-MAESTRO also reported lower ozone VMRs (mean relative difference of about −7%) in the range 15-37 km, whilst larger negative values (down to −18%) were found below 15 km, and positive mean relative differences (∼+8%) were found in the range 37-41 km. This analysis showed that the temporal variations of the ozone layer are well captured by ACE-FTS and ACE-MAESTRO, but that the limited temporal sampling does not allow finer-scale variations to be revealed. Within the stratosphere, no important structure or seasonal variation was identified in the time series which allowed us to derive meaningful statistics for the ACE-FTS and ACE-MAESTRO ozone data products by combining the three years of the comparison period.
We also investigated the height-resolved statistical differences over the full comparison time period for each station. An example of these relative difference profiles is shown in Fig. 33 for the coincidences between ACE-FTS  Figure 35 shows the mean relative differences between ACE-FTS and NDACC ozonesondes (top panel) and lidars (bottom panel), while the results for ACE-MAESTRO are summarized in Fig. 36. Figures 35 and 36 also illustrate the good consistency of the ACE data with respect to latitude, since there is no systematic meridional bias in the mean relative differences.
For the ACE-FTS and ozonesonde comparisons, the mean relative differences were within ±7% in the range 10-35 km and larger below this range. For the comparisons with lidars, the mean relative differences were within ±10% in the range 10-45 km. These values can be accounted for by known contributions to the systematic errors of the comparison, which indicates that ACE-FTS systematic errors are small. For the comparisons of ACE-MAESTRO retrievals with ozonesondes and lidar observations, the mean relative differences were globally negative, with an average value of about −7% above 15 km. Below this altitude, ACE-MAESTRO reported significantly less ozone than either of the ground-based instruments, with mean relative difference values within −20 to −40%. The negative biases observed for ACE-MAESTRO cannot be accounted for by the contributions from known sources, but are indicative of a systematic underestimation of the ozone VMR by the instrument.
The de-biased standard deviations of the mean relative differences, for both ACE-FTS and ACE-MAESTRO, were lower than 10% in the stratosphere but much larger in the troposphere. This can be explained by the atmospheric variability and the different horizontal smoothing by the occultation and ground-based measurements, which means that the contribution from the ACE retrievals to the combined of random errors of the comparison is small. The different horizontal smoothing of the ozone field is an important contribution to the random error budget of the comparisons, since it can contribute to about 10% of the standard deviation of the differences in the middle and upper stratosphere and more at lower altitudes (Cortesi et al., 2007).

Eureka DIAL measurements
A DIAL instrument has been in operation at the Arctic Stratospheric Ozone (AStrO) Observatory/Polar Environmental Atmospheric Research Laboratory (PEARL) in Eureka (80.05 • N, 86.42 • W) since 1993. In February-March 2004, it measured temperature and ozone profiles as part of the Canadian Arctic ACE Validation Campaigns (Kerzenmacher et al., 2005;Walker et al., 2005;Sung et al., 2007;Manney et al., 2008;Fraser et al., 2008;Fu et al., 2008;Sung et al., 2009). The measurements use radiation from a XeCl excimer laser at two wavelengths, one with a strong absorption signature of O 3 (the "on" wavelength, 308 nm for the Eureka lidar) and one with little absorption (the "off" wavelength, hydrogen Raman-shifted to 353 nm at Eureka) (Donovan et al., 1995). A detailed description of the system is given by Carswell et al. (1991). The Eureka DIAL is operated exclusively at night and provides vertical profiles of ozone from the tropopause level to ∼45 km with a vertical resolution of 300 m and an estimated accuracy for ozone of 1-2% (e.g., Bird et al., 1997).
Data from the Eureka DIAL measurements obtained during the 2004 Canadian Arctic ACE Validation Campaigns were used for validation of the previous release of the ACE-FTS and ACE-MAESTRO data (Kerzenmacher et al., 2005). Comparisons of the DIAL temperature profiles with ACE observations can also be found in companion papers (e.g., Manney et al., 2008;Sica et al., 2008). We present the comparisons of DIAL O 3 with ACE-FTS and ACE-MAESTRO. The results are presented in Fig. 37 for ACE-FTS and Fig. 38 for ACE-MAESTRO. The mean relative differences between the lidar measurements and the ACE-FTS profiles are within −10 to +3% (on average −7% and down to −0.8 ppmv) between 15 and 34 km. The corresponding debiased standard deviation is within 10% between 21 and 31 km and increases above and below this range. At the lowermost altitudes, the mean relative differences are larger (down to −27%). Above 35 km, the lidar profiles appear very noisy and the low statistics prevent us from drawing meaningful conclusions.
The shape of the difference profile for the comparison with ACE-MAESTRO is quite similar, but ACE-MAESTRO shows a larger negative bias with respect to the Eureka DIAL observations. Mean relative difference values range from −20 to +7% (on average −13%) in the range 12-38 km. The de-biased standard deviation of the mean relative differences is within 10% between 19 and 30 km and increases above Fig. 35. Mean relative differences for comparisons between ACE-FTS and ozonesonde data, plotted versus altitude and latitude (top); same information as above for comparisons with lidar data (bottom). Uncertainties are discussed in the text. and below this range. This result is comparable to the values found for ACE-FTS. The maximum mean absolute difference is −1.1 ppmv at 28 km. These results are qualitatively comparable with those described in Sect. 6.6 for other lidars but show an unusual (especially for ACE-FTS) low bias of the ACE instruments with respect to the Eureka DIAL.

Ground-based FTIR observations
In this section, we compare partial columns derived from the ACE-FTS and ACE-MAESTRO observations with groundbased measurements obtained by FTIR spectrometers, at ten NDACC stations (Table 4). Although the coarse vertical resolution of FTIR measurements limits their use for profile comparisons, they provide regular observations at different locations under clear-sky conditions and offer possibilities that complement the ozonesonde and lidar measurements for evaluating the temporal variations of the ACE dataset.
The FTIR instruments involved in the comparisons use microwindows in the range 780-3060 cm −1 and have spectral resolutions ranging from 0.001 to 0.012 cm −1 . They pro- vide information on numerous species including O 3 from the lower troposphere to the middle and upper stratosphere. Two different retrieval codes are used (depending on the station): SFIT2 Rinsland et al., 1998) and PROFITT92 (Hase, 2000). They were compared by Hase et al. (2004), who found that these algorithms are in excellent agreement (generally better than 1%) for both VMR retrievals and total column calculations. Both processing codes are based on the Optimal Estimation Method (Rodgers, 2000), thus providing averaging kernels which are useful for determining the information content and for smoothing higher vertical resolution measurements such as those from ACE-FTS and ACE-MAESTRO.
In this study, we used the coincidence criteria listed in Table 4. Because of the limited number of coincidences at some stations, the time period for the comparison exercise was extended to the end of 2006. The ACE-FTS and ACE-MAESTRO profiles were interpolated on the FTIR retrieval grid for each station and extended below the lowest retrieved altitude using the FTIR a priori VMR values. The resulting composite profile was smoothed using the FTIR averaging kernels and a priori profile, as described in Sect. 4. Partial Table 4. List of the FTIR stations which provided data for the analyses (Sect. 6.8). The latitude and longitude of the station are provided, together with the altitude above sea level in meters (m a.s.l.) (columns 3-4). The coincidence criteria used in this study are indicated for each station (column 5). References describing the stations, measurements and analyses are given in column 6.  columns were calculated for a specific altitude range for each station. To calculate the ACE-FTS and ACE-MAESTRO partial columns, we used the atmospheric density derived from the ACE-FTS measurements. For the FTIR instruments, we calculated a density profile from the pressure and temperature profiles used in their retrievals.
The lower limit of the partial column range was given by the ACE-FTS or ACE-MAESTRO lowest measured altitude, while the upper limit was determined from the sensitivity of the FTIR measurements. We used an approach similar to that of Vigouroux et al. (2007): the sensitivity (also called measurement response) at one altitude is given by the area under the corresponding averaging kernel. The useful range for the FTIR is defined as the altitudes where the FTIR sensitivity is greater than 0.5 (i.e., where the information comes primarily from the measurement). The resulting vertical ranges vary from station to station and for ACE-FTS and ACE-MAESTRO, with lower limits of 10-18 km and upper limits of 38-47 km. For the partial columns, this yields a number of degrees of freedom for signal (DOFS, defined as the trace of the averaging kernel matrix over the altitude range of the partial column) ranging from ∼1.7 for Toronto to ∼3.9 for Izaña.
In Figs. 39 (for ACE-FTS) and 40 (for ACE-MAESTRO), we present time series of the partial columns and relative differences for the comparisons with each FTIR instrument. In some cases, the comparison period is limited to several days of measurements in 2004 (Poker Flat and La Réunion). The  partial columns derived from the ACE-FTS profiles are in acceptable agreement (±20%) with the FTIR values, with mean relative differences within −10 to +7% and corresponding de-biased standard deviation ranging from ∼2% for Izaña to about 10% for Jungfraujoch and Wollongong. The results are slightly better for ACE-MAESTRO, with mean relative differences within −9 to +2%. For ACE-MAESTRO, the de-biased standard deviation of the mean relative differences is about 6% except for Harestua (∼10%) as well as Wollongong and Thule (16%). Furthermore, the scatterplots presented in Fig. 41 for ACE-FTS and in Fig. 42 for ACE-MAESTRO show very good correlation between the O 3 partial columns for the ACE instruments and the ground-based FTIR spectrometers, with correlation coefficients of 0.88 for ACE-FTS and 0.84 for ACE-MAESTRO. When comparing the results for the northern high latitude stations, a larger scatter in the mean relative differences (especially for ACE-MAESTRO) can be noted for Thule than for Kiruna. This is most likely due to the coincidence cri-teria which were broader for Thule than for Kiruna (Table 4). Additional tests were done with a stricter distance criterion (500 km) for comparison with Thule and showed significantly less scatter. However, it did not modify the mean agreement between the ACE data and the ground-based measurements. The results of the analysis for ACE-FTS and ACE-MAESTRO are presented in Table 5, showing the altitude range used for the calculations, the DOFS values, and the mean relative differences and associated de-biased standard deviations for each ground-based station. The latter are useful for quantitative evaluation of the results, even though the statistical relevance can be limited by the low number of coincidences for some stations. Since we have calculated (and described) the de-biased standard deviations of the mean relative differences, the values given above and in Table 5 represent an estimate or an upper limit to the combined precision of the FTIR and ACE instruments. 6.9 Comparison with ground-based microwave radiometer measurements Stratospheric and mesospheric profiles from the MWRs at the Lauder, New Zealand and Mauna Loa, Hawaii NDACC sites have been compared with ACE-FTS and ACE-MAESTRO measurements. These have also been used to perform non-coincident comparisons with other satelliteborne and ground-based instruments, in a manner previously employed by Boyd et al. (2007). This method allows comparison of datasets that would otherwise have limited or no coincident or collocated measurements. Here we compare a set of historical and current satellite-borne datasets as well as ground-based lidar measurements with the MWR measurements and, by using the MWRs as transfer standards, determine the agreement between the ACE instruments and a consensus of these other instruments.
The MWR instruments (Parrish et al., 1992;Parrish, 1994) observe atmospheric thermal emission of ozone at 110.836 GHz and the pressure-broadened line shape is analyzed to obtain the altitude distribution of ozone using the Optimal Estimation Method of Rodgers (2000). The observations are made 24 h a day and routinely averaged over 4-6 h to provide up to four VMR profiles per day. The lower altitude limit for the profiles is about 20 km based on the influence of the a priori on the retrieval, and the quality of the measurement averaging kernels. The upper altitude limit is between 64 km for daytime measurements and about 72 km during night, due to the increased mesospheric ozone signal. The expected precision is 4-5% between 20 and 57 km, and 7% at about 64 km. The expected accuracy (i.e., combined random and systematic error) is 6-9% between 20 and 57 km and 11% at about 64 km. The vertical resolution of the MWR profiles is 6-10 km between 20 and 50 km and about 13 km at 64 km. A detailed description of the error analysis approach used for this work is included in the work of Connor et al. (1995). In the ACE−MWR comparisons, broad coincidence criteria of ±24 h, ±6 • latitude and ±12 • longitude were used to increase the number of coincidences available. In the event that there was more than one ACE measurement fitting this criterion, the one closest in time to the MWR measurement is chosen. To avoid the effects of the significant diurnal variations in ozone amounts in the upper stratosphere and mesosphere, comparisons are restricted to below 52 km. To account for the different vertical resolutions of the instruments each ACE measurement is convolved with the averaging kernels of the MWR measurement as described by Connor et al. (1995), using Eq. (2) (Sect. 4). The profiles used here are interpolated onto an altitude grid with 2 km vertical spacing. The differences in the VMR profiles are determined with respect to the correlative dataset ((ACE−MWR)/MWR).
The mean relative differences between the ACE and MWR measurements, as well as the corresponding mean ozone VMR profiles, are presented in Fig. 43. Despite the small number of comparison pairs at Mauna Loa (less than 15), the difference profiles at both sites are generally similar. Below 44 km, the mean relative differences between the ACE instruments and the MWRs are within ±10%, and often better than ±5%, except for the ACE-MAESTRO -MWR mean relative differences at Lauder from 32-36 km, which are between +10 and +15%. Above 42 km, the ACE instruments have a positive bias, compared with the MWR, with mean relative differences within +3 to +25% and larger for ACE-FTS than for ACE-MAESTRO by 5-8%. Apart from a region between about 28 and 38 km at Lauder, ACE-FTS ozone retrievals yield larger VMRs than ACE-MAESTRO, though the differences are always within the indicated error bars.
A noticeable feature in the plots is the oscillation in the profile around the VMR peak at 34 km. This feature is also seen in comparisons between MWR measurements and those made with other instruments, as shown in Fig. 44, and can therefore be attributed to the MWR. Ground-based microwave measurements tend to produce retrievals with a small oscillatory component. The origin of this oscillation is discussed in Boyd et al. (2007) and Connor et al. (1995). These are effects of systematic spectral measurement errors that propagate through the process of averaging multiple spectra and can produce artifacts in difference profiles such as those seen in the figure.
To extend our validation comparisons, the MWR measurements were used as a transfer standard. The method compares data from the SAGE II, HALOE, Aura-MLS, GOMOS, and MIPAS satellite-borne instruments, as well as groundbased lidars, with the MWRs at Mauna Loa and Lauder. The difference profiles from these comparisons are then averaged to obtain a consensus difference profile. Also included in the averaging are MWR-MWR "zero-line" profiles so that the MWRs, themselves, are included in the consensus. These are then subtracted from the ACE-FTS -MWR and ACE-MAESTRO -MWR difference profiles from Fig. 43, to obtain profiles which show the agreement between the ACE instruments and the consensus of the other instruments. Instrument comparisons with the MWRs were made using criteria similar to those used for the ACE−MWR comparisons discussed above, except the geolocation window for the satellite-borne measurements extends to ±5.0 • latitude and ±10.0 • longitude of the two sites. All the instruments have relatively high vertical resolutions compared to the MWRs and have been convolved using the MWR averaging kernels for the comparison.
All available measurements made by the satellite-and ground-based instruments, in the three year period from 2004 through to the end of 2006, were used to determine the Table 5. Results of the comparisons between ACE-FTS, ACE-MAESTRO and the ground-based FTIRs. The microwindow(s) used in the FTIR retrievals are listed in column 2. For each ACE/FTIR instrument pair, the number of comparison pairs, the vertical range used to calculate the partial columns, the corresponding degrees of freedom (DOFS) and the mean difference and 1-σ standard deviation of the mean are indicated. The retrieval code (with version number) and spectroscopic database used by each station are given in the footnotes. c When multiple microwindows are listed for a station, they are fitted simultaneously during the retrieval process. d The 1000.00-1005.00 cm −1 microwindow was selected following the studies of Barret et al. (2002Barret et al. ( , 2003, for use within the European project UFTIR: "Time series of Upper Free Troposphere observations from a European ground-based FTIR network" (http://www.nilu.no/ uftir/). difference profiles. Table 6 summarizes the datasets used in this study, including the processing version number, the number of collocated pairs used in determining the difference profiles presented here and the gaps in the datasets. Results from the comparisons between the various instruments and the MWRs are presented in Fig. 44 for Mauna Loa (panel a) and for Lauder (panel b).
The resulting (ACE-consensus) difference profiles are again generally similar at both sites. Below 40 km, ACE-FTS shows a consistent positive bias, relative to the consensus, with mean relative differences within +2 to +7% at Mauna Loa and +4 to +8% at Lauder. ACE-MAESTRO also shows generally positive mean relative differences within +1 to +9%, in this altitude region, at Lauder. At Mauna Loa, the ACE-MAESTRO mean relative differences with the consensus are within ±5% up to 40 km, starting as a small negative bias but then tending positive. Above 40 km, both ACE instruments have an increasing positive bias, with mean relative differences between ACE-FTS and the consensus of up to +24% and, for ACE-MAESTRO, of up to +19%. Diurnal variation in ozone amounts becomes a factor above about 45 km, with rapid changes in ozone occurring around sunrise and sunset. The solar occultation SAGE II instrument has a small positive bias above this height, compared to the other consensus instruments, but still measures less ozone than the ACE instruments, suggesting other systematic errors are contributing to the higher positive bias in the ACE instruments. While HALOE is also a solar occultation instrument, the HALOE retrieval incorporates a photochemical model intended to account for diurnal variation of ozone along the instrument's line of sight at sunrise and sunset.

Summary -discussion
Here we summarize and discuss the VMR profile and partial column comparison results described in the previous sections. The mean relative differences from the vertical profile comparisons are presented in Figs and ACE-MAESTRO, respectively. In these plots, the vertical range has been limited to 60 km except for the comparisons with the Eureka DIAL, where the plotting limit was set to 38 km because of the large oscillations noted above this altitude. Only statistical comparisons are included in these summary plots, hence the comparisons with individual FIRS-2, SAOZ and SPIRALE measurements are not included. The corresponding results are given in Table 7.
7.1 ACE-FTS Figure 45 shows the mean relative differences of all statistical comparisons of VMR profiles for ACE-FTS. As can be seen, the results are highly consistent in the stratosphere between ∼16 km and 44 km for nearly all comparison datasets. In this vertical range, ACE-FTS reports on average +4% more ozone than the comparison instruments, with a spread of the mean relative differences on the order of ±5%. In this altitude range, two outliers for which much larger mean relative differences were found can be noted. In one case the mean relative differences are larger and positive, while in the other case the mean relative difference values are larger but negative. The former profile is the result of the comparison with Odin/SMR, for which the ACE-FTS VMR is consistently larger than that of SMR in the stratosphere (with mean relative differences within +3 to +20%), and the latter was obtained when comparing ACE-FTS with the Eureka DIAL, which shows negative mean relative differences of about −7%. The low bias of SMR ozone was noted in the validation study of Jones et al. (2007). The reason for the significant negative differences between ACE-FTS and the Eureka DIAL is still unclear. Furthermore, the individual comparisons with the balloon-borne instruments (not included in Fig. 45) show a similar agreement (with relative differences within ±10%). Additionally, the (ACE-FTSconsensus) mean relative difference profile (shown in Fig. 44 but not included in Fig. 45) obtained in the MWR study is an example of what can be obtained by combining the correlative observations from different instruments (Sect. 6.9). This shows results similar to what can be seen in Fig. 45, with a small positive bias of ACE-FTS with respect to the consensus at altitudes below 40 km, where the mean relative differences are within +2 to +8% at Mauna Loa and Lauder. Below 16 km, the relative differences are more scattered. This can be explained by both geophysical and instrumental factors. The lower stratosphere is an atmospheric region with intrinsically large variability in the ozone VMR (as expressed by the large increase of the standard deviation of the mean VMR profiles at these altitudes), where the observations can encounter clouds or where the sensitivity of satellite sensors can decrease. Therefore, the methology used here is not optimal for quality assessment of the ACE-FTS measurements at the lowest levels of the comparison. For detailed validation in the upper troposphere/lower stratosphere using alternative methods, the reader is referred to Hegglin et al. (2008). The persistent high bias of ACE-FTS in the mesosphere (45-60 km), noted frequently in previous sections, is clearly seen in Fig. 45. The mean relative differences are generally of about +20% at an altitude of about 55 km. Similar high VMR values were already noted in the initial validation for version 1.0 of the ACE-FTS data product (e.g., Walker et al., 2005;McHugh et al., 2005). The natural diurnal cycle of ozone in the mesosphere may be a factor in explaining the discrepancies, since the nighttime VMR values can be as much as 30 to 60% higher than the daytime values in the range 48-60 km (Schneider et al., 2005). However, these large differences are observed for comparisons with different instruments operating from different platforms, in different spectral ranges and with different viewing geometries. Therefore, it is unlikely that this difference at altitudes between ∼45 and 60 km arises solely due to the ozone diurnal cycle.
In addition, the comparison of partial columns derived from the ACE-FTS and ground-based FTIR measurements provide an alternate test of the overall quality of the ACE-FTS retrievals in the stratosphere. The partial column mean relative differences are within ±10% and generally positive, except for Thule (−9.1%) and Jungfraujoch (−9.9%), with de-biased standard deviation of the mean relative differences ranging from ∼2% for Izaña to 10% for Jungfraujoch and Wollongong. There is a good global correlation (∼0.88) between the values derived from the ACE-FTS measurements and those calculated for the FTIR observations. For all statistical comparisons, we calculated the uncertainty of the mean (standard error) whose values are very small over the altitude range 16-44 km for most comparisons, and larger but still small at mesopheric altitudes. This indicates that the biases characterized in this work are statistically significant, since they are very rarely within the standard error bars of the comparison. Furthermore, we reported the de-biased standard deviation of the mean relative differences, which remains within 5 to 15% between 16 and 44 km and increases very rapidly below and above this altitude range. A large part of the de-biased standard deviation of the mean relative differences can be accounted for by the stated uncertainties of the correlative measurements. This seems to show that the contribution of the ACE-FTS re- trievals to the combined random errors of the comparisons is small and well estimated by the statistical fitting errors. Several tests were performed with the ACE-FTS retrieval scheme to evaluate potential sources for systematic biases. The next processing version of the ACE-FTS software features an improved instrumental line shape (ILS) for the instrument. The ILS used for ACE-FTS version 2.2 processing gave an apparent 3-5% high bias in retrievals above ∼40 km for N 2 and HCl (and presumably other molecules as well). There is also an improvement in the retrieval process for pressure and temperature developed for the next version of the ACE-FTS analysis software. Neither the new ILS nor the improvements in the pressure/temperature processing eliminate the systematic high bias in ACE-FTS O 3 retrievals between 45 and 60 km. A more promising explanation for the high bias may be spectroscopy for the microwindows employed in the retrievals. An alternative set of microwindows was tested for this altitude region that appears to yield improved agreement with other datasets, but this issue remains under investigation.
Finally, no systematic difference has been found between the ACE-FTS SR and SS profiles for all comparisons. There is very good consistency between the comparisons for ACE-FTS SR and SS occultations, as seen in Fig. 45.

ACE-MAESTRO
The current analyses have extended the results of Kar et al. (2007) to a broader range of correlative datasets. Figure 46 shows the mean relative differences of all statistical comparisons. These are separated into ACE-MAESTRO SR and ACE-MAESTRO SS events. For completeness, we have included the results of Kar et al. (2007) for POAM III and SAGE III in this plot.
The most obvious result is the bias between the MAE-STRO SR and SS observations, at all altitudes between ∼35 and 55 km. The amplitude of this bias varies with altitude and with the comparison instrument. Below 35 km, the results are essentially comparable for both SR and SS, although the SR comparisons show generally positive and larger mean relative differences than the SS results in the range 25-35 km. Above ∼35 km and up to ∼55 km, the ACE-MAESTRO SR observations are systematically lower than the SS results for the same correlative dataset, and yield more scattered mean relative differences. The SR/SS bias is largest for POAM III and SAGE III around 50 km. For these instruments, the discrepancy can reach 25-30%, with mean relative differences of −10% for the ACE-MAESTRO SR occultations and +20% for the ACE-MAESTRO SS occultations. It should be noted that the ACE-MAESTRO measurements are known to have a variable timing error of up to one second with respect to the ACE-FTS measurements. Since the ACE-MAESTRO retrievals use the tangent heights retrieved for ACE-FTS, this can lead to an offset of a few kilometers in the ACE-MAESTRO tangent heights, resulting in VMR profiles that can be significantly lower or higher than those retrieved from ACE-FTS or the comparison instrument (Manney et al., 2007). This issue is under investigation and has not been resolved yet. In particular, the v1.2 ACE-MAESTRO data used in the present study have not been corrected for this timing error. While this affects both SR and SS profiles, the effect is more pronounced for the SR profiles. This might explain the fact that, in general, the de-biased standard deviations of the mean relative differences for the comparisons involving the ACE-MAESTRO SR profiles are significantly larger than those obtained using the ACE-MAESTRO SS profiles. Part of the large spread in the SR differences seen in Fig. 46 might also be attributed to this.
For most instruments apart from POAM III and SAGE III, the comparisons with ACE-MAESTRO SR measurements show mean relative differences generally within ±5% but with an average close to 0% over the altitude range 20-55 km. However, the spread of the results is about ±10% around the average difference, larger than for ACE-FTS. In contrast, the ACE-MAESTRO SS results are more consistent. They show good agreement between 18 and 40 km, here also with an average difference close to 0%, and mean relative differences starting negative (−5% at 18 km) but becoming increasingly positive with increasing altitude (+5% at 40 km). As was found for ACE-FTS, the largest discrepancies in the altitude range ∼18-40 km are seen in the comparisons with Odin/SMR (+2 to +17%) and with the Eureka DIAL (about −13%). It is interesting to note that the SR/SS bias is not apparent in the comparisons with SMR. Consistent results were found using the MWR instruments as a transfer standard (Sect. 6.9), for which no separation of SR/SS was made. The mean relative differences below 40 km for (ACE-MAESTRO -consensus) are within +1 to +9% at Lauder and within ±5% at Mauna Loa.
In the upper stratosphere/lower mesosphere altitude range, the ACE-MAESTRO SS occultations show significantly more ozone than the comparison instrument, typically by up to +20%. This is comparable to the high altitude positive bias already noted for ACE-FTS in the mesosphere. Potential explanation for this similarity between the ACE-FTS and the ACE-MAESTRO SS results may reside in the fact that the pressure and temperature profiles used in the ACE-MAESTRO retrievals are the profiles calculated from the ACE-FTS observations. This is also under investigation.
Below ∼18 km and above ∼55 km, the mean relative differences increase in magnitude and reach large negative values both for SR and SS observations. Above 55 km, the low signal-to-noise ratio in the O 3 Chappuis band affects the retrievals and may be responsible for the larger negative differences noted at these altitudes.
Finally, comparisons of partial columns with the groundbased FTIR instruments show good agreement in the range used for calculations, with mean relative differences within ±9% but generally around ±2% and corresponding debiased standard deviations of 6 to 16%. The correlation coefficient (0.84) is slightly lower than that found for the ACE-FTS comparisons.
As was found for ACE-FTS, the standard errors are very small for most statistical comparisons of VMR profiles, showing that the biases found in this study are statistically significant. The de-biased standard deviation of the mean relative differences is within ∼10 to 20% at most altitudes between 18 and 40 km and increases rapidly above and below this range. Unlike for ACE-FTS, the spectral fitting errors cannot account for the full contribution of ACE-MAESTRO retrievals to the de-biased standard deviation of the mean relative differences. Therefore, other sources will need to be taken into account in the ACE-MAESTRO random error budget.

Conclusions
We have completed a comprehensive bias determination study for the ozone profiles retrieved from measurements by the Atmospheric Chemistry Experiment satellite-borne Table 7. Summary of results for the ACE-FTS and ACE-MAESTRO profile comparisons with correlative measurements. For cases when the SR and SS comparisons were performed separately or when only one type of occultation was used, the mean relative differences are labeled this way. SR/SS is used when the comparison was not separated by occultation type. Columns 2-5: for ACE-FTS, number of comparison pairs, continuous altitude range in which the mean relative differences are globally within ±10%, mean value (column 4) and maximum/minimum values (column 5) in this range. Columns 6-9: same information for ACE-MAESTRO. instruments, namely the ACE-FTS version 2.2 Ozone Update and the ACE-MAESTRO version 1.2 data products. These datasets have been compared with VMR profiles from 11 satellite-borne instruments as well as ozonesondes and aircraft, balloon-borne and ground-based observations, over a time period of 1.5-3 years. Moreover, partial columns derived from the ACE measurements were compared with ground-based FTIR instruments. In these analyses, efforts were made to use consistent coincidence criteria, comparison methodology and data filtering (including selection of events with simultaneous observations from ACE-FTS, ACE-MAESTRO and the comparison instrument) in order to better assess the overall quality of the ACE-FTS and ACE-MAESTRO O 3 data products. The overall results of the intercomparisons are summarized in Table 5 (partial column comparisons with ground-based FTIR instruments) and Table 7 (profile comparisons). The analyses show generally good agreement and very good consistency between ACE-FTS, ACE-MAESTRO and the correlative instruments in the stratosphere. Biases were identified over particular altitude domains in both datasets. The main findings for the ACE-FTS version 2.2 Ozone Update product are that there is very good agreement with the correlative measurements in the stratosphere, with a slight positive bias with mean relative differences of about 5% between 15 and 45 km and a larger, well-characterized, systematic bias above 42-45 km. The analyses are remarkably consistent for the range of data products used in the comparisons, with a few exceptions which are generally accounted for by known biases of the comparison instrument. The debiased standard deviation of the mean relative differences can be used to evaluate the ACE-FTS and comparison instrument combined precision. It shows that the statistical fitting errors appear to be an acceptable precision estimate for the ACE-FTS retrievals. This implies that the ACE-FTS measurements have good precision, comparable to, or lower than that of the correlative instruments. Complete precision validation will be undertaken for the next version of the ACE-FTS ozone data product.
For the ACE-MAESTRO version 1.2 data product obtained from the VIS spectrometer, there is a noticeable bias between observations performed at sunrise and at sunset. Agreement for the SS measurements is generally better (with mean relative differences of +4% on average) in the range 20-40 km than that found for the SR events (with mean relative differences close to zero but showing a large scatter of ±15%), but there is a high bias above ∼45 km similar to the one noted for ACE-FTS. The SS difference profiles more closely resemble the results found for the ACE-FTS analyses. For ACE-MAESTRO, preliminary analysis of the de-biased standard deviations of the mean relative differences indicate that ACE-MAESTRO has poorer precision than ACE-FTS. The spectral fitting errors currently reported are not enough to account for the ACE-MAESTRO contribution to the random error budget of the comparison. Possible additional sources of random error are being investigated and should be included in the error budget of the ACE-MAESTRO ozone data product.
For both ACE-FTS and ACE-MAESTRO, comparisons of partial columns with ground-based FTIR instruments confirm the overall results and show comparable agreement with all stations.
Tests with a preliminary version of the next generation ACE-FTS retrievals (version 3.0) have shown that the slight positive stratospheric bias has been removed and that the large mesospheric differences have been decreased but are still present. Possible sources for these biases are being in-vestigated at the time of writing. Additional work is ongoing to resolve the differences between the SR and SS retrievals for ACE-MAESTRO. A complete characterization of the random and systematic errors for both instruments will be undertaken during development of the next versions of the ACE ozone products. The ACE-FTS and ACE-MAESTRO ozone measurements analyzed in this work will be a valuable dataset to continue the long-standing record of occultation measurements from space and will play a role in monitoring stratospheric ozone recovery.