Five blind men and the elephant: what can the NASA Aura ozone measurements tell us about stratosphere-troposphere exchange?

. We examine whether the individual ozone (O 3 ) measurements from the four Aura instruments can quantify the stratosphere-troposphere exchange (STE) ﬂux of O 3 , an important term of the tropospheric O 3 budget. The level 2 (L2) Aura swath data and the nearly coincident ozone sondes for the years 2005–2006 are compared with the 4-D, high-resolution (1 ◦ × 1 ◦ × 40-layer × 0.5 h) model simulation of atmospheric ozone for the same period from the University of California, Irvine chemistry transport model (CTM). The CTM becomes a transfer standard for comparing individual proﬁles from these ﬁve, not-quite-coincident measurements of atmospheric ozone. Even with obvious model discrepancies identiﬁed here, the CTM can readily quantify instrument-instrument biases in the tropical upper troposphere and mid-latitude lower stratosphere. In terms of STE processes, all four Aura datasets have some skill in identifying stratosphere-troposphere folds, and we ﬁnd several cases where both model and measurements see evidence of high-O stratospheric air the troposphere. In many cases identiﬁed in the model, individual in the at the model-measurement comparisons of individual proﬁles do provide some level of conﬁdence in the model-derived STE O 3 ﬂux, but it will be difﬁcult to this ﬂux from the satellite data


Introduction
Quantifying and understanding the causes of changes in the tropospheric ozone (O 3 ) burden are important topics for climate research and environmental studies, as ozone is a major greenhouse gas and plays a key role in the tropospheric chemistry. Besides obvious factors such as anthropogenic emissions of ozone precursors (Gauss et al., 2006;Hoor et al., 2009;Myhre et al., 2011;Holmes et al., 2011) and natural emissions of biogenic volatile organic compounds (Atkinson and Arey, 2003;Shao et al., 2009), stratospheric ozone influx has been identified as a major driver of tropospheric ozone changes (Roelofs and Lelieveld, 1997;Fusco and Logan, 2003;Terao et al., 2008;Hsu and Prather, 2009). There are large uncertainties in the estimates of global annual stratosphere-troposphere exchange (STE) of ozone flux either derived from observations (450 Tg(O 3 ) yr −1 (range, 200-870) (Murphy and Fahey, 1994), 510 Tg(O 3 ) yr −1 (450-590) (Gettelman et al., 1997), 550 ± 140 Tg(O 3 ) yr −1 (Olsen et al., 2001)) or from model simulations (e.g., Denman et al., 2007, Table 7.9 and references therein). There are also disagreements in terms of the magnitude and phase of the annual cycle as well as the geographical patterns (Gettelman et al., 1997;Roelofs and Lelieveld, 1997;Olsen et al., 2004;Hsu et al., 2005;Hsu and Prather, 2009). The large uncertainties and differences in the assessments of the STE O 3 budget are partly due to the different definitions and diagnostic methods used in these studies and partly due to the great temporal and spatial variances of the STE flux.
University of California, Irvine (UCI) chemistry transport model (CTM) simulations with the overall goal of using the model and measurements to derive a better understanding of how the stratospheric source affects the tropospheric ozone abundance. We evaluate if the Aura ozone measurements can provide useful information regarding processes, such as tropopause folds (TFs) (Danielsen, 1968), relate these events to STE O 3 fluxes; and examine the consistency amongst different Aura datasets, particularly in the upper troposphere and lower stratosphere (UT/LS) region (300-100 hPa). The UT/LS is where the influx of stratospheric O 3 is most evident (Terao et al., 2008) and where O 3 has the largest impact on radiative forcing of climate (Lacis et al., 1990). Identifying STE events, such as stratosphere-troposphere folds, is inherently a very difficult task for space remote sensing, especially for nadir-viewing instruments, and is beyond the designed scope of Aura (except for a fully functional HIRDLS). Therefore, we draw on the parable of the five blind men and the elephant, where the five Aura measurements are the five "blind men" who are touching the "elephant" (ozone) in different places (i.e., using different remote sensing techniques and observing different parts of the atmosphere (OMI and TES have some overlap)). The UCI CTM is able to see the whole "elephant" and thus provides an intercomparison platform and integrator to connect and relate the different Aura ozone measurements.

Chemistry transport model
The chemistry transport model (CTM) is forced by the pieced-forecast meteorological fields provided by University of Oslo (Kraabøl et al., 2002;Isaksen et al., 2005) from the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecast System (IFS). The model is initialized on 1 January 2005 (00:00 UTC) with a restart file from a CTM simulation ending 31 December 2004 at a resolution of T42 (∼2.8 • × ∼2.8 • ) with 40 layers and then run through 31 December 2006 at 1 • × 1 • × 40-layer × 0.5 h resolution. The 1 • ×1 • meteorological fields are only available for two years (2005)(2006). The modeled atmosphere extends from the surface to 2 hPa with ∼1 km vertical resolution around the tropopause. Because the interpolation from the T42 grid to the 1 • ×1 • grid introduces errors at the beginning of the 1 • ×1 • simulation, the quantitative analysis here often omits the first few months. The primary model output for this analysis is a 65-min swath along the Aura orbit every half an hour (30 min backward and 35 min forward from the sampling point) so that we can interpolate the overlapped swaths to match the exact time and location of each Aura measurement. The additional 5-min forward swath is designed to cover the MLS observations scanning in the forward limb direction. The swath is wide enough to include the cross-track scan of OMI and the off-track viewing geometry of HIRDLS, which is outside the OMI swath. We also store 65 • S-65 • N O 3 field every two hours to match sondes.
The UCI CTM simulates a basic tropospheric chemistry, including most of the major mechanisms, with the ASAD (A Self-contained Atmospheric chemistry coDe) software package (Carver et al., 1997) and a simplified stratospheric O 3 chemistry with Linoz version 2 (Hsu and Prather, 2009). The ASAD package includes the updates for the chemistry solver (Tang and Prather, 2010) and the chemical kinetics and photochemical coefficients (Sander et al., 2006). The tropopause is diagnosed by the abundance of an artificial tracer e90 Tang et al., 2011), which has been demonstrated to match the traditional, but moreawkward-for-our-model definitions. The emissions are taken from the European Union Quantifying the Climate Impact of Global and European Transport Systems (QUANTIFY) project Year-2000 inventory (Hoor et al., 2009). Advection uses the second order moment scheme (Prather, 1986), and the convection scheme follows Tiedtke (1989).

Ozone sonde data
In this study, we use the World Ozone and Ultraviolet Radiation Data Centre (WOUDC) ozone sondes in years 2005-2006 from 42 stations (http://www.woudc.org, retrieved on 10 November 2010) for latitudes 65 • S-65 • N, which covers most folding events found in the model. The modeled ozone profiles generally match sondes (see Tang and Prather, 2010, Fig. 1). The search criteria for identifying TFs follows the algorithm in Tang and Prather (2010): proceeding upward from the surface, we tag the first layer above 5 km altitude at which the O 3 abundance exceeds 80 ppb (parts per billion, nanomoles per mole of air) and continue upward recording the maximum abundance; then if it decreases by at least 20 ppb relative to the maximum within the 3 km above the peak to a minimum value less than 120 ppb, it is a fold.

Aura ozone data and method
The EOS Aura satellite, launched on 15 July 2004, carries four instruments (HIRDLS, MLS, OMI, and TES) that observe ozone (Schoeberl et al., 2006). Aura flies on a sunsynchronous orbit with a 98 • inclination and an ascending equator-crossing time of 13:45 local time. The orbit is 705 km above the sea level with a 16-day repeat period. A single orbit takes ∼98 min. In this study, we use the level 2 (L2) swath data from all the four instruments. Many studies of the Aura ozone data use averaged L2 data or L3 data (typically averaged over a month over large grid cells), both of which smear the true meteorological variability and the folding events.

HIRDLS
HIRDLS observes the atmosphere in limb direction at 21 infrared channels from 6.12 to 17.76 µm (HIRDLS Team, 2010). After launch a spacecraft malfunction resulted in ∼80 % blockage of its optical path which limited the coverage from 65 • S to 82 • N, missing the Antarctic. With its limited field of view at one azimuth angle, however, HIRDLS can still retrieve ozone profiles (260-0.5 hPa) with high vertical resolution (∼1 km). These continuous observations at one azimuth can facilitate studies on short-lived processes (e.g., gravity waves (Alexander et al., 2008)) and possibly TFs. Because of this failure, HIRDLS is not able to measure the same atmospheric profile as the rest Aura instruments within 15 min, but instead views the same location 84 min earlier than MLS at night and 17 • to the east of the MLS track during the daytime. We use version 5 (v5.00.00) HIRDLS ozone data in this study. The data with negative "O3Precision" or earthward from the nearest and above the "CloudTopPressure" are screened out. The "gradient filter" is not applied, because high vertical ozone gradient exists when tropopause folds occur. Note that without the "gradient filter" some unrealistically high ozone spikes are not excluded (see

MLS
MLS measures stratospheric and upper tropospheric ozone using microwave limb sounding technology at 240 GHz (Schoeberl et al., 2006;Waters et al., 2006). Although the version 3.3 standard O 3 product has doubled vertical resolution and an enlarged pressure range, we opt here for ver-sion 2.2 data, as oscillations appear in the tropical upper troposphere (UT) profiles of version 3.3, even for monthly mean ozone profiles . The vertical resolution of MLS ozone profiles is ∼3 km in the UT and stratosphere. The horizontal resolution is ∼200 km × 6 km (along-track × cross-track). The precision of single profile is 40 ppb at 215-100 hPa . We extract the Atmos. Chem. Phys., 12, 2357-2380, 2012 www.atmos-chem-phys.net/12/2357/2012/ scientifically useful data from 215 hPa to 0.02 hPa. Only data with (1) positive precision, (2) even-numbered "Status", (3) "Quality" greater than 1.2, and (4) "Convergence" less than 1.8 are used in this study.

OMI
OMI uses a 2-D Charge-Coupled Device to measure the backscattered solar irradiance from nadir direction at ultraviolet and visible wavelengths (UV-1: 264-311 nm, UV-2: 307-383 nm, VIS: 349-504 nm). The wide cross-orbit swath (2600 km) allows OMI to provide global daily coverage (OMI Team, 2009). In this study, we use the OMO3PR V003 ozone profiles with a horizontal resolution of 13 km×48 km (along-track × cross-track) and 18 vertical layers from the surface to 0.3 hPa (de Haan and Veefkind, 2009). In the troposphere, the vertical resolution is very coarse (3-6 layers) and thus OMI cannot resolve vertical structures such as TFs. OMI does provide useful information about the tropospheric column, including its enhancements in regions with TFs (Tang and Prather, 2010). The OMI tropospheric column ozone (TCO) is derived by applying the tropopause height output from the UCI CTM to the OMI ozone profiles.

TES
TES is a high resolution infrared Fourier transform spectrometer with spectral coverage from 650 to 3250 cm −1 at a spectral resolution of 0.025 cm −1 . It is designed to view the atmosphere in both nadir and limb directions with a 5 km×8 km nadir footprint. The limb scan mode, however, was eliminated in 2005 to conserve instrument life. The version 4 (V004, F05 07) nadir global survey standard ozone product is used in this paper. The profiles are reported at 67 pressure levels from the surface to 0.1 hPa, but in the troposphere there are only 1-2 degrees of freedom for the signal (DOFS) (Nassar et al., 2008;Zhang et al., 2010). Ozone profiles whose "SpeciesRetrievalQuality" or "O3 Ccurve QA" does not equal 1 are excluded . Some of the TES profiles (0.5 %) contain fill values for the averaging kernel (AK), and these are also excluded.

Methods of mapping modeled ozone profiles onto reported Aura measurements
The CTM and Aura ozone profiles all have different locations and pressure coordinates. For geographic collocation, we choose without interpolation the 1 • × 1 • model grid containing the centre (for OMI and TES) or the location of the tangent height (for HIRDLS and MLS) of the observation. Temporal differences are accounted for by interpolation between the two half-hour model simulations bounding the observation as described above. The vertical remapping is more complex and specific to each instrument. Before comparing them, we first map CTM profiles onto the Aura levels either by linear interpolation in pressure for HIRDLS and MLS or by convolution with the a priori and averaging kernel (AK) (together referred as the satellite operator) for OMI and TES. The "least squares fit" method recommended by  is unstable for the lowermost MLS layers, which are particularly important for this paper. Also considering that HIRDLS and MLS have vertical resolutions comparable to that of the CTM, we decided to simply interpolate the CTM profiles onto HIRDLS and MLS levels. For OMI and TES the satellite operators are applied to the CTM profiles to account for limited vertical resolution and sensitivity of nadir viewing measurements based upon the following equation Worden et al., 2007): where x a and A are the a priori and averaging kernel as reported in OMI and TES HDF-EOS5 metadata. The modeled ozone profiles are first interpolated onto the satellite levels. The above equation then transforms the interpolated profiles x m to the "retrieved" profilesx m , mimicking the vertical smoothing of the retrieval process of OMI and TES data.
Note that x m is in Dobson Unit (DU) for OMI and the natural logarithm of ozone molar ratio for TES.

Problems with applying satellite operators for nadir-view instruments
TES contains 1-2 DOFS in the troposphere and thus provides some tropospheric profile data but less information than is apparent in the 25 tropospheric pressure levels of their retrievals (see Fig. 1). Given the sparse spatial coverage of TES in contrast with OMI, we can only infer STE processes from the changes in upper tropospheric values. Therefore, in this study we compare the CTM with TES profiles. Convolving modeled or sonde profiles with Eq. (1) essentially smoothes those profiles vertically and relaxes them towards the retrieval a priori. For most regions, this method works adequately and smoothes the profiles as expected. In the UT, however, applying the satellite operator can cause unphysically high biases (see Fig. 1a and Table 1). Figure 1a shows the comparison for one of the TES profiles in Fig. 4i at 9.5 • S, 295.0 • E. The TES profile (red line) contains a slight inversion at 630 hPa, determined mainly by its a priori (green line), and a dispersed fold at 600-200 hPa that is primarily contributed by the TES signal. From 120 hPa to 70 hPa, TES values are almost identical to a priori values. The linearly interpolated CTM profile (blue line) also resolves the two folds, but with larger magnitudes. The modeled profile matches the shape as well as the magnitude retrieved by TES in the UT. Applying the TES operator, the CTM profile (black line) still has the fold at 630 hPa, but the vertical gradient becomes smoother and the shape is quite similar to the a priori estimate. However, the fold at 400 hPa is totally smoothed out and the ozone abundance increases monotonically above 400 hPa with a much larger slope than that of TES, the TES a priori, or the raw CTM profile in the UT, resulting in unrealistically large O 3 abundances in the UT. Smearing stratospheric information into the troposphere by the TES retrieval process is also noted in other studies . The artificially high bias in the UT introduced by applying the TES operator reflects: (i) the model's high bias relative to TES in the lower stratosphere, (ii) TES's coarse vertical resolution around the tropopause, and (iii) the large crosstropopause ozone gradient. The AK for this particular measurement is shown in Fig. 1b. It is generally thick, indicating coarse vertical resolution. Although data is reported at 67 levels, the DOFS of this measurement (i.e., the trace of the AK) are 4.0 for all levels, 1.5 for the troposphere, and 0.5 for the UT (250-100 hPa). The UT retrievals are strongly influenced by the layers at 100-50 hPa and 400-250 hPa, which have contributions as large as those of the UT region itself (red lines in Fig. 1b). Therefore, the large stratospheric differences between the TES a priori (green line in Fig. 1a) and the raw CTM profile (blue line in Fig. 1a) are smoothed and aliased into the UT, swamping the clear UT signal in the model, and leading to the high model biases in this region. Table 1 shows the means, standard deviations (σ ), and the root mean squares (RMS) of the partial columns for TES and matching CTM profiles for the swath shown in Fig. 4i. This swath consists of 54 individual profiles, covering 31.5 • S-71.4 • N. In the upper troposphere (400 hPa-tropopause), the raw CTM mean (second column) is 7 % smaller than TES (fourth column). After processing with the TES operator, the CTM mean (third column) is enhanced by 3.0 DU (almost 1σ ) and becomes 14 % larger than TES. In the middle (700-400 hPa) and lower (surface-700 hPa) troposphere, the CTM means using the TES operator become smaller and closer to the TES means. The RMS of anomalies (column 5 and 6) is increased by the operator in the UT, but reduced in the middle and lower troposphere.
The results in Table 1 are typical and suggest that applying the TES operator causes artificially high biases in the UT over most latitudes. This is also true for OMI, whose vertical resolution is even coarser than TES. One possible solution for this problem is to redo the TES retrieval process using our modeled profiles as the a priori, but this would be an extensive effort beyond the scope of this paper. Because the UT region is greatly affected by the STE processes (Terao et al., 2008), on which this paper focuses, we decided to compare the interpolated CTM profiles directly with TES on TES pressure levels in the following case studies (Sect. 6.1), and we show results both with and without the TES operator for the rest of the analysis. Our problems when using satellite operators highlight the fact that to avoid misusing and/or misinterpreting satellite data, it is important to know the sensitivities and resolutions of satellite measurements, especially in regions with large gradients like the tropopause. If the real ozone in the lower stratosphere was actually much higher than the retrieval a priori, TES would put too much ozone in the UT similar to the convolved model results.
The artificially high bias of local O 3 abundances caused by applying nadir satellite operators in the UT, however, appears to have only a small effect on the tropospheric column ozone (TCO) as shown by the sums of the second and third columns (38.0 DU vs. 38.1 DU) of Table 1. As mentioned above the TES DOFS in the troposphere are generally greater than one (Zhang et al., 2010) and thus give reasonable TCO values. Likewise, OMI has typically DOFS of 1 for the troposphere (de Haan and Veefkind, 2009), and its TCO matches the simulation in terms of geographical patterns and magnitudes (Tang and Prather, 2010). Therefore, when comparing the TCO of TES or OMI with the model results, we process CTM profiles with the satellite operators.

Temporal and spatial scales of STE
Knowledge of the temporal and spatial scales of STE-related processes is important for investigating stratospheric ozone influx. Figure 2 illustrates the high variability of STE with three snapshots at 2-h intervals of the simulated ozone cross sections as a function of pressure altitude (z * km = 16 × log 10 (1000/p hPa)) and longitude at 29.4 • N starting on 31 January 2005 at 00:00 UTC. The tropopause folding structures, outlined by the thick black lines designating O 3 abundance of 100-ppb suggest cross-tropopause mixing. The magenta squares mark the same CTM grid box and highlight the short-lived feature of tropopause folds. In Fig. 2a, the magenta box is located in the stratospheric part of a TF beneath an apparently isolated tropospheric air mass, which is actually connected with the troposphere at another latitude. Two hours later, another TF develops and its tropospheric branch extends over the box (Fig. 2b). The isolated tropospheric air mass in Fig. 2a reconnects with the troposphere at ∼10 • east in Fig. 2b. The folds continue moving to the east and the box is in the troposphere two hours later (Fig. 2c). An isolated stratospheric air mass (140 • E) and tropospheric air mass (170 • E) also emerge at 12-14 km in Fig. 2c. The stratospheric air stays deep in the troposphere (130 • E-150 • E, 6-8 km) with little change over 6-h, indicating a significant stratospheric intrusion and showing that the STE process can be relatively long-lived. Given the objective of identifying TFs, we must compare with the Aura level 2 (L2) swath data instead of the L2 averaged data or the L3 gridded data that are averaged over two weeks or more. Figure 2 also indicates that the STE processes occur on the scale of a few hundred kilometers and require model resolutions of about 1 • to match the observations.

Case studies
We select our case studies of tropopause folds to ensure that the most precise measurement of tropopause folds (i.e., the ozone sonde) observes a folding event, that the OMI swath overlaps the sonde measurement within one hour, and that all four Aura measurements are available. Out of the 1 907 WOUDC ozone sondes and the ∼10 000 Aura swaths, there are eight such cases in year 2005. Two of these cases are shown in Figs. 3 and 4 here, while the remaining six are in the appendix (Figs. A1-A6). In Fig. 3 Fig. 3a and 18:00 UTC for Fig. 4a). In the first case, the model and sonde have the same main shape (Fig. 3a): ozone decreases with height in the boundary layer (1000-800 hPa) and increases in the free troposphere with a clear inversion in the upper troposphere and another one of smaller magnitude at 125 hPa. The CTM misses the high values around 700 hPa. The CTM profile has larger variance in the free troposphere. The middle troposphere maximum seen in the sonde (300-200 hPa) occurs lower in the CTM (400-300 hPa) and is attributable to a TF. The model slightly underestimates the magnitude of the folding at 125 hPa. In the second case, the model reproduces the magnitude of the folding structure at 300 hPa with the peak point about 30 hPa lower in altitude, but overestimates the variance at the tropopause (120 hPa, Fig. 4a). These two comparisons reveal that the model matches the sonde in the shape and magnitude, in particular resolving the folds, and give us some confidence that the model should be capable of reproducing the folding structures and patterns in the nearly concurrent Aura swaths.
The locations of each available Aura measurement (HIRDLS: black crosses, MLS: cyan crosses, OMI: green dots, TES: red crosses) in the one hour period close to the sonde measurement are shown on the CTM grids in Figs. 3b and 4b. Note that although the swaths of MLS and TES nearly overlap, MLS measurements are a few minutes ahead of TES as MLS performs forward limb scan, whereas TES views in nadir direction. Only OMI and TES can have exactly matching measurements in both time and location. The large black crosses represent the sonde locations. Comparisons between Aura and CTM matched swaths are represented in Fig. 3c-d for OMI, Fig. 3e-f for MLS, Fig. 3g-h for HIRDLS, and Fig. 3i-j for TES. Parallel structure and notation are used in plots of the other case studies (Figs. 4 and A1-A6). Each Aura measurement is compared with the coincident model result for the grid box containing the centre of the Aura observation. The modeled profiles are interpolated onto the corresponding Aura levels for the comparisons. The white spaces in the Aura swaths (panels c, e, g, i of Figs. 3 and 4) indicate either no measurements or bad values. The CTM swaths, by contrast, show all profiles along the orbit to present a continuous picture. White areas in Figs. 3j and 4j reflect topography. The black lines imposed on the vertical swaths represent the tropopause at each measurement location as determined by the artificial tracer e90 in the CTM . For OMI, we compare the CTM with only the TCO as a function of latitude and longitude, since OMI has a DOFS of 1 in the troposphere (de Haan and Veefkind, 2009). For the remaining three Aura instruments, the comparisons are performed for the pressure-bylatitude cross-sections of each swath for 0 • -180 • E in Fig. 3 and 180 • E-360 • E in Fig. 4. Note that TES also contains 1-2 tropospheric DOFS, but its footprints are too sparse to allow it infer STE from TCO anomalies alone (as for OMI) and thus the combination of upper troposphere changes in TES profiles and simulated geographical pattern of the fold is used here to validate the modeled folds.
The model simulates the OMI TCO swath quite well (see Figs. 3c-d and 4c-d) as previously shown by Tang and Prather (2010). Here, the CTM profiles are convolved with the OMI operator to account for the limited vertical resolution and sensitivity of OMI, whereas Tang and Prather (2010) use the raw CTM profiles. The convolution does not have great impact on the CTM TCO. The OMI TCO uses the tropopause height calculated by the CTM to make consistent comparisons. In Fig. 3c-d, high TCO appears over Northern and Eastern Asia as well as Australia in both OMI and CTM. The geographic patterns match in details, such as the curvature in Northern Asia. The OMI TCO swath contains more high-frequency variability, likely to be noise, than does the CTM. Figure 4c-d show similar results. Both CTM and OMI have high TCO over North and South America. The CTM, however, underestimates TCO over North America and overestimates it over South America. The biases are within ±5 DU. The high anomalies in TCO are correlated with TF events, particularly near the subtropical jet streams (Tang and Prather, 2010) and hence can provide clues about whether the folding structures in MLS and TES swaths are realistic.
MLS and CTM have similar O 3 patterns just above the tropopause with the typical lower stratospheric values (>200 ppb) at 68 hPa in the tropics and at 215 hPa in the extra-tropics. The tropics-to-midlatitude transition from troposphere to stratosphere is the same in both: 23 • S at 100 hPa and 30 • N at 215 hPa in Fig. 3e-f; and 18 • N at 100 hPa and 60 • N at 215 hPa in Fig. 4e-f. MLS reports inversion structures near 13 • S at 147-100 hPa (Fig. 3e), which are not found in the CTM swath (Fig. 3f). The corresponding OMI Atmos. Chem. Phys., 12, 2357-2380, 2012 www.atmos-chem-phys.net/12/2357/2012/ and CTM TCO do not show high anomalies around 13 • S, and thus these inversions in the MLS data are probably noise in the MLS retrieval procedure. The folding structures at 15 • N-30 • N are consistent in MLS and CTM swaths and confirmed by the TCO high anomalies. As expected from the single-profile precision of MLS in this region, some unphysical values emerge in the MLS data with no analogues in the model, such as >200 ppb O 3 at 215 hPa near the equator, 70 ppb at 68 hPa at 20 • N (Fig. 3e), and >200 ppb at 147 hPa at 12 • N (Fig. 4e). These are likely due to contamination from thick clouds. For HIRDLS, most of the tropospheric values are missing for the tropics below the tropopause as they are obscured by the presence of high clouds in the troposphere (Figs. 3g and 4g). The tropopause region in HIRDLS swaths appears more fuzzy and diffused with some non-physically low values (<50 ppb) in the stratosphere (e.g., 17 • N at 70 hPa in Fig. 3g) and unrealistically high values (>200 hPa) in the troposphere (e.g., 7 • N at 130 hPa and 10 • S at 196 hPa in Fig. 4g). These unrealistic values reflect the fundamental difficulty with HIRDLS or any limb scanning instrument of quantifying ozone abundances as they decline rapidly below the tropopause. Some of the non-physical, high abundances may be screened out as "high spikes" (HIRDLS Team, 2010). On the other hand, HIRDLS does observe some tropospheric patterns that match the CTM, such as the high-O 3 spot at 150 hPa near 30 • N in Fig. 3g and the low values at 230 hPa near 21 • N in Fig. 4g. Given the fine vertical resolution (∼1 km), HIRDLS can resolve major STE events, following stratospheric air well into the troposphere (see Pan et al., 2009, for a case on 11 May 2007), but we did not find these in our test cases for 2005-2006. Observing at nadir angles, TES is able to retrieve the ozone profile down to the ground, but the vertical resolution is much coarser than MLS, HIRDLS, and the CTM in the UT/LS region. The profiles in the TES swaths (Figs. 3i and 4i) are much smoother compared to the CTM simulation, catching the main components and patterns but missing much of the details. In Fig. 3i-j, both TES and the CTM display high ozone abundances about 20 • N at 631 hPa and 42 • N at 400 hPa plus the displacement of stratospheric air (with O 3 >200 ppb) down to a typical troposphere regime (38 • N-50 • N, 400-250 hPa), indicating stratospheric intrusions. The locations of these intrusions match the cyclonic pattern in observed and simulated OMI swaths (Fig. 3cd). TES, however, does not show the intrusion structures at 20 • N at 280 and 158 hPa. In Fig. 4i-j, high O 3 (∼80 ppb) values are found near 10 • S at 350 hPa in both TES and the CTM, but the hot spot at 700 hPa at that latitude, probably due to biomass burning, is seen only in the CTM. TES shows the high-O 3 anomaly at 37 • N in the lowermost troposphere, which may possibly be understood as the folding structure aloft (predicted by the CTM) being redistributed by the TES AK (Fig. 1b) into the lowermost troposphere to give reasonable TCO. The enhanced ozone at 15 • N, 400 hPa is similar in both. The tropospheric inversion patterns simulated by the CTM at 30 • N-60 • N are seen as a broad area of enhanced O 3 by TES.
The other six cases (Figs. A1-A6) show very similar results as the above two cases for different locations and time. In one case (Fig. A2, 6 July 2005) the CTM reproduces the large stratospheric fold at 200 hPa as seen by the sonde, and the MLS and CTM patterns match quite well in Fig. A2ef, except for the magnitudes of a few points. HIRDLS observes a stratospheric intrusion at 45 • N, 260-200 hPa, also in agreement with the model. In Fig. A4 Figure A7 presents this case in the same way as the above eight cases except that there is no sonde available and the color scale is adjusted to emphasize the stratospheric O 3 . In the new HIRDLS version (v5.00.00), the 2-km thick intrusion is also found near 110 hPa at 30 • N-55 • N (see Fig. A7g). Figure A7h simulates this low O 3 layer at the same location, but the O 3 abundance in the surrounding air biases high relative to the HIRDLS measurements due to known problems with the stratospheric meteorology (Hsu and Prather, 2009). The MLS swath (Fig. A7e) indicates an inversion structure at 52 • N, 100 hPa, which does not appear in the simulation (Fig. A7f).
These cases studies of the five Aura ozone measurements and the CTM simulations, made on an instantaneous basis, confirm the model's ability of reproducing the STE processes and show that the Aura measurements can detect some of the fine structures in O 3 , such as TFs and stratospheric intrusions deep into the troposphere, while they miss a large number of such cases, presumably due to instrumental noise, lack of sensitivity, and vertical resolution in individual measurements. Like others, we find that the Aura measurements can resolve stratosphere-troposphere folds for specific cases Manney et al., 2009;Pan et al., 2009;Manney et al., 2011). Combined with the 4-D hindcasts using the CTM or with a data assimilation system, they may lead to a general, comprehensive integration of the global STE flux, but more work is needed.

CTM vs. Aura instantaneous comparisons
The   Table 2. changing tropopause folds and stratospheric intrusions, we need a 4-D description of atmospheric O 3 to determine if these instruments are measuring the same ozone. In this section, we use the UCI CTM as an intercomparison platform to study the consistency amongst the Aura ozone datasets focusing on the UT/LS regions. For TES, we present the comparisons for both the raw CTM simulation (fourth column) and that convolved with the TES operator (third column). The PDF (unit of frequency per ppb 2 ) is weighted inversely by the sampling times for each latitude to account for unequal observations from different latitudes, and it is normalized to give an integral of 1. The number of CTM-Aura exact matches for the month are shown on each panel. . Phys., 12, 2357-2380, 2012 www.atmos-chem-phys.net/12/2357/2012/    . 7. Same as Fig. 5 for January 2006 at 215 hPa. The mean biases and RMS are given in Table 3.    Table 3.

Atmos. Chem
The red, high-density pixels are generally located close to the black solid, 1:1 line for all the three instruments, indicating small biases, except for stratospheric comparisons such as Fig. 6q-t. The CTM is generally biased high compared to all three Aura measurements in the lower stratosphere, suggesting a model deficiency that is most likely due to the errors in the stratospheric circulation of the 40-layer ECMWF meteorological fields previously noted (Hsu and Prather, 2009). As expected, TES gives generally tighter PDFs than MLS and HIRDLS, reflecting the differences between nadir and limb scanning. For tropospheric model values (<100 ppb), such as in the tropics and jet regions of the summer hemisphere (Fig. 5e, f, i, j), the slopes of MLS and HIRDLS PDF are almost flat, consistent with low sensitivities and noise in the lowermost layers of these limb scanning measurements. Negative MLS profile values (e.g., Fig. 5a, q) are allowed in the retrieval algorithm to achieve the correct column loading . The PDFs of HIRDLS are notably more dispersed compared to those of MLS and TES at 215 hPa for the winter hemisphere middle latitudes ( Fig. 5r and Fig. 7b), indicating greater noise in the HIRDLS measurements for this region and season. With the TES operator, the tropospheric CTM values (<100 ppb) become stratospheric (>100 ppb) (e.g., Fig. 5c-d and Fig. 7s-t) as previously shown in Sect. 4.6. The CTM-TES comparisons are usually improved with application of the TES operator (e.g., Fig. 5g-h and Fig. 8g-h) due to the relaxing towards the TES a priori and reducing the variance at a given pressure level by vertical smoothing (see Tables 2 and 3). Without some clear indication of the relative influence of the a prior in each retrieval, this CTM-TES agreement may be artificial. Tables 2 and 3 summarize the mean biases and root mean square (RMS) errors for Figs. 5-8. The RMS errors are generally much larger than the biases, consistent with previous validations against ozone sondes Nassar et al., 2008;Zhang et al., 2010), and are most likely due to the high variability at this pressure range for all latitude zones in both summer and winter. Note that the biases are less meaningful given such large RMS, and thus are only good for qualitative, long-term averages (i.e., L3 monthly gridded data). The biases clearly show that the CTM overestimates in subtropical jet and mid-latitude regions, again recognizing the model deficiency in these regions.
The results in Tables 2 and 3 identify inconsistencies among the Aura datasets. In July 2005 at 215 hPa, the CTM means are smaller than MLS and TES, while greater than HIRDLS in the tropics, NH jets, and mid-latitudes. Compared with sondes, TES has at most a 15 % high bias in the troposphere (Nassar et al., 2008;Richards et al., 2008), while MLS has a ∼20 % high bias at the middle to high-latitudes tropopause . We have now shown that HIRDLS has large positive biases of ∼30-100 ppb at 215 hPa from the tropics to NH mid-latitudes in summer. In the tropics at 147 hPa, the CTM is smaller than MLS and HIRDLS but larger than TES for both July 2005 and January 2006, identifying a clear discrepancy between MLS-HIRDLS and TES without having to find collocated observations. In this case the RMS is much larger than the mean biases although the standard error of the mean (SEM) calculated assuming a normal distribution is smaller. The bias does not show in individual measurements, and the statistical significance of the bias in L3 gridded data depends greatly on the assumption of a normally distributed error that does not depend systematically on specific atmospheric conditions.

Conclusions
The high-resolution CTM (1 • × 1 • ×40-layer × 0.5 h) simulation of ozone reveals that the time scale of stratospheretroposphere exchange (STE) processes observed at a given location is as short as hours and indicates that STE occurs on a spatial scale of a few hundred kilometers. For nadirview instruments (e.g., OMI and TES), the application of their satellite operators (averaging kernel (AK) and a priori) can cause artificially high bias in the upper troposphere, as the nadir-view measurements have coarse vertical resolutions and their AK can smear the high O 3 abundances in the stratosphere into the troposphere.
Aura without a fully functioning HIRDLS is not well designed for studying STE. The L2 swath data from Aura are chosen to study the STE flux of ozone based upon the shortlived features of most STE processes and previous case studies Manney et al., 2009;Pan et al., 2009;Manney et al., 2011). The high-resolution simulation with the UCI CTM depicts the full ozone picture for years [2005][2006] to compare with the individual, mostly non-coincident Aura ozone measurements derived from different remote sensing techniques. The model's ability to reproduce STErelated processes, such as tropospheric folds (TFs), is confirmed by the comparisons with the WOUDC sondes, giving confidence on the reliability and accuracy of the folding and intrusion structures simulated along Aura swaths.
From the eight case studies here, the four Aura instruments demonstrate some skill in catching the STE structures, either from the high TCO anomalies (for OMI) or from the O 3 vertical profiles (for HIRDLS, MLS, and TES). Nevertheless, many of the features simulated by the model are not seen in the L2 data. Tropopause folds and stratospheric intrusions of O 3 present a fundamental difficulty for satellite passive remote sensing due to large abundances and columns of stratospheric ozone above the troposphere. Beyond this work, Aura datasets have been studied for only a few STE cases, such as Pan et al. (2009);Manney et al. (2011). Improvements in the instruments and sensing techniques so as to greatly reduce the apparent noise in individual retrievals will be necessary if satellite observations are to be used to map out folds and intrusions on a regular basis and thus provide better constraint for the STE modeling.
We use the CTM as an intercomparison platform to investigate the consistency of different Aura ozone measurements that are close but not coincident in space, time, or averaging kernel. The CTM deficiencies can be readily identified when the biases are similar against all Aura observations. The 2-D PDF as well as the mean biases and RMS of exactly matched CTM-Aura data identifies the model's high biases in the lower stratosphere. On the other hand, the CTM as a transfer standard can be used to identify clearly the relative biases in the Aura ozone instruments on an instantaneous basis, including the meteorology at the time of observation, even when they do not have overlapping measurements. For example, the case study for July 2005 (Table 2) quantifies the different model-measurement biases for HIRDLS, MLS, and TES in the UT/LS region, thus identifying both consistencies and inconsistencies across these Aura datasets.  Fig. 1 of .