Technical Note: Ozonesonde climatology between 1995 and 2011: description, evaluation and applications

An ozone climatology based on ozonesonde mea- surements taken over the last 17 yr has been constructed for model evaluation and comparisons to other observations. Vertical ozone profiles for 42 stations around the globe have been compiled for the period 1995-2011, in pressure and tropopause-referenced altitudes. For each profile, the mean, standard deviation, median, the half-width are provided, as well as information about interannual variability. Regional aggregates are formed in combining stations with similar ozone characteristics. The Hellinger distance is introduced as a new diagnostic to identify stations that describe similar shapes of ozone probability distribution functions (PDFs). In this way, 12 regions were selected covering at least 2 stations and the variability among those stations is discussed. Signif- icant variability with longitude of ozone distributions in the troposphere and lower stratosphere in the northern mid- and high latitudes is found. The representativeness of regional aggregates is discussed for high northern latitudes, Western Europe, Eastern US, and Japan, using independent observa- tions from surface stations and MOZAIC aircraft data. Good agreement exists between ozonesondes and aircraft observa- tions in the mid-troposphere and between ozonesondes and surface observations for Western Europe. For Eastern US and high northern latitudes, surface ozone values from ozoneson- des are biased 10 ppb high compared to independent mea- surements. An application of the climatology is presented using the NCAR CAM-Chem model. The climatology al- lows evaluation of the model performance regarding ozone averages, seasonality, interannual variability, and the shape of ozone distributions. The new assessment of the key fea- tures of ozone distributions gives deeper insights into the per- formance of models.


Introduction
Ozone is one of the most important trace gases in the atmosphere. In the last few decades, tropospheric and stratospheric ozone have been strongly influenced by anthropogenic activities (WMO, 2010). In the troposphere, ozone is photochemically produced by the oxidation of volatile organic compounds (VOCs) and carbon monoxide (CO) in the presence of nitrogen oxides (NO x ). Changes in fossil fuel consumption, industrial processes (e.g., Staehelin et al., 1994;Lelieveld and Dentener, 2000;Lamarque et al., 2005;Vestreng et al., 2009;Monks et al., 2009), and biomass burning (e.g., Oltmans et al., 2010) have therefore a large impact on ozone. The tropospheric composition of ozone is further impacted by long-term changes in stratospheric ozone and short-term stratospheric anomalies of ozone caused by Published by Copernicus Publications on behalf of the European Geosciences Union.
to volcanic eruptions (e.g., WMO, 2010, Chapter 4). In addition, tropospheric ozone is affected by interannual variations of sporadically occurring pollution events, such as forest fires and meteorological patterns, including the El Niño Southern Oscillation, Rossby waves and gravity waves in the Tropics (e.g., Randel and Thompson, 2011;Thompson et al., 2011b,a), and the North Atlantic Oscillation in midlatitudes . In addition, the interannual variability in stratospheric ozone has been shown to impact the interannual variability in tropospheric ozone (e.g., Tarasick et al., 2005;Ordonez et al., 2007;Terao et al., 2009;Hess and Zbinden, 2011).
Long-term changes in both dynamics and chemistry are responsible for ozone trends. However, the large interannual variability and the uncertainty of ozone measurements add large uncertainties to trends. Various studies have investigated the trend of ozone in the troposphere including critical discussions on measurement uncertainties (e.g., Logan et al., 1999;Tarasick et al., 2005;Parrish et al., 2009;Oltmans et al., 2006;Logan et al., 2012). Upper Troposphere Lower Stratosphere (UTLS) trends have been studied by Thouret et al. (2006); Kivi et al. (2007); Schnadt Poberaj et al. (2009). Ozone over Europe (Logan et al., 2012) and over the eastern US (Chan and Vet, 2010) has mostly decreased in the last 10-15 yr, whereas increasing values were observed at the US West Coast (Parrish et al., 2009) and over Canada (Tarasick et al., 2005;Chan and Vet, 2010).
The reproduction of ozone trends and variability in the troposphere and lowermost stratosphere is challenging. Freerunning chemistry-climate models are not expected to precisely simulate the observed inter-annual variability. However, the performance of the models can be evaluated by comparing simulated ozone to the observed base state of presentday conditions (e.g., Bey et al., 2001;Lamarque et al., 2005;Fiore et al., 2008;Eyring et al., 2010). This requires a precise description of the geographical and vertical distribution of ozone and its seasonality for present day conditions. Information about the interannual variability is also needed to identify and quantify shortcomings of processes in models. Besides seasonal averages, differences in the shape of ozone Probability Distribution Functions (PDFs) between models and observations are useful in identifying differences in non-Gaussian distributions, which occur frequently in the UTLS (Logan et al., 1999;Pan et al., 2010;Tilmes et al., 2010) and in the boundary layer, where ozone values can describe a large spread. Logan et al. (1999) established an ozone climatology based on ozonesonde observations available until 1995. In her comprehensive studies, Logan (1999a,b) summarized the key features of ozone distributions. Sondes and surface data were used to provide a 3-D climatology of averaged ozone profiles on a pressure and tropopause-referenced geometric altitude grid. This climatology was further updated to 1985-2000 with the inclusion of SHADOZ stations (Considine et al., 2008). McPeters et al. (2007) and McPeters and Labow (2012) provided a zonal mean climatology with a focus on the stratosphere using sonde and satellite data, with the purpose of providing prior information for satellite retrievals. Compiled long-term monthly mean ozone profiles (Lamarque et al., 2005;Considine et al., 2008) and zonal mean estimates (Stevenson et al., 2006) have been used in model evaluation studies for the troposphere and the UTLS. Further, regional climatologies have been developed for the tropics (Thompson et al., 2003a;Thompson et al., 2011a,b;Randel and Thompson, 2011), for North America (Newchurch et al., 2003) and for the SH Subtropics (Clain et al., 2009).
The distribution and seasonality of ozone as provided in the earlier studies are still valid in many ways. However, the extension of the ozonesonde network between 1995 and 2011, the increase in the number of observations (Table 1), and the trends in ozone concentrations observed in some regions, call for an updated description of present-day ozone. Therefore, the first goal of this study is to generate an ozone climatology using ozonesonde measurements taken between 1995 and 2011 from 42 stations around the globe. This climatology allows a station-by-station evaluation to capture the large spatial variability in ozone in both troposphere and UTLS, including ozone gradients across the tropopause (e.g., Considine et al., 2008). To our knowledge, an ozone climatology covering this period is not currently publicly available.
The second goal of this study is to identify stations that show similar ozone characteristics with regard to their seasonal median and shape of ozone PDFs and to group them into regions. A new diagnostic is introduced that provides a measure of the similarity of two ozone distributions by employing the Hellinger distance (Nikulin, 2001), defined in Appendix A. The Hellinger distance is applicable for comparisons of distributions of various shapes, including non-Gaussian distributions. The third objective of this paper is to demonstrate an application of the new ozone climatology to two model simulations. In addition to comparing averaged profiles, we evaluate the model with regard to the shape of their PDFs.
Detailed information about the ozone climatology is described in Sect. 2. Aircraft and surface observations used in this study are summarized in Sect. 3. In Sect. 4, we discuss key characteristics of the ozone distribution in defined regions and quantify the similarity of ozone distributions in both the troposphere and UTLS. The representativeness of ozone distributions using independent observations is discussed for some regions where sufficient measurements are available. In Sect. 5, application of the new ozone climatology in comparison to a model simulation is presented.
We consider 42 stations (shown in Fig. 1 and listed in Table 1) that have a sufficiently complete record of continuous sampling between 1995 and 2011 with at least 12 profiles per season for at least 5 continuous years. Some areas over the globe show a large sampling density, like Western Europe, whereas for other regions the sampling is sparse, such as the Southern Hemisphere (SH), Asia and Africa.
For most of the stations, the Electrochemical Concentration Cell (ECC) ozonesonde was used. For Hohenpeissenberg and Payerne, Brewer-Mast ozonesondes were used, as well as for Uccle before 1997. The technique for all Japanese stations, Kagoshima, Sapporo, Tateno, Naha, and Syowa, is the Carbon-Iodine Japanese Sonde (KC79/JMA). Details about the performance of different ozonesonde types are given in earlier studies (Logan et al., 1999;Smit et al., 2007;Thompson et al., 2007;Schnadt Poberaj et al., 2009). Different measurement techniques were evaluated against a UV photometer, and precision, bias and accuracy were determined in the Jülich Ozone Sonde Intercomparison Experiment (JOSIE) (Smit et al., 1998;Smit et al., 2007). For all three techniques, the precision is better than 5 % and the accuracy is within ±(5-13 %). An upated summary of the performance of ozonesondes over Europe is also given in Logan et al. (2012).
Between 1995 and 2011, the number of available ozone soundings has greatly increased and a larger number of stations with records of several years has become available over the globe (Table 1). For most stations the entire period is covered about equally with at least 12 ozone soundings per season and year, as further discussed in the Supplement (Fig. S1). Besides instrumental uncertainties, a sampling of 4 to 12 profiles per month each year, as is the case for most ozonesonde stations, can introduce uncertainties of 8-14 % in the seasonal mean (Saunois et al., 2012). A larger uncertainty can be expected for earlier periods where the sampling frequency was often less than 4 profiles per season and year.
Ozone profiles provided from the data centers are often scaled to ground-based ozone column measurements, which is strongly influenced by its stratospheric portion. If the scaling factor, called "correction factor", is outside the range of 0.8-1.2, ozone profiles in the troposphere might be biased. Here, we consider all profiles as provided by the data centers, without any additional filtering with regard to correction factors. Ignoring profiles that are corrected by factors outside the range of 0.8-1.2 has only a small impact on the averaged profile between 1995 and 2011 (as demonstrated in the Supplement, Fig. S2). We applied a column ozone filter to all ozone profiles to reject single profiles with column ozone values of more than 700 DU or of less than 50 DU.
Profiles from ozonesondes exhibit excellent vertical resolution throughout the flight, from the surface up to about 10 hPa. Ozone data within a layer around each of 26 predefined pressure levels were then linearly averaged. Between 1000 hPa and 100 hPa, layers of 25 hPa thickness were used. Between 100 hPa and 10 hPa a layer thickness of 2.5 hPa and above 10 hPa a layer thickness of 0.25 hPa was chosen. Each high-resolution ozonesonde profile is further converted from pressure to geometric height using the hydrostatic equation, if not provided in the dataset, as is the case for the NOAA and SHADOZ data.
To derive the thermal tropopause (TP) based on the temperature lapse rate definition (World Meteorological Organization, 1957), we use temperature profiles that were interpolated to a 250 m vertical grid. We are not using the original temperature profiles, since small-scale variability can introduce uncertainties in the TP estimation (Homeyer et al., 2010). The TP height is therefore defined with an accuracy of 125 m, which is better than what is derived from most of the models and meteorological analysis. In case of the identification of a secondary TP, we use the lowermost TP. Tropopause-referenced ozone profiles are derived by averaging available ozone observations onto a 500 m vertical grid around the TP. For this calculation, we reject those profiles, where no TP could be calculated.
We provide monthly averaged ozone profiles, in both mass mixing ratios and volume mixing ratios on a pressure and tropopause-referenced altitude grid for the periods, 1980-1994 and 1995-2011 for each station at http://acd.ucar.edu/ ∼ tilmes/ozone.html. The structure of provided data files is very similar to the climatology by Logan et al. (1999). However we add additional pressure intervals and provide higher vertical resolution of tropopause-referenced ozone profiles. In addition to Logan et al. (1999), we further add information about the monthly mean, the standard deviation, and the number of profiles entering the average, we also provide information about the median, the half-width of the distribution (calculated as (75th percentile -25th percentile)/2), and information about the interannual variability, defined here as the range of the 5th and 95th percentile of the annual median ozone value.
A comparison between the climatology for the period 1980-1994 and the climatology derived by Logan et al. (1999) for a similar period shows an agreement within ±5 % between the two data sets for most of the stations (see Supplement, Fig. S3). A comparison between 1980and 1995 ozone profiles for each station is given in Fig. S4, Supplement).
As for single stations, seasonal averages on a pressure and tropopause-referenced altitude grid, standard deviation, half-width, interannual variability and number of profiles, are derived from aggregates of ozone profiles located within defined regions (see Sect. 4). Single profiles from all selected stations in one region are equally aggregated before any statistics were applied. For regional aggregates, we only consider the period between 1995 and 2011 due to the limited data volume for some stations in earlier years. Further, to compare ozone PDFs directly to model results or other data, we provide a dataset for each season and region including long-term aggregates of observations between 1995 and 2011 for each of the defined pressure intervals. Median ozone mixing ratios and ozone distributions from different stations in selected regimes are similar (see Sect. 4), aside from the flagged cases listed in Table 2. To achieve the most reasonable comparison with model simulations, we recommend comparing regional aggregates, especially for those regimes that are characterized by a large variability between the stations included or for those regions with a very low sampling coverage, as further discussed in Sect. 4.

Surface and aircraft observations
We compare surface ozone measurements with ozonesonde data for only those regions where a large density of surface data is available, as is the case for high northern latitudes, Western Europe and Eastern US. Hourly surface observations from two different networks are used, the Clean Air Status and Trends Network (CASTNET) for the US (http://java.epa.gov/castnet/), and the European Monitoring and Evaluation Programme (EMEP) network in Europe (http: //www.emep.int/). Surface measurements from EMEP are available for over 60 stations in Western Europe in the area around Germany and up to 17 stations for the eastern NH Polar region. Data from a large number of surface stations (about 120 stations) are also available for the US from the CASTNET network with up to 26 stations in southeast US.
Surface stations located at higher elevations have been shown to sample airmasses from lower pressure levels than lower stations (e.g., Fiore et al., 2008). To consider similar airmasses, ozone measurements are compared for similar altitude intervals. Surface observations at elevations of 0-500 m, 500-1500 m and > 1500 m (above sea level), are compared to sonde data averaged over the pressure intervals 1000 hPa ± 50 hPa, 900 hPa ± 50 hPa and 800 hPa ± 50 hPa, respectively. For Western Europe, we only select surface ozone observations taken between 11:00 a.m.-02:00 p.m., in agreement to the time when ozone soundings are usually taken in this region. In this way, the influence of the daily cycle on surface ozone is considered (Logan, 1985;Beck and Grennfelt, 1994). During the day, ozone can vary by up to 15 ppb at the surface in summer, but changes are much smaller (below 5 ppb) at 800 hPa and above for Western Europe (Supplement, Fig. S4).
We further use independent data taken during daily passenger aircraft flights from the MOZAIC program (http:// mozaic.aero.obs-mip.fr/web) (Marenco et al., 1998;Thouret et al., 1998b). We consider data over those airports where on average two ozone profiles per day are available between 1995 and 2008 (about 75 profiles per months). Ozone measured from MOZAIC aircraft show an increasing trend between 1995 and 1998 over Europe , which is not in agreement with corresponding ozonesonde measurements (Logan et al., 2012). Therefore, we only compare to aircraft data taken between 1998 and 2009 for Atmos. Chem. Phys., 12, 7475-7497, 2012 www.atmos-chem-phys.net/12/7475/2012/ Western Europe (Frankfurt). A large number of vertical profiles from MOZAIC between 1995 and 2008 are also available for three locations in the eastern part of the US, Washington, New York, and Boston, and for two airports in Japan, Narita and Osaka.

Ozone distributions in selected regions
The use of ozone profiles from single ozone stations brings about the disadvantage of a low sampling frequency of typ-ically 4 to 12 profiles per month. This can result in uncertainties in the seasonal mean, as mentioned above (Saunois et al., 2012). To reduce this uncertainty, a higher sampling frequency can be achieved by combining ozonesonde observations from multiple stations that are located close enough to each other to sample similar air masses. A combination of ozone profiles into fewer regimes allows statistically more valid evaluation of large-scale processes. Additionally, regional aggregates are more representative for larger regions as resolved by models with a coarse horizontal resolution.
These models are not expected to capture small-scale variations in the ozone field (e.g., Emmons et al., 2010). Ozone distributions in the atmosphere are highly variable due to the influence of processes varying with season and altitude. The objective here is to define regions that include ozone observations over stations with similar characteristics. We focus on seasonal averages of ozone in the troposphere and UTLS. For each of the identified regions, as outlined in detail below, we derive median ozone profiles from regionally aggregated ozone distributions for four seasons and 26 pressure levels (Figs. 2 and 12), see also Sect. 2 for more details. Information about the half-width of the regionally aggregated ozone distribution and interannual variability of seasonal averaged ozone profiles, as defined in Sect. 2, are shown in the Supplement, Figs. S6 and S7. The half-width and the interannual variability of ozone profiles are similar in the troposphere and reach about 10 % of the median values. A larger half-width and interannual variability is observed below 900 hPa, and for regions that are most likely strongly influenced by sporadic occurring pollution events, like Japan and NH Subtropics in summer, and SH mid-latitudes in austral spring.
In addition, we derive regionally aggregated median profiles projected relative to the thermal tropopause (Figs. 3 and 12). The number of profiles entering each regional aggregate is illustrated in Fig. 4. For all regions, no single station is dominating one region for the majority of years between 1995 and 2011. Therefore, all stations in each region as listed in Table 2 are weighted equally.
For each season, region, and altitude level, we compute the spread of median values from all the stations. Further, the Hellinger distance (H-value) between single stations and the regional aggregate is derived. The Hellinger distance is a statistical measure of the similarity of two distributions covering values between zero and one, where the H-value is one, if two distributions are completely different, and zero, if two distributions they are identical (see Appendix A). These measures are used to quantify the spatial variability in ozone in the selected regimes. A median spread of less than 10 ppb in the troposphere and 15 % in the UTLS is defined to correspond to stations with similar ozone characteristics. These values cover uncertainties of ozone measurements including the impact of low sampling frequency. A regionally averaged H-value of less 0.2 is estimated to describe a similar ozone distribution. Those cases where the median spread or the Hvalue are larger than the threshold values are listed in Table 2 and should be considered with care, for example while comparing to regional averages from model results (as outlined below). The combination of stations might not be suitable for higher altitudes than considered here, due to differences in processes influencing ozone. For some regions, ozone distributions are further compared with independent observations from surface ozone measurements and routinely performed passenger aircraft MOZAIC (described in Sect. 3), to discuss the representativeness of regional averages.

High northern latitudes
For the high northern latitudes, we identify two regimes considering PDFs of ozone, NH polar West and NH polar East (Fig. 1, yellow and brown symbols). At 1000 hPa ozone distributions from the western sector, Eureka, Alert, and Resolute, cover much smaller mixing ratios than obtained from stations in the East (Fig. 5, left top). Considering all ozone distributions in the NH high latitudes, the H-value between a single station and the regional aggregate is especially large in spring; see Fig. 5 (top right, green symbols). The three stations in the western sector are located further north and ozone near the surface is likely influenced by halogen-induced ozone depleting events in spring and summer (as described in Tarasick and Bottenheim, 2002). The consideration of two regimes in high northern latitudes, rather than a zonal mean is also supported by observed differences in the upper troposphere. Long-range transport of pollution from low and midlatitudes into high northern latitudes has a varying longitudinal influence (Koch and Hansen, 2005;Stohl, 2006;Shindell et al., 2008;Tilmes et al., 2011), resulting in differences in the tropospheric ozone profiles as shown in Fig. 2. A separation of the polar latitudes into eastern and western sectors reduces the variability in ozone within each region, indicated by the reduced H-value and the spread of median ozone values (Fig. 5, bottom panels). A slightly larger variability between stations in the western sector is found in summer compared to other seasons, but H-values are still below 0.15, indicating little variability of the shape of ozone PDFs between individual stations.
The comparison of PDFs between surface ozone observations and ozonesonde measurements over the eastern sector (not shown) shows good agreement for spring and summer. However, in winter and fall, ozone mixing ratios from surface observations are on average 10 ppb smaller. This is, because some relatively low ozone values were observed over Scandinavia at surface stations, a region that is not covered by ozonesonde stations. Therefore, the surface ozone distributions of the three stations in winter and fall do not cover the variability in ozone over the entire region.
The grouping of ozone profiles using the three stations in Canada (Fig. 1, cyan symbols) results in averaged Hvalues below 0.17 and a median spread of less than 10 ppb in the troposphere. Ozone distributions and their seasonality are similar to those in the high latitude western sector, however, spring minimum values are not reached (Fig. 2). Further, fall and winter ozone mixing ratios at the surface are smaller compared to ozone in high latitudes, caused by smaller ozone mixing ratios in Edmonton in the boundary layer compared to the other two stations. A decrease of near surface ozone was observed in recent years at the station in Edmonton, likely as a result of increased urbanization close to the ozonesonde stations (Tarasick et al., 2005). In addition, stronger pollution might result in enhanced NO x titration, especially in winter. For the troposphere and the UTLS  the shape and seasonality of ozone profiles in high northern latitudes is very similar and in agreement with earlier studies (Logan, 1999a).

Mid-northern latitudes
For the northern mid-latitudes, we separate available ozone stations into three regions, Western Europe, Eastern US, and Japan. Western Europe covers stations between 45 and 53 • N. Madrid is not included because of its location farther south (at 40 • N). The slightly elevated TP in summer over Madrid (as shown in Fig. 11) results in different ozone characteristics in the UTLS compared to other stations in Western Europe. For the remaining stations in Western Europe ozone PDFs are for the most part similar in shape (see Fig. 6). Some variability exists below 800 hPa, as discussed in comparison to surface observations. Ozone profiles are characterized by relatively low ozone mixing ratios at 1000 hPa in winter in spring, see Fig. 2, which are similar to those in Canada and smaller than those in high northern latitudes. The variability in ozone in the LMS is discussed below.
In Fig. 7, ozone PDFs of regional aggregates from ozonesonde measurements around 1000 hPa are compared to PDFs from surface stations covering a similar region and altitude (see Fig. 8). The shape of regionally aggregated surface ozone PDFs over Western Europe (Fig. 7, left panels, red thick line) is very similar to PDFs from ozonesonde observations at 1000 hPa, and also at 900 and 800 hPa (not shown). Surface ozone observations show a larger variability than measurements taken by ozonesondes. The median difference between surface observations and regional aggregates from ozonesonde data varies between −25 and +25 % at 1000 hPa, and less for higher elevations, as illustrated in Fig. 8   .4 .5 .6 Kagoshima (32) Naha (26) Sapporo (43) Tateno ( profiles from ozone soundings after 1998 agree well in the free troposphere (Fig. 9, left panels) for summer at 500 hPa and 800 hPa. A good agreement of monthly-averaged timeseries is also shown in Logan et al. (2012) and in Fig. S4. Ozone distributions from ozonesonde observations are therefore representative for Western Europe in comparison to aircraft and surface observations for the entire troposphere. Four ozonesonde stations are located in Continental US, two stations in the east and one in the middle, Boulder, which is located above 900 hPa altitude, and one in the West. Considering ozone values over all four stations in the troposphere, the H-value reaches over 0.2 (Fig. 6, middle, bottom panel), suggesting a larger variability in ozone among the stations, as discussed in detail in Newchurch et al. (2003). Considering median mixing ratios and ozone PDFs for the troposphere, stations in the East of the US are more similar to each other than to Boulder and Trinidad Head, with smaller ozone mixing ratios in the troposphere over Boulder and in the West. The two eastern stations are located further south (between 35 and 38 • N) compared to the other two stations and are more frequently influenced by stratospheric intrusions of ozone-rich air (Newchurch et al., 2003). Compared to other NH mid-latitudes regions, ozone is on aver-age 10 ppb high in the lower troposphere, as a result of local pollution with largest values in summer over Huntsville. On the other hand, median ozone mixing ratios at 500 hPa are similar compared to Europe (Fig. 2), in agreement to what was found in Logan (1999a) andNewchurch et al. (2003). However, differences exists in the PDF of ozone compared to Japan, as further outlined below.
In comparison to surface ozone measurements, ozonesonde observations are biased high for Eastern US, as shown in Fig. 7 (right panel), with the most significant differences in summer at 1000 hPa. Largest differences also occur in summer at 900 hPa and 800 hPa if comparing observations from surface stations in the entire US to the median of the regional aggregate of ozonesonde data. Maximum ozone mixing ratios as observed by ozonesondes are not covered by most of the more rural surface stations, which reflects the influence of urban activities in the vicinity of the ozonesonde stations especially in summer. Ozonesonde measurements at 800 hPa are further biased high compared to MOZAIC observations over Eastern US (Fig. 9, middle panels), as is also the case for earlier periods (Thouret et al., 1998a). Ozone from ozonesonde measurements over US is further slightly biased high at 500 hPa, which might Atmos. Chem. Phys., 12, 7475-7497 Fig. 7. Probability distribution functions (PDFs) of surface ozone from surface ozone stations with altitude location below 500 m (thin colored lines) for Western Europe (first column) and Eastern US (second column) and for DJF (top) and JJA (bottom). Further, regional aggregates of the PDFs from these surface observations are shown in each panel (red thick lines), as well as regional aggregates derived from ozonesonde observations at 1000 ± 50 hPa (black thick lines). The location of selected stations is shown in Fig. 8. Ozonesonde distributions for the four available stations in Japan show large differences in the median values at 1000 hPa of −8 to 15 ppb (Fig. 6, bottom right). The largest spread occurs in summer and fall. At 500 hPa, all stations besides Sapporo show a wide distribution ranging between 15 and 120 ppb. Those stations are influenced by tropical air masses transporting relatively low ozone values to the region (Logan, 1985), as well as by pollution from large cities nearby. In contrast, Sapporo is characterized by a more compact distribution in the troposphere, with relatively low ozone at the surface (not shown) and higher values in the free troposphere. In the UTLS, Kagoshima and Tateno describe the most similar PDFs of ozone over Japan, as discussed in more detail below. Therefore, for Japan, ozone profiles over Kagoshima and Tateno are combined into one region, and cover latitudes 32-36 • N, similar to Eastern US. The median difference between the two stations in Japan is below 3 ppb throughout the troposphere. Ozone PDFs are similar in shape, indicated by a H-value below 0.2 (Fig. 2). Nearsurface ozone over Kaghoshima in summer is stronger influenced by tropical airmasses than Tateno. This results in a smaller median ozone values in summer than in spring in Japan below 700 hPa, in contrast to the other NH mid-latitude regions. The comparison of ozone PDFs with MOZAIC aircraft observations shows good agreement (Fig. 9, right panel). A slight difference in the shape of the ozone PDFs between aircraft and ozonesonde measurements is obvious at 800 hPa. The airmasses over the airports might be influenced by a slightly different fraction of polluted and clean airmasses compared to those over ozonesonde stations.
In the LMS, differences in the seasonality of ozone for the three regions in mid-latitudes become obvious (Fig. 3). For all regions in winter, the median value of ozone distributions varies strongly among the different stations in one region, as indicated by a median spread of up to 20 % for Western Europe, up to 30 % for Eastern US, and over 40 % for Japan in the LMS, even if tropopause referenced altitudes are considered. On the other hand, the H-values do not reach above 0.15. The differences in PDFs for the three regions are illustrated in Fig. 10.
The main difference among the three regions is a different seasonality of TP heights for the considered profiles, as shown in Fig. 11 in geometric altitude. For Western Europe, all stations show a low tropopause between 10 and 12 km for the whole year, whereas Eastern US and Japan are located further south and are characterized by a low TP in winter and a tropical TP (16-18 km) in summer, with a transition in spring and fall. Ozone mixing ratios over Western Europe in summer are therefore stronger influenced by polar airmasses than over Eastern US and Japan, as evident in the PDFs of ozone within 1-3 km above the TP (Fig. 10). In winter, ozone distributions of different stations over the US and Japan are much less homogeneous than in summer, the height of the TP is similar for all mid-latitude stations with an exception of Naha. Naha is characterized by a high tropical TP for most of the year (Fig. 11). Further, Sapporo and Tateno show a slightly lower TP height in winter compared to the other stations. In contrast to Logan (1999a), the TP over Kagoshima in winter does not show tropical values and its seasonality is more similar to the TP over Tateno.
For the selected stations in Japan (Kagoshima and Tateno) and Eastern US (Huntsville and Wallops Island), to a lesser degree, ozone distributions have a peak of relatively low ozone mixing ratios with a long tail towards high ozone mixing ratios in winter (Fig. 10). The low ozone mixing ratios correspond to profiles with low TP heights, while the medium and larger values are from profiles with higher TP heights. This suggests that when the TP is low in winter, ozone in the lowermost stratosphere is strongly influenced by tropospheric intrusions of airmasses with a source in the upper tropical troposphere. This correlates with the more frequent occurrence of the double TP in winter and spring (Randel and Wu, 2007).
Interestingly, ozone mixing ratios over Japan are lower than those in the US in winter for the same latitude band, and much smaller than Madrid, which is located at a similar latitude (not shown). This might be linked to the higher frequency of exchange processes between troposphere and stratosphere, including more tropospheric intrusions over the Pacific than over the Atlantic Sprenger et al., 2007;Kunz et al., 2011). Therefore, zonal Atmos. Chem. Phys., 12, 7475-7497, 2012 www.atmos-chem-phys.net/12/7475/2012/  averages, as used in earlier climatologies (Stevenson et al., 2006) mask significant longitudinal variations that are valuable for model evaluation.

Tropics
The processes influencing tropospheric ozone for the three stations denoted as NH Subtropics and the stations in the Tropics (Fig. 1) differ depending on their location in regard to the ascending or descending branch of the Walker circulation, the seasonality of convection, the influence of biomass burning, the amount of pollution and stratospheric influence (Thompson et al., 2011a,b). The vertical structure of ozone and the interannual variations over a 10-yr period are distinct from station to station (Thompson et al., 2011a). For the tropical troposphere, the median differences of the stations are very high and reach 40 ppb and the H-values are around 0.3-0.4 (not shown). Based on averaged relative humidity, temperature, and ozone at the tropopause and TTL, Thompson et al. (2012) performed a separation of the tropical stations into the three sub-regions, Western Pacific and East Indian Ocean, equatorial America, and the Atlantic and Africa. We adopted these regions for our analysis.
Median ozone values in the NH Subtropics show a large spread round 300 hPa, with a maximum above 30 % in JJA. A distinct upper tropospheric ozone minimum occurs in winter, which might be the result of the influence of ozone-poor airmasses transported from the tropical tropopause to the North, as suggested in Ogino et al. (2012). Ozone over the Western Pacific and East Indian Ocean region is strongly influenced by deep convection, resulting in a distinct upper tropospheric ozone minimum at about 200 hPa for all seasons (Fig. 12). The tropospheric distributions of the three stations are very similar and show a median spread of less than 10 ppb and a H-value of about 0.2 above 800 hPa. The equatorial Americas describe a transition zone between western Pacific and Atlantic. San Cristobal is more strongly influenced by deep convection than Paramaribo, leading to lower ozone values at 200 hPa over San Cristobal. Consequently, the median spread of ozone profiles in this region exceeds 10 ppb above 500 hPa and the ozone distributions of single stations are not very similar (Fig. 12, middle column). Ozone distributions over the Atlantic and Africa are very different from the other regions, due to their location near the descending branch of the Walker circulation. The three stations are also differently influenced by biomass burning and lightning activities (Thompson et al., 2011b, for more details). The median spread of ozone in the troposphere is round 15 ppb, but reaches over 30 ppb at around 600 hPa in SON.

Middle and high southern latitudes
For the middle southern latitudes, only three ozonesonde stations are available, Lauder, Macquarie, and Broadmeadows. Ozone over Broadmeadows (former Laverton between 1982 and 1999, in southern Australia) is systematically higher in Austral summer (DJF) in the troposphere compared to Lauder and Macquarie, located further south (not shown). Further, ozonesondes over Broadmeadows sampled higher ozone mixing ratios in Austral summer (DJF) and lower ozone mixing ratios in Austral winter above the TP compared to the stations that are located further south. The two stations further south are more strongly influenced by low ozone mixing ratios after the break-up of the polar vortex in DJF and ozone describes a similar distribution compared to high northern latitudes. The higher ozone mixing ratios for the more southern stations are expected in the LMS in JJA, because of the stronger transport of ozone towards the poles in high latitudes. On the other hand, Broadmeadows is more strongly influenced by tropical airmasses than the other two stations. Consequently, we only combine Lauder and Macquarie into one regime. Tropospheric ozone distributions of the two stations are similar with H-values below 0.15. Some variability exists in summer in the boundary layer, with larger ozone values over Macquarie than over Lauder. For the SH Polar region, all three ozone stations show a very similar dis-tribution in the troposphere, with very low values around 10 ppb in DJF and maximum values in JJA of 30 ppb, in agreement with Logan et al. (1999). Largest ozone mixing ratios in JJA could be a result of enhanced biomass burning that reaches into high southern latitudes. For the UTLS, a larger variability between the stations is a result of the variability of location and depth of the ozone hole in SON.

Application of the ozone climatology to model studies
Applications of the new climatology to evaluate ozone concentrations in chemistry climate and chemistry transport models are demonstrated and examples for new diagnostics are provided. We use two model simulations that were performed with CAM-Chem, as described in detail in Lamarque et al. (2012). One simulation, denoted as "CAM-Chem strat/trop" in the following, used the online representation of dynamics. For the second simulation, denoted as "CAMChem GEOS5", dynamical fields (temperature, winds, surface fluxes) were specified using NASA GMAO GEOS-5 meteorological analyses. The physics of both model simulations are the same with a model top at 10 hPa. For the CAMChem GEOS5 simulation a higher vertical resolution of 56 vertical layers was used, instead of 26 layers in the case of CAMChem strat/trop. Both model simulations are based on the same tropospheric chemistry using the MOZART-4 mechanism (Emmons et al., 2010). Additionally, CAMChem strat/trop used extensive stratospheric chemistry, whereas stratospheric chemical species were prescribed in the stratosphere for the CAMChem GEOS5 simulations.
Daily model output over 7 yr from both model simulations is compared to the climatology. We are not considering monthly mean model output, since it does not allow comparing ozone PDFs on a seasonal basis, as provided in the new climatology. To achieve the best comparison, regional aggregates are derived from the model output. For this, simulated ozone profiles are interpolated to the location of each of the ozonesonde stations. They are further interpolated to fixed pressure levels, as well as relative altitude levels, using the thermal TP. Here, we only show comparisions in the troposphere.
Seasonally-averaged and regionally-aggregated profiles are compared between both models and observations for the troposphere to identify shortcomings in different altitudes and seasons, as shown in Fig. 13, for Western Europe and Eastern US. Here, both model simulations bias high at the surface and above 400 hPa. In general, CAMChem strat/trop shows a better representation of the ozone gradient across the TP, likely a result of a more precise description of stratospheric ozone in CAMChem strat/trop, as discussed in Makar et al. (2010)  GEOS5, pointing to a more realistic description of transport patterns in the offline model simulation.
The performance of the models regarding annual averages and the seasonality is illustrated using Taylor diagrams that provides an overall summary for all regions in the troposphere for 1000 hPa and 500 hPa (500 hPa nd 250 hPa for the Tropics), see Fig. 14. The Taylor diagram compares annual mean ozone between model and observations versus the correlation of monthly averaged ozone mixing ratios and therefore the seasonality of ozone. Ozone is poorly simulated at the surface for most of the regions in both model simulations, with a slightly better performance of the CAMChem GEOS5 simulation. For example, both models overestimate the low ozone mixing ratios over NH Polar West and overestimate surface ozone over Eastern US, which can be caused by both chemical and dynamical shortcomings. On the other hand, both models represent ozone well at 500 hPa, as also discussed in Lamarque et al. (2012).
Further, we demonstrate a comparison of ozone PDFs, as provided by the new climatology, discussed in the example of Japan for the troposphere. Japan was flagged in the climatology, pointing to a larger variability of the two stations included in the regional aggregate. Both simulations were not able to reproduce the seasonality of ozone in the troposphere, especially for CAMChem strat/trop (Fig. 13, middle panel), even though the annual averages are similar to observations. A station-by-station comparison between model simulations and observations of ozone PDFs for winter and summer is shown in Fig. 15. The median difference (in ppb) and Hvalue between models and observations is shown for each station in the top right corner of Fig. 15.
In winter, ozone PDFs are rather narrow for both model results and observations. Both model simulations overestimate ozone in the troposphere in winter. The high bias of ozone in winter results in a much smaller overlap of the simulated and observed PDFs than in summer, which results in   can be a result of an underestimation of emissions, or shortcomings in the representation of transport patterns. Nevertheless, at 800 hPa, CAMChem GEOS5 is able to reproduce a bimodal distribution over Tateno, showing some indication of the influence of polluted airmasses in this region. Since emissions are very similar in the two simulations considered, differences in transport patterns are likely responsible for differences between the two simulations. An investigation of regional averages alone would not allow these conclusions to be drawn.
A global ozonesonde climatology has been compiled using 42 stations. Monthly averaged ozone profiles between 1995-2011 are provided on pressure altitudes and tropopausereferenced altitudes for all stations and can be downloaded at http://acd.ucar.edu/ ∼ tilmes/ozone.html. Besides averaged profiles on pressure altitudes and tropopause-referenced altitudes (mean and median) (Logan, 1999a;Considine et al., 2008), we provide information about the standard deviation, the half-width of the distribution, the interannual variability, and the number of profiles entering the average. The same information is also provided for the period 1980-1994. Mean values agree well with the climatology compiled by Logan et al. (1999), who used the same period of data. Further, we identify ozone stations with similar ozone characteristics in comparing the medians and PDFs of ozone profiles for all seasons and pressure levels. The Hellinger distance (H-value) is employed to quantify the similarity of two ozone PDFs. If the median spread of two stations is below 10 ppb for the troposphere (or 15 % for the UTLS) and if the H-value is equal or below 0.2, we define ozone stations to be similar. These values cover the instrumental uncertainties of ozonesonde measurements as well as uncertainties due to the limited sampling frequency. Similar stations are then combined into regions with at least two stations. As for single stations, regional aggregates and statistical information are also provided. We have flagged altitudes and seasons for those regions where ozone PDFs among individual stations are not similar to each other. Those cases have to be regarded with caution while comparing regional aggregates with model simulations or other data sets.
The seasonal variability of regionally aggregated ozone profiles in the troposphere is in agreement with earlier studies (e.g., Logan, 1999a;McPeters et al., 2007). We further identify longitudinal variations for different latitude bands. For example, the western and eastern part of the NH polar region shows different characteristics in the troposphere. Differences in the shape of ozone distributions occur for the three regions in the NH mid-latitudes (West Europe, Japan and the Eastern US). Differences are caused by the varying influence of airmasses from tropics and high latitudes, indicated by variations in the height of the TP. Further, differences between Japan and Eastern US in winter and spring are likely caused by the different influences of mixing between upper tropical troposphere and LMS. This may be related to a weaker transport barrier between the tropical TP and the lower stratosphere in the area around the North Pacific in winter and spring Sprenger et al., 2007) and suggests a need for further investigation. The findings support the fact that zonal averages are insufficient for evaluating models due to the longitudinal variation of ozone distributions.
We also combine three regions in the Tropics, as suggested by Thompson et al. (2012), that show very different charac-teristics. Large variability in ozone between stations in the Tropics and NH Subtropics should be considered when using regional aggregates. The representativeness of ozone PDFs from regional aggregates is investigated by comparing the ozonesonde data with independent data sets, using surface ozone and MOZAIC aircraft data. Ozone observations from ozonesondes, surface, and aircraft measurements over Western Europe agree well in both shape of the PDFs and median values. The variability in surface ozone is larger compared to those derived from ozonesonde observations. Ozonesonde measurements over Eastern US are biased high compared to surface and MOZAIC data below 800 hPa. Regions with only two or three ozonesonde stations are for the most part not sufficient to represent ozone near the surface for the entire region. Also, a different sampling time between different datasets might introduce differences of up to 15 ppb at the surface in summer. On the other hand, reasonable agreement between ozonesonde measurements and MOZAIC data exists at 500 hPa for Western Europe, Eastern US and Japan.
The climatology is applied to evaluate model results from two different model simulations performed with NCAR CAMChem. The comparison of median ozone profiles between simulations and observations identifies shortcoming in both model simulations, as well as differences between the two simulations. A better representation of the ozone gradient across the TP is found for the model simulation with derived stratospheric ozone, in constrast to the model with prescribe stratospheric ozone on a monthly and zonal basis. This further indicates the importance of considering longitudinal variations of ozone in the UTLS.
The performed illustration of the model performance in Taylor diagrams is useful for evaluating annual ozone in different regions, and the seasonal behavior. A more detailed analysis can be performed using this climatology in comparing ozone PDFs between observations and model simulations. An example is discussed for tropospheric ozone over Japan. Differences between the two model simulations and the climatology point to shortcomings possible in the transport in the model over this region. The compiled ozonesonde climatology provides an updated and extended basis for present-day model evaluation and introduced diagnostics give further insights into the ability of models to reproduce observed features of the global ozone distribution.

Hellinger distance
For the comparison of ozone distributions within observational data sets or between observations and models, the mean (or median) and standard deviation (or width) of a distribution are often considered. The comparison of means of ozone distributions does not give any information about the shape of the distributions, whereas the median and percentile Atmos. Chem. Phys., 12, 7475-7497, 2012 www.atmos-chem-phys.net/12/7475/2012/  Fig. A1. Probability distribution function (PDF) of ozone for the three stations within Japan (different colors) for four seasons. The regionally-aggregated distribution is shown as black thick lines. Middle row: cumulative Distribution Functions (CDF) of ozone for the three stations (thin lines) and for the regionally aggregated distribution (average distribution of ozone from all three stations (thick black line)) using variable bin sizes for the underlying PDF. Bottom panel: Hellinger distance between different stations (different colors) and the regionally aggregated distribution plotted against the median differences of the two distributions. Distribution samples are from data within 3-5 km above the TP.
give only a first-order estimate. However, a distribution of ozone concentrations is often not well represented as a Gaussian distribution, as shown in Fig. A1 (top row), using the ozone distribution based on sondes in Japan in LMS, as an example. Differences in the shape of two ozone distributions, e.g., Gaussian compared to bi-modal in the UTLS, even if describing the same mean and width of the distribution, might produce significantly different signals in radiative forcing or heating in a climate model. Further, in the troposphere, the mean or median of a distribution does not give any information on the frequency of very high ozone episodes as a result of pollution that can lead to health problems. Consequently, we need to evaluate not only the differences between means of two distributions, but also how much the shape of the distributions vary from each other, to get an idea of how well the models represent the physical behavior of the atmosphere. We introduce the "Hellinger distance" (Nikulin, 2001) as a tool to assess the similarity between two distributions. Let P and Q denote two probability measures that are absolutely continuous with respect to the ozone mixing ratio λ. The Hellinger distance H (P , Q) between two cumulative distribution functions (CDF), P and Q of ozone, is defined as follows: where dλ is the interval width of the mixing ratio bin. The Hellinger distance is 0 when two distributions are identical, and 1 when two distributions are completely different. The interval bin, λ, of the CDF is chosen in such a way that each bin contains an equal number of data, resulting in variable bin sizes. This allows a smoother representation of the shape of the CDF, as illustrated in Fig. A1, middle panel. To compare two distributions, the same number of bins are chosen for each distribution. Depending on the number of bins, the Hellinger distance can vary. This, however, does not change the conclusions. Here, we use 25 bins to compare two ozone distributions.
To illustrate the performance of Hellinger distance, we use the example of ozone distributions in the lowermost stratosphere (3-5 km above the tropopause) from the three Japanese data sets. Figure A1 illustrates the PDF (top row) and CDF (middle row) of the ozone distributions taken from three different ozone sonde stations (different colors) for all four seasons. The three distributions are compared to the regional average (Fig. A1, middle row, black line), in calculating the Hellinger distance between the distribution of each station and the regionally aggregated distribution. The derived Hellinger distance is then plotted vs. the percentage difference of the medians of the distributions, Fig. A1 (bottom row). In case the ozone distribution is very similar to the regional aggregate (black line), the Hellinger distance is below 0.1, as is the case in summer for all Japanese stations. On the other hand, even if the differences in the mean are small, the Hellinger distance can be larger than 0.2 if the shapes of the three ozone distributions are different, as is the case for winter and spring. This example supports the definition in Sect. 4 for similar distributions in one regions, where the spread of median values has to be less than 10 ppb in the troposphere and 15 % in the UTLS, and the Hellinger distance has to less than 0.2.