3-D evaluation of tropospheric ozone simulations by an ensemble of regional Chemistry Transport Model

. A detailed 3-D evaluation of an ensemble of ﬁve regional Chemistry Transport Models (RCTM) and one global CTM with focus on free tropospheric ozone over Europe is presented. It is performed over a summer period (June to August 2008) in the context of the GEMS-RAQ project. A data set of about 400 vertical ozone proﬁles from balloon soundings and commercial aircraft at 11 different locations is used for model evaluation, in addition to satellite measurements with the infrared nadir sounder (IASI) showing largest sensitivity to free tropospheric ozone. In the middle troposphere, the four regional models using the same top and boundary conditions from IFS-MOZART exhibit a systematic negative bias with respect to observed proﬁles of about − 20 %. Root Mean Square Error (RMSE) values are con-stantly growing with altitude, from 22 % to 32 % to 53 %, respectively for 0–2 km, 2–8 km and 8–10 km height ranges. Lowest correlation is found in the middle troposphere, with minimum coefﬁcients ( R ) between 0.2 to 0.45 near 8 km, as compared to 0.7 near the surface and similar values around 10 km. A sensitivity test made with the CHIMERE mode also shows that using hourly instead of monthly chemical boundary conditions generally improves the model skill (i.e. improve RMSE and correlation). Lower tropospheric 0–6


Introduction
Regional Chemical Transport Models (RCTM) are now central tools of air quality policy. In the case of ozone, their operational use for short-term forecast and monitoring (Rouil, et al., 2009;www.airnow.gov) implies the need for identifying and reducing the remaining uncertainties. Classically, RCTM are evaluated against surface observations (Honoré et al., 2008;Van Loon et al., 2007;Vautard et al., 2007) since their primary goal is to simulate pollutants to which humans and more generally the biosphere, are directly exposed. On the contrary, performance of such models to simulate free tropospheric ozone has been less evaluated (in contrast to global scale models (Johnson et al., 2010a)). Nevertheless, precise simulation of tropospheric ozone fields is crucial from the point of view of air quality. Since ozone is known to be harmful for humans (West et al., 2007) and vegetation development (Felzer et al., 2007), it is important to evaluate its long-range transport from source regions (Liang et al., 2004;Jonson et al., 2010) and the downward exchange between free troposphere and the boundary layer, which is poorly documented at the moment, but which is thought to be significant (Fiore et al., 2002;Foret et al., 2009;Parrington et al., 2009). In addition, the correct simulation of regional scale tropospheric ozone is important to assess its impact on regional climate change: ozone is the third most important greenhouse gas of the atmosphere (Forster et al., 2007) and, as an oxidant, it controls concentrations of other important greenhouse gases (mostly methane via OH production, Forster et al., 2007).
Vertical profiles of free tropospheric ozone provided by balloon borne ozone sondes and performed on board of commercial aircrafts (MOZAIC program) are very precious, because of their high vertical resolution. For summer 2008, ozone vertical profiles made by sondes have been obtained at 9 sites over Europe, among which five sites with a frequency of one or more soundings per week. MOZAIC vertical ozone profiles have been also obtained near 2 airports (Frankfurt, London) with sometimes more than one profile per day. In addition, the new generation of nadir viewing infrared sounders (IASI, Clerbaux et al., 2009;TES, Worden et al., 2007) is now operational and it opens new perspectives to study free tropospheric ozone. Thanks to its twice daily coverage of Europe (under cloud free conditions), IASI is a particularly good candidate due to a higher sensitivity to the free tropospheric (comparing to older instruments like GOME and/or SCIAMACHY) and in some cases also boundary layer ozone concentrations, these observations offer the possibility to evaluate/constrain pollution models (Eremenko et al., 2008;Foret et al., 2009;Coman et al., 2012).
The FP6 European project "Global and regional Earthsystem (atmosphere) Monitoring using Satellite and in-situ data" (GEMS) aimed at developing a pre-operational system for forecasting the chemical composition of the atmosphere at the global scale and more specifically at the regional scale for Europe by using an ensemble of RAQ, where RAQ stands for Regional Air Quality models (Hollingsworth et al., 2008). In the framework of the RAQ-GEMS subproject, ten European RCTM have been set up since June 2008 to forecast pollutant concentrations (ozone, NO 2 , SO 2 , CO and particles) over Europe (http://gems.ecmwf.int/d/products/raq/). The IFS-MOZART (Global CTM) forecast (Flemming et al., 2009) is used as boundary conditions (for top and lateral boundaries) for most of the RAQ models, but it produces also forecast over the regional domain. Model skills scores (such as bias, RMSE etc) have been calculated online for pollutants surface concentrations using measurements made by European air quality networks (http://acm. eionet.europa.eu/databases/airbase/). However, few efforts have been paid to evaluate the model abilities to reproduce free tropospheric concentrations. One reason for this is the lack of suitable (near real time) observations. What is proposed here is to conduct such an evaluation for tropospheric ozone in a hindcast mode. To do so, a specific exercise has been set-up where five of the GEMS-RAQ RCTM have re-simulated the summer 2008 period, with, for some of them, new configurations allowing simulating the whole troposphere. These models are state-of-the-art models in Europe and together they are a representative sample of European RCTM's. They will be compared against an extended set of tropospheric ozone measurements from sondes, commercial aircraft (MOZAIC), and thermal-infrared measurements onboard satellite (IASI). To our knowledge, this is the first study that uses IASI ozone observations to evaluate RCTM's. The frequency of observations (especially the daily coverage for IASI observations) allows performing comparisons between observations and models from the seasonal to the day-to-day temporal scale. More specifically, we discuss uncertainties induced by the different representation in models of some of the processes controlling tropospheric ozone concentrations. Especially, for boundary conditions, we compare the impact of climatological and daily resolved boundary conditions but also differences in model transport between regional scale and global scale CTM.
In Sect. 2 in situ and satellite observations are described. Section 3 presents the models participating in the exercise. The results of the systematic comparison between observations and models over a whole summer period are shown in Sect. 4 including also a case study that illustrates the synergy between models and satellite data to analyse specific events. Table 1. Geographical characteristics of the sounding sites as well as the number of profiles available for the study. The sondes used in this paper are taken from two archives, namely (1) the World Ozone and Ultraviolet Data Center (WOUDC) (http://www.woudc.org) and (2)  Vertical ozone soundings are obtained from electrochemical sensors lifted by hydrogen filled rubber balloons up to 30 km altitude. The vertical resolution of the stored measurements is about 100 m. The accuracy of such measurements is estimated to be better than 5 % in the troposphere (Smit et al., 2007). Over the "GEMS" European domain (covering part of European Russia, see model domain in Sect. 3), we have gathered data from 9 sounding sites for summer 2008 (June to August, Fig. 1). Table 1 indicates coordinates and altitude for each site as well as the number of profiles available and the databases from which they are available.

Tropospheric ozone measurements by commercial aircraft
Since 1994, ozone is measured onboard commercial airliners in the framework of the MOZAIC program (Marenco et al., 1998). The principle of ozone measurements is a dual beam UV absorption with an accuracy estimated at ±2 ppb or +2 % (Thouret et al., 1998). Vertical resolution of profiles taken during the take-off and landing phases is about few tens of meters. For summer 2008, a large number of profiles were available at Frankfurt (162 profiles) and London (58 profiles) airports. For Frankfurt, this corresponds to a daily frequency of nearly two (1.76 day −1 ).

Tropospheric ozone measurements by satellite
The Infrared Atmospheric Sounding Interferometer (IASI) instruments    Stiller et al., 2000) and its numerical inversion module KOPRAFIT. The inversion method was setup and first applied by Eremenko et al. (2008). To achieve maximal information content in the troposphere a constrained least squares fit method with an analytical altitudedependent regularization is used. The regularization matrix is a combination of zero, first and second order Tikhonov constraints with altitude-dependent coefficients that were optimised to both maximise the Degrees of Freedom (DOF) of the retrieval in the troposphere and to minimise the total error of the retrieved profile. A validation exercise performed over the first one-year-and-a-half of IASI operation for the northern midlatitudes showed a bias of less than 5 % in the retrieved ozone. Calculated instrumental and retrieval errors (in total about 18 % for 0-6 km partial columns for mid-latitudes) are consistent with the standard deviation of the differences between sonde measurements and IASI observations .
Due to the limited number of degrees of freedom in the troposphere and considering the GEMS-RAQ focus on lower tropospheric ozone, partial 0-6 km ozone columns have been chosen as the basis of comparison between IASI observations and RAQ model simulations in this paper, as in Eremenko et al. (2008). In order to make simulations comparable to the retrieval, the simulated ozone profile vector x s needs to be transformed into a pseudo-retrieved profile x r by applying Eq. (1): Here AVK denotes the Averaging Kernel Matrix, which expresses the sensitivity of the retrieved profile to the true profile and, by extension, to the a priori information (x a ). A row of the AVK indicates the sensitivity of retrieved ozone, at a given layer, to changes in ozone at the same and other layers. This matrix is calculated during the retrieval process for each individual retrieved profile. An example of a typical AVK is shown in Fig. 2. The left panel shows the rows of AVK for different altitudes (black curves correspond to levels between 0 and 6 km, red curves to levels between 7 to 12 km). The right panel shows the integrated AVK over these two height ranges. This figure indicates that due to the measurement set-up and as a result of the retrieval method: (1) it is impossible to separate information originating from nearby vertical levels; (2) the sensitivity to the lower levels of the ozone profile (below 3 km) is relatively small. Nevertheless the lower and the upper parts of the troposphere (0-6 km and 7-12 km) are almost independent and can be separated when thermal condition (surface temperature, thermal contrast) are favorable, i.e. mainly during summer (Dufour et  Typical averaging kernels on the vertical grid used for retrieval (and model evaluation) and spaced by 1 km height vertical layers (left panel) and in partial column space (right panel), the black curve displays the averaging kernel corresponding to the 0-6 km ozone partial columns and the red one shows the same for the 6-12 km ozone partial column.
al., 2010). In Eq. (1), x a represents the a priori ozone profile used in the retrieval. Application of Eq. (1) to the "high resolution" vertical profile x s ensures that x a (the a priori) has no impact on the IASI-simulation comparison.
In the following, we name "raw" columns the vertical columns integrating the simulated profiles x s , and "smoothed" columns those calculated from the x r profile (Eq. 1). For comparison purposes, individual IASI measurements (pixels) are regridded over the CHIMERE model grid cells with 0.5 • resolution for the comparison exercise.

Model set-up and processing
Five models from the RAQ activity within the FP6/GEMS project participate in this exercise (BOLCHEM, CAMx, CHIMERE, EURAD, and MOCAGE). In addition, MOZART global fields provided by IFS (Fleming et al., 2009) are also included in the comparison. Model runs are all performed over the common European GEMS-RAQ domain ( Fig. 1) for three summer months 2008 (June to August). Note that all models during this period have also been active for real time air quality forecast over Europe within the FP6 GEMS project (Hollingsworth et al., 2008). However, in this work hindcast simulations are used, because the whole set of tropospheric ozone simulations has not been stored during real time forecast. Moreover, some models have modified their operational version for this exercise; especially the BOLCHEM and CHIMERE model have moved the top of their domain from 500 to 200 hPa.
Atmos. Chem. Phys., 12, 3219-3240, 2012 www.atmos-chem-phys.net/12/3219/2012/ Global meteorological analysis (available every 6 h) and previsions (available every 3 h) with a spectral resolution of T799 and with 91 vertical levels up to 1 hPa are provided by ECMWF to the GEMS project (called IFS meteorology). All models use it either as direct meteorological input (CHIMERE, MOCAGE, IFS-MOZART) or as large scale fields for separate mesoscale simulations, on-line in BOLCHEM based on BOLAM dynamics (Buzzi et al., 1994), and off-line with the MM5 meteorological regional model for EURAD and CAMx. For all models, anthropogenic emissions are taken from the high resolution emission data base provided by TNO for GEMS (Visschedijk et al., 2007). For biogenic emissions, two models (BOLCHEM, CAMx) use the grid-based Biogenic Emission model (BEM; Poupkou et al., 2010) that allows calculating NMVOC (Non-Methane Volatil Organic Compound) emissions from vegetation at high spatial (30 km×30 km) and temporal (hour) scale. Similar approaches are also used by other models: EURAD (Guenther et al., 1993); MOCAGE (Guenther et al., 1995;Dentener et al., 2005) and the MEGAN model (Guenther et al., 2006) is used for CHIMERE. Moreover, altitude emissions, i.e. lightning NO x emissions and aircrafts emissions, are taken into account in IFS-MOZART (Horowitz et al., 2003) following the parameterization proposed by Price et al. (1997) for lightning NO x and the work of Friedl (1997) for aircraft emissions. RCTM do not directly represent these altitude emissions but use boundary conditions from the MOZART-IFS model. Indeed, hourly varying boundary conditions (BC) for ozone (but also for CO, NO, NO 2 , HNO 3 , peroxyacetyl nitrate, C 2 H 6 , isoprene, toluene and some others) are taken from IFS-MOZART global fields for most of the models except for MOCAGE that is using MOCAGE global simulations as hourly boundary conditions. As we will see later, the choice of boundary conditions can be a crucial parameter in model's behaviour. Models also include various formulations for atmospheric chemistry using well characterised reduced chemical schemes (Table 2). Dry deposition schemes are based on the classical "resistance" approach (Wesely, 1989). Another important model feature is the representation of pollutant transport. Table 2 indicates choices made in each RCTM to describe horizontal and vertical advection, turbulent transport in the planetary boundary layer, and convection by clouds. The impact of using different formulations for some of these processes will be analysed in Sect. 4.
The horizontal resolution of models varies between 0.2 • and 0.5 • , the model top between 200 (approximate tropopause height over Europe) and 10 hPa (only the CAMx model has a top at 300 hPa, see Table 2). The number of vertical tropospheric levels used to discretise the troposphere is about 20 between surface and 200hPa. In order to have a common reference frame for comparison, daily concentration have been interpolated onto a horizontal 0.5×0.5 lat/lon grid (Fig. 1) and a regular spaced vertical grid ranging from 0 km to 12 km in 1 km increments. This should reduce par-tially the impact of having different horizontal and vertical resolution used by the models (cf Table 2). For the comparisons with satellite data, "raw" and "smoothed" partial 0-6 km columns have been calculated as explained in the previous section.

Systematic model evaluation over the summer period
We first present here the systematic comparison of ozone tropospheric profiles simulated with RAQ models (Table 2) against in-situ observations available from sondes and aircraft (Table 1). Results are analysed in terms of bias, RMSE and correlation as a function of altitude and integrated over the whole summer period (June to August 2008). Also the impact of the chemical boundary conditions (for lateral and top limits of the modelling domain) is investigated. Next, this evaluation is completed by the comparison of models to satellite observations (IASI partial tropospheric columns of ozone) at different time scales (from seasonal to daily time scale). This section further contains a case study that illustrates the models and IASI ability to reproduce strong ozone gradient in the troposphere associated with the tropopause height variability.

Comparisons between models and in situ vertical profiles
The comparison between simulations and in situ vertical ozone profiles is performed in the following way. In the horizontal plane, the model grid point closest to the observations is used. For MOZAIC measurements, we take into account the horizontal displacement of aircraft during take-off and landing (up to 500 km until the flight level is reached). In the vertical, we interpolate observations and simulations to a uniform grid, stretching from 0 to 10 km above the surface with 1 km steps. We apply linear interpolation. Second order interpolation was also tested, but differences in error statistics were found to be negligible. With respect to time, the closest hourly model output with respect to the mean observation time is taken. Figure 3 shows the results of this comparison: vertical profiles of mean bias (Model-Observation), RMSE and Pearson's correlation coefficient. Values are averaged over 1 km-height layers (plotted here in the middle of each layer) and over all available data from soundings and aircrafts (about 400 vertical profiles). To analyse these results, we have chosen to first consider models using the IFS-MOZART hourly boundary conditions that constitute a coherent sub-ensemble (i.e. BOLCHEM, CAMX, CHIMERE, EURAD and IFS-MOZART itself). In the following, results are presented as a function of tree ranges of altitude: (1) the Planetary Boundary layer (PBL, 0-2 km height); (2) the middle troposphere (MT, 2-8 km height); the upper troposphere (UT, 8-10 km height). It allows being more synthetic to present these results and to take into account the www.atmos-chem-phys.net/12/3219/2012/ Atmos. Chem. Phys., 12, 3219-3240, 2012 CBM-IV + updates Gery et al. (1989) Carter (1996 Collela and Woodward (1984) Kain-Fritsch 2 Kain (2002) K-theory, coeff. MM5 Hong and Pan (1996) Wesely ( main differences in processes driving ozone concentrations as a function of altitude (i.e. surface emissions, "fast" chemistry and dry deposition associated with turbulent transport in the PBL, horizontal transport and "slow" chemistry in the MT and UTLS (Upper Troposphere Lower Stratosphere) exchange processes in the upper troposphere).

Results in the Planetary Boundary Layer (PBL; 0-2 km height)
In the PBL, mean ozone concentration is about 46 ppb with a 12.4 ppb standard deviation (Table 3). The mean model bias is −7 % with values varying, as a function of site, in a range between −24 % and +8 % (Table S1). Largest negative biases are observed at Valentia (−24 %), Lerwick (−21 %) and Sodankyla (−19 %), stations more directly under the influence of air masses from northern Atlantic and polar origin. This reflects probably a bias in the IFS-MOZART ozone fields for these regions. However, in the case of Valentia and Lerwick that are coastal stations, we can not exclude a systematic misrepresentation of local meteorological patterns such as land/sea breeze by the models. From Fig. 3, we observe a median bias of −2 % between 0 and 1 km height, which increases to about −12 % between 1 and 2 km. RMSE is almost constant in the PBL with a median value of about 10 ppb (22 %) ( Fig. 3; Table 3). This value is fairly similar at all sites (Table S1). We can note that these RMSE are similar to those obtained from previous evaluation studies using operational surface ozone measurements (Honoré et al., 2008;Vautard et al., 2007). Concerning Pearson's coefficient of correlation, the median value in the PBL is 0.71 but with significant variations from site to site (from 0.24 at Barajas to 0.86 at DeBilt). These discrepancies between sites are difficult to understand because they do not follow a clear pattern related to their geographical situation (i.e. coastal sites, mountainous sites which would be more complex to model do not show lower correlations in a systematic way). The general good correlation in PBL indicates that ozone build-up in the boundary layer is fairly well represented in the models as it is generally confirmed also by comparisons with ground stations (GEMS Final report, 2010). The variability ratio (model standard deviation divided by observation standard deviation) shows that models reproduce well the observations variability (±10 %) except for Barajas (0.68) and Valentia (0.58) (Table 3 and  Supplement Table S1).

Results in the Middle Troposphere (MT; > 2-8 km height)
In the MT, mean ozone concentrations are about 66 ppb with a 16.7 ppb standard deviation (Table 3). Mean model bias is negative (−16 %, Table 3). Largest negative values of the model median (about −20 %) are reached at 5 km height (Fig. 3). This behaviour is fairly systematic at all sites with a more or less pronounced minimum (Supplement Fig. S1). This type of negative model bias in middle tropospheric ozone over Europe has already been observed for several global models in earlier studies. Law et al. (2000) already pointed out a negative model bias of ozone in the troposphere for European sites presenting a summer maximum. They postulated that it could be due to "a lack of chemistry, deficiencies in transport schemes, as well as inadequate resolution". Tarasick et al. (2007) also observed such a feature and postulated inaccuracies in the representation of stratosphere to troposphere exchange. More recently, Jonson et al. (2010) showed this negative bias for the Uccle station, especially in the middle and upper troposphere during a summer period (cf. Fig. 3 of their work). Finally, Ordonez et al. (2010) came also to this conclusion after comparing GEMS-GRG (GRG stands for Global Reactive Gases) global models, including IFS-MOZART, to aircraft data, and concluded that a combination of uncertainties affecting model simulations (coarse horizontal resolution, uncertainties in long-range transport of pollution, limitations of the chemistry scheme, underestimated emissions) were responsible of this underestimation in MT ozone. Since the RCTM's evaluated here use boundary conditions derived from the global IFS-MOZART model, it is plausible that middle tropospheric boundary condition at the edge of Europe are also biased negatively, causing the negative bias in MT -ozone in the RCTMs studied here.
RMSE (of the mean) increases almost linearly from about 24 % at 3 km height to 35 % at 8 km height (Fig. 3). This feature is observed at almost all sites except Barajas and Valencia (located near the western edge of the domain) for which values stays almost constant. In parallel, correlations are decreasing from 0.66 at 3 km height to a minimum of about 0.41 at 8 km height. Such patterns are observed for the three sites, Frankfurt, London and, to a lesser extent, Payerne, which are dominating the statistics (273 profiles of 395). At most other locations a more or less broad minimum centred at 5-6 km height is observed (Supplement Fig. S1). Apparently models better reproduce ozone variability due to photochemical build-up in the PBL and due to the tropopause height variability in the upper troposphere (see below) than the more "diffuse" forcing in the middle troposphere due to long range transport, slow photochemistry, and exchange with lower and upper layers. The fact that RMSE increases with altitude, even when correlation increases again beyond its minimum, is due to increasing variability in ozone profiles at higher altitudes. Concerning model errors it should be added that uncertainty in surface emissions inside the modelling domain probably does not play a significant role above the planetary boundary layer height. This is confirmed by sensitivity tests made with the CHIMERE model using either 20 % increased/decreased surface emissions or the EMEP inventory (Vestreng et al., 2005) instead the TNO inventory. Indeed, corresponding changes in ozone concentrations were always below 10 % within the first 2 km height, and below 5 % above this altitude. On the other hand, we could imagine that altitude emissions produced either by lightning or aircraft could explain a part of model error. The regional models of this study do not represent these emissions; they are only taken into account in IFS-MOZART and it is also well-known that these processes are still not well characterised. Nevertheless, Due to the low residence time of air masses in the free troposphere within the model domain (of the order of several days) and small ozone production rates there, lightning NO x and aircraft emission over Europe are not expected to significantly impact European free tropospheric ozone levels. Besides producing NO x via lightning activity, deep convection can also alter the redistribution of ozone and its precursors (Lawrence et al., 2005). Colette et al. (2005) have shown that 10 % of ozone rich-layer in the European free troposphere could have been uplift by convection from PBL. Nevertheless, if taken into account in models, the parameterisation of such processes and their impact still remain highly uncertain. It should be noted that results of CHIMERE ozone simulations (made over the whole 2008 summer) where deep convection has been by switch off does not show significant differences (always less than few percent) compared to the results obtained with the parameterisation included.
The ratio between the modelled and observed standard deviation is close to 1 on average (0.96; Table 3) but a certain spread is observed from one site to another ranging from 0.65 at London to 1.40 at Södankyla. In the UT, as expected, much higher and variable ozone concentrations are observed ranging from about 84 ppb (Barajas) to 148 ppb (Lerwick) with associated standard deviations representing 27 % to 65 % of these values (Supplement Table S1). Spatial gradients and the large temporal variability in observed ozone levels are induced by the vicinity of the tropopause and its spatial-temporal variability, which determine the degree of stratospheric and ozone enhanced character of air masses. Vertical transport across the tropopause is an additional process affecting ozone fields (e.g. Stohl et al., 2003). As a consequence, ozone fields simulated by RAQ models are highly influenced by meteorological forcing and model transport (i.e, tropospheric height, advection by winds) as well as by top and lateral boundary chemical forcing. Chemistry plays a minor role due to the residence time of air masses over the domain less than a few days, but it impacts boundary conditions. Mean model bias is weak (below 1 %, Table 3), with a large variability for individual sites ranging from −18 % at Lerwick to +30 % at Payerne. A rapid increase of relative RMSE with height is observed reaching about 55 % at 10 km height associated with a significant improvement of correlations (∼0.55 at 10 km height). These features are similar at almost all sites (Supplement Fig. S1, Table S1). RMSE increases with altitude despite a slight increase in correlation (Fig. 3). This could be explained by larger ozone concentrations and in particular larger ozone variability (1 σ standard variation) in observed and simulated time series observed at these altitudes (cf. standard deviation in Table 3). Larger variability generally favours correlation if basic processes are well taken into account, here in particular variations of the tropopause height, but also increases RMSE if such processes are not perfectly taken into account. The average variability ratio between simulations and observations is close to one (0.88) as for other height ranges (Table 3) but again with a large spread among individual sites (Supplement Table S1). Another explanation of increasing RMSE with altitude is related to the performances of the IFS itself. Indeed, systematic verifications of IFS performances are made at ECMWF. They show for this period at European latitude that RMSE (calculated by comparing 24 h forecast to analysis) of both wind components are increasing with altitude with maximum errors occurring between 200 and 300 hPa (in the jet stream region) and are ranging from 1 to 3.6 m s −1 . These errors in wind amplitude and direction can impact on ozone advection simulated by RCTMs. A detailed analysis of this issue goes beyond the scope of this paper. Note that IFS meteorology is input for all simulations in this study (directly or as boundary conditions for mesoscale models, thus it is expected that the impact on model errors is similar). These results indicate that models perform quite well in the PBL region for which they had been initially designed. Consid-ering the vertical structure of model errors, it clearly shows a C-shape form of the bias with a minimum in the middle troposphere (∼5-6 km height) of about −20 %. RMSE exhibits increasing values from about 20 % in the PBL to about 55 % at 10 km height that correlate to the vertical gradient of ozone variability. Correlation also follows a kind of C-shape but with a minimum (0.4) at about 8-9 km height.

Analysis of differences between models
Figure 3 also illustrates that discrepancies exist between models themselves even if global meteorological forcing, anthropogenic emissions (at least for RAQ models) and chemical boundary conditions are similar as it is the case for the subset of models using IFS-MOZART as boundary conditions (black curve and associated bars in Fig. 3). Nevertheless various formulations (chemistry, transport etc.), forcings (natural emissions etc.) and numerical set-up (horizontal and vertical resolutions) remain different between the models. A weak dispersion in models results for biases and RMSE is observed in the PBL where ozone concentrations are strongly controlled by emissions (anthropogenic and natural), turbulent and horizontal transport and photochemistry. This indicates that differences in these processes likely do not induce large differences between models. This idea is reinforced by the fact that differences between models increase with height when the influence of these "PBL" processes decreases (Fig. 3). As net ozone production due to photochemistry in the middle and upper troposphere is expected to be weak during the residence time of air masses in the regional model domain of several days, we are suspecting that discrepancies between models are induced by the (horizontal and vertical) advection scheme, and horizontal and vertical resolution. The way the top boundary is handled can also be an issue. Sensitivity tests for short periods (ten days) are performed with the CHIMERE model have shown that discrepancies can occur when using different advection schemes (a simple first-order upwind scheme; the Van Leer second-order scheme (Van Leer et al., 1979) used in the reference run; the PPM (Piecewise Parabolic Method) thirdorder scheme (Colella and Woodward, 1984)). Differences are bigger when winds are stronger (i.e. at high altitudes and latitudes) but remain generally weak (a few ppb). Thus this error source does not explain the observed model-to-model differences. Differences due to horizontal resolution will be discussed below.
Differences could be due also to vertical transport due to differences in vertical advection schemes (different for each model), to differences in the treatment of top boundary conditions and in vertical resolution. Also the way vertical velocities are computed from the continuity equation, either as a diagnostic (CAMx, CHIMERE) or as a direct output of meteorological models (BOLCHEM, EU-RAD, IFS-MOZART), could play a role. All models except CAMx (monthly mean from IFS-MOZART) use hourly IFS-MOZART as top conditions but with different top levels, 300 hPa for CAMx, 200hPa for BOLCHEM and CHIMERE and 100 hPa for EURAD. For example, changing the top of the CHIMERE model from 200 hPa to 150 hPa (i.e. here we use 18 vertical levels instead of 17) induces differences (that grow with altitude) of +10 ppb for the 9-12 km layer with 95 % of the values included between 0 and 20 ppb. Thus the choice of the level of the model top boundary could have some impact on model errors in the upper troposphere.
We note that differences in correlations are more constant throughout the troposphere with especially weaker differences in the upper troposphere (contrary to bias) where all models probably follow IFS-MOZART. Last, it is interesting to discuss, whether RCTM simulations with horizontal resolutions between 0.2 • and 0.5 • show larger correlation coefficients than the global IFS-MOZART model with nearly 2 • horizontal resolution. Indeed, near the surface, the global model shows lower correlations coefficients (0.67) than the regional models (0.69-0.78), making evident the benefit of higher resolution to improve simulation of PBL photochemical ozone build-up. However, from 2 km height on, the IFS-MOZART correlation coefficient is close to the median one (not shown), so in the middle troposphere improved resolution does not necessarily result in better ozone simulations. As expected, the IFS-MOZART model exhibits (not shown) a lower variability than observations and than the RCTM's over the whole troposphere.
In conclusion, model-to model differences are most pronounced in the upper troposphere. A large variety of model settings could be responsible for errors, in particular related to transport processes. Some of them could be tested within CHIMERE, for instance the impact of the horizontal advection scheme (minor) or of the choice of top boundary (potentially contributing to part of the errors for the case of CHIMERE), but a final explanation for the model to model differences could not be achieved in this work.

Impact of chemical boundary conditions
By construction, limited-area models need to be provided at their boundaries with concentrations of long-lived (CO, O 3 etc.) and shorter-lived pollutants (as NO x etc.). The impact of use of different boundary conditions on regional model results is analysed here. As described previously, it is common to use large-scale climatologies to prescribe top and boundary conditions of RCTM's to avoid the setup of more complicated combined global-regional modelling chains. One of the achievements of the FP6 GEMS project was setting up this type of systems in which global models provide hourly chemical boundary conditions (BC) to regional models. Szopa et al. (2009) have shown that the impact of improving BC variability (use of daily instead of monthly BC) on surface ozone concentrations remains limited (less than 5 %) in the centre of the regional European modelling domain. Nevertheless, the authors did not evalu-ate the impact on middle tropospheric ozone concentrations. Here, we use our ensemble of different RCTM's with different forcings from GCTM output or from climatologies. Two different types of boundary forcings are evaluated: (1) hourly forcing from another GCTM than IFS-MOZART (namely MOCAGE); (2) the IFS-MOZART climatology in comparison to hourly forcing for one of the RAQ models (namely the CHIMERE model). Their impact is evaluated over the whole tropospheric height range using in situ measurements presented earlier.
First, we have evaluated the impact of using climatological boundary conditions instead of hourly ones. To do so, we have simulated the whole period with the CHIMERE model using the monthly averaged values of the IFS-MOZART model instead of the hourly values and compared both model configurations to observations. As expected, both produce quite similar results in terms of biases with differences never exceeding 4 % (Fig. 3). For RMSE, differences increase but remain quite small reaching about 5 % at 9 km height. This height dependence is explained by a smaller influence of BC in the PBL due to local forcings, and by a temporal ozone variability increase with height in the middle and upper troposphere. For correlations, differences are more systematic and the version with hourly IFS-MOZART BC is always better. Differences in the correlation coefficient are more significant above 3 km height, increasing from 0.03 to more than 0.2 at 10 km height. This indicates that temporal variations are better reproduced by the hourly BC than the monthly ones.
Second, we compared results of the MOCAGE model that uses its own BC in a nested global -regional simulation to those obtained with RAQ models forced by IFS-MOZART fields. For MOCAGE a positive mean surface bias (up to 20 % at the surface) is observed (Fig. 3). This result is in agreement with a parallel study of Ordonez et al. (2010). In the middle troposphere, bias remains positive until about 4 km height, and then becomes neutral or negative above this altitude. MOCAGE RMSE is larger than that for other models in the PBL (by about ∼40 % at the surface) and becomes lower than for other models between 3 to 9 km. Except in the PBL, correlations (Fig. 3) are similar to those of other models. Differences between MOCAGE and the median of IFS-MOZART driven models are indeed due to different boundary conditions. This can be deduced from the fact that simulations are different for sites at western edge of the boundary (i.e. Barajas, Valentia; figure not shown) which are strongly influenced by boundary conditions. In addition, other differences in the model set-up (Table 2) can add to differences.
As a conclusion of these comparisons, we find an improvement of middle tropospheric ozone simulations when passing from climatological ozone boundary conditions to hourly ones, although the benefit for boundary layer ozone predictions is rather small. This is an important finding of the GEMS project. It justifies the systematic coupling of global and regional models, if the aim is a consistent description Atmos. Chem. Phys., 12, 3219-3240, 2012 www.atmos-chem-phys.net/12/3219/2012/ Fig. 4. 0-6 km smoothed ozone partial columns (in Dobson Unit) averaged over the summer 2008 (JJA). The IASI columns are calculated from observations corresponding to the morning passage of the satellite. The "IASI counts" map indicates the number of (non cloudy) measurement days available during the period.
of regional scale tropospheric ozone. However, if the aim is restricted to a prediction of boundary layer ozone only, this coupling is not mandatory and use of climatological boundary conditions for ozone seems sufficient (at least for the case of the CHIMERE model). BC from different global models can impact significantly vertical profiles at regional scales from the ground to the UT.

Comparisons between models and IASI 0-6 km columns
As a complementary data source for model evaluation, we use satellite observations obtained with the IASI instrument. As previously mentioned (Sect. 2.2), it is possible to derive 0-6 km tropospheric ozone partial columns from these measurements with good accuracy. Even though such observations still give limited vertical information (especially compared to those from sondes and aircraft), they are attractive because of their large spatial coverage (two complete ozone fields per day under cloud free conditions). It should be noted that results of the CAMx model are not included in the comparison due to its lower model top (6 km height). Inspection of the averaging kernels shows that ozone values above 6 km height contribute to the retrieved 0-6 km columns, which makes it necessary to dispose of simulations with model top above 6 km. Figure 4 shows average IASI 0-6 km ozone partial columns for summer 2008 (June to August) interpolated on the model grid. The average is calculated using the more sensitive morning (by comparison with evening) observations. All IASI pixels available (up to five) within one model grid with 0.5 • horizontal resolution are averaged to obtain a daily value. Individual profiles are smoothed using the averaging kernels to remove the a priori information (see Sect. 2.2 and Eremenko et al., 2008). The number of available "days" per grid cells (for the whole summer) is shown in Fig. 4. Indeed, pixels that do not fulfil the quality check (cloudy for example) are systematically discarded in the retrieval procedure (Eremenko et al., 2008). The number of available pixels is often less than 2/3 above 55 • N, and generally less than 50 % over the Scandinavian Peninsula (Fig. 4). For the southern part of the domain, areas with low surface emissivity like desert areas (Maghreb, Southern Spain or even Turkey) are also poorly covered. For such regions, strong aerosol loading (dust) as well as the presence of cirro-stratus along the subtropical jet-stream can also alter the measured radiances and then reduce the number of sampled pixels retrieved. 23-26 Dobson Unit (DU). Indeed, during summer, persistent anticyclonic (and subsident) conditions associated with strong photochemistry and low deposition rates are observed over the Mediterranean basin (Lelieveld et al., 2002;Foret et al., 2009). Such conditions favour the persistence of high ozone levels throughout the troposphere over this region. It should be noted that due to higher surface temperatures (and then higher thermal contrast between ground and surface air masses) in the southern part of the European domain, partial columns observed over this area are probably more sensitive to ozone concentrations at lower tropospheric altitudes. Strong horizontal gradients are often observed between land and marine surface for which surface temperature (and thus the observations sensitivity to ozone), but also orography is significantly different. The potential impact of these features on the gradient is not yet clear and should be further investigated. Over elevated or mountainous areas, ozone values are smaller since the thickness of atmospheric partial columns taken into account is reduced. Thus signature of Western European mountains an/or plateau (Meseta plateau, Pyrenées, Massif central, Alps, Scandinavian and Dinaric alps, Carpathian and Balkan mountains, Anatolian plateau) are visible in Fig. 4. We also note high ozone values over the Black Sea, Bulgaria, Romania, Moldavia and Ukraine with maxima of about 25 DU.

Geographical distribution of summer averages
Corresponding smoothed columns were calculated from models for hours with available observations. Models driven by IFS-MOZART BC qualitatively exhibit a similar, albeit less pronounced, north/northwest-south/southeast gradient as IASI (Fig. 4). Minimum values over Scandinavia in IASI observations, and maximum values over the eastern Mediterranean basin are reproduced by most of the models. Differences between the model median and IASI partial columns (Fig. 6a) exhibit a latitude dependence with a global model underestimation south of 60 • N of about 2 to 4 DU (∼10 to 20 %) and little bias (< 1 DU) north of 60 • . These results (i.e. negative bias) are well in line with the negative bias observed in the comparisons between models and vertical profiles (cf. Sect. 4.1). Discrepancies are more important over Spain and especially the Maghreb, regions with a weaker data coverage due to soil particularities (i.e. low emissivity) and, potentially, to the presence of airborne mineral dust. Also, over the northern coast of the Black sea and more generally over the south eastern part of the domain (near Romania), models underestimate the ozone maxima observed by IASI by about 6 DU. This value is still within the range of uncertainty of models (about 2 DU as seen from model dispersion in Fig. 4) and observations (10 to 20 %, about 2.5 DU).
Note that the 0-6 km partial columns of models without vertical smoothing (hereafter called "raw" columns) show a clear north-south gradient (Fig. 5). In the case of smoothed columns, differences between models themselves and/or IASI are less representative of the surface (due to the weak sensitivity of satellite observations to the surface and the use of a common a priori that dominates lowest levels) but integrate to some extent information of the upper troposphere as seen from the averaging kernels (Fig. 2).
As expected from comparisons of models to sondes and aircraft, models using different chemical boundary conditions exhibit different behaviour. Figure 5 showing "raw" ozone columns confirms the positive bias of the MOCAGE model against other models below 6 km height as already shown by comparisons with in situ measurements. Comparisons with IASI (of the smoothed columns, Fig. 6b) show that MOCAGE performs well over the southern area of Europe but exhibits a positive bias over northern Europe, of more than 4 DU (∼20 %).
It is interesting to notice that comparisons between models and in situ measurements are fully consistent with comparisons between models and IASI: the median of models shows Atmos. Chem. Phys., 12, 3219-3240, 2012 www.atmos-chem-phys.net/12/3219/2012/ Table 4. The domain has been divided in 4 quadrants NW (North-West), NE (North-East), SW (South-West) and SE (South-East). The temporal evolution of 0-6km ozone partial columns from IASI and models are compared (for each quarter) in terms of their relative bias and Pearson's correlation. MEDIAN-IFS stand for the median of models using IFS-MOZART as boundary condition. CHIMERE-IFS is one of these models and is compared with CHIMERE-CLIM that is using the monthly mean of the IFS-MOZART hourly values as BC. Biases are express in DU. The same as (a): MOCAGE minus IASI a negative bias with middle tropospheric ozone from in-situ vertical profiles; this is confirmed by the comparison with 0-6 km IASI columns which indeed are most sensitive to free tropospheric ozone.

Summer ozone variability
As IASI inversions are available once per day from morning observations (under cloud free conditions), it is interesting to compare its temporal evolution for a summer season (here summer 2008) to the modelled evolution. Figure 7 shows this variability expressed again as the smoothed 0-6 km partial columns and averaged over four model sub-domains that correspond to the four NW, NE, SW and SE model domain quadrants). IASI daily (morning) observations are compared to the median of the models using IFS-MOZART as BC and the MOCAGE model. Both IASI and the models reproduce quite well the seasonal variability. This feature seems well in line with the expected slow decrease of ozone during the summer that follows the spring maximum (Monks, 2000) as observed at some remote stations in Europe (Chevalier et al., 2007;Gilge et al., 2010). Considering the median, as expected, a higher negative bias is observed for the southern part of the domain (−16 %) instead of −8 % (NW) and −5 % (NE) in the north (Table 4), in line with the latitude of biases discussed before. It should be noted here that this bias is quite systematic (Fig. 7). Time correlations are relatively high, between 0.74 for the NE to 0.63 (Table 4) for the SW sector indicating a good model ability to reproduce processes controlling regional scale ozone variability (either from BC or inside the domain itself). Also, we notice that correlations are systematically better in the eastern part compared to the western part at the same latitude when BC have less influence on the simulated concentrations. The dispersion of the ensemble is also plotted (Fig. 7) as the difference between the max and the min value of the ensemble for each day. In the northern part, the IASI observations are close or inside the model's variability while in the southern part of the domain where biases are more important they are almost systematically larger than the maximum model values. We notice that the mean dispersion of the ensemble is less important in the western part of the domain compared to the eastern part at the same latitude (2.7 DU in the NW against 3.5 DU in the NE and 1.8 DU in the SW against 3 DU in the SE). This is likely related to the use of common BC that have decreasing influence on simulated concentrations toward the east. As expected from previous sections, the MOCAGE model (with its own boundary conditions) exhibits higher values in the lower free troposphere leading to a positive bias in the north (up to 10 %) and a weaker negative biases in the south (less than 5 %). The correlation remains good but is slightly lower than that of the IFS-MOZART driven model's median. Also, from Table 4, it is confirmed that the use of hourly BC compared to monthly averages largely improves the correlations for the case of the CHIMERE model across the whole domain.
In conclusion, the comparison between models and IASI shows that models qualitatively reproduce the observed lower tropospheric continental scale N/NW-S/SE gradient. Also the temporal variability of the columns at large geographical scales (1500-2000 km) is well reproduced (correlations between IASI and the model's median in the range 0.63-0.74). These correlation coefficients are larger than those obtained from the comparison between simulations and in situ ozone profiles in the free troposphere. This is consistent with the fact that for these comparisons point measurements are used (with respect to spatial averages for the case of IASI observations).

Case study of large ozone gradients in relation with an upper tropospheric wave
Since the IASI instrument is on board the MetOp platform that samples the European domain at daily scale, it is conceptually possible to track specific ozone events. In particular, it is interesting to evaluate if models as well as IASI can reproduce large ozone gradients. To illustrate this point, we have focused our analysis on a case study of an upper tropospheric wave inducing a large variability in tropopause height and upper tropospheric ozone values From 8 to 11 June 2008, the median of raw simulated 0-6 km columns fields shows, to a varying degree, very prominent spatial features (Fig. 8, lower panel). A zone with enhanced ozone columns extends from Southern Norway to Atmos. Chem. Phys., 12, 3219-3240, 2012 www.atmos-chem-phys.net/12/3219/2012/ Northern Spain (also observed in the time series presented in Fig. 7). Especially, spatial gradients at the western edge of this zone are very pronounced. Corresponding smoothed column fields show similar features although the spatial structure is less apparent, because only cloud free pixels for which also IASI observations are available are presented (Fig. 8, middle panel). For 8 June, spatial structures for smoothed simulated models and IASI observations (Fig. 8,upper panel) coincide rather well, the region of strong spatial gradients is only slightly shifted towards south in IASI observations with respect to models (from North Sea to the North sea coast). Observed and simulated spatial gradients coincide even better for 9 June, the steepest gradients being located at the German North Sea and the French channel coast. For 10 June, the correspondence is again very good, the steepest gradient zone being shifted about 100 km to south.
We now need to seek for an explanation why models (here represented by their median) show such similar spatial structures during this period, and moreover correspond very well with IASI observations. The potential vorticity (PV) contour map (figure not shown) at the 330 K potential temperature level (corresponding to about 12 km height) for 8 June shows a pronounced wave structure over Europe with a ridge over the British Islands (with low tropospheric PV values, below 1 PVU), and a trough covering a large part of Western, South-ern and Central Europe (with large stratospheric PV values, above 3 PVU). The region with strongest PV gradients follows the channel and North Sea coast from France to Denmark. Its NE-SW orientation and location correspond to the strong gradients in the IASI ozone column fields observed for this day (Fig. 9). Also the day to day evolution of IASI partial ozone columns and 330 K PV maps in the following days is correlated. This perfect coincidence of spatial structures suggests that variations in IASI and modelled partial ozone columns are caused by the upper tropospheric wave structure. It is well known that upper tropospheric ozone and PV are well correlated (for example, Danielsen, 1968, Beekmann et al., 1994. A vertical cross section through the upper tropospheric front along 51 • N (Fig. 10) shows enhanced ozone values in the 4-10 km height region in the trough region (>60 ppb), compared to ridge region (< 40-50 ppb). Note that IASI observations are shown for specific altitudes (in km steps), but their implicit vertical resolution is of several kilometres. The picture in Fig. 10 is consistent with the spatial distribution in Fig. 8 when considering that due to the vertical sensitivity of IASI measurements (cf Averaging Kernel in Fig. 2), the large ozone values in the 4-10 km region have a strong impact on the smoothed 0-6 km partial columns. Enhanced ozone values in the 4 to 8 km height range (between 60 ppb and 100 ppb) are also observed in a MOZAIC profile recorded from Frankfurt airport within the trough region on 8 June at 06:45 UTC. The coincidence of enhanced ozone region with low CO, and low relative humidity indicates subsident motion from the tropopause region to the free troposphere. This is confirmed by Lagrangian particle simulations with the FLEXPART model (Stohl et al., 2005). For air masses arriving at Frankfurt, on 8 June, between 7 and 8 km altitude, they show subsiding anticyclonic motion of the retro-plume during the last three days, and indicate a significant fraction of air with stratospheric origin (from PV analysis). Nearly all models show strongly enhanced ozone values in the 4-10 km height region in the 51 • N cross section east of −10 • W (Fig. 10). For most of them ozone values in this region are somewhat stronger than those observed by IASI. Differences induced by the use of a monthly mean climatology (CHIMERE2) instead of hourly values (CHIMERE1) are small for this case. Note that simulated fields in Fig. 10 are again smoothed in order to be comparable to IASI observations. Thus, in conclusion, both the agreement in the vertical and in the horizontal distribution between observed and simulated 0-6 km partial ozone columns is striking (Fig. 8), especially the gradient zone between the ridge and the trough regions. Apparently, the deep trough associated with low tropopause and high ozone values is well represented in IFS meteorological fields which are used by all models as input (either directly for the CTM or as large scale or boundary values for the mesoscale meteorological simulations). This case study illustrates the possibility to use IASI observations to evaluate the CTM model behaviour for cases of strong ozone gradients related to upper tropospheric wave structures.

Conclusions
The 3-D evaluation of an ensemble of RCTM to simulate tropospheric ozone concentrations over Europe is presented here. Several models have simulated ozone concentrations over an entire summer period (June to August 2008) in the context of the GEMS-RAQ project. Among those, five state of the art RCTM and the IFS-MOZART system have participated in this evaluation exercise. A large set of observed vertical ozone profiles, either from sondes or commercial aircraft have been used for this evaluation purpose, in addition to satellite derived partial columns. The data set used comprises about 400 vertical profiles at 11 different locations. The model skills of representing PBL ozone concentrations are in the range of values observed in earlier studies using surface ozone measurements: we have calculated relative biases of 4 %, RMSE of 24 % and correlation of 0.77) and.
In the middle troposphere height region (> 2-8 km), models using the same hourly top and boundary conditions from IFS-MOZART exhibit a systematic negative bias of about −20 %. This feature is commonly observed in global scale CTM's and not yet fully understood. RMSE values are constantly growing with altitude, both in an absolute and relative sense (from 32 % to 53 %, respectively in the > 2-8 km and in the > 8-10 km height range). Largest values in the UT are thought to be associated with the difficulty for models to capture to a full extent the variability of tropospherestratosphere exchange processes or simply the height variation of the tropopause, although large correlation in the UT indicated that the basic processes governing ozone variability are taken into account. Correlation in the middle troposphere was found to be low, with minimum values of 0.2 to 0.45 near 8 km. Apparently, forcing processes for the ozone variability are not well captured in models in this height range. If long range transport of ozone contributes significantly to this variability, it is understandable that plume positions could not be easily predicted at several thousand kilometres distance from the sources. But misrepresentation of ozone chemistry as well as stratospheric intrusions upwind of Europe could also explain models errors. We also note that bias and RMSE are the lowest in the PBL (as well as satisfying correlations) showing a better model capacity to reproduce ozone concentrations in the part of atmosphere for which these RAQ models have been originally designed. We also can add, that differences between models inside the domain are observed (generally increasing with altitude) especially for bias in spite of common meteorology and chemical boundary conditions. In this part of the troposphere, where surface processes like emissions and fast chemistry have a weak influence, transport processes most likely are responsible for differences. However, due to the multitudes of different settings within the models tested, the exact sources for model to model discrepancies could not be determined. During this exercise, the impact of using different chemical BC has also been investigated. Two ways of prescribing BC have been tested: variable BC using hourly forecast from either the IFS-MOZART or the global CTM MOCAGE and, using a monthly climatology calculated from IFS-MOZART instead of hourly values. It has been shown that the use of hourly (forecast) instead of monthly (climatology) BC generally improves the skill of one model to a certain extent (for example, the correlation in the 5-8 km height region increases from 0.2-0.3 to 0.4 when hourly BC are used with CHIMERE). Larger differences between models are observed when different CTM are used to produce BC (case for IFS-MOZART and MOCAGE, even if other settings are also different for MOCAGE).
Another goal of the paper was to compare models against satellite data in particular to partial ozone columns (0-6 km) calculated from IASI observations. The IASI sounder is a thermal infrared instrument that allows estimating tropospheric ozone concentrations (mainly in the free troposphere) at twice daily frequency over Europe. It thus allows the identification of geographical pattern of the tropospheric ozone distribution and their temporal variations. We observed an overall good agreement between IASI and models over the summer 2008 period with differences generally lower than 20 % for the median of models. In particular, IASI observations of minimum values over Scandinavia, and maximum values over the eastern Mediterranean basin are reproduced by most of the models. Below 60 • north, a negative bias of models is observed well in line with comparisons between vertical profiles and models. Temporal variability in lower tropospheric ozone values during summer 2008 is also well reproduced by models (result obtained for IASI model comparisons averaged over model sub-domains).
Finally, a case of a multiday upper tropospheric wave generating strong ozone gradients was observed by these satellite data, confirmed by a MOZAIC profile and meteorological analysis, and well reproduced by models. In particular, both IASI and models were able to resolve strong horizontal gradients in middle and upper tropospheric ozone occurring in the vicinity of the upper tropospheric frontal zone. This shows the potential of IASI observations for investigating the upper tropospheric ozone distribution. Ideally, these features should not only be analysed in 0-6 km partial ozone columns, which were the basis of this study, but also in 0-12 km or 6-12 km partial columns. During the summer 2008 period studied, no major photochemical ozone pollution event suitable for a case study occurred.
As a final general conclusion, it is shown in this paper that a combination of high resolution vertical ozone profiles at a limited number of sites and satellite observations with good spatial coverage, but low vertical resolution, allow for a thorough evaluation of tropospheric ozone simulations at various temporal scales (seasonal, case study). Within the framework of the GMES program (and its FP6/GEMS and FP7/MACC projects), this work also shows the ability of a combined system of vertical profile observations, satellite observations, and model simulations to represent the free tropospheric vertical ozone distribution with a defined uncertainty, and to make evident key processes affecting its variability.