Evaluation of a regional air quality model using satellite column NO2: treatment of observation errors and model boundary conditions and emissions

. We compare tropospheric column NO 2 between the UK Met Ofﬁce operational Air Quality in the Uniﬁed Model (AQUM) and satellite observations from the Ozone Monitoring Instrument (OMI) for 2006. Column NO 2 retrievals from satellite instruments are prone to large uncertainty from random, systematic and smoothing errors. We present an algorithm to reduce the random error of time-averaged observations, once smoothing errors have been removed with application of satellite averaging kernels to the model data. This reduces the total error in seasonal mean columns by 10–70 %, which allows critical evaluation of the model. The standard AQUM conﬁguration evaluated here uses chemical lateral boundary conditions (LBCs) from the GEMS (Global and regional Earth-system Monitoring using Satellite and in situ data) reanalysis. In summer the standard AQUM overestimates column NO 2 in northern England and Scotland, but underestimates it over continental Europe. In winter, the model overestimates column NO 2 across the domain. We show that missing heterogeneous hydrolysis of N 2 O 5 in AQUM is a signiﬁcant sink of column NO 2 and that the introduction of this process corrects some of the winter biases. The sensitivity of AQUM summer column NO 2 to different chemical LBCs and NO x emissions data sets are investigated. Using Monitoring Atmospheric Composition and Climate (MACC) LBCs increases AQUM O 3 concentrations compared with the default GEMS LBCs. This enhances the NO x –O 3 coupling leading to increased AQUM column NO 2 in both summer and winter degrading the comparisons with OMI. Sensitivity experiments suggest of the


Introduction
Air quality has a major influence on the UK both socially and economically. It results in approximately 50 000 premature deaths per year and an average reduction in life expectancy of 7-8 months (HoC, 2010). Air pollution health effects include lung disease and cancer, cardiovascular problems, asthma and eye irritation (WHO, 2011). In 2005, poor UK air quality cost GBP (EUR) 8.5 (10.7)-20.2 (25.5) billion and between 2007 and 2008 there were 74 000 asthmarelated hospital admissions. Overall, these air quality asthma incidents cost society GBP (EUR) 2.3 (2.9) billion (HoC, 2010). Poor air quality associated with ozone concentrations over 40 ppbv can also significantly reduce crop yields (e.g. Hollaway et al., 2012). Therefore, regional models have been developed to predict hazardous levels of air pollution to help inform the public and to allow local authorities to take action to reduce/accommodate the respective health risks/effects. Air quality models have mainly been evaluated against surface observations, e.g. Savage et al. (2013). Recently such models have also been compared with satellite observations, taking advantage of the better spatial coverage despite the poten-Published by Copernicus Publications on behalf of the European Geosciences Union. R. J. Pope et al.: Evaluation of a regional air quality model using satellite column NO 2 tially large error of individual observations. In the past NO 2 satellite data have been compared mainly with global atmospheric chemistry models (e.g. Velders et al., 2001;Lauer et al., 2002;Van Noije et al., 2006). More recently, other studies have used satellite data to evaluate models on a regional scale. Savage et al. (2008) investigated European tropospheric column NO 2 interannual variability (IAV) during 1996-2000 by comparing GOME with the TOMCAT chemical transport model (CTM) (Monks et al., 2012). The best comparisons were found in the JFM and AMJ seasons, especially over western Europe. They also found that synoptic meteorology had more influence on NO 2 IAV than NO x emissions did. Huijnen et al. (2010) compared Ozone Monitoring Instrument (OMI) tropospheric column NO 2 against a European global-regional air quality model ensemble median for [2008][2009]. The ensemble compared better with the OMI data than any individual model, with good agreement over the urban hotspots. Overall, the spread in the models was greatest in the summer (with deviations from the mean OMI tropospheric column in the range 40-62 %), due to the more active NO x chemistry in this season and the differences in chemistry schemes among the contributing models, when compared to winter (20-34 %). Several of the regional models successfully simulated the shipping lanes seen by OMI. Han et al. (2011) investigated tropospheric column NO 2 over the Korean Peninsula through comparisons between OMI data and the Community Multi-scale Air Quality Model (CMAQ) (Foley et al., 2010). In summer, North and South Korea had similar column NO 2 from both the model and observations. In winter, South Korea, a more developed nation with greater infrastructure, had significantly greater NO 2 concentrations than North Korea. Overall, CMAQ overestimated OMI NO 2 concentrations by factors of 1.38-1.87 and 1. respectively. Other studies investigating regional tropospheric column NO 2 through model simulations and satellite observations include Blond et al. (2007), Boersma et al. (2009) and Curier et al. (2014). Blond et al. (2007) compared CHIMERE 3-D CTM and SCIAMACHY column NO 2 over western Europe; they found reasonable agreement with winter and summer correlations of 0.79 and 0.82, respectively. Boersma et al. (2009) used the GEOS-Chem 3-D CTM to explain the seasonal cycle in SCIAMACHY and OMI column NO 2 over Israeli cities, with larger photochemical loss of NO 2 in summer than winter. Curier et al. (2014) used a combination of OMI and the LOTOS-EUROS 3-D CTM to evaluate NO x trends finding negative trends of 5-6 % per year over western Europe.
The UK Met Office's Air Quality in the Unified Model (AQUM) is used for short operational chemical weather forecasts of UK air quality. Savage et al. (2013) performed the first evaluation of the AQUM operational forecast for the period May 2010-April 2011 by using surface O 3 , NO 2 and particulate matter observations from the UK Automated Ur-ban and Rural Network (AURN) (DEFRA, 2012). Among other model-observation metrics they used the mean bias (MB), root mean square error (RMSE), modified normalised mean bias (MNMB) and the fractional gross error (FGE) (Seigneur et al., 2000). See the Appendix for the definition of these metrics. Savage et al. (2013) found that AQUM overestimated O 3 by 8.38 µg m −3 (MNMB = 0.12), with a positive bias at urban sites but no systematic bias at rural sites. The modelobservation correlation was reasonably high at 0.68. For NO 2 , there was a bias of −6.10 µg m −3 , correlation of 0.57 and MNMB of −0.26. At urban sites there was a large negative bias while rural sites had marginal positive biases. The coarse resolution of AQUM (12 km) led to an underestimation at urban sites because the model NO x emissions are instantaneously spread over the entire grid box. The particulate matter (PM 10 ) prediction skill was lower with a correlation and bias of 0.52 and −9.17 µg m −3 , respectively.
The aim of this paper is to evaluate AQUM using satellite atmospheric trace gas observations. The Met Office has previously compared the skill of AQUM only against AURN surface measurements, which in the case of NO 2 are not specific and include contributions from other oxidised nitrogen compounds (see Savage et al., 2013, and references therein). Therefore, for better spatial model-observation comparisons and to minimise the effect of measurement interferences, we use satellite observations over the UK. We focus on tropospheric column NO 2 data from OMI for the summer (April-September) and winter (January-March, October-December) periods of 2006. Section 2 describes the OMI satellite data used and gives a detailed account of our error analysis which determines how we can use satellite data to test AQUM. Section 3 describes AQUM and the model experiments performed. Results from the model-observation comparisons are given in Sect. 4. Section 5 presents our conclusions.

Satellite data
OMI is aboard NASA's EOS-Aura satellite and has an approximate London daytime overpass at 13:00 LT. It is a nadirviewing instrument with pixel sizes between 16-23 km and 24-135 km along and across track, respectively, depending on the viewing zenith angle (Boersma et al., 2008). We have taken the DOMINO tropospheric column NO 2 product, version 2.0, from the TEMIS (Tropospheric Emissions Monitoring Internet Service) website, http://www.temis.nl/ airpollution/no2.html (Boersma et al., 2011b, a). We have binned NO 2 swath data from 1 January to 31 December 2006 onto a daily 13:00 LT 0.25 • × 0.25 • grid between 43-63 • N and 20 • W-20 • E. All satellite retrievals have been quality controlled, and retrievals/pixels with geometric cloud cover greater than 20 % and poor-quality data flags (flag = −1) were removed. The product uses the algorithm of Braak

Satellite averaging kernels
Model transfer functions (MTFs), known as "averaging kernels" (AKs), allow for direct comparison between model column NO 2 and satellite retrievals. This section introduces how these MTFs (AKs) are applied to model vertical profiles to allow for direct comparison with satellite observations and how the MTFs vary in season and location. Eskes and Boersma (2003) define the AK to be a relationship between the retrieved quantities and the true distribution of the tracer (i.e. the vertical profile of a chemical species). In other words, the satellite instrument's capability to retrieve a quantity is a function of altitude. Therefore, since satellite retrievals and model vertical profiles are not directly comparable, the AK is applied to the model data, so the sensitivity of the satellite is accounted for in the comparisons. The AK comes in different forms for different retrieval methods. For the Differential Optical Absorption Spectroscopy (DOAS) method, the AK is in the form of a column vector, while in Optimal Estimation, the AK is a matrix whose dimensions depend on the number of pressure levels in the retrieval process. The OMI retrievals use the DOAS technique and the AK is a column vector. Following Huijnen et al. (2010) and the OMI documentation (Boersma et al., 2011a), the AKs are applied to the model as where y is the total column, A is the AK and x is the vertical model profile. However, here the tropospheric column is needed: where A trop is AMF is the atmospheric air mass factor and AMF trop is the tropospheric air mass factor. For the OMI product, Huijnen et al. (2010) state that the AK tends to be lower than 1 in the lower troposphere (e.g. 0.2-0.7 up to 800 hPa) and greater than 1 in the mid-upper troposphere. Therefore, the OMI AKs reduce model NO 2 subcolumns in the lower troposphere and increase them in the mid-upper troposphere (Huijnen et al., 2010). Figure 1 shows example tropospheric AKs for summer and winter profiles over London (urbanhigher column NO 2 ) and Dartmoor (rural area in southwest England -lower column NO 2 ), which have been coloured by their respective tropospheric AMFs. In the lower troposphere for both seasons and locations the tropospheric AKs range around 0-1. However, in the mid-upper troposphere, the London tropospheric AKs tend to be greater than Dartmoor in both seasons. London tropospheric AKs are most pronounced in winter, with some tropospheric AKs over 8, while in the summer they range around 1-8. In both seasons, the tropospheric AMFs are biggest, 5-6, in the lower range tropospheric AKs, 0-1, and smaller, 0-1.5 as the tropospheric AK range increases, over 2. If the tropospheric AMFs are small (i.e. near 0 suggesting the majority of the NO 2 is within the lower layers of the London boundary layer; also small tropospheric AKs there), from Eq. (3), as the full atmospheric AKs naturally increase with altitude, the tropospheric AMFs will return larger tropospheric AKs. Also, in winter over London, the shallower boundary layer will trap larger winter emissions of NO 2 closer to the surface. Therefore, the tropospheric AMF will be smaller and the winter mid-upper tropospheric AKs will be larger as seen in Fig. 1. Over Dartmoor, the AKs show less seasonal variation and the majority range around 1-6 for both summer and winter. This is also seen in the tropospheric AMF, which ranges around approximately 0-6, but has no clear pattern in the Dartmoor tropospheric AKs, in both seasons.
The Dartmoor AKs tend to be lower than those of London, which could be a result of multiple factors: surface albedo, viewing geometry, cloud cover, etc. As data with cloud cover higher than 20 % are filtered out and the viewing geometry of London and Dartmoor will vary depending on where OMI is in its orbit (both locations are at similar latitudes), we suggest that neither is the dominant cause of the AK differences. The surface albedo data in the satellite files is noisy and shows no clear pictures between London and Dartmoor. We suggest that the different NO 2 loading between the locations is the primary factor in the AK differences. Belmonte Rivas et al. (2014) state that the AK is dependent on the scattering weighting function, the correction of temperature sensitivity on the NO 2 cross-section (both altitude dependent) and the AMF. Now the AMF itself is a function of the scattering weighting function, the temperature correction on the NO 2 cross-section and an a priori vertical trace gas profile extracted from a CTM. In the case of OMI column NO 2 , this profile comes from TM4 calculations, which simulate higher NO 2 loading over London than Dartmoor. This can be seen in TM4 simulations from Van Noije et al. (2006). Therefore, the AKs over London are larger than those over Dartmoor.

Differential optical absorption spectroscopy NO 2 retrieval error
The DOAS retrievals are subject to random, systematic and smoothing errors in the retrieval process. Random (quasisystematic) errors include fitting errors, cloud errors, instrument noise and signal corruption. Systematic errors include absorption cross-sections, surface albedo and stratospheric correction uncertainties. Finally, smoothing errors include biases in the a priori profiles and sensitivity of the satellite when recording the slant column through the atmosphere. If multiple retrievals are averaged together, as in this study, the random errors will partially cancel leading to the random error being reduced by a factor of 1 √ N (where N is the number of retrievals).
In contrast, systematic errors are unaffected by cancelling through averaging. In the following section we investigate the different error components of the satellite retrievals and derive an expression for the error in the averaged retrievals. This methodology should give smaller errors which are more representative of the time-averaged retrieval error and so allow a stricter test of the model. Boersma et al. (2004) describe the error in the DOAS NO 2 retrievals as where σ trop , σ strat and σ total are the uncertainties in the tropospheric vertical, stratospheric slant and total slant columns, respectively. AMF trop is the tropospheric air mass factor, σ AMF trop is the error in the tropospheric air mass factor, X total is the total slant column and X strat is the stratospheric slant column.
σ total is made up of both random and systematic error, where the random error component can be reduced by 1 √ N . The sources of systematic error in the total slant column include the NO 2 cross-section, spectral calibration and temperature (Boersma et al., 2004). We assume that the systematic and random errors can be combined in quadrature. In Eq. (6) there could be two terms for σ total : σ total ran and σ total sys , which are the random and systematic error components of the total slant column, respectively. Boersma et al. (2004) state that σ total sys can be expressed as σ total sys = 0.03X total . However, any systematic error in the NO 2 total slant column will largely be absorbed by the stratospheric assimilation procedure (Belmonte Rivas et al., 2014) and does not propagate into the tropospheric column error. Therefore, σ total sys can be neglected from Eq. (6). The OMI standard and DOMINO products estimate the stratospheric slant column using TM4 chemistry-transport model simulations and data assimilation (Dirksen et al., 2011). According to the DOMINO OMI product documentation (which references Boersma et al., 2004and Dirksen et al., 2011, the error in the stratospheric slant column (σ strat ) is estimated to be 0.25 × 10 15 molecules cm −2 in all cases. Boersma et al. (2004) state that the tropospheric column is calculated as where N trop is the vertical tropospheric column and can be substituted, including the σ total (σ total sys has been neglected) and σ strat estimates, into Eq. (4). This leads to σ trop is reduced in the model-satellite comparisons when the AK is applied to the model data. Therefore, the error product σ trop ak from the OMI retrieval files with the smoothing error removed is used instead of σ trop in Eqs. (4) and (6). Boersma et al. (2007) suggest that the uncertainty in the tropospheric AMF is around 10-40 %, which we treat as systematic. This is because the AMF uncertainty will be dominated by systematic errors in the surface albedo, NO 2 profile and cloud and aerosol parameters. Also, the literature does not provide an estimate of the random error contribution to the AMF uncertainty. Therefore, we take the conservative estimate of σ AMF trop = 0.4 · AMF trop . This leads to the new retrieval error approximation of All of these terms are known apart from σ total ran . We can rearrange to calculate this based on other variables provided in the OMI product files. This leads to In the rare case that the right-hand side is negative (e.g. when N trop is large, but has small uncertainty; σ strat will be relatively small compared to N trop ), the random error component cannot be found as it would be complex, so the random error component is then set to 50 % (H. Eskes, personal communication, 2012). Now, rearranging for σ total ran , and assuming the right-hand side is positive, Eq. (8) becomes This quantity was calculated for each retrieval in each grid square and then the new seasonal retrieval error was calculated taking the reduced random component into account: where a bar superscript represents the seasonal time average. Figure 2 shows how averaging, by decreasing the random error component, reduces the seasonal satellite tropospheric column error as calculated by our algorithm. The figure compares the simple mean of the total satellite column NO 2 error (calculated for each pixel) with our new method which reduces the estimated random error component by one over the square root of the number of observations. The reduction in the satellite column error is then presented as a percentage of the original satellite column seasonal mean error. In both summer and winter, the seasonal mean column error is reduced to 30-90 % across the domain, therefore making the OMI data much more useful for model evaluation. Table 1 gives examples of the seasonal tropospheric column NO 2 error and the reduced tropospheric column NO 2 error using our algorithm for multiple locations across Europe. The error in summer, compared with winter, and the error over sea in comparison to land, are smaller. We suggest that the larger sample size in summer and over the sea, when compared to winter and over the land, respectively, reduces the random error component further as N is larger. Only for a few retrievals over Scandinavia does this methodology of reducing the random error component increase the overall column error (not shown here).

Model setup
The AQUM domain covers the UK and part of continental Europe on a rotated grid between approximately 45-60 • N and 12 • W-12 • E. The model has a horizontal resolution of 0.11 • × 0.11 • with 38 vertical levels between the surface and 39 km. The model has a coupled, online tropospheric chemistry scheme using the United Kingdom Chemistry and Aerosols (UKCA) subroutines. The chemistry scheme (Regional Air Quality, RAQ) includes 40 tracers, 23 photolysis reactions and 115 gas-phase reactions (Savage et al., 2013) including the reaction of the nitrate radical with formaldehyde, ethene, ethane, propene, n-butane, acetaldehyde, isoprene, organic nitrates and the hydroperoxyl radical. The standard model setup does not include any heterogeneous chemistry. A complete chemical mechanism is included in the online supplement to Savage et al. (2013).
The model uses the Coupled Large-scale Aerosol Simulator for Studies In Climate (CLASSIC) aerosol scheme. This is a bulk aerosol scheme with the aerosols treated as an external mixture. It contains six prognostic tropospheric aerosol types: ammonium sulfate, mineral dust, fossil fuel black carbon (FFBC), fossil fuel organic carbon (FFOC), biomass burning aerosols and ammonium nitrate. In addition, there is a diagnostic aerosol scheme for sea salt and a fixed clima-tology of biogenic secondary organic aerosols (BSOA). Mass is exchanged between the different aerosol modes by nucleation, evaporation and re-evaporation, coagulation and mode merging, diffusion and coagulation. For more details of the aerosol scheme see Bellouin et al. (2011). In common with most regional air quality forecast models in Europe, AQUM shows a small negative bias for PM 2.5 and a larger negative bias for PM 10 . For full details of the performance of the model for aerosols, NO 2 and ozone see Savage et al. (2013).
Meteorological initial conditions and lateral boundary conditions (LBCs) come from the Met Office's operational global Unified Model (25 km × 25 km) forecast. Initial chemical conditions come from the previous day's AQUM forecast and aerosol and chemistry LBCs come from the ECMWF GEMS (Global and regional Earthsystem Monitoring using Satellite and in situ data) reanalysis (Hollingsworth et al., 2008). The GEMS fields, available at http://www.gmes-atmosphere.eu/, provide boundary fluxes for regional air quality models such as AQUM.
This configuration of AQUM uses emission data sets from the National Atmospheric Emissions Inventory (NAEI) (1 km × 1 km) for the UK, ENTEC (5 km × 5 km) for the shipping lanes and European Monitoring and Evaluation Programme (EMEP) (50 km × 50 km) for the rest of the model domain. Over the UK the NAEI NO x emissions data sets are made up of two source types: area and point. Area sources include traffic, light industry and urban emissions, while point sources are power stations, landfill, incinerators and refineries. Typically, the point source emissions are 100 g s −1 in magnitude, while the area sources tend to be 10 g s −1 . The quoted uncertainty of the NAEI NO x emission data used in these simulations is 10 % (Li et al., 2009) for the total emissions. The spatial disaggregation adds further uncertainties to Atmos. Chem. Phys., 15, 5611-5626, 2015 www.atmos-chem-phys.net/15/5611/2015/ . We do not have a separate parameterisation for soil NO x emissions but given the large emissions from transport and industry, the soil NO x emissions are unlikely to be important in this region. Poupkou et al. (2010) provide the monthly climatology of biogenic emissions on a 0.125 • × 0.0625 • resolution. The use of climatological biogenic isoprene emissions will partially diminish AQUM's representation of ozone from biogenic precursors. A new interactive biogenic isoprene scheme is under development but was not available for this study. However, this is a secondary issue in this paper as we focus on primary emissions of NO x . Biomass burning emissions of aerosols come from the Global Fire Emissions Database (GFED), version 1 (Randerson et al., 2005) for 2000. The use of biomass burning emissions from 2000 is somewhat arbitrary, but within the AQUM's domain these emissions have relatively little impact.

Sensitivity experiments
We performed one control and five sensitivity experiments to investigate the AQUM's simulation of column NO 2 . Two ex- With N 2 O 5 heterogeneous chemistry with γ = 0.001 N 2 O 5 High As run N 2 O 5 Low but with γ = 0.02 periments used different LBCs, two experiments used modified point source emissions and two included heterogeneous chemistry. These are summarised in Table 2.
Simulation MACC investigates the sensitivity of AQUM column NO 2 to different chemical LBCs from the global Monitoring Atmospheric Composition and Climate (MACC) reanalysis, which is the follow-on project of GEMS (Inness et al., 2013). The GEMS reanalysis assimilated ozone profiles from SBUV, MIPAS, MLS and GOME; total ozone column from OMI and SCIAMACHY and total CO column from MOPITT (GEMS, 2010). The MACC reanalysis uses a more recent version of the ECMWF model (Integrated Forecast System), and was run at a resolution of 80 km instead of 125 km. MACC assimilated ozone profiles from MIPAS and MLS and GOME, ozone tropospheric or partial columns from OMI, SBUV/2 and SCIAMACHY, CO tropospheric column from IASI and MOPITT and NO 2 tropospheric column from SCIAMACHY (Inness et al., 2013). No in situ observations of reactive gases were assimilated in either product. Both GEMS and MACC use 4D-Var to assimilate satellite and in situ (aircraft) observations into the reanalyses. Savage et al. (2013) have undertaken a similar analysis of the MACC LBCs in AQUM. They showed that when compared with the AURN observations of O 3 , AQUM-MACC performs well during the first quarter of 2006 and overestimates observations afterwards, while AQUM-GEMS has a negative bias during the first quarter of the year but compares well with observations afterwards.
We have performed additional runs to examine the impact of the point sources over the UK on NO 2 columns. Run E1 repeated the control experiment but with all point sources removed. The objective was to test the hypothesis that the positive biases observed in the North of England (an area with a high density of power plants -see Sect. 4.1) were linked to uncertainties in the representation of NO x emissions from power stations. There are of course uncertainties in all emissions sources (area and point) but to fully assess the impact of these on the NO 2 column is beyond the scope of this work. Run E2 introduces a new idealised passive tracer emitted from the UK point sources with the same emissions to that of the model NO x inventory. The idealised tracer is transported like any chemical tracer, but is not lost through chemical reactions. Instead it is lost through its e-folding lifetime of one day. The point source tracer columns can then be examined to see if they correlate with summer AQUM-OMI positive biases (see Sect. 4.3).
Runs N 2 O 5 High and N 2 O 5 Low investigate the impact of heterogeneous chemistry on NO 2 columns. Tropospheric NO x (NO + NO 2 ) sources are dominated by anthropogenic emissions and the loss of NO 2 to HNO 3 is through two pathways: The standard configuration of AQUM does not include any heterogeneous reactions such as the hydrolysis of N 2 O 5 on aerosol surfaces (see details of the chemistry scheme in the Supplement of Savage et al., 2013). Previous global modelling studies have shown that this process can be a significant NO x sink at mid-latitudes in winter (e.g. Tie et al., 2003;Macintyre and Evans, 2010). Following those analyses, we have implemented this reaction, with rate k (s −1 ) calculated as where A is the aerosol surface area (cm 2 cm −3 ), γ is the uptake coefficient of N 2 O 5 on aerosols (nondimensional) and ω = 100 [8RT/(π M)] 1 2 (cm s −1 ) is the root-mean-square molecular speed of N 2 O 5 at temperature T (K), M is the molecular mass of N 2 O 5 (kg mol −1 ) and R = 8.3145 J mol −1 K −1 . Macintyre and Evans (2010) investigated the sensitivity of N 2 O 5 loss on aerosol by using a range of uptake values (0.0, 10 −6 , 10 −4 , 10 −3 , 5 × 10 −3 , 10 −2 , 2 × 10 −2 , 0.1, 0.2, 0.5 and 1.0). They found that limited sensitivity occurs at low and high values of γ . At low values, the uptake pathway is an insignificant route for NO x loss. At high values, the loss of NO x through heterogeneous removal of N 2 O 5 is limited by the rate of production of NO 3 , rather than the rate of heterogeneous uptake. However, in the northern extra-tropics (including the AQUM domain), their model shows significant sensitivity to intermediate values of γ (0.001-0.02) with a significant loss of NO x . Therefore, we experiment with γ = 0.001 and 0.02 to investigate the sensitivity of AQUM column NO 2 to heterogeneous chemistry. The aerosol surface area, A, includes the contribution of seven aerosol types present in CLASSIC: sea salt aerosol, ammonium nitrate, ammonium sulfate, biomass burning aerosol, black carbon, FFOC and BSOA. To account for hydroscopic growth of the aerosols, the formulation of Fitzgerald (1975) is used for growth above the deliquescence point for ammonium sulfate (RH = 81 %), sea salt (RH = 75 %) and ammonium nitrate (RH = 61 %) up to 99.5 % RH. We apply a linear fit between the efflorescence (RH = 30 % for sulfate, 42 % for sea salt and 30 % for nitrate) and deliquescence points. There is no hydroscopic growth below the efflorescence point. Look-up tables are used for the other aerosol types. Biomass burning and FFOC aerosol growth rates are taken from Magi and Hobbs (2003), BSOA growth rates come from Varutbangkul et al. (2006) and black carbon is considered to be hydrophobic (no growth).

Statistical comparisons
For the AQUM-satellite comparisons the following modelobservation statistics were used: mean bias (MB), root mean square error (RMSE) and the fractional gross error (FGE, bounded by the values 0 to 2). These statistics are described by Han et al. (2011) and Savage et al. (2013). Further details are given in the Appendix. Figure 4 compares observed column NO 2 with the AQUM control Run C (with AKs applied). The AQUM and OMI averages have similar spatial patterns, with maximum and minimum column NO 2 over the urban and rural/ocean regions, respectively. In summer, AQUM and OMI background concentrations are around O(10 13 )-3 × 10 15 molecules cm −2 , where O(10 13 ) represents values in size of the order of 10 13 . The OMI peak column NO 2 of 16-20 × 10 15 molecules cm −2 is over London. AQUM simulates similar London column NO 2 , but the model peak concentrations are over northern England at over 20 × 10 15 molecules cm −2 .

Control run
In winter, the background column NO 2 is elevated with a larger spatial extent ranging around O(10 13 )-6 × 10 15 molecules cm −2 in both the AQUM and OMI fields. However, the elevated AQUM background state has a larger coverage than that of OMI. Over the source regions, OMI column NO 2 peaks over London at 12-13 × 10 15 molecules cm −2 , with similar concentrations seen in AQUM. However, AQUM peak column NO 2 are over northern England at 12-16 × 10 15 molecules cm −2 . Therefore, independently of season, AQUM overestimates northern England column NO 2 . Interestingly, the background column NO 2 is larger in winter for both AQUM and OMI, but column NO 2 is lower over the source regions in winter than in summer (Pope et al., 2014); van der A et al. (2008) suggest that peak UK NO x emissions occur in July, while Pope et al. (2014) suggest that the transport of column NO 2 away from source regions due to stronger winter dynamics outweighs Atmos. Chem. Phys., 15, 5611-5626, 2015 www.atmos-chem-phys.net/15/5611/2015/  the loss of UK source region column NO 2 from enhanced summer photochemistry. Figure 5 shows the MB between AQUM Run C and OMI. The black polygoned regions show significant differences, i.e. where the magnitude of the MB is greater than the satellite error. In summer, there are significant positive, 5-10 × 10 15 molecules cm −2 , and negative, −10 to −1 × 10 15 molecules cm −2 , biases in northern England and the Benelux region, respectively. The negative biases are potentially linked to the coarser resolution EMEP NO x emissions data sets (50 km × 50 km) which average emissions over a larger grid square causing AQUM to simulate lower column NO 2 than seen by OMI. We hypothesise that the northern England biases are linked to the point source (power station) NO x emissions from NAEI. This is further discussed in Sect. 4.3. In winter, AQUM overestimates OMI by 1-3 × 10 15 molecules cm −2 over the North Sea and Scotland, as the modelled winter background column NO 2 is larger; this is further investigated in Sect. 4.4 by including an additional NO x sink in the chemistry scheme of the model. The northern England positive biases seen in summer also extend to winter, 3-5 × 10 15 molecules cm −2 , suggesting that this is not only a seasonal feature. Finally, the large bias dipole in the Po Valley appears to be related to the LBCs or the winter emissions, as summer biases are small.
We also compared AQUM against surface observations of NO 2 from AURN, found at http://uk-air.defra.gov.uk/ networks/network-info?view=aurn, and maintained by DE-FRA. This was to see if there was a consistent pattern in the biases in the model column and surface NO 2 . However, we find similar problems to Savage et al. (2013) where surface AQUM-observation comparisons show systematic negative biases at urban sites. The coarse model resolution, compared to the observation point measurements (even with roadside and traffic sites removed), results in significant model underestimation of NO 2 in urban regions. Therefore, it is difficult to draw any conclusions on the AQUM skill as the model grid-point data will struggle to reproduce the point measurement observations. Also the spatial coverage of the AURN data is very sparse over the UK and AURN NO 2 measurement interferences from molybdenum converters (Steinbacher et al., 2007) overestimate surface concentrations, in particular at rural sites. Therefore, satellite (pixel area) data are the primary observations used to evaluate AQUM in this paper. Figure 6a and b shows results of the sensitivity run with the MACC boundary conditions (Run MACC) and can be compared with Fig. 4a and b. The MACC LBCs have a limited impact on summer column NO 2 with peak concen-trations over London and northern England between 15-20 × 10 15 molecules cm −2 for both runs MACC and C. However, in winter Run MACC increases column NO 2 from approximately 12 × 10 15 to 16 × 10 15 molecules cm −2 over the UK and Benelux region. When compared with OMI ( Fig. 6c and d) the limited summer impact of the MACC LBCs results in biases which are similar to those in Fig. 5 from the control run, with biases over northern England, 5-10 × 10 15 molecules cm −2 , and continental Europe, −5 to −3 × 10 15 molecules cm −2 . In winter, Run MACC has enhanced column NO 2 resulting in biases with OMI of between 2-5 × 10 15 molecules cm −2 across the whole domain, unlike Run C with GEMS LBCs in Fig. 5. The peak positive biases are again over northern England (and the Po Valley), 5 × 10 15 molecules cm −2 , suggesting that AQUM overestimates NO 2 in the region, at the OMI overpass time, independently of season or LBCs. Therefore, the GEMS LBCs appear to give better AQUM column NO 2 forecast skill than MACC does, similarly as found by Savage et al. (2013) for the comparisons with surface ozone.

AQUM NO x emissions sensitivity experiments
We hypothesise that significant summer Run C-OMI positive biases in northern England and Scotland (Fig. 5) are caused by the AQUM's representation of point source (mainly power station) NO x emissions. Therefore, to better understand these biases, we investigate sensitivity experiments of NO x emissions (Table 2)  ( Fig. 7a shows JJA Run C-OMI positive biases). Figure 7bd shows the JJA AQUM NO x emissions for runs C and E1 (with point sources removed) and their difference. The peak Run C NO x emissions are around 1.8 × 10 −9 kg m −2 s −1 . However, with point sources removed, the differences are 1.8 × 10 −9 kg m −2 s −1 in point source locations, showing that they make up a significant part of the emissions budget. Figure 8a and b highlight the impact of removing point sources, as column NO 2 over northern England reduces from 15-25 × 10 15 molecules cm −2 to 4-5 × 10 15 molecules cm −2 . The Run E1-OMI MB now ranges between −10 and −6 × 10 15 molecules cm −2 , while the Run C-OMI MB (Fig. 7a) is around 6-10 × 10 15 molecules cm −2 . Therefore, the switch in sign of the biases, of similar magnitude, indicates that the point source emissions play a significant role in the AQUM column NO 2 budget.
Run E2 aimed to test whether the point sources were responsible for the positive biases in Fig. 7a by using an idealised tracer of the power station emissions. Figure 8c shows the JJA tracer column with the OMI AKs applied, where peak columns range around 16-20 × 10 15 molecules cm −2 over northern England. The minimum tracer values of 0 × 10 15 molecules cm −2 are over the sea and continental Europe as there is no emission of the tracer there. Inspection of Figs. 7a and 8c suggests that the peak tracer columns overlap with the large Run C-OMI positive biases.
To test this more quantitatively, the spatial correlation between these peak concentrations from Run E2 were com-pared against a random tracer-MB (Run C) correlation distribution. The largest 100 tracer column pixels in Fig. 8c were compared against the MBs over the same locations in Fig. 7a, yielding a correlation of 0.45. Then, using a Monte Carlo approach, a random 100 sample of the Fig. 7a land-based MB pixels (we use land bias pixels only as the biases in Fig. 7a are over land) were correlated against the largest 100 tracer sample. This was repeated 1000 times and then sorted from lowest to highest. The 5th and 95th percentiles were calculated at −0.162 and 0.158, respectively. Our theory is that if the point sources are responsible for the peak Run C-OMI biases, then the peak tracer concentrations, which represent the point source emissions, should be in the same location as the peak biases. By looking at the random samples' correlation, we see how the tracer-MB peak value concentration compares with randomly sampled MB locations. Since 0.45 is above the 95th percentile, this shows the tracer-MB peak correlation value is significant (is actually the greatest correlation -see Fig. 8d) and that AQUM's representation of point source emissions is linked to the AQUM overestimation of column NO 2 in northern England and Scotland. Figure 9 shows the winter and summer MBs between AQUM (with LBCs from GEMS) and OMI when heterogeneous hydrolysis of N 2 O 5 is implemented in the model with γ = 0.001 (Run N 2 O 5 Low) and γ = 0.02 (Run N 2 O 5 High). In the Run C summer case (see Fig. 5a) there are positive northern England and Scotland biases of around 5-10 × 10 15 molecules cm −2 . We have shown that these positive biases are likely linked to AQUM's representation of point source emissions. However, by introducing N 2 O 5 heterogeneous chemistry these positive biases are significantly reduced. In Run N 2 O 5 Low (Fig. 9a) there is some impact on the biases as RMSE (over UK domain 8 • W-2 • E and 50-60 • N) decreases from 3.68 × 10 15 to 3.39 × 10 15 molecules cm −2 and FGE (over UK domain 8 • W-2 • E and 50-60 • N) also reduces very slightly.

Sensitivity to heterogeneous removal of N 2 O 5
In Run N 2 O 5 High (Fig. 9c) many of the positive biases over point sources are now insignificant and the RMSE decreases to 3.08 × 10 15 molecules cm −2 . However, over parts of continental Europe the intensity and spread of negative biases has increased, thus suggesting that γ = 0.02 might be too strong an uptake here. The FGE does go up slightly to 0.67 and we suspect that this is due to the introduction of negative biases over relatively clean or moderately polluted areas (e.g. the Irish Sea and parts of the continent). Note that the correction of errors of large magnitude (e.g. over point sources) reduces RMSE because this metric penalises the large deviations between the model and the satellite-retrieved columns, while the introduction of errors of low magnitude over less polluted areas might increase the normalised errors given by FGE. We experimented using the MACC LBCs when γ = 0.02 in an initial AQUM study of January-February-March (JFM) 2006. However, for this value of γ runs with GEMS instead of MACC LBCs gave the best comparisons (smaller domain RMSE when compared with OMI NO 2 ). The changes at the point source locations are most significant because of the large emissions of NO x and aerosols suitable for this heterogeneous process to take place. Therefore, we suggest that while AQUM's representation of point sources may be responsible for the summer northern England/Scotland positive biases, including N 2 O 5 heterogeneous chemistry with γ = 0.02 will partially account for this. In winter, the positive biases seen in Fig. 5b, 2-5 × 10 15 molecules cm −2 , decrease as γ increases, similarly as found for summer. In Run N 2 O 5 Low (Fig. 9b) the spatial spread of significantly positive biases is only partially reduced, resulting in small decreases of RMSE (from 5.12 × 10 15 to 5.05 × 10 15 molecules cm −2 ) and FGE (from 0.63 to 0.62). For Run N 2 O 5 High (Fig. 9d) the cluster of significantly positive biases has decreased spatially yielding the best comparisons, with RMSE and FGE values of 4.48 × 10 15 molecules cm −2 and 0.60, respectively.

Conclusions
We have successfully used OMI satellite observations of column NO 2 over the UK to further explore the AQUM performance, extending on previous validation of the model which had only used surface data. In order to do this we have looked in detail at the satellite errors (random, systematic and smoothing) and derived an algorithm which reduces the re-Atmos. Chem. Phys., 15, 5611-5626, 2015 www.atmos-chem-phys.net/15/5611/2015/ Based on the summer and winter comparisons, the standard (operational) AQUM overestimates column NO 2 over northern England/Scotland by 5-10 × 10 15 molecules cm −2 and over the northern domain by 2-5 × 10 15 molecules cm −2 . The use of a different set of lateral boundary conditions (from the MACC reanalysis), which are known to increase AQUM's surface ozone positive bias (Savage et al., 2013), also increases the error in the NO 2 columns. The AQUM column NO 2 is increased, especially in winter, by 2-5 × 10 15 molecules cm −2 , resulting in poorer comparisons with OMI.
From multiple sensitivity experiments on the UK NO x point source emissions we conclude that it was AQUM's representation of these emissions which very likely caused the northern England/Scotland summer biases. By emitting an idealised tracer in the NO x points sources we found a significant correlation of the peak tracer columns to the AQUM-OMI MBs. Finally, introducing N 2 O 5 heterogeneous chemistry in AQUM improves the AQUM-OMI comparisons in both seasons. In winter, the spatial extent of positive biases, 2-5 × 10 15 molecules cm −2 , decreases. In summer, the northern England biases decrease both spatially and in magnitude from 5-10 to 0-5 × 10 15 molecules cm −2 . Therefore, this suggests that in summer the AQUM's representation of NO x point sources is inaccurate but can be partially masked by the introduction of N 2 O 5 heterogeneous chemistry.
As this study has shown the potential use of satellite observations, along with the time-averaged random error algorithm, to evaluate AQUM, the data could be used in future to evaluate operational air quality forecasts. We also show that the heterogeneous loss of N 2 O 5 on aerosol is an important sink of NO 2 and should be included in the operational AQUM.