Multi-model study of mercury dispersion in the atmosphere : vertical and interhemispheric distribution of mercury species

Atmospheric chemistry and transport of mercury play a key role in the global mercury cycle. However, there are still considerable knowledge gaps concerning the fate of mercury in the atmosphere. This is the second part of a model intercomparison study investigating the impact of atmospheric chemistry and emissions on mercury in the atmosphere. While the first study focused on ground-based observations of mercury concentration and deposition, here we investigate the vertical and interhemispheric distribution and speciation of mercury from the planetary boundary layer to the lower stratosphere. So far, there have been few model studies investigating the vertical distribution of mercury, mostly focusing on single aircraft campaigns. Here, we present a first comprehensive analysis based on various aircraft observations in Europe, North America, and on intercontinental flights. The investigated models proved to be able to reproduce the distribution of total and elemental mercury concentrations in the troposphere including interhemispheric trends. One key aspect of the study is the investigation of mercury oxidation in the troposphere. We found that different chemistry schemes were better at reproducing observed oxidized mercury patterns depending on altitude. High concentrations of oxidized mercury in the upper troposphere could be reproduced with oxidation by bromine while elevated concentrations in the lower troposphere were better reproduced by OH and ozone chemistry. However, the results were not always conclusive as the physical and chemical parameterizations in the chemistry transport models also proved to have a substantial impact on model results.


Introduction
At the time of publication the Minamata Convention has 128 signatories and has been ratified by 55 countries.After reaching the threshold of 50 ratifications the convention enters into force on 16 August 2017 (UNEP, 2013).
Once ratified by at least 50 parties, this international legally binding treaty will oblige all participating parties to i. assess the state of mercury pollution ii.take actions to reduce mercury emissions and concentrations in the environment and iii.evaluate the success of the measures taken on a regular basis.
The state of mercury contamination is typically determined by measurement of the relevant mercury species (e.g., total mercury (TM) in the atmosphere, methylmercury in fish).However, in order to understand the sources of mercury pollution and to predict the impact of various possible measures for mercury emission reduction it is necessary to apply complex chemistry transport models.
In the last decades, general chemistry transport models (CTMs) have been extended to model the global mercury cycle by including mercury chemistry and partitioning (Bergan et al., 1999;Xu et al., 2000;Lee et al., 2001;Petersen et al., 2001;Seigneur et al., 2001;Dastoor and Larocque, 2004;Selin et al., 2007;Hedgecock and Pirrone, 2004).Since then, extensive model intercomparison studies have been performed to evaluate and improve the original models (Bullock et al., 2008;Ryaboshapko et al., 2002Ryaboshapko et al., , 2007a, b), b).However, until today, we have not fully understood all parts of the global mercury cycle.In the atmosphere, the main question is how elemental mercury emitted from anthropogenic, natural, and legacy sources is oxidized.This includes the relative importance of oxidizing reaction partners and the relevance of reduction pathways of oxidized mercury under environmental conditions.Once we understand the redox processes of atmospheric mercury, is it possible to determine the range of mercury transport and the fate of mercury emitted in the past and the future.
In this study, we investigate the vertical distribution of mercury species in the atmosphere.While gaseous elemental mercury (GEM) makes up the vast majority of total atmospheric mercury near the surface (Sprovieri et al., 2016 this issue), recent aircraft-based observations have indicated that there is significant oxidation of mercury occurring in the free troposphere (FT) (Brooks et al., 2014;Lyman and Jaffe, 2012;Jaffe et al., 2014;Gratz et al., 2015;Shah et al., 2016).However, apart from GEM no individual mercury compound has been identified so far and the atmospheric oxidized mercury is an unknown mixture of mercury bound to Br, Cl, OH, O, and NO 2 compounds (Horowitz et al., 2017).The speciation of mercury is thus operationally defined as GEM, gaseous oxidized mercury (GOM), and particulate bound mercury (PBM) (Gustin et al., 2015).In the following we will address the sum of all oxidized mercury species, including mercury in the aqueous phase, as OM (oxidized mercury).Thus, OM is defined as the difference between TM and GEM (OM = TM − GEM).
As oxidized mercury is much more rapidly removed from the atmosphere than elemental mercury, the free troposphere -the region between the planetary boundary layer (PBL) and the tropopause -is of great importance for the global mercury budget.
To investigate this issue further, the Mercury Modeling Task Force (MMTF) was founded during the course of the EU FP7 project GMOS (Global Mercury Observation System).The MMTF is a global collaboration not limited to GMOS project partners and, thus, incorporates most mercury CTMs currently in use in the scientific community.With a to-tal of seven model combinations (including four global, one hemispheric, and two regional models), the partners in the MMTF carried out a set of sensitivity model runs and compared the results to airborne observations in Europe, North America, and on intercontinental flights.

Observations
Aircraft-based observations are expensive and thus rarely performed on a regular basis.They are made in a certain area at a limited time interval and as such are hardly representative enough to be used to evaluate model performance.However, in the year 2013 an unprecedented number of aircraft-based observations has been performed.
Within the European Tropospheric Mercury Experiment (ETMEP) five vertical profiles were flown in the PBL and the lower free troposphere (LFT) at an altitude of 500-3500 m over central Europe during August 2013 (Weigelt et al., 2016a).Mercury was measured using two collocated Tekran instruments (2537X and 2537B).Both Tekran instruments were run with upstream particle filters and one additionally with a quartz wool trap, which presumably removes GOM (Lyman and Jaffe, 2012;Ambrose et al., 2013).Neglecting PBM, the concentration of which is usually negligible, the measurement by Tekran without the quartz wool trap approximates TM and that with quartz wool trap GEM (Weigelt et al., 2016b).GEM was also measured by a modified Lumex instrument (Weigelt et al., 2016b).Additionally, GOM was collected on denuders and analyzed on return to the laboratory.
In the US Brooks et al. (2014) measured GEM, GOM, and PBM profiles on 28 flights between August 2012 and July 2013 at altitudes from 1000 to 6000 m.GEM was measured on board with a modified Tekran 2537B instrument with a temporal resolution of 2.5 min.GOM was collected on denuders and PBM on a filter tube downstream of the denuder.Both were later analyzed in the laboratory.In addition, 19 flights were flown in June and July 2013 mostly over the southeastern USA at altitudes between 500 and 7000 m during the NOMADSS (Nitrogen, Oxidants, Mercury and Aerosol Distributions, Sources and Sinks) campaign (Gratz et al., 2015;Shah et al., 2016).Here, oxidized mercury was calculated based on a differential method using two Tekran 2537B instruments, one of which was equipped with GOM trap (quartz wool or ion-exchange membrane) using the University of Washington Detector for Oxidized Hg Species (DOHGS) (Lyman and Jaffe, 2012;Ambrose et al., 2015).
Finally, there were 19 intercontinental flights between Germany and North and South America made within the CARIBIC (Civil Aircraft for the Regular Investigation of the atmosphere Based on an Instrumented Container) project during which TM and GEM was measured in the upper troposphere and the lower stratosphere in altitudes between 6000 and 12 000 m using a modified Tekran 2537A instrument (Slemr et al., 2014(Slemr et al., , 2016)).
The aircraft observations were complemented with ground-based observations from the GMOS measurement network (Sprovieri et al., 2016;GMOS, 2016).In particular, we used data from the ground-based stations in Mace Head, Ireland, and Waldhof, Germany, to augment the ET-MEP profiles (Weigelt et al., 2013(Weigelt et al., , 2015)).At Mace Head and Waldhof GEM is measured with a Tekran 2537A.At Waldhof, additionally, GOM and PBM are measured with a Tekran 1130/1135 speciation unit.
These flights cover a large horizontal area in the midlatitudes above Europe (45-55 • N) and North America (30-45 • N) and also a large vertical area ranging from the surface up to the lower stratosphere (12 000 m).Moreover, comparable flights were performed throughout the year between January and October.Finally, all measurements were performed with Tekran instruments, allowing for a comparison of all aircraft-based measurements as well as the combination with ground-based observations which use similar instruments.It is arguable whether this is already enough data to give us a comprehensive and representative picture of the vertical distribution of mercury in the atmosphere.However, we think that there is an adequate amount of data to allow for more than just an anecdotal investigation of a specific episodes.Thus, we combined measurements from all flights in Europe and North America as well as ground-based observations for the year 2013 in order to construct idealized seasonal average vertical profiles for TM and OM (Fig. 1).It can be seen that TM concentrations are mostly uniform within each layer with decreasing gradients at the PBL and the tropopause.We see increased TM concentrations inside the PBL during winter due to higher primary emissions and a shallower PBL.In winter, the current measurement techniques are not able to detect OM in the FT with concentrations always below 100 pg m −3 .In spring and summer we see two distinguished areas with increased OM concentrations in the lower and the upper free troposphere.

Models
This study is based on an annual ensemble of seven different CTMs for the year 2013 including global (GLE-MOS, GEOS-Chem, GEM-MACH-Hg, ECHMERIT), hemispheric (CMAQ-Hem), and regional (WRF-Chem, CCLM-CMAQ) models (Table 1).The models differ considerably in the implemented physical and chemical parameterizations, spatial and temporal resolution, and meteorological drivers.The ensemble includes models that use external fields for chemical reaction partners (GLEMOS, GEOS-Chem), models with a complete photochemical reaction scheme (CCLM-CMAQ, CMAQ-Hem), and online coupled meteorological models (GEM-MACH-Hg, ECHMERIT, WRF-Chem).The Moreover, some models also consider reduction of Hg 2+ to GEM in the aqueous phase (GLEMOS, ECHMERIT, WRF-Chem, CMAQ).In addition to the BASE cases, a set of chemistry and emission sensitivity runs was performed.These include runs with no anthropogenic emissions (NOANT) and with a 100 % GEM speciation of anthropogenic emissions (ANTSPEC).For the mercury chemistry, different runs with only one of the abovementioned oxidants (OHCHEM, O3CHEM, BRCHEM) and without any mercury chemistry (NOCHEM) were performed.Concerning the bromine reaction, two different Br and BrO fields were used.These are bromine fields from GEOS-Chem (Parella et al., 2012) and the p-TOMCAT model (Yang et al., 2005(Yang et al., , 2010)).However, the described sensitivity runs were not performed by all models.Moreover, the list differs from that published by Travnikov et al. (2017, this issue) as only a limited set of 3-D model output data could be saved.A synthetic model description is given in Table 1 and the sensitivity runs performed are further described in Table 2.An evaluation of groundbased mercury concentrations and deposition fluxes for the four global models (GLEMOS, GEOS-Chem, GEM-MACH-Hg, ECHMERIT) can be found in Travnikov et al. (2017, this issue).An evaluation of regional deposition fields can be found in Gencarelli et al. (2017, this issue).For the sake of completeness we provide the detailed model descriptions here as well.

GLEMOS (Global EMEP Multi-media Modeling System)
GLEMOS is a multi-scale chemistry transport model developed for the simulation of environmental dispersion and cycling of different chemicals, including mercury, based on the older hemispheric model MSCE-HM-Hem (Travnikov, 2005;Travnikov and Ilyin, 2009;Travnikov et al., 2009).
The model simulates atmospheric transport, chemical transformations, and deposition of three Hg species (GEM, GOM, and PBM).The atmospheric transport of the tracers is driven by meteorological fields generated with the Weather Research and Forecast modeling system (WRF 3.7.2) (Skamarock et al., 2007), which is fed by operational analysis data from the European Centre for Medium-Range Weather Forecast (ECMWF) (ECMWF, 2016).In the default setup configuration the model grid has a horizontal resolution of 1 • × 1 • .Vertically, the model domain reaches up to 10 hPa and consists of 20 irregular terrain-following sigma layers.The atmospheric chemical scheme includes Hg oxidation and reduction reactions in both the gas phase and the aqueous phase of cloud water.The major chemical mechanisms in the gas phase include Hg oxidation by O 3 and OH radicals with reaction rate constants taken from Hall (1995) and Sommar et al. (2001), respectively.The latter was scaled down by a factor of 0.1 within and below clouds to account for reduced photochemical activity (Seigneur et al., 2001).The O 3 and OH concentration fields are imported from MOZART (Emmons et al., 2010).A two-step gas-phase oxidation of GEM by Br is included as an option.Aqueous-phase reactions include oxidation by ozone, chlorine, and hydroxyl radical and reduction via decomposition of sulphite complexes (Van Loon et al., 2000).The model distinguishes in-cloud and sub-cloud wet deposition of PBM and GOM based on empirical data.The dry deposition scheme is based on the resistance analogy approach (Wesely and Hicks, 2000).Prescribed fluxes of natural and secondary emissions of Hg from soil and seawater were generated depending on Hg concentrations in soil, soil temperature, and solar radiation for emissions from land and proportional to the primary production of organic carbon in seawater for emissions from the ocean (Travnikov and Ilyin, 2009).In addition, an empirical parameterization of the prompt Hg re-emission from snow-and icecovered surfaces is applied based on observational data.

GEOS-Chem
The GEOS-Chem global chemistry transport model (v9-02; http://www.geos-chem.org) is driven by assimilated meteorological data from the NASA GMAO Goddard Earth Observing System (Bey et al., 2001).The GEOS-FP and GEOS-5.2.0 data are used for the simulation year 2013 and the spin-up period, respectively (http://gmao.gsfc.nasa.gov/products/).GEOS-Chem couples a 3-D atmosphere (Holmes et al., 2010), a 2-D mixed layer slab ocean (Soerensen et al., 2010), and a 2-D terrestrial reservoir (Selin et al., 2008) in a horizontal resolution of 2 • × 2.5 • .Three mercury species (GEM, GOM, and PBM) are tracked in the atmosphere (Amos et al., 2012).A two-step gaseous oxidation mechanism initialized by Br atoms is used.Bromine fields are archived from a full-chemistry GEOS-Chem simulation (Parrella et al., 2012) while the rate constants of reactions are from Goodsite et al. (2012), Donohoue et al. (2006), andBalabanov et al. (2005).The surface fluxes of GEM include anthropogenic sources, biomass burning, geogenic activities, as well as the bidirectional fluxes in the atmosphereterrestrial and atmosphere-ocean exchanges (Song et al., 2015;Strode et al., 2007).Biomass burning emissions are estimated using a global CO emission database and a volume ratio of Hg / CO of 1 × 10 −7 .Geogenic activities are spatially distributed based on the locations of mercury mines.For atmosphere-terrestrial exchange, GEOS-Chem treats the evasion and dry deposition of GEM separately (Selin et al., 2008).Dry deposition is parameterized with a resistance-inseries scheme (Wesely, 1989).In addition, an effective GOM uptake by sea salt aerosol is also included over the ocean (Holmes et al., 2010).GEM evasion includes volatilization from soil and rapid recycling of newly deposited Hg.The former is estimated as a function of soil Hg content and solar radiation.The latter is modeled by recycling a fraction of wetdry deposited oxidized mercury to the atmosphere as GEM immediately after deposition (60 % for snow-covered land and 20 % for all other land uses) (Selin et al., 2008).GEOS-Chem estimates the atmosphere-ocean exchange of GEM using a standard two-layer diffusion model.The ocean mercury in the mixed layer interacts not only with the atmospheric boundary layer but also with subsurface waters through entrainment/detrainment of the mixed layer and wind-driven Ekman pumping (Soerensen et al., 2010).

GEM-MACH-Hg
GEM-MACH-Hg is a new chemical transport model for mercury that is based on the GRAHM model developed by Environment and Climate Change Canada (Dastoor and Larocque, 2014;Dastoor et al., 2008Dastoor et al., , 2015;;Durnford et al., 2010Durnford et al., , 2012;;Kos et al., 2013) GEM-MACH-Hg uses a newer version of the Environment and Climate Change Canada's operational meteorological model.The horizontal resolution of the model is 1 • × 1 • .GEM is oxidized in the atmosphere by OH radicals.The rate constant of the reaction is from Sommar et al. (2001) but scaled down by a coefficient of 0.34 to take into account possible dissociation/reduction reactions (Tossell et al., 2003;Goodsite et al., 2004).The gaseous oxidation of mercury by bromine is applied in polar regions using reaction rate constants from Donohoue et al. (2006), Dibble et al. (2012), and Goodsite et al. (2004).The parameterization of atmospheric mercury depletion events is based on Br production and chemistry and snow re-emission of GEM (Dastoor et al., 2008).
OH fields are from MOZART (Emmons et al., 2010) while BrO is derived from 2007-2009 satellite observations of BrO vertical columns.The associated Br concentration is then calculated from photochemical steady-state conditions (Platt and Janssen, 1995).Dry deposition in GEM-MACH-Hg is based on the resistance approach (Zhang, 2001;Zhang et al., 2003).In the wet deposition scheme, GEM and GOM are partitioned between cloud droplets and air using a temperature-dependent Henry's law constant.Total global emissions from natural sources and re-emissions of previously deposited Hg (from land and oceans) in GEM-MACH-Hg are based on the global Hg budgets by Gbor et al. (2007), Shetty et al. (2008), andMason (2009).Landbased natural emissions are spatially distributed according to the natural enrichment of Hg.Terrestrial re-emissions are spatially distributed according to the historic deposition of Hg and land-use type and depend on solar radiation and the leaf area index.Oceanic emissions depend on the distributions of primary production and atmospheric deposition.

ECHMERIT
ECHMERIT is a global online meteorological chemistry transport model, based on the ECHAM5 global circulation model, with a highly flexible chemistry mechanism designed to facilitate the investigation of atmospheric mercury chemistry (Jung et al., 2009;De Simone et al., 2014, 2015, 2017).The model uses the same spectral grid as ECHAM.The standard horizontal resolution of the model is T42 (approximately 2.8 • × 2.8 • ), whereas in the vertical the model is discretized with a hybrid sigma pressure system with 19 non-equidistant levels up to 10 hPa.The base chemical mechanism includes the GEM oxidation by OH and O 3 in the gaseous and aqueous phases.Reaction rate constants are from Sommar et al. (2001), Hall (1995), 17, 6925-6955, 2017 www.atmos-chem-phys.net/17/6925/2017/and Munthe (1992).OH and O 3 concentration fields were imported from MOZART (Emmons et al., 2010).The Hg oxidation by Br is also optionally available in a two-step gas-phase oxidation mechanism with reaction rates as described in Goodsite et al. (2004Goodsite et al. ( , 2012) ) and Donohoue et al. (2006).ECHMERIT uses a parameterization of dynamic air-seawater exchange as a function of ambient parameters, but using a constant value of mercury concentration in seawater (De Simone et al., 2014).Emissions from soils and vegetation were calculated offline and derived from the EDGAR/POET emission inventory (Granier et al., 2005;Peters and Olivier, 2003) that includes biogenic emissions from the GEIA inventories (http://www.geiacenter.org),as described by Jung et al. (2009).Prompt re-emission of a fixed fraction (20 %) of wet and dry deposited mercury is applied in the model to account for reduction and evasion processes which govern mercury short-term cycling between the atmosphere and terrestrial reservoirs (Selin et al., 2008).This fraction is increased to 60 % for snow-covered land and ice-covered seas.

CMAQ-Hem
This is a hemispheric setup of the Community Multi-Scale Air Quality System (CMAQ) version 4.6 (Byun and Schere, 2006;Byun and Ching, 1999).The model is based on a three-dimensional Eulerian atmospheric chemistry and transport modeling system that simulates Hg, ozone, particulate matter, acid deposition, and visibility simultaneously.The model components and scientific backgrounds have been documented elsewhere (Bullock and Brehme, 2002;Bullock et al., 2008;Travnikov et al., 2010).A spin-up period of 10 days is used to eliminate the impact of initial conditions for atmospheric oxidants (O 3 and OH) that react with mercury.As for mercury species, global models were simulated for several years prior to the study period (2005) in order to provide the initial and boundary conditions for this study (Pongprueksa et al., 2011).A hemispheric model domain with a polar stereographic projection at 108 km spatial resolution and 187 × 187 grid cells was used for this experiment with 13 hybrid sigma layers up to 50 hPa.Hourly meteorological data were prepared using the WRF model version 3.7 (Skamarock et al., 2008).The selected physics options were Thompson (Microphysics Options) (Thompson et al., 2004), Betts-Miller-Janjic (Cumulus Parameterization Options) (Janjic, 1994(Janjic, , 2000)), RRTMG (Radiation Physics Options), and BouLac (PBL Physics Options) based on the results of meteorological model performance evaluation (Wang et al., 2014).The ARW outputs were processed using MCIPv3.4.1 (Byun and Ching, 1999;Otte and Pleim, 2010) to generate model-ready meteorology for chemical transport simulations.

WRF-Chem
The WRF-Chem-Hg model (Gencarelli et al., 2014a(Gencarelli et al., , 2015(Gencarelli et al., , 2017) ) is a modified version of WRF-Chem (version 3.4, Grell et al., 2005) model, developed to reproduce the emission, transport, chemical transformation, and deposition of Hg at local scales with elevated spatial and temporal resolutions.The gas-phase chemistry of Hg and a parameterized representation of atmospheric Hg aqueous chemistry have been added to the RADM2 chemical mechanism using KPP (Sandu and Sander, 2006) and the WKC coupler (Salzmann and Lawrence, 2006) in order to represent four Hg species: GEM, GOM, PBM, and dissolved oxidized mercury (Hg II (aq) ) (see Gencarelli et al., 2014, for further details regarding Hg parameterizations and the physics options employed).Oxidation by O 3 , OH, and Br was implemented as described in Gencarelli et al. (2015) in accordance with the experimental purpose.In the BASE case only O 3 and OH chemistry are used.Chemical initial and boundary conditions were taken from the ECHMERIT model (Jung et al., 2009;De Simone et al., 2014) for Hg species, while boundary conditions for other chemical species were taken from MOZART-4 (Emmons et al., 2010).Dry deposition of gas-phase species is treated using the approach developed by Wesely (1989), multiplying the concentrations in the lowest model layer by the spatially and temporally varying deposition velocity, which is proportional to aerodynamic, sublayer, and surface resistances.The wet deposition of Hg species has been implemented by adding the Hg compounds to the scheme in WRF-Chem for gas and particulate convective transport and wet deposition.In-cloud and below-cloud scavenging of Hg species has been treated in accordance with the approach described by Neu and Prather (2012), with Hg species scavenging rate assumed to be the same as that for HNO 3 (g).The model domain covers Europe and the Mediterranean Sea, including part of the western North Atlantic Ocean, North Africa, and the Middle East with a horizontal resolution of 24 × 24 km, and 30 vertical levels from soil to 50 hPa.Hg emissions by AMAP/UNEP (2013a, b) for mercury and from the EDGARv4.tox1(2008) inventory for other species were interpolated on this model domain.

CCLM-CMAQ
This modeling system is based on the meteorological model CCLM and the chemistry transport model CMAQ v5.0.1.All physical atmospheric parameters were taken from regional atmospheric simulations with the COSMO-CLM v4.8 mesoscale meteorological model (Geyer, 2014) using NCEP reanalysis data as forcing (Kalnay et al., 1996).COSMO-CLM is the climate version of the regional-scale meteorological community model COSMO (Rockel et al., 2008), originally developed by Deutscher Wetterdienst (DWD) (Steppeler et al., 2003;Schaettler et al., 2008).It has been run on a 0.22 the whole of Europe.COSMO-CLM uses the TERRA-ML land surface model (Schrodin and Heise, 2001), a TKE closure scheme for the PBL (Doms, 2011;Doms et al., 2011), cloud microphysics after Seifert andBeheng (2001, 2006), the Tiedtke scheme (Tiedtke, 1989) for cumulus clouds and a long-wave radiation scheme following Ritter and Geleyn (1992).The meteorological fields were then processed to match the Lambert conformal conic CMAQ grid with a grid size of 24 × 24 km with 30 hybrid sigma layers up to 50 hPa.CMAQ uses the information that is provided by the meteorological input fields to calculate transport, transformation, and loss of all gas-phase and particulate species (Byun and Ching, 1999;Byun and Schere, 2006).For this study we used the multi-pollutant version with the carbon bond 5 photochemical mechanism cb05tump (Tanaka et al., 2003;Yarwood et al., 2005;Sarwar et al., 2007;Whitten et al., 2010) and the aerosol module aero6 (Appel et al., 2013;Carlton et al., 2010;Foley et al., 2010).Deposition schemes are based on Byun and Schere (2006) for dry and Pleim and Ran ( 2011) for wet deposition.The mercury chemistry is based on Bullock and Brehme (2002) and was updated based on observations and model intercomparisons in the course of the EU FP7 project GMOS (Zhu et al., 2015;Bieser et al., 2014a, b).To describe the re-emission of deposited mercury we used the bidirectional flux parameterization following Bash et al. (2010).Additionally, emissions from the North and Baltic seas were estimated based on Bieser and Schrum (2016).Boundary conditions were obtained from the GLEMOS model for GEM, GOM, and PBM (Travnikov and Ilyin, 2009) and from TM-5 for all other species (Huijnen et al., 2010).The annual total emissions are based on AMAP for mercury (AMAP, 2013a, b) and EMEP for other species and were speciated and disaggregated to an hourly resolution with the SMOKE for Europe emission model (Bieser et al., 2011a).Plume rise of point sources was explicitly calculated based on Bieser et al. (2011b).Finally biogenic emissions were calculated online using the BEIS3.14model (Schwede et al., 2005;Vukovich and Pierce, 2002).

Sensitivity runs
To evaluate the impact of emissions and atmospheric chemistry on the vertical distribution of mercury a set of sensitivity runs was made.While for the BASE case each model uses its default setup, for the sensitivity runs certain aspects of the models were harmonized.The list of all sensitivity runs is given in Table 2. Concerning emissions, we tested the impact of anthropogenic emissions by considering only natural and legacy emissions (NOANT) and by altering the speciation of anthropogenic emissions to 100 % GEM (ANTSPEC).In addition, we investigated different oxidation reactions by considering only one reaction at a time, namely ozone (O3CHEM), hydroxy radicals (OHCHEM), and bromine (BRCHEM).In these cases, the models used the same input fields for the investigated reactant.For bromine chemistry two alternative sets of bromine fields were used from GEOS-Chem (BRCHEM1) and from the p-TOMCAT model (BRCHEM2).

Model evaluation
For the model evaluation we used hourly model results for the year 2013 for all models with the exception of ECH-MERIT, which provided a lower temporal resolution that resulted in 3-hourly average concentrations.The grid cell and time step matching each individual measurement were taken using a four-dimensional bilinear interpolation to the nearest model space and time coordinate.For the analysis we used three aggregated model species: TM, GEM, and OM = TM − GEM.This means, for example, that observations within a single vertical profile can correspond to different time steps in the model.To investigate the models capability to reproduce observed mercury concentration and speciation we use traditional statistical measures bias, error, and correlation as given in Eqs. ( 1)-( 5).We use the mean normalized bias (MNB) and mean normalized error (MNE) because these give more weight on the individual data points instead of the overall mean value.

Mean normalized bias MNB
P is the predicted value from model-observation, O is the observed values from measurement, and N is the sample size.Due to the small number of aircraft observations available, such a comparison faces the problem that the model bias will not average out as it tends to do for larger datasets (e.g., 8760 hourly observations for a single year of ground-based station data).Moreover, the vertical model performance is highly dependent on meteorological parameters (e.g., PBL height, vertical transport).Thus, for an individual profile the model bias can be quite large.We did not perform a detailed analysis of the meteorological fields because this would be beyond the scope of this paper.To increase sample sizes, we summed several vertical profiles into seasonal average profiles in order to increase the number of observations per altitude.On average, each of the resulting seasonal average profiles consists of 58 data points per 1000 m altitude slice.
Moreover, to completely remove the model bias from the analysis of the vertical distribution of mercury we calculated a relative vertical profile which we call the mean deviation profile (MDP) (Eqs.6-8).The MDP indicates the difference for each individual altitude from the average column concentration and is calculated for models and observations independently.Thus, it indicates whether each model is able to reproduce the observed vertical distribution rather than the actual concentration of mercury species (Eq.8).This is especially valuable for the analysis of oxidized mercury species, as there is an ongoing discussion about an underestimation of concentrations due to limitations of the current measurement techniques (Lyman et al., 2010(Lyman et al., , 2016;;Ariya et al., 2015;Gustin et al., 2015;Huang and Gustin, 2015;Jaffe et al., 2014;McClure et al., 2014;Ambrose et al., 2013;Huang et al., 2013;Kos et al., 2013).Generally, the model error can be separated into three parts: the bias, which represents any systematic errors; the variance, which gives the variability around the mean value; and the covariance, which represents the correlation between model and observations (Solazzo and Galmarini, 2016).By using MDPs we completely remove the bias and all systematic errors from our evaluation.Combining MDP and correlation coefficient, we are able to investigate the models capabilities to reproduce areas with high and low production of oxidized mercury and the influence of different chemistry schemes.The idea behind this is that even if the absolute measurements are not correct, we can use them to identify regions with mercury oxidation in the vertical column.
Individual layer mean X (i,L) is the model or observation i in layer L, N L is the number of values in layer L, i is the counter for values in layer L, and M is the number of layers in profile.

Results and discussion
Observations indicate that there is a tripartite distribution of TM in the atmosphere.The highest concentrations (1.4-1.8 ng m −3 ) are found inside the PBL with a strong gradient towards the free troposphere (1.1-1.4 ng m −3 ).This gradient seems to be mainly driven by anthropogenic emissions, as it was not observed in regions with low primary emissions (e.g., Mace Head, Ireland) (Fu et al., 2016;Sprovieri et al., 2016;Weigelt et al., 2015).Finally, in the stratosphere total mercury concentrations are typically below 1 ng m −3 (0.7-1.0 ng m −3 ) (Slemr et al., 2016;Lyman and Jaffe, 2012).
The observed TM profiles are often similar to GEM profiles.Inside the PBL, oxidized mercury (OM) (here, OM is defined as the sum of all oxidized forms of mercury including model species GOM, PBM, and any mercury in the aqueous phase) concentrations are very low and mostly between 20 and 100 pg m −3 in Europe and North America, even in source regions with high anthropogenic emissions (e.g., coal-fired power plants) (Xu et al., 2016;Weigelt et al., 2016;2013;Gay et al., 2013;Tørseth et al., 2012;Prestbo and Gay, 2009;Weiss-Penzias et al., 2015).In China, PBM concentrations up to 1000 pg m 3 and GOM concentrations up to 100 pg m 3 have been observed, but no aircraft observations in the PBL and the LFT are available for this region : (Fu et al., 2015(Fu et al., , 2016;;Sprovieri et al., 2016).CARIBIC measurements during intercontinental flights indicate that OM concentrations are also usually below 100 pg m −3 in the upper free troposphere (9000-12 000 m) and only occasionally do high OM concentrations occur, which are probably caused by the direct inflow of OM from the stratosphere or the inflow of oxidizing agents which then react with GEM (Lyman and Jaffe, 2012).A combination of ETMEP and CARIBIC observations over Germany resulted in a uniform TM and GEM distribution in the free troposphere during summer (Weigelt et al., 2016) and TM concentrations close to those measured at ground level were found on six overflights of the CARIBIC aircraft in April, June, and September.A similar vertical dis- tribution was found in North America during winter (Brooks et al., 2014) and summer (Ambrose et al., 2015;Gratz et al., 2015;Shah et al., 2016).In none of these cases a substantial TM gradient was found inside the free troposphere and the GEM / TM ratio was in the range of 0.95-0.99 in the upper free troposphere, which is a ratio typically found inside the PBL.During spring (14 April to 4 June) Brooks et al. (2014) consistently found low TM concentrations above 5000 m which indicates a stratospheric intrusion of air masses with low mercury concentrations.Here, the GEM / TM ratio in the upper troposphere decreased to 0.88 to 0.92.For comparison, GEM / TM ratio at the tropopause is around 0.8-0.9 and decreases to 0.6-0.8 in the first 4 km above the tropopause.A similar profile was observed by Gratz et al. (2015) on 24 June and could be attributed to high bromine concentrations.Bromine as the main oxidizing agent in the upper free troposphere is consistent with findings from CARIBIC that showed no consistent influence of ozone concentrations on the GEM / TGM ratio (Fig. S1 in the Supplement).Finally, in North America a peak of OM concentrations in the range of 100-300 pg m −3 with GEM / TM ratios below 0.9 was observed in the LFT (2000-4000 m).As there are no airborne observations in the range of 3500-6500 m this feature has not yet been observed over Europe.Possible reasons for the occurrence of this OM peak, which points to GOM production at this altitude, are still unclear.However, it may be speculated that low relative humidity, low particle surface density, and high solar radiation facilitate photochemistry above the PBL.Based on the findings above, Fig. 1 depicts idealized seasonal vertical profiles for the northern midlatitudes.
Here, we investigate capability of the models to reproduce the observed atmospheric distribution of TM, GEM, and OM.To increase the sample size for the model evaluation we created seasonal average profiles for Europe and North America.For this, we integrated the high-resolution 2.5 min Tekran data to hourly values, separated all observations into bins of 1000 m (0-1000, 1000-2000, etc.), and calculated the mean concentration as well as the 66 % quantile range for each bin.In addition to the absolute concentrations we investigate mean deviation profiles as described in Sect.2.4.

Europe
Based on the combination of ground-based observations from the GMOS network (Sprovieri et al., 2016;GMOS, 2016;Weigelt et al., 2013Weigelt et al., , 2015) ) and ETMEP observations inside the PBL and the lower troposphere (Weigelt et al., 2016), as well as CARIBIC observations in the upper troposphere and the lower stratosphere (Slemr et al., 2016), we were able to obtain comprehensive vertical mercury profiles for Europe from the surface up to 12 000 m. Here, we present two individual profiles (Fig. 2).
The first profile measured on 21 August at 11-12:00 UTC at Leipzig, Germany, which combines ETMEP and CARIBIC data, was published by Weigelt et al. (2016).Based on the discussion above and ETMEP GOM measurements being in the range of 20 to 40 pg m −3 , we expect GEM to be almost identical to TM for these profiles, except perhaps for the data gap in the range of 3000-6000 m where GOM concentrations could have been higher.It can be seen that the models generally underestimate mercury concentrations.This is in line with many previous model studies which found that models tend to underestimate current TM concentrations in Europe (Bieser et al., 2014b;Chen et al., 2014;Muntean et al., 2014;Gencarelli et al., 2017).Based on a model run from 1996 to 2008, Muntean et al. (2014) hypothesized that this was due to an overestimation of emission reductions in the last decade.Moreover, a change in the speci-  (Weigelt et al., 2016).Lower panel: GEM / TGM profiles at Mace Head, Ireland (19 September 2013), compiled from GMOS groundbased observations (Weigelt et al., 2015) and CARIBIC measurements (Slemr et al., 2016).Solid lines indicate total mercury (TM), dashed lines indicate elemental mercury (GEM), and dotted lines depict the GEM / TM ratio given on the second x axis.The horizontal gray lines depict PBL and tropopause height.The black squares are ETMEP measurements, the gray circles are tropospheric, and the gray squares are stratospheric CARIBIC measurements.
ation of mercury emissions due to new cleaning technologies of modern coal-fired power plants can have an impact on the lifetime of regional primary anthropogenic emissions.However, the majority of model values are still within the measurement uncertainty range (Fig. 2).Typically, ground-based GEM measurements have an uncertainty of around 10 % and the models have an average MNB of −0.14 and an average MNE of 0.23 averaged over all European vertical profiles.MNB and MNE for all models as well as the model ensemble are given in Table 3.It can be seen that besides CCLM-CMAQ all models underestimate concentrations for Europe.Looking at the vertical distribution we found that the models are able to reproduce the vertical distribution of both GEM and TGM.Furthermore, we calculated the model ensemble MNB and MNE for altitude slices with a thickness of 1 km to investigate any vertical trends (Table 4).It can be seen that bias and error exhibit a very low variability inside the troposphere with a generally negative bias and MNE values mostly around 0.2 to 0.25.However, near the tropopause the bias becomes positive and the error increases strongly.Moreover, we find a slightly lower bias near the PBL, which we argue is an artifact due to the modelled PBL heights.The PBL height as calculated by the meteorological models has a large influence on the actual altitude of the Hg gradient.It can be seen, for example, that WRF-Chem simulates a PBL height of 500 m, while the observations located the top of the PBL at an altitude of 2500 m.Here, the PBL growth was delayed in the WRF meteorological model.All models exhibit higher concentrations inside the PBL and none has a gradient inside the troposphere, which is in agreement with the observations.Concerning the GEM / TGM ratio, only one model show values lower than 0.9-0.95inside the troposphere.The ECHMERIT model exhibits a mostly uniform GEM / TGM ratio between 0.7 and 0.8 over the whole altitude range.This would be a realistic ratio if OM measurements were underestimated by a factor of 5.
Looking at the stratosphere, only the GLEMOS model is able to reproduce a decrease of TM concentrations above the tropopause.Due to the low resolution at this altitude, GLE-MOS has only two layers between 10 000 and 15 000 m, and the modeled gradient is less steep than that observed.None of the other models give significantly lower TM concentrations in the stratosphere.However, GEOS-Chem and GEM-MACH-Hg have increased oxidation above the tropopause.In GEM-MACH-Hg the GEM / TM ratio declines from 0.9 at the tropopause (11 000 m) down to 0.2 km inside the stratosphere.This is in line with observations from CARIBIC.The GEOS-Chem model also exhibits pronounced mercury oxidation above the tropopause with the GEM / TM ratio declining from 0.9 to 0.1 in the 5 km above the tropopause.ECHMERIT and WRF-Chem-Hg have no increased oxidation or reduced TM concentrations above the tropopause.The CMAQ-based models CCLM-CMAQ and CMAQ-Hem have the tropopause as their upper boundary and do not model the stratosphere.
The second profile is a combination of ground-based observations at the GMOS station Mace Head, Ireland, with the CARIBIC flight of 19 September at 06-08:00 UTC (Fig. 2).In 2013, the CARIBIC aircraft passed close to Mace Head six times within a range of 86-220 km (27, 28 April, 8, 7 June, 19, 20 September) but the other profiles look similar.The CARIBIC data are separated into tropospheric and stratospheric measurements based on the relative height above the tropopause (Sprung and Zahn, 2010).Here, we depict the profile for the nearest CARIBIC overflight.In this region, which is influenced by clean air from the Atlantic Ocean, we did not observe a gradient between the surface and the upper troposphere.Again, models tend to underestimate mercury concentrations.At Mace Head all models are able to reproduce the constant TM concentrations in the free troposphere.However, several models overestimate the concentrations near the surface.It has to be noted, however, that Mace Head is a coastal station with predominantly westerly winds from the open Atlantic which might be difficult to reproduce for models with a coarse resolution, and thus higher groundbased concentrations could be due to anthropogenic emissions from Ireland.At the tropopause, the observations show an almost instantaneous decrease of TM concentrations from 1.4 to 1.0 ng m −3 .The models behave similarly to the profile over Leipzig with only GLEMOS showing a decrease above the tropopause.The models with a higher vertical resolution near the tropopause (GEM-MACH-Hg 12 layer and GEOS-Chem 5 layers between 10 000 and 15 000 m) are better able to reproduce the gradient, but they only show a decrease in GEM / TM ratio not in TM concentration.
As described above we calculated an average summer vertical profile for Europe using data from five ETMEP profiles in Germany and Slovenia performed between the 19 and 23 August, complemented by CARIBIC flights on 21 and 22 August and 18 and 19 September.Thus, we created an average profile with 290 hourly samples based on a sampling interval of the co-located Tekran instruments of 2.5 min (Fig. 4).We did not use measurements from the Lumex instrument for this evaluation as none of the other aircraft were equipped with such an instrument.The performance of the Lumex instrument on this flight is discussed in Weigelt et al. (2016, this issue).The resulting GEM and TM profiles are depicted in Fig. 3a and b, respectively.Again, it can be seen that the models generally underestimate mercury concentrations in central Europe during August 2013.However, when looking at the mean deviation profile (MDP), which depicts the relative vertical distribution compared to the total column average concentration, all the models are within the observed range.By investigating the experimental model runs, it can be seen that in the case with all anthropogenic emissions emitted as elemental mercury (ANTSPEC) the models have slightly higher mercury concentrations near the surface which leads to better agreement with observed gradients.While all models give similar vertical profiles for the BASE and ANTSPEC cases, in the cases without anthropogenic emissions (NOANT) and without atmospheric chemistry (NOCHEM) the models show different responses.In these cases the modeled vertical distributions of mercury start to diverge from the observations and each other.This shows the strong impact of atmospheric chemistry on the vertical GEM distribution and global mercury transport in general.

North America
We created similar average vertical mercury profiles for North America based on 185 hourly samples from three profile flights at Tullahoma, TN, between 18 January and 14 April 2013 (Brooks et al., 2014) (Fig. 4) and 898 hourly samples from seven NOMADSS flights between 20 June and 12 July 2013 (Fig. 5).For the NOMADSS flights we selected vertical flight paths for this evaluation and discarded horizontal flight paths.Here, the observations exhibit a similar vertical distribution with higher concentrations inside the PBL and lower concentrations in the FT.The NOMADSS profile contains one flight with a stratospheric intrusion and thus shows a slightly decreasing trend in the upper troposphere.Observed profiles and model results for North America are comparable to Europe.For the summer profile (Fig. 5) there are elevated TM concentrations inside the PBL and no trend inside the FT.Models tend to underestimate TM and GEM concentrations but are in good agreement with the relative distribution.The average MNB and MNE as given in Table 3   are similar to those for Europe.For North America only the GEM-MACH-Hg model exhibits a positive bias and on average the models underestimate GEM concentrations by 13 %.
As for Europe, the model error shows no significant vertical gradient and exhibits a minimum near the PBL (Table 4).The higher concentrations near the surface in the ANTSPEC case lead to better agreement with observations.For the winter profile (Fig. 4) GEOS-Chem and GEM-MACH-Hg are in good agreement with the absolute GEM and TM observations.However, models do overestimate concentrations near the surface, which could be due to modelled PBL height and anthropogenic emission fluxes.
Finally, we created a third profile for spring from three profile flights at Tullahoma, TN, on 15 April, 10 May, and 4 June 2013 (Brooks et al., 2014) (Fig. 6).This profile looks different than the others.Again, TM and GEM concentrations are highest inside the PBL but there is a second decreasing gradient between 4000 and 5000 m.Above 6000 m GEM and TM concentrations fall below 1.0 ng m −3 , which is a value typically found in the stratosphere.This feature was observed on all three flights during spring and thus seems not to be an individual outlier.Furthermore, in the time from April to July stratospheric mass transport into the upper and mid-troposphere is known to occur regularly (Appenzeller and Holten, 1996;Allen et al., 2003;Zanis et al., 2003;Olsen et al., 2004;Schoeberl, 2004).Moreover, Sprenger et al. (2003) and Sprenger and Wernli (2003) demonstrated that cross-tropopause mass flux is highest in the midlatitudes where these mercury profiles were measured.This is also in line with observations from CARIBIC, which found stratospheric intrusions of air masses with low mercury concentrations during this time span (F.Slemr, personal communication, 2017).Stratosphere-to-troposphere transport of mercury is also the most convincing reason for observed elevated oxidized mercury concentrations in the upper troposphere, which is further discussed in the next section.

Oxidized mercury
As the different implementations of the mercury redox chemistry in the models presented here is not directly compatible, we decided to sum all oxidized model species for this comparison.Thus, in the following section we compare modeled reactive mercury OM (OM = GOM + PBM = TM − GEM) concentrations to observations mostly because of the supposed equilibrium between GOM and PBM (Rutter and Schauer, 2007;Amos et al., 2012).The species measured by the presented aircraft campaigns also differ.Some measure GOM and PBM explicitly and others measure the difference between TM and GEM.Moreover, depending on the sampling inlet geometry and operating conditions, filters in the sampling line, and temperature gradients, a fraction of PBM may not be accessible to measurement (Slemr et al., 2016).In the following we treat all observations alike and interpret them as total OM measurements.As discussed in Sect.2.4, current GOM measurement techniques based on the sorption of GOM on KCl-coated denuders have been shown to be susceptible to environmental interferences.Mainly, ozone and humidity have shown to lead to an underestimation of ambient GOM concentrations (Lyman et al., 2010;Jaffe et al., 2014;Gustin et al., 2015).Thus, we focus the following model evaluation on the relative distribution of OM in the atmosphere rather than absolute values.

Europe
Measurements at Waldhof, Germany, indicate that there is a strong OM gradient inside the PBL with very low concentrations at the surface and 10-15 times higher concentrations above 500 m.This is to be expected because of the high stickiness and therefore fast dry deposition of OM on surfaces (Zhang et al., 2009).During the ETMEP campaign a total column OM measurement was performed inside the PBL above the ground-based measurement station Waldhof (Fig. 6).Five of the seven models (GLEMOS, GEOS-Chem, GEM-MACH-Hg, CMAQ-Hem, CCLM-CMAQ) are able to reproduce the OM concentrations above the surface with one over and one underestimating the concentration.It has to be noted that ECHMERIT, which strongly overestimates OM is able to reproduce the low concentrations at the surface and thus is in good agreement with the relative vertical distribution.An investigation of the experimental model runs indicated that the overestimation at the surface is due to anthropogenic emissions and was reduced significantly in the ANTSPEC run, while concentrations above the surface are mainly driven by atmospheric chemistry.This is in line with the findings of Bieser et al. (2014a) and Weigelt et al. (2016).

North America
For North America, we use the same profiles as described in Sect.3.1.2.On the flights at Tullahoma, GOM and PBM were measured and for the analysis we plotted the sum as total OM.Due to the long sampling times necessary for denuder measurements, the sample size is much smaller than for the GEM observations.The winter profiles are based on 32 samples (Fig. 7) and the spring profiles on 48 samples (Fig. 8).
During winter, OM concentrations varied around 30 pg m −3 , with slightly lower concentrations inside the PBL.For the BASE case model, results are mostly inside the uncertainty range of the observations.During winter the models with the lowest OM production (GEM-MACH-Hg, GLEMOS, CMAQ-Hem) are closest to the observations.ECHMERIT generally overestimates OM concentrations, while GEOS-Chem provides increasing concentrations above 4000 m which are not in agreement with observations.This increasing trend was also found in models when using the GEOS-Chem and p-TOMCAT bromine fields (BRCHEM1 and BRCHEM2).However, the peak is  much more pronounced in the GEOS-Chem run.Further investigation of the experimental model runs indicates that the amount of oxidized mercury is strongly dependent on the choice of CTM.For example, the ECHMERIT model produces the highest OM concentrations for all chemical reactions.With the exception of ECHMERIT all models are closest to the observations in the BASE case.Looking at the relative vertical distribution, the observations give lower OM concentrations inside the PBL and no trend in the free troposphere.The gradient at the PBL can be reproduced by all chemical reactants but bromine and OH chemistry leads to an increasing trend in the upper troposphere (Fig. 8).
Here, only the ozone chemistry is able to reproduce the observed profiles.By investigating the correlation coefficient it can be seen that the model runs using bromine chemistry have a much lower R value compared to model runs using ozone and OH (Table 5a).This can also be seen for the BASE case as models mainly based on ozone chemistry (GLEMOS, ECHMERIT, CMAQ-Hem) tend to have a better correlation than models based on other oxidants.However, the CMAQ-Hem model has a negative correlations due to the fact that is cannot reproduce the OM gradient at the PBL.
The spring profile for OM at Tullahoma is depicted in Fig. 9. Here, a strong OM peak up to 150 pg m 3 can be seen in an altitude of 3000-5000 m.This peak is above the PBL, which was between 2500 and 3200 m during these flights, all of which were made during the afternoon when the PBL reaches its highest expansion.In the BASE case most models fail to reproduce this peak and only CMAQ-Hem and ECHMERIT, both using ozone chemistry, give similar vertical profiles.On average, the multi-model mean is close to the observed concentrations but exhibits only the typical gradient at the PBL but no pronounced OM peak.Investigating the relative vertical distribution for different chemistry sensitivity runs reveals that ozone and OH chemistry are able to reproduce the observed peak.For bromine chemistry the profiles are inverted, exhibiting a minimum where the maximum OM concentrations were observed.Comparing the OM  profiles to the TM profiles (Fig. 6) shows that the OM peak is below the presumably stratospheric low TM air masses.This could be an indication that the increased oxidation is due not to stratospheric bromine transport but rather to regional oxidation above the PBL.This would explain why the bromine chemistry cannot reproduce this peak but ozone and OH chemistry can.Of course it has to be stated that the bromine fields themselves are also subject to large uncertain-ties and thus the interpretation of these findings depends on the quality of the bromine fields.However, results are similar for independent bromine datasets from GEOS-Chem and p-TOMCAT bromine fields.Furthermore, there were only two OM measurements that indicate the decline above 6000 m and it would also be possible that this peak extended further upwards and was due to a deep stratospheric intrusion.Looking at the correlation coefficient, it can be seen that model runs based on bromine chemistry have, on average, a much lower correlation (Table 5b).The GLEMOS model is even strongly anticorrelated when using bromine fields.Again, model runs based on OH and ozone chemistry exhibit much higher correlation coefficients compared to model runs based on bromine chemistry.We interpret these findings to be an indicator of secondary oxidation processes by ozone and OH, as described by Horowitz et al. (2017), taking place near the PBL.
Finally, we evaluate the model performance for OM for the summer profile based on NOMADSS data from June and July 2013.Due to the differential measurements approach of the DOHGS instrumental setup the sample size is equal to that of the GEM profiles (Lymann and Jaffe, 2012;Ambrose et al., 2013Ambrose et al., , 2015)).The larger sampling size together with the fact that NOMADSS observations cover a region larger than the vertical profiles over Tullahoma leads to a higher variability in the measurements given by the 66 % quantile range (Fig. 10).We created the average OM profile from the same data as the GEM profile.For OM measurements below the detection limit we used half the reported detection limit which varied between 74 and 138 pg m −3 , thus giving us a minimum OM concentration of 34 pg m −3 , which is in line with the other observations previously presented.
The resulting profile exhibits a distinct vertical distribution with lower concentrations inside the PBL (40-60 pg m −3 ), an OM peak directly above the PBL (100-350 pg m −3 ), lower concentrations in the mid-troposphere (50-200 pg m 3 ), and increasing concentrations in the upper troposphere (100-300 pg m 3 ).The increasing trend in the upper troposphere was attributed to an episode with high bromine concentrations (Gratz et al., 2015) and accordingly only the model runs with bromine chemistry can reproduce this (Fig. 10, BRCHEM).The underestimation of the absolute OM con-centrations by all models besides ECHMERIT is in line with the findings of Schmidt et al. (2017), who find that current models strongly underestimate bromine concentrations in this area.
The finding that the ozone and OH reactions cannot reproduce the observed increase in OM concentrations in the upper troposphere is in line with findings from CARIBIC, where no correlation of ozone with the GEM / TM ratio found (Fig. S1).Looking at the correlation coefficients we find that model runs based on ozone and OH chemistry exhibit no correlation or even anticorrelation.For this episode the correlation coefficients are generally low, but the default setups for all models runs based on bromine chemistry give the highest values (Table 5c).We argue that the low correlation coefficients are due to two overlaying processes: ozone and OH-based oxidation in the LFT and bromine-induced oxidation in the mid-to upper troposphere.
Similarly to the spring profile at Tullahoma, the lower OM peak lies directly above the PBL, which is an area of enhanced photolytic activity due to higher solar radiation and low particle density concentrations compared to the PBL.Also, due to the low water vapor content in this region little aqueous reduction of OM can take place.This OM peak cannot be reproduced by model runs with bromine chemistry.In fact, the resulting profiles are even inverse to the observations.Ozone and OH chemistry, in contrast, led to increased oxidation above the PBL with the OH chemistry run with the best agreement with the observed vertical distribution and ozone with the actual concentrations (Fig. 10; O3CHEM and OHCHEM).

Stratosphere
Stratospheric observations from intercontinental CARIBIC flights indicate that the GEM / TM ratio declines above the tropopause with values typically in the range between 0.6 and 0.8 in the first 4 km above the tropopause (Fig. 11).During summer, values down to 0.5 were found in the tropics.Here, we compare those models which include the stratosphere (GLEMOS, GEM-MACH-Hg, GEOS-Chem, ECHMERIT) to observations.The models exhibit greater differences in the stratosphere compared to the troposphere.ECHMERIT exhibits no GEM / TM gradient throughout the year with similar values of 0.7-0.9 in troposphere and stratosphere.Although the model cannot reproduce the declining trend above the tropopause, it is mostly within the uncertainty range of the observations.GLEMOS shows the best agreement with observations.It is able to reproduce the slow GEM / TM ratio decrease above the tropopause with values mostly between 0.5 and 0.7 in the first 4 km above the tropopause.GEM-MACH-Hg and GEOS-Chem both exhibit much higher oxidation rates in the stratosphere.GEM-MACH-Hg also has a slow decrease of GEM / TM ratios above the tropopause but consistently shows GEM / TM ratios below 0.3 above 12 000 m north and south of 30 • .Finally, in GOES-Chem the GEM / TM ratio decreases earlier, already a few kilometers below the tropopause in altitudes of 6000-10 000 m. Above 12 000 m almost all mercury is oxidized at the poles and even at the Equator the GEM / TM ratio drops below 0.1 above 16 000 m (Fig. 11c).On flights during summer in the range of 30-0 • N a steep decline of the GEM / TM ratio to values below 0.5 was observed, which is in line with the profiles modeled by GEOS-Chem.However, it has to be considered that the uncertainty of the observations is high and at times no gradient at all was observed.The GEM and TM CARIBIC measurements are further discussed in Sect.3.3.

Interhemispheric gradients
Finally, observations on 8 flights from Munich, Germany, to Cape Town, South Africa, and 19 flights from Munich to Sao Paulo, Brazil, are used to investigate the models' capability to reproduce interhemispheric gradients.The interhemispheric CARIBIC flights were performed between 2013 and 2017.The CARIBIC Tekran instrument, which is usually set up to measure TM, was equipped with a quartz wool filter on each return flight to measure GEM only (Slemr et al., 2016).The Tekran raw data were manually reintegrated (Slemr et al., 2016).This allows us to look at interhemispheric gradients of elemental and total mercury.However, as the two quantities were not measured on the same flights only a range of possible oxidized mercury concentrations can be deduced.Long-range transport and a variable tropopause height can easily lead to differences larger than the expected OM concentrations on the return flight on the same flight track.Because of this, the calculated average difference of TM and GEM can sometimes be lower than zero.Most of the TM and GEM measurements were within each other's 66 % quantile range (Fig. 12a, b).The difference between the average TM and GEM concentrations was 70 pg m −3 on the flights to Cape Town (N = 756) and 100 pg m 3 on the flights to Sao Paulo (N = 1399).A detailed investigation leads to the conclusion that OM concentrations are mostly low (∼ 50 pg m −3 ) in the upper troposphere with occasionally high concentrations of up to 200 pg m −3 and more.This is in line with the findings presented in Sect.3.2 and with three of the four global models, which also give an average TM-GEM difference of around 100 pg m −3 .GLEMOS, GEM-MACH-Hg, and ECHMERIT are in good agreement with observations in the BASE case, while GEOS-Chem overestimates oxidized mercury in the midlatitudes (50-30 • N), leading to an average of 200 pg m −3 (Fig. 12c, d).The sensitivity runs using GEOS-CHEM bromine fields led to similar results for other models (Fig. 12g, h).
To create average interhemispheric transects we grouped all observations which were at least 1 km below the tropopause into bins of 5 • latitude and filtered out high mercury concentrations from polluted air masses (Hg > 2.5 ng m 3 ).This was especially necessary on the flights to South Africa, where a few large-scale biomass burning events led to measured GEM concentrations of up to 3 ng m −3 .These events can mask the interhemispheric gradient.Finally, the first and last data points include take-off and landing.This results in a stronger gradient compared to measurements in the upper troposphere.
For the model evaluation we use monthly average GEM and TM concentrations for the month during which each flight was performed from the grid cell closest to the aircraft and aggregate the model data into bins similar to the observational data.It has to be kept in mind that for models with a low vertical resolution the relevant grid cell might extend above the tropopause.Here, we focus on the relative interhemispheric gradient to evaluate the models.The relative TM and GEM trends on flights to Sao Paulo are depicted in Fig. 13 and absolute values are given in Fig. 14.Similar plots for the flights to Cape Town are given in the Supplement (Figs.S2 and S3).The models are generally in better agreement with absolute and relative observations for total mercury (Figs. 13,14).This is mainly due to an overestimation of oxidized mercury in the Northern Hemisphere (45 to 35 • N).All models give slightly better results in the ANTSPEC case and the absolute mercury concentrations are 10 % higher compared to the BASE case (Fig. 14c, d).This is consistent with the findings in Sect.3.1.In the case without anthropogenic emissions (NOANT) mercury concentrations are much too low and in the NOCHEM run models vastly overestimate mercury concentrations.This is to be expected, as the lifetime of GEM increases without oxidation processes.The exception is the ECHMERIT model, which is very close to observations in the NOCHEM case.This is due to the fact that the ECHMERIT model does not consider dry deposition of GEM.The results in all experimental chemistry runs are strongly dependent on the dynamic response of air-sea exchange.In models that pre- scribe fixed oceanic emission rates, changing deposition due to changes in the chemistry scheme cannot be compensated by re-emissions.The ECHMERIT model, for example, prescribes fixed oceanic mercury concentrations and thus an increase in deposition will result in lower TM concentrations and vice versa, which explains the very high TM concentrations in chemistry sensitivity runs.This underlines the importance of the air-sea exchange for global atmospheric models even near the tropopause.
For TM, no chemistry setup could be found that most accurately reproduced the observed concentrations and trends.As was shown before in the evaluation of the vertical profiles, differences in the CTM formulation can have a larger impact than the choice of oxidant.Looking at GEM, it can be see that different oxidants lead to different interhemispheric distributions.Here, the use of bromine fields leads to an overestimation of oxidation in the Northern Hemisphere (50-25 • N).In contrast, the use of ozone and OH chemistry only leads to underestimation of the oxidation around the Equator.However, the GEM-MACH-Hg model does not exhibit this feature.With 12 layers between 10 000 and 15 000 m the GEM-MACH-Hg model has a much greater vertical resolution around the tropopause compared to the other models and this has a large impact on model results.In models with coarser vertical resolution, low stratospheric concentra-tions will have a larger impact on this evaluation.GLEMOS and ECHMERIT are the models with the lowest resolution in this altitude with two and three layers between 10 000 and 15 000 m, respectively.GEOS-Chem has five layers at this altitude.

Total atmospheric mercury burden
We investigated the total atmospheric mercury burden as predicted by the four global models.Looking at the vertical distribution the models predict 30 % inside the PBL, 60 % in the free troposphere, and 10 % in the stratosphere (Fig. 15a).On average the models have a total atmospheric mercury burden of 4800 Mg (ECHMERIT 4650 Mg, GEOS-Chem 5100 Mg, GLEMOS 4200 Mg, GEM-MACH-Hg 5300 Mg), which is comparable to the 5300 Mg estimated by Amos et al. (2013).The average vertical distribution in the model ensemble is 1500 Mg in the PBL, 2800 Mg in the FT, and 500 Mg in the stratosphere.For the oxidized mercury species model results exhibit larger differences compared to TGM, leading to a smaller spread of the predicted atmospheric total GEM burden.We found that all models have a similar inter hemispheric mercury distribution with 54 to 58 % of the total mercury in the Northern Hemisphere (Fig. 15b).Finally, we investigated the latitudinal distribution of GEM / TM ratios.Here, GLEMOS (0.95) and ECHMERIT (0.8) exhibit

Conclusions
In this model intercomparison study we investigated the vertical distribution of mercury in the atmosphere and evaluated the impact of mercury chemistry and emissions.The key finding is that models are generally able to reproduce the vertical profile of TM and elemental gaseous mercury (GEM) from the surface up to the tropopause.This means largely uniform concentrations inside the PBL and FT.Increased GEM concentrations observed inside the PBL could be attributed to anthropogenic emissions.However, the models tend to overestimate GEM concentrations in the lower stratosphere and those models which feature declining GEM concentrations above the tropopause do so by oxidation to reactive mercury (OM) species, thus overestimating TM.Moreover, it was found that a high vertical resolution near the tropopause is very important for a better reproduction of the observed declining mercury gradient.
For OM, the observations indicate low concentrations inside the PBL, often below 50 pg m −3 with a strong decrease towards the surface.This seems plausible due to the high dry deposition velocity of OM.Current model setups tend to overestimate OM near the surface, which here could be attributed to the current speciation profiles used for anthropogenic emissions.Also, in the FT most observations are below 100 pg m −3 , which is approximately the detection limit of current measurement techniques.Moreover, high concentrations of ozone and water vapor have been shown to negatively affect the retrieval rates of GOM species by the Tekran instruments (Gustin et al., 2015).Therefore, no further information on possible vertical gradients is available for these regions.However, two separate regions in the upper and lower free troposphere with increased GEM oxidation and OM concentrations above 100 up to 500 pg m −3 were identified in North America independently by Brooks et al. (2014) and Ambrose et al. (2013).Because current measurement techniques have been shown to underestimate concentrations of oxidized mercury (Jaffe et al., 2014;Gustin et al., 2015), we have focused the model evaluation on relative vertical distributions and correlation coefficients in order to remove the model bias and any systematic measurement error from the evaluation.
Our interpretation of the observations is that stratospheric intrusions and tropopause folds, which mainly occur during spring time, play an important role for elevated OM concentrations in the upper FT at altitudes above 6000 m.The frequency of stratosphere-to-troposphere transport is regionally variable and has shown to be most common in latitudes where the measurements were performed.However, also long-range transport of marine bromine species as observed by Gratz et al. (2015) during the NOMADSS flights can be an important source of stratospheric Br.Thus, we emphasize the importance of further research regarding the atmospheric bromine cycle to better understand the oxidation pathways of mercury.Besides bromine species, stratosphereto-troposphere transport could also be a source for OM already formed in the lower stratosphere.This could also explain the missing correlation of ozone concentrations and GEM / TM ratios measured by the CARIBIC aircraft in the upper FT.
Uniformly low OM concentrations were observed during winter and could be reproduced by the models.In spring and summer, increased OM concentrations were observed above the PBL in the LFT.This could only be reproduced by models using O 3 and OH chemistry.Any oxidant directly above the PBL is either produced locally or transported from the PBL and thus OH and/or O 3 seem a plausible explanation.The production of stable oxidized mercury species directly above the PBL could be the result of a two-stage oxidation process, as suggested by Horowitz et al. (2017).Moreover, reduced water vapor content and particle surface densities would reduce any occurring aqueous OM reduction processes.
Finally, we have investigated TM and GEM concentrations and gradients in the upper troposphere between the Northern and Southern Hemisphere based on intercontinental CARIBIC flights.The models were more adept in reproducing TM concentrations and trends compared to GEM.Model runs using bromine reactions showed a better agreement to observed intercontinental TM gradients.However, the current bromine fields led to a strong overestimation of mercury oxidation in midlatitudes.Ozone and OH chemistry, however, led to overestimated oxidation in the tropics.Interestingly, reducing the OM fraction in the anthropogenic emission inventories led to a better agreement with observed concentrations.This could be due high OM fractions for coal-fired power plants in current emission inventories, which have high stacks, and thus effective emission heights can even be above the PBL at times.Data availability.Mercury modeling and measurement data discussed in this paper are reported within the GMOS central database and are available upon request at http://sdi.iia.cnr.it/geoint/publicpage/GMOS.
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.Idealized observed TM and GEM mercury profiles for winter, spring, and summer in northern midlatitudes.The depicted profiles are based on aircraft observations from CARIBIC, ETMEP, NOMADSS, and Tullahoma flights.Data gaps in altitude where no observations are available were estimated.

Figure 2 .
Figure 2. Upper panel: GEM / TGM profiles at Leipzig, Germany (21 August 2013), compiled from ETMEP and CARIBIC measurements(Weigelt et al., 2016).Lower panel: GEM / TGM profiles at Mace Head, Ireland (19 September 2013), compiled from GMOS groundbased observations(Weigelt et al., 2015) and CARIBIC measurements(Slemr et al., 2016).Solid lines indicate total mercury (TM), dashed lines indicate elemental mercury (GEM), and dotted lines depict the GEM / TM ratio given on the second x axis.The horizontal gray lines depict PBL and tropopause height.The black squares are ETMEP measurements, the gray circles are tropospheric, and the gray squares are stratospheric CARIBIC measurements.

Figure 3 .
Figure 3.Comparison of modelled average mercury profile for Europe to observations based on vertical profiles from ETMEP and CARIBIC campaigns amended with ground-based observations at Waldhof and Mace Head (Weigelt et al., 2013; Slemr et al., 2016).The error bars indicate the 66 % quantile range of the observations in each altitude; the sample size for each altitude is indicated on the y axis of the legend.The mean deviation profiles (MDP) are given for TM.

Figure 4 .Figure 5 .
Figure 4. Comparison of modelled average mercury profile for North America to observations based on vertical profiles at Tullahoma, TN, from January and February 2013 (Brooks et al., 2014).The error bars indicate the 66 % quantile range of the observations in each altitude; the sample size for each altitude is indicated on the y axis of the legend.The mean deviation profiles (MDP) are given for TM.

Figure 6 .
Figure 6.Comparison of modelled average mercury profile for North America to observations based on vertical profiles at Tullahoma, TN, from April to June 2013 (Brooks et al., 2014).The error bars indicate the 66 % quantile range of the observations in each altitude; the sample size for each altitude is indicated on the y axis of the legend.The mean deviation profiles (MDP) are given for TM.

Figure 7 .
Figure7.GOM profiles at Waldhof Germany (23 August 2013)(Weigelt et al., 2016).The observations are a combination of ground-based measurements and a total column measurement in altitudes from 500 to 3000 m.Model values are given for BASE (solid line), ANTSPEC (dashed line), and NOCHEM (dotted line).

Figure 8 .
Figure 8. GOM + PBM with observations at Tullahoma, TN, for January and February 2013 reported by Brooks et al. (2014).The error bars indicate the 66 % quantile range of the observations in each altitude; the sample size for each altitude is indicated on the y axis of the legend.

Figure 9 .
Figure 9.Comparison of average reactive mercury profiles (GOM + PBM) at Tullahoma, TN, for April, May, and June (Brooks et al., 2014).The error bars indicate the 66 % quantile range of the observations in each altitude; the sample size for each altitude is indicated on the y axis of the legend.

Figure 10 .
Figure 10.Comparison of modeled average oxidized mercury (OM) concentration to observations based on NOMADSS flights in June and July 2013(Shah et al., 2016; Gratz et al., 2016).The error bars indicate the 66 % quantile range of the observations in each altitude; the sample size for each altitude is indicated on the y axis of the legend.

Figure 11 .
Figure 11.Seasonal vertical profiles of modeled GEM / TM ratios for winter (upper panel) and summer (lower panel).Observations are based on TM and GEM measurements from CARIBIC flights.

Figure 12 .
Figure 12.Average interhemispheric transects for 19 flights from Munich to Sao Paulo (left) and 8 flights from Munich to Cape Town (right).Error bars indicate the 66 % quantile range of all observations for a given latitude.Average OM concentrations are calculated as TM-GEM where TM was measured on the outward and GEM on return flights; thus negative values for OM are possible(Slemr et al., 2014).

Figure 13 .
Figure 13.Relative interhemispheric transects for 19 flights from Munich to Sao Paulo.TM (left side) was measured on the outward and GEM (right side) on return flights (Slemr et al., 2014).Error bars indicate the 66 % quantile range of all observations for a given latitude.Plot in the left column are for TGM and in the right column for GEM.

Figure 14 .
Figure 14.Average interhemispheric transects for 19 flights from Munich to Sao Paulo.TM (left side) was measured on the outward and GEM (right side) on return flights(Slemr et al., 2014).Error bars indicate the 66 % quantile range of all observations for a given latitude.

Table 2 .
Specification of model experiments.

Table 3 .
Mean normalized bias (MNB) and mean normalized error (MNE) for each model as well as for the model ensemble for GEM in Europe and North America.

Table 4 .
Model ensemble vertical distribution of model mean normalized bias (MNB) and mean normalized error (MNE) for GEM in Europe and North America.

Table 5 .
Correlation of individual models for OM profiles depicted in Figs. 8, 9, and 10.