The magnitude and causes of uncertainty in global model simulations of cloud condensation nuclei

Abstract. Aerosol–cloud interaction effects are a major source of uncertainty in climate models so it is important to quantify the sources of uncertainty and thereby direct research efforts. However, the computational expense of global aerosol models has prevented a full statistical analysis of their outputs. Here we perform a variance-based analysis of a global 3-D aerosol microphysics model to quantify the magnitude and leading causes of parametric uncertainty in model-estimated present-day concentrations of cloud condensation nuclei (CCN). Twenty-eight model parameters covering essentially all important aerosol processes, emissions and representation of aerosol size distributions were defined based on expert elicitation. An uncertainty analysis was then performed based on a Monte Carlo-type sampling of an emulator built for each model grid cell. The standard deviation around the mean CCN varies globally between about ±30% over some marine regions to ±40–100% over most land areas and high latitudes, implying that aerosol processes and emissions are likely to be a significant source of uncertainty in model simulations of aerosol–cloud effects on climate. Among the most important contributors to CCN uncertainty are the sizes of emitted primary particles, including carbonaceous combustion particles from wildfires, biomass burning and fossil fuel use, as well as sulfate particles formed on sub-grid scales. Emissions of carbonaceous combustion particles affect CCN uncertainty more than sulfur emissions. Aerosol emission-related parameters dominate the uncertainty close to sources, while uncertainty in aerosol microphysical processes becomes increasingly important in remote regions, being dominated by deposition and aerosol sulfate formation during cloud-processing. The results lead to several recommendations for research that would result in improved modelling of cloud–active aerosol on a global scale.


Introduction
Successive Intergovernmental Panel on Climate Change (IPCC) reports have identified aerosol direct and indirect effects on climate as the largest uncertainty in the assessment of anthropogenic forcing (Schimel et al., 1996;Penner et al., 2001;Forster et al., 2007). Global aerosols can impact the climate in two distinct ways: the direct radiative effect is a result of atmospheric aerosols reflecting or absorbing solar radiation and thereby cooling or warming the climate system. The indirect effect refers to the many ways in which aerosols interact with clouds, leading to changes in droplet concentrations, cloud albedo and precipitation (Lohmann and Feichter, 2005).
In response to the persistent uncertainty in aerosol forcing assessments, global aerosol microphysics models have been developed to describe more realistically the evolution of size-resolved aerosol properties, which determine the complex interactions between aerosols, clouds and the climate (Binkowski and Shankar, 1995;Jacobson, 1997;Whitby and McMurry, 1997;Ackermann et al., 1998;Ghan et al., 2001;Adams and Seinfeld, 2002;Lauer et al., 2005;Liu et al., 2005;Stier et al., 2005;Spracklen et al., 2005aSpracklen et al., , 2008Debry et al., 2007;Mann et al., 2012;Zhang et al., 2012). These models are more complex than have been used in Coupled Model Intercomparison Project (CMIP) assessments (whose results feed into IPCC assessments) because they attempt to simulate the microphysical processes that determine the aerosol particle size distribution and composition on a global scale. In principle, this development in model sophistication should improve model fidelity, but the increased complexity has led to an increase in the number of uncertain model parameters, many of which have very weak observational constraints and an incomplete scientific understanding (Ghan and Schwartz, 2007). Computational constraints have also restricted the grid resolutions used for tracer transport in the models, and forced modellers to introduce simplifications, such as parameterisation of the size distribution into log-normal modes or the use of a small number of bins in sectional approaches.
Assessment of multi-model diversity is the main way in which information about model uncertainty is obtained. Model intercomparison projects compare simulations of an ensemble of independent and often structurally different models over a small range of scenarios (Gates et al., 1998;Joussaume and Taylor, 1999;Meehl et al., 2000;Friedlingstein et al., 2006;Haywood et al., 2010;Kravitz et al., 2011). Many aspects of global aerosol models have been compared in this way as part of the AEROCOM project Schulz et al., 2006;Textor et al., 2006Textor et al., , 2007Meehl et al., 2007;Shindell et al., 2008;Koch et al., 2009). These comparisons have provided valuable information about model diversity that underpin the assessment of aerosol impacts on climate. However, aerosol microphysics models have only recently been included in these assessments . Moreover, the multi-model ensemble approach provides limited information about how the different treatment of processes in the models drives their simulations, making it difficult to attribute the sources of model diversity. Thus, approaches based on perturbation of the parameters in a single model (often called perturbed physics ensembles, or PPEs) are a valuable approach to explore uncertainties systematically in processes in a controlled way .
Our lack of understanding of how complex models behave across the full parameter space has several implications for the development, evaluation and use of global aerosolclimate models. First, it means that we cannot have confidence in the robustness of the models; our simulations might change if a different but plausible parameter setting was used. Second, it limits what we can conclude when the model is compared against observations. Do biases represent a fundamental weakness in the design of our model (such as missing processes) or do they simply mean that we have not evaluated or observationally calibrated our model over the full range of the parameters already in it? Third, we cannot confidently identify the model factors that most affect the uncertainty, which risks making model development an ad hoc process rather than one driven by the desire to reduce the persistent uncertainty in aerosol forcing.
Very few studies have attempted to quantify the parametric uncertainty of a single global aerosol model because of the computational expense. The first uncertainty analysis of the aerosol indirect effect was carried out by Pan et al. (1997) using the probabilistic collocation method to produce an approximation to their computer model in order to make uncertainty analysis feasible. Ackerley et al. (2009) studied the climate responses to changes in several sulfate aerosol parameters as part of the Climateprediction.net project ) with a simpler aerosol scheme than we use here. More recently, Haerter et al. (2009) studied the parametric uncertainty in aerosol indirect radiative forcing based on 7 cloudrelated parameters with the ECHAM5 model. Lohmann and Ferrachat (2010) examined the parametric uncertainty effects on the climate in a global aerosol model by systematically varying 4 cloud parameters at specified values following a factorial design with 168 model runs. Lohmann and Ferrachat (2010) showed a parametric uncertainty in aerosolclimate effect of 11 % when considering the uncertainty in the four cloud parameters. Another approach to understanding uncertainty is to use the adjoint of the model, which has been applied to cloud drop number in Karydis et al. (2012). Sensitivity analysis of cloud-aerosol interactions has been carried out by Markov chain Monte Carlo simulations using an inverse modelling approach in Partridge et al. (2012). The approaches require either a very large number of model simulations in a Monte Carlo-type approach (Ackerley et al., 2009) or a specific experimental design such as the factorial approach (Lohmann and Ferrachat, 2010), both of which are feasible only for a small number of parameters. However, the latest generation of global aerosol microphysics models have many tens of uncertain parameters. In order to make a realistic assessment of the spread in model simulations, a more efficient statistical approach is required. We present a more efficient statistical approach here.
In our previous work we have demonstrated that Gaussian process emulators and variance-based sensitivity analysis can be used to study the sensitivity of global cloud condensation nuclei across the full uncertainty space of 8 microphysics parameters and emissions (Lee et al., 2011. Here we extend these studies to a much more comprehensive assessment of model uncertainty covering more parameters, with the selection and range of values based on expert elicitation. We quantify the uncertainty in cloud condensation nuclei (CCN) due to 28 parameters, with 10 related to aerosol microphysical processes, 14 related to emissions of aerosol precursor gases and primary particles, and 4 related to the representation of the size distributions in the microphysics model. The host model physics was not perturbed.
In this paper, we focus on CCN because it is the fundamental quantity that drives the aerosol indirect effect on climate through changes in cloud drop concentrations, cloud albedo and precipitation processes. However, the approach Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ could be applied to assess and attribute uncertainties in other key predicted quantities such as aerosol optical depth, absorption or direct and indirect forcings. Our comprehensive coverage of aerosol model parameters provides the first essentially complete assessment of the parametric uncertainty of this key aerosol quantity. The results provide a detailed picture of the causes of model uncertainty mapped spatially and temporally across the globe for a full year. The ranked list of important parameters provides a strong steer on priorities for future model development and simplification. We use the term uncertainty in this study to imply the simulated range of CCN about the mean caused by an uncertainty range of input parameters determined by expert elicitation. The range of uncertainty about the mean is based on a complete sampling of the aerosol parameter uncertainty space, and is presented here in terms of the standard deviation of a CCN probability distribution for every grid cell of one altitude level of the model. The variance-based sensitivity analysis enables the contributions to this uncertainty to be quantified. We often refer to the parameter sensitivity as the "contribution to the uncertainty", which is justified given that we are able to calculate the absolute reduction in CCN standard deviation if a parameter were known precisely.
In Sect. 2 we introduce the global aerosol model, although this has been described in detail elsewhere. In Sect. 3 we describe the elicitation exercise, statistical approach and experimental design in general terms. In Sect. 4 we describe the uncertain parameters and their physical meaning in the model. In Sect. 5 we show the validation of the emulators. The results are presented in Sect. 6 in terms of the uncertain parameters and different global regions.

Model description and set-up
The GLObal Model of Aerosol Processes (GLOMAP-mode)  is an aerosol microphysics module that simulates evolution of the size distribution and composition of aerosol particles on a global 3-D domain. The model has been used in several studies of global aerosol (Schmidt et al., 2010Woodhouse et al., 2010Woodhouse et al., , 2012Spracklen et al., 2011b;Lee et al., 2012;Mann et al., 2012) and is a faster version of the GLOMAP-bin module that has been very widely used (e.g. Spracklen et al., 2005aSpracklen et al., ,b, 2010Spracklen et al., , 2011aKorhonen et al., 2008;Reddington et al., 2011). Both models have been compared and evaluated against observations in Mann et al. (2012).
Here, the aerosol model is run within the TOMCAT global 3-D offline chemistry transport model (CTM) (Chipperfield, 2006). The same GLOMAP-mode module is also implemented within a general circulation model , being the aerosol component of the UK Chemistry and Aerosol (UKCA) sub-model of the Hadley Centre Global Environmental Model. In a CTM the aerosol and chemical species are transported and mixed by 3-D meteorological fields read in from analyses, here from the European Centre for Medium-Range Weather Forecasts ERA-40 reanalyses (Uppala et al., 2005). The CTM runs here are at 2.8 × 2.8 degrees with 31 vertical levels between the surface and 10 hPa. Aerosol transport is calculated on the 3-D grid every 30 min by temporally interpolating between the analyses, which are updated every 6 h. Uncoupling the aerosol from the model transport and meteorology, as we do here in the CTM, provides a useful environment for our analysis, as we can examine the changes in aerosol properties without the complicating effects of dynamical responses. If meteorology developed dynamically independently in the model, we would not be able to decompose the variance into the original sources due to the extra source of variability. The dynamically evolving features could be added to the statistical analyses, but that is beyond the scope of this study.
The GLOMAP-mode simulations here use the full 7-mode configuration (as in Mann et al., 2010) with one nucleation mode and soluble and insoluble modes covering the Aitken, accumulation and coarse size ranges. The modes are described by log-normal size distribution functions that are characteristic of observed particle distributions. The scheme resolves the main microphysical processes that shape the particle size distribution on a global scale: emissions of primary particles and precursor gases, new particle formation, coagulation, gas-to-particle transfer, cloud processing and dry and wet deposition. It includes the aerosol chemical components sulfate, sea salt, black carbon (BC), organic carbon (OC) and secondary organic aerosol (SOA). The SOA is lumped with the OC component after condensation. Aerosols and precursor gases in GLOMAP are emitted over a few model levels: SO 2 emissions from industry/power plants are emitted between 100 and 300 m; volcanic SO 2 and biomass burning SO 2 , BC and OC are emitted over a range of altitudes depending on the location. The model includes dust, but we have not included it among the uncertain parameters since our focus is on CCN, which we have previously shown are not strongly affected by dust particles . The important parameters and their effects in the model are described in detail in Sect. 4. The implementation of GLOMAP-mode in the CTM has been shown to compare well with ground-based and aircraft observations of aerosol mass and number Schmidt et al., 2012;Spracklen et al., 2011b).
Wet deposition of particles occurs by two processes: (i) in-cloud nucleation scavenging in which activated particles form cloud drops and are removed in precipitation and (ii) below-cloud impaction scavenging by falling raindrops. ECMWF meteorological fields are used to diagnose largescale frontal precipitation, and the scheme of Tiedtke (1989) is used to parameterise sub-grid convection, with precipitation assumed to occur in 30 % of the affected grid box area. These fields are updated every 6 h, but used to calculate aerosol removal every 30 min time step. Low-level stratified clouds which are not diagnosed as either large-scale frontal www.atmos-chem-phys.net/13/8879/2013/ Atmos. Chem. Phys., 13, 8879-8914, 2013 or convective are read in separately from International Satellite Cloud Climatology Project (ISCCP) D2 data (Rossow and Schiffer, 1999). In these clouds we assume that aerosol particles are activated and subsequently undergo "cloud processing" in which sulfate mass is added to activated aerosol due to aqueous phase oxidation of SO 2 (see Sect. 4 for more details). The global pattern of January and July monthly mean precipitation rate is shown in Fig. 2. This version of the model does not include aerosol wet deposition due to low-level drizzling stratiform clouds. This has been shown to be important for Arctic aerosol in our model (Browse et al., 2012) but to have a small effect on global aerosol abundance. The model was run with a set-up very similar to that described in detail by Mann et al. (2010). Additional features for these runs include anthropogenic secondary organic aerosol and replacement of an earlier binary homogeneous nucleation scheme with that of Vehkamäki et al. (2002) We present results for the year 2008. The model was spun up for three months before any parameter perturbation was applied. After this common spin-up period, the parameter perturbations were applied and a further 3-month spin-up was performed. The analysis was done on monthly mean CCN based on the following 12 months of data. At the resolution used here, GLOMAP-mode takes about 1.5 h to run per month on 32 cores.
CCN concentrations and sensitivities are calculated at an altitude of 915 hPa (approximately 850 m a.s.l.), which is within the planetary boundary layer and at the approximate altitude of cloud base (where CCN concentrations are most relevant). We define CCN to be the number concentration of soluble particles larger than 50 nm dry diameter. CCN is a measured quantity that is usually reported at several supersaturations of water vapour (i.e. it equates to the number of aerosol particles activated to cloud drops when a particular maximum supersaturation is reached in a cloud). Supersaturation ratios in real clouds vary between less than 0.1 % in very slow updraughts to several per cent in storm clouds. Thus, no single CCN metric can provide a complete picture of the importance for cloud drop formation in all clouds. Our choice of CCN = N 50 is equivalent to a supersaturation of about 0.3 % and is typical of values reached in stratocumulus updraught cells. If we assumed a higher supersaturation (smaller diameter of activation), then CCN would become more sensitive to processes that determine the concentration of smaller particles, and vice versa for lower supersaturations.

Statistical methods
To quantify the effect of parametric uncertainty on model simulations, we apply well-established statistical methods to the global 3-D aerosol model. The overall approach is shown in Fig. 1, and consists of several distinct steps: first, expert Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | elicitation is used to choose the uncertain model parameters and represent the uncertainty in these parameters as a probability distribution. Second, statistical design is used to choose an appropriate number of model runs to explore the parameter uncertainty space. Third, Gaussian process emulation is used to estimate model output throughout the entire parameter uncertainty space. A Bayesian framework is used to combine expert prior beliefs on parameter uncertainty and model behaviour with model runs to produce a posterior distribution of model simulations to make global sensitivity analysis possible. Finally, a full variance-based sensitivity analysis is carried out using the emulator to quantify the sensitivity of model simulations to the parameters and their interactions conditional on the emulator and the elicited parameter probability distributions. In essence, we are using emulators conditioned on the GLOMAP output to generate continuous model output across the parameter uncertainty space. The emulator can then be used for a Monte Carlo-type sampling of the output to generate sufficient data to enable a full variance-based sensitivity analysis.

General principles of elicitation
Elicitation provides a framework to represent the uncertainty formally in model parameters from several experts in the relevant field into a probability distribution . We follow the procedures of the Sheffield Elicitation Framework (SHELF) (Oakley and O'Hagan, 2010) to visualise the probability distributions. The first step is to choose the experts to participate in the elicitation process. The aim is to ensure that the experts do not bias the choice of parameters to be studied and can provide enough knowledge to produce meaningful representations of their uncertainty in the form of a probability distribution. The experts are asked in advance Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ to think individually about the uncertain model parameters and to research the literature and gain as much evidence for conviction of their prior beliefs of the parameter uncertainty. Different experts should have different expertise so that the evidence is wide ranging across the different model parameters, though all experts will have some feel for the whole model involved. The experts are then brought together, either face to face or through some online tool, and asked to discuss the model parameters to be studied and their uncertainty. At this stage a facilitator, most likely a statistician, is present to guide the discussion, prevent issues such as anchoring to one person's opinion, and produce the probability distributions that result from the experts' beliefs. Once the parameters have been chosen, the facilitator will ask the experts to suggest the uncertainty range for each, such that it is highly unlikely the true value of that parameter is outside the range. The range is the most crucial part of this process since the experimental design and the emulator will be based on the ranges, whilst the shape of the uncertainty distribution of the parameters can be changed later. The shapes of the uncertainty distributions for the parameters are also elicited at this stage with all experts in discussion. This probability distribution is not restricted to the uniform or Gaussian distribution. The shape of the uncertainty distribution is obtained by asking the experts to split the uncertainty range into portions of different probability regions. There are various methods for obtaining the probability ranges as discussed in Oakley and O'Hagan (2010), and the experts are asked to trial them and their preferred method is used to prevent the method from impacting the results. The SHELF software is used to draw the distributions based on the experts discussions, and these are shared with experts so that feedback can be given on the resulting distribution and changes made when necessary. One aim of expert elicitation is to remove an element of the subjectivity in such studies. As a rule, a sensitivity study follows the path of an expert choosing a process to study and a few values of the associated parameter with which to run the model. In this study, we look at many more processes, so the subjectivity in choosing the processes is removed. We also ask experts to choose ranges that are beyond the normal values that are used to run the model, and in fact choose ranges outside of which the parameter value is highly unlikely to fall. This approach results in a range that is wider than would normally be considered in model sensitivity studies. Furthermore, the parameter ranges are elicited independently, so the uncertainty space is much larger than would normally be considered because we do not let the knowledge of a particular parameter influence the others (i.e. the experts are not asked to make any judgement on the joint space of all parameters). Comparison of the results with observations will enable experts to review their beliefs about model processes and parameters, which is an important follow-up study.

Conduct of the elicitation exercise
In this study the elicitation involved six aerosol modelling experts and a statistician. The quartile method of elicitation was chosen from those in Oakley and O'Hagan (2010) following a trial with known true answers, such as the distance from Leeds to London. The experts were given a few weeks to decide on the uncertain parameters to study and to gather evidence. The experts then discussed the uncertain parameters with some in a single office and others by teleconference. The range of each of the uncertain parameters was decided first and then the shape determined by cutting the range into regions of 50 % probability and then the two halves further into 50 % probability. The result of the cutting process was 4 regions all believed to contain 25 % of the probability of each parameter. Throughout the elicitation the experts were shown how the shape of the probability distributions was impacted by the decisions they made regarding the regions of probability. Visualising the probability distributions proved a valuable way of assessing the choices made by the experts. The discussions showed that some parameters were quite uncertain to all experts so the uncertainty ranges were quite wide whilst others could be constrained by expert knowledge and evidence. The experts chose initially 37 parameters. An initial study of 5 months of the data following the same method presented here was used to eliminate 9 parameters, resulting in 28 parameters to include in the final study. The probability distributions for the 28 final parameters were agreed on by all experts after feedback. The experts were very confident in the ranges of the parameters even when the shape of the distribution was less certain. The details of the chosen parameters and their uncertainty distributions are given in Table 1.

Statistical design of the model runs
In order to build emulators of GLOMAP gridded output, 168 model runs were carried out using parameter settings sampled from a maximin Latin hypercube covering the uncertainty ranges of the 28 parameters in Table 1. Latin hypercube sampling splits the range in every dimension into n equal intervals where n is the number of model runs and then makes sure that each interval is sampled exactly once. Parameters that are used to scale existing emissions are sampled uniformly over the log scale rather than the absolute scale to ensure a balance of points across the parameter uncertainty range. The scaled parameters are shown in Table 1. The maximin algorithm maximises the minimum distance between pairs of points in the 28-dimensional space to make it a space-filling design. Maximin Latin hypercube sampling has previously been shown to be an effective sampling design for building a Gaussian process emulator (McKay et al., 1979). We decided 6 model runs per parameter was sufficient, following tests during the building of the GLOMAP emulator in our previous studies . We also ran 84 model validation runs with 28 runs close to runs in the A separate emulator was built for each month over the year and for every grid box with the scalar output of CCN. At this stage no account is taken of spatial or temporal correlation. The set-up of the model runs is described in Sect. 2.

Gaussian process emulation
Gaussian process emulation (Currin et al., 1991;Haylock and O'Hagan, 1996;O'Hagan, 2006) is used to estimate model simulations at untried points throughout the space of the uncertainty of the model parameters when the computer model under investigation is too computationally expensive to be run enough times for a full Monte Carlo variance-based sensitivity analysis. Multivariate probability theory is used to produce a posterior probability distribution for the model simulations conditioned on model runs (training data) spread throughout the same space of uncertainty and a prior probability distribution to represent prior beliefs about the model behaviour. It is important to note that the emulators are based on output of the model generated from model runs covering the parameter space; they are not an alternative version of the model physics, such as the approach used by Tang and Dobbie (2011). First we explain the emulation method in its most general terms and then more specifically how we applied it in this study.
With the computer model (simulator) represented by the function η, the scalar model output is defined as Y = η(X), where X is the vector of parameter values {X 1 , . . . , X 28 } investigated in this study. Capital letters here represent the fact that the parameters, and therefore the model output, are uncertain. The prior probability distribution used here is the Gaussian process. This means that the prior probability distribution can be specified completely by a mean function and a covariance function. The mean function is where h(x) is some function of x with coefficients β. This represents the prior belief that the expected model output is some function of the input parameters x. The covariance function is where c is a function representing the correlation between pairs of parameter sets and depends on the distance between the pairs and the assumed smoothness of the model response to the parameters (represented by δ) whilst obeying the rules that c(x, x) = 1 and is positive semi-definite (and therefore invertible). The hyperparameters β, σ and δ are given weak conjugate prior distributions so that they are in effect estimated by the training data. The training data are provided by runs of the computer model y = {y 1 = η(x 1 ), . . . , y n = η(x n )}. The choice of parameter sets used to produce the training data is determined by some space-filling design given the ranges placed on X by the expert elicitation to gain as much information about the simulator response η(·) as possible over the region of interest. With the training data y, the parameters β, σ 2 and δ are estimated. Since β and σ 2 are given weak prior distributions, they are calculated by maximum likelihood estimation of the training data. where and where n is the number of training runs and q is the number of elements in β, which depends on the prior choice of h in Eq.
(1). The choice of Gaussian process prior means that the posterior probability conditioned on the training data runs will also be a Gaussian process distribution, which can be specified by a mean function and a covariance function. The posterior Gaussian process is a result of standard conditional multivariate Gaussian theory. Therefore, the mean function is given by which ensures that the function passes through each of the training data points, and the posterior covariance function iŝ ensuring that the variance is zero at the training data points. This mean of the posterior distribution is used as an approximation for the computer model, and sampling from it provides the data we need for sensitivity analysis. If, after performing the model simulations, it is decided that the range or distribution of a parameter is narrower than the maximum Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ ANTH SOA Anthropogenic VOC production of SOA 2-112 Tg a −1 Absolute elicited range, then the emulator can be sampled again without the need for more model runs. The covariance of the posterior distribution tells us how much uncertainty is due to using emulation rather than direct simulation of the computer model. Sampling many possible functions from the posterior distribution and comparing them to the mean function will provide us with information on how robust our results are and will form part of the emulator validation in Sect. 5.

Emulation of GLOMAP CCN
The emulation is carried out using the R package DiceKriging (Roustant et al., 2012). The model output y is the monthly mean CCN for each model grid cell and the model parameters x and their ranges are given in Table 1 and described in detail in Sect. 4. An emulator is built for every month and every model grid cell. In every emulator our prior beliefs assume the modelled CCN can be estimated by a simple linear regression of the parameters, and therefore h(x) = (1, x 1 , . . . , x 28 ) T and q = 29 (p + 1). The covariance structure is assumed to depend on the distance between each pair of parameter sets with a Gaussian function, and there- The emulation depends on smoothness in the modelled monthly mean CCN response to each of the 28 parameters δ i for i = 1, . . . , 28, which is calculated by maximum likelihood estimation. Model smoothness means that we have information on all model simulations in a neighbourhood close to those where the CCN concentrations have been calculated by running GLOMAP. If there are discontinuities in the model, the emulator will not deal with these so alternative approaches would have to be found. It is reasonable to assume no sudden jumps in the monthly mean CCN within a single grid cell within the parameter uncertainty space (and finding such jumps if they exist is crucial if reliable estimates of CCN concentration are to be predicted by the model). The hyperparameters of the mean function (β) and the covariance functions (σ and δ) are calculated by maximum likelihood of the training data as shown previously, but if there is reason to believe their values are known they can be used directly. In most cases there is no strong prior information on the hyperparameters so it is often necessary to use the weak priors as we do here. The assumptions of linear mean and Gaussian correlation can be changed if more information is available or when an emulator is not well validated.

Variance-based sensitivity analysis
Variance-based sensitivity analysis is used to decompose the uncertainty in the model simulations to the uncertainty in each of the model parameters (Saltelli et al., 2000). The approach is able to quantify the sensitivity to each of the model parameters and their interactions (in the case of independent parameters), which cannot be done using the often applied one-at-a-time (OAT) studies. In a complex system such as the global aerosol cycle, interactions between uncertain parameters are thought to be likely and the effect of these interactions can be studied with the variance-based sensitivity analysis. The total variance of the CCN in each grid box is calculated by sampling from the emulator mean function shown in Eq. (7) given the uncertainty distributions in each of the 28 parameters obtained by the elicitation exercise.
With Y and X defined as in Sect. 3.2.1, the emulator is used to estimate the variance (or uncertainty) around the mean Y due to the uncertainty in X, V = Var{E(Y |X)}. With independent parameters X, as we have here, the variance can be decomposed into its individual components, and V p,q = Var{E(Y |X p,q )} represents the variance due to the interaction effect of parameters p and q, and so on. With an accurate emulator these estimates will be close to their true values.
In this study we use the extended-FAST method (Saltelli et al., 1999) in R package sensitivity (Pujol et al., 2008) to sample from the emulator mean function and decompose the total variance in CCN into its parametric sources. The extended-FAST method provides a more efficient sampling from the parameter uncertainty space than Monte Carlo sampling designed specifically for sensitivity analysis. Two measures of sensitivity are calculated in the first instance. These are the main effect index and total effect index. The main effect index measures the percentage of the total variance that will be reduced if parameter p can be learnt precisely, V p /V . The total effect index measures both the individual effect and the interaction effect of each parameter with all others as a percentage of the total variance, V T p /V where V T p represents all variance components including parameter p. The two sensitivity measures are compared to assess the sensitivity of the model output to interactions. If there are no interactions with parameter p, V p = V T p .

Parameters and their meaning
As described in Sect. 3, following expert elicitation, a total of 28 uncertain model parameters were identified for the perturbed parameter ensemble. The parameters relate to microphysical processes, emissions of precursor gases and primary particles, and the structure of the aerosol model (assumptions made about the representation of the size distribution). The parameters are summarised in Table 1. Although some parameters (e.g. wildfire emissions) are likely to be better constrained in some regions than others, we have varied each parameter uniformly over the whole global 3-D domain, with the chosen uncertainty reflecting an upper limit for the range of their variation or uncertainty. Regional variations in the uncertainties could be studied by introducing separate parameters for each region, but we have not done this. The effect of a smaller range can be studied by adjusting the assumed distribution of a parameter after emulation.  (2010) have compared a large number of nucleation rate expressions under prescribed conditions. However, our previous studies (Spracklen et al., 2005a,b;Mann et al., 2010) show that in our model the BHN mechanism predicts total particle concentrations in reasonable agreement with observations through the free troposphere (FT) and is therefore likely to predict a fairly realistic median rate. We assume that the rate could be a factor of 100 lower but only a factor of 10 higher based on evidence that our model tends to overestimate particle concentrations in the upper troposphere (UT) (Metzger et al., 2010).

Definition of microphysical process parameters
In the boundary layer we use a rate expression j = A[H 2 SO 4 ], where j is the particle nucleation rate (cm −3 s −1 ), [H 2 SO 4 ] the gas phase sulfuric acid concentration and A a rate coefficient. This expression is based on measurements in the global boundary layer Sihto et al., 2006;Riipinen et al., 2007;Kuang et al., 2008), which has been shown to capture nucleation events and particle concentrations successfully in a range of environments in our model (Spracklen et al., 2006. The range of the rate coefficient A is based approximately on these measurements for continental conditions. The large variation in observed rate is probably because this simple expression hides a more complex mechanism that is influenced by organic compounds. A single range was applied globally. Although there is no evidence for rapid particle formation in marine regions, it is not clear whether this is due to low H 2 SO 4 or low rate coefficient. Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ Ageing rate (P3). Here, ageing refers to the process by which freshly emitted water-insoluble carbonaceous particles (e.g. from biomass burning) become soluble following condensation of sulfuric acid and condensable organic matter. Emitted BC/OC particles enter the insoluble modes. The controlling parameter is the number of monolayers of soluble material (assumed to be SOA and H 2 SO 4 ) required to convert the particles into cloud condensation nuclei, which is achieved by moving the particles from the insoluble to the soluble mode. The lower limit (0.3 monolayers) makes insoluble particles soluble within a few hours in polluted conditions, and with the upper limit (5 monolayers) this occurs on the order of days. This parameter therefore controls the particle size distribution, since particles in the soluble distribution can be wet-scavenged or undergo cloud processing, which adds sulfate mass to the particles (see parameter 8). Only particles in the soluble modes (larger than 50 nm equivalent dry diameter) are counted as CCN. This approach (developed by Wilson et al., 2001) is a simplification of a complex process in which multiple factors can affect the water solubility of the particles and their activation into cloud drops, but is widely used in global models (e.g. Stier et al., 2005;Spracklen et al., 2006).
Activation diameter (P4). The GLOMAP-mode version used here follows the approach for activation used by Spracklen et al. (2005a), whereby particles larger than a prescribed dry diameter are able to activate to cloud drops. A single value of activation diameter is used globally in a given run. In reality, the activation diameter depends on updraught speed (usually not diagnosed in models), particle composition, and the size distribution Pringle et al., 2009), and is therefore likely to vary spatially. However, this is a computationally expensive process to simulate, with large uncertainties in the driving variables (such as unresolved cloud-scale updraughts applied over large global grid boxes). In GLOMAP, the activation diameter controls the formation of cloud drops in all lowlevel clouds, which we assume are non-precipitating (see Fig. 2a). Thus it mainly controls which particles undergo cloud processing (sulfate production on the particles due to oxidation of SO 2 during the existence of cloud), and therefore how the size distribution is affected by clouds.
Droplet pH controlling in-cloud SO 4 production from SO 2 + O 3 (P5 and P6). The rate of the reaction SO 2 + O 3 → SO 4 is controlled by the pH of cloud water (Gurciullo and Pandis, 1997;Kreidenweis et al., 2003) and has been identified as an important uncertainty in the global sulfur cycle (Faloona, 2009). We assume this reaction occurs in low-level clouds ( Fig. 2a) but not in deep precipitating or frontal clouds in which the formed sulfate is rapidly removed. The pH is assumed to be the controlling parameter, which leads to a change in rate by a factor of 10 5 for pH between 3 and 6 (Seinfeld and Pandis, 1998). One pH parameter is used for clean (lower acidity) environments (SO 2 < 0.5 ppb) and one for polluted environments (SO 2 > 0.5 ppb) based on measurements (Collett et al., 1994). The pH is complicated to calculate in cloud drops because it depends on kinetic and thermodynamic processes in an evolving cloud droplet distribution that are not explicitly simulated. Therefore, most models assume a fixed pH of the cloud water to control this reaction rate. Bulk models of cloud water (no droplet size resolution) underestimate the reaction rate versus droplet size-resolving models by typically a factor of 3, but sometimes much more (Hegg and Larson, 1990). This error could be larger in marine regions with large salt particles. Our parameter represents the "effective" pH of the bulk droplets, and the range takes into account the uncertainty introduced by simplifying the process.
In-cloud scavenging diameter offset (P7). In GLOMAP we assume that particles larger than DSCAV = Activation diameter + diameter offset (P4 + P7) are removed in precipitation (at a rate determined by the loss rate of cloud water). The distribution of precipitation is shown in Fig. 2b. The lower limit of P7 (zero nanometres) assumes all activated particles are subject to removal during precipitation. A non-zero value assumes that some activated aerosol particles escape removal based on the assumption that precipitation-sized drops are initiated by the largest cloud droplets (hence largest aerosol particles) in warm clouds. These processes can only be accurately resolved in a model that treats size-resolved cloud microphysics at very high cloud-resolving resolutions, which no global models do, so they must be parameterised in global models. We do not include the scavenging rate in warm clouds as an uncertain parameter. Previous one-at-a-time tests showed that the scavenging diameter was a much more important factor in shaping the size distribution, primarily because the scavenging lifetime in most clouds is shorter than the residence time of the aerosol in cloudy grid boxes such that the time-averaged removal becomes independent of the rate. Other models include a scavenging efficiency (fraction of particles that are accessible to scavenging in one time step). However, this is entirely equivalent to scavenging rate after multiple time steps.

Scavenging efficiency in ice-containing clouds (P8).
This parameter controls the fraction of particles accessible to nucleation scavenging when air is below −10 • C (i.e. scavenging affects only a fraction of the aerosol in a given time step). Our previous work has shown this parameter to be important in the Arctic Browse et al., 2012). We treat this parameter as separate from warm cloud effects because ice cloud scavenging can affect the seasonal cycle of Arctic aerosol (Browse et al., 2012). Dry deposition of Aitken and accumulation mode particles (P9 and P10). GLOMAP calculates the wind speed and sizedependent deposition velocity due to Brownian diffusion, impaction and interception according to Slinn (1982) using resistances from Zhang et al. (2001) and three land-surface types: ocean, forest and other. In the perturbed runs, the calculated dry deposition velocity in each time step over each surface type is scaled for each particle size by a given factor. Taking into account the difficulty of applying dry deposition mechanisms to large global grid boxes containing unresolved inhomogeneity, we assume large uncertainties in the deposition velocity of a factor of 10 for the accumulation mode particles (Giorgi, 1988).

Definition of size distribution structural parameters
Accumulation and Aitken mode widths (P11 and P12). GLOMAP-mode uses fixed geometric widths of the lognormal size distribution modes (defined by the standard deviation of the distribution). Observations show that the width can vary in time and space (Heintzenberg et al., 2000(Heintzenberg et al., , 2004Birmili et al., 2001). However, allowing for dynamically evolving mode widths adds to the complexity of the model and is therefore not widely adopted in global models. The chosen uncertainty ranges of the Aitken and accumulation mode widths were based mainly on Heintzenberg et al. (2004) and Birmili et al. (2001). The same widths were applied for soluble and insoluble particles. Changing the mode width modifies the size distribution for particles in that mode, which in turn affects dry and wet deposition rates, and what fraction of particles are subject to cloud-processing (see P8).

Mode separation diameters (P13 and P14).
In modal aerosol microphysics schemes, separation diameters define the ranges over which the geometric mean radius can vary while staying in that mode. It is an inherent limitation of the parameterised size distribution approach used in these models. The separation size alters the mean size simulated for the affected modes and hence also changes model process rates (such as coagulation and growth) and removal timescales. The gap between the Aitken and accumulation modes is controlled partly by cloud processing of aerosol in which Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ in-cloud sulfate production leads to larger accumulation mode particles upon cloud evaporation. Because of this link with cloud processing, we scale this size to lie between 0.9 and 2 times the activation diameter (P4).

Definition of primary aerosol and precursor gas emission parameters
Fossil fuel, biofuel and biomass burning particle emission fluxes (P15, P16 and P17). The mass emission fluxes and spatial distribution of these primary particle emissions are as recommended for the harmonised emissions experiment in the first phase of AEROCOM  using the inventories of Bond et al. (2004) and Van der Werf et al. (2003). The recommended emissions are 3.2 Tg (OA) a −1 from fossil fuel, 9.1 Tg (OA) a −1 from biofuel and 34.7 Tg (OA) a −1 from wildfire/biomass burning. BC and OA fluxes are scaled by the same amounts as they are assumed to be within the same particles. The expert elicitation determined the uncertainty ranges to be a factor of 2 larger/smaller for fossil fuel combustion sources and a factor of 4 for biofuel and wildfire emissions since they are less certain (Bond et al., 2004(Bond et al., , 2007. The uncertainty in wildfire emissions in some parts of the world (e.g. North America) may be less than a factor of 4, but this can be adjusted after the emulator is built (although we have not done that here).
Fossil fuel, biofuel and biomass burning particle emission sizes (P18, P19 and P20). These parameters directly control the number of emitted particles for a given mass flux, and therefore directly influence the CCN population. The size of the emitted particles is not reported in emissions inventories, but is needed for size-resolving models, and is a major uncertainty in previous model studies of CCN (e.g. Merikanto et al., 2009;Reddington et al., 2011;Spracklen et al., 2011a). For the AEROCOM prescribed emissions experiment, Dentener et al. (2006) made recommendations for the size distribution of primary emissions based on available information in the literature. They recommended finer sizes be used for fossil fuel combustion sources than for biofuel combustion and wildfire emissions. Although more recent measurements provide some information about emitted particle number concentrations (Janhäll et al., 2010), the particle size remains very uncertain. The size of fossil fuel combustion particles depends on the source. Biomass burning and wildfire particle size depends on burning efficiency (Janhäll et al., 2010) amongst other parameters, but these processes are not treated in global models.
Sub-grid-scale sulfate particle production (P21 and P22). Two parameters describe the formation of particles in sub-grid-scale plumes, such as power plants and degassing volcanoes (Mather et al., 2003;Luo and Yu, 2011;Stevens et al., 2012). P21 defines the fraction of the SO 2 mass that enters the model grid square as new sulfate particles, and P22 defines the size of these particles (and hence their number concentration for fixed mass). The particles are most likely formed by nucleation and growth. Previous studies have shown this to be an important source of global CCN (Spracklen et al., 2005b;Pierce and Adams, 2006;Luo and Yu, 2011), but other studies suggest a more limited effect . We base our ranges on the plume-scale study of Stevens et al. (2012).
Sea spray particle mass flux (P23). We account for uncertainties in the wind-driven mass flux of sea spray particles in the size range 35 nm to 20 µm dry diameter by adjusting the baseline flux by given factors. Below 1 µm the emissions enter the accumulation mode, and at larger sizes they enter the coarse mode. This parameter conflates multiple sources of uncertainty: the function describing the wind-speed dependence of the flux, processes that are unaccounted for in the existing parameterisations (such as fetch, sea state, etc), the wind speed itself, and the effect of spatial resolution of the wind fields used by the model. The range is comparable to previous model studies (Pierce and Adams, 2006) and reflects uncertainties in the parameterisation of measured fluxes (O'Dowd and de Leeuw, 2007).
Anthropogenic SO 2 emissions (P24). The baseline emissions are those from the year 2000 from Cofala et al. (2007), as used for the AEROCOM harmonised emissions experiment .
Time-averaged volcanic SO 2 emissions (P25). The baseline emissions are as recommended by AEROCOM and are based on Andres and Kasgnoc (1998). Emissions include continuously degassing volcanoes and time-averaged sporadic eruptions. We use the same uncertainty range as applied to continuously degassing emissions in Schmidt et al. (2012).
Dimethyl sulfide (DMS) emissions (P26). DMS emissions are controlled by the sea-water concentration of DMS and the wind-driven transfer velocity parameterisation (Nightingale et al., 2000). We conflate uncertainties in these two factors by varying the calculated sea-air transfer flux by a given factor. This approach takes into account that the absolute uncertainty in flux is likely to be higher at higher wind speeds due to the uncertainty in the flux parameterisation. Combining these two uncertainties is a reasonable approach given the lack of separate information on the global DMS sea-water concentration. The range is comparable to that predicted by different parameterisations and models (Woodhouse et al., 2010).
Biogenic SOA production (P27). The range of this parameter conflates the uncertainty in the emissions of the precursor gases (biogenic volatile organic compounds, BVOCs) and the uncertainty in the yield of SOA following multi-step oxidation reactions into a single parameter that scales the VOC emissions and fixes the yield and chemical processes. In GLOMAP, SOA is produced through oxidation of transported monoterpenes (assumed to be α-pinene) by OH, NO 3 and O 3 . The SOA yield from these reactions was assumed to be 13 % in our previous studies (Spracklen et al., 2006Mann et al., 2010) and condenses with zero equilibrium vapour pressure (i.e. partitioned to the aerosol according to gas diffusion-limited uptake). Recent comparisons between global models and observations have suggested a global SOA source as large as 500 Tg a −1 (Heald et al., 2011). Spracklen et al. (2011b) used a comparison between the model and organic aerosol observed by the aerosol mass spectrometer to suggest a global SOA source of 50-360 Tg a −1 . There may be spatial variations in the uncertainty in yield that are different to the spatial uncertainty in emissions, but there is not enough understanding to constrain these two uncertainties separately. There are also uncertainties in the volatility of different compounds (Spracklen et al., 2011b) that we do not account for here.
Anthropogenic SOA production (P28). Uncertainty in anthropogenic SOA is treated in a similar way to biogenic SOA, by conflating the uncertainty in emissions and yield into a single emission uncertainty. For emissions of anthropogenic VOCs (VOC A), we used the same approach as in Spracklen et al. (2011b) by scaling gridded CO emissions from the IPCC. In Spracklen et al. (2011b) SRES CO emissions from anthropogenic activity (470.5 Tg (CO) a −1 ) were scaled using VOC/CO mass ratios of 0.29 g/g so as to reproduce the global sum of VOC emissions from the Emissions Database for Atmospheric Research (EDGAR) for anthropogenic sources (127 Tg (VOC) a −1 ). Here we vary these emissions to produce total anthropogenic SOA that lies between 2 and 112 Tg a −1 . We included the reaction of VOC A with OH. Figures 3 and 4 show the validation of the emulator. Scatter plots of the emulator estimates versus the GLOMAP validation runs at various grid box locations are shown in Fig. 3, with the 95 % confidence intervals around the emulator mean calculated using Eq. (9). Figure 4a and c show maps of the January and July global emulator validation in terms of the percentage of GLOMAP validation runs that lie within the 95 % confidence interval of the emulator estimate. In most grid cells over 90 % of the GLOMAP validation simulations lie within the 95 % confidence interval of the emulator. Note that the mean emulator estimate is used for the Monte Carlotype sampling (Sect. 3.3), and Fig. 3 shows that the emulator mean CCN is very close to the GLOMAP simulation, shown by the 1 : 1 line.

Validation of the emulator
If the emulator is to be useful, then the uncertainty needs to be less than the parametric uncertainty that we are aiming to quantify. The emulator uncertainty is compared to the parametric uncertainty in Fig. 4b and d. The emulator uncertainty was calculated as the standard deviation around the mean of 10 000 Gaussian process functions sampled from the emulator (Eqs. 7 and 9). Figure 4b and d show that the emulator uncertainty is less than 10 % of the parametric uncertainty.
The validity of the emulator can also be assessed subjectively by examining the maps of parametric uncertainty (next section). The CCN and sensitivity maps are produced from an analysis of 8192 independent emulators (one for each grid cell), and yet we find that the spatial patterns can be readily understood in terms of the driving processes, implying that the emulator mean is not dominated by its uncertainty in the different grid boxes. There may be grid boxes that are less well emulated, but for the purpose of our global analysis the emulators here are considered valid.

Metrics of uncertainty
We describe the results in terms of three measures of uncertainty.
The standard deviation of the CCN probability distribution in each grid cell provides a direct measure of the absolute uncertainty in CCN caused by the uncertain parameters. It is calculated as the square root of the total variance due to the uncertainty in the 28 parameters (see Sect. 3.3). Figure 5 shows January and July maps of emulator-estimated CCN and the standard deviation, while Fig. 6 gives some examples of the probability distribution of CCN for selected locations, from which the standard deviation was calculated. We also carry out a variance-based sensitivity analysis to quantify the contribution of each parameter i to the variance in the modelled CCN. These parameter effect variances can also be mapped . Here we show maps of the σ CCN uncertainty in CCN (σ CCN,i = V CCN,i for parameter i where V is the variance). The σ CCN,i value is the square root of the main effect index times the total variance for parameter i (see Sect. 3.3). The σ 2 CCN,i 's cannot be added to obtain the total uncertainty in Fig. 5 unless there is zero interaction between the parameters.
The coefficient of variation, or relative uncertainty, is the standard deviation divided by the emulator mean CCN (σ CCN,i /µ CCN ). This is shown also in Fig. 7. Relative uncertainty is a more appropriate measure of uncertainty in CCN than absolute uncertainty because the uncertainty in cloud reflectivity depends approximately on the ratio of change in cloud drop number (CDN) concentration to absolute concentration ( CDN/CDN), termed the susceptibility (Twomey, 1991). Although CCN and CDN concentrations are not linearly related, the relative uncertainty is more relevant for Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ climate than the absolute uncertainty. For other quantities, like black carbon mass concentrations, the direct aerosol effect depends approximately linearly on column mass, so the absolute uncertainty in BC would be more relevant.
The fraction of variance explained by a parameter is the reduction in variance that would be obtained if a particular parameter were known precisely. A parameter with a large contribution to variance may have its effect in a region with overall low variance. It is therefore a measure of local "research priority" (improved knowledge of highly ranked parameters would lead to a greater reduction in uncertainty in CCN) but not directly relevant to the impact on clouds and climate. Thus, information on CCN relative uncertainty and fraction of variance can be used together to estimate the effect of an uncertain parameter on climate and to identify the most important parameter in terms of reducing the uncertainty in the model. Figure 5 shows that the standard deviation correlates well with mean CCN concentrations, but this is not the case for the relative uncertainty. In general, the relative uncertainty is lower at low latitudes than at high latitudes, although there are exceptions in the biomass burning regions. It varies between a minimum of about ±30 % in many clean marine regions and about ±40-100 % over land areas and at high latitudes. The peak σ CCN reaches 100 % over the January Arctic and July Antarctic. There is a clear seasonal cycle in relative uncertainty in parts of the Northern Hemisphere (NH). For example, wintertime NH marine regions reach about 30-50 % but generally less than 30 % in summer. Peaks in uncertainty at summer high latitude continental locations are associated with large uncertainties in wildfires, as we show below.

Magnitude of uncertainty in global CCN
Although we do not attempt to compare the model uncertainties with observed CCN, it is worth noting that in general the spread of the model simulations is less than shown in the only compilation of global CCN measurements (Spracklen Fig. 4. Global validation of emulator-predicted CCN. CCN concentrations predicted by the emulator are compared against CCN from 84 additional GLOMAP model simulations for every model grid box on the 915 hPa model level. The fraction of GLOMAP simulations lying within the emulator 95 % confidence interval for every grid box is shown for (a) January and (c) July. In (b) and (d) the emulator uncertainty is shown as the standard deviation around the mean due to the emulator uncertainty (σ emulator ) divided by the standard deviation due to the uncertain parameters (σ CCN , shown in Fig. 5). Thus, everywhere, the emulator uncertainty is less than 10 % of the parametric uncertainty. et al., 2011a). In that study the σ CCN range in modelled minus observed CCN was at least 100 %. Some of this modelobservation scatter may be due to poor collocation of the modelled and observed concentrations.

Factors controlling uncertainty in CCN
The variance-based sensitivity analysis was carried out on each model grid box separately. Figure 7 shows the global distribution of the absolute and relative CCN uncertainty, and Fig. 8 shows a global summary of the ranked relative uncertainties. The ranked bar charts were calculated by globally averaging σ CCN,i /µ CCN over all grid boxes at 915 hPa, including a weighting for grid box area. We also stratify the global data into clean/polluted according to the black carbon concentration (clean < 50 ng m −3 , polluted > 100 ng m −3 ) (Fig. 8c) and by weighting σ CCN,i /µ CCN by cloud fraction based on the International Satellite Cloud Climatology Project (ISCCP) global D2 all-cloud data (Rossow and Schif-fer, 1999) (Fig. 8d). The cloud fraction is shown in Fig. 2a. Figure 8a and b also distinguish parameters according to whether they describe processes, emissions, model structures, or a combination of processes and emissions (the two SOA-related parameters). These global mean bar charts summarise the global importance of parameters.
There are several things to keep in mind when comparing these uncertainty maps. First, the importance of a parameter does not necessarily imply that the associated process or emission is acting locally. For example, the activation diameter in clouds accounts for a large fraction of the uncertainty over Antarctica, although there are no clouds there. This implies that the process is the dominant factor that affects the amount of aerosol transported to the region. Second, the importance of a parameter describes the effect it has on the uncertainty in aerosol, not necessarily how important it is for determining the absolute aerosol amount. For example, a low sensitivity to the FT nucleation rate does not imply that FT nucleation could be removed from the model; but only that, Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ when it is included in the model, the aerosol is insensitive to the choice of rate within the range we have tested: the process could possibly be simplified but not eliminated. Third, the contribution of a parameter to aerosol variance does not imply a positive association. For example, increases in biogenic SOA could lead to decreases in CCN due to increases in aerosol surface area and suppression of nucleation. Below we describe the factors controlling uncertainty in CCN first by parameter and then by region and season.

Uncertainty due to microphysical processes
Nucleation rates (P1 and P2). The peak effect of uncertainty in the rate of boundary layer nucleation on the CCN standard deviation is about 200-500 cm −3 , or a maximum CCN relative uncertainty of 20 % in any region, although we show in Section 6.3.6 that the peak contribution can locally reach 40 % in some months. The fraction of variance is also generally less than 40 %, highly localised over remote parts of summertime Canada, the European boreal forest, the Arctic, South Africa and parts of Asia. The FT nucleation rate is a process of high importance to CCN (Merikanto et al., 2009) but relatively insensitive to the rate. The greatest contribution to the standard deviation is mostly over land areas, reaching a σ CCN of 100-200 cm −3 and a peak relative uncertainty of about 25 % at high latitudes, but generally less than 10 %. The regions where the FT nucleation rate is most important do not coincide with regions where it makes the greatest contribution to nucleated CCN -over subtropical marine regions. Over clean regions the production of CCN is mainly through slow coagulation through the dry FT, making the CCN insensitive to the initial nucleation rate in the UT. Over polluted regions with higher vapour supply, there is more condensational growth of the particles, and a larger fraction survive to CCN, making the CCN in the BL more sensitive.
Ageing (P3). Ageing makes a localised contribution to variance over biomass burning and other BC source regions, of up to 2000 cm −3 σ CCN uncertainty in regions with very high CCN of 5000 cm −3 . However, the relative uncertainty is typically less than 10 % in these regions, and the fractional contribution to variance is everywhere less than 5 %. This low sensitivity is partly because of the much larger effect of uncertainty in the mass flux and size of the emitted particles (see below) and partly because ageing timescales are only important up to the point at which most particles have aged. Ageing is therefore a relatively unimportant source of uncertainty in these regions.
Activation diameter (P4). This is an important parameter over persistent low-cloud regions off the west coasts of continents (Fig. 2a) and at high latitudes of both hemispheres. It is ranked fourth globally, but third in clean regions. It accounts for a σ CCN uncertainty of about 50 cm −3 in sub-tropical cloudy regions and a relative uncertainty of up to 20 %. At high latitudes the effect peaks in winter, reaching a relative uncertainty of 30-40 % in the Arctic and 60 % over Antarctica. Sulfate addition in liquid clouds therefore has an important effect on uncertainty in regions dominated by transported aerosol, and Fig. 8c shows that it has a considerably more important impact on uncertainty in clean regions.  Fig. 9. The map of mean CCN is the same as in Fig. 5. In some cases the CCN concentration is negative when in reality it will be truncated at zero meaning that the uncertainty in some places will be slightly overestimated. Since the negative CCN is confined to a small region of the parameter uncertainty space, the sensitivity analysis results will be robust to the negative values. Emulator calibration is not part of this study, but the first regions of parameter space to be removed will be those that give negative values.
Droplet pH controlling in-cloud SO 4 production (P5 and P6). The droplet pH controlling the rate of reaction SO 2 + O 3 =SO 4 is an important parameter controlling much of mid-northern latitude CCN uncertainty in air affected by long-range transport of pollution in all seasons except summer. Figure 8a shows that the droplet pH is the third-most important parameter controlling CCN uncertainty in winter. It accounts for up to 70-80 % of variance over large areas of Alaska and Asia, and generally 20-30 % of Arctic CCN in winter. The absolute impact on CCN peaks over polluted regions, reaching a σ CCN uncertainty over E. US and Europe and China of 500 cm −3 , but the relative uncertainty peaks at about 30-40 % in the Arctic winter, making it one of the most important parameters there. This pattern is consistent with the seasonal importance of the chemical reaction SO 2 + O 3 , while in summer SO 2 oxidation in cloud water is controlled by H 2 O 2 . Under polluted conditions (pH between 3.5 and 5, controlled by P6) the uncertainty is relatively unimportant compared to cleaner conditions in which the pH lies between 4 and 6.5 (P5).
Nucleation scavenging diameter offset (P7). The size at which aerosol particles are scavenged in frontal and convective precipitation has a surprisingly small effect on CCN uncertainty at the 915 hPa level. As described in Sect. 4, the equivalent dry diameter at which activated aerosol particles are scavenged in precipitation is equal to the activation diameter (P4) plus the scavenging diameter offset. These results therefore show that CCN are more sensitive to the activation diameter (relative uncertainties exceeding 20 % in many areas) than they are to the scavenging diameter offset. The effect on standard deviation is concentrated over land areas, although the fractional contribution to uncertainty in CCN is never more than a few per cent. The relative uncertainty is Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ Fig. 7. The global distribution of CCN standard deviation (σ CCN,i , right two columns) and relative uncertainty (σ CCN,i /µ CCN , left two columns) for each of the 28 parameters i in Table 1  greatest over marine regions and the wintertime Arctic, but is everywhere less than about 20 %. Thus, it appears that, at the altitude of cloud base, CCN concentrations are relatively insensitive to in-cloud nucleation scavenging assumptions, other than assuming all activated particles are scavenged. However, as we showed in Lee et al. (2012) the scavenging diameter becomes a dominant parameter throughout most of the FT.
Nucleation scavenging in ice clouds (P8). This parameter contributes only a few per cent to the total variance in a few isolated locations with no clear pattern. It was expected that it would strongly influence Arctic CCN uncertainty (Browse et al., 2012), but the effect is much smaller than for aerosol mass concentrations highlighted in that study. There is a more consistent wintertime effect on BC, accounting for www.atmos-chem-phys.net/13/8879/2013/ Atmos. Chem. Phys., 13, 8879-8914, 2013

Dry deposition of Aitken and accumulation mode particles (P9 and P10
). The effect of dry deposition on the standard deviation follows the changes in aerosol abundance, consistent with it being a first-order loss process. The dry deposition of accumulation mode particles is more important for CCN than Aitken mode, even though the rate is lower (primarily because CCN reside mostly in the accumulation mode). It is largest over land and on continental outflow regions. The map of relative uncertainty is quite different, with a 10-30 % effect over almost all marine regions and a negligible contribution over almost all land areas. The fractional contribution to variance reaches ∼ 30 % in regions where few other factors are important, such as in the tropics. Although dry deposition of accumulation mode particles is quite slow (particle lifetimes of up to several days), it is the dominant (or even sole) loss process of accumulation mode particles close to the surface in many regions. Unlike other processes and emissions, it is a first-order loss process that occurs continuously and everywhere. Thus, globally averaged, it is an important factor in the relative uncertainty in CCN in the boundary layer. We also note the lack of precipitation.

Uncertainty due to size distribution parameters
Accumulation and Aitken mode widths (P11 and P12). The accumulation mode width has an effect over polluted NH regions, reaching a maximum relative uncertainty of 10 % in the wintertime Arctic. The width of the Aitken mode has a much more widespread absolute effect over NH polluted regions and hotspots in biomass burning areas. The relative uncertainty in CCN reaches 30 % in the wintertime Arctic and 40 % over the Antarctic and parts of the Southern Ocean. As a fraction of total variance, it accounts for 10-30 % over large regions of the ocean including the Arctic. Thus the Aitken mode width is a structural parameter of high importance for reducing uncertainty in predicted CCN of 50 nm dry diameter. Figure 8c and d show that the Aitken mode width is the second-most important uncertain parameter for CCN in clean and cloudy regions. The Aitken mode width is more important for CCN uncertainty than the accumulation mode width because almost all accumulation mode particles are counted as CCN, while the fraction of Aitken mode that is counted depends on how the distribution extends beyond the assumed CCN size of 50 nm dry diameter. This is the only parameter that has a significantly different impact on CCN uncertainty in cloudy versus non-cloudy regions.
Mode separation diameters (P13 and P14). The effect of the nucleation-Aitken separation diameter is restricted almost entirely to high southern latitudes of the Southern Ocean and Antarctica, accounting for a maximum of about 5 % of variance and a relative uncertainty of less than 10 %.
The Aitken-accumulation mode separation diameter has an absolute effect mainly over polluted regions. The fractional effect is restricted to a few small hotspots, reaching 8 % of variance.

Uncertainty due primary aerosol and precursor gas emissions
Fossil fuel particle mass flux and diameter (P15 and P18). Fossil fuel particle emissions have a highly localised generally less than 10 % effect on σ CCN /µ CCN over the main source regions, especially China. The size of the emitted particles is much more important for uncertainty in CCN than the mass emission flux. The size parameter has a maximum effect on relative uncertainty of 30 % over polluted regions and accounts for 50-60 % of the variance (σ CCN of 500-1000 cm −3 ), but typically less than 10 % over the US, where sulfate parameters are more important. The fossil fuel diameter is the fourth-most important for CCN uncertainty in polluted regions.
Biomass burning particle mass flux and diameter (P16 and P19). The importance of the biomass burning mass flux follows the seasonality of the emissions and reaches 40 % of variance over large regions mostly immediately over the sources (Amazon, Africa, northern and western US and boreal regions), which equates to a σ CCN uncertainty greater than 1000 cm −3 and relative uncertainty of 40-50 %. The size of the emitted particles is more important than the mass flux and causes a σ CCN /µ CCN uncertainty of over 60 % in source regions and 50 % over the summertime Arctic. Locally it is by far the dominant parameter and accounts for up to 80 % of the variance over source regions and up to 40-50 % over large regions of the remote Arctic in summer. The importance of the emission parameters is strongly located over the emission regions, with very little extension over the downwind ocean regions. In these regions dry deposition becomes important (see below). The reliability of this result will depend on the realism of vertical mixing of plumes in the model, and could be tested against observations. The biomass diameter is globally ranked number three in July, but number one in polluted regions.

Biofuel particle mass flux and diameter (P17 and P20).
The uncertainty due to biofuel mass flux is important only immediately over the main emission regions of India, southeast Asia and West Africa, with no strong seasonal variation. In the NH winter the impact over India extends over the Indian Ocean as the air pollution is advected out. The mass emission accounts for locally 30 % of the variance and the size of the particles up to 70 %. Thus in the main biofuel burning regions these primary emissions dominate the CCN uncertainty, but the effect is quite localised.
Sub-grid SO 2− 4 particle formation (P21 and P22). Uncertainties related to sub-grid SO 2− 4 particle production are as important as uncertainties in SO 2 emissions themselves. Parameters P21 and P22 have a large influence on CCN uncertainty over the eastern US, the North Atlantic, Europe and Asia, with the European emissions influencing the uncertainty right across Russia and into Asia. Relative uncertainty reaches 50 %. As with the other primary particle emissions, the size of the particles assumed to be formed at sub-grid scales is more important than the fraction of emitted sulfur assumed to be in them. Both parameters (mass flux and size) have significant interactions with other parameters, with up to 20 % of the total variance being due to interactions. Our analysis therefore shows that sub-grid production of a few per cent by mass of SO 2− 4 particles in plumes is much more important for CCN uncertainty than the SO 2 emissions themselves.
Sea spray emissions (P23). The uncertainty in sea spray emission has a small effect on CCN uncertainty over the world's oceans except in the Southern Ocean. Here, the fractional contribution to variance varies seasonally between 10-30 % in Southern Hemisphere (SH) summer and up to 60 % in SH winter, and the relative uncertainty reaches 30-40 %. It is the seventh-most important parameter globally in July and the fourth-most important in clean regions. Elsewhere the fractional contribution to variance is typically less than a few per cent in the mid-Pacific and Atlantic oceans throughout the year, making it a relatively unimportant parameter there. This is a surprisingly low sensitivity over windy oceans to a plus/minus factor of 5 change in flux. The reason may be related to the impact of the sea spray on CCN formed from nucleation, which is apparent in decreases in CN over many ocean regions. Absolute changes in CCN also occur over land regions, again suggesting an impact of sea spray on aerosol formation processes, impacting downwind regions.
Anthropogenic SO 2 emissions (P24). The effect of SO 2 emission uncertainty on CCN standard deviation is clearly associated with emission regions. The peak σ CCN uncertainty is about 500 cm −3 over Europe, but the relative uncertainty reaches 10-15 % over large parts of the NH.
Volcanic SO 2 emissions (P25). The contribution of volcanic SO 2 to the total uncertainty is important mainly in a zonal band between the Equator and 30 • S, causing a 10-15 % uncertainty in CCN. The widespread effect of volcanic SO 2 emissions on CCN and cloud albedo has been studied by Schmidt et al. (2012). As in their study, we find that the volcanic emissions have a widespread effect on CCN due to formation of particles in the FT.
DMS emissions (P26). DMS has a strong seasonally varying effect on CCN uncertainty, restricted largely to the SH in terms of its relative effect. This is consistent with Woodhouse et al. (2012), who showed a diminished NH effect due to a higher background CCN concentration. This NH/SH difference means that, globally averaged, DMS emissions are ranked 14th in NH summer but 4th in SH summer. Over SH marine regions the relative uncertainty is about 10-30 %, rising to 50 % near the Antarctic coast from January to April, or a σ CCN uncertainty of up to 100 cm −3 . On either side of this period (May and November and December) the impact is almost entirely over Antarctica. This short period of influence on CCN is consistent with our previous simulations of CCN over the Southern Ocean . The impact is much weaker in the NH, reaching a maximum of about 10 % relative uncertainty over much of the Arctic Ocean.
Biogenic SOA emission and production (P27). This parameter accounts for uncertainty in BVOC emissions and SOA production chemistry in one parameter. The impact of a large uncertainty in SOA on CCN is surprisingly small. It is ranked 16th globally and in clean regions. The effect on CCN standard deviation of about 200-500 cm −3 is mainly associated with vegetated land areas, but this is typically less than 10 % of mean CCN. The fractional contribution to variance reaches about 50 % only in a few very small spots in N. America, N. Europe, S. America and Australia, but has a negligible effect elsewhere. One reason for this weak effect on uncertainty is that increases in SOA in our model act to grow particles to CCN sizes (a positive effect), but the larger condensation sink acts to suppress nucleation, which we assume is not itself enhanced by organic compounds. The effect on CCN uncertainty could be much larger if nucleation were driven by organic compounds (Metzger et al., 2010). If this is not the case, then uncertainties in biogenic SOA could have a relatively minor effect on CCN because of compensating effects.

Anthropogenic SOA emission and production (P28).
The spatial distribution of the standard deviation resembles that of fossil fuel primary particles but spread out over the downwind marine regions. The impact is also larger and more widespread in the winter hemispheres. According to Fig. 8c anthropogenic SOA is considerably more important for CCN variance in clean regions, rather than in regions where it is emitted. These effects contrast starkly with biogenic SOA. Biogenic emissions peak in the summer, and therefore have their maximum effect on aerosol during periods with highest nucleation, leading to a compensation effect on CCN. However, anthropogenic SOA precursors are emitted all year and, although photochemistry is slower in winter, it can form SOA and grow existing particles to CCN sizes with little impact on nucleation in the lower troposphere, thus leading to a significant wintertime impact Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ on CCN. As we noted in Sect. 4, there is considerable uncertainty not just in the amount of anthropogenic SOA produced, but also whether observations can be explained by genuine anthropogenic SOA or by anthropogenic enhancement of biogenic SOA. Such a structural change in the model would lead to different results to those shown here. Nevertheless, anthropogenic SOA could have a potentially large impact on global CCN.

Ranking of global uncertainties
As described above, the relative importance of parameters for CCN uncertainty varies spatially and temporally. Nevertheless, some general observations can be made on the global mean importance of different parameters (Fig. 8).
For emissions (blue bars in Fig. 8), the rank order in terms of global mean σ CCN /µ CCN in July is as follows: (1) and (2) biomass burning (mass flux followed by particle size), (3) sea spray flux, (4) and (5) anthropogenic SO 2 and sub-grid sulfate particle size (approximately equal), (6) fossil fuel particle size and (7) DMS (which becomes parameter number 1 in January due to Southern Hemisphere emissions). If SOA formation is included as an emission uncertainty (rather than a process), then anthropogenic SOA would rank 4th among the other emissions, roughly equal to sea spray, and biogenic SOA would rank 9th. Taken together, anthropogenic and biogenic SOA would rank among the most important emissions. Biofuel emissions have a localised effect on σ CCN /µ CCN of about 20 % (due to their uncertain size), but are globally less important than the top 7. Volcanic SO 2 emissions are generally relatively unimportant. A clear feature of the results is that the sizes of the emitted primary particles is more important than their mass flux (by up to a factor of 2).
For processes (red bars in Fig. 8), the rank order in terms of global mean σ CCN /µ CCN in July is as follows: (1) dry deposition of accumulation mode particles, (2) the activation diameter, (3) the rate of sulfate production in cloud drops, (4) boundary layer nucleation, (5) dry deposition of the Aitken mode, and (6) the size of particles scavenged in precipitating clouds.
For size distribution representation, the Aitken mode width is clearly the number 1 parameter, with the other size distribution parameters being fairly unimportant.
The rank order of parameters is strongly dependent on the level of pollution, as defined by black carbon concentrations (Fig. 8c). There is an obvious reordering of the importance of the emissions of BC-containing particles (biomass, fossil fuel, biofuel emissions) in clean and polluted regions. But we also find that the sensitivity to natural emissions is strongly suppressed in polluted regions because of the high concentrations of anthropogenic aerosol. For example, between clean and polluted regions σ CCN /µ CCN decreases by a factor of 4.5 for sea spray, a factor of 4 for DMS and a factor of 2 for volcanic SO 2 . This implies that pollution will have suppressed the importance of natural aerosol-climate feedbacks. Interestingly, anthropogenic SOA has a larger effect on σ CCN /µ CCN in clean regions than polluted regions (by a factor of 3) because of its long-range transport.
Finally, we note that the rank order is essentially unchanged when the gridded σ CCN /µ CCN values are weighted by low-cloud fraction (Fig. 8d). This implies that the global rank importance of a parameter is also a good indicator of its importance for cloud drop formation and indirect forcing.

Relative importance of emissions, size distribution and processes
Figure 10 splits the contribution to uncertainty according to microphysical processes, size distribution representation and emissions, as coloured in Fig. 8a. The two SOA-related parameters (green in Fig. 8a) were included in both the processes and emissions group because they represent uncertainty in both the emissions of BVOCs and the chemistry of SOA formation. These maps show a strong contrast between the importance of emissions over land areas and processes over marine areas. Thus, in terms of aerosol indirect radiative forcing, uncertain aerosol processes are an important factor. In summer, the hotspots of emissions uncertainty are mostly due to wildfires and biomass burning. As noted in Sect. 6.3.6, there is a sharp transition immediately downwind between the importance of emissions parameters and process parameters (mainly dry deposition at low latitudes), which ought to be tested against observations.
In general, the representation of the size distribution is less important than either emissions or processes, although the modal aerosol parameters are not negligible and are an important factor at high latitudes. Almost all of the uncertainty in CCN due to the size distribution parameters comes from the width of the Aitken mode, which accounts for 40 % of variance in large regions of the remote ocean. But, as noted previously, the accumulation mode will be more important for larger CCN sizes than assumed here.

Uncertainties by region
We now present results for a few specific locations that are representative of larger regions or of specific interest because of long-term measurements. Figure 9 shows the seasonal cycle of emulator mean CCN (with 2σ CCN bars) and variance contributions for locations representing polluted marine (N. Atlantic), clean marine (S. Ocean), marginal Arctic (Barrow, Alaska, and Zeppelin, Svalbard), high Arctic, remote NH continental (Tomsk, central Siberia), polluted continental (Melpitz, central Europe, and Bondville, E. United States), European boreal forest (Hyytiala, Finland), persistent stratocumulus (coastal Chile), biomass burning (Botsalano, S. Africa), and two long-term sites at Cape Grim (Tasmania) and Mace Head (Ireland). These data refer to the single grid box in which the station sits. Polluted marine. In the North Atlantic the important parameters represent a mix of pollution (SO 2 emissions, SO 2− 4 particles and anthropogenic SOA production) and long-range transport processes (SO 2− 4 production in clouds and the Aitken diameter, as at the Arctic sites, and dry deposition). There is a clear seasonal cycle, with sulfur pollution and dry deposition dominating in summer and anthropogenic SOA and in-cloud SO 2− 4 production being more important in winter.
Remote marine. The Southern Ocean has two obvious zones: one between 40 and 60 • S where sea spray is important (particularly in Southern Hemisphere winter) and one south of 60 • S where DMS emissions play an important role in Southern Hemisphere summer. In both zones the activation diameter and the width of the Aitken mode are important through the year.
Arctic. The marginal Arctic sites Barrow (Bodhaine, 1989) and Zeppelin (Ström et al., 2003) look very similar except in the summer when CCN at Barrow are dominated by biomass burning. Outside the summer, the most important process parameters are the activation diameter and SO 2− 4 production in cloud drops (pH controlling the rate of O 3 + SO 2 ), both of which control the evolution of the size distribution during cloud processing. The width of the Aitken mode is also very important at both sites, and dominates at Zeppelin in the summer. Again, the width of the Aitken mode affects the fraction of particles that can be activated in clouds. Dry deposition is also important year-round. Thus these Arctic sites are dominated by processes that occur during long-range transport. The important parameters in the high Arctic (85 • N, 0 • W) are similar to those at Barrow and Zeppelin.
Remote NH continental. The important parameters at Tomsk, Central Siberia, are very similar to central Europe in winter and spring, but in the mid-summer the CCN variance is dominated by uncertainty in biomass burning emissions, primarily the size of the emitted particles. In early summer, boundary layer nucleation accounts for up to 20 % of the uncertainty.
Polluted NH continental. Uncertainty in CCN at Bondville and Melpitz (Engler et al., 2007) is dominated by anthropogenic SO 2 , particulate SO 2− 4 and fossil fuel BC/OC emissions. The most important parameter in these locations is the diameter of sub-grid SO 2− 4 particles, which accounts for about 30-40 % of variance through the year. Both sites have low seasonal variation in the importance of parameters.
European boreal forest. The important parameters at Hyytiala outside of summer  are very similar to the central European site at Melpitz, with a large fraction of the total variance being due to pollution-related parameters, particularly SO 2− 4 production in clouds, anthropogenic SO 2 and sub-grid SO 2− 4 properties. However, as we stated previously, this does not imply that the properties and sources of aerosol are the same in both locations; only that the factors controlling uncertainty are similar. The main difference is the appearance of biogenic SOA and boundary layer nucleation as important parameters in summer in Hyytiala. Cloudy region. The impact of CCN changes on climate usually focuses on low-level stratiform clouds because of their importance to the radiative budget of the planet. In the stratocumulus region off the coast of Chile, the dominant factor in CCN uncertainty is aerosol dry deposition in the summer and anthropogenic SOA in winter, with the activation diameter and width of the Aitken mode being important all year. Because we neglect wet deposition in such regions, dry deposition is the dominant removal processes. It is likely that drizzle scavenging would be an important uncertainty if that process were included. Figure 8d also shows that the Aitken mode width is the second-most important parameter when weighted by cloud cover. Thus, this structural parameter is an important consideration in model development.
Biomass burning region. CCN uncertainty at Botsalano is dominated by uncertainty in the size and rate of the biomass burning emitted particles through much of the year. At the beginning of the year before the start of the biomass burning period, the uncertainty in CCN is controlled by boundary layer nucleation and the size and rate of anthropogenic emissions through fossil fuels and particulate SO 2− 4 .
Cape Grim. Cape Grim on the southern tip of Tasmania is an important site for the long-term monitoring of aerosols and trace gases (Ayers et al., 1986; Ayers and Gras, Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ 1991) and has been used extensively for studies of marine aerosol processes. The key parameters controlling CCN variance at Cape Grim are very heterogeneous, but appear to be controlled mostly by continental emissions. The most obvious feature is the importance of biomass burning from March to May. But outside this period a mix of natural and anthropogenic aerosol parameters is important, with marine aerosols and precursors (sea spray and DMS) not being prominent among them. So although DMS emissions control the seasonal cycle of CCN at Cape Grim , CCN concentrations are much more sensitive to a range of other emissions (i.e. the seasonal cycle will still occur within the range of DMS emissions that we have used here). This result will have implications for interpreting any long-term trends. However, care needs to be taken when comparing with observations because of the strong land-sea gradient in aerosol properties at this site.
Mace Head. CCN uncertainty at this coastal site is similar to Cape Grim in being controlled by a wide range of parameters. The Mace Head site (Jennings et al., 1991) is assumed to be representative of the marine aerosol environment. However, the factors controlling CCN uncertainty in the global model are actually mainly pollution sources. One reason for the low importance of marine aerosol properties is that the site is in a model grid box that overlaps with the UK, although the agreement of our GLOMAP-bin model at this site does not suggest any particular issue with model skill (Reddington et al., 2011). An improved understanding of aerosol model uncertainty at this and other coastal sites may require filtering of the data to identify marine air masses, or analysis of model grid boxes over open ocean rather than on the coast.

Discussion and conclusions
We have used an ensemble of global aerosol microphysics simulations together with emulators and variance-based sensitivity analysis to quantify the magnitude and causes of uncertainty in monthly-mean CCN for every 2.8 • grid box of a global aerosol model at the altitude of 915 hPa (approximately cloud base). Twenty-eight uncertain parameters and their likely uncertainty ranges were defined based on expert elicitation. A validated Gaussian process emulator of the model behaviour across the 28-dimensional parameter space in each grid box enables a full probability density distribution of CCN to be generated by Monte Carlo-type sampling for each grid box based on only 168 model runs. The probability distributions then allow the standard deviation of modelled CCN to be computed on a global scale. A full variancebased sensitivity analysis was also conducted, which generates globally gridded information about the most important sources of uncertainty in modelled CCN.
This analysis of uncertainties in a global aerosol microphysics model points to several priorities for reducing parametric uncertainty in modelled CCN. The following statements refer to the relative uncertainty in CCN concentrations (> 50 nm dry diameter) in the boundary layer at 915 hPa, which we defined as the global mean of the standard deviation divided by the mean CCN in each grid box (Sect. 6.1). Figure 11 shows a schematic of the relative importance of the parameters we have studied, with the size of the font proportional to the relative uncertainty.
-The most important process for global mean CCN uncertainty is dry deposition of the accumulation mode. Dry deposition is a globally important process that occurs continuously and everywhere at a first-order rate that scales with aerosol concentration. In contrast, many other processes are only regionally important, so are less prominent as a global mean uncertainty. The dry deposition velocity is also the parameter with the greatest uncertainty (factor of 0.1 to 10). A more refined study could take account of differences in uncertainty over different land surface types, rather than the globally uniform uncertainty applied here.
-Processes related to the interaction of aerosols with lowlevel clouds are among the most important processes for CCN uncertainty. The two leading parameters are the activation diameter of aerosol in clouds and the oxidation of SO 2 by ozone in clouds (we did not study the uncertainty due to the H 2 O 2 reaction). Improved models of aerosol activation depend primarily on improved simulations of updraught speeds, while better constraint of SO 2 oxidation would require more advanced models of cloud drop chemistry and compilation of a global dataset of cloud drop pH measurements. Improvements in these processes and evaluation against data related to the sulfur budget (Alexander et al., 2002(Alexander et al., , 2005 would help to reduce uncertainty in global CCN. -Among the primary particle and precursor gas emissions, the uncertainties in carbonaceous combustion particles (from biomass burning, wildfires and fossil-/biofuel) are more important for CCN uncertainty than anthropogenic SO 2 emissions. The ranges we used for the associated emission parameters were very large: up to a factor of 4 for mass emission flux with a range of particle sizes. More information on how these parameters vary with location and other conditions would have a substantial effect on model simulations of CCN and would help to reduce the uncertainty in their effects on cloud radiative forcing, which counteracts the positive direct forcing due to the presence of black carbon .
-The size of emitted primary particles is more important for CCN uncertainty than the mass flux. These particles derive from biomass burning, wildfires, fossil fuel and biofuel combustion, and sub-grid sulfate particle formation in plumes. In general, for the parameter ranges we used, the relative uncertainty in CCN due to uncertain particle size is about a factor of 2 larger than that due to mass flux. The importance of particle size makes intuitive sense because the number concentration scales with the reciprocal of the emitted size cubed, but the number concentration scales only linearly with the emitted mass. However, as shown by Pierce and Adams (2007) the relative effect of uncertainty can depend on several factors. In general, the dependence is much less than this scaling would suggest, because smaller more numerous primary particles need to grow to CCN sizes.
-Sub-grid formation of sulfate particles in plumes is approximately as important for CCN uncertainty as the uncertainty in SO 2 itself, despite the fact that less than 1 % of the SO 2 is converted into particles in the plume. More research is needed to understand the formation and dispersion of particles in plumes (Stevens et al., 2012). So far, studies have focused only on sulfate particles. However, given the large uncertainty, it would be worth identifying whether sub-grid production of particles occurs in other environments.
-Biogenic secondary organic aerosol has a surprisingly small effect on CCN uncertainty, despite a very large range applied in the model (5 to 360 Tg a −1 SOA production). This low sensitivity of a secondary aerosol component contrasts with the much higher sensitivity of CCN to SO 2 emissions (35-87 Tg a −1 ). The likely reason for the different sensitivities is that H 2 SO 4 from SO 2 oxidation produces new particles as well as growing existing ones, while SOA only grows existing particles in our model. An important area of research is therefore to understand how and to what extent biogenic SOA influences the nucleation of new particles. If it does, the large uncertainties associated with biogenic SOA might make it one of the most important parameters in global CCN production.
-Anthropogenic SOA has a larger effect on CCN uncertainty than biogenic SOA despite having a smaller overall parameter uncertainty (3-160 Tg a −1 ). With the approach we have taken, this parameter has an effect on CCN uncertainty approximately as great as sea spray and anthropogenic SO 2 emissions. Anthropogenic SOA uncertainty influences CCN mainly in winter and has a widespread hemispheric effect on CCN uncertainty, while biogenic SOA has a patchy continental effect. One reason for the greater impact on CCN may be that anthropogenic SOA forms in polluted regions where a large number of small particles can grow to CCN sizes. There are many open questions concerning anthropogenic SOA, even whether observed SOA is truly anthropogenic or whether air pollution enhances formation of biogenic SOA (Spracklen et al., 2011b). An improved understanding of anthropogenic SOA formation, and how it compares to biogenic SOA, could lead to a significant reduction on model uncertainty.
-Nucleation accounts for about 45 % of CCN globally (e.g. Merikanto et al., 2009), or up to 70 % if sub-grid sulfate particle formation in plumes is included (Yu and Luo, 2009) (although this effect is assessed here Atmos. Chem. Phys., 13, 8879-8914, 2013 www.atmos-chem-phys.net/13/8879/2013/ as part of the sub-grid particle production uncertainty). Merikanto et al. (2009) estimate that more than 75 % of the nucleated CCN at cloud base level come from particles formed in the free troposphere. Here, we find that free tropospheric nucleation accounts for a negligible fraction of total CCN uncertainty at cloud base. Boundary layer nucleation is a more important uncertainty, accounting for a global mean CCN standard deviation of about 6 % of the mean. The relatively small uncertainty from nucleation is in agreement with earlier studies (Pierce and Adams, 2009). However, it is an essential process in models because of the large net contribution it makes to CCN. It is likely that global CCN will be more sensitive to nucleation rates in the preindustrial era Merikanto et al., 2010) when other particle sources were lower and the rate of formation may have been reduced over Northern Hemisphere land areas because of much lower emissions of sulfur species. The importance of nucleation might also change if a different mechanism were used, such as one driven by organic compounds. Thus an improved understanding of particle formation and the effects of biogenic and anthropogenic SOA is important.
-The ageing rate of insoluble primary particles (from combustion processes) into water-soluble particles has a negligible effect on CCN uncertainty globally. This result suggests that structural simplification of aerosol models in terms of chemical mixing state would have an acceptable impact on the reliability of CCN simulations.
-The wintertime high latitudes are regions of high CCN parametric uncertainty, which can be attributed almost entirely to uncertain microphysical processes.
-Emissions and processes are more important than the representation of the size distribution in the aerosol microphysics model. We previously showed that a bin and a modal model agree quite well in the simulations of many aerosol quantities. Some important structures can be improved, as noted below, but in general the development of more complicated and computationally demanding aerosol models to calculate varying mode widths should have lower priority than the improvement in model processes and emissions. The effects of structural changes in the host global transport model have not been assessed here, but AEROCOM intercomparisons suggest the variance could be large .
-The most important parameter representing the size distribution in a modal model in terms of simulation of CCN is the width of the Aitken mode. This parameter was varied between 1.2 and 1.8 and accounts for up to 40 % of variance in CCN in remote regions, particularly at high latitudes in winter. In terms of global mean relative uncertainty in CCN (σ CCN /µ CCN ), it is ranked second out of the 28 parameters we studied. It is important because it determines the fraction of Aitken particles that are counted as CCN at 50 nm dry diameter. It is important to note that the importance of this parameter would decrease if we considered larger CCN, although the width of the accumulation mode would then rank more highly. Mixing of different air masses with different mode widths cannot be handled in a modal model with constant width. Possible approaches to improvement include replacing the Aitken mode with bins (e.g. as in the SALSA model) (Kokkola et al., 2008;Bergman et al., 2012) or developing a modal model with a prognostic treatment of the width of the modes, as previously suggested (e.g. Weisenstein et al., 2007). More climatological information on Aitken mode aerosol properties (Heintzenberg et al., 2000(Heintzenberg et al., , 2004Birmili et al., 2001) would be valuable for model evaluation.
-Interactions between parameters controlling CCN generally account for less than 20 % of the uncertainty. This is smaller than we found in a previous study of 8 parameters . Although the same interactions must still be occurring in the present much larger study, their relative contribution to the overall uncertainty is less.
We reiterate that these conclusions refer to the model factors that are important for the uncertainty in model simulations of CCN. They are the properties of the model that should be given most attention in efforts to reduce uncertainty. The important uncertain factors may not be the same as those that account for the absolute abundance of aerosol. For example, SOA is a major component of the aerosol mass, but our model results have shown that CCN are not very sensitive to its formation, most likely because of how it affects other aerosol processes. Likewise, nucleation is known to be an important source of CCN (e.g. Spracklen et al., 2008;Makkonen et al., 2009;Merikanto et al., 2009;Yu and Luo, 2009), but CCN are not strongly sensitive to the nucleation rate.
Care needs to be taken to verify these model sensitivity results using observations. As we have pointed out in several cases, an uncertain model parameter can impact aerosol far away from where the emission or process occurred. The most obvious example of this effect is the importance of cloud processing for Antarctic CCN when there are no clouds over the Antarctic, which is caused by the integrated effect of in-cloud sulfate formation along air mass trajectories.
Nevertheless, the uncertainty information generated in this study provides the basis for a much more rigorous evaluation of the model against observations, leading to a more structured approach to model improvement. The normal approach in model evaluation and improvement is to reduce the bias between modelled and observed aerosol by tuning a small number of existing parameters or developing more sophisticated models for various processes of interest. With new information about the full probability distribution of the model and ranked parameter sensitivities in all grid boxes, it will be possible to home in on the most likely causes of model bias. Structural uncertainties can be more easily identified in cases where observations lie outside the confidence intervals of the model. Confronting these results with observations is therefore a high priority.
It is essential to extend the current study to include the structural uncertainty in the host transport model and the parametric uncertainty in the host model physics. An important question is whether the uncertainty in global aerosol stems largely from parametric uncertainty in the aerosol microphysics model or from uncertainty in the meteorological fields that transport and ultimately remove the aerosol. This study has been conducted in one model structural framework, so exploration of other structures and models is an important next step to generate a fuller picture of overall uncertainty.
The conclusions we reach about the relative importance of different parameters are dependent upon the estimated ranges of the parameters from the expert elicitation. If it is decided that a parameter is actually less uncertain than we have assumed, then the variance analysis can easily be repeated using the emulators and a new ranking of important parameters obtained. However, if the model structures or design of the parameterisations changes, then new model runs would have to be performed unless new model processes simply help to constrain the value of the existing parameters.
How can these results be related to uncertainty in aerosol forcing? We have quantified uncertainties in present-day CCN, but the overall uncertainty in the indirect effect is determined by the uncertainty in CCN as well as uncertainties in cloud occurrence and cloud-related processes (updraught speeds, precipitation processes, etc). Because aerosol forcing is calculated relative to some baseline (such as the preindustrial era), the uncertainty in forcing also depends on the baseline (e.g. pre-industrial) CCN concentration. In fact, as we showed in Schmidt et al. (2012), the cloud albedo forcing will probably be more sensitive to the uncertainties in the pre-industrial CCN than to the present-day CCN. Thus, the ranking of important parameters for forcing may differ from what we have presented here.