We present a Monte Carlo genetic algorithm (MCGA) for efficient, automated, and unbiased global optimization of model input parameters by simultaneous fitting to multiple experimental data sets. The algorithm was developed to address the inverse modelling problems associated with fitting large sets of model input parameters encountered in state-of-the-art kinetic models for heterogeneous and multiphase atmospheric chemistry. The MCGA approach utilizes a sequence of optimization methods to find and characterize the solution of an optimization problem. It addresses an issue inherent to complex models whose extensive input parameter sets may not be uniquely determined from limited input data. Such ambiguity in the derived parameter values can be reliably detected using this new set of tools, allowing users to design experiments that should be particularly useful for constraining model parameters. We show that the MCGA has been used successfully to constrain parameters such as chemical reaction rate coefficients, diffusion coefficients, and Henry's law solubility coefficients in kinetic models of gas uptake and chemical transformation of aerosol particles as well as multiphase chemistry at the atmosphere–biosphere interface. While this study focuses on the processes outlined above, the MCGA approach should be portable to any numerical process model with similar computational expense and extent of the fitting parameter space.

Atmospheric aerosols play a key role in climate, air quality, and public health. Heterogeneous reactions and multiphase processes alter the physical and chemical properties of organic aerosol particles, but the effects of these reactions are not fully elucidated (e.g. Finlayson-Pitts, 2009; George and Abbatt, 2010; Abbatt et al., 2012; Pöschl and Shiraiwa, 2015). While multiphase chemistry in aerosols and clouds can be described by a sequence of well-understood physical and chemical elementary processes in kinetic models (Hanson et al., 1994; Pöschl et al., 2007; George and Abbatt, 2010), the deduction of parameters or rate coefficients of the individual elementary processes is severely complicated by the inherent coupling of chemical reactions and mass transport processes (Kolb et al., 2010; Berkemeier et al., 2013; Shiraiwa et al., 2014).

Heterogeneous chemical reactions on aerosol particles are traditionally described using so-called resistor models, which represent parallel and sequential physical or chemical processes in analogy to electrical circuits. These models have typically been used to derive analytical expressions for simplified limiting cases (e.g. Hanson et al., 1994; Worsnop et al., 2002; Hearn et al., 2005). Recently, numerical models have been developed that allow a more complete consideration of the time- and depth-resolved chemical and physical behaviour of aerosol particles, leading to a better understanding of these reaction systems, especially under conditions where the steady-state assumptions underlying the resistor models are not valid (Smith et al., 2003; Pöschl et al., 2007; Steimer et al., 2015; Berkemeier et al., 2016). Kinetic multilayer models describe single particles or thin films by division into compartments such as near-surface gas phase, surface, and particle bulk and further subdivision of the particle bulk into thin layers to achieve depth resolution. Specific models provide a focus on chemistry (KM-SUB, Shiraiwa et al., 2010), on gas-particle partitioning (KM-GAP, Shiraiwa et al., 2012, and ADCHAM, Roldin et al., 2014), or on water diffusion (the ETH Diffusion Model, Zobrist et al., 2011). For simplicity, throughout this paper we refer to a kinetic model as any computational model that is used to simulate a system's behaviour. We will use the term input parameters to address the prescribed model parameters (thermodynamic, kinetic, or physical) that are optimized in this study so that kinetic model output matches experimental data, a process that we will refer to as fitting the kinetic model. Note that this definition excludes model parameters that are clearly defined by physical laws or the experiment (e.g. physical constants and experimental conditions) or are of purely technical nature (e.g. integration time steps).

Ideally, fitting a kinetic model to experimental data would return all
chemical and physical parameters necessary to understand the importance of
the processes at work and to predict the outcome of future experiments, even
if conducted under experimental conditions not part of the training data
set, i.e. all experimental data used during the fitting process. However,
kinetic models often require a multitude of input parameters, some of which
are not constrained well experimentally or are merely effective parameters
combining a sequence of inherently coupled processes. In general, two main
difficulties arise when optimizing complex models to experimental data:

The optimization hypersurface is often non-convex, i.e. it will not
have only a single minimum due to interactions between non-orthogonal
(coupled) input parameters and/or scatter in the experimental data. Hence,
steepest descent methods fail since they get trapped easily in local minima.
Brute-force or exhaustive searches, where an

If too few or too similar experimental data are used during the fitting process or input parameters are allowed to move in a large range, the optimization problem can be underdetermined (ill-defined) and multiple solutions may exist. In this case, even though a good agreement between model output and training data set is obtained, it is likely that only the model input parameters corresponding to the most limiting processes will be physically meaningful. Extrapolation of the model outside its training range can then lead to strong discrepancies between modelled and measured data. For example, if a model is trained using data that are exclusively limited by a single process, it will constrain the parameters that represent that specific process while the other parameters remain nearly unconstrained even if multiple data sets are used. This means that if a parameter set were optimized using data from surface film experiments, the bulk diffusion coefficients would likely be poorly constrained regardless of how many different experimental data sets of that type were used.

Hence, sophisticated optimization methods, which quickly and reliably determine the model input parameters that lead to the best correlation between kinetic model and experiment, are needed. Furthermore, experiments covering a broad range of conditions must be conducted to ensure that the observables are controlled by (a) as many model input parameters as possible across all experimental conditions, but (b) by as few model input parameters as possible for a specific experimental condition (i.e. limiting cases). The MCGA presented here is able to overcome the difficulty of a complex optimization hypersurface with many local minima while providing the user with a realistic assessment of how well-constrained the model input parameters are by the experimental data.

Schematic representation of the MCGA optimization method consisting of a Monte Carlo sampling, which feeds into a genetic algorithm. Populations of model input parameter sets (blue boxes) are iteratively improved over several generations through survival of elites (red boxes) and recombination and mutation of parents to create children (purple boxes) until a sufficient correlation to the experimental data (goodness of fit) is obtained.

In many modelling applications, methods are needed that reliably find the optimum in non-convex optimization problems and detect underdetermined optimization problems. Global optimization methods have been subject of extensive research in the past (Arora et al., 1995) and provide means of approximating non-convex optimization problems without premature convergence to local optima. Examples for these methods are simulated annealing methods and evolutionary algorithms. In atmospheric chemistry, simple optimization techniques are commonly used to determine kinetic parameters by fitting rate equations to experimental data sets. However, to our knowledge no global optimization technique diligently designed for the determination of atmospheric reaction rate coefficients from multiple data sets has been described thus far. A related technique (Markov chain Monte Carlo algorithm) has been used to determine parametric uncertainties in cloud–aerosol interaction models (Partridge et al., 2012; Lowe et al., 2016). Global optimization was also used to calculate thermodynamic equilibria for phase separation of aqueous multicomponent solutions (Zuend and Seinfeld, 2013).

In this study, we present the Monte Carlo genetic algorithm (MCGA), a method combining direct Monte Carlo sampling with a genetic algorithm as a heuristic global optimization method that approximates the global optimum for input parameter sets of computational models. Repeated execution of the search algorithm can be used to test for uniqueness or to provide statistical bounds on the model input parameters. The MCGA utilizes a two-step approach to find minima on non-convex hypersurfaces. First, a Monte Carlo (MC) sampling is performed in the large space of possible model input parameters to narrow down the possible solution to smaller areas of interest. The parameter sets are evaluated using a goodness-of-fit expression of the user's choice, such as the RMSE between kinetic model output and experimental data. In the examples presented here, the RMSE or logarithmic RMSE was used. When multiple data sets were fitted, a weighting factor was introduced to prevent bias due to the number of data points in different experimental data sets. An additional optional weighting factor allows the user to assign priority to experimental data with lower statistical error or scatter. The parameter sets for the MC sampling are generated randomly from a distribution of the model input parameters. Each parameter was sampled using a logarithmically spaced distribution of values to provide uniform sampling over the large ranges most input parameters can possibly adopt. Note that, depending on the problem, different distributions and sampling strategies (e.g. Latin hypercube sampling) could be applied.

The genetic algorithm (GA) uses survival of the fittest to optimize an
ensemble (the

During execution of the GA, a number of model input parameter sets with the
highest correlation between model output and experimental data
(goodness of fit) are directly transferred into the next generation by the
survival mechanism (the

Schematic visualization of the parallelized MCGA optimization
method. The Monte Carlo step is performed independently on

In this study we used the genetic algorithm provided by
MathWorks^{®}
(MATLAB^{®} Global Optimization Toolbox) and
developed a routine for parallel computation on computer clusters. In a
typical setting, the MC step and GA step of the optimization occupied an
approximately equal amount of computation time. Figure 2 describes the
implementation of the parallel MCGA optimization method. The

Although models may possess a multitude of kinetic and thermodynamic input
parameters that represent the many possible sequential and/or concurrent
processes occurring in the system, their behaviour is often driven by only a
single or at most a few processes at a certain point in time. In chemical
kinetics, the behaviour of the system can often be characterized by a kinetic
regime, which may change during the course of the reaction and with
experimental conditions (Berkemeier et al., 2013). If a set of model input
parameters can be uniquely determined (by MCGA or other means) and results
in a high-fidelity fit of model output to experimental data, the parameters
would then be regarded as correct within the approximations of the underlying
model and uncertainties of the experimental data. This is a convenient way to
assimilate data from multiple previous studies; data sets can be weighted to
reflect confidence in their results, and the final range of accepted
parameters then represents a consensus from the fitted data. However, it may
not always be possible to fully constrain the input parameters, even using
multiple experimental data sets. In general, there are two reasons that a
model input parameter can remain unconstrained after optimization.

The parameter is non-influential.

The parameter is inherently coupled to another one, forming a non-orthogonal parameter pair under all experimental conditions.

Figure 3 illustrates both cases in an example taken from atmospheric
multiphase chemistry, using the benchmark system of ozone plus oleic acid
and data adopted from Hearn et al. (2005). The original data were converted
from ozone exposure to a time series using an ozone concentration of

The prerequisite of a successful optimization is to fit a sufficiently broad experimental data set so that a unique and accurate set of fitting parameters is obtained. Thus, both of the conditions above must be avoided. This may be achieved by including additional experimental data, especially from a different experimental technique or over a different timescale so that the system might sample another limiting behaviour. In the data given in Fig. 3 above, for example, measuring full time series at different oxidant concentrations may help to constrain the oxidant's desorption lifetime. However, if a model has too many free parameters (or especially parameters that are not well-constrained by experimental data), it may be necessary to reduce the model complexity or fix some of the parameters. We therefore recommend using data sets obtained from a range of different experimental techniques to ensure this variability if they are available, and using models with as few free parameters as possible.

In the example above, it was possible to use brute-force sampling to determine the true optimization hypersurface (contour lines) for comparison to the MCGA results. Of course, in typical applications, the number and range of input parameters makes such a search prohibitive. The computational feasibility of an optimization depends crucially on the size of the input parameter space, i.e. number and possible range of all parameters. Using an unreasonably large range for input parameters increases the possibility of finding non-physical solutions that fit the experimental data. The input parameter space can be reduced based on a priori knowledge from laboratory experiments and theoretical calculations. Parameters can be narrowed down by laboratory experiments (e.g. bulk experiments for derivation of trace gas solubility), by physics (e.g. the upper limit of the accommodation coefficient at unity), or by simulations (e.g. molecular dynamics simulations to estimate the surface accommodation coefficient and desorption lifetime as in Vieceli et al. (2005) and Julin et al. (2013)). Note that in the example given in Fig. 3b, the two parameters were not truly independent, so that constraining either model parameter from a priori information would constrain the other parameter. In multi-parameter optimizations, where many such dependencies might exist, this can lead to a significant reduction in solution space.

Previous studies applying the MCGA.

The MCGA has been previously applied to chemical reaction systems
of atmospheric relevance (Table 1). The essential parameters we use to
describe an atmospheric multiphase chemical kinetic system of reactive trace
gases

In its first application the MCGA was used to fit individual data sets of the decay of oleic acid upon ozonolysis (Berkemeier et al., 2013), highlighting the need for fitting to multiple experimental data sets to constrain kinetic parameters. This was done in further studies that investigated gas uptake to (semi-)solid organic material in coated-wall flow-tube reactors (Arangio et al., 2015; Berkemeier et al., 2016), ozone-induced protein oligomerization in bulk solutions (Kampf et al., 2015), viscosity change upon alkene ozonolysis as measured with fluorescence microscopy (Hosny et al., 2016), the redox-cycling reactions in the human lung lining fluid (Lakey et al., 2016a), and ozonolysis of squalene contained in human skin lipids (Lakey et al., 2016b). In each of these studies, a large set of model input parameters was optimized to several experimental data sets to constrain the input parameter space. In the following, we review results previously obtained by the MCGA to demonstrate its utility in determining kinetic parameters, assimilating large data sets, and detecting ill-defined problems.

Observed (markers) and modelled (lines) uptake coefficients of ozone
on a thin film of shikimic acid as a function of exposure time.

In Berkemeier et al. (2016), 11 parameters, under 12 distinct experimental conditions and using a single set of kinetic parameters, were varied simultaneously to fit the ozone uptake to shikimic acid films over many hours (Fig. 4). The model was found to accurately describe the humidity- and concentration-dependence of ozone uptake and a high correlation between model output and experimental data was achieved. During optimization, a subset of six parameters, including diffusivity coefficients and trace gas solubility, was allowed to increase or decrease monotonically over six steps in relative humidity, resulting in a total of 41 optimized parameter values. Despite this large number of optimization parameters, a well-constrained parameter set could be obtained due to the large depth in training data and by applying a priori information.

Kinetic parameters for multiphase chemical reactions of OH with
levoglucosan (white) and abietic acid (grey) determined by the MCGA method of
fitting the experimental data with the KM-GAP model. The ranges of parameters
are depicted as a box

In another study investigating the oxidation of biomass burning tracers with
hydroxyl radicals (Arangio et al., 2015), repeated execution of MCGA revealed
a remaining uncertainty in the kinetic parameters obtained from optimization
to the two experimental data sets (Fig. 5). While some parameters could be
narrowly constrained (diffusion coefficient of the organic matrix,

The MCGA addresses the problem of extracting physical and chemical parameters from experimental data. The algorithm allows the user to assimilate multiple data sets and its random sampling approach reduces the bias which may arise in more user-directed optimization methods. Unlike simple gradient-based optimization methods, MCGA can thus be used as a statistical tool that not only detects unconstrained parameters but also finds dependencies between unconstrained parameters. The results can be applied in process models and may serve to direct future experimental studies, e.g. to drive a reaction system into regimes in which the remaining unconstrained parameters have high sensitivity. MCGA could also be used to constrain chemical reaction systems in the post-analysis of field and laboratory studies: starting with a large set of model input parameters (i.e. chemical reactions, physical processes), data from various measurement campaigns could be combined, reconciled, and in a further step used to reduce the number of model input parameters to the key processes necessary to describe all measurement data. MCGA may be a powerful and useful tool to constrain kinetic parameters and reaction rate coefficients in models that study the formation of secondary organic aerosol in reaction chambers (Chan et al., 2007; Shiraiwa et al., 2013; Cappa et al., 2013; Riedel et al., 2016). It could be suitable for fine-tuning of reaction rates in large reaction mechanisms of atmospheric chemistry, such as the Master Chemical Mechanism (MCM; Jenkin et al., 1997; Saunders et al., 2003), the Gas-Aerosol Model for Mechanism Analysis (GAMMA; McNeill et al., 2012), or the Chemical Aqueous Phase Radical Mechanism (CAPRAM; Herrmann et al., 1999). Multiple experimental data sets from a broad range of techniques could be used with the algorithm to narrow down difficult-to-measure reaction rate coefficients, provide uncertainty estimates, and reconcile experiments across different research groups and facilities.

No data sets were used in this article.

The authors declare that they have no conflict of interest.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the US National Science Foundation.

Thomas Berkemeier was supported by the Max Planck Graduate Center with the Johannes Gutenberg-Universität Mainz (MPGC). Andrew J. Huisman was supported by the United States National Science Foundation under award no. IRFP 1006117 and by ETH Zürich. We gratefully acknowledge Geoffrey D. Smith for providing published data in tabulated form. The authors would like to thank Andrea Pozzer, Christian Pfrang, and Claudia Marcolli for stimulating discussions and support and John Rieffel for his expertise in algorithms. The article processing charges for this open-access publication were covered by the Max Planck Society. Edited by: David Topping Reviewed by: three anonymous referees