A computationally-efficient secondary organic aerosol module for three-dimensional air quality models

Accurately simulating secondary organic aerosols (SOA) in three-dimensional (3-D) air quality models is chal- lenging due to the complexity of the physics and chemistry involved and the high computational demand required. A computationally-efficient yet accurate SOA module is neces- sary in 3-D applications for long-term simulations and real- time air quality forecasting. A coupled gas and aerosol box model (i.e., 0-D CMAQ-MADRID 2) is used to optimize relevant processes in order to develop such a SOA module. Solving the partitioning equations for condensable volatile organic compounds (VOCs) and calculating their activity co- efficients in the multicomponent mixtures are identified to be the most computationally-expensive processes. The two processes can be speeded up by relaxing the error tolerance levels and reducing the maximum number of iterations of the numerical solver for the partitioning equations for organic species; conditionally activating organic-inorganic interac- tions; and parameterizing the calculation of activity coeffi- cients for organic mixtures in the hydrophilic module. The optimal speed-up method can reduce the total CPU cost by up to a factor of 31.4 from benchmark under the rural con- ditions with 2 ppb isoprene and by factors of 10-71 under various test conditions with 2-10 ppb isoprene and >40% relative humidity while maintaining ±15% deviation. These speed-up methods are applicable to other SOA modules that are based on partitioning theories.


Introduction
Representing secondary organic aerosols (SOA) in three dimensional (3-D) atmo-20 spheric models is very important because they constitute a sizeable fraction of fine particulate mater (PM 2.5 ), which has impacts on human health, visibility degradation, and climate change (Watson, 2002;Davidson et al., 2005;IPCC, 2007;Zhang et al., 2007). The formation of SOA depends on atmospheric abundance of anthropogenic and biogenic volatile organic compounds (VOCs), their chemical reactivity, solubility, and the els (Strader et al., 1999;Barthelmie and Pryor, 1999;Aumont et al., 2000;Andersson-Sköld and Simpson, 2001;Schell et al., 2001;Pun et al., 2002;Griffin et al., 2002aGriffin et al., , 2003Tulet et al., 2006). Two aerosol modules: Model of Aerosol Dynamics, Reaction, Ionization, and Dissolution 1 and 2 (MADRID 1 and MADRID 2) have been incorporated into the US Environmental Protection Agency (EPA)'s 3-D Community Multiscale 10 Air Quality (CMAQ) modeling system (Binkowski and Roselle, 2003) to simulate SOA (Zhang et al., 2004). MADRID 1 uses an empirical representation of SOA formation based on data obtained in smog chamber experiments (Odum et al., 1997;Griffin et al., 1999). MADRID 2 uses a mechanistic representation that simulates an external mixture of hydrophilic and hydrophobic particles (Pun et al., 2002). The two modules differ 15 in two aspects. First, both use the same partitioning equation for hydrophobic organic compounds (OC) but with different methods for partitioning coefficients, which are obtained from the smog chamber experiments in MADRID 1 but calculated in MADRID 2 as a function of temperature and composition of the hydrophobic OCs based on Raoult's law following the equation derived by Pankow et al. (1994). The determination 20 of chemical composition in MADRID 2 involves the use of UNIFAC (Fredenslund et al., 1977) for activity coefficients. Second, hydrophilic compounds are treated based on the Henry's Law in MADRID 2 but not in MADRID 1. While MADRID 2 represents a detailed treatment for SOA formation, it is more computationally-expensive than MADRID 1 (by up to a factor of 8 in 3-D simulations), which limits its application for long-term 25 3-D simulations and real-time air quality forecasting.
In this work, a coupled gas and aerosol box model (i.e., zero-dimensional (0-D) version of CMAQ-MADRID 2) is used to explore various speedup methods while maintaining a desirable numerical accuracy. The main objective of this study is to improve the  (Pun et al., 2002). Hydrophilic OCs include those with a short carbon chain (≤7 carbons; or ≤10 carbons with three or more functional groups), high solubility (≥1 g solute/100 g water), and a high effective Henry's law constant (≥1 ×10 6 M atm −1 ). Hydrophobic condensable OCs are identified by their estimated octanol-water partitioning coefficients. Each surrogate compound assumes 5 the characteristic size and functional groups of the corresponding explicit compounds. Due to the paucity of property data for complex OCs, partitioning parameters, such as Henry's law constants and saturation vapor pressures, are estimated using group contribution methods (Pun et al., 2002). Dissociation constants and deliquescence relative humidities (DRH) are assigned based on analogy to simpler compounds.

10
The partitioning of hydrophobic condensable OCs into particulate phase through absorption and gas-particle equilibrium is described as follows (Pun et al., 2002;Zhang et al., 2004): where K i (m 3 µg −1 ) is the partitioning coefficient, A i and G i (µg m −3 air) are the mass 15 concentrations of species i in the particulate-and gas-phase, respectively, and M om (µg m −3 air) is the sum of primary (non-volatile) and secondary organic carbon (semivolatile) in the particulate phase that serve as the organic absorbing medium. A i can be calculated based on K i , G i , and M om . The partitioning coefficient, K i , can be calculated as follows (Pun et al., 2003): where R is the ideal gas constant (8.2×10 −5 m 3 atm mol −1 K −1 ), T (K) is the temperature, P sat Introduction (1) to calculate a new A i and the deviation between the two A i values. Such iterative calculations will stop until the deviation meets a convergence criterion of 10 −4 , making this iteration computationally very expensive.
In the absence of liquid water, hydrophilic condensable products undergo absorption 5 that is similar to that of hydrophobic condensable OCs. In the presence of liquid water, the partitioning of hydrophilic condensable OCs into existing aqueous particles is governed by Henry's law assuming equilibrium between the bulk gas and whole particulate phase, and takes into account the activity coefficient of the molecular solute. The additional water uptake can occur in the presence of hydrophilic SOA when the ambient 10 relative humidity (RH) is above the DRH of any individual OC in the particulate phase Hildemann, 1996, 1997). The addition of organic ions changes the pH and total water content of aqueous particles, and consequently, affects the partitioning of inorganic compounds, such as nitrate and ammonium (Saxena et al., 1995;Cruz and Pandis, 2000;Pun et al., 2003). The role of SOA in total aerosol water and in 15 transferring of nitrate into the aerosol phase is significant, especially at low RH (<50%) and high SOA mass fractions (>20% of total PM 2.5 ) (Ansari and Pandis, 2000). The above effects are also species-specific (Choi and Chan, 2002), and their modeling is difficult due to the lack of fundamental thermodynamic data for OCs and a suitable theoretical approach (Clegg et al., 2001). These organic-inorganic (O-I) interactions 20 are simulated in CMAQ-MADRID 2 by coupling the hydrophilic SOA module with an inorganic equilibrium model, ISORROPIA of Nenes et al. (1999), to relate water and ions between organic and inorganic species. In CMAQ-MADRID 2, both the hydrophilic and hydrophobic SOA modules require the simultaneous solution of partitioning equations for several OCs. A globally-convergent 25 Newton/line search method is used to solve for the partitioning equations simultaneously. Some simplifications are provided as alternatives to detailed treatments to improve computational efficiency in 3-D applications (Pun et al., 2003). For example, the amount of water associated with hydrophilic OCs is calculated by assuming that the water associated with each hydrophilic compound in a binary solution is additive (i.e., using the Zdanovskii, Stokes, and Robinson (ZSR) equation (Stokes and Robinson, 1966)). In the hydrophobic module, M om in Eq. (1) is calculated based on the concentration and composition of PM at the beginning of the time step in each grid cell, and held constant for the partitioning calculation during the same time step.

Test condition and model input
The 0-D CMAQ-MADRID 2 used in this study simulates the CACM gas-phase chemistry and all of aforementioned aerosol processes except aerosol activation. Other atmospheric processes such as emissions, dilution, transport, removal, and aqueousphase chemistry are not included in the box model. Three numerical solvers are imple-10 mented to solve CACM in the box model, the quasi steady state approximation solver (QSSA) (Hestveldt et al., 1978), the RosenBrock solver (ROS3) (Sandu et al., 1997a, b), and the sparse-matrix vectorized Gear's solver (SMVGEAR) (Jacobson and Turco, 1994). The box model is set up for a one-day simulation starting 12:00 GMT (corresponds to 08:00 a.m. Eastern Daylight Time (EDT)). A diurnal photolytic rate profile 15 is used to represent a typical summer day in the eastern US. Four hypothetical atmospheric conditions are used to represent the rural and urban conditions with low and high biogenic VOCs (rural LBG/HBG, and urban LBG/HBG). The initial gas-phase and particulate concentrations under the four conditions are given in Tables 1 and 2. The typical ambient non-methane hydrocarbon (NMHC) concentrations range from a few to 20 2000 ppbC (Finlayson-Pitts and Pitts, 1986;Seinfeld and Pandis, 1998). In the baseline simulation, the total NMHC concentrations are assumed to be 50 and 150 ppbC for the rural and urban conditions, respectively, which represent the low limits under such conditions. The speciation of NMHCs for all the initial CACM organic species except biogenic VOCs (i.e., isoprene (ISOP), and monoterpenes (BIOL + BIOH)) is 25 based on the specific distribution factors of Griffin et al. (2002a). The concentrations of isoprene and monoterpenes in the ambient air are typically in the range of 1 to 10 ppbC (Finlayson-Pitts and Pitts, 1986). The only differences between rural/urban HBG and 7091 Introduction

Back Close
Full Screen / Esc

Printer-friendly Version
Interactive Discussion rural/urban LBG are the concentrations of ISOP and BIOL + BIOH (0.1 and 0.4 ppb, respectively, for rural and urban LBG and 2 and 8 ppb, respectively, for rural and urban HBG). The initial concentrations of other inorganic gas-phase species are based on Zhang et al. (1998). The initial PM mass and number concentrations are assumed to be 20.9 µg m −3 and 2.27×10 6 particle cm −3 , respectively, for all the four conditions, 5 which minimizes the effects of preexisting PM on the predicted SOA under different conditions. Two scenarios are considered for each condition: without PM (i.e., CACM gas-phase chemistry only) and with PM. All simulations are conducted on the Linux Cluster of IBM Blade Center of North Carolina State University, with 2.8-3.2 GHz dual Xeon compute nodes. The simulation results are written out every 30 min.
10 3 Improvement of computational efficiency of MADRID 2

Methodology and simulation design for computational efficiency improvement
Baseline simulations without aerosol treatments are first conducted to identify the most appropriate solver for gas-phase chemistry. Among the three solvers implemented for CACM, the SMVGEAR is widely used in atmospheric transport model (Hertel et al., 15 1993;Huang and Chang, 2001) and considered as the most accurate numerical chemical solver available (US EPA, 2004). The results from SMVGEAR with the most stringent error tolerances for convergence (a relative error tolerance (RTOL) of 10 −5 and an absolute error tolerance (ATOL) of 10 −13 ppm) are therefore used as a benchmark for accuracy. The results with the ROS3 and the QSSA solvers at the same most stringent 20 error tolerance are quite similar to those with the SMVGEAR solver (the deviations in mixing ratios are within 4.2% for the ROS3 solver and 5.6% for the QSSA solver for all major gas-phase species). For all conditions, the QSSA solver is the fastest among the three solvers, while the ROS3 solver is slightly faster than the SMVGEAR solver for most simulations. Regardless of the solvers used, simulations under rural and ur-Introduction Interactive Discussion respectively) than those with LBG conditions, with the highest CPU cost for the rural HBG condition. Considering accuracy and computational efficiency, the ROS3 solver is selected for the speed-up test experiments for all simulations with aerosols. Solving aerosol processes takes much more CPU time than solving gas-phase chemistry (e.g., 2.1 vs. 199.3 s using the ROS3 solver under the rural HBG condition). 5 Under the test conditions, among all aerosol processes simulated, two computationally most expensive physical processes are identified: solving partitioning equations for OCs using the globally-convergent Newton/line search method and calculating activity coefficients of multicomponent mixtures through UNIFAC. The speed-up test experiments therefore focus on the two physical processes under the rural HBG condition.

10
Speed-up methods that are tested include: speeding up the numerical solver for the SOA formation; conditionally activating organic-inorganic interactions; and parameterizing the UNIFAC calculation of activity coefficients. These methods are first tested in the box MADRID with a constant temperature of 298 K and an RH of 70% under the four test conditions. The computationally-efficient MADRID 2 (referred to as MADRID 15 2 Fast hereafter) using the optimal speed-up methods is further tested for atmospheric conditions with temperatures of 273-313 K and RHs of 10-95%.

Numerical solver for partitioning of organic compounds
The globally-convergent Newton/line search method is developed from Newton's method and can solve the simultaneous equations of the form F (x)=0. It can con-20 verge to a solution from any starting point (i.e., initial guess value) (Press et al., 1997). Six parameters are tested for potential speedup including three tolerance control parameters (i.e., TOLF, TOLMIN, and TOLX), two internal parameters (i.e., EPS and ALF) and one external parameter (i.e., MAXIT). Their default values and function are given in Table 3. Relaxing TOLF and TOLX results in small-to-significant speed-up. As shown in 25 Fig. 1a Multiple parameters are relaxed in one simulation to obtain an optimal speedup. Figure 1c shows various combinations of speed-up parameters, where solver 1, 5 solver 2, and solver 3 correspond to a combination of TOLF=1×10 −3 , TOLX=1×10 −5 , MAXITS=100, EPS=1×10 −4 , and ALF=1×10 −4 ; the same with solver 1 but with TOLX=1×10 −3 ; and a combination of TOLF=1×10 −2 , TOLX=1×10 −3 , MAXITS=5, EPS=1×10 −3 , and ALF=1×10 −3 , respectively. The speed-up is much more significant as compared with relaxing single parameter in one simulation. Solver 1, solver 2, and 10 solver 3 reduce the CPU cost by 45%, 54.5%, and 86.1%, respectively. Compared with the benchmark results (i.e., using the SMVGEAR solver for gas-phase chemistry without any speed-up in aerosol simulation), the ranges of percentage deviations for simulated major gas-phase species and total PM 2.5 with solver 1, solver 2, and solver 3 are −2.7 to 1.4%, −5.5 to 1.7%, and −10.6 to 39.1%, respectively. While the speed-up 15 with solver 3 is at the expense of reducing accuracy, solver 2 and solver 1 provide a good compromise between the CPU cost and the numerical accuracy.

Organic-Inorganic (O-I) interactions
The extent of the partitioning for any individual OC between the gas and particulate phases depends on not only the amounts and properties of the compound, but also the 20 amount of water present in the atmosphere. Meanwhile, the water uptake and ions associated with OCs will affect the partitioning of inorganic aerosols. In CMAQ-MADRID2, such O-I interactions are simulated by coupling of a module solving dissolution of water soluble OCs with an inorganic gas-aerosol equilibrium module (i.e., ISORROPIA). Due to the lack of experimental data on the equilibrium in mixed inorganic/organic/aqueous 25 system, the inorganics and organics are treated separately and coupled only through the liquid water content (LWC) and pH of the aerosols (Pun et al., 2002 absolute values of water uptake by OCs (i.e., LWCORG) and the total water content from inorganics (i.e., RWATER) may vary significantly (e.g., 10 −4 to 2 and 3-15 µg m −3 , respectively), their ratios (i.e., LWCORG/RWATER) provide a good indicator on the relative importance of water uptake by OCs in the system, which is chosen as a cutoff value to turn on/off the O-I interactions. If the value of LWCORG/RWATER is less than the cutoff value, the water uptake and ion added by OCs will not be treated in ISORROPIA (i.e., no O-I interactions). Several cutoff values of LWCORG/RWATER are tested at an RH of 70% in terms of CPU costs and numerical accuracy. With the increased cutoff values (e.g., from 0.1 to 20%), the CPU costs reduce up to 64% under all test conditions. The CPU cost is reduced significantly when choosing a large 10 cutoff value, but the tradeoff is that the percentage deviations from the benchmark results are also large. For example, the percent deviations are −16.7 to 49.2% for major gas-phase species and −21.2 to 17.7% for total PM 2.5 for a cutoff value of 20%. The cutoff value of 0.5% can reduce CPU cost 7% with relatively small percent deviations, i.e., −7.2 to 2.6% for major gas-phase species and −4.2 to 7.3% for total PM 2.5 . The 15 value of LWCORG/RWATER of 0.5% is thus selected as the cutoff value to turn on/off the O-I interactions. As shown in Fig. 2, activating O-I interactions only when LW-CORG/RWATER >0.5% can reduce CPU cost by 7.1% and 15.6% for the rural and urban HBG conditions, and 39.3% and 18.6% for the rural and urban LBG conditions. The reduction in CPU cost is greater under the rural LBG than rural HBG conditions, 20 because relative amount of water content associated with organics are smaller and LW-CORG/RWATER is less than the cutoff value under the rural LBG condition. However, the CPU reductions are similar under urban LBG and HBG conditions due to smaller differences of LWCORG/RWATER between the two conditions. 25 Multiple linear regression analysis based on the least squares approach is applied for parameterization of activity coefficient calculation (para-AC) based on UNIFAC in the hydrophilic SOA module that is identified to be the most computationally-expensive 7095 Introduction process. Six molecules are treated for hydrophilic OC partitioning in CMAQ-MADRID 2: propandioic acid, C8 dien-dioic acid with an aldehyde branch, C8 hydroxy-dien-dial, C9 hydroxy-carbonyl acid with one double bond, C10 hydroxy-carbonyl aldehyde, and butandioic acid. The molecule fractions of the above species and water content and temperature are treated as independent variables. Given the range of each indepen-5 dent variable (i.e., temperature ranges from 253 to 313 K; mole fraction ranges from 0 to 1), a total of 1.2×10 8 simulations are conducted (120 temperatures ×10 6 mole fractions of 6 hydrophilic species). The following parameterization expression of activity coefficient is then derived and applied in box MADRID 2:

Parameterization of activity coefficient calculation
where Y is activity coefficient, c is the intercept (i.e., constant term), α and β are vectors representing the polynomial-fitted coefficients, T and X are the matrixes representing temperature in K and molecule fractions of six hydrophilic species and water content. In Eq. (3), the values of c are all zeros, and the values of α, and β are shown in Table 4. The correlation coefficients between the calculated activity coefficient by the 15 parameterization and the UNIFAC range from 0.6 to 0.9. The above parameterization reduces the CPU cost by 90%, 54%, 82%, and 52% (corresponding to speed up factors of 10.5, 2.2, 5.6, and 2.1) for rural HBG/LBG and urban HBG/LBG conditions (see Fig. 2), respectively. The optimal speed-up method that combines solver 2, O-I cutoff, and para-AC (i.e., comb all) can reduce the total 20 CPU cost by 97%, 78%, 89%, and 61% (corresponding to speed up factors of 29.7, 4.5, 8.9, and 2.5) for rural HBG/LBG and urban HBG/LBG conditions, respectively. Compared with the baseline without speed-up, the ranges of percentage deviations in species concentrations are −14.9 to 14.6% for major gas-phase species and −16.0 to 3.9% for total PM 2.5 under the rural HBG condition, and −1.9 to 0.5% for major Introduction

Sensitivity test of MADRID 2 Fast
MADRID 2 Fast is further tested under the aforementioned four conditions with 5 temperatures (273,283,293,303,and 313 K) and 5 RHs (10,40,60,80,and 95%), with a total of 100 simulations (=5 temperatures×5 RHs×4 conditions). Figure 3 shows the average percentage deviations from the benchmark for major gas-phase species and 5 total PM 2.5 under the rural HBG condition at an RH of 10% or 95% and temperatures of 273-313 K and at temperatures of 273 K or 313 K and RHs of 10-95%. For most species, the average deviations are within ±10%. Larger deviations occur for nitrogen dioxide (NO 2 ), nitric oxide (NO), and nitrate radical (NO 3 ) at 273 K and RH≥80% or at RH=95% and temperature ≤283 K, due to their very small concentrations in the 10 baseline simulation. For example, at T =273 K, RH=80%, the mixing ratios of NO decrease from 1.24×10 −6 ppb at 01:00 p.m. to 6.87×10 −19 ppb at 11:30 p.m. in the baseline simulation, those in the sensitivity simulation with MADRID 2 Fast decrease from 2.19×10 −6 at 01:00 p.m. to 1.8×10 −19 ppb at 11:30 p.m., such changes from the small baseline values at each time step result in a large average percentage deviation 15 of 92.5% (see Fig. 3c), although the absolute changes are negligible. Figure 4 shows a fairly good correlation of simulated AROH and BIOH, total SOA, and total PM 2.5 between simulations with MADRID 2 Fast and MADRID 2 under the four test conditions during the 24 simulation period, with correlation coefficients of 0.99 to 1.0 and >86% of data points within ±15% deviations. These results illustrate that MADRID 2 Fast 20 provides an overall good compromise between computational efficiency and numerical accuracy under most atmospheric conditions.

Summary and future work
A box model, 0-D CMAQ-MADRID 2, is applied to explore various methods in improving computational efficiency of the SOA module. Solving the partitioning equations for 25 condensable VOCs and estimating their activity coefficients with the UNIFAC are iden- Interactive Discussion tified as the most computationally-expensive processes in simulating SOA. Potential speedup methods tested include relaxing the error tolerance levels, reducing the maximum number of iterations, and adjusting several internal parameters of the numerical solver; turning off O-I interactions when water content associated organics is relatively small, and parameterizing the UNIFAC calculation of activity coefficients using multi-5 ple linear regression analysis at various temperatures for the mixtures with different mole fractions. A combination of several speed-up parameters related to the numerical solver gives 47-55% CPU reductions (speedup by 1.9-2.2) with percentage deviations of −5.5 to 1.7%. The percentage of water associated with organics over the total water from inorganics is selected as an indicator of O-I interactions with an optimal cutoff 10 value of 0.5%. Using the cutoff value to turn on/off O-I interactions can reduce CPU by 7 to 39.3% under the test conditions with the percentage deviations of −7.2 to 7.3% for most species. Among all the methods tested, the parameterization of the activity coefficient calculation in the hydrophilic module gives the most effective CPU reduction (reduced by 51.9 to 90.5%, or speedup by factors of 2.1-10.5) with percentage 15 deviations of −14.9 to 14.6%. The optimal speed-up method that combines all above methods can significantly reduce the CPU cost by 60.5 to 96.6% (speedup by factors of 2.5-29.7) with percentage deviations of −14.9 to 14.6% from the benchmark. Under conditions with typical ranges of temperatures and RHs, the MADRID 2 Fast can reproduce the benchmark results with acceptable accuracy (within ±15%) and compu-20 tational efficiency for most species. The methods used to develop the MADRID 2 Fast in this study are generic to other SOA modules that employ the UNIFAC calculation for activity coefficients of OCs and/or that use a numerical solver to solve the partitioning equations for OCs. The MADRID 2 Fast is being implemented in 3-D CMAQ-MADRID of Zhang et al. (2004) and the Weather Research and Forecast Model with Chemistry 25 and MADRID (WRF/Chem-MADRID) (Zhang et al., 2005;Hu et al., 2006). It will be further tested with representative episodes in the US and abroad for long-term applications and real-time air quality forecasting.    Figure 1. The CPU time cost as a function of (a) various tolerance levels of TOLX by fixing values of TOLF; (b) various maximum iteration numbers; and (c) various combination Fig. 1. The CPU time cost as a function of (a) various tolerance levels of TOLX by fixing values of TOLF; (b) various maximum iteration numbers; and (c) various combination of speed-up parameters, where solver 1, solver 2, and solver 3 correspond to TOLF=1×10 −3 , TOLX=1×10 −5 , MAXITS=100, EPS=1×10 −4 , and ALF=1×10 −4 ; TOLF=1×10 −3 , TOLX=1×10 −3 , MAX-ITS=100, EPS=1 ×10 −4 , and ALF=1×10 −4 ; and TOLF=1×10 −2 , TOLX=1×10 −3 , MAXITS=5, EPS=1×10 −3 , and ALF=1×10 −3 , respectively.